Query database in Prolog
zby
created: 2005-12-29 11:41:59
I am quite new with Programming in Logic, but it seems to be a great language for querying a data store and storing facts. In many ways it is so similar to SQL, and in many cases, I am sure, it would be much more convenient. Read Logic Programming with Perl and Prolog if you want an introduction to Prolog with some Perl modules or Other online Prolog tutorials to learn more about the power of programming in logic. So why there are no databases built to support Prolog? Is it too difficult?

At least with Perl you can dump your database as a Prolog program:

use strict;
use DBIx::Class::Loader;
use AI::Prolog;

my $loader = DBIx::Class::Loader->new(
    dsn       => $ARGV[0],
    user      => $ARGV[1],
    password  => $ARGV[2],
    namespace => 'Data',
);

my $prog;
for my $c ( $loader->classes ) {
    my $rs = $c->search();
    $prog .= '% ' . join ( ',', $c->columns ) . "\n";
    while ( my $row = $rs->next ){
        $prog .= $c->table . '(';
        my @quoted = map {AI::Prolog->quote($row->$_)} $c->columns;
        $prog .= join ',', @quoted;
        $prog .= ").\n";
    }
}
print $prog;
Let's see what we can get with this from a simple database like this:
create table usr(id integer primary key, name varchar(100), cityid integer);
create table city (id integer primary key, name varchar(100));

insert into usr values (1, 'John', 1);
insert into usr values (2, 'Marry', 1);
insert into usr values (3, 'Eva', 2);
insert into usr values (4, 'Zby', 3);
insert into city values (1, 'New York');
insert into city values (2, 'London');
insert into city values (3, 'Warsaw');
Run the dumper:
perl dump.pl dbi:Pg:dbname=prolog > mytables
What I get in the mytables file is:
% name,id
city('New York',1).
city('London',2).
city('Warsaw',3).
% cityid,name,id
usr(1,'John',1).
usr(1,'Marry',2).
usr(2,'Eva',3).
usr(3,'Zby',4).
The sequence of columns is not retained by DBIx::Class::Loader - that's why I added the column names in the Prolog comments (lines starting with '%'), this will be usefull for creating the queries. That's our logic database. Now let's query it. I'll run the full SWI prolog engine here and load the above program, I do this because I cannot get AI::Prolog to work with compound queries that I need.
zby@zby:~/progs/prologdata$ swipl
Welcome to SWI-Prolog (Multi-threaded, Version 5.2.13)
Copyright (c) 1990-2003 University of Amsterdam.
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to redistribute it under certain conditions.
Please visit http://www.swi-prolog.org for details.

For help, use ?- help(Topic). or ?- apropos(Word).

?- [mytables].
% mytables compiled 0.00 sec, 1,400 bytes

Yes
?-
Now the database is loaded, and we can query it. How about a table join? Let's look for all users that are from 'New York'. Now I need the comments that tell me the sequence of columns.
?- usr(Cityid, Name, Id), city('New York', Cityid).

Cityid = 1
Name = 'John'
Id = 1 ;

Cityid = 1
Name = 'Marry'
Id = 2 ;

No
?-
Great, it found both.

And here is some code you can add to the first program to ask the query at once without saving the intermediate Prolog program. Unfortunately I did not found any way how to do conjunction in the queries (needed for table joins here).

my $prologDB = AI::Prolog->new( $prog );
$prologDB->query( "usr(1,Name,Uid)." );
while (my $results = $prologDB->results) { print "@$results\n"; }

For sure there are other ways how to convert relational databases to logic, I hope other monks here will share their ideas on how to do that and how to use it.

Re: Query database in Prolog
created: 2005-12-29 14:52:36

[Added: [google://Prolog SQL] has some good info. You should definately avoid re-inventing a wheel. The real Prolog people have already been hacking on this problem.]


I was thinking about this yesterday. It seemed most natural that you'd patch [cpan://AI::Prolog] to be able to query this stuff. You might want to introduce a new built-in function to represent joins. You've got a problem in that SQL relations have no order to columns so you'll either need to ignore that or instruct AI::Prolog in what order to read columns when querying the database.

% A "normal" lookup
city('New York', Cityid), usr(Cityid, Name, Id).

% An id-less query. Maybe the engine notices the join and transforms this into the prior behind the scenes.
% Maybe an invented built-in like references/4 cause the transformation.
:- references( 'USR', 'CITY_ID', 'USR', 'CITY_ID' ).
usr( city( 'New York' ), Name ).

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^2: Query database in Prolog
zby
created: 2005-12-29 15:16:36
I plan to patch DBIx::Class to retain the order, but I think about other possibilities as well.
Re^3: Query database in Prolog
created: 2005-12-29 15:32:59

"Retain" the order? The db provides no order!

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^4: Query database in Prolog
zby
created: 2005-12-29 15:52:14
You have some order in the CREATE clause and most databases would retain that when you issue * in a SELECT.
Re^5: Query database in Prolog
created: 2005-12-29 15:55:08

I'm told that any column ordering is incidental and not relational. I'm told that you're supposed to treat columns as unordered just like rows are unordered unless you specify an ORDER BY clause.

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^5: Query database in Prolog
created: 2005-12-29 16:58:00
Here is an exercise for you.

Create a table. Give it the fields (state, city, address). Populate it with, say, 500 records in random order, divided between, say, 20 states and 80 cities. Put an index on (state, city). Make sure that database statistics are properly populated. Now "select * from tbladdress where state = '$state'" for one of the less populated states.

If you did that as I described, your results are likely to come back grouped by city (in Oracle ordered that way as well), and not the order you inserted them in. That is because the database realizes that it is faster to access data through the index, and the index is organized differently from the table.

In Oracle specifically, if you select from the most populous state it may well come back in your original insertion order. That is because the cutoff between preferring an index to a full table scan is around 7% in Oracle. Your least populous state is below that magic figure. Your most populous one is above it.

In short, it is programmer error to make any assumptions about what order things will come back in if you do not have an order by clause. The order which things come back in is an internal implementation detail, and if you rely on it you deserve to be bitten by it some day.

Occasionally you may run across optimization advice that contradicts this. For instance in Oracle (and likely other databases) a group by also sorts data. So I had one instructor from Oracle point out to me that if you have a group by you can just specify the right group by and avoid having an order by, thereby saving yourself one sort in the database. However it turns out that even that you can't fully depend on - going from Oracle 9i to Oracle 10g the order in which NULL is placed in the result set changed. (But only if you have 2 or more columns in the group by.)

The moral remains: if you have not specified an order by clause, do not expect any particular order to result. If one does, it is a coincidence that you should not rely on.

Re^6: Query database in Prolog
zby
created: 2005-12-30 02:56:58
First I was not talking about the order of records but the order of columns. Second I can force the order of columns by using the columns sequence in the SELECT clause - so I can have any order I want and particularily I can have the same order that was in the CREATE query.
Re^7: Query database in Prolog
created: 2005-12-30 16:49:39
Ah.

Yes, that works and is portable. However whenever I've dealt with code that tried to remember column names by position, it has been a maintainance issue. Every single time. (It wasn't always obvious to other people around me that it was a maintainance issue since the work of maintaining it and the routine failures in doing so were just accepted as normal.)

Therefore even though you can do things that way, I strongly recomment that you not do so. Albeit for different reasons than what I thought you were talking about.

Re^8: Query database in Prolog
zby
created: 2005-12-30 18:03:57
I welcome ideas how to make the code independent on column position - I clearly expressed in the main article that I am waiting for other ideas how to convert the database relations to logic. The example is much complicated by the dependence on column order - but i don't know how to get rid of it in a simple way.
Re^9: Query database in Prolog
created: 2005-12-30 19:10:50
I tend to use raw DBI and fetchrow_hashref. Now I have columns by name. That won't be portable across databases (different databases handle capitalization of field names differently) but I've generally only found myself working with one database at a time so it isn't an issue for me.

Note that if you're processing large amounts of data, making those anonymous hashes is slower. I've never found that to be a bottleneck for me, but I've always had to deal with remote databases and have never worked with more than a few billion rows. (And on large datasets, my performance expectations were already set by the amount of data.) YMMV.

Re^10: Query database in Prolog
created: 2005-12-30 20:22:34
I tend to use raw DBI and fetchrow_hashref. Now I have columns by name. That won't be portable across databases (different databases handle capitalization of field names differently)
If you set $dbh->{fetchHashKeyName} = NAME_uc or NAME_lc at the top of your scripts, then the hash fetches will be portable across databases since they'll all be in either uc or lc.
Re^10: Query database in Prolog
zby
created: 2005-12-31 04:50:05
It is not about DBI - but about converting relations to logic predicates. I know how to get the data from the database (and if you looked at my first post in this thread you would discover that I use an ORM for that - that is order independent), it is the conversion that is the difficult and in my solution order dependent step. I don't want to be rude but I think you did not read the whole thread here and now you attack me with some straw man tactics. It is insulting when you suppose that I don't know basic facts.
Re^10: Query database in Prolog
created: 2006-01-05 16:21:57

> Note that if you're processing large amounts of data, making those anonymous hashes is slower.

Yep. I have had that be an occasional performance bottleneck. Solved it thusly (IIRC):

$sth = $dbh->prepare("Select * FROM datatable");
$sth->execute;
my %map;
{
   my @col = @{ $sth->{NAME_lc} };
   for (0..$#col) { $map{$col[$_]} = $_ }
}
while( my $row = $sth->fetchrow_arrayref ) {
   # I want the cols named ID, name, city, in that order
   printf "%d: %s from %s\n"
     ,$row->{$map{id}}
     ,$row->{$map{name}}
     ,$row->{$map{city}};
}

In this way, you use the fastest fetchrow (fetchrow_arrayref) on each iteration, which seems to be much faster.

Granted, specifying the field order in the SELECT and then using constants to reference field order (or just passing the list to printf) is faster still. But, if (like me) someone else updates the SQL that your code is calling (like when you just call stored procedures), the code above just keeps working, whereas using numerical constants could break.

I second the cry of YMMV, but this has always worked well for me.

<-radiant.matrix->
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law
Re: Query database in Prolog
created: 2005-12-29 16:09:08

I'll run the full SWI prolog engine here and load the above program, I do this because I cannot get AI::Prolog to work with compound queries that I need.

This is due to the limited grammar which [dist://AI::Prolog] supports. Changing your query to a rule will permit this. Add the following rule to the mytables file:

city_usr(City, Name, Id) :-
    city(City, City_ID),
    usr(City_ID, Name, Id).

Then you can run the shell:

AI-NeuralNet-Simple-0.03 $ aiprolog mytables.pro 

Welcome to AI::Prolog v 0.732
Copyright (c) 2005, Curtis "Ovid" Poe.
AI::Prolog comes with ABSOLUTELY NO WARRANTY.  This library is free software;
you can redistribute it and/or modify it under the same terms as Perl itself.

Type '?' for help.

?- listing.

1. city_usr/3: 
   city_usr(A, B, C) :- 
        city(A, D),
        usr(D, B, C).
2. usr/3: 
   usr(1, John, 1).
   usr(1, Marry, 2).
   usr(2, Eva, 3).
   usr(3, Zby, 4).
3. city/2: 
   city('New York', 1).
   city(London, 2).
   city(Warsaw, 3).

listing 

Yes
?- city_usr('New York', User, ID).

city_usr('New York', John, 1) ;

city_usr('New York', Marry, 2) ;

 No
?- 

There are other techniques, but that's the easiest.

As for why databases don't support Prolog, there are a variety of reasons for that. The primary reason is that DBMS don't really handle the relational model all that well. Queries frequently return bags instead of sets, SQL does not handle recursion well -- a core requirement of Prolog -- and Prolog allows lists, something databases also don't handle well for a single column.

Querying databases is something I would like to have in AI::Prolog, but in a more general nature. However, what if you've retrieved 15 out of 20 results and the engine backtracks over your function? You need to be able to add previous results to the AI::Prolog::KnowledgeBase and fetch them from there, if appropriate. Only if they are not found should subsequent results be pulled from the database.

To handle this, we'd need some way of registering external functions and having them call the correct primitive from AI::Prolog::Engine::Primitives (a primitive for external functions would need to be added). There's also the question of side effects. What happens when you backtrack over a predicate which deletes file? [diotalevi] and I have discussed much of this, but regrettably, I'm terribly busy at work and don't have time to work on this.

Cheers,
Ovid

New address of my CGI Course.

Re^2: Query database in Prolog
zby
created: 2005-12-29 16:31:47
What I was imagining was using Prolog in place of RDBMS. That is a prolog engine to which you could connect using an analogue to ODBC, store facts there and query them. But this means you need a way to delete the facts as well - I don't know how this can be added to the Prolog computational model. I just started with Prolog and I don't really understand your sentence about backtracking over query results - but I think that backtracking in this environment would have to be limited by a transaction boundary.
Re^3: Query database in Prolog
created: 2005-12-29 16:43:10

In reality, deleting facts is generally not something one wants to do as it's considered a "non-logical" operation. However, you can use retract/1 if you really need to:

AI-NeuralNet-Simple-0.03 $ aiprolog 


Welcome to AI::Prolog v 0.732
Copyright (c) 2005, Curtis "Ovid" Poe.
AI::Prolog comes with ABSOLUTELY NO WARRANTY.  This library is free software;
you can redistribute it and/or modify it under the same terms as Perl itself.

Type '?' for help.

?- help('retract/1') 

retract/1

    Remove facts from the database. You cannot remove rules. This may
    change in the future. See assert(X).

     retract(loves(ovid,java)).

help(retract/1) 

Yes
?- 

You can also just type "help" at the aiprolog prompt to see a list of all built-in predicates you can get help for:

?- help.

Help is available for the following builtins:

!/0          assert/1     call/1       consult/1    div/2    
eq/2         fail/0       ge/2         gt/2         halt/1   
if/3         is/2         le/2         listing/0    listing/1
lt/2         minus/2      mod/2        mult/2       ne/2     
nl/0         not/1        notrace/0    once/1       or/2     
plus/2       pow/2        print/1      println/1    retract/1
trace/0      true/0       var/1        write/1      writeln/1

help 

Yes
?- 

If you really want to get a decent grounding in Prolog, see [http://www.amzi.com/AdventureInProlog/advfrtop.htm|Amzi's "Adventure in Prolog"]. It's a free online book which lets guides through building four different Prolog applications. It's quite cool. Most of the basics work in [dist://AI::Prolog], but it could still use a lot of work/patches.

Cheers,
Ovid

New address of my CGI Course.

perlmonks.org content © perlmonks.org and diotalevi, jZed, Ovid, radiantmatrix, tilly, zby

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03