[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Mon Oct 21 09:36:44 2000 
Subject:Re: Performance 
From:Aaron Sloman See text for reply address 
Volume-ID:1001021.02 

Elan <rebol@techscribe.com> writes:

> Are poplog prolog "databases" sufficiently fast to manage a small
> database of several thousand records? Or do I need to employ postgresql
> (which is really overkill in this situation)? Is there a better,
> built-in way for managing data under poplog?

I can't comment in detail on the efficiency of the current prolog
implementation. However I did hear of at least one customer of
ISL gaining a speedup of some orders of magnitude when they switched
from using the prolog database to using properties in pop-11.

There are several information files about Pop-11 properties readable
in Ved
    HELP PROPERTIES
    HELP NEWPROPERTY
    HELP NEWMAPPING
    HELP NEWANYPROPERTY
    REF PROPS

The HELP files are in
    $usepop/pop/help/
the REF files in
    $usepop/pop/ref
Some of these are in a format that is hard to read without
Ved because of the use of "fancy" fonts. There are "stripped"
(plain text) versions of all the online documentation at
    http://www.cs.bham.ac.uk/research/poplog/doc/

Properties use hash tables. There are two main sorts of properties
in pop11.

1. Those created by newproperty require exact identity of keys. They
hash the address of a datastructure to get the actual key used.

2. Those created by newanyproperty and newmapping (a simplified
interface to newanyproperty) hash on the contents of the data
structure.

So two lists with identical contents will be associated with the
same value in case 2 but not in case 1. Newproperty produces
properties that work faster but are less flexible.

Depending on the form of the data and the way in which you need to
be able to access items, you may find that some sort of tree of
properties of type1 gives you very fast access.

However, for many purposes searching down lists can be fast enough.
Create a list of 10000 items thus:

vars x;
vars database
    = [%for x from 1 to 10000 do [the next number is ^x] endfor%];

Now search through it 10 times searching for a non-existent item, using
the "=" test which compares contents (like EQUAL in lisp), and time
it:

sysdaytime() =>
repeat 10 times
    member([the next number is 99999], database) =>
endrepeat;
sysdaytime() =>

On a 450 mhz sparc running solaris 7 this printed out immediately:

** Sat Oct 21 09:51:30 BST 2000
** <false>
** <false>
** <false>
** <false>
** <false>
** <false>
** <false>
** <false>
** <false>
** <false>
** Sat Oct 21 09:51:31 BST 2000

Re-doing to measure CPU time instead, without the false printed:
timediff() ->;
repeat 10 times
    member([the next number is 99999], database) ->
endrepeat;
timediff() =>
** 0.16

(i.e. 0.16 secs)

It would be around the same speed on a 450 mhz pentium/celeron
with linux poplog.  I find that non-floating point performance on sparc
and intel corresponds approximately to CPU clock speed.

Compare storing the lists in an expandable property using newmapping:

true -> popgctrace;
4000000 -> popmemlim;
6 -> pop_hash_lim;
vars procedure pdatabase = newmapping([], 10000, false, true);

;;; Now insert the 10000 items, associating them with true:

    vars x;
    for x from 1 to 10000 do
       true -> pdatabase([the next number is ^x])
    endfor;


(This is much slower if you don't increase the default value of
pop_hash_lim. see HELP SYSHASH);

;;; Acces is now very much faster than searching down a list ;;; we can
search for the same item (actually in the property) 10000 times in a
fraction of a second. (Not if you leave pop_hash_lim with its default
value of 3, however!)

timediff() ->;
repeat 10000 times
    pdatabase([the next number is 9999]) ->
endrepeat;
timediff() =>

If you want to use pattern variables, e.g. with the Pop-11 pattern matcher
then the above method won't work. You can create a list of items and
use constructs like foreach to search over the list with patterns
containing variables to be bound to values.

vars type;
timediff() ->;
repeat 10 times
    foreach [the next ?type is 9999] in database do type=> endforeach;
endrepeat;
timediff() =>

This prints out:
** number
** number
** number
** number
** number
** number
** number
** number
** number
** number
** 0.25

Inside a procedure definition I would use lvars x, and the pattern
prefix "!" that allows pattern variables to be used.

It would also be possible to use the new pattern syntax that works with
"=" defined in HELP EQUAL. It works on vectors as well as lists.

;;; make alist of 10000 vectors
vars x;
vars vdatabase
    = [%for x from 1 to 10000 do {the next number is ^x} endfor%];

timediff() ->;
lvars type;
;;; create pattern once, rather than each time round the loop
lvars pattern = {the next =?type is 9999};

repeat 10 times
    lvars item;
    for item in vdatabase do
        if item = pattern then type => endif
    endfor;
endrepeat;
timediff() =>

** number
** number
** number
** number
** number
** number
** number
** number
** number
** number
** 0.19

Vectors take up less space than lists.

I hope that provides some useful information. Old users of Pop-11
may not be aware of the generalisations around 1996 by John Gibson
enabling "=" to be used as a pattern matcher, with a new data
type, matchvars, and new pattern syntax using =?, =?? etc
described in HELP EQUAL.

HELP EQUAL also describes a more powerful matcher called "equals", that
is guaranteed to find all possible matches where segment variables are
involved. The simpler matchers, "matches" and "=" both fail on this;

    vars x,y;
    [[1 2][2 1]]  matches  [[??x ??y][??y ??x]] =>
    ** <false>

    ;;; note new syntax for segment variables "=??"
    vars x,y;
    [[1 2][2 1]]  = [[=??x =??y][=??y =??x]] =>
    ** <false>

But equals gets it right
    vars x,y;
    [[1 2][2 1]] equals [[=??x =??y][=??y =??x]], x, y =>


Apologies for long reply.

Aaron
===
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk   (ReadATas@please !)
PAPERS: http://www.cs.bham.ac.uk/research/cogaff/
FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html