[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Wed, 25 Feb 2004 20:19:44 +0000 (UTC) 
Subject:Re: counting elements in lists 
From:Aaron Sloman 
Volume-ID: 

Ian,

Please try to avoid posting in html in addition to plain text. It
clutters up email lists and news groups. Plain text is best.

Thanks.

> hi, i have a list within lists i.e. [[N V N ADJ] [N V V ADJ] [N ADV N
> V]] (please ignore the grammatical meanings!) and i need to assess which
> element occurs most often for each position, so return one list.

David Young's solution depends on an elegant intermediate operation which
replaces a list of N K-element lists with a list of K N-element lists where
the I'th list contains all the components in position I in the original
list. Using a procedure that finds which element occurs most often in the
I'th list, he then finds which element is most common in each of the K
lists, and that's the required answer.

An alternative is to count occurrences, using properties, as David
suggests.

Here are some fragments of that alternative.

define count_items(list, K) -> props;
    ;;; given a list of K-element lists (each list containing K elements)
    ;;; return a list of K properties (hash tables), props, where the Ith
    ;;; property associates with every item the number of times it occurs
    ;;; in position I in one of the elements of list. Items never found
    ;;; have the value 0 without needing an explicit entry in the property.

    lvars
        ;;; counter for 1 to K
        index,

        ;;; a poor guess at number of different symbols in each position.
        ;;; This could be an extra parameter for the procedure, or a more
        ;;; conservative guess could be used (1 + half length list)
        propsize = 1 + listlength(list) >> 1;

    ;;; make the K properties, with default value 0
    [%
        for index to K do
            newproperty([], propsize, 0, "perm")
        endfor
    %] -> props;

    ;;; Go down the list counting occurrences
    lvars item, entry, prop;
    for item in list do
        for index to K do
            props(index) -> prop;
            item(index) -> entry;
            prop(entry) + 1 -> prop(entry)
        endfor;
    endfor;

enddefine;

vars props =
    count_items([[N V N ADJ] [N V V ADJ] [N ADV N V] [P P P P] [Q Q Q Q]], 4);

props =>
** [<property> <property> <property> <property>]

;;; Datalist prints out a property, in a random order
datalist(props(1)) =>
** [[N 3] [P 1] [Q 1]]

datalist(props(2)) =>
** [[V 2] [P 1] [Q 1] [ADV 1]]

datalist(props(3)) =>
** [[P 1] [N 2] [Q 1] [V 1]]

datalist(props(4)) =>
** [[P 1] [ADJ 2] [Q 1] [V 1]]

Then to find the most common element in position I do

define commonest(prop)-> item;

    lvars
        ;;; item with highest count so far
        item = undef,
        ;;; highest count so far
        count = 0;

    appproperty(prop,
        procedure(entry, val);
            if val > count then
                entry -> item; val -> count
            endif;
        endprocedure);

enddefine;

commonest(props(1)) =>
** N
commonest(props(4)) =>
** ADJ

Then end with a procedure that uses all the above and
uses approperty to examine all the entries in each property
thus:

vars I;
[%for I from 1 to 4 do
    commonest(props(I))
endfor%] =>
** [N V N ADJ]

David's method is more elegant and probably just as good for a
short list. If the original list is very long his method will
create another long list and then count things, so that it
takes a temporary list as big as the original and traverses
everything twice, though he has gone to a lot of trouble to
re-use list-links.

Using properties and counting things on the fly as you traverse
the original list could save a lot of space and time if
N is very much larger than K and the number of symbols in each
position is small.

> In general is there an easy way to do this, like a built in
> pop11 procedure that i havent managed to find?

This is one of a vast number of things that could be done with
lists. Pop11 attempts to provide the most common, e.g. applist,
maplist, sort, reverse, length, member, oneof, shuffle, flatten,
last, allbutfirst, allbutlast, and others.
(See REF LISTS, REF DATA), and chapter 6 of the primer.

This one is probably too specialised, though if it were frequently
required it could easily be added to the library. It's better to
teach people how to create such things as required than to try to
anticipate every possible useful list-processing procedure.

There are several more such examples in exercises on list
processing: TEACH SETS  TEACH SETS2, available here:
    http://www.cs.bham.ac.uk/research/poplog/teach/sets
    http://www.cs.bham.ac.uk/research/poplog/teach/sets2
        E.g. see the 'subculture' problem.

Aaron
====
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk   (ReadATas@please !)
PAPERS: http://www.cs.bham.ac.uk/research/cogaff/ (And free book on Philosophy of AI)
FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html