Whitten@Fwva.Saic.Com (David Whitten) writes:
> Date: 12 Feb 93 17:00:52 GMT
> Organization: Science Applications Int'l, Corp. - Computer systems operation
>
> bssdmw@gdr.bath.ac.uk (D M Webster) writes:
> >I need either an algorithm or code to produce a non-redundant
> >series of permutations. What I mean by this is for example
> >given ABC I want the permutations for say groups of 2 letters
> >i.e. AB
> > AC
> > BC
> >
> > I don't want to consider BA, CA, or CB.
> >
> >As an added complication I want to be able to generate these
> >patterns for any number of combinations. That is given for example
> >given 6 items I may want unique patterns of 5,4,3 groups or any
> >number I choose.
> >
> >David Webster bssdmw@uk.ac.bath.gdr
>
> Classic problem. almost a homework problem. (is it?)
> It should be in the FAQ for comp.lang.prolog or comp.lang.pop, I have cross
^^^^^^^^^^
> posted to them since I couldn't find it in their FAQ's....
^^^^^^
>
> David (whitten@fwva.saic.com) US:(619)535-7764 [I don't speak as a company rep.]
I did not see the original request. But here's a Pop-11 version,
since people reading the cross-posted lists may wish to compare with
solutions in their favourite language. ( For new readers Pop-11 is a
lisp-like language, with a pascal-like syntax. As in Common Lisp,
objects are typed, and types can be checked at run time, but
identifiers are not typed, except that you can declare procedure
identifiers to reduce run-time checks when procedures are called).
The Pop-11 solution combinations(num, items) -> list;
-----------------------------------------------------
Define a recursive procedure -find_combinations- that takes
an integer -num- and a list -items- and creates a list containing
all lists of length -num- giving all non-redundant combinations of
-num- items from the list. This procedure will assume
length(items) >= num
Then define a procedure -combinations- that takes the same inputs
and produces the required list of lists, using the recursive
procedure, but after checking whether it should produce an error
because length(items) < num.
The combinations will all be in the same order as the items
in the original list, which is how non-redundancy is achieved.
It is assumed that the original list contains no redundancies, since
it would be easy to check and eliminate repetitions before calling
combinations(num, items).
It is also assumed that if num = 0 then the result should be a list
containing the empty list, namely [[]]. I.e. no matter how many
items there were in the original list, there's only one list of 0
items selected from the original list, namely the empty list.
(Anyone who takes a different view, will have problems handling the
recursion, below.)
I shall assume, because it makes the task easier, that there is no
requirement to share common subtails of lists. I'll return to that
possibility later.
A utility procedure
-------------------
First create a utility -prepend- which, when given an item and a
list of lists, makes a new list of lists, each starting with the
item and an element of lists. It will just leave all its results on
the stack, to be collected into a list by the calling procedure
define prepend(item, lists);
lvars item, lists, list;
for list in lists do [^item ^^list] endfor
enddefine;
;;; test it. This should produce three lists
prepend(1, [[a b] [b c] [c d]])=>
** [1 a b] [1 b c] [1 c d]
The main recursive sub-routine
------------------------------
;;; Now use prepend in find_combinations, after the recursive call
define find_combinations(num, items) -> list;
lvars num, item, items, sublists;
if num == 0 then
[[]] -> list
else
;;; for each item in the list, make a list of all ways of
;;; combining it with combinations of length num-1 made from
;;; following items in the list, using a recursive call for
;;; the latter.
;;; start the list of stacked items.
[%
until items == [] do
hd(items) -> item; ;;; next item
tl(items) -> items; ;;; rest of list
;;; recurse, in order to get the list of lists for
;;; num-1 and remaining items
find_combinations(num - 1, items) -> sublists;
;;; now create a list of all combinations of length num
;;; using item and all the num-1 combinations
prepend(item, sublists)
enduntil
;;; finish the list of stacked items
%] -> list
endif
enddefine; ;;; find_combinations
Now the main procedure called by users
--------------------------------------
define combinations(num, items) -> list;
lvars num, items, list;
if num > listlength(items) then
mishap('TOO FEW ITEMS IN LIST', [^num ^items]);
else
find_combinations(num, items) -> list
endif
enddefine; ;;; combinations
Testing the procedures
----------------------
;;; do some tests to check all the main cases
combinations(3, [a b]);
;;; MISHAP - TOO FEW ITEMS IN LIST
;;; INVOLVING: 3 [a b]
;;; FILE : /tmp/followup1x21566 LINE NUMBER: 99
;;; DOING : sysprmishap mishap combinations ....
combinations(0,[a b]) =>
** [[]]
combinations(1,[a]) =>
** [[a]]
combinations(1, [a b c]) =>
** [[a] [b] [c]]
combinations(2, [a b c d]) =>
** [[a b] [a c] [a d] [b c] [b d] [c d]]
combinations(3, [a b c d]) =>
** [[a b c] [a b d] [a c d] [b c d]]
And just to show that you can mix object types freely in Pop-11
lists:
combinations(3, [a 3.14 [c] 4]) =>
** [[a 3.14 [c]] [a 3.14 4] [a [c] 4] [3.14 [c] 4]]
A more compact, slightly more efficient, but less readable, version
of find_combinations, follows
define find_combinations(num, items) -> list;
lvars num, item, items;
if num == 0 then
[[]] -> list
else
num - 1 -> num;
[% until items == [] do
dest(items) -> (item, items);
prepend(item, find_combinations(num, items))
enduntil
%] -> list
endif
enddefine; ;;; find_combinations
Avoiding duplication of common subtails
---------------------------------------
The algorithms given above do not return lists that share common
"tails". E.g. if the result includes both [a c d] and [b c d] the
common tail [c d] would be created twice. This happens because
find_combinations recurses repeatedly in the "until ... enduntil"
loop, each time with shorter tails of the same list, and
therefore it can repeatedly recreate some lists of length num-1.
This can be shown by tracing prepend and repeating one of the tests;
trace prepend;
combinations(2, [a b c d]) =>
> prepend b [[]]
< prepend [b]
> prepend c [[]]
< prepend [c]
> prepend d [[]]
< prepend [d]
> prepend a [[b] [c] [d]]
< prepend [a b] [a c] [a d]
> prepend c [[]]
< prepend [c]
> prepend d [[]]
< prepend [d]
> prepend b [[c] [d]]
< prepend [b c] [b d]
> prepend d [[]]
< prepend [d]
> prepend c [[d]]
< prepend [c d]
> prepend d []
< prepend
** [[a b] [a c] [a d] [b c] [b d] [c d]]
Notice that there are two occurrences of this call
> prepend c [[]]
< prepend [c]
and three of this one
> prepend d [[]]
< prepend [d]
This sort of repeated construction of the same list can be avoided
by doing the recursive call of find_combinations once at each
recursive level, then iterating down the result, using a new utility
-lists_without-, which, when given an item and a list of lists, goes
down the list till it finds the first one that doesn't start with
the item. It then returns the remainder of the list from there.
define lists_without(item, lists) -> lists;
lvars item, list, lists;
repeat
returnif(lists == []);
hd(lists) -> list;
returnif(list == [] or item /= hd(list));
tl(lists) -> lists;
endrepeat
enddefine;
;;; test it
lists_without("a", [[]] ) =>
** [[]]
lists_without("a", [[a b][a c]] ) =>
** []
lists_without("a", [[a b][a c] [b c] [b d]] ) =>
** [[b c] [b d]]
;;; Now use it in a new, more economical, version of find_combinations
A more efficient definition of find_combinations
------------------------------------------------
define find_combinations(num, items) -> list;
lvars num, item, items, sublists;
if num == 0 then
[[]] -> list
else
;;; recurse once, outside the loop, to get lists of num - 1 items
find_combinations(num - 1, tl(items)) -> sublists;
;;; now make all the lists of num items, avoiding redunancy
[%
for item in items do
lists_without(item, sublists) -> sublists;
unless sublists == [] then
prepend(item, sublists)
endunless;
endfor
%] -> list
endif
enddefine; ;;; find_combinations
Compare its behaviour with prepend traced
trace prepend;
combinations(2, [a b c d]) =>
> prepend b [[]]
< prepend [b]
> prepend c [[]]
< prepend [c]
> prepend d [[]]
< prepend [d]
> prepend a [[b] [c] [d]]
< prepend [a b] [a c] [a d]
> prepend b [[c] [d]]
< prepend [b c] [b d]
> prepend c [[d]]
< prepend [c d]
** [[a b] [a c] [a d] [b c] [b d] [c d]]
combinations(3, [a b c d]) =>
> prepend c [[]]
< prepend [c]
> prepend d [[]]
< prepend [d]
> prepend b [[c] [d]]
< prepend [b c] [b d]
> prepend c [[d]]
< prepend [c d]
> prepend a [[b c] [b d] [c d]]
< prepend [a b c] [a b d] [a c d]
> prepend b [[c d]]
< prepend [b c d]
** [[a b c] [a b d] [a c d] [b c d]]
Tracing find_combinations shows that it too is no longer called
repeatedly with the same inputs.
There is now no redundancy in recursive calls, and common subtails
are shared (which may or may not be desirable!). Here's a
demonstration showing that if one tail is changed the others with
the same elements are.
untrace prepend;
vars newlist = combinations(2, [a b c d]);
newlist =>
** [[a b] [a c] [a d] [b c] [b d] [c d]]
;;; change the last element of the third list
1 -> last(newlist(3));
;;; check what has changed
newlist =>
** [[a b] [a c] [a 1] [b c] [b 1] [c 1]]
I.e. all tails containing that last element have changed. In some
programs that would be a serious bug. So the efficient solution is
not always the one required.
Non-list solutions
------------------
The original poster mentioned groups of letters, without saying what
sorts of datastructures were to be used. If they were strings or
words, instead of lists, the same sort of technique could be used,
except that there would normally be a far bigger garbage collection
overhead, because you'd be creating temporary structures then
discarding them. In the solutions given above, nothing is created
temporarily then discarded.
In a language without lists, it would be possible, using strings
(e.g. arrays of characters) to avoid creating garbage by
precomputing the number of strings of length num required, creating
them all and passing an array of them to all the sub-routines. The
algorithm would be a bit more messy, but not conceptually very
different. E.g. prepend would have to be given a list or array of
all the strings and an index saying at which location the item
should be inserted.
Aaron
--
Aaron Sloman,
School of Computer Science, The University of Birmingham, B15 2TT, England
EMAIL A.Sloman@cs.bham.ac.uk OR A.Sloman@bham.ac.uk
Phone: +44-(0)21-414-3711 Fax: +44-(0)21-414-4281
|