HELP SUMMARISE                                    Aaron Sloman June 1999
                                          Based on work by Riccardo Poli

(DRAFT HELP FILE. May be improved)

LIB SUMMARISE

This program designed and implemented primarily by Riccardo Poli in
October 1996, and slightly modified by Aaron Sloman in June 1999, can
read in a text file, digest it, form an impression of the style, and
spew out a "crazy" summary, written in a parody of the the style of the
original.


CONTENTS

 -- Introduction
 -- summarise_file(file, max_word_count);
 -- Examples
 -- ENTER summarise <number>
 -- ENTER summarise
 -- Invoking the program from the Unix command line
 -- Global variable max_input_words (default 20000 );

-- Introduction -------------------------------------------------------

To make it available load the library:

    uses summarise

The program can be used in three different modes all desrcibed below,
namely:

    o by invoking the procedure summarise_file
    o by giving a Ved command
    o by invoking Pop-11 as a unix command

The library makes available a number of procedures, one for use
directly, e.g. in a procedure of your own which selects a file to
summarise, and one which defines a VED command: ENTER summarise,
described later. The following is the main procedure


-- summarise_file(file, max_word_count);

    file can be either a string naming a file, or a character
    repeater (e.g. vedrepeater, or the output of discin)

    This reads in the file (or the text input stream defined by
    the character repeater) and analyses some statistical properties
    of the file (storing for each word the frequencies with which other
    words follow it in the file). It then prints out (using
    cucharout) a new file containing max_word_count words, in
    the same "style" as the original, making use of the frequency
    information. The result can be an amusing parody of the
    original, though in general it will not make much sense and
    may include ungrammatical sentences.


-- Examples

Try these commands (using ESC d) to compile them. The summary will be
printed into the file output.p, so you will probably want to clear that
file before you start each command. (Edit output.p, then do ENTER
clear):

    summarise_file('$usepop/pop/teach/respond', 300);

    summarise_file('$usepop/pop/teach/aithemes', 300);

    summarise_file('$usepop/pop/doc/continuation', 300);

You can also run this with some of your own files.

The output of summarise_file is a bit like an essay written by a student
who has memorised many short phrases and technical terms and can
generate them fluently without really understanding what they mean.


-- ENTER summarise <number>
-- ENTER summarise

The ENTER summarise command can be used to produce a summary of the
file you are currently reading in VED. The <number> is used as the
second argument to summarise_file. If you don't give the number
it defaults to 1000 words.

The output is printed into a Ved file called SUMMARY, in non-writable
mode. So if you already have a file called SUMMARY in the editor, you
should save the file and "quit" it.

For instance, try it now, giving this command to produce a
summary of the file you are now reading.

    ENTER summarise 400

Alternatively look at a teach file, e.g.
    ENTER teach primer

Then
    ENTER summarise 500


-- Invoking the program from the Unix command line

If your Pop-11 system is set up properly you can also start the program
as a unix command, in this form:


    pop11 summarise <filename> <words>

E.g. try the following Unix command a few times

    pop11 summarise $usepop/pop/teach/eliza 400

You can give that command do the shell in an xterm window.


-- Global variable max_input_words (default 20000 );

If the file you wish to summarise is very long, you may not wish to wait
for the program to read it all in. There is a global variable
max_input_words whose value shold be an integer, and which defaults to
20000. After reading this number of words from the input file the
program stops. You can try increasing or decreasing this number
to see whether it improves the summaries, when dealing with a long input
file.


--- $poplocal/local/help/summarise
--- Copyright University of Birmingham 1999. All rights reserved. ------
