Further information on built in data types

Next: Objectclass - An Up: CHAPTER.2: INTRODUCTION TO Previous: List of Pop-11

Further information on built in data types

The following sections give more information on the data types listed above. Here they are grouped according to their purpose, rather than being presented alphabetically. Where there is special syntax for creating instances, this is illustrated. Some of the main procedures associated with the data-type are also listed. Most of the generic procedures described previously are not mentioned again. Online HELP or REF files giving further information are listed. In many cases further information is given in other parts of this primer.

Words

Words are structures corresponding to a sequence of characters. They can be used in one of three main sorts of roles:

o data, i.e. objects operated on by programs, e.g. in lists

o syntax words like "if", "define", "endwhile" which form portions of programs and define the program structure

o names of objects, i.e. variable or constant identifiers

They can occur implicitly in programs as part of the code, or they can be explicitly denoted by quoted word expressions like these:

     "cat",  "dog",  "***",  "a",  "xxx_yyy",  "(",  "!+*+*+!",  """
     "'a long mixed character word including spaces %$%$%3333!)(.;;'"

Expressions denoting words have strict formation rules sketched previously. Normally quoted words and program text words such as variable names cannot contain spaces, or mixtures of characters of different types. However there are procedures which can construct words with arbitrary combinations of characters, and arbitrary sequences of characters can be made into a quoted word by enclosing them in string quotes with surrounding word quotes (as in the last example above). This mechanism is not available for using variable names with spaces or illegal mixtures of characters.

Words can be thought of as structures that include a string and other information. In addition they are "standardised" in a dictionary, as described below, unlike strings.

The internal representation of words

A word is represented internally by a special record which includes information about the characters making up the word (which can be accessed by applying the procedure word_string to the word).

If the word is being used as a program identifier (as opposed to merely being a data item in a list, for example) then the word record includes a pointer to an ident record (described below) giving information about the syntactic properties and associated value. This pointer can be changed depending on what the current section is. If the word has a syntactic role then the following procedures can be applied to it to discover what the properties are:

    identprops(word)
        Returns "undef" for word with no syntactic role. Otherwise
        returns the numeric precedence, or one of "macro", "syntax",
        "syntax N" as described in REF IDENT

    full_identprops(word)
        Returns "undef" for an undeclared word, or a list of all the
        keywords used in the declaration of word, e.g. "global",
        "constant", "protected" etc.  See REF IDENT

    identof(word)
        Returns or updates the global (permanent) identifier currently
        associated with the word (see below). This association will
        vary according to which section is 'current'.

The Pop-11 dictionary

Words play a crucial role in Pop-11. Most of the items in a program text stream are words, including syntax words like "if", "define", "lvars", and also user-defined variable or procedure names, like "list", "rooms", "x_axis", etc. In order to be able to tell quickly whether a word is one that is already known, Pop-11 keeps all words, including both system words and words introduced by the user, in a single global dictionary, which makes use of hash-coding on the characters of the word for rapid access. This makes it very easy to check whether a new sequence of characters corresponds to a currently known word. If so, the existing word record is used. If not, a new word record is created and entered in the dictionary.

That explains why two occurrences of a word expression for the same word will return the very same (i.e. identical word), unlike two occurrences of a string expression, as shown when the strict equality predicate "==" is used for comparison:

    "cat" == "cat" =>
    ** <true>
    'cat' == 'cat' =>
    ** <false>

The dictionary in Pop-11 corresponds roughly to the symbol table of a conventional programming language, like Pascal or C, except that in those languages the symbol table is used at compilation and link time but is not normally required during program execution, whereas the incremental compiler in Pop-11 requires the dictionary to be available at all times, not merely during a compilation phase prior to execution of programs. It is also required because running programs can create words, e.g. using consword or <>.

The dictionary is not a structure that is accessible to users. However the procedure appdic can be given a procedure which it will apply to every word in the dictionary. For example the following expression will create an alphabetically sorted list of all the words in the dictionary, typically a list of several thousand items:

    sort( [% appdic(identfn) %] )

Additional procedures that operate on words include consword, destword, isword, subscrw, subword, word_string, sys_current_ident, and the concatenator <>

See REF WORDS and REF IDENT for further information.

Strings

Examples of string expressions:

    'a',  '21385d73::;+*)(&%',  'string with spaces'

A string is a vector of characters (8-bit integers). Relevant procedures include inits, consstring, deststring, isstring, subscrs, ><, explode.

Non-printing characters can be represented in strings using special conventions analogous to those used in C string expressions, as explained previously e.g.

    \s  = a space
    \t  = a tab
    \n  = a newline
    \r  = the return character (ascii 13)

A string expression like `cat' causes a corresponding new string to be created by the lexical analyser each time such an expression is read in while programs are being compiled. Thus a list like the following will contain two strings with similar contents:

    ['cat' 'cat']

whereas the following list will contain contain two pointers to the very same word record, namely "cat":

    [cat cat]

Moreover, as shown above, the strict equality test on two strings created at different times will return false, because the strings will be different items in the machine's memory.

In Pop-11 there is no table containing all strings, like the dictionary containing all the words.

If a string expression occurs inside a procedure, e.g. in an assignment like this

    'The string' -> string;

then the string will be created only once, at compile time, rather than a new string being created each time the procedure is run. This is unlike list expressions and vector expressions which actually plant instructions to create a new instance.

Strings are useful for storing sets of small positive integers (i.e. requiring no more than 8 bits per integer). They are also often used to create text for printing. The string concatenator operator "><" is often handy for creating a string that includes the characters that would normally be used for printing some other object that is not a string. For example if you concatenate the empty string with a number the result is a string that looks like the number. However, if pop_pr_quotes is true the concatenated string will include spurious string quote characters, which can be suppressed by using sys_>< instead of ><, thus:

    false -> pop_pr_quotes;
    vars numstring = 12345 >< ";
    numstring =>
    ** 12345
    true -> pop_pr_quotes;
    numstring =>
    vars newstring = 12345 >< ";
    newstring =>
    ** '12345"'
    ;;; Use sys_>< to suppress the effects of pop_pr_quotes true
    vars laststring = 12345 sys_>< ";
    laststring =>
    ** '12345'
    false -> pop_pr_quotes;
    newstring =>
    ** 12345"

The empty string is often very useful, so the built in identifier nullstring is provided with an empty string as its value.

The Poplog editor VED represents each file as a vector of strings and dstrings (described below). Empty lines are represented by nullstring.

For more information see:

    HELP STRINGS, HELP ASCII

Dstrings

Since version 14.2, the Poplog editor VED can use a more complex representation for characters in strings. These use a special datatype known as "dstrings" (or Display Strings) which support characters with different attributes, such as bold, italic, or underlining. Ultimately dstrings will be able to support multiple fonts.

These are described in REF STRINGS/Dstrings

The VED procedure ved_chat, for modifying CHaracter ATtributes in a VED file is described in REF VEDCOMMS.

Idents (identifier records)

When a word is used as a syntax word or a variable or constant identifier, a new structure is associated with it which defines its role in Pop-11. This structure is a special record, known as an ident (or identifier). The record contains the following fields:

o An idval field for holding the value of the identifier if there is one. The procedure valof, applied to a word, accesses this field of the corresponding ident. So does the any program code that accesses or updates the value of a variable.

o Type information: the identtype. E.g. an ident may be restricted to take only procedure values.

o A flag indicating whether the variable is "active" and if so what its multiplicity is.

o The identprops, which determines the syntactic properties used by the Pop-11 compiler when program text is being compiled.

o A flag indicating whether the identifier is lexical or permanent

At present the ident does not specify the word that identifies it. So more than one word can share the same identifier, i.e. they can function as synonyms in programs.

The syntax word "ident" is available for accessing the ident currently associated with a word, thus:

    vars list1 = [a b c];
    ident list1 =>
    ** <ident [a b c]>

    ident define =>
    ** <ident <procedure define>>

Note that the standard printing routine merely shows the idval field of the ident.

A word may be associated with different idents in different sections. This is why not all the information relevant to the role of a word is held in the word record itself.

Procedures concerned with idents include: consident, isident, idval, identprops, nonactive_idval, sys_current_ident, word_identifier.

The last procedure is used to create words that bypass the section mechanism so that they are guaranteed always to have the same identifier associated with them.

Types of numbers in Pop-11

There are several sorts of numbers in Pop-11: integers, big integers, decimals (called floats in some languages), ddecimals (sometimes called long floats), ratios, and complex numbers. The formats for typing in numbers are fully specified in REF IDENT, and the internal representation of numbers of different sorts and procedures available for operating on them are described in REF NUMBERS. What follows provides a summary of this information.

Integers and Bigintegers

Integers are `simple' Pop-11 objects not represented by pointers. They occupy one machine word, typically a 32 bit word, though 2 bits are normally used for type identification in Pop-11 and therefore only 30 bits are left for the integer value. Examples are:

    66,  -33,  99999,    -12348888, 0

Bigintegers are structures represented in the heap and can be arbitrarily large. Some examples are:

    12345678900980980911,   2**40,   -99999999999999999999999

The only limit to the size of a biginteger is the (virtual) memory available in the machine.

Floating point numbers: decimals and ddecimals

In Pop-11 there are two kinds of floating point numbers, decimals and ddecimals, which differ only in their degree of precision. Normally the results of floating point calculations in Pop-11 are decimals (single precision floats). However, by making the value of the system variable popdprecision true, instead of false, which is its default, the relevant procedures are changed to return ddecimals. This increases the precision of floating point computations, but can use a lot of temporary storage space in the heap, causing garbage collections to occur (or extra paging on a machine with a small amount of physical memory.)

Decimals, like integers, are `simple' Pop-11 objects not represented by pointers. Each occupies a single machine word, apart from the two bits needed for type information. Thus decimals in Pop-11 are typically restricted to 30 bit precision. Examples are the following.

    66.0,  -33.0,  77.35,  9999.532,  -6666.0,  0.0s0,   5.5s-5

The last example represents

    5.5 * (10 ** -5), i.e.  0.000055

Strictly only the last two examples will create a single precision decimal if typed in to Pop-11, since normally floating point constants are read in as ddecimals irrespective of the value of popdprecision. The last two examples use `s' to force creation of a single precision float.

Examples of expressions denoting ddecimals are

    0.0,  -9999.5, 12345.678,  0.00000001,  1.5d5,   1.5d-5,   1.5e5

In the last three examples the letters "d" and "e" are used interchangeably to indicate the exponent, in contrast with the "s" of single precision floats. For example

    123.45d5   is the double precision number  12345000.0
    123.45d-5  is the double precision number  0.0012345

The last example may print as `0.001234' because the global variable pop_pr_places, which controls the number of decimal places printed defaults to 6, as shown here:

    0.123456789 =>
    ** 0.123457

As explained above, decimals and integers use single precision arithmetic, and are represented entirely within a single word of memory, usually using 30 bits, as the remaining two bits are required to distinguish pointers, integers and decimals. By contrast ddecimals use double precision arithmetic, and will be created when the value of the global Pop-11 variable popdprecision is non false.

All ddecimals take the same amount of memory space (which depends on the current implementation, but is typically three 32 bit words, one word being used to point to the ddecimal key, and the other two to hold the number.) Thus ddecimals have limited precision, though it is greater than the precision of decimals. Bigintegers are more complex structures that are unlimited in size and therefore unlimited in precision. The same applies to ratios.

Ratios use indefinite precision

Ratios represent the ratio of two integers or two bigintegers, and can be used for very high precision arithmetic. Examples of ways of representing ratios are

    3_/4,  12345_/54321,  -33_/44

In this form a ratio expression will be read as one item, as can be shown by enclosing them in list brackets and printing out the list:

    [3_/4  12345_/54321  -33_/44] =>
    ** [3_/4 4115_/18107 -3_/4]

Unlike this

    [3/4  12345/54321  -33/44] =>
    ** [3 / 4 12345 / 54321 -33 / 44]

The division of two integers will normally produce either an integer, or a biginteger, or a ratio, in the case where the division is not exact. E.g.

    10/3 =>
    ** 10_/3

Programs that involve inexact division of integers will produce ratios, and the computations will be exact, without the loss of precision involved in the use of decimals and ddecimals. However, high precision ratios, like bigintegers, can take up a lot of space, and if many of them are created, they will require temporary storage space, and the garbage collector may be invoked more often than expected. Fortunately the Poplog garbage collector is very fast. Also the frequency of garbage collections can be reduced by using the variables popmemlim and popminmemlim to expand the heap space so that space runs out less often.

The format shown above can make it difficult to take in differences between different ratios. It is possible to get Pop-11 to print out ratios as if they were floating point numbers, which is sometimes more convenient, though potentially misleading. This is done by making the global variable pop_pr_ratios false.

    false -> pop_pr_ratios;
    10/3 =>
    ** 3.333333
    true -> pop_pr_ratios;
    10/3 =>
    ** 10_/3

Complex numbers

Complex numbers are represented in Pop-11 by records that hold the real part and the imaginary part. The real and imaginary parts can be integers, ratios, decimals or ddecimals, though both must be of the same type. These records can be created using the two infix operators +: and -:, where users are invited to think of the colon as an approximate depiction of "i", the square root of -1. Thus, for example:

    0 +: 1 =>
    ** 0_+:1

    3 -: 5 =>
    ** 3_-:5

Complexes are printed out in a form in which they can be typed in as single items, using expressions that start with the real part, followed by an underscore "_", followed by "+:" or "-:" followed by the imaginary part. For example the following is a list of two complex numbers (notice how the integer values are coerced to floats where necessary)

    [ 3_+:2.0  -3_-:4 ] =>
    ** [3.0_+:2.0 -3_-:4]

There are many built in mathematical functions that are capable of taking complex numbers as arguments and/or returning them as results. E.g. attempting to compute the square root of -1, or the logarithm of a negative number produces a complex result:

    sqrt(-1) =>
    ** 0.0_+:1.0

    log(-22.5) =>
    ** 3.113515_+:3.141593

Procedures that are specifically concerned with complex numbers, include the two operators mentioned above and, conjugate, destcomplex, realpart, imagpart, iscomplex.

For more information see REF NUMBERS

Recognizers for number types: integral, rational, decimal, complex

A number is described as "integral" if it is an integer or a biginteger. It is described as "rational" if it is integral or a ratio. It is described as decimal if it is a decimal or a ddecimal. There are various recognizer procedures for detecting the different number classes:

    isinteger, isbiginteger, isintegral, isratio, isrational,
    isdecimal, issdecimal, isddecimal, isreal, iscomplex, isnumber

Reading in numbers relative to a base

Numbers may be represented externally relative to a base, though the internal representation is not changed thereby. The base is indicated by an integer followed by a colon, preceding the number itself. Thus, binary numbers are represented with the prefix `2:'. So:

    2:100 is the same as 4
    2:1011 is the same as 11
    8:101 is the same as 65
    2:1.1 is the same as 1.5
    8:1.1 is the same as 1.125
    16:1FFA represents 8186 as a hexadecimal number.

Note that the prefix `10:' is redundant. 10:999 = 999

For more on notations for numbers see REF ITEMISE, or Chapter 5, below.

Additional information about the representation of numbers inside the machine is also given in chapter 5, below.

Characters (8 bit integers)

A character in Pop-11 is represented as a positive 8 bit integer (i.e. an integer between 0 and 255) according to the standard ascii conventions, (except in dstrings where characters have more information corresponding to font characteristics.) The printing characters correspond to the integers between 33 (the exclamation mark character) and 126 (the tilde character). Users of Pop-11 do not need to remember the mapping from integers to characters, since character quote symbols can be used to represent the integers corresponding to a character. Examples are the following, where the last two examples correspond to the backslash character and the character quote character. (The spacing in the printed result has been stretched to show the correspondence)

       `!`, `a`, `B`, `0`, `9`, `(`, `\s`, `\t`, `\n`, `\\ `, `\" =>
    ** 33   97   66   48    57  40   32    9      10    92   96

The file HELP ASCII gives full details on character codes, including how to represent non-printing characters, such as control characters.

NOTE: `a` is a character, whereas `a' is a string.

Characters are not really a distinct Pop-11 datatype, as they are (at present) simply 8 bit integers. So they have no key or dataword of their own.

Booleans (true and false)

In Lisp the empty list is treated as false and everything else as true. In C and many other languages the number 0 is treated as false and everything else as true. The original version of Pop2 followed the latter convention, but when Pop-11 was designed it was decided that the introduction of a boolean data type was desirable as too many obscure bugs could follow from treating the empty list or 0 as false.

Two built in identifiers are provided to refer to the two boolean values:

    true, false =>
    ** <true> <false>

Many other expressions are capable of denoting boolean values, e.g. the result of applying a recognizer procedure to an arbitrary object, or the result of applying an equality test to two objects or an arithmetical comparison to two numbers. Here are several examples of boolean-valued expressions:

    true,  false,   66 == 66,  66 == 99,  77 < 33, "cat" = "dog",
    isinteger(true), isboolean(99), isword("cat"), isstring('cat'),
    member(3, [a b c d])

In addition to recognizers and comparison predicates there are a few operators designed specifically to operate on boolean values, namely:

    not, and, or,

These are used for forming complex conditions for conditional and looping instructions. All of these treat any non-false object as if it were true, and "and" and "or" return false or their last non-false argument. E.g.

    not(99) =>
    ** <false>
    not(false) =>
    ** <true>
    true and false =>
    ** <false>
    true and "cat" =>
    ** cat
    false or 99 =>
    ** 99
    99 or false =>
    ** 99
    false or not(true) =>
    ** <false>

strictly "and" and "or" are not infix procedure names, but syntax words as they prevent their second argument being evaluated if the first argument suffices to determine a value. This can mean that the first argument can be used as a "guard" against an error in the second, e.g.

    false and ("cat" + "dog") =>
    ** <false>

Very many procedures return booleans as results, for use in conditional expressions and loop test expressions.

Pairs and lists

Pairs are records containing two fields, which can contain any type of Pop-11 item. They can be created using the procedure conspair, and have associated procedures destpair, front, back, ispair. In Pop-11 as in several other languages, pairs are used as the basis for a `derived' data-type namely lists. Lists are defined recursively as follows.

o The empty list [] (defined below) is a list.

o A pair is a list if its back is a list.

Lists are described in far more detail in Chapter 6, below. Apart from the syntax for creating lists, there is no special syntax for creating pairs, though they can be created using the procedure conspair:

    conspair(3, 4),     conspair([], "cat"),   conspair("cat", [])

Only the last of these is a list, since its back is a list, namely the empty list. There is special syntax for creating lists, introduced in Chapter 1. Since these lists are made out of pairs, the syntax for creating lists also creates pairs. For example the list

    [cat dog 99]

could be created using the expression

    conspair("cat", conspair("dog", conspair(99, [])))

Many examples of lists are given in this introduction. Lists are strictly speaking a `derived' data-type in Pop-11, in that they are constructed out of pairs, and therefore do not have their own key or dataword.

Most of the time users do not need to think about pairs. The facilities for building and manipulating lists, described later, are designed to hide such irrelevant details! See REF LISTS and Chapter 6. An overview of Poplog documentation relating to lists is given in HELP LISTS.

Later sections of this primer give a lot more information about lists, including the use of the pattern matcher and the Pop-11 database facility.

Procedures for operating on pairs include conspair, destpair, ispair, front, back. There is a much wider variety of procedures for operating on lists built out of pairs. See Chapter 6, below.

References (single component records, consref, cont).

A reference is a record with a single field that can contain an arbitrary Pop-11 object. There is no special syntax for creating references. They can be created using consref. E.g.

    consref(0), consref("cat") =>
    ** <ref 0> <ref cat>

The contents of a reference created by consref can be accessed or updated using the field_accessor procedure cont. E.g.

    ;;; create a reference record rec, containing the number 10
    vars rec = consref(10);
    rec =>
    ** <ref 10>
    ;;; increment the number by 1
    cont(rec) + 1 -> cont(rec);
    rec =>
    ** <ref 11>

The recognizer is isref. References are often used to share variable information between different processes. For example, a procedure P can create a reference R which it gives to procedure Q. Q may eventually cause the contents of R to change. Then when control returns to P it can examine the contents of R to find out what has happened. This is sometimes more convenient than passing values via the stack (e.g. in programs using parallel co-routines) and safer than using global variables.

The procedure datalength applied to any pair will always return the same result:

    datalength( conspair(3, 4) ) =>
    ** 2
    datalength( [a b c d e] ) =>
    ** 2

In the second case it does not chain down the elements of the list. The procedures length and listlength do that.

    datalength( conspair(3, 4) ) =>
    length( [a b c d e] ) =>
    ** 5

However, length applied to a pair attempts to treat it as list, and this will cause an error:

    length( conspair(3, 4) ) =>
    ;;; MISHAP - LIST NEEDED
    ;;; INVOLVING:  [3|4]
    ;;; DOING    :  null listlength length ...

Types of vectors: strings, full vectors, intvecs, shortvecs

There are several types of vectors in Pop-11. Each vector class has a family of associated procedures, including the following:

o initiator procedure, e.g. initv, inits, which takes an integer N and creates a vector with N fields, containing a default value, usually either undef for full vectors or 0, or 0.0 for others.

o constructor procedure, e.g. consvector, consstring, which takes N items and an integer N and creates a vector containing the N items

o subscriptor procedure, e.g. subscrv, subscrs, which takes an integer N and a vector, with at least N fields, and returns or updates the contents of the N'th field

o recognizer, e.g. isvector, isstring

o destructor procedure, e.g. destvector, deststring, which takes a vector and returns all its items on the stack, plus an integer N specifying the number of items. (Thus the results of destvector can be given to consvector to create a copy of the original.)

In addition various generic procedures mentioned previously can be used on all classes of vectors including datalength, appdata, mapdata, datalist, copy, copydata, and the concatenator <>.

The equality tester "=" is defined for all vector classes as follows: V1 = V2 is true if and only if V1 are of the same vector class (have the same datakey) and have the same length, and if corresponding components or V1 and V2 are themselves = to one another.

Users can define their own vector classes, though several standard vector classes are built in to Pop-11, including strings described above. Strings are byte vectors containing packed 8 bit integers, and have associated procedures inits, consstring, subscrs, isstring, etc.

Standard full vectors

One of the standard data types is the class of standard full vectors. Instances of this class can contain any number of items, including no items. The contents of the fields in a vector can be arbitrary Pop-11 items: i.e. they are `full' fields.

There is special syntax for creating vectors. Examples are:

    {},  {a b c},    {cat mouse 3 4},    {'a string' in a vector}
    {[a vector] {containing some} [lists {and} vectors]}

Vectors are very like lists, but stored more compactly in the computer. There are several different types of vectors, besides standard full vectors. E.g. strings are "byte" vectors.

Procedures for operating on standard vectors include initv, consvector, destvector, subscrv, isvector, and the concatenator, <>, and others described in REF VECTORS.

The procedure datalength can be used to check the length of a vector.

    datalength({}) =>
    ** 0
    datalength({{a b} {c d} {e f}}) =>
    ** 3

Packed integer vectors: intvecs and shortvecs

An intvec is a signed packed-integer vectors (usually with 32 bit fields). So the fields of these vectors have two more bits than standard Pop-11 integers which can use only 30 bits. Intvecs have no special syntax, but can be constructed using consintvec, or initintvec. Examples are:

    initintvec(5) =>
    ** <intvec 0 0 0 0 0>
    consintvec(1, 2, 3, 4, 5, 6, 6) =>
    ** <intvec 1 2 3 4 5 6>

The associated procedures are: consintvec, destintvec, initintvec, isintvec, subscrintvec

A shortvec is a signed short packed-integer vectors (usually with 16 bit fields). They have no special syntax. Associated procedures are consshortvec, destshortvec, initshortvec, isshortvec, subscrshortvec

For more details see the online files: REF INTVEC, REF SHORTVEC

Procedures, closures, arrays, properties

A great deal has already been said about procedures, and more information will be given below. In particular, we have previously illustrated the special syntax for creating named and anonymous procedures using the formats:

    define      ....    enddefine;
    procedure   ....    endprocedure

Procedures are sets of instructions, which tell the computer to do something. Some are built in to the Pop-11 system. Others are added by the user. Unlike some languages, Pop-11 treats procedures as objects, just like numbers or words. E.g. procedures can be created by running procedures, can be stored in lists, or assigned to variables. There are several procedures associated with procedures

o isprocedure: recognizes procedures

o pdprops: accesses or updates the pdprops field of a procedure, which normally contains the name and possibly other information.

o updater: accesses or updates the updater of a procedure.

For example the following two expressions for updating the hd of a list are equivalent: the first is `syntactic sugar' for the second.

    "cat" -> hd(list);
    updater(hd)("cat", list);

The second applies the updater of hd to "cat" and list. The first does the same, though it is easier to read.

There are three more special kinds of procedures. They are all functionally equivalent to procedures in that they all have procedure_key as their datakey, they all use procedure-calling syntax for accessing or updating their contents they can all be partially applied to form closures (defined below), and various other procedure specific procedures can be applied to them, including pdprops and updater.

closures

A closure is a combination of a procedure and some data for it to operate on. A closure may either be created using "partial application" (see HELP PARTAPPLY, HELP CLOSURES), or may be a "lexical closure" created by a procedure containing local lvars variables. See Chapter 4 below, and HELP LVARS, REF VMCODE

There is special syntax for creating closures using partial application. E.g. gensym is a procedure that can be applied to a word, to produce new words with appended numerals, e.g.

    gensym("cat") =>
    ** cat1
    gensym("cat") =>
    ** cat2

If we wished to partially apply the procedure gensym to the word "cat" to create a new procedure of no arguments that could be run to get the effect of applying gensym to "cat" we could do so as follows:

    vars cat_gen = gensym(%"cat"%);
    cat_gen() =>
    ** cat3
    cat_gen() =>
    ** cat4

The datakey of a closure is the same as the datakey of any other procedure, i.e. procedure_key:

    datakey(cat_gen) =>
    ** <key procedure>

arrays

Arrays are N dimensional structures whose components are accessed by N integers, where N can be 0 or more. E.g. a two dimensional array might represent a picture, and its components could be accessed by giving two numbers representing distance along and distance up, e.g. picture(3,5). For every class of vectors there are corresponding types of arrays. Note that Pop-11 unlike many other languages, treats arrays as procedures, for maximum flexibility. For details see REF ARRAYS.

The main procedures for creating arrays are newarray and newanyarray, and additional procedures are, as follows:

array_subscrp, arrayvector, arrayvector_bounds, boundslist, isarray, isarray_by_row.

The global variable poparray_by_row has a boolean value which determines the order in which an array stores its values in the underlying vector.

Arrays are treated like procedures in that they are applied to the numbers representing their subscripts. Every array has an associated updater procedure for updating the contents of the array.

An example of the creation of an array from a vector class follows.

Using newanyarray to create an array from an intvec

Vectors of various types can be used to create arrays of various types using newanyarray. For example to create a 2-D array of intvecs with subscript values going from -5 to 5 and -10 to 10 do

    vars
        intarray = newanyarray([-5 5 -10 10], initintvec, subscrintvec);

This array has 0 as default value at each location. The value at any location can be accessed by applying the array to two integers that are within the range of the bounds given. The array can also be updated, as follows, using two integers as indexes into the array.

    intarray(1, 1) =>
    ** 0
    99 -> intarray(1, 1);
    intarray(1, 1) =>
    ** 99

Attempting to update the array with the wrong type of item will produce an error"

    [99] -> intarray(1, 1);
    ;;; MISHAP - INTEGER -2147483648 TO 2147483647 NEEDED
    ;;; INVOLVING:  [99]
    ;;; DOING    :  subscrintvec ....

Arrays also have the procedure_key as their datakey.

    datakey(intarray) == procedure_key =>
    ** <true>

The procedure newarray is used to create arrays from ordinary full vectors. Both newarray and newanyarray allow several optional forms in which they can be used. See HELP ARRAYS for an introduction.

properties

The last kind of procedure is a property. A property is a table of associations between objects, based on `hash coding' to speed up access. A property can function as a sort of memory of what is associated with what. There are several different kinds of properties in Pop-11, and different procedures for creating them, described in REF PROPS

Properties, like closures, arrays, and ordinary procedures are applied to their arguments in order to access or update the associated value. Every property has an associated updater for changing the contents of the property.

The simplest kind of property is created using newproperty, and examples are given in the next chapter, and in HELP NEWPROPERTY.

More sophisticated types of properties can be created using the rather complex procedure newanyproperty, described in HELP NEWANYPROPERTY. A special subset of its functionality is provided by the procedure newmapping, whose use is illustrated in the next chapter. See also HELP NEWMAPPING

Procedures associated with properties include appproperty, clearproperty, clearproperty, property_default, property_size, datalength, appdata, copy and explode, as described in REF PROPS

`Destroy properties'

A special kind of property which (as far as I know) is unique to Pop-11, is called a `destroy property'. A destroy property (described in REF PROPS) allows us to associate with an object a procedure to be run when the object becomes garbage. This is particularly useful when the object has associated with it some object outside the current Poplog process that needs to be removed if the object is garbage. A typical example might be a window on the screen corresponding to the object: the destroy action might be used to ensure that such windows are removed when they are no longer needed.

For more details on properties see REF PROPS

Declaring a variable to be of type procedure

When you know that the value of a variable is going to be a procedure and will not ever be anything else you can declare it as being of type procedure. Thus the declaration of cat_gen above could be replaced by this form:

    vars procedure cat_gen = gensym(%"cat"%);

Declaring a variable as of type procedure can cause more efficient code containing it to be compiled, and will also cause extra error checking when anything is assigned to the variable. Several identifiers can be declared to be of type procedure if they are enclosed in parentheses following the word "procedure", e.g.

    vars procedure (p1, p2, p3);

Lightweight processes

A process is a structure containing a combination of procedure and data and a record of how far the procedure has got in its execution. The original procedure may have invoked another procedure, which in turn may have invoked other procedures, and so on. Thus a process needs to include a partial procedure calling stack. It also needs to record the values of any local variables of those procedures.

Each process can be suspended and resumed as required. These are sometimes referred to as "lightweight" processes, because switching between these processes is far less time consuming than switching between operating system processes on a time-shared computer. Processes can be used for running simulations of various kinds, e.g. simulating an operating system.

Procedures for operating on processes include consproc, consproc_to, runproc, resume, suspend, kresume, ksuspend, saveproc, isprocess, isliveprocess,

These are described in HELP PROCESS, and in more detail in REF PROCESS

The timing facilities in Pop-11, such as sys_timer, make it possible for processes to be suspended and resumed at regular intervals. Thus a time-shared multiprocessing system can be implemented using Poplog Pop-11.

undefs

Undefs are a special type of record used to initialise newly declared global or dynamic local variables that have not been given an initial value.

Normally when a new variable is created, it is given a default value, which is a special object which may print out something like:

        <undef xx>

Meeting one of these in an error message is usually an indication that you have forgotten to "initialise" a variable with an appropriate value. Note that there is a standard Pop-11 variable called "undef" whose value is the word "undef", and which is NOT an example of an undef data-type, but is a word. (For historical reasons, the word "undef" itself is still used in some contexts).

An example follows:

    vars xxxx;  ;;; declare a new variable.
    xxxx =>
    ** <undef xxxx>
    hd(xxxx) =>
    ;;; MISHAP - LIST NEEDED
    ;;; INVOLVING:  <undef xxxx>
    ;;; DOING    :  hd compile .....

For more information see REF IDENT and HELP UNDEF

Keys

A key is a record containing information about a class of objects in Poplog. Each data type has associated with it a key which is a structure containing information about all objects of that type, such as their dataword, how they are to be printed, how many elements they are made of, how they are recognized, whether it is a vector class, a record class or some other kind, and if it is a record or vector class, what is accessing procedures are, etc.

Procedures associated with keys, and described in REF KEYS, include conskey, datakey, isvectorclass, isrecordclass, class_=, class_access, class_apply, class_attribute, class_cons, class_datasize, class_dataword, class_dest, class_fast_subscr, class_field_spec, class_hash, class_init, class_print, class_recognise, class_subscr

Unique objects: nil, termin, stackmark

The empty list []

This unique object is used to indicate the end of a chain of pairs making up a list. In Pop-11 it is NOT used to denote FALSE as it does in many lisp systems that lack a proper boolean data type. The special identifier [] can be used to represent the empty list. For compatibility with other languages the word "nil" is also defined as a Pop-11 identifier that represents the same object.

    nil =>
    ** []
    nil == [] =>
    ** <true>

The stream terminator, termin

Termin is a unique object used to indicate the end of a sequence of items, e.g. the end of a sequence of characters read from a file, or the end of a sequence of items making a stream, or a dynamic list. There is no special syntax for it, though termin is often produced by typing the end-of-file character to a program reading from the terminal. It is also the last item produced by a character repeater obtained from a file. There is no special syntax to represent termin, though a constant identifier termin is provided, which has termin as its value.

    termin =>
    ** <termin>
    datakey(termin) =>
    ** <key termin>

When a dynamic list is produced by a generator procedure, the end of the list is indicated by the procedure producing termin as a result. Dynamic lists are discussed in Chapter 6.

See REF CHARIO

The stack mark, popstackmark

The unique item, which prints as <popstackmark> is used in connection with Pop-11's "open" stack when building lists or vectors. Users will normally only come across it when they make errors involving attempting to take too many items off the stack (stack underflow errors)!

Roughly, whenever an list building expression [ ... ] or a vector building expression of the form { ... } the object popstackmark is placed on the stack when the construction starts, and the final object is created by removing all items on the stack down to the last occurrence of popstackmark and putting them in a list, or a vector.

See REF STACK

Devices

These are records associated with files, terminals and other means of communication between the Pop-11 system and the rest of the world. There are also pseudo-devices created using consdevice.

See REF SYSIO

External pointers

These are pointers to external data or external procedures (functions), created using another language, e.g. Fortran, C or Pascal and linked into the Pop-11 system. See REF * EXTERNAL_DATA

Sections

Sections are structures that hold information mapping words in the dictionary to the idents that provide information about how the words are currently being used. Because the mapping from word to ident can be different in different sections, sections enable different programmers to use the same words for different purposes without risk of clashing even when their programs are later combined.

Identifiers in one section can access those in another by using full section "pathnames", which are similar to Unix file path names except that "$-" is used instead if "/". Thus the identifier "bite" in subsection "dog" in in sub-section "mammal" in sub-section "alive" of the top level section could be referred to as

        $-alive$-mammal$-dog$-bite

Code written in the section $-alive$-mammal$-dog would merely need to use "bite".

As implied by this example, sections can contain subsections. For full details see REF SECTIONS. An introduction can be found in HELP SECTIONS.

The active variable current_section holds as its value the current section.

Identifiers associated with sections include:

pop_default_section, pop_section, section_cancel section_export, section_import, section_key section_name, section_pathname, section_subsect section_supersect

Prolog variables and terms: prologvars, prologterms

A prologvar is a one-element record used as a variable for the Prolog subsystem of Poplog. Various utility procedures are available for manipulating them.

A prologterm is an instance of a special class of vectors used to implement terms in Prolog.

Procedures relevant to the prolog sub-mechanisms in Poplog include:

consprologterm, destprologterm, initprologterm, isprologterm, isprologvar, prolog_arg, prolog_args,

See REF PROLOG for more details.

Data types required for the Poplog X window interface

XptDescriptors are used for managing some of the rather complex interactions between Poplog and the X window system, which is mostly written in C. Additional Pop-11 datatypes are created in the X libraries in $usepop/pop/x/*

See especially $usepop/pop/x/pop/ref/*types*

Further details are provided in: REF XptDescriptor

The REF files mentioned above give far more information than beginners can possibly cope with, but experts designing sophisticated software will probably find them indispensable.

Additional built-in data types may be provided in later versions of Poplog, and will be listed in REF DATA

Most beginners will not need to know about most of the data-types mentioned above. For very many programs it suffices to know only about procedures, words, numbers and lists. Booleans are used in conditional instructions or in tests for termination of loops. Strings are useful for formatted printing, and they are also used as names of files. Arrays are useful for representing two dimensional image data. This Primer will be mainly concerned only with the most commonly used data types.

Next: Objectclass - An Up: CHAPTER.2: INTRODUCTION TO Previous: List of Pop-11

Aaron Sloman
Fri Jan 2 03:17:44 GMT 1998