[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Mon Jun 9 20:14:03 1994 
Subject:Saving and restoring data and even generating C 
From: Robin Popplestone  
Volume-ID:940610.01 

Some months ago I put out a description of  -save_data- and -restore_data-
procedures which I have been developing, promising the code would follow
shortly.... Well, I hung up on generality, especially after receiving a
message from Jon Meyer. BUT herewith is a NEW GLOSSY (if ascii can be
glossy) help file of a more versatile capability. How versatile you may
come to appreciate if I tell you that twiddling a few buttons or rather
writing a few procedures allowed me to generate C code to recreate the
structure as output rather than the standard pedestrian form. BUT I am
still not satisfied that it can do all the things a Jon could wish.

However it should support anybody who wants to walk through an arbitrary
POP data-structure and produce some kind of report on it.

Comments please.

============================================================================




HELP restore_data, save_data               Robin Popplestone MAR 1994

A general capability for saving Poplog data-structures on backing store.
The format of data written may be user-specified, and include e.g. a
C program to recreate equivalent structures.

This text may be freely copied provided the above attribution is preserved.


         CONTENTS - (Use <ENTER> g to access required sections)

 --  How to use these facilities to save data to backing store and restore it.
 --  The format of the saved data
 --  An Example - saving a circular list.
 --  Limitations - no saving of widgets, external data and some procedures
 --  But class_save class_restore class_update mitigate the limitations.
 --    Writing the -save- procedure
 --    Writing the -restore- and -update- procedures
 --      stack_fields(Rep,m) reads and stacks fields ending with ";"
 --      restore_field(Rep,m) reads a single field
 --  User definition of the saved format.
 --  restore_data provides more capabilities than datafile
 --  Use with lib objectclass

How to use these facilities to save data to backing store and restore it.
-------------------------------------------------------------------------
save_data allows many Pop data structures, including "circular" structures, to
be recorded on backing store, and to  be read back. It thus extends the  scope
of LIB * DATAFILE. To write a structure to backing store type:

    save_data(<struct>, <filename>) -> <property>;

The meaning of the <property> is explained in the section on "User
definition of the saved format". For simple uses of -save_data- it
can be ignored.

Similarly, to read a structure back from backing store, type:

    restore_data(<filename>) -> <struct>;

the (default) permitted datatypes are:

    words, numbers, lists, vector types, record types,
    vector arrays, ordinary properties, booleans and closures

A named procedure can be "saved"; what this means is that its name is
written to the file on saving. On restoration the procedure with the
same (global) name is restored, using valof. save_data checks that
the name of the procedure in its -pdprops- is associated with the
procedure itself as value.

A similar approach is adopted for -key- objects. The dataword is written
and the corresponding key is regenerated.

The format of the saved data
-----------------------------
The whole saved file is described by the syntax:

    <saved_file> -> SavedData <version> <date> <field> <objects>
    <version> -> V1

Here the <version> specifies which version of save_data was used to save
the data. If the file does not begin with SavedData, the older datafile
capability is assumed to have been used.

<date> is a string, derived from sysdaytime, and records the actual time
of writing of the file.

A <field> is a specification of a POP item (record, vector etc.) which
is either self-contained or is the index referring to a line in the
file where the item actually "is", i.e. a kind of in-file pointer. Simple
items and certain compound items are self-contained.

    <field> ->  (<index>)     Pointer to the item in line labelled <index>
            ->  "<string>"    The word made from the string.
            ->  <string>      A POP-11 string.
            ->  <word>        A POP-11 variable - take the valof.

The <word> option is used for named procedures and certain system constants
(e.g. false).

In the <saved_file> syntax, the <field> specifies the actual item that
has been saved. This distinguishes it from its sub-items, since it will
not necessarily have any particular index. A simple item will have no
<objects> following it. Thus the number 654 is saved as:

    SavedData V1
    654

However, normally we have <objects>

    <objects> -> <object> <objects>
              -> <null>

where <null> is the grammar which generates just the null sequence.

    <object>    -> <index> <dataword> <fields> ;
                -> <index> %a <array>
                -> <index> %c <closure>
                -> <index> %p <property>

An array is written out as specified by the following syntax. Here
the two <int> values are the arrayvector_bounds of the array. No
updater is written as it is illegal to change the updater of an array.

    <array>      -> <boundslist> <vector> <subscr> <int> <int> <byrow> <props>
    <boundslist> -> [ <int> <int> .... ]
    <vector>     -> <field>                    ;;; The vector of values
    <subscr>     -> <field>                    ;;; Subscripting procedure
    <byrow>      ->  true | false              ;;; storage by row?
    <props>      -> <field>                    ;;; The pdprops


A closure is written out as:

    <closure> -> <pdpart> <pdprops> <updater>

Here all three components are written out as fields.

A property is written out as:

    <property> -> <alist> <size> <default> <props> <updater>
    <alist> -> <field>          ;;; The association list of the property
    <size>  -> <int>            ;;; The size of the hash table
    <default> -> <field>        ;;; the value to be returned by default
    <props>   -> <field>        ;;; The -pdprops- field of the property.
    <updater> -> <field>        ;;; The -updater- of the property.
                                ;;; This is -false- if the standard updater
                                ;;; is in use.

An Example - saving a circular list.
-----------------------------------
The circular list created by:

    vars circular = [44 fred 'ab' [55 66] last];
    circular -> circular.tl.tl.tl.tl.tl;
    save_data(circular,'circ.tmp');

creates the following file:

SavedData V1
(1)
1 pair 44 (2);
2 pair "'fred'"(3);
3 pair 'ab'(4);
4 pair (5)(6);
5 pair 55 (7);
6 pair "'last'"(1);
7 pair 66 nil ;

Limitations - no saving of widgets, external data and some procedures
---------------------------------------------------------------------
Apart from the inability to save anonymous procedures, widgets and external
data, there are other limitations.

(1) This facility only handles properties that can be created with
-newproperty- and assumes that any properties are -permanent-.

(2) Properties which are circular in the sense that they associate themselves
with some other data, for example P below:

     P -> P(key)

must NOT be saved - an infinite loop will result. It is anticipated that
this will occur rather seldom.

(3) Apart from arrays and properties, no procedure is -truly- saved, since
its name only is written. This is probably the most useful capability,
since, provided there is no change in the specification of procedures,
data should be reusable with a new version of a program.

(4) The updater of properties is not saved.

But class_save class_restore class_update mitigate the limitations.
-------------------------------------------------------------------
Where a structure may contain sub-structures which may not be saved
by the default -save_data- procedure, for example because it contains
external data, or widgets, or anonymous procedures, or if it contains
information that it is not necessary to save, such as a big image taken
from a library, then then user can define her own "methods" using an analog
of the simple built in class_... procedures, which work in a manner analogous
to class_print (see ref print).

For any data class, with data-key Key, which is to be saved by a  user-defined
procedure, write a procedure  P_save (say) to save,  members of that  class, a
procedure P_restore  to  restore class-members  and  a procedure  P_update  to
update members as described below. Then, making the assignments:

section $-SAVE_DATA;
   ...... code for P_save etc. (don't forget to import anything you need into
   ......       the section)
    P_save -> class_save(Key);
    P_restore -> class_restore(Key);
    P_update  -> class_update(Key);
endsection;

will ensure that members of that class can be saved and restored.

  Writing the -save- procedure
------------------------------
This procedure should write out enough components of the data-structure
to allow it to be restored by the user-written restore procedure. Layout
is not prescribed, but compound components will normally be written
using the -save_field- procedure. This will typically write just an index
number for a compound component, as described in the -format- section
below. -save_field- must be used if there is any possibility of
circularity. The -dataword- will be written out -before- the -save- procedure
is called, and a semicolon will be written after.

IMPORTANT when the -restore- and -update- procedures are called, you cannot
assume that any data-structure is properly built. The -restore- procedure
actually constructs -your- data-structure, but does not necessarily
put in the actual final values in its components. The -update- procedure
puts in the pointers to the correct structures, but these may not
have been updated. Therefore you -cannot- rely on any information that
is normally in a data-structure actually being there. In the example
below, while the widget can be reconstructed from the size of picture (Pic)
that it is to hold, we cannot assume that -contents_Image- will evaluate
to a picture at any time during the restoration process (except of
course at the end, but we have to act before the end).

An example:

define save_Image(Im);                         ;;; Save an Image, Im
  lvars W = Widget_Image(Im);                  ;;; If it is displayed in a
  if W then true endif -> Widget_Image(Im);    ;;; widget, replace by <true>
  lvars Pic = contents_Image(Im);              ;;; Get size of picture
  lvars (dx,dy) = (dx_Pic(Pic),dy_Pic(Pic));
  appdata(Im,save_field);                      ;;; save fields of Im record
  spr(";"); spr(dx); spr(dy);                  ;;; save size of picture
  W -> Widget_Image(Im);                       ;;; restore widget
enddefine;


  Writing the -restore- and -update- procedures
-----------------------------------------------
These both read data written by the corresponding -save- procedure. The
save file is scanned twice. In the first scan -restore- procedures are
called to build the data-structures, in the second forward references
are updated.

The -restore- function is called with parameters

  restore(Rep,m)

It should return the restored data-structure which it has read using
the item-repeater Rep. The parameter -m- is passed on to the -stack_fields-
procedure described below. It is the serial number of the latest record
actually constructed on restore. On update it is very big.

    stack_fields(Rep,m) reads and stacks fields ending with ";"
----------------------------------------------------------------
The SAVE_DATA section contains a procedure -stack_fields- which
will stack up fields which it reads until a semicolon is encountered,
putting a count on the stack.

    stack_fields(Rep,m) -> Fld_1 ..... Fld_n, n

The procedure -restore_field- is available to read individual components.

    restore_field(Rep,m) reads a single field
---------------------------------------------
Here Rep is the item-repeater for the file. m is the index of the latest
record formed. Any reference to a file-record preceding m can be finalised.
Forward references have to be postponed.
restore_field(Rep,m) -> (Obj,flag)

If the flag = termin then we have read the last field in the object.
If the flag = true then Obj is an integer index to a forward reference
If the flag = false then Obj is the actual finalised POP-11 object


Thus the -restore- procedure will normally read and stack components
of the structure and then call the class-constructor. An example:

define restore_Image(Rep,m);
  consImage((stack_fields(Rep,m)->;));     ;;; Build the actual structure
  stack_fields(Rep,m) -> ; -> ; -> ;       ;;; ignore the Widget info
enddefine;

restore_Image -> class_restore(Image_key);

The update procedure is called with 3 parameters, namely
 (a) Your own item which is being restored
 (b) The item repeater made from the saved data
 (c) A parameter -m- which is given to -stack_fields- etc.
It should read the fields off the repeater in the same way as the restore
procedure, but use these fields to update the corresponding fields in
the object itself (-fill- is often a useful way of doing this, see
HELP *FILL).

An example is given below.

define update_Image(Im,Rep,m);             ;;; We read the fields and use
  fill((stack_fields(Rep,m)->),Im)->;      ;;; them to update the structure.
  lvars (dx,dy,_) = stack_fields(Rep,m);   ;;; Read the dimensions of widget
  lvars is_W = Widget_Image(Im);           ;;; If there should be a widget
  if is_W then
  lvars W_new = mk_GraphicWidget('dx'><"*"><'dy',   ;;; make one to spec
                                 dx,dy,false,
                                 CB_button_Lv);
  endif;                                            ;;; and put it in its
  W_new -> Widget_Image(Im);                        ;;; slot.
enddefine;

User definition of the saved format using a property as second argument
-----------------------------------------------------------------------
The second argument of -save_data- may be a property. In that case it
is used to provide user-defined versions of various procedures used
by save_data. This allows a much more radical redefinition of the action
of -save_data- than the class_save capability defined above. For example
-save_data- can be used (in several passes) to create a C program which
will recreate a POP data-structure.

The concept of "index" can be generalised to include any Poplog object
for which an ordering relation is defined. Thus, if a C program is being
generated, indices will be C-identifiers which will be bound to the
objects being recreated.

The POP-11 call:

    save_data(<struct>, <property>) -> <property>

returns a property which maps from the sub-structures of <struct> to
their indices. This property is important if consistency of mapping
is required over repeated calls of -save_data-.


    The "before" property entry specifies how to sort the indices.
------------------------------------------------------------------
This should be a binary ordering predicate, such as nonop <= which
is in fact the default value. Use -alphabefore- if you are dealing
with identifiers as indices.

    The "class_save" entry supports data-class specific output.
---------------------------------------------------------------
In case -save_data- is needed with class_save entries that are different
for different calls of -save_data-, we provide this -class_save- entry.
It is not expected that many people will need this, because more
radical changes are to be expected, using the "save_struct" entry, below.

    The "new_index" entry generates an index for a Poplog structure.
-----------------------------------------------------------------
new_index(Struct) must produce a unique new index for the structure  -Struct-.
The default value of -new_index- produces integer indices starting at 1.

For example, the following generates a new index based on the dataword
of the structure

    define new_c_index(Struct);
      gensym(dataword(Struct));
    enddefine;

E.g.3]) may return "pair234".

Note: if save_data is called multiple times and consistent use of indices
is required, you will probably need to incorporate the property returned
by -save_data- in -new_index-, so that -new_index- returns the index
assigned by a previous pass.

For example:

    define new_index_P(Struct);
        P_index(Struct);
    enddefine;

Using the property returned by a first pass:

    save_data(Struct,spec_declare) -> P_index;

We create a new property:

    lvars spec_assign = newassoc([ [new_index   ^new_index_P]])

And call a second pass of -save_data-

    save_data(Struct,spec_assign)->;

Deliberately, there is no facility whereby the property returned from  one
call from save_data can be reused -directly- in a second call, since the
existence of an entry in the property is taken by -save_data- as evidence
that a given structure has been fully saved.

    The "save_field" entry specifies how to save a field
--------------------------------------------------------
The save_field procedure, specified above, can be redefined. Typically
more radical modifications of the output of -save_data- will use
the "save_struct" entry below.

    The "save_struct" entry specifies how to save a whole structure.
--------------------------------------------------------------------
This procedure is usually the most complex one to write for any advanced use
of save_data. Typically, -save_struct(Struct)- will call -appdata- to write
out every component of -Struct-. For example:

    define assign_c(Struct);
      appdata(Struct,save_field_c(%Struct,
                     datakey(Struct),consref(1)%));
    enddefine;

Here -save_field_c- is the procedure that does the work of writing out the
components, using its partially applied arguments for information about the
parent structure. The procedure

        SAVE_DATA$-index

should be called to generate a new index for a POP-11 data-structure if
necessary. Thus, usually the -index- of a compound component will be written
out rather than the component, by the procedure which saves each component.

Note that save_struct will have to treat some POP-11 data-structures
rather specially. This include:

    false, true, nil, termin, procedures, strings, ddecimals


    The "pr_header" entry is called initally to put out a header.
--------------------------------------------------------------------
pr_header(Struct) will be called before any other output is attempted.
Typically it will print out header information on an output file.

Note that it -must- also evaluate SAVE_DATA$-index(Struct) either
directly or indirectly, by calling a procedure that does. This will
have the effect of generating a queue of structures to be saved subsequently.
If there is no queue, nothing gets saved.

This requirement of evaluation may appear bizarre - but it is not so
if we think of saving a POP-11 simple item - in this case the header
will contain just that item and there is no queue of sub-structures
to be saved, since it -has- no sub-structures.


    The "pr_string" entry can save a string in the desired format.
------------------------------------------------------------------
Because strings are not normally treated as -compound data objects- by
-save_data-, you must use this capability to redefine the output
of strings.

    "pr_trailer" is called finally. It may close the output file
--------------------------------------------------------------------

    The "FileSpec" entry specifies where the output is to go.
---------------------------------------------------------------
This entry specifies a file or repeater in the same way as the non-property
second argument of -save_data-.


restore_data provides more capabilities than datafile
-----------------------------------------------------
The scope is extended primarily to include circular structures. In addition
double precision floating point numbers ("ddecimals") are written with
the appropriate number of significant digits to be accurately restored.

Two separate procedures are provided for saving and restoring since it
is anticipated that there will be applications where only one of these is
required.

A simple way of defining save and restore "methods" is provided.

Use with lib objectclass
------------------------
These procedures will save objects generated with lib objectclass, and
will restore them, provided that the specification of object classes
remains constant between saving and restoring. The "methods" draw on the
older paradigm of class_... procedures rather than being methods within
lib objectclass, since this library is not intended to require lib objectclass
to be loaded.