-----------------------------------------------------------------------------
HELP restore_data, save_data Robin Popplestone MAR 1994
This file may be freely copied and modified, provided the above attribution
is preserved or amended appropriately.
The procedure -save_data- visits each node of a Poplog datastructure. By
default it creates a "saved form" of the structure on disc, which can be
read in by -restore_data-.
The user can adapt -save_data- in many versatile ways by supplying her own
versions of various procedures, effectively providing a general structure
mapping capability. For example, an array which indexes a structure can be
generated, and it is even possible to generate C programs which "replicate"
a given POP structure.
This text may be freely copied provided the above attribution is preserved.
CONTENTS - (Use <ENTER> g to access required sections)
-- How to use these facilities to save data to backing store and restore it.
-- The format of the saved data
-- An Example - saving a circular list.
-- Limitations - no saving of widgets, external data and some procedures
-- But class_save class_restore class_update mitigate the limitations.
-- Writing the -save- procedure
-- Writing the -restore- and -update- procedures
-- stack_fields(Rep,m) reads and stacks fields ending with ";"
-- restore_field(Rep,m) reads a single field
-- User definition of the saved format using a property as second argument
-- The "before" property entry specifies how to sort the indices.
-- The "class_save" entry supports data-class specific output.
-- The "new_index" entry generates an index for a Poplog structure.
-- The "save_field" entry specifies how to save a field
-- The "save_struct" entry specifies how to save a whole structure.
-- The "pr_header" entry is called initally to put out a header.
-- The "pr_string" entry can save a string in the desired format.
-- "pr_trailer" is called finally. It may close the output file
-- The "FileSpec" entry specifies where the output is to go.
-- An example - making an array that indexes a structure.
-- restore_data provides more capabilities than datafile
-- Use with lib objectclass
How to use these facilities to save data to backing store and restore it.
-------------------------------------------------------------------------
save_data allows many Pop data structures, including "circular" structures,
to be recorded on backing store, and to be read back. It thus extends the
scope of LIB * DATAFILE. To write a structure to backing store type:
save_data(<struct>, <filename>) -> <property>;
The meaning of the <property> is explained in the section on "User
definition of the saved format". For simple uses of -save_data- it can be
ignored.
Similarly, to read a structure back from backing store, type:
restore_data(<filename>) -> <struct>;
the (default) permitted datatypes are:
words, numbers, lists, vector types, record types,
vector arrays, ordinary properties, booleans and closures
A named procedure can be "saved"; what this means is that its name is
written to the file on saving. On restoration the procedure with the same
(global) name is restored, using valof. save_data checks that the name of
the procedure in its -pdprops- is associated with the procedure itself as
value.
A similar approach is adopted for -key- objects. The dataword is written
and the corresponding key is regenerated.
The format of the saved data
-----------------------------
The whole saved file is described by the syntax:
<saved_file> -> SavedData <version> <date> <field> <objects>
<version> -> V1
Here the <version> specifies which version of save_data was used to save
the data. If the file does not begin with SavedData, the older datafile
capability is assumed to have been used.
<date> is a string, derived from sysdaytime, and records the actual time of
writing of the file.
A <field> is a specification of a POP item (record, vector etc.) which is
either self-contained or is the index referring to a line in the file where
the item actually "is", i.e. a kind of in-file pointer. Simple items and
certain compound items are self-contained.
<field> -> (<index>) Pointer to the item in line labelled <index>
-> "<string>" The word made from the string.
-> <string> A POP-11 string.
-> <word> A POP-11 variable - take the valof.
The <word> option is used for named procedures and certain system constants
(e.g. false).
In the <saved_file> syntax, the <field> specifies the actual item that has
been saved. This distinguishes it from its sub-items, since it will not
necessarily have any particular index. A simple item will have no <objects>
following it. Thus the number 654 is saved as:
SavedData V1
'Wed Jun 15 13:48:25 EDT 1994'
654
However, normally we have <objects>
<objects> -> <object> <objects>
-> <null>
where <null> is the grammar which generates just the null sequence.
<object> -> <index> <dataword> <fields> ;
-> <index> %a <array>
-> <index> %c <closure>
-> <index> %p <property>
An array is written out as specified by the following syntax. Here the two
<int> values are the arrayvector_bounds of the array. No updater is written
as it is illegal to change the updater of an array.
<array> -> <boundslist> <vector> <subscr> <int> <int> <byrow> <props>
<boundslist> -> [ <int> <int> .... ]
<vector> -> <field> ;;; The vector of values
<subscr> -> <field> ;;; Subscripting procedure
<byrow> -> true | false ;;; storage by row?
<props> -> <field> ;;; The pdprops
A closure is written out as:
<closure> -> <pdpart> <pdprops> <updater>
Here all three components are written out as fields.
A property is written out as:
<property> -> <alist> <size> <default> <props> <updater>
<alist> -> <field> ;;; The association list of the property
<size> -> <int> ;;; The size of the hash table
<default> -> <field> ;;; the value to be returned by default
<props> -> <field> ;;; The -pdprops- field of the property.
<updater> -> <field> ;;; The -updater- of the property.
;;; This is -false- if the standard updater
;;; is in use.
An Example - saving a circular list.
-----------------------------------
The circular list created by:
vars circular = [44 fred 'ab' [55 66] last];
circular -> circular.tl.tl.tl.tl.tl;
save_data(circular,'circ.tmp') ->;
creates the following file:
SavedData V1
'Wed Jun 15 13:46:43 EDT 1994'
(1)
1 pair 44 (2);
2 pair "'fred'"(3);
3 pair 'ab'(4);
4 pair (5)(6);
5 pair 55 (7);
6 pair "'last'"(1);
7 pair 66 nil ;
Limitations - no saving of widgets, external data and some procedures
---------------------------------------------------------------------
Apart from the inability to save anonymous procedures, widgets and external
data, there are other limitations.
(1) This facility only handles properties that can be created with
-newproperty- and assumes that any properties are -permanent-.
(2) Properties which are circular in the sense that they associate
themselves with some other data, for example P below:
P -> P(key)
must NOT be saved - an infinite loop will result. It is anticipated that
this will occur rather seldom. Content-hashed properties are not restored
correctly.
(3) Apart from arrays and properties, no procedure is -truly- saved, since
its name only is written. This is probably the most useful capability,
since, provided there is no change in the specification of procedures, data
should be reusable with a new version of a program.
(4) The updater of properties is not saved.
But class_save class_restore class_update mitigate the limitations.
-------------------------------------------------------------------
Where a structure may contain sub-structures which may not be saved by the
default -save_data- procedure, for example because it contains external
data, or widgets, or anonymous procedures, or if it contains information
that it is not necessary to save, such as a big image taken from a library,
then then user can define her own "methods" using an analog of the simple
built in class_... procedures, which work in a manner analogous to
class_print (see ref print).
For any data class, with data-key Key, which is to be saved by a
user-defined procedure, write a procedure P_save (say) to save, members of
that class, a procedure P_restore to restore class-members and a procedure
P_update to update members as described below. Then, making the
assignments:
section $-SAVE_DATA;
...... code for P_save etc. (don't forget to import anything you need into
...... the section)
P_save -> class_save(Key);
P_restore -> class_restore(Key);
P_update -> class_update(Key);
endsection;
will ensure that members of that class can be saved and restored.
Writing the -save- procedure
------------------------------
This procedure should write out enough components of the data-structure to
allow it to be restored by the user-written restore procedure. Layout is
not prescribed, but compound components will normally be written using the
-save_field- procedure. This will typically write just an index number for
a compound component, as described in the -format- section below.
-save_field- must be used if there is any possibility of circularity. The
-dataword- will be written out -before- the -save- procedure is called, and
a semicolon will be written after.
IMPORTANT when the -restore- and -update- procedures are called, you cannot
assume that any data-structure is properly built yet. The -restore-
procedure actually constructs -your- data-structure, but does not
necessarily put in the actual final values in its components. The -update-
procedure puts in the pointers to the correct structures, but these may not
have been updated. Therefore you -cannot- rely on any information that is
normally in a data-structure actually being there. In the example below,
while the widget can be reconstructed from the size of picture (Pic) that
it is to hold, we cannot assume that -contents_Image- will evaluate to a
picture at any time during the restoration process (except of course at the
end, but we have to act before the end).
An example:
define save_Image(Im); ;;; Save an Image, Im
lvars W = Widget_Image(Im); ;;; If it is displayed in a
if W then true endif -> Widget_Image(Im); ;;; widget, replace by <true>
lvars Pic = contents_Image(Im); ;;; Get size of picture
lvars (dx,dy) = (dx_Pic(Pic),dy_Pic(Pic));
appdata(Im,save_field); ;;; save fields of Im record
spr(";"); spr(dx); spr(dy); ;;; save size of picture
W -> Widget_Image(Im); ;;; restore widget
enddefine;
Writing the -restore- and -update- procedures
-----------------------------------------------
These both read data written by the corresponding -save- procedure. The
save file is scanned twice. In the first scan -restore- procedures are
called to build the data-structures, in the second forward references are
updated.
The -restore- function is called with parameters
restore(Rep,m)
It should return the restored data-structure which it has read using the
item-repeater Rep. The parameter -m- is passed on to the -stack_fields-
procedure described below. It is the serial number of the latest record
actually constructed on restore. On update it is very big.
stack_fields(Rep,m) reads and stacks fields ending with ";"
----------------------------------------------------------------
The SAVE_DATA section contains a procedure -stack_fields- which will stack
up fields which it reads until a semicolon is encountered, putting a count
on the stack.
stack_fields(Rep,m) -> Fld_1 ..... Fld_n, n
The procedure -restore_field- is available to read individual components.
restore_field(Rep,m) reads a single field
---------------------------------------------
Here Rep is the item-repeater for the file. m is the index of the latest
record formed. Any reference to a file-record preceding m can be finalised.
Forward references have to be postponed. restore_field(Rep,m) -> (Obj,flag)
If the flag = termin then we have read the last field in the object.
If the flag = true then Obj is an integer index to a forward reference
If the flag = false then Obj is the actual finalised POP-11 object
Thus the -restore- procedure will normally read and stack components
of the structure and then call the class-constructor. An example:
define restore_Image(Rep,m);
consImage((stack_fields(Rep,m)->;)); ;;; Build the actual structure
stack_fields(Rep,m) -> ; -> ; -> ; ;;; ignore the Widget info
enddefine;
restore_Image -> class_restore(Image_key);
The update procedure is called with 3 parameters, namely
(a) Your own item which is being restored
(b) The item repeater made from the saved data
(c) A parameter -m- which is given to -stack_fields- etc.
It should read the fields off the repeater in the same way as the restore
procedure, but use these fields to update the corresponding fields in
the object itself (-fill- is often a useful way of doing this, see
HELP *FILL).
An example is given below.
define update_Image(Im,Rep,m); ;;; We read the fields and use
fill((stack_fields(Rep,m)->),Im)->; ;;; them to update the structure.
lvars (dx,dy,_) = stack_fields(Rep,m); ;;; Read the dimensions of widget
lvars is_W = Widget_Image(Im); ;;; If there should be a widget
if is_W then
lvars W_new = mk_GraphicWidget('dx'><"*"><'dy', ;;; make one to spec
dx,dy,false,
CB_button_Lv);
endif; ;;; and put it in its
W_new -> Widget_Image(Im); ;;; slot.
enddefine;
User definition of the saved format using a property as second argument
-----------------------------------------------------------------------
The second argument of -save_data- may be a property. In that case it is
used to provide user-defined versions of various procedures used by
save_data. This allows a much more radical redefinition of the action of
-save_data- than the class_save capability defined above. For example
-save_data- can be used (in several passes) to create a C program which
will recreate a POP data-structure.
The concept of "index" can be generalised to include any Poplog object for
which an ordering relation is defined. Thus, if a C program is being
generated, indices will be C-identifiers which will be bound to the objects
being recreated.
The POP-11 call:
save_data(<struct>, <property>) -> <property>
returns a property which maps from the sub-structures of <struct> to their
indices. This property is important if consistency of mapping is required
over repeated calls of -save_data-.
The "before" property entry specifies how to sort the indices.
------------------------------------------------------------------
This should be a binary ordering predicate, such as nonop <= which is in
fact the default value. Use -alphabefore- if you are dealing with
identifiers as indices.
The "class_save" entry supports data-class specific output.
---------------------------------------------------------------
In case -save_data- is needed with class_save entries that are different
for different calls of -save_data-, we provide this -class_save- entry. It
is not expected that many people will need this, because more radical
changes are to be expected, using the "save_struct" entry, below.
The "new_index" entry generates an index for a Poplog structure.
-----------------------------------------------------------------
new_index(Struct) must produce a unique new index for the structure
-Struct-. The default value of -new_index- produces integer indices
starting at 1.
For example, the following generates a new index based on the dataword of
the structure
define new_c_index(Struct);
gensym(dataword(Struct));
enddefine;
E.g. new_c_index([2 3]) may return "pair234".
Note: if save_data is called multiple times and consistent use of indices
is required, you will probably need to incorporate the property returned by
-save_data- in -new_index-, so that -new_index- returns the index assigned
by a previous pass.
For example:
define new_index_P(Struct);
P_index(Struct);
enddefine;
Using the property returned by a first pass:
save_data(Struct,spec_declare) -> P_index;
We create a new property:
lvars spec_assign = newassoc([ [new_index ^new_index_P]])
And call a second pass of -save_data-
save_data(Struct,spec_assign)->;
Deliberately, there is no facility whereby the property returned from one
call from save_data can be reused -directly- in a second call, since the
existence of an entry in the property is taken by -save_data- as evidence
that a given structure has been fully saved.
The "save_field" entry specifies how to save a field
--------------------------------------------------------
The save_field procedure, specified above, can be redefined. Typically more
radical modifications of the output of -save_data- will use the
"save_struct" entry below.
The "save_struct" entry specifies how to save a whole structure.
--------------------------------------------------------------------
This procedure is usually the most complex one to write for any advanced
use of save_data. Typically, -save_struct(Struct)- will call -appdata- to
write out every component of -Struct-. For example:
define assign_c(Struct);
appdata(Struct,save_field_c(%Struct,
datakey(Struct),consref(1)%));
enddefine;
Here -save_field_c- is the procedure that does the work of writing out the
components, using its partially applied arguments for information about the
parent structure. The procedure
SAVE_DATA$-index
should be called to generate a new index for a POP-11 data-structure if
necessary. Thus, usually the -index- of a compound component will be
written out rather than the component, by the procedure which saves each
component.
Note that save_struct will have to treat some POP-11 data-structures rather
specially. This include:
false, true, nil, termin, procedures, strings, ddecimals
The "pr_header" entry is called initally to put out a header.
--------------------------------------------------------------------
pr_header(Struct) will be called before any other output is attempted.
Typically it will print out header information on an output file.
Note that it -must- also evaluate SAVE_DATA$-index(Struct) either directly
or indirectly, by calling a procedure that does. This will have the effect
of generating a queue of structures to be saved subsequently. If there is
no queue, nothing gets saved.
This requirement of evaluation may appear bizarre - but it is not so if we
think of saving a POP-11 simple item - in this case the header will contain
just that item and there is no queue of sub-structures to be saved, since
it -has- no sub-structures.
The "pr_string" entry can save a string in the desired format.
------------------------------------------------------------------
Because strings are not normally treated as -compound data objects- by
-save_data-, you must use this capability to redefine the output of
strings.
"pr_trailer" is called finally. It may close the output file
--------------------------------------------------------------------
The "FileSpec" entry specifies where the output is to go.
---------------------------------------------------------------
This entry specifies a file or repeater in the same way as the non-property
second argument of -save_data-.
An example - making an array that indexes a structure.
----------------------------------------------------------
lvars index = SAVE_DATA$-index;
define save_in_array(Struct)->A;
lvars A = newarray([1 1000], undef);
define lvars save_s(Struct);
lvars i = index(Struct);
Struct -> A(i);
if class_field_spec(datakey(Struct)) then
appdata(Struct,index<>erase);
endif;
enddefine;
define lvars header(Struct);
[% 'saved-data', index(Struct) %] -> pdprops(A) ;
enddefine;
save_data(Struct, newassoc([
[pr_header ^header]
[save_struct ^save_s]
[FileSpec ^charout]
[pr_trailer ^erase]
])) -> ;
enddefine;
save_in_array({3 4 [5]}) -> A;
arrayvector(A) =>
** {{3 4 [5] } 3 4 [5] 5 [] undef undef undef undef undef ...}
pdprops(A) =>
** [saved-data 1]
restore_data provides more capabilities than datafile
-----------------------------------------------------
The scope is extended primarily to include circular structures. In addition
double precision floating point numbers ("ddecimals") are written with the
appropriate number of significant digits to be accurately restored.
Two separate procedures are provided for saving and restoring since it is
anticipated that there will be applications where only one of these is
required.
A simple way of defining save and restore "methods" is provided.
Use with lib objectclass
------------------------
These procedures will save objects generated with lib objectclass, and will
restore them, provided that the specification of object classes remains
constant between saving and restoring. The "methods" draw on the older
paradigm of class_... procedures rather than being methods within lib
objectclass, since this library is not intended to require lib objectclass
to be loaded.
|