                          The M-code Instruction Set
                          ==========================

                                 Robert Smith
                                  June 1988

1.0 Introduction
----------------

POPC is a user program that extends the normal POP-11 compiler to form
SYSPOP-11, the systems programming language of POPLOG. These extensions allow
such facilities as definition of structures, manipulation of pointers and
machine integers, etc, and the language starts to bear a strong similarity to
'C'. When compiling a system source file, calls are made to the VM interface
in the same manner as for user programs, but rather than optimisation occuring
as the file is processed, an unadulterated (but slightly modified) code-list
of VM instructions for each procedure is passed to POPC. POPC optimises this
code stream and translates it to an intermediate representation called M-code
(this corresponds to a multiple-operand machine instruction set with
generalised addressing modes). For each M-code there is a procedure
responsible for translating that M-code into the equivalent target machine
assembler.

This note describes the instructions and operands of M-code. A later document
will decribe other aspects of the system code generation process such as
register declaration and use, target code emission, inline code expansions,
etc. The M-code to target assembler translation routines are located in
$popsrc/syscomp/genproc.p for a given machine.

The following descriptions are not completely machine independant. There is an
implicit assumption that a 32-bit machine is used. The notes which follow some
instructions assume byte-addressability (the case for all current POPLOG
hosts) which allow the following tagging scheme:


A pointer to an object (all objects word aligned, thus same as object address)
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | 30-bit word index                                         |0|0|
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

POP integer
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | 30-bit signed integer                                     |1|1|
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

POP decimal
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | 30-bit decimal                                            |0|1|
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


2.0 The M-code Instruction Set
------------------------------

The M-code instruction set has a total of 47 instructions. I have divided them
into 7 major classes, based in some cases on physical rather than logical
connections. The instruction groups are:


Data Movement Instructions:

    M_MOVE
    M_MOVEs
    M_MOVEb
    M_MOVEbit
    M_MOVEss
    M_MOVEsb
    M_MOVEsbit
    M_UPDs
    M_UPDb
    M_UPDbit

Arithmetic Instructions:

    M_ADD
    M_SUB
    M_MULT
    M_NEG
    M_PADD
    M_PSUB
    M_PADD_TEST
    M_PSUB_TEST
    M_PTR_ADD_OFFS
    M_PTR_SUB_OFFS
    M_PTR_SUB

Logical and Shift Instructions:

    M_BIS
    M_BIC
    M_BIM
    M_LOGCOM
    M_ASH

Compare and Test Instructions:

    M_BIT
    M_CMP
    M_TEST
    M_PCMP
    M_PTR_CMP
    M_CMPKEY

Branch Instructions:

    M_BRANCH
    M_BRANCH_std
    M_BRANCH_ON
    M_BRANCH_ON_INT

Procedure Call and Stack Frame Instructions:

    M_CALL
    M_CALLSUB
    M_CALL_WITH_RETURN
    M_RETURN
    M_CHAIN
    M_CHAINSUB
    M_CREATE_SF
    M_UNWIND_SF

Miscellaneous:

    M_LABEL
    M_ERASE
    M_END

The remainder of this section describes the abstract syntax and operation of
each of the instructions. The syntax is abstract because each instruction
really appears as a vector whose first element is a (pointer to) an M-code
transalation routine. The descriptions are not complete because some (e.g.
stack frame instructions) use data from variables rather than arguments.
Addressing modes and test conditions are dealt with in the next section.                  


2.1 Data Movement Instructions
------------------------------

M_MOVE                                                             Move Word

Syntax:         M_MOVE src dest

Description:    Move the contents of -src- to -dest-.

Operation:      dest:int = src:int


M_MOVEs                                                  Move Unsigned Short

Syntax:         M_MOVEs src dest

Description:    Move the unsigned short at -src- to word at -dest-.

Operation:      dest:<15:0>  = src:short
                dest:<31:16> = 0


M_MOVEb                                                   Move Unsigned Byte

Syntax:         M_MOVEb src dest

Description:    Move the unsigned byte at -src- to word at -dest-.

Operation:      dest:<7:0>  = src:byte
                dest:<31:8> = 0


M_MOVEbit                                            Move Unsigned Bit Field

Syntax:         M_MOVEbit size pos base dest

Description:    Move the unsigned bit field in word -base- starting at bit
                -pos- and extending up for -size- bits to the word at
                -dest-.

Operation:      dest:<size-1:0> = base:<size+pos-1:pos>
                dest:<31:size>  = 0


M_MOVEss                                                   Move Signed Short

Syntax:         M_MOVEss src dest

Description:    Move the signed short at -src- to word at -dest-.

Operation:      dest:<15:0>  = src:short
                dest:<31:16> = src:<15>


M_MOVEsb                                                    Move Signed Byte

Syntax:         M_MOVEsb src dest

Description:    Move the signed byte at -src- to word at -dest-.

Operation:      dest:<7:0>  = src:byte
                dest:<31:8> = src:<7>


M_MOVEsbit                                             Move Signed Bit Field

Syntax:         M_MOVEsbit size pos base dest

Description:    Move the signed bit field in word -base- starting at bit
                -pos- and extending up for -size- bits to the word at
                -dest-.

Operation:      dest:<size-1:0> = base:<size+pos-1:pos>
                dest:<31:size>  = base:<size+pos-1>


M_UPDs                                                          Update Short

Syntax:         M_UPDs src dest

Description:    Move the least significant short at -src- to word at -dest-.

Operation:      dest:<15:0>  = src:<15:0>
                dest:<31:16> = unaffected


M_UPDb                                                           Update Byte

Syntax:         M_UPDb src dest

Description:    Move the least significant byte at -src- to word at -dest-.

Operation:      dest:<7:0>  = src:<7:0>
                dest:<31:8> = unaffected


M_UPDbit                                                    Update Bit Field

Syntax:         M_UPDbit size pos base src

Description:    Move the -size- least significant bits from the word at
                -src- to the bit field in -base- starting at bit position
                -pos- and extending up for -size- bits.

Operation:      base:<size+pos-1:pos> = src:<size-1:0>
                base:<pos-1:0>        = unaffected
                base:<31:size+pos>    = unaffected



2.2 Arithmetic Instructions
---------------------------

M_ADD                                                   Add Machine Integers

Syntax:         M_ADD src1 src2 dest

Description:    Add machine integer contents of -src1- to machine integer
                contents of -src2- and put machine integer result in -dest-.

Operation:      dest:int = src2:int + src1:int


M_SUB                                              Subtract Machine Integers

Syntax:         M_SUB src1 src2 dest

Description:    Subtract machine integer contents of -src1- from machine
                integer contents of -src2- and put machine integer result
                in -dest-.

Operation:      dest:int = src2:int - src1:int


M_MULT                                             Multiply Machine Integers

Syntax:         M_MULT src1 src2 dest

Description:    Multiply machine integer contents of -src2- by machine integer
                contents of -src1- and put machine integer result in -dest-.

Operation:      dest:int = src2:int * src1:int


M_NEG                                                 Negate Machine Integer

Syntax:         M_NEG src dest

Description:    Negate machine integer contents of -src- and put machine
                integer result in -dest-.

Operation:      dest:int = 0:int - src:int


M_PADD                                                      Add POP Integers

Syntax:         M_PADD src1 src2 dest

Description:    Add POP integer contents of -src1- to POP integer contents
                of -src2- and put POP integer result in -dest-.

Operation:      dest:pint = src2:pint + src1:pint

Notes:          With normal POP integer representation and machine arithmetic:
                dest = src2 + (src1 - 0x3)


M_PSUB                                                 Subtract POP Integers

Syntax:         M_PSUB src1 src2 dest

Description:    Subtract POP integer contents of -src1- from POP integer
                contents of -src1- and put POP integer result in -dest-.

Operation:      dest:pint = src2:pint - src1:pint

Notes:          With normal POP integer representation and machine arithmetic:
                dest = src2 - (src1 - 0x3)


M_PADD_TEST                                       Add POP Integers With Test

Syntax:         M_PADD_TEST src1 src2 cond label

Description:    Add POP integer contents of -src1- to POP integer contents
                of -src2- and push the POP integer result on the stack. If
                the -cond- is true then branch to the -label- else continue.

Operation:      push (src2:pint + src1:pint) on user stack
                if cond then PC = label

Notes:          Calculation as for M_PADD. In practice -cond- is always an
                overflow test.


M_PSUB_TEST                                  Subtract POP Integers With Test

Syntax:         M_PSUB_TEST src1 src2 cond label

Description:    Subtract POP integer contents of -src2- from POP integer
                contents of -src1- and push the POP integer result on the
                stack. If the -cond- is true then branch to the -label- else
                continue.

Operation:      push (src2:pint - src1:pint) on user stack
                if cond then PC = label

Notes:          Calculation as for M_PSUB. In practice -cond- is always an
                overflow test.


M_PTR_ADD_OFFS                                            Add Pointer Offset

Syntax:         M_PTR_ADD_OFFS type off base dest

Description:    Add offset -off- to pointer in -base- to form pointer at
                -dest-. Pointers and offsets of type -type-.

Operation:      dest:ptr = base:ptr + off:offs

Notes:          As machine arithmetic for byte-addressable machines. -type-
                is irrelevant.


M_PTR_SUB_OFFS                                       Subtract Pointer Offset

Syntax:         M_PTR_SUB_OFFS type off base dest

Description:    Subtract offset -off- from pointer in -base- to form pointer
                at -dest-. Pointers and offsets of type -type-.

Operation:      dest:ptr = base:ptr - off:offs

Notes:          As machine arithmetic for byte-addressable machines. -type-
                is irrelevant.


M_PTR_SUB                                            Subtract Pointer Offset

Syntax:         M_PTR_SUB type ptr1 ptr2 dest

Description:    Subtract pointer -ptr1- from pointer -ptr2- to form offset
                in -dest-. Pointers and offsets of type -type-.

Operation:      dest:offs = ptr2:ptr - ptr1:ptr

Notes:          As machine arithmetic for byte-addressable machines. -type-
                is irrelevant.


2.3 Logical and Shift Instructions
----------------------------------

M_BIS                                                                Bit Set

Syntax:         M_BIS src1 src2 dest

Description:    Set bits in -src2- that are are set in -src1- and put the
                result in -dest-.

Operation:      dest:int = src2:int || src1:int


M_BIC                                                              Bit Clear

Syntax:         M_BIC src1 src2 dest

Description:    Clear bits in -src2- that are are set in -src1- and put the
                result in -dest-.

Operation:      dest:int = src2:int && ~~ src1:int


M_BIM                                                               Bit Mask

Syntax:         M_BIM src1 src2 dest

Description:    Clear bits in -src2- that are are clear in -src1- and put the
                result in -dest-.

Operation:      dest:int = src2:int && src1:int


M_LOGCOM                                                          Complement

Syntax:         M_LOGCOM src dest

Description:    Put complement of -src- in -dest-.

Operation:      dest:int = ~~ src:int


M_ASH                                                       Shift Arithmetic

Syntax:         M_ASH count src dest

Description:    Perform arithmetic shift of -count- bits on -src- and put
                result in -dest-. A positive -count- gives a shift to the
                left. Zeroes are shifted in from the right, and the sign bit
                from the left.

Operation:      dest:int = src:int << count:int   (arithmetic shift)


2.4 Compare and Test Instructions
---------------------------------

M_BIT                                                               Bit Test

Syntax:         M_BIT mask src cond label

Description:    If logical AND of -mask- and -src- such that -cond- is true
                then jump to the -label-, else continue.

Operation:      src:int && mask:int (sets condition codes)
                if cond then PC = label


M_TEST                                                  Test Machine Integer

Syntax:         M_TEST src cond label

Description:    If -src- compared with zero gives -cond- true then jump to
                the -label-, else continue.

Operation:      src:int - 0:int (sets condition codes)
                if cond then PC = label


M_CMP                                               Compare Machine Integers

Syntax:         M_CMP src1 src2 cond label

Description:    Compare machine integers -src1- and -src2-. If -cond- is
                true then jump to -label-, else continue.

Operation:      src2:int - src1:int (sets condition codes)
                if cond then PC = label


M_PCMP                                                  Compare POP Integers

Syntax:         M_PCMP src1 src2 cond label

Description:    Compare POP integers -src1- and -src2-. If -cond- is true
                then jump to -label-, else continue.

Operation:      src2:pint - src1:pint (sets condition codes)
                if cond then PC = label

Notes:          As machine integer compare for current implementations.


M_PTR_CMP                                                   Compare Pointers

Syntax:         M_PTR_CMP type src1 src2 cond label

Description:    Compare pointers -src1- and -src2-. If -cond- is true then
                jump to -label-, else continue. The pointers are of type
                -type-.

Operation:      src2:ptr - src1:ptr (sets condition codes)
                if cond then PC = label

Notes:          As machine integer compare for current implementations.


M_CMPKEY                                                         Compare Key

Syntax:         M_CMPKEY key src cond label

Description:    Compare the key -key- with the keyfield of the object -src-.
                If -cond- is true then jump to -label-, else continue. If the
                object is simple then the key will not match

Operation:      if issimple(src) then
                    se condition codes 'not equal'
                else
                    key(src):key - key:key (set condition codes)
                endif
                if cond then PC = label

Notes:          Only EQ and NEQ conditions are sensible.


2.5 Branch Instructions
-----------------------

M_BRANCH                                                              Branch

Syntax:         M_BRANCH label

Description:    Transfer control to code at -label-.

Operation:      PC = label


M_BRANCH_std                                                 Standard Branch

Syntax:         M_BRANCH_std label

Description:    Transfer control to code at -label-. Equivalent with M_BRANCH
                at the M-code level, but guaranteed to generate target branch
                code of fixed size. Used in procedure code to standardize
                seperation between two code entry points.

Operation:      PC = label


M_BRANCH_ON                                            Branch On POP Integer

Syntax:         M_BRANCH_ON switch label_list else_label

Description:    Transfer control to one of the labels in -label_list- given
                by the value of the POP integer -switch-, where a value of 1
                implies the first label. If the -switch- is out of range then
                jump to -else_label- (if false then continue).

Operation:      if switch >= 1 and switch <= length(label_list) then
                    goto label_list(switch)
                elseif else_label then
                    goto else_label
                endif


M_BRANCH_ON_INT                                    Branch On Machine Integer

Syntax:         M_BRANCH_ON_INT switch label_list else_label

Description:    Transfer control to one of the labels in -label_list- given
                by the value of the machine integer -switch-, where a value of
                1 implies the first label. If the -switch- is out of range
                then jump to -else_label- (if false then continue).

Operation:      if switch >= 1 and switch <= length(label-list) then
                    goto label-list(switch)
                elseif else-label then
                    goto else-label
                endif


2.6 Procedure Call and Stack Frame Instructions
-----------------------------------------------

M_CALL                                                    Call POP Procedure

Syntax:         M_CALL call

Description:    Call execute address of POP procedure -call-.

Operation:      push return PC on call stack
                PC = call


M_CALLSUB                                             Call Assembler Routine

Syntax:         M_CALLSUB call

Description:    Call assembler routine with entry address -call-.

Operation:      push return PC on call stack
                PC = call


M_CALL_WITH_RETURN              Call POP Procedure With Given Return Address

Syntax:         M_CALL_WITH_RETURN call return

Description:    Push supplied return address -return- and chain execute
                address of POP procedure -call-.

Operation:      push return address on call stack
                PC = call


M_CALLER_RETURN                        Access/Update Caller's Return Address

Syntax:         M_CALLER_RETURN update operand

Description:    If -update- is false, move the return address into the
                caller of the current procedure to destination -operand-.
                if -update- is true, set the caller's return address to
                the value from source -operand-.

Operation:      operand = caller's return address (update false)
                caller's return address = operand (udpate true)

Notes:          Optional: if not defined, caller's return address is
                assumed to be in an ordinary memory location in the
                current stack frame.
                (Currently used only in the SPARC implementation, where
                caller's return address is held in a register, and is
                offset by -8 from the actual return address.)


M_RETURN                                                              Return

Syntax:         M_RETURN

Description:    Return from POP procedure.

Operation:      Pop PC from call stack


M_CHAIN                                                  Chain POP Procedure

Syntax:         M_CHAIN chain

Description:    Chain execute address of POP procedure -chain-.

Operation:      PC = chain


M_CHAINSUB                                           Chain Assembler Routine

Syntax:         M_CHAINSUB chain

Description:    Chain assembler routine with entry address -chain-.

Operation:      PC = chain


M_CREATE_SF                                               Create Stack Frame

Syntax:         M_CREATE_SF

Description:    Create stack frame for POP procedure on call stack.

Operation:      save machine registers
                save dynamic locals
                allocate and zero locals on stack
                allocate space for non pop variables
                save owner pointer

Notes:          Instruction uses information from variables rather than
                arguments.


M_UNWIND_SF                                               Unwind Stack Frame

Syntax:         M_UNWIND_SF

Description:    Unwind stack frame

Operation:      remove owner pointer and stack variables
                restore dynamic locals
                restore machine registers

Notes:          Instruction uses information from variables rather than
                arguments.


2.7 Miscellaneous Instructions
------------------------------

M_LABEL                                                                Label

Syntax:         M_LABEL label

Description:    Define label -label- of next M-code instruction.

Operation:      Define assembler label


M_ERASE                                                                Erase

Syntax:         M_ERASE dest

Description:    Erase one element from stack specified by auto-indexed operand
                -dest-.

Operation:      If -dest- is auto-indexed operand then modify index by normal
                offset.


M_END                                                                    End

Syntax:         M_END

Description:    End of M-code instruction stream.

Operation:      None


3.0 M-code Operand Addressing Modes
-----------------------------------

The representation of addressing modes in M-code instructions has already been
described by Simon Nichols. The following table is adapted from his document.


M-code operand type  VAX Addressing mode       Example translation     
-------------------  -------------------       -------------------

Word                 Register                 "R1"     -->      R1

Integer              Immediate                4                 $4

Ref
 <ref 'label'>       Immediate                <ref 'foo'>       $foo

String               Absolute                 'label'           label

Vector
 {^reg 0}            Register Deferred        {r1 0)            (r1)

 {^reg ^disp}        Displacement             {r1 4}            4(r1)

 {^reg ^false}       Autodecrement            {r1 false}        -(r1)

 {^reg ^true}        Autoincrement            {r1 true}         (r1)+

 {^reg ^disp ^reg'}  Based Indexed with       {a1 4 d2}         a1@(4,d2:L)
                     Displacement (68K)

Pair

 [^operand'|^disp]   Autoincrement deferred   [{r1 0}|0]        @0(r1)
                     or displacement deferred
                     (effective address is
                     value of operand plus
                     displacement

3.1 Conditions
--------------

The conditions referred to in the compare instructions will be one of the
following:

        EQ      equal to
        NEQ     not equal to
        LT      signed less than
        LEQ     signed less than or equal to
        GT      signed greater than
        GEQ     signed greater than or equal to
        ULT     unsigned less than
        ULEQ    unsigned less than or equal to
        UGT     unsigned greater than
        UGEQ    unsigned greater than or equal to
        NEG     negative
        POS     positive
        OVF     overflow
        NOVF    not overflow


3.2 Pointer Types
-----------------

The pointer types that are generated for some M-code's are:

        1       byte
        2       short
        3       word

As we have seen, for byte addressable machines these can be ignored.


Appendix 1.0 - Types
--------------------

The operation description of the M-code's made a rather cavalier use of types.
They are listed here with the intention that one day they may become more
formal:

        int     32-bit integer
        short   16-bit integer
        byte    8-bit integer
        <n:m>   Bit field from bit n down to bit m in 32-bit integer
        pint    POP integer
        ptr     Pointer
        offs    Offset
        key     Key
