Re: Layout floating point numbers

From: Ullrich von Bassewitz (uz_at_musoftware.de)
Date: 2002-10-05 14:59:19

Hi!

On Sat, Oct 05, 2002 at 12:21:55PM +0200, ruud.baltissen@abp.nl wrote:
> I made a global variable of the length that can be altered using a
> directive. This can save memory when using this compiler on a small
> computer.

Note: Both mentioned compilers have hardcoded lengths for the keywords. You
will have to change that also.

> >   * Both are using displays
>
> This was one of the things I didn't understand. The moment an identifier is
> declared, I give it a relative address. So the moment I need the data I only
> have to perform a "LDA BaseAddres + RelAddress"

While this is the most common case (at least if the compiler does not support
static/global variables), it's not always as easy as this. Assume the
following program:

----------------------------------------------------------------------------
    PROCEDURE P1 (L: INTEGER);
    VAR
      I: INTEGER;
      PROCEDURE P2 (L: INTEGER);
        PROCEDURE P3 (L: INTEGER);
          PROCEDURE P4 (L: INTEGER);
          BEGIN
            IF (L = 0) THEN I := 1 ELSE P4 (L-1);
          END;
        BEGIN
          P4 (L);
        END;
      BEGIN
        P3 (L);
      END;
    BEGIN
      P2 (L);
    END;

  BEGIN
    P1 (5);
  END;
----------------------------------------------------------------------------

The problem here is: When the inner procedure (P4) stores something into the
variable I which lives in another stack frame, it does not know, where this
frame resides on the stack.

There are two solutions for this problem. One is called displays, the second
one frame links.

When using displays, each local procedure gets a list of pointers to each
stack frame of any procedure in the enclosing scope as a hidden parameter.
That is, P4 receives an additional parameter as if it were declared like
this:

        PROCEDURE P4 (Display: ARRAY[1..3] OF WORD; L: INTEGER);

Display[1] contains the stack base pointer to access variables in the scope of
P3, Display[2] contains the same for P2, and Display[3] the stack frame
pointer for P1.

P3 receives one pointer less, because it is enclosed in only two other lexical
scopes:

        PROCEDURE P3 (Display: ARRAY[1..2] OF WORD; L: INTEGER);

When calling P4, the code in P3 must copy its own display onto the stack, add
its own stack frame pointer as third element and then do the call.

Access to variables in the outer lexical scope is quite fast when using
displays: First the frame pointer of the enclosing scope is fetched from the
display, and then an indirect store is done using this pointer. This means
that any access needs one additional indirection, regardless of the depth of
the lexical nesting.

The problem with displays in respect to the 6502 is, that it needs quite some
memory (each call to a nested procedure gets this display as a value
parameter), and building the display adds runtime overhead to each call.

The second method is called frame links: Each procedure passes its own frame
stack pointer when calling a nested function. The advantage is, that there is
just one additional (hidden) parameter per call. The disadvantage is, that
runtime access to variables in other scopes is more expensive, because more
indirect accesses are needed. To store something into I, P4 has to:

        1. load the stack frame of P3 from its own stack
        2. load the stack frame of P2 from the stack frame of P3
        3. load the stack frame of P1 from the stack frame of P2
        4. do the store using the stack frame pointer of P1

The reason why both mentioned compilers use displays is, that they don't have
something like global variables. Variables in global scope are handled as if
they were part of the lowest stack frame. The advantage is, that no special
handling in the compiler is needed, and the virtual pcode machine needs no
special addressing modes for static variables. But this in turn means that
each store into a global variable from a procedure in lexical level 1 needs
one indirect access.

My suggestion would be to add a static variable space. While this makes the
compiler more complex (you have to handle more addressing modes), it is easier
to generate good machine language code later. And, because there is no
overhead for the most common case (nested procedures accessing variables in
the global scope), making the rare cases more expensive is not a big problem,
so you can use frame links instead of displays and save some memory and
runtime overhead when calling nested procedures.

> >   * Output generation in one pass .... leads to bad code.
>
> That's why I generate ML-code, my assembler takes car of the needed passes.

Using more than one pass in the compiler has nothing to do with a multi pass
assembler. As soon as you have reached the assembler stage, most of the
information you had collected before is lost. Optimizations in this stage are
far more difficult than optimizations done by the compiler, because the
compiler has access to much more information. I'm thinking of things like

        * Is a variable actually used?
        * When was its first use and when was the last use?
        * Is any part of a (sometimes complex) expression constant or is it
          possible to simplify it by rearranging the expression tree?
        * ...and so on.

For this reason, many compilers do separate passes over the code collecting
more and more information. For example, in a first pass a parse tree is built
in memory. The second pass runs over this parse tree and rearranges
expressions. The third pass may generate an intermediate code using a virtual
machine model with infinite registers and storage. The fourth pass will
optimize this intermediate code. The fifth pass will take this code as input,
does a lifetime and use analysis for variables and assign registers and or
memory locations for the variables. And so on.

> >     or units.
>
> Oops, forgot about this one myself as well :( Would have bumped in it soon
> enough but stil ...

Supporting modules means that your backend tools need support for linking.
Just to make the list complete :-)

> The only problem I face is that I have two flexible parts, the dummy stack
> and the heap, that I cannot mix in one or another way. The momentary idea is
> that one grows up starting from the declared data and other grows down from
> the top of free space.

The usual program layout is to have the code in low memory, data above code,
heap above data growing upwards. The stack is located in top memory growing
downwards. Having the compiler output stack checking code on request is
helpful to detect the situation when the stack grows into the heap.

> > I would suggest to drop the existing pcode in favour of
> > direct machine language output
>
> This means I need a mechanism that does several passes to optimize the code.
> And my assembler has this mechanism.

This is a good starting point. Here are two more things that come to mind:

  * Support for map and symbol files, so you can lookup the actual addresses
    of variables and procedures (even nested ones) and maybe even debug the
    programs on the ML level using an emulator (this is helpful for yourself
    because compiler generated code tends to be quite - uhm - what is the
    opposite of straightforward?).

  * Having a library that supports basic stuff would be useful. Something like
    screen I/O as in the TP CRT unit, file support and so on.

Of course you don't need all these things in the beginning, but if you keep in
mind that you might add it later, you can add the necessary hooks even when
doing the first steps.

Regards


        Uz


-- 
Ullrich von Bassewitz                                  uz@musoftware.de

       Message was sent through the cbm-hackers mailing list

Archive generated by hypermail 2.1.4.