Spanning Two Worlds

[The ninth in a series of posts on the evolution of TransForth]

The dictionary we have at the moment is split across two worlds. The definitions are in Forth-world; packed into plain memory. But we still have the F#-world mapping of WordRecords to those memory locations.

letmutable dict = []

type WordRecord = { Name : string; Def : int; Immediate : bool ref }

let immediate () = dict.Head.Immediate := true

let header name = dict <- { Name = name; Def = mem.[h]; Immediate = ref false } :: dict

 

Header Format

 

Instead of this, we now want to move to plain memory along with everything else. The traditional Forth dictionary header is four bytes. We can certainly pack the name and the Immediate flag into a single Int32. However, for now we’ll postpone implementing the traditional Forth way of representing names of words (as a four-byte sequence giving the length and the first three ASCII characters). Instead we’ll take a little temporary shortcut and just go with the simplest thing that could possibly work; a hash of the name with the high bit reserved as a flag indicating whether it’s immediate (there’s obvious shortcomings to this but we change it once again when it’s re-implimented in Forth itself):

let encode (n : string) = n.GetHashCode() &&& 0x7FFFFFFF

let immediate () = mem.[latest] <- mem.[latest] ||| 0x80000000

 

We’ll be getting rid of the dict and will pack these headers into memory alongside their definitions. Each header will give the name/immediate encoding in one memory cell, followed in the next cell by the address of the previous word; thus making a linked list. The latest pointer will always point to the header of the most recently added word.

 

Following each header will be the packed definition compiled in as we implemented in the last post.

 

letmutable latest = mem.[h]

let header name =

    let link = latest

    latest <- mem.[h]

    encode name |> append

    append link

Finding and Forgetting

 

Instead of finding and forgetting words using the niceties of F#...

let find name = List.tryFind (fun w -> w.Name = name) dict

let forget name =

    let found = dict |> Seq.skipWhile (fun w -> w.Name <> name) |> List.ofSeq

    dict <- found.Tail

    mem.[h] <- found.Head.Def

 

… we’ll have to resort to low level memory walking and adjusting of the latest pointer.

let find name =

    let enc = encode name

    letrec find' addr =

        if addr = 0x0400 // first cell is DOSEMI

        then -1 else

            if mem.[addr] &&& 0x7FFFFFFF = enc

            then addr else find' mem.[addr + 1]

    find' latest

let forget name = mem.[h] <- find name; latest <- mem.[mem.[h] + 1]

The find function now returns the address of a word’s header (or -1 if not found). The forget function just adjusts the latest pointer; leaving forgotten definitions in memory to be subsequently overwritten.

 

A couple of helpers will be useful for checking whether a word isimmediate, and for converting a word address to the address of the definition – the so called “code field address” (cfa) which always can be found two cells from the header.

let isimmediate addr = mem.[addr] &&& 0x80000000 = 0x80000000

let cfa addr = addr + 2 // used in several places

Outer Interpreter Tweaks

 

Finally, our outer interpreter needs to change slightly to expect addresses rather than WordRecord options from find and to use the new dictionary format.

let rep input =

  out.Clear() |> ignore

    source <- input

    while not (Seq.isEmpty source) do

        let word = token ()

        if word.Length > 0 then

            match find word with

            | -1 ->// literal?

                let number, value = Int32.TryParse word

                if number then

                    if interactive then push value else append LIT_ADDR; append value

                else word + "?" |> failwith

            | d ->

                let c = cfa d

         if interactive || isimmediate d

                then p <- c; w <- c; i <- HALT_ADDR; execute ()

                else append c

 

That’s the last of the data structures remaining to be moved to plain memory. The last major moving part to be transitioned will be the outer interpreter. TransForth is coming along!

Next>