.. vim: ft=rst showbreak=--»\  noexpandtab fileencoding=utf-8 nomodified   wrap textwidth=0 foldmethod=marker foldmarker={{{,}}} foldcolumn=4 ruler showcmd lcs=tab\:|- list tabstop=8 noexpandtab nosmarttab softtabstop=0 shiftwidth=0 linebreak  

There are many strategies, how to implement **FORTH**, but I will refer mainly to `Moving FORTH <https://www.bradrodriguez.com/papers/moving1.htm>`__ from Brad Rodriguez, author of `CamelForth <https://www.camelforth.com/>`__.
:date: 2026.01.15 20:25:16
:_modified: 1970.01.01 00:00:00
:tags: MHF-002
:authors: Gilhad
:summary: FORTH_25.01.13_HW_considerations
:title: FORTH_25.01.13_HW_considerations
:nice_title: |logo| %title% |logo|

%HEADER%

FORTH_25.01.13_HW_considerations
--------------------------------------------------------------------------------

Please read the  `Moving FORTH <https://www.bradrodriguez.com/papers/moving1.htm>`__ alongside with this log, as I will just answer or paraphrase questions laid there, without reapeating everything

**16 bit or 32 bits? 24 bits!** and **CELL size:**

Target platform is **ATmega2560**, which has this types of memory:

- Flash 256kB (code may be executed only from here, not writable (simply))
- Internal RAM 8kB (may be extended to 64kB)
- External RAM 128kB (part used to extend RAM)
- Shared RAM 128kB

To address all types in pointer is needed at least 20 bits, so I will use **CELL** size **24bit** (3 bytes, 3 registers) for arithmetic and pointers.

**THE THREADING TECHNIQUE:**

Code can be executed only in flash, which is not (simply and repeatadly) writable, while new data can be only stored in some RAM.

So new (user defined) words must be placed in RAM and therefor cannot contain executable code.

Only **ITC (Indirect Threaded Code)** allows it (and I like it most too). 

Also as NEXT is here long, then **JMP to NEXT** is way for me. 

And Token threaded code needs the table allocated in RAM to contain the tokens, which consumes RAM and limits number of user defined names (they use the token space too) so I will not use it.

**Intermezzo:**

- **Primitives** are words written in assembler (or so), the basic words like **+ - @ ! DUP DROP**
- **Compound words** are words written in FORTH, eg. **: DOUBLE DUP + ;**
- **Phrases** are words, that would normally be coumpoud, but I choose to write them in assembly instead ( eg. **DOUBLE: add r2,r2; adc r3,r3; adc r4,r4; NEXT;**)

In current stage of implementation it looks, like NEXT would take aroud 70 clocks (~4.4 μs) so DOUBLE as compound have 3xNEXT and some code, DOUBLE as phrase have 1x NEXT, and 3 clocks code (which is shorter, then any code in compound DOUBLE).

I will use phrases for many usual combination of words (yes, phrases), like **1 +**, **1 CELL +**, ... as well as for some more complicated parts.

While Flash is little slower than internal RAM, there is a lot of it and RAM is also needed for graphics, file operations, stacks, variables, etc. etc. so I want to prefere Flash to store as much of usual load as possible (common words, vocabularies ...).

Things in Flash must be compiled and uploaded before use whith the rest of FORTH code (~slow and complicated Arduino way).

Things in RAM may be easily changed anytime (added, edited and deleted).

Primitives have native code in flash obviously.

Phrases are just more complicated primitives and so are in Flash too.

Compound words may be placed in RAM at runtime, or placed in Flash at compilation of FORTH as data, like this:

- **DEFWORD** w_DOUBLE,0,"DOUBLE",f_docol
	- **PTR24** w_dup_cw
	- **PTR24** w_plus_cw
	- **PTR24** w_exit_cw

Some words (like words using CREATE DOES> ) needs executable part, so such words have to be compiled into Flash.

If compound word's definition contains any IMMEDIATE word inside, then such word have to be resolved before it may be compiled into Flash. (All the IF-THEN, BEGIN-UNTIL and other constructs.)

End od intermezzo

**REGISTER ALLOCATION**

- In my previous FORTH implementations I used **NEXT** in type **DT=*IP++; JMP *DT++;**  and **DT** was the use as pointer to data part of word. On Brad pages is used **W** for the same function.
- On ATmega only **Z(r30,r31)** may access Flash, so it will be **universal scratch pointer** for everything, **r25** is used for selecting data source and bank.
- One ATmega pointer (**Y  (r28,r29)**, 2bytes) is preserved by C and **Data Stack** will be in Internal RAM, 2 bytes are enought.
- pointer **X (r26+r27)** + **r0** will became **W=DT** pointer, as it is generated for each function again and again and used as scratch,
- **TOS** will enable faster access to Top Of Stack, as many primitives (@, !, ...) and phrases (1+,1C+,...) may use the registers and not move stack at all.
- **Return stack (r2,r3)** will point also only to Internal RAM, so 2 bytes are enough
- **HW stack pointer** will be used as stack for C/C++ routines, like SD card readers a interrupt handlers (which is not FORTH part of system) There are none simple means for addressing relatively to it.
- I did not decide, if I will need **UP=UserPointer**, this may wait for now, r11+ are still free
- Also I will use some  `Canary value <https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries>`__ to see how much of stacks was used and C/C++ routines would affect this substantially. (Canary ~ CACARY ~ **0xCACAA7** seems usable and maybe routines for debug prints will translate it to special string "Canary" for easier spotting inside FORTH 3B stacks)
- for other space, then 3 byte stacks I will use **0xDEADBEEF** instead as it is simple to alling to 4 and reversly test mod 4
- note, that moving pair of registers need both source and destination registers to be even numbered (0,2,4,...)

Here is my register allocation (also classical C API) and possible stack implementations on AVR:

|Registers_C_Forth.png|

|ATmega-Stacks.png|

Best of stack implementation are for me **register stacks growing down**, as the stack pointer points on lower byte of the value and is easy to manipulate.

**DOES and company**

In `Part 3: Demystifying DOES> <https://www.bradrodriguez.com/papers/moving3.htm>`__ are discussed words like **CODE;** or **DOES>** which compiles native code into new words. This is complicated on ATmega2560.

Such words may used to define new words, but the new words will not work as they are in RAM. But the new words may be dumped and transformed into Flash compatible byte defintions, which may be then compiled into FORTH and uploaded as new program.

So I will build my FORTH with lot of such words in the core, either as compound words or as phrases.

See also `How do you build a Forth system for the Very First Time? <https://www.bradrodriguez.com/papers/moving4.htm>`__

.. |ATmega-Stacks.png| image:: ATmega-Stacks.png
	:align: top
	:target: ATmega-Stacks.png

.. |Registers_C_Forth.png| image:: Registers_C_Forth.png
	:align: top
	:target: Registers_C_Forth.png