#LyX 1.6.5 created this file. For more info see http://www.lyx.org/
\lyxformat 276
\begin_document
\begin_header
\textclass literate-article
\begin_preamble
\usepackage[dvips,colorlinks=true,linkcolor=blue]{hyperref}
\DeclareGraphicsExtensions{.pdf}
\end_preamble
\language american
\inputencoding auto
\font_roman ae
\font_sans default
\font_typewriter default
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\paperfontsize default
\spacing single
\papersize a4paper
\use_geometry true
\use_amsmath 1
\use_esint 0
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\leftmargin 36pt
\topmargin 1in
\rightmargin 36pt
\bottommargin 1in
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 2
\papersides 1
\paperpagestyle fancy
\tracking_changes false
\output_changes false
\author ""
\author ""
\end_header

\begin_body

\begin_layout Title
b16 Documentation
\end_layout

\begin_layout Author

\noun on
Bernd Paysan
\end_layout

\begin_layout Standard
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
lhead{
\end_layout

\end_inset

b16 Documentation
\begin_inset ERT
status collapsed

\begin_layout Standard

}
\backslash
chead{
\end_layout

\end_inset


\noun on
Bernd Paysan
\noun default

\begin_inset ERT
status collapsed

\begin_layout Standard

}
\end_layout

\end_inset


\end_layout

\begin_layout Abstract
This article presents architecture and implementation of the b16 stack processor.
 This processor is inspired by 
\noun on
Chuck Moore'
\noun default
s newest Forth processors.
 The minimalistic design fits into small FPGAs and ASICs and is ideally
 suited for applications that need both control and calculations.
 The factor is shifted towards control to save space.
 The synthesizible implementation uses Verilog.
\end_layout

\begin_layout Section*
Introduction
\end_layout

\begin_layout Standard
Minimalistic CPUs can be used in many designs.
 A state machine often is too complicated and too difficult to develop,
 when there are more than a few states.
 A program with subroutines can perform a lot more complex tasks, and is
 easier to develop at the same time.
 Also, ROM and RAM blocks occupy much less place on silicon than 
\begin_inset Quotes eld
\end_inset

random logic
\begin_inset Quotes erd
\end_inset

.
 That's also valid for FPGAs, where 
\begin_inset Quotes eld
\end_inset

block RAM
\begin_inset Quotes erd
\end_inset

 is---in contrast to logic elements---plenty.
\end_layout

\begin_layout Standard
The architecture is inspired by the c18 from
\noun on
 Chuck Moore
\noun default
 
\begin_inset LatexCommand cite
key "c18"

\end_inset

.
 The exact instruction mix is different; it also differs from the standard
 b16 core.
 Also, this architecture is byte-addressed.
\end_layout

\begin_layout Standard
A word about Verilog: Verilog is a C-like language, but tailored for the
 purpose to simulate logic, and to write synthesizible code.
 Variables are bits and bit vectors, and assignments are typically non-blocking,
 i.e.
 on assignments first all right sides are computed, and the left sides are
 modified afterwards.
 Also, Verilog has events, like changing of values or clock edges, and blocks
 can wait on them.
\end_layout

\begin_layout Section
Architectural Overview
\end_layout

\begin_layout Standard
The core components are
\end_layout

\begin_layout Itemize
An ALU
\end_layout

\begin_layout Itemize
A data stack with top and next of stack (T and N) as inputs for the ALU
\end_layout

\begin_layout Itemize
A return stack
\end_layout

\begin_layout Itemize
An instruction pointer P 
\end_layout

\begin_layout Itemize
An address mux 
\family typewriter
addr
\family default
, to address external memory
\end_layout

\begin_layout Itemize
An instruction latch I
\end_layout

\begin_layout Standard
Figure 
\begin_inset LatexCommand ref
reference "blockdiagram"

\end_inset

 shows a block diagram.
\end_layout

\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open

\begin_layout Standard
\align center
\begin_inset Graphics
    filename b16-small.pdf
    width 100col%

\end_inset


\end_layout

\begin_layout Standard
\begin_inset Caption

\begin_layout Standard
Block Diagram
\begin_inset LatexCommand label
name "blockdiagram"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Subsection
Register
\end_layout

\begin_layout Standard
In addition to the standard Forth machine registers there are control registers
 for external RAM (
\family typewriter
rd
\family default
 and 
\family typewriter
wr
\family default
), stack pointers (
\family typewriter
sp
\family default
 and 
\family typewriter
rp
\family default
), and a carry 
\family typewriter
c
\family default
.
 For consistency with Chuck Moores' nomenclature, violating most coding
 style guidelines, the Forth machine registers are single-letter variables
 in upper case.
 Since the source code is a LyX document, you can use the 
\begin_inset Quotes eld
\end_inset

search whole word
\begin_inset Quotes erd
\end_inset

 mode to find them easily, and they also show up on top of the signal list
 during simulation.
\end_layout

\begin_layout Standard
\begin_inset VSpace medskip
\end_inset


\end_layout

\begin_layout Standard
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="9" columns="2">
<features>
<column alignment="center" valignment="top" width="0pt" leftline="true" rightline="false">
<column alignment="left" valignment="top" width="0pt" leftline="true" rightline="true">
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
Name
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
Function
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
T
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Top of Stack
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
I
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Instruction Bundle
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
P
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Program Counter
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
R
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Top of Returnstack
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
state
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Processor State
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
sp
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Stack Pointer
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
rp
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Return Stack Pointer
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
Carry Flag
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Standard
\begin_inset VSpace medskip
\end_inset


\end_layout

\begin_layout Scrap
<<register declarations>>=
\newline

reg [sdep-1:0] sp;
\newline

reg [rdep-1:0] rp;
\newline


\newline

reg `L T, I, P, R;
\newline

reg [1:0] state;
\newline

reg c;
\newline

@
\end_layout

\begin_layout Standard
\begin_inset Float table
wide true
sideways false
status collapsed

\begin_layout Standard
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="7" columns="10">
<features>
<column alignment="center" valignment="top" width="0pt" leftline="true" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="center" valignment="top" width="0pt" leftline="false" rightline="true">
<column alignment="left" valignment="top" width="0pt" leftline="false" rightline="true">
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
0 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
1 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
2 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
3 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
4 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
5 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
6 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
7 
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
Comment
\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="false">
<cell alignment="center" valignment="top" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
0 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
nop
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
call
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
jmp
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
ret
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
jz
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
jnz
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
jc
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
jnc
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="true">
<cell alignment="center" valignment="top" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
exec
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
goto
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
ret
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
gz
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
gnz
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
gc
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
gnc
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
for slot 3 
\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="true">
<cell alignment="center" valignment="top" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
8 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
xor
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
com
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
and
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
or
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
+
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
+c
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
\begin_inset Formula $*+$
\end_inset


\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
\begin_inset Formula $/-$
\end_inset


\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="false">
<cell alignment="center" valignment="top" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
10 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
!+
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
@+
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
@
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
lit
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c!+
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c@+
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c@
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
litc
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="true">
<cell alignment="center" valignment="top" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
!.
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
@.
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
@
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
lit
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c!.
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c@.
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
c@
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
litc
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
for slot 1
\emph default
 
\end_layout

\end_inset
</cell>
</row>
<row topline="false" bottomline="true">
<cell alignment="center" valignment="top" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
18 
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
nip
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
drop
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
over
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
dup
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
>r
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
r>
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" bottomline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
 
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Standard
\begin_inset Caption

\begin_layout Standard
Instruction Set
\begin_inset LatexCommand label
name "instructions"

\end_inset


\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Section
Instruction Set
\end_layout

\begin_layout Standard
There are 32 different instructions.
 Since several instructions fit into a 16 bit word, we call the bits to
 store the packed instructions in an instruction word 
\begin_inset Quotes eld
\end_inset

slot
\begin_inset Quotes erd
\end_inset

, and the instruction word itself 
\begin_inset Quotes eld
\end_inset

bundle
\begin_inset Quotes erd
\end_inset

.
 The arrangement here is 1,5,5,5, i.e.
 the first slot is only one bit large (the more significant bits are filled
 with 0), and the others all 5 bits.
\end_layout

\begin_layout Standard
The operations in one instruction word are executed one after the other.
 Each instruction takes one cycle, memory operation (including instruction
 fetch) need another cycle.
 Which instruction is to be executed is stored in the variable 
\family typewriter
state
\family default
.
\end_layout

\begin_layout Standard
The instruction set is divided into four groups: jumps, ALU, memory, and
 stack.
 Table 
\begin_inset LatexCommand ref
reference "instructions"

\end_inset

 shows an overview over the instruction set.
 Note: Some special characters indicate functions as follows:
\end_layout

\begin_layout Description
! 
\begin_inset Quotes eld
\end_inset

store
\begin_inset Quotes erd
\end_inset


\end_layout

\begin_layout Description
@ 
\begin_inset Quotes eld
\end_inset

load
\begin_inset Quotes erd
\end_inset

, 
\end_layout

\begin_layout Description
> 
\begin_inset Quotes eld
\end_inset

to
\begin_inset Quotes erd
\end_inset

 if before, 
\begin_inset Quotes eld
\end_inset

from
\begin_inset Quotes erd
\end_inset

 if afterwards.
\end_layout

\begin_layout Standard
Operations will be described using a 
\begin_inset Quotes eld
\end_inset

stack effect
\begin_inset Quotes erd
\end_inset

.
 This is a template for the stack elements before and after the operation,
 separated by a long dash.
 The names are listed in the order bottom to top, unchanged stack elements
 below are not listed.
\end_layout

\begin_layout Standard
Jumps use the rest of the instruction word as target address (except 
\family typewriter
ret
\family default
).
 The lower bits of the instruction pointer P are replaced, there's nothing
 added.
 For instructions in the last slot, no address remains, so they use T (TOS)
 as target.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<instruction selection>>=
\newline

// instruction and branch target selection   
\newline

wire [4:0] inst, rwinst;
\newline

reg `L jmp;
\newline


\newline

assign inst = { 4'b0000, data[15], I[14:0] }
\newline

              >> (5*(3-state[1:0]));
\newline

assign rwinst = { 5'b00000, I[14:0] }
\newline

                >> (5*(3-state[1:0]));
\newline


\newline

always @(state or I or P or T or data)
\newline

   case(state[1:0])
\newline

     2'b00: jmp = { data[14:0], 1'b0 };
\newline

     2'b01: jmp = { P[15:11], I[9:0], 1'b0 };
\newline

     2'b10: jmp = { P[15:6], I[4:0], 1'b0 };
\newline

     2'b11: jmp = { T[15:1], 1'b0 };
\newline

   endcase // casez(state)
\newline

@
\end_layout

\begin_layout Standard
The instructions themselves are executed depending on 
\family typewriter
inst
\family default
:
\end_layout

\begin_layout Scrap
<<instructions>>=
\newline

case(inst)
\newline

   <<control flow>>
\newline

   <<ALU operations>>
\newline

   <<load/store>>
\newline

   <<stack operations>>
\newline

endcase // case(inst)
\newline

@
\end_layout

\begin_layout Subsection
Jumps
\end_layout

\begin_layout Standard
In detail, jumps are performed as follows: the target address is stored
 in the address latch 
\family typewriter
addr
\family default
, which addresses memory, not in the P register.
 The register P will be set to the incremented value of 
\family typewriter
addr
\family default
, after the instruction fetch cycle.
 Apart from 
\family typewriter
call
\family default
, 
\family typewriter
jmp
\family default
 and 
\family typewriter
ret
\family default
 there are conditional jumps, which test for 0 and carry.
 The lowest bit of the return stack is used to save the carry flag across
 calls.
 Conditional instructions don't consume the tested value, which is different
 from Forth.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Standard
To make it easier to understand, I also define the effect of an instruction
 in a pseudo language.
 Every instruction has a stack effect (before---after) with top of stack
 on the right, 
\begin_inset Quotes eld
\end_inset

r:
\begin_inset Quotes erd
\end_inset

 prefix indicating return stack, and register assignments:
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
nop ( --- )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
call ( ---r:P ) 
\begin_inset Formula $\mathrm{P}\leftarrow jmp$
\end_inset

; 
\begin_inset Formula $\mathrm{c}\leftarrow0$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
jmp ( --- ) 
\begin_inset Formula $\mathrm{P}\leftarrow jmp$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
ret ( r:a--- ) 
\begin_inset Formula $\mathrm{P}\leftarrow a\wedge\$\mathrm{FFFE}$
\end_inset

; 
\begin_inset Formula $\mathrm{c}\leftarrow a\wedge1$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
jz ( n--- ) 
\begin_inset Formula $\mathbf{if}(n=0)\,\mathrm{P}\leftarrow jmp$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
jnz ( n--- ) 
\begin_inset Formula $\mathbf{if}(n\ne0)\,\mathrm{P}\leftarrow jmp$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
jc ( x--- ) 
\begin_inset Formula $\mathbf{if}(c)\,\mathrm{P}\leftarrow jmp$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
jnc ( x--- ) 
\begin_inset Formula $\mathbf{if}(c=0)\,\mathrm{P}\leftarrow jmp$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<control flow>>=
\newline

5'b00001: begin // call
\newline

   rp <= rpdec;
\newline

   R <= { ~|state ? incaddr[15:1] : P[15:1], c };
\newline

   P <= jmp;
\newline

   c <= 1'b0;
\newline

   if(state == 2'b11) `DROP;
\newline

end // case: 5'b00001
\newline

5'b00010: begin // jmp
\newline

   P <= jmp;
\newline

   if(state == 2'b11) `DROP;
\newline

end
\newline

5'b00011: // ret
\newline

          { rp, c, P, R } <= 
\newline

          { rpinc, R[0], R[l-1:1], 1'b0, toR };
\newline

5'b00100, 5'b00101, 5'b00110, 5'b00111:
\newline

begin // conditional jmps
\newline

   if((inst[1] ? c : zero) ^ inst[0]) 
\newline

      P <= jmp;
\newline

   `DROP;
\newline

end
\newline

@
\end_layout

\begin_layout Subsection
ALU Operations
\end_layout

\begin_layout Standard
The ALU instructions use the ALU, which computes a result 
\family typewriter
res
\family default
 and a carry bit from T and N.
 The instruction 
\family typewriter
com
\family default
 is an exception, since it only inverts T---that doesn't require an ALU.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Standard
Ordinary ALU instructions just write the result of the ALU into T and c,
 and reload N.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
xor ( a b---r ) 
\begin_inset Formula $r\leftarrow a\oplus b$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
com ( a---r ) 
\begin_inset Formula $r\leftarrow a\oplus\$\mathrm{FFFF}$
\end_inset

, 
\begin_inset Formula $\mathrm{c}\leftarrow1$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
and ( a b---r ) 
\begin_inset Formula $r\leftarrow a\wedge b$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
or ( a b---r ) 
\begin_inset Formula $r\leftarrow a\vee b$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
+ ( a b---r ) 
\begin_inset Formula $\mathrm{c},r\leftarrow a+b$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
+c ( a b---r ) 
\begin_inset Formula $\mathrm{c},r\leftarrow a+b+\mathrm{c}$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
\begin_inset Formula $*$
\end_inset

+ ( a b---a r ) 
\begin_inset Formula $\mathbf{if}(\mathrm{c})\, c_{n},r\leftarrow a+b\,\mathbf{else}\, c_{n},r\leftarrow0,b$
\end_inset

; 
\begin_inset Formula $r,\mathrm{R},\mathrm{c}\leftarrow c_{n},r,\mathrm{R}$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
/-- ( a b---a r ) 
\begin_inset Formula $c_{n},r_{n}\leftarrow a+b+1;$
\end_inset

 
\begin_inset Formula $\mathbf{if}(\mathrm{c}\vee c_{n})\, r\leftarrow r_{n}$
\end_inset

; 
\begin_inset Formula $\mathrm{c},r,\mathrm{R}\leftarrow r,\mathrm{R},\mathrm{c}\vee c_{n}$
\end_inset

 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<ALU operations>>=
\newline

5'b01001: // com
\newline

   { c, T } <= { 1'b1, ~T };
\newline

5'b01110: // *+
\newline

   { T, R, c } <=
\newline

   { c ? { carry, res } : { 1'b0, T }, R };
\newline

5'b01111: // /-
\newline

   { c, T, R } <=
\newline

   { (c | carry) ? res : T, R, (c | carry) };
\newline

5'b01000, 5'b01010, 5'b01011, 5'b01100, 5'b01101:
\newline

   // xor, and, or, +, +c
\newline

   { sp, c, T } <= { spinc, carry, res };
\newline

@
\end_layout

\begin_layout Subsection
Memory Instructions
\end_layout

\begin_layout Standard
Memory instructions use either T as address, and N as data (source or destinatio
n), or P as address, and T as destination (literals).
 The address is auto-incremented, except for instructions in the first slot
 which use T as address---this is to implement read-modify-write instructions
 (non-incremeting is written as @.
 or !.
 in the assembler, don't care as @* or !*).
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
!+ ( n A---A' ) 
\begin_inset Formula $mem[A]\leftarrow n$
\end_inset

; 
\begin_inset Formula $\mathrm{A'}\leftarrow\mathrm{A}+2$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
@+ ( A---n A' ) 
\begin_inset Formula $n\leftarrow mem[\mathrm{A}]$
\end_inset

; 
\begin_inset Formula $\mathrm{A'}\leftarrow\mathrm{A}+2$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
@ ( A---n ) 
\begin_inset Formula $n\leftarrow mem[\mathrm{A}]$
\end_inset

; 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
lit ( ---n ) 
\begin_inset Formula $n\leftarrow mem[\mathrm{P}]$
\end_inset

; 
\begin_inset Formula $\mathrm{P}\leftarrow\mathrm{P}+2$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
c!+ ( c A---A' ) 
\begin_inset Formula $mem.b[\mathrm{A}]\leftarrow c$
\end_inset

; 
\begin_inset Formula $\mathrm{A'}\leftarrow\mathrm{A}+1$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
c@+ ( A---c A' ) 
\begin_inset Formula $c\leftarrow mem.b[\mathrm{A}]$
\end_inset

; 
\begin_inset Formula $\mathrm{A'}\leftarrow\mathrm{A}+1$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
c@ ( A---c ) 
\begin_inset Formula $c\leftarrow mem.b[\mathrm{A}]$
\end_inset

; 
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
litc ( ---c ) 
\begin_inset Formula $c\leftarrow mem.b[\mathrm{P}]$
\end_inset

; 
\begin_inset Formula $\mathrm{P}\leftarrow\mathrm{P}+1$
\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<address handling>>=
\newline

wire `L incaddr, dataw, datas;
\newline

wire tos2r, tos2n;
\newline

wire incby, bswap, addrsel, access, rd;
\newline

wire [1:0] wr;
\newline


\newline

assign incby = (rwinst[4:2] != 3'b101);
\newline

assign access = (rwinst[4:3]==2'b10);
\newline

assign addrsel = rd ? 
\newline

      (access & (rwinst[1:0] != 2'b11)) : |wr;
\newline

assign rd = (state==2'b00) || 
\newline

            (access && (rwinst[1:0]!=2'b00));
\newline

assign wr = (access && (rwinst[1:0]==2'b00)) ?
\newline

            { ~rwinst[2] | ~T[0], 
\newline

              ~rwinst[2] | T[0] } : 2'b00;
\newline

assign addr = addrsel ? T : P;
\newline

assign incaddr = addr + incby + 1;
\newline

assign tos2n = (!rd | (rwinst[1:0] == 2'b11));
\newline

assign toN = tos2n ? T : dataw;
\newline

assign bswap = ~incby ^ addr[0];
\newline

assign datas = bswap ? { data[7:0], data[l-1:8] }
\newline

                     : data;
\newline

assign dataw = incby ? datas
\newline

                     : { 8'h00, datas[7:0] }; 
\newline

assign dataout = bswap ? { N[7:0], N[l-1:8] }
\newline

                       : N; 
\newline

@
\end_layout

\begin_layout Standard
Memory access can't just be done word wise, but also byte wise.
 Therefore two write lines exist.
 For byte wise store the lower byte of T is copied to the higher one.
\end_layout

\begin_layout Scrap
<<load/store>>=
\newline

5'b10000, 5'b10001, 5'b10100, 5'b10101:
\newline

begin        // !+, @+, c!+, c@+
\newline

   if(nextstate != 2'b10) T <= incaddr;
\newline

   sp <= rd ? spdec : spinc;
\newline

end
\newline

5'b10010, 5'b10011, 5'b10110, 5'b10111:
\newline

   T <= dataw;  // @, lit, c@, litc
\newline

@
\end_layout

\begin_layout Standard
Memory accesses need an extra cycle.
 Here the result of the memory access is handled.
\end_layout

\begin_layout Scrap
<<load-store>>=
\newline

<<pointer increment>>
\newline

if(|state[1:0]) begin 
\newline

   <<store afterwork>>
\newline

end else begin 
\newline

   <<ifetch>>
\newline

end
\newline

@
\end_layout

\begin_layout Scrap
<<debug>>=
\newline

$write("%b[%b] T=%b%x:%x[%x], ",
\newline

       inst, state, c, T, N, sp);
\newline

$write("P=%x, I=%x, R=%x[%x], res=%b%x
\backslash
n",
\newline

       P, I, R, rp, carry, res);
\newline

@
\end_layout

\begin_layout Standard
After the access is completed, the result for a load has to be pushed on
 the stack, or into the instruction register; for stores, the TOS is to
 be dropped.
\end_layout

\begin_layout Scrap
<<store afterwork>>=
\newline

if(rd && { inst[4:3], inst[1:0] } != 4'b1010)
\newline

   sp <= spdec;
\newline

if(|wr) sp <= spinc;
\newline

@
\end_layout

\begin_layout Standard
Furthermore, the incremented address may go back to the program pointer.
\end_layout

\begin_layout Scrap
<<pointer increment>>=
\newline

if(~|state || 
\newline

   ({ inst[4:3], inst[1:0] } == 4'b1011))
\newline

   P <= incaddr;
\newline

@
\end_layout

\begin_layout Standard
To shortcut a 
\family typewriter
nop
\family default
 in the first instruction, there's some special logic.
 That's the second part of NEXT.
\end_layout

\begin_layout Scrap
<<ifetch>>=
\newline

I <= data; 
\newline

if(!data[15]) state[1:0] <= 2'b01;
\newline

@
\end_layout

\begin_layout Subsubsection
Peripherals
\end_layout

\begin_layout Standard
Peripherals should only use address bits [15:1], read a whole word, and
 select the bytes written to based on the two write bits (bit 1 for most
 significant byte, bit 0 for least significant byte).
\end_layout

\begin_layout Subsection
Stack Instructions
\end_layout

\begin_layout Standard
Stack instructions change the stack pointer and move values into and out
 of latches.
 With the 6 used stack operations, one notes that 
\family typewriter
swap
\family default
 is missing.
 Instead, there's 
\family typewriter
nip
\family default
.
 The reason is a possible implementation option: it's possible to omit N,
 and fetch this value directly out of the stack RAM.
 This consumes more time, but saves space.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
nip ( a b---b )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
drop ( a--- )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
over ( a b---a b a )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
dup ( a---a a )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
>r ( a---r:a )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Description
r> ( r:a---a )
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<stack operations>>=
\newline

5'b11000: sp <= spinc;               // nip
\newline

5'b11001: `DROP;                     // drop
\newline

5'b11010: { sp, T } <= { spdec, N }; // over
\newline

5'b11011: sp <= spdec;               // dup
\newline

5'b11100: begin                      // >r
\newline

   R <= T; rp <= rpdec; `DROP;
\newline

end // case: 5'b11100
\newline

5'b11110: begin                      // r>
\newline

   { sp, T, R } <= { spdec, R, toR };
\newline

   rp <= rpinc;
\newline

end // case: 5'b11110
\newline

default ;                            // noop
\newline

@
\end_layout

\begin_layout Section
The Rest of the Implementation
\end_layout

\begin_layout Standard
First the implementation file(s) with comment and modules.
 You can either have all in one file (
\family typewriter
b16.v
\family default
), or each module in a file with the same name as the module---the defines
 will go to 
\family typewriter
b16-defines.v
\family default
 for central manipulation of the defines.
\end_layout

\begin_layout Scrap
<<header>>=
\newline

/*
\newline

 * b16 core: 16 bits, 
\newline

 * inspired by c18 core from Chuck Moore
\newline

 * (c) 2002-2011 by Bernd Paysan
\newline

 * 
\newline

 * <<gpl-header>>
\newline

 */
\newline

@
\end_layout

\begin_layout Scrap
<<defines>>=
\newline

`define L [l-1:0]
\newline

`define DROP { sp, T } <= { spinc, N } 
\newline

`define DEBUGGING
\newline

`define FPGA
\newline

// `define BUSTRI
\newline

@
\end_layout

\begin_layout Scrap
<<b16.v>>=
\newline

<<header>>
\newline

/*
\newline

<<inst-comment>>
\newline

 */
\newline

<<defines>>
\newline


\newline

<<ALU>>
\newline

<<latchen>>
\newline

<<Stack>>
\newline

<<cpu>>
\newline

<<debugger>>
\newline

@
\end_layout

\begin_layout Scrap
<<b16-defines.v>>=
\newline

<<defines>>
\newline

@
\end_layout

\begin_layout Scrap
<<alu.v>>=
\newline

<<header>>
\newline

`include "b16-defines.v"
\newline


\newline

<<ALU>>
\newline

@
\end_layout

\begin_layout Scrap
<<stack.v>>=
\newline

<<header>>
\newline

`include "b16-defines.v"
\newline


\newline

<<Stack>>
\newline

@
\end_layout

\begin_layout Scrap
<<latchen.v>>=
\newline

<<header>>
\newline

`include "b16-defines.v"
\newline


\newline

<<latchen>>
\newline

@
\end_layout

\begin_layout Scrap
<<cpu.v>>=
\newline

<<header>>
\newline

/*
\newline

<<inst-comment>>
\newline

 */
\newline

`include "b16-defines.v"
\newline


\newline

<<cpu>>
\newline

@
\end_layout

\begin_layout Scrap
<<debugger.v>>=
\newline

<<header>>
\newline

`include "b16-defines.v"
\newline


\newline

<<debugger>>
\newline

@
\end_layout

\begin_layout Scrap
<<gpl-header>>=
\newline

This program is free software; you can redistribute it and/or modify
\newline

it under the terms of the GNU General Public License as published by
\newline

the Free Software Foundation; version 2 of the License or any later.
\newline


\newline

This program is distributed in the hope that it will be useful,
\newline

but WITHOUT ANY WARRANTY; without even the implied warranty of
\newline

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  See the
\newline

GNU General Public License for more details.
\newline


\newline

This is not the source code of the program, the source code is a LyX
\newline

literate programming style article.
\newline

@
\end_layout

\begin_layout Scrap
<<inst-comment>>= 
\newline

 * Instruction set:
\newline

 * 1, 5, 5, 5 bits
\newline

 *     0    1    2    3    4    5    6    7
\newline

 *  0: nop  call jmp  ret  jz   jnz  jc   jnc
\newline

 *  /3      exec goto ret  gz   gnz  gc   gnc
\newline

 *  8: xor  com  and  or   +    +c   *+   /-
\newline

 * 10: !+   @+   @    lit  c!+  c@+  c@   litc
\newline

 *  /1 !.
   @.
   @    lit  c!.
  c@.
  c@   litc
\newline

 * 18: nip  drop over dup  >r        r>
\newline

@
\end_layout

\begin_layout Subsection
Top Level
\end_layout

\begin_layout Standard
The CPU consists of several parts, which are all implemented in the same
 Verilog module.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<cpu>>=
\newline

module cpu(clk, run, nreset, addr, rd, wr, data, 
\newline

           dataout, scanning, atpg 
\newline

`ifdef DEBUGGING,
\newline

           dr, dw, daddr, din, dout, bp`endif);
\newline

   <<port declarations>>
\newline

   <<register declarations>>
\newline

   <<instruction selection>>
\newline

   <<ALU instantiation>>
\newline

   <<address handling>>
\newline

   <<stack pushs>>
\newline

   <<stack instantiation>>
\newline

   <<state changes>>
\newline

   <<debugging read>>
\newline


\newline

   always @(posedge clk or negedge nreset)
\newline

      <<register updates>>
\newline


\newline

endmodule // cpu
\newline

@
\end_layout

\begin_layout Standard
First, Verilog needs port declarations, so that it can know what's input
 and output.
 The parameter are used to configure other word sizes and stack depths.
 The CPU is not fully scalable, e.g.
 the instruction decoder or the byte swap operation for byte access depends
 on 16 bit word size, but those parts of the CPU that are scalable can be
 scaled by changing that parameter---the others need manual intervention.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<port declarations>>=
\newline

parameter rstaddr=16'h3FFE, show=0,
\newline

          l=16, sdep=4, rdep=4;
\newline

input clk, run, nreset, scanning, atpg;
\newline

output `L addr;
\newline

output rd;
\newline

output [1:0] wr;
\newline

input  `L data;
\newline

output `L dataout;
\newline

<<debugging-ports>>
\newline

@
\end_layout

\begin_layout Standard
The ALU is instantiated with the configured width, and the necessary wires
 are declared
\end_layout

\begin_layout Scrap
<<ALU instantiation>>=
\newline

wire `L res, toN, toR, N;
\newline

wire carry, zero;
\newline


\newline

alu #(l) alu16(.res(res), .carry(carry),
\newline

               .zero(zero), 
\newline

               .T(T), .N(N), .c(c),
\newline

               .inst(inst[2:0]));
\newline

@
\end_layout

\begin_layout Standard
Since the stacks work in parallel, we have to calculate when a value is
 pushed onto the stack (thus 
\series bold
only
\series default
 if something is stored there).
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<stack pushs>>=
\newline

reg dpush, rpush;
\newline


\newline

always @(state or inst or rd or run <<dbg senselist>>)
\newline

  begin
\newline

     rpush = 1'b0;
\newline

     dpush = (|state[1:0] & rd) |
\newline

             (inst[4] && inst[3] && inst[1]);
\newline

     case(inst)
\newline

        5'b00001: rpush = |state[1:0] | run;
\newline

        5'b11100: rpush = 1'b1;
\newline

        default ;
\newline

     endcase // case(inst)
\newline

     <<stack debugging>>
\newline

  end
\newline

@
\end_layout

\begin_layout Standard
The stacks don't only consist of the two stack modules, but also need an
 incremented and decremented stack pointer.
 The return stack even allows to write the top of return stack even without
 changing the return stack depth.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<stack instantiation>>=
\newline

wire [sdep-1:0] spdec, spinc;
\newline

wire [rdep-1:0] rpdec, rpinc;
\newline


\newline

stack #(sdep,l) dstack(.clk(clk),
\newline

                       .sp(sp),
\newline

                       .spdec(spdec),
\newline

                       .push(dpush),
\newline

                       .in(toN),
\newline

                       .out(N),
\newline

                       .scan(scanning));
\newline

stack #(rdep,l) rstack(.clk(clk),
\newline

                       .sp(rp),
\newline

                       .spdec(rpdec),
\newline

                       .push(rpush),
\newline

                       .in(R),
\newline

                       .out(toR),
\newline

                       .scan(scanning));
\newline


\newline

assign spdec = sp-{{(sdep-1){1'b0}}, 1'b1};
\newline

assign spinc = sp+{{(sdep-1){1'b0}}, 1'b1};
\newline

assign rpdec = rp-{{(rdep-1){1'b0}}, 1'b1};
\newline

assign rpinc = rp+{{(rdep-1){1'b0}}, 1'b1};
\newline

@
\end_layout

\begin_layout Standard
The basic core is the fully synchronous register update.
 Each register needs a reset value, and depending on the state transition,
 the corresponding assignments have to be coded.
 Most of that is from above, only the instruction fetch and the assignment
 of the next value of 
\family typewriter
incby
\family default
 has to be done.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<register updates>>=
\newline

if(!nreset) begin
\newline

   <<resets>>
\newline

end else if(run) begin
\newline

`ifdef REPORT_VERBOSE
\newline

   if(show) begin
\newline

      <<debug>>
\newline

   end
\newline

`endif
\newline

   <<load-store>>
\newline

   state <= nextstate;
\newline

   <<instructions>>
\newline

end else begin // debug
\newline

   <<debugging>>
\newline

end // else: !if(nreset)
\newline

@
\end_layout

\begin_layout Standard
As reset value, we initialize the CPU so that it is about to fetch the next
 instruction from address 0.
 The stacks are all empty, the registers contain all zeros.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<resets>>=
\newline

state <= 2'b11;
\newline

P <= rstaddr;
\newline

T <= 16'h0000;
\newline

I <= 16'h0000;
\newline

R <= 16'h0000;
\newline

c <= 1'b0;
\newline

sp <= 0;
\newline

rp <= 0;
\newline

@
\end_layout

\begin_layout Standard
The transition to the next state (the NEXT within a bundle) is done separately.
 That's necessary, since the assignments of the other variables are not
 just dependent on the current state, but partially also on the next state
 (e.g.
 when to fetch the next instruction word).
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<state changes>>=
\newline

wire [1:0] nextstate;
\newline


\newline

assign nextstate = ((~|inst) || (|inst[4:3])) ?
\newline

                   state[1:0] + 2'b01 : 2'b00;
\newline

@
\end_layout

\begin_layout Subsection
Debugging
\end_layout

\begin_layout Standard
For debugging purposes, all registers are memory read--writable.
 This requires an external bus master attached to the debugging interface.
 The debugging interface is configured with the DEBUGGING flag.
 It's only active when the processor is stopped, so the processor itself
 can't access its own registers.
\end_layout

\begin_layout Standard
The debugging module offers the following registers as address space:
\end_layout

\begin_layout Standard
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="9" columns="3">
<features>
<column alignment="right" valignment="top" width="0" leftline="true" rightline="false">
<column alignment="center" valignment="top" width="0" leftline="true" rightline="false">
<column alignment="center" valignment="top" width="0" leftline="true" rightline="true">
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
Address
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
read
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard

\emph on
write
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFE0
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
stack[sp++]
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
push+T
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFE2
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
rstack[rp++]
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
rpush+R
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFE4
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
bp
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
bp
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFE6
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
state+stop
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
state
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFE8
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
P
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
P
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFEA
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
T
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
T
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="false">
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFEC
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
R
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
R
\end_layout

\end_inset
</cell>
</row>
<row topline="true" bottomline="true">
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
$FFEE
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
I
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text

\begin_layout Standard
I
\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\begin_layout Standard
The stacks and the state register change state when being read, so be careful!
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<debugger>>=
\newline

`ifdef DEBUGGING
\newline

module debugger(clk, nreset, run,
\newline

                addr, data, r, w,
\newline

                cpu_addr, cpu_r,
\newline

                drun, dr, dw, bp);
\newline

parameter l=16, dbgaddr = 12'hFFE;
\newline

input clk, nreset, run, r, cpu_r;
\newline

input [1:0] w;
\newline

input [l-1:1] addr;
\newline

input `L data, cpu_addr;
\newline

output drun, dr, dw;
\newline

output `L bp;
\newline


\newline

reg drun, drun1;
\newline

reg `L bp;
\newline

wire dsel = (addr[l-1:4] == dbgaddr);
\newline

assign dr = dsel & r;
\newline

assign dw = dsel & |w;
\newline


\newline

always @(posedge clk or negedge nreset)
\newline

if(!nreset) begin
\newline

   drun <= 1;
\newline

   drun1 <= 1;
\newline

   bp <= 16'hffff;
\newline

end else begin
\newline

   if(cpu_addr == bp && cpu_r)
\newline

      { drun, drun1 } <= 0;
\newline

   else if(run) drun <= drun1;
\newline

   if((dr | dw) && (addr[3:1] == 3'h3)) begin
\newline

      drun <= !dr & dw;
\newline

      drun1 <= !dr & dw & data[12];
\newline

   end
\newline

   if(dw && addr[3:1] == 3'h2) bp <= data;
\newline

end
\newline


\newline

endmodule
\newline

`endif
\newline

@
\end_layout

\begin_layout Scrap
<<debugging>>=
\newline

`ifdef DEBUGGING
\newline

if(dw) case(daddr)
\newline

   3'h0: { sp, T } <= { spdec, din };
\newline

   3'h1: { rp, R } <= { rpdec, din };
\newline

   3'h3: { c, state, sp, rp } <= 
\newline

           { din[10:8],
\newline

             din[sdep+3:4], din[rdep-1:0] };
\newline

   3'h4: P <= din;
\newline

   3'h5: T <= din;
\newline

   3'h6: R <= din;
\newline

   3'h7: I <= din;
\newline

   default ;
\newline

endcase
\newline

if(dr) case(daddr)
\newline

   3'h0: sp <= spinc;
\newline

   3'h1: rp <= rpinc;
\newline

   default ;
\newline

endcase
\newline

`endif
\newline

@
\end_layout

\begin_layout Scrap
<<debugging read>>=
\newline

`ifdef DEBUGGING
\newline

reg `L dout;
\newline


\newline

always @(daddr or dr or run or P or T or R or I or
\newline

         state or sp or rp or c or N or toR or bp)
\newline

if(!dr || run) dout = 'h0;
\newline

else case(daddr)
\newline

   3'h0: dout = N;
\newline

   3'h1: dout = toR;
\newline

   3'h2: dout = bp;
\newline

   3'h3: dout = { run, 4'h0, c, state,
\newline

                  {4-sdep{1'b0}}, sp,
\newline

                  {4-rdep{1'b0}}, rp };
\newline

   3'h4: dout = P;
\newline

   3'h5: dout = T;
\newline

   3'h6: dout = R;
\newline

   3'h7: dout = I;
\newline

endcase
\newline

`endif
\newline

@
\end_layout

\begin_layout Scrap
<<debugging-ports>>=
\newline

`ifdef DEBUGGING
\newline

   input [2:0] daddr;
\newline

   input dr, dw;
\newline

   input `L din, bp;
\newline

   output `L dout;
\newline

`endif
\newline

@
\end_layout

\begin_layout Scrap
<<dbg senselist>>=
\newline

`ifdef DEBUGGING
\newline

or run or dw or daddr
\newline

`endif
\newline

@
\end_layout

\begin_layout Scrap
<<stack debugging>>=
\newline

`ifdef DEBUGGING
\newline

if(!run && dw) case(daddr)
\newline

   3'h0: dpush = 1;
\newline

   3'h1: rpush = 1;
\newline

   default ;
\newline

endcase
\newline

`endif
\newline

@
\end_layout

\begin_layout Subsection
ALU
\end_layout

\begin_layout Standard
The ALU just computes the sum with possible carry-ins, the logical operations,
 and a zero flag.
 It reuses the same logic (essentially what comprises a full adder) to do
 both sums and logic.
 Figure 
\begin_inset LatexCommand ref
reference "fig:ALU-bit-slice"

\end_inset

 illustrates the logic that processes one bit of the ALU operation: Two
 multiplexers and one full adder (or the equivalent logic) per bit is sufficient
 to implement an ALU.
 The carry works as an AND gate if the carry in is 0 (both 
\begin_inset Formula $a$
\end_inset

 and 
\begin_inset Formula $b$
\end_inset

 input must be 1 to create a carry out), an OR gate if the carry in is 1
 (both 
\begin_inset Formula $a$
\end_inset

 and 
\begin_inset Formula $b$
\end_inset

 input must be 0 to not create a carry out), and the sum is an XOR of 
\begin_inset Formula $a$
\end_inset

 and 
\begin_inset Formula $b$
\end_inset

 without carry in, and an XNOR with carry in.
 The XNOR operation of the ALU is not used.
 When the carry is propagated, a normal sum is generated; in this case,
 the result 
\begin_inset Formula $r$
\end_inset

 selected is always the sum.
\end_layout

\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open

\begin_layout Standard
\align center
\begin_inset Graphics
    filename alu.pdf
    scale 40

\end_inset


\end_layout

\begin_layout Standard
\begin_inset Caption

\begin_layout Standard
\begin_inset LatexCommand label
name "fig:ALU-bit-slice"

\end_inset

ALU bit slice
\end_layout

\end_inset


\end_layout

\end_inset


\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<ALU>>=
\newline

module alu(res, carry, zero, T, N, c, inst);
\newline

   <<ALU ports>>
\newline


\newline

   wire        `L r1, r2;
\newline

   wire [l:0]  carries;
\newline


\newline

   assign r1 = T ^ N ^ carries;
\newline

   assign r2 = (T & N) | 
\newline

               (T & carries`L) | 
\newline

               (N & carries`L);
\newline

// This generates a carry *chain*, not a loop!
\newline

   assign carries = 
\newline

        prop ? { r2[l-1:0], (c | selr) & andor } 
\newline

             : { c, {(l){andor}}};
\newline

   assign res = (selr & ~prop) ? r2 : r1;
\newline

   assign carry = carries[l];
\newline

   assign zero = ~|T;
\newline

endmodule // alu
\newline

@
\end_layout

\begin_layout Standard
The ALU has ports T and N, carry in, and the lowest 3 bits of the instruction
 as input, a result, carry out, and test for zero as output.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<ALU ports>>=
\newline

parameter l=16;
\newline

input `L T, N;
\newline

input c;
\newline

input [2:0] inst;
\newline

output `L res;
\newline

output carry, zero;
\newline


\newline

wire prop, andor, selr;
\newline


\newline

assign { prop, selr, andor } = inst;
\newline

@
\end_layout

\begin_layout Subsection
Stacks
\end_layout

\begin_layout Standard
The stacks are modelled as block RAM in the FPGA.
 In an ASIC, this is implemented with latches.
 The block RAM (or register file) needs one read and one write port.
\begin_inset ERT
status collapsed

\begin_layout Standard


\backslash
filbreak
\end_layout

\end_inset


\end_layout

\begin_layout Scrap
<<Stack>>=
\newline

module stack(clk, sp, spdec, push, scan, in, out);
\newline

   parameter dep=2, l=16;
\newline

   input clk, push, scan;
\newline

   input [dep-1:0] sp, spdec;
\newline

   input `L in;
\newline

   output `L out;
\newline


\newline

   reg `L stackmem[0:(1@<<dep)-1];
\newline


\newline

`ifndef FPGA
\newline

   wire write;
\newline

   latchen genwrite(.clk(clk),
\newline

                    .en(push),
\newline

                    .scan(scan),
\newline

                    .out(write));
\newline


\newline

   always @(write or spdec or in)
\newline

      if(write) stackmem[spdec] <= in;
\newline

`else
\newline

   always @(posedge clk)
\newline

      if(push)
\newline

         stackmem[spdec] <= in;
\newline

`endif
\newline


\newline

  assign out = stackmem[sp];
\newline


\newline

endmodule // stack
\newline

@
\end_layout

\begin_layout Scrap
<<latchen>>=
\newline

`ifndef FPGA
\newline

module latchen(clk, en, scan, out);
\newline

   input clk, en, scan;
\newline

   output out;
\newline


\newline

   assign out = en & ~clk & ~scan;
\newline

endmodule
\newline

`endif
\newline

@
\newline


\end_layout

\begin_layout Bibliography
\begin_inset LatexCommand bibitem
key "c18"

\end_inset


\emph on
c18 ColorForth Compiler,
\emph default
 
\noun on
Chuck Moore
\noun default
, 
\begin_inset Formula $17^{\mathrm{th}}$
\end_inset

 EuroForth Conference Proceedings, 2001
\end_layout

\end_body
\end_document