(This article is part of the The Fabric of Computing series: in search of simplicity)
With today’s programming languages, it’s easy to ignore what goes on inside. Maybe you’ve never even bothered to find out.
The trouble is that there are only so many hours in a day. As long as the chip is usable from a high-level language, who cares?
Well… I do. It’s fun to open up the hood and figure out how things are built and configured internally. Often with lots of clever and creative engineering.
There’s an older but delightfully minimal CPU by Tim Böscke, called the MCPU. His design fits into aCPLD with a mere thirty-two macrocells plus 64 bytes of RAM.
The MCPU project now lives on GitHub, including a 6-page PDF. But this post is not about duplicating his design, or simulating it at the logic level - instead, I’d like to show how to create an assembler and emulator for this CPU with only a few dozen lines of code. At the end, we’ll use this to run a little prime-number generator written in MCPU-assembler, and then re-establish the earth-shattering fact that the 52nd prime is 239!
The CPU is the computer’s brain. It steps through code stored in memory, performs computations, and makes decisions. Here’s a tiny example program for this MCPU:
3E 45 7F C2 C4 FA 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 01
These machine instructions are specific for MCPU, but every computer has‘em, and they all look like this sort of gibberish.
Here is an emulator, written in Python, which follows the MCPU’s rules and executes / interprets the above program:
$ python emu.py <test.obj
pc,ir,acc,cf: 0 0x3e 0 0
pc,ir,acc,cf: 1 0x45 0 0
pc,ir,acc,cf: 2 0x7f 250 0
pc,ir,acc,cf: 3 0xc2 251 0
pc,ir,acc,cf: 2 0x7f 251 0
pc,ir,acc,cf: 3 0xc2 252 0
pc,ir,acc,cf: 2 0x7f 252 0
pc,ir,acc,cf: 3 0xc2 253 0
pc,ir,acc,cf: 2 0x7f 253 0
pc,ir,acc,cf: 3 0xc2 254 0
pc,ir,acc,cf: 2 0x7f 254 0
pc,ir,acc,cf: 3 0xc2 255 0
pc,ir,acc,cf: 2 0x7f 255 0
pc,ir,acc,cf: 3 0xc2 0 1
pc,ir,acc,cf: 4 0xc4 0 0
pc,ir,acc,cf: 4 0xc4 0 0
pc,ir,acc,cf: 4 0xc4 0 0
pc,ir,acc,cf: 4 0xc4 0 0
pc,ir,acc,cf: 4 0xc4 0 0
pc,ir,acc,cf: 4 0xc4 0 0
$
There’s a counter in there, which counts to 255, “overflows” to 0, and then stops.
The above will not make much sense so far, but perhaps this will help:
org 0
lda count
cloop add one
jcc cloop
done jcc done
count dcb 250
That’s the same code, in “MCPU-assembly” language: load a count in the accumulator, add one, and loop until the carry flag is set. Then we enter an infinite loop.
Half a century ago, this would probably have been state of the art. The assembly language freed programmers from looking up codes and entering them by hand. Even more importantly, the assembler figured out all the memory addresses and jump distances, with support for human-friendly labels (cloop / done / count in this case).
The big jump is that the assembler is itself a computer program. In the old days these were very complex, but with the help of a bit of Python scripting, we can now boil it down to just a few dozen lines of code:
# MCPU assembler, see https://github.com/cpldcpu/MCPU# python asm.py <test.asm >test.objfrom__future__import print_functionimportsysdefvalof(arg):if arg in syms:
arg = syms[arg]try:returnint(arg)except:return0# pseudo opsdeforg(arg):global pc
pc = valof(arg)defdcb(arg):global pc
mem[pc &0x3F] = valof(arg)
pc +=1# instruction setdefnor(arg): dcb(valof(arg) |0x00)defadd(arg): dcb(valof(arg) |0x40)defsta(arg): dcb(valof(arg) |0x80)defjcc(arg): dcb(valof(arg) |0xC0)# combined instructionsdefjcs(arg): jcc(pc+2); jcc(arg)defjmp(arg): jcc(arg); jcc(arg)deflda(arg): nor('allone'); add(arg)defsub(arg): nor('zero'); add(arg); add('one')defout(arg): dcb(0xFF)
code = sys.stdin.readlines()
syms = {}for _ inrange(2):
pc =0
mem =64* [0]for line in code:
fields = line.split()iflen(fields) >0and line[0] !='#':if line.lstrip() == line:
syms[fields[0]] = pcdel fields[:1]
f =globals()[fields[0]]
f(fields[1])
s ='\n'print(s.join([format(x, '02X') for x in mem]))
That’s the entire two-pass assembler. We need two passes because lines may refer to labels defined later on, so the first pass figures out all label values and the second pass then generates the final code.
The emulator for this charmingly-simple MCPU processor is equally small:
# MCPU emulator, see https://github.com/cpldcpu/MCPU# python asm.py <... | python emu.py# optional arg is # of cycles to run# show instruction trace if cycles < 1000from__future__import print_functionimportsystry:
cycles =int(sys.argv[1])except:
cycles =20
mem = [int(x, 16) for x in sys.stdin.readlines()]assertlen(mem) ==64
pc =0
acc =0
cf =0for _ inrange(cycles):
ir = mem[pc &0x3F]
pc +=1
op, arg = ir >>6, ir &0x3Fif cycles <1000:print('pc,ir,acc,cf:', pc-1, hex(ir), acc, cf >>8, sep='\t')if op ==0:
acc = (acc | mem[arg]) ^0xFFelif op ==1:
acc += mem[arg]
cf = acc &0x100
acc &=0xFFelif op ==2:
mem[arg] = accelse:if ir ==0xFF:print(acc)elif cf ==0:
pc = arg
cf =0
With these tools, we can now tackle a more substantial task - generate prime numbers:
org 61
zero dcb 0
allone dcb 255
one dcb 1
org 0
lda two
out *
start lda allone
add allone
sta subs
loop lda number
inner add subs
jcs inner
sub subs
add allone
jcc noprime
lda subs
add allone
sta subs
add number
add allone
jcs loop
lda number
out *
noprime lda number
add two
sta number
jmp start
two dcb 2
subs dcb 0
number dcb 3
Assembling the above code produces this:
3E 62 FF 3E 7E 7E A3 3E 64 63 CC C9 3D 63 7F 7E
DC 3E 63 7E A3 64 7E D9 C7 3E 64 FF 3E 64 62 A4
C3 C3 02 00 03 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 01
If we feed that into the emulator, we get:
$ python asm.py <prime.asm | python emu.py 200000
2
3
5
[...]
229
233
239
$
Fifty two lines of output, and sure enough the last line is 239, which we can easily verify to be correct with Wolfram Alpha.
There you have it: an assembler + emulator which are simple enough to understand with just a little bit of code-reading.
(The way prime numbers are “computed” is quite a brain teaser in itself, by the way.)
Sooo… next time you start writing software on your computer, note that inside it, all of the above is happening - the pioneers who came up with these ideas live on as the giants whose shoulders we now stand on…
PS. The source files are available asgist.