When trying out things in Verilog, one of the struggles I have is understanding
what my “noob” code is doing. There’s no such thing as a printf
here,
obviously. There’s simulation, but that only goes so far when tying into the
real world, and there are LEDs and 7-segment displays, which are trivial to
attach to some internal signals.
But that’s not enough: I want a transcript of a series of events, to see what happened, in the proper context and sequence.
My solution to this is something I’m calling an “SPI peek” device, which can be included in the FPGA and talks to an external µC.
SPI is a brilliant data exchange mechanism. It’s simple and it’ll handle > 20 Mbit/sec:
SPI is basically a circular shift register split into two parts: one half lives in the master, the other half lives in the slave. Three wires to carry all information and a fourth enable wire (SS) to delimit message boundaries.
The SPI peek idea is to have say 32 bits on both sides, rapidly sending them across as often as needed. The enable signal (which is active low), is used as follows: on the falling edge, 32 bits are latched into the FPGA’s slave register. On the rising edge, the slave register is latched into an output register.
The result is that we can connect up to 32 internal FPGA signals to the slave’s input, and that we get 32 output signals to tie back into the FPGA to control it.
As a first test, I’ve “connected” these virtual output pins to the 4-digit 7-segment display on my starter board: 4 pins to digit select, and 8 pins to the individual segments.
The attached µC is an HyTiny STM32F103, running this Forth code:
spi-init
%0000000001011100 SPI1-CR1 ! \ clk/16, i.e. 4.5 MHz, master
\ %0000000001001100 SPI1-CR1 ! \ clk/4, i.e. 18 MHz, master (max supported)
: >fpga> ( u -- u ) \ exchange 32 bits with attached FPGA, takes ≈ 7 µs
+spi
dup 24 rshift >spi> 24 lshift swap
dup 16 rshift >spi> 16 lshift swap
dup 8 rshift >spi> 8 lshift swap
>spi>
-spi or or or ;
\ verify that data comes back when loopback is set
depth . $12345678 >fpga> hex. depth .
depth . $90ABCDEF >fpga> hex. depth .
: scan1 ( u -- ) not $FFF and >fpga> drop 5 ms ;
: scanner
begin
$808 scan1
$404 scan1
$202 scan1
$101 scan1
key? until ;
scanner
In other words, the µC is acting as if it had the 7-segment display attached directly to its I/O pins, and does all the multiplexing and segment setup. The FPGA just passes the signals on to the real display hardware.
Once the proper segments lit up on each of the digits, I knew that this new SPI peek interface was working. Now, it can be used to develop new logic in the FPGA.
As a next test, I’ll send a 16-bit value and have the FPGA display it as hex number.
Here is the Forth side, acting as test driver:
spi-init
[... same as above ...]
: >fpga> ( u -- u )
[... same as above ...]
: counter 0 begin 1+ dup >fpga> drop 250 ms key? until drop ;
counter
It’s nothing but a counter, sending a new 32-bit count to the FPGA every 250 ms.
Now, the FPGA does a bit more: converting each hex nibble to a 7-segment pattern, and rapidly multiplexing each of the digits:
(that’s the display, after 1343 seconds…)
For the SPI peek device code, see GitHub. Here is a Verilator (PDF) simulation run.
SPI peek works up to ≈ 1/10th the FPGA’s clock rate (tested w/ 18 Mb/s @ 200 MHz).
What I really like about this setup, is that I can prototype a bit of logic in Forth on the µC side, setting up the FPGA to simply pass through all its signals, and then gradually convert parts into Verilog code to make the FPGA take over specific functionality.
Verilog synthesis is a very slow process. With Forth’s interactive peeking, poking, and fast-paced coding cycle, the whole process becomes a lot more… enjoyable!
Update - SpiPeek has been optimised to use 30% fewer LEs and can run 2x as fast.