Synthesizing formal models of hardware from RTL for efficient verification of memory model implementations

Yao Hsiao, Dominic P. Mulligan, Nikos Nikoleris, Gustavo Petri, Caroline Trippel
[doi] [ISBN] [Google Scholar] [DBLP] [Citeseer] [url]
Read: 20 August 2022

MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture
Virtual Event, Greece
Association for Computing Machinery
New York, NY, USA
Pages 679-694
Note(s): ISA specification, weak memory, CPU verification, uspec
Papers: manerkar:micro:2017

RTL2uspec fills a gap in the uspec methodology by automatically lifting RTL into uspec models. It works by producing an initial overapproximation of the dependencies and then refining that using a bounded model checker (jasper gold) to check that all the dependencies are valid.

Input is the RTL of a processor plus some manual labelling with the PC that applies to various stages. The netlist for the RTL is transformed to a graph with registers as nodes and combinational logic collapsed into the edges; nodes are numbered by shortest distance from I-fetch. For each instruction, extract a sub-graph based on which nodes are potentially written to by that instruction - this is the execution path of that instruction. (Limitation: only really works for processors that have a single execution path for each instruction which rules out use of store buffers, caches, bypass logic, etc.)

The resulting overapproximation is a bit like the result of a taint analysis - it says that there might be a data dependency but does not guarantee that there is. This is refined by instantiating some SVA templates that check each edge in the graph and that each node is reachable.

The nice thing about this approach is that it is a very local analysis: mostly just analysing combinational logic. So it scales really well. The evaluation compares against RTLCheck manerkar:micro:2017 (which tried to verify the RTL directly against the uspec spec) on a simple RISC-V core (that does not have store buffers, caches, etc.). The approach of first lifting the spec scales much better. They also found a bug in the processor’s handling of undefined instructions.