Dagstuhl: Formal Methods for Correct Persistent Programming
Last week, I boarded the train to Wadern, Germany (through London, Paris and Saarbrücken) to attend Dagstuhl seminar 23412 on “Formal Methods for Correct Persistent Programming”.
Researcher at Intel
Last week, I boarded the train to Wadern, Germany (through London, Paris and Saarbrücken) to attend Dagstuhl seminar 23412 on “Formal Methods for Correct Persistent Programming”.
I was long overdue to attend a conference in person: the last conference I attended was POPL 2020.
This year’s PLDI was in Orlando, Florida at the height of a hot, humid summer in the largest conference
center I have ever seen (we did a lot of walking within the conference center).
PLDI was co-located with other conferences including
the hardware conference ISCA as part of the Federated Computing Research Conference (FCRC)
and I dropped in on a few of the ISCA talks.
A couple of weeks ago, I attended PLARCH 2023: a new workshop about
the intersection between Programming Languages and Computer Architecture.
There was a lot of interest in attending and speaking at the workshop so the
program consisted of a lot of short talks with group discussions in between.
Programming languages provide modules as a way of splitting large programs
up into small, separate pieces.
Modules enable information hiding that prevents one part of the program
from using and becoming dependent on some internal detail of how
other parts of the program are implemented.
Almost every major language designed in the last 50 years has some
form of module system.
There are lots of potential uses for machine readable specifications
so you would think that every major real world artifact like long-lived
hardware and software systems, protocols, languages, etc. would have
a formal specification that is used by all teams extending the design,
creating new implementations, testing/verifying the system, verifying
code that uses the system, doing security analyses or any of the other
potential uses.
But, in practice, this is usually not true: most real world systems do not
have a well tested, up to date, machine readable specification.
Today I am joining Intel Strategic CAD Labs to work on formal specifications.
There’s not much to say so far because I’ve only just started but
I think it’s going to be a lot of fun (and a lot of work).
I figure that a lot of the first month or two will be spent figuring out what
people want to do with formal specifications so, if you think that this will be
relevant to you, please get in touch.
I am equally interested in uses that are on my list of uses for ISA specifications
and uses that I haven’t even thought of.
I’ve spent the
last couple of years working at Google Research. For the first 5–6 months, I
was working in London in an office nestled between King’s Cross station and
the iconic St Pancras station. This location was ideal for me because I was
still living in Cambridge at the time and it was a very easy journey to
either of these stations from Cambridge.
ISA specifications describe the behaviour of a processor: the instructions, memory protection, the privilege mechanisms, debug mechanisms, etc. The traditional form of an ISA specification is as a paper document but, as ISAs have grown, this has become unwieldy. More importantly though, there are more and more potential uses for machine readable, mechanized, executable ISA specifications.
Last year, I wrote about the 122 papers that I read in my first year at Google and that I summarize on the RelatedWork site. Over the last 18 months or so, I’ve spent a lot less time doing the one hour train commute between Cambridge and London so I only read 59 papers in the last year and added 188 papers to the backlog of unread papers. You can read my summaries of all the papers, read notes on common themes in the papers, and download BibTeX for all the papers.
There are lots of hard problems in industrial research
but one of the hardest is timing.
If you start a project too early, your users
are focused on other problems and don’t yet feel a
need for your solution.
And, if you start a project too late, your
users have already hit the problem and you
have to work round whatever solution they have
found.
And so it is with this project.[^but-also-other-things]
At the moment, our potential users are still working on all the things
you must have to support a new language in a large company
and there is not a strong pull for a new, unproven research
idea.
So, as we close the project down, I thought it would be good
to have a bit of a retrospective on what the project was
trying to do and how.
To conclude this series on using KLEE with Rust for Linux,
we are going to write a simple verification harness
for testing the
rust_semaphore
device driver written in Rust
and symbolically execute the verification harness in KLEE.
It’s worth repeating that the goal of these posts is not to
verify the Rust for Linux code or to find bugs in it.
Instead, my goal is that you should be able to apply
these ideas to do that verification yourself.
You might choose to continue in the same direction as these posts suggest
by creating better mocks and test harnesses.
Or you might want to try something else based on different answers
to the questions in the first part of this series;
or, perhaps, using some tool other than KLEE
that I described in the second part of the series.
The code changes described in this post are
in this branch of my fork of Rust for Linux.
Using tools like the KLEE symbolic execution tool or bounded model checkers
with Linux kernel code is still a bit of a black art.
The first part of this series
on using KLEE on the Rust for Linux
considered what we would want to check.
This second part, digs deeply into how to prepare the codebase for use with
LLVM-based tools like KLEE.
(Warning: it may contain far more detail than most people are interested in.)
The final part will
show how we can build simple verification frameworks.
The code changes described in this post are
in this branch of my fork of Rust for Linux.
The Rust for Linux project is working on adding support for the Rust language
to the Linux kernel
with the hope that using Rust will make new code more safe, easier
to refactor and review, and easier to write.
(See the RFC for more detail about goals and for the varied
responses of the Linux Kernel community.)
In larger, long-running research projects, it can be useful to
make a list of all the risks and research questions you have
and use that as one way to guide what you tackle
first.
This is a list that we considered
as we were planning the Rust verification project.
Although some of the questions are specific to our particular context (what
we want to achieve, who we want to help, etc.), I think that
many of the questions apply to any project that aims to help
developers to verify their own code.
The goal of our Rust Verification project is to help programmers be more
productive using tools and techniques derived from the formal verification community.
We wrote about this in the paper “Towards making formal methods normal: meeting developers where they are”
and I described it recently in the Building Better Systems podcast but
I think there’s a few things that did not come over clearly in those.
Research is uncertain. It’s not clear what problems you will hit. It’s not clear how many problems you will hit. It’s not clear how long success will take or what success will look like or whether you will even succeed. When we talk about research, we often focus on the ideas, the sudden insight, the stroke of genius, standing on the shoulders of giants, etc. We sometimes talk about uncertainty and persistence. But we very rarely talk about risk and, most importantly, how to manage it.
A lot of our work over the last year was on identifying and fixing obstacles to using KLEE with Rust
and the main technique we used for finding new obstacles was to try to use
KLEE with different Rust programs and libraries.
One of the largest suites of programs we tackled was the Rust CoreUtils
library: a Rust rewrite and drop in replacement for the GNU CoreUtils suite
that includes programs like ls, cp, df, cat, and about 90 other standard Unix shell
commands.
Last year, I wrote a summary of all the Rust verification tools
that I knew about.
But formal verification of Rust is a fast-changing field so I took advantage
of all the experts at the 2021 Rust Verification Workshop
to make an up to date list of tools.
This post focuses on the automatic verification tools.
It is inevitable that automatic verification tools will have performance
problems because they push up against “the decidability
ceiling”: trying to solve undecidable problems and
often getting away with it.
In an earlier article, we looked at how to profile the
verification process to find which part of your program is causing the problem.
But that is only half the problem: we need to actually fix the problem.[^not-even-half]
So this article looks at one way that we can fix performance bottlenecks
when verifying Rust code using the KLEE symbolic execution tool.
In particular, it looks at using path-merging to overcome the path explosion problem.
Research is characterized by allowing yourself to make mistakes: performing
experiments; drawing conclusions; later, realizing that your
experiment was not sufficient and you got it wrong; and
trying again.
Back in March, we thought that we knew how to deal with vectorized Rust: tell the compiler not to auto-vectorize code; tell the compiler
not to use vector instructions; and use existing conditional compilation
feature flags to disable hand-vectorized code.
Unfortunately, two of those three ideas don’t work – but we think we have a
viable approach now.
A lot of our work on Rust formal verification is based on LLVM based tools
and, in particular, the KLEE symbolic execution tool that can be used
to find bugs and to generate high coverage testsuites.
We still have more work to do but it’s a good time for a Rust/KLEE status update.
One of the major themes in our work on Rust verification is
eliminating verification obstacles: things that mean that you
can’t even run a verification tool on your code.
So we have worked on how to verify cargo crates,
how to verify code that uses the Rust FFI
and how to verify programs with command line arguments.
One recurring obstacle is that some code depends on processor-specific
intrinsic functions.
For example the Aho-Corasick crate supports fast string searches
using SIMD acceleration and uses intrinsics to access the x86 architecture’s
AVX2 or SSE2 vector extensions if they are available.
Although verification tools could support these intrinsics,
most of them do not – so if your program uses Aho-Corasick (or
any crate that depends on it like the regex crate), then you won’t be
able to verify your program.
Automatic formal verification is usually pushing up against what Leino calls “the
decidability ceiling”: pushing the tools beyond what
they can be guaranteed to solve in some reasonable time, taking the risk that
the tools will blow up, but often getting away with it.
But what can we do when the toast lands butter-side-down?
This is a summary of a quick investigation to find out if
anybody had any answers. Tl;dr: they do.
Rust is able to call C code using the FFI (Foreign Function Interface).
This note describes how to verify crates that consist of a mixture of
Rust code and C code that is built using a
build script
such as
Alex Crichton’s cc-rs crate.
(If your crate calls a separate C library (e.g., libX11 or libSSL),
you will need to do a bit more, although this note may be a useful starting
point.)
What can we do when designing Rust code to make it easier to test?
This is a survey of everything I could find[^survey-method] about
testing Rust with a particular focus on design for testability for
correctness. Some of the articles show multiple things to do on a
worked example, some are more focused on a particular trick.
This is an overview of the papers that I have read over the last 12 months since I joined Google Research. Over the last year, I have read 122 papers and I have added 364 to my paper backlog. You can read my summaries of all the papers, read notes on common themes in the papers, and download BibTeX for all the papers. As the number of papers grew, I reorganized several times and the site became a bit like a Zettelkasten.
One important difference between C and Rust is that the C main function expects
to be given a list of command line arguments via
argc
/argv
function
parameters while Rust programs access their command line arguments via the
sys::env::args()
library function.
Dijkstra famously dissed testing by saying “Program testing can be used to show
the presence of bugs, but never to show their absence!”
As if you should choose one over the other.
I don’t see them as opposites but as complementary techniques that
should both be used to improve the quality of your code.
The goal of the tools and library in this repository is to let you verify
interesting things about non-trivial programs.
Unfortunately, interesting/non-trivial programs are too large for introducing
you to a tool so, for now, we will consider this trivial program.
Our tools for verifying Rust programs are based on the
verification-annotations
API
that provides a common API for multiple tools to use.
While we recommend that you use the higher-level propverify
library
it is useful to describes how to use the verification-annotations
library
in case you wonder how propverify
works.
One of the most straightforward and most solid automatic verification tools is the KLEE
symbolic execution tool that can be used to search for bugs in programs.
KLEE was originally developed for C but, because KLEE and the Rust compiler
are both based on the LLVM platform, it is possible to use KLEE
with Rust programs.
The Rust language and the Rust community are really interesting if you are
want to build better quality systems software.
Since joining Google in September, I have spent a lot of time reading about software verification tools and one of the things that keeps coming up is that there are competitions between different verification tools.
ASLi
is an interpreter for
Arm’s Architecture Specification Language.
To use it, you need to download the
MRA tools,
Arm’s
Architecture Specifications
and, of course,
ASLi.
(Note: that I am linking to an updated version of ASLi
instead of to
Arm’s official release
– I had to extend ASLi a bit to be able to execute programs with it.)
Facebook London has built a great team to bring the latest bugfinding techniques into Facebook’s development process. A lot of the time, they are having to develop those techniques themselves. But, they also give grants to academic groups to encourage them and they hold an annual two day symposium to talk about the challenges, progress, techniques, etc. The symposium was open to anybody interested in the topic: I met Facebook staff, many academic researchers (professors and PhD students), people from some of the other major tech companies, people from automative industry, entrepreneurs creating bugfinding tools, and many others. I think it is great that Facebook is investing in developing the testing and verification community in this way.
Just over a month ago, I joined Google Research to work on
Project Oak
and, in particular, formal reasoning about the Hafnium
Hypervisor.
It feels as if the idea of having a formal specification of a processor has turned a corner. Instead of having incomplete specs for parts of some architectures, we now have specs for Arm and RISC-V that are complete enough that you can boot an OS on them and we have complete specs of the x86 instruction set. Instead of having specs that are tied to some particular project/purpose, we have flexible specs that can be used to reason for many different purposes. So this is a great time to hold a workshop for the people working on all the different specs and applications to get together, compare notes and identify future challenges.
After 15 great years at Arm, I’m making a change.
You can use an SMT solver to find a solution to a set of constraints.
But what happens if you want to find multiple solutions?
My previous posts in this series have looked at
how you can turn execution traces into SMT problems
and at how you can
use an SMT solver to enumerate all paths through a specification.
In this post, I’ll look at how you can generate multiple inputs that will
trigger each path.
This can be useful if you are trying to generate testcases although it is good
for other things too.
SMT solvers are incredibly flexible tools for analyzing complex systems. In
my previous post,
I showed how you can: generate a symbolic execution trace from running an
instrumented interpreter on some input values; turn the trace into an SMT
circuit; and use an SMT solver to check that the SMT circuit matches
your original trace.
This post will explore how we can check assertions, array index bounds, etc.
to find bugs in the specification and how we can enumerate all the paths
through a specification.
In a previous post about formal validation of the Arm architecture I sketched how you
can reason about Arm’s processor specifications using SMT solvers such as
Z3,
Boolector and
CVC4. The main idea is that you translate
ASL (the language that Arm’s specifications are written in) to SMT-Lib2 (the
language that SMT solvers accept).
Last week’s S-REPLS
keynote by Sylvan
Clebsch was a talk
about the limitations of current microprocessor architecture and how it hides
everything of interest from the programmer: instruction level parallelism is
hidden behind out-of-order execution, message passing is hidden behind cache
coherency, etc. This reminded me that I have been meaning to write about the
SoC-C project that aimed to make parallelism
explicit by adding some language extensions to C and that achieved very high
performance, good scaling and high energy efficiency by using compiler tricks
and some programmer annotations to let us exploit a fairly bare-bones parallel
system.
One of the tantalising pieces of information contained in ARM’s machine
readable specifications is a specification of the assembly syntax.
A few years ago (on an earlier) version of the specification, Wojciech Meyer
and I decided to try to transform this specification into assemblers and
disassemblers.
At the time, this was not very useful to ARM because we already had assemblers
and disassemblers so, although technically successful, the project died and
the code has been slowly bitrotting ever since.
In a few days time, I will be giving a talk at the 34th Chaos Communication
Congress
[video]
[pdf]
in
Leipzig about practical things you can do with ARM’s specification and
I thought it would be a good idea to suggest that someone creates a similar
tool.
But maybe it would be a good idea if I showed you what Wojciech and I did to
get you started?
In my last post about natural language specifications, I talked about the problem with
executable specifications: they are great for specifying existing behaviour but
they are bad about describing overall properties and, in particular, what is
not allowed to happen. This makes it possible that extensions that add
new behaviour could break fundamental properties of the specification without
anybody noticing.
In my efforts to create a formal specification of the Arm architecture, I have
focussed on the parts written in “pseudocode”. I have reverse engineered the
ASL language hiding inside the pseudocode to create a formal, executable
specification that I can execute and test. In the process, I have tended to
ignore all the natural language prose that makes up the bulk of the 6,000 page
Arm Architecture Reference Manual. In defiance of Betteridge’s
Law, this
article is going to explain how I finally found a use for all that prose.
Three months ago, Arm released version v8.2 of its processor architecture
specification. Arm’s release includes PDF and HTML for the specification but
what makes this specification unusual is that it includes a machine readable
spec as well. The machine readable spec contains the instruction encodings, the system register
encodings and an executable specification (written in
ASL)
of each instruction and all the
supporting code such as exception handling, interrupt handling, page table
walks, hypervisor support, debug support, etc.
In my post about dissecting the ARM Machine Readable Architecture files
I described how to extract the ASL code from the XML files that ARM provides.
In this post, I will describe how to start processing that code by examining
the lexical syntax.
In doing so, I will be going back to one of the first things I had to figure
out when I started trying to use ARM’s documentation as an executable
specification so I will be looking at code I have barely thought about in
6 years and trying to remember my thought processes at the time as
I reverse engineered the language lurking inside ARM’s pseudocode.
Last week, ARM released their Machine Readable
Architecture Specification and
I wrote about what you can do with it.
But before you can do anything with the specification, you need
to extract the useful bits from the release so I thought I would try
for myself and describe what I found out (and release some
scripts that demonstrate/test
what I am saying).
The device you are reading this post on consists of a very tall
stack of layers - all the way from transistors and NAND gates all the
way up to processors, C, Linux/ Android/ iOS/ Windows to the browser.
Each of these layers may be written by a different team possibly in a different
company and the interface between these layers is documented and specified
so that each team knows what it can assume and what it must provide.
One of the most important interfaces in this stack is the one between
hardware and software that says what a processor will do if it is configured
a certain way, provided with page tables, interrupt handlers, put in a certain
privilege level and finallyprovided with a program to execute.
If you want to write a compiler, operating system, hypervisor or security
layer, then you need to know how the processor will behave.
If someone gives you a virus and asks you to figure out how it works
and how to defend against it, then you need to know how the processor will
behave.
What language should you write a specification in? Should you use the language
supported by your favourite verification tool (HOL, Verilog, …)?
Or should you write it in a DSL and translate that to whatever your
verification tool needs?
As I was explaining the ISA-Formal technique for verifying ARM processors to people, I realized that it was
important to be clear about what ISA-Formal does not verify and why.
Part of the reason why the ISA-Formal technique for verifying ARM
processors is so effective and
so portable across different ARM processors is the fact that we directly use the
ARM Instruction Set Architecture (ISA) Specification in our flow.
That is, I translate ARM’s official printed documentation into something that
I can load into a model checker alongside ARM’s processor Verilog and I verify
that the two match each other.
Probably the most important thing I didn’t say enough in my paper about
verifying ARM processors
is why we focus on finding bugs.
I have spent the last 5 years working on ARM’s processor specifications:
making them executable, testable, mechanizing them, formalizing them,
making them trustworthy, etc.
The difficult thing about Industrial Research Labs is that you don’t always get to talk about what you are doing so, for the last five years or so, I haven’t published anything except the occasional patent and I have given only abstract descriptions to most of my friends.
I write lots of minor documentation files: Readme’s; release notes; instructions for checking out and building the project; weekly reports; todo lists; internal documentation; etc. If I want to keep these in sync with the source code they refer to, then I need to put it in git/Hg/SVN/CVS version control which rules out using MS Word (not that I was seriously planning to use Word). So it’s plain text then…
I don’t have a problem. My wife thinks it strange that I always have a pen in my pocket - but I feel that she also admires the practicality of it. Almost the first thing I do on getting a new computer is to install software for taking and cross-referencing notes - but everyone does that, don’t they? And when I’m travelling and want to be able to take notes without whipping out a laptop and typing a password, then I use a pocketmod.
My first laptops came with docking stations - hefty beasts you clip your laptop into with USB, DVI, power, printer and possibly even RS-232 connectors. Convenient at the desk but too bulky to travel with so it sits on my desk at work and doesn’t come home with me (or sits at home and …)
I’ve been using a Synology Diskstation for a while as a Network Attached Storage box for sharing music and photos and as a TimeMachine server but I’ve always known that it can do a lot more and I wanted a private git server so I googled and found these useful pages:
I have been wanting to experiment with ARM’s new 64-bit architecture for a while and now ARM has released a simulator.
What does it take to turn a Mac into a decent programming environment?
I recently got a new home computer - my first Mac. As with any new machine, it takes a while to set it up just as you like it. Here are some of the basics that I started off with.
Historically, I’ve changed jobs every 5-6 years; I’ve been at ARM over 7 years
now. When I mentioned this to my manager, he started wondering where this was
heading. In fact, I was explaining why I felt it was time for me to shift my
focus away from parallelism, vector processing, performance optimization, etc.
and into something I didn’t know much about. It’s a fast moving industry and
if you don’t keep moving, your skills become irrelevant and if you don’t make
a big enough shift, you won’t take the risks you need to take to achieve
something new. (A very bad paraphrase of Richard Hamming.)
When I worked for universities, I had a very visible public presence. Anyone I talked to could easily google for me and find my recent projects, papers, etc. Without even trying, I was the first hit on Google. But 7 years ago, I moved to industry and suddenly I was invisible. I had no web page, there was no single collection of all my publications, nothing about recent projects, nothing about the people I work with. And I often can’t say much about my current project until all the patents are filed or because we’re working with another company and there are confidentiality requirements.