The meaning of memory safety

Arthur Azevedo de Amorim, Cătălin Hrițcu, Benjamin C. Pierce
[ISBN] [Google Scholar] [DBLP] [Citeseer]
Read: 17 April 2021

Principles of Security and Trust
Springer International Publishing
Cham
Pages 79-105
2018
Note(s): memory safety, separation logic, non-interference, Coq theorem prover, Rust language, Cyclone language

Aims to define memory safety in terms of reasoning principles it enables instead of the errors that memory unsafety manifests. The key idea is the separation logic notion of a function’s “footprint” – whether specified by a separation logic predicate or by reachability (from free variables / arguments).

Discussed in the context of a simple imperative language that includes local variables, heap objects, conditionals, loops and pointer arithmetic but whose semantics uses some form of fat pointer so that the underlying object can always be recovered. Also assumes underlying objects cannot be distinguished and that memory is unbounded.

Initial set of theorems

  1. Adding more memory doesn’t affect result of terminating program.
  2. Allocating objects in a different order doesn’t affet result.
  3. Adding more memory doesn’t make non-terminating programs terminate.
  4. Adding more memory doesn’t make erroneous programs non-erroneous.

Corollary (non-interference): Inaccessible memory neither interferes with nor is interfered with by a command. Note that this captures both integrity and secrecy; they hint at connections between secrecy and relational program logics.

An initial attempt (theorem 5) at a frame rule following from theorems 1 and 3 is not very useful because it only works for non-erroneous commands – an overly restrictive requirement.

A revised attempt (theorem 6) uses a relaxed Hoare-style contract that allows errors and replaces the ‘separating conjunction’ “p * r” with the ‘isolating conjunction’ “p |> r” where none of the object identifiers in the state described by p overlap with the objects in the state described by r. (The difference is the addition of the word “identifiers”). Use of the isolating conjunction ensures that the fragment satisfying ‘r’ is reachable from the rest of the state.

In a realistic language, various parts of these guarantees are lost

  • memory is finite - breaking integrity by allowing the outcome to depend on available free memory
  • time is observable - breaking secrecy and integrity
  • pointer values may be partly observable (e.g., can be ordered)
  • memory can be allocated without initialization (perhaps restricted to having a non-pointer type) - breaking secrecy
  • dangling pointers, double-free, etc. - breaking integrity
  • pointer spoofing - e.g., Javascript allows all globals to be accessed via a string with its name: effectively allowing pointer spoofing.

The reason about a heap-safety monitor for machine-code.

Everything is formalized in Coq theorem prover.

The related work points to other formalizations, languages such as garbage collected languages, Cyclone language and Rust language that relax one or more of the assumptions, etc.


Memory safety