To formally verify Rust programs, we are building coq-of-rust, a translator from Rustย ๐ฆ code to the proof system Coqย ๐. We generate Coq code that is as similar as possible to the original Rust code, so that the user can easily understand the generated code and write proofs about it. In this blog post, we explain how we are representing side effects in Coq.
๐ฆ Side effects in Rustโ
In programming, side effects are all what is not representable by pure functions (mathematical functions, functions that always return the same output for given input parameters). In Rust there are various kinds of side effects:
- errors (the panic! macro) that propagate and do appear in the return type of functions,
- non-termination, with some potentially non-terminating loops (never returning a result is considered as a side-effect),
- control-flow, with the
break
,continue
,return
keywords, that can jump to a different part of the code, - memory allocations and memory mutations,
- I/O, with for example the println! macro, that prints a message to the standard output,
- concurrency, with the thread::spawn function, that creates a new thread.
๐ Coq, a purely functional languageโ
Like most proof systems, Coq is a purely functional language. This means we need to find an encoding for the side effects. The reason for most proof systems to forbid side effects is to be logically consistent. Otherwise, it would be easy to write a proof of False
by writing a term that does not terminate for example.
๐ฎย Monads in Coqโ
Monads are a common way to represent side effects in a functional language. A monad is a type constructor M
:
Definition M (A : Set) : Set :=
...
representing computations returning values of type A
. As an example we can take the error monad of computations that can fail with an error message, using the Result type like in Rust:
Definition M (A : Set) : Set :=
Result A string.
It must have two operators, Pure
and Bind
.
The Pure
operatorโ
The Pure
operator has type:
Definition Pure {A : Set} (v : A) : M A :=
...
It lifts a pure value v
into the monad. For our error monad, the Pure
operator is:
Definition Pure {A : Set} (v : A) : M A :=
Ok v.
The Bind
operatorโ
The Bind
operator has type:
Definition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=
...
It sequences two computations e1
with f
, where f
is a function that takes the result of e1
as input and returns a new computation. We also note the Bind
operator:
let* x := e1 in
e2
assuming that f
is a function that takes x
as input and returns e2
. Requiring this operator for all monads shows that sequencing computations is a very fundamental operation for side effects.
For our error monad, the Bind
operator is:
Definition Bind {A B : Set} (e1 : M A) (f : A -> M B) : M B :=
match e1 with
| Ok v => f v
| Err msg => Err msg
end.
๐ง State, exceptions, non-termination, control-flowโ
We use a single monad to represent all the side effects that interest us in Rust. This monad is calledย M
and is defined as follows:
Definition RawMonad `{State.Trait} :=
...
Module Exception.
Inductive t (R : Set) : Set :=
| Return : R -> t R
| Continue : t R
| Break : t R
| Panic {A : Set} : A -> t R.
Arguments Return {_}.
Arguments Continue {_}.
Arguments Break {_}.
Arguments Panic {_ _}.
End Exception.
Definition Exception := Exception.t.
Definition Monad `{State.Trait} (R A : Set) : Set :=
nat -> State -> RawMonad ((A + Exception R) * State).
Definition M `{State.Trait} (A : Set) : Set :=
Monad Empty_set A.
We assume the definition of some RawMonad
for memory handling that we will describe in a later post. Our monadย M
is a particular case of the monadย Monad
with R = Empty_set
. It is a combination four monads:
- The
RawMonad
. - A state monad, that takes a
State
as input and a return an updated state as output. The traitState.Trait
provides read/write operations on theState
type. - An error monad with errors of type
Exception R
. There errors include theReturn
,Continue
,Break
andPanic
constructors. TheReturn
constructor is used to return a value from a function. TheContinue
constructor is used to continue the execution of a loop. TheBreak
constructor is used to break the execution of a loop. ThePanic
constructor is used to panic with an error message. We implement all these operations as exceptions, even if onlyPanic
is really an error, as they behave in the same way: interrupting the execution of the current sub-expression to bubble up to a certain level. - A fuel monad for non-termination, with the additional
nat
parameter.
The parameter R
of the type constructor Monad
is used to represent the type of values that can be returned in the body of a function. It is the same as the return type of the function. So for a function returning a value of type A
, we define its body in Monad A A
. Then, we wrap it in an operator:
Definition catch_return {A : Set} (e : Monad A A) : M A :=
...
that catches the Return
exceptions and returns the value.
Conclusionโ
We will see in the next post how we define the RawMonad
to handle the Rust state of a program and memory allocation.
If you have a Rust codebase that you wish to formally verify, or need advice in your work, contact us atย contact@formal.land. We will be happy to set up a call with you.