We are working on formally verifying the ๐ฆย Rust implementation of the Move type-checker for bytecode in the proof system ๐ย Coq. You can find the code of this type-checker in the crate move-bytecode-verifier.

This requires translating all the Rust code in idiomatic Coq on which we will write our specifications and proofs. We write this translation by hand relying as much as possible on generative AI tools such as GitHub Copilot, as there are many particular cases. We plan, eventually, to prove it equivalent to the translation automatically generated by coq-of-rust.

In this blog post we present how we organize our ๐ฎย monad to represent the side-effects used in this Rust code. We believe this organization should work for other Rust projects as well.

To ensure your code is secure today, contact us atย ย ๐งcontact@formal.land!ย ๐

Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.

We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.

## Primitive effectsโ

In functional programming, effects (or side-effects) are every operation that cannot be directly represented as a mathematical function, that is to say, a procedure that returns an output purely based on the value of its inputs and does nothing else. For example, the function returning the current time makes an effect as it depends on a hidden state (the current time) that is not passed as an argument. The function printing a message to the console makes an effect as it modifies the state of the console, in addition to returning a value that is generally either empty or a confirmation of the printing. Arithmetic operations (`+`

, `*`

, ...) are an example of pure functions.

We consider three primitive effects in our Rust code:

**Panic**For many reasons, a Rust program can panic, as a result of an out-of-bounds access to an array or a wrong unwrap, for example. This is an effect as no outputs are returned in case of a panic.**Result**The`Result`

type is used to represent the result of a computation that can fail. It is a sum type with two constructors:`Ok`

for the successful result and`Err`

for the error. The Rust operator`?`

is used to propagate errors in a function that returns a`Result`

. This is another effect for us.**State**Finally, we consider the functions that mutate one of their arguments as effectful. The mutated parameter is generally typed as a mutable reference`&mut`

.

All our Coq definitions to represent the effects are in the file simulations/M.v.

### Panicโ

We define a monad `Panic.t`

to represent the effect of a panic with:

`Module Panic.`

Inductive t (A : Set) : Set :=

| Value : A -> t A

| Panic {Error : Set} : Error -> t A.

Note that the type `Error`

in this position is an existential type. This has a few consequences:

- We do not need to annotate the type
`Panic.t`

with the type of the error. - We can use any type for
`Error`

when we trigger a panic operation. This is useful for debugging, as we can add any payload to the panic message to help us understand what went wrong. - We cannot compute on the panic payload. We do not consider this as a limitation, as panics should not be caught and handled in a Rust program, only propagated.

We define the monadic *return* and *bind* operations as usual:

`Definition return_ {A : Set} (value : A) : t A := Value value.`

Definition bind {A B : Set} (value : t A) (f : A -> t B) : t B :=

match value with

| Value value => f value

| Panic error => Panic error

end.

We introduce notations based on the exclamation mark `!`

to make the code more readable:

`Notation "M!" := Panic.t.`

Notation "return!" := Panic.return_.

Notation "'let!' x ':=' X 'in' Y" :=

(Panic.bind X (fun x => Y))

(at level 200, x pattern, X at level 100, Y at level 200).

### Resultโ

We define the monad `Result.t`

to represent the propagation of errors with the `?`

operator with:

`Module Result.`

Inductive t (A Error : Set) : Set :=

| Ok : A -> t A Error

| Err : Error -> t A Error.

The difference with the `Panic.t`

monad is that the error type is not existential anymore. This is because we want to be able to compute on the error payload, as some functions depend on the error value.

We define the *return* and *bind* operations as:

`Definition return_ {A Error : Set} (value : A) : t A Error := Ok value.`

Definition bind {Error A B : Set} (value : t A Error) (f : A -> t B Error) : t B Error :=

match value with

| Ok value => f value

| Err error => Err error

end.

The *bind* corresponds to the question mark operator `?`

in Rust. We also introduce notation to make the code more readable:

`Notation "M?" := (fun A Error => Result.t Error A).`

Notation "return?" := Result.return_.

Notation "'let?' x ':=' X 'in' Y" := ...

### Stateโ

Finally, we define the monad `State.t`

ย ๐บ๐ธ to represent the effect of one or several mutable references with a mutable state typeย `S`

:

`Module State.`

Definition t (State A : Set) : Set := State -> A * State.

Definition return_ {State A : Set} (value : A) : t State A :=

fun state => (value, state).

Definition bind {State A B : Set} (value : t State A) (f : A -> t State B) : t State B :=

fun state =>

let (value, state) := value state in

f value state.

The stateย `S`

will typically be the tuple of all the current mutable references in the Rust code. We use notations based on the letterย `S`

.

We also introduce lens operations that mimic how we can extract a mutable reference to the part of a data structure from a mutable reference to the whole data structure in Rust. Here is the definition of the lens type:

`Record t {Big_A A : Set} : Set := {`

read : Big_A -> M! A;

write : Big_A -> A -> M! Big_A

}.

The `read`

and `write`

operations correspond to the dereferencing and the assignment of a mutable reference in Rust. The type `Big_A`

is the type of the whole data structure, and the type `A`

is the type of the part that we are referencing. These primitives might fail (there are in the panic monad) if the mutable reference is not valid, for example, for an out-of-bounds access in an array or an invalid case in an enum.

We can use a lens to lift a computation that operates on a part of a data structure to a computation that operates on the whole data structure. We provide various *lift* operators to help with this.

## Combinaisonsโ

Depending on the Rust code we want to translate, we might need to use none, one, or several of the effects above. We explicitly define all the possible combinations of the above monads, as well as return operations to go from one monad to another, more general monad.

The special case is for the combination of the panic and state effect. When a panic occurs, we do not return the resulting state, as we are not supposed to continue the evaluation after a panic so the current state should not be relevant. We lose the information about the state of the program when a panic occurs, which can be a limitation for debugging, but:

- It simplifies some definitions of simulations, and forces us not to speak about the specification of a state after a panic, what should not be relevant.
- We can still return the current state as an additional payload in the panic operator. This is actually what our panic operator does by default.

The most complete monad combines all the effects:

`Module StatePanicResult.`

Definition t (State Error A : Set) : Set :=

MS! State (M? Error A).

Definition return_ {State Error A : Set} (value : A) : t State Error A :=

returnS! (Result.Ok value).

Definition bind {State Error A B : Set}

(value : t State Error A)

(f : A -> t State Error B) :

t State Error B :=

letS! value := value in

match value with

| Result.Ok value => f value

| Result.Err error => returnS! (Result.Err error)

end.

with the notations:

`Notation "MS!?" := StatePanicResult.t.`

Notation "returnS!?" := StatePanicResult.return_.

Notation "'letS!?' x ':=' X 'in' Y" := ...

We are repeating our notations a lot, as our three effects and their combinations are very similar. In addition, we always have to explicitly choose in our code which monad we use and add explicit conversions to go from one to another. A future enhancement could be to add some automation at this level, through the use of type-classes, for example, to automatically infer the monad to use based on the operations used in the codeย ๐ฆพ. For now, we prefer to stay explicit.

## Iterationsโ

To convert code involving `for`

loopsย ๐ or manipulations with the `.map`

method of iterators, we introduce the effectful version of the `for`

loop (*fold* or *reduce* in functional languages) and the `map`

method. For example, for the folding operation:

`(** The order of parameters is the same as in the source `for` loops. *)`

Definition fold_left {State Error A B : Set}

(init : A)

(l : list B)

(f : A -> B -> t State Error A) :

t State Error A :=

List.fold_left (fun acc x => bind acc (fun acc => f acc x)) l (return_ init).

with the notation:

`Notation "foldS!?" := StatePanicResult.fold_left.`

## Conclusionโ

Thanks to the definitions and notations above, we were able to translate (manually/with GitHub Copilot) all the code of the type-checker for the Move bytecode to Coq in an idiomatic Coq code of a size roughly similar to the original Rust code. This translation is available in our folder move_sui/simulations/move_bytecode_verifierย ๐.

In the next post we will present how we tested this translation to be faithful to the original Rust code, waiting to have an efficient way to prove it equivalent.