At Formal Land our mission is to reduce the cost of finding bugs in software. We use formal verification, that is to say mathematical reasoning on code, to make sure we find more bugs than with testing. As part of this effort, we are working on a tool coq-of-rust to translate Rust code to Coq, a proof assistant, to analyze Rust programs. Here we present a technical improvement we made in this tool.

One of the challenges of our translation from Rust to Coq is that the generated code is very verbose. The size increase is about ten folds in our examples. A reasons is that we use a monad to represent side effects in Coq, so we need to name each intermediate result and apply the `bind`

operator. Here, we will present a monadic notation that prevents naming intermediate results to make the code more readable.

This work is funded by the Aleph Zero crypto-currency to verify their Rust smart contracts. You can follow us on X to get our updates. We propose tools and services to make your codebase bug-free with formal verification.

Contact us at contact@formal.land to chat ☎️!

## Example 🔎

Here is the Rust source code that we consider:

`fn add(a: i32, b: i32) -> i32 {`

a + b

}

Before, we were generating the following Coq code, with `let*`

as the notation for the bind:

`Definition add (τ : list Ty.t) (α : list Value.t) : M :=`

match τ, α with

| [], [ a; b ] =>

let* a := M.alloc a in

let* b := M.alloc b in

let* α0 := M.read a in

let* α1 := M.read b in

BinOp.Panic.add α0 α1

| _, _ => M.impossible

end.

Now, with the new monadic notation, we generate:

`Definition add (τ : list Ty.t) (α : list Value.t) : M :=`

match τ, α with

| [], [ a; b ] =>

ltac:(M.monadic

(let a := M.alloc (| a |) in

let b := M.alloc (| b |) in

BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)))

| _, _ => M.impossible

end.

The main change is that we do not need to introduce intermediate `let*`

expressions with generated names. The code structure is more similar to the original Rust code, with additional calls to memory primitives such as `M.alloc`

and `M.read`

.

The notation `f (| x1, ..., xn |)`

represents the call to the function `f`

with the arguments `x1`

, ..., `xn`

returning a monadic result. We bind the result with the current continuation that goes up to the wrapping `ltac:(M.monadic ...)`

tactic. We automatically transform the `let`

into a `let*`

with the `M.monadic`

tactic when needed.

## Where do we use this notation? 🤔

We use this notation in all the function bodies that we generate, that are all in a monad to represent side effects. We call the `ltac:(M.monadic ...)`

tactic at the start of the functions, as well as at the start of closure bodies that are defined inside functions. This also applies to the translation of `if`

, `match`

, and `loop`

expressions, as we represent their bodies as functions.

Here is an example of code with a `match`

expression:

`fn add(a: i32, b: i32) -> i32 {`

match a - b {

0 => a + b,

_ => a - b,

}

}

We translate it to:

`Definition add (τ : list Ty.t) (α : list Value.t) : M :=`

match τ, α with

| [], [ a; b ] =>

ltac:(M.monadic

(let a := M.alloc (| a |) in

let b := M.alloc (| b |) in

M.read (|

M.match_operator (|

M.alloc (| BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |) |),

[

fun γ =>

ltac:(M.monadic

(let _ :=

M.is_constant_or_break_match (|

M.read (| γ |),

Value.Integer Integer.I32 0

|) in

M.alloc (|

BinOp.Panic.add (| M.read (| a |), M.read (| b |) |)

|)));

fun γ =>

ltac:(M.monadic (

M.alloc (|

BinOp.Panic.sub (| M.read (| a |), M.read (| b |) |)

|)

))

]

|)

|)))

| _, _ => M.impossible

end.

We see that we call the tactic `M.monadic`

for each branch of the `match`

expression.

## How does it work? 🛠️

The `M.monadic`

tactic is defined in M.v. The main part is:

`Ltac monadic e :=`

lazymatch e with

(* ... *)

| context ctxt [M.run ?x] =>

lazymatch context ctxt [M.run x] with

| M.run x => monadic x

| _ =>

refine (M.bind _ _);

[ monadic x

| let v := fresh "v" in

intro v;

let y := context ctxt [v] in

monadic y

]

end

(* ... *)

end.

In our translation of Rust, all of the values have the common type `Value.t`

. The monadic bind is of type `M -> (Value.t -> M) -> M`

where `M`

is the type of the monad. The `M.run`

function is an axiom that we use as a marker to know where we need to apply `M.bind`

. The type of `M.run`

is:

`Axiom run : M -> Value.t.`

The notation for monadic function calls is defined using the `M.run`

axiom with:

`Notation "e (| e1 , .. , en |)" := (M.run ((.. (e e1) ..) en)).`

When we encounter a `M.run`

(line 4) we apply the `M.bind`

(line 8) to the monadic expression `x`

(line 9) and its continuation `ctx`

that we obtain thanks to the `context`

keyword (line 4) of the matching of expressions in Ltac.

There is another case in the `M.monadic`

tactic to handle the `let`

expressions, that is not shown here.

## Conclusion

Thanks to this new monadic notation, the generated Coq code is more readable and closer to the original Rust code. This should simplify our work in writing proofs on the generated code, as well as debugging the translation.

If you are interested in formally verifying your Rust projects, do not hesitate to get in touch with us at contact@formal.land 💌! Formal verification provides the highest level of safety for critical applications, with a mathematical guarantee of the absence of bugs for a given specification.