Field
type to write programs that run efficiently as circuits. Having a formal verification for Noir enables the development of applications holding a large amount of money in this language, as it ensures that the code is correct with a mathematical level of certainty.
In this first post, we present how we translate Noir code to the 🐓 Coq proof system. We explore a translation after monomorphization and then at the HIR level. Note that we are interested in verifying programs written in Noir. The verification of the Noir compiler itself is a separated topic.
All our code is available as opensource on github.com/formalland/coqofnoir, and you are welcome to use it. We also provide allincluded audit services to formally verify your smart contracts using coqofnoir
.
To ensure your code is secure today, contact us at 💌contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to mathematical reasoning on the code. It can be integrated into your CI pipeline to check that every commit is fully correct without doing a whole audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
Noir is designed as a small version of 🦀 Rust with many builtin constructs to make it more amenable to efficient compilation to zeroknowledge circuits. Being a smaller version of Rust, this simplifies the development of tooling as the surface of the language is reduced. In addition, as it shares similarities with Rust, we can reuse our knowledge from coqofrust, a formal verification tool for Rust, to propose an equivalent tool for Noir.
A notable difference between Rust and Noir is that Noir has a much simpler memory management model: nothing is ever deallocated! As a result, the various kinds of pointers that exist in Rust (Rc
, RefCell
, ...) are not present in Noir. Most of the data is immutable, and mutations are encouraged to be done only on local variables.
The loops are restricted to for
loops with bounds known at compile time, which simplifies the reasoning about them. For example, we are sure that all the loops terminate, which is required for the verification of the code.
Here is an example of Noir program that we will use in this series of blog posts. It showcases the use of mutable variables in a loop, as well as generic values such as InputElements
that are known at compile time and specialized during the monomorphization phase to compile the code down to a circuit. It is part of the noir_base64 library to encode an array of ASCII values into base64 values using finite field operations to stay efficient.
/**
* @brief Take an array of ASCII values and convert into base64 values
**/
pub fn base64_encode_elements<let InputElements: u32>(
input: [u8; InputElements]
) > [u8; InputElements] {
let mut Base64Encoder = Base64EncodeBE::new();
let mut result: [u8; InputElements] = [0; InputElements];
for i in 0..InputElements {
result[i] = Base64Encoder.get(input[i] as Field);
}
result
}
In this phase of compilation, all generic types and values are instantiated with their concrete values, as well as trait instances. The resulting code is much simpler as it only contains functions and types. If we translate the code to an untyped representation in Coq, we can even consider that the monomorphized code only contains functions. Thus, for convenience, we started doing our translation from the monomorphized level.
The abstract syntax tree for this level is in the Rust file compiler/noirc_frontend/src/monomorphization/ast.rs from the Noir's compiler. As an example, here is how the expressions are represented:
pub enum Expression {
Ident(Ident),
Literal(Literal),
Block(Vec<Expression>),
Unary(Unary),
Binary(Binary),
Index(Index),
Cast(Cast),
For(For),
If(If),
Tuple(Vec<Expression>),
ExtractTupleField(Box<Expression>, usize),
Call(Call),
Let(Let),
Constrain(Box<Expression>, Location, Option<Box<(Expression, HirType)>>),
Assign(Assign),
Semi(Box<Expression>),
Break,
Continue,
}
If you look at the various constructors of this enum they correspond to the language's primitives presented in the reference manual of Noir. Expressions (Ident
, Binary
, Call
, ...) and statements (If
, Let
, Break
, ...) are mixed together. If we look at the definition of Ident
:
pub struct Ident {
pub location: Option<Location>,
pub definition: Definition,
pub mutable: bool,
pub name: String,
pub typ: Type,
}
and then at the definition of Definition
:
pub enum Definition {
Local(LocalId),
Function(FuncId),
Builtin(String),
LowLevel(String),
// used as a foreign/externally defined unconstrained function
Oracle(String),
}
we get that most of the names have an associated id that is a unique number. This is because in the monomorphization phase, we duplicate a lot of the definitions (once for each instantiation of a generic type), so we have to give them a unique id to distinguish them.
We translate the monomorphized code to Coq by doing:
serde
serialization library in Rust.We find this development process to be rather efficient as the Python language is quite flexible and allows us to manipulate the JSON data easily. Compared to the work of a full compiler, which can be rather expensive computationally, what we do is mostly a translation from one syntax to another, and Python is a good fit.
Our Noir example is monomorphized to the following code, which can be shown by the development option showmonomorphized
of nargo
:
fn base64_encode_elements$f4(input$l26: [u8; 36]) > [u8; 36] {
let Base64Encoder$27 = new$f6();
let result$28 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
for i$29 in 0 .. 36 {
result$l28[i$l29] = get$f7(Base64Encoder$l27, (input$l26[i$l29] as Field))
};
result$l28
}
We see that the generic variable InputElements
is replaced by the constant value 36
as this is the value we use in the example we translate. All the identifiers have an additional $...
suffix to make them unique. Thanks to the serialization library serde
, we automatically get the JSON representation of this code that starts with:
{
"id": 4,
"name": "base64_encode_elements",
"parameters": [
[
49,
false,
"input",
{
"Array": [
118,
{
"Integer": [
"Unsigned",
"Eight"
]
}
]
}
]
],
"body": {
"Block": [
{
"Let": {
"id": 27,
"mutable": true,
"name": "Base64Encoder",
"expression": {
"Call": {
"func": {
"Ident": {
"location": {
"span": {
"start": 5312,
"end": 5315
},
"file": 70
},
"definition": {
"Function": 6
},
"mutable": false,
"name": "new",
"typ": {
"Function": [
[],
{
"Tuple": [
{
"Array": [
64,
{
"Integer": [
"Unsigned",
"Eight"
]
}
]
}
]
},
// much more JSON code
This is extremely verbose, and there is some information that we do not need, such as the locations of some of the items in the source. The advantage of JSON is that it is easy to parse and handle in most programming languages. In our case, here is an extract of the Python script that translates this JSON to Coq:
'''
pub enum Expression {
Ident(Ident),
Literal(Literal),
Block(Vec<Expression>),
Unary(Unary),
Binary(Binary),
Index(Index),
Cast(Cast),
For(For),
If(If),
Tuple(Vec<Expression>),
ExtractTupleField(Box<Expression>, usize),
Call(Call),
Let(Let),
Constrain(Box<Expression>, Location, Option<Box<(Expression, HirType)>>),
Assign(Assign),
Semi(Box<Expression>),
Break,
Continue,
}
'''
def expression_to_coq(node) > str:
node_type: str = list(node.keys())[0]
if node_type == "Ident":
node = node["Ident"]
return ident_to_coq(node)
if node_type == "Literal":
node = node["Literal"]
return alloc(literal_to_coq(node))
if node_type == "Block":
node = node["Block"]
return \
"\n".join(
expression_inside_block_to_coq(expression, index == len(node)  1)
for index, expression in enumerate(node)
)
if node_type == "Unary":
node = node["Unary"]
return unary_to_coq(node)
if node_type == "Binary":
node = node["Binary"]
return binary_to_coq(node)
# more cases...
For each kind of node in the AST, we write the original Rust type in comments, then let GitHub Copilot write the Python code and refine it. Here is the final Coq code that we get for this example:
Definition base64_encode_elements₄ (α : list Value.t) : M.t :=
match α with
 [input] =>
let input := M.alloc input in
let* result :=
let~ Base64Encoder := [[ M.copy_mutable (
M.alloc (M.call_closure (
M.read ( M.get_function ( "new", 6 ) ),
[]
))
) ]] in
let~ result := [[ M.copy_mutable (
M.alloc (Value.Array [
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) );
M.read ( M.alloc (Value.Integer IntegerKind.U8 0) )
])
) ]] in
do~ [[
M.for_ (
M.read ( M.alloc (Value.Integer IntegerKind.U32 0) ),
M.read ( M.alloc (Value.Integer IntegerKind.U32 36) ),
fun (i : Value.t) =>
[[
M.alloc (M.assign (
M.read ( M.alloc (M.index (
M.read ( M.alloc (result) ),
M.read ( i )
)) ),
M.read ( M.alloc (M.call_closure (
M.read ( M.get_function ( "get", 7 ) ),
[
M.read ( Base64Encoder );
M.read ( M.alloc (M.cast (
M.read ( M.alloc (M.index (
M.read ( input ),
M.read ( i )
)) ),
IntegerKind.Field
)) )
]
)) )
))
]]
)
]] in
[[
result
]] in
M.read result
 _ => M.impossible "wrong number of arguments"re
end.
If you attentively compare this Coq code to the original Noir version, you will see that the two are similar, although the Coq version is much more verbose with all the explicit memory allocations and reads. You might be wondering why we are choosing this specific representation. How did we know we had to use M.for_
, for example, to represent the loops?
This is where the semantics comes in. In the semantics phase, we define the meaning of each construct of the language in Coq. We reused our experience in building the coqofrust and coqofsolidity, where we also had to define the semantics of imperative languages in Coq.
We remove all the type information to avoid the differences between the Coq's type system and the type system of Noir. All the values have the same type Value.t
:
Module Value.
Inductive t : Set :=
 Bool (b : bool)
 Integer (kind : IntegerKind.t) (integer : Z)
 String (s : string)
 FmtStr : string > Z > t > t
 Pointer (pointer : Pointer.t t)
 Array (values : list t)
 Slice (values : list t)
 Tuple (values : list t)
 Closure : {'(Value, M) : (Set * Set) @ list Value > M} > t.
End Value.
We have a monad M.t
to represent the sideeffects of Noir in Coq (memory mutation, nontermination for recursive calls, ...). We define this monad from the composition of two monads:
LowM.t
that contains all the effects we cannot directly represent in Coq.Result.t
to represent special controlflow operations, such as break
and continue
, which have to interrupt the execution of the current loop prematurely, and a panic value in case of assert failure, which must propagate up to the main function.The definition of these types is as follows:
Module LowM.
Inductive t (A : Set) : Set :=
 Pure (value : A)
 CallPrimitive {B : Set} (primitive : Primitive.t B) (k : B > t A)
 CallClosure (closure : Value.t) (args : list Value.t) (k : A > t A)
 Let (e : t A) (k : A > t A)
 Loop (body : t A) (k : A > t A)
 Impossible (message : string).
End LowM.
Module Result.
Inductive t : Set :=
 Ok (value : Value.t)
 Break
 Continue
 Panic {A : Set} (payload : A).
End Result.
Module M.
Definition t : Set :=
LowM.t Result.t.
End M.
Note that since our type of values is always Value.t
, we do not parameterize the monad M.t
by the type of values.
Thanks to all the work above, we obtain a translation for a large subset of the Noir language to the Coq proof system, which typechecks and has a semantics. A difficulty with handling the code we produce from monomorphization is the unique identifier added after each name to make them unique. These identifiers are generated in a rather nondeterministic way that can depend on the machine that runs the compiler. In addition, they change every time we make changes to the source code.
In the next blog post, we will see how we prevent the identifiers from appearing in the generated code by working at a higher level than the monomorphization phase.
]]>In particular, we cover the changes we made to use unoptimized Yul code and how we made a functional representation of the loop to compute the most significant bit of the scalars.
To ensure your code is secure today, contact us at 💌contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to mathematical reasoning on the code. It can be integrated into your CI pipeline to check that every commit is fully correct without doing a whole audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
We are now verifying the code based on the unoptimized Yul output of the Solidity compiler instead of the optimized one. As a consequence the code is a little bit more verbose, although in our present case the difference is limited as we are verifying a code that is already handoptimized. The main advantage is that the variables are preserved instead of being moved to locations in the memory, which makes the verification easier, especially when handling loop invariants. A downside is that we now have to trust the correctness of the Solidity compiler's optimization passes.
As an example, here is how we now translate in Coq the loop to compute the most significant bit of the scalars with the unoptimized Yul code:
let~ var_ZZZ_83 := [[ 0 ]] in
let_state~ '(var_ZZZ_83, var_mask_63) :=
(* for loop *)
Shallow.for_
(* init state *)
(var_ZZZ_83, var_mask_63)
(* condition *)
(fun '(var_ZZZ_83, var_mask_63) => [[
iszero ~( var_ZZZ_83 )
]])
(* body *)
(fun '(var_ZZZ_83, var_mask_63) =>
Shallow.lift_state_update
(fun var_ZZZ_83 => (var_ZZZ_83, var_mask_63))
(let~ var_ZZZ_83 := [[ add ~( add ~( sub ~( 1, iszero ~( and ~( var_scalar_u_55, var_mask_63 ) ) ), shl ~( 1, sub ~( 1, iszero ~( and ~( shr ~( 128, var_scalar_u_55 ), var_mask_63 ) ) ) ) ), add ~( shl ~( 2, sub ~( 1, iszero ~( and ~( var_scalar_v_57, var_mask_63 ) ) ) ), shl ~( 3, sub ~( 1, iszero ~( and ~( shr ~( 128, var_scalar_v_57 ), var_mask_63 ) ) ) ) ) ) ]] in
M.pure (BlockUnit.Tt, var_ZZZ_83)))
(* post *)
(fun '(var_ZZZ_83, var_mask_63) =>
Shallow.lift_state_update
(fun var_mask_63 => (var_ZZZ_83, var_mask_63))
(let~ var_mask_63 := [[ shr ~( 1, var_mask_63 ) ]] in
M.pure (BlockUnit.Tt, var_mask_63)))
As a reference, here is the original smart contract code, in handwritten Yul:
ZZZ := 0
for {} iszero(ZZZ) { mask := shr(1, mask) } {
ZZZ := add(
add(
sub(1, iszero(and(scalar_u, mask))),
shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))
),
add(
shl(2, sub(1, iszero(and(scalar_v, mask)))),
shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))
)
)
}
We recognize the variables var_ZZZ_83
and var_mask_63
, corresponding to ZZZ
and mask
in the original code. They are made explicit in a state monad with the state (var_ZZZ_83, var_mask_63)
for the loop.
We had some constructs we were not handling in coqofsolidity
, for constructs that appeared in the optimized code but not in the unoptimized one. An example is the initialization part of the for
loop that seems to be always move away in the optimized code. We added those missing cases to our tool to be able to translate the unoptimized Yul code of Smoo.th.
Verifying the for
loop above can be challenging. Automated verification tools for Solidity typically do not fully handle loops, and instead unroll them three or four times to check the first iterations, which can miss some bugs.
The first step is to prove the loop is equivalent to a recursive function, as this will simplify reasoning. Here is a recursive function that computes the most significant bit of the scalars u
and v
:
Fixpoint get
(u_low u_high v_low v_high : U128.t) (over_index : nat) :
PointsSelector.t * nat :=
match over_index with
 O =>
(* We should never reach this case if the scalars
are not all zero *)
(PointsSelector.Build_t false false false false, O)
 S index =>
let selector := HighLow.get_selector
u_low u_high v_low v_high (Z.of_nat index) in
if PointsSelector.is_zero selector then
let new_over_index := index in
get u_low u_high v_low v_high new_over_index
else
let next_over_index := index in
(selector, next_over_index)
end.
Here are some notable changes compared to the original for
loop:
u
and v
of 256 bits into their high and low parts, u_low
, u_high
, v_low
, and v_high
of 128 bits each.PointsSelector
type, which is a record with four boolean fields. In the original code, the ZZZ
variable is used to group these four booleans into a single integer.over_index
to represent the mask. We decrement it at each iteration until it reaches zero, proving by construction the termination of the function. The relation with the mask is:Note that this means that when the over_index
is zero, then the mask
is zero. This corresponds to the last case of the loop. We use the variable name over_index
so that if we define:
then the relation with the mask is:
$\text{mask} = 2^{\text{index}}$for all cases except the last one.
Here is the reasoning rule for the smart contract loops in Coq:
Lemma LoopStep codes environment {In Out : Set}
(init : In)
(body : In > LowM.t Out)
(break_with : Out > In + Out)
(k : Out > LowM.t Out)
(output output_inter : Out)
state state_inter state'
(H_body :
{{? codes, environment, state 
body init ⇓ output_inter
 state_inter ?}}
)
(H_break_with :
match break_with output_inter with
 inr output_inter' =>
{{? codes, environment, state_inter 
k output_inter' ⇓ output
 state' ?}}
 inl next_init =>
{{? codes, environment, state_inter 
LowM.Loop next_init body break_with k ⇓ output
 state' ?}}
end
) :
{{? codes, environment, state 
LowM.Loop init body break_with k ⇓ output
 state' ?}}.
This rule, to be used in combination with some reasoning by induction, allows us to verify that a certain property is true for any number of iterations of the loop. In the present case, we use it to prove that the recursive function get
is equivalent to the for
loop. Basically, it states that:
body
of the loop evaluates to some output output_inter
,break_with
helper, which wraps the end of the end of the loop to either continue the loop or break it, evaluates to output
,output
.Here, the output of the body of the loop contains the state of the state monad, that is to say, the two variables ZZZ
and mask
, and a special variable to break or continue the for
loop iterations.
Due to a lack of time, we only made a sketch of the proof of evaluation of this loop, admitting some intermediate lemmas about identities over the selector function. This work is available in the file coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4/run.v.
We have seen how to reason about loops with coqofsolidity
. This example with bitlevel arithmetic was rather complex, but the general idea is still to reason by induction, showing the equivalence with a recursive function, using the reasoning rule LoopStep
above to step through the loop.
If you have smart contracts that you need to secure, talk to us! 🤝 The cost of an attack always far outweights the cost of an audit, and our solution, with full formal verification, is the more extensive in terms of coverage.
]]>In this blog post we present how we work with customers to integrate full formal verification in their workflow and ensure that their code is secure in the best possible way.
To ensure your code is secure today, contact us at 💌contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to mathematical reasoning on the code. It can be integrated into your CI pipeline to check that every commit is fully correct without doing a whole audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
Security is central to the long term success of decentralized platforms. Traditional testing or security audits can catch many issues, but are not enough to guarantee the absence of bugs. Formal verification is a technique that checks every possible input of your program to ensure that it is always correct, for a given set of security properties. It works by mathematically reasoning about the code constructs and then checking this reasoning with a computer.
Our process is as follows:
Ready to take your application's security to the next level? Reach out to us at 💌contact@formal.land, and let's build a secure future together! 🚀
]]>The Smoo.th library is interesting as elliptic curves are at the core of many cryptographic protocols, including authentication protocols, and having a generic and fast implementation simplifies the development of dApps in environments with missing precompiled (like L1s) or missing circuits (like zeroknowledge layers).
From a verification point of view, it is very challenging as it combines lowlevel operations (handoptimized Yul code with bit shifts, inlined functions, ...) with higherlevel reasoning on elliptic curves and arithmetic 💪.
To ensure your code is secure today, contact us at 💌contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds, thanks to a mathematical reasoning on the code. It can be integrated into your CI pipeline to check that every commit is fully correct without doing a whole audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
The library is implemented in SCL_mulmuladdX_fullgen_b4.sol mostly in Yul. Given two points $G$ and $Q$ on an elliptic curve in the field $\mathbb{F}_p$ and two scalars $u$ and $v$, it computes the following operation:
$u \cdot G + v \cdot Q$where the points are represented as $(x, y)$ coordinates, the scalars are integers, and the curve is described in the short Weierstrass form.
Here is a diagram to summarize the workflow of the library 🤓:
You can find more details about the algorithms used in the library in the complete audit report by CryptoExperts.
Our goal is to show that all these steps are equivalent to doing the naive operation of adding the points $u \cdot G$ and $v \cdot Q$ on the elliptic curve, ignoring a higher gas consumption and that the library is then free of bugs. Note that there are a few exceptional points, for example, when $G$ is the opposite of $Q$, where the library does not work as it is and runs another algorithm instead. We need to make these points explicit in the proof and assume we are not in these special cases.
In order to formally verify that the code is correct for any possible inputs, we need to first translate it to a proof language, in our case Coq. We run our tool coqofsolidity
on the optimized Yul code as generated by the Solidity compiler, that optimizes further the already handoptimized code of the library. All our verification work is available on GitHub in the folder coq/CoqOfSolidity/contracts/scl/mulmuladdX_fullgen_b4 of the coqofsolidity's repository.
Here is an example of handwritten Yul code from the contract, to compute the mostsignificant bit from the scalars:
ZZZ := 0
for {} iszero(ZZZ) { mask := shr(1, mask) } {
ZZZ := add(
add(
sub(1, iszero(and(scalar_u, mask))),
shl(1, sub(1, iszero(and(shr(128, scalar_u), mask))))
),
add(
shl(2, sub(1, iszero(and(scalar_v, mask)))),
shl(3, sub(1, iszero(and(shr(128, scalar_v), mask))))
)
)
}
The Yul code after optimization by the Solidity compiler is:
mstore(0xe0, 0)
for { } iszero(mload(0xe0)) { mstore(0x01a0, shr(1, mload(0x01a0))) } {
mstore(0xe0, add(
add(
sub(1, iszero(and(mload(0x0120), mload(0x01a0)))),
shl(1, sub(1, iszero(and(shr(128, mload(0x0120)), mload(0x01a0)))))
),
add(
shl(2, sub(1, iszero(and(mload(0x0160), mload(0x01a0))))),
shl(3, sub(1, iszero(and(shr(128, mload(0x0160)), mload(0x01a0)))))
)
))
}
As we can see, the variable names were replaced by fixed memory addresses. As we can see, this will make the verification more complex. The Coq code that we generate with coqofsolidity
is:
do~ [[ mstore ~( 0xe0, 0 ) ]] in
let_state~ 'tt :=
(* for loop *)
Shallow.for_
(* init state *)
tt
(* condition *)
(fun 'tt => [[
iszero ~( mload ~( 0xe0 ) )
]])
(* body *)
(fun 'tt =>
do~ [[
mstore ~( 0xe0, add ~(
add ~(
sub ~( 1, iszero ~( and ~( mload ~( 0x0120 ), mload ~( 0x01a0 ) ) ) ),
shl ~( 1, sub ~( 1, iszero ~( and ~( shr ~( 128, mload ~( 0x0120 ) ), mload ~( 0x01a0 ) ) ) ) )
),
add ~(
shl ~( 2, sub ~( 1, iszero ~( and ~( mload ~( 0x0160 ), mload ~( 0x01a0 ) ) ) ) ),
shl ~( 3, sub ~( 1, iszero ~( and ~( shr ~( 128, mload ~( 0x0160 ) ), mload ~( 0x01a0 ) ) ) ) )
)
) )
]] in
M.pure (BlockUnit.Tt, tt))
(* post *)
(fun 'tt =>
do~ [[ mstore ~( 0x01a0, shr ~( 1, mload ~( 0x01a0 ) ) ) ]] in
M.pure (BlockUnit.Tt, tt))
default~ tt in
We use a monadic notation f ~( x1, ..., xn )
to represent the sideeffects of the EVM, such as memory read and write with mload
and mstore
. The function Shallow.for_
represents a for loop with an initial state, a condition, a body, and a postaction. We implement it using a primitive from our monad to represent potentially nonterminating loops.
Here the proper state of the loop is empty (value tt
) and we instead modify the memory with mload
. Ideally we should have (ZZZ, mask)
as the state of the loop to simplify the verification. For our next attempt at verifying this code, we will look at the Yul code generated before optimizations by the Solidity compiler in order to keep these variables.
We are not done yet with the verification of this library. For now, we have verified that:
ecAddn2
is implemented as specified.ecDblNeg
is implemented as in the specification, in an inlined manner.For example, here is our statement for the execution of the ecAddn2
operation:
Lemma run_usr'dollar'ecAddn2 codes environment state
(P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y : U256.t) (p : U256.t) :
let output :=
ecAddn2 p
{ PZZ.X := P1_X; PZZ.Y := P1_Y; PZZ.ZZ := P1_ZZ; PZZ.ZZZ := P1_ZZZ }
{ PA.X := P2_X; PA.Y := P2_Y } in
let output := Result.Ok (output.(PZZ.X), output.(PZZ.Y), output.(PZZ.ZZ), output.(PZZ.ZZZ)) in
{{? codes, environment, Some state 
Contract_91.Contract_91_deployed.usr'dollar'ecAddn2 P1_X P1_Y P1_ZZ P1_ZZZ P2_X P2_Y p ⇓
output
 Some state ?}}.
It says that in a given environment (codes
, environment
, state
), the execution of the translated function Contract_91.Contract_91_deployed.usr'dollar'ecAddn2
gives the same result as a handwritten purely functional version ecAddn2
operating on data types directly representing the curve points (PZZ.t
and PA.t
).
We verify this execution in a straightforward way by unfolding the definition and executing it step by step:
Proof.
simpl.
unfold Contract_91.Contract_91_deployed.usr'dollar'ecAddn2.
l. {
repeat (l; [repeat cu; p]).
p.
}
p.
Qed.
For the verification of the inlinedecDblNeg
operation, here is the memory state just after computing the coordinates of the doubled point:
[
mem0; mem1; Pure.add 0 2048; mem3; mem4;
Pure.addmod
(Pure.mulmod
(Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)
(Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)
(Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)
(Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)
(Pure.mulmod (Pure.sub p 2)
(Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p;
Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;
Pure.mulmod
(Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)
P_127.(PZZ.ZZZ) p;
Pure.addmod
(Pure.mulmod
(Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p)
P_127.(PZZ.Y) p)
(Pure.mulmod
(Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)
(Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)
(Pure.addmod
(Pure.addmod
(Pure.mulmod
(Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)
(Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p)
(Pure.addmod (Pure.mulmod 3 (Pure.mulmod P_127.(PZZ.X) P_127.(PZZ.X) p) p)
(Pure.mulmod a (Pure.mulmod P_127.(PZZ.ZZ) P_127.(PZZ.ZZ) p) p) p) p)
(Pure.mulmod (Pure.sub p 2)
(Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p) p) p)
(Pure.sub p (Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p))
p) p) p;
HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;
Pure.mulmod (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) P_127.(PZZ.ZZ) p;
p; Q.(PA.Y); Q'.(PA.X); Q'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G'.(PA.X); G'.(PA.Y);
0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);
P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);
P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);
P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);
P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);
P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);
P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);
P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);
P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);
P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);
P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);
P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);
P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);
P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);
P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);
P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);
0; p
]
The state is very large as we are verifying a large function (250 lines) directly mutating the memory. We recognize the parameters of the function (Q
, Q'
, G
, G'
) as well as the precomputed points (P0
, P1
, P2
, ..., P16
). We also see the computation of the coordinates of the doubled point, stored at fixed memory addresses.
We define the dbl_neg_P_127
point as:
set (dbl_neg_P_127 := ecDblNeg a p P_127).
We then rewrite the memory locations of the doubled point with the coordinates of dbl_neg_P_127
:
apply_memory_update_at P_127_X_address dbl_neg_P_127.(PZZ.X); [reflexivity].
apply_memory_update_at P_127_Y_address dbl_neg_P_127.(PZZ.Y); [reflexivity].
apply_memory_update_at P_127_ZZ_address dbl_neg_P_127.(PZZ.ZZ); [reflexivity].
apply_memory_update_at P_127_ZZZ_address dbl_neg_P_127.(PZZ.ZZZ); [reflexivity].
giving us the new state:
[
mem0; mem1; Pure.add 0 2048; mem3; mem4; dbl_neg_P_127.(PZZ.X);
Pure.mulmod P_127.(PZZ.X) (Pure.mulmod (Pure.mulmod 2 P_127.(PZZ.Y) p) (Pure.mulmod 2 P_127.(PZZ.Y) p) p) p;
dbl_neg_P_127.(PZZ.ZZZ); dbl_neg_P_127.(PZZ.Y);
HighLow.merge u_high u_low; 480; HighLow.merge v_high v_low; Pure.add 0 2048; 2 ^ 126;
dbl_neg_P_127.(PZZ.ZZ);
p; Q.(PA.Y); Q'.(PA.X); Q'.(PA.Y); p; a; G.(PA.X); G.(PA.Y); G'.(PA.X); G'.(PA.Y);
0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
P0.(PZZ.X); P0.(PZZ.Y); P0.(PZZ.ZZ); P0.(PZZ.ZZZ);
P1.(PZZ.X); P1.(PZZ.Y); P1.(PZZ.ZZ); P1.(PZZ.ZZZ);
P2.(PZZ.X); P2.(PZZ.Y); P2.(PZZ.ZZ); P2.(PZZ.ZZZ);
P3.(PZZ.X); P3.(PZZ.Y); P3.(PZZ.ZZ); P3.(PZZ.ZZZ);
P4.(PZZ.X); P4.(PZZ.Y); P4.(PZZ.ZZ); P4.(PZZ.ZZZ);
P5.(PZZ.X); P5.(PZZ.Y); P5.(PZZ.ZZ); P5.(PZZ.ZZZ);
P6.(PZZ.X); P6.(PZZ.Y); P6.(PZZ.ZZ); P6.(PZZ.ZZZ);
P7.(PZZ.X); P7.(PZZ.Y); P7.(PZZ.ZZ); P7.(PZZ.ZZZ);
P8.(PZZ.X); P8.(PZZ.Y); P8.(PZZ.ZZ); P8.(PZZ.ZZZ);
P9.(PZZ.X); P9.(PZZ.Y); P9.(PZZ.ZZ); P9.(PZZ.ZZZ);
P10.(PZZ.X); P10.(PZZ.Y); P10.(PZZ.ZZ); P10.(PZZ.ZZZ);
P11.(PZZ.X); P11.(PZZ.Y); P11.(PZZ.ZZ); P11.(PZZ.ZZZ);
P12.(PZZ.X); P12.(PZZ.Y); P12.(PZZ.ZZ); P12.(PZZ.ZZZ);
P13.(PZZ.X); P13.(PZZ.Y); P13.(PZZ.ZZ); P13.(PZZ.ZZZ);
P14.(PZZ.X); P14.(PZZ.Y); P14.(PZZ.ZZ); P14.(PZZ.ZZZ);
P15.(PZZ.X); P15.(PZZ.Y); P15.(PZZ.ZZ); P15.(PZZ.ZZZ);
0; p
]
Still large but much cleaner!
There are two main parts that remain to be done in order to have a full formal verification of the library:
for
loops. Reasoning on the loops is complex; in the current version, we unroll the loops once in order to have a first step towards the full proof. As the memory used by the main function is quite large, we will first need to change the code we verify by looking at the Yul code generated before optimizations by the Solidity compiler.We have seen how the Smoo.th library works at a high level, how we can start verifying it, and what challenges do we face. This is also an interesting example to improve our tool coqofsolidity
and develop reasoning primitives for cryptographic code. We will continue this work in the coming weeks to verify more parts of this library.
In this blog post, we present how we developed an effect inference mechanism to translate optimized Yul code combining variable mutations and control flow with loops and nested premature returns (break
, continue
, and leave
) to a clean 🧼 purely functional representation in the proof system Coq.
We will be talking about this work at the Encode London Conference on Friday, October 25, 2024 📢.
To ensure your code is secure today, contact us at 📧contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
Yul is the intermediate language of the Solidity compiler that we translate to the Coq proof system to formally verify properties of smart contracts. The issue is that it has slightly different behaviors than the Coq language. In particular, it allows for variable mutations and imperative loops (for
loops) with premature exits that have no native equivalents in purely functional languages like the ones used for formal verification.
Here is a short example of Yul code that is impossible to translate to Coq as it is:
function rugby() > x {
let i := 0
x := 0
for { } lt(i, 10) { i := add(i, 1) } {
x := add(x, i)
if eq(i, 5) {
leave
}
}
}
It uses the variable x
to store the sum of the increasing sequence of integers 1
, 2
, 3
, ... but prematurely stops the loop when i
reaches 5
and returns the final value of x
.
To represent this code in a purely functional language, we need to:
i
and x
.eq(i, 5)
is satisfied and then bubbles up to the body of the function to return the final result x
.Having a purely functional representation of the Yul code is important as verifying functional programs is easier than verifying imperative ones, especially in the case of a system like Coq that is based on functional programming even at the logical level.
Ideally, such a translation should be done automatically so that we are not at risk of making mistakes and can focus our time on the verification work. This would allow to more efficiently formally verify properties of smart contracts or similar imperative programs. Not that in Yul, in addition to mutations on variables, there are also mutations on the contract's memory and storage, which we do not cover here.
Our solution is a tool that does an effect inference on the Yul code to determine which variables might be mutated at each point of the program, and then propagates the results in the two cases where the execution continues to the next instruction and the case where it bubbles up.
We wrote our tool in 🐍 Python, for ease of development, parsing the Yul code from the JSON output of the Solidity compiler and outputting a Coq file that represents the functional version of the code. Yul is a rather pleasant language, optimized for formal verification and with very few constructs. Our code is available on our GitHub repository github.com/formalland/coqofsolidity, in a pull request that is about to be merged.
Here is the header of our main Python function, which translates Yul statements to Coq:
def statement_to_coq(node) > tuple[Callable[[set[str]], str], set[str], set[str]]:
It takes a JSON node
corresponding to a statement (assignment, if
, for
, leave
, ...) and returns a triple with:
From these information we can infer the variables that are mutated at each point of the program and propagate them.
As an example, here is the generated Coq translation of our 🏉 rugby
example above:
Definition rugby : M.t U256.t :=
let~ '(_, result) :=
let~ i := [[ 0 ]] in
let~ x := [[ 0 ]] in
let_state~ '(i, x) :=
(* for loop *)
Shallow.for_
(* init state *)
(i, x)
(* condition *)
(fun '(i, x) => [[
lt ~( i, 10 )
]])
(* body *)
(fun '(i, x) =>
Shallow.lift_state_update
(fun x => (i, x))
(let~ x := [[ add ~( x, i ) ]] in
let_state~ 'tt := [[
Shallow.if_ (
eq ~( i, 5 ),
M.pure (BlockUnit.Leave, tt),
tt
)
]] default~ x in
M.pure (BlockUnit.Tt, x)))
(* post *)
(fun '(i, x) =>
Shallow.lift_state_update
(fun i => (i, x))
(let~ i := [[ add ~( i, 1 ) ]] in
M.pure (BlockUnit.Tt, i)))
default~ x in
M.pure (BlockUnit.Tt, x)
in
M.pure result.
On lines 3
and 4
we see that we use normal let
declarations for the variables i
and x
:
let~ i := [[ 0 ]] in
let~ x := [[ 0 ]] in
The notation let~
is a monadic notation to represent the sideeffects of the EVM (storage updates, contract calls, ...) but the variables i
and x
are plain Coq variables, what will facilitate the formal verification process later.
In line 5
, we see that we consider the for
loop to have a twovariable state (i, x)
:
let_state~ '(i, x) :=
(* for loop *)
Shallow.for_
(* init state *)
(i, x)
The condition depends on the whole state, even if it only uses a part of it:
(* condition *)
(fun '(i, x) => [[
lt ~( i, 10 )
]])
The body is more interesting. We only modify the variable x
but we need to read and return the whole state (i, x)
, so we start with a lift operation:
(* body *)
(fun '(i, x) =>
Shallow.lift_state_update
(fun x => (i, x))
Then we update the variable x
with a standard variable declaration as if the variable was immutable:
(let~ x := [[ add ~( x, i ) ]] in
The updated value of the variable x
is propagated at the end of the body:
M.pure (BlockUnit.Tt, x)))
This is how we translate the inner if
:
let_state~ 'tt := [[
Shallow.if_ (
eq ~( i, 5 ),
M.pure (BlockUnit.Leave, tt),
tt
)
]] default~ x in
If the condition is satisfied, we return the special value BlockUnit.Leave
that will be interpreted as a premature exit of the function and activate the bubbleup mechanism. The associated state is the special empty value tt
as there are no mutations in the if
statement. We use default~ x
at the next line to say that we complete the tt
state with the value x
if we are bubbling up.
The binding of the expression of default~
is done after the let_state~
to be able to retrieve parts of the state that might have been modified, if needed. This is, for example, the case for the for
loop where we say that we first get the values of the two variables i
and x
:
let_state~ '(i, x) :=
(* for loop *)
and then propagate only the state x
in case of a premature exit:
default~ x in
at the line 33
.
The monad we use to represent the bubbleup mechanism is the following:
Module Shallow.
Definition t (State : Set) : Set :=
M.t (BlockUnit.t * State).
where:
M.t
is the monad representing the sideeffects of the EVM,BlockUnit.t
is a type representing the different modes of the bubbleup mechanism: no bubbleup, or a bubbleup with a break
, continue
, or leave
instruction,State
is the type of the current state that we might be writing to.We define the notation let_state~ ... default~ ... in
with:
Notation "'let_state~' pattern ':=' e 'default~' state 'in' k" :=
(let_state e (fun pattern => (state, k)))
and the function:
Definition let_state {State1 State2 : Set}
(expression : t State1) (body : State1 > State2 * t State2) :
t State2 :=
M.strong_let_ expression (fun value =>
let '(mode, state1) := value in
match mode with
(* no bubbleup, do not use the default state *)
 BlockUnit.Tt => snd (body state1)
(* bubbleup, use the default state and keep the same bubbleup mode *)
 _ => M.pure (mode, fst (body state1))
end).
You can also look at the definitions of the Shallow.if_
and Shallow.for_
functions in our code. For loops, we use a nontermination effect of the underlying monad M.t
. This is because loops can be infinite, and this is not allowed in Coq.
We are using the new translation above to formally verify the implementation of a handoptimized Yul code using loops and mutations to implement cryptographic operations in an efficient way. We believe that this translation would work as well for any other examples of Yul code, enabling the formal verification of arbitrary Solidity or Yul code in a more functional way.
We have show how we can automatically translate arbitrary Yul code in a purely functional form 🌟, excluding mutations of the memory and the storage, in order to simplify further formal verification operations 🙂.
A work left to be done is to prove that this transformation is correct, showing it equivalent to our initial and simpler Yul semantics where variables are represented as string keys in a map. We believe this is possible by generating a proof on a casebycase basis for each transformed program, working by unification and exploring all the branches. But this remains to be done.
]]>In this blog post, we present how we test this translation to ensure it is correct by running the typechecker on each opcode of the Move bytecode and comparing the results with the Rust code, testing the success and error cases.
To ensure your code is secure today, contact us at 📧contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
The typechecker of Move Sui is a large piece of Rust code with a core function verify_instr
in movebytecodeverifier/src/type_safety.rs that typechecks each individual instruction in a Move bytecode. There are exactly 77
different opcodes. To give you an example, here is how it typechecks the opcode Add
:
let operand1 = safe_unwrap_err!(verifier.stack.pop());
let operand2 = safe_unwrap_err!(verifier.stack.pop());
if operand1.is_integer() && operand1 == operand2 {
verifier.push(meter, operand1)?;
} else {
return Err(verifier.error(StatusCode::INTEGER_OP_TYPE_MISMATCH_ERROR, offset));
}
The Move virtual machine is stackbased. The typechecker maintains a stack of types, corresponding to the types of the values that should be on the stack at the current point of the execution. For the Add
operation it pops the two last types on the types, checks that they are integers and equal, and pushes the result type on the stack. The result of an addition is of the same type as the operands. In case of an error, it returns the status code INTEGER_OP_TYPE_MISMATCH_ERROR
.
We translate this code to Coq in the following way:
letS! operand1 :=
liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in
letS! operand1 := return!toS! $ safe_unwrap_err operand1 in
letS! operand2 :=
liftS! TypeSafetyChecker.lens_self_stack AbstractStack.pop in
letS! operand2 := return!toS! $ safe_unwrap_err operand2 in
if andb
(SignatureToken.is_integer operand1)
(SignatureToken.t_beq operand1 operand2)
then
TypeSafetyChecker.Impl_TypeSafetyChecker.push operand1
else
returnS! $ Result.Err $ TypeSafetyChecker.Impl_TypeSafetyChecker.error
verifier StatusCode.INTEGER_OP_TYPE_MISMATCH_ERROR offset
The two code extracts above seem very similar, but how to make sure that they are indeed the same, and that we made no typos or misunderstanding in the 3,200 lines of translation?
To answer that question, we choose to write unit tests on the Rust side covering all the execution paths (success and error, all the opcodes) and to run the same tests on the Coq side after a manual/AI assisted translation of these tests. We will compare the results of the tests to ensure that the Coq code behaves exactly like the Rust code.
The tests on the Rust side are in the file movebytecodeverifier/src/type_safety_tests/mod.rs, which is a 3,000line file with 176 tests. For example, for the addition we have:
#[test]
fn test_arithmetic_correct_types() {
for instr in vec![
Bytecode::Add,
Bytecode::Sub,
Bytecode::Mul,
Bytecode::Mod,
Bytecode::Div,
Bytecode::BitOr,
Bytecode::BitAnd,
Bytecode::Xor,
] {
for push_ty_instr in vec![
Bytecode::LdU8(42),
Bytecode::LdU16(257),
Bytecode::LdU32(89),
Bytecode::LdU64(94),
Bytecode::LdU128(Box::new(9999)),
Bytecode::LdU256(Box::new(U256::from(745_u32))),
] {
let code = vec![push_ty_instr.clone(), push_ty_instr.clone(), instr.clone()];
let module = make_module(code);
let fun_context = get_fun_context(&module);
let result = type_safety::verify(&module, &fun_context, &mut DummyMeter);
assert!(result.is_ok());
}
}
}
There are four other tests covering the error cases (missing arguments, wrong types, ...).
One of the difficulties in these tests, apart from their size, is that we need to initialize the module
variable with the proper content to be able to typecheck some of the instructions. We defined some helpers for that, such as:
fn add_simple_struct_with_abilities(module: &mut CompiledModule, abilities: AbilitySet) {
let struct_def = StructDefinition {
struct_handle: StructHandleIndex(0),
field_information: StructFieldInformation::Declared(vec![FieldDefinition {
name: IdentifierIndex(5),
signature: TypeSignature(SignatureToken::U32),
}]),
};
let struct_handle = StructHandle {
module: ModuleHandleIndex(0),
name: IdentifierIndex(0),
abilities: abilities,
type_parameters: vec![],
};
module.struct_defs.push(struct_def);
module.struct_handles.push(struct_handle);
}
that is used in 26
tests involving struct data structures.
We translated the tests using the same approach as for the typechecker, with the same monadic representation of effects. For example, we represent in Coq the arithmetic test above as:
Definition test_arithmetic_correct_types
(instr push_ty_instr : Bytecode.t) :
M!? PartialVMError.t unit :=
let code := [push_ty_instr; push_ty_instr; instr] in
let module := make_module code in
let! fun_context := get_fun_context module in
verify module fun_context.
Goal List.Forall
(fun instr =>
List.Forall
(fun push_ty_instr =>
test_arithmetic_correct_types instr push_ty_instr = return!? tt
)
[
Bytecode.LdU8 42;
Bytecode.LdU16 257;
Bytecode.LdU32 89;
Bytecode.LdU64 94;
Bytecode.LdU128 9999;
Bytecode.LdU256 745
]
)
[
Bytecode.Add;
Bytecode.Sub;
Bytecode.Mul;
Bytecode.Mod;
Bytecode.Div;
Bytecode.BitOr;
Bytecode.BitAnd;
Bytecode.Xor
].
Proof.
repeat constructor.
Qed.
We convert the test that iterates assertions to an anonymous proof goal that uses the List.Forall
predicate to verify a series of equalities. The List.Forall
predicate is defined as "the following property is valid for all elements of the list".
Fortunately for us, GitHub Copilot was extremely efficient in the translation of these tests with a success rate of about %95 (we did not make a precise measurement). These end result is in move_sui/simulations/move_bytecode_verifier/type_safety_tests/mod.v that contains more than 6,000 lines of Coq code excluding comments.
About %20 of our translated Coq tests failed 💥, which we actually consider a very good success 💪 as the translated Coq code of the typechecker was not run before. Apart from one misunderstanding of the Rust code, all the issues were due to typos in the translation. We had about a dozen of them, such as a missing negation in a condition, some of them generating multiple test failures. It took about one day to fix all of them by changing our Coq translation of the typechecker accordingly. Now all the tests work 🎉!
A few errors where also due to incorrectly translated tests, typically with a missing line. We did a manual review, but we do not know for sure if there are tests with a mistake that by chance fix an error in the translation of the typechecker. We have not seen any such case yet.
We now have an idiomatic 🐓 Coq translation of the typechecker of the Move bytecode in Rust. In addition, we test the result of this translation for every opcode and error case.
Now that we are confident enough in the translation, we can start the specification and formal verification of the typechecker. This will involve reasoning on both the typechecker and the bytecode interpreter, showing that:
This requires translating all the Rust code in idiomatic Coq on which we will write our specifications and proofs. We write this translation by hand relying as much as possible on generative AI tools such as GitHub Copilot, as there are many particular cases. We plan, eventually, to prove it equivalent to the translation automatically generated by coqofrust.
In this blog post we present how we organize our 🔮 monad to represent the sideeffects used in this Rust code. We believe this organization should work for other Rust projects as well.
To ensure your code is secure today, contact us at 📧contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
In functional programming, effects (or sideeffects) are every operation that cannot be directly represented as a mathematical function, that is to say, a procedure that returns an output purely based on the value of its inputs and does nothing else. For example, the function returning the current time makes an effect as it depends on a hidden state (the current time) that is not passed as an argument. The function printing a message to the console makes an effect as it modifies the state of the console, in addition to returning a value that is generally either empty or a confirmation of the printing. Arithmetic operations (+
, *
, ...) are an example of pure functions.
We consider three primitive effects in our Rust code:
Result
type is used to represent the result of a computation that can fail. It is a sum type with two constructors: Ok
for the successful result and Err
for the error. The Rust operator ?
is used to propagate errors in a function that returns a Result
. This is another effect for us.&mut
.All our Coq definitions to represent the effects are in the file simulations/M.v.
We define a monad Panic.t
to represent the effect of a panic with:
Module Panic.
Inductive t (A : Set) : Set :=
 Value : A > t A
 Panic {Error : Set} : Error > t A.
Note that the type Error
in this position is an existential type. This has a few consequences:
Panic.t
with the type of the error.Error
when we trigger a panic operation. This is useful for debugging, as we can add any payload to the panic message to help us understand what went wrong.We define the monadic return and bind operations as usual:
Definition return_ {A : Set} (value : A) : t A := Value value.
Definition bind {A B : Set} (value : t A) (f : A > t B) : t B :=
match value with
 Value value => f value
 Panic error => Panic error
end.
We introduce notations based on the exclamation mark !
to make the code more readable:
Notation "M!" := Panic.t.
Notation "return!" := Panic.return_.
Notation "'let!' x ':=' X 'in' Y" :=
(Panic.bind X (fun x => Y))
(at level 200, x pattern, X at level 100, Y at level 200).
We define the monad Result.t
to represent the propagation of errors with the ?
operator with:
Module Result.
Inductive t (A Error : Set) : Set :=
 Ok : A > t A Error
 Err : Error > t A Error.
The difference with the Panic.t
monad is that the error type is not existential anymore. This is because we want to be able to compute on the error payload, as some functions depend on the error value.
We define the return and bind operations as:
Definition return_ {A Error : Set} (value : A) : t A Error := Ok value.
Definition bind {Error A B : Set} (value : t A Error) (f : A > t B Error) : t B Error :=
match value with
 Ok value => f value
 Err error => Err error
end.
The bind corresponds to the question mark operator ?
in Rust. We also introduce notation to make the code more readable:
Notation "M?" := (fun A Error => Result.t Error A).
Notation "return?" := Result.return_.
Notation "'let?' x ':=' X 'in' Y" := ...
Finally, we define the monad State.t
🇺🇸 to represent the effect of one or several mutable references with a mutable state type S
:
Module State.
Definition t (State A : Set) : Set := State > A * State.
Definition return_ {State A : Set} (value : A) : t State A :=
fun state => (value, state).
Definition bind {State A B : Set} (value : t State A) (f : A > t State B) : t State B :=
fun state =>
let (value, state) := value state in
f value state.
The state S
will typically be the tuple of all the current mutable references in the Rust code. We use notations based on the letter S
.
We also introduce lens operations that mimic how we can extract a mutable reference to the part of a data structure from a mutable reference to the whole data structure in Rust. Here is the definition of the lens type:
Record t {Big_A A : Set} : Set := {
read : Big_A > M! A;
write : Big_A > A > M! Big_A
}.
The read
and write
operations correspond to the dereferencing and the assignment of a mutable reference in Rust. The type Big_A
is the type of the whole data structure, and the type A
is the type of the part that we are referencing. These primitives might fail (there are in the panic monad) if the mutable reference is not valid, for example, for an outofbounds access in an array or an invalid case in an enum.
We can use a lens to lift a computation that operates on a part of a data structure to a computation that operates on the whole data structure. We provide various lift operators to help with this.
Depending on the Rust code we want to translate, we might need to use none, one, or several of the effects above. We explicitly define all the possible combinations of the above monads, as well as return operations to go from one monad to another, more general monad.
The special case is for the combination of the panic and state effect. When a panic occurs, we do not return the resulting state, as we are not supposed to continue the evaluation after a panic so the current state should not be relevant. We lose the information about the state of the program when a panic occurs, which can be a limitation for debugging, but:
The most complete monad combines all the effects:
Module StatePanicResult.
Definition t (State Error A : Set) : Set :=
MS! State (M? Error A).
Definition return_ {State Error A : Set} (value : A) : t State Error A :=
returnS! (Result.Ok value).
Definition bind {State Error A B : Set}
(value : t State Error A)
(f : A > t State Error B) :
t State Error B :=
letS! value := value in
match value with
 Result.Ok value => f value
 Result.Err error => returnS! (Result.Err error)
end.
with the notations:
Notation "MS!?" := StatePanicResult.t.
Notation "returnS!?" := StatePanicResult.return_.
Notation "'letS!?' x ':=' X 'in' Y" := ...
We are repeating our notations a lot, as our three effects and their combinations are very similar. In addition, we always have to explicitly choose in our code which monad we use and add explicit conversions to go from one to another. A future enhancement could be to add some automation at this level, through the use of typeclasses, for example, to automatically infer the monad to use based on the operations used in the code 🦾. For now, we prefer to stay explicit.
To convert code involving for
loops 🔁 or manipulations with the .map
method of iterators, we introduce the effectful version of the for
loop (fold or reduce in functional languages) and the map
method. For example, for the folding operation:
(** The order of parameters is the same as in the source `for` loops. *)
Definition fold_left {State Error A B : Set}
(init : A)
(l : list B)
(f : A > B > t State Error A) :
t State Error A :=
List.fold_left (fun acc x => bind acc (fun acc => f acc x)) l (return_ init).
with the notation:
Notation "foldS!?" := StatePanicResult.fold_left.
Thanks to the definitions and notations above, we were able to translate (manually/with GitHub Copilot) all the code of the typechecker for the Move bytecode to Coq in an idiomatic Coq code of a size roughly similar to the original Rust code. This translation is available in our folder move_sui/simulations/move_bytecode_verifier 🚀.
In the next post we will present how we tested this translation to be faithful to the original Rust code, waiting to have an efficient way to prove it equivalent.
]]>Formal verification is one of the best techniques to ensure that your code is correct, as it checks every possible input ✨ of your program. For a long, formal verification was reserved for specific fields, such as the space industry 🧑🚀. We are making this technology accessible for the blockchain industry and general programming thanks to tools and services we develop, like coqofsolidity and coqofrust.
To ensure your code is secure today, contact us at 📧contact@formal.land! 🚀
Formal verification goes further than traditional audits to make 100% sure you cannot lose your funds. It can be integrated into your CI pipeline to make sure that every commit is correct without running a full audit again.
We make bugs such as the DAO hack ($60 million stolen) virtually impossible to happen again.
We have existed for three years, focusing on formal verification for the web3 industry to validate software 🛡️ where safety is of paramount importance. Formal verification is a technique to analyze the code of a program, which relies on making a mathematical proof that the code is correct, proof that is furthermore checked by a computer 🤓 to make sure there are absolutely no missing cases! As programs are made of 0 and 1 and fully deterministic, obtaining perfect programs is something we can reach.
We need to rely on a proof system. We exclusively use the 🐓 Coq proof system as it is both:
We choose to verify existing 🗿 code rather than to develop new code written in a style simplifying formal verification. This is generally harder, but it is also more useful for many of our customers who have already written code and want to ensure it is correct without rewriting it. Verifying the existing code also enables the verification of the optimizations, which generally involve lowlevel operations that would be forbidden when rewriting the code in a formal verification language.
We verify the actual 🌍 implementation of programs rather than a 🗺️ model of them. This is to capture all the implementation details, such as integer overflows or the use of specific data structures or libraries. We believe that a lot of bugs are hidden in the details (the devil is in the details), in addition to the highlevel bugs of design. Verifying the implementation also helps to follow code updates 🪜 as we are able to say that we verified the code for a precise commit hash.
The tool coqofocaml was our first product to analyze 🐫 OCaml programs by translating the code to Coq. The translation is almost onetoone in terms of size, for a verification work simplified at a maximum. It was initially developed as part of a PhD at Inria and then at the Nomadic Labs company.
We use it to verify properties of the code of the Layer 1 of Tezos with the project Coq Tezos of OCaml. We analyzed a code base of more than 100,000 lines of OCaml code, for which we made a full and automatic translation to the proof system Coq that can be maintained as the code evolves. We verified various properties, including:
Many more properties are yet to be verified, but the project is currently on hold. You can have more information by looking at the project blog!
Our second project is coqofrust to verify Rust programs. Rust is an interesting target as more and more programs are getting written in it, especially for projects where the security is critical. Even if Rust offers a strong type system, with memory safe programs by design, there are still many bugs that can happen, like logical bugs or code making a panic (sudden stop of the program) in production due to an outofbound access in an array.
The project coqofrust
was funded by Aleph Zero to verify the code of their smart contracts.
We achieve to translate most Rust programs to the Coq proof system, including the core
library 🎉, which is the standard library of Rust. To our knowledge, we are the only ones who have achieved such a translation of the standard library. The generated Coq code is about ten times the size of the initial Rust code. This is quite verbose and related in particular to:
match
patterns to primitive patterns.We have a semantics for the translated code, and are working on reasoning principles to show that this translated code is equivalent to a much simpler version (simulations) on which to reason.
As an example, here is the Coq translation of one of the functions of the revm, a Rust implementation of the Ethereum Virtual Machine:
(*
pub fn add<H: Host + ?Sized>(interpreter: &mut Interpreter, _host: &mut H) {
gas!(interpreter, gas::VERYLOW);
pop_top!(interpreter, op1, op2);
*op2 = op1.wrapping_add( *op2);
}
*)
Definition add (ε : list Value.t) (τ : list Ty.t) (α : list Value.t) : M :=
match ε, τ, α with
 [], [ H ], [ interpreter; _host ] =>
ltac:(M.monadic
(let interpreter := M.alloc ( interpreter ) in
let _host := M.alloc ( _host ) in
M.catch_return (
ltac:(M.monadic
(M.read (
let~ _ :=
M.match_operator (
M.alloc ( Value.Tuple [] ),
[
fun γ =>
ltac:(M.monadic
(let γ :=
M.use
(M.alloc (
UnOp.not (
M.call_closure (
M.get_associated_function (
Ty.path "revm_interpreter::gas::Gas",
"record_cost",
[]
),
[
M.SubPointer.get_struct_record_field (
M.read ( interpreter ),
"revm_interpreter::interpreter::Interpreter",
"gas"
);
M.read (
M.get_constant (
"revm_interpreter::gas::constants::VERYLOW"
)
)
]
)
)
)) in
let _ :=
M.is_constant_or_break_match ( M.read ( γ ), Value.Bool true ) in
(* ... more code ... *)
Last but not least, the tool coqofsolidity to translate Solidity smart contracts to Coq. We use the Yul intermediate language of the Solidity compiler to do our translation, with roughly a three times size increase in the translated code.
We support most of the Solidity instructions, passing 90% of tests of the Solidity compiler. We recently developed a new translation mode that can represent arbitrary Solidity code, or Yul written by hand, in a nice monad, even in case of complex control flow like nested loops with break
and continue
instructions and variable mutations. This is done thanks to our new effect inference engine in coqofsolidity
to always give a purely functional representation of imperative code.
Compared to other formal analysis tools for Solidity, the strength is to be able to verify arbitrary complex properties. This is crucial for the verification of cryptographic operations (elliptic curve implementations, zeroknowledge verifiers linking the L1 to the L2s, ...) that are out of reach of standard verification tools. For example, we are currently verifying a handoptimized Yul implementation of elliptic curve operations.
We have seen what we are proposing at Formal Land to enhance the security of your applications to the best possible level 🌟, with security of mathematical certainty. Next time, we will see how to use the Coq proof system to verify simple properties by following the Coq in a Hurry tutorial 🚀.
]]>We will formally verify the part that checks that the bytecode is welltyped, so that when a smart contract is executed it cannot encounter critical errors. The type checker itself is also written in Rust, and we will verify it using the proof assistant Coq 🐓 and our tool coqofrust that translates Rust programs to Coq.
To formally verify your Rust code and ensure it contains no bugs or vulnerabilities, contact us at 📧contact@formal.land.
The cost is €10 per line of Rust code (excluding comments) and €20 per line for concurrent code.
The plan for this project is as follows:
coqofrust
. This part will give more precise results than the tests, as it will cover all possible inputs and states of the program. The complexity of this part is to go through all the details that exist in the Rust code, like the use of references to manipulate the memory, the macros after expansion, and the parts of the Rust standard library that the code depends on.For now, we have written a simulation for the type checker in CoqOfRust/move_sui/simulations/move_bytecode_verifier/type_safety.v. We are now:
In the following blog posts, we will describe how we structured the simulations and how we are testing or verifying them.
This project is kindly founded by the Sui Foundation which we thank for their trust and support.
The proofs are still tedious for now, as there are around 1,000 lines of proofs for 100 lines of Solidity. We plan to automate this work as much as possible in the subsequent iterations of the tool. One good thing about the interactive theorem prover Coq is that we know we can never be stuck, so we can always make progress in our proof techniques and verify complex properties even if it takes time ✨.
Formal verification with an interactive proof assistant is the strongest way to verify programs since:
To audit your smart contracts and make sure they contain no bugs, contact us at 📧contact@formal.land.
We refund our work if we missed a high/critical severity bug.
We specify the ERC20 smart contract by writing an equivalent version in Coq that acts as a functional specification. In this specification, we ignore the emit
operations that are logging events in Solidity and the precise payload of revert operations (we only say that "a revert occurs"). We make all our arithmetic operations on Z
the type of unbounded integers with explicit overflow checks.
For example, here is the _transfer
function of the Solidity smart contract:
function _transfer(address from, address to, uint256 value) internal {
require(to != address(0), "ERC20: transfer to the zero address");
// The subtraction and addition here will revert on overflow.
_balances[from] = _balances[from]  value;
_balances[to] = _balances[to] + value;
emit Transfer(from, to, value);
}
We specify it in the file erc20.v by:
Definition _transfer (from to : Address.t) (value : U256.t) (s : Storage.t)
: Result.t Storage.t :=
if to =? 0 then
revert_address_null
else if balanceOf s from <? value then
revert_arithmetic
else
let s :=
s < Storage.balances :=
Dict.declare_or_assign s.(Storage.balances) from (balanceOf s from  value)
> in
if balanceOf s to + value >=? 2 ^ 256 then
revert_arithmetic
else
Result.Success s < Storage.balances :=
Dict.declare_or_assign s.(Storage.balances) to (balanceOf s to + value)
>.
With the Coq notation:
storage < field := new_value >
we modify a storage element as in the equivalent Solidity:
field = new_value;
With the two tests:
if balanceOf s from <? value then
if balanceOf s to + value >=? 2 ^ 256 then
we make explicit the overflow checks that are implicit in the Solidity code.
A Solidity smart contract has two public functions:
We will focus on the second one. It takes the contract's payload in a specific format:
This blog article Deconstructing a Solidity Contract  Part III: The Function Selector from OpenZeppelin gives more information about it. In Coq, we represent the payload of a contract with a sum type:
Module Payload.
Inductive t : Set :=
 Transfer (to: Address.t) (value: U256.t)
 Approve (spender: Address.t) (value: U256.t)
 TransferFrom (from: Address.t) (to: Address.t) (value: U256.t)
 IncreaseAllowance (spender: Address.t) (addedValue: U256.t)
 DecreaseAllowance (spender: Address.t) (subtractedValue: U256.t)
 TotalSupply
 BalanceOf (owner: Address.t)
 Allowance (owner: Address.t) (spender: Address.t).
End Payload.
We define how to get this payload from the binary representation:
Definition of_calldata (callvalue : U256.t) (calldata: list U256.t) :
option Payload.t :=
if Z.of_nat (List.length calldata) <? 4 then
None
else
let selector := Stdlib.Pure.shr (256  32) (StdlibAux.get_calldata_u256 calldata 0) in
if selector =? get_selector "approve(address,uint256)" then
let to := StdlibAux.get_calldata_u256 calldata (4 + 32 * 0) in
let value := StdlibAux.get_calldata_u256 calldata (4 + 32 * 1) in
if negb (callvalue =? 0) then
None
else if negb (get_have_enough_calldata (32 * 2) calldata) then
None
else if negb (get_is_address_valid to) then
None
else
Some (Approve to value)
else if selector =? get_selector "totalSupply()" then
(* ... other cases ... *)
The callvalue
is the amount of Ether sent with the transaction, which has to be zero for nonpayable functions. The calldata
is the list bytes of the payload of the transaction. We check that the length of the payload is at least 4 bytes, then we extract the selector and the arguments of the function. We check that the arguments are valid, and we return the corresponding payload or None
in case of error.
Note that a lot of the code is very repetitive and can be generated automatically by AI. For example the definition of the Payload.t
type was automatically generated by Claude.ai in one shot, with the code of the smart contract and its specification in context.
Here is the lemma stating that, for any possible user inputs and storage values, the Solidity smart contract and the Coq specification behave exactly the same:
Lemma run_body codes environment state
(s : erc20.Storage.t)
(H_environment : Environment.Valid.t environment)
(H_s : erc20.Storage.Valid.t s) :
let memoryguard := 128 in
let memory_start :=
[0; 0; 0; 0; 0] in
let state_start :=
make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in
let output :=
The functional specification here:
erc20.body
environment.(Environment.caller)
environment.(Environment.callvalue)
s
environment.(Environment.calldata) in
let memory_end_middle :=
[memoryguard; 0] in
let state_end :=
match output with
 erc20.Result.Revert _ _ => None
 erc20.Result.Success (memory_end_beginning, memory_end_end, s) =>
Some (make_state environment state
(memory_end_beginning ++ memory_end_middle ++ memory_end_end)
(SimulatedStorage.of_erc20_state s)
)
end in
{{? codes, environment, Some state_start 
The original code here:
ERC20_403.ERC20_403_deployed.body ⇓
match output with
 erc20.Result.Revert p s => Result.Revert p s
 erc20.Result.Success (_, memory_end_end, _) =>
Result.Return memoryguard (32 * Z.of_nat (List.length memory_end_end))
end
 state_end ?}}.
The proof is done in the same way as in the previous blog post 🪁 Coq of Solidity – part 3 about the verification of the _approve
function. The body of the contract calls all the other functions of the contract, and we reuse the equivalence proofs for the other functions here.
The main difficulty we encountered in the proof was missing information in the specification. For example, our predicate of equivalence requires for the memory of the smart contract to have the exact same value as its specification at the end of execution, except in case of revert. This means we needed to add the final state of the memory in the specification also, even if this is an implementation detail. We will refine our equivalence statement in the future to avoid this kind of issue.
For the most part of the proof, the work was about stepping through both codes and making sure, by automatic unification, that the twos are indeed equal.
The development of coqofsolidity
is made possible thanks to the AlephZero project. We thank the AlephZero Foundation for their support 🙏.
We have presented how to specify and formally verify a typical smart contract in Solidity, the ERC20 token, using our tool coqofsolidity
(opensource). In the next post, we will see how to verify an invariant on the code and how the proof system Coq reacts if we introduce a bug.
This is very important as a single bug can lead to the loss of millions of dollars in smart contracts, as we have regularly seen in the past, and we can never be sure that a human review of the code did not miss anything.
Our tool coqofsolidity
is one of the only tools using an interactive theorem prover for Solidity, together with Clear from Nethermind. This might be the most powerful approach to making code without bugs, as exemplified in this PLDI paper comparing the reliability of various C compilers. They found numerous bugs in each compiler except in the formally verified one!
In this blog post we show how we functionally specify and verify the _approve
function of an ERC20 smart contract. We will see how we prove that a refined version of the function is equivalent to the original one.
The development of coqofsolidity
is made possible thanks to the AlephZero project. We thank the AlephZero Foundation for their support 🙏.
Here is the _approve
function of the Solidity smart contract that we want to specify:
mapping (address => mapping (address => uint256)) private _allowances;
function _approve(address owner, address spender, uint256 value) internal {
require(owner != address(0), "ERC20: approve from the zero address");
require(spender != address(0), "ERC20: approve to the zero address");
_allowances[owner][spender] = value;
emit Approval(owner, spender, value);
}
It modifies an item in the _allowances
map and emits an Approval
event after a few sanity checks. We will now write a functional specification of this function in Coq. The idea is to explain what this function is supposed to do describing its behavior with an idiomatic Coq code. This will be useful to make sure there are no mistakes in the smart contract, although here we have a very simple example. From the functional specification, we will also be able to check higherlevel properties of the smart contract, such as the fact that the total amount of tokens is always conserved.
Here is the Coq version of the _approve
function:
Module Storage.
Record t := {
allowances : Dict.t (Address.t * Address.t) U256.t;
(* other fields *)
}.
End Storage.
Definition _approve (owner spender : Address.t) (value : U256.t) (s : Storage.t) :
Result.t Storage.t :=
if (owner =? 0)  (spender =? 0) then
revert_address_null
else
Result.Success s < Storage.allowances :=
Dict.declare_or_assign s.(Storage.allowances) (owner, spender) value
>.
It takes the same parameters as the Solidity code: owner
, spender
, value
, and the current state s
of the storage. It returns a Result.t Storage.t
type, which is either a Result.Success
with the new storage after the execution of the _approve
function, or a revert_address_null
if the owner
or spender
is the null address. To create the new storage, we use the corresponding Coq notation and function to update the _allowances
map.
We ignore the emit
primitives for now.
Now let us show that, for any possible owner
, spender
, value
, and storage state s
, the _approve
function in Solidity will behave exactly as the Coq specification.
Here is the Coq translation of the _approve
function as generated by coqofsolidity
:
Definition fun_approve (var_owner : U256.t) (var_spender : U256.t) (var_value : U256.t) : M.t unit :=
let~ _1 := [[ and ~( var_owner, (sub ~( (shl ~( 160, 1 )), 1 )) ) ]] in
do~ [[
M.if_unit ( iszero ~( _1 ),
let~ memPtr := [[ mload ~( 64 ) ]] in
do~ [[ mstore ~( memPtr, (shl ~( 229, 4594637 )) ) ]] in
do~ [[ mstore ~( (add ~( memPtr, 4 )), 32 ) ]] in
do~ [[ mstore ~( (add ~( memPtr, 36 )), 36 ) ]] in
do~ [[ mstore ~( (add ~( memPtr, 68 )), 0x45524332303a20617070726f76652066726f6d20746865207a65726f20616464 ) ]] in
do~ [[ mstore ~( (add ~( memPtr, 100 )), 0x7265737300000000000000000000000000000000000000000000000000000000 ) ]] in
do~ [[ revert ~( memPtr, 132 ) ]] in
M.pure tt
)
]] in
let~ _2 := [[ and ~( var_spender, (sub ~( (shl ~( 160, 1 )), 1 )) ) ]] in
do~ [[
M.if_unit ( iszero ~( _2 ),
let~ memPtr_1 := [[ mload ~( 64 ) ]] in
do~ [[ mstore ~( memPtr_1, (shl ~( 229, 4594637 )) ) ]] in
do~ [[ mstore ~( (add ~( memPtr_1, 4 )), 32 ) ]] in
do~ [[ mstore ~( (add ~( memPtr_1, 36 )), 34 ) ]] in
do~ [[ mstore ~( (add ~( memPtr_1, 68 )), 0x45524332303a20617070726f766520746f20746865207a65726f206164647265 ) ]] in
do~ [[ mstore ~( (add ~( memPtr_1, 100 )), 0x7373000000000000000000000000000000000000000000000000000000000000 ) ]] in
do~ [[ revert ~( memPtr_1, 132 ) ]] in
M.pure tt
)
]] in
do~ [[ mstore ~( 0x00, _1 ) ]] in
do~ [[ mstore ~( 0x20, 0x01 ) ]] in
let~ dataSlot := [[ keccak256 ~( 0x00, 0x40 ) ]] in
let~ dataSlot_1 := [[ 0 ]] in
do~ [[ mstore ~( 0, _2 ) ]] in
do~ [[ mstore ~( 0x20, dataSlot ) ]] in
let~ dataSlot_1 := [[ keccak256 ~( 0, 0x40 ) ]] in
do~ [[ sstore ~( dataSlot_1, var_value ) ]] in
let~ _3 := [[ mload ~( 0x40 ) ]] in
do~ [[ mstore ~( _3, var_value ) ]] in
do~ [[ log3 ~( _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 ) ]] in
M.pure tt.
We plug into the Solidity compiler and translate the intermediate representation Yul that solc
uses to generate EVM bytecode. We automatically refine the Yul generated by the Solidity compiler but for now this refinement is limited.
The two M.if_unit
at the beginning correspond to the require
statements in the Solidity code. The revert
statements are used to return an error message to the caller. The mstore
and sstore
functions are used to store values in the memory and the storage of the EVM. The keccak256
function encodes the storage addresses to access the _allowances
map. The log3
function is used to emit an event at the end.
This representation of the _approve
function is very verbose as it corresponds exactly to what the source code does and contains a lot of implementation details. Our goal now is to show that this version is equivalent to the functional specification that we wrote by hand.
We express that the functional specification is equivalent to the original one with this lemma:
Lemma run_fun_approve codes environment state
(owner spender : Address.t) (value : U256.t) (s : erc20.Storage.t)
(mem_0 mem_1 mem_3 mem_4 : U256.t)
(H_owner : Address.Valid.t owner)
(H_spender : Address.Valid.t spender) :
let memoryguard := 0x80 in
let memory_start :=
[mem_0; mem_1; memoryguard; mem_3; mem_4] in
let state_start :=
make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in
let output :=
erc20._approve owner spender value s in
let memory_end :=
[spender; erc20.keccak256_tuple2 owner 1; memoryguard; mem_3; value] in
let state_end :=
match output with
 erc20.Result.Revert _ _ => None
 erc20.Result.Success s =>
Some (make_state environment state memory_end (SimulatedStorage.of_erc20_state s))
end in
{{? codes, environment, Some state_start 
ERC20_403.ERC20_403_deployed.fun_approve owner spender value ⇓
match output with
 erc20.Result.Revert p s => Result.Revert p s
 erc20.Result.Success _ => Result.Ok tt
end
 state_end ?}}.
This lemma of equivalence requires some parameters:
codes
, environment
, and state
values, that describe the state of the blockchain before the execution of the _approve
function,memoryguard
value that gives a memory zone that we are safe to use,mem_i
variables, as we do not know the exact values of the memory slots before the execution of the function,owner
, spender
, and value
that are the parameters of the _approve
function,s
that is the state of storage of the smart contract before the execution of the _approve
function,H_owner
and H_spender
proofs that the owner
and spender
are valid addresses. These two proofs are required to execute the function as expected and always available, thanks to runtime checks made at the entrypoints of the smart contract.The lemma will hold for any possible values of the parameters above, even if there are infinite possibilities. This is the power of formal verification: we can prove that our smart contract is correct for all possible inputs and states.
The core statement uses the predicate:
{{? codes, environment, start_state 
original_code ⇓
refined_code
 end_state ?}}
It says that some original_code
executed in the start_state
environment will give the same output as the refined_code
and will result in the final state end_state
. The state is an option type: either Some
state or None
if the execution reverted. That way we do not have to deal with describing the state after a contract revert, that will reset the storage anyways.
The statement of equivalence is relatively verbose so there could be mistakes in the way it is stated. This is not really an issue, as the _approve
function is an intermediate function, so the only statement that really matters is the one on the main function of the contract that dispatches to the relevant entrypoint according to the payload of the transaction. There could also be mistakes there, but perhaps we can automatically generate this statement from the Solidity code.
The way we write the proof is interesting. We use Coq as a symbolic debugger, where we execute both the original code and the functional specification until we reach the end of execution for all the branches, always with the same result.
Here is an example of a debugging step (in the proof mode of Coq):
{{?codes, environment,
Some
(make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; mem_4]
[IsStorable.IMap.(IsStorable.to_storable_value) s.(erc20.Storage.balances);
StorableValue.Map2
(Dict.declare_or_assign s.(erc20.Storage.allowances) (owner, spender) value);
StorableValue.U256 s.(erc20.Storage.total_supply)])

The original code here:
do~ call (Stdlib.mstore 128 value)
in (do~ call
(Stdlib.log3 128 32
63486140976153616755203102783360879283472101686154884697241723088393386309925
owner spender) in LowM.Pure (Result.Ok tt)) ⇓
The functional specification here:
Result.Ok tt
 Some
(make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; value]
(SimulatedStorage.of_erc20_state
s<erc20.Storage.allowances:= Dict.declare_or_assign s.(erc20.Storage.allowances)
(owner, spender) value>))?}}
On the original code side we can recognize:
do~ [[ mstore ~( _3, var_value ) ]] in
do~ [[ log3 ~( _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 ) ]] in
M.pure tt
that corresponds to the end of the execution of the _approve
function. On the functional specification, we have:
Result.Ok tt
that ends the execution successfully but does not return anything. This is because we ignore the emit
operation, translated as a log3
Yul primitive. We also ignore the mstore
call as it is only used to fill information for the log3
call.
Here are the various commands to step through the code, encoded as Coq tactics:
p
: final Pure expressionpn
: final Pure expression ignoring the resulting state with a None (for a revert)pe
: final Pure expression with nontrivial Equality of resultspr
: Yul PRimitiveprn
: Yul PRimitive ignoring the resulting state with a Nonel
: step in a Letlu
: step in a Let by Unfoldingc
: step in a function Callcu
: step in a function Call by Unfoldings
: Simplify the goalThese commands verify that the two programs are equivalent as we step through them. As a reference, the proof is in CoqOfSolidity/proofs/ERC20_functional.v:
Proof.
simpl.
unfold ERC20_403.ERC20_403_deployed.fun_approve, erc20._approve.
l. {
now apply run_is_non_null_address.
}
unfold Stdlib.Pure.iszero.
lu.
c; [p].
s.
unfold Stdlib.Pure.iszero.
destruct (owner =? 0); s. {
change (true  _) with true; s.
lu; c. {
apply_run_mload.
}
repeat (
lu 
cu 
(prn; intro) 
s 
p
).
}
l. {
now apply run_is_non_null_address.
}
lu.
c; [p]; s.
unfold Stdlib.Pure.iszero.
change (false  ?e) with e; s.
destruct (spender =? 0); s. {
lu; c. {
apply_run_mload.
}
repeat (
lu 
cu 
(prn; intro) 
s 
p
).
}
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_keccak256_tuple2.
}
lu.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_keccak256_tuple2.
}
lu; c. {
apply_run_sstore_map2_u256.
}
CanonizeState.execute.
lu; c. {
apply_run_mload.
}
s.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
p.
}
p.
Qed.
To audit your smart contracts and make sure they contain no bugs, contact us at 📧contact@formal.land.
We refund our work in case we missed any high/critical severity bugs.
We have presented how to functionally specify a function with coqofsolidity
. In the next blog post we will see how to extend this proof and specification to the entire ERC20 smart contract.
We work by translating the Yul version of a smart contract to the formal language Coq 🐓, in which we then express the code specifications/security properties and formally verify them 🔄. The Yul language is an intermediate language used by the Solidity compiler and others to generate EVM bytecode. Yul is simpler than Solidity and at a higher level than the EVM bytecode, making it a good target for formal verification.
In this blog post we present the recent developments we made to simplify the reasoning 🧠 about Yul programs once translated in Coq.
This development is made possible thanks to AlephZero. We thank the Aleph Zero Foundation for their support to bring more security to the Web3 space 🙏.
We present here the general workflow to use coqofsolidity
to make sure your smart contracts contain no bugs 🐛.
The workflow is as follows:
coqofyul
tool generates a first Coq version. This version is very lowlevel, with, for example, variable names represented by the string of their names.prepare.py
script makes as many refinements as possible in the Coq code to make it more readable and easier to reason about. For example, we order the functions definitions by the order in which they are used and replace the Yul variables by standard Coq variables.The code that coqofsolidity
generates is very verbose. For example, for this Yul function generated by the Solidity compiler to make an addition with overflow check:
function checked_add_uint256(x) > sum
{
sum := add(x, /** @src 0:419:421 "20" */ 0x14)
/// @src 0:33:3484 "contract ERC20 {..."
if gt(x, sum)
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x11)
revert(0, 0x24)
}
}
we get a Coq translation:
Code.Function.make (
"checked_add_uint256",
["x"],
["sum"],
M.scope (
do! ltac:(M.monadic (
M.assign (
["sum"],
Some (M.call (
"add",
[
M.get_var ( "x" );
[Literal.number 0x14]
]
))
)
)) in
do! ltac:(M.monadic (
M.if_ (
M.call (
"gt",
[
M.get_var ( "x" );
M.get_var ( "sum" )
]
),
M.scope (
do! ltac:(M.monadic (
M.expr_stmt (
M.call (
"mstore",
[
[Literal.number 0];
M.call (
"shl",
[
[Literal.number 224];
[Literal.number 0x4e487b71]
]
)
]
)
)
)) in
do! ltac:(M.monadic (
M.expr_stmt (
M.call (
"mstore",
[
[Literal.number 4];
[Literal.number 0x11]
]
)
)
)) in
do! ltac:(M.monadic (
M.expr_stmt (
M.call (
"revert",
[
[Literal.number 0];
[Literal.number 0x24]
]
)
)
)) in
M.pure BlockUnit.Tt
)
)
)) in
M.pure BlockUnit.Tt
)
)
This is quite long to follow, and even harder to use to write formal proofs. We made a script prepare.py that simplifies the code above to:
Definition checked_add_uint256 (x : U256.t) : M.t U256.t :=
let~ sum := [[ add ~( x, 0x14 ) ]] in
do~ [[
M.if_unit ( gt ~( x, sum ),
do~ [[ mstore ~( 0, (shl ~( 224, 0x4e487b71 )) ) ]] in
do~ [[ mstore ~( 4, 0x11 ) ]] in
do~ [[ revert ~( 0, 0x24 ) ]] in
M.pure tt
)
]] in
M.pure sum.
This is much more readable. We have monadic notations to compose all the primitive Yul functions such as mstore
and revert
, that may cause side effects such as memory mutation or premature return. The code uses standard Coq variables and functions instead of strings, which simplifies the proofs.
To make sure that this transformation is correct, we also generate a Coq proof file that shows that our transformation is correct and that the original and transformed code from prepare.py
are equivalent ✔️.
We can simplify the code even further. For example:
add
, gt
, and shl
are purely functional, so we could explicit this property in the Coq code. For now they are called as monadic functions with the notation f ~( arg1, ..., argn )
even if they never make side effects.mstore
function stores values at fixed addresses in the memory, here 0
and 4
. We could remove these memory operations by introducing named variables to hold the results instead.We hope to be able to automate as many refinements as possible in the future, but for now we have to do some manual work 🔧.
We manually refine the code by showing that it returns the same result, for every possible input and initial memory state, as a simplified code written by hand. For the checked_add_uint256
function above we use:
Definition simulation_checked_add_uint256 (x y : Z) : Result.t Z :=
if x + y >=? 2 ^ 256 then
Result.Revert 0 0x24
else
Result.Ok (x + y).
Here, all the computations are made with the Z
type of unbounded integers that are simpler to manipulate for the proofs. We use an if
statement to explicitly detect the overflows. The revert statement has the same parameters as in the original code, but we do not fill the memory area 0
to 0x24
anymore. The reason is that we ignore what the revert
returned in our specifications as this is not relevant for now and also simplifies the proofs.
In the code above we do not manipulate the memory anymore. In general, we do the following kinds of refinements:
revert
statement.keccak256
hash encoding of the addresses.keccak256
function.For now these transformations are manual and semiautomated, but we hope to automate them as much as possible in the future. By proving that simulation_checked_add_uint256
behaves as the original checked_add_uint256
function we are sure that we can reason on the simplified code instead of the original one without losing any information 🔍.
To audit your smart contracts with the method above contact us at 📧contact@formal.land.
Compared to other auditing methods, formal verification has the strong advantage of covering all possible execution cases 💪.
We have presented the current status of our work to formally verify smart contracts, especially the refinements steps that make the reasoning possible. In our next posts we will continue seeing how we can verify a full smart contract 🔮.
]]>Formal verification is a technique to test a program on all possible entries, even when there are infinitely many. This contrasts with the traditional test techniques, which can only execute a finite set of scenarios. As such, it appears to be an ideal way to bring more security to smart contract audits.
In this blog post, we present the formal verification tool coqofsolidity
that we are developing for Solidity. Its specificities are that:
Here, we present how we translate Solidity code into Coq using the intermediate language Yul. We explain the semantics we use and what remains to be done.
The code is available in our fork of the Solidity compiler at github.com/formalland/coqofsolidity.
We reuse the code of the standard Solidity compiler solc
in order to make sure that we can stay in sync with the evolutions of the language and be compatible with all the Solidity features. Thus, our most straightforward path to implementing a translation tool from Solidity to Coq was to fork the C++ code of solc
in github.com/formalland/coqofsolidity. We add a new solc
's flag ircoq
that tells the compiler to also generate a Coq output in addition to the expected EVM bytecode.
At first, we looked at the direct translation from the Solidity language to Coq, but this was getting too complex. We changed our strategy to instead target the Yul language, an intermediate language used by the Solidity compiler to have an intermediate step in its translation to the EVM bytecode. The Yul language is simpler than Solidity and still has a higher level than the EVM bytecode, making it a good target for formal verification. In contrast to the EVM bytecode, there are no explicit stackmanipulation or goto
instructions in Yul simplifying formal verification.
To give an idea of the size difference between Solidity and Yul, here are the files to export these languages to JSON in the Solidity compiler:
The Solidity language appears as more complex than Yul as the code to translate it to JSON is five times longer.
We copied the file libyul/AsmJsonConverter.cpp
above to make a version that translates Yul to Coq: libyul/AsmCoqConverter.cpp. We reused the code for compilation flags to add a new option ircoq
, which runs the conversion to Coq instead of the conversion to JSON.
To limit the size of the generated Coq code, we translate the Yul code after the optimization passes. This helps to remove boilerplate code but may make the Yul code less relatable to the Solidity sources. Thankfully, the optimized Yul code is still readable in our tests, and the Solidity compiler can prettyprint a version of the optimized Yul code with comments to quote the corresponding Solidity source code.
As an example, here is how we translate the if keyword of Yul:
std::string AsmCoqConverter::operator()(If const& _node)
{
yulAssert(_node.condition, "Invalid if condition.");
std::string ret = "M.if_ (\n";
m_indent++;
ret += indent() + std::visit(*this, *_node.condition) + ",\n";
ret += indent() + (*this)(_node.body) + "\n";
m_indent;
ret += indent() + ")";
return ret;
}
We convert each Yul _node
to an std::string
that represents the Coq code. We use the m_indent
variable to keep track of the indentation level, and the indent()
function to add the right number of spaces at the beginning of each line. We do not need to add extra parenthesis to disambiguate priorities, as the Yul language is simple enough.
Here is the generated Coq code for the beginning of the erc20.sol example from the Solidity compiler's test suite:
(* Generated by solc *)
Require Import CoqOfSolidity.CoqOfSolidity.
Module ERC20_403.
Definition code : M.t BlockUnit.t :=
do* ltac:(M.monadic (
M.function (
"allocate_unbounded",
[],
["memPtr"],
do* ltac:(M.monadic (
M.assign (
["memPtr"],
Some (M.call (
"mload",
[
[Literal.number 64]
]
))
)
)) in
M.od
)
)) in
do* ltac:(M.monadic (
M.function (
"revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb",
[],
[],
do* ltac:(M.monadic (
M.expr_stmt (
M.call (
"revert",
[
[Literal.number 0];
[Literal.number 0]
]
)
)
)) in
M.od
)
)) in
(* ... 6,000 remaining lines ... *)
This code is quite verbose, for an original smart contract size of 100 lines of Solidity. As a reference, the corresponding Yul code is 1,000 lines long and starts with:
/// @usesrc 0:"erc20.sol"
object "ERC20_403" {
code {
function allocate_unbounded() > memPtr
{ memPtr := mload(64) }
function revert_error_ca66f745a3ce8ff40e2ccaf1ad45db7774001b90d25810abd9040049be7bf4bb()
{ revert(0, 0) }
// ... 1,000 remaining lines ...
The content is actually the same up to the notations, but we use many more line breaks and keywords in the Coq version.
Now that the code is translated in Coq, we need to define a runtime for the Coq code. This means giving a definition for all the functions and types that are used in the generated code, like M.t BlockUnit.t
, M.monadic
, M.function
, ... This runtime gives the semantics of the Yul language, that is to say, the meaning of all the primitives of the language.
We first define a monadic notation ltac:(M.monadic ...)
to make a monadic transformation on the generated code. We reuse here what we have done for our Rust translation to Coq, which we describe in our blog post 🦀 Monadic notation for the Rust translation. The notation:
f ( x_1, ..., x_n )
corresponds to the call of a monadic function. The tactic M.monadic
automatically chains all these calls using the monadic bind operator.
The do* ... in ...
is another monadic notation to chain monadic expressions, directly calling the monadic bind. This notation is more explicit, and we use it in combination with the ltac:(M.monadic ...)
notation as it might be more efficient to typecheck very large files.
To represent the side effects in Yul, we use the following Coq monad, that we define in CoqOfSolidity/CoqOfSolidity.v:
Module U256.
Definition t := Z.
End U256.
Module Environment.
Record t : Set := {
caller : U256.t;
(** Amount of wei sent to the current contract *)
callvalue : U256.t;
calldata : list Z;
(** The address of the contract. *)
address : U256.t;
}.
End Environment.
Module BlockUnit.
(** The return value of a code block. *)
Inductive t : Set :=
(** The default value in case of success *)
 Tt
(** The instruction `break` was called *)
 Break
(** The instruction `continue` was called *)
 Continue
(** The instruction `leave` was called *)
 Leave.
End BlockUnit.
Module Result.
(** A wrapper for the result of an expression or a code block. We can either return a normal value
with [Ok], or a special instruction [Return] that will stop the execution of the contract. *)
Inductive t (A : Set) : Set :=
 Ok (output : A)
 Return (p s : U256.t)
 Revert (p s : U256.t).
Arguments Ok {_}.
Arguments Return {_}.
Arguments Revert {_}.
End Result.
Module Primitive.
(** We group together primitives that share being impure functions operating over the state. *)
Inductive t : Set > Set :=
 OpenScope : t unit
 CloseScope : t unit
 GetVar (name : string) : t U256.t
 DeclareVars (names : list string) (values : list U256.t) : t unit
 AssignVars (names : list string) (values : list U256.t) : t unit
 MLoad (address length : U256.t) : t (list Z)
 MStore (address : U256.t) (bytes : list Z) : t unit
 SLoad (address : U256.t) : t U256.t
 SStore (address value : U256.t) : t unit
 RLoad : t (list Z)
 TLoad (address : U256.t) : t U256.t
 TStore (address value : U256.t) : t unit
 Log (topics : list U256.t) (payload : list Z) : t unit
 GetEnvironment : t Environment.t
 GetNonce : t U256.t
 GetCodedata (address : U256.t) : t (list Z)
 CreateAccount (address code : U256.t) (codedata : list Z) : t unit
 UpdateCodeForDeploy (address code : U256.t) : t unit
 LoadImmutable (name : U256.t) : t U256.t
 SetImmutable (name value : U256.t) : t unit
(** The call stack is there to debug the semantics of Yul. *)
 CallStackPush (name : string) (arguments : list (string * U256.t)) : t unit
 CallStackPop : t unit.
End Primitive.
Module LowM.
Inductive t (A : Set) : Set :=
 Pure (output : A)
 Primitive