We continue to strengthen the security of smart contracts with our tool coq-of-solidity π οΈ. It checks for vulnerabilities or bugs in Solidity code. It uses formal verification with an interactive theorem prover (CoqΒ π) to make sure that we cover:
- all possible user inputs/storage states, even if there are infinite possibilities,
- for any security properties.
This is very important as a single bug can lead to the loss of millions of dollars in smart contracts, as we have regularly seen in the past, and we can never be sure that a human review of the code did not miss anything.
Our tool coq-of-solidity
is one of the only tools using an interactive theorem prover for Solidity, together with Clear from Nethermind. This might be the most powerful approach to making code without bugs, as exemplified in this PLDI paper comparing the reliability of various C compilers. They found numerous bugs in each compiler except in the formally verified one!
In this blog post we show how we functionally specify and verify the _approve
function of an ERC-20 smart contract. We will see how we prove that a refined version of the function is equivalent to the original one.
The development of coq-of-solidity
is made possible thanks to the AlephZero project. We thank the AlephZero Foundation for their supportΒ π.
Functional specificationβ
Here is the _approve
function of the Solidity smart contract that we want to specify:
mapping (address => mapping (address => uint256)) private _allowances;
function _approve(address owner, address spender, uint256 value) internal {
require(owner != address(0), "ERC20: approve from the zero address");
require(spender != address(0), "ERC20: approve to the zero address");
_allowances[owner][spender] = value;
emit Approval(owner, spender, value);
}
It modifies an item in the _allowances
map and emits an Approval
event after a few sanity checks. We will now write a functional specification of this function in Coq. The idea is to explain what this function is supposed to do describing its behavior with an idiomatic Coq code. This will be useful to make sure there are no mistakes in the smart contract, although here we have a very simple example. From the functional specification, we will also be able to check higher-level properties of the smart contract, such as the fact that the total amount of tokens is always conserved.
Here is the Coq version of the _approve
function:
Module Storage.
Record t := {
allowances : Dict.t (Address.t * Address.t) U256.t;
(* other fields *)
}.
End Storage.
Definition _approve (owner spender : Address.t) (value : U256.t) (s : Storage.t) :
Result.t Storage.t :=
if (owner =? 0) || (spender =? 0) then
revert_address_null
else
Result.Success s <| Storage.allowances :=
Dict.declare_or_assign s.(Storage.allowances) (owner, spender) value
|>.
It takes the same parameters as the Solidity code: owner
, spender
, value
, and the current state s
of the storage. It returns a Result.t Storage.t
type, which is either a Result.Success
with the new storage after the execution of the _approve
function, or a revert_address_null
if the owner
or spender
is the null address. To create the new storage, we use the corresponding Coq notation and function to update the _allowances
map.
We ignore the emit
primitives for now.
Now let us show that, for any possible owner
, spender
, value
, and storage state s
, the _approve
function in Solidity will behave exactly as the Coq specification.
Approve functionβ
Here is the Coq translation of the _approve
function as generated by coq-of-solidity
:
Definition fun_approve (var_owner : U256.t) (var_spender : U256.t) (var_value : U256.t) : M.t unit :=
let~ _1 := [[ and ~(| var_owner, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in
do~ [[
M.if_unit (| iszero ~(| _1 |),
let~ memPtr := [[ mload ~(| 64 |) ]] in
do~ [[ mstore ~(| memPtr, (shl ~(| 229, 4594637 |)) |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr, 4 |)), 32 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr, 36 |)), 36 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr, 68 |)), 0x45524332303a20617070726f76652066726f6d20746865207a65726f20616464 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr, 100 |)), 0x7265737300000000000000000000000000000000000000000000000000000000 |) ]] in
do~ [[ revert ~(| memPtr, 132 |) ]] in
M.pure tt
|)
]] in
let~ _2 := [[ and ~(| var_spender, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in
do~ [[
M.if_unit (| iszero ~(| _2 |),
let~ memPtr_1 := [[ mload ~(| 64 |) ]] in
do~ [[ mstore ~(| memPtr_1, (shl ~(| 229, 4594637 |)) |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr_1, 4 |)), 32 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr_1, 36 |)), 34 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr_1, 68 |)), 0x45524332303a20617070726f766520746f20746865207a65726f206164647265 |) ]] in
do~ [[ mstore ~(| (add ~(| memPtr_1, 100 |)), 0x7373000000000000000000000000000000000000000000000000000000000000 |) ]] in
do~ [[ revert ~(| memPtr_1, 132 |) ]] in
M.pure tt
|)
]] in
do~ [[ mstore ~(| 0x00, _1 |) ]] in
do~ [[ mstore ~(| 0x20, 0x01 |) ]] in
let~ dataSlot := [[ keccak256 ~(| 0x00, 0x40 |) ]] in
let~ dataSlot_1 := [[ 0 ]] in
do~ [[ mstore ~(| 0, _2 |) ]] in
do~ [[ mstore ~(| 0x20, dataSlot |) ]] in
let~ dataSlot_1 := [[ keccak256 ~(| 0, 0x40 |) ]] in
do~ [[ sstore ~(| dataSlot_1, var_value |) ]] in
let~ _3 := [[ mload ~(| 0x40 |) ]] in
do~ [[ mstore ~(| _3, var_value |) ]] in
do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in
M.pure tt.
We plug into the Solidity compiler and translate the intermediate representation Yul that solc
uses to generate EVM bytecode. We automatically refine the Yul generated by the Solidity compiler but for now this refinement is limited.
The two M.if_unit
at the beginning correspond to the require
statements in the Solidity code. The revert
statements are used to return an error message to the caller. The mstore
and sstore
functions are used to store values in the memory and the storage of the EVM. The keccak256
function encodes the storage addresses to access the _allowances
map. The log3
function is used to emit an event at the end.
This representation of the _approve
function is very verbose as it corresponds exactly to what the source code does and contains a lot of implementation details. Our goal now is to show that this version is equivalent to the functional specification that we wrote by hand.
Equivalenceβ
We express that the functional specification is equivalent to the original one with this lemma:
Lemma run_fun_approve codes environment state
(owner spender : Address.t) (value : U256.t) (s : erc20.Storage.t)
(mem_0 mem_1 mem_3 mem_4 : U256.t)
(H_owner : Address.Valid.t owner)
(H_spender : Address.Valid.t spender) :
let memoryguard := 0x80 in
let memory_start :=
[mem_0; mem_1; memoryguard; mem_3; mem_4] in
let state_start :=
make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in
let output :=
erc20._approve owner spender value s in
let memory_end :=
[spender; erc20.keccak256_tuple2 owner 1; memoryguard; mem_3; value] in
let state_end :=
match output with
| erc20.Result.Revert _ _ => None
| erc20.Result.Success s =>
Some (make_state environment state memory_end (SimulatedStorage.of_erc20_state s))
end in
{{? codes, environment, Some state_start |
ERC20_403.ERC20_403_deployed.fun_approve owner spender value β
match output with
| erc20.Result.Revert p s => Result.Revert p s
| erc20.Result.Success _ => Result.Ok tt
end
| state_end ?}}.
This lemma of equivalence requires some parameters:
- an initial
codes
,environment
, andstate
values, that describe the state of the blockchain before the execution of the_approve
function, - a
memoryguard
value that gives a memory zone that we are safe to use, - some
mem_i
variables, as we do not know the exact values of the memory slots before the execution of the function, - an
owner
,spender
, andvalue
that are the parameters of the_approve
function, - an
s
that is the state of storage of the smart contract before the execution of the_approve
function, - an
H_owner
andH_spender
proofs that theowner
andspender
are valid addresses. These two proofs are required to execute the function as expected and always available, thanks to runtime checks made at the entrypoints of the smart contract.
The lemma will hold for any possible values of the parameters above, even if there are infinite possibilities. This is the power of formal verification: we can prove that our smart contract is correct for all possible inputs and states.
The core statement uses the predicate:
{{? codes, environment, start_state |
original_code β
refined_code
| end_state ?}}
It says that some original_code
executed in the start_state
environment will give the same output as the refined_code
and will result in the final state end_state
. The state is an option type: either Some
state or None
if the execution reverted. That way we do not have to deal with describing the state after a contract revert, that will reset the storage anyways.
The statement of equivalence is relatively verbose so there could be mistakes in the way it is stated. This is not really an issue, as the _approve
function is an intermediate function, so the only statement that really matters is the one on the main function of the contract that dispatches to the relevant entrypoint according to the payload of the transaction. There could also be mistakes there, but perhaps we can automatically generate this statement from the Solidity code.
Proof of equivalenceβ
The way we write the proof is interesting. We use Coq as a symbolic debugger, where we execute both the original code and the functional specification until we reach the end of execution for all the branches, always with the same result.
Here is an example of a debugging step (in the proof mode of Coq):
{{?codes, environment,
Some
(make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; mem_4]
[IsStorable.IMap.(IsStorable.to_storable_value) s.(erc20.Storage.balances);
StorableValue.Map2
(Dict.declare_or_assign s.(erc20.Storage.allowances) (owner, spender) value);
StorableValue.U256 s.(erc20.Storage.total_supply)])
|
The original code here:
do~ call (Stdlib.mstore 128 value)
in (do~ call
(Stdlib.log3 128 32
63486140976153616755203102783360879283472101686154884697241723088393386309925
owner spender) in LowM.Pure (Result.Ok tt)) β
The functional specification here:
Result.Ok tt
| Some
(make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; value]
(SimulatedStorage.of_erc20_state
s<|erc20.Storage.allowances:= Dict.declare_or_assign s.(erc20.Storage.allowances)
(owner, spender) value|>))?}}
On the original code side we can recognize:
do~ [[ mstore ~(| _3, var_value |) ]] in
do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in
M.pure tt
that corresponds to the end of the execution of the _approve
function. On the functional specification, we have:
Result.Ok tt
that ends the execution successfully but does not return anything. This is because we ignore the emit
operation, translated as a log3
Yul primitive. We also ignore the mstore
call as it is only used to fill information for the log3
call.
Here are the various commands to step through the code, encoded as Coq tactics:
p
: final Pure expressionpn
: final Pure expression ignoring the resulting state with a None (for a revert)pe
: final Pure expression with non-trivial Equality of resultspr
: Yul PRimitiveprn
: Yul PRimitive ignoring the resulting state with a Nonel
: step in a Letlu
: step in a Let by Unfoldingc
: step in a function Callcu
: step in a function Call by Unfoldings
: Simplify the goal
These commands verify that the two programs are equivalent as we step through them. As a reference, the proof is in CoqOfSolidity/proofs/ERC20_functional.v:
Proof.
simpl.
unfold ERC20_403.ERC20_403_deployed.fun_approve, erc20._approve.
l. {
now apply run_is_non_null_address.
}
unfold Stdlib.Pure.iszero.
lu.
c; [p|].
s.
unfold Stdlib.Pure.iszero.
destruct (owner =? 0); s. {
change (true || _) with true; s.
lu; c. {
apply_run_mload.
}
repeat (
lu ||
cu ||
(prn; intro) ||
s ||
p
).
}
l. {
now apply run_is_non_null_address.
}
lu.
c; [p|]; s.
unfold Stdlib.Pure.iszero.
change (false || ?e) with e; s.
destruct (spender =? 0); s. {
lu; c. {
apply_run_mload.
}
repeat (
lu ||
cu ||
(prn; intro) ||
s ||
p
).
}
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_keccak256_tuple2.
}
lu.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
apply_run_keccak256_tuple2.
}
lu; c. {
apply_run_sstore_map2_u256.
}
CanonizeState.execute.
lu; c. {
apply_run_mload.
}
s.
lu; c. {
apply_run_mstore.
}
CanonizeState.execute.
lu; c. {
p.
}
p.
Qed.
To audit your smart contracts and make sure they contain no bugs, contact us atΒ Β π§contact@formal.land.
We refund our work in case we missed any high/critical severity bugs.
Conclusionβ
We have presented how to functionally specify a function with coq-of-solidity
. In the next blog post we will see how to extend this proof and specification to the entire ERC-20 smart contract.