🪁 Coq of Solidity – part 3

August 12, 2024 · 11 min read

We continue to strengthen the security of smart contracts with our tool coq-of-solidity 🛠️. It checks for vulnerabilities or bugs in Solidity code. It uses formal verification with an interactive theorem prover (Coq 🐓) to make sure that we cover:

all possible user inputs/storage states, even if there are infinite possibilities,
for any security properties.

This is very important as a single bug can lead to the loss of millions of dollars in smart contracts, as we have regularly seen in the past, and we can never be sure that a human review of the code did not miss anything.

Our tool coq-of-solidity is one of the only tools using an interactive theorem prover for Solidity, together with Clear from Nethermind. This might be the most powerful approach to making code without bugs, as exemplified in this PLDI paper comparing the reliability of various C compilers. They found numerous bugs in each compiler except in the formally verified one!

In this blog post we show how we functionally specify and verify the _approve function of an ERC-20 smart contract. We will see how we prove that a refined version of the function is equivalent to the original one.

AlephZero

The development of coq-of-solidity is made possible thanks to the AlephZero project. We thank the AlephZero Foundation for their support 🙏.

Functional specification

Here is the _approve function of the Solidity smart contract that we want to specify:

mapping (address => mapping (address => uint256)) private _allowances;

function _approve(address owner, address spender, uint256 value) internal {
    require(owner != address(0), "ERC20: approve from the zero address");
    require(spender != address(0), "ERC20: approve to the zero address");

    _allowances[owner][spender] = value;
    emit Approval(owner, spender, value);
}

It modifies an item in the _allowances map and emits an Approval event after a few sanity checks. We will now write a functional specification of this function in Coq. The idea is to explain what this function is supposed to do describing its behavior with an idiomatic Coq code. This will be useful to make sure there are no mistakes in the smart contract, although here we have a very simple example. From the functional specification, we will also be able to check higher-level properties of the smart contract, such as the fact that the total amount of tokens is always conserved.

Here is the Coq version of the _approve function:

Module Storage.
  Record t := {
    allowances : Dict.t (Address.t * Address.t) U256.t;
    (* other fields *)
  }.
End Storage.

Definition _approve (owner spender : Address.t) (value : U256.t) (s : Storage.t) :
    Result.t Storage.t :=
  if (owner =? 0) || (spender =? 0) then
    revert_address_null
  else
    Result.Success s <| Storage.allowances :=
      Dict.declare_or_assign s.(Storage.allowances) (owner, spender) value
    |>.

It takes the same parameters as the Solidity code: owner, spender, value, and the current state s of the storage. It returns a Result.t Storage.t type, which is either a Result.Success with the new storage after the execution of the _approve function, or a revert_address_null if the owner or spender is the null address. To create the new storage, we use the corresponding Coq notation and function to update the _allowances map.

info

We ignore the emit primitives for now.

Now let us show that, for any possible owner, spender, value, and storage state s, the _approve function in Solidity will behave exactly as the Coq specification.

Approve function

Here is the Coq translation of the _approve function as generated by coq-of-solidity:

Definition fun_approve (var_owner : U256.t) (var_spender : U256.t) (var_value : U256.t) : M.t unit :=
  let~ _1 := [[ and ~(| var_owner, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in
  do~ [[
    M.if_unit (| iszero ~(| _1 |),
      let~ memPtr := [[ mload ~(| 64 |) ]] in
      do~ [[ mstore ~(| memPtr, (shl ~(| 229, 4594637 |)) |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr, 4 |)), 32 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr, 36 |)), 36 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr, 68 |)), 0x45524332303a20617070726f76652066726f6d20746865207a65726f20616464 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr, 100 |)), 0x7265737300000000000000000000000000000000000000000000000000000000 |) ]] in
      do~ [[ revert ~(| memPtr, 132 |) ]] in
      M.pure tt
    |)
  ]] in
  let~ _2 := [[ and ~(| var_spender, (sub ~(| (shl ~(| 160, 1 |)), 1 |)) |) ]] in
  do~ [[
    M.if_unit (| iszero ~(| _2 |),
      let~ memPtr_1 := [[ mload ~(| 64 |) ]] in
      do~ [[ mstore ~(| memPtr_1, (shl ~(| 229, 4594637 |)) |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr_1, 4 |)), 32 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr_1, 36 |)), 34 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr_1, 68 |)), 0x45524332303a20617070726f766520746f20746865207a65726f206164647265 |) ]] in
      do~ [[ mstore ~(| (add ~(| memPtr_1, 100 |)), 0x7373000000000000000000000000000000000000000000000000000000000000 |) ]] in
      do~ [[ revert ~(| memPtr_1, 132 |) ]] in
      M.pure tt
    |)
  ]] in
  do~ [[ mstore ~(| 0x00, _1 |) ]] in
  do~ [[ mstore ~(| 0x20, 0x01 |) ]] in
  let~ dataSlot := [[ keccak256 ~(| 0x00, 0x40 |) ]] in
  let~ dataSlot_1 := [[ 0 ]] in
  do~ [[ mstore ~(| 0, _2 |) ]] in
  do~ [[ mstore ~(| 0x20, dataSlot |) ]] in
  let~ dataSlot_1 := [[ keccak256 ~(| 0, 0x40 |) ]] in
  do~ [[ sstore ~(| dataSlot_1, var_value |) ]] in
  let~ _3 := [[ mload ~(| 0x40 |) ]] in
  do~ [[ mstore ~(| _3, var_value |) ]] in
  do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in
  M.pure tt.

We plug into the Solidity compiler and translate the intermediate representation Yul that solc uses to generate EVM bytecode. We automatically refine the Yul generated by the Solidity compiler but for now this refinement is limited.

The two M.if_unit at the beginning correspond to the require statements in the Solidity code. The revert statements are used to return an error message to the caller. The mstore and sstore functions are used to store values in the memory and the storage of the EVM. The keccak256 function encodes the storage addresses to access the _allowances map. The log3 function is used to emit an event at the end.

This representation of the _approve function is very verbose as it corresponds exactly to what the source code does and contains a lot of implementation details. Our goal now is to show that this version is equivalent to the functional specification that we wrote by hand.

Equivalence

We express that the functional specification is equivalent to the original one with this lemma:

Lemma run_fun_approve codes environment state
    (owner spender : Address.t) (value : U256.t) (s : erc20.Storage.t)
    (mem_0 mem_1 mem_3 mem_4 : U256.t)
    (H_owner : Address.Valid.t owner)
    (H_spender : Address.Valid.t spender) :
  let memoryguard := 0x80 in
  let memory_start :=
    [mem_0; mem_1; memoryguard; mem_3; mem_4] in
  let state_start :=
    make_state environment state memory_start (SimulatedStorage.of_erc20_state s) in
  let output :=
    erc20._approve owner spender value s in
  let memory_end :=
    [spender; erc20.keccak256_tuple2 owner 1; memoryguard; mem_3; value] in
  let state_end :=
    match output with
    | erc20.Result.Revert _ _ => None
    | erc20.Result.Success s =>
      Some (make_state environment state memory_end (SimulatedStorage.of_erc20_state s))
    end in
  {{? codes, environment, Some state_start |
    ERC20_403.ERC20_403_deployed.fun_approve owner spender value ⇓
    match output with
    | erc20.Result.Revert p s => Result.Revert p s
    | erc20.Result.Success _ => Result.Ok tt
    end
  | state_end ?}}.

This lemma of equivalence requires some parameters:

an initial codes, environment, and state values, that describe the state of the blockchain before the execution of the _approve function,
a memoryguard value that gives a memory zone that we are safe to use,
some mem_i variables, as we do not know the exact values of the memory slots before the execution of the function,
an owner, spender, and value that are the parameters of the _approve function,
an s that is the state of storage of the smart contract before the execution of the _approve function,
an H_owner and H_spender proofs that the owner and spender are valid addresses. These two proofs are required to execute the function as expected and always available, thanks to runtime checks made at the entrypoints of the smart contract.

The lemma will hold for any possible values of the parameters above, even if there are infinite possibilities. This is the power of formal verification: we can prove that our smart contract is correct for all possible inputs and states.

The core statement uses the predicate:

{{? codes, environment, start_state |
  original_code ⇓
  refined_code
| end_state ?}}

It says that some original_code executed in the start_state environment will give the same output as the refined_code and will result in the final state end_state. The state is an option type: either Some state or None if the execution reverted. That way we do not have to deal with describing the state after a contract revert, that will reset the storage anyways.

info

The statement of equivalence is relatively verbose so there could be mistakes in the way it is stated. This is not really an issue, as the _approve function is an intermediate function, so the only statement that really matters is the one on the main function of the contract that dispatches to the relevant entrypoint according to the payload of the transaction. There could also be mistakes there, but perhaps we can automatically generate this statement from the Solidity code.

Proof of equivalence

The way we write the proof is interesting. We use Coq as a symbolic debugger, where we execute both the original code and the functional specification until we reach the end of execution for all the branches, always with the same result.

Here is an example of a debugging step (in the proof mode of Coq):

{{?codes, environment,
Some
  (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; mem_4]
     [IsStorable.IMap.(IsStorable.to_storable_value) s.(erc20.Storage.balances);
      StorableValue.Map2
        (Dict.declare_or_assign s.(erc20.Storage.allowances) (owner, spender) value);
      StorableValue.U256 s.(erc20.Storage.total_supply)])
|
  The original code here:
  do~ call (Stdlib.mstore 128 value)
  in (do~ call
            (Stdlib.log3 128 32
               63486140976153616755203102783360879283472101686154884697241723088393386309925
               owner spender) in LowM.Pure (Result.Ok tt)) ⇓
  The functional specification here:
  Result.Ok tt
| Some
    (make_state environment state [spender; erc20.keccak256_tuple2 owner 1; 128; mem_3; value]
       (SimulatedStorage.of_erc20_state
          s<|erc20.Storage.allowances:= Dict.declare_or_assign s.(erc20.Storage.allowances)
                                          (owner, spender) value|>))?}}

On the original code side we can recognize:

do~ [[ mstore ~(| _3, var_value |) ]] in
do~ [[ log3 ~(| _3, 0x20, 0x8c5be1e5ebec7d5bd14f71427d1e84f3dd0314c0f7b2291e5b200ac8c7c3b925, _1, _2 |) ]] in
M.pure tt

that corresponds to the end of the execution of the _approve function. On the functional specification, we have:

Result.Ok tt

that ends the execution successfully but does not return anything. This is because we ignore the emit operation, translated as a log3 Yul primitive. We also ignore the mstore call as it is only used to fill information for the log3 call.

Here are the various commands to step through the code, encoded as Coq tactics:

p: final Pure expression
pn: final Pure expression ignoring the resulting state with a None (for a revert)
pe: final Pure expression with non-trivial Equality of results
pr: Yul PRimitive
prn: Yul PRimitive ignoring the resulting state with a None
l: step in a Let
lu: step in a Let by Unfolding
c: step in a function Call
cu: step in a function Call by Unfolding
s: Simplify the goal

These commands verify that the two programs are equivalent as we step through them. As a reference, the proof is in CoqOfSolidity/proofs/ERC20_functional.v:

Proof.
  simpl.
  unfold ERC20_403.ERC20_403_deployed.fun_approve, erc20._approve.
  l. {
    now apply run_is_non_null_address.
  }
  unfold Stdlib.Pure.iszero.
  lu.
  c; [p|].
  s.
  unfold Stdlib.Pure.iszero.
  destruct (owner =? 0); s. {
    change (true || _) with true; s.
    lu; c. {
      apply_run_mload.
    }
    repeat (
      lu ||
      cu ||
      (prn; intro) ||
      s ||
      p
    ).
  }
  l. {
    now apply run_is_non_null_address.
  }
  lu.
  c; [p|]; s.
  unfold Stdlib.Pure.iszero.
  change (false || ?e) with e; s.
  destruct (spender =? 0); s. {
    lu; c. {
      apply_run_mload.
    }
    repeat (
      lu ||
      cu ||
      (prn; intro) ||
      s ||
      p
    ).
  }
  lu; c. {
    apply_run_mstore.
  }
  CanonizeState.execute.
  lu; c. {
    apply_run_mstore.
  }
  CanonizeState.execute.
  lu; c. {
    apply_run_keccak256_tuple2.
  }
  lu.
  lu; c. {
    apply_run_mstore.
  }
  CanonizeState.execute.
  lu; c. {
    apply_run_mstore.
  }
  CanonizeState.execute.
  lu; c. {
    apply_run_keccak256_tuple2.
  }
  lu; c. {
    apply_run_sstore_map2_u256.
  }
  CanonizeState.execute.
  lu; c. {
    apply_run_mload.
  }
  s.
  lu; c. {
    apply_run_mstore.
  }
  CanonizeState.execute.
  lu; c. {
    p.
  }
  p.
Qed.

Get started

To audit your smart contracts and make sure they contain no bugs, contact us at 📧contact@formal.land.

We refund our work in case we missed any high/critical severity bugs.

Conclusion

We have presented how to functionally specify a function with coq-of-solidity. In the next blog post we will see how to extend this proof and specification to the entire ERC-20 smart contract.

Functional specification​

Approve function​

Equivalence​

Proof of equivalence​

Conclusion​

Functional specification

Approve function

Equivalence

Proof of equivalence

Conclusion