# What is coq-of-ocaml

**coq-of-ocaml** is a transpiler from the ๐ซย OCaml programming language to the ๐ย Coq proof language. It allows **formal verification** on OCaml programs. We developed it for the Tezos crypto-currency, verifying more that 100K lines of OCaml code in Coq Tezos of OCamlย ๐ซ. The sources are on Github.

`coq-of-ocaml`

generates idiomatic and human readable Coq code. Technically speaking, this is a shallow embedding of OCaml into Coq. We supports the purely functional parts of OCaml, including advanced features such as functors, first-class modules, and GADTs. Side-effects in an OCaml program can be translated when represented by a monad.

If you want to formally verify OCaml programs or see what is possible, contact us atย contact@formal.land.

We do not support yet OCaml 5 and this is our next target.

## Exampleโ

`coq-of-ocaml`

translates the OCaml code:

`type 'a tree =`

| Leaf of 'a

| Node of 'a tree * 'a tree

let rec sum tree =

match tree with

| Leaf n -> n

| Node (tree1, tree2) -> sum tree1 + sum tree2

to the Coq code:

`(* Generated by coq-of-ocaml *)`

Inductive tree (a : Set) : Set :=

| Leaf : a -> tree a

| Node : tree a -> tree a -> tree a.

Arguments Leaf {_}.

Arguments Node {_}.

Fixpoint sum (tree : tree int) : int :=

match tree with

| Leaf n => n

| Node tree1 tree2 => Z.add (sum tree1) (sum tree2)

end.

We map the algebraic datatype `tree`

to an equivalent inductive type `tree`

in Coq. With the `Arguments`

command, we ask Coq to be able to infer the type parameter `a`

, as it is done in OCaml. We translate the recursive function `sum`

using the command `Fixpoint`

of Coq. Here, we represent the `int`

ย type of OCaml by `Z`

in Coq, but this can be parametrized.

## Workflowโ

`coq-of-ocaml`

works by compiling the OCaml files one by one. Thanks to Merlin, we get access to the typing environment of each file. Thus names referencing external definitions are properly interpreted.

In a typical project, we may want to translate some of the `.ml`

files and keep the rest as axioms (for the libraries or non-critical files). To generate the axioms, we can run `coq-of-ocaml`

on the `.mli`

files for the parts we want to abstract. When something is not properly handled, `coq-of-ocaml`

generates an error message. These errors do not necessarily need to be fixed. However, they are good warnings to help having a more extensive and reliable Coq formalization.

Generally, the generated Coq code for a project does not compile as it is. This can be due to unsupported OCaml features, or various small errors such as name collisions. In this case, you can:

- modify the OCaml input code, so that it fits what
`coq-of-ocaml`

handles or avoids Coq errors (follow the error messages); - use the attributes or configuration mechanism to customize the translation of
`coq-of-ocaml`

; - fork
`coq-of-ocaml`

to modify the code translation; - post-process the output with a script;
- post-process the output by hand.

## Conceptsโ

We can import to Coq the OCaml programs which are either purely functional or whose side-effects are in a monad. We translate the primitive side-effects (references, exceptions, ...) to axioms. We have no proofs that we preserve the semantics of the source code. One should do manual reviews to assert that the generated Coq code is a correct formalization of the source code. We produce a dummy Coq term and an explicit message in case of error. In particular, we always generate something and no errors are fatal.

We compile OCaml projects by pluging into the LSP of OCaml Merlin. This means that if you are using Merlin then you can run `coq-of-ocaml`

with no additional configurations.

We do not do special treatments for the termination of fixpoints. If needed, you can disable termination checks using the Coq's flag Guard Checking. We erase the type parameters for the GADTs. This makes sure that the type definitions are accepted, but can make the pattern matchings incomplete. In this case we offer the possibility to introduce dynamic casts guided by annotations in the OCaml code. We did not find a way to nicely represent GADTs in Coq yet. We think that this is hard because the dependent pattern matching works well on type indicies which are values, but does not with types.

We support modules, module types, functors and first-class modules. For OCaml modules, we generate either Coq modules or polymorphic records depending on the case. We generate axioms for `.mli`

files to help formalizations, but importing `.mli`

files should not be necessary for a project to compile in Coq.

## Relatedโ

In the OCaml community:

- Cameleer (verify OCaml programs leveraging the Why3's infrastructure)
- CFML (import OCaml to Coq using characteristic formulae)
- coq-of-ocaml-mrmr1993 (fork of
`coq-of-ocaml`

including side-effects, focusing on the compilation of the OCaml's stdlib)

In the JavaScript community:

- coq-of-js (sister project;
*currently on halt to support*)`coq-of-ocaml`

In the Haskell community:

- hs-to-coq (import Haskell to Coq)
- hs-to-gallina (2012, by Gabe Dijkstra, first known project to do a shallow embedding of a mainstream functional programming language to Coq)

In the Go community;

- goose (import Go to Coq)

In the Rust community:

- electrolysis (import Rust to Lean)

## Creditsโ

The `coq-of-ocaml`

project started as part of a PhD directed by Yann Regis-Gianas and Hugo Herbelin
as the university of Paris 7. Originally, the goal was to formalize real OCaml programs in Coq to study side-effects inference and proof techniques on functional programs. This project was then financed by Nomadic Labs and then the Tezos Foundation, with the aim to be able to reason about the implementation of the crypto-currency Tezos. See this blog post to get an example about what we can prove.