tim carstens
intoverflow

Hello VST, hello Verifiable C

draft draft draft draft draft

On today’s show:

Files you can download:

Introduction

This post is about writing formally verified C code. If you’re writing formally verified software in C, there are three things you’re gonna want:

  • A mathematical meta-language for defining and reasoning about an abstract model of your code;
  • A “program logic” for C – that is, a meta-language within-which you may reason about the behavior of C code, and relate it to your mathematical meta-language;
  • A C compiler which preserves whatever theorems you’ve proven in your program logic.

For instance, if you are formally verifying a program which sieves for prime numbers, you’d want:

  • A model of the integers, with a definition of primality;
  • Some kind of argument relating your code’s behavior to this model;
  • A compiler whose output relates to your model the same as your C code did.

There aren’t a lot of tool suites which support this type of work flow. In industry, there are many tools for reasoning formally about software; but these tools tend to be closed-source and tres riche, making them ill-suited for Internet-scale infrastructure.

A new hope?

Today we’ll look at one approach to formally verifying C code: Coq + CompCert + VST + Verifiable C. Coq is a proof agent; it provides the mathematical meta-language. VST and Verifiable C provide the program logic. CompCert is the compiler.

Coq and CompCert have been described all over Internet. For us, the take-away is that CompCert is written in Coq, and has been shown not to introduce any bugs during the compilation process (versus, say, GCC, which is known to introduce bugs into programs).

VST (Verified Software Toolchain) is a framework for using separation logic to analyze the behavior of programs with global mutable state. Presently, the “killer app” for VST is Verifiable C, a program logic for a psudo-C language (“C light,” which happens to be an intermediate language used by CompCert).

The nice thing about Verifiable C is that its program logic has been formally shown (again, in Coq) to be compatible with the semantics defined by CompCert.

So with this tool chain, we can write our code in more-or-less ordinary C style, prove theorems about our code in Coq using the Verifiable C program logic, and when we compile using CompCert we’ll know that (a) CompCert didn’t introduce any bugs into our compiled program and (b) our compiled program satisfies an appropriate version of the theorems we proved earlier.

Sounds very nice to me, let’s get started.

Installing CompCert and VST

At time of writing, the latest release of VST is Version 1.4, which is compatible with CompCert 2.3pl2.

Download link for CompCert 2.3pl2: http://compcert.inria.fr/release/compcert-2.3pl2.tgz, or, if that’s stale, you can get a copy from me: http://intoverflow.github.io/assets/compcert-2.3pl2.tgz.

Download link for VST 1.4: http://vst.cs.princeton.edu/download/vst-1.4.tgz, or, if that’s stale, you can get a copy from me: http://intoverflow.github.io/assets/vst-1.4.tgz.

So now we get started:

[tcarstens@weibel ~]$ mkdir verifiable-c
[tcarstens@weibel ~]$ cd verifiable-c/
[tcarstens@weibel verifiable-c]$ wget --quiet "http://compcert.inria.fr/release/compcert-2.3pl2.tgz"
[tcarstens@weibel verifiable-c]$ wget --quiet "http://vst.cs.princeton.edu/download/vst-1.4.tgz"
[tcarstens@weibel verifiable-c]$ md5sum *
f97700e91163e6fbdb645721e2c1350f  compcert-2.3pl2.tgz
d6a2aff5a454ffc421e18dbda690e930  vst-1.4.tgz
[tcarstens@weibel verifiable-c]$ sha256sum *
a8626962e1aa0c7ac566d799c4b4c588a2bbc9e47fd5a2bfae8152438caf04b3  compcert-2.3pl2.tgz
68a0ee84e3800c4fba933a9fd8a36fe4416ef758e80ec0a8416b3d984144aa1c  vst-1.4.tgz
[tcarstens@weibel verifiable-c]$ tar -xf compcert-2.3pl2.tgz
[tcarstens@weibel verifiable-c]$ tar -xf vst-1.4.tgz
[tcarstens@weibel verifiable-c]$ ls -F
compcert-2.3pl2/  compcert-2.3pl2.tgz  vst/  vst-1.4.tgz
[tcarstens@weibel verifiable-c]$

So far so good. Now we inspect our host for dependencies. In my case, I’m running Linux x86-64, with the intent to have CompCert target ia32-linux. You’ll note that I have two gcc’s installed; when building CompCert, I’ll need to specify which gcc to use.

[tcarstens@weibel verifiable-c]$ uname -a
Linux weibel 3.10.9-1-ARCH #1 SMP PREEMPT Wed Aug 21 13:49:35 CEST 2013 x86_64 GNU/Linux
[tcarstens@weibel verifiable-c]$ cat /proc/cpuinfo | grep "model name"
model name	: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
model name	: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
model name	: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
model name	: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
[tcarstens@weibel verifiable-c]$ gcc -v
...(output elided)...
Target: x86_64-unknown-linux-gnu
...(output elided)...
gcc version 4.8.1 20130725 (prerelease) (GCC)
[tcarstens@weibel verifiable-c]$ arm-none-eabi-gcc -v
...(output elided)...
Target: arm-none-eabi
...(output elided)...
gcc version 4.9.0 (Arch Repository) 
[tcarstens@weibel verifiable-c]$ coqc -v
The Coq Proof Assistant, version 8.4pl3 (March 2014)
compiled on Mar 25 2014 01:27:43 with OCaml 4.01.0
[tcarstens@weibel verifiable-c]$ file `which coqc`
/usr/bin/coqc: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=adcbf1ed229ca82e8a43c124c15a32278de4dee9, stripped
[tcarstens@weibel verifiable-c]$

Very good, let’s build CompCert. Now as it happens, CompCert depends on GCC at run-time (it uses the GCC preprocessor). So we need to tell CompCert which GCC to build for. By default, it will assume gcc, which happens to be what I want because I’m targetting ia32-linux. If I were targeting arm-eabi, I’d want to use arm-none-eabi-gcc, and to do that I’d toss in an extra -toolprefix arm-none-eabi- flag to the configure script.

[tcarstens@weibel verifiable-c]$ pushd compcert-2.3pl2
~/verifiable-c/compcert-2.3pl2 ~/verifiable-c
[tcarstens@weibel compcert-2.3pl2]$ ./configure ia32-linux -no-runtime-lib &> ../compcert-2.3pl2.configurelog
[tcarstens@weibel compcert-2.3pl2]$ cat ../compcert-2.3pl2.configurelog
Testing assembler support for CFI directives... yes
Testing Coq... version 8.4pl3 -- good!
Testing OCaml... version 4.01.0 -- good!
Testing Menhir... version 20140422 -- good!

CompCert configuration:
    Target architecture........... ia32
    Application binary interface.. standard
    OS and development env........ linux
    C compiler.................... gcc -m32
    C preprocessor................ gcc -m32 -U__GNUC__ -E
    Assembler..................... gcc -m32 -c
    Assembler supports CFI........ true
    Assembler for runtime lib..... gcc -m32 -c
    Linker........................ gcc -m32
    Math library.................. -lm
    Binaries installed in......... /usr/local/bin
    Runtime library provided...... false
    Library files installed in.... /usr/local/lib/compcert
    cchecklink tool supported..... false

If anything above looks wrong, please edit file ./Makefile.config to correct.

[tcarstens@weibel compcert-2.3pl2]$ time make -j 6 all &> ../compcert-2.3pl2.makelog
...(go get some coffee, or open another terminal and tail -f ../compcert-2.3pl2.makelog)...

real	22m17.246s
user	79m14.576s
sys	1m28.580s
[tcarstens@weibel compcert-2.3pl2]$ time make -j 6 clightgen &> ../compcert-2.3pl2.makelog-clightgen

real	0m1.732s
user	0m1.600s
sys	0m0.090s
[tcarstens@weibel compcert-2.3pl2]$ ./clightgen --help
The CompCert Clight generator
Usage: clightgen [options] <source files>
Recognized source files:
  .c             C source file
Processing options:
  -E             Preprocess only, send result to standard output
Preprocessing options:
  -I<dir>        Add <dir> to search path for #include files
  -D<symb>=<val> Define preprocessor symbol
  -U<symb>       Undefine preprocessor symbol
  -Wp,<opt>      Pass option <opt> to the preprocessor
Language support options (use -fno-<opt> to turn off -f<opt>) :
  -fbitfields    Emulate bit fields in structs [off]
  -flongdouble   Treat 'long double' as 'double' [off]
  -fstruct-return  Emulate returning structs and unions by value [off]
  -fvararg-calls Emulate calls to variable-argument functions [on]
  -fpacked-structs  Emulate packed structs [off]
  -fall          Activate all language support options above
  -fnone         Turn off all language support options above
Tracing options:
  -dparse        Save C file after parsing and elaboration in <file>.parse.c
  -dc            Save generated Compcert C in <file>.compcert.c
  -dclight       Save generated Clight in <file>.light.c
General options:
  -v             Print external commands before invoking them
[tcarstens@weibel compcert-2.3pl2]$ ./ccomp --help
The CompCert C verified compiler, version 2.3pl2
Usage: ccomp [options] <source files>
...(output elided)...
[tcarstens@weibel compcert-2.3pl2]$ popd
~/verifiable-c
[tcarstens@weibel verifiable-c]$

Sweet. Now we can build VST. You’ll need to create a vst/CONFIGURE file pointing to your CompCert source directory.

[tcarstens@weibel verifiable-c]$ pushd vst
~/verifiable-c/vst ~/verifiable-c
[tcarstens@weibel vst]$ echo "COMPCERT=../compcert-2.3pl2" > CONFIGURE
[tcarstens@weibel vst]$ cat CONFIGURE
COMPCERT=../compcert-2.3pl2
[tcarstens@weibel vst]$ time make -j 6 &> ../vst.makelog
...(go get some lunch, or open another terminal and tail -f ../vst.makelog)...
real	33m20.637s
user	83m42.863s
sys	0m39.973s
[tcarstens@weibel vst]$ coqtop `cat .loadpath` -l progs/verif_sumarray.v
Welcome to Coq 8.4pl3 (March 2014)
<W> Grammar extension: in [constr:operconstr] some rule has been masked

Coq < Print sumarray_spec.
sumarray_spec = 
(_sumarray:ident,
(WITH  p : val * share * (Z -> val), size : Z PRE  [
 (_a, tptr tint), (_n, tint)]
 (let (p0, contents) := p in
  let (a0, sh) := p0 in
  PROP  (0 <= size <= Int.max_signed;
  forall i : Z, 0 <= i < size -> is_int (contents i))
  LOCAL  (`(eq a0) (eval_id _a); `(eq (Vint (Int.repr size))) (eval_id _n);
  `isptr (eval_id _a))
  SEP  (`(array_at tint sh contents 0 size) (eval_id _a))) POST  [tint]
 (let (p0, contents) := p in
  let (a0, sh) := p0 in
  local
    (`(eq (Vint (fold_range (add_elem contents) Int.zero 0 size))) retval) &&
  `(array_at tint sh contents 0 size a0)))
:funspec)
     : ident * funspec

Coq < Quit.
[tcarstens@weibel vst]$ popd
~/verifiable-c
[tcarstens@weibel verifiable-c]$

Excellent. As you can see, we’ve made it to a point where we can work with the VST sample programs (in the case above, sumarray).

Now let’s write and verify something of our own.

strncat in Verifiable C

First lets set up our environment. While you weren’t looking, I authored .loadpath and strncat.c (you can download the latter here: strncat.c). The .loadpath file is adapted from a file of the same name in the VST directory, which is generated when building VST. You can see it getting written in my VST build log above.

[tcarstens@weibel strncat]$ ln -s ~/verifiable-c/compcert-2.3pl2 compcert
[tcarstens@weibel strncat]$ ln -s ~/verifiable-c/vst vst
[tcarstens@weibel strncat]$ ls -Fa
./  ../  .loadpath  compcert@  strncat.c  vst@
[tcarstens@weibel strncat]$ cat .loadpath
-I vst/msl -as msl -I vst/sepcomp -as sepcomp -I vst/veric -as veric -I vst/floyd -as floyd -I vst/progs -as progs -R compcert -as compcert
[tcarstens@weibel strncat]$ cat strncat.c
typedef unsigned int size_t;

size_t strlen(const unsigned char *s) {
  size_t i;
  unsigned char c;

  i = 0;
  c = s[i];
  while ('\0' != c) {
    i++;
    c = s[i];
  }

  return i;
}

unsigned char *strncat(unsigned char *dest, const unsigned char *src, size_t n) {
  size_t dest_len = strlen(dest);
  size_t i;

  for (i = 0; i < n && src[i] != 0; i++)
    dest[dest_len + i] = src[i];
  dest[dest_len + i] = 0;

  return dest;
}

[tcarstens@weibel strncat]$

We now generate a Coq file strncat.v which contains abstract-syntax for a Clight version of strncat.c (which you can download here: strncat.v):

[tcarstens@weibel strncat]$ compcert/clightgen strncat.c
[tcarstens@weibel strncat]$ cat strncat.v
Require Import Clightdefs.

Local Open Scope Z_scope.

Definition _c : ident := 32%positive.
Definition ___compcert_va_int64 : ident := 16%positive.
Definition _strncat : ident := 38%positive.
Definition ___builtin_fmadd : ident := 24%positive.
Definition ___builtin_fmax : ident := 22%positive.
Definition ___compcert_va_float64 : ident := 17%positive.
Definition ___builtin_memcpy_aligned : ident := 8%positive.
Definition ___builtin_subl : ident := 5%positive.
Definition ___builtin_va_arg : ident := 12%positive.
Definition ___builtin_annot_intval : ident := 10%positive.
Definition ___builtin_negl : ident := 3%positive.
Definition ___builtin_write32_reversed : ident := 2%positive.
Definition ___builtin_write16_reversed : ident := 1%positive.
Definition _n : ident := 36%positive.
Definition _i : ident := 31%positive.
Definition ___builtin_va_end : ident := 14%positive.
Definition ___builtin_mull : ident := 6%positive.
Definition ___builtin_fnmadd : ident := 26%positive.
Definition ___builtin_bswap32 : ident := 19%positive.
Definition _main : ident := 39%positive.
Definition ___builtin_va_start : ident := 11%positive.
Definition _dest_len : ident := 37%positive.
Definition ___builtin_addl : ident := 4%positive.
Definition ___builtin_read16_reversed : ident := 28%positive.
Definition ___builtin_fabs : ident := 7%positive.
Definition ___builtin_fsqrt : ident := 21%positive.
Definition ___builtin_bswap : ident := 18%positive.
Definition ___builtin_annot : ident := 9%positive.
Definition ___builtin_va_copy : ident := 13%positive.
Definition ___builtin_fnmsub : ident := 27%positive.
Definition _strlen : ident := 33%positive.
Definition ___builtin_fmsub : ident := 25%positive.
Definition ___compcert_va_int32 : ident := 15%positive.
Definition _dest : ident := 34%positive.
Definition ___builtin_read32_reversed : ident := 29%positive.
Definition _src : ident := 35%positive.
Definition _s : ident := 30%positive.
Definition ___builtin_fmin : ident := 23%positive.
Definition ___builtin_bswap16 : ident := 20%positive.
Definition _dest_len' : ident := 40%positive.


Definition f_strlen := {|
  fn_return := tuint;
  fn_callconv := cc_default;
  fn_params := ((_s, (tptr tuchar)) :: nil);
  fn_vars := nil;
  fn_temps := ((_i, tuint) :: (_c, tuchar) :: nil);
  fn_body :=
(Ssequence
  (Sset _i (Econst_int (Int.repr 0) tint))
  (Ssequence
    (Sset _c
      (Ecast
        (Ederef
          (Ebinop Oadd (Etempvar _s (tptr tuchar)) (Etempvar _i tuint)
            (tptr tuchar)) tuchar) tuchar))
    (Ssequence
      (Swhile
        (Ebinop One (Econst_int (Int.repr 0) tint) (Etempvar _c tuchar) tint)
        (Ssequence
          (Sset _i
            (Ebinop Oadd (Etempvar _i tuint) (Econst_int (Int.repr 1) tint)
              tuint))
          (Sset _c
            (Ecast
              (Ederef
                (Ebinop Oadd (Etempvar _s (tptr tuchar)) (Etempvar _i tuint)
                  (tptr tuchar)) tuchar) tuchar))))
      (Sreturn (Some (Etempvar _i tuint))))))
|}.

Definition f_strncat := {|
  fn_return := (tptr tuchar);
  fn_callconv := cc_default;
  fn_params := ((_dest, (tptr tuchar)) :: (_src, (tptr tuchar)) ::
                (_n, tuint) :: nil);
  fn_vars := nil;
  fn_temps := ((_dest_len, tuint) :: (_i, tuint) :: (41%positive, tint) ::
               (_dest_len', tuint) :: nil);
  fn_body :=
(Ssequence
  (Ssequence
    (Scall (Some _dest_len')
      (Evar _strlen (Tfunction (Tcons (tptr tuchar) Tnil) tuint cc_default))
      ((Etempvar _dest (tptr tuchar)) :: nil))
    (Sset _dest_len (Etempvar _dest_len' tuint)))
  (Ssequence
    (Ssequence
      (Sset _i (Econst_int (Int.repr 0) tint))
      (Sloop
        (Ssequence
          (Ssequence
            (Sifthenelse (Ebinop Olt (Etempvar _i tuint) (Etempvar _n tuint)
                           tint)
              (Ssequence
                (Sset 41%positive
                  (Ecast
                    (Ebinop One
                      (Ederef
                        (Ebinop Oadd (Etempvar _src (tptr tuchar))
                          (Etempvar _i tuint) (tptr tuchar)) tuchar)
                      (Econst_int (Int.repr 0) tint) tint) tbool))
                (Sset 41%positive (Ecast (Etempvar 41%positive tbool) tint)))
              (Sset 41%positive (Econst_int (Int.repr 0) tint)))
            (Sifthenelse (Etempvar 41%positive tint) Sskip Sbreak))
          (Sassign
            (Ederef
              (Ebinop Oadd (Etempvar _dest (tptr tuchar))
                (Ebinop Oadd (Etempvar _dest_len tuint) (Etempvar _i tuint)
                  tuint) (tptr tuchar)) tuchar)
            (Ederef
              (Ebinop Oadd (Etempvar _src (tptr tuchar)) (Etempvar _i tuint)
                (tptr tuchar)) tuchar)))
        (Sset _i
          (Ebinop Oadd (Etempvar _i tuint) (Econst_int (Int.repr 1) tint)
            tuint))))
    (Ssequence
      (Sassign
        (Ederef
          (Ebinop Oadd (Etempvar _dest (tptr tuchar))
            (Ebinop Oadd (Etempvar _dest_len tuint) (Etempvar _i tuint)
              tuint) (tptr tuchar)) tuchar) (Econst_int (Int.repr 0) tint))
      (Sreturn (Some (Etempvar _dest (tptr tuchar)))))))
|}.

Definition prog : Clight.program := {|
prog_defs :=
((___builtin_fabs,
   Gfun(External (EF_builtin ___builtin_fabs
                   (mksignature (AST.Tfloat :: nil) (Some AST.Tfloat)
                     cc_default)) (Tcons tdouble Tnil) tdouble cc_default)) ::
 (___builtin_memcpy_aligned,
   Gfun(External (EF_builtin ___builtin_memcpy_aligned
                   (mksignature
                     (AST.Tint :: AST.Tint :: AST.Tint :: AST.Tint :: nil)
                     None cc_default))
     (Tcons (tptr tvoid)
       (Tcons (tptr tvoid) (Tcons tuint (Tcons tuint Tnil)))) tvoid
     cc_default)) ::
 (___builtin_annot,
   Gfun(External (EF_builtin ___builtin_annot
                   (mksignature (AST.Tint :: nil) None
                     {|cc_vararg:=true; cc_structret:=false|}))
     (Tcons (tptr tschar) Tnil) tvoid
     {|cc_vararg:=true; cc_structret:=false|})) ::
 (___builtin_annot_intval,
   Gfun(External (EF_builtin ___builtin_annot_intval
                   (mksignature (AST.Tint :: AST.Tint :: nil) (Some AST.Tint)
                     cc_default)) (Tcons (tptr tschar) (Tcons tint Tnil))
     tint cc_default)) ::
 (___builtin_va_start,
   Gfun(External (EF_builtin ___builtin_va_start
                   (mksignature (AST.Tint :: nil) None cc_default))
     (Tcons (tptr tvoid) Tnil) tvoid cc_default)) ::
 (___builtin_va_arg,
   Gfun(External (EF_builtin ___builtin_va_arg
                   (mksignature (AST.Tint :: AST.Tint :: nil) None
                     cc_default)) (Tcons (tptr tvoid) (Tcons tuint Tnil))
     tvoid cc_default)) ::
 (___builtin_va_copy,
   Gfun(External (EF_builtin ___builtin_va_copy
                   (mksignature (AST.Tint :: AST.Tint :: nil) None
                     cc_default))
     (Tcons (tptr tvoid) (Tcons (tptr tvoid) Tnil)) tvoid cc_default)) ::
 (___builtin_va_end,
   Gfun(External (EF_builtin ___builtin_va_end
                   (mksignature (AST.Tint :: nil) None cc_default))
     (Tcons (tptr tvoid) Tnil) tvoid cc_default)) ::
 (___compcert_va_int32,
   Gfun(External (EF_external ___compcert_va_int32
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons (tptr tvoid) Tnil) tuint cc_default)) ::
 (___compcert_va_int64,
   Gfun(External (EF_external ___compcert_va_int64
                   (mksignature (AST.Tint :: nil) (Some AST.Tlong)
                     cc_default)) (Tcons (tptr tvoid) Tnil) tulong
     cc_default)) ::
 (___compcert_va_float64,
   Gfun(External (EF_external ___compcert_va_float64
                   (mksignature (AST.Tint :: nil) (Some AST.Tfloat)
                     cc_default)) (Tcons (tptr tvoid) Tnil) tdouble
     cc_default)) ::
 (___builtin_bswap,
   Gfun(External (EF_builtin ___builtin_bswap
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons tuint Tnil) tuint cc_default)) ::
 (___builtin_bswap32,
   Gfun(External (EF_builtin ___builtin_bswap32
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons tuint Tnil) tuint cc_default)) ::
 (___builtin_bswap16,
   Gfun(External (EF_builtin ___builtin_bswap16
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons tushort Tnil) tushort cc_default)) ::
 (___builtin_fsqrt,
   Gfun(External (EF_builtin ___builtin_fsqrt
                   (mksignature (AST.Tfloat :: nil) (Some AST.Tfloat)
                     cc_default)) (Tcons tdouble Tnil) tdouble cc_default)) ::
 (___builtin_fmax,
   Gfun(External (EF_builtin ___builtin_fmax
                   (mksignature (AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble Tnil)) tdouble cc_default)) ::
 (___builtin_fmin,
   Gfun(External (EF_builtin ___builtin_fmin
                   (mksignature (AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble Tnil)) tdouble cc_default)) ::
 (___builtin_fmadd,
   Gfun(External (EF_builtin ___builtin_fmadd
                   (mksignature
                     (AST.Tfloat :: AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble (Tcons tdouble Tnil))) tdouble
     cc_default)) ::
 (___builtin_fmsub,
   Gfun(External (EF_builtin ___builtin_fmsub
                   (mksignature
                     (AST.Tfloat :: AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble (Tcons tdouble Tnil))) tdouble
     cc_default)) ::
 (___builtin_fnmadd,
   Gfun(External (EF_builtin ___builtin_fnmadd
                   (mksignature
                     (AST.Tfloat :: AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble (Tcons tdouble Tnil))) tdouble
     cc_default)) ::
 (___builtin_fnmsub,
   Gfun(External (EF_builtin ___builtin_fnmsub
                   (mksignature
                     (AST.Tfloat :: AST.Tfloat :: AST.Tfloat :: nil)
                     (Some AST.Tfloat) cc_default))
     (Tcons tdouble (Tcons tdouble (Tcons tdouble Tnil))) tdouble
     cc_default)) ::
 (___builtin_read16_reversed,
   Gfun(External (EF_builtin ___builtin_read16_reversed
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons (tptr tushort) Tnil) tushort cc_default)) ::
 (___builtin_read32_reversed,
   Gfun(External (EF_builtin ___builtin_read32_reversed
                   (mksignature (AST.Tint :: nil) (Some AST.Tint) cc_default))
     (Tcons (tptr tuint) Tnil) tuint cc_default)) ::
 (___builtin_write16_reversed,
   Gfun(External (EF_builtin ___builtin_write16_reversed
                   (mksignature (AST.Tint :: AST.Tint :: nil) None
                     cc_default)) (Tcons (tptr tushort) (Tcons tushort Tnil))
     tvoid cc_default)) ::
 (___builtin_write32_reversed,
   Gfun(External (EF_builtin ___builtin_write32_reversed
                   (mksignature (AST.Tint :: AST.Tint :: nil) None
                     cc_default)) (Tcons (tptr tuint) (Tcons tuint Tnil))
     tvoid cc_default)) :: (_strlen, Gfun(Internal f_strlen)) ::
 (_strncat, Gfun(Internal f_strncat)) :: nil);
prog_main := _main
|}.

[tcarstens@weibel strncat]$ 

Lovely. Now we need to specify its behavior, and prove it conforms to that specification.

A formal specification for strncat

An informal specification for strncat is given in man 3 strncat:

char *strcat(char *dest, const char *src);

char *strncat(char *dest, const char *src, size_t n);

The strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; buffer overruns are a favorite avenue for attacking secure programs.

The strncat() function is similar, except that

  • it will use at most n bytes from src; and
  • src does not need to be null-terminated if it contains n or more bytes.

As with strcat(), the resulting string in dest is always null-terminated.

If src contains n or more bytes, strncat() writes n+1 bytes to dest (n from src plus the terminating null byte). Therefore, the size of dest must be at least strlen(dest)+n+1.

Our present task is to produce a formal version of this specification in Coq. To do so, we’ll need a meta-theory of strings, which we will use to abstractly model the behavior of strncat.

Coq already has a string type, which represents strings as lists of ascii values, which themselves are represented as 8-tuples of bools. The Coq standard library includes functions for computing string length, concatenating strings, and obtaining sub-strings. In short, it should have everything we need, so we may as well use it as our meta-theory.

So now we may begin writing our specification. Let us first specify the behavior of strlen, as our strncat specification will surely make use of it.

Ok, so let’s compile strncat.v and fire up CoqIDE to begin authoring and proving a specification:

[tcarstens@weibel strncat]$ time coqc `cat .loadpath` strncat.v

real	0m2.107s
user	0m1.900s
sys	0m0.200s
[tcarstens@weibel strncat]$ coqide `cat .loadpath` verif_strncat.v

At this point we’ve (a) setup our environment so that we can use CompCert and VST, (b) compiled our Coq representation of strncat.c (meaning that other Coq code can open this module and work with its definitions), and (c) launched CoqIDE with a new file verif_strncat.v.

Let’s begin authoring verif_strncat.v. (Here’s a download link for verif_strncat.v). We’ll use this soothing yellowish-cream background color to indicate the contents of this file, concatenating as we go. First we import some libraries:

Require Import floyd.proofauto.

Require Import Coq.Strings.Ascii.

Now we’ll make some definitions which move us towards relating Clight values to our meta-theory of strings. For now, we only need concern ourselves with specifying what we mean by a (Clight) “ascii” value or a (Clight) “null-terminated ascii string”.

Inductive ascii : val -> Prop :=
  | is_ascii c: (0 <= Int.intval c < 128) -> ascii (Vint c).

Inductive ascii_not_nil : val -> Prop :=
  | is_ascii_not_nil c: (0 < Int.intval c < 128) -> ascii_not_nil (Vint c).

Lemma ascii_not_nil_is_ascii: forall v, ascii_not_nil v -> ascii v.
Proof. intros. inversion H. apply is_ascii. omega. Qed.

Inductive ascii_string (contents: Z -> val) (len size: Z) : Prop :=
  | is_ascii_string:
     (0 <= len < size)
     -> (forall i, 0 <= i < len -> ascii_not_nil (contents i))
     -> contents len = Vint Int.zero
     -> ascii_string contents len size.

Lemma ascii_string_is_int:
  forall contents len size,
   ascii_string contents len size
   -> forall i, 0 <= i <= len -> is_int (contents i).
Proof.
  intros.
  destruct (Z_dec i len) eqn:?.
  + destruct s ; try omega.
    destruct H.
    assert (ascii_not_nil (contents i)) as ann by (apply H1 ; omega).
    inversion ann.
    simpl ; trivial.
  + inversion H.
    subst.
    rewrite H3.
    simpl ; trivial.
Qed.

Now we’re set to write our specification for strlen. Try to read the specification and figure out what it means. I’ll offer a quick hint: think of _s as the argument to strlen, and s0 as an abstract representative of that argument.

Require Import strncat.

Local Open Scope logic.

Definition strlen_spec :=
 DECLARE _strlen
  WITH s0: val, sh : share, contents : Z -> val, len: Z, size: Z
  PRE [ _s OF (tptr tuchar) ]
          PROP (0 < size <= Int.max_signed;
                ascii_string contents len size)
          LOCAL (`(eq s0) (eval_id _s); `isptr (eval_id _s))
          SEP (`(array_at tuchar sh contents 0 size s0))
  POST [ tuint ]
        local (`(eq (Vint (Int.repr len))) retval)
              && `(array_at tuchar sh contents 0 size s0).

This specification says that strlen takes an argument _s which is an array of chars of size size. It posits that this array is a null-terminated ascii string of length len, in the sense described above (see our ascii_string definition).

It says that strlen returns a machine int which is equivalent to len (modulo 8*sizeof(int)).

Lastly, it says that strlen does not modify the data pointed-to by _s.

A formal proof that our strlen is correct

First we bundle up all of our definitions and specifications:

Definition Vprog : varspecs := nil.
Definition Gprog : funspecs :=  strlen_spec :: nil.

Now we can state and prove a lemma about strlen meeting its specification. Note that I got stuck near the end and cheated with an “admit”. I admit it, I got tired!

Lemma body_strlen: semax_body Vprog Gprog f_strlen strlen_spec.
Proof.
  start_function.
  set (H_ascii_string := H0).
  destruct H0 as [ zero_lte_len_lt_size H_ascii_not_nil H_contents_len ].
  forward (* i = 0; *).
  forward. (* c = s[i]; *)
  { entailer!.
    destruct (zeq 0 len).
    + subst ; rewrite H_contents_len ; simpl ; trivial.
    + assert (ascii_not_nil (contents 0)) as ann
         by (apply H_ascii_not_nil; omega).
      inv ann; simpl; auto.
  }
  set (strlen_Inv :=
    EX i': Z,
    (PROP ( 0 <= i' <= len )
     LOCAL ( `(eq s0) (eval_id _s)
           ; `(eq (Vint (Int.repr i'))) (eval_id _i)
           ; `(eq (contents i')) (eval_id _c)
           )
     SEP ( `(array_at tuchar sh contents 0 size) (eval_id _s) )
    ) ).
  set (strlen_Post :=
    (PROP ( )
     LOCAL ( `(eq s0) (eval_id _s)
           ; `(eq (Vint (Int.repr len))) (eval_id _i)
           )
     SEP ( `(array_at tuchar sh contents 0 size) (eval_id _s) )
    ) ).
  forward_while strlen_Inv strlen_Post; unfold strlen_Inv, strlen_Post in * ; clear strlen_Inv strlen_Post.
  (* Prove that current precondition implies loop invariant *)
  { apply exp_right with 0.
    entailer!.
    rewrite H0.
    normalize.
    destruct (zeq 0 len).
    + subst ; rewrite H_contents_len ; simpl.
      normalize.
    + assert (ascii_not_nil (contents 0)) as contents_0_ascii_nn.
      { apply H_ascii_not_nil ; omega. }
      assert (is_int (contents 0)) as contents_0_is_int.
      { apply ascii_string_is_int with len size ; try omega.
        apply H_ascii_string.
      }
      destruct (contents 0) ; simpl in contents_0_is_int ; try contradiction.
      destruct contents_0_ascii_nn.
      simpl.
      (* *)
      admit.
  }
  (* Prove that loop invariant implies typechecking condition *)
  { entailer!. }
  (* Prove that invariant && not loop-cond implies postcondition *)
  { entailer!.
    replace len with i' ; trivial.
    apply Z.le_lteq in H1 ; destruct H1 ; try assumption.
    (* now we go for the contradiction *)
    rewrite negb_false_iff in H2.
    apply int_eq_e in H2.
    assert (ascii_not_nil (contents i')) as H_contents_i'.
    { apply H_ascii_not_nil. omega. }
    destruct H_contents_i'.
    inversion H3 ; subst ; simpl in *.
    omega.
  }
  (* Prove that loop body preserves invariant *)
  { forward (* i++; *).
    forward (* c = s[i]; *).
    + entailer!.
      { (* is_int (contents (i' + 1)) *)
        assert (i' < len) as H_i'_len.
        { apply Z.le_lteq in H1 ; destruct H1 ; subst ; try assumption.
          rewrite H_contents_len in H4.
          inversion H4 ; subst.
          compute in H3 ; inversion H3.
        }
        assert (is_int (contents (i' + 1))) as Q.
        { apply ascii_string_is_int with len size.
          + apply H_ascii_string.
          + omega.
        }
        destruct (contents (i' + 1)) ; simpl in * ; try contradiction ; trivial.
      }
      { (* 0 <= i' + 1 < size *)
        assert (i' < len) ; try omega.
        apply Z.le_lteq in H1.
        destruct H1 ; trivial ; subst.
        rewrite H_contents_len in H4 ; inversion H4 ; subst.
        compute in H3 ; inversion H3.
      }
    + entailer!.
      apply exp_right with (i' + 1).
      entailer!.
      { (* i' + 1 <= len *)
        assert (contents i' <> Vint Int.zero).
        { intro Q ; rewrite Q in H4.
          compute in H4 ; inversion H4.
        }
        apply Z.le_lteq in H1.
        destruct H1 ; subst ; try omega ; congruence.
      }
      { (* contents (i' + 1) = Vint _id *)
        rewrite H2 ; normalize.
        (* *)
        admit.
      }
  }
  (* loop is done, continue with rest of proof *)
  forward (* return i; *).
Qed.

A formal proof that our strncat is correct

Next time!