OUTPUT

The blog of Maxime Kjaer

CS-550 Formal Verification

Throughout these notes, vectors are denoted in bold and lowercase (e.g. ). Booleans are denoted interchangeably by and , or and . Generally, I try to use and for formulas, and and for mathematical notation1.

Transition systems

Definition

We’ll start by introducing transition systems, a generalization of DFAs. They may be not finite, and not deterministic.

Definition: Transition system

A transition system is a 4-tuple where:

  • is the set of states
  • is the set of starting states
  • is the transition relation
  • is the alphabet

For and , means that ther is a transition from to on input .

A few special cases of this general form exist:

  • If is finite, we have a finite state machine
  • If and , then the system is deterministic.

Traces and reachability

Definition: Trace

A trace is a finite or infinite sequence, describing steps taken by a transition system:

where we require, :

A trace may or may not be finite. If they are finite, we can assume that the trace ends with a state . We’ll introduce some notation for traces:

Definition: Trace of a transition system

is the set of all traces of a transition system , starting from .

Definition: Reachable states of a transition system

The reachable states are states for which there exists a trace that ends in :

To check for reachability, for a finite , we can simply run DFS.

Relations

Let’s study relations more closely. A relation is a directed edge in the transition graph. We follow this edge when we see a given input .

If we look at the graph, we may not be interested in the inputs associated to each edge. Therefore, we can construct the edge set :

Generally, we’ll use a bar for relations that disregard input. Note that even when is deterministic, can become non-deterministic.

Composition

Relations can be composed:

Definition: Composition of relations

Note that composition is not commutative, but is associative.

To understand what a composition means, intuitively, we’ll introduce a visual metaphor. Imagine the nodes of the graph as airports, and the edges as possible routes. Let be the routes operated by United Airlines, and be the routes operated by Delta. Then is the set of routes possible by taking a United flight followed by a Delta flight.

Iteration

Definition: Iteration

An iteration describes paths of length in a relation . It is defined recursively by:

Identity

In the above, describes the identity relation, i.e. a relation mapping every node to itself. We’ll see a small generalization below, which will be useful for the rest of the course:

Definition: Identity relation

The identity relation (also called “diagonal relation” or “triangular relation”) is the relation mapping every item in the universe to itself:

This relation can be conditioned, so as to only include elements in a set (or satisfying a condition ):

Transitive closure

Applying iteration an arbitrary number of times leads us to the transitive closure:

Definition: Transitive closure

The transitive closure of a relation is:

In our airport analogy, the transitive closure is the set of all airports reachable from our starting airports.

A few properties of transitive closure follow from the definition:

  • Reflexivity: , so every item is mapped back onto itself (among others, potentially)
  • Transitivity:
  • Distributivity (under certain conditions): if is reflexive and transitive and , then
  • Monotonicity:

A few bonus facts:

  • is the smallest reflexive transitive relation containing .
  • An edge is in the transitive closure if and only if there is a path in the graph . In mathematical notation, this is:

Image

Finally, we’ll introduce one more definition:

Definition: Image of a set

The image of a state set under a relation is the set of states reachable in one step from :

We also introduce an alternative notation:

The alternative notation may make it simpler to read images; can be read as “ following then following ”.

Reach

The above definitions lead us to a first definition of reach:

Theorem: Definition 1 of reach

Post

We’ll introduce another definition in order to give an alternative definition of reach. We’re still considering a state system .

Definition: Post

If , define:

We also define:

This definition of post leads us to another formulation of reach:

Theorem: Definition 2 of reach

The proof is done by expanding the post:

Where:

Invariants

Definition: Invariant

An invariant of a system is any superset of the reachable states:

A way to think of invariants is that all the reachable states must “satisfy the invariant”, i.e. be included in .

Definition: Inductive Invariant

An inductive invariant is a set satisfying:

Intuitively, the second condition means that you can’t “escape” an inductive invariant by taking a step: there can be ingoing edges to an inductive invariant, but no edges exiting the set.

Note that every inductive invariant is also an invariant. Indeed, for an inductive invariant , if then we must also grow to include all states reachable from in order to satisfy the second property. Therefore, and is an invariant.

Definition: Inductive strengthening

For an invariant , is an inductive strengthening of if:

  • is an inductive invariant

In this case, we have:

Encoding and storage

Motivating example

Transition systems very quickly get out of hand. Suppose a snack dispenser has different items, and has 10 of each. It can take 500 coins. It then has:

They might not all be reachable, but reachability is a separate question. Here, we’re just defining the graph. In any case, this is too large to store explicitly. Therefore, we must find formulas and data structures to represent it more compactly.

Sequential circuit

We’ll consider a deterministic finite-state transition system . We will let be such that , and , or in other words, and .

To encode the transition system in bits, we can think of it as 4-tuple of boolean functions:

  • We can represent the states on bits:
  • We can represent the alphabet on bits:
  • We can represent the set of initial states by the boolean function , which maps each state to a boolean (“initial” or “not initial”).
  • We can represent the transition relation as a function , which is

We can think of these functions as a circuit receiving an input , which alters the state , which in turn alters the relation .

Example of a simple sequential circuit

Formula encoding of boolean functions

Both the initial states and the transition relation are boolean functions. We will often define them mathematically (e.g. or ), but how do we encode them in bits?

We can use a truth table. For instance, the truth table encoding of is a table mapping inputs and to outputs . This is the most efficient encoding if we need to be able to encode any relation: there is no better general encoding. However, most functions can be stored more efficiently.

Let us consider the relation function . We can consider an input-output pair as the following condition:

We can write this as a propositional formula with variables . This formula should be true exactly when the tuple belongs to .

Let , and let be a propositional variable. We can then define:

With this notation in hand, we can state the following:

Theorem: Formula encoding of boolean functions

We can always represent a transition relation as a propositional formula in disjunctive normal form, where:

For many boolean functions (, , etc) this formulation will be quite small.

Auxiliary variables

This formula is a tree, where variables are leaves, and operations are internal nodes (much like an AST). Clearly, the variables are shared by many operations, so the more efficient representations exploit this sharing, and represent the tree as a directed acyclic graph (DAG) instead.

Note that not only leaves of the tree are shared, but that bigger nodes can also be shared. This leads us to introduce the notion of auxiliary variable definitions. An auxiliary variable represents one such shared node.

We can define auxiliary variables directly in the propositional logic by using the operator (which is alternative notation for ).

For instance, for a -bit ripple-carry adder can be encoded with:

  • Initial state
  • Input numbers
  • Output

The formula for the adder with auxiliary variables (serving as the carry) is:

The initial carry is . Every individual adder gets 3 bits of input to add: two bits, and the previous carry. It outputs a carry of 1 iff two or more input bits are 1. It outputs a single bit as the result of the addition, which is the XOR of all inputs.

Propositional logic

Definition

Propositional logic is a language for representing Boolean functions as formulas. The grammar of this language is:

Where denotes variable identifiers.

QBF

Having looked at boolean formulas let’s now look at a small generalization, QBFs:

Definition: Quantified Boolean Formulas

Quantified Boolean Formulas (QBFs) is built from:

  • Propositional variables
  • Constants and
  • Operators

We will use as alternative notation for . A boolean formula is QBF without quantifiers ( and ).

Free variables

Definition: Free variables

The free variables of a formula is the set of variables that are not bound by a quantifier:

Environment

Definition: Environment

An environment is a partial map from propositional variables to (“false” or “true”).

Consider two vectors and of propositional variables. We denote the environment as a mapping given by ,

We denote the result of evaluating a boolean expression with the environment as . This can evaluate to (“false”) or (“true”).

While the formula might seem weird at first (we defined in terms of another , it makes more sense if we think about the first as an AST node and the second as a logical and in the host language:

1
2
3
4
5
6
7
def eval(expr: Expr)(implicit env: Var => Boolean): Boolean = expr match {
    case x: Var => env(x)
    case Literal(0) => false
    case Literal(1) => true
    case And(a, b) => eval(a) && eval(b)
    case Or(a, b) => eval(a) || eval(b)
}

With this notation in hand, we can introduce the following shorthand:

Definition: Models

We write to denote that is true in environment , i.e. that .

Substitution

Definition: Substitution

Let and be propositional formulas, and let be a variable. Let denote the result of replacing each occurrence of in by :

We’ll also introduce a general notation to simultaneously replace many variables: denotes the substitution of a vector of variables with a vector of expressions .

Validity, Satisfiability and Equivalence

Definition: Satisfiability

A formula is satisfiable

Note that if is not satisfiable, it is unsatisfiable, which means .

Definition: Validity

A formula is valid

Theorem: Validity and unsatisfiability

is valid is unsatisfiable.

The proof should follow quite trivially from the definitions.

Definition: Equivalence

Formulas and are equivalent environment defined for all free variables in , we have:

This means that two formulas are equivalent if and only if they always return the same values for the same environment (assuming it’s defined for all their free variables).

Theorem: Equivalence and validity

and are equivalent the formula is valid.

Bounded model checking

Formula representation of sequential circuits

Definition: Sequential circuit

We represent a sequential circuit as a 5-tuple where:

  • is the vector of state variables
  • is a boolean formula describing the initial state
  • is a boolean formula called the transition formula
  • is the vector of auxiliary variables
  • is the vector of input variables

The boolean formula tells us which states we can start in, so it can only contain state variables:

The transition formula can only contain state (), next-state (), auxiliary () or input () variables:

The sequential circuit is a representation of a transition system , where:

  • , meaning that the initial states of a transition system is given by an assignment of state variables that verifies the formula
  • , meaning that the transition relation is given by a mapping of states and inputs to next-states , such that the mapping satisfies the transition formula. Here, the auxiliary variables are existentially quantified so that we can express the criterion without them.

Inductive invariant checking

Given a sequential circuit representation of a transition system , and a formula , how can we check that is an inductive invariant? According to the definition, we require:

We’ll ask the SAT solver to check if it’s possible to break either of the two conditions by asking it if it can satisfy either of the following two formulas:

  • to check if there is an initial state not included in the invariant.
  • to check if, starting from the invariant and take a step, we can end up outside of the invariant.

    To understand this condition, it’s useful to think of as determining the assignment of . Then, seeing that contains variables and , it will fix the assignment for the next states . We can then see if the invariant is still true at the next state.

If the SAT solver returns UNSAT to both, we have proven that is an inductive invariant. Note that this resembles a proof by induction (because it is!).

Bounded model checking for reachability

How do we check whether a given state is reachable? Often, we’re interested in knowing if we can reach an error state or not; being able to do so would be bad. To simplify the question a little, we can ask whether we can reach this error state in steps.

Let be the error formula corresponding to the error state, so . When we talked about circuits, we said that the state and inputs change at each step, so let us denote the state at step as , and the inputs at step as .

We’ll construct an error formula that is satisfiable if and only if there exists a trace of length starting from the initial state that satisfies :

This formula starts at the initial state, then computes all states , and plugs in the final state in the error formula to see if it can be satisfied.

If the SAT solver returns UNSAT, the error state is not reachable. If it returns SAT, the

Satisfiability checking

SAT problem

The SAT problem is to determine whether a given formula is satisfiable. The problem is NP-complete, but useful heuristics exist.

A SAT solver is a program that given a boolean formula either:

  • Returns SAT and optionally an environment such that
  • Returns UNSAT and optionally a proof that no satisfying assignment exists

Formal proof system

Let’s consider a set of logical formulas (e.g. propositional logic).

Definition: Proof system

A proof system is a pair where is a decidable set of inference steps, where:

  • A set is decidable there is a program to check if an element belongs to it
  • Given a set , denotes all finite sequences with elements from

Definition: Inference step

An inference step is a 2-tuple , which we can denote as:

We say that from the premises , we derive the conclusion .

Definition: Axiom

We say that an inference step is called an axiom when , i.e. that it has no premises:

Definition: Proof

Given a proof system , a proof is a finite sequence of inference steps such that, for every inference step , each premise is a conclusion of a previous step.

A minimal propositional logic proof system

We’ll look into a simple logic called the Hilbert system. We’ll define the grammar of our logic as follows. This grammar defines a set of formulas .

This may seem very minimal, but we can actually express many things by combining these. For instance, we can introduce negation (“for free”) as a shorthand:

The inference rules are , where:

The first two rules are axioms, telling us that an implication is true if the right-hand side is true (), and that implication is distributive (). We might recognize modus ponens in the last rule.

We can use these rules to construct a proof of . We’ll draw this proof as a DAG: we can always view a proof as a DAG because of the requirement that the premises of an inference step be a conclusion of a previous step.

G A (a→((a→a)→a) → ((a → (a→a))→(a→a) C (a→(a→a))→(a→a) Modus Ponens A->C B (a((a→a)→a) B->C E a→a Modus Ponens C->E D (a→(a→a)) D->E

Provability

A formula is provable if we can derive it from a set of initial assumptions. We’ll start by formally defining what an assumption even is:

Definition: Assumptions

Given where , and given a set of assumptions , a derivation from in is a proof in where:

In other words, assumptions from are just treated as axioms (i.e. they are rules that have no prerequisites, hence ). A derivation is a proof that starts from assumptions.

Definition: Provable

We say that “a formula is provable from a set of assumptions ”, denoted , there exists a derivation from in that contains an inference step whose conclusion is .

We write (or simply ) to denote that there exists a proof in containing as a conclusion.

Consequence and soundness in propositional logic

Definition: Semantic consequence

Given a set of assumptions , where is in propositional logic, and given , we say that is a semantic consequence of , denoted , for every environment that defines all variables in , we have:

In other words, iff an environment makes all assumptions true, and is true in an environment when the set of assumptions are all true in that environment, then we call a semantic consequence.

Definition: Soundness

A step is sound

A proof system is sound if every inference step is sound.

In other words, a conclusion of step is sound if it is a semantic consequence of the previous steps. A proof is sound if all steps are sound.

If is an axiom (which has no precondition, meaning that in the above), this definition of soundness means that is always a valid formula. We call this a tautology.

Theorem: Semantic consequence and provability in sound proof systems

Let where are propositional logic formulas. If every inference rule in is sound, then implies .

This theorem tells us that that if all the inference rules are sound, then is a semantic consequence of if is provable from . This may sound somewhat straightforward (if everything is sound, then it seems natural that the semantic consequence follows from provability), but is a nice way to restate the previous definitions.

The proof is immediate by induction on the length of the formal proof. As a consequence implies that is a tautology.

Proving unsatisfiability

Let’s take a look at two propositional formulas and . These are semantically equivalent if and .

We can prove equivalence by repeatedly applying the following “case analysis” rule, which replaces a given variable by 0 in and by 1 in :

Definition: Case analysis rule

This is sound, because if we consider an environment defining , and assume and , then:

  • if then
  • if then

In either case remains true when and are sound.

Strictly speaking, the above rule may not be quite enough, so we’ll also introduce a few simplification rules that preserve the equivalence:

Those rules together form the sound system , where:

Remember that a set of formulas is satisfiable if there exists an environment such that for every formula , . We can use to conclude unsatisfiability:

Theorem: Refutation soundness

If then is unsatisfiable

Here, means false. This follows from the soundness of . More interestingly, the converse is also true.

Theorem: Refutation completeness

If a finite set is unsatisfiable, then

This means that unsatisfiable .

For the proof, we can take the conjunction of formulas in and existentially quantify it to get (i.e. )

Conjunctive Normal Form (CNF)

To define conjunctive normal form, we need to define the three levels of the formula:

  • CNF is the conjunction of clauses
  • A clause is a disjunction of literals
  • A literal is either a variable or its negation

This is a nice form to work with, because we have the following property: if is a clause then there exists a literal such that .

We can represent formulas in CNF as a set of sets. For instance:

The false value can be represented as the empty clause . Note that seeing an empty clause in CNF means that the whole formula is unsatisfiable.

Clausal resolution

Definition: Clausal resolution rule

Let and be two clauses.

This rule resolves two clauses with respect to . It says that if clause contains , and clause contains , then we can remove the variable from the clauses and merge them.

Theorem: Soundness of the clausal resolution rule

Clausal resolution is sound for all clauses and propositional variables .

This tells us that clausal resolution is a valid rule. A stronger result is that we can use clausal resolution to determine satisfiability for any CNF formula:

Theorem: Refutational completeness of the clausal resolution rule

A finite set of clauses is satisfiable there exists a derivation to the empty clause from using clausal resolution.

Unit resolution

A unit clause is a clause that has precisely one literal: it’s of the form where is a literal. Note that the literal in a unit clause must be true.

Given a literal we define the dual as and .

Unit resolution is a special case of resolution where at least one of the clauses is a unit clause.

Definition: Unit resolution

Let be a clause, and let be a literal.

This is sound (if is true then is false and can thus be removed from another clause ). When applying this rule we get a clause : this gives us progress towards , which is good.

Equivalence and equisatisfiability

Let’s recall that two formulas and are satisfiable iff and . TODO is this a typo? Should it be “equivalent”?

Definition: Equisatisfiability

Two formulas and are equisatisfiable is satisfiable whenever is satisfiable.

Equivalent formulas are always equisatisfiable, but equisatisfiable formulas are not necessarily equivalent.

Tseytin’s Transformation

Tseytin’s transformation is based on the following insight: if and are two formulas, and we let be a fresh variable, then is equisatisfiable with:

Tseytin’s transformation applies this recursively in order to transform an expression to CNF. To show this, let’s consider a formula using :

The transformation works by introducing a fresh variable for each operation (we can think of it as being for each AST node):

Note that these formulas refer to subterms by their newly introduced equivalent variable. This prevents us from having an explosion of terms in this transformation.

Each of these equivalences can be converted to CNF by using De Morgan’s law, and switching between and . The resulting conversions are:

Operation CNF

Note that the Tseytin transformations can be read as implications. For instance, the transformation can be read as:

  • If and are true, then is true
  • If is false, then is false
  • If is false, then is false

It then takes the conjunction of all these equivalences:

SAT Algorithms for CNF

Now that we know how to transform to CNF, let’s look into algorithms that solve SAT for CNF formulas.

DPLL

The basic algorithm that we’ll use is DPLL, which applies clausal resolution recursively until an empty clause appears, or all clauses are unit clauses. This works thanks to the theorem on refutational completeness of the clausal resolution rule.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def DPLL(S: Set[Clause]): Bool = {
    val S' = subsumption(UnitProp(S))
    if (∅  S') false // an empty clause means the whole thing is unsat
    else if (S' has only unit clauses) true // the unit clauses give e
    else {
        val L = a literal from a clause of S' where {L} not in S'
        DPLL(S' + Set(L)) || DPLL(S' + Set(complement(L)))
    }
}

// Unit Propagation
def UnitProp(S: Set[Clause]): Set[Clause] =
    if (C  S && unit U  S && resolve(U, C) not in S)
        UnitProp((S - C) + resolve(U, C))
    else S

def subsumption(S: Set[Clause]): Set[Clause] =
    if (C1, C2  S such that C1  C2)
        subsumption(S - C2)
    else S

Backtracking

Perhaps the most intuitive algorithm is to construct a binary decision tree. At each node, we take a random decision. If that leads to a conflict, we go back one step, and take the opposite decision.

This is quite simple, but the complexity is exponential. We’ll therefore see some smarter algorithms.

Non-chronological backtracking

We still construct a binary decision tree. Each decision may also force some other variables into a certain value: we will track these implications in a directed graph.

In this graph, each node represents a value assignment to a variable; let’s say we color a node blue if it has been set directly by a decision, and in green if the value is a consequence of a decision. Edges go from nodes (which may be blue or green) to their consequences.

As we construct the graph, we may create conflicts, meaning that we have two (green) nodes assigning different values to the same variable. In this case, we look at the set of nodes pointing to the conflicting pair of nodes:

G cluster_conflict x1 x1 = 0 x4 x4 = 1 x1->x4 x8 x8 = 0 x1->x8 x12 x12 = 1 x1->x12 x2 x2 = 0 x11 x11 = 1 x2->x11 x3 x3 = 1 x3->x8 x9b x9 = 1 x3->x9b x7 x7 = 1 x9a x9 = 0 x7->x9a x7->x9b x8->x12 x8->x9a

In the above example, the variables , and point to the conflicting pair of ’s. This means that one of the assignments to these variables was incorrect, because it lead us to the conflict. We can learn from this conflict, and add the following conflict clause to the formula:

We can then backtrack to the first decision where we set , or (which may be much earlier than the parent decision in the tree). Every once in a while, it may be useful to completely abandon the search tree (but keep the conflict clauses): this is called a restart.

This approach significantly prunes the search tree; by introducing this conflict clause, we’ve learned something that will be useful forever.

2-literal watching

This algorithm does the standard trick of unit propagation, but avoids expensive book-keeping by only picking 2 literals in each clause to “watch”. We ignore assignments to other variables in the clause.

For each variable, we keep two sets of pointers:

  • Pointers to clauses in which the variable is watched in its negated form
  • Pointers to clauses in which the variable is watched in its non-negated form

Then, when a variable is assigned true, we only need to visit clauses where its watched literal is negated. This means we don’t have to backtrack!

Linear Temporal Logic

Definition

With bounded model checking, we’ve seen how to check for a property over a finite trace. However, there are certain properties that we cannot prove by just seeing a finite trace. For instance, program termination or eventual delivery cannot be proven with the semantics of propositional logic. This will prompt us to introduce linear temporal logic, with which we can study events on a trace.

Let be a boolean formula over state and input variables.

We write to say that a trace satisfies formula at time . The rules for the above constructs are:

We can interpret until to mean that is true until becomes true. For instance, we can guarantee something until an overflow bit is on, at which point all bets are off. Note that this does not impose any constraints on once is true.

Note that next and prev, and since and until are duals of each other3.

We can add some derived operations from these:

The operation globally can be thought of as meaning “forever, from now on” (where “now” is step ). For instance, means that from now on, every point in the trace satisfying is followed by a point in the trace satisfying at some point.

Binary Decision Diagrams

Definition

Binary Decision Diagrams (BDDs) are a representation of Boolean functions: we can represent a Boolean function as a directed acyclic graph (DAG). We distinguish terminal from non-terminal nodes. Each edge is labeled with a decision: a solid edge (“high edge”) means the variable is 1, and a dashed edge (“low edge”) means that it is 0. The leaf nodes are labeled with the outcome.

For instance, the BDD for is:

G cluster_output f1a 1 f1b 1 f0 0 x x x->f1a y y x->y y->f1b y->f0

We can also think of this DAG as a finite automaton, or as a (loop-free) branching program.

Canonicity

The key idea of BDDs is that specific forms of BDDs are canonical, and that BDDs can be transformed into their canonical form. For instance, the previous example can be rewritten as:

G f1 1 f0 0 x x x->f1 y y x->y y->f1 y->f0

Let’s define this transformation more formally.

We first assign an arbitrary total ordering to variables: variables must appear in that order along all paths. Here, we picked as our ordering. Note that in general, selecting a good ordering is an intractable problem, but decent application-specific heuristics exist.

Then, the reduced ordered BDD (RO-BDD) is obtained by:

  • Merging equivalent leaf nodes (which is what we did above)
  • Merging isomorphic nodes (same variables and same children)
  • Eliminating redundant tests (where both outgoing edges go to the same child)

Doing this brings us to the following property:

Theorem: Unicity of RO-BDD

With a fixed variable order, the RO-BDD for a Boolean function is unique

In the following, we’ll always refer to RO-BDDs, and just call them BDDs.

Data structures for BDDs

To encode a BDD, we’ll map number all nodes as 0, 1, 2, …, where 0 and 1 are terminals. The variables are also numbered as . We also number the levels in the diagram, according to the total ordering that we selected: level 1 is the root node, and level contains terminal nodes.

G cluster_level1 Level 1 cluster_level2 Level 2 cluster_level3 Level 3 x x y y x->y f1 1 x->f1 y->f1 f0 0 y->f0

A data structure for BDDs is the node table , mapping a node to its low child and high child . The node table also contains an entry describing the level in the ordering as well as the name of the variable.

For instance, for the example we used above, the table would be:

Node number Level and name Low edge High edge
0 3    
1 3    
2 2 0 1
3 1 2 1

We’ll assume we have some methods for querying this table:

  • init(T): initialize with 0 and 1 terminal nodes
  • u = add(T, i, l, h): add node with to
  • var(u): get of node
  • low(u): get of node
  • high(u): get of node

We can also define the inverse of a node table, allowing us to find a node if we have all the information about it. We’ll assume that we have some methods to query and update this structure:

  • init(H): initialize with 0 and 1 terminal nodes
  • u = lookup(H, i, l, h): get node from
  • insert(H, i, l, h, u): add an entry from to

Basic operations on BDDs

Somewhat surprisingly, given a BDD for a formula , we can check whether it is a tautology, satisfiable or inconsistent in constant time:

  • is a tautology the BDD is
  • is satisfiable the BDD is not
  • their BDDs are equal4

To insert a node into , we ensure that we’re keeping a RO-BDD by eliminating redundant tests, and preventing isomorphic nodes. If that is not the case, we can update the tabée and the reverse mapping .

1
2
3
4
5
6
7
8
9
10
11
12
13
14
type NodeId = Int
type Level = Int

def mk(i: Level, l: NodeId, h: NodeId): NodeId = {
  // eliminate redundant tests:
  if (l == h) l
  // prevent isomorphic nodes:
  else if (lookup(H, i, l, h) != null) lookup(H, i, l, h) 
  else {
    u = add(T, i, l, h) // insert node
    insert(H, i, l, h, u) // update reverse mapping
    u
  }
}

With this method in hand, we can see how to build a BDD from a formula . The key idea is to use Shannon’s expansion:

Here, means that we replace all nodes by their high-edge subtree. In the formula, we can think of it as a substitution of by 1. We call this basic operation Restrict.

This breaks the problem into two subproblems, which we can solve recursively:

1
2
3
4
5
6
7
8
9
10
def build(f: Formula): NodeId = build2(f, 1)
private def build2(f: Formula, level: Level): NodeId =
  if (level > n)
    if (f == False) 0 else 1
  else {
    xi = f.variables.head
    f0 = build2(f.subst(xi, 0), level + 1)
    f1 = build2(f.subst(xi, 1), level + 1)
    mk(level, f0, f1)
  }

Interpolation-based model checking

Interpolation

Definition: Interpolant

Let and be propositional formulas. An interpolant for and is a formula such that:

Note that if these conditions hold, we have . The goal of is to serve as an explanation of why implies , using variables that the two have in common.

Theorem: Existence and Lattice of Interpolants

Let and be propositional formulas such that and let be the set of interpolants of . Then:

  • If then and

For instance, let’s consider two formulas and . The variables they have in common are so we’re looking for an interpolant .

Reverse interpolants

We know that . The negation of this is . Let and . Instead of looking at the validity of , we can look at unsatisfiability of . To aid us in this, we’ll define as an interpolant for , meaning that we have:

  • , meaning that

As previously seen, we can determine whether is unsatisfiable by using the theorem on completeness of clause resolution: it is unsatisfiable iff we can derive the empty clause using resolution. A key insight here is that we can use the resolution proof to construct an interpolant for .

Let’s define as the interpolant for and , where denotes the clause , but only with literals belong in . The idea here is that we can construct it recursively, and that is the interpolant for and .

There are multiple ways of constructing this , like the Symmetric System or McMillan’s System (which uses a SAT solver). There are many interpolants that exist, and it’s unclear which is best. Small formulas are good, but we do not currently know any efficient algorithms to find them.

Tseytin’s transformation and interpolants

Remember that Tseytin’s transformation transforms an expression into CNF by introducing fresh variables. Suppose we have converted and to and , and that the newly introduced variables are and , respectively. The results of the Tseytin transformations are:

Lemma: Interpolant for CNF

The interpolant for and is also an interpolant for and

We’ll prove this by showing that all three properties of interpolants are preserved:

  • todo
  • todo
  • By assumption of being an interpolant for and , we have:

    The second step is by the above definition of the transformation’s results. The last step is because and are disjoint.

Reachability checking using interpolants

We can use interpolants to improve upon bounded model checking, in order to prove properties without having to unfold up to the maximum length. To do this, we’ll need to recall two concepts that we’ve seen previously:

  • Remember that bounded model checking constructs a formula that is satisfiable iff there exists a trace of length starting from the initial state, satisfying an error formula .
  • Also, recall the function.

Reachability checking

To check for reachability, we can keep adding to the set of reachable states until one of the following two situations arises:

  • We reach a fixpoint: )
  • We find that an error state is reachable: .

The algorithm would look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
val errorStates: Set[State] = ???

def reachable(currentStates: Set[State]): Boolean = {
  val nextStates = currentStates union post(currentStates)
  if (nextStates intersect errorStates != emptyset)
    true
  else if (nextStates == currentStates)
    false
  else
    reachable(nextStates)
}

reachable(Init)

The problem with this approach is that computing may result in a complex image (a large formula or BDD), and it may take many steps to compute all reachable states. To fix this problem, we can do the same trick we always do in formal verification: simplifying the model.

Approximated reachability checking

The insight to simplify is that we can drop complex and uninteresting parts of the formula: instead of , we’ll only look at . This allows us to be faster: indeed, a formula that says nothing about certain boolean variables describes a larger set of states (meaning that we grow faster), and has a smaller formula (meaning that the result of is less complex).

So how does one approximate the post? There are many possible approximations of it, so we’ll use a subscript to denote the (potentially) different approximations. We’ll use a superscript to denote an approximation of something. Note that a superscript number still denotes recursive applications of the function.

Let denote any over-approximation of :

We’ll consider a monotonic 5, meaning that:

Now, consider the following sequence:

The relationship here is that:

This means that we can replace with recursive applications of different approximations. This leads us to the following algorithm:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
val errorStates: Set[State] = ???
val post#: Array[Set[State] => Set[State]] = ???

def maybeReachable(
  currentStates: Set[State], 
  postAcc: Set[State], // accumulator of post#[i-1](post#[i-2](...))
  i: Int
): Boolean = {
  // compute post#[i](...post#[1](Init)...)
  val nextPostAcc = post#[i](postAcc)

  val nextStates = currentStates union nextPostAcc
  if (nextStates intersect errorStates != emptySet)
    true
  else if (nextStates == currentStates)
    false
  else
    maybeReachable(nextStates, nextPostAcc, i + 1)
}

maybeReachable(Init, Init, 1)

If maybeReachable returns false, then we know for sure that the system is safe, that the error states are not reachable. However, crucially, if maybeReachable returns true, then we cannot really conclude anything. Maybe the error states are reachable, and maybe we just need to try again with better approximations .

Constructing approximations from interpolants

We have discussed how to use an approximation, but not how to choose one. We’ll start out by showing how to find one from , and how to continue from there. Let’s look at the formula for bounded model checking again:

To simplify the formula a little, we’ll define:

This allows us to rewrite the formula as:

We define and as above, and are now interested in finding an interpolant between the two. The first thing we can look at is which variables and have in common; by the third property of interpolants:

We also need the two first properties of interpolants:

  • , which is equivalent to . Intuitively, this is like , so we can define . This is a valid approximation, as .

  • , which is equivalent to . We can just impose the constraint of to prevent from being too general.

Once we have an approximation for , we can find the by treating as the new and repeating the process, and so on for all the other approximations.

LCF Approach to Formal Proofs

Logic for Computable Functions (LCF) is a logic for reasoning about computable partial functions based on domain theory.

In “A metalanguage for Interactive Proof in LCF”, the idea is:

  • A Theorem is an abstract data type that stores a formula
  • We cannot create a theorem out of an arbitrary formula (the constructor is private)
  • We can create formulas by:
    • Creating axiom instances given some parameters (static methods)
    • Use inference rules to get theorems from other theorems (instance methods)

The idea is that this allows us to maintain invariants about the theorems. For instance:

1
2
3
4
5
6
final class Theorem private(val formula: Formula) {
  def modusPonens(pq: Theorem, p: Theorem): Theorem = pq.formula match {
    case Implies(pf, qf) if p.formula == pf => Theorem(qf)
    case _ => throw new Exception("illegal use of modus ponens")
  }
}

First order logic

Grammar

We have formulas and terms :

Where:

  • denotes a variable
  • denotes a predicate symbol
  • denotes a function symbol
  • denotes a constant. We can think of it as a function symbol of arity 0.

To denote how many arguments a function or predicate symbol takes, we define a function . For instance, if then takes two arguments.

We’ll also define granularity levels of a formula:

  • We call an atomic formula if it contains no logical connectives or formulas.
  • A literal is an atomic formula or its negation.
  • A clause is a disjunction of literals

Interpretation

Definition: First-order logic interpretation

A first-order interpretation is denoted , where:

  • is the set of constants
  • maps constants, functions and predicate functions as follows:
    • Each constant into an element of , i.e.
    • Each function symbol with into a total function of arguments, i.e.
    • Each predicate symbol with into an -ary relation, i.e.

One thing that’s important to understand about first-order logic is that it is not defined over a fixed set of constants. It can be interpreted with an arbitrary set of constants , which is not necessarily just booleans (which would be ). Much like with SAT solvers which can return a boolean assignment , in an interpretation of a first-order formula, the choice of maps constants, function symbols and predicate symbols into actual values.

We can evaluate a given formula with a given interpretation . We denote this as .

  • If the result of the evaluation is true, we write or .
  • If the result of the evaluation is is false, we write or .

Since the constants in aren’t necessarily booleans, how does the result of the evaluation return a boolean? Well, we can “push down” evaluation and compute logical operations on those results6:

We say that:

Definition: Validity and satisfiability in first-order logic

  • is valid if
  • is satisfiable if

As an example, let’s take a formula with function symbols and :

  • Validity:
  • Satisfiability:

This leads us to the following observation:

Lemma

is valid is not satisfiable

This means that to check validity, we can instead check satisfiability of the negation. Looking into negations will lead us to defining a normal form for negation.

Negational normal form

In negational normal form, negation only applies to atomic formulas, and the only other connectives are and (we’ve translated and away). It essentially “pushes down” negation. We can transform formulas to negational normal form as follows:

Prenex normal form

Prenex normal form is a form where are all quantifiers appear first, and then we have a formula without quantifiers. Once we are in negational normal form, we can pull quantifiers to the top by the following:

Skolem functions

Observe that the following formula is valid:

Indeed, if we assume the left-hand side of the implication to be true, we need to prove that the right-hand side also holds. To do this, we can simply pick to be the value of . This seems obvious.

What’s less obvious is a sort of converse: if we assume that the right-hand side holds, we can prove that there exists an satisfying the whole formula. Note that we can’t write because is a symbol and the FOL grammar doesn’t allow this, but what we can do instead is to extend the signature with a new function symbol that does not appear in the formula. We call this a Skolem function.

Definition: Skolemization

We can skolemize a formula in prenex normal form by replacing:

with

Where is a fresh function symbol (called a Skolem function) of arity .

Converting imperative programs to formulas

Motivating example

An imperative program can be thought of as a relation between initial state and final state. For instance, let’s consider the following instructions:

1
2
x = x + 2
y = x + 10

We can think of this as the relation:

We can ensure postconditions on this. In Stainless, we could check the following condition:

1
2
3
4
5
6
7
8
9
import stainless.lang._
import stainless.lang.StaticChecks._

case class FirstExample(var x: BigInt, var y: BigInt) {
  def increase: Unit = {
    x = x + 2
    y = x + 10
  }.ensuring(_ => old(this).x > 0 ==> (x > 0 && y > 0))
}

This is equivalent to the following condition, which says that the relation of the program should be a subset of the relation of the pre- and postconditions.

This is equivalent to checking the validity of the following formula:

Translating commands

We’ll first define a general formulation of the translation, and then look into commands on a case-by-case basis.

Suppose we want to translate a program containing mutable variables . Let be an arbitrary command. Let be the formula describing the relation between initial state and final state . We define the relation as :

Assignment

We formulate assignment (x = t) as follows:

The formula says that now has value , but also fixes everything that hasn’t changed. We need to constrain the other variables so that they cannot take arbitrary values.

Conditions

We formulate conditions (if (b) c1 else c2) as follows:

This is a fairly straightforward transformation that corresponds to the boolean formulation of a condition.

Non-deterministic choice

What if we have a condition where we do not know what happens in the condition (e.g. if (*) c1 else c2)? We can simply translate that into a disjunction:

Sequences

We formulate sequences (c1; c2) as composition of the relations and corresponding to and :

This simply tells us that there must exist an intermediary state that binds initial and final state together. The formula is thus:

Where are freshly picked. This relation places the constraint that the output variables of should be the input variables of . This is a little reminiscent of the formula for bounded model checking.

As a useful convention when converting larger programs, we can name after the position in the program.

Non-deterministic commands

Let’s introduce a potentially dangerous command, which we’ll call havoc (a word meaning destruction and confusion). How can we formulate havoc(x)? A conservative approach would be to let be arbitrary henceforth, but keep all other variables as they are:

Assumption

An assumption (which we’ll call assume(F)) limits the space of possibilities. We can use assume to recover from havoc (e.g. havoc(x); assume(x == e) is equivalent to x = e if x isn’t in the free variables of e). The command doesn’t change anything, but if the condition doesn’t hold, execution should be stopped.

The relation created by this translation is the identity relation over the set of variables satisfying . To justify the name “assume”, we can look at the following example.

  • , where are the final values that we get from executing .

Note that if (b) c1 else c2 is equivalent to if (*) { assume(b); c1 } else { assume(!b); c2 }; the generated formulas will be equivalent.

Full example

Consider the following code:

1
2
3
(if (b) { x = x + 1 } else { y = x + 2});
x = x + 5;
(if (*) { y = y + 1} else {x = y})

Notice the line numbers; we’ll use those to name our intermediary variables. The translation becomes:

Here, and denote the variables’ initial state (before execution), and and denote their final state.

Hoare Logic

Hoare logic (named after Sir Tony Hoare) is a logic that was introduced to reason about programs. We can think of it as a way of inserting annotations into code, in order to make proofs about (imperative) program behavior simpler.

As an example, annotations have been added in the program below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// {0 <= y}
i = y;
// {0 <= y && i = y}
r = 0;
// {0 <= y && i = y && r = 0}
while // {r = (y-i)*x && 0 <= i}
  (i > 0) {
    // {r = (y-i)*x && 0 < i}
    r = r + x;
    // {r = (y-i+1)*x && 0 < i}
    i = i - 1
    // {r = (y-i)*x && 0 <= i}
}
// {r = x * y}

Let’s look at the first three lines:

1
2
3
// {0 <= y}
i = y;
// {0 <= y && i = y}

This constitutes a Hoare triple, which we’ll study in more detail now.

Hoare triples

Central to Hoare logic are Hoare triples.

Definition: Hoare Triple

Let be the set of possible states. Let . Let be a relation over the states . A Hoare triple is defined as:

We call the precondition and the postcondition.

Note that does not mean “singleton set of ”, but is notation for an “assertion” around a command. A Hoare triple may or may not hold; after all it is simply a shorthand for a logical formula, as the definition shows. Let’s look at examples of triples that do and do not hold:

  • {j = a} j = j + 1 {a = j + 1}

    This triple does not hold: if is equal to initially, and we then increment by 1, then .

  • {i != j} if (i > j) then m=i-j else m=j-i {m > 0}

    This triple does hold. If then , and if then it is also positive.

Preconditions and postconditions

Here are a few cues as to how to think of stronger vs. weaker conditions:

  • A stronger condition is a smaller set than the weaker condition.
  • A stronger condition is a subset of a weaker condition.
  • A stronger condition implies a weaker condition.

In other words, putting conditions on a set makes it smaller.

The strongest possible condition is “false”, which is the set . The weakest condition is “true”, which is the biggest set (all tuples).

Definitions

Definition: Strongest postcondition

To visualize this, let’s look at a diagram of the relation :

Strongest postcondition of a relation

Image from lara.epfl.ch

If we let be the red set, then is the blue set: it’s the smallest set of values that all are pointed to from by . In other words, it’s the image of ( or ), which we can notice has the same definition.

Definition: Weakest precondition

Weakest precondition of a relation

Image from lara.epfl.ch

Dually, if we let be the blue set, then is the red set: it’s the largest set of values that only point to . Notice that there are points that point to and (since doesn’t need to be injective), but those are not included in . The weakest precondition is the largest possible set from which we will definitely end up in from.

Equivalence

Lemma: Three forms of Hoare Triple

The following three conditions are equivalent.

These conditions expand into the following formulas, respectively:

From here on, we can prove equivalence using first-order logic properties:

  • when
  • when

Characterization

The above lemma also gives us a good intuitive grip of what wp and sp are: they’re the best possible values for and , respectively. For the triple to hold, any other must be a subset of wp (i.e. stronger), and any other must be a supserset of (i.e. weaker). This leads us to the following characterization lemmas.

Lemma: Characterization of sp

is the smallest set such that , that is, the two following hold:

The first condition immediately follows from the equivalence in the above lemma: it is equivalent to . The second condition ensures that the strongest precondition is the smallest one (as it’s a subset of all ).

Lemma: Characterization of wp

is the largest set such that , that is, the two following hold:

The reasoning is the same as above.

Duality

When defining wp, we could see from the diagram that almost corresponded to going backwards from . And it is almost, because we must deal with the cases where values outside of point to (which is possible when the relation is not injective). To deal with this, we don’t look at , but at the complement set of “error states” (in brown in the diagram).

Note that and is always defined.

To prove this lemma, we can expand both sides and apply basic first-order logic properties. We first prove it from the left-hand side:

Now on the right-hand side:

As these are equal, we have proven equality.

More laws

First, we’ll state a lemma telling us that sp can be applied to each point of a set, or the the whole set, and we get the same results:

Lemma: Pointwise sp

From this, it should be clear what the sp of unions is:

Lemma: Disjunctivity of sp

For wp, the wp can be obtained by selecting each point for which the sp is in . To understand this, remember that the sp is like the image, so we can select all points that only point to , which is the intuitive explanation we had for wp.

Lemma: Pointwise wp

Likewise, this should give us an idea of how to deal with intersections of sets:

Lemma: Conjunctivity of wp

All of these can be proven by expanding to formulas and using basic first-order logic.

Hoare Logic for loop-free programs

By now, we’ve seen how to annotate a single relation, and how to reason about preconditions and postconditions. How do we scale this up to a whole program? Specifically, we’ll need to reason over unions and compositions of relations. To do that, we’ll introduce two theorems telling us how to do that.

Suppose the possible paths of a program are , a set of relations.

Theorem: Expanding paths

The condition:

is equivalent to:

This simply says that to a triple is valid over a set of relations when it is valid over all relations in the set: it serves as a generalization when we’re considering multiple possible paths.

To prove this, we can use the definitions to expand into formulas. Alternatively, we can use the previous lemmas:

The first step translates into the third equivalent form, and the second step uses disjunctivity of sp. The third step translates the union to a quantified formula, and the last step translates back the the first equivalent form.

Theorem: Transitivity of Hoare triples

If and then . We can write this as the following inference rule:

We won’t go into too much detail for this theorem. The two theorems above tell us that if we can annotate Hoare triples for individual statements, we can annotate the whole program.

Hoare logic for loops

For general loops, the simplest rule that we can have is the following. It says that if a single iteration of the loop doesn’t change the precondition, then the whole loop doesn’t change the precondition.

Lemma: Rule for non-deterministic loops

This is obviously going to be true by transitivity, but let’s prove it formally nonetheless. We can generalize the previous results to programs with loops. A special case of the transitivity theorem of Hoare triples is when and . In that case, we have:

We can keep applying the transitivity rule to achieve a more general form:

By the Expanding Paths condition, we then have:

Note that we have the transitive closure in the denominator, .

Hoare logic for loops with conditions

Also known as while loops. We did not previously define the relation for while loops like while (b) { c }, so let’s do it now.

Let be the set corresponding to the command b, and be the set corresponding to the command !b. Recall the definition of identity relation. Let be the relation of the loop body. The relation of the while loop is:

The relation follows the relation an arbitrary number of times while the condition is true, and then finally the condition is false. This translates almost directly into the following rule:

Lemma: Rule for loop with condition

For a while (b) { c } loop, we have:

Restated with commands and conditions instead of relations and sets:

We trivially have:

So by transitivity:

From the rule for non-deterministic loops, we have:

Applying these rules, we get the lemma.

Properties of program contexts

Properties of relations

Lemma: Monotonicity over composition

Lemma: Monotonicity over union

Lemma: Union distributivity

Properties of expressions

Expressions are functions that take relations, and construct a more complex relation from them. We’ll only consider expressions that use union () and composition (). An example of such an expression would be:

We’ll see two theorems stating that expressions are monotonic, and that they “sort of distribute” over union (meaning sometimes with equality, but always at least with ).

Theorem: Monotonicity of expressions using union and composition

Let be the universe, and let be the set of all possible relations in . Let be any function building one relation from another, where is built from:

  • ,
  • some additional constant expressions ,
  • using and .

Such expressions are monotonic on :

This theorem can be proven by induction on the expression tree defining , using the monotonicity properties of and that we have seen above.

Theorem: Union “distributivity” of expressions in one direction

Let with be a family of relations. Then:

Let . Note that . Since is monotonic, . Since all the are included in , so is their union, so .

Note that we do not have equality and we therefore don’t really have a real union distributivity! To see this, take , and consider the family of relations where and . In this case:

However, under some special assumptions on , we actually have full, normal distributivity with equality:

Theorem: Refined union distributivity of expressions

If satisfies one of the conditions below:

  1. contains at most once
  2. contains any number of times, but is a set of natural numbers and is an increasing sequence
  3. contains any number of times, but is a directed family of relations, meaning that for each there exists a such that (and is possibly uncountably infinite)

Then distributes over union:

The third case is actually a generalization of the second. With the increasing sequence, the can just be the bigger relation between and . In this second case, can be finite or countably infinite, which is also less general than the third case.

All the proofs are done by induction, but we will not go into much detail on them.

More on mapping programs to formulas

We’ve seen how to translate really basic constructs like conditions, while loops and assignment, but will now see some more advanced constructs that many real programming languages.

Mutable local variables

When translating code into formulas, each statement is a relation between variables in scope. What happens if we introduce a variable in local scope? Suppose x and z are global variables, and we have a block introducing a local variable y:

1
2
3
4
5
6
7
x = x + 1;
{
  var y;
  y = x + 3;
  z = x + y + z
};
x = x + z

The final formula should not depend on y. Therefore, we need to do some kind of local variable translation, which we can do with existential quantification.

Let’s call the whole expression in brackets . Inside , we have variables . Let be the formula for , containing variables in the scope. This is a refinement of the previous definition of , which only used the top-level scope ; here, we can give it a local scope.

The variable initialization in the bracket block can be translated into a formula by:

This lets and be unconstrained, and translates the rest of the block with now added into the scope . This lets take any value in . Note that the translation of this is quite similar to what havoc does. In fact, we can express havoc using these semantics:

This equivalence will serve as a handy shorthand: when we initialize multiple variables, we can write it as:

Note that the order of havocs does not matter.

Specifications

A specification (a.k.a. spec) is a formula that expresses some pre- and postconditions that the program should respect. Note that the spec is often more general than the strongest postcondition, but it could very well be the actual sp.

There are three ways in which we can think of a spec; we give a translation of the same example for each:

  • Formula:
  • Relation:
  • Program: havoc(x, y); assume(x > 0 && y > 0)

A program adheres to the spec if its relation is a subset of the spec’s relation.

Note that the above program representation lets and be unconstrained by the havoc, and constrains and through the assume. We get by default, by not writing anything about .

Program refinement and equivalence

Definition: Refinement

For two programs and , we define refinement if and only if the following is a valid formula:

We can define the operator in the other direction in the obvious manner:

Note that this is nothing more than a new notation for relation subset, since:

Definition: Equivalence

We define equivalence of two programs and as:

From the above comment, we can see that this is just new notation for relation equality, since:

Theorem: Monotonicity of refinement over sequences and if-statements

This follows directly from our lemmas of monotonicity of relations over composition and monotonicity over union.

Stepwise refinement

Why is this notion of refinement useful? It allows us to model one possible execution of very general, possibly non-deterministic program. We can do this by refining the program in steps, until it becomes deterministic, simple, efficiently executable:

Loops, revisited

Let’s examine the following program , which has variables :

1
2
3
while (x > 0) {
  x = x - y
}

We’re interested in finding the smallest relation between initial state and final state . Let be the number of time the loop executes. We can see that there are two different cases here:

  • When , we never enter the loop because the initial did not satisfy the loop condition; therefore, all variables are kept the same
  • When , we enter the loop, and decrement by times, and finally the loop condition no longer holds.

This leads us to the following formulas:

To find a formula that fits both the and cases, we can take the disjunction of the two. For , there will be a single value of at runtime, so we existentially quantify to express that:

This existential quantifier is annoying, and we will see later how to apply to one-point rule to eliminate it. (TODO, note that the bar in the slides is notation for “y divides …”)

Fixpoints

Before we see how to translate recursive functions, we need a bit of theory on fixpoints:

Definition: Fixpoint

is a fixpoint of if

This is analogous to what exists on real function: the fixpoints of are and .

We are often interested in the fixpoint that is smaller than all others, which we call the least fixpoint. In the example on the polynomial, it would be . For relations, we’re looking for the fixpoint that is of all other fixpoints.

Recursion

A recursive function will result in a recursive relation; say that we’re translating the following function f into a relation:

1
2
3
4
5
def f() = if (x > 0) {
    x = x -1;
    f();
    y = y + 2
} else { }

We know how to translate most of these constructs, but what postconditions can we assume on the recursive call? We’re not done inferring postconditions on the body, so we don’t know what the final result should be. The solution to find a relation for such that .

For now, let’s suppose has this property, and leave it as a variable. Translating the above function f, we get:

Here, the subscript s means “the set corresponding to this formula”. We’ll how to find this relation :

Theorem: Relation for recursion

Let be the expression translating the body of a recursive function, where is the relation of the recursive call. Let be defined as follows:

where . The relation is the least fixpoint of , and is the relation corresponding to the recursive function.

We’ll prove that is indeed a fixpoint of . Note that, according to the theorem of monotonicity of expressions using union and composition, is monotonic. This fact means that we actually satisfy the second condition of union-distributivity of . To see this, pick as follows:

We indeed have by monotonicity of . Therefore, distributes over union. Having union-distributivity will be crucial in finding a relation such that . Indeed, we can pick as follows:

With union-distributivity, we can show that this choice of satisfies :

The step is the one that needs the union-distributivity property.

Let’s follow this up with a proof that is the least fixpoint. We obtained by solving ; now, suppose is an arbitrary relation such that . Note that does satisfies this condition. We now claim that , i.e. that it is the smallest. If we expand to its definition, we’ll see that we need to prove:

If we can show for every , we’ll have proven this. This can be done by induction: the base case is , which is trivially true. The induction step is by monotonicity of :

Step is by monotonicity of , and step is by the original assumption of .

Note that this second proof is actually stronger than we strictly need it to be: not only did we prove that is the least fixpoint (smaller than any such that ), we proved that is also smaller than any other such that . We’ll see some vocabulary for this later (todo link), with which we can say that is the least postfix point.

This model has the advantage of being simple, but also has limitations; for one, it translates programs that never terminate in the same way as programs that do. Therefore, the correct semantics of the translation is that it only applies to terminating executions. The alternative would be error states for non-termination, but we won’t look into that.

Specification of recursive functions

In the previous section, we had . Let’s think about what that means, from a more intuitive perspective. It means that when we plug into , we get something that conforms to the spec. This is exactly what we wanted in the first place: finding a translation of a recursive call that also satisfies the spec for the function itself.

Now, suppose that we want to check a user-defined specification :

As for any spec, we need to check that the relation of the program is a subset of the spec, . This is actually equivalent to proving .

Mutual recursion

Suppose we have two mutually recursive procedures and . We can extend the previous approach to work on pairs. We define and . We want to find a pair of relations such that:

In order to do this, we introduce the a partial ordering (more on this later, todo link), where means and . This will allow us to express refinements.

Pairs aren’t sets, but we can define set-like operations on them by applying the operation to both members of the pair. We denote these operations using the square variant of the set operation in question. For instance:

The entire theory is preserved as long as we have a partial order with some “good properties” (i.e. lattice elements are a generalization of sets), so this extension can be done without too much extra work.

To find the least fixpoint, the semantics are now:

As before, we can prove that if we consider any relations such that and then is the least fixpoint, meaning and .

Quantifier elimination in Presburger arithmetic

Quantifier elimination

Definition: Quantifier elimination

Given a formula , the goal of quantifier elimination (QE) is to find a formula that:

  • is equivalent to
  • has no quantifiers
  • has

Generally, given an assignment , it’s easier to check the truth value of because it has no quantifiers.

The third above condition may prompt an interesting question; what if has no free variables, for instance because they’re all bound by quantifiers? The condition still holds, so would have no variables at all, it would simply be a constant expression, which we can immediately verify the truth of it.

This also means that we can check satisfiability and validity of by:

  • Satisfiability: do QE on and evaluate
  • Validity: do QE on and evaluate

Presburger arithmetic

Let’s talk about how to use what we’ve discussed in order to verify programs. We can do this in three steps:

  1. Compile programs to logical formulas
  2. Express properties in logic or code
  3. Use an automated theorem prover for generated conditions (SAT, SMT, rewrite, interactive provers)

Which logic should we use? We’ll look into integer linear arithmetic, also known as Presburger arithmetic.

This is integer arithmetic with logical operations (), quantifiers (), addition (), and comparison (). This is one of the earliest theories that was shown to be decidable7.

We’ll see that we can extend Presburger arithmetic to admit quantifier elimination (QE): this is not a trivial result, as not all theories admit QE. The reason for which we want to do QE in the first place is that formulas are hard to decide when they have formulas. A Presburger arithmetic formula of size with quantifiers is decidable by a single-tape alternating Turing machine in time . This is very sensitive to the value of .

Another reason to do QE is to remove quantifiers from interpolants, which is useful in generalizing counterexamples to invariants. Note that if a logic has QE, it also has quantifier-free interpolants.

One-point rule

The one-point rule is one of the many steps used in quantifier elimination procedures. Note that it is not enough on its own, because it only handles equalities, and not inequalities .

Theorem: One-point rule

If is a tuple of variables not containing , then:

This rule should make intuitive sense; it’s somewhat akin to inlining (in the direction) or extracting to a variable (in the direction). In the direction, we say that we apply the one-point rule, and in the direction we call it flattening.

We can also state the dual of this rule, which is the negated version of this statement

Theorem: Dual one-point rule

The proof for this is easy; we negate both sides and see that it reduces to the rule for :

Making Presburger arithmetic admit QE

As is, Presburger arithmetic (PA) does not admit QE. If we lack some operations that can be expressed with quantifiers, there may be no equivalent without quantifiers. For instance, we can state divisibility by 3 in PA as follows:

This indicates to us that we need a divisibility operator (e.g. ), because in order to do QE we must have a quantifier-free equisatisfiable variant of the above formula. We will therefore extend PA with the following, where we always have .

  • Comparison
  • Subtraction
  • Divisibility
  • Multiplication by a constant factor (this is just shorthand for very long additions)

The resulting language has operators on numbers and variables, and on atomic formulas.

Normalized Presburger arithmetic

To normalize terms in PA, we can rearrange each term to be of the form:

A term in this form is called a linear term. To normalize literals in PA, we do the following translation:

This generates some disjunctions and conjunctions, but we can convert back to DNF again.

Eliminating existential quantifiers

Let’s see how to eliminate from the following, where is quantifier-free:

To do that, we can convert to disjunctive normal form, a disjunction of clauses , each of which is a conjunction of literals.

The elimination is then done by:

Eliminating universal quantifiers

Suppose is quantifier-free, and we want to eliminate from the following formula:

This is equivalent to:

We know how to handle this right-hand side: we can eliminate the existential quantifier in to obtain . The result of elimination is therefore:

Exposing the variable to eliminate

Sometimes, we want to do QE of a quantified variable , but appears in a multiplied in the formula, or with bounds.

Multiplication

For comparison and “divides” terms, we can rely on the following observations, for :

Let be the coefficients next to ; we choose . Then, to ensure coefficient 1, we multiply all literals by to obtain multiples or (as in the example above, in which ). Then, we let and solve:

As an example of this, let’s consider the following formula:

In this case, the we can take the least common multiple (lcm) of the coefficients of :

We now make all occurrences of by have coefficient (but preserving mathematical equivalence, so we need to update other coefficients too):

Let denote . Note that this is not an arbitrary , it must be divisible by 30, so we add a condition for that at the end. The formula is now:

Bounds

Let’s focus on a single variable . We can reorganize the formula in terms of to fit the following form, so that there are lower bounds on , and upper bounds, and divisibility constraints. We call this form :

If there are no divisibility constraints (), then the formula can be rewritten as:

The min and the max are not something that we can express directly in PA, but we can convert that into the following equivalent form:

Alternatively, we can express this as follows, where does not contain (after all, our goal is to eliminate ):

Let’s see a few instances of this, depending on the values of , and :

  • and :

    At least one of the terms in will evaluate to , so this is valid.

  • :

    where . This works because if holds, then holds too.

  • :

Generalization of Presburger arithmetic

Let’s consider a generalization of Presburger logic: first-order formulas with equality and comparison, interpreted over rationals instead of integers. This is called dense linear order without endpoints. The solution for QE is actually simpler than for PA, because there are no divisibility constraints.

  1. These are of course equivalent, but it sometimes helps to be able to distinguish whether an operator is written in the context of a formula or in the context of a mathematical statement. 

  2. The proof of this identity is in exercise session 1, exercise 2.1.1. It is done by decomposing into existential quantifiers. 

  3. Or rather, they’re almost duals, because the future is infinite but the past is finite. 

  4. This is a consequence of the the theorem of unicity of the RO-BDD 

  5. Why do we consider to be monotonic? Well, that will become clear later in the course: for our application of verifying programs, we will work with monotonic relations. 

  6. These rules are not complete, but suffice if we’ve translated the formula to negational normal form

  7. The story goes that Presburger solved this in his MSc thesis with Tarski in 1929, but that Tarski was not impressed. 

« Back