# Sat Solving

To better understand the `SAT`

(i.e., *boolean satisfiability*) problem and `SAT`

solvers, I decided to implement a basic one. I was pleasantly surprised that wikipedia has enough details to implement the so called **DPLL** algorithm quite easily, with even some improvements described in wikipedia. Even better, in the case when there was no solution, the same algorithm gives a proof that there is no solution. The proof that there is no solution was based on *resolution* due to Davis-Putnam — so the algorithm gives as a bonus a proof that resolution is refutation complete for propositional calculus.

The code corresponding to this blog is in the `fol`

project (and directory) of my ProvingGround repository.

## The `SAT`

(Boolean satisfiability) problem

Suppose we are given finitely many variables `$P$`

, `$Q$`

, … which are *boolean*, i.e. represent whether a statement is true or false, and finitely many relations among them. The relations are logical statements built from the variables using logical operators such as `$\neg$`

(*not*), `$\vee$`

(*or*), `$\wedge$`

(*and*), `$\Rightarrow$`

(*implies*) and `$\Leftrightarrow$`

(*equivalent*). The `SAT`

(*boolean satisfiability*) problem asks whether we can assign truth values to these (i.e., declare each of `$P$`

, `$Q$`

, … to be `true`

or `false`

) so that all the relations are satisfied.

In the sequel, we shall use the standard logic terminology. Logical statements are called *Propositions* (these can be either true or false, unlike the standard meaning of the word proposition in mathematics). Statements built from logical variables by logical operators are called *formulas*. All the variables we consider are boolean. Note that a boolean variable is itself a formula.

## CNF

If `$P$`

is a (boolean) variable, the formulas `$P$`

and `$\neg P$`

are called *literals*. A *clause* is a formula of the form `$l_1\vee l_2\vee\dots\vee l_n$`

with each `$l_i$`

a literal. A formula in **CNF** is a formula of the form `$C_1\wedge C_2\wedge\dots\wedge C_m$`

such that each `$C_j$`

is a clause. Observe that `$C_1\wedge C_2\wedge\dots\wedge C_m$`

is satisfied if and only if all the formulas `$C_j$`

are satisfied. Hence we can (and will) view a formula in `$CNF$`

as a collection of clauses, all of which are required to be satisfied.

Any formula can be rewritten as a CNF formula to which it is equivalent. Namely, we first rewrite all instances of `$A\Rightarrow B$`

as `$\neg B\vee A$`

and all instances of `$A\Leftrightarrow B$`

as `$(\neg A\wedge \neg B)\vee (A\wedge B)$`

to eliminate operators other than `$\vee$`

, `$\wedge$`

and `$\neg$`

. Next we use `$\neg(A \vee B) = \neg A \wedge \neg B$`

and `$\neg(A \wedge B) = \neg A \vee \neg B$`

recursively to rewrite the formula as combinations of literals using `$\vee$`

and `$\wedge$`

only. Finally, using the distributivity property `$A \vee (B\wedge C)= (A\vee B)\wedge (A \vee C)$`

recursively we get a formula in CNF.

### Tautologies and Contradictions

Since `$l\vee l = l$`

, we can assume that no literal appears more than once in a clause. Further, if a clause contains both `$P$`

and `$\neg P$`

for some variable `$P$`

, then this clause is always true, i.e., is a *tautology*. We can drop such clauses. We henceforth assume both these simplifications have been made.

On the other hand, an empty clause can never be satisfied (as *some* literal in the clause must be true for it to be satisfied). Thus, an empty clause is a *contradiction*, and we denote it by `$\bot$`

. In particular, the formula in CNF has an empty clause, then it cannot be satisfied. This will rarely happen, but we shall see how to deduce new clauses in such a way as to get an empty clause for unsatisfiable formulas.

### Resolution

Given two clauses that are of the form `$C = P\vee l_1\vee l_2\vee \dots\vee l_n$`

and `$C' = \neg P\vee l'_1\vee l'_2\vee \dots\vee l'_{n'}$`

(after possibly reordering literals), we can deduce the clause `$l_1\vee l_2\vee \dots\vee l_n\vee l'_1\vee l'_2\vee \dots\vee l'_{n'}$`

. Namely, if `$P$`

is false then as `$C$`

is true we must have `$l_1\vee l_2\vee \dots\vee l_n$`

. On the other hand if `$P$`

is true, as `$C'$`

is true we must have `$l'_1\vee l'_2\vee \dots\vee l'_{n'}$`

. Either way, `$l_1\vee l_2\vee \dots\vee l_n\vee l'_1\vee l'_2\vee \dots\vee l'_{n'}$`

holds.

The clause `$l_1\vee l_2\vee \dots\vee l_n\vee l'_1\vee l'_2\vee \dots\vee l'_{n'}$`

is said to be obtained by *resolution* from `$C$`

and `$C'$`

. A theorem of Martin Davis and Hilary Putnam says that resolution is *refutation complete* not just for `SAT`

but in a more general context, namely first-order logic. This means that if a collection of clauses is not satisfiable, then we can deduce the empty clause by (repeated) resolution starting with the given clauses. Note that the final step will be resolving (singleton) clauses of the form `$P$`

and `$\neg P$`

to get the empty clause `$\bot$`

.

The algorithm I implemented gives either a solution to a collection of clauses or a deduction using resolution of the empty clause `$\bot$`

. Thus, we get in particular a proof that resolution is refutation complete in the context of `SAT`

.

## An example: the N-Queens problem

Before we sketch our algorithm, we consider a class of examples – the N-Queens problem. This asks whether we can place `$N$`

queens on an `$N\times N$`

chessboard with no two queens attacking each other.

To formulate this in terms of `SAT`

, we consider a collection of `$N^2$`

boolean variables `QueenAt(i, j)`

for `$0 \leq i, j < N$`

representing whether a queen is present at the corresponding square on the grid (as is common with programs the indices begin at `$0$`

). As is usual, instead of using (the clumsy) equation for their being (at least) `$N$`

queens on the board, we use `$N$`

equations saying that each row has a queen. We also have a bunch of equation for queens not attacking each other horizontally, vertically and along diagonals. These equations are generated (programmatically) as clauses.

### Solution to 8-Queens

The classical 8-Queens problem is solved instantly. A little scripting lets us display the solution in a table (using a unicode character for the queen).

♕ | |||||||

♕ | |||||||

♕ | |||||||

♕ | |||||||

♕ | |||||||

♕ | |||||||

♕ | |||||||

♕ |

### No solution for 3-Queens

On the other hand, the 3-Queens problem has no solutions. Indeed, our program gives a proof of this by deducing a contradiction using resolution, starting with the assumptions. A little scripting lets us write this solution as nested lists, with resolutions written in terms of the final clause being deduced from the given clauses. This gives a tree-like structure, with the root a contradiction and the leaves the given assumptions.

*Contradiction***from**(¬QueenAt(1,0))**and**(QueenAt(1,0)),**using**- (¬QueenAt(1,0))
**from**(¬QueenAt(1,0) ∨ QueenAt(0,0))**and**(¬QueenAt(0,0) ∨ ¬QueenAt(1,0)),**using**- (¬QueenAt(1,0) ∨ QueenAt(0,0))
**from**(¬QueenAt(1,0) ∨ ¬QueenAt(0,2))**and**(¬QueenAt(1,0) ∨ QueenAt(0,0) ∨ QueenAt(0,2)),**using**- (¬QueenAt(1,0) ∨ ¬QueenAt(0,2))
**from**(¬QueenAt(1,0) ∨ QueenAt(2,2))**and**(¬QueenAt(0,2) ∨ ¬QueenAt(2,2)),**using**- (¬QueenAt(1,0) ∨ QueenAt(2,2))
**from**(¬QueenAt(2,1) ∨ ¬QueenAt(1,0))**and**(¬QueenAt(1,0) ∨ QueenAt(2,1) ∨ QueenAt(2,2)),**using**- (¬QueenAt(2,1) ∨ ¬QueenAt(1,0)) by assumption
- (¬QueenAt(1,0) ∨ QueenAt(2,1) ∨ QueenAt(2,2))
**from**(¬QueenAt(2,0) ∨ ¬QueenAt(1,0))**and**(QueenAt(2,1) ∨ QueenAt(2,2) ∨ QueenAt(2,0)),**using**- (¬QueenAt(2,0) ∨ ¬QueenAt(1,0)) by assumption
- (QueenAt(2,1) ∨ QueenAt(2,2) ∨ QueenAt(2,0)) by assumption

- (¬QueenAt(0,2) ∨ ¬QueenAt(2,2)) by assumption

- (¬QueenAt(1,0) ∨ QueenAt(2,2))
- (¬QueenAt(1,0) ∨ QueenAt(0,0) ∨ QueenAt(0,2))
**from**(¬QueenAt(0,1) ∨ ¬QueenAt(1,0))**and**(QueenAt(0,0) ∨ QueenAt(0,2) ∨ QueenAt(0,1)),**using**- (¬QueenAt(0,1) ∨ ¬QueenAt(1,0)) by assumption
- (QueenAt(0,0) ∨ QueenAt(0,2) ∨ QueenAt(0,1)) by assumption

- (¬QueenAt(1,0) ∨ ¬QueenAt(0,2))
- (¬QueenAt(0,0) ∨ ¬QueenAt(1,0)) by assumption

- (¬QueenAt(1,0) ∨ QueenAt(0,0))
- (QueenAt(1,0))
**from**(QueenAt(1,0) ∨ QueenAt(2,1))**and**(¬QueenAt(2,1) ∨ QueenAt(1,0)),**using**- (QueenAt(1,0) ∨ QueenAt(2,1))
**from**(QueenAt(1,2) ∨ QueenAt(1,0) ∨ QueenAt(2,1))**and**(¬QueenAt(1,2) ∨ QueenAt(2,1)),**using**- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ QueenAt(2,1))
**from**(QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,0))**and**(QueenAt(1,2) ∨ QueenAt(1,0) ∨ QueenAt(2,0) ∨ QueenAt(2,1)),**using**- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,0))
**from**(QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0))**and**(¬QueenAt(2,0) ∨ ¬QueenAt(1,1)),**using**- (QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0)) by assumption
- (¬QueenAt(2,0) ∨ ¬QueenAt(1,1)) by assumption

- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ QueenAt(2,0) ∨
QueenAt(2,1))
**from**(QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,2))**and**(QueenAt(2,0) ∨ QueenAt(2,2) ∨ QueenAt(2,1)),**using**- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,2))
**from**(QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0))**and**(¬QueenAt(2,2) ∨ ¬QueenAt(1,1)),**using**- (QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0)) by assumption
- (¬QueenAt(2,2) ∨ ¬QueenAt(1,1)) by assumption

- (QueenAt(2,0) ∨ QueenAt(2,2) ∨ QueenAt(2,1)) by assumption

- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,2))

- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ ¬QueenAt(2,0))
- (¬QueenAt(1,2) ∨ QueenAt(2,1))
**from**(¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ QueenAt(0,1))**and**(¬QueenAt(0,1) ∨ ¬QueenAt(1,2)),**using**- (¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ QueenAt(0,1))
**from**(¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ ¬QueenAt(0,0))**and**(¬QueenAt(1,2) ∨ QueenAt(0,0) ∨ QueenAt(0,1)),**using**- (¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ ¬QueenAt(0,0))
**from**(¬QueenAt(1,2) ∨ QueenAt(2,0) ∨ QueenAt(2,1))**and**(¬QueenAt(0,0) ∨ ¬QueenAt(2,0)),**using**- (¬QueenAt(1,2) ∨ QueenAt(2,0) ∨
QueenAt(2,1))
**from**(¬QueenAt(2,2) ∨ ¬QueenAt(1,2))**and**(QueenAt(2,0) ∨ QueenAt(2,2) ∨ QueenAt(2,1)),**using**- (¬QueenAt(2,2) ∨ ¬QueenAt(1,2)) by assumption
- (QueenAt(2,0) ∨ QueenAt(2,2) ∨ QueenAt(2,1)) by assumption

- (¬QueenAt(0,0) ∨ ¬QueenAt(2,0)) by assumption

- (¬QueenAt(1,2) ∨ QueenAt(2,0) ∨
QueenAt(2,1))
- (¬QueenAt(1,2) ∨ QueenAt(0,0) ∨ QueenAt(0,1))
**from**(¬QueenAt(0,2) ∨ ¬QueenAt(1,2))**and**(QueenAt(0,0) ∨ QueenAt(0,1) ∨ QueenAt(0,2)),**using**- (¬QueenAt(0,2) ∨ ¬QueenAt(1,2)) by assumption
- (QueenAt(0,0) ∨ QueenAt(0,1) ∨ QueenAt(0,2)) by assumption

- (¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ ¬QueenAt(0,0))
- (¬QueenAt(0,1) ∨ ¬QueenAt(1,2)) by assumption

- (¬QueenAt(1,2) ∨ QueenAt(2,1) ∨ QueenAt(0,1))

- (QueenAt(1,2) ∨ QueenAt(1,0) ∨ QueenAt(2,1))
- (¬QueenAt(2,1) ∨ QueenAt(1,0))
**from**(¬QueenAt(2,1) ∨ QueenAt(1,1) ∨ QueenAt(1,0))**and**(¬QueenAt(1,1) ∨ ¬QueenAt(2,1)),**using**- (¬QueenAt(2,1) ∨ QueenAt(1,1) ∨ QueenAt(1,0))
**from**(¬QueenAt(1,2) ∨ ¬QueenAt(2,1))**and**(QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0)),**using**- (¬QueenAt(1,2) ∨ ¬QueenAt(2,1)) by assumption
- (QueenAt(1,1) ∨ QueenAt(1,2) ∨ QueenAt(1,0)) by assumption

- (¬QueenAt(1,1) ∨ ¬QueenAt(2,1)) by assumption

- (¬QueenAt(2,1) ∨ QueenAt(1,1) ∨ QueenAt(1,0))

- (QueenAt(1,0) ∨ QueenAt(2,1))

- (¬QueenAt(1,0))

## Solving SAT

I now describe the algorithm I implemented to solve `SAT`

for a collection of clauses, giving either an assignment of truth values that satisfies all the clauses or a contradiction using resolution starting with the given clauses. We shall see this in stages.

A `SAT`

problem is specified by a finite set `$V$`

of Boolean variables and a finite set `$E$`

of clauses in these variables — we refer to this as *the SAT problem $SAT(V, E)$*. We also assume that

`$E$`

does not contain tautologies, and none of the clauses contain a literal more than once. We also regard clauses as equal if they become so after reordering.### Recursive solving: the DP algorithm

Assume that we are given sets `$V$`

and `$E$`

as above. We pick a variable `$P\in V$`

and look for solutions first when `$P$`

is assigned the value true and then when it is assigned false (in my code I actually randomized the order of the branches). For solutions where `$P$`

is true:

- Any clause containing
`$P$`

is true, so can be dropped. - Any clause of the form
`$\neg P\vee l_1\vee\dots\vee l_n$`

(up to reordering) is true if and only the clause`$C =l_1\vee\dots\vee l_n$`

obtained by dropping`$\neg P$`

is true. We can thus replace`$\neg P \vee C := \neg P\vee l_1\vee\dots\vee l_n$`

by`$C$`

when considering solutions with`$P$`

true. - Finally, clauses
`$C$`

containing neither`$P$`

nor`$\neg P$`

are unaffected, and should be satisfied even after assigning`$P$`

.

Thus, we get a new set of clauses, which we denote `$\rho(E, P)$`

, in the variables `$V\setminus\{P\}$`

, giving the *restricted* `SAT`

problem `$SAT(V\setminus\{P\}, \rho(E, P))$`

.
If the restricted problem has a solution then we get a solution to the original problem `$SAT(V, E)$`

by assigning true to `$P$`

. Otherwise we look for a solution where `$P$`

is false, once more obtaining a restricted `SAT`

problem `$SAT(V\setminus\{P\}, \rho(E, \neg P))$`

with fewer variables (with `$\rho(E, \neg P))$`

defined analogous to `$\rho(E, P))$`

). If the latter problem has a solution so does the original one `$SAT(V, E)$`

. If not, we know both the restricted problems have no solutions. Since `$P$`

must be true or false, the original problem `$SAT(V, E)$`

has no solution.

Thus, we can reduce the solution of a `SAT`

problem to the solution of `SAT`

problems with fewer variables. This gives a recursive algorithm once we solve in the base case, which is the case with no variables. But in this case the only clause is the empty clause. Thus there are only two `SAT`

problems.

`$E = \phi$`

. Here every clause is satisfied, so there is a solution.`$E = \{\bot\}$`

, i.e., a singleton consisting of the empty clause. This is not satisfiable as the empty clause cannot be satisfied.

To summarize, we have a recursive algorithm by reducing to a simpler case, namely with fewer variables, in case the set of variables is non-empty, and (easily) solving in the case where the set of variables is empty.

### DPLL algorithm

There are two ways in which the DP algorithm can be improved, giving the DPLL algorithm. Firstly, if some clause is a *unit literal*, i.e., a literal `$l$`

with `$l = P$`

or `$l = \neg P$`

, then `$P$`

must be assigned the value that makes `$l$`

true. On doing this, all clauses containing `$l$`

are true and can be dropped, while a clause of the form `$\neg l\vee C$`

can be replaced by the clause `$C$`

obtained by deleting `$\neg l$`

, i.e., we replace `$E$`

by `$\rho(E, l)$`

. On making such simplifications, new units may be created and this process repeated. For instance, if each clause has length 2 (so called `2-SAT`

), this clearly gives a fast algorithm.

A second improvement is to use *pure literals*, which are literals `$l$`

so that `$\neg l$`

is not present in any clause. Then the `SAT`

problem has a solution if and only if it has a solution with `$l$`

true. Hence we can assign the value of the variable `$P$`

with `$l = P$`

or `$l = \neg P$`

to make `$l$`

true, and drop all the clauses containing `$l$`

.

I have implemented this without further heuristics, except for some use of a *conflict driven* approach to avoid full back-tracking, based on
keeping track of proofs. I next sketch how we keep track of proofs (the algorithmic improvement will be evident).

### Lifting proofs

Suppose the given `SAT`

problem `$SAT(V, E)$`

has no solution. We can enhance our algorithm to in this case give a proof using resolution of a contradiction starting with the given clauses `$E$`

. I sketch this here. The main step is the *lifting* of a proof from a restricted problem. We shall denote the set of clauses deduced by resolution from a set of clauses `$E$`

by `$D(E)$`

. Thus, `$D(E)\supset E$`

, and is the smallest set containing `$E$`

that is closed under resolution.

To start with consider the base case where there is no variable. Here if the `SAT`

problem is not satisfiable, we must have the empty clause in `$E$`

. Thus a given clause is itself a contradiction.

Now consider the general case with `$n$`

variables, and assume that our algorithm gives either a solution or deduces a contradiction using resolution whenever we have fewer than `$n$`

variables. Pick a variable `$P$`

and assign that `$P$`

is true. As above, we get a restricted problem `$SAT(V\setminus\{P\}, \rho(E, P))$`

not involving the variable `$P$`

.

Suppose the original problem `$SAT(V, E)$`

has no solution, then the restricted problem `$SAT(V\setminus\{P\}, \rho(E, P))$`

does not either. By the induction hypothesis, using resolution we can deduce a contradiction starting with `$\rho(E, P)$`

, i.e., we have `$\bot\in D(\rho(E, P))$`

.
We cannot in general conclude from this that `$\bot\in D(E)$`

. Nevertheless we can lift the proof of a contradiction to something useful.

Observe that, by the construction of `$\rho(E, P)$`

, if `$C \in \rho(E, P)$`

then either `$C \in E$`

or `$\neg P\vee C\in E$`

(if both hold we choose the first case in the sequel for efficiency). We claim that an analogous statement holds for clauses deduced by resolution. Namely, if `$C \in D(\rho(E, P))$`

then either `$C \in D(E)$`

or `$\neg P\vee C\in D(E)$`

.

This claim can be proved by induction on the number of steps in the deduction using resolution. For the case with `$0$`

steps, we just get elements of `$\rho(E, P)$`

, for which we have the claim by the above observations as `$E\subset D(E)$`

.

Next, if `$C \in D(\rho(E, P))$`

is deduced in `$n > 0$`

steps, then `$C$`

can be deduced from clauses `$C_1$`

and `$C_2$`

by resolution, and the clauses `$C_i$`

can be deduced from `$\rho(E, p)$`

in fewer than `$n$`

steps.

By induction hypothesis, it follows that, for each `$i =1, 2$`

, either `$C_i \in D(E)$`

or `$\neg P\vee C_i\in D(E)$`

. By definition of resolution, if both `$C_1$`

and `$C_2$`

are in `$D(E)$`

then `$C\in D(E)$`

, and in all other cases `$\neg P\vee C\in D(E)$`

.

As `$\bot \in D(\rho(E, P))$`

, it follows that either `$\bot\in D(E)$`

or `$\neg P = \neg P\vee \bot \in D(E)$`

.
If `$\bot\in D(E)$`

, we have proved unsatifiability. Otherwise we apply the same algorithm to the restricted problem `$SAT(V \setminus \{P\}, \rho(E, \neg P))$`

obtained by assigning the value false to `$P$`

. In this case we deduce either `$\bot\in D(E)$`

or that `$P\in D(E)$`

. Thus, we either have a proof of unsatisfiability or both `$P\in D(E)$`

and `$\neg P\in D(E)$`

hold. But resolution using `$P$`

and `$\neg P$`

gives `$\bot$`

, showing that `$\bot\in D(E)$`

.

The above has been sketched as an existence result, but indeed all steps are effective, algorithmically giving a proof in the case of problems that are not satisfiable. Note that by checking if the contradiction requires the assumption of the truth value of `$P$`

while lifting, we have also avoided full back-tracking in some cases.

## Looking ahead: formal programs and proofs

Using languages implementing *Dependent Type Theory* such as `Lean 4`

or `Idris`

, one can hope for more — a program that outputs one of the following:

- a solution together with a proof that this is a solution, or
- a proof that there is no solution — not just a resolution tree but a proof in the given foundations.

Furthermore, the compiler should be able to verify that the program terminates for any valid input, and has correct output of one of the above forms. It appears that there is no obstruction in principle to doing this, but some work is needed.