Many-Sorted First-Order Logic¶
Note: Executive summary
To formally define formulas such as \( (a \Implies (2x + y < 3)) \land (b \lor \neg(x + y \neq 0)) \land … \), one needs to extend propositional logic be means to express terms such as \( 2x + 3 \) and Boolean-valued atoms such as \( 2x + y < 3 \). This is done by using first-order logic that allows one to define what constants (e.g., \( 2 \)), functions (\( + \)), predicates (\( < \)) are allowed in the formulas and how they are interpreted (e.g. that the domains are integers and \( + \) is interpreted in the usual way).
In this context, a theory means a way to indicate what functions and predicates are allowed and how they are interpreted. For instance, \( (y = 2x - 0.5) \land (y = -x + 1) \) is satisfiable modulo the theory of linear arithmetic over reals but not modulo the theory of linear arithmetic over integers (the lines cross at the point \( x = 0.5 \) and \( y= 0.5 \)).
In the following, we give basic definitions of first-order logic (FOL) needed to understand the SMT formulas, theories, and satisfiability. Like many contemporary research articles, we will use many-sorted first-order logic. It augments the standard first-order logic with a simple type system. The basic definitions we use are based, among others, on
Silvio Ranise, Christophe Ringeissen, and Calogero Zarba: Combining Data Structures with Nonstably Infinite Theories Using Many-Sorted Logic, Proc. FroCoS 2005, pp 48–64.
Dejan Jovanović and Clark Barrett: Sharing Is Caring: Combination of Theories, Proc. FroCoS 2011, pp 195–210
Note
Many slightly different variants of FOL, many-sorted FOL etc appear in the literature. Also other conventions (e.g., use of the terms “variable” and “free constant”) may vary. In the following, I’ve selected a presentation that hopefully suits the purposes of this course best.
Syntax¶
The syntax of the formulas in many-sorted FOL are built on “signatures”, which declare the non-Boolean sorts (“types” in programming language terminology) and the operators used in formulas.
Definition: Signatures
A signature \( \Sig = \Tuple{\Sorts,\Funcs,\Preds} \) consists of countable sets of:
sorts \( \Sorts \). These are only the names of the sorts, they have no domain associated to them yet.
function symbols \( \Funcs \). Each function symbol \( \Func \) is associated with a type \( \SortI1 \times … \times \SortI{n} \to \Sort \), where \( n \ge 0 \) and \( \SortI1,…,\SortI{n},\Sort \in \Sorts \). Nullary function symbols, i.e. those of type \( \to \Sort \), are called constants of sort \( \Sort \).
predicate symbols \( \Preds \). Each predicate symbol \( \Pred \) is associated with a type \( \SortI1 \times … \times \SortI{n} \), where \( n \ge 1 \) and \( \SortI1,…,\SortI{n} \in \Sorts \). The set \( \Preds \) always includes the binary equality predicate \( \EqS{\Sort} \) with the type \( \Sort \times \Sort \) for each sort \( \Sort \).
As the equality symbol \( \EqS{\Sort} \) is always included for each sort, we are actually using many-sorted FOL with built-in equality. We just write \( \Eq \) instead of \( \EqS{\Sort} \) because the sort \( \Sort \) will always be implicit from the context.
Example
A signature for integers with addition and multiplication could be $$\IntsSig = \Tuple{\Set{\IntsSort},\Funcs,\Preds}$$ where \( \Funcs = \Set{0,1,+,-,*} \) such that
\( 0 \) and \( 1 \) are constants (of sort \( \IntsSort \))
\( + \), \( - \) and \( * \) have the type \( \IntsSort \times \IntsSort \to \IntsSort \)
and \( \Preds = \Set{\Eq,<} \) such that
\( < \) and \( \Eq \) are of the type \( \IntsSort \times \IntsSort \).
Variables, Terms, and Atoms¶
Assume a signature \( \Sig \). A set \( \Vars \) of \( \Sig \)-variables, or simply variables, is a countable set of variable names, each associated with a sort in \( \Sig \). Based on variables, we can next build terms. Intuitively, terms are expressions that evaluate to non-Boolean values.
Definition: Terms
The \( \Sig \)-terms over \( \Vars \), or simply terms, are defined inductively:
Each variable \( \Var \in \Vars \) of sort \( \Sort \) is a term of sort \( \Sort \).
If \( t_1,…,t_n \) are terms of sorts \( \SortI1,…,\SortI{n} \), respectively, and \( \Func \) is a function symbol of type \( \SortI1 \times … \times \SortI{n} \to \Sort \), then \( \Func(t_1,…,t_n) \) is a term of sort \( \Sort \).
(There are no other terms.)
We usually write a constant application \( c() \) simply as \( c \), and may use the infix notation \( t_1 \mathop{f} t_2 \) for a binary function symbol \( f \).
Example: continuing the previous one
If \( x \) and \( y \) are variables of sort \( \IntsSort \), then \( x \), \( y \), \( 1 \), \( 3 \), \( x+1 \), and \( 3*(x+1) \) are \( \IntsSig \)-terms of sort \( \IntsSort \). Here \( 3 \) abbreviates the term \( +(1,+(1,1)) \), \( x+1 \) the term \( +(x,1) \), and \( 3*(x+1) \) the term \( * (+(1,+(1, 1)),+(x,1)) \).
Based on terms, we next define “atoms”. Intuitively, atoms are obtained by applying predicates to terms and they are the simplest expressions that evaluate to Boolean values.
Definition: Atoms
Whenever
\( t_1,…,t_n \) are \( \Sig \)-terms over \( \Vars \) of sorts \( \SortI1,…,\SortI{n} \), resp., and
\( \Pred \) is a predicate symbol with the type \( \SortI1 \times … \times \SortI{n} \),
then \( \Pred(t_1,…,t_n) \) is a \( \Sig \)-atom over \( \Vars \) (or simply an atom).
Again, we use the infix notation whenever convenient and, e.g., write \( x \Eq y \) instead of \( {\Eq}(x,y) \).
Example: continuing the previous ones
In our integer signature, \( x < y \) and \( x+1 \Eq y \) are atoms.
Formulas¶
In addition to sorted variables, we assume a countable set of propositional variables \( \BVars = \Set{a,b,c,a_1,b_1,….} \). In principle, propositional variables could be simulated with sorted variables or with nullary predicates but we include them for some notational convenience reasons.
Finally, we can define “formulas” that are built over propositional variables and atoms by using the usual logical connectives as well as quantifiers.
Definition: Formulas
\( \Sig \)-formulas over \( \Vars \) and \( \BVars \), or simply formulas, are defined inductively by:
Each propositional variable in \( \BVars \) is a formula.
Each \( \Sig \)-atom over \( \Vars \) is a formula.
If \( \LSF \) and \( \RSF \) are formulas, then so are \( \neg\LSF \), \( (\LSF \lor \RSF) \), and \( (\LSF \land \RSF) \).
If \( \Var \in \Vars \) is a variable of sort \( \Sort \) and \( \LSF \) is a formula, then so are \( (\E{\Var}{\Sort}{\LSF}) \) and \( (\A{\Var}{\Sort}{\LSF}) \).
In the above definition,
\( \exists \) is the existential quantifier and “\( \E{\Var}{\Sort}{\LSF} \)” is read “there exists a value for \( \Var \) such that \( \LSF \) holds”, and
\( \forall \) is the universal quantifier and “\( \A{\Var}{\Sort}{\LSF} \)” is read “for all values of \( \Var \), \( \LSF \) holds”.
A formula is said to be quantifier-free if it does not contain any quantifiers.
Parentheses can be omitted whenever there is no risk of ambiguity. As usual, we can use standard abbreviations as long as there is a common understanding of their meaning. For instance:
\( {t_1 \Neq t_2} \AbbrvEq \neg(t_1 \Eq t_2) \)
\( \LSF \Implies \RSF \AbbrvEq (\neg\LSF \lor \RSF) \)
\( \LSF \Iff \RSF \AbbrvEq (\LSF \Implies \RSF) \land (\RSF \Implies \LSF) \)
\( \True \AbbrvEq ({\neg\BVar} \lor\BVar) \) for some propositional variable \( \BVar \)
\( \False \AbbrvEq \neg\True \)
Example: continuing the previous ones
One example of an \( \IntsSig \)-formula is $$\A{x,y}{\IntsSort}((x < y) \Implies (x+1 \le y))$$ where
“\( \A{x,y}{\IntsSort}{\alpha} \)” abbreviates “\( \A{x}{\IntsSort}{(\A{y}{\IntsSort}{\alpha})} \)” and
“\( (x+1 \le y) \)” abbreviates “\( ({x+1 < y} \lor {x+1 \Eq y}) \)”.
For a variable \( \Var \), a term \( \Var \) in a formula \( \SF \) is called an occurrence of \( \Var \) in \( \SF \). An occurrence is bound if it is part of a formula \( \SF \) in \( \A{\Var}{\Sort}{\SF} \) or \( \E{\Var}{\Sort}{\SF} \), otherwise it is free. If \( \Var \) has a free occurrence in \( \SF \), it is called a free variable in \( \SF \).
Example
In the formula \[(\A{x}{\IntsSort}{x < y}) \lor {x > 0}\] the first occurrence of \( x \) is bound; the second one and that of \( y \) are free.
A formula is defined to be a sentence if it does not have any free variables.
Example
The formula \( \A{x}{\IntsSort}{({x < y} \lor {x > 0})} \) is not a sentence as \( y \) is free but \( \A{x}{\IntsSort}{\E{y}{\IntsSort}{({x < y} \lor {x > 0})}} \) is a sentence.
Semantics¶
Consider a signature \( \Sig = \Tuple{\Sorts,\Funcs,\Preds} \). It only describes the names of the types and operations that are used to build terms, atoms and formulas. However, it does not describe what elements each sort actually contains and how the functions and predicates are evaluated on them. Adding this information is the task of \( \Sig \)-structures.
Definition: Structures
A \( \Sig \)-structure \( \Interp \) assigns
a non-empty domain set \( \DomOf{\Sort} \) to each sort \( \Sort \in \Sorts \)
a function \( \InterpOf{\Func}: \DomOf{\SortI1} \times … \times \DomOf{\SortI{n}} \to \DomOf{\Sort} \) for each function symbol \( \Func \in \Funcs \) of type \( \SortI1 \times … \times \SortI{n} \to \Sort \)
a function \( \InterpOf{\Pred}: \DomOf{\SortI1} \times … \times \DomOf{\SortI{n}} \to \Set{\False,\True} \) for each predicate symbol \( \Pred \in \Preds \) of type \( \SortI1 \times … \times \SortI{n} \)
It is required that for all sorts \( \Sort \) and all \( v,w \in \DomOf{\Sort} \), \( {{\InterpOf{\EqS{\Sort}}}(v,w)=\True} \text{ iff } {v = w} \). That is, equality is always interpreted as element equality. Note that each constant \( c \) of sort \( \Sort \) is mapped to an element \( \InterpOf{c} \in \DomOf{\Sort} \).
Example
For our “integer signature” \( \IntsSig=\Tuple{\Set{\IntsSort},\Set{0,1,+,-,*},\Set{<,\Eq}} \), we can have many \( \IntsSig \)-structures. For instance:
The “natural” structure \( \IInts \).
\( \DomOf{\IntsSort} = \Ints \), and
\( \IIntsOf{0} = 0 \), \( \IIntsOf{1} = 1 \), and
\( \IIntsOf{+} \), \( \IIntsOf{-} \), \( \IIntsOf{*} \), \( \IIntsOf{<} \) and \( \IIntsOf{\Eq} \) correspond to the standard arithmetic operations over integers, e.g., \( 2 \IIntsOf{+} 6 = 8 \)
The “modulo 7” structure \( \IMods \).
\( \DomOf{\IntsSort} = \Set{0,1,…,6} \), and
\( \IModsOf{0} = 0 \), \( \IModsOf{1} = 1 \), and
\( \IModsOf{+} \), \( \IModsOf{-} \), \( \IModsOf{*} \), \( \IModsOf{<} \) and \( \IModsOf{\Eq} \) correspond to the standard arithmetic operations over integers modulo 7, e.g., \( 2 \IModsOf{+} 6 = 1 \).
In order to evaluate terms, atoms and formulas, we extend structures to variables and propositional variables as well. Consider a signature \( \Sig = \Tuple{\Sorts,\Funcs,\Preds} \) as well as sets of \( \Sig \)-variables \( \Vars \) and propositional variables \( \BVars \).
Definition: Interpretations
A \( \Sig \)-interpretation \( \Interp \) over \( \Vars \) and \( \BVars \), or simply an interpretation, is a \( \Sig \)-structure that additionally assigns
a Boolean value \( \InterpOf{\BVar} \in \Set{\False,\True} \) to each propositional variable \( \BVar \in \BVars \), and
a value \( \InterpOf{\Var} \in \DomOf{\Sort} \) to each variable \( \Var \in \Vars \) of sort \( \Sort \).
Thus, a \( \Sig \)-structure is a \( \Sig \)-interpretation when the sets of variables and propositional variables are empty. Arbitrary non-variable terms are evaluated in an interpretation \( \Interp \) inductively: \[\InterpOf{f(t_1,…,t_n)} = \InterpOf{f}(\InterpOf{t_1},…\InterpOf{t_n}).\]
Example
Under an interpretation \( \Interp \) that extends \( \IInts \) of the previous example by \( \Set{x \mapsto 3, y \mapsto 7} \), \begin{eqnarray*} \InterpOf{((x+2)*y)} &=&\\\\ \InterpOf{(x+2)} \InterpOf{*} \InterpOf{y} &=&\\\\ (\InterpOf{x} \InterpOf{+} \InterpOf{2}) \InterpOf{*} 7 &=&\\\\ (3 \InterpOf{+} 2) \InterpOf{*} 7 &=&\\\\ 5 \InterpOf{*} 7 &=& 35 \end{eqnarray*}
If \( \Var \in \Vars \) is a variable of sort \( \Sort \) and \( \Val \in \DomOf{\Sort} \), then \( \InterpSubst{\Var}{\Val} \) is the interpretation which is the same as \( \Interp \) except that it evaluates \( \Var \) to \( \Val \), i.e. \( \InterpIOf{\InterpSubst{\Var}{\Val}}{\Var} = \Val \) and \( \InterpIOf{\InterpSubst{\BVar}{\Val}}{y} = \InterpOf{y} \) whenever \( y \neq x \).
The following inductive rules define when an interpretation \( \Interp \) evaluates a formula \( \SF \) to true, denoted by \( \Interp \Models \SF \):
\( \Interp \Models \BVar \) for a propositional variable \( \BVar \in \BVars \) iff \( \InterpOf{\BVar} = \True \)
\( \Interp \Models \Pred(t_1,…,t_n) \) iff \( \InterpOf{p}(\InterpOf{t_1},…\InterpOf{t_n}) = \True \)
\( \Interp \Models {\neg\LSF} \) iff not \( \Interp \Models \LSF \)
\( \Interp \Models (\LSF \land \RSF) \) iff \( \Interp \Models \LSF \) and \( \Interp \Models \RSF \)
\( \Interp \Models (\LSF \lor \RSF) \) iff \( \Interp \Models \LSF \) or \( \Interp \Models \RSF \)
\( \Interp \Models (\A{\Var}{\Sort}{\LSF}) \) iff \( \InterpSubst{\Var}{\Val} \Models \LSF \) for all \( \Val \) in the domain of \( \Var \)
\( \Interp \Models (\E{\Var}{\Sort}{\LSF}) \) iff \( \InterpSubst{\Var}{\Val} \Models \LSF \) for at least one \( \Val \) in the domain of \( \Var \)
Example: Continuing the previous ones
\( \IInts \Models \A{x}{\IntsSort}{({x > 0} \Implies {x < 2*x})} \)
\( \IInts \NotModels \A{x}{\IntsSort}{\E{y}{\IntsSort}{x \Eq 2*y}} \)
\( {\IMods\cup\Set{x \mapsto 5, y \mapsto 3}} \Models {x > y} \land {x+2 < y} \)
Theories¶
We are not usually interested how formulas evaluate in arbitrary structures but want the functions and predicates to have some specific meaning.
Definition: Theories
A \( \Sig \)-theory \( \Th \) is a non-empty set of \( \Sig \)-structures.
One can define the class by
describing it explicitly, or
axiomatically by giving a countable set of \( \Sig \)-sentences. The \( \Sig \)-structures in which all the axioms hold are the ones in the theory. Theories defined in this way are also called first-order theories.
Both methods can be found in the literature.
Definition: Theory of nonlinear integer arithmetic
Let the theory of nonlinear integer arithmetic \( \TNIA \) be the \( \IntsSig \)-theory consisting only of the \( \IntsSig \)-structure \( \IInts \) defined in our earlier example.
Let’s have some additional definitions:
A \( \Th \)-interpretation is a \( \Sig \)-interpretation \( \Interp \) that extends some structure in the theory \( \Th \).
A formula \( \SF \) is \( \Th \)-satisfiable if \( \Interp \Models \SF \) for some \( \Th \)-interpretation \( \Interp \).
A formula \( \SF \) is \( \Th \)-valid, denoted by \( \ThModels \SF \), if \( \Interp \Models \SF \) for all \( \Th \)-interpretations \( \Interp \).
Example: continuing…
The quantifier-free formula \( {x < 2*x} \) is
\( \TNIA \)-satisfiable as \( \IInts \cup \Set{x \mapsto 1} \Models {x < 2*x} \) but
not \( \TNIA \)-valid as \( \IInts \cup \Set{x \mapsto 0} \NotModels {x < 2*x} \)
The quantifier-free formula \( {x > 0} \Implies {x < 2*x} \) is \( \TNIA \)-valid
When solving practical problems, we are usually interested in the \( \Th \)-satisfiability problem
Problem: \( \Th \)-satisfiability
Given a formula, is it \( \Th \)-satisfiable?
and especially its restricted version of quantifier-free \( \Th \)-satisfiability problem:
Problem: quantifier-free \( \Th \)-satisfiability
Given a quantifier-free formula, is it \( \Th \)-satisfiable?
Naturally, the decidability and complexity of these problems depends on the theory \( \Th \) that we are considering. For instance, quantifier-free \( \TNIA \)-satisfiability is undecidable but we’ll see many decidable theories soon.
Definition: Models and consequences
A \( \Th \)-interpretation \( \Interp \) such that \( \Interp \Models \phi \) is called a \( \Th \)-model of \( \phi \).
A formula \( \phi \) is a \( \Th \)-consequence of a formula \( \psi \), denoted by $$\psi \ThModels \phi$$ if each \( \Th \)-model of \( \psi \) is also a \( \Th \)-model of \( \phi \) (that is, if \( \Interp \Models \psi \) implies \( \Interp \Models \phi \) for all \( \Th \)-interpretations \( \Interp \)).
Example
\( {x > 0} \NIAModels {x < 2*x} \)
\( {x*x+y*y \le 1} \NIAModels {x+y < 2} \)