\documentclass[11pt]{article}
\begin{document}
%%%%% written in LaTeX %%%
\addtolength{\oddsidemargin}{-1.25cm}
\addtolength{\textwidth}{3cm}
\addtolength{\textheight}{4cm}
\addtolength{\topmargin}{-2cm}
\newcommand{\een}{\end{enumerate}}
\newcommand{\noi}{\noindent}
\newcommand{\ra}{\rightarrow}
\newcommand{\sig}{\sigma}
\newcommand{\eps}{\epsilon}
\newcommand{\del}{\delta}
\newcommand{\Del}{\Delta}
\newcommand{\TREE}{\mbox{TREE}}
\newcommand{\MO}{{\tt MOSER}}
\newcommand{\FX}{\tt{FIXIT!}}
\newcommand{\Sb}{{\tt PURE}\,}
\newcommand{\Sa}{{\tt ACTUAL}\,}
\newcommand{\lam}{\lambda}
\newcommand{\ah}{\alpha}
\newcommand{\gam}{\gamma}
\newcommand{\ben}{\begin{enumerate}}
\newcommand{\beq}{\begin{equation}}
\newcommand{\eeq}{\end{equation}}
\newcommand{\bec}{\begin{center}}
\newcommand{\ece}{\end{center}}
\newcommand{\hsa}{\hspace*{1cm}}
\newcommand{\hsb}{\hspace*{2cm}}
\newtheorem{conjecture}{Conjecture}
\newtheorem{definition}{Definition}
\newtheorem{define}{Definition}
\newtheorem{theorem}{Theorem}[section]
\newtheorem{claim}{Claim}[theorem]
\newtheorem{prop}[theorem]{Proposition}
\newtheorem{remark}[theorem]{Remark}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{guess}[theorem]{Conjecture}
\bec {\bf Robin Moser makes Lov\'asz Local Lemma Algorithmic!}
\\ Notes of Joel Spencer \ece
\section{Preliminaries}
The idea in these notes is to explain a new approach of Robin
Moser \footnote{Moser is a graduate student (!) at ETH, working
with Emo Welzl} to give an algorithm for the Lov\'asz Local Lemma.
This description is of the approach as modified and improved by
G\'abor Tardos.
We don't strive for best possible or most general here.
In particular, we stick to what is called the symmetric case.
Lets start with a particular and instructive example. Let $x_i$,
$1\leq i \leq n$ be Boolean variables. Let $C_j$, $1\leq j \leq m$
be clauses, each the disjunction of $k$ variables or their negations.
For example, with $k=3$, $x_8\vee\overline{x_{19}}\vee x_{37}$ would
be a typical clause. We say two clauses overlap, and write $C_i\sim C_j$,
if they have a common variable $x_k$, regardless of whether the variable
is negated or not in the clauses. A set of clauses is called mutually
satisfiable if there exists a truth assignment of the underlying variables
so that each clause is satisfied or, equivalently, if the $\wedge$ of the
clauses is satisfiable.
\begin{theorem}\label{one} Assume, using the above notation, that each
clause overlaps at most $d$ clauses (including itself). Assume
\beq 2^{-k}\frac{d^d}{(d-1)^{d-1}} \leq 1 \eeq
Then the set of clauses is mutually satisfiable.
Moreover (and this is the new part) there is an algorithm that
finds an assignment for which each clause $C_j$ is satisfied that runs
in linear time in $n$, with $k,d$ fixed.
\end{theorem}
Here is a more general setting. Let $\Omega$ be a set of size $n$. For
$v\in\Omega$ let $X_v$ be independent random variables. For $1\leq j\leq m$
let $e^j\subseteq \Omega$ and let $B_j$ be an event that depends only
on the values $X_v$, $v\in e^j$. We say two events overlap, and write
$B_i\sim B_j$, if $e^i \cap e^j \neq \emptyset$.
\begin{theorem}\label{two} Assume, using the above notation, that each
event overlaps at most $d$ events (including itself). Assume
\beq \Pr[B_j] \leq p \mbox{ for all } j \eeq
and that
\beq p\frac{d^d}{(d-1)^{d-1}} \leq 1 \eeq
Then
\beq \wedge_{j=1}^m \overline{B_j} \neq \emptyset \eeq
Moreover (and this is the new part) there is an algorithm that
finds an assignment of the $X_i$
for which each $\overline{B_j}$ holds that runs
in linear time in $n$, with $k,d$ fixed.
\end{theorem}
To say the implication of Theorem \ref{one} from Theorem \ref{two}
consider a random assignment $X_v$ of the variables $x_v$. That is,
each $X_v$ independently takes on the values true, false with probabilities
one half. The "bad" event $B_j$ is then that the clause $C_j$ is not
satisfied, which has probability $2^{-k}$. The event that none of the
bad events occur is nonempty. By Erd\H{o}s Magic, there is a point
in the probability space, which is precisely an assignment of truth
values, such that no bad event occurs, which is precisely that the
clauses are all simultaneously satisfied.
We will write the proof in the more general form, but the example
of Theorem \ref{one} is a good one to keep in mind.
The time of the algorithm actually will depend on some data
structure assumptions which we omit.
{\tt The Moser-Tardos (ML) Algorithm}.
\ben
\item Give $X_v$ random
values from their distributions.
\item WHILE some $B_j$ holds.
\item PICK some $B_j$ that holds.
\item Reset the $X_v$, $v\in e_j$, independently
\item END WHILE
\een
The selection mechanism for PICK can be arbitrary. For definiteness,
we may pick the minimal $j$ for which $B_j$ holds, but it doesn't affect
the proof. We just need some specified mechanism for PICK.
As this is a randomized algorithm, its output may be and will be considered
a random variable. Let $B_t,e_t$ be the event and underlying set in the
$t$-th iteration of the WHILE loop. We shall refer to this as {\em time}
$t$ in the running of ML.
We define the LOG of the running of the algorithm to
be the sequence $e_1,\ldots,e_t,\ldots$.
A priori, there is no reason to believe that this algorithm will actually
terminate, and so the LOG might be an infinite sequence. On the other
extreme, the initial random assignment might work in which case LOG
would be the null sequence.
For convenience we let $H=\{e^1,\ldots,e^m\}$ so that the
$e\in H$ are just the possible values of the $e_t$.
For $e\in H$ let $COUNT[e]$ denote the number of times $e$ appears
in LOG, that is, the number of times $t$ for which $e=e_t$. A priori
this could be infinite. But our main result is:
\begin{theorem}\label{three}
\beq E[COUNT[e]] \leq \frac{1}{d-1} \eeq
\end{theorem}
Given this result, linearity of expectation gives
\begin{theorem}\label{four} The expected length of LOG is
at most $\frac{m}{d-1}$ where $m$ is the number of events.
\end{theorem}
As each event overlaps at most $d$ events, each $v\in\Omega$
can be in at most $d$ events, and so $m\leq nd$. Theorem
\ref{four} then gives that the expected length of LOG is
linear in the size of $\Omega$. This is why we call
the MT algorithm linear time, though in particular
instances one would need further assumptions about the data
structure.
The remainder of the argument is a proof of Theorem \ref{three}.
Given a running of MT with the LOG of size at least $t$ we
define $TREE[t]$ to be a rooted tree with vertices labelled
by the $e\in H$. (Note: Several vertices may have the same
label.) The root of $TREE[t]$ is $e_t$. Now we construct
the tree by reverse induction from $i=t-1$ to $i=1$. (When
$t=1$ the tree has only the root $e_1$.) For a given $i$
we check whether there is a $j$, $is$.
The subtlety arises with those $i\in\Omega$ lying in both $e$ and $f$.
For these we must distinguish the original value of $X_i$ and the
revised value after time $s$. It is necessary that $B_s$ hold with
the original values of $X_i$ and this occurs with probability at most
$p$. It is further necessary that $B_t$ holds using the revised values
$X_i$ for $i\in e\cap f$ and the original values for the other $i\in e$.
This also occurs with probability at most $p$ and, more importantly, that
probability that both $B_s$ and $B_t$ hold is at most $p^2$. This is
because in looking at $B_t$ we are looking at different "coin flips" for
the values $X_i$.
We argue the general case for Theorem \ref{five} by preprocessing the
randomness. That is, each variable $v\in \Omega$ independently makes
a countable number of evaluations of $X_v$, labelled $x_v^0,x_v^1,\ldots$.
Following this preprocessing ML is deterministic. When $v$ needs to
reset $X_v$ for the $i$-th time, it takes the value $x_v^i$, $x_v^0$
being the original evaluation. We call $i$ above the evaluation number.
Preprocessing is a powerful tool in
analyzing random algorithms, though it has no affect on the actual
running of the algorithms.
While $TREE[t]$ does not determine LOG, or even the precise order
of the appearances of the nodes of $TREE[t]$ in the LOG, it does
determine the order of appearance of the nodes $f_0,\ldots,f_s$
containing any given vertex $v$. When the ML algorithm reached
$f_i$, $v$ had value $x_v^i$. That is, for each $e\in TREE[t]$
and each vertex $v\in e$ there is an evaluation number
$i$ so that when ML reaches
$e$ the vertex $v$ had value $x_v^i$. Critically, this $i$ is determined
by the $TREE$ (even though many values of LOG could yield the same
$TREE$) and the particular node $e$ and vertex $v$.
For $OCCUR[T]$ it is {\em necessary} that
\bec For each $e\in TREE$ the associated bad event
$B$ occurs where for each $v\in e$, $v$ is using the $i$-th
evaluation with $i$ the evaluation number as determined above.
\ece
For each $e\in TREE$ this occurs with probability at most
$p$ as the various evaluations are independent. Moreover,
critically, the various events $B$ are mutually independent
as for each $v$ one uses different evaluation number for each
one. Thus the probability that all the $B$ hold at their
various "times" is at most $p^{|T|}$, completing Theorem $\ref{five}$.
Now we turn to proving Theorem $\ref{three}$ by bounding
$E[COUNT[e]]$. As the values $TREE[t]$ are tautologically
distinct, no $T$ can be counted more than once in $COUNT[e]$
and so
\beq\label{count1}
E[COUNT[e]] = \sum_T \Pr[OCCUR[T]] \leq
\sum_T p^{|T|}
\eeq
where the sum is over all finite trees with root $e$, and
the second inequality is from Theorem \ref{five}.
We bound the Right Hand Side of \ref{count1} using standard methods from
algebraic combinatorics. Let \Sb denote the infinite rooted
tree in which every node has precisely $d$ children. Let
\Sa be the infinite tree rooted at $e$ in which $f$ has
as children all $g$ which overlap it. Our basic assumption is
that each node in $\Sa$ will have at most $d$ children and
so \Sa is a subtree of \Sb. Hence the sum for subtrees $T$ of
\Sa is at most the sum for subtrees of $\Sb$. Here we have an
exact result.
\begin{theorem}\label{six} Assume
\beq
p\leq \frac{(d-1)^{d-1}}{d^d}
\eeq
Then, taking the sum over all subtrees of \Sb rooted at the root of \Sb,
\beq \label{e3}
y=\sum_T p^{|T|}
\eeq
is finite, and is that unique $y$, $0\leq y \leq \frac{1}{d-1}$, satisfying
\beq \label{e1}
y = p(1+y)^d
\eeq
\end{theorem}
\noi {\tt Proof:} Assumming the sum is finite, we find \ref{e1} by considering
the terms when there are $i$ children of the root in $T$. Each subtree then
gives a contribution of $y$ and they are independent so the total contribution
is $y^i$. Thus
\beq\label{e2}
y = p\sum_{i=0}^d {d\choose i}y^i =
p(1+y)^d
\eeq
In general let $y_s$ be the sum \ref{e3} over all $T$ of depth at most $s$.
Then $y_0=p$ and \ref{e2} becomes
\beq\label{e4}
y_{s+1} = p\sum_{i=0}^d {d\choose i}y_s^i =
p(1+y_s)^d
\eeq
and basic analysis shows that this sequence approaches a fixed point $y$
as desired.
\end{document}