Post

Did just a couple of deviations suffice all along? (problems 10-14)

In September 2024, at a conference on Mathematical Aspects of Learning Theory in Barcelona, Dan Spielman gave a beautiful talk on discrepancy theory and some of its applications in clinical trials. Even though this was not the precise focus of the talk, it sparked a discussion that day about lower bounds for Joel Spencer’s Six Deviations Suffice Theorem. I will describe some of the unanswered questions from this discussion. Many thanks to Dan Spielman, Tselil Schramm, Amir Yehudayoff, Petar Nizic-Nikolac, and Anastasia Kireeva with whom discussing this problem contributed to making the workshop a memorable week!

Given $n$ a positive integer and $A$ and $n\times n$ matrix, we define the discrepancy of $A$, $\mathrm{disc}(A)$ as

\[\\ \mathrm{disc}(A) := \inf_{x\in\{\pm1\}^n}\|Ax\|_\infty. \\\]

One of the motivations of this question is to understand how much smaller is $\inf_{x\in\{\pm1\}^n}\|Ax\|_\infty$ compared to the value, e.g. in expectation, when $x$ is drawn uniformly from the hypercube. We point to the references in this post for a more thorough discussion of the context, motivations, and state of the art.

In 1985, Joel Spencer (Spencer, 1985) showed the cellebrated ``Six Deviations Suffice’’ Theorem, which states that for any positive integer $n$ and any $A\in{\pm1}^{n\times n}$ $\mathrm{disc}(A) \leq 6\sqrt{n}$. There have since been improvements on this constant, but the question we will pursue here concerns lower bounds.

When the condition of $\pm1$ entries is replaced by an $\ell_2$ condition on the columns, the corresponding question is the famous K'{o}mlos Conjecture (which coincidentally is the first open problem in my lecture notes from around a decade ago (Bandeira, 2016)).

Conjecture 10 (Komlos Conjecture)
There exists a universal constant $K$ such that, for all square matrices $A$ whose columns have unit $\ell_2$ norm, $\mathrm{disc}(A) \leq K$.

It is easy to see that the statement of this conjecture implies Spencer’s Theorem, since the columns of an $n\times n$, $\pm1$ matrix, have $\ell_2$ norm $\sqrt{n}$. The best current lower bound for K'{o}mlos is a recent construction by Tim Kunisky~(Kunisky, 2023) which shows a lower bound on $K$ of $1+\sqrt{2}$. Unfortunately, the resulting matrix is not a $\pm1$ matrix (even after scaling), and so it does not provide a lower bound for the Spencer setting.

Open Problem 11 (How many deviations are needed afterall?)
What is the value of the following quantity?
\(\\ \sup_{n\in\mathbb{N}}\sup_{A\in\{\pm1\}^{n\times n}}\frac1{\sqrt{n}}\mathrm{disc}(A) \\\)

It is easy to see that the matrix \(\begin{bmatrix}1 & 1 \\ 1 & -1\end{bmatrix}\) has discrepancy $2=\sqrt{2}\sqrt{2}$, giving a lower bound of $\sqrt{2}$ to this quantity. Also, Dan Spielman mentioned a numerical construction (of a specific size) that achieved a slightly larger discrepancy value. To the best of my knowledge, there is no known lower bound above $2$, motivating the question: ``Did just a couple of deviations suffice all along?’‘.

Also, to the best of my knowledge, there is no known construction (or proof of existence) of an infinite family achieving a lower bound strictly larger than one. More explicitly, we formulate the following conjecture.

Conjecture 12
Given the definitions above, we have \(\\ \limsup_{n\to\infty}\sup_{A\in\{\pm1\}^{n\times n}}\frac1{\sqrt{n}}\mathrm{disc}(A) > 1. \\\)

Hadamard matrices

A natural place to search for a family proving a lower bound is to use the Sylvester construction of Hadarmard matrices. An Hadarmard matrix is a $\pm1$ matrix whose columns are all orthogonal (in other words, after scaling by $\frac{1}{\sqrt{n}}$ it is an orthogonal matrix). In 1867 James Joseph Sylvester proposed the following construction for sizes $n$ that are powers of $2$. $H_0 = 1$ and $H_k$ is the $2^{k}\times 2^{k}$ matrix given by $H_k = H_1 \otimes H_{k-1}$. In other words

\[\\ H_k = \begin{bmatrix} H_{k-1} & H_{k-1} \\ H_{k-1} & -H_{k-1}\end{bmatrix}. \\\]

Recall that $n=2^k$. Since $\frac{1}{\sqrt{n}}H_k$ is an orthogonal matrix, it is easy to see that $\left|H_kx\right|_2=n$ for all $x\in\{\pm1\}^n$ and thus $\mathrm{disc}(H_k)\geq \sqrt{n}$. This is not tight for $H_1$ (as its discrepancy is $2$), however it is tight for $H_2$: e.g. taking the eigenvector $y = (1, 1, 1, -1)$.

To give an upper bound on $\mathrm{disc}(H_k)$ we proceed with an explicit construction of a $\pm1$ vector, using the (optimal) eigenvector $y$ for $H_2$. Let $x^\natural_{(0)}=1$, $x^\natural_{(1)}=(1, 1)$, and for $k \geq 2$, $x^\natural_{(k)}\in\{\pm1\}^{2^k}$ is given by

\[\\ x^\natural_{(k)} = y \otimes x^\natural_{(k-2)} =\left[\begin{array}{c}x^\natural_{(k-1)} \\ x^\natural_{(k-1)} \\x^\natural_{(k-1)} \\ -x^\natural_{(k-1)}\end{array}\right]. \\\]

Since $H_k = H_1 \otimes (H_1 \otimes H_{k-2}) = H_2 \otimes H_{k-2}$, the mixed-product property gives

\[\\ H_kx^\natural_{(k)} = (H_2 \otimes H_{k-2}) (y \otimes x^\natural_{(k-2)}) = (H_2y) \otimes (H_{k-2}x^\natural_{(k-2)}) = (2 y) \otimes (H_{k-2}x^\natural_{(k-2)}). \\\]

Inducting this argument shows that

\[\\ \left\|H_kx^\natural_{(k)}\right\|_\infty = \begin{cases} \sqrt{2}\sqrt{2^k} \text{ if } k \text{ is odd,} \\ \sqrt{2^k} \text{ if } k \text{ is even.} \end{cases} \\\]

While this settles the discrepancy of $H_k$ for $k$ even, the odd case is, to the best of my knowledge, open.

Conjecture 13
Let $H_k$ be the $2^k\times 2^k$ Hadamard matrix obtained by the Sylvester construction (see above). For odd $k$, we have \(\\ \mathrm{disc}(H_k) = \sqrt{2}\sqrt{2^k}. \\\)

Note that this would settle Conjecture 12.

Remark (Boolean Fourier Analysis): I am not going to discuss this at length here, but since $H_k$ essentially corresponds to the boolean Fourier transform for boolean functions on $k$ variables, the discrepancy of $H_k$ has a natural interpretation in Boolean Analysis. We point the reader to (O’Donnell, 2014) for more on this subject.

As a final note, I could not mention Hadamard matrices without stating a famous conjecture regarding their existence.

Conjecture 14 (Hadamard Conjecture)
For any positive $n$ a multiple of $4$, there exists an $n\times n$ Hadamard matrix.

References

  1. Spencer, J. (1985). Six Standard Deviations Suffice. Transactions of the American Mathematical Society, 289(2), 679–706.
  2. Bandeira, A. S. (2016). Ten Lectures and Forty-Two Open Problems in the Mathematics of Data Science. Available at \Urlhttps://People.math.ethz.ch/ Abandeira/TenLecturesFortyTwoProblems.pdf.
  3. Kunisky, D. (2023). The discrepancy of unsatisfiable matrices and a lower bound for the Komlós conjecture constant. SIAM J. Discrete Mathematics (SIDMA), 37(2).
  4. O’Donnell, R. (2014). Analysis of Boolean Functions. Cambridge Press.
This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.