Sampling from the Sherrington-Kirkpatrick Model (problem 15)

Posted Dec 20, 2024

By Almut Roedder

5 min read

In recent years, significant progress has been made in understanding the mixing behavior of Glauber dynamics for Ising models, particularly in the Sherrington-Kirkpatrick (SK) model. Despite the classical Dobrushin-uniqueness condition being too restrictive for this model, recent breakthroughs employing techniques like stochastic localization and spectral analysis have improved the understanding of the fast-mixing regime. In this post, we will discuss these improved mixing bounds for Glauber dynamics in the SK model and the open questions that remain.

Let us start with some preliminaries. Consider the following measure on the hypercube $\{\pm 1\}^n$: $\mu_{J,h}(x) \propto \exp(\frac{1}{2} x^T J x +h^Tx),$ where $\propto$ corresponds to equal up to a normalizing constant. Here, the interaction matrix $J \in \mathbb{R}^{n \times n}$ encodes pairwise interactions between spins, while $h$ represents the influence of an external field. We will particularly focus on $J = \beta W$ being a random matrix with $W \sim GOE(n)$ and inverse temperature $\beta>0$. This model was introduced by Sherrington and Kirkpatrick (Sherrington & Kirkpatrick, 1975) and we refer to it as SK-model. The reader can focus on the case $h=0$. All statements below are with high probability on the disorder $W$.

Glauber Dynamics is an MCMC algorithm which chooses a site $i \in [n]$ uniformly at random and updates the spin according to $\mu$ conditioned on the remaining spins. The dynamics can be represented by the following transition kernel:

\[P(x,y) = \begin{cases} \frac{1}{n}\mu(X_i = y_i \mid X_{-i} = x_{-i}) ,& \text{if } d_H(x,y) = 1\\ 1-\frac{1}{n} \sum_{j=1}^n \mu(X_j \neq x_j \mid X_{-i} = x_{-i}) ,& \text{if } d_H(x,y) = 0\\ 0, &\text{otherwise,} \end{cases}\]

representing the probability of transitioning from $x$ to $y$. Here, $d_H$ denotes the Hamming distance in the hypercube. Mixing in time $T_\epsilon$ means that, regardless of the starting point, the measure of the Markov chain at time $T_\epsilon$ is $\epsilon$-close to its stationary measure $\mu$ with respect to the total variation distance (although other notions of distances can also be used). The reader should think of $\epsilon$ as a small constant.

One celebrated and generally sufficient condition of Glauber Dynamics mixing in $O(n\log(n))$ time is the Dobrushin-uniqueness condition (Dobruschin, 1968) which requires $\max_{i}\sum_{j} |J_{ij}|<1$. While this condition is tight for some models (e.g. the Curie-Weiss-model), it only proves fast mixing for very high temperatures ($\beta = O(\frac{1}{\sqrt{n}})$) in the SK-model. In fact, a long standing open problem asked to extend the results above for some $\beta = O(1)$.

Indeed, in recent years, new proof techniques based on stochastic localization schemes have emerged and, when applied to the SK-model, significantly improved the bounds on $\beta$ for guaranteeing fast mixing. Eldan, Koehler and Zeitouni (Eldan et al., 2022) were able to prove the following general (deterministic) spectral condition: As long as

\[\lambda_{\max}(J) -\lambda_{\min}(J) < 1-\delta\]

for some $\delta >0$, then Glauber Dynamics mixes in $O(n^2)$ time with respect to total variation distance (big-O notation may hide constants depending on $\epsilon$). Applied to the SK-model this guarantees fast mixing for inverse temperature $\beta < \frac{1}{4}$.

The bound on the width of the spectrum of the Interaction Matrix proves to be tight again in the Curie-Weiss-Model. Moreover, based on the low-degree-conjecture, Kunisky (Kunisky, 2024) gave evidence that no polynomial-time sampling algorithm exists that can succeed in general for all $J$ with $\lambda_{\max}(J) -\lambda_{\min}(J) \leq 1+\delta$. Hence, for improving bounds in the SK-model using spectral analysis, it does not suffice to measure the distance of the largest and the smallest eigenvalue of the interaction matrix, but additional properties of the interaction matrix in the SK model have to be used. One nice example is the symmetry of the spectrum: In contrast to the all-ones-matrix which defines the Curie-Weiss-Interaction matrix, the spectrum of $W$ is almost symmetric around $0$. Anari, Koehler and Vuong (Anari et al., 2024) used exactly this property to improve the bound on $\beta$: When

\[\frac{|\lambda_{\max}(J)|}{|\lambda_{\min}(J)|} = 1 + o(1),\]

Glauber Dynamics mixes in $O(n \log(n))$ time with respect to the Kullback-Leibler divergence, when

\[\lambda_{\max}(J) - \lambda_{\min}(J) < 1.18\]

and $\lambda_{\min}(J) <0$. The proof is based on a trickle down equation in a stochastic localization scheme and the authors introduce novel techniques to bound the operator norm of covariance matrices, depending on both the ratio of $\lambda_{\max}(J)$ and the distance $\lambda_{\min}(J)$ and $\lambda_{\max}(J) - \lambda_{\min}(J)$. This leads to the currently best-known upper bound $\beta < 0.295$ and it remains an open question whether this bound could be improved up to $1$:

Open Problem 15 (Fast Mixing for Glauber Dynamics in the SK-Model)

Consider the SK-model with an arbitrary external field. Does Glauber Dynamics fast mix (meaning in polynomial time) up to the replica symmetry breaking threshold $\beta_\ast =1$?

El Alaoui, Montanari and Sellke (Alaoui et al., 2022) proposed an algorithm for sampling from the SK-model without an external field that applies a stochastic localization scheme and makes use of the Approximate Message Passing framework. They proved fast mixing in Wasserstein Distance (which is a weaker metric than total variation distance) up to $\beta <\frac{1}{2}$. Celentano (Celentano, 2024) extended the argument to all $\beta$ up to the threshold $\beta_\ast =1$. Moreover, El Alaoui, Montanari and Sellke, proved in the same paper (Alaoui et al., 2022) that it is impossible for stable algorithms to sample from the SK model without external field for $\beta >1$, which also rules out fast mixing for their algorithm.

References

Sherrington, D., & Kirkpatrick, S. (1975). Solvable model of a spin-glass. Physical Review Letters, 35(26), 1792.
Dobruschin, P. L. (1968). The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability & Its Applications, 13(2), 197–224.
Eldan, R., Koehler, F., & Zeitouni, O. (2022). A spectral condition for spectral gap: fast mixing in high-temperature Ising models. Probability Theory and Related Fields, 182(3), 1035–1051.
Kunisky, D. (2024). Optimality of Glauber dynamics for general-purpose Ising model sampling and free energy approximation. Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 5013–5028.
Anari, N., Koehler, F., & Vuong, T.-D. (2024). Trickle-Down in Localization Schemes and Applications. Proceedings of the 56th Annual ACM Symposium on Theory of Computing, 1094–1105.
Alaoui, A. E., Montanari, A., & Sellke, M. (2022). Sampling from the Sherrington-Kirkpatrick Gibbs measure via algorithmic stochastic localization. 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), 323–334.
Celentano, M. (2024). Sudakov–Fernique post-AMP, and a new proof of the local convexity of the TAP free energy. The Annals of Probability, 52(3), 923–954.

Open Problems 2024

This post is licensed under CC BY 4.0 by the author.

References

Trending Tags