Exchangeable Pairs

Introduction

The classical combinatorial central limit theorem deals with sums of the following form: \[ \begin{align} \sum_{i=1}^{n} c(i, \pi(i)), \label{eqn:classic_cclt} \end{align} \] where the \(c(i,j)\) are constants, and the randomness comes from the permutation \(\pi\). Our sums are slightly more complicated: \[ \begin{align} \sum_{i,j,k} a(i,j,k) b(\pi(i), \pi(j), \pi(k)), \label{eqn:new_form} \end{align} \] but the hope is that the classical approach can be extended to our more complicated setting11 In our setting, the regular indices separate from the permutation indices, and this makes the formatting of the calculations easier, but I don’t think it’s actually necessary that they decompose for our results to go through..

Centering for Classical

Equation 8 of (Hoeffding ⊕1951Hoeffding, Wassily. 1951. “A Combinatorial Central Limit Theorem.” The Annals of Mathematical Statistics 22 (4): 558–66.) gives the term for the variance in the combinatorial central limit theorem, which we reproduce below: \[ \begin{align*} d(i,j) = c(i,j) - c(\cdot, j) - c(i, \cdot) + c(\cdot, \cdot), \end{align*} \] where the \(\cdot\) corresponds to taking an average across all possible indices in that term22 In other words, \(c(i, \cdot) = \frac{1}{n}\sum_j c(i,j)\). This centering is powerful because not only does it correspond to a zero mean random variable in that \[ \begin{align} \sum_i c(i, \pi(i)) - \mathbb{E}\left( \sum_i c(i, \pi(i)) \right) = \sum_i d(i, \pi(i)), \label{eqn:equality} \end{align} \] but the \(d\) terms have the property that their row and column sums are 0. Clearly, \[ \begin{align} \sum_i d(i,j) &= \sum_i \left[ c(i,j) - c(\cdot, j) - c(i, \cdot) + c(\cdot, \cdot) \right] \\ &= n c(i,j) - n c(\cdot, j) - n c(\cdot, \cdot) + n c(\cdot, \cdot) = 0, \end{align} \] and by symmetry the row sums are also 0. Now, to show \(\eqref{eqn:equality}\), note that \[ \begin{align} \sum_i d(i, \pi(i)) &= \sum_i \left[ c(i,\pi(i)) - c(\cdot, \pi(i)) - c(i, \cdot) + c(\cdot, \cdot) \right] \\ &= \sum_i c(i, \pi(i)) - n c(\cdot, \cdot) - n c(\cdot, \cdot) + n c(\cdot, \cdot) \\ &= \sum_i c(i, \pi(i)) - n c(\cdot, \cdot). \end{align} \] But the expectation of \(\eqref{eqn:classic_cclt}\) is exactly the second term above: \[ \begin{align} \mathbb{E} \sum_i c(i, \pi(i)) &= \sum_i \mathbb{E} c(i, \pi(i)) \\ &= \sum_i \sum_j c(i, j) \mathbb{P}(\pi(i) = j) \\ &= \frac{1}{n} \sum_{i,j} c(i, j) = n c(\cdot, \cdot) \end{align} \]

New Centering

Let’s see if we can do something similar. Consider the following centered version of the summands of \(\eqref{eqn:new_form}\): \[ \begin{align*} \tilde{a}(i,j,k,r,s,t) &= a(i,j,k,r,s,t) \\&- a(\cdot,j,k,r,s,t) - a(i,\cdot,k,r,s,t) - \ldots \\&+ a(\cdot,\cdot,k,r,s,t) + a(\cdot,j,\cdot,r,s,t) + \ldots \\&- a(\cdot,\cdot,\cdot,r,s,t) - a(\cdot,\cdot,k,\cdot,s,t) - \ldots \\&+ a(\cdot,\cdot,\cdot,\cdot,s,t) + a(\cdot,\cdot,\cdot,r,\cdot,t) + \ldots \\&- a(\cdot,\cdot,\cdot,\cdot,\cdot,t) - a(\cdot,\cdot,\cdot,\cdot,s,\cdot) - \ldots \\&+ a(\cdot,\cdot,\cdot,\cdot,\cdot,\cdot) % \\&= a_{\emptyset} % \\&- a_{1} - a_{2} - \ldots % \\&+ a_{1,2} + a_{1,3} + \ldots % \\&- a_{1,2,3} - a_{1,2,4} - \ldots % \\&+ a_{1,2,3,4} + a_{1,2,3,5} + \ldots % \\&- a_{1,2,3,4,5} - a_{1,2,3,4,6} - \ldots % \\&+ a_{1,2,3,4,5,6} \end{align*} \] Let us first check that sums over one index with everything else fixed are 0. Without loss of generality, pick the first index, \(i\). Clearly, every term on the RHS either has an \(i\) in the first position, or a \(\cdot\) there. A little more thought will convince you that these terms pair up, in the sense that every term on the right with an \(i\) in the first position can be matched to a term that has the same entries in the last 5 positions but a \(\cdot\) in the first one, and that this pairing is exhaustive.

But once you have this pairing, then just note that \[ \begin{align} \sum_i a(i, \ldots) - \sum_i a(\cdot, \ldots) = n a(\cdot, \ldots) - n a(\cdot, \ldots) = 0, \end{align} \] where the \(\ldots\) here represents some common sequence of indices. Hence, the single-coordinate sums for \(\tilde{a}\) are 0. It remains to show that \[ \begin{align} \sum_{i,j,k} a(i,j,k,\pi(i),\pi(j),\pi(k)) - \mathbb{E}\left( \sum_{i,j,k} a(i,j,k,\pi(i),\pi(j),\pi(k)) \right) \\= \sum_{i,j,k} \tilde{a}(i,j,k,\pi(i),\pi(j),\pi(k)) \end{align} \] The problem is that this is not true. Consider the following: \[ \begin{align} \sum_{i,j,k} a(\cdot,j,k,\pi(i),\pi(j),\pi(k)) = \sum_{i,j,k,r} a(i,j,k,r,\pi(j),\pi(k)) \end{align} \] What we see is that, unlike in the classical case where the extra terms would never have indices showing up in multiple places, we have both \(j\) and \(k\) showing up in multiple places, so that the sum doesn’t simplify.