Johnson-Lindenstrauss 引理

Johnson-Lindenstrauss 引理

对于NNdd维向量构成的集合A={x1,x2,,xNRd}A=\left\lbrace x_1,x_2,\cdots,x_N\in\mathbb{R}^d\right\rbraceε(0,1)\forall\varepsilon\in(0,1)f:RdRk\exists f:\mathbb{R}^d\to\mathbb{R}^k,满足

(1ε)xixj2f(xi)f(xj)2(1+ε)xixj2(1-\varepsilon)\|x_i-x_j\|^2\leq \|f(x_i)-f(x_j)\|^2\leq (1+\varepsilon)\|x_i-x_j\|^2

其中

k24logN3ε22ε3O(logNε2)k\geq \frac{24\log N}{3\varepsilon^2-2\varepsilon^3}\simeq O(\frac{\log N}{\varepsilon^2})

Proof:\mathrm{Proof:}

随机矩阵

证明主要基于Cramér-Chernoff方法

uRd\mathbf{u}\in \mathbb{R}^d, Riji.i.dN(0,1)R_{ij}\overset{i.i.d}{\sim}\mathcal{N}(0,1), 随机投影向量 v=1kRu\mathbf{v}=\frac{1}{\sqrt{k}}R\cdot \mathbf{u}

根据随机矩阵期望保距

E[v2]=u2\mathbb{E}[\|\mathbf{v}\|^2]=\|\mathbf{u}\|^2

X=kuv=1uRuX=\frac{\sqrt{k}}{\|\mathbf{u}\|}\mathbf{v}=\frac{1}{\|\mathbf{u}\|}R\cdot \mathbf{u}, x=i=1kxi2=kv2u2χ2(k)x=\sum_{i=1}^k x_i^2 =\frac{k\|\mathbf{v}\|^2}{\|\mathbf{u}\|^2}\sim \chi^2(k)

E[eλxi2]=+eλxi212πexi22dxi=112λ\begin{aligned} \mathbb{E}[e^{\lambda x_i^2}]&=\int_{-\infty}^{+\infty}e^{\lambda x_i^2}\frac{1}{\sqrt{2\pi}}e^{-\frac{x_i^2}{2}}\,\mathrm{d}x_i=\sqrt{\frac{1}{1-2\lambda}} \end{aligned}

一方面

P[v2(1+ε)u2]=P[x(1+ε)k]=P[eλxeλ(1+ε)k]infλ0E[eλx]eλ(1+ε)k=infλ0i=1kE[eλxi2]eλ(1+ε)k=infλ0(112λeλ(1+ε))k=[(1+ε)eε]k2=exp(k2(ε+log(1+ε)))exp(k2(ε22+ε33))\begin{aligned} P[\|\mathbf{v}\|^2\geq(1+\varepsilon)\|\mathbf{u}\|^2]&=P[x\geq(1+\varepsilon)k]\\ &=P[e^{\lambda x}\geq e^{\lambda(1+\varepsilon)k}]\\ &\leq \inf_{\lambda\geq 0}\frac{\mathbb{E}[e^{\lambda x}]}{e^{\lambda(1+\varepsilon)k}}\\ &=\inf_{\lambda\geq 0}\frac{\prod_{i=1}^k\mathbb{E}[e^{\lambda x_i^2}]}{e^{\lambda(1+\varepsilon)k}}\\ &=\inf_{\lambda \geq 0}\left(\frac{1}{\sqrt{1-2\lambda}\,e^{\lambda(1+\varepsilon )}}\right)^k\\ &=\left[(1+\varepsilon)e^{-\varepsilon}\right]^{\frac{k}{2}}\\ &=\exp\left(\frac{k}{2}(-\varepsilon+\log(1+\varepsilon))\right)\\ &\leq\exp\left(\frac{k}{2}\left(-\frac{\varepsilon^2}{2}+\frac{\varepsilon^3}{3}\right)\right) \end{aligned}

k2(ε22+ε33)<2logN\dfrac{k}{2}\left(-\dfrac{\varepsilon^2}{2}+\dfrac{\varepsilon^3}{3}\right)<-2\log N, 即 k>24logN3ε22ε3k>\dfrac{24\log N}{3\varepsilon^2-2\varepsilon^3} 时,满足

P[v2(1+ε)u2]N2P\left[\|\mathbf{v}\|^2\geq (1+\varepsilon)\|\mathbf{u}\|^2\right]\leq N^{-2}

因此

P[v2(1+ε)u2]1N2P\left[\|\mathbf{v}^2\|\leq (1+\varepsilon)\|\mathbf{u}\|^2\right]\geq 1-N^{-2}

另一方面

P[v2(1ε)u2]=P[x(1ε)k]=P[eλxeλ(1ε)k]infλ0E[eλx]eλ(1ε)k=infλ0i=1kE[eλxi2]eλ(1ε)k=infλ0(11+2λeλ(1ε))k=[(1ε)eε]k2=exp(k2(ε+log(1ε)))exp(kε24)\begin{aligned} P\left[\|\mathbf{v}\|^2\leq(1-\varepsilon)\|\mathbf{u}\|^2\right]&=P\left[x\leq (1-\varepsilon)k\right]\\ &=P\left[e^{-\lambda x}\geq e^{-\lambda(1-\varepsilon)k}\right]\\ &\leq \inf_{\lambda\geq 0}\frac{\mathbb{E}[e^{-\lambda x}]}{e^{-\lambda(1-\varepsilon)k}}\\ &=\inf_{\lambda\geq 0} \frac{\prod_{i=1}^k\mathbb{E}\left[e^{-\lambda x_i^2}\right]}{e^{-\lambda (1-\varepsilon )k}}\\ &=\inf_{\lambda \geq 0}\left(\frac{1}{\sqrt{1+2\lambda}\,e^{-\lambda(1-\varepsilon )}}\right)^k\\ &= \left[(1-\varepsilon)e^{\varepsilon}\right]^{\frac{k}{2}}\\ &= \exp\left(\frac{k}{2}\left(\varepsilon+\log(1-\varepsilon)\right)\right)\\ &\leq \exp \left(-\frac{k\varepsilon^2}{4}\right) \end{aligned}

kε242logN-\dfrac{k\varepsilon^2}{4}\leq -2\log N时,即k8logNε2k\geq \dfrac{8\log N}{\varepsilon^2}

P[v2(1ε)u2]N2P\left[\|\mathbf{v}\|^2\leq(1-\varepsilon)\|\mathbf{u}\|^2\right]\leq N^{-2}

P[v2(1ε)u2]1N2P\left[\|\mathbf{v}\|^2\geq(1-\varepsilon)\|\mathbf{u}\|^2\right]\geq 1- N^{-2}

k{k24logN3ε22ε3}{k8logNε2}={k24logN3ε22ε3}k\in \left\{k\geq\frac{24\log N}{3\varepsilon^2-2\varepsilon^3}\right\}\cap \left\{k\geq\frac{8\log N}{\varepsilon^2}\right\}=\left\{k\geq\frac{24\log N}{3\varepsilon^2-2\varepsilon^3}\right\}

满足

P[v2∉[(1ε)u2,(1+ε)u2]]2N2P\left[\|\mathbf{v}\|^2\not \in\left[(1-\varepsilon)\|\mathbf{u}\|^2,(1+\varepsilon)\|\mathbf{u}\|^2\right]\right]\leq \frac{2}{N^2}

进一步

xi,xjA,P[f(xi)f(xj)2∉[(1ε)xixj2,(1+ε)xixj2]]2N2\forall x_i, x_j\in A,\, P\left[\|f(x_i)-f(x_j)\|^2\not \in\left[(1-\varepsilon)\|x_i-x_j\|^2,(1+\varepsilon)\|x_i-x_j\|^2\right]\right]\leq\frac{2}{N^2}

Boole不等式,遍历集合AA的二元向量组

P(Ei)P(Ei)N(N1)22N2=11NP(\cup E_i)\leq\sum P(E_i)\leq\frac{N(N-1)}{2}\frac{2}{N^2}=1-\frac{1}{N}

由此可以得出,当 k24logN3ε22ε3k\geq\dfrac{24\log N}{3\varepsilon^2-2\varepsilon^3}时,将整个 AA 几乎保距映入低维空间的映射ff存在的概率不为0

Fast Johnson-Linderstrauss Transform