// Databricks notebook source exported at Tue, 28 Jun 2016 11:19:25 UTC

Scalable Data Science

Course Project by Dominic Lee

supported by and

The html source url of this databricks notebook and its recorded Uji Image of Uji, Dogen's Time-Being :

#Random Matrices

This project explores how the theory of random matrices can be used in a practical way for the decomposition of matrices. In partcular, we focus on an important practical application:

Separating “signal” from “noise”: Obtain a low-rank representation of a matrix that retains most of the “signal” and filters out most of the “noise” in the original matrix.

References

Alan Edelman and Yuyang Wang (2013). Random matrix theory and its innovative applications.
Abstract Recently more and more disciplines of science and engineering have found Random Matrix Theory valuable. Some disciplines use the limiting densities to indicate the cutoff between “noise” and “signal.” Other disciplines are finding eigenvalue repulsions a compelling model of reality. This survey introduces both the theory behind these applications and MATLAB experiments allowing a reader immediate access to the ideas.
Alan Edelman and N. Raj Rao (2005). Random matrix theory. Acta Numerica, pp. 1-65.
Abstract: Random matrix theory is now a big subject with applications in many disciplines of science, engineering and finance. This article is a survey specifically oriented towards the needs and interests of a numerical analyst. This survey includes some original material not found anywhere else. We include the important mathematics which is a very modern development, as well as the computational software that is transforming the theory into useful practice.
Contents:
1 Introduction
2 Linear systems
3 Matrix calculus
4 Classical random matrix ensembles
5 Numerical algorithms stochastically
6 Classical orthogonal polynomials
7 Multivariate orthogonal polynomials
8 Hypergeometric functions of matrix argument
9 Painlev´e equations
10 Eigenvalues of a billion by billion matrix
11 Stochastic operators
12 Free probability and infinite random matrices
13 A random matrix calculator
14 Non-Hermitian and structured random matrices
15 A segue

Applications of Random Matrix Theory

Mark Buchanan (2010). Enter the matrix: the deep law that shapes our reality. New Scientist Vol. 206 Issue 2755, p. 28-31.
Abstract: The article discusses random matrix theory, a facet of physics research said to be growing rapidly. Once applied only in the study of physics, the article reports that the concept is finding application in numerous other areas including finance and materials science. Electrical engineer Raj Nadakuditi is cited regarding his observation that random matrix theory is deeply rooted in nature. The history of application of random matrix theory to studies of complex atomic nuclei is noted. Physicist Eugene Wigner is mentioned for his insights into the quantum nature of atomic nuclei.

//This allows easy embedding of publicly available information into any other notebook
//when viewing in git-book just ignore this block - you may have to manually chase the URL in frameIt("URL").
//Example usage:
// displayHTML(frameIt("https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation#Topics_in_LDA",250))
def frameIt( u:String, h:Int ) : String = {
      """<iframe 
 src=""""+ u+""""
 width="95%" height="""" + h + """"
 sandbox>
  <p>
    <a href="http://spark.apache.org/docs/latest/index.html">
      Fallback link for browsers that, unlikely, don't support frames
    </a>
  </p>
</iframe>"""
   }
displayHTML("https://en.wikipedia.org/wiki/Random_matrix")

Separating “Signal” from “Noise”

From the decomposition of a matrix as a sum of rank-one matrices, the goal is to truncate the sum appropriately so as to obtain a low-rank representation of the original matrix that retains the “signal” (or structure) and filters out the “noise”.

Wigner’s Theorem

Wigner’s theorem, also known as the semi-circle law, states that for a real symmetric \(n\times n\) random matrix, \(X_{n}\), whose entries are independent random variables with mean 0, variance 1 and have bounded moments, the distribution of the eigenvalues of \(\frac{1}{\sqrt{n}}X_{n}\) approaches a semi-circle density on the support [-2,2] as \(n\to\infty\).
Denoting an eigenvalue of \(\frac{1}{\sqrt{n}}X_{n}\) by \(\lambda\), the asymptotic semi-circle density is \(f(\lambda)=\frac{1}{2\pi}\sqrt{4-\lambda^{2}}.\)

Let’s import the needed libraries.

import breeze.linalg._
import breeze.linalg.operators._
import breeze.numerics._
import breeze.numerics.constants._
import breeze.stats._
import breeze.stats.distributions._
import breeze.stats.DescriptiveStats._
sqlContext

Gaussian Example

// Gaussian example
val gaussianMean = 0.0
val gaussianSD = 1.0
val n = 4
val x = new DenseMatrix(n,n,new Gaussian(gaussianMean,gaussianSD).sample(n*n).toArray) // nxn Gaussian random matrix

val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric Gaussian random matrix

val yEigval = eigSym.justEigenvalues(y/sqrt(n)).toArray // eigenvalues of scaled symmetric Gaussian random matrix

val n = 1000
val nSquared = n * n
val nIndex = DenseVector.range(1,n+1,1).toArray // array of integers from 1 to n
val nSquaredIndex = DenseVector.range(1,nSquared+1,1).toArray // array of integers from 1 to nSquared

val s = new Gaussian(gaussianMean,gaussianSD).sample(nSquared).toArray
val sDF = sc.parallelize(nSquaredIndex zip s).toDF("index","sample") // create RDD and convert to dataframe
sDF.show

display(sDF.select("sample"))

val x = new DenseMatrix(n,n,s) // nxn Gaussian random matrix
val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric Gaussian random matrix
val yEigval = eigSym.justEigenvalues(y/sqrt(n)).toArray // eigenvalues of scaled symmetric Gaussian random matrix

val yEigvalDF = sc.parallelize(nIndex zip yEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
yEigvalDF.show

display(yEigvalDF.select("eigenvalue"))

Gaussian example: Density histogram with semi-circle density.

Uniform Example

// Uniform example
val uniformMin = -sqrt(3)
val uniformMax = sqrt(3)
val uniformMean = Uniform(uniformMin,uniformMax).mean
val uniformVariance = Uniform(uniformMin,uniformMax).variance

val s = new Uniform(uniformMin,uniformMax).sample(nSquared).toArray
val sDF = sc.parallelize(nSquaredIndex zip s).toDF("index","sample") // create RDD and convert to dataframe
display(sDF.select("sample"))

val x = new DenseMatrix(n,n,s) // nxn uniform random matrix
val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric uniform random matrix
val yEigval = eigSym.justEigenvalues(y/sqrt(n)).toArray // eigenvalues of scaled symmetric random matrix
val yEigvalDF = sc.parallelize(nIndex zip yEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(yEigvalDF.select("eigenvalue"))

Laplacian (Double Exponential) Example

// Laplacian (double exponential) example
val exponentialRate = sqrt(2)
val a = new DenseMatrix(n,n,new Exponential(exponentialRate).sample(nSquared).toArray) // nxn exponential random matrix
implicit def bool2double(b:Boolean) = if (b) 1.0 else 0.0
val bernoulliProb = 0.5
val b = new DenseMatrix(n,n,new Bernoulli(bernoulliProb).sample(nSquared).map(b => 2 * bool2double(b) - 1).toArray) // nxn Bernoulli random matrix
val x = a :* b // nxn Laplacian random matrix
val s = x.toDenseVector.toArray
val sDF = sc.parallelize(nSquaredIndex zip s).toDF("index","sample") // create RDD and convert to dataframe
display(sDF.select("sample"))

val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric Laplacian random matrix
val yEigval = eigSym.justEigenvalues(y/sqrt(n)).toArray // eigenvalues of scaled symmetric random matrix
val yEigvalDF = sc.parallelize(nIndex zip yEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(yEigvalDF.select("eigenvalue"))

Does it work when the underlying density is non-symmetric?

Exponential Example

// Exponential example
val exponentialRate = 0.5
val s = new Exponential(exponentialRate).sample(nSquared).toArray
val sDF = sc.parallelize(nSquaredIndex zip s).toDF("index","sample") // create RDD and convert to dataframe
display(sDF.select("sample"))

val x = new DenseMatrix(n,n,s) // nxn exponential random matrix
val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric exponential random matrix
val z = (y - mean(y)) / sqrt(variance(y)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval = eigSym.justEigenvalues(z/sqrt(n)).toArray // eigenvalues of scaled standardized symmetric random matrix
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

It still works!!! The underlying density can be non-symmetric.

Exponential example: Density histogram with semi-circle density.

Does it work when the underlying distribution is discrete?

Poisson Example

// Poisson example
val poissonMean = 5
val t = new Poisson(poissonMean).sample(nSquared)
val s = t.map(ti => ti.toDouble).toArray
val sDF = sc.parallelize(nSquaredIndex zip s).toDF("index","sample") // create RDD and convert to dataframe
display(sDF.select("sample"))

val x = new DenseMatrix(n,n,s) // nxn Poisson random matrix
val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric Poisson random matrix
val z = (y - mean(y)) / sqrt(variance(y)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval = eigSym.justEigenvalues(z/sqrt(n)).toArray // eigenvalues of scaled standardized symmetric random matrix
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

It still works!!! The underlying distribution can be discrete.

Does it still work when the matrix entries are independent but not identically distributed?

Example: Gaussians with different means and variances

val s = new DenseMatrix(n,n,new Gaussian(gaussianMean,gaussianSD).sample(nSquared).toArray)
val t = new DenseMatrix(n,n,new Poisson(poissonMean).sample(nSquared).map(t => t.toDouble + 1.0).toArray)
val x = (s :* t) + t
val y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric Poisson random matrix
val z = (y - mean(y)) / sqrt(variance(y)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval = eigSym.justEigenvalues(z/sqrt(n)).toArray // eigenvalues of scaled standardized symmetric random matrix
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

It still works!!! The independent matrix entries do not have to be identically distributed.

What happens when the matrix contains non-random structure?

Example: Exponential random matrix with strongly connected block

// Exponential example
val s = new Exponential(exponentialRate).sample(nSquared).toArray
val x = new DenseMatrix(n,n,s) // nxn exponential random matrix
var y = upperTriangular(x) + upperTriangular(x).t - diag(diag(x)) // symmetric exponential random matrix
y = y :/ max(y) // normalize largest entry to 1.0
y(0 to 99,0 to 99) := 0.7 // first 100 nodes form strongly connected block

val z = (y - mean(y)) / sqrt(variance(y)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval = eigSym.justEigenvalues(z/sqrt(n)).toArray // eigenvalues of scaled standardized symmetric random matrix
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

The eigenvalues do not have a semi-circle density.

Separating “signal” from “noise”

Let \(Y\) be a real symmetric \(n\times n\) matrix with eigenvalues, \(\lambda_{1},...,\lambda_{n}\), and corresponding eigenvectors, \(u_{1},...,u_{n}\). A decomposition of \(Y\) into a sum of rank-one matrices is given by \(Y = \sum_{k=1}^{n} \lambda_{k} u_{k} u_{k}^{T}.\)

\(Y\) can decomposed into “signal” plus “noise” by partitioning its component rank-one matrices into signal components and noise components: \(Y = Y_{signal} + Y_{noise} = \sum_{i \in S} \lambda_{i} u_{i} u_{i}^{T} + \sum_{j\in N} \lambda_{j} u_{j} u_{j}^{T},\)

where \(S\) is an index set of signal components and \(N\) is an index set of noise components (\(S\cup N = \{1,\ldots,n\}\), \(S\cap N=\emptyset\)).

val yEig = eigSym(y)
val yEigval = yEig.eigenvalues
val yEigvec = yEig.eigenvectors
val yEigvalDF = sc.parallelize(nIndex zip yEigval.toArray).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(yEigvalDF.select("eigenvalue"))

\(S_{1}= \{1000\}\) and \(N_{1}= \{1,\ldots,999\}\)

val yNoise = y - (yEigval(-1) * (yEigvec(::,-1) * yEigvec(::,-1).t))
val z = (yNoise - mean(yNoise)) / sqrt(variance(yNoise)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval1 = eigSym.justEigenvalues(z/sqrt(n)) // eigenvalues of scaled standardized symmetric random matrix
val zEigval = zEigval1.toArray
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

Signal and noise not well-separated

\(S_{2}=\{999,1000\}\) and \(N_{2}= \{1,...,998\}\)

val yNoise = y - (yEigval(-1) * (yEigvec(::,-1) * yEigvec(::,-1).t)) - (yEigval(-2) * (yEigvec(::,-2) * yEigvec(::,-2).t))
val z = (yNoise - mean(yNoise)) / sqrt(variance(yNoise)) // symmetric random matrix standardized to have mean 0 and variance 1
val zEigval2 = eigSym.justEigenvalues(z/sqrt(n)) // eigenvalues of scaled standardized symmetric random matrix
val zEigval = zEigval2.toArray
val zEigvalDF = sc.parallelize(nIndex zip zEigval).toDF("index","eigenvalue") // create RDD and convert to dataframe
display(zEigvalDF.select("eigenvalue"))

Signal and noise adequately separated

Semi-Circle Cumulative Distribution Function

def semiCircleCDF(s:DenseVector[Double]):DenseVector[Double]={
  val n = s.length
  var cdf = (0.5 * (s :* sqrt(4.0 - (s :* s))) + 2.0 * (asin(0.5 * s) - asin(-1.0))) / (2.0 * Pi)
  for(i <- 0 to n-1){
    if(s(i) < -2.0)
     cdf(i) = 0.0
    if(s(i) > 2.0)
     cdf(i) = 1.0
  }
  cdf
}

Kolmogorov-Smirnov Test

def KolmogorovSmirnovTest(scdf:DenseVector[Double]):Double={
  val m = scdf.length
  val n = m.toDouble
  val indexInt = DenseVector.range(1,m+1,1)
  val indexDou = indexInt.map(i => i.toDouble)
  val index1 = indexDou / n
  val index2 = (indexDou - 1.0) / n
  val diff1 = max(abs(scdf - index1))
  val diff2 = max(abs(scdf - index2))
  val ksStat = max(diff1,diff2)
  val pValue = min(1.0,2.0 * exp(-2.0 * n * ksStat * ksStat))
  pValue
}

Test Noise 1

val zcdf1 = semiCircleCDF(zEigval1)
val pValue1 = KolmogorovSmirnovTest(zcdf1)

p-value suggests that Noise 1 is not a random matrix.

Test Noise 2

val zcdf2 = semiCircleCDF(zEigval2)
val pValue2 = KolmogorovSmirnovTest(zcdf2)

p-value suggests that Noise 2 appears to be a random matrix.

Extension to Singular Values: Quarter-Circle Law

The quarter-circle law states that for a real \(n\times n\) random matrix, \(X_{n}\), whose entries are independent random variables with mean 0, variance 1 and have bounded moments, the distribution of the singular values of \(\frac{1}{\sqrt{n}}X_{n}\) approaches a quarter-circle density on the support [0,2] as \(n\to\infty\).
Denoting an eigenvalue of \(\frac{1}{\sqrt{n}}X_{n}\) by \(\sigma\), the asymptotic quarter-circle density is \(f(\sigma)=\frac{1}{\pi}\sqrt{4-\sigma^{2}}.\)

Extension to Rectangular Matrices: Marcenko-Pastur’s Theorem

Marcenko-Pastur’s theorem states that for a real \(m\times n\) random matrix, \(X_{n}\), with \(m\geq n\) satisfying \(\frac{n}{m}\to r\leq 1\) as \(n\to\infty\), and whose entries are independent random variables with mean 0, variance 1 and have bounded moments, the asymptotic density, as \(n\to\infty\), of the singular values of \(\frac{1}{\sqrt{m}}X_{n}\) is \(f(\sigma)=\frac{1}{\pi \sigma r}\sqrt{[\sigma^{2}-(1-\sqrt{r})^{2}][(1+\sqrt{r})^{2}-\sigma^{2}]},\) on the interval, \([1-\sqrt{r},1+\sqrt{r}]\).
Note that when \(m=n\), Marcenko-Pastur’s theorem reduces to the quarter-circle law.

Scalable Data Science

Course Project by Dominic Lee

References

Applications of Random Matrix Theory

Separating “Signal” from “Noise”

Wigner’s Theorem

Gaussian Example

Gaussian example: Density histogram with semi-circle density.

Uniform Example

Laplacian (Double Exponential) Example

Does it work when the underlying density is non-symmetric?

Exponential Example

It still works!!! The underlying density can be non-symmetric.

Exponential example: Density histogram with semi-circle density.

Does it work when the underlying distribution is discrete?

Poisson Example

It still works!!! The underlying distribution can be discrete.

Does it still work when the matrix entries are independent but not identically distributed?

Example: Gaussians with different means and variances

It still works!!! The independent matrix entries do not have to be identically distributed.

What happens when the matrix contains non-random structure?

Example: Exponential random matrix with strongly connected block

The eigenvalues do not have a semi-circle density.

Separating “signal” from “noise”

\(S_{1}= \{1000\}\) and \(N_{1}= \{1,\ldots,999\}\)

Signal and noise not well-separated

\(S_{2}=\{999,1000\}\) and \(N_{2}= \{1,...,998\}\)

Signal and noise adequately separated

Semi-Circle Cumulative Distribution Function

Kolmogorov-Smirnov Test

Test Noise 1

p-value suggests that Noise 1 is not a random matrix.

Test Noise 2

p-value suggests that Noise 2 appears to be a random matrix.

Extension to Singular Values: Quarter-Circle Law

Extension to Rectangular Matrices: Marcenko-Pastur’s Theorem

Scalable Data Science

Course Project by Dominic Lee

Share on