A.23  Vector prediction exploring spatial correlation

Instead of exploring correlation over time, this section discusses methods to explore the so-called spatial correlation: an element of the random vector being estimated based on the other elements of this vector. As signal model, the Gaussian block or “packet” ISI channel [CF97], (page 84, Eq. 4.6) is adopted here, which is given by

Y = HX + N,
(A.103)

where X is a zero-mean complex random input m-vector, Y is a zero-mean complex random output n-vector, N is a complex zero-mean Gaussian noise n-vector independent of X and H is the complex n × m channel matrix [CF97]. For these random vectors, the correlation matrices coincide with covariance matrices. 20

The output covariance matrix is given by

Ryy = HRxxH + R nn,
(A.104)

where a matrix superscript denotes Hermitian (transpose conjugate).

A characteristic representation of a random m-vector X is given by the linear combination of the columns of a matriz F, whose determinant is equal to one,21 weighted by a vector of uncorrelated random variables V, that is:

X = FV.
(A.105)

Hence, the covariance matrix of X is given by:

Rxx = FRvvF
(A.106)

where Rvv is diagonal (the random variables in V are uncorrelated).

There are two alternatives of interest for representing a vector in its characteristic form, the modal and the innovations representation. The first is derived from the eigendecomposition of Rxx. Given the factorization Rxx = UΛx2U = (UΛx)(UΛx), by comparison to Eq. (A.106), F corresponds to the unitary matrix U from the eigendecomposition, while the uncorrelated vector V from Eq. (A.105) corresponds to U1X. Meanwhile, the latter (innovations representation) is derived from the Cholesky decompostion. In a similar manner, given the factorization Ryy = LDy2L = (LDy)(LDy), F corresponds to the lower triangular matrix L, while V corresponds to L1X.

The important conclusion yielded by these two representations is that a vector X whose random variables are correlated can be whitened by a forward section given by U1, the inverse of the unitary matrix from the eigendecomposition of its covariance matrix, or L1, the inverse of the lower triangular matrix from the Cholesky decomposition of its covariance matrix.

The innovations representation is a natural adaptation of linear prediction over time and is obtained with a Cholesky factorization of Ryy, while the modal representation can be obtained via eigenanalysis or SVD.

The optimum MMSE linear predictor in this scenario is

Y~ = PY,
(A.107)

where the predictor matrix P is given by

P = I L1,
(A.108)

with L being obtained from the innovations representation Ryy = LDy2L and I being the identity matrix. It was assumed that Ryy is nonsingular, otherwise the pseudo inverse can be used.

Because L is lower triangular and monic, its inverse is also lower triangular and monic. The subtraction of L1 from I makes P to be lower triangular with zeros in the main diagonal. This structure imposes a causal relation among the elements of Y, such that Y~ can be obtained recursively.

The error vector is

E = Y Y~ = Y PY = Y (I L1)Y = L1Y.
(A.109)

In general, the sum mean-squared prediction error is

𝔼[||E||2] = 𝔼[||Y Y~||2] = trace{R ee},
(A.110)

where Ree = 𝔼[EE] is the autocorrelation matrix of E. It can be proved (see, e. g., [BLM04]) that when the optimum linear predictor of Eq. (A.108) is adopted, the error power trace{Ree} achieves its minimum value given by trace{Dy2}. This avoids the step of estimating Ree to obtain the prediction gain, which is given by

prediction gain = 10log 10 (trace{Rxx} trace{Ree} ) = 10log 10 ( trace{Rxx} trace{Dy2})dB.
(A.111)

Hence, making an analogy with prediction over time, repeated here for convenience:

X[n] Mx1(z) I[n] M x(z) X[n]

the spatial prediction allows to obtain

Y L1 E L Y,

which is expressed in matrix notation as E = L1Y and Y = LE.

Listing A.15 illustrates an example discussed in [BLM04].

Listing A.15: MatlabOctaveCodeSnippets/snip_appprediction_spatialLinearPredictionExample.m
1%Example 10-11 from Barry, 2004 (note a typo in matrix R in the book) 
2Ryy=[16 8 4; 8 20 10; 4 10 21] %noise autocorrelation, correlated 
3%Ryy=[10, 8, 2; 8 10 10; 2 10 10]; %another option, higher gain 
4[L D] = ldl_dg(Ryy)%own LDL, do not use chol(A) because it swaps rows 
5Ryy-L*D*L' %compare with Ryy, should be the same 
6P=eye(size(L))-inv(L) %optimum MMSE linear predictor 
7minMSE=trace(D) %minimum MSE is the trace{Ree} = trace{D} 
8sumPowerX=trace(Ryy); %sum of all "users" 
9predictionGain = 10*log10(sumPowerX/minMSE)
  

In Listing A.15, the original predictor matrix is

P =     [0,         0,         0;
    0.5000,         0,         0;
         0,    0.5000,         0]

and the prediction gain is 0.7463 dB. Adopting a new correlation matrix Ryy=[10, 8, 2; 8 10 10; 2 10 10] leads to

P =     [0,         0,         0;
    0.8000,         0,         0;
   -1.6667,    2.3333,         0]

and a prediction gain of 9.2082 dB.

Note that the first element y~1 of Y~ = [y1,,yV ] in Y~ = PY is always zero due to the structure of P. Then, the second element y~2 is a scaled version of the first element y1 of Y, and so on.