Advanced: Vector Prediction Exploring Spatial Correlation

A.22 Advanced: Vector Prediction Exploring Spatial Correlation

Instead of exploring correlation over time, this section discusses methods to explore the so-called spatial correlation: an element of the random vector being estimated based on the other elements of this vector. As signal model, the Gaussian block or “packet” ISI channel [CF97], (page 84, Eq. 4.6) is adopted here, which is given by

Y = H X + N,

(A.101)

where $X$ is a zero-mean complex random input $m$ -vector, $Y$ is a zero-mean complex random output $n$ -vector, $N$ is a complex zero-mean Gaussian noise $n$ -vector independent of $X$ and $H$ is the complex $n × m$ channel matrix [CF97]. For these random vectors, the correlation matrices coincide with covariance matrices. ²²

The output covariance matrix is given by

R_{yy} = H R_{xx} H^{∗} + R_{nn},

(A.102)

where a matrix superscript $∗$ denotes Hermitian (transpose conjugate).

A characteristic representation of a random $m$ -vector $X$ is given by the linear combination of the columns of a matriz $F$ , whose determinant is equal to one,²³ weighted by a vector of uncorrelated random variables $V$ , that is:

X = F V .

(A.103)

Hence, the covariance matrix of $X$ is given by:

R_{xx} = F R_{vv} F^{∗}

(A.104)

where $R_{vv}$ is diagonal (the random variables in $V$ are uncorrelated).

There are two alternatives of interest for representing a vector in its characteristic form, the modal and the innovations representation. The first is derived from the eigendecomposition of $R_{xx}$ . Given the factorization $R_{xx} = U Λ_{x}^{2} U^{∗} = (U Λ_{x}) {(U Λ_{x})}^{∗}$ , by comparison to Eq. (A.104), $F$ corresponds to the unitary matrix $U$ from the eigendecomposition, while the uncorrelated vector $V$ from Eq. (A.103) corresponds to $U^{−1} X$ . Meanwhile, the latter (innovations representation) is derived from the Cholesky decompostion. In a similar manner, given the factorization $R_{yy} = L D_{y}^{2} L^{∗} = (L D_{y}) {(L D_{y})}^{∗}$ , $F$ corresponds to the lower triangular matrix $L$ , while $V$ corresponds to $L^{−1} X$ .

The important conclusion yielded by these two representations is that a vector $X$ whose random variables are correlated can be whitened by a forward section given by $U^{−1}$ , the inverse of the unitary matrix from the eigendecomposition of its covariance matrix, or $L^{−1}$ , the inverse of the lower triangular matrix from the Cholesky decomposition of its covariance matrix.

The innovations representation is a natural adaptation of linear prediction over time and is obtained with a Cholesky factorization of $R_{yy}$ , while the modal representation can be obtained via eigenanalysis or SVD.

The optimum MMSE linear predictor in this scenario is

\tilde{Y} = P Y,

(A.105)

where the predictor matrix $P$ is given by

P = I − L^{−1},

(A.106)

with $L$ being obtained from the innovations representation $R_{yy} = L D_{y}^{2} L^{∗}$ and $I$ being the identity matrix. It was assumed that $R_{yy}$ is nonsingular, otherwise the pseudo inverse can be used.

Because $L$ is lower triangular and monic, its inverse is also lower triangular and monic. The subtraction of $L^{−1}$ from $I$ makes $P$ to be lower triangular with zeros in the main diagonal. This structure imposes a causal relation among the elements of $Y$ , such that $\tilde{Y}$ can be obtained recursively.

The error vector is

E = Y − \tilde{Y} = Y − P Y = Y − (I − L^{−1}) Y = L^{−1} Y .

(A.107)

In general, the sum mean-squared prediction error is

𝔼[|| E | |^{2}] = 𝔼[|| Y − \tilde{Y} | |^{2}] = trace {R_{ee}},

(A.108)

where $R_{ee} = 𝔼[E E^{∗}]$ is the autocorrelation matrix of $E$ . It can be proved (see, e. g., [BLM04]) that when the optimum linear predictor of Eq. (A.106) is adopted, the error power $trace {R_{ee}}$ achieves its minimum value given by $trace {D_{y}^{2}}$ . This avoids the step of estimating $R_{ee}$ to obtain the prediction gain, which is given by

prediction gain = 10 {log}_{10} (\frac{trace {R_{xx}}}{trace {R_{ee}}}) = 10 {log}_{10} (\frac{trace {R_{xx}}}{trace {D_{y}^{2}}}) dB .

(A.109)

Hence, making an analogy with prediction over time, repeated here for convenience:

X[n] → M_{x}^{−1} (z) → I[n] → M_{x} (z) → X[n]

the spatial prediction allows to obtain

Y → L^{−1} → E → L → Y,

which is expressed in matrix notation as $E = L^{−1} Y$ and $Y = L E$ .

Listing A.15 illustrates an example discussed in [BLM04].

Listing A.15: MatlabOctaveCodeSnippets/snip_appprediction_spatialLinearPredictionExample.m

1%Example 10-11 from Barry, 2004 (note a typo in matrix R in the book) 
2Ryy=[16 8 4; 8 20 10; 4 10 21] %noise autocorrelation, correlated 
3%Ryy=[10, 8, 2; 8 10 10; 2 10 10]; %another option, higher gain 
4[L D] = ldl_dg(Ryy)%own LDL, do not use chol(A) because it swaps rows 
5Ryy-L*D*L' %compare with Ryy, should be the same 
6P=eye(size(L))-inv(L) %optimum MMSE linear predictor 
7minMSE=trace(D) %minimum MSE is the trace{Ree} = trace{D} 
8sumPowerX=trace(Ryy); %sum of all "users" 
9predictionGain = 10*log10(sumPowerX/minMSE)

In Listing A.15, the original predictor matrix is

     P =     [0,         0,         0;
         0.5000,         0,         0;
              0,    0.5000,         0]

and the prediction gain is 0.7463 dB. Adopting a new correlation matrix Ryy=[10, 8, 2; 8 10 10; 2 10 10] leads to

     P =     [0,         0,         0;
         0.8000,         0,         0;
        -1.6667,    2.3333,         0]

and a prediction gain of 9.2082 dB.

Note that the first element ${\tilde{y}}_{1}$ of $\tilde{Y} = [y_{1},…, y_{V}]$ in $\tilde{Y} = P Y$ is always zero due to the structure of $P$ . Then, the second element ${\tilde{y}}_{2}$ is a scaled version of the first element $y_{1}$ of $Y$ , and so on.

²² Eq. (15) in [GP06] corresponds to Eq. (A.101), assuming the channel matrix $H$ is diagonal because vectoring of the $V$ lines already partitioned the channel.

²³ A matrix whose determinant is equal to one is said to be equi-areal and orientation-preserving.