Vector prediction exploring spatial correlation

A.23 Vector prediction exploring spatial correlation

Instead of exploring correlation over time, this section discusses methods to explore the so-called spatial correlation: an element of the random vector being estimated based on the other elements of this vector. As signal model, the Gaussian block or “packet” ISI channel [CF97], (page 84, Eq. 4.6) is adopted here, which is given by

Y = H X + N,

(A.103)

where $X$ is a zero-mean complex random input $m$ -vector, $Y$ is a zero-mean complex random output $n$ -vector, $N$ is a complex zero-mean Gaussian noise $n$ -vector independent of $X$ and $H$ is the complex $n \times m$ channel matrix [CF97]. For these random vectors, the correlation matrices coincide with covariance matrices. ²⁰

The output covariance matrix is given by

R_{yy} = H R_{xx} H^{*} + R_{nn},

(A.104)

where a matrix superscript $*$ denotes Hermitian (transpose conjugate).

A characteristic representation of a random $m$ -vector $X$ is given by the linear combination of the columns of a matriz $F$ , whose determinant is equal to one,²¹ weighted by a vector of uncorrelated random variables $V$ , that is:

X = F V .

(A.105)

Hence, the covariance matrix of $X$ is given by:

R_{xx} = F R_{vv} F^{*}

(A.106)

where $R_{vv}$ is diagonal (the random variables in $V$ are uncorrelated).

There are two alternatives of interest for representing a vector in its characteristic form, the modal and the innovations representation. The first is derived from the eigendecomposition of $R_{xx}$ . Given the factorization $R_{xx} = U Λ_{x}^{2} U^{*} = (U Λ_{x}) {(U Λ_{x})}^{*}$ , by comparison to Eq. (A.106), $F$ corresponds to the unitary matrix $U$ from the eigendecomposition, while the uncorrelated vector $V$ from Eq. (A.105) corresponds to $U^{- 1} X$ . Meanwhile, the latter (innovations representation) is derived from the Cholesky decompostion. In a similar manner, given the factorization $R_{yy} = L D_{y}^{2} L^{*} = (L D_{y}) {(L D_{y})}^{*}$ , $F$ corresponds to the lower triangular matrix $L$ , while $V$ corresponds to $L^{- 1} X$ .

The important conclusion yielded by these two representations is that a vector $X$ whose random variables are correlated can be whitened by a forward section given by $U^{- 1}$ , the inverse of the unitary matrix from the eigendecomposition of its covariance matrix, or $L^{- 1}$ , the inverse of the lower triangular matrix from the Cholesky decomposition of its covariance matrix.

The innovations representation is a natural adaptation of linear prediction over time and is obtained with a Cholesky factorization of $R_{yy}$ , while the modal representation can be obtained via eigenanalysis or SVD.

The optimum MMSE linear predictor in this scenario is

\tilde{Y} = P Y,

(A.107)

where the predictor matrix $P$ is given by

P = I - L^{- 1},

(A.108)

with $L$ being obtained from the innovations representation $R_{yy} = L D_{y}^{2} L^{*}$ and $I$ being the identity matrix. It was assumed that $R_{yy}$ is nonsingular, otherwise the pseudo inverse can be used.

Because $L$ is lower triangular and monic, its inverse is also lower triangular and monic. The subtraction of $L^{- 1}$ from $I$ makes $P$ to be lower triangular with zeros in the main diagonal. This structure imposes a causal relation among the elements of $Y$ , such that $\tilde{Y}$ can be obtained recursively.

The error vector is

E = Y - \tilde{Y} = Y - P Y = Y - (I - L^{- 1}) Y = L^{- 1} Y .

(A.109)

In general, the sum mean-squared prediction error is

𝔼 [| | E | |^{2}] = 𝔼 [| | Y - \tilde{Y} | |^{2}] = trace {R_{ee}},

(A.110)

where $R_{ee} = 𝔼 [E E^{*}]$ is the autocorrelation matrix of $E$ . It can be proved (see, e. g., [BLM04]) that when the optimum linear predictor of Eq. (A.108) is adopted, the error power $trace {R_{ee}}$ achieves its minimum value given by $trace {D_{y}^{2}}$ . This avoids the step of estimating $R_{ee}$ to obtain the prediction gain, which is given by

prediction gain = 10 {log}_{10} (\frac{trace {R_{xx}}}{trace {R_{ee}}}) = 10 {log}_{10} (\frac{trace {R_{xx}}}{trace {D_{y}^{2}}}) dB .

(A.111)

Hence, making an analogy with prediction over time, repeated here for convenience:

X [n] \to M_{x}^{- 1} (z) \to I [n] \to M_{x} (z) \to X [n]

the spatial prediction allows to obtain

Y \to L^{- 1} \to E \to L \to Y,

which is expressed in matrix notation as $E = L^{- 1} Y$ and $Y = L E$ .

Listing A.15 illustrates an example discussed in [BLM04].

Listing A.15: MatlabOctaveCodeSnippets/snip_appprediction_spatialLinearPredictionExample.m

1%Example 10-11 from Barry, 2004 (note a typo in matrix R in the book) 
2Ryy=[16 8 4; 8 20 10; 4 10 21] %noise autocorrelation, correlated 
3%Ryy=[10, 8, 2; 8 10 10; 2 10 10]; %another option, higher gain 
4[L D] = ldl_dg(Ryy)%own LDL, do not use chol(A) because it swaps rows 
5Ryy-L*D*L' %compare with Ryy, should be the same 
6P=eye(size(L))-inv(L) %optimum MMSE linear predictor 
7minMSE=trace(D) %minimum MSE is the trace{Ree} = trace{D} 
8sumPowerX=trace(Ryy); %sum of all "users" 
9predictionGain = 10*log10(sumPowerX/minMSE)

In Listing A.15, the original predictor matrix is

P =     [0,         0,         0;
    0.5000,         0,         0;
         0,    0.5000,         0]

and the prediction gain is 0.7463 dB. Adopting a new correlation matrix Ryy=[10, 8, 2; 8 10 10; 2 10 10] leads to

P =     [0,         0,         0;
    0.8000,         0,         0;
   -1.6667,    2.3333,         0]

and a prediction gain of 9.2082 dB.

Note that the first element ${\tilde{y}}_{1}$ of $\tilde{Y} = [y_{1}, \dots, y_{V}]$ in $\tilde{Y} = P Y$ is always zero due to the structure of $P$ . Then, the second element ${\tilde{y}}_{2}$ is a scaled version of the first element $y_{1}$ of $Y$ , and so on.

²⁰ Eq. (15) in [GP06] corresponds to Eq. (A.103), assuming the channel matrix $H$ is diagonal because vectoring of the $V$ lines already partitioned the channel.

²¹ A matrix whose determinant is equal to one is said to be equi-areal and orientation-preserving.