A.22  One-dimensional linear prediction over time

A.22.1  The innovations process

As indicated in Eq. (4.39), when white noise with power σn2 is passed through an LTI filter M(z), the output PSD is

S(ejΩ) = σ2|M(ejΩ)|2.
(A.97)

Conversely, when a PSD S(ejΩ) satisfies the corresponding conditions,14 it is possible to find a filter M(z) that obeys Eq. (A.97).

As discussed in [BLM04] (page 71), a WSS random process X[n] with PSD SX(ejΩ) allows the following monic15 minimum-phase spectral factorization of the associated16 “transfer function” SX(z):

SX(z) = σi2M x(z)Mx(1z),
(A.98)

where Mx(z) is a monic loosely17 minimum-phase causal filter and σi2 is the variance of the innovation process or, equivalently, the geometric mean18 of SX(ejΩ).

If Mx(z) is strictly minimum phase, Mx1(z) is stable and the following filtering process

X[n] Mx1(z) I[n] M x(z) X[n]

illustrates how to generate a white WSS process I[n] with power σi2 from X[n] (the “whitening” operation) and how to recover X[n] from I[n].

The filter Mx1(z) removes correlated components in X[n] and obtains I[n] with characteristics of white noise. Hence, it is called whitening filter. And because I[n] corresponds to the new and random part of the process, it is called the innovation sequence.

Linear prediction for signal decorrelation

The whitening filter Mx1(z) is also called prediction-error filter because its action can be split into two stages, assuming linear prediction:

f 1.
a prediction X~[m] = k=1pkX[m k] of the current (m-th) sample of X[n] is obtained from past samples of X[n] via the prediction filter

P(z) = n=1p nzn,
(A.99)

which is P(z) = 1 Mx1(z)

f 2.
and then this prediction is subtracted from the current sample to produce the prediction error I[m] = X[m] X~[m]. It can be shown that this scheme is an optimum minimum mean square error (MMSE) estimator19 [BLM04] (page 72).

Using the notation of “prediction”, the whitening can be obtained with:

X[n] 1 P(z) I[n]

In summary, the filter Mx1(z) = 1 P(z) is often used to perform decorrelation of a signal while Mx(z) is called the synthesis filter because it can generate a signal X[n] with a given PSD from white noise I[n].

When X[m] is a correlated signal, the decorrelation procedure decreases its power and is typically beneficial for signal compression (reducing the bit rate) purposes.

The power of the prediction error I[m] is σi2, which is the geometric mean of the PSD SX(ejΩ), while the power σx2 of X[m] is the average of SX(ejΩ). Hence, linear prediction reduces the noise power by

prediction gain = 10log 10 (σx2 σi2 ) = 10log 10 ( 1 SFM )dB.
(A.100)

The ratio SFM = σi2σx2 is called the spectral flatness measure. Assuming the PSD was estimated in K discrete points SX[k], the SFM can be alternatively obtained by

SFM = PSD geometric mean PSD arithmetic mean = ( k=1KSX[k])1K (1K) k=1KSX[k].
(A.101)

Because the geometric mean is never larger than the arithmetic mean: 0 SFM 1, and SFM = 1 only if SX[k] is constant, k.

The prediction gain (in linear scale) coincides with 1SFM. Small values of SFM suggest concentration of power at narrow portions of the PSD and, consequently, large prediction gains, while large SFM values indicate the PSD is approximately flat, which corresponds to small prediction gains.

Autoregressive (AR) linear prediction

This discussion is related to Section 4.8. There are computationally efficient methods to estimate M(z) when it is restricted to be and autoregressive (AR) filter M(z) = 1A(z). For example, the Yule-Walker equations consist of an efficient method to estimate A(z) [Hay01]. In this case, it is assumed that X[n] is a realization of an ergodic autoregressive random process of order P, denoted as AR(P).

The problem can then be posed mathematically as finding the FIR filter A(z) = 1 i=1P aizi that minimizes the energy n|I[n]|2 of the output signal I[n] in the filtering process:

X[n] A(z) I[n].

In this case, the prediction-error filter can be written as M1(z) = A(z) = 1 P~(z), where P~(z) = k=nP akzk predicts an estimate X~[n] of X[n] as

X~[n] = k=1P a kX[n k],
(A.102)

based on the P past samples of X[n].

AR-based modeling using (A.102) is widely used in speech coding and, because X[n] in (A.102) is a linear combination of past inputs, it is called linear predictive coding (LPC) (see Section 4.8). Listing A.14 provides two examples of linear prediction using or not an AR process.

Listing A.14: MatlabOnly/snip_appprediction_example.m
1randn('seed',1); %reset number generator. Could use rng('default') 
2useARProcess = 0;       % 1 for AR or 0 for ARMA 
3N       = 100000;  % Number of samples 
4inn = randn(1, N); % innovation: WGN with zero mean and unit variance 
5if useARProcess == 1 % AR process 
6    fprintf('AR process example\n') 
7    Bz = 4; % H(z) = Bz/Az is an all-poles filter 
8    Az = poly([0.98*exp(1j*pi/4) 0.98*exp(-1j*pi/4)]); 
9    x = filter(Bz, Az, inn); % realization of an AR (2) process 
10    P = 2; % adopt the correct order of H(z) 
11else %ARMA process 
12    fprintf('ARMA process example\n') 
13    Bz = 6*poly([0.7*exp(1j*pi/3) 0.7*exp(-1j*pi/3)]); 
14    Az = poly([0.9*exp(1j*pi/4) 0.9*exp(-1j*pi/4)]); 
15    x = filter(Bz, Az, inn); % realization of innovation process 
16    P = 10; % it is not AR, so choose a relatively high value 
17end 
18[A, Pinn] = aryule(x, P); %prediction, P: order and Pinn: error variance 
19% Given the filter M(z) = 1/A(z), the inverse M^{-1}(z) = A(z) 
20% estimates the innovation sequence with X as input: 
21estimatedInnovation = filter(A, 1, x); % predict 
22Hw = freqz(Bz, Az); % frequency response 
23Spsd = abs(Hw).^2 + eps; % convert to scaled PSD and avoid zeros 
24Sgeo = geo_mean(Spsd); % geometric mean: power of innovation 
25Sarit = mean(Spsd); % arithmetic mean: power of input signal 
26predictionGain = 10*log10(Sarit/Sgeo); %prediction gain in dB 
27fprintf('prediction gain = %g dB \n', predictionGain) 
28fprintf('Signal power (time-domain) =  %g \n', mean(x.^2) ) 
29fprintf('Signal power (via PSD arithmetic mean) = %g \n', Sarit) 
30fprintf('Innovation power (via PSD geometric mean) = %g \n', Sgeo) 
31fprintf('Innovation power (via AR modeling) = %g \n', Pinn) 
32fprintf('Innovation power (time-domain) = %g \n', ... 
33    mean(estimatedInnovation.^2)) 
34% freqz(Bz,Az) %frequency response, if useful
  

The result of running Listing A.14 with useARProcess = 1 is the following:

AR process example
prediction gain = 14.0937 dB
Signal power (time-domain) =  422.558
Signal power (via PSD arithmetic mean) = 412.08
Innovation power (via PSD geometric mean) = 16.0552
Innovation power (via AR modeling) = 16.0843
Innovation power (time-domain) = 16.0819

Running Listing A.14 with useARProcess = 0 leads to the following:

ARMA process example
prediction gain = 2.99342 dB
Signal power (time-domain) =  72.056
Signal power (via PSD arithmetic mean) = 71.8227
Innovation power (via PSD geometric mean) = 36.0512
Innovation power (via AR modeling) = 36.1937
Innovation power (time-domain) = 36.1937

In both cases the results were consistent. In the second case, an order 10 for the AR filter was adopted. Note that moving the poles closer to unit circle increases the prediction gain. This occurs because, by moving the poles closer to the unit circle, the frequency response (and the power spectral density) tends to present a shaper peak with higher magnitude. This, in turn, produces a more significant increase in the arithmetic mean than on the geometric mean of the PSD, which implies in increased prediction gain, as inferred from Eq. (A.100).