Block Transforms

2.4 Block Transforms

Generic block processing was presented in Section 1.4. This section discusses block transforms as a specific block-oriented signal processing, including DCT, DFT, and Haar transforms, which are among the most used transforms, together with the already presented KLT (or PCA).

In many signal processing tasks, it is useful to have two related transformations called a transform pair. One transform is the inverse of the other, with the inverse undoing the forward transform. When dealing with block transforms,⁵ one simply uses linear algebra (as detailed in Appendix A.14), and both transforms are simple matrix multiplications. The pair of transforms is defined by a pair of matrices. The matrices can be rectangular, as in lapped transforms,⁶ but most block transforms are defined by a pair of $N \times N$ square matrices $A$ and $A^{- 1}$ and the jargon $N$ -point transform indicates their dimension. The inverse matrix $A^{- 1}$ is assumed to exist and can “undo” the transformation $A$ (and vice-versa). Here, the forward (or direct) transformation is denoted as

X = A x,

while the inverse transformation is denoted here as

x = A^{- 1} X .

(2.10)

The vector $X$ is called the transform of $x$ and the elements of $X$ are called coefficients. In this text, vectors are represented by bold lower case letters, but the vector with coefficients will be denoted by capital letters to be consistent with the jargon (unfortunately, vectors of coefficients such as $X$ can be confused with matrices, but the context will distinguish them).

An important concept is:

The columns of the inverse transform matrix $A^{- 1}$ are called the basis functions (or basis vectors).

2.4.1 Advanced: Unitary or orthonormal transforms

A matrix $B$ with orthonormal columns is called unitary. The rows of a unitary matrix are also orthonormal. If one takes any pair of distinct rows or columns of the unitary matrix $B$ , their inner product is zero (they are orthogonal) and their norms are equal to one.

Unitary matrices are widely used in transforms because their inverse is simply the conjugate transpose as indicated in:

B^{- 1} = B^{H} = {(B^{*})}^{T},

where $H$ denotes the Hermitian (conjugate transposition).

Considering Eq. (2.10), if the basis functions (columns of $A^{- 1}$ ) are orthonormal, then $A^{- 1} = A^{H}$ . Consequently, the rows of the direct transform $A$ are the complex-conjugate of the basis functions. Two important facts are:

Forward: the $k$ -th coefficient $X [k]$ of the forward transform $X = A x$ is obtained by performing the inner product between $x$ and the $k$ -th basis function. Based on the definition of inner product for complex-valued vectors in Eq. (A.32), it is adopted the complex conjugate of the $k$ -th basis function. The larger is this coefficient $X [k]$ magnitude, the better the $k$ -th basis function represents signal $x$ .
Inverse: in the inverse transform $x = A^{- 1} X$ , the column vector $x$ is obtained by the linear combination of the basis functions: the $k$ -th element (coefficient) in $X$ multiplies the $k$ -th column (basis function) of $A^{- 1}$ in a linear combination that generates $x$ .

Observing that the inverse of a unitary real matrix is its transpose

To get insight on why $B^{- 1} = B^{H}$ for a unitary $B$ , consider the elements of $B$ are real numbers, such that $B^{H} = B^{T}$ . The result of the product $B^{T} B = I$ is the identity matrix (that is $B^{- 1} = B^{T}$ ), because the inner product between the rows of $B^{T}$ (columns of $B$ ) with the columns of $B$ is one when they coincide (main diagonal of $I$ ) and zero otherwise (due to their orthogonality). In other words, the inner products of columns of $B$ with themselves is the identity, given they are orthonormal. In case $A$ is complex-valued, one has $A^{- 1} = A^{H}$ via a similar reasoning.

The next paragraphs present the DCT transform, which can be used for both frequency analysis and coding.

2.4.2 DCT transform

An example of a unitary matrix transform very useful for coding is the discrete cosine transform (DCT). When $N = 4$ , the corresponding matrices are

A = \frac{1}{2} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1.307 & 0.541 & - 0.541 & - 1.307 \\ 1 & - 1 & - 1 & 1 \\ 0.541 & - 1.307 & 1.307 & - 0.541 \end{matrix}]

and, because in this case of a real-valued matrix $A^{H} = A^{T}$ :

A^{- 1} = A^{T} = \frac{1}{2} [\begin{matrix} 1 & 1.307 & 1 & 0.541 \\ 1 & 0.541 & - 1 & - 1.307 \\ 1 & - 0.541 & - 1 & 1.307 \\ 1 & - 1.307 & 1 & - 0.541 \end{matrix}]

for the direct and inverse transforms, respectively.

Because they have finite duration, one can represent the input signals of block transforms as vectors or sequences. For instance, the DCT basis function corresponding to the second column of $A^{- 1}$ could be written as

b [n] = \frac{1}{2} (1.307 δ [n] + 0.541 δ [n - 1] - 0.541 δ [n - 2] - 1.307 δ [n - 3]) .

Considering that the first element is $a_{0, 0}$ (first index in equations is zero, not one as in Matlab/Octave), an element $a_{n, k}$ of the $N$ -point inverse DCT matrix $A^{- 1} = {a_{n, k}}$ can be obtained by

a_{n, k} = w_{k} cos (\frac{π (2 n + 1) k}{2 N}),

where $w_{k}$ is a scaling factor that enforces the basis vectors to have unit norm, i. e., $w_{k} = \frac{1}{\sqrt{N}}$ for $k = 0$ and $w_{k} = \sqrt{\frac{2}{N}}$ for $k = 1, 2, \dots, N - 1$ .

Example 2.5. DCT calculation in Matlab/Octave. Listing 2.1 illustrates how the DCT matrices can be obtained in Matlab/Octave.

Listing 2.1: MatlabOctaveFunctions/ak_dctmtx.m

1function [A, Ainverse] = ak_dctmtx(N) 
2% function [A, Ainverse] = ak_dctmtx(N) 
3%Calculate the DCT-II matrix of dimension N x N. A and Ainverse are 
4% the direct and inverse transform matrices, respectively. Ex: 
5%  [A, Ainv]=ak_dctmtx(4); x=[1:4]'+1j; A*x, dct(x) 
6Ainverse=zeros(N,N); %pre-allocate space 
7scalingFactor = sqrt(2/N); %make base functions to have norm = 1 
8for n=0:N-1 %a loop helps to clarify obtaining inverse matrix 
9    for k=0:N-1 %first array element is 1, so use A(n+1,k+1): 
10       Ainverse(n+1,k+1)=scalingFactor*cos((pi*(2*n+1)*k)/(2*N)); 
11    end 
12end 
13Ainverse(1:N,1)=Ainverse(1:N,1)/sqrt(2); %different scaling for k=0 
14%unitary transform: direct transform is the Hermitian of the inverse 
15A = transpose(Ainverse); %Matrix is real, so transpose is Hermitian

The matrices obtained with ak_dctmtx.m can be used to perform the transformations but this has only pedagogical value. There are algorithms for computing the DCT that are faster than a plain matrix multiplication. Check the functions dct and idct in Matlab/Octave and scipy.fftpack.dct in Python. $□$

Example 2.6. The DCT basis functions are cosines of distinct frequencies. Figure 2.2 shows four basis functions of a 32-point DCT transform.

Figure 2.2: The first three ( $k = 0, 1, 2$ ) and the last ( $k = 31$ ) basis functions for a 32-point DCT. Note that the frequency increases with $k$ .

Figure 2.2 indicates that, in order to represent signals composed by “low frequencies”, DCT coefficients of low order (small values of $k$ can be used), while higher order coefficients are more useful for signals composed by “high frequencies”. For example, the commands:

1N=32;k=3;n=0:N-1;x=7*cos(k*(pi*(2*n+1)/(2*N)));stem(x); X=dct(x)

return a vector X with all elements equal to zero but X(4)=28, which corresponds to $k = 3$ (recall the first index in Matlab/Octave is 1, not 0). Using a larger $k$ will increase the frequency and the order of the corresponding DCT coefficient. $□$

Example 2.7. Example of a DCT transformation. For example, assuming a 4-point DCT and $x = {[1, 2, 3, 4]}^{T}$ , the forward transform can be obtained in this case with

\begin{array}{l} X = & [\begin{matrix} X (0) \\ X (1) \\ X (2) \\ X (3) \end{matrix}] = A x \\ \approx & \frac{1}{2} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1.307 & 0.541 & - 0.541 & 1.307 \\ 1 & - 1 & - 1 & 1 \\ 0.541 & - 1.307 & 1.307 & - 0.541 \end{matrix}] [\begin{matrix} 1 \\ 2 \\ 3 \\ 4 \end{matrix}] \\ \approx & [\begin{matrix} 5 \\ - 2.230 \\ 0 \\ - 0.158 \end{matrix}] . \end{array}

In this case,

X [0] = ⟨ x, A (0, :) ⟩ = ⟨ {[1, 2, 3, 4]}^{T}, 0.5 {[1, 1, 1, 1]}^{T} ⟩ = 5,

where $A (0, :)$ represents the first (0-th) row of matrix $A$ . Similarly, $X (2)$ is given by

X [2] = ⟨ x, A (2, :) ⟩ = ⟨ {[1, 2, 3, 4]}^{T}, 0.5 {[1, - 1, - 1, 1]}^{T} ⟩ = 0,

and so on.

The previous expressions provide intuition on the direct transform. In the inverse transform, when reconstructing $x$ , the coefficient $X [k]$ is the scaling factor that multiplies the $k$ -th basis function in the linear combination $x = A^{- 1} X$ . Still considering the 4-point DCT, the inverse corresponds to

\begin{array}{l} [\begin{matrix} x [0] \\ x [1] \\ x [2] \\ x [3] \end{matrix}] & = \frac{1}{2} [\begin{matrix} 1 & 1.307 & 1 & 0.541 \\ 1 & 0.541 & - 1 & - 1.307 \\ 1 & - 0.541 & - 1 & 1.307 \\ 1 & - 1.307 & 1 & - 0.541 \end{matrix}] [\begin{matrix} 5 \\ - 2.230 \\ 0 \\ - 0.158 \end{matrix}] \\ = \frac{1}{2} {5 [\begin{matrix} 1 \\ 1 \\ 1 \\ 1 \end{matrix}] - 2.230 [\begin{matrix} 1.307 \\ 0.541 \\ - 0.541 \\ - 1.307 \end{matrix}] \\ + 0 [\begin{matrix} 1 \\ - 1 \\ - 1 \\ 1 \end{matrix}] - 0.158 [\begin{matrix} 0.541 \\ - 1.307 \\ 1.307 \\ - 0.541 \end{matrix}]} . \end{array}

Note that $X [2] = 0$ and, consequently, the basis function $0.5 {[1, - 1, - 1, 1]}^{T}$ is not used to reconstruct $x$ . The reason is that this specific basis function is orthogonal to $x$ and does not contribute to its reconstruction. As another example, the DCT coefficient $X [0] = 5$ indicates that a scaling factor equals to 5 should multiply the corresponding $0$ -th basis function $0.5 {[1, 1, 1, 1]}^{T}$ in order to reconstruct the vector $x$ in time-domain. $□$

Alternatively, the matrix multiplication can be described by a transform equation. For example, the DCT coefficients can be calculated by

X [k] = \sqrt{\frac{2}{N}} \sum_{n = 0}^{N - 1} x [n] cos (\frac{π (2 n + 1) k}{2 N}), k = 1, …, N - 1

and

X [0] = \frac{1}{\sqrt{N}} \sum_{n = 0}^{N - 1} x [n] .

As mentioned, the $k$ -th element (or coefficient) $X [k]$ of $X$ can be calculated as the inner product of $x$ with the complex conjugate of the $k$ -th basis function. This can be done because the DCT basis functions are orthogonal among themselves, as discussed in Section A.14.3. The factors $\sqrt{2 ∕ N}$ and $1 ∕ \sqrt{N}$ are used to have basis vectors with norms equal to 1. The $k$ -th basis vector is a cosine with frequency $(kπ) ∕ N$ and phase $(kπ) ∕ (2 N)$ .

Section 2.10 discusses examples of DCT applications, including coding (signal compression). One advantage of adopting block transforms in coding applications is the distinct importance of coefficients. In the original domain, all samples (or pixels in image processing) have the same importance, but in the transform domain, coefficients typically have distinct importance. Hence, the coding scheme can concentrate on representing the most important coefficients and even discard the non-important ones. Another application of DCTs is in frequency analysis (finding the most relevant frequencies that compose a signal). But, in this application, the DFT is more widely adopted than the DCT.

2.4.3 DFT transform

As the DCT, the discrete Fourier transform (DFT) is a very useful tool to accomplish frequency analysis, where the goal is to estimate the coefficients for basis functions that are distinguished by their frequencies. The DFT is related to the discrete-time Fourier series, which also uses cosines $cos (\frac{2 πnk}{N})$ and sines $sin (\frac{2 πnk}{N})$ , $k = 0, 1, \dots, N - 1$ , as basis functions, and will be discussed in this chapter. While the DCT uses cosines and its matrices are real, the DFT uses complex exponentials as basis functions.

Using Euler’s formula, Eq. (A.1), complex numbers provide a more concise representation of sines and cosines and the $k$ -th DFT basis function is given by

\frac{1}{N} e^{\frac{j 2 πnk}{N}} = \frac{1}{N} [cos (\frac{2 πnk}{N}) + j sin (\frac{2 πnk}{N})],

(2.11)

where $n = 0, 1, \dots, N - 1$ expresses time evolution, as for the DCT. The value $k$ determines the angular frequency $Ω_{k} = \frac{2 πk}{N}$ (in radians) of the basis function $b [n]$ . Eq. (2.11) can be written as

b [n] = \frac{1}{N} e^{j Ω_{k} n} = \frac{1}{N} [cos (Ω_{k} n) + j sin (Ω_{k} n)],

(2.12)

which gives the $n$ -th value of the time-domain basis function $b [n]$ .

Hence, one can notice that the DFT operates with an angular frequency resolution $Δ Ω = 2 π ∕ N$ , such that $Ω_{k} = k Δ Ω$ . For instance, considering $N = 4$ , the resolution is $Δ Ω = 2 π ∕ 4 = π ∕ 2$ rad and the DFT uses basis functions with angular frequencies $Ω_{k} = 0, π ∕ 2, π$ and $3 π ∕ 2$ rad. This is depicted in Figure 2.3.

Figure 2.3: Angles $Ω_{k} = 0, π ∕ 2, π$ and $3 π ∕ 2$ rad (or, equivalently, discrete-time angular frequencies) used by a DFT of $N = 4$ points when $k$ varies from $k = 0$ to 3, respectively.

When $k = 2$ in Figure 2.3, the angular frequency is $Ω_{k} = π$ rad, and the basis function is $b [n] = \frac{1}{N} e^{jπn} = \frac{1}{N} {(e^{jπ})}^{n} = \frac{1}{N} {(- 1)}^{n}$ . The signal ${(- 1)}^{n}$ corresponds to the highest angular frequency $Ω = π$ rad among discrete-time signals, and this frequency is called Nyquist frequency, as discussed in Section 1.8.3.

To complement Figure 2.3, which uses an even value $N$ , Figure 2.4 describes the DFT angular frequencies when $N = 5$ .

Figure 2.4: Angles $Ω_{k} = 0, 2 π ∕ 5, 4 π ∕ 5, 6 π ∕ 5$ and $8 π ∕ 5$ rad (that in degrees corresponds to $0$ , $72^{\circ}$ , $144^{\circ}$ , ${- 144}^{\circ}$ , ${- 72}^{\circ}$ ) used by a DFT of $N = 5$ points when $k$ varies from $k = 0$ to 4, respectively.

As depicted in Figure 2.4, when $N$ is odd, the Nyquist frequency $Ω = π$ rad is not used by the DFT.

Assuming a DFT with $N = 6$ points, Figure 2.5 indicates the angular frequency $Ω_{k}$ corresponding to each DFT coefficient $X [k]$ . For instance, the DC coefficient corresponding to $k = 0$ is the first element of the vector with DFT coefficients. For $N$ even, the Nyquist frequency is used and its respective coefficient is the element corresponding to $k = N ∕ 2$ .

Figure 2.5: Mapping between angular frequencies $Ω_{k}$ and the corresponding DFT coefficient $X [k]$ for $N = 6$ .

Figure 2.6 shows a figure similar to Figure 2.5 but for a DFT with $N = 5$ points. In the case of an odd value of $N$ , the Nyquist frequency is not used in the DFT calculation.

Another aspect depicted by both Figure 2.6 and Figure 2.5 is that the negative frequencies are located after the positive ones in the DFT coefficients vector. The Matlab/Octave function fftshift can be used to move the negative frequencies to the first elements of this vector.

Figure 2.6: Mapping between angular frequencies $Ω_{k}$ and the corresponding DFT coefficient $X [k]$ for $N = 5$ .

The $N$ -point forward and inverse DFT pair can be written as a pair of equations. The forward (direct) transform is

X [k] = \sum_{n = 0}^{N - 1} x [n] e^{- \frac{j 2 πnk}{N}}, for k = 0, \dots, N - 1,

(2.13)

which corresponds to calculating the $k$ -th DFT coefficient $X [k] = ⟨ x [n], e^{\frac{j 2 πnk}{N}} ⟩$ as the inner product between the time-domain signal $x [n]$ and the $k$ -th basis vector of Eq. (2.11) multiplied by $N$ . In this calculation, the adopted inner product for complex-valued vectors of Eq. (A.32) is responsible for Eq. (2.20) using $e^{- \frac{j 2 πnk}{N}}$ instead of $e^{\frac{j 2 πnk}{N}}$ .

The inverse DFT transform equation is

x [n] = \frac{1}{N} \sum_{k = 0}^{N - 1} X [k] e^{\frac{j 2 πnk}{N}}, for n = 0, \dots, N - 1 .

(2.14)

Eq. (2.13) and Eq. (2.14) are the most adopted definitions of a DFT pair. But the DFT scaling factor $1 ∕ N$ in Eq. (2.21) can be changed to $1 ∕ \sqrt{N}$ , and the same factor $1 ∕ \sqrt{N}$ be also used in Eq. (2.13), if one wants to have a DFT pair with basis functions with unit norm. In this case the transform is called unitary DFT and has the properties already discussed in Section 2.4.1.⁷

Instead of using transform equations, sometimes it is convenient to use a matrix notation for transforms. With a matrix notation, it is easier to interpret Eq. (2.21) as composing the time-domain signal $x [n]$ as a linear combination of basis functions $(1 ∕ N) e^{\frac{j 2 πnk}{N}}$ scaled by the respective coefficient $X [k]$ .

To develop the matrix notation, one can observe that an element $a_{n, k}$ of the $N$ -point inverse DFT matrix $A^{- 1} = {a_{n, k}}$ is given by

a_{n, k} = \frac{1}{N} e^{\frac{j 2 πnk}{N}} .

(2.15)

Because $n$ and $k$ appear in Eq. (2.15) as multiplication parcels, $a_{n, k} = a_{k, n}$ and the inverse DFT matrix $A^{- 1}$ is symmetric, that is $A^{- 1} = {(A^{- 1})}^{T}$ .

For the sake of comparison, when $N = 4$ , the inverse DCT and DFT matrices are, respectively:

A_{DCT}^{- 1} = \frac{1}{2} [\begin{matrix} 1 & 1.307 & 1 & 0.541 \\ 1 & 0.541 & - 1 & - 1.307 \\ 1 & - 0.541 & - 1 & 1.307 \\ 1 & - 1.307 & 1 & - 0.541 \end{matrix}] and A_{DFT}^{- 1} = \frac{1}{4} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & j & - 1 & - j \\ 1 & - 1 & 1 & - 1 \\ 1 & - j & - 1 & j \end{matrix}] .

The example shows that the inverse DFT matrix $A_{DFT}^{- 1}$ is symmetric, while the inverse DCT matrix $A_{DCT}^{- 1}$ is not. It can also be noticed that the inverse DFT matrix $A_{DFT}^{- 1}$ is complex-valued, having elements that are real and imaginary. In contrast, the inverse DCT matrix $A_{DCT}^{- 1}$ is real-valued. However, the first and third columns of $A_{DFT}^{- 1}$ are real-valued (all elements are $+ 1$ or $- 1$ ). For $N = 4$ , these rows correspond to $k = 0$ and $k = N ∕ 2$ , which are called DC and Nyquist frequencies, respectively. All $N$ values corresponding to the basis vector of frequency $k = 0$ (DC) are given by $a_{n, 0} = 1 ∕ 4 = 0.25, n = 0, \dots, N - 1$ , as indicated in the first column of $A_{DFT}^{- 1}$ . And for $N$ even, all $N$ values corresponding to the basis vector of the Nyquist frequency $k = N ∕ 2$ (third column of $A_{DFT}^{- 1}$ ) are given by $a_{n, N ∕ 2} = {(- 1)}^{n}, n = 0, \dots, N - 1$ .

From Eq. (2.15) and adopting $N = 4$ , the DFT inverse matrix is given by

A^{- 1} = \frac{1}{N} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & e^{jπ ∕ 2} & e^{jπ} & e^{j 3 π ∕ 2} \\ 1 & e^{jπ} & e^{j 2 π} & e^{j 3 π} \\ 1 & e^{j 3 π ∕ 2} & e^{j 3 π} & e^{j 9 π ∕ 2} \end{matrix}] = \frac{1}{4} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & j & - 1 & - j \\ 1 & - 1 & 1 & - 1 \\ 1 & - j & - 1 & j \end{matrix}] .

Note that the complex numbers $e^{j Ω}$ with angles $Ω = 0, π ∕ 2, π$ and $3 π ∕ 2$ (which is equivalent to $- π ∕ 2$ ) rad and magnitudes equal to 1, can be written as $1, j, - 1$ and $- j$ , respectively. This simplifies writing the elements of $A^{- 1}$ for $N = 4$ and other values of $N$ .

Example 2.8. Examples of 4-point DFT calculation. Let us calculate the 4-point DFTs of three signals represented by the column vectors $x = {[2, 2, 2, 2]}^{T}$ , $y = {[0, 5, - 5, 0]}^{T}$ and $z = {[1, 2, 3, 4]}^{T}$ . In this case, the forward and inverse matrices can be written as

A = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & - j & - 1 & j \\ 1 & - 1 & 1 & - 1 \\ 1 & j & - 1 & - j \end{matrix}] and A^{- 1} = \frac{A^{H}}{N} = \frac{1}{4} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & j & - 1 & - j \\ 1 & - 1 & 1 & - 1 \\ 1 & - j & - 1 & j \end{matrix}],

respectively. The coefficient vectors are $X = A x = {[8, 0, 0, 0]}^{T}$ , $Y = A y = {[0, 5 - j 5, - 10, 5 + j 5]}^{T}$ and $Z = A z = {[10, - 2 + j 2, - 2, - 2 - j 2]}^{T}$ .

Because the basis functions (columns of $A^{- 1}$ ) are orthogonal, but not orthonormal, the $k$ -th coefficient of the forward transform is obtained by performing the inner product between $x$ and the complex conjugate of the $k$ -th basis function multiplied by $N = 4$ .

The first basis function (corresponding to $k = 0$ ) is always the first column of $A^{- 1}$ , which in this case is ${[0.25, 0.25, 0.25, 0.25]}^{T}$ . Because all elements of this basis function are a constant ( $0.25$ when $N = 4$ ), the job of the respective coefficient is to represent the average value (DC level) of the input signal. After multiplication by $N = 4$ , the scaled basis function for $k = 0$ is $[1, 1, 1, 1]$ (first row of $A$ ) and the corresponding coefficients in $X$ , $Y$ and $Z$ are 8, 0 and 10, respectively. This indicates that the DC level of $x$ , $y$ and $z$ are 2, 0 and 2.5, respectively.

This example also illustrates the symmetry of DFT coefficients when the input signal is real-valued. One can note that the coefficients corresponding to $k = 1$ and $k = 3$ are complex conjugates, while the coefficients corresponding to $k = 0$ and $k = 2$ (more generally, $k = N ∕ 2$ , for $N$ even) is always real-valued. In fact, because the basis functions for frequencies $k = 0$ (DC) and $k = N ∕ 2$ (Nyquist) are real-valued, if the input vector is also real-valued, these two DFT coefficients will always be real-valued (in the case $N$ is even). $□$

The following example addresses the situation when $N$ is an odd number. In this case, the Nyquist frequency is not represented in the DFT values.

Example 2.9. Examples of 5-point DFT calculation. Let us calculate the 5-point DFTs of three signals that have five elements but are similar to the ones in the previous example: $x = {[2, 2, 2, 2, 2]}^{T}$ , $y = {[0, 5, - 5, 0, 0]}^{T}$ and $z = {[1, 2, 3, 4, 5]}^{T}$ . In this case, the inverse matrix can be written as

A^{- 1} = \frac{1}{5} [\begin{matrix} 1 & 1 & 1 & 1 & 1 \\ 1 & e^{j 2 π ∕ 5} & e^{j 4 π ∕ 5} & e^{- j 4 π ∕ 5} & e^{- j 2 π ∕ 5} \\ 1 & e^{j 4 π ∕ 5} & e^{- j 2 π ∕ 5} & e^{j 2 π ∕ 5} & e^{- j 4 π ∕ 5} \\ 1 & e^{- j 4 π ∕ 5} & e^{j 2 π ∕ 5} & e^{- j 2 π ∕ 5} & e^{j 4 π ∕ 5} \\ 1 & e^{- j 2 π ∕ 5} & e^{- j 4 π ∕ 5} & e^{j 4 π ∕ 5} & e^{j 2 π ∕ 5} \end{matrix}],

which can be written in Cartesian form using two decimal places as

A^{- 1} \approx \frac{1}{5} [\begin{matrix} 1 & 1 & 1 & 1 & 1 \\ 1 & 0.31 - 0.95 j & - 0.81 - 0.59 j & - 0.81 + 0.59 j & 0.31 + 0.95 j \\ 1 & - 0.81 - 0.59 j & 0.31 + 0.95 j & 0.31 - 0.95 j & - 0.81 + 0.59 j \\ 1 & - 0.81 + 0.59 j & 0.31 - 0.95 j & 0.31 + 0.95 j & - 0.81 - 0.59 j \\ 1 & 0.31 + 0.95 j & - 0.81 + 0.59 j & - 0.81 - 0.59 j & 0.31 - 0.95 j \end{matrix}] .

When $N = 5$ , the DFT angular resolution is $Δ Ω = 2 π ∕ 5$ rad, which corresponds to $72$ degrees. To better visualize the involved angles, as a third altenative to denote $A^{- 1}$ , one can use the notation $m∠ 𝜃^{\circ}$ for complex values and use angles in degrees to write

A^{- 1} = \frac{1}{5} [\begin{matrix} 1 & 1 & 1 & 1 & 1 \\ 1 & 1 ∠ 72^{\circ} & 1 ∠ 144^{\circ} & 1 ∠ {- 144}^{\circ} & 1 ∠ {- 72}^{\circ} \\ 1 & 1 ∠ 144^{\circ} & 1 ∠ {- 72}^{\circ} & 1 ∠ 72^{\circ} & 1 ∠ {- 144}^{\circ} \\ 1 & 1 ∠ {- 144}^{\circ} & 1 ∠ 72^{\circ} & 1 ∠ {- 72}^{\circ} & 1 ∠ 144^{\circ} \\ 1 & 1 ∠ {- 72}^{\circ} & 1 ∠ {- 144}^{\circ} & 1 ∠ 144^{\circ} & 1 ∠ 72^{\circ} \end{matrix}] .

Performing the direct transform, one finds that the coefficient vectors in this case are $X = A x = {[10, 0, 0, 0, 0]}^{T}$ , $Y = A y = {[0, 5.59 - j 1.82, - 5.59 - 7.69 j, - 5.59 + 7.69 j, 5.59 + j 1.82,]}^{T}$ and $Z = A z = {[15, - 2.5 + 3.44 j, - 2.5 + 0.81 j, - 2.5 - 0.81 j, - 2.5 - 3.44 j]}^{T}$ .

The first (DC) coefficients in $X$ , $Y$ and $Z$ are 10, 0 and 15, respectively. After division by $N = 5$ , one can find that the DC level (or average value) of the time-domain vectors $x$ , $y$ and $z$ are 2, 0 and 3, respectively.

On the symmetry of DFT coefficients when the input signal is real-valued

Figure 2.5 and Figure 2.6 are useful to observe symmetries related to the DFT transform. Starting from Eq. (2.12), if we substitute the original angular frequency $Ω_{k}$ by $- Ω_{k}$ , the new basis $d [n]$ is given by

d [n] = \frac{1}{N} e^{- j Ω_{k} n} = \frac{1}{N} [cos (- Ω_{k} n) + j sin (- Ω_{k} n)] = \frac{1}{N} [cos (Ω_{k} n) - j sin (Ω_{k} n)] .

(2.16)

Comparing $d [n]$ with Eq. (2.12), one can note that $d [n] = b^{*} [n]$ is the complex conjugate⁸ of the function that has frequency $Ω_{k}$ . Therefore, if $X [k] = ⟨ x [n], b [n] ⟩$ , in case one changes the sign of the angular frequency of $b [n]$ from $Ω_{k}$ to $- Ω_{k}$ to obtain a new basis function $d [n] = b^{*} [n]$ , the new coefficient $Z [k] = ⟨ x [n], d [n] ⟩$ has the property $Z [k] = X^{*} [k]$ of being the complex conjugate of $X [k]$ if $x [n]$ is real-valued.

Observing Figure 2.5, the angles corresponding to $k = 1$ and $k = 5$ are opposites. Also, $k = 2$ is the opposite of $k = 4$ . Therefore, the respective pairs of DFT coefficients are complex conjugates. More specifically, $X [1] = X^{*} [5]$ and $X [2] = X^{*} [4]$ , when a 6-point DFT is adopted. Similarly, when assuming $N = 5$ and following the mentioned reasoning, one can conclude from Figure 2.6 that $X [1] = X^{*} [4]$ and $X [2] = X^{*} [3]$ .

In general, if the input signal is real-valuded one has DFT coefficients with the complex conjugate property

X [k'] = X^{*} [N - k'] .

If $N$ is even, this is valid for $k' = 1, \dots, (N ∕ 2 - 1)$ . And if $N$ is an odd number, the range is $k' = 1, \dots, (N - 1) ∕ 2$ . The following example illustrates this fact.

Example 2.10. Coefficients that are complex conjugates when the signal is real-valued. All the time-domain input vectors of Example 2.8 and Example 2.9 are real-valued and can be used to help studying the symmetry of the respective DFT coefficients.

In Example 2.8, the coefficient vectors are $X = {[8, 0, 0, 0]}^{T}$ , $Y = {[0, 5 - j 5, - 10, 5 + j 5]}^{T}$ and $Z = {[10, - 2 + j 2, - 2, - 2 - j 2]}^{T}$ . Recalling that the first element has index $k = 0$ (not $k = 1$ ), it can be seen that the coefficients corresponding to $k = 1$ and $k = 3$ are always complex conjugates.

Similarly, in Example 2.8, the coefficient vectors are $X = {[10, 0, 0, 0, 0]}^{T}$ , $Y = {[0, 5.59 - j 1.82, - 5.59 - 7.69 j, - 5.59 + 7.69 j, 5.59 + j 1.82,]}^{T}$ and $Z = {[15, - 2.5 + 3.44 j, - 2.5 + 0.81 j, - 2.5 - 0.81 j, - 2.5 - 3.44 j]}^{T}$ , and one can observe that the coefficients corresponding to the pairs $(k = 1, k = 4)$ and $(k = 2, k = 3)$ are always complex conjugates. $□$

The basis functions $b [n]$ for the DC and Nyquist coefficients are real-valued⁹ Hence, another property of the DFT for real-valued inputs is that $X [k] = ⟨ x [n], b [n] ⟩$ are real-valued coefficients for $k = 0$ and, in case $N$ is even, $k = N ∕ 2$ . This can be observed in the DFT coefficients discussed in Example 2.10.

Advanced: DFT and FFT denoted with the twiddle factor $W_{N}$

For convenience, the twiddle factor $W_{N}$ is defined as

W_{N} = e^{- \frac{j 2 π}{N}}

(2.17)

such that the $k$ -th basis is $(1 ∕ N) {(W_{N})}^{- nk}$ for the conventional DFT and $(1 ∕ \sqrt{N}) {(W_{N})}^{- nk}$ for the unitary DFT. Twiddle means to lightly turn over or around and is used because the complex number $W_{N}$ has magnitude equals to 1 and changes only the angle of a complex number that is multiplied by it. Each element of the inverse DFT matrix $A^{- 1} = {a_{n, k}}$ is

a_{n, k} = \frac{1}{N} {(W_{N})}^{- nk} .

Hence, the twiddle factor $W_{N} = e^{- \frac{j 2 π}{N}}$ denotes the DFT angular frequency resolution $Δ Ω$ .

Figure 2.7 illustrates the complex numbers $W_{N}$ as vectors for different values of $N$ . Because $| W_{N} | = 1$ , the twiddle factor is located on the unit circle of the complex plane and effectively informs an angle. For example, the three angles used by a DFT of $N = 3$ points are 0, 120 and 240 degrees, while a 4-point DFT uses 0, 90, 180 and 270.

Figure 2.7: The angles corresponding to $W_{N}$ on the unit circle, for $N = 3, 4, 5, 6$ .

For $N = 4$ , the DFT pair is given by

A = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & W_{4}^{1} & W_{4}^{2} & W_{4}^{3} \\ 1 & W_{4}^{2} & W_{4}^{4} & W_{4}^{6} \\ 1 & W_{4}^{3} & W_{4}^{6} & W_{4}^{9} \end{matrix}]

(2.18)

and

A^{- 1} = \frac{1}{N} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & W_{4}^{- 1} & W_{4}^{- 2} & W_{4}^{- 3} \\ 1 & W_{4}^{- 2} & W_{4}^{- 4} & W_{4}^{- 6} \\ 1 & W_{4}^{- 3} & W_{4}^{- 6} & W_{4}^{- 9} \end{matrix}] .

Note that $A^{- 1} \neq A^{H}$ because the basis functions have norms equal to $1 ∕ \sqrt{N}$ (the DFT matrix $A$ was not defined as unitary). In this case,

A^{- 1} = N A^{H} .

(2.19)

Also note that the reason to have $W_{N}$ defined as a negative exponent in Eq. (2.17) is that the direct transform $A$ uses positive powers of $W_{N}$ . Another important fact is that ${(W_{N})}^{aN + b} = {(W_{N})}^{b}$ for any $a \in ℤ$ . This can be seen by noting that $W_{N}^{n}$ has a period of $N$ samples and ${(W_{N})}^{aN} = 1$ . Hence, the 4-point direct DFT matrix of Eq. (2.18) can be written as

A = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & W_{4}^{1} & W_{4}^{2} & W_{4}^{3} \\ 1 & W_{4}^{2} & 1 & W_{4}^{2} \\ 1 & W_{4}^{3} & W_{4}^{2} & W_{4}^{1} \end{matrix}] = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & - j & - 1 & j \\ 1 & - 1 & 1 & - 1 \\ 1 & j & - 1 & - j \end{matrix}] .

It can be seen that the matrix $A$ (and also $A^{- 1}$ ) has several symmetries that can be explored to save computations. There is a large set of algorithms for efficiently computing the DFT. These are collectively called fast Fourier transform (FFT) algorithms. Apart from numerical errors, both DFT (the plain matrix multiplication) and FFT algorithms lead to the same result. Therefore, the DFT is the name of the transform and FFT is a fast algorithm (from a large collection) used to speed up calculating the DFT. Implementing an FFT routine is not trivial but they are typically available in softwares such as Matlab/Octave.

There are FFT algorithms specialized in the case where the FFT size $N$ is a power of two (i. e., $N = 2^{a}, a \in ℕ$ ). Other FFT algorithms do not present this restriction but typically are less efficient. The computational costs of DFT and FFT are quite different when $N$ grows large. The DFT matrix multiplication requires computing $N$ inner products, so the order of this computation is $O (N^{2})$ while FFT algorithms achieve $O (N {log}_{2} N)$ . Figure 2.8 illustrates this comparison.

Figure 2.8: Computational cost of the DFT calculated via matrix multiplication versus an FFT algorithm. Note that $N = 4096$ are used in standards such as VDSL2, for example, and it is clearly unreasonable to use matrix multiplication.

Instead of using matrix notation, one can use transform equations. The $N$ -point DFT pair (forward and inverse, respectively) is:

X [k] = \sum_{n = 0}^{N - 1} x [n] {(W_{N})}^{nk}

(2.20)

and

x [n] = \frac{1}{N} \sum_{k = 0}^{N - 1} X [k] {(W_{N})}^{- nk} .

(2.21)

For the unitary DFT pair, the normalization factors should be changed to:

X [k] = \frac{1}{\sqrt{N}} \sum_{n = 0}^{N - 1} x [n] {(W_{N})}^{nk}

(2.22)

and

x [n] = \frac{1}{\sqrt{N}} \sum_{k = 0}^{N - 1} X [k] {(W_{N})}^{- nk} .

(2.23)

2.4.4 Haar transform

In signal processing, the Haar transform is intimately related to wavelets. This section will not explore this relation nor detail the generation law of Haar matrices. The goal here is to motivate the study of wavelets by indicating how basis functions with support shorter than $N$ can be useful.

The Haar transform is unitary, such that $A^{- 1} = A^{H}$ . The reader can use the script MatlabOctaveThirdPartyFunctions/haarmtx.m with A=haarmtx(N) to obtain the forward matrix $A$ . For $N = 2$ , the Haar, DCT and unitary DFT forward matrices are all the same:

A = \frac{1}{\sqrt{2}} [\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix}] .

For $N = 4$ , the Haar forward matrix is

A = \frac{1}{2} [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & - 1 & - 1 \\ \sqrt{2} & \sqrt{2} & 0 & 0 \\ 0 & 0 & \sqrt{2} & - \sqrt{2} \end{matrix}] .

Figure 2.9: The first four ( $k = 0, 1, 2, 3$ ) and the last ( $k = 31$ ) basis functions for a 32-point Haar. Note that the frequency increases with $k$ .

Figure 2.10: Basis functions $k = 0, 17$ and the last four ( $k = 28, 29, 30, 31$ ) for a 32-point Haar. The support (range of non-zero samples) of the last 15 functions ( $k \geq 17$ ) is just two samples.

Figure 2.9 and Figure 2.10 depict some Haar basis functions. The most interesting aspect is that, in contrast to DCT and DFT, some Haar functions are non-zero only at specific intervals. This characteristic allows the Haar coefficients to provide information not only about “frequency”, but also about “time” (or localization). For example, it is well-known in signal processing that Fourier transforms are better suited for stationary signals while wavelets can better represent transient signals in a time-frequency plan. This kind of behavior can already be visualized by comparing DFT (and DCT) basis with Haar’s and is explored in Application 2.2.

2.4.5 Advanced: Properties of orthogonal and unitary transforms

As commented, for transforms with unitary matrices, $X [k]$ is simply the inner product between $x$ and the $k$ -th basis function.¹⁰ In the inverse transform, the coefficient $X [k]$ is the scaling factor that multiplies the $k$ -th basis function. This interpretation is not valid for a general non-unitary transform pair $A$ and $A^{- 1}$ .

The next paragraphs discuss the fact that unitary matrices do not change the norm of the input vectors. After that, the orthogonal transforms are compared to the unitary, to show that even when the norm of the basis vectors is not equal to one, the orthogonality still enables inverting a matrix using inner products.

Advanced: Unitary matrices lead to energy conservation

Another interesting property of a unitary matrix (or transform) $A$ is that it does not change the norm of the input vector, that is

| | X | | = | | A x | | = | | x | | .

As indicated in Eq. (1.65), the squared norm of a vector corresponds to its energy and, consequently, a unitary matrix conserves energy when $x$ is transformed into $X$ . The following example summarizes this result.

Theorem 2. Parseval theorem for unitary block transforms. If $A$ is unitary, the transform $X = A x$ conserves energy ( $| | X | |^{2} = | | x | |^{2}$ ) or, equivalently, $| | X | | = | | x | |$ .

Proof sketch: First, recall that

\begin{array}{l} ∥ a + b ∥^{2} & = ⟨ a + b, a + b ⟩ = ⟨ a, a ⟩ + ⟨ a, b ⟩ + ⟨ b, a ⟩ + ⟨ b, b ⟩ \\ = ∥ a ∥^{2} + ∥ b ∥^{2} + 2 ⟨ a, b ⟩ . \end{array}

To visualize that $∥ X ∥ = ∥ x ∥$ , for simplicity, assume that $N = 2$ and $x = {[α, β]}^{T}$ . Hence, $X = α A (:, 0) + β A (:, 1)$ and, because $A$ is unitary, $| | A (:, 0) | | = | | A (:, 1) | | = 1$ and $⟨ A (:, 0), A (:, 1) ⟩ = 0$ . Therefore, the squared norm is $∥ X ∥^{2} = α^{2} ∥ A (:, 0) ∥^{2} + β^{2} ∥ A (:, 1) ∥^{2} + 2 ⟨ A (:, 0), A (:, 1) ⟩ = α^{2} + β^{2} = ∥ x ∥^{2}$ . Same reasoning applies for $N > 2$ as indicated by Eq. (A.35). $□$

Example 2.11. If the transform is unitary in a coding application, the total error energies in time and frequency domains are equal. In a coding application that uses a unitary transform, the error energy in time and “frequency” domains are the same. In other words, assume that $A$ is a unitary matrix and the original vector $x = A^{H} X$ is obtained from the coefficients $X$ . If the original coefficients $X$ are encoded and represent an approximation $\hat{X}$ , the error vector $e_{f} = X - \hat{X}$ has the same norm $| | e_{f} | | = | | e_{t} | |$ of the error vector in time domain $e_{t} = x - \hat{x}$ .

To better understand this fact, one can write:

∥ e_{f} ∥ = ∥ X - \hat{X} ∥ = ∥ A (x - \hat{x}) ∥ = ∥ A e_{t} ∥ = \sqrt{{(A e_{t})}^{H} A e_{t}} = ∥ e_{t} ∥ .

(2.24)

The proof above uses the reasoning adopted in Eq. (A.35). $□$

Advanced: Orthogonal but not unitary also allows easy inversion

Orthogonal but not unitary matrices also lead to useful transforms. They do not lead to energy conservation but the coefficients are conveniently obtained by inner products, as for unitary matrices. An important detail is that when used in transforms, the inner products with the basis vectors of a orthogonal matrix must be normalized by the norms, as discussed for the DFT. This aspect is further discussed in the sequel. The term “energy” is used here instead of vector “norm” because the results are valid not only for vectors, but also for continuous-time signals, for example.

An orthogonal matrix $B$ can be written as $B = A D$ , where $A$ is unitary and $D = diag [\sqrt{E_{1}}, \dots, \sqrt{E_{N}}]$ is a diagonal matrix with the norm of the $i$ -th column of $B$ , or square root of its energy $E_{i}$ , composing the main diagonal. The inverse of a matrix with orthogonal columns is

B^{- 1} = diag [1 ∕ \sqrt{E_{1}}, \dots, 1 ∕ \sqrt{E_{N}}] A^{H} = diag [1 ∕ E_{1}, \dots, 1 ∕ E_{N}] B^{H} .

Example 2.12. Inversion of orthogonal but not unitary matrix. For example, ${[3, 3]}^{T}$ and ${[- 1, 1]}^{T}$ form a basis for $ℝ^{2}$ with energies $E_{1} = 18$ and $E_{2} = 2$ , respectively. The matrix $B = [3, - 1; 3, 1]$ with orthogonal columns can be written as $B = A [\sqrt{18}, 0; 0, \sqrt{2}]$ , where $A = [3 ∕ \sqrt{E_{1}}, - 1 ∕ \sqrt{E_{2}}; 3 ∕ \sqrt{E_{1}}, 1 ∕ \sqrt{E_{2}}]$ is orthonormal. The inverse of $B$ is $B^{- 1} = [1 ∕ \sqrt{E_{1}}, 0; 0, 1 ∕ \sqrt{E_{2}}] A^{H}$ or, equivalently, $B^{- 1} = [1 ∕ E_{1}, 0; 0, 1 ∕ E_{2}] B^{H}$ . $□$

If all columns of an orthogonal matrix $B$ have the same energy $E_{i} = E, \forall i$ , its inverse is

B^{- 1} = \frac{1}{E} B^{H} .

(2.25)

Example 2.13. About the orthogonal (non-unitary) DFT transform. For example, in Eq. (2.11) or, equivalently, Eq. (2.20), the DFT transform was defined by an orthogonal matrix with the same energy $E = 1 ∕ N$ for all basis functions. Therefore, Eq. (2.25) leads to $B^{- 1} = N B$ , which is equivalent to Eq. (2.19). $□$

The next section focuses in Fourier transforms and series. The connections with block transforms are plenty. For example, extending Eq. (2.25) for continuous-time signals, the basis functions for Fourier series, other than the one for $k = 0$ , have energy $E = T_{0}$ over the duration of a fundamental period $T_{0}$ . Hence, the inverse transform in Eq. (2.27) has the factor $1 ∕ T_{0}$ .

⁵ There are other transforms, such as Laplace, which is not a block transform and will be discussed later in this chapter.

⁶ See [Mal92] for a nice description of transforms, with focus on lapped transforms.

⁷ Other concepts related to different choices for scaling the DFT basis functions are further discussed later, in Sections 2.4.5.0 and 2.7.1.

⁸ Because, as discussed in Section 1.6.1, the cosine is an even function while the sine is an odd function.

⁹ The existence of a Nyquist coefficient requires $N$ to be an even number.

¹⁰ With the complex conjugate of the basis incorporated to the definition of this inner product.