2.4  Block Transforms

Generic block processing was presented in Section 1.4. This section discusses block transforms as a specific block-oriented signal processing, including DCT, DFT, and Haar transforms, which are among the most used transforms, together with the already presented KLT (or PCA).

In many signal processing tasks, it is useful to have two related transformations called a transform pair. One transform is the inverse of the other, with the inverse undoing the forward transform. When dealing with block transforms,5 one simply uses linear algebra (as detailed in Appendix A.14), and both transforms are simple matrix multiplications. The pair of transforms is defined by a pair of matrices. The matrices can be rectangular, as in lapped transforms,6 but most block transforms are defined by a pair of N × N square matrices A and A1 and the jargon N-point transform indicates their dimension. The inverse matrix A1 is assumed to exist and can “undo” the transformation A (and vice-versa). Here, the forward (or direct) transformation is denoted as

X = Ax,

while the inverse transformation is denoted here as

x = A1X.
(2.10)

The vector X is called the transform of x and the elements of X are called coefficients. In this text, vectors are represented by bold lower case letters, but the vector with coefficients will be denoted by capital letters to be consistent with the jargon (unfortunately, vectors of coefficients such as X can be confused with matrices, but the context will distinguish them).

An important concept is:

2.4.1  Advanced: Unitary or orthonormal transforms

A matrix B with orthonormal columns is called unitary. The rows of a unitary matrix are also orthonormal. If one takes any pair of distinct rows or columns of the unitary matrix B, their inner product is zero (they are orthogonal) and their norms are equal to one.

Unitary matrices are widely used in transforms because their inverse is simply the conjugate transpose as indicated in:

B1 = BH = (B)T ,

where H denotes the Hermitian (conjugate transposition).

Considering Eq. (2.10), if the basis functions (columns of A1) are orthonormal, then A1 = AH. Consequently, the rows of the direct transform A are the complex-conjugate of the basis functions. Two important facts are:

Observing that the inverse of a unitary real matrix is its transpose

To get insight on why B1 = BH for a unitary B, consider the elements of B are real numbers, such that BH = BT . The result of the product BT B = I is the identity matrix (that is B1 = BT ), because the inner product between the rows of BT (columns of B) with the columns of B is one when they coincide (main diagonal of I) and zero otherwise (due to their orthogonality). In other words, the inner products of columns of B with themselves is the identity, given they are orthonormal. In case A is complex-valued, one has A1 = AH via a similar reasoning.

The next paragraphs present the DCT transform, which can be used for both frequency analysis and coding.

2.4.2  DCT transform

An example of a unitary matrix transform very useful for coding is the discrete cosine transform (DCT). When N = 4, the corresponding matrices are

A = 1 2 [ 1 1 1 1 1.307 0.541 0.541 1.307 1 1 1 1 0.541 1.307 1.307 0.541 ]

and

A1 = AH = AT = 1 2 [ 1 1.307 1 0.541 1 0.541 1 1.307 1 0.541 1 1.307 1 1.307 1 0.541 ]

for the direct and inverse transforms, respectively.

Because they have finite duration, one can represent the input signals of block transforms as vectors or sequences. For instance, the DCT basis function corresponding to the second column of A1 could be written as

b[n] = 1 2 (1.307δ[n] + 0.541δ[n 1] 0.541δ[n 2] 1.307δ[n 3]).

Considering that the first element is a0,0 (first index in equations is zero, not one as in Matlab/Octave), an element an,k of the N-point inverse DCT matrix A1 = {an,k} can be obtained by

an,k = wk cos (π(2n + 1)k 2N ),

where wk is a scaling factor that enforces the basis vectors to have unit norm, i. e., wk = 1 N for k = 0 and wk = 2 N for k = 1,2,,N 1.

Example 2.5. DCT calculation in Matlab/Octave. Listing 2.1 illustrates how the DCT matrices can be obtained in Matlab/Octave.

Listing 2.1: MatlabOctaveFunctions/ak_dctmtx.m
1function [A, Ainverse] = ak_dctmtx(N) 
2% function [A, Ainverse] = ak_dctmtx(N) 
3%Calculate the DCT-II matrix of dimension N x N. A and Ainverse are 
4% the direct and inverse transform matrices, respectively. Ex: 
5%  [A, Ainv]=ak_dctmtx(4); x=[1:4]'+1j; A*x, dct(x) 
6Ainverse=zeros(N,N); %pre-allocate space 
7scalingFactor = sqrt(2/N); %make base functions to have norm = 1 
8for n=0:N-1 %a loop helps to clarify obtaining inverse matrix 
9    for k=0:N-1 %first array element is 1, so use A(n+1,k+1): 
10       Ainverse(n+1,k+1)=scalingFactor*cos((pi*(2*n+1)*k)/(2*N)); 
11    end 
12end 
13Ainverse(1:N,1)=Ainverse(1:N,1)/sqrt(2); %different scaling for k=0 
14%unitary transform: direct transform is the Hermitian of the inverse 
15A = transpose(Ainverse); %Matrix is real, so transpose is Hermitian
  

The matrices obtained with ak_dctmtx.m can be used to perform the transformations but this has only pedagogical value. There are algorithms for computing the DCT that are faster than a plain matrix multiplication. Check the functions dct and idct in Matlab/Octave and scipy.fftpack.dct in Python.    

Example 2.6. The DCT basis functions are cosines of distinct frequencies. Figure 2.2 shows four basis functions of a 32-points DCT transform.

PIC

Figure 2.2: The first three (k = 0,1,2) and the last (k = 31) basis functions for a 32-points DCT. Note that the frequency increases with k.

Figure 2.2 indicates that, in order to represent signals composed by “low frequencies”, DCT coefficients of low order (small values of k can be used), while higher order coefficients are more useful for signals composed by “high frequencies”. For example, the commands:

1N=32;k=3;n=0:N-1;x=7*cos(k*(pi*(2*n+1)/(2*N)));stem(x); X=dct(x)

return a vector X with all elements equal to zero but X(4)=28, which corresponds to k = 3 (recall the first index in Matlab/Octave is 1, not 0). Using a larger k will increase the frequency and the order of the corresponding DCT coefficient.    

Example 2.7. Example of a DCT transformation. For example, assuming a 4-points DCT and x = [1,2,3,4]T , the forward transform can be obtained in this case with

X = [ X(0) X(1) X(2) X(3) ] = Ax 1 2 [ 1 1 1 1 1.307 0.541 0.541 1.307 1 1 1 1 0.541 1.307 1.307 0.541 ] [ 1 2 3 4 ] [ 5 2.230 0 0.158 ].

In this case,

X[0]  =  x,A(0,:)  =  [1,2,3,4]T ,0.5[1,1,1,1]T   = 5,

where A(0,:) represents the first (0-th) row of matrix A. Similarly, X(2) is given by

X[2]  =  x,A(2,:)  =  [1,2,3,4]T ,0.5[1,1,1,1]T   = 0,

and so on.

The previous expressions provide intuition on the direct transform. In the inverse transform, when reconstructing x, the coefficient X[k] is the scaling factor that multiplies the k-th basis function in the linear combination x = A1X. Still considering the 4-points DCT, the inverse corresponds to

[ x[0] x[1] x[2] x[3] ] = 1 2 [ 1 1.307 1 0.541 1 0.541 1 1.307 1 0.541 1 1.307 1 1.307 1 0.541 ] [ 5 2.230 0 0.158 ] = 1 2 {5 [ 1 1 1 1 ] 2.230 [ 1.307 0.541 0.541 1.307 ]    +0 [ 1 1 1 1 ] 0.158 [ 0.541 1.307 1.307 0.541 ]}.

Note that X[2] = 0 and, consequently, the basis function 0.5[1,1,1,1]T is not used to reconstruct x. The reason is that this specific basis function is orthogonal to x and does not contribute to its reconstruction. As another example, the DCT coefficient X[0] = 5 indicates that a scaling factor equals to 5 should multiply the corresponding 0-th basis function 0.5[1,1,1,1]T in order to reconstruct the vector x in time-domain.   

Alternatively, the matrix multiplication can be described by a transform equation. For example, the DCT coefficients can be calculated by

X[k] = 2 N n=0N1x[n]cos (π(2n + 1)k 2N ),k = 1, ,N 1

and

X[0] = 1 N n=0N1x[n].

As mentioned, the k-th element (or coefficient) X[k] of X can be calculated as the inner product of x with the complex conjugate of the k-th basis function. This can be done because the DCT basis functions are orthogonal among themselves, as discussed in Section A.14.3. The factors 2N and 1N are used to have basis vectors with norms equal to 1. The k-th basis vector is a cosine with frequency ()N and phase ()(2N).

Section 2.10 discusses examples of DCT applications, including coding (signal compression). One advantage of adopting block transforms in coding applications is the distinct importance of coefficients. In the original domain, all samples (or pixels in image processing) have the same importance, but in the transform domain, coefficients typically have distinct importance. Hence, the coding scheme can concentrate on representing the most important coefficients and even discard the non-important ones. Another application of DCTs is in frequency analysis (finding the most relevant frequencies that compose a signal). But, in this application, the DFT is more widely adopted than the DCT.

2.4.3  DFT transform

As the DCT, the discrete Fourier transform (DFT) is a very useful tool to accomplish frequency analysis, where the goal is to estimate the coefficients for basis functions that are distinguished by their frequencies. The DFT is related to the discrete-time Fourier series, which also uses cosines cos (2πnk N ) and sines sin (2πnk N ), k = 0,1,,N 1, as basis functions, and will be discussed in this chapter. While the DCT uses cosines and its matrices are real, the DFT uses complex exponentials as basis functions.

Using Euler’s formula, Eq. (A.1), complex numbers provide a more concise representation of sines and cosines and the k-th DFT basis function is given by

1 Nej2πnk N = 1 N [cos (2πnk N ) + jsin (2πnk N )],
(2.11)

where n = 0,1,,N 1 expresses time evolution, as for the DCT. The value k determines the angular frequency Ωk = 2πk N (in radians) of the basis function b[n]. Eq. (2.11) can be written as

b[n] = 1 NejΩkn = 1 N [cos (Ωkn) + jsin (Ωkn)],
(2.12)

which gives the n-th value of the time-domain basis function b[n].

Hence, one can notice that the DFT operates with an angular frequency resolution ΔΩ = 2πN, such that Ωk = kΔΩ. For instance, considering N = 4, the resolution is ΔΩ = 2π4 = π2 rad and the DFT uses basis functions with angular frequencies Ωk = 0,π2,π and 3π2 rad. This is depicted in Figure 2.3.

PIC

Figure 2.3: Angles Ωk = 0,π2,π and 3π2 rad (or, equivalently, discrete-time angular frequencies) used by a DFT of N = 4 points when k varies from k = 0 to 3, respectively.

When k = 2 in Figure 2.3, the angular frequency is Ωk = π rad, and the basis function is b[n] = 1 Nejπn = 1 N (e) n = 1 N(1)n. The signal (1)n corresponds to the highest angular frequency Ω = π rad among discrete-time signals, and this frequency is called Nyquist frequency, as discussed in Section 1.8.3.

To complement Figure 2.3, which uses an even value N, Figure 2.4 describes the DFT angular frequencies when N = 5.

PIC

Figure 2.4: Angles Ωk = 0,2π5,4π5,6π5 and 8π5 rad (that in degrees corresponds to 0, 72, 144, 144, 72) used by a DFT of N = 5 points when k varies from k = 0 to 4, respectively.

As depicted in Figure 2.4, when N is odd, the Nyquist frequency Ω = π rad is not used by the DFT.

The N-point forward and inverse DFT pair can be written as a pair of equations. The forward (direct) transform is

X[k] = n=0N1x[n]ej2πnk N ,   for k = 0,,N 1,
(2.13)

which corresponds to calculating the k-th DFT coefficient X[k] = x[n],ej2πnk N as the inner product between the time-domain signal x[n] and the k-th basis vector of Eq. (2.11) multiplied by N. In this calculation, the adopted inner product for complex-valued vectors of Eq. (A.32) is responsible for Eq. (2.20) using ej2πnk N instead of ej2πnk N .

The inverse DFT transform equation is

x[n] = 1 N k=0N1X[k]ej2πnk N ,   for n = 0,,N 1.
(2.14)

Eq. (2.13) and Eq. (2.14) are the most adopted definitions of a DFT pair. But the DFT scaling factor 1N in Eq. (2.21) can be changed to 1N, and the same factor 1N be also used in Eq. (2.13), if one wants to have a DFT pair with basis functions with unit norm. In this case the transform is called unitary DFT and has the properties already discussed in Section 2.4.1.7

Instead of using transform equations, sometimes it is convenient to use a matrix notation for transforms. With a matrix notation, it is easier to interpret Eq. (2.21) as composing the time-domain signal x[n] as a linear combination of basis functions (1N)ej2πnk N scaled by the respective coefficient X[k].

To develop the matrix notation, one can observe that an element an,k of the N-point inverse DFT matrix A1 = {an,k} is given by

an,k = 1 Nej2πnk N .
(2.15)

Because n and k appear in Eq. (2.15) as multiplication parcels, an,k = ak,n and the inverse DFT matrix A1 is symmetric, that is A1 = (A1) T .

For the sake of comparison, when N = 4, the inverse DCT and DFT matrices are, respectively:

ADCT1 = 1 2 [ 1 1.307 1 0.541 1 0.541 1 1.307 1 0.541 1 1.307 1 1.307 1 0.541 ]   and    ADFT1 = 1 4 [ 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j ].

The example shows that the inverse DFT matrix A DFT1 is symmetric, while the inverse DCT matrix ADCT1 is not. It can also be noticed that the inverse DFT matrix A DFT1 is complex-valued, having elements that are real and imaginary. In contrast, the inverse DCT matrix ADCT1 is real-valued. However, the first and third columns of A DFT1 are real-valued (all elements are + 1 or 1). For N = 4, these rows correspond to k = 0 and k = N2, which are called DC and Nyquist frequencies, respectively. All N values corresponding to the basis vector of frequency k = 0 (DC) are given by an,0 = 14 = 0.25,n = 0,,N 1, as indicated in the first column of ADFT1. And for N even, all N values corresponding to the basis vector of the Nyquist frequency k = N2 (third column of ADFT1) are given by an,N2 = (1)n,n = 0,,N 1.

From Eq. (2.15) and adopting N = 4, the DFT inverse matrix is given by

A1 = 1 N [ 1 1 1 1 1 e2 e ej3π2 1 e ej2π ej3π 1 ej3π2 ej3π ej9π2 ] = 1 4 [ 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j ].

Note that the complex numbers ejΩ with angles Ω = 0,π2,π and 3π2 (which is equivalent to π2) rad and magnitudes equal to 1, can be written as 1,j,1 and j, respectively. This simplifies writing the elements of A1 for N = 4 and other values of N.

Example 2.8. Examples of 4-points DFT calculation. Let us calculate the 4-points DFTs of three signals represented by the column vectors x = [2,2,2,2]T , y = [0,5,5,0]T and z = [1,2,3,4]T . In this case, the forward and inverse matrices can be written as

A = [ 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j ]   and   A1 = AH N = 1 4 [ 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j ],

respectively. The coefficient vectors are X = Ax = [8,0,0,0]T , Y = Ay = [0,5 j5,10,5 + j5]T and Z = Az = [10,2 + j2,2,2 j2]T .

Because the basis functions (columns of A1) are orthogonal, but not orthonormal, the k-th coefficient of the forward transform is obtained by performing the inner product between x and the complex conjugate of the k-th basis function multiplied by N = 4.

The first basis function (corresponding to k = 0) is always the first column of A1, which in this case is [0.25,0.25,0.25,0.25]T . Because all elements of this basis function are a constant (0.25 when N = 4), the job of the respective coefficient is to represent the average value (DC level) of the input signal. After multiplication by N = 4, the scaled basis function for k = 0 is [1,1,1,1] (first row of A) and the corresponding coefficients in X, Y and Z are 8, 0 and 10, respectively. This indicates that the DC level of x, y and z are 2, 0 and 2.5, respectively.

This example also illustrates the symmetry of DFT coefficients when the input signal is real-valued. One can note that the coefficients corresponding to k = 1 and k = 3 are complex conjugates, while the coefficients corresponding to k = 0 and k = 2 (more generally, k = N2, for N even) is always real-valued. In fact, because the basis functions for frequencies k = 0 (DC) and k = N2 (Nyquist) are real-valued, if the input vector is also real-valued, these two DFT coefficients will always be real-valued (in the case N is even).   

The following example addresses the situation when N is an odd number. In this case, the Nyquist frequency is not represented in the DFT values.

Example 2.9. Examples of 5-points DFT calculation. Let us calculate the 5-points DFTs of three signals that have five elements but are similar to the ones in the previous example: x = [2,2,2,2,2]T , y = [0,5,5,0,0]T and z = [1,2,3,4,5]T . In this case, the inverse matrix can be written as

A1 = 1 5 [ 1 1 1 1 1 1 ej2π5 ej4π5 ej4π5 ej2π5 1 ej4π5 ej2π5 ej2π5 ej4π5 1 ej4π5 ej2π5 ej2π5 ej4π5 1 ej2π5 ej4π5 ej4π5 ej2π5 ] ,

which can be written in Cartesian form using two decimal places as

A1 1 5 [ 1 1 1 1 1 1 0.31 0.95j 0.81 0.59j 0.81 + 0.59j 0.31 + 0.95j 1 0.81 0.59j 0.31 + 0.95j 0.31 0.95j 0.81 + 0.59j 1 0.81 + 0.59j 0.31 0.95j 0.31 + 0.95j 0.81 0.59j 1 0.31 + 0.95j 0.81 + 0.59j 0.81 0.59j 0.31 0.95j ].

When N = 5, the DFT angular resolution is ΔΩ = 2π5 rad, which corresponds to 72 degrees. To better visualize the involved angles, as a third altenative to denote A1, one can use the notation m∠𝜃 for complex values and use angles in degrees to write

A1 = 1 5 [ 1 1 1 1 1 1 172 1144 1144 172 1 1144 172 172 1144 1 1144 172 172 1144 1 172 1144 1144 172 ].

Performing the direct transform, one finds that the coefficient vectors in this case are X = Ax = [10,0,0,0,0]T , Y = Ay = [0,5.59 j1.82,5.59 7.69j,5.59 + 7.69j,5.59 + j1.82,]T and Z = Az = [15,2.5 + 3.44j,2.5 + 0.81j,2.5 0.81j,2.5 3.44j]T .

The first (DC) coefficients in X, Y and Z are 10, 0 and 15, respectively. After division by N = 5, one can find that the DC level (or average value) of the time-domain vectors x, y and z are 2, 0 and 3, respectively.

This example also illustrates the symmetry of DFT coefficients when the input signal is real-valued. One can note that the coefficients corresponding to k = 1 and k = 3 are complex conjugates, while the coefficients corresponding to k = 0 and k = 2 (more generally, k = N2, for N even) is always real-valued. In fact, because the basis functions for frequencies k = 0 (DC) and k = N2 (Nyquist) are real-valued, if the input vector is also real-valued, these two DFT coefficients will always be real-valued (in the case N is even).   

On the symmetry of DFT coefficients when the input signal is real-valued

Assuming a DFT with N = 6 points, Figure 2.5 indicates the angular frequency Ωk corresponding to each DFT coefficient X[k]. For instance, the DC coefficient corresponding to k = 0 is the first element of the vector with DFT coefficients. For N even, the Nyquist frequency is used and its respective coefficient is the element corresponding to k = N2.

PIC

Figure 2.5: Mapping between angular frequencies Ωk and the corresponding DFT coefficient X[k] for N = 6.

Figure 2.6 shows a figure similar to Figure 2.5 but for a DFT with N = 5 points. In the case of an odd value of N, the Nyquist frequency is not used in the DFT calculation.

Another aspect depicted by both Figure 2.6 and Figure 2.5 is that the negative frequencies are located after the positive ones in the DFT coefficients vector. The Matlab/Octave function fftshift can be used to move the negative frequencies to the first elements of this vector.

PIC

Figure 2.6: Mapping between angular frequencies Ωk and the corresponding DFT coefficient X[k] for N = 5.

Figure 2.5 and Figure 2.6 are also useful to observe symmetries related to the DFT transform. Starting from Eq. (2.12), if we substitute the original angular frequency Ωk by Ωk, the new basis d[n] is given by

d[n] = 1 NejΩkn = 1 N [cos (Ωkn) + jsin (Ωkn)] = 1 N [cos (Ωkn) jsin (Ωkn)].
(2.16)

Comparing d[n] with Eq. (2.12), one can note that d[n] = b[n] is the complex conjugate8 of the function that has frequency Ωk. Therefore, if X[k] = x[n],b[n], in case one changes the sign of the angular frequency of b[n] from Ωk to Ωk to obtain a new basis function d[n] = b[n], the new coefficient Z[k] = x[n],d[n] has the property Z[k] = X[k] of being the complex conjugate of X[k] if x[n] is real-valued.

Observing Figure 2.5, the angles corresponding to k = 1 and k = 5 are opposites. Also, k = 2 is the opposite of k = 4. Therefore, the respective pairs of DFT coefficients are complex conjugates. More specifically, X[1] = X[5] and X[2] = X[4], when a 6-points DFT is adopted. Similarly, when assuming N = 5 and following the mentioned reasoning, one can conclude from Figure 2.6 that X[1] = X[4] and X[2] = X[3].

In general, if the input signal is real-valuded one has DFT coefficients with the complex conjugate property

X[k ] = X[N k ].

If N is even, this is valid for k  = 1,,(N2 1). And if N is an odd number, the range is k  = 1,,(N 1)2. The following example illustrates this fact.

Example 2.10. Coefficients that are complex conjugates when the signal is real-valued. All the time-domain input vectors of Example 2.8 and Example 2.9 are real-valued and can be used to help studying the symmetry of the respective DFT coefficients.

In Example 2.8, the coefficient vectors are X = [8,0,0,0]T , Y = [0,5 j5,10,5 + j5]T and Z = [10,2 + j2,2,2 j2]T . Recalling that the first element has index k = 0 (not k = 1), it can be seen that the coefficients corresponding to k = 1 and k = 3 are always complex conjugates.

Similarly, in Example 2.8, the coefficient vectors are X = [10,0,0,0,0]T , Y = [0,5.59 j1.82,5.59 7.69j,5.59 + 7.69j,5.59 + j1.82,]T and Z = [15,2.5 + 3.44j,2.5 + 0.81j,2.5 0.81j,2.5 3.44j]T , and one can observe that the coefficients corresponding to the pairs (k = 1,k = 4) and (k = 2,k = 3) are always complex conjugates.   

The basis functions b[n] for the DC and Nyquist coefficients are real-valued9 Hence, another property of the DFT for real-valued inputs is that X[k] = x[n],b[n] are real-valued coefficients for k = 0 and, in case N is even, k = N2. This can be observed in the DFT coefficients discussed in Example 2.10.

Advanced: DFT and FFT denoted with the twiddle factor WN

For convenience, the twiddle factor WN is defined as

WN = ej2π N
(2.17)

such that the k-th basis is (1N) (WN) nk for the conventional DFT and (1N) (WN) nk for the unitary DFT. Twiddle means to lightly turn over or around and is used because the complex number WN has magnitude equals to 1 and changes only the angle of a complex number that is multiplied by it. Each element of the inverse DFT matrix A1 = {an,k} is

an,k = 1 N (WN) nk.

Hence, the twiddle factor WN = ej2π N denotes the DFT angular frequency resolution ΔΩ.

Figure 2.7 illustrates the complex numbers WN as vectors for different values of N. Because |WN| = 1, the twiddle factor is located on the unit circle of the complex plane and effectively informs an angle. For example, the three angles used by a DFT of N = 3 points are 0, 120 and 240 degrees, while a 4-points DFT uses 0, 90, 180 and 270.

PIC
(a) N = 3
PIC
(b) N = 4

PIC
(c) N = 5
PIC
(d) N = 6
Figure 2.7: The angles corresponding to WN on the unit circle, for N = 3,4,5,6.

For N = 4, the DFT pair is given by

A = [ 1 1 1 1 1 W41 W42 W43 1 W42 W44 W46 1 W43 W46 W49 ]
(2.18)

and

A1 = 1 N [ 1 1 1 1 1 W41 W42 W43 1 W42 W44 W46 1 W43 W46 W49 ] .

Note that A1AH because the basis functions have norms equal to 1N (the DFT matrix A was not defined as unitary). In this case,

A1 = NAH.
(2.19)

Also note that the reason to have WN defined as a negative exponent in Eq. (2.17) is that the direct transform A uses positive powers of WN. Another important fact is that (WN)aN+b = (WN)b for any a . This can be seen by noting that WNn has a period of N samples and (WN)aN = 1. Hence, the 4-points direct DFT matrix of Eq. (2.18) can be written as

A = [ 1 1 1 1 1 W41 W42 W43 1 W42 1 W42 1 W43 W42 W41 ] = [ 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j ].

It can be seen that the matrix A (and also A1) has several symmetries that can be explored to save computations. There is a large set of algorithms for efficiently computing the DFT. These are collectively called fast Fourier transform (FFT) algorithms. Apart from numerical errors, both DFT (the plain matrix multiplication) and FFT algorithms lead to the same result. Therefore, the DFT is the name of the transform and FFT is a fast algorithm (from a large collection) used to speed up calculating the DFT. Implementing an FFT routine is not trivial but they are typically available in softwares such as Matlab/Octave.

There are FFT algorithms specialized in the case where the FFT size N is a power of two (i. e., N = 2a,a ). Other FFT algorithms do not present this restriction but typically are less efficient. The computational costs of DFT and FFT are quite different when N grows large. The DFT matrix multiplication requires computing N inner products, so the order of this computation is O(N2) while FFT algorithms achieve O(Nlog 2N). Figure 2.8 illustrates this comparison.

PIC

Figure 2.8: Computational cost of the DFT calculated via matrix multiplication versus an FFT algorithm. Note that N = 4096 are used in standards such as VDSL2, for example, and it is clearly unreasonable to use matrix multiplication.

Instead of using matrix notation, one can use transform equations. The N-point DFT pair (forward and inverse, respectively) is:

X[k] = n=0N1x[n] (W N) nk
(2.20)

and

x[n] = 1 N k=0N1X[k] (W N) nk.
(2.21)

For the unitary DFT pair, the normalization factors should be changed to:

X[k] = 1 N n=0N1x[n] (W N) nk
(2.22)

and

x[n] = 1 N k=0N1X[k] (W N) nk.
(2.23)

2.4.4  Haar transform

In signal processing, the Haar transform is intimately related to wavelets. This section will not explore this relation nor detail the generation law of Haar matrices. The goal here is to motivate the study of wavelets by indicating how basis functions with support shorter than N can be useful.

The Haar transform is unitary, such that A1 = AH. The reader can use the script MatlabOctaveThirdPartyFunctions/haarmtx.m with A=haarmtx(N) to obtain the forward matrix A. For N = 2, the Haar, DCT and unitary DFT forward matrices are all the same:

A = 1 2 [ 1 1 1 1 ].

For N = 4, the Haar forward matrix is

A = 1 2 [ 1 1 1 1 1 1 1 1 2 2 0 0 0 0 2 2 ].

PIC

Figure 2.9: The first four (k = 0,1,2,3) and the last (k = 31) basis functions for a 32-points Haar. Note that the frequency increases with k.

PIC

Figure 2.10: Basis functions k = 0,17 and the last four (k = 28,29,30,31) for a 32-points Haar. The support (range of non-zero samples) of the last 15 functions (k 17) is just two samples.

Figure 2.9 and Figure 2.10 depict some Haar basis functions. The most interesting aspect is that, in contrast to DCT and DFT, some Haar functions are non-zero only at specific intervals. This characteristic allows the Haar coefficients to provide information not only about “frequency”, but also about “time” (or localization). For example, it is well-known in signal processing that Fourier transforms are better suited for stationary signals while wavelets can better represent transient signals in a time-frequency plan. This kind of behavior can already be visualized by comparing DFT (and DCT) basis with Haar’s and is explored in Application 2.2.

2.4.5  Advanced: Properties of orthogonal and unitary transforms

As commented, for transforms with unitary matrices, X[k] is simply the inner product between x and the k-th basis function.10 In the inverse transform, the coefficient X[k] is the scaling factor that multiplies the k-th basis function. This interpretation is not valid for a general non-unitary transform pair A and A1.

The next paragraphs discuss the fact that unitary matrices do not change the norm of the input vectors. After that, the orthogonal transforms are compared to the unitary, to show that even when the norm of the basis vectors is not equal to one, the orthogonality still enables inverting a matrix using inner products.

Advanced: Unitary matrices lead to energy conservation

Another interesting property of a unitary matrix (or transform) A is that it does not change the norm of the input vector, that is

||X|| = ||Ax|| = ||x||.

As indicated in Eq. (1.65), the squared norm of a vector corresponds to its energy and, consequently, a unitary matrix conserves energy when x is transformed into X. The following example summarizes this result.

Theorem 2. Parseval theorem for unitary block transforms. If A is unitary, the transform X = Ax conserves energy (||X||2 = ||x||2) or, equivalently, ||X|| = ||x||.

Proof sketch: First, recall that

a + b2 = a + b,a + b = a,a + a,b + b,a + b,b = a2 + b2 + 2a,b.

To visualize that X = x, for simplicity, assume that N = 2 and x = [α,β]T . Hence, X = αA(:,0) + βA(:,1) and, because A is unitary, ||A(:,0)|| = ||A(:,1)|| = 1 and A(:,0),A(:,1) = 0. Therefore, the squared norm is X2 = α2A(:,0)2 + β2A(:,1)2 + 2A(:,0),A(:,1) = α2 + β2 = x2. Same reasoning applies for N > 2 as indicated by Eq. (A.35).   

Example 2.11. If the transform is unitary in a coding application, the total error energies in time and frequency domains are equal. In a coding application that uses a unitary transform, the error energy in time and “frequency” domains are the same. In other words, assume that A is a unitary matrix and the original vector x = AHX is obtained from the coefficients X. If the original coefficients X are encoded and represent an approximation X^, the error vector ef = X X^ has the same norm ||ef|| = ||et|| of the error vector in time domain et = x x^.

To better understand this fact, one can write:

ef = X X^ = A(x x^) = Aet = (A e t )H A e t = et.
(2.24)

The proof above uses the reasoning adopted in Eq. (A.35).    

Advanced: Orthogonal but not unitary also allows easy inversion

Orthogonal but not unitary matrices also lead to useful transforms. They do not lead to energy conservation but the coefficients are conveniently obtained by inner products, as for unitary matrices. An important detail is that when used in transforms, the inner products with the basis vectors of a orthogonal matrix must be normalized by the norms, as discussed for the DFT. This aspect is further discussed in the sequel. The term “energy” is used here instead of vector “norm” because the results are valid not only for vectors, but also for continuous-time signals, for example.

An orthogonal matrix B can be written as B = AD, where A is unitary and D = diag[E1,,EN] is a diagonal matrix with the norm of the i-th column of B, or square root of its energy Ei, composing the main diagonal. The inverse of a matrix with orthogonal columns is

B1 = diag[1E 1,,1EN]AH = diag[1E 1,,1EN]BH.

Example 2.12. Inversion of orthogonal but not unitary matrix. For example, [3,3]T and [1,1]T form a basis for 2 with energies E1 = 18 and E2 = 2, respectively. The matrix B = [3,1;3,1] with orthogonal columns can be written as B = A[18,0;0,2], where A = [3E1,1E2;3E1,1E2] is orthonormal. The inverse of B is B1 = [1E1,0;0,1E2]AH or, equivalently, B1 = [1E1,0;0,1E2]BH.   

If all columns of an orthogonal matrix B have the same energy Ei = E,i, its inverse is

B1 = 1 EBH.
(2.25)

Example 2.13. About the orthogonal (non-unitary) DFT transform. For example, in Eq. (2.11) or, equivalently, Eq. (2.20), the DFT transform was defined by an orthogonal matrix with the same energy E = 1N for all basis functions. Therefore, Eq. (2.25) leads to B1 = NB, which is equivalent to Eq. (2.19).   

The next section focuses in Fourier transforms and series. The connections with block transforms are plenty. For example, extending Eq. (2.25) for continuous-time signals, the basis functions for Fourier series, other than the one for k = 0, have energy E = T0 over the duration of a fundamental period T0. Hence, the inverse transform in Eq. (2.27) has the factor 1T0.