Linear Algebra

A.14 Linear Algebra

A.14.1 Inner products and norms

In general, a norm is a function $f$ (also denoted as $∥⋅∥$ ) that takes a vector $x$ and outputs a non-negative real number $y \in ℝ_{+}$ . The norm $y$ is often interpreted as the distance between $x$ and the adopted origin. A norm can be defined for any real or complex vector space, and the vector $x$ will belong to this adopted vector space. For example, assuming the Euclidean space and a vector $x = [x_{1}, x_{2}, \dots, x_{D}]$ with real-valued elements, the so-called L2 norm is denoted as $∥ x ∥_{2} = \sqrt{x_{1}^{2} + x_{2}^{2} + \dots + x_{D}^{2}}$ .

Another useful norm is the Manhattan norm $∥ x ∥_{1} = | x_{1} | + | x_{2} | + \dots + | x_{D} |$ , also known as the L1 norm, which correspond to the sum of the absolute values of the elements in vector $x$ .

A third common norm is the maximum norm $∥ x ∥_{\infty} = max {| x_{1} |, | x_{2} |, …, | x_{D} |}$ , also known as the infinity norm or Chebyshev norm, which measures the maximum distance along any dimension of the vector. For example, the maximum norm of $x = [- 3, 10, - 20]$ is $∥ x ∥_{\infty} = 20$ .

To be a valid norm, the function $∥⋅∥$ needs to have some fundamental properties: a) $∥ x ∥ \geq 0$ ; b) if $∥ x ∥ = 0$ then $x$ is the zero vector; c) $∥ α x ∥ = | α | ∥ x ∥$ and d) (triangle inequality) $∥ x + y ∥ \leq ∥ x ∥ + ∥ y ∥$ .

The inner product in a $K$ -dimensional space with complex-valued vectors is:

⟨ a, b ⟩ = \sum_{i = 1}^{K} a_{i} b_{i}^{*} = ∥ a ∥ ∥ b ∥ cos (𝜃) .

(A.32)

See Table 2.2 for alternative definitions of inner products.

When $a = b$ , Eq. (A.32) can be written as

∥ a ∥ = \sqrt{⟨ a, a ⟩} .

(A.33)

An inner product $⟨ a, b ⟩$ can also be written as a multiplication of two vectors

⟨ a, b ⟩ = b^{H} a,

(A.34)

where in this case both are assumed to be column vectors (row vectors would suggest $a b^{H}$ ).

In case a vector $b = A a$ is obtained via multiplication by a unitary matrix $A$ , Eq. (A.33) and Eq. (A.34) lead to

∥ b ∥ = \sqrt{⟨ A a, A a ⟩} = \sqrt{{(A a)}^{H} A a} = \sqrt{a^{H} A^{H} A a} = \sqrt{a^{H} I a} = ∥ a ∥

(A.35)

because $A^{H} A = I$ , which indicates that the unitary $A$ does not alter the norm of the input vector.

A.14.2 Projection of a vector using inner product

To explore the properties and advantages of linear transforms, it is useful to study the vector projection (or simply projection) of a vector onto another one.

Figure A.2: The perpendicular line for obtaining the projection $p_{xy}$ of a vector $x$ onto $y$ in $ℝ^{2}$ . Note that $𝜃 = {cos}^{- 1} (⟨ x, y ⟩ ∕ (∥ x ∥ ∥ y ∥)$ is the angle between $x$ and $y$ and the inner product $⟨ x, y ⟩ = ∥ p_{xy} ∥ ∥ y ∥ = ∥ p_{yx} ∥ ∥ x ∥$ .

Using $ℝ^{2}$ for simplicity, note that the projection $p_{xy}$ of a vector $x$ in another vector $y$ is obtained by choosing the point along the direction of $y$ that has the minimum distance to $x$ . This line is perpendicular to $y$ , as indicated in Figure A.2. Using the Pythagorean theorem and assuming that $0 \leq 𝜃 \leq π ∕ 2$ , the norm $∥ p_{xy} ∥$ of the projection can be written as $∥ p_{xy} ∥ = ∥ x ∥ cos (𝜃)$ . If $π ∕ 2 < 𝜃 \leq π$ , $∥ p_{xy} ∥ = ∥ x ∥ cos (π - 𝜃) = - ∥ x ∥ cos (𝜃)$ . Hence, in general,

∥ p_{xy} ∥ = ∥ x ∥ | cos (𝜃) | = \frac{| ⟨ x, y ⟩ |}{∥ y ∥} .

(A.36)

For a given norm $∥ y ∥$ , the larger the inner product, the larger the norm of the projection. The same is valid for $p_{yx}$ as depicted in Figure A.3:

∥ p_{yx} ∥ = ∥ y ∥ | cos (𝜃) | = \frac{| ⟨ x, y ⟩ |}{∥ x ∥} .

Figure A.3: Projections of a vector $x$ and $y$ onto each other. Note the errors are orthogonal to the directions of the respective projections.

Any vector $z$ can be written as its norm $∥ z ∥$ multiplied by a unity norm vector $\frac{z}{∥ z ∥}$ that indicates its direction. Note that the vector $p_{yx}$ is in the same or the opposite direction of $x$ , which can be specified by the unity norm vector $sign (cos (𝜃)) \frac{x}{∥ x ∥}$ . Hence, one has

p_{yx} = ∥ p_{yx} ∥ sign (cos (𝜃)) \frac{x}{∥ x ∥} = (\frac{⟨ x, y ⟩}{∥ x ∥^{2}}) x,

where $\frac{⟨ x, y ⟩}{| | x | |^{2}}$ is a scaling factor that can be negative but does not change the direction of $x$ . Similarly, the projection $p_{xy}$ of $x$ onto $y$ is given by

p_{xy} = (\frac{⟨ x, y ⟩}{∥ y ∥^{2}}) y .

Note that if the vector $y$ has unitary norm, the absolute value of the inner product $⟨ x, y ⟩$ coincides with the norm $∥ p_{xy} ∥$ . These expressions are valid for other vector spaces, such as $ℝ^{n}$ , $n > 2$ .

Using geometry to interpret a projection vector is very useful. When one projects $x$ in $y$ , the result $p_{xy}$ is the “best” representation (in the minimum distance sense, or least-square sense) of $x$ that $y$ alone can provide. The error vector $e_{xy} = x - p_{xy}$ is orthogonal to $y$ (and, consequently, to $p_{xy}$ ), i. e., $⟨ e_{xy}, y ⟩ = 0$ . The vector $e_{xy}$ represents what should be added to $p_{xy}$ in order to completely represent $x$ , and the orthogonality $⟨ e_{xy}, y ⟩ = 0$ indicates that $y$ cannot contribute any further.

Figure A.3 completes the example. It was obtained using the following Matlab/Octave script.¹ The example assumes $x = [2, 2]$ and $y = [5, 1]$ , having an angle $𝜃$ of approximately 33.7 degrees between them.

Listing A.1: MatlabOctaveCodeSnippets/snip_transforms_projection.m

1x=[2,2]; %define a vector x 
2y=[5,1]; %define a vector y 
3magx=sqrt(sum(x.*x))%magnitude of x 
4magy=sqrt(sum(y.*y))%magnitude of y 
5innerprod=sum(x.*y) %<x,y>=||x|| ||y|| cos(theta) 
6theta=acos(innerprod / (magx*magy)) %angle between x and y 
7%obs: could use acosd to have angle in degrees 
8disp(['Angle is ' num2str(180*theta/pi) ' degrees']); 
9%check if inverting direction is needed: 
10invertDirection=1; if theta>pi/2 invertDirection=-1; end 
11%find the projection of x over y (called p_xy) and p_yx 
12mag_p_xy = magx*abs(cos(theta)) %magnitude of p_xy 
13%directions: obtained as y normalized by magy, x by magx 
14y_unitary=y/magy; %normalize y by magy to get unitary vec. 
15p_xy = mag_p_xy * y_unitary* invertDirection %p_xy 
16mag_p_yx = magy*abs(cos(theta)); %magnitude of p_yx 
17x_unitary=x/magx; %normalize x by magx to get unitary vec. 
18p_yx = mag_p_yx * x_unitary* invertDirection %p_yx 
19%test orthogonality of error vectors: 
20error_xy = x - p_xy; %we know: p_xy + error_xy = x 
21sum(error_xy.*y) %this inner product should be zero 
22error_yx = y - p_yx; %we know: p_yx + error_yx = y 
23sum(error_yx.*x) %this inner product should be zero

The concept of projections via inner products will be extensively used in our discussion about transforms. For example, the coefficients of a Fourier series of a signal $x (t)$ correspond to the normalized projections of $x (t)$ on the corresponding basis functions. A large value for the norm of a projection indicates that the given basis function provides a good contribution in the task of representing $x (t)$ .

Chapter 2 discusses block transforms and relies on orthogonal functions. Hence, it is useful to discuss why orthogonality is important.

A.14.3 Orthogonal basis allows inner products to transform signals

Assume the existence of a set of orthogonal vectors composing the basis of a vector space. For example, in $ℝ^{2}$ , a convenient basis is the standard, composed by $\bar{i} = [1, 0]$ and $\bar{j} = [0, 1]$ . The inner product $⟨ \bar{i}, \bar{j} ⟩ = 0$ indicates that these two vectors are orthogonal. The orthogonality property simplifies the following analysis task: given any vector $x$ , the coefficients $α$ and $β$ of the linear combination $x = α \bar{i} + β \bar{j}$ , can be easily found by using the dot products $α = ⟨ x, \bar{i} ⟩$ and $β = ⟨ x, \bar{j} ⟩$ . The following theorem proves this important result.

Theorem 4. Analysis via inner products. If the basis set ${b_{1}, \dots, b_{N}}$ of an inner product space (e. g., Euclidean) is orthonormal, the coefficients of a linear combination $x = \sum_{i = 1}^{N} α_{i} b_{i}$ that generates a vector $x$ can be calculated by the inner product $α_{i} = ⟨ x, b_{i} ⟩$ between $x$ and the respective vector $b_{i}$ in the basis set.

Proof: Recall the following properties of a dot product: $⟨ \bar{a} + \bar{b}, \bar{c} ⟩ = ⟨ \bar{a}, \bar{c} ⟩ + ⟨ \bar{b}, \bar{c} ⟩$ and $⟨ α \bar{a}, \bar{b} ⟩ = α ⟨ \bar{a}, \bar{b} ⟩$ and write

⟨ x, b_{j} ⟩ = ⟨ \sum_{i = 1}^{N} α_{i} b_{i}, b_{j} ⟩ = \sum_{i = 1}^{N} α_{i} ⟨ b_{i}, b_{j} ⟩

because the basis vectors are orthonormal, $⟨ b_{i}, b_{j} ⟩ = 1$ if $i = j$ and $⟨ b_{i}, b_{j} ⟩ = 0$ if $i \neq j$ . Therefore,

⟨ x, b_{j} ⟩ = α_{j} .

because all the terms in the above summation are zero but the one for $i = j$ . $□$

Example A.1. Obtaining the coefficients of a linear combination of basis functions. A simple example can illustrate the analysis procedure: the coefficients of $x = 4 \bar{i} + 8 \bar{j}$ are $α = 4$ and $β = 8$ by inspection, but they could be calculated as $α = ⟨ x, \bar{i} ⟩ = ⟨ [4, 8], [1, 0] ⟩ = 4$ and $β = ⟨ x, \bar{j} ⟩ = ⟨ [4, 8], [0, 1] ⟩ = 8$ . Note that the zeros in these basis vectors make the calculation overly simple. Another example may be more useful to highlight orthogonality and the following alternative basis set is assumed: $\bar{i} = [0.5, 0.866]$ and $\bar{j} = [0.866, - 0.5]$ . Let $x = 3 \bar{i} + 2 \bar{j} = [3.232, 1.5980]$ . Given $x$ , the task is again to find the coefficients such that $x = α \bar{i} + β \bar{j}$ . Due to the orthonormality of $\bar{i}$ and $\bar{j}$ , one can for example obtain $α = ⟨ x, \bar{i} ⟩ = ⟨ [3.232, 1.5980], [0.5, 0.866] ⟩ = 3$ . These computations can be done in Matlab/Octave as follows.

1i=[0.5,0.866], j=[0.866 -0.5] %two orthonormal vectors 
2x=3*i+2*j %create an arbitrary vector x to be analyzed 
3alpha=sum(x.*i), beta=sum(x.*j) %find inner products

In contrast, let us modify the previous example, adopting a non orthogonal basis. Assume that $\bar{i} = [1, 1]$ and $\bar{j} = [0, 1]$ (note that $⟨ \bar{i}, \bar{j} ⟩ = 1$ , hence the vectors are not orthogonal). Let $x = 3 \bar{i} + 2 \bar{j} = [3, 5]$ . In this case, $⟨ x, \bar{i} ⟩ = 8$ and $⟨ x, \bar{j} ⟩ = 5$ , which do not coincide with the coefficients $α = 3$ and $β = 2$ . How could the coefficients be properly recovered in this case? An alternative is to write the problem as a set of linear equations, organize it in matrix notation and find the coefficients by inverting the matrix. In Matlab/Octave:

Listing A.2: MatlabOctaveCodeSnippets/snip_transforms_non_orthogonal_basis.m

1i=transpose([1,1]), j=transpose([0,1]) %non-orthogonal 
2x=3*i+2*j %create an arbitrary vector x to be analyzed 
3A=[i j]; %organize basis vectors as a matrix 
4temp=inv(A)*x; alpha=temp(1), beta=temp(2) %coefficients

In summary, the analysis procedure for many linear transforms (such as Fourier, Z, etc.) obtain the coefficients via an inner product, and the procedure can be interpreted as calculating the projection of $x$ onto basis $\bar{i}$ (eventually scaled by the norm of $\bar{i}$ ). $□$

This discussion leads to the conclusion that a basis with orthogonal vectors significantly simplifies the task: in this case, the analysis procedure can be done via inner products. This applies when the basis vectors do not have unitary-norm, but in this case a normalization by their norms is needed. Orthogonal basis vectors are a property of all block transforms discussed in this text.

A.14.4 Moore-Penrose pseudoinverse

Pseudoinverses ² are generalizations of the inverse matrix and are useful when the given matrix does not have an inverse (for example, when the matrix is not square or full rank).

The Moore-Penrose pseudoinverse has several interesting properties and is adequate to least square problems. It provides the minimum-norm least squares solution $z = X b$ to the problem of finding a vector $z$ that minimizes the error vector norm $| | X z - b | |$ . Assuming $X$ is an $m \times n$ matrix, the pseudoinverse provides the solution for a set of overdetermined or underdetermined equations if $m > n$ or $m < n$ , respectively.

Two properties of a Moore-Penrose pseudoinverse $X^{+}$ are

X^{H} = X^{H} X X^{+},

(A.37)

and

X^{H} = X^{+} X X^{H} .

(A.38)

With $r$ being the rank of $X$ , then:

if $m = n$ and $r = m = n$ , the pseudoinverse $X^{+} = X^{- 1}$ is equivalent to the usual inverse;
if $m > n$ (overdetermined) and, besides, $r = n$ (the columns of $X$ are linearly independent), $X^{H} X$ is invertible and using Eq. (A.37) the pseudoinverse is given by

$X^{+} = {(X^{H} X)}^{- 1} X^{H};$ (A.39)

if $n > m$ (underdetermined) and $r = m$ (the rows of $X$ are linearly independent), $X X^{H}$ is invertible and using Eq. (A.38) leads to

X^{+} = X^{H} {(X X^{H})}^{- 1} .

(A.40)

Whenever available, instead of Eq. (A.39) or Eq. (A.40) that requires linear independence, one should use a robust method to obtain $X^{+}$ such as the pinv function in Matlab/Octave, which adopts a SVD decomposition to calculate $X^{+}$ . Listing A.3 illustrates such calculations and the convenience of relying on pinv when the independence of rows or columns is not guaranteed.

Listing A.3: MatlabOctaveCodeSnippets/snip_systems_pseudo_inverse.m

1test=3; %choose the case below 
2switch test 
3    case 1 %m>n (overdetermined/tall) and linearly indepen. columns 
4        X=[1 2 3; -4+1j -5+1j -6+1j;1 0 0;0 1 0]; %4 x 3 
5    case 2 %n>m (underdetermined/fat) and linearly independent rows 
6        X=[1 2 3; -4+1j -5+1j -6+1j]; %2 x 3 
7    case 3 %neither rows nor columns of X are linearly independent 
8        %rows X(2,:)=2*X(1,:) and columns X(:,4)=3*X(:,1) 
9        X=[1 2 3 3; 2 4 6 6; -4+1j -5+1j -6+1j -12+3*1j]; %3 x 4 
10end 
11Xp_svd = pinv(X) %pseudoinverse via SVD decomposition 
12Xp_over = inv(X'*X)*X' %valid when columns are linearly independent 
13Xp_under = X'*inv(X*X') %valid when rows are linearly independent 
14rank(X'*X) %X'*X is square but may not be full rank 
15rank(X*X') %X*X' is square but may not be full rank 
16Xhermitian=X'*X*pinv(X) %equal to X' (this property is always valid) 
17Xhermitian2=pinv(X)*X*X' %equal to X' (the property itself is valid) 
18maxError_over=max(abs(Xp_svd(:)-Xp_over(:))) %error for overdetermin. 
19maxError_under=max(abs(Xp_svd(:)-Xp_under(:))) %for underdetermined

¹ The function ak_drawvector.m was also used.

² See, e. g., [ url3inv] and [ url2psi].