3.4 Linear Time-Invariant Systems
This section is dedicated to the study of linear and time-invariant (LTI) systems.
Figure 3.5 depicts the position of LTI systems and also includes the important subset of systems described by linear, constant-coefficient differential (or difference, in discrete-time) equations (LCCDE). In this text, it is assumed that the LCCDE systems have zero initial conditions or, equivalently, that the system is at rest.
3.4.1 Impulse response and convolution for LTI systems
A LTI system is completely characterized by its impulse response.1 This fact is illustrated in the sequel using a simple example. Assume that a system is LTI and its impulse response is obtained by imposing an input . For example, assuming , one has
This example illustrates that, knowing and using linearity and time-invariance, one can calculate the output.
In general, the input/output relation of a LTI depends on two facts:
- As indicated by Eq. (1.3), any signal can be decomposed as the sum of impulses that are shifted in time (by ) and scaled (by ) in amplitude.
- By time-invariance, each impulse generates a sequence at the output and by linearity these sequences can be scaled and summed to composed the output .
These two key facts lead to the convolution operation:
Note how linearity and time-invariance were both invoked in this proof. The continuous-time convolution is similar:
| (3.2) |
The convolution is so important that it is represented by the operator . For example, denotes the convolution of and . Because is the same symbol used for multiplication in programming languages, the context has to distinguish them.
Figure 3.6 illustrates the convolution between two sequences. The interpretation is that is composed by the sum of several scaled and shifted impulse responses. The plots at the right in Figure 3.6 indicate that is the sum of the two parcels and . The Listing 3.1 provides an example of implementing convolution of discrete-time sequences,2 which gives the same result as the (faster) conv function in Matlab/Octave.
1function y=ak_convolution(x,h) 2% function y=ak_convolution(x,h) 3%convolution between sequences x and h 4N1=length(x); %get the number of samples in x 5N2=length(h); 6N=N1+N2-1; %this is the number of samples in the output y 7y=zeros(1,N); %pre-allocate space for y[n] 8for i=1:N1 %calculate y[n]= sum_k x[k] h[n-k] 9 y(i:i+N2-1)=y(i:i+N2-1)+x(i)*h; %scaling h by x(i) 10end
Note that if the samples of and are organized as elements of vectors of polynomial coefficients, convolving them is equivalent to multiplying the two polynomials. For the example in Figure 3.6 one can obtain the convolution result by multiplying the equivalent polynomials and , which leads to , i. e., the coefficients are the samples of .
If the impulse response completely characterizes a LTI system, all its properties can be inferred from the corresponding impulse response. This is discussed in Appendix A.27.7.
3.4.2 Advanced: Convolution properties
The convolution is:
- Commutative:
- Associative:
- Distributive:
Some other facts about the convolution:
- If and are the duration in samples of two finite-length discrete-time sequences and , then has duration samples
- The expression Eq. (1.51) can also be written as
- If , to obtain for a specific instant , one sums all products for
An example of this last fact is appropriate: if and , the output is given by
y[0] = x1[0] x2[0] = 5 y[1] = x1[0] x2[1] + x1[1] x2[0] = 16 y[2] = x1[0] x2[2] + x1[1] x2[1] + x1[2] x2[0] = 34 y[3] = x1[0] x2[3] + x1[1] x2[2] + x1[2] x2[1] = 40 y[4] = x1[1] x2[3] + x1[2] x2[2] = 37 y[5] = x1[2] x2[3] = 24
where one can note that the indexes of the input sequences sum up to the index of the output. Hence, one can alternatively implement convolution as in Listing 3.2.
1function [y,n]=ak_convolution2(x1,x2,n1,n2) 2% function [y,n]=ak_convolution2(x1,x2,n1,n2) 3%calculate the convolution between the sequences x1 and x2 4%that start at index n1 and n2, respectively 5%Example: [y,n]=ak_convolution2(1:3,5:8,-3,2); stem(n,y) 6N1=length(x1); %get the number of samples in x1 7N2=length(x2); %get the number of samples in x2 8N=N1+N2-1; %this is the number of samples in the output y 9y=zeros(1,N); %pre-allocate space for y[n] 10for i=0:N1-1 %calculate y[n] = sum_k x1[k] x2[n-k] 11 for j=0:N2-1 12 y(i+j+1) = y(i+j+1) + x1(i+1)*x2(j+1); %update 13 end 14end 15n=n1+n2:n1+n2+N-1; %generate the "time" indices
Note that Listing 3.2 allows to explicitly deal with sequences that do not start at . When using Matlab/Octave, it is often necessary to explicitly deal with and generate the time indexes.
It can be observed from Listing 3.2 that the first non-zero sample of the convolution between two sequences and is located at , which are assumed to be the indexes of the first non-zero samples of the two sequences. Similarly, the last non-zero sample of the convolution is the sum of the indexes of the last non-zero samples of and . This can be seen by running Listing 3.3.
1x1=1:3; %define sequence x1 2x2=5:8; %define sequence x2 3nx1=-3:-1; %define abscissa for x1 4nx2=2:5; %define abscissa for x2 5subplot(311); stem(nx1,x1) 6subplot(312); stem(nx2,x2) 7[y,n]=ak_convolution2(x1,x2,-3,2); %calculate convolution 8subplot(313); stem(n,y) %show result with proper time axis
3.4.3 Advanced: Convolution via correlation and vice-versa
Given that convolution and cross-correlation are tightly related, Listing 3.4 illustrates how the Matlab/Octave functions xcorr and conv can be used to calculate the other operation.
1x=(1:4)+j*(4:-1:1); %define some complex signals as row vectors, such 2y=rand(1,15)+j*rand(1,15); %that fliplr inverts the ordering 3%% Correlation via convolution 4Rref=xcorr(x,y); %reference of a cross-correlation 5xcorrViaConv=conv(x,conj(fliplr(y))); %use the second argument 6%% Convolution via correlation 7Cref=conv(x,y); %reference of a convolution 8convViaXcorr=xcorr(x,conj(fliplr(y))); %using the second argument 9%convViaXcorr=conj(fliplr(xcorr(conj(fliplr(x)),y))); %alternative 10%% Make sure they have the same length and compare the results 11if length(x) ~= length(y) %this case requires post-processing because 12 %xcorr assumes the sequences have the same length and uses 13 %zero-padding if they do not. We treat the effect of these zeros: 14 convolutionLength=length(x)+length(y)-1; 15 correlationLength=2*max(length(x),length(y))-1; 16 if length(x) < length(y) %zeros at the end 17 convViaXcorr = convViaXcorr(1:convolutionLength); 18 xcorrViaConv = [xcorrViaConv zeros(1,correlationLength- ... 19 length(xcorrViaConv))]; 20 elseif length(x) > length(y) %zeros at the beginning 21 convViaXcorr = convViaXcorr(end-convolutionLength+1:end); 22 xcorrViaConv = [zeros(1,correlationLength- ... 23 length(xcorrViaConv)) xcorrViaConv]; 24 end 25end 26ErroXcorr= max(abs(Rref - xcorrViaConv)) %calculate maximum errors 27ErroConv = max(abs(Cref - convViaXcorr)) %should be small numbers
As indicated in Listing 3.4, the key step to mimic a convolution via correlation or vice-versa is to flip the signal using , use its complex conjugate and eventually shift by as indicated in
| (3.3) |
3.4.4 Advanced: Discrete-time convolution in matrix notation
Convolution can be denoted in matrix notation. This is especially convenient when dealing with finite-duration discrete-time signals. When these signals are represented by column vectors and , a convolution matrix allows to obtain the convolution result as
| (3.4) |
where is composed by the elements of . It is also possible to create from the elements of and use .
This can be better understood with examples. Considering the goal is to obtain the convolution of two column vectors x1=[1; 2; 3] and x2=[5; 6; 7; 8] (previous example), instead of using y=conv(x1,x2), one can create a matrix
hmatrix = [ 1 0 0 0 2 1 0 0 3 2 1 0 0 3 2 1 0 0 3 2 0 0 0 3]
with the command hmatrix = convmtx(x1,length(x2)), and then calculate the convolution with y=hmatrix * x2. Note that when using row vectors, the commands could be hm = convmtx(x1,length(x2)); x2*hm (in this case, hm would be the transpose of hmatrix). Application 3.8 discusses the issue of repeatedly processing blocks of samples using the convolution in matrix notation. The matrices created by convmtx are Toeplitz matrices and the command toeplitz.m can also be useful to compose convolution matrices.
3.4.5 Approximating continuous-time via discrete-time convolution
Some situations require approximating the continuous-time convolution denoted by Eq. (3.2) via the discrete-time convolution in Eq. (3.1). In such cases, the sampling interval should be used as a normalization factor, as indicated in Eq. (A.31). This factor is adopted in the next example.
Example 3.2. Convolution of a pulse with itself leads to a triangular waveform. Assume a pulse (see in Section 1.3.6) with amplitude V and support s. The convolution corresponds to a triangular waveform: it is 0 for and assumes its maximum value at at 0.2 s, which corresponds to the time that the two pulses overlap completely and is the area of . Figure 3.7 depicts both the pulse and continuous-time convolution .
But Figure 3.7 was in fact obtained by representing via its discrete-time version , and approximating the convolution with the scaled discrete-time convolution , i. e., Eq. (A.31). Listing 3.5 shows the first lines of the script that generated Figure 3.7. After running Listing 3.5, the vectors pulse and triangle are shown in Figure 3.7 with the proper time axes.
1T0=0.2; %pulse "duty cycle" (interval with non-zero amplitude): 0.2 s 2Ts=2e-3; %sampling interval: 2 ms 3N=T0/Ts; %number of samples to represent the pulse "duty cycle" 4A=4; %pulse amplitude: 4 Volts 5pulse=A*[zeros(1,N) ones(1,N) zeros(1,4*N)]; %pulse 6triangle=Ts*conv(pulse,pulse); %(approximated) continuous convolution 7disp(['Convolution peak at ' num2str(T0) ' is ' num2str(A^2*T0)])
Note in line 6 the scaling factor , which leads to the correct values, as indicated by the datatip in Figure 3.7.
3.4.6 Frequency response: Fourier transform of the impulse response
As indicated in Table 3.1, the frequency response of an LTI system is the Fourier transform of its impulse response. In continuous-time it will be denoted by ( in Hz) or ( in rad/s), while in discrete-time the notation is ( in rad). The frequency response is particularly useful because complex exponentials are eigenfunctions of LTI systems as explained in the sequel.
Eigenfunctions
Eigenfunctions are closely related to eigenvectors. if is a matrix (a linear transformation), a non-null vector is an eigenvector of if there is a scalar such that
The scalar is an eigenvalue of corresponding to the eigenvector . In general, the matrix generates a completely new vector , i. e., and have different directions and magnitudes. However, if is an eigenvector of , then , which means that the operation changed only the magnitude of , leaving its direction unchanged (or possibly reversing it in case ).
Similarly, for any LTI system (continuous-time is assumed here), a complex exponential is an eigenfunction because the output is when the input is . The frequency response is useful because the eigenvalue is the value of the frequency response at the specific frequency . This implies that an LTI never creates new frequencies, because it can simply change the magnitude and phase of frequencies that were presented at its input. This fact can be represented as
It is then possible to analyze an LTI in the frequency domain by following the steps:
- f 1.
- Decompose the input as a sum (if the signal is periodic) or integral of complex exponentials. For that, one can use the Fourier series or transform
- f 2.
- Obtain the frequency response, which provides the eigenvalues for all frequencies:
- f 3.
- Conceptually, multiply each complex exponential by its eigenvalue and add the partial results to obtain the system output . In practice, this step is performed by and then using the inverse transform to obtain .
Example 3.3. LTI filtering of continuous-time complex exponential. For example, consider that is the input of an LTI with . The eigenvalues corresponding to each the complex exponential eigenfunctions with frequencies rad/s can be obtained from the frequency response . The Matlab/Octave commands Omega=4; H=1/(j*Omega+2) calculate for the positive frequency. Because is real-valued, the frequency response presents Hermitian symmetry and , i. e., . The commands abs(H), angle(H) can transform from Cartesian to polar: and . Therefore, the output is .
The effect of the LTI system was to impose a gain of and a phase of rad. Figure 3.8 depicts the frequency response for a range rad/s. Because of the Hermitian symmetry (when is real), it is common to plot only the positive frequencies. For example, investigate the command freqs(1,[1 2]), which shows the frequency response of in a different format.
The previous discussion was restricted to exponentials of the form , but a general complex exponential (with ) is also an eigenfunction of any LTI and the eigenvalue is given by the Laplace transform , as represented by
The frequency response of discrete-time LTI systems is also very useful. When the time axis is discretized by sampling, i. e., , the complex exponential becomes , which is more conveniently represented by , where
with .
The discrete-time complex exponential is an eigenfunction of discrete-time LTI systems. To prove that, consider , then
The eigenvalue can be obtained from the transfer function
which is the Z-transform of the system’s impulse response. As a special case, when , the eigenfunction has its amplitude and phase eventually modified by a LTI as depicted by:
Example 3.4. LTI filtering of discrete-time complex exponentials. This example describes how an LTI system filters a complex exponential. Consider that is the input of an LTI with . The eigenvalues can be obtained from when rad. The Matlab/Octave commands omega=pi/2; H=1/(1-0.7*exp(-j*omega)) calculate . Because is real-valued, the frequency response presents Hermitian symmetry. The commands abs(H), angle(H) can transform from Cartesian to polar: . Therefore, the output is . The effect of the LTI system was to impose a gain of and a phase of rad.
Figure 3.9 depicts the frequency response for a range rad. As expected, is periodic because . Given this periodicity and the Hermitian symmetry (when is real), it is common to plot only the positive frequencies in the range . For example, Figure 3.10 shows the plots obtained with the command freqz(1,[1 -0.7]), which shows the frequency response of with the magnitude in dB, the phase in degrees (instead of rad) and the abscissa in a “normalized frequency” that maps into .
3.4.7 Fourier convolution property
The convolution property of Fourier transforms states that the convolution between two signals has a Fourier transform given by the multiplication of their Fourier transforms. In other words, the convolution between these signals by can be obtained as the inverse Fourier transform of the multiplication of their Fourier transforms. For example, assuming continuous-time, the convolution can be obtained with
| (3.5) |
where and are the corresponding Fourier transforms. Similarly, in discrete-time, the convolution result can be written as
| (3.6) |
where and are the corresponding DTFTs and, in this case, denotes the inverse DTFT.
Example 3.5. Convolution using DTFTs and inverse DTFT. For example, if and are two discrete-time complex exponentials with and , their DTFTs are , respectively. The convolution result is
| (3.7) |
and can be obtained via the inverse DTFT of , which is given by
| (3.8) |
Eq. (3.7) can be obtained by using partial fraction expansion of Eq. (3.8).
Due to the duality of Fourier transforms, the multiplication in time-domain corresponds to convolution in frequency-domain between the respective spectra. This is called the multiplication property and corresponds to
| (3.9) |
in continuous-time.
In discrete-time, one needs to take in account that the spectra are periodic and the conventional convolution is not adequate when both signals are periodic. In fact, the multiplication property for discrete-time signals is
| (3.10) |
The integral in Eq. (3.10) differs from a conventional convolution because it is calculated over only one period ( rad in this case) and the result is normalized by this period. This modified convolution is denoted as and is called periodic, cyclic or circular convolution. As in a Fourier series expansion to obtain coefficients , the associated period must be known in order to properly use . The normalization by the period can be incorporated in its definition, which leads to the following expression for the periodic convolution:
| (3.11) |
where both signals are periodic in .
The definition of Eq. (3.11) allows to write Eq. (3.10) as
| (3.12) |
where the period is implicit.
The periodic convolution corresponds to performing the conventional convolution (called “linear” convolution, in contrast to “circular”) between and , where is a single period of that is normalized by the period . The interval to define can be conveniently chosen as , or otherwise, or , or otherwise.
As discussed in the sequel, the periodic convolution is typically associated to FFT-based processing.
3.4.8 Circular and fast convolutions using FFT
Recalling that the FFT corresponds to sampling the DTFT, Eq. (3.6) suggests that FFTs can be used to efficiently compute a convolution. However, even if were not periodic, when its DTFT is sampled (in frequency domain) by the FFT, the FFT values are representing the spectrum of a periodic version of . Consequently, when FFTs substitute DTFTs in Eq. (3.6), the result is not the linear but the circular convolution represented as
| (3.13) |
where both FFTs must have the same length , which plays the whole of the period of a circular convolution.
Listing 3.6 provides an example of using Eq. (3.13). Note how zero-padding is used to assure the element-wise multiplication fft(x,N).*fft(h,N) uses arrays with the same length. The result of Listing 3.6 confirms that, in general, linear and circular convolution differ.
1x=[1 2 3 4]; h=[.9 .8]; %signals to be convolved 2shouldMakeEquivalent=0 %in general, linear and circular conv. differ 3if shouldMakeEquivalent==1 4 N=length(x)+length(h)-1; %to force linear and circular coincide 5else 6 N=max(length(x),length(h)); %required for FFT zero-padding 7end 8linearConv=conv(x,h) %linear convolution 9circularConv=ifft(fft(x,N).*fft(h,N)) %circular convolution, N=4 10%circularConv=cconv(x,h,N) %note that Matlab has the cconv function
If the value of shouldMakeEquivalent is made equal to 1, Listing 3.6 returns the same results for both linear and circular convolutions. In fact, when the FFT length is at least the number of non-zero samples of the convolution output, the circular and linear convolution results coincide. This suggests that Eq. (3.13) can be used to calculate a linear convolution, provided that the FFT length is made long enough.
Obtaining a linear convolution via FFTs is trickier when one of the signals to be convolved has infinite duration. In this case, it is obviously not possible to find a large enough FFT length to use Eq. (3.13). However, if the other signal has finite duration, it is feasible and often used in practice to segment the long signal into blocks and calculate the linear convolution by sequentially calculating one FFT per block and properly combining the results. There are basically two alternatives to combine the intermediate results that are called overlap-add and overlap-save methods, and are roughly equivalent. Listing 3.7 illustrates the former.
1x=1:1000; %infinite duration (or "long") input signal 2h=ones(1,3); %non-zero samples of finite-length impulse response 3Nh=length(h); %number of impulse response non-zero samples 4Nb=5; %block (segment) length 5Nfft=2^nextpow2(Nh+Nb-1); %choose a power of 2 FFT size 6Nx = length(x); %number of input samples 7H = fft(h,Nfft); %pre-compute impulse response DFT, with zero-padding 8beginIndex = 1; %initialize index for first sample of current block 9y = zeros(1,Nh+Nx-1); %pre-allocate space for convolution output 10while beginIndex <= Nx %loop over all blocks 11 endSample = min(beginIndex+Nb-1,Nx);%last sample of current block 12 Xblock = fft(x(beginIndex:endSample),Nfft); %DFT of block 13 yblock = ifft(Xblock.*H,Nfft); %get circular convolution result 14 outputIndex = min(beginIndex+Nfft-1,Nh+Nx-1); %auxiliary variab. 15 y(beginIndex:outputIndex) = y(beginIndex:outputIndex) + ... 16 yblock(1:outputIndex-beginIndex+1); %add parcial result 17 beginIndex = beginIndex+Nb; %shift begin of block 18end 19stem(y-conv(x,h)) %compare the error with result from conv
A typical application of the overlap-add and overlap-save methods is to compute the output of a LTI system represented by a finite-duration impulse response (i. e., a FIR filter, as will be discussed in Section 3.14). Listing 3.7 provides an example where the impulse response has only three non-zero samples, and the long input signal is segmented in blocks of Nb=5 samples.