Windows for spectral analysis

4.3 Windows for spectral analysis

Windows were briefly discussed in the context of block processing and FIR design in Sections 1.4 and 3.14.5, respectively. Here they are applied to spectral analysis in order to better control trade offs such as resolution and leakage.

Windows are essential to model the extraction of a finite-duration segment from a signal with a potentially infinite duration. For example, a $N_{fft}$ -points FFT is restricted to operate on $N$ signal samples, with $N \leq N_{fft}$ . And even if the user is not aware or explicitly modeling the windowing operation, obtaining a segment of $N$ samples from an eventually longer signal $x [n]$ is mathematically equivalent to multiplying $x [n]$ by a rectangular window $w [n]$ of duration $N$ samples (as implicitly done in Section 1.4).

This windowing operation in time domain can often be modeled by

x_{w} [n] = x [n] w [n],

(4.1)

as a pre-processing step, to generate $x_{w} [n]$ with a finite duration of $N$ samples. Windowing enables calculating the FFT of the finite-duration $x_{w} [n]$ , while the FFT of the original signal $x [n]$ would be unfeasible due to its long duration.

As informed by Eq. (3.10), this multiplication $x [n] w [n]$ in time-domain corresponds to a periodic convolution in frequency-domain, as repeated here for convenience:

X_{w} (e^{j Ω}) = X (e^{j Ω}) ⊛ W (e^{j Ω}),

(4.2)

between the DTFTs $X (e^{j Ω})$ and $W (e^{j Ω})$ of $x [n]$ and $w [n]$ , respectively.

Five popular windows are discussed in the next subsection.

4.3.1 Popular windows

Figure 4.1: Selected windows $w [n]$ of duration $N = 32$ samples in time-domain.

Figure 4.2: The DTFTs $W (e^{j Ω})$ of the windows in Figure 4.1.

Figure 4.1 shows some common windows for the case of $N = 32$ samples in time-domain. Note that all of them were normalized to have the maximum amplitudes equal to one. Figure 4.2 shows these windows in frequency-domain, via their DTFTs.

Example 4.1. Designing windows in Matlab/Octave. Listing 4.1 lists the commands to design windows with $N = 256$ samples using Matlab/Octave.

Listing 4.1: MatlabOctaveCodeSnippets/snip_frequency_windows.m. [ Python version]

1N=256; %desired number of samples of window W 
2W=rectwin(N); %Rectangular window 
3W=hamming(N); %Hamming window 
4W=hanning(N); %Hann window 
5W=kaiser(N,7.85); %Kaiser with beta=7.85 
6W=flattopwin(N); %Flat top window

Assuming discrete-time windows, the only input parameter for their design is their duration N. The Kaiser windows (also called Kaiser-Bessel window) is the only exception among the five, because it also has a parameter $β$ . For example, increasing $β$ of a Kaiser window widens the main lobe and decreases the amplitude of the sidelobes. $□$

Figure 4.3: The DTFT of windows in Figure 4.1 with their values normalized such that $| W (e^{j Ω}) | = 1$ for $Ω = 0$ rad.

It is common to normalize a window such that its DTFT has a value equal to one at DC. This was done to generate Figure 4.3. To obtain this normalization, one can use the fact that the DC value $W (z) |_{z = 1}$ coincides with the sum of the window coefficients. For example, a normalized Hamming window can be obtained with w=hamming(32);w=w/sum(w).

Equations describing popular windows

The Hamming window adopted here is given by

w [n] = 0.54 - 0.46 cos (\frac{2 πn}{N - 1}), n = 0, …, N - 1,

(4.3)

while the Hann window is

w [n] = 0.5 - 0.5 cos (\frac{2 πn}{N - 1}), n = 0, …, N - 1 .

(4.4)

Note that according to Eq. (4.4), the first and last samples ( $n = 0$ and $N - 1$ ) are zero and this version of the Hann window is called “periodic”. In Python, the commands import numpy and w = numpy.hanning(5) return [0.,0.5,1.,0.5,0.]. In Matlab/Octave, the command hanning(6,’periodic’) returns [0;0.25;0.75;1.00;0.75;0.25] while the commands hanning(5) and hanning(5,’symmetric’) return [0.25;0.75;1.00;0.75;0.25], without the zero-valued samples.

The flat-top window is

\begin{align} w [n] & = 0.21557895 - 0.41663158 cos (\frac{2 πn}{N - 1}) + 0.277263158 cos (\frac{4 πn}{N - 1}) & (4.5) \\ - 0.083578947 cos (\frac{6 πn}{N - 1}) + 0.006947368 cos (\frac{8 πn}{N - 1}), n = 0, …, N - 1 . \end{align}

The Kaiser windows for $N$ odd, such that $M = (N - 1) ∕ 2$ is an integer, is

w [n] = \frac{I_{0} (β \sqrt{1 - {(\frac{n - M}{M})}^{2}})}{I_{0} (β)}, n = 0, \dots, N - 1,

(4.6)

where $I_{0} (x)$ is the $0^{th}$ order Bessel function of the first kind, which is calculated in Matlab/Octave with besseli(0,x), as indicated in the following commands:

1N=13; beta=7.85; %produce same window as w=kaiser(N,beta): 
2n=0:N-1; M=(N-1)/2; %auxiliary variables 
3w=abs(besseli(0,beta*sqrt(1-((n-M)/M).^2)))/abs(besseli(0,beta)); 
4w=w/max(w); %normalize to have maximum value equal to 1

that create a Kaiser window (for $N$ odd).

4.3.2 Figures of merit applied to windows

Four of the most important parameters of a window are:

width of its main lobe,
amplitude of its sidelobes,
rate of sidelobe decay as frequency increases,
and scalloping loss.

Each of them are discussed in the sequel, starting by the scalloping loss. There are other figures of merit of windows, and the reader is invited to check the references about windows in Section 4.11.

Estimating sinusoid amplitude and observing the scalloping loss

In some applications of FFT-based spectral analysis it is necessary not only detect the frequencies (their values in Hertz, for instance) but also the amplitudes (e. g., in volts) of the detected frequency components. In this case, the impact of windowing and then using an FFT needs to be taken in account because they impose an error (or loss) on the estimated sinusoid amplitude with respect to the true value called “scalloping”. For most windows, this so-called scalloping loss is worst when the frequency is half-way the two neighboring bins.

For instance, if one is calculating the harmonic distortion solely based on percentage values, such as: “- the third harmonic has 12% of the amplitude of the fundamental frequency component”, then mitigating the scalloping loss may not be needed. But if the analysis requires the absolute amplitude values of the harmonic components, the scalloping loss issue becomes important.

Listing 4.2 illustrates a procedure for estimating the amplitude $A = 6$ V of a cosine using FFT and the effect of the scalloping loss.

Listing 4.2: MatlabOctaveCodeSnippets/snip_frequency_scalloping.m. [ Python version]

1window_choice = 1; %choose one among 3 possible windows 
2alpha=8.5; %specifies the cosine frequency 
3clf, N=32; A=6; %clear figure, FFT length, cosine amplitude 
4Wc=(alpha*2*pi)/N; %cosine frequency in radians 
5n=0:N-1; x=A*cos(Wc*n); %generate N samples of the cosine 
6switch (window_choice) 
7  case 1 
8    this_window = rectwin(N); 
9  case 2 
10    this_window = hanning(N); 
11  case 3 
12  this_window = flattopwin(N); 
13end 
14amplitude_scaling = sum(this_window); %factor to mitigate scalloping 
15xw = x.* transpose(this_window); %multiply in time-domain 
16Xw_scaled_fft = fft(xw)/amplitude_scaling; %N-points FFT and scale it 
17max_fft_amplitude = max(abs(Xw_scaled_fft)); 
18disp(['Max(abs(scaled FFT)) = ' num2str(max_fft_amplitude )]) 
19scalloping_loss=(A/2)-max_fft_amplitude; 
20disp(['Scalloping loss = ' num2str(scalloping_loss)]) 
21disp(['Correct amplitude (Volts) = ' num2str(A)]) 
22disp(['Estimated amplitude (Volts) = ' num2str(2*max_fft_amplitude)]) 
23amplitude_error = A - 2*max_fft_amplitude; 
24disp(['Amplitude error (Volts) = ' num2str(amplitude_error)]) 
25disp(['Amplitude error (%) = ' num2str(100*amplitude_error/A)]) 
26ak_impulseplot([A/2, A/2],[Wc,2*pi-Wc]/pi,[]); %plot cosine impulses 
27hold on, stem([0:N-1]*(2*pi/N)/pi,abs(Xw_scaled_fft),'or') 
28xlabel('Angular frequency \Omega (rad) normalized by \pi') 
29ylabel('Spectrum magnitude'), grid 
30legend('DTFT (impulse area divided by \pi)', 'FFT magnitude')

When Listing 4.2 is used with $α = 8$ to generate a bin-centered cosine, the result using the rectangular window is Figure 4.4. Changing the window to flattop leads to Figure 4.5.

Figure 4.4: Comparison of DTFT and DFT/FFT of a bin-centered signal $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 8$ and $N = 32$ , using a rectangular window.

Figure 4.5: Comparison of DTFT and DFT/FFT of a bin-centered signal $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 8$ and $N = 32$ , using a flattop window.

Both Figure 4.4 and Figure 4.5 show that the maximum value of the FFT magnitude (3 in this case), when normalized by the FFT size $N$ leads exactly to the area $πA$ (divided by $π$ ) of the impulse representing the cosine DTFT. In this case of bin-centered sinusoids there is no visible scalloping loss.

However, if the cosine frequency $Ω_{c}$ is located half-way two neighboring bins (alpha=8.5), the scalloping loss becomes evident. When using the rectangular window with alpha=8.5, one obtains Figure 4.6 and the numerical results below:

Max(abs(scaled FFT)) = 1.9221
Scalloping loss = 1.0779
Correct amplitude (volts) = 6
Estimated amplitude (volts) = 3.8442
Amplitude error (volts) = 2.1558
Amplitude error (%) = 35.9301

The amplitude error in this case is 35.9%, which is due to the worst-case scalloping loss for the rectangular window in this situation of a sinusoid exactly half-way the bins.¹

Figure 4.6: Comparison of DTFT and DFT/FFT of a non-bin-centered signal $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 8.5$ and $N = 32$ , using a rectangular window.

In case one chooses the flat-top window in Listing 4.2 (window_choice=3), the corresponding figure is Figure 4.7 and the numerical results are:

Max(abs(scaled FFT)) = 2.9949
Scalloping loss = 0.0050728
Correct amplitude (volts) = 6
Estimated amplitude (volts) = 5.9899
Amplitude error (volts) = 0.010146
Amplitude error (%) = 0.16909

The flat-top window does not have a good frequency resolution but reaches an amplitude estimation error of only 0.17%. Because of that, it is widely adopt to calibrate equipment.²

Figure 4.7: Comparison of DTFT and DFT/FFT of a non-bin-centered signal $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 8.5$ and $N = 32$ , using a flattop window.

Figure 4.8: Comparison of DTFT and DFT/FFT of a non-bin-centered signal $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 8.5$ and $N = 32$ , using a Hann window.

As indicated in Figure 4.8, the Hann window is a trade-off between frequency resolution and amplitude error, reaching 16% of amplitude error when one uses window_choice=2 in Listing 4.2, as listed:

Max(abs(scaled FFT)) = 2.5197
Scalloping loss = 0.48034
Correct amplitude (Volts) = 6
Estimated amplitude (Volts) = 5.0393
Amplitude error (Volts) = 0.96069
Amplitude error (%) = 16.0115

This study just showed the concept of scalloping loss and the importance of choosing the proper window function (such as flattop). A better interpretation of scalloping loss is obtained by interpreting the FFT as a filter bank, as briefly discussed in the next subsection.

FFT Interpreted as a Filter Bank

Filter banks are often used for performing spectrum analysis and signal synthesis. A DFT can be seen as a uniform filter bank due to the fact that all filters have the same bandwidth.

The interpretation of FFT and other transforms as filter banks lead to improved insight. For example, the good result obtained by the rectangular window in Figure 4.21 is due to a particular property of its side-lobe structure: all side-lobes are zero in the center of the bins. When the spectrum being analyzed has non-zero components only in bin centers, the leakage due to the side-lobes will not appear in the final FFT result. The leakage could be seen in the DTFT though.

Figure 4.9: DFT filter bank with rectangular window of $N = 8$ samples. The circles mark the DFT bin centers. The filter for $k = 3$ is emphasized.

Figure 4.10: DFT filter bank with Kaiser window of $N = 8$ samples. The filter for $k = 6$ is emphasized.

Figure 4.10 and Figure 4.9 indicate the filter banks corresponding to an FFT with rectangular and Kaiser windows, respectively. The rectangular has larger side-lobe levels, but all of them achieve a zero in the bin centers. The Kaiser window does not have side-lobes with amplitude equal to zero in the bin centers, but these side-lobe levels are very small. And there are other strategies (see e. g. [Lyo11]) to mitigate the effects of the scalloping loss that complement choosing a good window function.

Evaluating windows using Matlab’s wintool

Matlab’s wintool (Window Design & Analysis Tool) GUI is useful to explore trade offs in windowing. One can observe, for example, that increasing $β$ of a Kaiser window widens the main lobe and decreases the amplitude of the sidelobes. It also calculates the leakage factor, relative sidelobe attenuation and main lobe width.

Windows differ in their main lobe width, which is typically inversely proportional to $N$ , the length of the window. Thus, it is possible to achieve a better resolution in frequency by increasing $N$ , but more signal samples are needed and sometimes they are not available.

Another important window parameter is the highest sidelobe level, which should have a small value. When a window is used for FIR design, the smaller the sidelobe amplitude with respect to the amplitude of the main lobe, the higher the filter attenuation in the stopband. From the spectra in Figure 4.3, Table 4.1 can be obtained, which indicates the difference in dB between the main lobe and the highest sidelobe amplitudes. It should be noticed that, for the Hamming window, the highest sidelobe is the third one. The first sidelobe is localized in the frequency equal to 0.46 rad and has a level of $- 50.53$ dB when compared to the main-lobe level.

Table 4.1: Difference in dB between the window main lobe and highest sidelobe amplitudes.


Window	Rectangular	Hann	Hamming	Kaiser ( $β = 7.85$ )	Flat top

Difference (dB)	$- 13$	$- 32$	$- 43$	$- 57$	$- 68$

When comparing spectral components that are close to each other, the adopted windows must have good resolution and a low highest sidelobe level. When the spectral components are far apart, having a large sidelobe decay can be more important.

This sidelobe fall-off indicates how the peak amplitude of the sidelobes decay with frequency and, for example, is a motivation for using the Hann window. The side-lobe fall-off is dependent on the continuity of the window $w (t)$ in time-domain and its derivatives. For example, the Hann window is continuous, and besides, its first derivative is continuous too. Thus, its transform falls off at $1 ∕ ω^{3}$ or, equivalently, at $- 18$ dB per octave. In contrast, the rectangular and Hamming windows, which are not continuous at their endpoints, have a $- 6$ dB per octave fall-off.

Relation between roots location and spectrum

. A window $w [n]$ is used to multiply a signal in time-domain, but for the sake of observing its spectrum, it can be interpreted as the impulse response of a FIR filter, which does not have finite poles, only zeros. Under this assumption, there is an interesting relation between the zeros of the windows, and its spectrum $W (e^{j Ω})$ .

Figure 4.11: Relationship between roots location and spectrum for a rectangular window with $N = 32$ samples.

Figure 4.11 shows the spectrum (left plot) of a rectangular window with $N = 32$ and the roots (right plot) of this window. This plot outlines that there are $N - 1$ zeros (the amount of roots of a polynomial of order $N$ ) and the width of the main lobe is obtained from the angle $Θ = 2 π ∕ N$ (in this case, $Θ = 2 π ∕ 32 \approx 0.1963$ rad) of the first root, given that there is no root at $Ω = 0$ and the $N - 1$ roots are uniformly spaced around the unit circle.

Figure 4.12: Relationship between roots location and spectrum for a Kaiser window with $N = 32$ samples.

Figure 4.12 shows that a greater concentration of zeros in the region of the main lobe, allows the Kaiser window (with $β = 7.85$ ) achieve a very small first side-lobe. However, the width of the main lobe increases considerably when compared to the rectangular window in Figure 4.11, due to the increase of $Θ$ . In summary, the rectangular window has the smallest main lobe, but its first sidelobe level is just $- 13$ dB below the main-lobe level. On the other hand, the Kaiser window has a wider main lobe, but this window has very small sidelobe levels.

The next subsections discuss leakage and the picket-fence FFT effect.

4.3.3 Leakage

Leakage is a very important aspect on spectral analysis using FFT. To understand leakage one must remember that the multiplication $x [n] w [n]$ in time-domain as in Eq. (4.1), corresponds to the convolution in frequency-domain as in Eq. (4.2). The spectra of the windows in Figure 4.2 indicate that the values of the original spectrum $X (e^{j Ω})$ will be surely altered if $X (e^{j Ω})$ is convolved with any finite-duration window.

The result of the convolution caused by windowing may not significantly impact the original spectrum $X (e^{j Ω})$ in case $W (e^{j Ω})$ is relatively narrow compared to variations in $X (e^{j Ω})$ . But this condition implies that the window $w [n]$ is sufficiently long (i. e., $N$ must be sufficiently large). As mentioned, the ideal situation, where windowing would not alter the original $X (e^{j Ω})$ , is to have $W (e^{j Ω})$ as an impulse in the range $- π < Ω \leq π$ (i. e. $W (e^{j Ω}) = δ (Ω + k 2 π), k \in ℤ$ ), but this implies that $w [n]$ has infinite duration and all samples of the eventually eternal $x [n]$ are available for spectral analysis.

Hence, even if $W (e^{j Ω})$ is narrow compared to $X (e^{j Ω})$ , the convolution $X_{w} (e^{j Ω}) = X (e^{j Ω}) ⊛ W (e^{j Ω})$ may have non-zero (“leaked”) values in frequencies where the original signal $X (e^{j Ω})$ was zero.

Note that many times, leakage is erroneously interpreted as a consequence of using FFTs to perform spectral analysis. However, even when dealing with DTFTs, without the inherent discretization of frequency provoked by DFTs / FFTs, windowing creates “frequency leaks” that show up on the DTFT $X_{w} (e^{j Ω})$ . The following paragraphs provide concrete examples of leakage.

Example 4.2. Examples of spectrum leakage due to windowing. Spectrum leakage can be understood by observing the case of an eternal sinusoid $x [n] = A cos (Ω_{c} n)$ . The DTFT of $x [n]$ is $X (e^{j Ω}) = Aπ [δ (Ω + Ω_{c}) + δ (Ω - Ω_{c})]$ within the range $- π \leq Ω \leq π$ , then repeated in multiples of $2 π$ .

Assuming a rectangular window $w [n]$ with amplitudes equal to 1 from $n = 0, 1, \dots, N - 1$ and zero otherwise, the spectrum of $w [n]$ as given by Eq. (A.55) is

W (e^{j Ω}) = \frac{sin (\frac{N Ω}{2})}{sin (\frac{Ω}{2})} e^{- j \frac{Ω (N - 1)}{2}} .

(4.7)

The DTFT $X_{w} (e^{j Ω})$ of the windowed cosine $x_{w} [n] = x [n] w [n]$ is the periodic convolution between $X (e^{j Ω})$ and the window DTFT $W (e^{j Ω})$ . Recalling that the periodic convolution of Eq. (4.2) has a factor $\frac{1}{2 π}$ , the resulting DTFT is

X_{w} (e^{j Ω}) = \frac{A}{2} [W (e^{j (Ω + Ω_{c})}) + W (e^{j (Ω - Ω_{c})})] .

(4.8)

Figure 4.13 depicts the example of a cosine with amplitude $A = 3$ V and frequency $Ω_{c} = 1$ rad. The data tip on the top plot indicates the area of the impulses at $Ω = \pm Ω_{c}$ is $0.5 Aπ = 0.5 \times 3 \times π \approx 4.71239$ . The middle plot shows $| W (e^{j Ω}) |$ and its data tip indicates that its value at DC ( $Ω = 0$ ) is $N$ . The bottom plot of Figure 4.13 shows $| X_{w} (e^{j Ω}) |$ . The value 22.6146 of the data tip at $Ω = - Ω_{c}$ is obtained from Eq. (4.8) as:

\begin{align} X_{w} (e^{j Ω}) |_{Ω = - Ω_{c}} & = \frac{A}{2} [W (e^{j 0}) + W (e^{- j 2 Ω_{c}})] \\ = \frac{A}{2} [N + \frac{sin (- N Ω_{c})}{sin (- Ω_{c})} e^{j Ω_{c} (N - 1)}] \\ = \frac{3}{2} [14 + \frac{sin (- 14)}{sin (- 1)} e^{j 13}] \\ = 22.6024 + 0.7420 j . \end{align}

The magnitude $| 22.6024 + 0.7420 j | = 22.6146$ is the data tip value shown in the bottom plot of Figure 4.13.

Figure 4.13: Spectrum leakage when a cosine $x [n] = 3 cos (Ω_{c} n)$ , with $Ω_{c} = 1$ rad, is windowed with a rectangular window $w [n]$ of duration $N = 14$ samples.

As discussed in Section 4.3.2.0, recovering exactly the amplitude $A$ from $X_{w} (e^{j Ω})$ is not an easy task because the scalloping loss is influenced by contributions from two replicas of $W (e^{j Ω})$ in Eq. (4.8).

Figure 4.14 depicts the result when the window length is increased from $N = 14$ to $N = 50$ samples. In this case, $X_{w} (e^{j Ω}) |_{Ω = - Ω_{c}}$ is given by

\begin{align} X_{w} (e^{j Ω}) |_{Ω = - Ω_{c}} & = \frac{3}{2} [50 + \frac{sin (- 50)}{sin (- 1)} e^{j 49}] \\ = \frac{3}{2} [50 + (- 0.0937 + 0.2974 j)] . \\ = 74.8594 + 0.4461 j \end{align}

The magnitude $| 74.8594 + 0.4461 j | = 74.8607$ is slightly smaller than $\frac{3}{2} 50 = 75$ , which is the value that would be obtained by using only the first replica of $W (e^{j Ω})$ . This uncertainty about the amplitude of the sinusoid provoked by the second replica of $W (e^{j Ω})$ can be mitigated using a flattop instead of the rectangular window, as discussed in Section 4.3.2.0.

Figure 4.14: Spectrum leakage when the cosine $x [n] = 3 cos (n)$ of Figure 4.13 is now windowed with a rectangular window $w [n]$ of duration $N = 50$ samples.

The top plot in Figure 4.13 and Figure 4.14 is showing impulses with the same area values because the input signal does not change. But the middle plots show that the DTFT peak gets larger when $N$ is increased. Figure 4.15 presents the results when the window length is increased to $N = 1000$ samples.

Figure 4.15: Spectrum leakage when the cosine $x [n] = 3 cos (n)$ of Figure 4.13 is now windowed with a rectangular window $w [n]$ of duration $N = 1000$ samples.

Independent of the window length $N$ , a key point made by Figs. 4.13, 4.14 and 4.15 is that the spectrum $X_{w} (e^{j Ω})$ of a windowed sinusoid is not composed solely of impulses, as in $X (e^{j Ω})$ for the eternal sinusoid. The DTFT $X_{w} (e^{j Ω})$ has sidelobes dictated by the window DTFT $W (e^{j Ω})$ . $□$

As mentioned, the convolution in Eq. (4.2) may not significantly alter $X_{w} (e^{j Ω})$ with respect to the original $X (e^{j Ω})$ in case $W (e^{j Ω})$ is relatively narrow compared to variations in $X (e^{j Ω})$ . However, there is a tradeoff between narrower bandwidths and longer duration over time. See, for example, the Fourier transform of a square pulse in Eq. (A.54) to remember the tradeoff between time and frequency. Increasing the duration of $w [n]$ (using more samples from the original signal) mitigates leakage as indicated by the behavior of Figure 4.15, which shows the effect of increasing $N$ to obtain a window that looks more similar to the ideal impulse.

Leakage is associated to windowing and should not be confused with the picket-fence effect caused by sampling the DTFT via a FFT, which is discussed in the sequel.

4.3.4 Picket-fence effect

As discussed, the result of windowing a signal and then taking the FFT of the windowed signal leads to two consequences: a) creation of spectrum $X_{w} (e^{j Ω})$ with leakage due to windowing, and then b) spectrum values $X_{w} [k]$ available only at a discrete frequency grid along the angular frequency $Ω$ imposed by the FFT.

The consequence of discretizing $Ω$ via an FFT is sometimes called picket-fence effect, which corresponds to observing the DTFT of the signal under analysis just at the DFT/FFT bin centers, as dictated by Eq. (2.40) (or Eq. (2.37), when using frequency in Hz). To study the picket-fence effect, it is useful to place sinusoids at arbitrary positions within a given FFT bin. This is the topic of the next paragraphs.

Positioning sinusoids with respect to the FFT bin

Given that the bin width of an $N$ -points DFT (or FFT) is $Δ Ω = \frac{2 π}{N}$ radians, the angular frequency associated to the $k$ -th FFT bin is $Ω_{k} = k \frac{2 π}{N}$ radians, as given by Eq. (2.40). At this point, it may be useful to recall the definition of a bin center using Figure 2.18.

When performing the spectral analysis of a bin-centered sinusoid $x [n] = cos (k \frac{2 π}{N} n)$ using an $N$ -points FFT, the components of “positive” and “negative” frequencies of $x [n]$ will be mapped to the FFT bins $k$ and $(N - k)$ , respectively. In order to investigate non-bin-centered sinusoids, it is useful to generalize the signal expression to

x [n] = cos (α \frac{2 π}{N} n),

(4.9)

where $α \in ℝ$ is a real number in the range $[0, N ∕ 2]$ that allows to obtain any desired value for the positive angular frequency $Ω_{c} = α \frac{2 π}{N}$ , from 0 to $π$ rad. One can write Eq. (4.9) in terms of the FFT resolution $Δ Ω = 2 π ∕ N$ as:

x [n] = cos (α Δ Ω) .

(4.10)

Scalloping loss: the combined effects of leakage and picket-fence

It is informative to revisit the result of an FFT when the original input signal is a discrete-time sinusoid $x [n] = A cos (Ω_{c} n)$ after windowing, as discussed in Section 4.3.2.0 and Example 4.2.

Recall that the DTFT of this eternal cosine is $X (e^{j Ω}) = Aπ [δ (Ω + Ω_{c}) + δ (Ω - Ω_{c})]$ within the range $- π \leq Ω \leq π$ . And the DTFT $W (e^{j Ω})$ of the rectangular window is given by Eq. (4.7).

Assuming that $X_{w} (e^{j Ω})$ denotes the DTFT of the windowed cosine, the final FFT result depends on the two effects:

f 1.

Leakage:

X_{w} (e^{j Ω})

is the periodic convolution between

X (e^{j Ω})

and

W (e^{j Ω})

, and is given by Eq. (4.8), which is repeated below for convenience:

X_{w} (e^{j Ω}) = \frac{A}{2} [W (e^{j (Ω + Ω_{c})}) + W (e^{j (Ω - Ω_{c})})] .

f 2.

Picket-fence: The FFT

X_{w} [k]

has a resolution of

Δ Ω = (2 π) ∕ N

and corresponds to samples

X_{w} [k] = X_{w} (e^{j Ω}) |_{Ω = k Δ Ω}

of the DTFT

X_{w} (e^{j Ω})

at frequencies

Ω = k Δ Ω

Two cases of interest are a bin-centered sinusoid ( $α \in ℤ$ ), and $Ω_{c}$ half way two bin centers, exactly on their edges. The latter case represents a worst-case situation with respect to the scalloping loss.

Complementing the experiments in Section 4.3.2.0, here we study scalloping loss specifically adopting the rectangular window. For that, one can execute Listing 4.3 with the command snip_frequency_fftCosineExample(8, 2) and snip_frequency_fftCosineExample(8, 2.5) to generate Figure 4.16 and Figure 4.17, respectively. These figures are similar to Figure 4.4 and Figure 4.6.

Listing 4.3: MatlabOctaveCodeSnippets/snip_frequency_fftCosineExample.m

1function snip_frequency_fftCosineExample(N, alpha) 
2% function snip_frequency_fftCosineExample(alpha) 
3% Superimposes the DTFT and FFT of a windowed cosine, 
4% assuming the rectangular window. 
5% N     - FFT length 
6% alpha - specifies the cosine angular frequency Wc as 
7%         Wc=(alpha*2*pi)/Nfft, where Wc is in radians 
8A=6; %cosine amplitude 
9if alpha > N/2 
10    error('alpha cannot be larger than N/2') 
11end 
12Wc=(alpha*2*pi)/N; %cosine frequency in radians 
13%% Calculate DTFT values in a fine grid 
14M=1000; %number of sample points for DTFT visualization 
15% discretize the frequency axis in large number M of points: 
16Omega = linspace(-pi, pi, M); 
17%add frequency values of interest to facilitate plots 
18Omega = sort([Omega -Wc Wc 0]); 
19Xw = (A/2)*(rect_dtft(Omega+Wc,N) + rect_dtft(Omega-Wc,N)); %DTFT 
20%% Calculate DFT 
21n=0:N-1; x=A*cos(Wc*n); %generate N samples of the cosine 
22Xfft = fftshift(fft(x)); %calculate FFT with N points and shift 
23deltaW = 2*pi/N;  %the unit circle was divided in N slices 
24if rem(N,2) == 0 
25    W_fft = -pi:deltaW:pi-deltaW; %N is even 
26else 
27    W_fft = -pi+(deltaW/2):deltaW:pi-(deltaW/2); %N is odd 
28end 
29%% Plot and calculate scalloping loss 
30plot(Omega,abs(Xw)),hold on, stem(W_fft,abs(Xfft),'or') 
31xlabel('Angular frequency \Omega (rad)') 
32ylabel('|X_w(e^{j\Omega})|'), legend('DTFT','DFT') 
33grid, axis tight 
34disp(['Max(abs(FFT))=' num2str(max(abs(Xfft)))]) 
35disp(['Scalloping loss in DTFT=' num2str(A*N/2-max(abs(Xfft)))]) 
36end 
37function rect_window_dtft = rect_dtft(Omega, N) 
38    %DTFT of rectangular window 
39    W_div_2 = Omega/2; %half of angles (pre-compute to speed up) 
40    rect_window_dtft=(sin(N*W_div_2)./sin(W_div_2)).*exp(-1j*W_div_2*(N-1)); 
41    %Use L'Hospital rule to correct NaN if 0/0: 
42    rect_window_dtft(isnan(rect_window_dtft))=N; 
43end

Figure 4.16: Comparison of DTFT and DFT/FFT for bin-centered $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 2$ and $N = 8$ using a rectangular window.

Figure 4.16 indicates that when the sinusoid is bin-centered, the FFT outputs two peaks of value A*N/2, which in this example corresponds to 6*8/2=24. In this case, the scalloping loss is zero and there is no visible leakage. This situation is interesting because as can be observed by the DTFT plot in Figure 4.16, the windowing operation already created leakage, but the picket-fence effect makes it not visible to the user that is observing solely the FFT result $X_{w} [k]$ , not the DTFT $X_{w} (e^{j Ω})$ .

Figure 4.17: Version of Figure 4.16 for a (worst-case) not bin-centered $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 2.5$ and $N = 8$ using a rectangular window.

In contrast to Figure 4.16, in Figure 4.17 the picket-fence effect on a non-bin-centered sinusoid clearly discloses an extreme leakage. The adopted value $α = 2.5$ , places the cosine right in the shared edge of bin centers $k = 2$ and $k = 3$ . The FFT value for $k = 2$ is slightly larger ( $| X_{w} [2] | = 16.8$ ) than the value for $k = 3$ ( $| X_{w} [2] | = 14.4$ ) due to the influence of the first replica of $W (e^{j Ω})$ positioned at $Ω_{c} = - α 2 π ∕ N \approx - 1.9635$ rad. In this example, the scalloping loss is calculated with

1disp(['Max(abs(FFT))=' num2str(max(abs(Xfft)))]) 
2disp(['Scalloping loss in DTFT=' num2str(A*N/2-max(abs(Xfft)))])

and the result is

1Max(abs(FFT))=16.7876 
2Scalloping loss in DTFT=7.2124

In this case, the expected (ideal) DTFT peak for the sinusoid with amplitude A=6 would be A*N/2=24, but the maximum $| X_{w} [k] |_{max}$ of the amplitude spectrum is 16.7876. Hence, the scalloping loss is $24 - 16.7876 = 7.2124$ .

If one uses the fact that $| X_{w} [k] |_{max}$ should be equal to A*N/2, the sinusoid amplitude can be estimated with $Â = 2 | X_{w} [k] |_{max} ∕ N = 2 \times 16.7876 ∕ 8 = 4.1969$ V. This corresponds to an error of 1.8031 V with respect to the correct sinusiod amplitude of 6 V.

Advanced: FFT of a sinusoid multiplied by a rectangular window

Looking for a general expression for the FFT values $X_{w} [k]$ of a windowed cosine $x_{w} [n] = x [n] w [n]$ , with $Ω_{c} = α Δ Ω = α (2 π) ∕ N$ (recall that $α \in ℝ$ and $α \in [0, N ∕ 2]$ ) and assuming a rectangular window, one can substitute Eq. (4.7) into Eq. (4.8) to obtain:

X_{w} [k] = \frac{A}{2} [\frac{sin ((k + α) π)}{sin (\frac{(k + α) π}{N})} e^{- j (\frac{(N - 1) π}{N} (k + α))} + \frac{sin ((k - α) π)}{sin (\frac{(k - α) π}{N})} e^{- j (\frac{(N - 1) π}{N} (k - α))}] .

(4.11)

From Eq. (4.7), the value of $W (e^{j Ω})$ for $Ω = 0$ rad can be found using L’Hospital’s rule, which leads to $W (e^{j Ω}) |_{Ω = 0} = N$ .

In the special case of $Ω_{c} = 0$ , the signal $x [n] = A$ has a constant amplitude, its DTFT is $X_{w} (e^{j Ω}) = AW (e^{j Ω})$ and the FFT DC bin $X_{w} [k] |_{k = 0} = X_{w} [0] = AN$ . This situation is depicted in Figure 4.18.

Figure 4.18: Comparison of DTFT and DFT/FFT for a constant signal $x [n] = 6 cos ((α 2 π ∕ N) n) = A$ , with $α = 0$ (centered at DC) and $N = 8$ , using a rectangular window.

Let us assume the goal is to estimate the sinusoid amplitude $A$ in volts, via the DFT coefficients obtained from the signal FFT. In this case, Eq. (2.49), indicates that one needs to normalize the DFT by $N$ . Hence, the normalized DC value for Figure 4.18 is $X_{w} [0] ∕ N = A$ . Similarly, when $Ω_{c} \neq 0$ as in Figure 4.16 and Figure 4.17, the two DTFT peaks at $Ω = \pm Ω_{c}$ have amplitude $AN ∕ 2$ that lead to an amplitude of $A ∕ 2$ for each after division by $N$ . In summary, as already discussed in the context of Eq. (2.49), the estimate of amplitudes from a FFT result requires normalizing the absolute values $X_{w} [k]$ by $N$ .

As indicated in Listing 4.3, the scalloping loss $S$ for a windowed sinusoid with amplitude $A$ when a rectangular window and an $N$ -points FFT are used, can be obtained with:

S = N \frac{A}{2} - max {| X_{w} [k] |},

(4.12)

where $max {| X_{w} [k] |$ is the magnitude corresponding to the FFT bin center that best represents the windowed sinusoid $x_{w} [n]$ (its strongest FFT component).

4.3.5 Advanced: Only a bin-centered sinusoid leads to an FFT without visible leakage

Leakage and picket-fence make harder to use FFT for analyzing eternal sinusoids. In spite of that, the FFT is a flexible tool and its user needs to choose the proper interpretation of its results. For instance, one perspective from the FFT of a windowed signal $x_{w} [n]$ is that the basis functions of the DFT have a finite duration of $N$ samples and can only represent a finite-duration $x_{w} [n]$ . In this case, the FFT user is not concerned with signal periodicity. Another perspective is to consider that the FFT is estimating the DTFS coefficients and $x_{w} [n]$ is a single period with $N$ samples of a periodic and eternal signal. In any case, it is not always possible to conclude about the periodicity and frequency components of the infinite duration $x [n]$ by observing only its windowed version $x_{w} [n]$ .

Recall that, to check the periodicity of $x [n]$ , one can use Eq. (1.21) and write

\frac{m}{N_{0}} = \frac{Ω_{c}}{2 π} = \frac{\frac{2 π}{N} α}{2 π} = \frac{α}{N},

(4.13)

which is a ratio $m ∕ N_{0}$ of two integers when $α ∕ N$ is a rational number ( $α ∕ N \in ℚ$ ). In case $α$ is an integer, then $α ∕ N$ is a rational number. But when $α = 2.5$ and $N = 10$ , we still have $α ∕ N = 1 ∕ 4$ rational. Hence, $x [n]$ can be periodic even if $α$ is not an integer. But $x [n]$ will be bin-centered, only if $α$ is an integer. This is illustrated in the top plots of Figure 4.19.

Figure 4.19: Top plots are obtained from the periodic but not bin-centered $x [n] = 6 cos ((α 2 π ∕ N) n)$ , with $α = 2.5$ and its DTFT and FFT with a rectangular window and $N = 10$ . Bottom plots correspond to the non-periodic $x [n]$ with $α = π$ .

The bottom plots of Figure 4.19 were obtained from a non-periodic signal $x [n] = 6 cos ((π^{2} ∕ 5) n)$ , but the DFT result presents a stronger peak than its top plot counterpart. By observing only the top DFT, one could be confused whether $x [n]$ is composed of one or two sinusoids. On the other hand, observing the bottom DFT could lead to concluding that $x [n]$ is periodic, while strictly it is not. The idea of Figure 4.19 is that care must be exercised when interpreting FFT results.

Another couple of examples is provided. Assume now that the signal of interest is $x [n] = cos ((100 π ∕ 256) n)$ , with $α = 50$ and $N = 256$ , where $N$ is the adopted length of the rectangular window and FFT size. In this special case, in which $α \in ℤ$ , the fundamental period is $N_{0} = 128$ samples, given that simplifying the fraction $α ∕ N$ leads to $m ∕ N_{0} = 25 ∕ 128$ , and this cosine is represented by the FFT bins $k = 50$ and $k = 256 - 50 = 206$ . The DTFT and FFT magnitudes of $x [n]$ are depicted at the left plot in Figure 4.20.

Figure 4.20: Comparison of DTFT and DFT for $x [n] = 6 cos ((α 2 π ∕ N) n)$ with $α = 50$ (left) and $α = 30.5$ (right) $N = 256$ using a rectangular window.

The right plot in Figure 4.20 used the signal $x [n] = cos ((61 π ∕ 256) n)$ , still with a window and FFT size $N = 256$ . In this case, $α = 30.5$ and $α \notin ℤ$ , but $α ∕ N \in ℚ$ and can be written as a ratio $30.5 ∕ 256 = 61 ∕ 512$ of two integers $m = 61$ and $N_{0} = 512$ . The signal $x [n]$ has a period of $N_{0} = 512$ samples but only $N = 256$ of its samples are being analyzed via the FFT. In this case, the FFT does not “see” a pure sinusoid. The component of $x [n]$ corresponding to the positive frequency will reside between the bins $k = 30$ and $k = 31$ , while the negative-frequency component will be located between bins $k = 224$ and 225. Leakage will occur for both components and will be more visible in their neighbor bins.

4.3.6 Summarizing leakage and picket-fence effect

In previous paragraphs, we observed that reduced resolution and leakage are the two primary effects on the spectrum as a result of applying a window to the signal. The resolution of the overall spectral analysis is primarily influenced by the width of the main lobe of the window, which is inversely proportional to the window (and consequently signal) duration. The degree of leakage depends on aspects of the window DTFT such as the relative amplitude of the main lobe with respect to the amplitudes of the sidelobes.

An FFT-based spectral analysis resolution also depends on the FFT resolution $Δ f = F_{s} ∕ N_{fft}$ as described by Eq. (2.37), with $N_{fft}$ being the number of FFT points. Given a windowed signal $x_{w} [n]$ with duration of $N$ samples, the FFT resolution can always be improved via zero-padding and adopting $N_{fft} > N$ .

In summary, there are two artifacts when an FFT is used for spectral analysis:

Leakage, which becomes stronger when the window duration is made shorter (or equivalently, $N$ made smaller)
FFT resolution $Δ f$ , which can be improved by increasing the number $N_{fft}$ of FFT points and using zero-padding.

These two effects appear in a combined way in FFT-based spectral analysis. And while $Δ f = F_{s} ∕ N_{fft}$ improves as $N_{fft}$ gets larger, this cannot recover the leakage that occurred due to windowing.

These effects appear in a combined way in FFT-based spectral analysis, such as in Application 2.11.

4.3.7 Example of using windows in spectral analysis

To illustrate the importance of windows other than the rectangular, two distinct signals, $x_{1} [n]$ and $x_{2} [n]$ , each composed by a stronger and a weaker sinusoids, will have their spectra estimated using a FFT with $N = 256$ points:

Case 1: the stronger sinusoid is bin-centered - The first sequence is
$x_{1} [n] = 1 cos ((64 π ∕ 256) n + π ∕ 3) + 100 cos ((76 π ∕ 256) n + π ∕ 4) .$

The idea is that the weaker cosine (of amplitude 1 V) is located in the center of the FFT bin $k = 32$ , and the stronger is located in the center of bin 38. Having a bin-centered sinusoid corresponds to choosing the FFT size coinciding with a multiple of the sinusoid period.
Case 2: the stronger sinusoid is not bin-centered - The second sequence is
$x_{2} [n] = 1 cos ((64 π ∕ 256) n + π ∕ 3) + 100 cos ((75 π ∕ 256) n + π ∕ 4) .$

In this case, the weaker cosine is located in the center of bin 32 but the stronger is located in the border of bins 37 and 38.

In this example the goal is to distinguish the sinusoids and estimate their frequencies. As in many applications of spectral analysis, the amplitudes are irrelevant and what is taken into account is the relative difference between the signal components in frequency domain. For both $x_{1} [n]$ and $x_{2} [n]$ , the difference in power between the sinusoids is $20 {log}_{10} (100 ∕ 1) = 40$ dB. These sequences can be generated as in Listing 4.4.

Listing 4.4: MatlabOctaveCodeSnippets/snip_frequency_sequence_generation.m. [ Python version]

1N=256; %number of samples available of x1 and x2 
2n=0:N-1; %abscissa 
3kweak=32; %FFT bin where the weak cosine is located 
4kstrong1=38; %FFT bin for strong cosine in x1 
5weakSigal = 1*cos((2*pi*kweak/N)*n+pi/3); %common parcel 
6x1=100*cos((2*pi*kstrong1/N)*n+pi/4) + weakSigal; %x1[n] 
7kstrong2=37.5; %location for strong cosine in x2 
8x2=100*cos((2*pi*kstrong2/N)*n+pi/4) + weakSigal; %x2[n]

Figure 4.21: Comparison of spectra obtained with four windows in case both sinusoids are bin-centered (left plots) and not (right plots).

The commands that design the windows to pre-multiply x1 and x2 in Listing 4.4, before calling the FFT, are given in Listing 4.1.

The results with four windows from Listing 4.1 are shown in Figure 4.21. The flat top window was not used due to its poor frequency resolution. For example, the spectrum in Figure 4.21 for $x_{2} [n]$ (called x2 in the code) using Hamming was obtained with Listing 4.5.

Listing 4.5: MatlabOctaveCodeSnippets/snip_frequency_analysis.m. [ Python version]

1N=256; n=0:N-1; %N is the number of available samples and FFT length 
2x2=100*cos((2*pi*37.5/N)*n+pi/4) + cos((2*pi*32/N)*n+pi/3); %cosines 
3dw=(2*pi)/N; %DFT spacing in radians 
4w=-pi:dw:pi-dw; %abscissa for plots matched to fftshift 
5x=x2.*hamming(N)'; %perform windowing 
6factor=max(abs(fft(x))); %normalization to have stronger at 0 dB 
7plot(w,fftshift(20*log10(abs(fft(x)/factor))));

The graphs at the left column in Figure 4.21 correspond to estimated spectra of $x_{1} [n]$ (both sinusoids are bin-centered). For $x_{1} [n]$ , it can be seen that the best result was obtained by a rectangular window while the other windows led to spectra with leakage (spurious power) near the cosines. Based solely on the results obtained for case 1, one could erroneously concluded that the rectangular window is always the best. However, as shown in the top-right graph, when the strong cosine is not bin-centered, the rectangular window miserably fails to detect the weaker cosine.

In Figure 4.21, the Hann window allows a marginal detection of the weaker cosine, while the Hamming window allows the detection of both cosines but contaminates the whole spectrum with spurious power along frequency just 50 dB below the power level of the strongest sinusoid. This is a consequence of the reduced side-lobe fall-off for the Hamming window, as indicated in Figure 4.3. The Kaiser window achieves the best result when one takes in account both cases (bin-centered and not).

Figure 4.22: Individual spectra of the two sinusoids superimposed obtained using the Kaiser window.

Figure 4.23: Individual spectra of the two sinusoids superimposed, obtained using the rectangular window.

To complement the analysis of estimating the spectrum of $x_{2} [n]$ (the stronger sinusoid is not bin centered), Figure 4.22 and Figure 4.23 show the individual spectra of the two sinusoids superimposed, obtained using the Kaiser and rectangular windows, respectively. Note that, Figure 4.21 shows exactly the combined effect of these individual spectra. Figure 4.23 clearly shows that the significant leakage of the stronger cosine caused by the rectangular window, completely “masks” the weaker cosine.

Note that this was a specific application. For example, features such as side-lobe fall-off are important in other situations. A very general conclusion is to always use a window other than the rectangular when performing spectral analysis of a signal that is not guaranteed to have all components bin-centered.

When a DTFT has impulses, does this create a problem for an FFT-based analysis?

A DTFT can have impulses such as, for example, the DTFT of a sinusoid. One question that may arise then is: -If the FFT simply performs sampling of the DTFT in the frequency domain, what is the FFT value when exactly sampling an impulse? Should this value go to infinite? The answer requires noting that the impulse in the DTFT appears because $x [n]$ is an infinite-long signal (e. g., an eternal sinusoid). But an FFT has to be applied to a signal with a finite number $N$ of samples. Therefore, the FFT is always applied to a windowed signal $x_{w} [n]$ , with a DTFT $X_{w} (e^{j Ω})$ that does not have impulses.

After this introduction to windowing, the next sections discuss three important functions in spectral analysis.

¹ Similar to the amplitudes, the absolute power values inferred from of an FFT-based analysis require correction [Bra11].

² See, e. g., the presentation by Matt Ettus at http://www.youtube.com/watch?v=wqKNUXDdIvU.