1.4  Block or Window Processing

Many algorithms segment the signal into blocks of samples and process each block individually. For example, when writing Matlab/Octave code for DSP, it is tempting to organize the information in a single vector, which is then processed and converted into another vector. This works well when the number of elements in each vector is relatively small. However, many situations require processing billions of samples, and the corresponding long vectors would eventually not fit in memory. In these cases, the best strategy is to write DSP code using block processing, also called window processing and frame-based processing.

When processing a discrete-time “infinite” duration signal x[n] in blocks (also called window or frames), x[n] is iteratively segmented and represented by finite dimensional vectors xm with N elements each. Assuming that x[n] is a right-sided signal (n = 0 corresponds to the first non-zero sample, n = 1 to the second and so on), the first block x0 is formed by the samples

x0 = [x[0],x[1],,x[N 2],x[N 1]]T .

In general, the m-th block is represented by the column vector

xm = [x[mN],x[mN + 1],,x[mN + N 2],x[mN + N 1]]T .
(1.8)

To avoid too many indexes, the dependence on m will be omitted hereafter.

Example 1.10. Segmenting a signal into blocks. Suppose the task is to segment a very long signal into blocks and then calculate the power of each block. Listing 1.6 illustrates how to perform this task in Matlab/Octave.

Listing 1.6: MatlabOctaveCodeSnippets/snip_transforms_segmentation.m. [ Python version]
1S=10000; %some arbitrary number of samples 
2x=rand(1,S); %create some very very long vector 
3N=50; %number of samples per frame (or block) 
4M=floor(S/N); %number of blocks, floor may discard last block samples 
5powerPerBlock=zeros(M,1); %pre-allocate space 
6for m=0:M-1 %following the book convention 
7    beginIndex = m*N+1; %index of the m-th block start 
8    endIndex = beginIndex+N-1; %m-th block end index 
9    xm=x(beginIndex:endIndex) %the samples of m-th block 
10    powerPerBlock(m+1)=mean(abs(xm).^2); %estimate power of block xm 
11end 
12clf; subplot(211); plot(x); subplot(212); plot(powerPerBlock) %plot
  

Note that Eq. (1.8) assumes the first index is 0, which is adopted in Python, C, Java and other languages. However, in Matlab/Octave the first index is 1 and this required using beginIndex = m*N+1 instead of beginIndex = m*N as in Eq. (1.8).

The reader is invited to use a speech signal instead of the random signal and observe how the signal power varies over time, comparing it for vowels and consonants.    

1.4.1  Advanced: Block processing with overlapped windows

The next paragraphs discuss an alternative to deal with observation windows containing L samples (or observations). It is assumed that for each window, it is desired to obtain a single estimate. For instance, one can use windows with L = 100 samples to estimate the average signal power within that window. The goal is to extract more information from the available collection of samples and alleviate the noise and blocking effects. Two strategies to improve block processing are overlapping windows and multiplying the block of L samples by a function w[n] with support of L samples and that goes to zero at its endpoints. The function w[n] is called window and is discussed in Section 4.3. Here, we will solely rely on overlapping windows, as illustrated in Figure 1.12.

To specify the extraction of overlapping windows, besides the length L, one needs to define the shift S in samples. When the window shift (also called stride in deep learning) is S < L, neighboring windows share L S samples. Figure 1.12 provides an example in which S = 1 and L = 4. As also indicated in this figure, non-overlapping windows are obtained with

S = L.
(1.9)

With S = 1, for each new input sample, there is a new observation window and consecutive windows share L S = 3 samples. For example, in Figure 1.12, when sample x[4] comes, the observation window slides to the right and encompasses x[1] to x[4]. When x[5] comes, the window covers x[2] to x[5], and so on. In this case, because there is a new estimate for each new sample, the estimator’s output rate is the same as its input rate despite the value of L. Another application of overlapping window is in estimating the spectogram, as explained in Section 4.9.1.

PIC

Figure 1.12: The top representation shows non-overlapping windows of L = 4 samples, with both non-windowed indexing x[n] and windowed indexing x[k,m]. The bottom representation shows overlapping windows with L = 4 and shift S = 1 sample using non-windowed indexing.

When dealing with block processing, it is sometimes useful to adopt a windowed indexing in which x[k,m] represents the m-th sample of the k-th window, as depicted in Figure 1.12.

When there are N available samples, the number M of windows with L samples and shifted by S samples that can be extracted is given by

M = N L S + 1,
(1.10)

where all values N, L and S are greater than zero, and N L (otherwise there is not a single window). For example, if we have N = 13 samples, with windows of L = 5 samples and shifted by S = 3, the result is M = (13 5)3 + 1 = 3 windows.

To prove Eq. (1.10), it must be taken into account that of the N input samples, we need L to make one valid window and hence the summation of 1. As depicted in Figure 1.13, for the remaining samples, N L, each extra valid window requires S samples. The floor function ⌊⋅⌋ eliminates incomplete windows. This is illustrated in Figure 1.13, where the red line separates, at the left, L = 5 samples for the first window, and the remaining N L = 8 samples are enough for two extra valid windows (each one requires S = 3 samples). In this example, the last two of the N = 13 samples are not used in any window.

PIC

Figure 1.13: Interpreting Eq. (1.10) to obtain M = 3 windows of L = 5 samples from a total of N = 13 samples using a shift of S = 3 samples.