5.3  MAP and ML Receivers for Symbol-by-Symbol Decisions

This section discusses two decision criteria widely adopted in communication receivers: maximum a posteriori (MAP) and maximum likelihood (ML). Under some conditions, MAP and ML can be the optimum data detection strategy for minimizing Pe. For example, for the AWGN channel, MAP executed in a symbol-by-symbol basis (in this case the channel is memoryless) is the optimum criterion. When the symbols are uniformly distributed, ML is equivalent to MAP and, consequently, ML is optimum for AWGN with uniform symbol priors.

The discussion assumes the receiver knows all M conditional PDFs p(r|m) and they are the correct ones. MAP and ML also guide the receiver in the situation where a decision must be done of a sequence of symbols. But here, for simplicity, only symbol-by-symbol decisions are discussed.

When using the ML criterion, the receiver defines the decision regions by choosing:

m^ ML = argmax mip(r|mi). Hence, after observing a given value r = R, all values p(r = R|mi) for the symbols mi, i = 1,…,M, are calculated and the chosen symbol is the one that achieves the maximum p(r = R|mi).

Note that, commonly, the elements of r are continuous random variables. For example, for AWGN, r is distributed according to a Gaussian. Hence, p(r = R|m) is not a probability but a likelihood (in the context of this text, the value of a continuous PDF or the PDF itself). This is the reason for the name ML criterion. ML does not take into account the prior probabilities p(m), which can be important, as illustrated by the example in Figure 5.4.

The distinction with respect to ML is that MAP takes the priors in account and is the optimum solution in the general case of non-uniform priors.

The MAP criterion seeks the maximization of the posteriori distribution p(m|r):

m^ MAP = argmax mip(mi|r).(5.2) The posteriors p(mi|r) can be obtained via the Bayes’ theorem as in Eq. (A.62), repeated here for convenience:
p(mi|r) = p(r|mi)p(mi) p(r) = p(r|mi)p(mi) j=1Mp(mj)p(r|mj).

Because p(r) is the same normalization factor for all candidates mi, Eq. (5.3) is equivalent to

m^ MAP = argmax mip(r|mi)p(mi).(5.3) Eq. (5.3) should be compared to Eq. (5.1). It is possible to prove that the decision regions imposed by the MAP criterion are optimal according to the following reasoning, which assumes r is a discrete r. v. for simplicity.2

For each received r, if there is only one symbol mi for which p(r|mi) > 0, then m^ = mi and r does not contribute to Pe. If there are two or more symbols mi for which p(r|mi) > 0, then the decision will surely influence Pe. In order to see that, assume the symbols for which p(r|mi) > 0 are m1,m2 and m3. Choosing m1 would imply adding the parcels p(r|m2)p(m2) + p(r|m3)p(m3) to Pe (see Eq. (5.1)), while choosing m2 would imply adding the parcels p(r|m1)p(m1) + p(r|m3)p(m3). Therefore, choosing m^ = argmax m∈Mp(r|m)p(m) is the optimal decision because it minimizes the parcel that contributes to Pe.

As mentioned, when the symbols are uniformly distributed, i. e., all priors p(m) = 1∕M are the same, the ML and MAP criteria lead to the same result. For AWGN, all p(r|m) are Gaussians with the same variance (imposed by the noise) and the symmetry derived from uniform priors reflects in having thresholds that are the same for ML and MAP and in the middle point between neighbor symbols. Figure 5.5 provides an example, where the ML thresholds are relatively easy to obtain as − 2,0,2 by observation when the likelihood PDFs cross each other.

PIC
Figure 5.6: Conditional probabilities p(r|m) and MAP thresholds. In case a) the priors are uniform but the variances of the noises added to each symbol differ while in b) the variances are the same (as in AWGN) but the priors differ.

To get insight, a more general case is assumed in the sequel. Assume binary modulation with symbols m1 and m2, where p(r|m1) = N(r|μ1,σ1) and p(r|m2) = N(r|μ2,σ2) with σ1σ2. The goal is to calculate the values of r for which p(m1)p(r|m1) = p(m2)p(r|m2) because they correspond to the threshold of the MAP decision regions:

p(m1) 1 σ1e(r−μ1)2 2σ12 = p(m2) 1 σ2e(r−μ2)2 2σ22 ln (p(m1)σ2 p(m2)σ1 ) = (r − μ1)2 2σ12(r − μ2)2 2σ22

This is a second order equation ar2 + br + c = 0 and the two thresholds are r = −b±b2 −4ac 2a , where a = σ22σ12, b = 2(σ12μ2σ22μ1) and c = σ22μ12σ12μ22 − 2σ12σ22 ln((σ2p(m1))∕(σ1p(m2))).

Figure 5.6 was generated with the code ak_MAPforTwoGaussians.m that implements this calculation. In case a), the priors are uniform but the variances (2 and 0.4) are distinct and p(r|m1) = N(r|− 1,2), p(r|m2) = N(r|1,0.4), as if the noise had distinctly affected the two symbols. Then, there are two thresholds of interest, which is always the case when σ12σ22. The MAP optimal thresholds are 0.24 and 1.93, which coincide with the ML thresholds. A receiver that uses these two thresholds to make its decisions would achieve the Bayes error (the minimum Pe), which in this case is Pe = 0.117. In case b), the variances are the same such that p(r|m1) = N(r|− 1,1) and p(r|m2) = N(r|1,1), but the threshold is not in the middle point because the priors p(m1) = 0.8, p(m2) = 0.2 differ. In this case, because m = 1 has larger prior probability p(m1) = 0.8, the MAP optimal threshold is 0.69 and is biased in favor of the symbol m1.

PIC
Figure 5.7: Posteriori distributions p(m|r) for the examples in Figure 5.6. Note that while p(r|m) are likelihoods, p(m|r) are probabilities and sum up to one.

Figure 5.7 shows the posteriori distributions p(m|r) for the examples in Figure 5.6. While the MAP threshold for case b) in Figure 5.6 cannot be found visually, it is very easy to note that 0.69 is the optimal threshold via Figure 5.7.

It should be noted that for AWGN, because the Gaussian distribution is such that the distance to the mean is inversely proportion to the probability, the thresholds obtained for the ML criterion coincide with the threshold of Voronoi regions defined by the Euclidean distances among the symbols.

2 It is also assumed that p(m) > 0,∀m, otherwise the effective number of symbols would be less than M.