Convex programming for detection in structured communication problems

. The generalized Minimum Mean Squared Error (GMMSE) detector has a bit error rate performance, which is similar to the MMSE detector. The advantage of the GMMSE detector is that it does not require the knowledge of the noise power. However, the computational complexity of the GMMSE detector is signiﬁcantly higher than the computational complexity of the MMSE detector. In this paper, the complexity of the GMMSE detector is reduced by taking into account the structure of the system matrix (Toeplitz). Furthermore, by using circular approximation of the structured system matrix an approximate GMMSE detector is presented, whose computational complexity is only slightly higher than MMSE, i.e. only an iterative gradient descent algorithm based on the inversion of diagonal matrices is additionally required.


Introduction
The maximum likelihood (ML) detection problem can be written as a quadratic optimization problem with integer constraints (Verdu, 1998). Unfortunately this problem is in general non-deterministic polynomial hard (NP-hard) (Verdu, 1989). This observation resulted in the development of many receivers that have reasonable complexity (Tan and Rasmussen, 2001;Chang and Han, 2008), e.g. the wellknown least squares (LS) and minimum mean squared error (MMSE) detectors (Lupas and Verdu, 1989;Madhow and Honig, 1994) as the most simple cases.
Recently convex programming has been successfully employed to sub-optimally solve such detection problems. Using this kind of relaxation converts the discrete optimization problem into a continuous one which can be solved itera-Correspondence to: T. Morsy (tharwat.morsy@tu-dortmund.de) tively (Boyd and Vandenberghe, 2004). Generalized minimum mean squared error detector is one important detector that uses convex programming to solve the detection problem using unconstrained gradient descent algorithm (Yener, Yates and Ulukus, 2002). The advantages of this detector are that it has a BER performance which is similar to the MMSE detector, and does not require the knowledge of the noise power. Because of these advantages it can be used in scenarios where the noise power is changing rapidly or the noise power is unknown. Associated with these advantages of GMMSE detector there is the disadvantage, that it has a significantly higher computational complexity compared to MMSE detector.
In order to decrease this computational complexity the structure of the system matrix is used in this paper. First the Toeplitz structure of the channel convolution matrix is taken into consideration. In this case, computing the solution of the GMMSE detector requires the EVD of Toeplitz matrix but reduces the effort for the iterative gradient descent algorithm significantly. Nevertheless, computing the EVD of the Toeplitz matrix (using, e.g., Lanczos algorithm) is still computationally demanding. Therefore, we approximate the banded Toeplitz matrix by a circular matrix (Vollmer et al., 2001). In this case the MMSE/GMMSE solution is obtained by computing the EVD of the circular matrix using FFT/IFFT, such that the required EVD implies no additional effort. Therefore, the additional effort of approximate GMMSE is only determined by the iteration steps of the gradient descent algorithm based on the diagonal matrix containing the eigenvalues. This paper is organized as follows: in Sect. 2 a system model for the detection problem and the convex relaxations of the problem are introduced. LS and MMSE detectors are described from the convex programming point of view in Sect. 3. GMMSE detector is described in Sect. 4. In Sect. 5 we introduce our new detector which is derived from the GMMSE detector taking into account the structure of the Published by Copernicus Publications on behalf of the URSI Landesausschuss in der Bundesrepublik Deutschland e.V.
ollowing theorem that was stated in [Boyd and Vanhe (2004)] describes LS and MMSE solutions from ex programming point of view.
m 1 Suppose that the objective function f in an unned convex optimization problem is differentiable, so known necessary and sufficient optimality condition .
(7) channel matrix. Simulation results are used to compare bit error rate (BER) of the different detectors in Sect. 6 and the computational complexity is discussed in Sect. 7. Conclusions are drawn in Sect. 8.

System model and its relaxations
Consider the system model in matrix form as The vector r ∈ R m is the received signal vector, the matrix H ∈ R m×n is the channel matrix, and the vector n ∈ R m is additive white Gaussian noise with noise power σ 2 . The transmitted symbols x ∈ R n are drawn from Binary Phase Shift Keying (BPSK) constellation, i.e. x ∈ {−1,+1} n . Under the white Gaussian noise assumption the ML detector of x is given bŷ The ML problem in Eq.
(2) can be equivalently written aŝ Substituting the value of the matched filter output into Eq.
(3), we get This problem is NP hard and solving Eq. (5) by exhaustive search has a complexity which grows with 2 n (Verdu, 1989). This makes computationally less complex solutions of Eq. (5) interesting. We use the benefits of convex programming as an important mathematical tool to solve problem Eq. (5) by relaxing the constraint set in Eq. (5). Our constraint set x ∈ {−1,+1} n which contains only the corners of the unit hypercube is not a convex set. Therefore we relax this constraint set using three relaxations which yield a convex set. Figure 1 shows the relaxed constraint sets for n = 2 taking into account that the original problem contains only the corners of the unit hypercube. Three relaxations are considered: relaxation of the constraint set to the whole unit hypercube (region I), relaxation of the constraint set to the sphere which covers the unit hypercube (region I+II), and the relaxation to the whole space (region I+II+III). The solution in each case can be mapped to the feasible set of the original problem by taking the sign of each component of the relaxed solution vector.

Least squares and MMSE detectors
We first discuss the LS and MMSE solution from the convex programming point of view. Relaxing the constraint set to be the whole space, i.e. (region I+II+III), problem Eq. (5) will take the form The following theorem that was stated in (Boyd and Vandenberghe, 2004) describes LS and MMSE solutions from the convex programming point of view.
Theorem 1 Suppose that the objective function f in an unconstrained convex optimization problem is differentiable, so the well known necessary and sufficient optimality condition is Applying condition Eq. (7) to problem Eq. (6), which has an objective function the necessary and sufficient optimality condition gives the solution which is the well known least squares solution.
When the noise power σ 2 is known, using the same relaxation (region I+II+III) we get the minimum mean square error solution

Generalized MMSE detector
If we relax the constraint set in problem Eq. (5) to be the sphere which contains the unit hypercube, i.e. (region I+II), then our detection problem takes the form Since problem Eq. (11) has a convex objective function over a convex constraint set, it is a convex optimization problem and it has a unique minimum (Boyd and Vandenberghe, 2004). The convex duality theorem guarantees that no duality gap exists and one can solve for the dual problem instead (Nash and Sofer, 1996). Problem Eq. (11) has a single constraint such that there is only one dual variable and a simple iterative algorithm can be employed to solve this dual problem.
We can express the Lagrange dual function as which is minimized over x and maximized over λ ≥ 0. Solving for x in terms of λ and substituting back, we obtain This problem has the advantage, that it is a one dimensional optimization problem. Now we can solve this dual problem Eq. (13) instead of the primal problem Eq. (11) because there is no duality gap between these two problems. Problem Eq. (13) can be solved by different iterative algorithms (Hansen, 1979). A simple unconstrained gradient descent algorithm given bȳ converges toλ for a reasonable choice of µ. The solution of Eq. (13) is given by Then, the unique minimizer of Eq. (11) is This solution looks familiar because of its similarity to the MMSE detector. When λ * = σ 2 , the GMMSE detector reduces to the MMSE detector. Therefore this detector which depends on the value of optimum dual solution λ * is named Generalized MMSE detector. The advantage of the GMMSE detector is, that it improves the BER performance (Compared to LS detector as the MMSE detector) and does not require the knowledge of the noise power σ 2 . However GMMSE detector has the disadvantage that it requires a significantly higher computational complexity than MMSE detector.

Structured problem
In this Sect. we consider the reduction of the computational complexity of the GMMSE detector by taking the structures of the underlying detection problems into account. In the first case the Toeplitz structure of the channel convolution matrix H is used. In this case we will express the matrix H T H in problem Eq. (11) by its eigenvalue decomposition H T H = V V T where V is the matrix whose columns are the eigenvectors of H T H and is a diagonal matrix that contains the corresponding eigenvalues as its diagonal elements. Problem Eq. (11) can be rewritten aŝ The dual problem for the problem Eq. (17) in this case takes the form The unconstrained gradient descent algorithm takes the form and the GMMSE solution will be Besides computing V T y only diagonal matrices must be converted in Eq. (19) and Eq. (20), which simplifies the computations significantly. We can also make use of the Toeplitz structure of H T H when computing the EVD by using the Lanczos algorithm (Golub and Loan, 1996). Although this approach reduces the computational complexity of the GMMSE detector significantly, it is still much more complex than MMSE because of the required EVD (the iterations of Eq. (19) on the diagonal matrices are only of O(n)). In the following we will discuss the circular structure case, which is obtained by an approximation of the Toeplitz case. A banded Toeplitz structure gives a circular structure by adding L − 1 columns to the Toeplitz matrix, where L is the length of the channel impulse response. This is shown in the following example for L = 2: If channel matrix H in problem Eq. (11) is approximated by circular matrixH we obtain We can express the matrixH TH by its eigenvalue decompo-sitionH TH = F T F, where F is the discrete Fourier transform matrix (computed by FFT) and = diag F ·H(:,1) in that case problem Eq. (11) can be written aŝ The dual problem for problem Eq. (22) is and the gradient descent algorithm in the circular case takes the form After getting the optimal value λ * , the GMMSE solution in the circular case is Again, besides computing Fy (IFFT) only diagonal matrices must be inverted in Eqs. (24) and (25). Most important, no EVD computation is required in the circular case, since the EVD of a circular matrix is easily obtained using FFT/IFFT. Therefore, in this case the additional effort (compared to MMSE) is given by the iteration of Eq. (24), i.e. inversions of diagonal matrices and scalar products.

Simulation results
The BER performance of the different detectors is discussed.
In the simulation we compare the BER performance for LS, MMSE, and GMMSE detectors, taking into account that we have two different structures, Toeplitz and circular approximation. We applied this simulation for two different simulation scenarios. The equalization problem in the first scenario has a channel impulse response of length L = 7 and a transmitted bit vector of length n = 50. The second scenario describes the equalization problem with channel impulse response of length L = 15 and a transmitted bit vector of length n = 1000. Figures (2) and (3) show that GMMSE detector has almost the same performance as MMSE detector but it has the advantage that it does not require the knowledge of σ 2 . Furthermore, we see that the circular approximation only slightly degrades the performance of the detectors.

Complexity analysis
The computational complexity of the GMMSE detector is composed of two parts: (Part 1). The complexity of the solution of the system of equations (Eqs. 16,20 or 25) which is the same as for LS and MMSE (Eqs. 9 or 10).
In part 1, if there is no structure the solution is obtained by the Cholesky algorithm with complexity n 3 /3. When there is a Toeplitz structure, the solution is given by the Levinson algorithm with complexity 4n 2 and if we approximate this Toeplitz matrix to a circular structure, the solution is obtained using the FFT decomposition with complexity 3/2(n + L − 1)log 2 (n + L − 1). Therefore, the circular  approximation results in a significantly reduced computational complexity.
In part 2 gradient descent algorithm adds some complexity. However, for the structured cases (Eqs. 19 or 24) the iterations of the gradient descent algorithm are only applied to diagonal matrices ( ) such that the complexity is only of O(n) per iteration. Figures (4) and (5) show the mean number of required iterations for the Toeplitz case and the circular case in our two scenarios respectively. Obviously, the number of iterations in each scenario for both cases is almost the same. We assume the worst case complexity for part 2 by taking 6 iterations for all cases . Since the required number of iterations is quite small and the computational complexity is only of O(n) per iteration, the complexity of the gradient descent algorithm is almost negligible compared to part 1. The overall complexity (part 1 and part 2) for our two cases is shown in Fig. (6) and (7).

Conclusions
In this paper, it was shown that the circular approximation of the Toeplitz channel matrix is not only effective to significantly reduce the computational complexity of GMMSE detector using the gradient descent algorithm, but it also keeps the performance gain compared to LS detector (is almost the same as MMSE) without any requirement to know the noise power value (σ 2 ) .
In future work we will apply the presented technique to various practical problems and evaluate the performance depending on the channel length (L) and the dimension of the transmitted bit vector (n). We will also apply it to some common communication schemes like CDMA and OFDM.