## Abstract

Few-mode fiber transmission systems are typically impaired by mode-dependent loss (MDL). In an MDL-impaired link, maximum-likelihood (ML) detection yields a significant advantage in system performance compared to linear equalizers, such as zero-forcing and minimum-mean square error equalizers. However, the computational effort of the ML detection increases exponentially with the number of modes and the cardinality of the constellation. We present two methods that allow for near-ML performance without being afflicted with the enormous computational complexity of ML detection: improved reduced-search ML detection and sphere decoding. Both algorithms are tested regarding their performance and computational complexity in simulations of three and six spatial modes with QPSK and 16QAM constellations.

© 2015 Optical Society of America

## 1. Introduction

One widely discussed solution to the imminent capacity crunch in long-haul single-mode fiber (SMF) systems is the usage of space-division multiplexed (SDM) systems [1–3]. SDM allows for a virtually linear capacity increase as the transmission is performed through fiber ribbons, multiple cores or modes in a single fiber, the latter approach referred as mode-division multiplexing (MDM). Numerous experiments already validate the feasibility of MDM-based transmission [2, 3] using specially-tailored fibers supporting few-modes. The usage of few-mode fiber (FMF) promises cost per bit reduction by means of developing optical components, which are able to operate on all modes, e.g. few-mode wavelength selective switches [4] and amplifiers [5, 6].

However, there are still several challenges that need to be addressed before considering MDM of practical importance to the industry, e.g. mode-dependent loss (MDL), also refered as mode-dependent gain. MDL is generated by all kind of inline optical components e.g. few-mode amplifiers (FM-EDFAs) and switches. It reduces the channel capacity and in extreme cases causes system outage [1, 7, 8]. FM-EDFAs, being one of the major sources of MDL, are focus of research to achieve gain equalization over all modes and for a wide range of wavelengths [5, 6, 9]. However, this is a problem that has not been solved yet, being even more difficult with increasing number of modes.

Simulations demonstrate the performance improvement and greater transmission distances of MDL-impaired systems with the nonlinear maximum-likelihood (ML) detection compared to linear equalization such as zero-forcing (ZF) and minimum-mean square error (MMSE) equalization [8], however, with enormous computational complexity. While the computational effort for linear equalization already grows proportional to the number of tributaries (spatial and polarization modes) to the power of three, the ML detection complexity growth rate is exponential [8].

In this paper, two alternatives to ML detection with lower computational effort are presented: improved reduced-search ML (IRSML) detection [10] and sphere decoding (SD) [11,12]. Both were initially proposed for wireless systems, but they represent good candidates for improving the performance of MDL-impaired systems. Despite the fact that the channel matrix in wireless systems is different from the one in FMF systems, they both share the particularity of being nonunitary matrices, for wireless because of reflections, refraction and scattering, and for FMFs because of MDL. Based on simulations, a three- and six-moded-orthogonal frequency division multiplex (OFDM) receivers are assessed in the presence of MDL in terms of performance and computational complexity. Two different modulations formats are tested resulting in 3×158 Gb/s and 6×158 Gb/s MDM quadrature phase-shift keying (QPSK) OFDM (MDM-QPSK-OFDM) and, 3×316 Gb/s and 6×316 Gb/s MDM 16-level quadrature amplitude modulation (16QAM) OFDM (MDM-16QAM-OFDM).

## 2. System setup

The simulated system transmits an OFDM signal using three or six spatial modes for MDM. Through its cyclic prefix (CP), OFDM allows for the removal of the intersymbol interference introduced by dispersion and differential mode delay (DMD). Therefore, equalization and detection can be done separately on each subcarrier, and thus, less complex than when using single-carrier schemes [8, 13]. The system setup is depicted in Fig. 1 and follows the setup in [14].

With a net symbol rate of 25 Gbaud per tributary, the polarization multiplexed three-mode system yields a total of 3 *×* 100 Gb/s in case of QPSK modulation. Accordingly, the six-mode system with twelve tributaries allows for a total of 6 × 100 Gb/s by using QPSK. For 16QAM, bit rates of 3 *×* 200 Gb/s and 6 × 200 Gb/s, respectively, are achieved. Taking into account 24% overhead for forward error correction, 16.1% for the CP, and 10% for training symbols, this results in a gross bit rate of 3 *×* 158.3 Gb/s and 6 *×* 158.3 Gb/s for three- and six-moded QPSK transmission, respectively; for 16QAM these values correspond to 3 *×* 316 Gb/s and 6 *×* 316 Gb/s, respectively.

#### 2.1. Transmitter

In the transmitter, shown in detail in Fig. 2, the bits from the information source are mapped onto symbols and then converted into *N*_{DATA} = 3328 parallel symbol streams. Together with the *N*_{ZP} = 768 zero-padded subcarriers, the signal is transformed with an inverse fast Fourier transform (IFFT) of size *N*_{FFT} = 4096.

Then, a CP of length *N*_{CP} = 660 is added as a guard interval to avoid intersymbol interference (ISI) and training symbolss (TSs) are inserted for channel estimation. The CP length allows for a maximum of 16.7 ns of delay spread due to DMD and residual dispersion, enabling transmission over the maximum simulated distance corresponding to 1200 km with a fiber which has approximately 9 ps/km of DMD. The TSs consist of two identical constant amplitude zero autocorrelation (CAZAC) symbols for channel estimation and frame synchronization purposes [8]. Subsequently, pre-emphasis is applied in order to compensate for the spectrum roll-off caused by the digital-to-analog converters (DACs). The signal is split into real (I) and imaginary (Q) parts and separately converted into analogue domain by two DACs to drive an I/Q modulator.

#### 2.2. Link

For the three mode system employing the LP_{01} and two degenerate LP_{11} spatial modes, and both x- and y-polarization, six tributaries are generated. Accordingly, six polarization multiplexed transmitters generate the input signals for the transmission of six spatial modes: LP_{01}, two degenerate LP_{11}, LP_{02}, and two degenerate LP_{21}. As shown in Fig. 1, the outputs of the transmitters are multiplexed (MUX) into the FMF and scrambled in a mode scrambler (MS) at link input and after inline amplification. MSs are modeled as a concatenation of complex unitary matrices with random mode coupling, as described in [14,15]. Inducing mode coupling in the link is fundamental for reducing the accumulated MDL throughout the optical link and for the efficiency of the ML and near-ML detection, otherwise, with weakly coupled modes, the performance of linear and nonlinear equalization is very similar [8, 14, 15].

The link consists of *N*_{spans} spans of 80 km of FMF, whose parameters and model are reported in [14–16] and the references therein. The amplified spontaneous emission (ASE) noise from the erbium-doped fiber amplifiers (EDFAs) is modeled as a Gaussian noise source at the end of the link. The MDL in the link is simulated by assigning different gains per mode to the inline EDFAs. The gain for the LP_{01} mode is adjusted such that the loss through one fiber span is compensated. For three modes, the gains are adjusted such that *G*_{11} = *G*_{01} −2 dB as in [6], where G_{**} is the gain for mode LP_{**}. For six modes the gains correspond to *G*_{11} *= G*_{01} *−*1.5 dB, *G*_{21} = *G*_{01} *−* 3.5 dB, and *G*_{02} = *G*_{01} *−* 4 dB, following [5]. At the end of the link, the signal is demultiplexed (DMUX) and fed into *N*_{t} front ends, where *N*_{t} is the number of tributaries.

#### 2.3. Receiver

The receiver front end, shown in detail in Fig. 2(b), consists of a coherent detector that uses two separate analog-to-digital converter (ADC) to convert real and imaginary part of the signal into the digital domain. The subsequent digital signal processing (DSP) compensates for the dispersion, synchronizes the frames, and removes the CP before converting the symbol stream back to *N*_{FFT} parallel symbol streams. After performing a fast Fourier transform (FFT), the zero-padding (ZP) is removed and the signal is detected aided by the estimated channel matrix **H** that is separately estimated using the TSs. For the detection, the schemes described in Section 3 are employed. Afterward, the signal streams are converted back to serial and demapped into a bit stream.

## 3. Detection schemes

The channel model is assumed to be linear time-invariant disturbed by additive white Gaussian noise (AWGN) [17], considering low launch power (negligible nonlinear effects), no laser phase noise and no nonlinearities from the transmitter and receiver imperfections. Thus, the relation between channel output $\mathbf{r}\in {\u2102}^{{N}_{\mathrm{t}}}$ and channel input $\mathbf{s}\in {\mathbb{A}}^{{N}_{\mathrm{t}}}$, where $\mathbb{A}$ is the set of symbols in the constellation, is thus given as

where $\eta \in {\u2102}^{{N}_{\mathrm{t}}}$ is a random AWGN component. The channel matrix $\mathbf{H}\in {\u2102}^{{N}_{\mathrm{t}}\times {N}_{\mathrm{t}}}$ for an*N*

_{t}×

*N*

_{t}multiple-input multiple-output (MIMO) channel is estimated separately for every OFDM subcarrier via TSs. For the purpose of exploiting the advantage of one-tap channel equalization provided by OFDM, the channel description in Eq. (1) and the subsequent definitions apply subcarrier-wise.

In this section, different equalization and detection schemes are reviewed. First, the methods employing the linear equalizer MMSE and ML detection are presented. Afterward, the IRSML [10] and the SD schemes [11, 12] are described.

#### 3.1. Linear equalizer MMSE

The linear equalizer MMSE is used because it suffers less from noise amplification than a ZF equalizer. It allows for the best possible compromise between fulfilling the Nyquist intersymbol interference criterion and noise amplification. Its equalizer matrix is given as [18]

*η*. Here, ${\sigma}_{\mathbf{s}}^{2}$ and ${\sigma}_{\eta}^{2}$ denote the variances of the transmit signal and the noise, respectively. The equalization is performed by multiplying the equalizer matrix

**G**

_{MMSE}with the received signal

**r**as [18]

#### 3.2. Maximum-likelihood detection

The optimum receiver is provided by the maximum a posteriori (MAP) criterion [18], which is equivalent to the ML decisions if the input symbols have equal a-priori probabilities, as in this case. The ML detector is based on the maximization of the probability *P*(**s**|**r**) for all input hypersymbols **s**. In case of an AWGN channel, this leads to the well-known form of the ML detection scheme [18]:

Equation 4 implies the extensive search of all possible combinations of input vectors **s** and for that reason the computational effort rises exponentially with the constellation size and the number of tributaries [8] (see also Section 5). This approach is considered to be unfeasible in a practical scenario.

#### 3.3. Improved reduced-search ML detection

Due to the ML’s optimum performance, on the one hand, and high complexity, on the other hand, it has been a topic of research to find algorithms capable to achieve a near-ML performance with reduced effort [10, 19, 20]. In this paper, the approach IRSML is investigated [10], since it achieves near-ML performance with few pre-processing stages before the ML stage and it has a complexity that can be fixed independent of the channel conditions. The goal of the IRSML algorithm is to find a most reliable search set with a smaller set of hypersymbols hypothesis than the brute-force ML detection. It is referred to as IRSML detector as it is an improved version of the algorithm proposed earlier in [21]. IRSML offers better performance and higher flexibility regarding different constellation formats compared to the reduced-search maximum-likelihood (RSML) algorithm.

The structure of the IRSML detector is shown in Fig. 3. The output of MMSE equalizer is fed into a hard decision unit and is then used as a starting point to construct a search set for the ML stage of the algorithm characterized by the equation:

The search set
$\mathbb{S}$ is fed into the ML detector, where
$\mathbb{S}$ is constructed according to the MAP criterion by finding the hypothetical symbols *s _{i}* for

*i*= 1,…,

*N*

_{t}with the highest probability $P({s}_{i}|{y}_{{\mathrm{MMSE}}_{i}})$ as derived in [10]. For this purpose the Euclidean distances between the MMSE equalizer output and the detected symbols from such equalizer are computed as

*μ*= 1,…

*,M*is the index for all possible constellation points, accordingly,

*s*refers to one possible constellation point of index

_{μ,i}*μ*and tributary

*i*[10]. The Euclidean distances in Eq. (6) are sorted in ascending order by finding the pair

*μ, i*that satisfies

*i*. The process of adding the hypersymbol candidates to the set ${\mathbb{S}}_{i}$ is repeated until a certain predefined maximum number of hypersymbol candidates

*N*

_{max}is reached or until it is exceeded by adding another candidate. The maximum number of hypersymbol candidates

*N*is chosen according to required performance and allowed computational complexity.

_{max}#### 3.4. Sphere decoding

The idea behind the SD algorithm is to achieve ML-performance by searching the closest point in a lattice to the received signal with the condition that it should lie within the radius *R* of an hypersphere [11, 12]. Equation 8 and 9 represent this constraint [22, 23], where
${\mathbb{S}}_{{R}^{\prime}}(\mathbf{r},\mathbf{H})\subset {\mathbb{A}}^{{N}_{\mathrm{t}}}$ corresponds to the set of possible candidates for performing the SD algorithm.

By decomposing the channel matrix **H** into a unitary matrix **Q** and an upper triangular matrix **R** with the so-called QR decomposition [24], another perspective of Eq. (8) is obtained. Furthermore, after replacing **QR** into **H**, both addends in Eq. (8) can be multiplied by **Q**^{H}, since the multiplication with a unitary matrix does not change the *l*_{2}-norm, leading to an equivalent form of Eq. (8) and Eq. (9):

**w**=

**Q**

^{H}

**r**.

A closer look onto the elements of the vector **Q**^{H}**r**−**Rs** reveals how the triangularity of **R** can be beneficially used. For this, the norm is rewritten as the sum of its components:

**q**

*is the*

_{k}*k*-th column of

**Q**. The order of outer summation is inverted and the triangularity of

**R**, i.e.

*R*= 0∀

_{kl}*l*<

*k*, is exploited, yielding to

As evident in Eq. (13) the addend for tributary *N*_{t} only depends on the candidate symbol *s _{l}* for tributary

*N*

_{t}. The rest of the addends for smaller tributary number

*N*

_{t}−1,

*N*

_{t}−2 and so on only depend on candidate symbols in their corresponding tributary number and higher tributary numbers. It is important to note that only the last addend for

*k*= 1 depends on all the elements of the candidate hypersymbol

**s**and because of the absolute value in Eq. (13) every term adds up monotonically. For this reason the summation can be interrupted as soon as it reaches a number which surpasses the radius of the hypersphere

*R*. Thus, a significant number of candidates can be discarded.

The appropriate choice of the initial hypersphere radius *R* is crucial. The algorithm will find a suboptimal solution if *R* is too small. However, if *R* is oversized the complexity of the algorithm can be even greater than the ML algorithm due to the additional operations that the SD has to perform, such as the QR decomposition. Following the approach in [25], a starting point for the tree search corresponds to Eq. (14), where ⌊·⌉ denotes rounding to the closest constellation point. The hypersymbol **e** can be expected to be close to the actual solution even though noise amplification through **R**^{−1} inhibits achieving ML performance at once with the computation of **e**. Accordingly, the search starts with an infinite radius *R*, which is shrunk firstly with the radius resulting from the starting point **e** and then is updated if a lower value is found with another symbol combination.

## 4. System performance

In this section, the system performance of the different detection schemes described in Section 3 is compared. It is evaluated with the required optical signal-to-noise ratio (OSNR) penalty at a target bit error rate (TBER) of 1E-3. The back-to-back performance is taken as the reference for computing the required OSNR penalty. Three and six modes employing QPSK and 16QAM were simulated. The mean OSNR penalty and two times the standard deviations (error bars) from 200 channel realizations are shown in Fig. 4(a) for a system with three modes using QPSK. Figure 4(a) depicts two exemplary parameters *N*_{max}, since the IRSML algorithm offers the possibility to vary the size of the search set.

In Fig. 4(a) it can be seen that the OSNR penalty of the different schemes increases with the distance. This is due to the MDL introduced by the FM-EDFA, which accumulates with the square root of the distance as shown in Fig. 4(b) [14].

Figure 4(a) shows that the performance of the linear equalizer MMSE is exceeded by the other detection schemes, since these detection schemes do not generate noise amplification, as the linear equalizer MMSE with its matrix pseudoinverse. It can also be seen that both the SD and the IRSML detection with sufficient number of hypersymbol candidates *N*_{max} achieve near-ML performance. As *N*_{max} is increased, the IRSML detection curve approaches the performance of the ML detection. In this case, six tributaries and QPSK, the total number of hypersymbol candidates and hence, the largest possible *N*_{max}, is 4^{6} = 4096. In Fig. 4, it can be observed that already allowing for only two candidates yields an increase of the transmission distance of 240 km at 1 dB of OSNR penalty, i.e. an increment of 60 % in the transmission reach.

The results are summarized in Fig. 5, where the different detection schemes and modulation formats are illustrated. The curves represent the average reach allowing an OSNR penalty of 1 dB. Figure 5 shows how the reach of the IRSML detection approaches the ML performance with increasing search set size. In Fig. 5(a), the MMSE equalization poses the lower bound around 365 km, while ML and SD allow for transmission over more than double this distance.

Because of the tremendous computational complexity, it was possible to employ the brute-force ML detection only for the case of three modes and QPSK. Therefore, the SD algorithm will serve as a benchmark for ML performance. This can be supported by its theoretical derivation (see Section 3), and it was furthermore confirmed with the near-ML performance achieved in the case of three modes and QPSK, as shown in Fig. 4(a) and 5(a).

In addition, Fig. 5 highlights with the solid markers the points in the IRSML curve which achieve more than 95 % of the corresponding ML or SD reach. This is used as a criterion for choosing *N*_{max} with near-ML performance. For three modes, IRSML achieves this transmission distance with minimum 16 and 32 candidates for QPSK and 16QAM, respectively; for six modes, these values correspond to 32 and 128 candidates for QPSK and 16QAM, respectively.

## 5. Computational complexity

This section shows a comparison of the computational complexity of the different equalization and detection schemes described in the previous sections. Here, the complexity is computed per subcarrier and is defined as the number of complex multiplications needed to detect one bit [13].

#### 5.1. MMSE equalizer

According to Eq. (2), the MMSE equalizer needs one complex matrix inversion and two matrix multiplications to calculate the filter matrix ${\mathbf{G}}_{\mathrm{MMSE}}\in {\u2102}^{{N}_{\mathrm{t}}\times {N}_{\mathrm{t}}}$. The matrix inversion is assumed to be performed using the LU matrix factorization method, which is a well known method with good stability and computational efficiency properties [24]. A total of

complex multiplications are needed for the matrix inversion [24]. As for the two matrix multiplications in Eq. (2) ( ${\mathbf{H}}^{\mathrm{H}}\mathbf{H}\in {\u2102}^{{N}_{\mathrm{t}}\times {N}_{\mathrm{t}}}$and the multiplication of ${\mathbf{H}}^{\mathrm{H}}\in {\u2102}^{{N}_{\mathrm{t}}\times {N}_{\mathrm{t}}}$ on its inverse) $2{N}_{\mathrm{t}}^{3}$ complex multiplications are needed. It is assumed that the computation of**G**

_{MMSE}is only done every time the channel is estimated, i.e. every

*N*

_{f}OFDM symbols. Whereas the operation in Eq. (3) is necessary for every input vector, which requires ${N}_{\mathrm{t}}^{2}$ complex multiplications. Therefore, the total number of complex multiplications per bit needed for the detection using the MMSE equalizer is given by Eq. (16). The denominator in Eq. (16) corresponds to the total number of bits that can be simultaneously equalized per subcarrier.

#### 5.2. ML detection

The ML detection described in Eq. (4) requires the computation of the Euclidean distances between the
${M}^{{N}_{\mathrm{t}}}$ possible input vectors and the received signal vector. For each candidate, the vector
$\mathbf{s}\in {\u2102}^{{N}_{\mathrm{t}}}$ is multiplied with the channel matrix
$\mathbf{H}\in {\u2102}^{{N}_{\mathrm{t}}\times {N}_{\mathrm{t}}}$, which requires
${N}_{\mathrm{t}}^{2}$ complex multiplications each. These operations allow for the simultaneous detection of *N*_{t} log_{2} *M* bits. Therefore,

#### 5.3. IRSML detection

As described in Section 3.3, the IRSML scheme employs two stages. First, it equalizes the signal with the MMSE equalizer, which requires *c*_{2} complex multiplications per bit. Then, it performs ML detection, whose complexity corresponds to Eq. (17), although the factor
${M}^{{N}_{\mathrm{t}}}$ is replaced by *N*_{max} due to the reduced number of hypothesis employed in the IRSML algorithm. Therefore the complexity of the IRSML is the sum of the complexity of these two stages as shown in Eq. (18).

#### 5.4. Sphere decoding

Since it has been proposed, the complexity of the SD algorithm has been examined for different implementations [22, 26]. In [22], the authors demonstrate that the average complexity grows exponentially, even though not proportional to
${M}^{{N}_{\mathrm{t}}}$ as the ML detection, but proportional to
${M}^{\gamma {N}_{\mathrm{t}}}$ for *γ* ∈ (0, 1]. According to [22],
${M}^{\gamma {N}_{\mathrm{t}}}$ corresponds to the number of nodes visited in the search tree or number of considered hypothesis. If the factor *γ* is much smaller than one, the SD may be feasible even for large numbers of tributaries and high order constellation formats; if *γ* is one, the algorithm has visited all nodes in the search tree, as the ML detection, however, with higher complexity because of the computation of additional operations e.g. QR decomposition.

The complexity of the SD algorithm corresponds to the complexity of the QR decomposition as in Eq. (19) [24], the inversion of **R** for computing Eq. (14) as in Eq. (15), and the multiplications required to compute the metrics in Eq. (12) for every hypothesis s. Since the first two terms depend only on the channel matrix and not on the received signal, they are calculated only every *N*_{f} OFDM symbols. The latter term, the number of multiplications to compute the metrics in Eq. (12) denoted by *B*, has been logged during simulations since it depends on the noise level and the channel conditions, i.e. how much MDL is there in the system. *B* is the sum over the total number of visited nodes of the complex multiplications per visited node. The total complexity of the SD algorithm corresponds to Eq. (20).

#### 5.5. Comparison

In this section the complexity of the different detection schemes is compared. The parameter *N*_{f} for computing Eq. (16) and (20) is assumed to be 20, allowing 10% of TS overhead and sending two TSs per frame for synchronization and channel estimation purposes. That is, the channel is assumed to be constant over 20 OFDM symbols or 2.4 *μs*, which is in accordance with the experimental results shown in [27], where the minimum rate at which the channel estimation should be updated was determined to be approximately 250 kHz, or every 4 *μs*. Accordingly, *N*_{f} can be incremented even further up to 33 OFDM symbols. With such values of *N*_{f} for the IRSML detection and SD, the complex multiplications per bit and subcarrier affected by *N*_{f} in Eq. (18) and (20) are virtually negligible. For the MMSE equalizer, the contribution of the part dependent on *N*_{f} in Eq. (16) is comparable to the rest of the equation.

The complexity for the IRSML detection is depicted in Fig. 6. *N*_{max} is assumed to be the minimum value in Fig. 5, which accomplish 95 % of the reach of ML or SD. For example, for 3 modes and QPSK the *N*_{max} employed is 16. Figure 6 shows that IRSML achieves near-ML performance with much less complex multiplications than the ML algorithm. For three modes, the IRSML algorithm reduces the complexity of the ML detection in two and five orders of magnitude for QPSK and 16QAM, respectively; for 6 modes, the complexity reduction is higher reaching five and 12 orders of magnitude for QPSK and 16QAM, respectively.

As the complexity of the SD algorithm is variable it is represented with error bars in Fig. 7, where the length of the vertical lines are twice the standard deviation of the data. Figure 7(a) shows with the solid lines the average complex multiplications for a TBER of 1E-3; the dotted lines represent the average complexity for values above and below a bit error rate (BER) of 1E-3 for six modes and 16QAM. It can be seen that the average complexity of the SD increases proportional to the degradation of the channel conditions. As the inset in Fig. 7(a) shows, the MDL accumulates with the square root of the distance, due to the strong mode coupling generated by the MSs [14]. This demonstrates that both the BER and the MDL influence the complexity of the SD algorithm. The greater the BER and MDL the less unlikely is that the initial hypothesis in Eq. (14) is the one that fulfills the minimum argument in Eq. (8). Consequently, with badly-conditioned channels the algorithm searches extensively for the ML solution leading to the execution of more complex multiplications.

Whereas Fig. 7(a) indicates the average complexity, and therefore, gives an indication of the energy consumption of the algorithm implemented in hardware, the maximum complexity indicates the amount of hardware resources that need to be available. As Fig. 7(b) shows, the maximum number of complex multiplications are several orders of magnitude above the average complexity, often exceeding the complexity of the IRSML.

Figure 7(c) and 7(d) illustrates the value of *γ*, which has also been computed during simulation from the number of nodes visited by the SD algorithm at 1E-3 of BER. The error bars correspond to the average *γ* and the isolated markers correspond to the maximum *γ* registered. Additionally, the dotted curves in Fig. 7(d) correspond to the average *γ* for six modes, 16QAM and a BER of 1.8E-3 and 0.2E-3. Similar to Fig. 7(a) and 7(b) and as mention in [22], the average and the maximum *γ* increases as the channel conditions worsen and it is not longer much smaller than one.

A promising approach to reduce the complexity of SD has been proposed and compared with existing schemes in [23]. The decoder proposed in [23] uses a fixed effort, i.e., without relying on the variable complexity of the SD. However, its complexity reduction has to be trade-off with its performance degradation.

## 6. Conclusion

The detection algorithms IRSML and SD have been compared in terms of performance and complexity in 3*×*158-Gb/s and 6*×*158-Gb/s MDM-QPSK-OFDM and, 3×316-Gb/s and 6×316-Gb/s MDM-16QAM-OFDM systems in the presence of MDL. Both algorithms outperform the linear equalizer MMSE. The SD algorithm achieves ML performance with a complexity that in average is lower than the ML detection. However, the complexity of the SD is variable since it depends on the channel conditions, i.e. MDL and signal-to-noise ratio (SNR), which makes it difficult to implement in hardware. In contrast, the IRSML algorithm has a fixed complexity and can achieve near-ML performance with considerable less computational effort than the ML detection. Already by using two hypersymbol candidates for the IRSML detection the 16QAM and six mode system can achieve an improvement of 60 % of the average transmission reach over the MMSE equalizer allowing 1 dB OSNR penalty.

## References and links

**1. **P. J. Winzer and G. J. Foschini, “MIMO capacities and outage probabilities in spatially multiplexed optical transport systems,” Opt. Express **19**(17), 16680–16696 (2011). [CrossRef] [PubMed]

**2. **V. A. J. M. Sleiffer, H. Chen, Y. Jung, P. Leoni, M. Kuschnerov, A. Simperler, H. Fabian, H. Schuh, F. Kub, D. J. Richardson, S. U. Alam, L. Grner-Nielsen, Y. Sun, A. M. J. Koonen, and H. de Waardt, “Field demonstration of mode-division multiplexing upgrade scenarios on commercial networks,” Opt. Express **21**(25), 31036–31046 (2013). [CrossRef]

**3. **R. Ryf, S. Randel, N. K. Fontaine, X. Palou, E. Burrows, S. Corteselli, S. Chandrasekhar, A. H. Gnauck, C. Xie, R.-J. Essiambre, P. J. Winzer, R. Delbue, P. Pupalaikis, A. Sureka, Y. Sun, L. Grner-Nielsen, R. V. Jensen, and R. Lingle Jr., “708-km combined WDM/SDM transmission over few-mode fiber supporting 12 spatial and polarization modes,” in Proc. of ECOC’13 (Optical Society of America, 2013), paper We.2.D.1.

**4. **N. K. Fontaine, R. Ryf, C. Liu, B. Ercan, J. R. SalazarGil, S. G. Leon-Saval, J. Bland-Hawthorn, and D. T. Neilson, “Few-mode fiber wavelength selective switch with spatial-diversity and reduced-steering angle,” in Proc. of OFC/NFOEC’14 (Optical Society of America, 2014), paper Th4A.7.

**5. **M. Salsi, R. Ryf, G. Le Cocq, L. Bigot, D. Peyrot, G. Charlet, S. Bigo, N. K. Fontaine, M. A. Mestre, S. Randel, X. Palou, C. Bolle, B. Guan, and Y. Quiquempois, “A six-mode erbium-doped fiber amplifier,” in Proc. of ECOC’12 (Optical Society of America, 2012), paper Th.3.A.6.

**6. **Y. Jung, S. Alam, Z. Li, A. Dhar, D. Giles, I. P. Giles, J. K. Sahu, F. Poletti, L. Grüner-Nielsen, and D. J. Richardson, “First demonstration and detailed characterization of a multimode amplifier for space division multiplexed transmission systems,” Opt. Express **19**(26), B952–B957 (2011). [CrossRef]

**7. **K.-P. Ho and J. M. Kahn, “Mode-dependent loss and gain: statistics and effect on mode-division multiplexing,” Opt. Express **19**(17), 16612–16635 (2011). [CrossRef] [PubMed]

**8. **A. Lobato, F. Ferreira, B. Inan, S. Adhikari, M. Kuschnerov, A. Napoli, B. Spinnler, and B. Lankl, “Maximum-likelihood detection in few-mode fiber transmission with mode-dependent loss,” IEEE Photon. Technol. Lett. **25**(12), 1095–1098 (2013). [CrossRef]

**9. **Y. Jung, Q. Kang, J. Sahu, B. Corbett, J. OCallaghan, F. Poletti, S. Alam, and D. Richardson, “Reconfigurable modal gain control of a few-mode EDFA supporting 6 spatial modes,” IEEE Photon. Technol. Lett. **26**(11), 1100–1103 (2014). [CrossRef]

**10. **M. Chouayakh, A. Knopp, and B. Lankl, “Fixed effort MIMO decoders for wireless indoor channels: Theory and practical field trials,” in Proc. of PIMRC’08, 2008.

**11. **U. Fincke and M. Pohst, “Improved methods for calculating vectors of short length in a lattice, including a complexity analysis,” Mathematics of computation **44**(170), 463–471 (1985). [CrossRef]

**12. **E. Viterbo and J. Boutros, “A universal lattice code decoder for fading channels,” IEEE Trans. Inf. Theory. **45**(5), 1639–1642 (1999). [CrossRef]

**13. **B. Inan, B. Spinnler, F. Ferreira, D. van den Borne, A. Lobato, S. Adhikari, V. Sleiffer, M. Kuschnerov, N. Hanik, and S. Jansen, “DSP complexity of mode-division multiplexed receivers,” Opt. Express **20**(9), 10859–10869 (2012). [CrossRef] [PubMed]

**14. **A. Lobato, F. Ferreira, J. Rabe, M. Kuschnerov, B. Spinnler, and B. Lankl, “Mode scramblers and reduced search maximum likelihood detection for mode-dependent-loss-impaired transmission,” in Proc. of ECOC’13 (Optical Society of America, 2014), paper Th.2.C.3.

**15. **A. Lobato, F. Ferreira, J. Rabe, M. Kuschnerov, B. Spinnler, and B. Lankl, “Enhanced performance for MDL-impaired few-mode fiber transmission,” in Proc. of OECC/ACOFT’14 (Optical Society of America, 2014), paper TU4B-3.

**16. **F. Ferreira, D. Fonseca, A. Lobato, B. Inan, and H. Silva, “Reach improvement of mode division multiplexed systems using fiber splices,” IEEE Photon. Technol. Lett. **25**(12), 1091–1094 (2013). [CrossRef]

**17. **S. Randel, A. Sierra, S. Mumtaz, A. Tulino, R. Ryf, P. Winzer, C. Schmidt, and R.-J. Essiambre, “Adaptive MIMO signal processing for mode-division multiplexing,” in Proc. of OFC/NFOEC’14 (Optical Society of America, 2014), paper OW3D.5.

**18. **E. Biglieri, R. Calderbank, A. Constantinides, A. Goldsmith, A. J. Paulraj, and V. H. Poor, *MIMO Wireless Communications* (Cambridge University, 2007). [CrossRef]

**19. **J.-S. Kim, S.-H. Moon, and I. Lee, “A new reduced complexity ML detection scheme for MIMO systems,” IEEE Trans. Commun. , **58**(4), 1302–1310 (2010). [CrossRef]

**20. **K. Honjo and T. Ohtsuki, “Computational complexity reduction of MLD based on SINR in MIMO spatial multiplexing systems,” IEICE transactions on communications **89**(3), 914–921 (2006). [CrossRef]

**21. **M. Chouayakh, A. Knopp, and B. Lankl, “Low-effort near maximum likelihood MIMO detection with optimum hardware resource exploitation,” Electron. Lett. **43**(20), 1104–1106 (2007). [CrossRef]

**22. **J. Jaldén and B. Ottersten, “On the complexity of sphere decoding in digital communications,” IEEE Trans. Signal Process. **53**(4), 1474–1484 (2005). [CrossRef]

**23. **N. Tax and B. Lankl, “Fixed effort sphere decoder for MIMO OFDM systems,” in Proc. of WCSP’13, 2013.

**24. **Ll. N. Trefethen and D. Bau, *Numerical Linear Algebra* (Society for Industrial and Applied Mathematics, 1997). [CrossRef]

**25. **E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point search in lattices,” IEEE Trans. Inf. Theory. **48**(8), 2201–2214, (2002). [CrossRef]

**26. **B. Hassibi and H. Vikalo, “On the expected complexity of integer least-squares problems,” in *Proc. of ICASSP’02*, 2002. [CrossRef]

**27. **X. Chen, J. He, A. Li, J. Ye, and W. Shieh, “Characterization and analysis of few-mode fiber channel dynamics,” IEEE Photon. Technol. Lett. **25**(18), 1819–1822 (2013). [CrossRef]