## Abstract

Polarization dependent loss (PDL) causes imbalanced optical signal to noise ratio (OSNR) of the two polarizations, thus remains one of the major bottlenecks for next-generation polarization-division-multiplexed (PDM) coherent optical transmission systems. In this paper, we investigate Pairwise Coding for adaptive PDL mitigation in PDM coherent optical systems. By pre-coding across two polarizations, the PDL-induced performance degradation can be largely mitigated without any coding overhead. We present details of the coding and de-coding design, and also derive the analytical symbol/bit error rate of the Polarization Pairwise Coding scheme, which can be used to predict the performance gain as well as for optimal rotation angle calculation. Simulation results verify that Pairwise Coding achieves substantial system performance gains over a wide range of PDL values. Compared with other digital coding techniques, Polarization Pairwise Coding shows improved performance than Walsh-Hadamard transform since it maximizes the coordinate diversity; and also Pairwise Coding is computationally much simpler to decode compared with the Golden and Silver Codes, therefore is practical for current 100-Gb/s and future 400-Gb/s and 1-Tb/s digital coherent transceivers.

© 2015 Optical Society of America

## 1. Introduction

The combination of advanced modulation formats, polarization-division-multiplexing (PDM) and digital coherent receivers enables next-generation high-capacity optical transmission [1]. By obtaining the full optical field information with coherent detection, powerful digital signal processing (DSP) techniques enable effective compensation of most of the system impairments such as I/Q imbalance, chromatic dispersion (CD), polarization mode dispersion (PMD), laser phase noise and local oscillator frequency offset [2]; even fiber nonlinearity can be effectively mitigated to a certain level [3]. However, polarization dependent loss (PDL), which refers to two orthogonal polarizations being attenuated differently, remains an unsolved problem due to its non-unitary nature. Although PDL is negligible in fibers, it is significant in discrete devices such as amplifiers, wavelength division multiplexers, circulators, and isolators [4]. After long-haul transmission, an accumulated PDL of several dB can be easily observed, which may become one of the major bottlenecks for the high-speed long-haul PDM coherent optical systems.

In a PDM coherent optical system, PDL causes non-orthogonality of the PDM signals and imbalance in the received optical signal to noise ratios (OSNR) of the two signal polarizations [5, 6]. Signal non-orthogonality can be equalized with an adaptive multiple-input-multiple-output (MIMO) polarization mode dispersion (PMD) equalizer; however, OSNR imbalance will eventually limit the overall system performance. In practical systems, the data bits for both polarizations are encoded together so that there is only one FEC decoder at the receiver, where then the total system performance is the average of both “good” and “bad” polarizations; however, the overall system performance will still be dominated by the lossy polarization.

The degraded OSNR of the lossy polarization is similar to frequency selective fading in a wireless system; thus, PDL could be viewed as “polarization selective fading”. Therefore, coding concepts from wireless systems might be able to be applied for PDL mitigation. Recently, polarization-time coding, in the form of the Golden and Silver Codes, has been introduced for PDM coherent optical systems to achieve superior PDL tolerance [7–9]. Golden and Silver Codes encode across four symbols (two in each polarization) with a non-invertible coding matrix, and they require computationally expensive maximum likelihood sequence estimators at the receiver. A semi-Silver Code has been demonstrated in [10], using a 4 × 4 adaptive filter for simpler joint channel equalization and decoding; however, the performance is compromised and the computational effort is still double that of a conventional 2 × 2 adaptive filter. The Walsh-Hadamard transform [11], on the other hand, has a simpler coding/decoding transform matrix and maintains the normal constellation decision region at the receiver. It is also able to equalize the OSNR difference between two polarizations; however, the performance improvement is limited because it does not fully utilize the coordinate diversity. The optical implementation of low-complexity space-time pre-coding [12] and disjoint receiver detection schemes [13] either provides limited system performance gain, or increases the system’s hardware complexity.

Pairwise Coding originates from the scheme of maximizing the signal space diversity by rotating the conventional signal constellations [14], i.e., rotating the constellation maximizes the performance when the powers of the I and Q noise components are not identical. Pairwise Coding was first applied to single-input-single-output systems to improve the performance over fading channels [14], where the I and Q components are interleaved to allow different channel gains, mitigating the imbalanced SNRs of the I and Q components. It was then extended to MIMO wireless systems, where pairing of sub-channels with different signal-to-interference-and-noise-ratios (SINR), using the same rotation angle and exchanging the real/imaginary parts between different sub-channels, improved the overall BER performance [15]. This Pairwise Coding also improved the receiver sensitivity for direct-detection optical orthogonal-frequency-division multiplexing (OFDM) [16]. At OFC 2015 we reported Pairwise Coding in a PDM coherent optical system for PDL mitigation [17]; our experimental results showed that the pairing of two polarizations significantly improves the system performance over a wide range of worst-case PDL values. Importantly, Pairwise Coding does not require an overhead that would reduce the payload data rate, and needs only a few extra computations per symbol, because at the receiver end, after I/Q de-interleaving, only symbol-by-symbol decision processing is required.

In this paper, we present a theoretical analysis for the Pairwise Coding and decoding design, and derive the analytical symbol error rate (SER) and bit error rate (BER) for pairwise coded signals, which leads to accurate prediction of the performance gain and makes the determination of the optimal rotation angle easier. By comparing with alternative techniques including the Walsh-Hadamard transform and Golden and Silver Codes, using numerical simulations, we show that Pairwise Coding is a good candidate to be integrated into current 100 Gb/s, future 400 Gb/s and 1 Tb/s digital coherent transceivers.

## 2. Pairwise coding theory and design

#### 2.1 Pairwise pre-coding

Figure 1 shows the structure of polarization pairwise pre-coding. Firstly, two data streams are mapped to quadrature amplitude modulation (QAM) symbols ${X}_{n}={a}_{n}+{b}_{n}j$ and${Y}_{n}={c}_{n}+{d}_{n}j$, and then a constant phase shift, *θ*, is applied to all symbols, which leads to ${X}_{\theta ,n}$ and${Y}_{\theta ,n}$. After this angular rotation, I/Q component interleaving is used to generate the transmitted signals for each polarization:

*θ*= 45°.

The angular rotation and I/Q interleaving can also be described in matrix form as:

#### 2.2 Channel model and receiver decoding process

After single-mode fiber transmission, the frequency representation of the received signals can be described by the following equation:

*N*PDL elements contributing to the total PDL. ${R}_{\alpha ,i}$ and ${R}_{\beta ,i}$ are two random rotation matrices with uniform distribution in the range$[0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}2\pi ]$, ${\gamma}_{i}$ defines the actual PDL for each PDL element, where $PD{L}_{i}=\frac{1+{\gamma}_{i}}{1-{\gamma}_{i}}$. Typically PDL causes one polarization to have a higher OSNR (therefore SNR) than the other, unless the two polarizations share the same loss. The overall system performance is then dominated by the polarization with the lowest SNR. We can combine the PDL rotation matrices with ${H}_{linear}$ to rewrite Eq. (1) as:

*vs*. 28 Gbaud, so it is appropriate to perform analysis over tens of thousands of symbols based on the same $\eta $ value. Therefore, in this equation, we assume that the signals have a constant polarization state within a short time frame, therefore the same PDL characteristics. This assumption allows the PDL to be separated from the PMD matrix. A more general expression that encapsulates all of the polarization effects can be found in [20]. We will show later that our scheme can adapt to random changes of the signal polarization state.

After channel equalization, and frequency and phase correction, the normalized recovered time-domain signals can be described as [2]:

*L*, the highest and lowest possible SNR differences are

*L*and 0, corresponding to the worst and best case PDLs, respectively. In the following, we focus on the system performance for different ΔSNR values. The performance improvement with ΔSNR > 0 can be thought of as an increased tolerance of PDL.

The pairwise decoding process is shown in Fig. 2: the SNR estimation is first performed for each polarization for $\eta $ estimation using the statistical moments method [22], and then the equalized signals are rescaled accordingly, which essentially balances the noise variances of both polarizations. After I/Q de-interleaving, the resulting real and imaginary parts on each polarization now have different SNR levels:

The essential idea of Polarization Pairwise Coding is that by interleaving and de-interleaving the I/Q components at the transmitter and receiver, the PDL-induced SNR difference between two polarizations is translated into a SNR imbalance between the real and imaginary components of both polarizations, which then enables constellation rotation and rescaling to create an optimal decision region to achieve improved system performance.

## 3. Error rate and optimal rotation angle analysis

In the following sections we answer the questions: (a) what is the expected benefit that applying Pairwise Coding in the presence of PDL, and (b) which rotation angle leads to the largest benefit?

#### 3.1 Analytical SER and BER

As described in Eq. (7), in each polarization, one of the I and Q components will have a worse SNR than the other, but as both polarizations have similar overall characteristics, we need only analyze one polarization. Also for the sake of simplicity, we use QPSK as the modulation format. As shown in Fig. 3, we need to compare the four Euclidean distances (${D}_{k}$) between a given signal and ${\zeta}_{k}$, and make the decision based on the one that gives the minimum value.

The SER for pairwise coded QPSK signals can be calculated as:

*θ*= 45°:

#### 3.2 Optimal rotation angle

The optimal rotation angle for QPSK modulation has been derived based on maximizing the mutual information between two information vectors [15]:

*θ*, over more symbols with larger symbol rate. An alternative solution is to test across a number of

*θ*candidates using Eqs. (11) and (14) to find the one that provides the minimum SER or BER; however, this approach requires information about ΔSNR and also the total SNR. The advantage of the SER or BER searching method is that it can be easily extended to high-order QAM modulation formats, whereas only the solution for QPSK is provided in [15].

Figure 4 plots the derived optimal angles based on Eq. (15), SER and BER searching, using a fine search resolution of 0.045°. In Fig. 4(a), at SNR = 12 dB, the difference between the theoretical and SER searching curves is more than 5° for ΔSNR > 5 dB. This difference is reduced by about 50% when Gray-coded BER searching is used, because the minimum SER does not correspond to the minimum BER. Figure 4(b) shows that the SER and BER searching methods estimate different optimal rotation angles over a range of SNRs; for example, there is a 13°-difference at 5-dB SNR. At high SNRs the two methods converge.

#### 3.3 Simulation verifications

We verified the above derivations by conducting a single-channel PDM-QPSK system simulation using VPItransmissionMaker. The symbol rate is 12.5 GHz with 0.01 roll-off root raised cosine filtering. We simulated different $\Delta \text{SNR}$ values by using a single PDL element with the lossy polarization aligned to the X-polarization of the PDM signals, and then followed this by an OSNR-setting module to define the OSNR (specified for a 0.1-nm noise bandwidth). After coherent reception, training-aided channel equalization [23] and pilot-aided maximum likelihood phase estimation [24] were used to recover the signals. Gray coding was applied for bit-to-symbol mapping.

Figure 5 plots BER and SER performances at 10-dB SNR, against rotation angle at a resolution of 4.5° for 3- and 9-dB ΔSNR. For QPSK, the performance is periodic with rotation angle, with a period of 90°; thus, we need only sweep in a range of 0° to 90°. It is clear that the optimal rotation angle is the same for the minimum BER and the minimum SER at 3-dB ΔSNR, because in this case the BER is almost half of the SER, similar to the normal QPSK case. When the ΔSNR is increased to 9 dB, the optimal angles that lead to best BER and SER performance are slightly different. This illustrates that if we choose the rotation angle from the SER search, Gray coding is not the optimal bit-to-symbol mapping scheme. Also, different SNR values lead to different optimal rotation angles; therefore, to pursue the best system performance, we need to jointly consider the bit-to-symbol mapping scheme, system SNR (OSNR), and ΔSNR when determining the optimum rotation angle.

Figure 6 shows the single-channel OSNR versus ΔSNR, where the rotation angle is calculated based on either Eq. (15) or the BER and SER searching methods. Only results for BERs and SERs worse than 10^{−6} are shown, due to the limited number of symbols that can be simulated. For the BER results in Figs. 6(a)-6(d), for ΔSNRs of 0 and 3 dB, the optimal rotation angle estimated by both methods is the same; for ΔSNRs of 6 and 9 dB, the two optimal rotation angle calculation methods achieve a similar performance. With increased ΔSNRs between the two polarizations, Pairwise Coding provides a larger performance enhancement compared with the unpaired signals – the pairwise coded signals require about 1/2.5/4 dB lower OSNR to achieve BERs of 10^{−3} with 3/6/9 dB ΔSNR. There is no performance penalty when ΔSNR is zero, i.e. Pairwise Coding achieves same performance as unpaired signals if there is zero PDL, or if both polarizations are attenuated equally with best-case PDL (45° between the signal polarizations and the PDL lossy axis).

The SER results with 6-dB and 9-dB $\Delta \text{SNR}$ are shown in Figs. 6(e) and 6(f), respectively. Pairwise Coding requires about 3/4.5 dB lower OSNR to achieve an SER of 10^{−3} with 6/9 dB ΔSNRs; this again illustrates that ≈0.5-dB penalty is incurred due to the sub-optimal Gray bit-to-symbol mapping for Pairwise Coding. With all BER and SER results, the simulations agree well with the theoretical curves derived from Eqs. (14) and (11). By comparing Pairwise Coding with different optimal angle calculation methods, generally speaking, the optimal angle from Eq. (15) produces a similar performance as the SER and BER searching methods, but the magnified insets in Figs. 6(c) and 6(e) show that the SER/BER searching methods can achieve a minor improvement. This illustrates that, based on the optimal rotation angle calculated from Eq. (15), the system performance approximates to the theoretical optimal performance.

Overall, the simulation results verify that, based on the theoretical models, the performance gain given by Pairwise Coding can be predicted precisely.

To demonstrate the effectiveness of the proposed technique within a more practical scenario, we first simulated a single PDL element with 4-dB PDL, fix the system SNR to 10-dB, and then swept the angle between the input signals’ SOP and the PDL element from 0 to 360°. As shown in Fig. 7(a), without Pairwise Coding, the good and bad polarizations switch between the X and Y axes every 45°. The overall system performance (black curve) also swings with the same period. With Pairwise Coding, the worst performance appears when the SNR is the same for both polarizations (the crossing points of blue and red curves): the best performance can be attained with largest ΔSNR. Therefore Pairwise Coding improves the lower bound of the achievable system performance to be the same as the uncoded system with ‘best case PDL’.

We then split the single 4-dB PDL element into four cascaded 1-dB PDL elements, with the same lossy axes and randomly varying signal SOPs between the four PDL elements. We collected 1000 simulation results. The uncoded and coded system performances are shown in Figs. 7(b) and 7(c), respectively. It is clear that Polarization Pairwise Coding improves the worst, best and average system performances significantly. It is also worth mentioning that, although we have only considered the system performance without forward error correction (FEC) codes, it has been reported that a combination of constellation rotation and FEC codes can achieve further system performance gains over fading wireless channels [25]; therefore, there may also be some potential benefits by combining advanced FEC techniques with Polarization Pairwise Coding.

## 4. Discussion on practical implementation issues

#### 4.1 Choosing a fixed rotation angle

We have presented the approaches to identify the optimal rotation angle $\theta $ in the previous section. However, the methods based on Eq. (15) and SER/BER searching, need values of ΔSNR and sometimes the system SNR to be communicated from the receiver to the transmitter, which complicates the system architecture. Also, because the ΔSNR and system SNR are time varying, the rotation angle needs to updated adaptively to cope with the evolution of these two parameters. This is especially true for the ΔSNR since the state of polarization of PDM signals can change completely within one millisecond, requiring frequent feedback. Furthermore, different rotation angles generate different constellations after transmitter-side I/Q interleaving, which may set higher requirements for number of bits of the transmitter side DAC. The transmitted constellations that are based on optimal rotation angle calculation using Eq. (15) for 3, 6 and 9-dB ΔSNR are shown in Figs. 8(a), 8(b) and 8(c), respectively, indicating that at least a 2-bit DAC is required, rather than a 1-bit DAC for PDM-QPSK. Therefore, a fixed rotation angle is preferred, as it gives a reasonable gain over a wide range of ΔSNR and SNRs for simpler transmitted constellations.

As 45° is the optimal rotation angle for ΔSNR ≤4.7 dB, and has 9-QAM like constellation requiring only a 2-bit DAC, *θ* = 45° is attractive for Pairwise Coding. Moreover, the receiver digital channel equalization and phase estimation processes for 9-QAM format have been thoroughly investigated [26], thus the adaptive filter and phase estimator structure of conventional PDM-QPSK systems can be used, with small changes to the error-update algorithms. As the main feature of the *θ* = 45° solution is a simplified pre-coding/decoding process, it is worth comparing it to another well-known coding scheme, the Walsh-Hadamard transform (WHT), which also has simple encoding matrix with single-carrier PDM signals as:

*θ*= 45° case, except that Pairwise Coding exchanges the I of X/Y-polarization and the Q of Y/X-polarization, while WHT mixes all of the I and Q components of two polarizations. If we consider the $\Delta \text{SNR}$ as the worst-case PDL with a single PDL element, applying the WHT is equivalent to rotating the SOP of the incoming signals by

*θ*= 45° before passing through the PDL element, which changes the worst-case PDL to the best-case PDL. In conclusion, the major difference between these two schemes is that Pairwise Coding not only balances the SNR between two polarizations, it also manipulates the constellation to maximize the coordinate diversity, while WHT only aims to equalize SNR.

We simulated the performance with three different coding schemes: (1) using *θ* = 45° for Pairwise Coding; (2) adaptively using optimal angle calculated by Eq. (15) for Pairwise Coding, and (3) using WHT coding scheme. Figure 9 shows the results. The normal PDM-QPSK performance is also shown as a reference. All three schemes show improved performance compare to the uncoded signals with worst-case PDL. For Pairwise Coding, clearly *θ* = 45° provides similar performance gain as the optimal angle case with less than 7-dB ΔSNR, and then the performance penalty for the *θ* = 45° case is about 2.8 dB with 12-dB ΔSNR. At the same time, the *θ* = 45° solution shows a consistent advantage over the Walsh-Hadamard transform method. The reason can be revealed by comparing the recovered constellations for three schemes (12 dB SNR and 10 dB ΔSNR), as shown in Figs. 9(b)-9(d). With *θ* = 45°, the symbol errors mainly come from the overlap between the two quadrants that sit on the real axis, and also in this case one symbol error translates to two bit errors due to Gray coding; while with optimal $\theta $ the four quadrants interact with each other quite in a similar fashion, which leads to better performance. In contrast, WHT only aims to equalize the SNR between two polarizations without altering the decision region, resulting in the least improvement, and it shows good agreement to the uncoded system performance with best case PDL. Note that at high ΔSNR, the performance with *θ* = 45° Pairwise Coding can be improved by optimizing the bit-to-symbol mapping scheme to halve the BER, e.g. coding the two interacting quadrants (1-1*j* and −1 + 1*j*) with only one bit difference (e.g. 10 01).

An extreme example to understand these three schemes is if there is no noise on Y-polarization (infinite ΔSNR). Here, Pairwise Coding with an optimal rotation angle will create constellations that all sit on imaginary axis, achieving infinite capacity. With the *θ* = 45° solution, part of the coordinate diversity is missed, which results in moderate performance. Using WHT, the system’s capacity is limited to being equivalent to having twice the SNR of the X-polarization, which is the lowest among the three schemes. We can therefore conclude that Pairwise Coding with *θ* = 45° provides large gains for a wide range of ΔSNR values, while greatly simplifying the system design.

#### 4.2 Comparison with other polarization time codes

Besides WHT, there are also some other digital coding methods have been proposed to improve the PDL tolerance; for example, the Alamouti code, as a well-known code with best performance in 2 × 1 MIMO system, has been applied for PDL penalty mitigation [27]:

*M*-QAM modulation format, Pairwise Coding needs to compare between

*M*single-entry Euclidean distances for each symbol, while for Golden and Silver Codes,

*M*

^{4}and 2

*M*

^{3}2 × 2 matrix likelihood calculations are required to decode every four symbols, respectively. Therefore, compared with Golden/Silver Codes, Pairwise Coding is significantly less complex during equalization and symbol detection, so could be added into existing digital coherent optical systems.

## 7. Conclusions

In this paper, we have presented the design of a zero-overhead digital coding method, Pairwise Coding, to improve the PDL tolerance of PDM coherent optical systems. The pre-coding and decoding structure have been discussed in detail. We derived the analytical SER and BER for pairwise coded signals, which can accurately predict the performance gain and also can be used to find the optimal rotation angle for the transmitter side pre-coding. Simulation results illustrate the benefit of Pairwise Coding over a wide range of PDL values, which agree well with the analytical SER and BER models. Although the investigation of this paper is limited to QPSK modulation, the principle can be easily extended to higher-order QAM formats by constructing the analytical SER and BER analysis accordingly, to select the correct rotation angle for system performance enhancement. By comparing with other coding methods, we prove that using a fixed rotation angle, *θ* = 45°, greatly simplifies the system coding and decoding process while still providing a large performance gain. Thus, Pairwise Coding with a fixed rotation angle can be easily integrated into current and next-generation commercial digital coherent transceivers to give a beneficial performance gain.

## Acknowledgments

We thank VPIphotonics (www.vpiphotonics.com) for the use of their simulator, VPItransmissionMakerWDM V9.1. This work is supported under the Australian Research Council’s Laureate Fellowship (FL130100041) scheme and under CUDOS – ARC Centre of Excellence for Ultrahigh bandwidth Devices for Optical Systems (CE110001018).

## References and links

**1. **P. J. Winzer, “High-spectral-efficiency optical modulation formats,” J. Lightwave Technol. **30**(24), 3824–3835 (2012). [CrossRef]

**2. **S. J. Savory, “Digital coherent optical receivers: algorithms and subsystems,” IEEE J. Sel. Top. Quantum Electron. **16**(5), 1164–1179 (2010). [CrossRef]

**3. **L. B. Du, D. Rafique, A. Napoli, B. Spinnler, A. D. Ellis, M. Kuschnerov, and A. J. Lowery, “Digital fiber nonlinearity compensation: towards 1Tb/s transport,” IEEE Signal Process. Mag. **31**(2), 46–56 (2014). [CrossRef]

**4. **E. Lichtman, “Limitations imposed by polarization-dependent gain and loss on all-optical ultralong communication systems,” J. Lightwave Technol. **13**(5), 906–913 (1995). [CrossRef]

**5. **M. Shtaif, “Performance degradation in coherent polarization multiplexed systems as a result of polarization dependent loss,” Opt. Express **16**(18), 13918–13932 (2008). [CrossRef] [PubMed]

**6. **C. Xie, “Polarization-dependent loss induced penalties in PDM-QPSK coherent optical communication systems,” in *Optical Fiber Communication Conference and Exposition and The National Fiber Optic Engineers Conference*, OSA Technical Digest Series (CD) (Optical Society of America, 2010), paper OWE6. [CrossRef]

**7. **S. Mumtaz, G. Othman, and Y. Jaouen, “Space-time codes for optical fiber communication with polarization multiplexing,” in Proc. IEEE ICC, Cape Town, South Africa, May 2010, pp. 1–5. [CrossRef]

**8. **E. Awwad, Y. Jaouën, and G. R. Othman, “Polarization-time coding for PDL mitigation in long-haul PolMux OFDM systems,” Opt. Express **21**(19), 22773–22790 (2013). [CrossRef] [PubMed]

**9. **E. Meron, A. Andrusier, M. Feder, and M. Shtaif, “Use of space-time coding in coherent polarization-multiplexed systems suffering from polarization-dependent loss,” Opt. Lett. **35**(21), 3547–3549 (2010). [CrossRef] [PubMed]

**10. **M. Zamani, C. Li, and Z. Zhang, “Polarization-time code and 4 × 4 equalizer-decoder for coherent optical transmission,” IEEE Photonics Technol. Lett. **24**(20), 1815–1818 (2012). [CrossRef]

**11. **W.-R. Peng, T. Tsuritani, and I. Morita, “Modified Walsh-Hadamard transform for PDL mitigation,” in *39th European Conference and Exposition on Optical Communications**,* OSA Technical Digest (CD) (Optical Society of America, 2013), paper P.3.5.

**12. **A. Andrusier, E. Meron, M. Feder, and M. Shtaif, “Optical implementation of a space-time-trellis code for enhancing the tolerance of systems to polarization-dependent loss,” Opt. Lett. **38**(2), 118–120 (2013). [CrossRef] [PubMed]

**13. **A. Andrusier and M. Shtaif, “Disjoint detection in polarization multiplexed communication systems affected by polarization dependent loss,” Opt. Express **17**(10), 8173–8184 (2009). [CrossRef] [PubMed]

**14. **J. Boutros and E. Viterbo, “Signal space diversity: a power- and bandwidth-efficient diversity technique for the Rayleigh fading channel,” IEEE Trans. Inf. Theory **44**(4), 1453–1467 (1998). [CrossRef]

**15. **S. K. Mohammed, E. Viterbo, Y. Hong, and A. Chockalingam, “MIMO precoding with X- and Y-codes,” IEEE Trans. Inf. Theory **57**(6), 3542–3566 (2011). [CrossRef]

**16. **Y. Hong, A. J. Lowery, and E. Viterbo, “Sensitivity improvement and carrier power reduction in direct-detection optical OFDM systems by subcarrier pairing,” Opt. Express **20**(2), 1635–1648 (2012). [CrossRef] [PubMed]

**17. **C. Zhu, B. Song, L. Zhuang, B. Corcoran, and A. Lowery, “Pairwise coding to mitigate polarization dependent loss,” in *Optical Fiber Communication Conference**,* OSA Technical Digest (online) (Optical Society of America, 2015), paper W4K.4. [CrossRef]

**18. **P. Poggiolini, “The GN model of non-linear propagation in uncompensated coherent optical systems,” J. Lightwave Technol. **30**(24), 3857–3879 (2012). [CrossRef]

**19. **P. M. Krummrich, E. Schmidt, W. Weiershausen, and A. Mattheus, Field trial results on statistics of fast polarization changes in long haul WDM transmission systems,” *in Optical Fiber Communication Conference and Exposition and The National Fiber Optic Engineers Conference,* Technical Digest (CD) (Optical Society of America, 2005), paper OThT6.

**20. **C. Antonelli, A. Mecozzi, L. E. Nelson, and P. Magill, “Autocorrelation of the polarization-dependent loss in fiber routes,” Opt. Lett. **36**(20), 4005–4007 (2011). [CrossRef] [PubMed]

**21. **L. E. Nelson, C. Antonelli, A. Mecozzi, M. Birk, P. Magill, A. Schex, and L. Rapp, “Statistics of polarization dependent loss in an installed long-haul WDM system,” Opt. Express **19**(7), 6790–6796 (2011). [CrossRef] [PubMed]

**22. **C. Zhu, A. V. Tran, S. Chen, L. B. Du, C. C. Do, T. Anderson, A. J. Lowery, and E. Skafidas, “Statistical moments-based OSNR monitoring for coherent optical systems,” Opt. Express **20**(16), 17711–17721 (2012). [CrossRef] [PubMed]

**23. **C. Zhu, A. V. Tran, C. Do Cuong, S. Chen, T. Anderson, and E. Skafidas, “Digital signal processing for training-aided coherent optical angle-carrier frequency-domain equalization systems,” J. Lightwave Technol. **32**(24), 4712–4722 (2014). [CrossRef]

**24. **S. Zhang, P. Y. Kam, C. Yu, and J. Chen, “Decision-aided carrier phase estimation for coherent optical communications,” J. Lightwave Technol. **28**(11), 1597–1607 (2010). [CrossRef]

**25. **N. H. Tran, H. H. Nguyen, and T. Le-Ngoc, “Performance of BICM-ID with signal space diversity,” IEEE Trans. Wirel. Commun. **6**(5), 1732–1742 (2007). [CrossRef]

**26. **B. Huang, J. Zhang, J. Yu, Z. Dong, X. Li, H. Ou, N. Chi, and W. Liu, “Robust 9-QAM digital recovery for spectrum shaped coherent QPSK signal,” Opt. Express **21**(6), 7216–7221 (2013). [CrossRef] [PubMed]

**27. **S. Mumtaz, G. Rekaya-Ben Othman, Y. Jaouen, J. Li, S. Koenig, R. Schmogrow, and J. Leuthold, “Alamouti Code against PDL in Polarization Multiplexed Systems,” in *Advanced Photonics*, OSA Technical Digest (CD) (Optical Society of America, 2011), paper SPTuA2.

**28. **J.-C. Belfiore, G. Rekaya, and E. Viterbo, “The golden code: a 2x2 full-rate space-time code with nonvanishing determinants,” IEEE Trans. Inf. Theory **51**(4), 1432–1436 (2005). [CrossRef]

**29. **O. Tirkkonen and A. Hottinen, “Square-matrix embeddable space-time block codes for complex signal constellations,” IEEE Trans. Inf. Theory **48**(2), 384–395 (2002). [CrossRef]