## Abstract

In this paper, we propose a low-complexity format-transparent digital signal processing (DSP) scheme for next generation flexible and energy-efficient transceiver. It employs QPSK symbols as the training and pilot symbols for the initialization and tracking stage of the receiver-side DSP, respectively, for various modulation formats. The performance is numerically and experimentally evaluated in a dual polarization (DP) 11 Gbaud 64QAM system. Employing the proposed DSP scheme, we conduct a system-level study of Tb/s bandwidth-adaptive superchannel transmissions with flexible modulation formats including QPSK, 8QAM and 16QAM. The spectrum bandwidth allocation is realized in the digital domain instead of turning on/off sub-channels, which improves the performance of higher order QAM. Various transmission distances ranging from 240 km to 6240 km are demonstrated with a colorless detection for hardware complexity reduction.

© 2014 Optical Society of America

## 1. Introduction

100G coherent optical transceivers are being commercially deployed [1]. Moreover, 800G transmission has been demonstrated using commercial products [2]. As the industry is moving towards the next generation transceiver, some new capabilities are anticipated. First, the data rate per transceiver is expected to reach a Tb/s. The superchannel technology is likely to be used for addressing the speed limitations of the electronics [3]. Second, the flexibility in transmitting a signal with adaptive spectral efficiency and bandwidth will be essential for future flexgrid and agile optical networks [4]. Tb/s flexible transmission has been demonstrated without using digital-to-analog converters (DACs) [5]. However, DACs are more desirable in such scenarios for their flexibility and agility in signal generation. Furthermore, high speed DACs are now available and are used in commercial products [6]. Recently, we demonstrated spectral efficiency-adaptive transmissions using time domain hybrid QAM and high speed DACs [7]. Third, energy-efficient and format-transparent digital signal processing (DSP) will be needed to reduce the required hardware resources and power consumption for next generate transceivers. Due to the massive parallelization in the DSP implementation at such high sampling rates, the tracking speed of conventional decision-directed feedback algorithms such as the least-mean-square (LMS) algorithm for filter adaptation and the phase-locked loop (PLL) for carrier phase recovery (CPR) will be significantly reduced [8, 9]. Therefore, it is desirable to design new DSP concepts and algorithms that are low-complexity, format-transparent and suitable for parallel implementation.

Recently, we reported on a system-level study of Tb/s bandwidth- and format-adaptive transmission employing high speed DACs (34 GSa/s) and novel format-transparent DSP algorithms in [10]. In this paper, we provide a more detailed description of this study with additional experimental results including the investigation of the bandwidth adjustment and colorless detection. In particular, we first describe the principle of the proposed DSP scheme, which employs QPSK symbols as the training and pilot symbols for the receiver initialization and tracking, respectively. The performance is compared to the LMS + PLL scheme both numerically and experimentally in an 11 Gbaud dual-polarization (DP) 64QAM system, indicating an improved performance of the proposed scheme. After that, the experimental setup for the Tb/s superchannel flexible transmission is introduced, which consists of 10 sub-channels. The raw data rate is fixed at 1.2 Tb/s and we investigate the performance of three cases including 30 Gbaud QPSK, 20 Gbaud 8QAM and 15 Gbaud 16QAM per sub-channel. Different from the previous work [5], where sub-channels are turned on/off for bandwidth allocation, we adjust the symbol rate of each sub-channel by re-sampling the digital signal, leading to a performance optimization when the implementation noise is dependent on the symbol rate. Assuming a 7% hard-decision (HD) forward error correction (FEC), we realize a transmission distance from 240 km to 2400 km with a net spectral efficiency from 6.44 b/s/Hz to 3.22 b/s/Hz, while with a 20% soft-decision (SD) FEC the transmission distance is from 1680 km to 6240 km with a net spectral efficiency from 5.78 b/s/Hz to 2.89 b/s/Hz. Moreover, the colorless detection of each sub-channel is employed with negligible performance degradation, which is important for the cost-effective implementation of such systems.

## 2. Format-transparent digital signal processing

#### 2.1 Description of proposed DSP

Figures 1(a) and 1(b) depicts the generic structure of a flexible transmitter and receiver, respectively. In particular, before being converted to analog signals by DACs the transmitted sequence is encoded and processed. The transmitter-side DSP enables switching modulation format, tuning signal symbol rate, pulse shaping and pre-compensation of channel impairments such as chromatic dispersion (CD) and fiber nonlinearities. The linear electrical-to-optical (E/O) conversion including linear RF drivers and linear IQ modulators is essential to ensure the quality of a high order QAM signal or a pre-compensated signal. At the receiver side, the optical field can be linearly mapped to the electrical field by coherent detection. After digitization using analog-to-digital converters (ADCs), DSP is applied to recover and decode the signal.

Many DSP algorithms for various functions have been proposed for specific modulation formats. However, for a flexible transceiver, low-complexity and format-transparent DSP is preferred in order to save on hardware resources and power consumption. Among those functions, it is especially challenging to design the adaptive filter and CPR with the parallel implementation. LMS combined with PLL is usually employed in some offline processing based transmissions without considering the feedback delay [11]. However, when considering the large feedback delay, either novel feedforward CPR algorithms such as the blind phase search (BPS) proposed in [9] or superscalar parallelization based PLL in [12, 13] should be employed. For the former, the disadvantage is the high complexity, while for the latter it is not straightforward to feedback the recovered phase to the LMS adaptation.

In this work, we propose a data-aided receiver-side DSP scheme as shown in Fig. 1(c), which is low-complexity, format-transparent and suitable for parallel implementation. The DSP is divided into two stages: 1) initialization; and 2) tracking. Training symbols with QPSK format are sent for the initialization stage. Two identical patterns each containing 100 symbols are sent at the very beginning for a coarse synchronization based on the auto-correlation metric. Then the pre-convergence of the butterfly filter taps is achieved using constant modulus algorithm (CMA). Since the training symbols are QPSK symbols which are suitable for CMA, optimal filter coefficients can be obtained. Afterwards, the equalized training symbols are used for the initial frequency offset (FO) estimation based on the 4th power of the QPSK symbols. After compensating CD and FO, fine synchronization can be done using the cross-correlation between the received and transmitted training symbols [7].

The initialization stage is relatively easy to design, since the processing speed requirement is not high as it only needs to be done once for the entire transmission. On the contrary, the tracking stage is much more challenging to design as mentioned earlier. The drifting effects in a typical optical transport system include clock jitter, polarization rotation, polarization mode dispersion (PMD), laser frequency and laser phase noise. The clock jitter is tracked using the square and filter method [14]. For the other effects, we propose the pilot symbol (PS)-aided scheme as shown in Fig. 2(a). The structure of one transmitted frame is illustrated in Fig. 2(b). In each frame, we send 8 PS’s for 800 data symbols with 4 consecutive PS’s inserted after each 400 data symbols, resulting a 1% overhead. The 8 PS’s contains a pair of special PS’s: ${e}^{j\pi /4}[1\text{1;1}\text{-1}]$ for polarization rotation tracking, and 6 random QPSK symbols for CMA based channel tracking (e.g. PMD).

The PS’s aided polarization tracking algorithm is modified from an algorithm proposed in our earlier work [15]. The polarization rotation matrix can be modeled as:

*θ*is the rotation angle. With the received special PS’s ${r}_{x/y}[n]$, the absolute value of

*b*and the sum of the angles of

*a*and

*b*can be calculated as:

*k*is the index of frame and

*g*is a weighting factor which is used to balance the tracking speed and noise tolerance. The obtained rotation angle can be integrated into the coefficients of the butterfly filter for the polarization recovery. The rest 6 QPSK PS’s are used to track the drifting of other linear effects such as PMD, CD and filtering effects, which are much slower and have less influence on the performance than the polarization rotation. The CMA algorithm can be applied since the PS’s are in QPSK format.

The computational complexity of the polarization rotation tracking is calculated as follows. Equation (2) involves 4 real multipliers, 4 real adders and 1 square root. Equation (3) requires 8 real multipliers, 6 real adders and 1 arg(∙). In Eq. (4), we need 1 cos(∙), 1 real multiplier and 1 asin(∙). Finally in Eq. (5) 1 real multiplier and 1 adder are used. Therefore, if the square root, arg(∙), cos(∙) and asin(∙) are all implemented using look-up tables (LUT), in total the polarization tracking requires 14 real multipliers, 11 real adders and 4 LUTs for each frame. The complexity per symbol is the above complexity divided by the number of symbols in one frame, which is 800 symbols in our case. We can see that the computational complexity and power consumption per symbol are almost negligible, while the hardware complexity is reasonably low. Similarly, for the subsequent CMA operation, the computational complexity should be divided by 6/800 in our case, leading to negligible power consumption per symbol as well.

Compared to the conventional decision-directed LMS algorithm, in additional to the significantly reduced computational complexity the proposed PS’s aided polarization tracking and CMA scheme has two extra advantages: 1) the channel tracking is decoupled from the CPR, which helps to ease the design of the receiver DSP and improves the system tolerance to laser phase noise; 2) the performance is not deteriorated by the parallelization. Contrarily, the conventional LMS + PLL might not be practically useful with a large parallelization degree.

After channel equalization, the 8 PS’s are reused for carrier recovery including FO tracking and CPR. For the FO tracking, the average phase of each consecutive 4 PS’s is calculated and compared to the previous 4 PS’s to obtain one phase difference. In this work, we average over 40 such phase differences (one block) to obtain an accurate FO and apply it to the symbols in the next block. By repeating this process the FO tracking is realized. Similar to the polarization tracking, the computational complexity of the FO tracking is negligible when averaged to each symbol since the calculation is only operated on PS’s. For the CPR, the PS’s aided superscalar parallelization based PLL combined with maximum-likelihood (ML) algorithm, denoted as SSP-PLL + ML, is employed, which has been demonstrated to tolerate very high laser linewidths with a reasonably low complexity for arbitrary modulation format in our earlier work [13]. The linewidth tolerance and complexity of SSP-PLL + ML has been thoroughly investigated for QPSK, 16QAM and 64QAM in [12], so we focus on the evaluation of polarization rotation and FO tracking speed, as well as the steady state performance in the next section. The cycle slips are corrected by PS’s and thus differential coding can be removed in the proposed scheme.

#### 2.2 Performance evaluation in 64QAM systems

The performance of the proposed DSP scheme is evaluated in an 11 Gbaud DP-64QAM system. Figures 3(a) and 3(b) shows the numerical investigation of the tracking speed for the FO and polarization rotation drifting, respectively, in the back-to-back transmission. The combined laser linewidth was set to be 100 kHz, the initial rotation angle was 40 degree and no initial FO was added. 10^{19} symbols were sent for evaluation. As per Fig. 3(a), the FO drifting tolerance can reach up to 1 MHz/μs with a negligible performance loss, which is faster than the typical frequency variation rate of an external cavity laser (ECL) [16]. If needed, the FO tracking speed can be further increased by optimizing the length of phase difference average block.

The polarization tracking speed comparison is shown in Fig. 3(b). The serially implemented LMS (with PLL) can tolerate > 100 krad/s, and a performance degradation is observed at all polarization angular frequencies with differential coding (denoted as D. C.), which is necessary in the system without PS’s. It is known that the tracking speed of LMS will be significantly reduced with a large feedback delay in the parallel implementation [8]. As an example, we show that the tolerance of LMS is reduced by a factor of 10 with a feedback delay of 10 symbols in Fig. 3(b). With PS-aided CMA only, the tolerance is reduced to 3 krad/s because the filter adaptation frequency is too low. By adding the proposed polarization tracking scheme, we can significantly improve the tolerance depending on the weighting factor. In particular, the tracking speed is increased to 10 krad/s, 25 krad/s and 50 krad/s with g = 0.05, 0.1 and 0.2, respectively. In practical, *g* should be as small as possible for the system stability since the divergence of the polarization tracking will fail the transmission. It is noteworthy that the tracking speed of the proposed system can be improved by increasing the overhead of the PS’s. Therefore, for a specific transmission system the symbol frame should be designed while taking into account the overhead and tracking speed requirement.

The experimental comparison between the proposed tracking scheme and LMS + PLL was also conducted. The experimental setup was almost the same as the Tb/s bandwidth-adaptive transmission experiment, which is introduced in the next section, except that we used only one laser as the light source at the transmitter. The 64QAM signal was uploaded to the DACs with one sample per symbol at 11 GS/s, leading to a non-return-to-zero (NRZ) pulse shape. In the comparison results in Fig. 4, we can see that the proposed scheme has the same performance as the LMS + PLL without differential coding and feedback delay. For the back-to-back performance in Fig. 4(a), the simulation results without implementation noise demonstrate an improvement of 0.4 dB at bit error rate (BER) = 3.8 × 10^{−3} (HD FEC) and 0.9 dB at BER = 2.7 × 10^{−2} (SD FEC) for the proposed scheme compared to the LMS + PLL with differential coding. In experiments, the improvement is increased to 1.5 dB and 1.6 dB, respectively, due to the reduced slope of the curves caused by implementation noise. For the transmission performance in Fig. 4(b), we observe a distance increase of up to 140 km comparing the proposed scheme to the LMS + PLL with differential coding. We also investigate the performance of LMS + PLL with a feedback delay of 10 symbols, which reduces the distance by 240 km at BER = 2.7 × 10^{−2} mainly due to the reduced laser linewidth tolerance. Finally, we compare the performance with different amount of nonlinear noise in Fig. 4(c). It is observed that the proposed scheme performs similarly well in both the linear and nonlinear regimes.

## 3. Tb/s bandwidth-adaptive transmission

Flexibility and Tb/s data rates are the targets of the next generation transceiver. In this section we present and discuss a system-level study of superchannel based Tb/s transmission with flexibility in modulation format and bandwidth. All the experiments employed the proposed low-complexity DSP scheme, demonstrating its superior performance for various modulation formats.

#### 3.1 Experiment setup

Figure 5 depicts the schematic of the experimental setup. The transmitted symbols were applied by a root raised cosine (RRC) filter with a roll-off factor *α* = 0.12 to increase the spectral efficiency, and a pre-emphasis to compensate the frequency response of the transmitter. Then the inphase and quadrature parts of the obtained waveform were uploaded to the memory of two field-programmable gate arrays (FPGAs), respectively, which drove two DACs to generate the analog signals. Ten ECLs were combined and bulk modulated by the electrical signals through an IQ modulator. DP signals were formed using a DP emulator with one path delayed for de-correlation. The delay length was set to 808 symbols in order to align the PS’s. The re-circulating loop consisted of 3 spans of 80 km single mode fiber (SMF28e + ) and 3 Erbium-doped fiber amplifiers (EDFAs). A Finisar waveshaper was employed as a gain-flattening filter (GFF). The output of the loop was coherently detected without pre-amplification and filtering. Another ECL was used as the local oscillator (LO). The linedwidths of the ECLs were below 100 kHz. One real-time scope operating at 80 GSa/s was used to digitize the electrical signals for offline processing.

The sampling rates of the DACs were fixed at 34 GSa/s. The RRC signals were first generated in Matlab with two samples per symbol, and then re-sampled to achieve the desired symbol rate, which was 30 Gbaud, 20 Gbaud and 15 Gbaud for QPSK, 8QAM and 16QAM, respectively. Therefore, the raw data rate was 1.2 Tb/s for all modulation formats. Note that the spacing of ECLs was accordingly tuned to 33 GHz, 22 GHz and 16.5 GHz for the three cases, respectively, resulting in a 10% guard band. The optical spectrums of the three types of signal at the output of the transmitter are plotted in Fig. 6. Because the total power is similar, the power per wavelength bin increases as the bandwidth reduces for higher order QAM.

#### 3.2 Comparison of different bandwidth-adaptive schemes

Spectrum allocation can be realized by turning on/off sub-channels as demonstrated in [5]. However, this approach is not appropriate when high speed DACs are employed. It is because that the effective number of bits (ENOB) of those DACs might drop significantly at high frequencies, resulting in a much larger implementation noise for a higher symbol rate signal than a lower symbol rate signal. In addition, other impairments such as the hardware response also deteriorate the high frequency components of a signal. Therefore, in this scenario the implementation penalty can be significantly reduced with a lower symbol rate, which is very important for high order QAM.

In our experiments, as described earlier the spectrum bandwidth was adjusted by digitally adjusting the symbol rate of each sub-channel while keeping all of them on for transmission. Such a scheme is capable of optimizing the performance of each modulation format, and it is inherently compatible with fixed data rate transmission, which requires a lower symbol rate for higher order QAM. To illustrate the performance improvement, we also transmitted 6 sub-channels with 30 Gbaud 8QAM (1.08 Tb/s). The transmission performance is compared in Fig. 7. Apparently, the 10 sub-channel 20 Gbaud performs much better than the other system, with a 270% reach improvement at 20% FEC threshold. In addition to the reduced ENOB at high frequencies, the performance loss of 30 Gbaud signals also comes from the fact that the small oversampling factor 1.13 ( = 34/30) induces a non-negligible spectrum images at two sides since the 3 dB bandwidth of our DACs is around 20 GHz, leading to extra inter-sub-channel interferences.

Although the performance difference might be smaller with more advanced DACs and better image rejections in real systems, our comparison result implies that when we adjust the bandwidth of the superchannel system, it is more desired to keep all sub-channels running and change the symbol rate of each of them in order to minimize the implementation penalty. In addition to the performance improvement, adjusting the bandwidth in digital domain also allows a very small granularity, which will be needed in future flexgrids with 12.5 GHz slot granularity. In contrast, turning on/off sub-channels can only achieve a granularity equal to the bandwidth of each sub-channel. Note that the proposed scheme requires a fine tuning of laser wavelength spacing, which is feasible with commercial ECLs since they have very small frequency tuning resolutions. If a comb generator is employed as the multi-tone source, the spacing changing can be realized by varying the clock frequency. On the other hand, the energy consumption of the system with more sub-channels (lower symbol rates) might be higher than that with less sub-channels (higher symbol rates), which should be taken into account for the system design.

#### 3.3 Back-to-back and transmission performance

Figure 8 shows the BER of the fifth sub-channel at the back-to-back transmission for all modulation formats. The OSNR (0.1nm) penalty with respect to the theoretical limit at BER = 3.8 × 10^{−3} is 4.2, 4.6 and 5.3 dB for QPSK, 8QAM and 16QAM, respectively, while at BER = 2.7 × 10^{−2} it is 2.7, 2.3 and 2.4 dB, respectively. For QPSK, the implementation noise is quite large due to low ENOB at high frequencies and the extra inter-sub-channel interferences caused by the images as explained in the previous section. However, because QPSK is very tolerant to noise it still achieves reasonable back-to-back performance. This is also reflected in the achieved transmission distance shown later. Although the performance of QPSK can be improved by reducing the symbol rate to 15 Gbaud, the required number of DACs/ADCs and other components to achieve the same aggregate data rate will be doubled. In Fig. 8, the 16QAM performance with a serially implemented LMS + PLL tracking scheme is also plotted as a comparison, which includes no PS’s but requires differential coding to address cycle slips. It can be seen that the proposed DSP scheme reduces the required OSNR by 1.8 dB and 1.5 dB at 3.8 × 10^{−3} BER and 2.7 × 10^{−2} BER, respectively. The improvement is mainly attributed to the removal of differential coding, at the cost of 1% overhead only.

It is known that the optimal launch power is the same for different modulation formats with a fixed symbol rate [17], but it is increased as the symbol rate gets larger [18], which is confirmed in our measurement. Figure 9(a) shows the BER as a function of the launch power for QPSK, 8QAM and 16 QAM signals at distances of 2400 km, 720 km and 240 km, respectively. The optimal launch powers were 8 dBm, 7 dBm and 5 dBm for the three modulation formats, respectively, and they were used in the following transmission performance evaluation. Figure 9(b) shows the achieved transmission distances of different modulation formats where the BERs of all sub-channels are below the FEC thresholds. Various distances ranging from 240 km to 6240 km are realized for different formats and different FEC coding schemes, demonstrating the flexibility and dynamic range of our system. At BER = 2.7 × 10^{−2}, in spite of the large implementation noise, QPSK still achieves an ultra long-haul distance of 6240 km. And with reduced implementation noise at lower symbol rates, 8QAM and 16QAM can transmit over 2880 km and 1680 km distance, respectively.

Since in our experiments all sub-channels carry the same data, which might lead to an enhancement of fiber nonlinearities [19]. Therefore, we did extensive simulations comparing the system performance with the same data or different data on sub-channels. The simulation results, which are not included in the paper, show that with the configuration of our experiments the superchannel system with identical data only slightly underestimates the performance thanks to the CD-induced sub-channel decorrelation and the relatively large symbol rates. It should be noted that with 20% FEC overhead, the net data rate will be slightly below Tb/s since other overheads such as PS’s (1%) and transport overhead (≈5%) should also be subtracted from the data rate. The constellations of all modulation formats before and after transmitting over fiber are plotted in Fig. 10. Their power is normalized to be identical and the back-to-back noise difference can be clearly observed. It is worth mentioning that it is quite easy to change the modulation formats in our setup. To be specific, on the DSP side, only decision functions were switched, while on the hardware side, only the laser wavelength spacing was tuned.

#### 3.4 Performance of colorless detection

Colorless reception is especially important for cost savings in flexible transmissions, since the spectrum bandwidth is dynamic which might require bandwidth-tunable filters for each sub-channel. In Fig. 11, we investigate the performance of the systems with and without filters. The 3 dB filter bandwidth is 0.4 nm, 0.3 nm and 0.2 nm for QPSK, 8QAM, and 16QAM, respectively. For all modulation formats, the performance difference between them is negligible especially for longer distances with high BERs, meaning that the filters are not required in our systems. Such a colorless detection is achievable in commercial products since colorless integrated coherent receivers have been demonstrated [20].

## 4. Conclusion

We reported an experimental demonstration of Tb/s bandwidth-adaptive transmissions. High speed DACs were employed for the easy switch of modulation formats, and the bandwidth adjustment was realized by changing the oversampling ratio of the digital signal. In addition, we proposed a pilot symbol-aided digital signal processing (DSP) scheme, which is format-transparent, low complexity and compatible with parallel implementation. This kind of DSP scheme is essential for next generation flexible and energy-efficient transceiver. With the proposed DSP scheme, we demonstrated the superchannel transmission of QPSK, 8QAM and 16QAM with a fixed raw data rate of 1.2 Tb/s. The transmission distance ranged from 240 km to 6240 km for different formats and FEC thresholds. The colorless detection was also employed for the purpose of cost reduction.

## References and links

**1. **K. Roberts, D. Beckett, D. Boertjes, J. Berthold, and C. Laperle, “100G and beyond with digital coherent signal processing,” IEEE Commun. Mag. **48**(7), 62–69 (2010). [CrossRef]

**2. **Ciena Press Releases, “BT and Ciena Light World’s First 800G Super-Channel,” (2013), http://www.ciena.com/about/newsroom/press-releases/BT-and-Ciena-Light-Worlds-First-800G-Super-Channel.html?campaign=X379513&src=blog

**3. **S. Chandrasekhar and X. Liu, “Experimental investigation on the performance of closely spaced multi-carrier PDM-QPSK with digital coherent detection,” Opt. Express **17**(24), 21350–21361 (2009). [CrossRef] [PubMed]

**4. **O. Gerstel, M. Jinno, A. Lord, and S. J. B. Yoo, “Elastic optical networking: a new dawn for the optical layer?” IEEE Commun. Mag. **50**(2), s12–s20 (2012). [CrossRef]

**5. **Y. Huang, E. Ip, P. N. Ji, Y. Shao, T. Wang, Y. Aono, Y. Yano, and T. Tajima, “Terabit/s optical superchannel with flexible modulation format for dynamic distance/route transmission,” in Proc.OFC'12, Paper. OM3H.4 (2012). [CrossRef]

**6. **K. Roberts and C. Laperle, “Flexible transceivers,” in Proc. ECOC'12, Paper. We.3.A.3 (2012). [CrossRef]

**7. **Q. Zhuge, M. Morsy-Osman, X. Xu, M. Chagnon, M. Qiu, and D. V. Plant, “Spectral efficiency-adaptive optical transmission using time domain hybrid QAM for agile optical networks,” J. Lightwave Technol. **31**(15), 2621–2628 (2013). [CrossRef]

**8. **M. Kuschnerov, M. Chouayakh, K. Piyawanno, B. Spinnler, E. de Man, P. Kainzmaier, M. S. Alfiad, A. Napoli, and B. Lankl, “Data-aided versus blind single-carrier coherent receivers,” IEEE Photonics Journal **2**(3), 387–403 (2010). [CrossRef]

**9. **T. Pfau, S. Hoffmann, and R. Noe, “Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. **27**(8), 989–999 (2009). [CrossRef]

**10. **Q. Zhuge, M. Morsy-Osman, M. Chagnon, X. Xu, M. Qiu, and D. V. Plant, “Demonstration of energy-efficient and format-transparent digital signal processing for Tb/s flexible transceiver,” in Proc. ACP'13, Paper. AF2E.7 (2013). [CrossRef]

**11. **A. H. Gnauck, P. J. Winzer, A. Konczykowska, F. Jorge, J. Dupuy, M. Riet, G. Charlet, B. Zhu, and D. W. Peckham, “Generation and transmission of 21.4-Gbaud PDM 64-QAM using a novel high-power DAC driving a single I/Q modulator,” J. Lightwave Technol. **30**(4), 532–536 (2012). [CrossRef]

**12. **K. Piyawanno, M. Kuschnerov, B. Spinnler, and B. Lankl, “Low complexity carrier recovery for coherent QAM using superscalar parallelization,” in Proc. ECOC'10, Paper. We.7.A.3 (2010). [CrossRef]

**13. **Q. Zhuge, M. Morsy-Osman, X. Xu, M. E. Mousa-Pasandi, M. Chagnon, Z. A. El-Sahn, and D. V. Plant, “Pilot-aided carrier phase recovery for M-QAM using superscalar parallelization based PLL,” Opt. Express **20**(17), 19599–19609 (2012). [CrossRef] [PubMed]

**14. **M. Oerder and H. Meyr, “Digital filter and square timing recovery,” IEEE Trans. Commun. **36**(5), 605–612 (1988). [CrossRef]

**15. **M. Morsy-Osman, M. Chagnon, Q. Zhuge, X. Xu, M. E. Mousa-Pasandi, Z. A. El-Sahn, and D. V. Plant, “Ultrafast and low overhead training symbol based channel estimation in coherent M-QAM single-carrier transmission systems,” Opt. Express **20**(26), B171–B180 (2012). [CrossRef] [PubMed]

**16. **S.-H. Fan, J. Yu, D. Qian, and G.-K. Chang, “A fast and efficient frequency offset correction technique for coherent optical orthogonal frequency division multiplexing,” J. Lightwave Technol. **29**(13), 1997–2004 (2011). [CrossRef]

**17. **G. Bosco, V. Curri, A. Carena, P. Poggiolini, and F. Forghieri, “On the Performance of Nyquist-WDM Terabit Superchannels Based on PM-BPSK, PM-QPSK, PM-8QAM or PM-16QAM Subcarriers,” J. Lightwave Technol. **29**(1), 53–61 (2011). [CrossRef]

**18. **P. Poggiolini, G. Bosco, A. Carena, V. Curri, V. Miot, and F. Forghieri, “Performance Dependence on Channel Baud-Rate of PM-QPSK Systems Over Uncompensated Links,” IEEE Photon. Technol. Lett. **23**(1), 15–17 (2011). [CrossRef]

**19. **L. B. Du and A. J. Lowery, “The validity of “Odd and Even” channels for testing all-optical OFDM and Nyquist WDM long-haul fiber systems,” Opt. Express **20**(26), B445–B451 (2012). [CrossRef] [PubMed]

**20. **M. Morsy-Osman, M. Chagnon, X. Xu, Q. Zhuge, M. Poulin, Y. Painchaud, M. Pelletier, C. Paquet, and D. V. Plant, “Colorless and preamplifierless reception using an integrated Si-photonic coherent receiver,” IEEE Photon. Technol. Lett. **25**(11), 1027–1030 (2013). [CrossRef]