Abstract
Blind phase search (BPS) algorithm for M-QAM has excellent tolerance to laser linewidth at the expense of rather high computation complexity (CC). Here, we first theoretically obtain the quadratic relationship between the test angle and corresponding distance matric during the BPS implementation. Afterwards, we propose a carrier phase estimation (CPE) based on a two-stage BPS with quadratic approximation (QA). Instead of searching the phase blindly with fixed step-size for the BPS algorithm, QA can significantly accelerate the speed of phase searching. As a result, a group factor of 2.96/3.05, 4.55/4.67 and 2.27/2.3 (in the form of multipliers/adders) reduction of CC is achieved for 16QAM, 64QAM and 256QAM, respectively, in comparison with the traditional BPS scheme. Meanwhile, a guideline for determining the summing filter block length is put forward during performance optimization. Under the condition of optimum filter block length, our proposed scheme shows similar performance as traditional BPS scheme. At 1 dB required ES/N0 penalty @ BER = 10−2, our proposed CPE scheme can tolerate a times symbol duration product of 1.7 × 10−4, 6 × 10−5 and 1.5 × 10−5 for 16/64/256-QAM, respectively.
© 2015 Optical Society of America
1. Introduction
To satisfy the dramatic bandwidth requirements of fiber optical networks, the utilization of higher-order M-ary quadrature amplitude modulation (QAM) formats combined with coherent detection and digital signal processing (DSP) has been proposed to build up future spectral-efficient high-capacity wavelength-division multiplexing (WDM) transmission systems [1–5 ]. However, as for higher-order M-QAM modulation format, their tolerance toward laser phase noise decreases dramatically due to the inherent shorter Euclidean distance [6]. Although we can utilize narrow linewidth lasers with even several kHz to maintain the performance, deployment of carrier phase estimation (CPE) algorithm at receiver-side DSP flow is of great interest, in account of the implementation cost and complexity.
Since decision-directed feedback CPE approaches are not very efficient in terms of laser linewidth tolerance because of large feedback delays [6, 7 ], especially in practical parallel and pipeline architectures, feed-forward CPE algorithms have been extensively investigated [8–22 ]. Until now, the most popular feed-forward CPE algorithms are based on either Mth Power or blind phase search (BPS). Since only a small portion of current symbols can be used for the phase estimation for high-order QAM formats, the Mth Power approach shows inherently poor tolerance to laser linewidth [9, 10 ]. On the other hand, the BPS algorithm, originally invented for synchronous communication systems [17, 18 ], has good tolerance to laser linewidth and flexibility to various QAM formats [7]. Nevertheless, a practical problem associated with the BPS approach is its huge computation complexity (CC). Generally, the required number of test phase angles is increased with the modulation level of M. For instance, 64 test angles are generally required in order to achieve the best linewidth tolerance for 64-QAM signal. In order to cut down the CC of the BPS algorithm, several multi-stage feed-forward CPE schemes based on BPS have been proposed to reduce the number of test phase angles. In [19, 20 ], the BPS algorithm with less test angles was employed as the first stage and either the maximum likelihood (ML) algorithm or Mth Power algorithm is implemented in the second stage. With above works, the CC has been reduced by a factor ranging from 1.5 to 3 for 16/64-QAM. In addition, the BPS algorithm can be utilized in both stages for the coarse and fine phase estimations, respectively [21–23 ]. As a result, the computational effort has been further reduced by a factor ranging from 2 to 4 for 64-QAM.
In this paper, we propose a feed-forward CPE algorithm based on two-stage BPS with quadratic approximation (QA). Instead of searching the phase blindly with fixed step-size as the BPS algorithm, the QA algorithm can significantly accelerate the speed of phase searching. As a result, our proposed CPE scheme achieves a significant reduction of CC, compared with traditional BPS scheme. In addition, our proposed CPE scheme is verified with the similar tolerance to laser phase noise.
2. Operation principle
2.1 Quadratic relationship during BPS implementation
Assuming ideal coherent detection, clock recovery, equalization, and laser frequency offset compensation, the received symbol-rate sample before the CPE in a typical digital optical coherent receiver can be modeled as:
where denotes the kth transmitted symbol drawn from a QAM constellation; stands for additive complex white Gaussian noise; and represents the phase noise and remains unchanged within a proper time-window. As for the BPS algorithm, the received signal is firstly rotated by B test phase angles:Then all rotated symbols are fed into a hard decision circuit and the squared distance to the closest constellation point is calculated at the complex plane:where stands for the hard decision in accordance with the given QAM constellation. In order to filter out the noise, the distances of consecutive test symbols rotated by the same carrier phase angle are summed up and the distance matric is obtained as follows:where denotes the flooring function and denotes the ceiling function; is an integer and denotes the summing filter block length. Assume the hard decision is right, we can substitute Eqs. (1) and (3) into Eq. (4) and can be represented by:As we can see from Eq. (5), shows the minimum value when the carrier phase error is equal to zero, namely total compensation of phase noise. Moreover, we can also conclude that show an approximate quadratic relationship with or , especially when is small. Since function is a monotonous function within , we can also conclude also show a quadratic relationship with. Figure 1 depicts the normalized distance matricas a function of test angles, without phase noise loading (). As expected, when , namely , is minimized. In addition, there exists a quadratic relationship between and within a specific range. Therefore, we can utilize the quadratic relationship to identify the minimum value of , indicating of the optimum test phase angle for the kth sample.2.2 Algorithm design
Assume three points within the quadratic interval have been obtained, namely , and . are the three test phase angles and are corresponding distance matric obtained by Eq. (5). Please note those points are located on the real plan. Therefore, we can get
We define some parameters as follows:Then, the parameters in Eq. (6) can be represented by:From Eq. (6-8) , the minimum point can be obtained by:Where is the estimated phase after one-time QA iteration. Then we choose the minimum point and its adjacent two points from and the newly designated three points can be utilized to obtain another estimated phase, according to Eqs. (6)-(9) . After iteration, a series of estimated phases can be obtained as follows: … .donates for the iterations. In practical, is determined bywhere is the designated phase estimation error. At this moment, the iteration is completed and is the final estimated phase angle. Please note that as for the initial obtained points , the following condition must be satisfied in order to get a minimum point:If not, the implementation of QA algorithm is unavailable. On this condition, we suggest that the currently estimated phase is the same as that of prior symbol. In this situation, no iteration is used andis defined.3. Implementation of proposed CPE scheme
Figure 2 shows the flowchart of the proposed CPE scheme. The simplified two-stage BPS is proposed to obtain the quadratic interval for later QA implementation [21]. As shown in Fig. 2,B1 test angles are firstly used to carry out a rough CPE by minimizing the distance metric, defined by . Meanwhile, another two test angles close to are also identified, defined by and . Note that on the condition of , is determined by . Similarly, on the condition of , is determined by at the moment. Due to the rotational symmetry of QAM constellations, the obtained distance metrics remain unchanged. Obviously, these operations under two special scenarios can lead to additional implementation of 1 comparator and 2 adders, in comparison with the traditional BPS scheme. In the second-stage, we firstly choose two new test angles, which are in the middle of and , and in the middle of and . Together with above 3 test angles , those five test angles are employed to implement the BPS algorithm with different summing window. Similarly, the rough estimation of phase noise and another two test angles close to can be obtained. Consequently, the corresponding distance metrics are obtained as . For QA implementation at the second stage, the minimum point is firstly obtained by Eqs. (6)-(9) . Then, we can do an evaluation between and . In order to secure the performance, is preferred. If the condition is satisfied, the final kth estimated phase is determined by . If not, we choose another new two points close to from and repeat above operations until is obtained. Please note that during the second-stage BPS, though the total test angle number is 5, 3 of 5 test angles have been utilized in the first-stage BPS and indeed make no contribution to CC of the second-stage BPS. Therefore, the effective number of test angles to obtain quadratic interval for QA algorithm is B1 + 2.
4. Simulation results and discussions
In order to evaluate the performance of the proposed CPE scheme, the coherent optical M-QAM single polarization transmission system is used as the simulation platform. In our simulations, the M-QAM symbols are generated by combiningde-correlation pseudo-random binary sequences (PRBS) sequences with a word length of 217-1. Differential coding is used in order to avoid cycle slips [24]. The laser phase noise is modeled as a Wiener process with a variance of , where denotes the combined linewidth of the transmitter-side and receiver-side lasers, is the symbol period and represents the times symbol duration product [25]. The additive complex Gaussian noise is loaded to adjust the signal-to-noise ratio per symbol (ES/N0). During the implementation of QA algorithm, the designated phase error is fixed as in order to achieve a trade-off between phase estimation accuracy and CC.
4.1 Optimization of the test angles number B1
Firstly, we investigate the relationship between the number of test angles B1 and performance of 16/64/256-QAM formats for the proposed CPE scheme. Generally, larger B1 is preferred to obtain three initial points within the quadratic interval for QA implementation. However, the CC also increases with the growing of B1. Figure 3 shows the ES/N0 penalty required to achieve BER = 10−2, compared with the case without phase noise. The BER bench-mark is determined by the fact that the system can tolerate a 1 dB ES/N0 penalty due to phase noise without exceeding the forward error correction (FEC) threshold, which is assumed to be 2 × 10−2, as granted by current state-of-the-art soft FEC codes with 20% overhead [26]. As we can see that the required number of test angles B1 is 5/7/16 for 16/64/256-QAM. Considering another two test angles used in the second-stage, the total number of used test angles is 7/9/18 for 16/64/256-QAM, in order to obtain the quadratic interval for QA algorithm.
4.2 Optimization of summing filter block length
Then, we evaluate the effect of summing filter block length. For the purpose of simple discussion, the 64-QAM modulation format is taken into account. In order to obtain the optimum filter block length, a two-dimensional contour plot of two parameters and is obtained for our proposed CPE scheme, on the condition of a typical times symbol duration product and ES/N0 = 21.5 dB, as shown in Fig. 4(a) . In terms of Q-factor, there occurs maximum point at . Similarly, the optimal filter block length for other can also be obtained and the results are summarized in Table 1 . Generally speaking, the optimal filter block length is determined by a trade-off between ASE noise and laser phase noise. Large filter block length is helpful to average the additive noise, while small filter block length is preferred to avoid the de-correlation of laser phase noise within the block. Therefore, the filter block length decreases along with larger , as shown in Table 1. Moreover, we find is always larger than whatever is. As we can see, if is small, the coarse phase estimation may be far from the real phase in the presence of noise distortions. In this case, the obtained initial points will be out of the quadratic interval, which inevitably degrades the performance. Therefore, largeris always preferred during implementation of the proposed CPE scheme and this can be regarded as a guideline for performance optimization:. In order to further verify the guideline, the optimal filter block length under various for 16QAM and 256 QAM are summarized in Table 2 and 3 , respectively.
Meanwhile, during optimization of the filter block length, an interesting phenomenon about is observed as well. Figure 6 shows the relationship between and achieved Q-factor, assuming that takes the optimum value in each case and ES/N0 = 21 dB. For a comparison, the traditional BPS scheme with 64 test angles (BPS(64)) is also under investigation andis its summing filter block length. As we can see that the proposed CPE scheme shows the similar performance to the BPS(64) scheme with optimized filter block length. In addition, the optimal is approximately equal to the optimal filter block length for the BPS(64) scheme. For further verification, the relationship between optimized and under different is also listed in Table 4 .
4.3 Phase noise tolerance
Under optimum summing filter block length, the performances of our proposed CPE scheme and traditional single-stage BPS scheme with 32/64/64 test angles for 16/64/256-QAM are investigated by calculating the required ES/N0 to achieve BER = 10−2, respectively [7]. Since the ML algorithm has been widely used together with the BPS in order to reduce the CC, here the BPS/ML CPE scheme is also included as a reference [20]. As shown in Fig. 5 , the proposed and traditional BPS schemes show similar performance to laser phase noise. Given 1 dB required ES/N0 penalty, the times symbol duration product of 1.7 × 10−4, 6 × 10−5 and 1.5 × 10−5 can be tolerated for 16/64/256-QAM, respectively. However, it reduces to 1 × 10−4, 2 × 10−5 and 1 × 10−5 for 16/64/256-QAM using the BPS/ML scheme, because the simplified two-stage BPS is based on rough phase estimation far from the real phase noise value and the second stage ML CPE suffers from performance penalty due to wrong constellation-assisted hard decision [20]. In order to further evaluate the phase noise tolerance of individual algorithms for small BER, such as BER = 10−3, the relationship between BER and ES/N0 is obtained for typical values of , as shown in Fig. 6 . The theoretical curve is given by:
where is the differential coding factor. The proposed scheme and traditional BPS schemes have the similar performance under various ES/N0. However, the BPS/ML scheme suffers from performance penalty. Moreover, compared with the benchmark of BER = 10−2, the penalty is worsen for BER = 10−3 [7].4.4 Complexity computations
The CC of our proposed CPE scheme is determined to the iterations. The probability distribution of is firstly calculated, as shown in Fig. 7 . Obviously, the maximum value of is only 2 on the condition of , indicating that 2 iteration is enough for the QAalgorithm implementation. Since initial points for QA implementation are located on the real plan, both square and multiplication operation have the same CC. Moreover, initial condition for the first iteration, as shown in Eq. (11), is satisfied due to the minimum-selection operation in the second-stage BPS. Since no more than 2 iterations are enough, the CC calculation for 2 QA iterations is summarized as follows:
- ①. Equation (6) just elaborates the quadratic relationship and makes no contribution to CC.
- ②.Eq. (7) calculates some necessary parameters. For obtaining,, and, each one requires 3 real multiplexers and 1 real adder. Please note that the cc of subtraction is the same as that of addition. For obtaining,, and , each one requires 1 real multiplier and 1 real adder. For obtaining, it requires 2 real multipliers and 3 real adder. In a summary, Eq. (7) requires 14 real multipliers and 9 real adders.
- ③. In Eq. (8), when calculatingand, each one requires 2 real multipliers and 2 real adders. During the DSP implementation, both division and multiplication are implemented with cyclic shift for binary data. Thus, the CC of divider is almost the same as multiplier [#6,#9,#10]. For calculating, it requires 3 real multipliers and 2 real adders. In total, Eq. (8) requires 7 real multipliers and 6 real adders.
- ⑤. In Eq. (10), is equal to, and it requires 2 real multipliers, 1 real adders and 1 comparator.
- ⑥. For next iteration, we need identify three points. On the other hand, since the newly obtained point is the minimum point, and what we need to do is to choose another two points close to from , , and . This can be simply achieved by make a comparison between and. Meanwhile, it only requires 1 comparator because the value of has been computed.
- ⑦. Repeat the operations from ①-⑤ for the second iteration, the required CC are 28 real multipliers, 18 real adders, and 1 comparators.
In a conclusion, the overall CC of 2 QA iteration is 56 real multipliers, 36 real adders and 3 comparators. Then we can obtain the overall CC for our proposed scheme (two stage BPS together with 2 QA iteration), as shown in Table 5 . In particular, additional complexity of 1 comparator and 2 adders for obtaining the second-stage test angles and is also included. For the ease of comparison, the CC of traditional BPS scheme and BPS/ML scheme are also listed according to reported results [27]. Compared with the complexity of traditional BPS algorithm, the complexity of the proposed scheme is significantly reduced by the group factors of 2.96/3.05, 4.55/4.67 and 2.27/2.3 (in the form of multipliers/adders) for 16QAM, 64QAM and 256QAM, respectively. According to Fig. 6, the proposed scheme with parameters of N1 = 30 and N2 = 19 has almost the same performance as traditional BPS scheme with the parameter of N = 19. Therefore, for the proposed scheme using such representative parameter, the required number of multipliers/adders is 1260/1198, 1620/1556, 3240/3167 for 16QAM, 64QAM and 256-QAM, respectively. As for the traditional BPS scheme, the required number of multipliers/adders is 3724/3656, 7372/7272, 7372/7272 for 16QAM, 64QAM and 256QAM, respectively.
5. Conclusion
Low-complexity linewidth-tolerant feed-forward CPE algorithm is proposed for M-QAM signal based on two-stage BPS with QA. Instead of searching the phase blindly with fixed step-size as traditional BPS algorithm does, QA can significantly accelerate the speed of phase searching. Therefore, a group factor of 2.96/3.05, 4.55/4.67 and 2.27/2.3 (in the form of multipliers/adders) reduction of CC is achieved for 16QAM, 64QAM and 256QAM, respectively. Meanwhile, guideline for determining the summing filter block length is also discussed during performance optimization of proposed CPE scheme. Under the condition of optimum block length, the proposed CPE scheme shows similar tolerance of phase noise with traditional BPS scheme. At 1 dB required ES/N0 penalty @ BER = 10−2, the proposed scheme can tolerate of 1.7 × 10−4, 6 × 10−5 and 1.5 × 10−5 for 16/64/256-QAM signal, respectively.
Acknowledgments
This work was supported by the 863 High Technology Plan (2015AA016904), and National Natural Science Foundation of China (61307091, 61331010).
References and links
1. R. W. Tkach, “Scaling optical communications for the next decade and beyond,” Bell Labs Tech. J. 14(4), 3–9 (2010). [CrossRef]
2. S. K. Korotky, “Traffic trends: Drivers and measures of cost-effective and energy-efficient technologies and architectures for backbone optical networks,” in Proceedings of OFC (Los Angeles, California, 2012), paper OM2G.1. [CrossRef]
3. E. Lach and W. Idler, “Modulation formats for 100 G and beyond,” Opt. Fiber Technol. 17(5), 377–386 (2011). [CrossRef]
4. X. Zhou, L. Nelson, and K. Carlson, “4000 km transmission of 50 GHz spaced, 10 × 494.85-Gb/s Hybrid 32–64 QAM using cascaded equalization and training-assisted phase recovery,” in Proceedings of OFC (Los Angeles, California, 2012), paper PDP5C.
5. P. Winzer, “High-spectral-efficiency optical modulation formats,” J. Lightwave Technol. 30(8), 3824–3835 (2012). [CrossRef]
6. E. Ip and J. M. Kahn, “Feedforward carrier recovery for coherent optical communications,” J. Lightwave Technol. 25(9), 2675–2692 (2007). [CrossRef]
7. T. Pfau, S. Hoffmann, and R. Noé, “Hardware-efficient coherent dig-ital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. 27(24), 989–999 (2009). [CrossRef]
8. M. Seimetz, “Laser linewidth limitations for optical systems with high-order modulation employing feedforward digital carrier phase estimation,” in Proceedings of OFC (San Diego, California, 2008), paper OTuM2.
9. Y. Gao, A. P. T. Lau, S. Yan, and C. Lu, “Low-complexity and phase noise tolerant carrier phase estimation for dual-polarization 16-QAM systems,” Opt. Express 19(22), 21717–21729 (2011). [CrossRef] [PubMed]
10. I. Fatadin, D. Ives, and S. J. Savory, “Laser linewidth tolerance for 16QAM coherent optical systems using QPSK partitioning,” IEEE Photonics Technol. Lett. 22(9), 631–633 (2010). [CrossRef]
11. S. M. Bilal, G. Bosco, J. Chen, and C. Lu, “Carrier phase estimation through the rotation algorithm for 64-QAM optical systems,” J. Lightwave Technol. 33(9), 1766–1773 (2015). [CrossRef]
12. S. Zhang, P. Y. Kam, C. Yu, and J. Chen, “Laser linewidth tolerance of decision-aided maximum likelihood phase estimation in coherent optical M-ary PSK and QAM systems,” IEEE Photonics Technol. Lett. 21(15), 1075–1077 (2009). [CrossRef]
13. K. Zhong, J. H. Ke, Y. Gao, and J. C. Cartledge, “Linewidth-tolerant and low-complexity two-stage carrier phase estimation based on modified QPSK partitioning for dual-polarization 16-QAM systems,” J. Lightwave Technol. 31(1), 50–57 (2013). [CrossRef]
14. I. Fatadin, D. Ives, and S. J. Savory, “Carrier-phase estimation for 16-QAM optical coherent systems using QPSK partitioning with barycenter approximation,” J. Lightwave Technol. 32(13), 2420–2427 (2014). [CrossRef]
15. S. M. Bilal, C. Fludger, and G. Bosco, “Multi-stage CPE algorithms for 64-QAM constellations,” in Proceedings of OFC (San Francisco, California, 2014), paper. M2A.8.
16. Y. Gao, A. P. T. Lau, C. Lu, J. Wu, Y. Li, K. Xu, W. Li, and J. Lin, “Multi-stage CPE algorithms for 64-QAM constellations,” in Proceedings of OFC (Los Angeles, California, 2011), paper. OMJ6.
17. F. Rice, B. Cowley, B. Moran, and M. Rice, “Cramér-Rao lower bounds for QAM phase and frequency estimation,” IEEE Trans. Commun. 49(9), 1582–1591 (2001). [CrossRef]
18. S. K. Oh and S. Stapleton, “Blind phase recovery using finite alpha-bet properties in digital communications,” Electron. Lett. 33(3), 175–176 (1997). [CrossRef]
19. T. Pfau and R. Noé, “Phase-noise-tolerant two-stage carrier recovery concept for higher order QAM formats,” IEEE J. Sel. Top. Quantum Electron. 16(5), 1210–1216 (2010). [CrossRef]
20. X. Zhou, “An improved feed-forward carrier recovery algorithm for coherent receivers with M-QAM modulation format,” IEEE Photonics Technol. Lett. 22(14), 1051–1053 (2010). [CrossRef]
21. X. Li, Y. Cao, S. Yu, W. Gu, and Y. Ji, “A simplified feed-forward carrier recovery algorithm for coherent optical QAM system,” J. Lightwave Technol. 29(5), 801–807 (2011). [CrossRef]
22. Q. Zhuge, C. Chen, and D. V. Plant, “Low computation complexity two-stage feedforward carrier recovery algorithm for M-QAM,” in Proceedings of OFC (Los Angeles, California, 2011), paper OMJ5. [CrossRef]
23. J. Li, L. Li, Z. Tao, T. Hoshida, and J. C. Rasmussen, “Laser-linewidth-tolerant feed-forward carrier phase estimator with reduced complexity for QAM,” J. Lightwave Technol. 29(16), 2358–2364 (2011). [CrossRef]
24. J. K. Hwang, Y. L. Chiu, and C. S. Liao, “Angle differential-QAM scheme for resolving phase ambiguity in continuous transmission system,” Int. J. Commun. Syst. 21(6), 631–641 (2008). [CrossRef]
25. A. Bisplinghoff, C. Vogel, and B. Schmauss, “Slip-reduced carrier phase estimation for coherent transmission in the presence of non-linear phase noise,” in Proceedings of OFC (Anaheim, California, 2013), paper OTu3I.1. [CrossRef]
26. Y. Miyata, K. Sugihara, W. Matsumoto, K. Onohara, T. Sugihara, K. Kubo, H. Yoshida, and T. Mizuochi, “A triple-concatenated FEC using soft-decision decoding for 100 Gb/s optical transmission,” in Proceedings of OFC (San Diego, California, 2010), paper OThL3. [CrossRef]
27. K. Zhong, J. H. Ke, and Y. Gao, “Linewidth-tolerant and low-complexity two-stage carrier phase estimation for dual-polarization16-QAM coherent optical fiber communications,” J. Lightwave Technol. 30(24), 3987–3992 (2012). [CrossRef]