Low-latency and efficient retiming and equalizing scheme for a 112-Gbps bandwidth-limited optical PAM-4 system

Lin Sun; Lin Sun; Jiawang Xiao; Yi Cai; Gangxiang Shen; Gordon Ning Liu; Chao Lu

doi:10.1364/OE.457998

1. Introduction

Driven by the data center (DC) interconnection development during these years, the demand for short-reach optical communication links with high capacity, low latency, and low complexity has increased significantly. For intra-DC interconnections in which the interconnection distance is generally below 2 kilometers, the primary constraint on transmission link capacity is the limited bandwidth of optoelectronic devices in the optical transceivers [1]. When operating at the O-band, the effect of fiber dispersion is relatively small. Optical intensity modulation with direct detection (IMDD) is preferred since it has lower power consumption and simpler digital signal processing (DSP) than its coherent detection counterpart. For intra-DC application, the optical IMDD system should support: 1) a high capacity under the constraint of limited channel bandwidth. 2) simple DSP with low latency. Many schemes have been studied to achieve these objectives. These include Tomlinson-Harashima precoding [2], Nyquist pulse shaping [3], and simplified equalizers at the receiver [4–6]. For practical implementation, simplified DSP flow is preferred to achieve low latency and complexity [7,8], such as blind retiming without training pilots, short-tap feedforward equalizer (FFE), and decision feedback equalizer (DFE).

To meet these requirements on DSP for intra-DC interconnection, a low-latency and low-complexity retiming scheme is of great importance as the first step of receiver signal processing. Commonly used timing synchronization algorithms include Gardner loop [9,10], Mueller and Müller algorithm [11]. Gardner retiming loop achieves its minimum timing error by updating the interpolation phase of 2-sample/symbol sequences without prior-known pilots for training. It exhibits low complexity and is compatible with other 2-sample/symbol equalizers [10].

However, when signals are narrowly filtered due to channel bandwidth limitation, the rising/falling edges will be distorted, requiring a much more precise retiming to achieve a good BER performance. However, previous studies have shown that the retiming phase of the Gardner loop has a non-neglectable variance even after many iterations [10,12]. This is due to the inherent tradeoff between iteration step and convergence speed of the error-minimization iteration algorithms. To increase the convergence speed, we need to choose a coarse step, and this will result in a significant timing jitter. A smaller residual timing error can be obtained using finer steps. However, this will increase the convergence time [13]. An improved Gardner retiming scheme has been proposed to reduce timing jitter by multiplying the updating phase with a decimal less than 1. However, it also slows down the convergence speed [14].

In this work, we proposed an efficient retiming method to reduce the Gardner loop’s sampling phase variance by a moving average filter (MAF). Based on simulation results for an optical IMDD Nyquist-shaped 4-level pulse amplitude modulation (PAM-4) system, the improved retiming method reduces timing jitter without sacrificing iteration speed. It enables much more precise sampling than the traditional Gardner loop. Moreover, by utilizing this stabilized retiming method before the DFE, an accelerated iteration of feedback coefficients can be obtained. Thanks to better DFE performance, significant BER reductions have been obtained for a 112-Gbps optical PAM-4 system, providing at least 2-dB power budget gain at 20%-OH FEC limit with a low latency.

2. Operational principle

2.1 Principle of the ultra-stable retiming method

The principle of the proposed retiming loop is given in Fig. 1. For the traditional Gardner retiming loop, the timing error Error(k) of the k^th sample is calculated on 2-sample/symbol sequence y(n), with

(1)$$Error(k )= [ {y({k - 1} )- \frac{{y(k )- y({k - 2} )}}{2} ] \; [y(k )- y({k - 2} )} ]. $$

Realized by a loop to update the sampling phase, minimization of timing error can be achieved without using a prior-known pilot for synchronization. In particular, an interpolation filter is used to obtain the sampled value at the sampling phase of ${u_k}$, then the retiming error is calculated using Eq. (1). The loop filter is an integral filter for generating a control word ${w_k}\; $ for the numerical controlled oscillator (NCO). Updated ${w_k}$ is related to the timing error through $w({k + 1} )= w(k )+ {C_1}[{Error(k )- Error({k - 1} )} ]$, where ${C_1}$ is loop filter coefficient. This formula indicates that when the differential error $Error(k )- Error({k - 1} )$ is a non-zero value, coefficient ${w_k}$ varies correspondingly and it will affect the accuracy of the timing phase. To solve this problem, a stabilizing method of timing phase is proposed by multiplying the loop filter output with a scaling factor to reduce the variance of NCO output [14]. However, an additional digital switch is required for deciding when to apply the scaling factor at the loop filter output. In this work, we employ a moving average filter (MAF) after NCO to stabilize the sampling phase. The basic reason for the use of MAF is that the sampling phase is a stationary random sequence. The tap length of MAF for this can be shown to be short enough for maintaining low complexity and latency. Moreover, the proposed retiming scheme does not reduce convergence speed because the loop filter coefficients ${C_1}$ is not changed.

Fig. 1. Principle of the proposed ultra-stable retiming loop.

Download Full Size | PDF

2.2 Advantages of the ultra-stable retiming in DFE-equalized optical PAM-4 systems

Retiming process for PAM signals is depicted in Fig. 2. For calculating the timing error of the rising/falling edges by the Gardner error function in Eq. (1), thresholds are also time variant given by $T{H_i}(k )= ({y(k )- y({k - 2} )} )/2$. That causes the intrinsic timing jitter during the Gardner retiming procedure. In addition, the thresholds $T{H_i}$ after iteration are not always optimal for PAM-4 signals with different pulse shapes. For example, PAM-4 signals with Gaussian waveform have a large duty cycle with larger tolerance to timing jitter, as shown in Fig. 2(a). However, the half-voltage timing point on rising/falling edges is not at the center of the symbol duration, as shown in the left of Fig. 2. Consequently, it also places a high requirement on retiming jitter unless the optimal sampling is achieved at the largest eye-opening. While for the spectral-shaped PAM-4 (which can be realized by Nyquist shaping and channel bandwidth limitation), the slowed-downed edges result in severe eye closure. If the timing jitter is significant, it will be converted to amplitude noise.

Fig. 2. Schematic diagram of the influences of a suboptimal retiming to PAM-4 signaling.

Download Full Size | PDF

On the other hand, a low-jitter retiming could also lead to a better performance of DFE. That is due to the fact that the error propagation effect will deteriorate BER when wrong decisions occur. Thus, a more precise retiming is expected to be useful to a more precise decision and a better DFE performance. The transfer function of a DFE in time domain can be described by the following equation,

(2)$$Y(k )= y(k )+ \mathop \sum \nolimits_{j = 1}^D {w_{DFE}}(j )\hat{Y}({k - j} )$$

where $y(k )$ is the input of the DFE. It is also the output of the Gardner loop $\; and\; Y(k )$ is the DFE output. $\hat{Y}(k )$ is the decided symbol. As indicated in Fig. 2(b), jitter-induced noise will increase the decision error in DFE. This will slow down the convergence of the feedback coefficient ${w_{DFE}}(j )$ or even result in error propagation. As a result, it is anticipated that the proposed stabilized retiming method will improve the performance of the DFE equalizer.

3. Results and discussions

To investigate the performances of the proposed improved retiming method in an optical IMDD PAM-4 system, we conducted a simulation study using VPItransmissionMakers. The setup for the simulation is shown in Fig. 3. Parameters of the modules used in the simulation are the same as the ones we used in our previous experimental studies [15]. Arbitrary waveform generator (AWG) is set at 56Gsa/s rate with 35-GHz bandwidth, and the corresponding DAC resolution is 8 bits. 2-km fiber link is modeled using the standard single mode fiber (SMF) model, taking into consideration of fiber attenuation, dispersion and nonlinearity. Before detection, we utilized an erbium doped fiber amplifier (EDFA) to pre-amplify the signals with 20-dB gain and 5-dB noise figure. Photodetector (PD) used here has a 35-GHz bandwidth and 0.8-A/W responsivity. Then, a digital storage oscilloscope (DSO) is modeled according to Keysight Real-time Oscilloscope (Z562) with 8-bit ADC resolution and 160Gsa/s rate. Digital signal processing includes retiming, DFE and BER counting.

Fig. 3. Evaluation setup for the proposed joint retiming and equalization method.

Download Full Size | PDF

To get a compromise between the convergence speed and stability, three different values of ${C_1}$ are tested for the traditional Gardner retiming method. The sampling phase evolutions are plotted in Fig. 4. It indicates that when ${C_1}$ equals to 0.0005, a good compromise between the convergence speed and stability can be obtained. Thus, we set ${C_1}{\; }$ at 0.0005 for comparing the performances of the MAF-Gardner and the traditional Gardner methods in the rest of this paper.

Fig. 4. Sampling phase convergences with different loop filter coefficients.

Download Full Size | PDF

To characterize the retiming performance for different system bandwidths, the sampling phase and constellations are plotted for every iteration in Fig. 5, with 35-GHz and 30-GHz system bandwidth limitation which is tuned by adding a low-pass filter with certain cutting-off frequency. The roll-off factor of pulse shaping is 0.02 and the received optical power is 0 dBm.

Fig. 5. (a) Sampling phase and (b) constellations by traditional Gardner retiming (with MAF length at 1) and the proposed method with channel bandwidth at 35 GHz. (c) Sampling phase and (d) constellations by traditional Gardner retiming (with MAF length at 1) and the proposed method with channel bandwidth at 30 GHz.

Download Full Size | PDF

To investigate the convergence performance under different clock offsets, the initializations of sampling phase is different for the results in Fig. 5(a) and (c). It indicates that the proposed MAF-Gardner exhibits a reduced variance of sampling phase and a faster convergence speed, comparing to the traditional Gardner retiming method. In addition, the converged constellations by using the MAF-Gardner loop are better than that of the traditional Gardner loop, as shown in Fig. 5(b). When channel bandwidth limitation is 30-GHz, clearer constellations can be obtained using the proposed MAF-Gardner retiming method compared with those obtained using the conventional Gardner algorithm, as shown in Fig. 5(c) and (d).

To verify the improved BER performance by the proposed retiming method, we introduce BER bathtubs with a fine granularity of sampling phase which can indicate the signals’ tolerance to sampling phase error and timing jitter. In this case, we synchronize samples using prior-known training sequences for the purpose of obtaining bathtub curves as shown in Fig. 6. Sampling points obtained by the traditional Gardner method and the proposed MAF-Gardner are also plotted for comparisons. The timing jitter induced by the Gardner loop is investigated by plotting the 3-σ range of sampling phase, where σ is the standard deviation of the sampling phase after 10k samples iteration. The timing jitter J is dependent on the standard deviation of the sampling phase $\mathrm{\sigma}$, given by

(3)$$J = 20{\log _{10}}(\mathrm{\sigma} )$$

As shown in Fig. 6, there is a little difference in sampling points between the traditional Gardner and MAF-Gardner retiming. However, the 3-σ range is significantly reduced using the MAF-Gardner method, which covers a much smaller sampling region with lower BERs. BER values, in this case, are 2.1E-3 and 6.3E-4 for the traditional Gardner and the MAF-Gardner retiming, respectively.

Fig. 6. Bathtub curve with channel bandwidth of 35 GHz and retiming region by traditional Gardner and the proposed retiming method.

Download Full Size | PDF

Figure 7 shows the timing jitter and BER values against different MAF tap lengths for different system bandwidths. BER here is directly calculated for the output sequences after retiming without equalization. As shown in Fig. 7(a), the timing jitter is reduced with MAF length. Typically, when MAF tap length is 10, jitter reduction over 25 dB can be achieved compared to the traditional Gardner retiming. Notably, the lowest timing jitter is achieved when the bandwidth is 25 GHz. As indicated in Fig. 7(b), MAF with a tap length of 10 is sufficient for achieving a good BER value without equalization. As discussed in the principal part, a more precise retiming can also be beneficial to the decision in DFE, which can further improve the performance of the DFE.

Fig. 7. (a) Timing jitter vs. MAF tap length. (b) BER vs. MAF tap length.

Download Full Size | PDF

Then, we implemented DFE after the retiming block to investigate the performance gain of DFE by a better retiming method. In this work, we investigated a 10-tap DFE with the considerations of low latency and complexity. Moreover, because we utilize RLS-DFE for equalizing. The updating rule of RLS-DFE is based on the minimization of the weighted linear least square cost function related to the input signals [16]. In the case of 30-GHz bandwidth, the evolution of feedback coefficients of 10-tap DFEs is plotted in Fig. 8, after the traditional Gardner retiming and the proposed MAF-Gardner retiming respectively. Figure 8(a) shows that the feedback coefficients of DFE iterate to zero due to a poor retiming by the traditional Gardner loop. Moreover, the evolution of feedback coefficients fluctuates a lot even after sufficient samples iteration. That’s due to the fact that a high retiming jitter leads to an unstable sampling point and then a poor convergence of feedback coefficients. For the MAF-Gardner retiming, the iterated feedback coefficients are non-zero, which indicates that inter-symbol interference (ISI) has been successfully trained during iterations. Moreover, feedback coefficients become stable after 400 samples iteration with the assistance of precise retiming. For pursuing a better equalizing performance of DFE, one can also use DSP methods to alleviate the error propagation effects [15,17].

Fig. 8. Feedback coefficients evolution of DFE after (a) the traditional Gardner retiming, (b) the MAF-Gardner retiming with tap length at 10.

Download Full Size | PDF

Using 10-taps FFE and DFE for post-processing after retiming, BER with 30-GHz bandwidth is plotted against received optical power (ROP) in Fig. 9. For the traditional Gardner retiming, there is no BER improvement by using DFE at low received optical powers. That is because in this case a poor iteration of DFE coefficients is caused by the high retiming jitter of the traditional Gardner method, which can be referred to in the results in Fig. 8(a). When using MAF-Gardner retiming with tap length of 10, significantly reduced BER can be obtained even without DFE processing, thanks to the improved jitter performance of sampling under this bandwidth-limited condition. Furthermore, this improved retiming also can enhance the performance of DFE. With the assistance of DFE after MAF-retiming, BER below 7%-OH FEC limit can be achieved with ROP over −2-dBm. Compared to the conventional Gardner method, the proposed low-jitter retiming provides nearly 2-dB power budget gain at the threshold of 20%-OH FEC. Moreover, one may note that when using the traditional Gardner algorithm, BERs of signals equalized by DFE are even worse than those without DFE at low received powers. That is due to the error propagation effect of DFE at the low SNR condition. When using the MAF-Gardner retiming method, BERs of signals equalized by DFE are keeping smaller than those without DFE.

Fig. 9. BER vs. received optical power using traditional Gardner and MAF-Gardner with the assistance of 10-taps FFE and DFE.

Download Full Size | PDF

On the other hand, the proposed retiming method exhibits low complexity and latency for practical uses. For the proposed ultra-stable retiming, the only difference to the traditional Gardner retiming is the additional MAF. According to the results in Fig. 7, the MAF with tap length of 10 is sufficient for achieving a significantly reduced BER. Comparing to the traditional Gardner loop, the input signals to the MAF-Gardner loop need to further pass through an MAF, which contains a shift register (1 clock cycle) and an adder (1 clock cycle) [18]. Thus, for using an N-taps MAF, the extra number of the arithmetic calculations is 2N. When using a 200-MHz digital circuit after the Serdes, the additional latency by using the MAF-Gardner with 10 taps is approximately 0.1µs. Thus, we believe that the induced latency of the proposed MAF-Gardner is acceptable.

4. Conclusion

In this work, we proposed a low-latency and efficient retiming and equalizing scheme for bandwidth-limited optical IMDD systems. With the proposed ultra-stable retiming scheme, the performance of DFE can be improved with significantly reduced BER. Due to the neglectable latency added by the proposed MAF-Gardner retiming, we believe that the proposed joint retiming and equalizing scheme is promising for intra-DC interconnections where low complexity and latency are important.

Funding

National Key Research and Development Program of China (2018YFB1801701); National Natural Science Foundation of China (62105273).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. C. F. Lam, H. Liu, B. Koley, X. Zhao, V. Kamalov, and V. Gill, “Fiber optic communication technologies: What's needed for datacenter network operations,” IEEE Communications Magazine 48(7), 32–39 (2010). [CrossRef]

2. Q. Hu, M. Chagnon, K. Schuh, F. Buchali, and H. Bülow, “IM/DD beyond bandwidth limitation for data center optical interconnects,” J. Lightwave Technol. 37(19), 4940–4946 (2019). [CrossRef]

3. Y. Zhu, F. Zhang, F. Yang, L. Zhang, X. Ruan, Y. Li, and Z. Chen, “Toward single lane 200G optical interconnects with silicon photonic modulator,” J. Lightwave Technol. 38(1), 67–74 (2020). [CrossRef]

4. M. Li, W. Zhang, Q. Chen, and Z. He, “High-throughput hardware deployment of pruned neural network based nonlinear equalization for 100-Gbps short-reach optical interconnect,” Opt. Lett. 46(19), 4980–4983 (2021). [CrossRef]

5. J. Zhang, C. Guo, J. Liu, X. Wu, A. P. T. Lau, C. Lu, and S. Yu, “Decision-feedback frequency-domain volterra nonlinear equalizer for IM/DD OFDM long-reach PON,” J. Lightwave Technol. 37(13), 3333–3342 (2019). [CrossRef]

6. J. Zhang, Q. Liu, M. Zhu, H. Lin, S. Hu, X. Yi, and K. Qiu, “EML-based 200-Gbit/s/λ DMT Signal Transmission over 10-km SSMF Using Entropy Loading and Simplified Volterra Equalization,”. In 2021 Optical Fiber Communication Conference, Optical Society of America. (pp. W6A–32).

7. K. Gopalakrishnan, A. Ren, A. Tan, A. Farhood, A. Tiruvur, B. Helal, and V. Shvydun, “3.4 A 40/50/100Gb/s PAM-4 Ethernet transceiver in 28 nm CMOS,”. In 2016 IEEE International Solid-State Circuits Conference (ISSCC), IEEE. (pp. 62–63).

8. C. Loi, A. Mellati, A. Tan, A. Farhoodfar, A. Tiruvur, B. Helal, and Y. Liao, “6.5 A 400Gb/s transceiver for PAM-4 optical direct-detect application in 16 nm FinFET,”. In 2019 IEEE International Solid-State Circuits Conference-(ISSCC)IEEE. (pp. 120–122).

9. N. Stojanović, T. Rahman, S. Calabrò, J. Wei, and C. Xie, “Baud-Rate Timing Phase Detector for Systems with Severe Bandwidth Limitations,”. In 2019 Optical Fiber Communication Conference, Optical Society of America. (pp. M4J-5).

10. H. Zhou, Y. Li, D. Lu, L. Yue, C. Gao, Y. Liu, and J. Wu, “Joint clock recovery and feedforward equalization for PAM4 transmission,” Opt. Express 27(8), 11385–11395 (2019). [CrossRef]

11. K. Mueller and M. Müller, “Timing recovery in digital synchronous data receivers,” IEEE transactions on communications 24(5), 516–531 (1976). [CrossRef]

12. M. S. Alam, A. Abdo, X. Li, S. Aouini, M. Parvizi, N. Ben-Hamida, and D. V. Plant, “Performance and Complexity Analysis of Blind Timing Phase Error Detectors in Pluggable Coherent Receivers,”. In Signal Processing in Photonic Communications, Optical Society of America. (pp. SpTu1I-2).

13. F. Li, D. Zou, L. Ding, Y. Sun, J. Li, Q. Sui, and Z. Li, “100 Gbit/s PAM4 signal transmission and reception for 2-km interconnect with adaptive notch filter for narrowband interference,” Opt. Express 26(18), 24066–24074 (2018). [CrossRef]

14. W. Lei and S. Qiang, “Improved timing recovery loop in laser communication,”. In 2016 2nd IEEE International Conference on Computer and Communications (ICCC)IEEE. (pp. 2159–2163).

15. J. Zhang, X. Wu, L. Sun, J. Liu, A. P. T. Lau, C. Guo, and C. Lu, “C-band 120-Gb/s PAM-4 transmission over 50-km SSMF with improved weighted decision-feedback equalizer,” Opt. Express 29(25), 41622–41633 (2021). [CrossRef]

16. K. Bandara and Y. H. Chung, “Reduced training sequence using RLS adaptive algorithm with decision feedback equalizer in indoor visible light wireless communication channel,”. In 2012 International Conference on ICT Convergence (ICTC), IEEE. (pp. 149–154).

17. J. Zhou, C. Yang, D. Wang, Q. Sui, H. Wang, S. Gao, and Z. Li, “Burst-Error-Propagation Suppression for Decision-Feedback Equalizer in Field-Trial Submarine Fiber-Optic Communications,” J. Lightwave Technol. 39(14), 4601–4606 (2021). [CrossRef]

18. L. Chen, C. Li, C. W. Oh, and A. T. Koonen, “A low-latency real-time PAM-4 receiver enabled by deep-parallel technique,” Opt. Commun. 508(127836), 127836 (2022). [CrossRef]

Low-latency and efficient retiming and equalizing scheme for a 112-Gbps bandwidth-limited optical PAM-4 system

Abstract

1. Introduction

2. Operational principle

2.1 Principle of the ultra-stable retiming method

2.2 Advantages of the ultra-stable retiming in DFE-equalized optical PAM-4 systems

3. Results and discussions

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Equations (3)

Optics Express