We study the performance of novel quadrature amplitude modulation (QAM) constellations for 100 Gb/s transmission by a directly-modulated laser. Due to the strong nonlinearity of a directly-modulated laser, rectangular constellations suffer a large penalty from their regular spacing between symbols. We present a method for synthesizing irregular constellations which position symbols more efficiently. We will demonstrate the improved performance of these novel constellations over the conventional rectangular constellation as well as the superior performance achievable with digital QAM compared to optimally bit-loaded discrete-multitone modulation.
© 2014 Optical Society of America
Future fiber optic links in data centers will likely transmit 100 Gb/s per wavelength through high-order modulation techniques such as pulse amplitude modulation (PAM), carrierless amplitude and phase (CAP) modulation, quadrature amplitude modulation (QAM), and discrete multitone (DMT) modulation . For the high-volume data center market, cost is an important aspect of system design, which makes direct modulation of lasers favorable over external modulation. However, directly-modulated lasers (DMLs) exhibit significant nonlinearities in the conversion from drive current to output intensity. Additionally, their large modulation bandwidth results in a large relative intensity noise (RIN)  which can dominate over shot and thermal noise in a short-reach link. These differences greatly affect the performance of any system employing high-order modulation techniques.
Among the single-carrier formats of PAM, CAP, and QAM, CAP and QAM intrinsically offer higher performance, since PAM is essentially a special case of CAP or QAM with an inefficient constellation . Much of the recent work on high data rate, short-reach intensity-modulated/direct-detection (IM/DD) systems has focused on CAP and DMT [4–10]. This is due to the high spectral efficiency and baseband nature of these modulation formats. Both CAP and DMT have previously been used in digital subscriber loops (DSL) [11, 12]. Most implementations of CAP are very similar to QAM in that they synthesize the symbols from two orthogonal basis functions. In fact, standard implementations of CAP utilize basis functions constructed from multiplying the square-root raised cosine pulse with quadrature sinusoids. As in QAM, this 2-D CAP system demodulates the received signal through matched filtering. For the case of an analog CAP system, the CAP transmitter first encodes the data into two independent PAM signals. One PAM signal is pulse shaped by the in-phase filter and the other PAM signal is pulse-shaped by the quadrature filter. These two pulse-shaped PAM signals are then combined to form the 2-D CAP signal. At the receiver the two PAM signals are recovered through matched filtering and sampling. Thus, the CAP system can be implemented using analog filters and without a digital-to-analog converter (DAC) or analog-to-digital converter (ADC), which would lower the power consumption [4, 5].
However, for a digital implementation of CAP, the traditional analog CAP architecture can be modified to use a QAM mapper rather than two independent PAM mappers (for example,  used CAP with QAM mappers). For such a system, when the in-phase and quadrature pulse shapes are the root raised cosine with quadrature sinusoidal modulation, the digital CAP system is nearly the same as a digital QAM system. Both systems use the same fundamental pulse shape, have the same number of signaling dimensions, and occupy the same bandwidth. Although CAP can be extended to higher dimensions than two, the spectral efficiency remains the same; using more than two dimensions is for the purpose of multiple access [12, 14].
Our study focuses on the digital QAM architecture. However, it is equally applicable to digital CAP systems based on quadrature-modulated root raised cosine pulses and QAM constellation mapping. Such digital systems have the following advantages over analog CAP systems. In analog CAP, using two PAM mappers only allows for synthesis of signals from a constellation of size M2, where M is an integer, due to the fact that its 2-D constellation is the Cartesian product of two M-PAM constellations. Additionally, the shape of this constellation is constrained, since the two PAM mappers operate independently of each other. Second, analog CAP based on analog filters has very limited equalization capability and thus only works well in fairly flat channels. This is due to two reasons. The first is the limited accuracy and tuning range of an analog filter compared to a digital filter. The second is that a digital equalizer can be fractionally spaced. The fractionally-spaced equalizer has been found to provide significant performance gains due to its ability to equalize before aliasing by the symbol-rate sampler (pp. 655–660 of ).
Unlike most past studies, we simulated all systems using a nonlinear rate-equation model for a commercial distributed feedback (DFB) DML at 1310 nm. These models are based upon laboratory characterizations of a 20-GHz-bandwidth DFB laser used in commercial Finisar LR-4 transmitters. Thus, these models fully characterize the nonlinearity and RIN of a DML. As we will show, the DML nonlinearity greatly favors a non-rectangular constellation, which requires a QAM mapper to generate. We employ a fully-digital implementation of QAM, as described in . The ability to perform the operations of pulse shaping and frequency translation digitally at symbol rates near 20 Gbaud has been enabled by the recent development of commercial CMOS DAC and ADC cores operating at 65 GSamples/s [10, 16]. Due to CMOS implementation, these data converters operate at approximately 1 W/channel [17, 18], making them practical for use in data center transceivers. The adoption of a fully-digital QAM/CAP implementation avoids the many imperfections in an analog system and benefits from the steady improvements in CMOS technology.
In addition to comparing the performance of novel non-rectangular constellations to the standard rectangular constellation, we will demonstrate superior performance to optimally bit-loaded DMT modulation. Digital QAM/CAP is thus a strong contender for future short-reach IM/DD systems.
This paper is organized as follows. Section II introduces the digital QAM architecture. Section III explains our method for generating a more optimal QAM constellation for a DML. Section IV provides full system simulations of QAM and DMT, comparing the performance of QAM with a conventional constellation, QAM with our optimized constellations, and DMT. Finally, Section V provides conclusions of our study.
2. QAM architecture
2.1. QAM transmitter
The digital QAM transmitter is depicted in Fig. 1. This form of QAM is based on the concept of subcarrier modulation, in which the pulses are shaped to occupy a narrow band and frequency-translated with an RF subcarrier . The first step in the transmission process is mapping of the input bits into complex QAM symbols. These complex symbols are then upsampled by a factor of 3 to the DAC’s sampling rate of fDAC. This means that the DAC operates at a sampling rate of 3 times the baud rate. Although 3 samples/symbol is not the lowest upsampling ratio needed, it is the lowest integer upsampling ratio possible, as will be explained. The minimum non-integer upsampling ratio is found by the ratio of twice the signal bandwidth to the baud rate. Using an integer ratio of sampling rate to baud rate simplifies the transmitting and receiver circuitry but results in a slightly higher sampling rate than the minimum needed.
The upsampled symbols then become the amplitudes of a stream of baseband pulses. These pulses are shaped to limit their bandwidth. In particular, we employ the finite impulse response (FIR) approximation of the square-root raised cosine pulse. Such pulses have a bandwidth of3]). It should be noted that fRC is the frequency at which the pulse spectrum reaches 0. For the case of an FIR approximation, the spectrum will not exactly reach 0, but will be very small for frequencies above fRC. This baseband pulse stream then modulates an RF subcarrier at a frequency of fRC to form a digital QAM signal which occupies a bandwidth of 2 fRC. Finally, the digital QAM signal is converted into an analog signal with a DAC. Since the DAC sampling rate must be at least twice the highest frequency present in the RF QAM signal, we need
Although in principle, one could construct a digital QAM signal which has its peaks at the largest amplitudes of the constellation, this would require a baseband pulse shape of11]; digital QAM with square-root raised-cosine pulse shaping will possess a lower PAPR since the pulses decay rapidly in time, so only a small number of pulses contribute significantly to the sum. However, ultra-high-speed DACs typically have resolutions in the range of 5–8 bits, and thus their digital inputs must be appropriately scaled in order to make efficient use of the DAC’s dynamic range. This means that optimal use of a low-resolution DAC for digital QAM will involve some degree of digital clipping, as in DMT . In other words, the digital QAM signal x[n] is clipped at the limits ±x0 to yield the clipped signal 11]. However, for the case of QAM, it cannot be modeled in this manner, because clipping is the result of transmitting the larger-magnitude symbols of the constellation. As a result, clipping becomes another source of signal-dependent nonlinear distortion in the received constellation.
2.2. QAM receiver
The QAM receiver is depicted in Fig. 2. First the received optical intensity is digitized at the same sampling rate as the transmitter’s DAC. This QAM signal is then down-converted to a complex baseband signal for equalization by a feedforward equalizer (FFE). The FFE consists of an adaptive FIR feedfoward filter. This equalizer has the same sampling rate as the ADC; thus, it is a fractionally-spaced equalizer. This allows matched filtering to occur digitally and eliminates sensitivity to timing offsets in the decision circuit (pp. 674–677 of ). The feedforward filter coefficients are adapted during transmission of a training sequence.
We also considered an optional decision feedback tap in the equalizer to feed back the most recent symbol decision. Due to the difficulty of implementing feedback at 20 Gbaud, we only considered a single feedback tap. Achieving feedback at such a high rate requires a highly parallelized implementation and is only practical for a single feedback tap [21, 22].
2.3. Digital pre-emphasis
Since the magnitude response of the channel is approximately known, it can be partially pre-equalized at the transmitter through a pre-emphasis filter. This reduces the amount of noise enhancement at the receiver [3, 15]. To achieve pre-equalization, a linear-phase FIR pre-emphasis filter with high-frequency gain is inserted at the output of the up-converter. The spectra before and after pre-emphasis are shown in Fig. 3. For the purposes of implementation, the pre-emphasis could alternatively be absorbed into the pulse-shaping filter.
3. QAM constellation
An important choice in the design of a QAM system is the constellation. For the target bit rate of 109 Gb/s, the optimal constellation size is 32, when restricted to sizes of the form 2b, with b being an integer. A constellation size of 16 would require a symbol rate that would exceed the bandwidth limits of the DFB and a constellation size of 64 would not be able to meet the target symbol error rate (SER) of 5 × 10−4 due to the intrinsic noise and nonlinearities in the system.
Typical QAM systems employ the rectangular QAM constellation formed from the rectangular lattice. However, the nonlinearity of the DFB penalizes any constellation formed from a lattice, because symbols further from the origin experience greater nonlinearity. As a result, the received symbol clusters are not circular and are larger for symbols further from the origin. This is depicted in Fig. 4. A more efficient constellation would pack symbols less densely when further away from the origin, so that errors are distributed more equally among all symbols.
3.1. Constellation A
We first present a 32-symbol constellation constructed using eight concentric rings centered at the origin. We will refer to this constellation as constellation A. Each ring consists of four equally-spaced symbols, as shown in Fig. 5. In the figure, the rings are labeled 1–8. Since the SER depends on the relative positions of the rings in a complicated way, the constellation was optimized by an iterative Monte-Carlo algorithm, with the SER calculated in each iteration. Let the rings be assigned the following relations: Ring 1 is defined as interior to all other rings. Ring 2 is defined as interior to rings 3–8. Ring 3 is defined as interior to rings 6 and 7. Ring 4 is defined as interior to rings 7 and 8. Ring 5 is defined as interior to rings 6 and 8. Note that some pairs of rings do not have an interior-exterior relation. The radii and rotations of the rings are optimized by the following algorithm:
- Iteratively perturb the radii of all rings by adding the same amount Δr1 to all radii. Determine the Δr1 which locally minimizes the SER. Add this Δr1 to all radii.
- Iteratively perturb the rotation angle of ring 2 by Δθ2 to locally minimize SER. Add Δθ2 to the rotation angle of ring 2.
- Iteratively perturb the radii of ring 2 and all rings exterior to ring 2 by the same length Δr2 to locally minimize SER. Add this Δr2 to the radii of these rings.
- Repeat steps 2 and 3 for rings 3 through 8, beginning with ring 3 and ending with ring 8.
- Re-iterate all of these steps starting from step 1 and continue until the SER has converged to a minimum.
3.2. Constellation B
Considering that a hexagonal lattice offers the densest packing of spheres in the plane (p. 256 of ), we also consider a constellation constructed from its perturbation, constellation B. However, the hexagonal lattice does not naturally produce a symmetric 32-point subset. We instead consider a 31-point subset of the hexagonal lattice centered at the origin, as shown in Fig. 6. As for the case of the 32-symbol constellation based on eight concentric rings, we partition the constellation into concentric rings for performing numerical optimization. In this constellation, ring 1 is a single point at the origin. We define the interior-exterior relations among the rings as follows: Ring 2 is interior to rings 3–6. Ring 3 is interior to rings 5 and 6. Ring 4 is interior to rings 5 and 6. To optimize this constellation, we use the same algorithm as before, except that step 2 first applies to ring 3 rather than ring 2.
Since 31 symbols are transmitted instead of 32, the entropy of the source is reduced to log2 31 = 4.95 bits (pp. 332–336 of ). Thus, the baud rate must be increased by the factor log2 32/ log2 31 = 1.0092 in order to transmit at the same bit rate. Since the constellation size is no longer a power of two, the method by which bits are mapped to symbols must be altered. One method to perform the encoding of bits into symbols is block encoding. Let the binary data be grouped into blocks of b consecutive bits. The encoder must map a block of b bits to a block of k 31-QAM symbols. In order for the block of 31-QAM symbols to be able to represent all of the 2b possible bit patterns, the 31-QAM block must be of length k, where 2b < 31k. If we choose the smallest such k which satisfies this, then we have that
For the case of a small block length, the ratio b/k will be less than the entropy log2 31, as many of the possible patterns of k symbols are unused. In this case, it may be advantageous to combine the binary-to-base-31 conversion with forward error correction (FEC) coding. In other words, rather than applying a binary FEC to the data before block encoding into a 31-QAM block, one will send the raw uncoded binary data to the 31-QAM block encoder. The error protection is instead achieved by appropriately mapping the binary block to a 31-QAM block so that the output codewords are “far apart” (pp. 432–439 of ).
3.3. Constellation based on perturbation of the rectangular constellation
In the preceding discussions, we considered the SER of the constellations, but not the bit error rate (BER). Depending on the encoding of bits to symbols, the relation between the SER and BER can be very different. Assuming that one can map 5-bits to a single 32-QAM symbol with a Gray code, then each symbol differs from its nearest neighbors by one bit, so each symbol error results in one bit error, assuming that errors only occur between nearest neighbors. However, for the previous two constellations, the spacings between the symbols are irregular, so it is difficult to determine an optimal mapping of bits to symbols. Determining an efficient mapping of bits to symbols is a complex problem and beyond the scope of this work. In our comparisons involving constellations A and B, we will restrict our discussion to SER.
However, to make a concrete comparison of BER performance, we present a third constellation, constellation C, for which we determine the BER. This constellation is derived by perturbation of the 32-QAM rectangular cross constellation. It should be noted that the rectangular cross constellation does not have an exact Gray code, but an approximate Gray code exists [23,24]. Our constellation C is derived from a rectangular cross constellation with the bit mapping of [23, 24].
As for constellation A, we partition constellation C into 8 rings of 4 symmetrically-located symbols, as shown in Fig. 7. We define the following interior-exterior relations: Ring 1 is interior to rings 2, 3, 5, and 6. Ring 2 is interior to ring 5. Ring 3 is interior to ring 6. Ring 4 is interior to rings 7 and 8. We optimize constellation C using the same algorithm as for constellation A.
4. Simulation results
The simulated DML was a nonlinear rate-equation model of a 25G 1310 nm DFB currently used in commercial Finisar transmitters. The laser had a bandwidth of 20 GHz and an average RIN of −145 dB/Hz in the 0–20 GHz band. The small-signal transfer function and RIN spectrum of the DFB are shown for various levels of bias current in Fig. 8.
For all of our simulations, we operated the DFB at a bias current of 60 mA, as it gave the largest bandwidth and lowest RIN. For the QAM simulations, the DFB was operated with a peak-to-peak drive current of 55 or 60 mA, depending on the constellation and equalizer, as summarized in Table 1. For comparison, we also simulated a DMT system based on the same DML. For the DMT simulations we found a lower peak-to-peak drive current of 50 mA to yield the optimal trade off between signal strength and linearity. DMT is thus more sensitive to the nonlinearity of the DFB.
Our simulated QAM systems had the following parameters. The baseband QAM pulse had the square-root raised cosine spectral shape with β = 0.2 and a length of 10 symbols. Although using a smaller β results in a narrower bandwidth, it would also require a longer pulse-shaping filter and would increase PAPR, since more pulses would overlap. The pulse-shaped signal had an upsampling ratio of 3 samples/symbol. The pre-emphasis filter consisted of 20 taps. The clipping ratio at the DAC/ADC inputs was either 9 or 10 dB, depending on the constellation and equalizer. The DAC and ADC operated at a sampling rate of 65 GSamples/s with 6 bits of resolution. Dispersion was neglected since data center links are typically within 2 km and dispersion is minimal at 1310 nm. We simulated two types of equalizers: (1) an FFE-only equalizer consisting of a 31-tap FIR filter and (2) an FFE-DFE equalizer consisting of an additional single-tap decision feedback branch. The equalizers were adapted using the least-mean-square algorithm (pp. 710–720 of ). We employed the constellation optimization techniques described previously. Since the constellation optimization occurs at a particular receiver power level, the optimizations were performed at power levels yielding an SER of approximately 5×10−4. The optimal constellations for the FFE-only receiver are shown in Fig. 9. The optimal constellations for the FFE-DFE receiver were similar.
In Fig. 10(a), we compare the SER performance of these four QAM constellations with an FFE-only receiver at various receiver power levels. As seen, constellations A,B, and C outperform the rectangular constellation in terms of SER. In Fig. 10(b), we make this same SER comparison for the case of an FFE-DFE at the receiver. Again, constellations A,B, and C outperform the rectangular constellation in SER.
To make a BER comparison, we consider only constellation C, as shown in Fig. 11. For data center applications, low-latency FEC is preferred. One such standard FEC is the IEEE 802.3bj PAM4 FEC, which has a BER threshold of 2.3×10−4 . To meet this BER threshold, we target a BER of 10−4. For both the FFE-only and FFE-DFE equalizers, constellation C significantly outperforms the rectangular constellation.
We also compare the performance of the QAM system with DMT using the same electrical and optical components, as shown in Fig. 11. The DMT system used 128 subcarriers. We found that a larger number of subcarriers gave negligible improvement. To optimize the power and bit allocation among the subcarriers, we employed Campello’s bit-loading algorithm, which is the optimal discrete-bit-allocation algorithm . The DMT system thus requires a back channel to transmit the channel state information for optimal bit loading. Since the conventional bit and power allocation algorithm assumes an additive white Gaussian noise channel, the bit-loading algorithm was performed iteratively . This was necessary since the effective “noise” at the receiver consisting of both noise and distortion depends on the signal. As seen in Fig. 11, QAM with constellation C outperforms DMT (and further does not rely on a back channel).
It should be noted that the optimality of the presented constellations depends on the parameters of the laser and the clipping levels. In practice, the variation among lasers can be taken into account by finding the optimal constellation for each transmitter during production testing.
This study has shown that an optimized non-rectangular constellation can significantly outperform a rectangular constellation. Compared to the rectangular constellation, a non-rectangular constellation only needs modification to the decision boundaries at the receiver’s decision circuit. Additionally, the proposed QAM system outperforms DMT.
We would like to acknowledge the financial support of Finisar Corporation. We also thank Jonathan Ashbrook of Finisar Corporation for helpful discussions regarding implementation of a DFE circuit.
References and links
1. C. Cole, I. Lyubomirsky, A. Ghiasi, and V. Telang, “Higher-order modulation for client optics,” IEEE Commun. Mag. 51, 50–57 (2013). [CrossRef]
2. G. P. Agrawal, Fiber-Optic Communication Systems (John Wiley and Sons, 2002). [CrossRef]
3. J. G. Proakis and M. Salehi, Digital Communications (McGraw-Hill, 2008), 5
4. J. D. Ingham, R. V. Penty, I. H. White, and D. G. Cunningham, “40 Gb/s carrierless amplitude and phase modulation for low-cost optical datacommunication links,” in OFC/NFOEC” (2011), p. OThZ3.
5. J. L. Wei, L. Geng, R. V. Penty, I. H. White, and D. G. Cunningham, “100 Gigabit Ethernet transmission enabled by carrierless amplitude and phase modulation using QAM receivers,” in OFC/NFOEC,” (2013), p. OW4A.5.
6. R. Rodes, M. Wieckowski, T. T. Pham, J. B. Jensen, J. Turkiewicz, J. Siuzdak, and I. T. Monroy, “Carrierless amplitude phase modulation of VCSEL with 4 bit/s/Hz spectral efficiency for use in WDM-PON,” Opt. Express 19, 26551–26556 (2011). [CrossRef]
7. J. L. Wei, D. G. Cunningham, R. V. Penty, and I. H. White, “Feasibility of 100G Ethernet enabled by carrierless amplitude/phase modulation and optical OFDM,” in “Proc. Eur. Conf. Opt. Commun.”, (2012), p. P6.05.
8. L. Tao, Y. G. Wang, Y. L. Gao, A. P. T. Lau, N. Chi, and C. Lu, “Experimental demonstration of 10 Gb/s multilevel carrier-less amplitude and phase modulation for short range optical communication systems,” Opt. Express 21, 6459–6465 (2013). [CrossRef] [PubMed]
9. J. L. Wei, D. G. Cunningham, R. V. Penty, and I. H. White, “Study of 100 Gigabit Ethernet using carrierless amplitude/phase modulation and optical OFDM,” J. Lightw. Technol. 31, 1367–1373 (2013). [CrossRef]
10. W. Z. Yan, T. Tanaka, B. Liu, M. Nishihara, L. Li, T. Takahara, Z. Tao, J. C. Rasmussen, and T. Drenski, “100 Gb/s optical IM-DD transmission with 10G-class devices enabled by 65 GSamples/s CMOS DAC core,” in OFC/NFOEC 2013,” (Anaheim, USA, 2013), p. OM3H.1.
11. J. Armstrong, “OFDM for optical communications,” J. Lightw. Technol. 27, 189–204 (2009). [CrossRef]
12. A. F. Shalash and K. K. Parhi, “Multidimensional carrierless AM/PM systems for digital subscriber loops,” IEEE Trans. Commun. 47, 1655–1667 (1999). [CrossRef]
13. M. I. Olmedo, T. Zuo, J. B. Jensen, Q. Zhong, X. Xu, S. Popov, and I. T. Monroy, “Multiband carrierless amplitude phase modulation for high capacity optical data links,” J. Lightw. Technol. 32, 798–804 (2014). [CrossRef]
14. M. B. Othman, X. Zhang, L. Deng, M. Wieckowski, J. B. Jensen, and I. T. Monroy, “Experimental investigations of 3-D-/4-D-CAP modulation with directly modulated VCSELs,” IEEE Photon. Technol. Lett. 24, 2009–2011 (2012). [CrossRef]
15. I. Lyubomirsky and W. A. Ling, “Digital QAM modulation and equalization for high performance 400 GbE data center modules,” in OFC/NFOEC,” (2014).
16. I. Dedic, “56Gs/s ADC: Enabling 100GbE,” in OFC/NFOEC 2010,” (San Diego, USA, 2010), p. OThT6.
17. “LEIA digital-to-analog converter,” www.fujitsu.com/downloads/MICRO/fme/documentation/c60.pdf.
18. “LUKE analog-to-digital converter,” www.fujitsu.com/downloads/MICRO/fme/documentation/c63.pdf.
19. A. S. Karar and J. C. Cartledge, “Generation and detection of a 56 Gb/s signal using a DML and half-cycle 16-QAM Nyquist-SCM,” IEEE Photon. Technol. Lett. 25, 757–760 (2013). [CrossRef]
21. K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation (John Wiley and Sons, 1999).
22. J. Ashbrook, “Real-time implementation of dfe,” (Oct. 2013). Private communication.
23. J. G. Smith, “Odd-bit quadrature amplitude-shift keying,” IEEE Trans. Commun. 23, 385–389 (1975). [CrossRef]
24. P. K. Vitthaladevuni, M.-S. Alouini, and J. C. Kieffer, “Exact BER computation for cross QAM constellations,” IEEE Trans. Wirel. Commun. 4, 3039–3050 (2005). [CrossRef]
25. J. D’Ambrosia, M. Gustlin, and P. Anslow, “802.3bj FEC overview and status,” in “IEEE 802.3bm, 40 Gb/s and 100 Gb/s Fiber Optic Task Force,” (2012).
26. J. Campello, “Practical bit loading for DMT,” in Proc. Global Telecommun. Conf. (GLOBECOM ’99),” (Vancouver, Canada, 1999), pp. 801–805.
27. D. J. F. Barros and J. M. Kahn, “Comparison of orthogonal frequency-division multiplexing and on-off keying in amplified direct-detection single-mode fiber systems,” J. Lightw. Technol. 28, 1811–1820 (2010). [CrossRef]