## Abstract

We investigate the application of time and frequency packing techniques, an extension of the classical faster-than-Nyquist signaling, to long-haul optical links. These techniques provide a significant spectral efficiency increase and represent a viable alternative to overcome the theoretical and technological issues related to the use of high-order modulation formats. Adopting these techniques, we successfully demonstrate through simulations the transmission of 1 Tbps over 200 GHz bandwidth in a realistic (nonlinear) long-haul optical link.

© 2011 OSA

## 1. Introduction

The evolution of long-haul optical communications is actually oriented beyond the current established 100 Gbps [1]. Hence, there is a widespread interest in algorithms and techniques which can overcome the difficulties concerning a further capacity growth, enabling the transmission of data rates up to 1 Tbps. Among the more severe limitations involved in such a system upgrade, the technological and practical issues of processing high data rates on a single channel, and the optical channel impairments related to the required transmit power (i.e., the nonlinear effects), should be noticed. Thus, effective and feasible solutions for a deeper exploitation of the optical channel and available devices and technologies should be envisaged.

In optical communications, as in most digital communication systems, orthogonal signaling is usually adopted to ensure the absence of intersymbol interference (ISI) and, in multi-carrier scenarios, also the absence of interference from adjacent channels. In fact, in coherent optical systems, possibly employing polarization multiplexing, given a conventional transmitter, with Mach-Zehnder (MZ) modulators and return to zero (RZ) or non-return to zero (NRZ) shaping pulses, when group velocity dispersion (GVD) and polarization mode dispersion (PMD) are effectively compensated and nonlinear effects are limited, proper filtering and sampling at the receiver ensure that even a symbol-by-symbol detector enables an almost-optimal performance [2]. However, if the orthogonality condition must be satisfied and given the present hardware limitations, even when adopting spectrally efficient orthogonal techniques such as orthogonal frequency-division multiplexing (OFDM) [3] or Nyquist wavelength-division multiplexing (WDM) [4], the only way to increase the spectral efficiency with the aim of reaching the goal of 1 Tb/s transmissions is to increase the constellation cardinality, thus employing modulation formats more sensitive to the nonlinear effects. This is clearly illustrated by Fig. 1, where the achievable spectral efficiency *η* per polarization for three different modulation formats, namely quaternary and octal phase shift keying (QPSK of 8-PSK) and 16-ary quadrature amplitude modulation (16-QAM), is shown (the Shannon limit for additive white Gaussian noise channels is also reported) as a function of *E _{b}/N*

_{0},

*E*being the energy per bit and

_{b}*N*

_{0}/2 the noise power spectral density (PSD) per polarization. An excess bandwidth of 20% with respect to the minimum value ensuring orthogonal signaling (i.e., a transmitted bandwidth of 1.2 the signaling frequency) is assumed, since, in real systems a penalty must be expected, for technological and practical issues, when implementing Nyquist WDM systems [4, Fig. 4]. This gives, for 16-QAM, an asymptotic maximum value of 3.2 bit/s/Hz instead of 4 bit/s/Hz.

An alternative way to improve the spectral efficiency of low-order modulations, such as QPSK or 8-PSK, can be obtained by giving up the orthogonality condition. For example, faster-than-Nyquist signaling (FTN, see [5, 6]) is a well known technique consisting of reducing the spacing between two adjacent pulses in the time domain (i.e., the symbol interval) well below that corresponding to the Nyquist rate. Controlled ISI is thus introduced but the ISI-free performance is still reached provided the optimal detector (whose complexity could, however, become unmanageable) is employed and proper values of the symbol interval selected (such that the minimum Euclidean distance of the system is not reduced). Following the same principle, in [7] both the symbol interval and the frequency separation among adjacent channels are optimized with the aim of maximizing the *achievable spectral efficiency* *η*, which is thus used as a performance measure instead of the minimum distance. In addition, rather than the optimal receiver, a symbol-by-symbol detector working on the samples at the output of a filter matched to the transmitted shaping pulse (matched filter, MF) is considered, thus constraining the receiver complexity to its minimum value [7].

By employing more sophisticated detection algorithms, *η* can be further improved. Two receiver architectures will be considered in this paper: (i) a proper filtering of the MF output plus a symbol-by-symbol detector, and (ii) a low-complexity *maximum a posteriori* (MAP) symbol detector, which takes into account only a limited amount of interference. Improving *η* without increasing the constellation order can be considerably convenient since the larger the constellation size, the higher the decoding complexity. Moreover, it is well known that low-order constellations are more robust to channel impairments such as nonlinearities, whose effects are already increased by the higher transmitted power needed to obtain higher spectral efficiency values, and phase noise. In the case of frequency packing, a further improvement could be achieved by adopting, at the receiver side, a multi-user detector, although this case is not considered here since it would increase the receiver complexity. The remainder of this paper is organized as follows. The system model is described in Section 2. The spectral efficiency computation and optimization is then described in Section 3, considering detectors with different complexity. Numerical results are reported in Section 4 and, finally, some conclusions are drawn in Section 5.

## 2. System Model

We consider a frequency-division multiplexed system where adjacent channels (assumed perfectly synchronized) employ the same linear modulation format and shaping pulse *p*(*t*). The baseband equivalent of the received signal is expressed as

*E*is the symbol energy,

_{s}*T*the symbol interval,

*x*the symbol transmitted over the

_{n,ℓ}*ℓ*-th channel during the

*n*-th symbol interval,

*F*the frequency spacing between adjacent channels, and

*w*(

*t*) a circularly symmetric zero-mean white Gaussian noise process with PSD 2

*N*

_{0}. When polarization multiplexing is also employed,

*r*(

*t*) is the received signal on one state of polarization. In the following, we will avoid to consider the presence of GVD and PMD since, as known, they can be perfectly compensated through a proper two-dimensional equalizer [2]. The transmitted symbols {

*x*} are independent and uniformly distributed and belong to a given zero-mean

_{n,ℓ}*M*-ary complex constellation

*χ*properly normalized such that

*E*{|

*x*|

_{n,ℓ}^{2}} = 1. Note that, in order to avoid boundary effects, the summations in Eq. (1) extend from −∞ to +∞, namely an infinite number of time epochs and carriers are employed. As in [7], we consider the central user only and in the definition of the spectral efficiency we will use

*F*as a measure of the signal bandwidth. The symbol interval

*T*and frequency spacing

*F*will be optimized to maximize the spectral efficiency.

The possibility to generate a transmitted signal with expression

is strictly related to the availability of a linear modulator. In other words, let us consider the transmitted signal associated to the user for*ℓ*= 0. If pulse

*p*(

*t*) has support larger than

*T*, this signal cannot be directly generated through a MZ modulator unless it is properly linearized. This is due to the nonlinear transfer function of the MZ modulator between the electrical signal at its input and the optical signal at its output. We could, however, use a MZ modulator to generate a linearly modulated signal with shaping pulse having support at most

*T*and then “stretch” the transmitted pulses through an optical filter. Hence, in this case, time packing is not an available option. The only degree of freedom will be the frequency spacing

*F*and the bandwidth of the optical filter used at the MZ output.

## 3. Spectral Efficiency optimization

#### 3.1. Symbol-by-Symbol detection

We first consider a symbol-by symbol detector, working on the central user, i.e., that for *ℓ* = 0. The receiver is composed by a filter matched to the shaping pulse *p*(*t*), followed by a proper discrete-time filter and a symbol-by-symbol detector. Although the discrete-time filter could be, in general, fractionally-spaced (FS), the detector will operate on one sample per symbol interval. These samples will be denoted by {*y _{k}*

_{,0}} and can be expressed as

*h*(

*n,ℓ,k*) is the residual interference at time

*kT*due to the

*ℓ*-th user and the (

*k*–

*n*)-th transmitted symbol, and {

*z*} is the additive noise term, in general colored unless a whitening filter (WF) is employed after the MF. The discrete-time filter is assumed properly normalized such that the noise variance is 2

_{k}*N*

_{0}. The dependence of coefficients

*h*(

*n, ℓ, k*) on

*k*is through a complex coefficient of unit amplitude which disappears for

*ℓ*= 0 (hence

*h*(

*n,*0

*,k*) is independent of

*k*) and is due to the fact that

*F*is not an integer multiple of 1/

*T*.

The interference due to adjacent symbols and users is here modeled as a zero-mean Gaussian process with PSD equal to 2*N _{I}*, of course independent of the additive thermal noise—an approximation exploited only by the receiver, while in the actual channel the interference is clearly generated as in Eq. (2). The interference is really Gaussian distributed only if the transmitted symbols

*x*are Gaussian distributed as well, which is a good approximation when

_{k,ℓ}*T*and

*F*are optimized and a large number of interferers arises.

With the above mentioned Gaussian approximation, the channel model assumed by the receiver is

where {*v*} are independent and identically distributed zero-mean circularly symmetric Gaussian random variables, with variance 2(

_{k}*N*

_{0}+

*N*). From Eq. (2) it is

_{I}*k*. The achievable information rate (AIR), measured in bit per channel use, for this mismatched receiver (see [8, 9]) is

*p*

_{Yk,0|Xk,0}(

*y*

_{k}_{,0}|

*x*

_{k}_{,0}) is a Gaussian probability density function (pdf) of mean

*x*

_{k}_{,0}and variance 2(

*N*

_{0}+

*N*) (in accordance with the auxiliary channel model of Eq. (3)), while the outer statistical average, with respect to

_{I}*x*

_{k}_{,0}and

*y*

_{k}_{,0}, is carried out according to the real channel model of Eq. (2) [9]. Equation (5) can be evaluated efficiently by means of a Monte Carlo average [9]. From a system viewpoint, the spectral efficiency is more significant than the information rate. Under the assumption of infinite transmission,

*η*is defined as

For a given constellation and shaping pulse, it is possible to show that the optimal spacings *T* and *F* that provide the largest *η*, depend on the signal-to-noise ratio (SNR). Also, it must be noticed that as *T* and *F* are reduced, interference increases and thus the information rate degrades, but *η* can be improved. This means that, for a given fixed code, the asymptotic performance will degrade. Information theory, however, ensures that with a proper code of lower rate, those values of spectral efficiency can be obtained.

The properties of the function *η*(*T,F,E _{S}/N*

_{0}) cannot be easily studied in closed form, but it is clear, by physical arguments, that it is bounded, continuous in

*T*and

*F*, and tends to zero when

*T,F*→ 0 or

*T,F*→ ∞. Hence, the function

*η*(

*T,F,E*

_{s}/N_{0}) has a maximum value—according to our findings, in most cases there are no local maxima other than the global maximum. The problem can be solved by evaluating

*η*(

*T,F,E*/

_{S}*N*

_{0}), for fixed modulation, shaping pulse and

*E*

_{s}/N_{0}, on a grid of values of

*T*and

*F*(coarse search), followed by an interpolation of the obtained values (fine search).

A measure of the SNR more significant than *E _{s}/N*

_{0}is given by

*E*

_{b}/N_{0}, for which the following Eq. holds

*E*=

_{s}*I*(

*E*)

_{s}*E*. The optimization problem becomes

_{b}*T,F*), which ensure an accurate sampling of the AIR, and

*E*

_{s}/N_{0}. For each couple (

*T*), cubic spline interpolation can be used to obtain a continuous function of

_{i}, F_{j}*E*

_{s}/N_{0}(fine search), denoted as

*I*(

*T*

_{i}, F_{j}, E_{s}/N_{0}). Then, given a value of

*E*

_{b}/N_{0}the following fixed-point problems are solved in

*E*

_{s}/N_{0}for different couples (

*T*),

_{i}, F_{j}*I*(

*T*

_{i}, F_{j}, E_{b}/N_{0}). Further improvements could be achieved by adding

*N*as variable in Eq. (7). However, we have found by numerical results that choosing

_{I}*N*as in Eq. (4) is almost optimal.

_{I}The spectral efficiency depends on the employed discrete-time filter. Since the optimization of this filter with the aim of maximizing the spectral efficiency is a hard task, we restricted our analysis to the case of a FS minimum-mean-square-error (MMSE) feedforward filter with at most 22 coefficients, since it provided, among all considered filters, the best results.

As mentioned before, when a nonlinear MZ modulator is adopted, the optimization problem of Eq. (7) will be reduced to the optimization of the frequency spacing and of the bandwidth of the optical filter (for this latter parameter, only a coarse search was performed).

#### 3.2. Single-User Trellis Processing

An improved, still achievable, lower bound can be reached by the adoption of more effective detection algorithms, namely a more complex receiver, able to cope with part of the interference introduced by the time-frequency packing. Interference due to the adjacent users is not considered—a single-user receiver is adopted.

For a general channel with finite intersymbol interference, an optimal MAP symbol detector can be designed working on the samples at the WF output. These samples, denoted as Forney observation model [10], can still be expressed as in Eq. (2) with a proper expression of coefficients *h*(*n,ℓ,k*). We assume to adopt the optimal receiver for the following auxiliary channel:

*f*}

_{n}

_{n}_{≥0}are such that

*f*=

_{n}*h*(

*n,*0

*,k*) and, as mentioned, are independent of

*k*, whereas the noise samples {

*v*}, that take into account the white noise and the residual interference, are assumed independent and identically distributed zero-mean circularly symmetric Gaussian random variables with variance 2(

_{n}*N*

_{0}+

*N*), with

_{I}*σ*

_{k,}_{0}= (

*x*

_{k}_{–1,0},...,

*x*

_{k}_{–}

_{L}_{,0}), takes into account

*L*interfering symbols only, according to a given maximal allowable receiver complexity. Being the number of trellis states equal to

*S*=

*M*, we will consider very limited values of

^{L}*L*.

Let us define **x*** ^{n}* = (

*x*

_{0,0},

*x*

_{1,0},...,

*x*

_{n}_{,0}) and

**y**

*= (*

^{n}*y*

_{0,0},

*y*

_{1,0},...,

*y*

_{n}_{,0}). The simulation-based method described in [9] allows to evaluate the AIR for the mismatched receiver, i.e.,

For channels with finite ISI, optimal MAP symbol detection can be equivalently implemented by working directly on the MF output [12], i.e., on the so-called Ungerboeck observation model [13]. The equivalence does not hold when reduced-complexity detection is considered and interference from adjacent channels arises. The spectral efficiency *η* for the Ungerboeck observation model can be computed as described for the Forney model. Since the Forney observation model has shown to be less convenient, in terms of spectral efficiency values [14], than the Ungerboeck model, it will not be considered further in this paper.

Notice that tighter lower bounds can be obtained by using a more general auxiliary channel model and the corresponding optimal receiver, i.e., a multi-user receiver for the central user (that with *ℓ* = 0) that takes into account *J* adjacent signals on each side (an approximation exploited only by the receiver, while in the actual channel the interference is generated as in Eq. (1)). The benefit is two-fold. First, these tighter bounds allow to evaluate the performance degradation due to the use of single-user receivers with respect to a more involved multi-user receiver, which is more “matched” to the real channel. Second, it gives a practical performance upper bound when low-complexity approximate multi-user receivers, for example based on linear equalization or interference cancellation (see [15] and references therein) are employed. Obviously, in this case some (limited) degradation must be expected.

## 4. Simulation Results

In this section, we report the optimal spectral efficiency *η*_{M} as a function of *E _{b}/N*

_{0}for different modulation formats. Since we consider the case of a MZ modulator, as mentioned. simulation results for frequency-packing only are presented. The employed shaping pulses are those resulting form the use of RZ pulses with duty cycle 33, 50, and 66%, and a Gaussian or 4th-order Gaussian optical filter. The frequency spacing

*F*and the optical filter bandwidth

*B*, have been optimized for each value of

*E*

_{b}/N_{0}. Hence, their values change along the curves. The considered modulation formats are QPSK, 8-PSK, and 16-QAM. Regarding QPSK, we would like to mention that, in case of use of Gray mapping, it can be viewed, with a proper rotation of the constellation, as two independent BPSK signals transmitted over the in-phase and quadrature components, respectively. Hence, at the receiver side, we may use two identical and independent detectors, one working on the in-phase and the other one on the quadrature component. This is beneficial in case of adoption of a MAP symbol detector. In fact, when

*L*interfering symbols are taken into account, we have two detectors working on a trellis with 2

*states instead of a single detector working on a trellis with 4*

^{L}*states. Hence, for a given complexity, a larger number of interferers can be taken into account.*

^{L}Fig. 2 shows the achievable spectral efficiency *η*_{M} for QPSK, 8-PSK, and 16-QAM modulations with RZ pulses with 50% duty cycle. No differences were observed for different values of the duty cycle, although the optimal values of *B* and *F* may be different. At the output of the MZ modulator, a 4-th order Gaussian optical filter is employed, and, at the receive side, after the MF an optimized discrete-time MMSE filter is used followed by a symbol-by-symbol detector. The Shannon Limit is also shown for comparison. By comparing these results with those in Fig. 1 and related to the case of orthogonal signaling, we may observe a significant improvement for QPSK.

Fig. 3 shows the achievable spectral efficiency *η*_{M} of the same system but with trellis processing, that, as can be noticed, allows to improve the spectral efficiency, with respect to a symbol-by-symbol detector, of almost 30% for QPSK and 8-PSK, whereas a limited improvement is obtained for 16-QAM. The trellis processing is here performed with *S* = 16 for QPSK and 16-QAM, and with *S* = 64 for 8-PSK. If we compare the theoretical SE curve for QPSK that can be obtained with the proposed technique and that corresponding to orthogonal signaling, it is clear that an asymptotic SE of 3.3 bit/s/Hz per polarization can be obtained instead of 2 bit/s/Hz per polarization. Thus the gain is of 65%. In addition, it is known that a realistic transmission system should envisage an excess bandwidth (as an example see ref. [4]) and, as shown in Fig. 1, a SE of 1.6 bit/s/Hz per polarization with a 20% excess bandwidth (more realistic value) should be taken as a reference. In this sense the improvement is around 106%.

What information theory promises can be approached by using proper coding schemes, even in the presence of nonlinear effects. We consider a compensated optical link (14 spans of about 90 km each, described in [16]) with QPSK modulation and three different combinations of codes and subchannel data rates. Similar results are expected in a link without inline dispersion compensation. Two low-density parity-check codes having codewords of length 64800 bits and rates 4/5 and 8/9, respectively, are employed. As the target bit rate in our simulations is 1 Tbps on a bandwidth of 200 GHz and given a polarization-multiplexed QPSK transmission, a resulting spectral efficiency of 2.5 bit/s/Hz per polarization is required. We consider three setups: (1) seven 180 Gbps 30 GHz-spaced channels using the code with rate 4/5, (2) six 188 Gbps 35 GHz-spaced channels with the same code, and (3) eight 140 Gbps 26 GHz-spaced channels with the code of rate 8/9. Transmit and receive optical filter bandwidths are chosen equal to 0.3/*T* for (1), and 0.325/*T* for (2) and (3), whereas the launch powers per channel are −2.5, −2.6, and −4 dBm, respectively. A two-dimensional (2-D) adaptive FFE with 9 taps processes the signals received over two orthogonal states of polarization to compensate for GVD and PMD [2], so that its output is provided to a BCJR detector which iteratively exchange information with the LDPC decoder for a maximum of 50 iterations. The receiver architecture is shown in Fig. 4. The values of *E _{b}/N*

_{0}, estimated as if the channel were linear, for the three systems able to provide a bit-error rate (BER) of 10

^{−7}are reported in Fig. 5 along with the theoretical spectral efficiency curves. It may be observed that, despite the lack of optimization in the code design, we are 2.5 ÷ 3 dB far from the theoretical results. This loss is due to the presence of nonlinear effects which require a careful redesign of the codes and the investigation of the best combination of coding and modulation. The PSD of the transmitted signal of system (3) is also shown in Fig. 6. It can be noticed that, since the bandwidth of each subchannel is highly reduced by filtering, the required sampling rate is always within state-of-the-art technology, i.e, well below 50 Gsample/s.

We point out that no alternative schemes could be adopted in this scenario with this target value of spectral efficiency. In fact, 8-PSK or 16-QAM modulations with orthogonal signaling and proper coding schemes fail to attain the target BER value, in this specific link, due to the highly detrimental nonlinear effects. In this case, we considered both the cases of single-carrier transmissions, relaxing all constraints on the sampling rate available today, and multicarrier transmissions with a limited number of subcarriers to satisfy such technological constraints.

## 5. Conclusions

In this paper, we investigated a possible solution to improve the spectral efficiency of low-order linear modulations with different kinds of receivers. The improvement is related to the use of narrow optical filtering and frequency packing, in order to giving up the signal orthogonality in the time and in the frequency domain, and on the adoption of detectors with different complexity. We showed preliminary results of combinations of modulation and coding formats that reach promising performance on a realistic optical link, and, in particular, 1 Tbps over 200 GHz bandwidth has been successfully demonstrated by simulations.

## References and links

**1. **S. Chandrasekhar and X. Liu, “Enabling components for future high-speed coherent communication systems,” in Proc. Optical Fiber Commun. Conf. (OFC’09) (Los Angeles, CA, USA, 2011), Paper OMU5.

**2. **G. Colavolpe, T. Foggi, E. Forestieri, and G. Prati, “Robust multilevel coherent optical systems with linear processing at the receiver,” J. Lightwave Technol. **27**, 2357–2369 (2009). [CrossRef]

**3. **J. Zhao and A. Ellis, “Electronic impairment mitigation in optically multiplexed multicarrier systems,” J. Light-wave Technol. **29**, 278–290 (2011). [CrossRef]

**4. **G. Bosco, V. Curri, A. Carena, P. Poggiolini, and F. Forghieri, “On the performance of nyquist-WDM terabit superchannels based on PM-QPSK, PM-8PSK or PM-16QAM subcarriers,” J. Lightwave Technol. **29**, 53–61 (2011). [CrossRef]

**5. **J. E. Mazo, “Faster-than-Nyquist signaling,” Bell System Tech. J. **54**, 1450–1462 (1975).

**6. **F. Rusek and J. B. Anderson, “The two dimensional Mazo limit,” in Proc. IEEE International Symposium on Information Theory, (Adelaide, Australia, 2005), pp. 970–974.

**7. **A. Barbieri, D. Fertonani, and G. Colavolpe,“ Time-frequency packing for linear modulations: spectral efficiency and practical detection schemes,” IEEE Trans. Commun. **57**, 2951–2959 (2009).

**8. **N. Merhav, G. Kaplan, A. Lapidoth, and S. Shamai, “On information rates for mismatched decoders,” IEEE Trans. Inform. Theory **40**, 1953–1967 (1994). [CrossRef]

**9. **D. M. Arnold, H.-A. Loeliger, P. O. Vontobel, A. Kavčić, and W. Zeng, “Simulation-based computation of information rates for channels with memory,” IEEE Trans. Inform. Theory **52**, 3498–3508 (2006). [CrossRef]

**10. **G. D. Forney Jr., “Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference,” IEEE Trans. Inform. Theory **18**, 284–287 (1972). [CrossRef]

**11. **L. R. Bahl, L. R. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inform. Theory **20**, 284–287 (1974). [CrossRef]

**12. **G. Colavolpe and A. Barbieri, “On MAP symbol detection for ISI channels using the Ungerboeck observation model,” IEEE Commun. Lett. **9**, 720–722 (2005). [CrossRef]

**13. **G. Ungerboeck, “Adaptive maximum likelihood receiver for carrier-modulated data-transmission systems,” IEEE Trans. Commun. **com-22**, 624–636 (1974). [CrossRef]

**14. **F. Rusek and D. Fertonani, “Lower bounds on the information rate of intersymbol interference channels based on the ungerboeck observation model,” in Proc. IEEE International Symposium on Information Theory (2009).

**15. **G. Colavolpe, D. Fertonani, and A. Piemontese, “SISO detection over linear channels with linear complexity in the number of interferers,” IEEE J. Sel. Top. Signal Process . (submitted).

**16. **A. Barbieri, G. Colavolpe, T. Foggi, E. Forestieri, and G. Prati, “OFDM vs. single-carrier transmission for 100 Gbps optical communication,” J. Lightwave Technol. **28**, 2537–2551 (2010). [CrossRef]