## Abstract

We present a symbol-by-symbol coherent optical receiver, which employs a novel, complex-weighted, decision-aided, maximum-likelihood (CW-DA-ML) carrier phase and frequency offset estimator. The CW-DA-ML carrier estimator uses a CW transversal filter to generate a carrier reference phasor, and the filter weights are automatically adapted on-line by linear regression on the observed signals. A complete modulo-*R* reduced frequency offset estimation (FOE) range of $\pm $*R*/2 is achieved, independent of modulation format, where *R* is the symbol rate. Carrier phase and frequency tracking is achieved rapidly. The acquisition speed of frequency offset in quaternary phase-shift keying (4-PSK) signals is more than 5 times faster than that of differential FOE. A constant penalty of approximately 1 dB at bit-error rate of 10^{−4} is demonstrated for all frequency offsets in 4-PSK signals with laser-linewidth-symbol-duration product of 8$\times $10^{−5}.

©2012 Optical Society of America

## 1. Introduction

Traditionally, a major challenge in homodyne coherent receivers is to phase- and frequency-lock the local oscillator (LO) laser to the received optical signal. Recently, digital signal processing (DSP) algorithms have been widely used to perform post-detection carrier synchronization, thereby giving birth to the intradyne coherent receiver featuring an intermediate frequency (IF) which allows for a free-running LO laser [1]. A prevalent DSP based block *M*th power carrier estimation (CE) was proposed in [2], but is meant for *M*-ary phase-shift keying (MPSK) formats only and requires nonlinear operations which increase system latency. A computationally linear, decision-aided, maximum-likelihood (DA-ML) CE with comparable performance to block *M*th power CE was derived in [3] and proposed for use in coherent optical communications in [4]. In an intradyne receiver, the frequency offset, ∆*f*, between LO and transmitter laser can be as large as $\pm $5 GHz [5]. However, block *M*th power and DA-ML CE were designed to estimate the carrier phase without considering any frequency offset. Block *M*th power and DA-ML CE have limited frequency offset tolerance of 10^{−3} times the symbol rate at laser-linewidth-symbol-duration product of 6.35$\times $10^{−5} [2, 6]. Hence, it is imperative for a digital CE to incorporate frequency offset estimation (FOE) capability.

A differential FOE, using the sum of differenced signal phases over an observation interval, was suggested for the block *M*th power CE in [7]. It utilizes the nonlinear operations of arctan($\xb7$) and *M*th power to remove data modulation. Consequently, it is only applicable to MPSK formats and the FOE range is limited to $\pm $*R*/2*M*, where *R* is the symbol rate. With higher order modulation formats, the FOE range shrinks while the complexity of raising to the *M*th power increases. This differential FOE has an *R*/*M* frequency estimate ambiguity, requiring differential encoding (DE) which leads to a DE-induced performance penalty [8].

Extending upon the DA-ML CE of [3, 4], we propose here a modulation-format-independent symbol-by-symbol receiver employing a novel, complex-weighted (CW), DA-ML CE to estimate the unknown carrier phase and frequency offset. A complete modulo-*R* reduced frequency offset range of $\pm $*R*/2 can be estimated for arbitrary modulation formats. CW-DA-ML CE achieves near-ideal FOE with a low training overhead cost and no *a priori* knowledge of the system statistics. Carrier phase and frequency are acquired rapidly. The FOE for 4-PSK signals is more than 5 times faster than that of the differential FOE of [7].

The paper is organized as follows. In Section 2, the received signal in a phase and polarization diversity intradyne coherent receiver is modeled. Section 3 provides the derivation of the symbol-by-symbol receiver structure consisting of CW-DA-ML CE and data detection (DD). Section 4 examines the CE performance via simulations, and includes a discussion on its complexity and parallel implementation. Section 5 concludes the paper. Throughout this paper, ${\rm E}[\xb7]$is the expectation, $|\xb7|$ is the modulus operator, boldface characters represent matrices, and $\delta $is the Kronecker delta. Superscript $\ast $, *T*, and *H* denote conjugate, transpose, and complex conjugate transpose, respectively. All phase quantities are treated modulo-2*π* to account for their circular nature and frequency quantities are treated modulo-*R*, which is justified later in Subsection 4.3.

## 2. Phase and polarization diversity intradyne coherent receiver

The opto-electronic front end followed by DSP modules in a phase and polarization diversity intradyne coherent receiver is shown in Fig. 1
. The entire received optical field, ${E}_{r}\left(t\right)$, within the receiver bandwidth is mapped into *x-* and *y*-polarized, in-phase, *I*, and quadrature-phase, *Q*, photocurrents by mixing ${E}_{r}\left(t\right)$ with an LO using two 90° optical hybrids. The LO laser is free running, in contrast to a homodyne coherent receiver where the LO laser needs to be locked to the received optical signal. Each photocurrent branch $i\left(t\right)$ is sampled at symbol rate *R* which amounts to sampling the complex signal at 2*R*, therefore satisfying the Nyquist theorem for fully preserving the optical signal and noise statistics.

We assume ideal signal conditioning by the first four DSP blocks, namely, analog-to-digital conversion (ADC), clock recovery and retiming, chromatic dispersion compensation, and polarization demultiplexing and polarization-mode dispersion compensation. The received signal samples ${r}_{x}\left(k\right)$ and ${r}_{y}\left(k\right)$ over the *k*th symbol interval $\left[kT,\left(k+1\right)T\right)$ (*T* = symbol duration) are clock-synchronized with one complex sample per symbol. They are assumed to be free of intersymbol interference (ISI), polarization crosstalk, and nonlinear distortions. The $r\left(k\right)$ in one of the polarizations can be shown to be [9]

*k*th interval, respectively. Here, $\theta \left(k\right)$ is the combined laser phase noise of the transmitter and LO lasers, whereas ${\theta}_{LO}\left(k\right)$ is the LO phase noise. Most long-haul optical fiber systems are limited by amplified spontaneous emission (ASE) noise from inline optical amplifiers, which is represented by $n\text{'}\left(k\right)$ in Eq. (1). The ${i}_{sh}$ is the LO shot noise, assuming ${P}_{LO}\gg {P}_{r}$. Thermal noise is ignored in Eq. (1). On the right-hand side of Eq. (1), the first term is the signal-LO beat term and the second term is the LO-ASE beat noise. Using a pair of balanced photodiodes to detect each photocurrent branch, compared to a single photodiode in [1], eliminates the signal-ASE and ASE-ASE beat noise terms.

A canonical model of the received signal in Eq. (1) can be written as

*k*th data symbol. The LO-ASE beat noise and LO shot noise are modeled as additive white Gaussian noise (AWGN), and are combined to form $n\left(k\right)=\chi \sqrt{{P}_{LO}}{n}^{\prime}\left(k\right){e}^{j\left(2\pi \Delta fTk-{\theta}_{LO}\left(k\right)\right)}+{i}_{sh}$. The set $\left\{n\left(k\right)\right\}$ is a sequence of circularly symmetric, AWGN with mean zero and covariance function ${\rm E}\left[n\left(k\right)n*\left(k-i\right)\right]={N}_{0}\delta \left(i\right)$. The carrier phase $\theta \left(k\right)$ has a Gaussian random-walk model of $\theta \left(k\right)=\eta \left(k\right)+\theta \left(k-1\right)$, where $\left\{\eta \left(k\right)\right\}$is a set of independent, identically distributed, Gaussian random variables with mean zero and variance ${\sigma}_{p}{}^{2}=2\pi \Delta \nu T$ [9]. Here, ∆ν is the combined 3-dB linewidth of the transmitter and LO lasers. The carrier phase $\theta \left(k\right)$ has a temporal correlation characterized by ${\rm E}\left[\theta \left(i\right)\theta \left(j\right)\right]={\sigma}_{p}{}^{2}\mathrm{min}\left[i,j\right]$. Signal-to-noise ratio (SNR) per bit is defined as ${\gamma}_{b}={\rm E}\left[{\left|m\left(k\right)\right|}^{2}\right]/{N}_{0}{\mathrm{log}}_{2}M$. In an intradyne receiver, ${E}_{r}\left(t\right)$ is downconverted to an IF of ∆

*f*as seen in Eq. (2). Optical frequency bands around ${f}_{LO}+\Delta f$ and ${f}_{LO}-\Delta f$ map to the same IF, where ${f}_{LO}$ is the LO frequency. In order to avoid crosstalk in a dense wavelength-division-multiplexing system and to avoid excess optical noise from unwanted image bands, an optical filter of a single channel bandwidth is required before the intradyne receiver.

## 3. Carrier estimation and data detection

In this paper, we consider the detection of an uncoded data symbol sequence $\left\{m\left(k\right)\right\}$ transmitted over an AWGN channel with unknown carrier phase and frequency offset, as modeled in Eq. (2). Because the data is uncoded, the symbols of $\left\{m\left(k\right)\right\}$ are independent, and each symbol can assume with equal probability any point *S _{i}*,

*i*= 0,…,

*M*-1, in the signal constellation. The elements of $\left\{\theta \left(k\right)\right\}$ are correlated, whereas the assumption of an AWGN channel with no ISI makes the elements of $\left\{n\left(k\right)\right\}$ independent. The elements of $\left\{r\left(k\right)\right\}$ are rendered independent when conditioned on given values of uncoded data symbol sequence $\left\{m\left(k\right)\right\}$, carrier phase process $\left\{\theta \left(k\right)\right\}$, and frequency offset, ∆

*f*. Hence, each data symbol $m\left(k\right)$ in the sequence $\left\{r\left(k\right)\right\}$ will be detected individually, i.e., symbol-by-symbol with minimum symbol error probability. We assume mutual statistical independence among $\left\{m\left(k\right)\right\}$, $\left\{\theta \left(k\right)\right\}$, and ∆

*f,*which leads to the separation of the CE problem from the DD problem [10]. Therefore, at high SNR, the optimum symbol-by-symbol receiver structure consists of a CE followed by a coherent DD, as illustrated in Fig. 1 [10].

We develop our new CE algorithm in Subsection 3.1 for carrier phase and frequency offset estimation. The estimated phase and frequency are treated as if they were the true values of the carrier and are used in (partially) coherent DD in Subsection 3.2. Separate symbol-by-symbol receiver structures, consisting of CE and DD, can be applied independently on each polarization channel as shown in Fig. 1. All equations and quantities expressed hereafter are thus meant for one polarization channel.

#### 3.1 Complex-weighted DA-ML carrier estimation

In DA-ML CE [3, 4], the carrier phase is estimated using a complex reference phasor (RP), where the argument of the RP is the ML phase estimate of the carrier. It is assumed that the carrier phase process $\left\{\theta \left(k\right)\right\}$ is fluctuating slowly compared to the symbol rate such that it can be approximated to be piecewise constant over intervals longer than *LT*, where *L* is the estimator filter length. The RP $V\left(k+1\right)$ without considering any frequency offset is computed in [3, 4] using the immediate past *L* received signals as

*k*th symbol from DD and ${C}^{-1}\left(k\right)={{\displaystyle {\sum}_{l=1}^{L}\left|\widehat{m}\left(k-l+1\right)\right|}}^{2}$ is a factor to normalize the RP's magnitude in the event of a non-constant-energy signal constellation.

Considering the presence of an unknown carrier frequency offset, we propose here a new RP $V\text{'}\left(k+1\right)$ that is formed by

Ideally, one should have ${w}_{l}\left(k\right)={e}^{j\Delta \omega l}$, but ∆*ω* is unknown in practice. Thus, we propose to choose the complex weights automatically and adaptively at each time *k* based on the observations $\left\{r\left(l\right),0\le l\le k\right\}$ to minimize the cost function $J\left(k\right)$ given by [11]

*L*-by-1 filter-weight vector and $y\left(k\right)={\left[\begin{array}{ccc}r\left(k\right)\widehat{m}*\left(k\right)& \cdots & r\left(k-L+1\right)\widehat{m}*\left(k-L+1\right)\end{array}\right]}^{T}$ is the

*L*-by-1 filter-input vector at time

*k*. The error $e\left(l\right)$ is the difference between the desired response $r\left(l\right)/\widehat{m}\left(l\right)$ and the RP output of Eq. (4) at time

*l*$-$1 using the latest set of filter coefficients $w\left(k\right)$. Minimization of the cost function $J\left(k\right)$ forces RP ${V}^{\prime}\left(l\right)$ to track the normalized term $r\left(l\right)/\widehat{m}\left(l\right)$, and thus forces ${w}_{l}\left(k\right)$ to track ${e}^{j\Delta \omega l}$. Adaptation of filter weights using a least-squares criterion, as opposed to a mean-square-error criterion $\tilde{J}\left(k\right)={\rm E}\left[{\left|e\left(k\right)\right|}^{2}\right]$, requires no statistical information about the AWGN, carrier phase, or frequency offset.

Since $J\left(k\right)$ is a real valued function of $w\left(k\right)$ and $w*\left(k\right)$, we solve the equation: $\partial J\left(k\right)/\partial w*\left(k\right)=0$, for $w\left(k\right)$ using the complex vector differentiation identity: $\partial \left[a-{w}^{H}b\right]/\partial w*=-b$, where $a$ is a scalar and $b$ is a vector. The optimum $\widehat{w}\left(k\right)$ is obtained as the solution of a least-squares normal equation

where $\Phi \left(k\right)={\displaystyle {\sum}_{l=1}^{k}{C}^{2}\left(l-1\right)y*\left(l-1\right){y}^{T}\left(l-1\right)}$ is the*L*-by-

*L*time-average autocorrelation matrix and $z\left(k\right)={\displaystyle {\sum}_{l=1}^{k}C\left(l-1\right)y*\left(l-1\right)r\left(l\right)/\widehat{m}\left(l\right)}$ is the

*L*-by-1 time-average cross-correlation vector. The ${\Phi}^{-1}\left(k\right)$ matrix and the least-squares $\widehat{w}\left(k\right)$ at each time

*k*can be obtained recursively using the matrix inversion lemma [12]. Therefore, CW-DA-ML CE avoids any matrix inversion. The optimum $\widehat{w}\left(k\right)$ can respond to changing channel conditions, as it depends on the observations $\left\{r\left(k\right)\right\}$.

#### 3.2 Symbol-by-symbol data detection

Treating ${V}^{\prime}\left(k\right)$ as the true phasor ${e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}$, we multiply $r\left(k\right)$ by ${V}^{\prime}*\left(k\right)$ to compensate for the carrier phase and frequency offset. The derotated $r\left(k\right)$ is plugged into the ML minimum-distance DD for an AWGN channel given by [12]

*S*, and rearranging them to yield

_{i}Our symbol-by-symbol receiver algorithm employing the novel CW-DA-ML CE with recursive updating of ${\Phi}^{-1}\left(k\right)$ matrix is outlined in Table 1
. The ${\Phi}^{-1}\left(k\right)$ matrix is Hermitian symmetric, thus only the upper triangle of ${\Phi}^{-1}\left(k\right)$ needs to be computed whereas the lower triangle is obtained through diagonal reflection as signified by the ${\rm T}\text{ri}\{\xb7\}$ operator in Table 1. In operating the filter of Eq. (4), we set ${V}^{\prime}\left(0\right)=1$ and $\widehat{w}\left(0\right)={\left[\begin{array}{ccc}1& 0& \begin{array}{cc}\cdots & 0\end{array}\end{array}\right]}^{T}$ to give a maximum gain of one on the first filter input $r\left(0\right)\widehat{m}*\left(0\right)$. An initial preamble of *N* known symbols is used to enable $\widehat{w}\left(k\right)$ to settle to a steady state and for ${V}^{\prime}\left(k\right)$ to acquire tracking of the phasor ${e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}$. The filter operates in decision-directed mode subsequently.

## 4. Numerical results and discussion

In all subsequent Monte-Carlo simulations, a single-polarization received signal modeled after Eq. (2) is used with a fixed bit rate of 50 Gbit/s for all modulation formats. A filter length of *L* = 5 is maintained for all CE algorithms tested. The modulo-2*π* reduced ∆*ω* is assumed to have a probability density function given by *p*(∆*ω*) = 1/2*π* for ∆*ω*$\in $[-*π*, *π*), where ∆*ω* is random but time invariant. The models of the carrier phase,$\theta \left(k\right)$, and angular frequency offset, ∆*ω*, are not known to the receiver.

#### 4.1 Operation of CW-DA-ML CE

In this subsection, we analyze the statistical performance of CW-DA-ML CE by simulating carrier recovery for a 4-PSK signal with a preamble of *N* = 50 and a laser linewidth of 2 MHz.

Figure 2
plots the acquisition of the argument of the complex weights, $\left\{\mathrm{arg}\left({\widehat{w}}_{l}\left(k\right)\right),1\le l\le L\right\}$, from Eq. (6) at an ${\gamma}_{b}$ of 10 dB and a frequency offset of 3 GHz. The $\mathrm{arg}\left({\widehat{w}}_{l}\left(k\right)\right)$ converges quickly to the actual value of *l*∆*ω*. Hence, ${w}_{l}\left(k\right)$ in Eq. (4) effectively removes the angular frequency offset difference between consecutive filter input terms $\left\{r\left(k-l+1\right)\widehat{m}*\left(k-l+1\right),1\le l\le L\right\}$ and helps to produce a more accurate RP compared to that produced by Eq. (3).

The mean-square error (MSE) learning curve of CW-DA-ML CE algorithm given by $J\text{'}\left(k\right)={\rm E}\left[{\left|r\left(k\right)/\widehat{m}\left(k\right)-{V}^{\prime}\left(k\right)\right|}^{2}\right]$ is empirically evaluated. The ensemble-average squared error curves averaged over 10^{4} runs at frequency offsets of 0, 0.03, and 3 GHz are plotted for SNR values of 7, 10, and 13 dB in Fig. 3
. Ideal decision feedback is assumed, i.e., $\widehat{m}\left(k\right)=m\left(k\right)$ for all *k*. Using Eq. (2), the MSE $J\text{'}\left(k\right)$ can be expanded into

*∆f*, $\theta \left(k\right)$, $n\left(k\right)$, and $m\left(k\right)$ are mutually independent . Thus, the third and fourth terms on the right hand side of Eq. (9) amount to zero because ${\rm E}\left[n\left(k\right)/m\left(k\right)\right]=0$. Equation (9) then reduces to ${J}^{\prime}\left(k\right)={{J}^{\prime}}_{\mathrm{min}}\left(k\right)+{{J}^{\prime}}_{ex}\left(k\right)$, where ${{J}^{\prime}}_{\mathrm{min}}\left(k\right)={\rm E}\left[{\left|n\left(k\right)/m\left(k\right)\right|}^{2}\right]$ is the minimum achievable MSE and ${{J}^{\prime}}_{ex}\left(k\right)={\rm E}\left[{\left|{e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}-{V}^{\prime}\left(k\right)\right|}^{2}\right]$ is the excess MSE due to the MSE of the RP. It can be shown that ${{J}^{\prime}}_{\mathrm{min}}=1/{\gamma}_{b}{\mathrm{log}}_{2}M$for 4-PSK signals. Figure 3 shows the values of ${{J}^{\prime}}_{\mathrm{min}}$, and also ${{J}^{\prime}}_{ex}$ which was obtained by averaging from time

*k*= 100 to

*k*= 200 the average of ${\left|{e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}-{V}^{\prime}\left(k\right)\right|}^{2}$ over 10

^{4}runs at each time point

*k*. A ${{J}^{\prime}}_{ex}$ value less than 2.4$\times $10

^{−2}demonstrates the good tracking of ${e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}$ by ${V}^{\prime}\left(k\right)$, for SNR values above 7 dB. The excess MSE ${{J}^{\prime}}_{ex}$ decreases with increasing SNR but is indifferent to varying frequency offset, attesting that FOE by CW-DA-ML CE algorithm is unbiased to the frequency offset present.

CW-DA-ML CE algorithm converges quickly yielding rapid carrier phase and frequency tracking. We note that a short preamble of approximately twice the filter length, *N* ≈2*L*, is sufficient to aid the complex weight ${\widehat{w}}_{l}\left(k\right)$ in acquiring tracking of ${e}^{j\Delta \omega l}$, thus keeping the training overhead cost low. The MSE of CW-DA-ML CE for 4-PSK signals at SNR values above 7 dB settles to a steady state, implying acquisition of the frequency offset, within 100 observed samples. Whereas the differential FOE for 4-PSK signals in [7] used a minimum observation sample size of 500 for acquisition of the frequency offset. Hence, CW-DA-ML FOE is more than 5 times faster than that of the differential FOE of [7] for 4-PSK signals at SNR values above 7 dB.

#### 4.2 Bit-error rate performance with CW-DA-ML CE

Bit-error rate (BER) curves of block *M*th power [2], DA-ML [3, 4], and CW-DA-ML CE are plotted in Fig. 4
for 4- and 8-PSK, as a function of SNR per bit, ${\gamma}_{b}$, in the presence of different frequency offsets. Filter length and laser linewidth was set to 5 and 0, respectively. A preamble of *N* = 10 known symbols is used for DA-ML and CW-DA-ML CE. DE according to [8] is employed to resolve phase ambiguity in block *M*th power CE, and to counter runaways due to decision errors in DA-ML and CW-DA-ML CE.

It is evident that block *M*th power and DA-ML CE are intolerant to frequency offsets. For instance, in block *M*th power CE of 4-PSK signals, as the frequency offset increases from 0 to 100 and 200 MHz, the ${\gamma}_{b}$ penalty rises from 1.13 to 1.33 and 1.80 dB, respectively. Likewise for DA-ML CE, the ${\gamma}_{b}$ penalty rises from 0.89 to 1.30 and 2.14 dB. SNR per bit penalties are referenced to the perfect coherent (ASE noise limited) receiver at BER = 10^{−4}. Moreover, block *M*th power and DA-ML CE become increasingly sensitive to frequency offset with increasing modulation order. For a fixed frequency offset of 100 MHz, as modulation format increases from 4- to 8-PSK, block *M*th power CE incurs an increasing ${\gamma}_{b}$ penalty of 1.33 and 3.18 dB, respectively. Likewise, DA-ML CE incurs an increasing ${\gamma}_{b}$ penalty of 1.30 and 3.63 dB. Frequency offset intolerance of block *M*th power and DA-ML CE originates from the violation of their assumption that $\left\{r\left(l\right),k-L+1\le l\le k\right\}$ have identical angular frequency offsets.

On the contrary, the ${\gamma}_{b}$ penalty for CW-DA-ML CE remains unchanged at 0.89 and 1.13 dB regardless of the frequency offset present in 4- and 8-PSK, respectively, at BER = 10^{−4}. Regardless of ∆*f*, BER of CW-DA-ML CE over all tested SNR values closely replicate that of DA-ML CE at zero frequency offset. Hence, CW-DA-ML CE approaches ideal FOE, without requiring any statistical knowledge of AWGN, carrier phase, or frequency offset.

Figure 4 also shows CW-DA-ML CE at ∆*f* = 100 MHz with ideal decision feedback, i.e., $\widehat{m}\left(k\right)=m\left(k\right)$ for all *k*. In the range of SNR shown, the performance difference between actual and ideal decision feedback is not noticeable for 4-PSK, and is slightly more pronounced at the lower SNR region for 8-PSK. The effect of decision errors on the receiver performance becomes negligible for BER's below 10^{−3}.

#### 4.3 FOE range of CW-DA-ML CE

SNR per bit, ${\gamma}_{b}$, penalty of block *M*th power CE [2], DA-ML CE [3, 4], differential FOE [7], and CW-DA-ML CE are plotted in Fig. 5
for 4- and 8-PSK, as frequency offset is varied between $\pm $*R*/2, in the presence of different laser linewidths, ∆*ν*. Any integer-multiple-of-*R* frequency offset, ∆*f*, translates into the same modulo-2*π* reduced ∆*ω*, thus having the same rotational effect on $r\left(k\right)$. Hence, it is sufficient to consider only the primary modulo-*R* reduced frequency offset range $\pm $*R*/2. The differential FOE utilizes a subsequent block *M*th power CE to correct any residual carrier phase, DE to overcome its *R*/*M* frequency estimate ambiguity, and a sample size of 500 to acquire the frequency offset, following that of [7].

CE techniques, such as block *M*th power and differential FOE, which erase data modulation by raising the received signal to the *M*th power, and extract the phase using the arctan($\xb7$) operator are inherently disadvantaged in three ways. First, they are restricted to MPSK signals. Second, their FOE range is limited, modulation-format dependent, and decreases with increasing modulation order. For a fixed bit rate of ${R}_{b}$, block *M*th power CE has a maximum allowable frequency offset of $\pm {R}_{b}/2LM{\mathrm{log}}_{2}\left(M\right)$ in the absence of carrier phase variations [7], and the differential FOE has an FOE range of $\pm {R}_{b}/2M{\mathrm{log}}_{2}\left(M\right)$ [13]. Third, they require phase unwrapping in estimating the carrier phase and frequency offset, which is difficult to perform accurately at low SNR and is prone to the highly nonlinear cycle slipping phenomenon [14].

On the other hand, our CW-DA-ML CE achieves a complete modulo-*R* reduced, modulation-format-independent, FOE range of $\pm $*R*/2 as witnessed in Fig. 5. This is attributed to the use of the RP, ${V}^{\prime}\left(k\right)$, having an unambiguous phase tracking range of [0, 2*π*). The use of the RP also eliminates the need for phase unwrapping in carrier phase and frequency offset estimation. The circular-2*π* nature of the RP results in frequency-estimate duplicity of *R*, but this duplicity does not call for pilot assistance or DE to ensure correct DD. From Fig. 5(a), it is notable that a penalty of only approximately 1-dB is incurred at BER = 10^{−4} for a 50 Gbit/s 4-PSK signal having a laser linewidth of 2 MHz, regardless of the frequency offset present.

#### 4.4 Complexity and parallel implementation of CW-DA-ML CE

The computational load, defined as the number of operations required to estimate ${e}^{j\left(\Delta \omega k+\theta \left(k\right)\right)}$ per symbol, of the CW-DA-ML CE algorithm presented in Table 1 is $O\left({L}^{2}\right)$ real multiplications and additions. However, by implementing Eq. (4) in a lattice structure, which is not presented here for brevity, the complexity reduces to simply $O\left(L\right)$ real multiplications and additions [12]. Differential FOE (inclusive of subsequent block *M*th power CE) requires $O\left({\mathrm{log}}_{2}M\right)$ real multiplications and additions, and (2*L* + 1)/*L* accesses to a read-only-memory to map arctan($\xb7$) operations. The 1/*L* factor arises from sharing of the common arctan($\xb7$) operation among *L* symbols during the block *M*th power processing. The higher complexity of CW-DA-ML CE compared to differential FOE is traded off with CW-DA-ML CE’s wider and faster frequency estimation, and its applicability to arbitrary modulation formats. The update of $w\left(k\right)$ need be performed only until the frequency offset is acquired, say, until the 100th iteration for 4-PSK signals according to Fig. 3. Thereafter, the same weight vector $w\left(k\right)$ can be used in Eq. (4) at each time point *k*, i.e., skip steps 4 to 8 in Table 1. Then, the complexity of CW-DA-ML CE becomes simply 4*L* + 6 and 4*L* real multiplications and additions, respectively, per symbol.

Our CW-DA-ML CE is a *D*-symbol forward linear predictor, where the filter predicts ${e}^{j\left(\Delta \omega \left(k+D\right)+\theta \left(k+D\right)\right)}$ as a weighted linear combination of past observed samples $\left\{r\left(l\right),l\le k\right\}$ and symbol decisions $\left\{\widehat{m}\left(l\right),l\le k\right\}$ which are fed back. All of the above simulations used an ideal feedback delay of *D* = 1, which is the minimum in a feedback system. Practical coherent optical receivers supporting 100 Gb/s or higher data rates use parallel and pipelined processing to overcome the limited complementary metal-oxide-semiconductor processor speed of several Gb/s. In parallel processing, the effect of the observed sample on the feedback value is delayed by *D* > 1 [15]. Parallel implementation of CW-DA-ML CE, with a constant frequency offset, zero laser phase noise, and a feedback delay of *D*-symbols, is expected to incur marginal performance deterioration as it can estimate *D*-symbols ahead the linearly increasing angular frequency offset ∆*ω*(*k* + *D*). However, in the additional presence of laser phase noise, performance penalty originating from feedback delay is expected as pointed out in [15]. This is due to the difficulty in estimating the accumulated Gaussian random-walk of the carrier phase over the previous *D*-symbols, for which the estimated carrier phase information is unavailable because of feedback delay. A detailed implementation and performance analysis of CW-DA-ML CE in parallel processing is left as part of our future work.

## 5. Conclusions

If the frequency-offset-symbol-duration product is small, separate frequency offset estimation (FOE) may not be necessary as block *M*th power and decision-aided, maximum-likelihood carrier estimation (DA-ML CE) can cope with small frequency offsets. To combat larger frequency offsets, an FOE is necessary. Hence, we proposed a symbol-by-symbol receiver employing a novel complex-weighted (CW) DA-ML CE for optical channels with both carrier phase and frequency offset uncertainty. A near-ideal FOE over a complete modulo-*R* reduced frequency range of $\pm $*R*/2 (*R* = symbol rate) is demonstrated such that the bit-error rate performance is only limited by the carrier phase variance. CW-DA-ML CE is applicable to both *M*-ary phase-shift keying (MPSK) and quadrature amplitude modulation formats. The FOE for 4-PSK signals is more than 5 times faster than that of the differential FOE at signal-to-noise ratio values above 7 dB.

Acknowledgments

The financial support of the Singapore MoE AcRF Tier 2 Grant MOE2010-T2-1-101 is gratefully acknowledged.

## References and links

**1. **F. Derr, “Coherent optical QPSK intradyne system: concept and digital receiver realization,” J. Lightwave Technol. **10**(9), 1290–1296 (1992). [CrossRef]

**2. **D. S. Ly-Gagnon, S. Tsukamoto, K. Katoh, and K. Kikuchi, “Coherent detection of optical quadrature phase-shift keying signals with carrier phase estimation,” J. Lightwave Technol. **24**(1), 12–21 (2006). [CrossRef]

**3. **P. Y. Kam, “Maximum likelihood carrier phase recovery for linear suppressed-carrier digital data modulations,” IEEE Trans. Commun. **34**(6), 522–527 (1986). [CrossRef]

**4. **S. Zhang, P. Y. Kam, J. Chen, and C. Yu, “Decision-aided maximum likelihood detection in coherent optical phase-shift-keying system,” Opt. Express **17**(2), 703–715 (2009). [CrossRef] [PubMed]

**5. **“Integrable Tunable Laser Assembly Multi Source Agreement,” OIF-ITLA-MSA-01.1 (Optical Internetworking Forum, Los Angeles, 2005).

**6. **A. Meiyappan, P. Y. Kam, and H. Kim, “Performance of decision-aided maximum-likelihood carrier phase estimation with frequency offset,” in Proc. OFC/NFOEC, Los Angeles, CA, 2012, paper OTu2G.6.

**7. **A. Leven, N. Kaneda, U.-V. Koc, and Y.-K. Chen, “Frequency estimation in intradyne reception,” IEEE Photon. Technol. Lett. **19**(6), 366–368 (2007). [CrossRef]

**8. **W. J. Weber, “Differential encoding for multiple amplitude and phase shift keying systems,” IEEE Trans. Commun. **26**(3), 385–391 (1978). [CrossRef]

**9. **K.-P. Ho, *Phase-modulated Optical Communication *(Springer, 2005).

**10. **P. Y. Kam, S. S. Ng, and T. S. Ng, “Optimum symbol-by-symbol detection of uncoded digital data over the Gaussian channel with unknown carrier phase,” IEEE Trans. Commun. **42**(8), 2543–2552 (1994). [CrossRef]

**11. **P. Y. Kam, K. H. Chua, and X. Yu, “Adaptive symbol-by-symbol reception of MPSK on the Gaussian channel with unknown carrier phase characteristics,” IEEE Trans. Commun. **46**(10), 1275–1279 (1998). [CrossRef]

**12. **J. G. Proakis, *Digital Communications* (McGraw-Hill, 2008).

**13. **U. Mengali and A. N. D' Andrea, *Synchronization Techniques for Digital Receivers* (Plenum Press, 1997).

**14. **H. Meyr, M. Moeneclaey, and S. Fechtel, *Digital Communication Receivers* (John Wiley, 1997).

**15. **T. Pfau, S. Hoffmann, and R. Noe, “Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations,” J. Lightwave Technol. **27**(8), 989–999 (2009). [CrossRef]