## Abstract

The ultimate limits introduced by polarization dependent loss (PDL) in coherent polarization multiplexed systems using advanced signal processing are studied. An analytical framework for effectively assessing the penalties is established and applied to systems with and without dynamically optimized launch polarization control. In systems with no launch polarization control, the PDL induced penalty is described by a simple formula and it is independent of the choice of constellation, or modulation format. The gain from optimizing launch polarizations is studied numerically and the mechanisms limiting it are described.

©2008 Optical Society of America

## 1. Introduction

Coherent polarization-multiplexed optical systems, using advanced signal processing are currently revolutionizing the field of optical communications. In such systems, given the rapidly growing capabilities of high-speed electronics, many optical phenomena such as chromatic dispersion (CD), polarization mode dispersion (PMD) and even certain types of nonlinear distortions, can be compensated for electronically at the receiver, so that very high tolerance to these phenomena can be achieved [1]–[4]. In contrast, perfect compensation for non-unitary phenomena such as polarization dependent loss (PDL) cannot be realized from a fundamental standpoint and therefore its effects must be properly accounted for. Studies aimed at single polarization systems [5],[6], or at systems operating with intensity modulation and detection [7], are not suitable for this new regime. More contemporary studies that addressed the PDL issue as part of their analysis of a particular implementation of a polarization multiplexed QPSK system [3],[4] focused on particular implementation and lacked the broader view of how PDL may affect coherent systems of a more general kind.

In this paper the goal is to assess the ultimate performance degradation caused by PDL in a generic, optimally designed, coherent polarization multiplexed system. One in which inter-symbol interference caused by dispersive effects, or by intra-channel nonlinearities is well compensated for, and where joint detection and processing of data transmitted over both polarizations can be performed. For such systems, we obtain simple expressions that relate the degradation in performance to the PDL parameters of the system and allow the assessment of margins needed for given outage probabilities. The obtained results are quite general and in most cases of relevance they do not depend on the exact choice of constellation, or modulation format.

## 2. Theory

We study a coherent polarization multiplexed system that consists of *N*
* _{s}* amplified spans and exploits an optimally designed receiver. It is assumed that the distortion of the Gaussian noise statistics as a result of the optical nonlinearity of the fiber is small and can be neglected. The system is assumed to contain

*N*

*amplified spans and the matrix describing transmission from the output of the*

_{s}*j*-th amplifier to the receiver is denoted by

**T**

*. Thus*

_{j}**T**

_{0}is the matrix describing transmission over the entire link from transmitter to receiver. The signal launched into the fiber on the transmitter side is denoted by a Jones vector [

*s*

_{1}(

*t*),

*s*

_{2}(

*t*)]

*where the superscript*

^{t}*t*denotes “transposed” and where

with ξ having units of energy and with *f*(*t*) being a unit energy pulse representing the transmitted symbol. The terms *a*
^{(k)}
_{1} and *a*
^{(k)}
_{2} are complex numbers that represent the digital information carried by the *k*-th symbol in each one of the two polarizations. It is assumed that the two polarizations are modulated independently while using identical modulation formats. For example, in the case of Mary PSK (M-PSK) transmission ${a}_{1}^{\left(k\right)}=\mathrm{exp}\left(i\frac{2\pi {l}_{1}}{M}\right)\phantom{\rule{.2em}{0ex}}\mathrm{and}\phantom{\rule{.2em}{0ex}}{a}_{2}^{\left(k\right)}=\mathrm{exp}\left(i\frac{2\pi {l}_{2}}{M}\right)$, where *l*
_{1} and *l*
_{2} are two independent integers between 0 and *M*-1. The terms Δ*φ* and Δ*t* denote possible differences in optical phase and temporal alignment of the two signals, respectively. Typically, both those parameters are slowly varying functions of time. The optical signal reaching the receiver can be expressed as

with *n*
^{(j)}
_{1} and *n*
^{(j)}
_{2} being two independent circular Gaussian processes whose power density is given by $\frac{{P}_{n}}{2{N}_{s}}$, such that the overall noise power density in the absence of PDL would be *P*
* _{n}*.

The most difficult obstacle in the proceeding analysis is the possible temporal misalignment between the two data-carrying waveforms. For non-integer values of Δ*t*/*T* and assuming optimal signaling, a given symbol in each data stream overlaps with two symbols in the other data-stream, such that the number of dimensions that must be taken into account in the analysis of optimal detection needs to be doubled. This is in fact a form of inter-symbol interference generated by loss of orthogonality that is induced by PDL. Notice that with the effects of PMD compensated for, misalignment between the data streams is generated mainly at the transmitter and may be caused by phase misalignment between the clocks driving the two signals. A situation that can in principle be fixed in practical implementations. Thus, for the tractability of the analysis, we consider only the case of full temporal overlap between the data streams in the calculations that follow. The relevance of the calculations in providing an estimate for the performance degradation in the presence of PDL should not be noticeably affected by this limitation.

Denoting by *h*(*t*) the receiver filter’s impulse response, one may now replace the time dependent quantities in the column vectors in Eq. (2) by their filtered samples

where *r*
_{1,2}=∫_{T}*r*
_{1,2}(*t*)*h*(-*t*)d*t* and *n*
^{(j)}
_{1,2}=∫_{T}*n*
^{(j)}
_{1,2}(*t*)*h*(-*t*)d*t* are independent identically distributed circular Gaussian random variables with variance equal to *P*
* _{n}*/2

*N*

*. The term*

_{s}*η*is the overlap integral

*η*=∫

_{T}*f*(

*t*)

*h*(-

*t*)d

*t*and it assumes its maximum value of 1 for the case of matched filtering, i.e. when

*h*(

*t*)=

*f*(-

*t*). The superscript

*k*denoting the symbol number above

*a*

_{1}and

*a*

_{2}in Eq. (1) has been omitted for clarity. In a shortened notation (3) can be conveniently rewritten as

where we use an underline to denote a two dimensional (Jones) vector with complex components. The variables *r*
_{1} and *r*
_{2} are the decision variables, based on which, a decision regarding the identity of the received symbols is to be made. The statistics of the accumulated Gaussian noise in Eq. (3) is completely described in terms of its coherency matrix, which can be readily obtained from Eq. (2) and it is given by ${\Lambda}_{n}=\frac{{P}_{n}}{2{N}_{s}}\sum _{j=1}^{{N}_{s}}{\mathbf{T}}_{j}{\mathbf{T}}_{j}^{\u2020}$. Equation (3) is now perfectly well defined and it allows extraction of the error probabilities. Our goal in what follows is to manipulate the decision variables such that the assessment of performance will become more straightforward.

Using the fact that the matrices **T**
_{j}**T**
^{†}
* _{j}* are hermitian, the coherency matrix can be conveniently expressed in the form

where ${\left\{{g}_{j}^{\left(0\right)}\right\}}_{j=1}^{{N}_{s}}$ are real nonnegative numbers and where ${\left\{{\overrightarrow{g}}_{j}\right\}}_{j=1}^{{N}_{s}}$ denotes a three dimensional vector with real components. The modulus of *g*⃗* _{j}* can be shown to satisfy |

*g*⃗

*|≤*

_{j}*g*

^{(0)}

*with equality obtained only when the matrix*

_{j}**T**

*represents a perfect polarizer. The term σ⃗ is the Pauli-matrix vector σ⃗=[σ*

_{j}_{1},σ

_{2},σ

_{3}], with σi (i=1,2,3) being the pauli-matrices, commuted as in [9] for consistency with the conventional polarization terminology. The rightmost expression in Eq. (5) follows the definitions $g\prime =\frac{1}{{N}_{s}}\sum _{j=1}^{{N}_{s}}{g}_{j}^{\left(0\right)}$ and $\overrightarrow{\Gamma}\prime =\frac{1}{{N}_{s}g\prime}\sum _{j=1}^{{N}_{s}}{\overrightarrow{g}}_{j}$, and the term

**I**represents the 2 by 2 identity matrix. Next, we denote by

**U**the matrix that diagonalizes the coherency matrix Λ

*. Clearly, since Λ*

_{n}*is Hermitian,*

_{n}**U**is a unitary matrix satisfying

**U**

^{-1}=

**U**

^{†}. The diagonalized coherency matrix can be expressed as

where Γ′ denotes the modulus of Γ⃗′ and where σ_{1} is the first Pauli matrix in the notation of [9] (which is the same as σ* _{z}* in the physical spin literature). It is now convenient to normalize the decision vector [

*r*

_{1},

*r*

_{2}]

*as follows*

^{t}$$=\frac{\sqrt{2\eta \xi}}{\sqrt{{P}_{n}g\prime \left(1-\Gamma {\prime}^{2}\right)}}\sqrt{\mathbf{I}-\Gamma \text{'}{\sigma}_{1}}\mathbf{U}{\mathbf{T}}_{0}\underset{-}{u}+\underset{-}{\overset{\sim}{n}}$$

such that the components of *n*̱̃=[*n*̃_{1},*n*̃_{2}]* ^{t}* are statistically independent circular-complex Gaussian variables with zero mean and unit variance. The term $\sqrt{\mathbf{I}-\Gamma \prime {\sigma}_{1}}$ is a diagonal 2 by 2 matrix whose top left argument is $\sqrt{1-\Gamma \prime}$ and whose bottom right argument equals $\sqrt{1+\Gamma \text{'}}$. The first term on the right-hand side of Eq. (7) represents the re-normalized constellation of transmitted signals. While the exact symbol error-rate can only be found by dividing the four dimensional constellation space to decision regions and considering the complete constellation, a good approximation follows from the notion that the majority of errors occurs between minimally distant constellation points. Thus, the symbol-error rate is equal approximately to

*Q*(

*d*

*/2) where $Q\left(x\right)=\frac{1}{\sqrt{2\pi}}{\int}_{x}^{\infty}\mathrm{exp}\left(\frac{-{s}^{2}}{2}\right)\mathrm{ds}$ is the famous Gaussian Q-function and where dmin is the smallest distance between a pair of constellation points in the re-normalized space. The PDL induced SNR penalty is therefore equal to the reduction in the re-normalized minimum distance as a result of PDL.*

_{min}Assume two constellation points in the original space, *u*̱_{1} and *u*̱_{2}, the difference between which is Δ*u*̱=[Δ*a*
_{1},exp(*i*Δ*φ*)Δ*a*
_{2}]* ^{t}*. The corresponding square distance between those points in the re-normalized constellation space is

Note that **U**
^{†}(**I**-Γ′σ_{1})**U**=**I**-Γ⃗′·σ⃗ and also that **T**
^{†}
_{0}
**T**
_{0} can be expressed as **T**
^{†}
_{0}
**T**
_{0}=*g*
_{0}(**I**+Γ⃗_{0}·σ⃗) with Γ⃗_{0} being the PDL vector of the entire link and with *g*
_{0} denoting the polarization averaged loss. With these relations, the square distance can be re-expressed as

where Δ⃗* _{in}*=Δ

*u*̱

^{†}σ⃗Δ

*u*̱ and Δ⃗

*=Δ*

_{out}*u*̱

^{†}

**T**

^{†}

_{0}σ⃗

**T**

_{0}Δ

*u*are the Stokes vectors that correspond to the difference between the two constellation points at the input and output of the link, respectively. Namely, these are the Stokes representations of the difference between the Jones vectors corresponding to the two constellation points at the input, and output the link, respectively. One must therefore avoid misinterpreting these quantities as the difference between the Stokes vectors of the two constellation points. The term Δ

*denotes the modulus of Δ⃗*

_{in}*(which is equal to Δ*

_{in}*=|Δ*

_{in}*a*

_{1}|

^{2}+|Δ

*a*

_{2}|

^{2}), whereas $\tilde{\Delta}$

*represents its orientation. Using the relation Δ*

_{in}*=Δ*

_{out}*u*̱

^{†}

**T**

^{†}

_{0}

**T**

_{0}Δ

*u*̱=Δ

_{in}*g*

_{0}(1+Γ⃗

_{0}·$\widehat{\Delta}$

*) and denoting by ${\mid \Delta \underset{-}{{\tilde{r}}_{0}}\mid}^{2}=\frac{2\eta \xi {\Delta}_{\mathrm{in}}}{{P}_{n}}$ Pn the square distance in the absence of PDL, the following final expression for the distance ratio is obtained*

_{in}Equation (10) is the main analytical result of this paper. It connects the reduction in distance between any two constellation points to the PDL parameters of the link. Note that this is an exact expression for the assumed system without any approximations, or restrictions on the magnitude of the PDL, or its statistics. It will later serve us for the assessment of the PDL induced performance penalty. The vector Γ⃗_{0} is the forward [5] PDL vector of the entire link and it is related to the instantaneous PDL of the link *ρ* through $\rho =\frac{1+{\Gamma}_{0}}{1-{\Gamma}_{0}}$. The vector Γ⃗′ defined in (5) is approximately the backward [5] PDL vector integrated over the system length. The values of the polarization averaged loss *g*
_{0} and the integrated polarization averaged loss *g*′ (5) depend both on the particular random fiber realization and on the equalization strategy, as we elaborate later.

For the sake of the subsequent analysis and to lay the foundations for the later simulation of penalties with the help of Eq. (10), we describe now the construction of the transmission link in more detail. The matrix representing the overall effect of PDL in each span of the assumed multi-span fiber-optic system is expressed as ${M}_{j}=\mathrm{exp}\left(-\frac{1}{2}\left({\alpha}_{j}^{\left(0\right)}I+{\overrightarrow{\alpha}}_{j}\xb7\overrightarrow{\sigma}-i{\overrightarrow{\beta}}_{j}\xb7\overrightarrow{\sigma}\right)\right)$. The terms *α*⃗* _{j}* and

*α*

^{(0)}

*denote the local PDL vector [11] and the polarization independent loss, respectively, whereas*

_{j}*β*⃗

*is the birefringence vector. Since we do not consider dispersive effects, all quantities are taken to be independent of optical frequency. The previously defined transmission matrices*

_{j}**T**

*are then given by*

_{i}${\mathbf{T}}_{i}=\prod _{j=i}^{{N}_{s}}{M}_{j}$

where the matrices *M*
* _{j}* are ordered with the smallest index appearing on the rightmost side and where

*i*is between 0 and

*N*

*. The noise of each amplifier is modelled as if it is added to the signal at the amplifier output and so the noise of the*

_{s}*j*th amplifier sees the PDL of only the spans and amplifiers that follow (see Fig. 1). Although in practice, in multi-stage amplifiers the noise is generated adiabatically along the amplifying medium, this simplified view (which is critical from the analytical standpoint) is quite reliable in systems consisting of a large number of spans. It is assumed that the first amplifier is positioned at the end of the first span and that the last amplifier is immediately in front of the receiver so that the noise emitted by it does not experience PDL. Therefore, the last matrix

*M*

*for*

_{j}*j*=

*N*

*is the identity matrix. The vectors*

_{s}*α*⃗

*are picked from an isotropic Gaussian distribution with [8],[10].*

_{j}where *γ*=20/ln(10), where 〈*ρ*
^{2}〉 is the mean square PDL and where the mean PDL is $\u3008\rho \u3009=\sqrt{\frac{8\u3008{\rho}^{2}\u3009}{\left(3\pi \right)}}$ [8]. Note that *M*
* _{j}* accounts for the PDL of the entire span and therefore, it includes the combined effect of many small PDL contributions from the fiber and various inline components, thereby justifying the choice of Gaussian distribution. Nevertheless, as in similar situations encountered in previous studies of PDL and PMD, this particular choice has no effect on the final distributions when

*N*

*≫1. The presence of the birefringence vectors*

_{s}*β*⃗

*is immaterial and they can be omitted without any loss of generality. In fact, the omission of these vectors is rigorously equivalent to adopting a frame of reference that rotates with the birefringence [8], [5]. The choice of the polarization independent loss term*

_{j}*α*

^{(0)}

*is not obvious, as it reflects the equalization strategy used in the system and it is related to the mode in which the inline optical amplifiers are operated. For example, one option would be to operate the amplifiers such that the polarization averaged gain of each amplified span is equal to unity. This choice is equivalent to requiring that the trace of*

_{j}*M*

^{†}

_{j}*M*

*is equal to 2 and it is achieved with*

_{j}*α*

^{(0)}

*=ln (cosh(|*

_{j}*α*⃗|)). While this mode of operation is perhaps convenient from the point of view of analysis, its practical implementation in actual systems is highly nontrivial. A much more practical strategy would be to assume that the amplifiers are operated in a constant output power mode, so that the total optical power at the amplifier output equals the total launched power. In this case, assuming that the states of polarization of the different WDM channels are random and the distribution of their states of polarization is approximately uniform at the input [12], the above condition implies that the polarization averaged loss from the link input to the output of any amplifier is equal to 1. Mathematically, this means that the terms

*α*

^{(0)}

*are such that the trace of*

_{j}**T**

^{†}

_{0}→

_{j}**T**

_{0}→

*is equal to 2 for all*

_{j}*j*, where

**T**

_{0}→

*j*≡

**T**

_{0}

**T**

^{-1}

*. In the numerical examples presented in what follows, the latter form of operation is assumed. Once the set of matrices*

_{j}*M*

*is generated, the parameters of Eq. (10) are trivially obtained.*

_{j}In the next section the usage and implications of Eq. (10) for the estimation of PDL induced performance degradation is discussed and demonstrated.

## 2.1. The relation to performance degradation

In polarization multiplexed constellations (unlike in the case of generalized Polarization shift keying) the nearest constellation points in the original space are such that differ in only one of the two polarization components. This is easily verified from the fact that |Δ*u*̱|^{2}=|Δ*a*
_{1}|^{2}+|Δ*a*
_{2}|^{2} which is smallest when either Δ*a*
_{1}, or Δ*a*
_{2} is set to 0. Since the two polarization signals are assumed to be identically modulated, the smallest square distance between two constellation points that differ in both polarization components is larger by exactly a factor of 2 than the minimum square distance over all points. Therefore, for a pair of points that differ in both polarizations to become minimally distant in the re-normalized constellation space, *η*
* _{r}* must reach values lower than 0.5, implying a fairly large PDL. Unless the input states of polarization are optimized dynamically, with feedback from the receiver, a scenario that we consider separately in the next section, such high values of PDL cannot be encountered with any meaningful probability in normally operating systems. If it were not so, then penalties in excess of 3dB would be observed with approximately the same probability. For this reason, the minimally distant pairs of constellation points in the re-normalized space would always be minimally distant in the original space, corresponding to either $\widehat{\Delta}$

*=(1,0,0) when Δ*

_{in}*a*

_{2}=0, or $\widehat{\Delta}$

*=(-1,0,0) when Δ*

_{in}*a*

_{1}=0. The smaller of the two values obtained from Eq. (10) with $\widehat{\Delta}$

*=(±1,0,0) and the corresponding $\widehat{\Delta}$*

_{in}*is the SNR penalty, defined as the ratio between the PDL impaired SNR and the SNR in the absence of PDL. In what follows we use the symbol ηSNR to denote this quantity. Note that we find it convenient to use linear units, in which case*

_{out}*η*

*receives values in the range of 0 to 1, with 1 corresponding to no penalty. It is important to note that in this regime the PDL induced SNR penalty is independent of the choice of constellation. In other words, the SNR will be degraded by the same amount regardless of which modulation format is applied to the individual polarizations, provided only that coherent detection with joint signal processing is performed.*

_{SNR}Much insight into the problem can be gained from considering first the low PDL regime, where Eq. (10) can be considerably simplified. As we show in the next section, results obtained for this regime are quite useful in the relevant range of average PDL values. In this regime Eq. (10) can be written as

where we have dropped the polarization independent loss term *g*
_{0}/*g*′ and expanded the rest of the expression to second order in Γ⃗_{0} and Γ⃗′. In this expansion we have used the relation $\widehat{\Delta}$
* _{out}*⋍$\widehat{\Delta}$

*+Γ⃗*

_{in}^{⊥}

_{0}, which is correct to first order in Γ⃗

_{0}, as can be shown based on the dynamic Eq. of the unit Stokes vectors in fibers with PDL [5]. The term Γ⃗⊥

_{0}denotes the component of Γ⃗

_{0}that is orthogonal to $\widehat{\Delta}$

*. Additionally, for small PDL and with a the above specified equalization strategy, ${{g}_{j}}^{\left(0\right)}\simeq 1,{\overrightarrow{g}}_{j}\simeq \sum _{i=j}^{{N}_{s}-1}{\overrightarrow{\alpha}}_{i}\mathrm{and}{\overrightarrow{\Gamma}}_{0}\simeq \sum _{i=0}^{{N}_{s}-1}{\overrightarrow{\alpha}}_{i}$. Finally, using the above relations in the definition of Γ⃗′ and after rearranging the sums, leads to the relation*

_{in}with which *η*
* _{r}* can be described in terms a quadratic form of Gaussian variables. On the other hand, truncating the expression after the first term on the right hand-side leads to a Gaussian approximation. While the distribution of quadratic forms can be addressed rigorously, the resulting expressions are very cumbersome and unnecessary for the approximation level that we need here. Instead, we use the Gaussian approximation with a small correction to the mean value based on the two second order terms. As we show by comparison to numerical simulations, the resulting Gaussian distribution very accurately predicts the behavior of the studied systems in the relevant range of PDL values. Thus, after some algebra we find that the mean of

*η*

*is*

_{r}and its variance is

where Γ⃗^{‖}
_{0} and Γ⃗^{‖} are the components of Γ⃗_{0} and Γ⃗′ in the direction of Δ* _{in}* and with 〈|

*α*⃗

*|*

_{j}^{2}〉 specified by Eq. (11) and it is equal approximately to the mean PDL in decibels divided by 64

*N*

*. The SNR reduction ratio is*

_{s}*η*
* _{SNR}*=

*m*

*-|Γ⃗‖*

_{r}_{0}-Γ⃗′′|,

and its distribution density looks like the left wing of a Gaussian function,

The outage probability for any prescribed PDL margin is then given in terms of the Gaussian Q-function.

The accuracy of the above analysis is demonstrated in Fig. 2 by comparison with numerical Mote-Carlo simulations of Eq. (10). In Fig. 2(a) the probability density function of *η*
* _{SNR}* is shown, whereas Fig. 2(b) shows the cumulated distribution of the SNR penalty, namely the probability that

*η*

*is smaller than a given value. The numerical results were obtained by simulating Eq. (10) directly, without any assumptions, or approximations and for all relevant pairs of points in an 8-PSK constellation. Identical results were obtained for the case of Q-PSK, demonstrating the independence on the exact constellation. For the clarity of the Fig., the displayed numerical data corresponds only to case of 8-PSK. Monte-Carlo simulations were performed with 10*

_{SNR}^{6}fiber realizations and for the case of

*N*

*=10. The results correspond to average PDL values of 1,2 and 3 dB and in all cases the excellent agreement with the analytical results of (16) is evident. This agreement provides final evidence to the fact that the choice of modulation format is immaterial. Also, in this situation our early assumption that there is full temporal overlap between the symbols transmitted in the two polarization does not introduce any inaccuracy to the obtained results. Only in the case of 〈*

_{s}*ρ*〉=3 dB can a small discrepancy between the approximate theory and simulation be observed in the tails of the distribution, and even then it is of insignificant magnitude. The system margin that needs to be allocated for PDL is easily obtained from the above theory. For a specified outage probability the margin can be defined as the SNR degradation whose probability of occurrence is equal to the outage probability. Thus, based on Eq. (16) we express the system margin in decibels as

where *Q*
^{-1}(·) is the inverse Q function. It is interesting to plot the allowed average PDL for a specified margin and outage probability. This information is provided in Fig. 4(a) where the bottom solid curve is obtained from the analytical relation (17) whereas the open circles around it represent simulation results. The curve was plotted for the same simulation parameters described in context with Fig. 2 and for *P*
_{outage}=4×10^{-5} (equivalent to 20 minutes per year). Note that average PDL of 3dB is allowed only when margins in access of 3dB are acceptable, a fairly unrealistic situation in practical systems. The other data presented in Fig. 4 is related to the next section and it will be discussed there.

## 2.2. Dynamically optimized launch polarization

In advanced polarization multiplexed systems it is possible to optimize the launch states of polarization for the achievement of better performance. Since the PDL of the link changes gradually with time, such optimization needs to be performed dynamically with a continuous feedback from the receiver. In a practical application, a control mechanism that continuously searches for the best launch polarization can be implemented quite reasonably. In contrast, the numerical simulation of this same process and the assessment of performance is very ineffective, since it requires the search for the optimal launch polarization in every realization of the PDL in the link. A much more practical approach is to note that the best launch polarization would have to be close to the polarization that minimizes the leading, first order term in expression (12). Since Γ_{0} is always well below unity for relevant PDL values, the first-order term is by far the most significant contributor to performance degradation. This is particularly true since the second-order term in (12) is independent of the launch polarization. This reasoning suggests that $\widehat{\Delta}$
* _{in}* should be in the plane that is orthogonal to the vector Γ⃗

_{0}-Γ⃗′, but it does not specify its direction in that plane. As we show in what follows, the exact choice of direction is not very significant. That is because in this regime, with the first order term eliminated, performance degradation is dictated not so much by the high-order terms in Eq. (12), but rather by the increased probability of events in which non-minimally distant constellation points in the original space, such that differ in both polarization components at the input, become minimally distant following the effect of PDL.

In order to gain some insight into this situation, note that the relevant pairs of constellation points that differ in both polarizations and are candidates for becoming minimally distant in the re-normalized space are those for which the distances in each polarization |Δ*a*
_{1}|=|Δ*a*
_{2}| are equal to the minimum distance. For these pairs of points, the input unit Stokes vector ^{1} is of the form $\widehat{\Delta}$
^{′}
* _{in}*=(0, sin(

*θ*),cos(

*θ*) where

*θ*can assume different values, depending on the modulation format that is used. Note that the orientation of $\widehat{\Delta}$

*′*

*in this case is in a plane that is orthogonal to the direction of $\widehat{\Delta}$*

_{in}*that corresponds to constellation points differing in only one polarization. Thus, launching the input signals in polarization states that are approximately orthogonal to Γ⃗*

_{in}_{0}-Γ⃗′ for the purpose of minimizing the first-order penalty term, approximately maximizes the probability that pairs of points differing in both polarizations become minimally distant. To illustrate this point further, we consider two possible choices for $\widehat{\Delta}$

*. The first choice is to have $\widehat{\Delta}$*

_{in}*point in the direction specified by ±Γ⃗*

_{in}_{0}×Γ⃗′ so that it is orthogonal to both vectors. The second choice is to place $\widehat{\Delta}$

*in the same plane in which Γ⃗*

_{in}_{0}and Γ⃗′ reside, but such that it is orthogonal to the difference between them. In both cases the SNR penalty can be expressed as

where the minimization is over the two possible signs in $\widehat{\Delta}$
* _{in}* and over all allowed values of

*θ*in $\widehat{\Delta}$ ′

*. The factor of 2 in the second term in Eq. (18) is due to the fact that for these events the PDL-free square distance is twice larger than the minimum distance between points differing in only one polarization.*

_{in}^{1}Of course, technically we use a reference frame in which the launch states coincide with the *x* and *y* axes. Variation of launch polarization in the actual physical space therefore translates formally into the corresponding opposite rotation of the physical vectors Γ⃗_{0}, Γ⃗′,*α*⃗* _{j}* etc.

In Fig. 3(a) we show the cumulated distribution curves of *η*
* _{SNR}* in these two cases, corresponding to mean PDL of 4 dB and to 8-PSK modulation. These curves were obtained from Monte-Carlo simulations of Eq. (10) similarly to the numerical calculations described in the context of Fig. 2 and performed for all the pairs of constellation points participating in the scheme. The dashed curves represent the simulation of Eq. (10) while ignoring pairs of constellation points differing in both polarizations. It illustrates the effect of the high-order terms in Eq. (12). The solid curves correspond to a simulation that takes all relevant pairs of constellation points into account. Notice how the solid curves are characterized by two distinct behaviors. While, for small penalties the high order terms dominate (as can be seen from the comparison between the dashed and solid curves), the more relevant range of high penalties is dominated by events in which the minimal distance corresponds to pairs that differ in both polarization. In this range the two solid curves overlap almost perfectly, suggesting that as long as $\widehat{\Delta}$

*is orthogonal to Γ⃗*

_{in}_{0}-Γ⃗′, the further optimization of its orientation has little significance. In what follows, the choice of $\widehat{\Delta}$

*orthogonal to both Γ⃗*

_{in}_{0}and Γ⃗′ is assumed. Figure 3(b) shows the cumulated distributions of

*η*

*for PDL values of 2, 3 and 4 dB, indicating fairly large penalties.*

_{SNR}In Fig. 4(a), we show the tolerable average PDL for a specified system margin and assuming the standard outage probability of 4×10^{-5}. The case in which the launch polarization is optimized is represented by the top two curves. These curves correspond to 8-PSK and to *Q*-*PSK* modulation. The curve corresponding to the case without optimizing launch polarizations is also shown in the figure as we discussed earlier. Note that the improvement in PDL tolerance that is achieved by optimal polarization control is only of the order of ~2dB.

The fact that in the case of optimized launch polarization the results for 8-PSK and Q-PSK differ slightly is not surprising in view of our earlier discussion. When events in which pairs of points differing in both polarization start affecting performance, the penalty becomes dependent on the choice of constellation. In general, it can be argued that the PDL tolerance reduces with the richness of the constellation. The reason is quite obvious and it has to do with the fact that in rich constellations the number of allowed angles *θ* in Eq. (18) increases and thus the minimum distance can attain a smaller value. For example, in the case of M-PSK, the allowed values of *θ* are *θ*=2*π*
*j*/*M* with *j*=0, ‖,*M*-1, so that the vector $\widehat{\Delta}$
* _{in}* may point in M equally spaced angles in the plane

*S*̂

_{1}=0. Thus, the larger

*M*the more options there are for $\widehat{\Delta}$ ′

*to coincide with Γ⃗*

_{in}_{0}-Γ⃗′. This notion has a very important and interesting implication and that is that for a given mean PDL, the outage probability may scale at most linearly with

*M*. In practice linear scaling is quite insignificant for outage probabilities in the relevant range. Therefore even in cases where optimal state launch is performed, the importance of the exact constellation is minimal in terms PDL tolerance. This is evident from Fig. 4(a), where the difference between the Q-PSK and 8-PSK curves is shown to be very small, and it is also illustrated in Fig. 4(b), where the cumulated probability curves corresponding to the cases of Q-PSQ and 8-PSK are plotted for 〈

*ρ*〉=4 dB. Consistently with the above argument, the probability of a given penalty in the Q-PSK case is found to be smaller by slightly less than a factor of 2 in the wings of the distribution.

## 3. Conclusions

We have studied the penalties induced by PDL in coherent polarization multiplexed optical systems with advanced signal processing. Such systems are rapidly becoming practical and their use is proposed ubiquitously for future fiber-optic links. An analytical framework for considering the effects of PDL in systems and shedding light on the various properties of this phenomenon was presented. We showed that the PDL induced penalties are practically independent of the type of constellation, or modulation format used in each of the polarizations. In systems that do not use active optimization of launch polarizations, the PDL penalty is accurately described by a simple analytical formula. The presented analytical framework is also used for simple numerical extraction of PDL induced penalties in systems where active control of launch polarizations is applied. The gain from using active polarization control is limited and it is estimated to be in the range of ~2dB.

## Acknowledgment

The author is pleased to acknowledge Dr. Michael Eiselt for asking a question that led to this research.

## References and links

**1. **J. Renaudier, G. Charlet, M. Salsi, O.B. Pardo, H. Mardoyan, P. Tran, and S. Bigo, “Linear Fiber Impairments Mitigation of 40-Gbit/s Polarization-Multiplexed QPSK by Digital Processing in a Coherent Receiver,” J. Lightwave Technol. **26**, 36–42 (2008). [CrossRef]

**2. **L. E. Nelson, S. L. Woodward, M.D. Feuer, X. Zhou, P.D. Magill, S. Foo, D, Hanson, D. McGhan, H. Sun, M. Moyer, and M. O’Sullivan, “*Performance of a 46Gbps dual polarization QPSK transceiver in a high-PMD fiber transmission experiment*,” Optical Fiber Communications conference, Paper PDP9, OFC San Diego (2008).

**3. **C. Laperle, B. Villeneuve, Z. Zhang, D. McGhan, H. Sun, and M. O’Sullivan, “WDM Performance and PMD Tolerance of a Coherent 40-Gbit/s Dual-Polarization QPSK Transceiver,” J. Lightwave Technol. **26**, 168–175 (2008). [CrossRef]

**4. **H. Sun, K.-T. Wu, and K. Roberts, “Real-time measurements of a 40 Gb/s coherent system,” Opt. Express **16**, 873–879 (2008) [CrossRef] [PubMed]

**5. **A. Meccozzi and M. Shtaif, “Signal-to-noise-ratio degradation caused by polarization-dependent loss and the effect of dynamic gain equalization,” J. Lightwave Technol. **22**1856–1871 (2004). [CrossRef]

**6. **M. Shtaif and A. Mecozzi, “Polarization-dependent loss and its effect on the signal-to-noise ratio in fiber-optic systems,” IEEE Photon. Technol. Lett. **16**, 671–673 (2004). [CrossRef]

**7. **I.T. Lima, A.O Lima, Yu Sun, Hua Jiao, J. Zweck, C.R. Menyuk, and G.M. Carter,“A receiver model for optical fiber communication systems with arbitrarily polarized noise,” J. Lightwave Technol. **23**, 1478–1490 (2004). [CrossRef]

**8. **A. Mecozzi and M. Shtaif, “The statistics of polarization dependent loss in optical communication systems,” IEEE Photon. Technol. Lett. **14**, 313–315 (2002). [CrossRef]

**9. **J. P. Gordon and H. Kogelnik, “PMD fundamentals,” Proc. Natl. Acad. Sci. **97**, 4541–4550 (2000). [CrossRef] [PubMed]

**10. **A. Galtarossa and L. Palmieri, “Spatially Resolved PMD Measurements,” J. Lightwave Technol. **22**, 1103–1105 (2004). [CrossRef]

**11. **B. Huttner, C. Geiser, and N. Gisin, Polarization-induced distortions in optical fiber networks with polarization-mode dispersion and polarization-dependent losses,” IEEE J. Sel. Top. Quantum Electron. **6**, 317–329 (2000). [CrossRef]

**12. **
This is what happens when each channel passes through different optical routes before being multiplexed into the transmission fiber.