## Abstract

Modal dispersion (MD) in a multimode fiber may be considered as a generalized form of polarization mode dispersion (PMD) in single mode fibers. Using this analogy, we extend the formalism developed for PMD to characterize MD in fibers with multiple spatial modes. We introduce a MD vector defined in a *D*-dimensional extended Stokes space whose square length is the sum of the square group delays of the generalized principal states. For strong mode coupling, the MD vector undertakes a *D*-dimensional isotropic random walk, so that the distribution of its length is a chi distribution with *D* degrees of freedom. We also characterize the largest differential group delay, that is the difference between the delays of the fastest and the slowest principal states, and show that it too is very well approximated by a chi distribution, although in general with a smaller number of degrees of freedom. Finally, we study the spectral properties of MD in terms of the frequency autocorrelation functions of the MD vector, of the square modulus of the MD vector, and of the largest differential group delay. The analytical results are supported by extensive numerical simulations.

© 2012 Optical Society of America

## 1. Introduction

With the fiber-optic communications industry facing an imminent capacity crunch [1, 2], the search for new, scalable technologies has become unavoidable. One of the most promising approaches for preventing the predicted crisis is the use of spatially multiplexed transmission [3] in a combination of multi-core (MCF) [4] and multi-mode (MMF) fibers [5, 6]. Indeed, a notable fraction of regular and post-deadline papers presented in the last two largest conferences on optical communications (the Optical Fiber Communications conference (OFC), Los Angeles 2011, and the European Conference on Optical Communications (ECOC), Geneva 2011) were devoted to spatially multiplexed optical transmission.

Conceptually, provided that all fiber modes are selectively addressed at transmitter and receiver, spatially multiplexed fiber-optic transmission is a direct extension of polarization multiplexed transmission in single mode fibers that is being used today. To describe a fiber with *N* spatial modes, the 2-dimensional polarization vector needs to be replaced by a 2*N* dimensional vector whose elements represent the excitation of the various modes (with the factor of 2 accounting for the fact that each mode supports two polarizations). Correspondingly, the 2 × 2 transfer matrix that describes linear transmission in single mode fibers needs to be replaced by a matrix whose dimension is 2*N* × 2*N*. This situation calls for the extension of the concepts that are at the heart of polarization studies. Thus, rather than referring to the phenomena of polarization mode dispersion (PMD) and polarization dependent loss (PDL), which describe the relevant consequence of polarization coupling, one now has to deal with the more general concepts of modal dispersion (MD) and mode dependent loss (MDL). Relevant work on this topic has already been reported, where the concept of principal states of polarizations was generalized to the multi-mode case [7] and where the statistics of the corresponding group delays was derived [8]. Important observations on the dynamics of MDL were also reported in [9]. Nonetheless, the understanding of the physical nature of these phenomena can greatly benefit from the generalization of the analytical tools, which were invaluable in polarization studies.

Polarization vectors, which have two complex-number components, reside in a so-called Jones space. This Jones space is isomorphic to a so-called Stokes space which consists of 3 dimensional real-valued column vectors [10]. Thus, every state of polarization can be uniquely represented either as a Jones vector, or as a vector in Stokes space. In addition, unitary operations in Jones space can be represented by a Stokes vector, whose direction is the axis of rotation and whose modulus equals the rotation angle. For this reason, the effect of birefringence (which is a unitary operation in Jones space) is customarily represented by a 3-dimensional vector *β⃗* in Stokes space, known as the *birefringence vector*. Specifying the birefringence of a fiber therefore comes down to specifying the birefringence vector *β⃗*(*z*) that represents it at every point *z* along the fiber. Similarly, the effect of PMD is customarily represented by a vector *τ⃗*in Stokes space, known as the PMD vector, capturing birefringence-induced waveform distortion. The Stokes representations of the slowest and fastest states of polarization (known as the principal states of polarization) coincide with ±*τ̂* (where *τ̂* = *τ⃗/τ* is a unit vector), and the differential group delay (DGD) between the slowest and fastest polarization components is equal to *τ* = |*τ⃗*|. Even non-unitary phenomena such as PDL are typically represented in terms of vectors in Stokes space [11]. It is thus natural for the analysis of spatially multiplexed transmission to seek an extension of the above concepts to the multi-mode case.

In this paper, we introduce an extended *D* = 4*N*^{2} – 1 dimensional Stokes space, that allows the representation of multi-mode 2*N*-dimensional state-vectors (which are the extension of the 2-dimensional Jones vectors in the case of polarization). While generalized Stokes representations have been considered previously in other contexts [12, 13], the properties of the generalized Stokes space and the relation to unitary dynamics (which is of particular importance in our case) have not been fully formulated previously, to the best of our knowledge. We show that while the extended Stokes-space lacks some of the properties of the usual Stokes-space describing polarization, many properties can be generalized in a way that sheds light on the physics of multi-mode propagation. In particular, focusing on the case where MDL is negligible, we show that the local birefringence vector *β⃗* and the PMD vector *τ⃗* can be extended to the *D*-dimensional space so as to capture the effects of local mode coupling and MD. As in the case of PMD, the MD vector *τ⃗* determines the temporal spread of propagating pulses, although the relation between its magnitude and the differential mode delays is somewhat less direct in the multi-mode case, as we shall see. In order to demonstrate the use of the proposed formalism we consider the case of quasi-degenerate modes, namely transmission over multiple spatial modes with varying degrees of coupling, but with small deterministic difference between the various group delays, such that the delay spread observed at the output is always dominated by the random coupling caused by perturbations. In addition, our theory applies to the case of arbitrary deterministic group delay difference, as long as the coupling between the modes is strong. The statistics of mode dispersion in this scenario are identical to that of strongly coupled degenerate modes. This framework includes real-world scenarios in which groups of modes propagating at different group velocities are processed separately [14]. For example, this corresponds to the case in which the four degenerate LP_{11} modes of a step-index fiber are processed separately from its two LP_{01} modes. Larger groups of degenerate modes can be encountered in situations where multiple MMFs are combined in a MCF structure. Our analysis does not rely on a specific choice of the set of modes that are used to describe signal propagation in the optical fiber and it is valid as long as the quasi-degeneracy conditions are satisfied. For example, in the case of a single-core multi-mode fiber one could choose to work with either the exact (hybrid) fiber modes, or with the linearly polarized mode approximation, with no consequence to the validity of the results.

The source of MD within a group of degenerate modes is random fiber imperfection and environmental perturbation that remove the degeneracy and cause group-delay spread. While MD can in principle be compensated for using digital signal processing (DSP) [15], the requirements from the DSP hardware and algorithm, as well as the overall system performance strongly depend on the statistical properties of MD, which we extract in this paper. In what follows we will use the acronym MMF in order to refer collectively to all structures in which light can propagate in multiple spatial modes, whether by means of a single multi-mode waveguide, or through multiple cores.

## 2. Main results

Prior to delving into the analytical details of the proposed model, we devote the present section to a summary of the main results.

#### 2.1. Generalized Stokes space and modal dispersion vector

An important outcome of this work is the introduction of a generalized *D*-dimensional Stokes space (with *D* = 4*N*^{2} – 1) and the definition of a *D*-dimensional MD vector *τ⃗*, as derived in Section 3. Knowledge of *τ⃗* allows the extraction of the principal modes of the system as well as their various group delays, although the relations between those quantities and *τ⃗* are less transparent than they are in the case of PMD. As shown in Section 3.2, the quantity *τ*^{2}, which is the square modulus of *τ⃗*, is proportional to the sum of the squares of the 2*N* individual group delays *t _{i}*,

The individual group delays are defined with the mode averaged delay set to 0, ∑* _{j} t_{j}* = 0. The MD vector

*τ⃗*evolves as a Gaussian vector, such that its modulus is chi distributed (its square modulus obeys the chi-square distribution). If no spatial mode coupling exists (while still allowing for polarization coupling within each spatial mode), only 3

*N*of the components of

*τ⃗*are different from zero, and the distribution of the modulus

*τ*is chi distributed with 3

*N*degrees of freedom. In the other extreme, i.e. in the presence of significant random coupling among all

*N*spatial modes (as well as among the 2 polarizations within each spatial mode), the number of degrees of freedom is 4

*N*

^{2}– 1 implying that the orientation of

*τ⃗*is homogenously distributed in the entire generalized Stokes space, and the chi distribution has 4

*N*

^{2}– 1 degrees of freedom. Note that the chi distribution is a special case of the Nakagami distribution, which is famous in the context of wireless communications [16]. Interestingly, in the presence of partial coupling, the distribution of

*τ*can be well approximated as a chi distribution with an effective number of degrees of freedom

*D*

_{eff}, ranging between 3

*N*and 4

*N*

^{2}– 1. This transition is illustrated in Fig. 1, where we plot the probability density function (PDF) of

*τ*for different coupling regimes for the case

*N*= 2. In (a) the spatial modes are uncoupled (

*D*

_{eff}= 3

*N*) and in (d) there is full coupling (

*D*

_{eff}= 4

*N*

^{2}– 1); plots (b) and (c) represent different degrees of partial coupling. The dots are the results of Monte Carlo simulations and the solid curves correspond to chi probability density functions with

*D*

_{eff}degrees of freedom. The details of the computation are discussed in Section 4.

#### 2.2. Maximum modal delay spread

A quantity of particular relevance in the analyzed scenario is the largest *differential* group delay, defined as the difference between the largest and the smallest group delays, *T* = max* _{k}*{

*t*} – min

_{k}*{*

_{k}*t*}, as discussed in Section 3.2. This parameter defines the maximum spread of the propagating waveforms and is therefore related to the required complexity of the digital signal processing hardware (e.g. the number of taps of a finite impulse response equalizer). While the derivation of the distribution of

_{k}*T*in the general case seems to be a very difficult task [8, 9], the chi distribution turns out to provide an excellent approximation, as illustrated in Fig. 2 for the case of full coupling. The parameters 〈

*T*

^{2}〉 and

*K*characterizing the distribution were obtained numerically and are plotted in the top panel of Fig. 2 as a function of the number of spatial modes

_{N}*N*. The term 〈

*T*

^{2}〉 is the mean square value of

*T*, whereas

*K*is what we refer to as the

_{N}*shape parameter*of the chi-distribution used to fit the PDF of

*T*[17]. The dependence of 〈

*T*

^{2}〉 and

*K*on

_{N}*N*is very well approximated by the following heuristic expressions

*x*⌉ denotes the smallest integer greater than

*x*. The approximate curves are not displayed in Fig. 2, as they are practically indistinguishable from the actual curves in the plotted scale. The curves in the bottom panels illustrate the accuracy of this approximation on a logarithmic scale for three values of

*N*. The plots show that, as the number of spatial modes increases, the chi distribution becomes more and more symmetric around the root-mean square value of

*T*, asymptotically approaching the Gaussian distribution. The analytic expression provided in [8] for the case of 2

*N*= 3 (which unfortunately does not correspond to a practical multi-mode scenario) also agrees very closely with the chi approximation, the differences between the two curves being practically unnoticeable.

#### 2.3. Autocorrelation functions

The spectral behavior of MD is conveniently quantified in terms of the autocorrelation functions of the MD-vector and of its modulus. The width of these functions is a measure of the bandwidth across which the distortion affecting different spectral components of the transmitted optical field are statistically correlated, and it is also defined as the bandwidth across which the first-order approximation [10] applies to the interpretation of MD effects. In the regime of strong coupling we were able to derive analytic expressions for the frequency autocorrelation function of the MD vector *τ⃗*,

*τ*

^{2},

*N*= 1, that is

*D*= 3. The ACF of

*τ⃗*, Eq. (4), normalized to its peak value, is plotted versus

*ω*(〈

*τ*

^{2}〉/

*D*)

^{1/2}by solid lines in the top panels of Fig. 3 for

*N*= 2,

*N*= 3 and

*N*= 4. The stars are the results of Monte Carlo simulations that we performed to test the analytical result. The corresponding ACF of

*τ*

^{2}, Eq. (5), after subtraction of its asymptotic value for

*ω*→ ∞ and normalization to its peak value, is plotted by solid lines in the bottom panels of Fig. 3, where the stars are again the results of Monte Carlo simulations. We numerically verified that the autocorrelation function of

*τ*,

*R*(

_{τ}*ω*) = 〈

*τ*(Ω +

*ω*)

*τ*(Ω)〉, whose analytic expression is not available, is approximated very well by Eq. (5), provided that their asymptotic values are removed and that they are normalized to their peak values. More remarkably, we verified that Eq. (5), re-scaled and modified by replacing

*D*with

*K*, gives an excellent approximation of the autocorrelation function of the largest

_{N}*differential*group delay. The ACF of

*T*is plotted in the bottom panels of Fig. 3 by circles and the modified Eq. (5) by dot-dashed lines: although the modification of the re-scaled ACF of

*τ*

^{2}is practically unnoticeable for small values of

*N*, an excellent match between simulations and analytical expressions characterizes all cases. The bandwidth of the largest

*differential*group delay

*B*can thus be very well approximated as the 3dB width of the modified ACF (5) ${B}_{T}\simeq 3.2\sqrt{{K}_{N}/\u3008{\tau}^{2}\u3009}$, which, for a moderate, yet practically important, number of spatial modes (∼

_{T}*N*< 5), is only negligibly different from ${B}_{T}\simeq 3.2\sqrt{D/\u3008{\tau}^{2}\u3009}$.

Analytical expressions corresponding to Eq. (4) and (5) in the regime of partial coupling are not available. However, we found through numerical simulations that the intriguing result illustrated in Fig. 1 (i.e., the good match to a chi distribution independent of *D*_{eff}) extends somewhat to the MD correlation properties. Indeed, the shape of the autocorrelation function of *τ⃗* and *τ*^{2} is not affected by the coupling magnitude, whereas its bandwidth increases as the coupling magnitude increases and scales, yet only approximately, with
$\sqrt{{D}_{\text{eff}}}$.

#### 2.4. Non-degenerate modes

Figures 1(d), 2 and 3 were plotted for the case of *N* degenerate fully coupled modes with no MDL. This situation differs somewhat from reality, where there are groups of degenerate modes that are characterized by deterministically different wave-numbers and hence they couple to each other significantly less as a result of perturbations. Using the characteristic example of the coupling between the LP_{01} and the LP_{11} groups of modes in step-index fibers [20], we show that the results presented above remain practically unchanged. To illustrate this point, we show in Fig. 4(a) the PDF of the length of the *D*_{LP11} = 15 dimensional MD vector *τ⃗*_{LP11} that represents the four LP_{11} modes and the PDF of the length of the *D*_{LP01} = 3 dimensional MD vector *τ⃗*_{LP01} representing the two polarization modes of LP_{01}. In Figs. 4(b) and 4(c) we show the corresponding MD vector ACFs. In order to plausibly emulate the situation in a real fiber [20], we adjusted the magnitude of the difference between the wave-numbers corresponding to the two groups of modes to the point where the overall output power in the LP_{11} modes was about 10dB lower than the output power in the LP_{01} mode, given that only the LP_{01} mode was excited. As is evident from the figures, the numerically obtained distribution of *τ*_{LP11} and *τ*_{LP01} (symbols) is still very well fitted by chi distributions of *D*_{LP11} = 15 and *D*_{LP01} = 3 degrees of freedom respectively (solid lines). At the same time the numerical ACF (symbols) still matches the theoretical curves (solid lines) given by Eq. (4) very well. The individual MD vectors were extracted by considering the unitary part of the corresponding 4 × 4 and 2 × 2 blocks along the main diagonal of the simulated overall (6 × 6) transfer matrix.

## 3. Theory

Following [10], we use the notation |*ψ*(*z,ω*)〉 to represent the state of the field in what we refer to as the generalize Jones space, at angular frequency *ω* at a given point *z* along the optical fiber. One should interpret |*ψ*(*z*)〉 as a column vector with 2*N* complex components representing the excitation of the various modes in the fiber. The corresponding row vector, which is the hermitian conjugate of |*ψ*(*z*)〉, is denoted by 〈*ψ*(*z*)| = (|*ψ*(*z*)〉)^{†}. For convenience we will also assume in what follows that the state vector |*ψ*(*z*)〉 is normalized, such that 〈*ψ*(*z*)|*ψ*(*z*)〉 = 1. Propagation between the positions *z*_{0} and *z* along the fiber is represented by a linear operator **U**(*z,z*_{0}) so that

**U**(

*z,z*

_{0}) and |

*ψ*(

*z*)〉 on

*ω*was suppressed for brevity of notation. When considering the process of propagation in the frequency domain,

**U**(

*z,z*

_{0}) can be conveniently represented by a 2

*N ×*2

*N*frequency dependent matrix, which is unitary at each frequency in the absence of MDL, such that

**U**(

*z*

_{0}

*, z*) =

**U**

^{−1}(

*z,z*

_{0}) =

**U**

^{†}(

*z, z*

_{0}). The space and frequency evolution of

**U**(

*z,z*

_{0}) can be expressed most generally as

**B**(

*z*) and

**T**(

*z,z*

_{0}) are frequency dependent Hermitian matrices. Their Hermiticity is required for the unitarity of

**U**(

*z,z*

_{0}) to be preserved at all

*z*and

*ω*. The multi-mode propagation problem of Eqs. (7) and (8) can be conveniently approached by extending the Pauli-matrix formalism to the multi-mode case as we show in the section that follows.

#### 3.1. Trace orthogonal matrices and the generalized Stokes space

A generic 2*N ×* 2*N* Hermitian matrix **H** contains 4*N*^{2} entries. Of these, 2*N* are necessarily real (on the main diagonal) and the remaining 2*N*(2*N* – 1) come in complex conjugate pairs. Thus 4*N*^{2} real numbers are necessary to completely define a 2*N ×*2*N* Hermitian matrix. This is hence the number of basis elements in the linear space of 2*N ×* 2*N* Hermitian matrices. A convenient basis for this space is the one consisting of the identity matrix **1** and 4*N*^{2} – 1 matrices Λ* _{j}* having zero trace and satisfying the condition of trace-orthogonality

*δ*is the Kronecker delta function. Equation (9) can be shown to define an inner product in the space of Hermitian matrices. An example of a trace-orthogonal basis in the case of

_{i,j}*N*= 1 is the identity with the three Pauli matrices that are used for the study of PMD [10]. The choice of a trace-orthogonal basis is never unique, and many algorithms for constructing one are possible. The Gell-Mann method [21] is one known method of construction. Its modification, which is slightly more convenient for our purposes, is described in the appendix. As should be expected, the choice of basis is immaterial for any of the physically meaningful results, and the only guideline for its selection is the simplicity of calculations. Once a basis is chosen, the matrix

**H**is given by a superposition of the basis elements. A compact representation is the following

*η*⃗ is a vector of 4

*N*

^{2}– 1 real elements

*η*, Λ⃗ is a vector whose elements are the matrices Λ

_{j}*, and the dot operator is implemented as*

_{j}*η*⃗ can be extracted by using the trace-orthogonality of the Λ

*matrices,*

_{j}*η*

_{0}represents a mode independent quantity, whereas all mode related phenomena are captured by the traceless content of

**H**, given by the term

*η*⃗

*·*Λ⃗. The vectors

*η*⃗ reside in a 4

*N*

^{2}– 1 dimensional space which is a generalization of the Stokes space used in the analysis of polarizations (

*N*= 1). In what follows we will refer to vectors in this space as to

*generalized Stokes vectors*, to be distinguished from the vectors |

*ψ*〉 to which we refer as

*state-vectors*. Since a projection operator |

*ψ*〉〈

*ψ*| onto a normalized state-vector |

*ψ*〉 is a Hermitian operator, it too can be expressed in the form

*ψ*⃗ is obtained from |

*ψ*〉 via Eq. (12), such that its components are given by

*ψ*|

*ϕ*〉|

^{2}= trace{|

*ψ*〉〈

*ψ*|

*ϕ*〉〈

*ϕ*|} and substituting the right-hand-side of (13) instead of the individual projection operators, one finds that

**1**} = 2

*N*, trace{Λ

*} = 0, as well as the identity*

_{j}*ϕ*〉 = |

*ψ*〉 in Eq. (15), one finds that |

*ψ*⃗|

^{2}= 2

*N*− 1, implying that only in the case

*N*= 1, the Stokes representation of a normalized state-vector is a unit vector. If the state vectors are orthogonal to each other, Eq. (15) indicates that the product of their generalized Stokes vectors is equal to −1. Consequently, the angle

*α*between two generalized Stokes vectors that correspond to orthogonal states satisfies cos(

*α*) = − (2

*N*− 1)

^{−1}, and the vectors are antipodal only in the case

*N*= 1. The fact that 0 ≤ |〈

*ψ*|

*ϕ*〉|

^{2}≤ 1 implies that −1 ≤

*ψ*⃗

*· ϕ*⃗

*≤*2

*N*− 1. The left equality occurs when the state-vectors are orthogonal to each other, whereas the right equality occurs when |

*ψ*〉 = |

*ϕ*〉. It is evident that in the case

*N >*1 not all generalized Stokes vectors with length $\sqrt{2N-1}$ have a state vector that corresponds to them. For example, if

*ϕ*⃗ = −

*ψ*⃗ the product

*ψ*⃗

*· ϕ*⃗ = −2

*N*+ 1 is smaller than −1 so that at least one of the two generalized Stokes vectors cannot represent a legitimate state. This observation can also be made directly from Eq. (13), noting that the projection operator |

*ψ*〉〈

*ψ*| has only two eigenvalues. A non-degenerate eigenvalue equal to 1 corresponds to the eigenstate |

*ψ*〉 itself and another 2

*N*− 1 fold degenerate eigenvalue is equal to 0. The eigenstates of the latter eigenvalue are the 2

*N*− 1 state vectors that are orthogonal to |

*ψ*〉. Thus,

*ψ*⃗ is a legitimate representation of a state-vector only if the matrix

*ψ*⃗

*·*Λ⃗ has a 2

*N*−1 degenerate eigenvalue equal to −1 and exactly one non-degenerate eigenvalue, equal to 2

*N*− 1. It is obvious that only a small fraction of Stokes vectors satisfies this requirement when

*N >*1.

As discussed in Section 3.2, a relevant question concerns the relation between a generalized Stokes vector *S*⃗ and the eigenvalues of the matrix *S*⃗ *·* Λ⃗. In the case *N* = 1 the two eigenvalues are *±*|*S*⃗| and the corresponding eigenvectors are those whose Stokes representation is parallel, or anti-parallel to the direction of *S*⃗. In the case of general *N*, there is no such simple relationship. The total number of eigenvalues is 2*N*, and the orientation of *S*⃗ usually does not correspond to a legitimate state-vector. Nonetheless, the following relationship can be established. Given that |*ψ _{i}*〉 is an eigenvector of

*S*⃗ · Λ⃗, thereby satisfying

*S*⃗ · Λ⃗|

*ψ*〉 =

_{i}*θ*|

_{i}*ψ*〉, where

_{i}*θ*is the corresponding eigenvalue, multiplication by 〈

_{i}*ψ*| from the left and using (14) yields

_{i}*S⃗*, whereas the smallest eigenvalue corresponds to the eigenvector whose generalized Stokes representation is least aligned with it. A more useful relationship follows from noting that ${\theta}_{i}^{2}$ is an eigenvalue of (

*S*⃗ · Λ⃗)

^{2}and hence that $\sum {\theta}_{i}^{2}=\text{Trace}\left\{{\left(\overrightarrow{S}\cdot \overrightarrow{\Lambda}\right)}^{2}\right\}$. Using Eq. (16), this equality produces the relation As we shall see in what follows, this relation gives a physical meaning to the vectors that will be used to represent the matrices associated with propagation in the fiber.

#### 3.2. Mode coupling and the mode-dispersion vector

Using Eq. (10), Eqs. (7) and (8) can be conveniently reexpressed as

*β*

_{0},

*τ*

_{0},

*β⃗*and

*τ⃗*on

*z*and

*ω*was suppressed for the simplicity of notation. Note that

*β*

_{0}and

*τ*

_{0}introduce a phase and group delay that is common to all the propagation modes in the fiber. These terms are therefore immaterial to our study of MD and their values can be set to zero. The vectors

*β⃗*and

*τ⃗*are the two most important quantities for the study of modal dispersion in optical fibers. The first vector,

*β⃗*is a generalization of the birefringence vector, as it describes the local coupling between the various modes. If one chooses to use the basis presented in Appendix A, then the first

*N*triplets of the components of

*β⃗*correspond to the birefringence vectors of the

*N*individual propagation modes. Namely, (

*β*

_{1},

*β*

_{2},

*β*

_{3}) is the birefringence vector of the first mode, (

*β*

_{4},

*β*

_{5},

*β*

_{6}) is the birefringence vector of the second mode and so on. The remaining 4

*N*

^{2}− 3

*N*− 1 terms of

*β⃗*represent coupling between spatial modes, or the difference in the spatial modes’ propagation vectors. The vector

*τ⃗*is the direct generalization of the PMD vector [10] and it captures the essentials of MD.

Equations (19) and (20) can be used to establish the concatenation rule satisfied by the MD vector *τ⃗*. To this end let us consider an assembly of two concatenated fiber sections characterized by the two matrices **U**_{1} and **U**_{2}. The matrix describing their concatenation is **U** = **U**_{2}**U**_{1}. One may express the combined MD vector *τ⃗* of the two fibers as

*τ⃗*

_{1}and

*τ⃗*

_{2}are the MD vectors of the first and second fiber section, respectively. The concatenation rule (21) can be stated as follows: The joint MD vector equals the MD vector of the first section rotated by the coupling matrix of the second section and supplemented by the MD vector of the second section alone. This description is identical to that used in the case of PMD [10], although the word rotation is applied more loosely in the present case, indicating an operation that does not change the length of a generalized stokes vector. To see that this is indeed the case with the second term in (21), one may define ${\overrightarrow{\tau}}_{1}^{(R)}\cdot \overrightarrow{\Lambda}={\mathbf{U}}_{2}\left({\overrightarrow{\tau}}_{1}\cdot \overrightarrow{\Lambda}\right){\mathbf{U}}_{2}^{\u2020}$ and take advantage of identity (16) with

*A⃗*=

*B⃗*to show that

**U**

_{1}represents propagation from

*z*

_{0}to

*z*whereas

**U**

_{2}represents an incremental evolution between the points

*z*and

*z*+ d

*z*. Using Eq. (19) one finds that to first order in d

*z*,

**U**

_{2}=

**1**+ (

*i*/2

*N*)

*β⃗*(

*z*,

*ω*)

*·*Λ⃗d

*z*. Substituting this into (21) produces

*β⃗*=

_{ω}*∂β⃗*/

*∂ω*. This expression reduces to Eq. (6.15) of [10] in the case of

*N*= 1. The solution to Eq. (23) is formally identical to Eq. (7.22) of [10],

**R**(

*L,z*) is a

*D × D*norm-preserving matrix accounting for the evolution of the infinitesimal contribution

*β⃗*(

_{ω}*z*)d

*z*from

*z*to the fiber end at

*z*=

*L*and related to

**U**(

*L,z*) by

*τ⃗*is provided by Eqs. (17), (18). Indeed, if we use

*S⃗*=

*τ⃗*/(2

*N*), then the eigenvalues

*θ*coincide with the group delays of the individual principal states

_{i}*t*and Eq. (17) becomes

_{i}*t*=

_{i}*τ⃗*·

*ψ⃗*/(2

_{i}*N*), thus showing that these are bounded between $-\tau \sqrt{2N-1}/(2N)$ and $\tau \sqrt{2N-1}/(2N)$. On the other hand, Eq. (18) becomes which gives the magnitude of the MD vector as proportional to the sum of the square group delays. For

*N*= 1,

*τ*is equal to the differential group delay |

*t*

_{2}−

*t*

_{1}|, as expected.

An alternative and much simpler formulation notationally follows by defining a cross-product operation in the generalized Stokes space. The Hermiticity and the zero-trace property of the matrices Λ* _{i}* imply that

*i*(Λ

*Λ*

_{i}*− Λ*

_{j}*Λ*

_{j}*) /(2*

_{i}*N*) is also a traceless Hermitian matrix. Hence it can be expanded as

*i*(Λ

*Λ*

_{i}*− Λ*

_{j}*Λ*

_{j}*) /(2*

_{i}*N*) = ∑

_{k}*f*Λ

_{i,j,k}*, with*

_{k}*f*being a set of real-valued structure constants,

_{i,j,k}*e⃗*are the orthogonal unit-length basis vector of the

_{i}*D*-dimensional generalized Stokes space [24]. Equation (28) can be used to re-express the second term on the right-hand-side of Eq. (23) as (

*β⃗ ×*

*τ⃗*)

*·*Λ

*⃗*, so that Eq. (23) simplifies to

*S⃗*= 〈

*ψ*(0)|

**U**(

*z,*0)

^{†}Λ⃗

**U**(

*z,*0)|

*ψ*(0)〉 with respect to

*z*and

*ω*, and by using Eqs. (19) and (20), the evolution of the generalized Stokes vector in space and frequency can be expressed as

## 4. Statistical analysis

In order to account for the statistical properties of MD, we assume that the distribution of the mode coupling vector *β⃗* can be adequately modeled as white Gaussian noise. Although this assumption is not strictly satisfied even in single-mode optical fibers [22], it was shown to produce a very accurate description of PMD when the overall fiber length is significantly larger than the correlation length of the local birefringence. In principle, the white noise model is viable whenever the correlation length of the mode coupling vector is very small relative to the overall system length. Since the source of perturbations (geometrical distortions and mechanical stress) are similar to those experienced by single-mode fibers, it is expected that the correlation length of *β⃗*(*z*) is of a similar order of magnitude as the correlation length of the polarization birefringence vector in single-mode fibers. Using this assumption, it is quite obvious that the MD vector *τ⃗* is Gaussian and evolves as a Wiener process in *z*, which is the continuous limit of a random walk. This conclusion can be readily drawn from Eq. (24) which shows that *τ⃗* is the sum of rotated and statistically independent increments of *β⃗ _{ω}*. Exploiting similar arguments to those made in the analysis of PMD statistics [23], one can conclude that the vector

*τ⃗*(

*ω,z*) experiences a

*D*dimensional isotropic random walk, hence it is Gaussian distributed and its modulus

*τ*obeys a chi distribution. An important consequence of Eq. (24) is that the mean-square MD value 〈

*τ*

^{2}〉 is equal to 〈

*τ*

^{2}〉 =

*γ*

^{2}

*z*, where

*γ*

^{2}is defined through the relation 〈

*β⃗*(

_{ω}*z*)·

*β⃗*(

_{ω}*z*′)〉 =

*γ*

^{2}

*δ*(

*z*−

*z*′). In the case of partial coupling,

*τ⃗*is still Gaussian if

*β⃗*(

*z*) is a white noise process, yet its distribution may not be isotropic, and the equation 〈

*τ*

^{2}〉 =

*γ*

^{2}

*z*still rigorously applies. In this case, numerical modeling shows that

*τ*approximately obeys a chi distribution with an effective number of degrees of freedom

*D*

_{eff}

*< D*.

The derivation of the autocorrelation functions Eqs. (4) and (5) is performed most simply by using the relations introduced in Eq. (29). Just as in the procedure followed in the study of PMD [19], we switch to a rotating reference frame, where, after a first-order expansion in frequency, *β⃗* is replaced by *ωβ⃗ _{ω}* with

*ω*denoting the offset from the central frequency of the propagating signal. The spatial dependence of

*β⃗*is modeled as white Gaussian noise. The customary procedure in this case is to represent the integral of

_{ω}*β⃗*over the propagation distance as an isotropic

_{ω}*D*-dimensional Wiener process

*W⃗*(

*z*). The increments of

*W⃗*(

*z*) are characterized by 〈d

*W*(

_{i}*z*)d

*W*(

_{j}*z*)〉=

*D*

^{−1}

*γ*

^{2}

*δ*d

_{i,j}*z*, where

*γ*is a constant proportional to the mean-square magnitude of the generalized birefringence, such that 〈

*τ*

^{2}(

*z*)〉 =

*γ*

^{2}

*z*. Equation (29) can be re-expressed as a stochastic differential equation

*γ*

^{2}

*ω*

^{2}/

*D*)

*τ⃗*d

*z*, which is obtained after some algebra. From this point on, considering that our cross-product possesses the property

*τ⃗ ·*(d

*W⃗*×

*τ⃗*) = 0, the derivation of the autocorrelation functions is identical to the one presented in [19]. The only modification is that the term

*D*generalizes the factor 3 appearing in the corresponding term in [19].

Finally, we note that the scaling with the number of degrees of freedom is such that the shape of the autocorrelation functions is uniquely determined by 〈*τ*^{2}〉/*D*, a quantity that has the meaning of the mean square MD parameter per dimension in the extended Stokes space. Polarization mode dispersion in single-mode fibers is a special case corresponding to *D* = 3.

#### 4.1. Numerical simulations

In the numerical simulations a fiber is modeled as an assembly of *N _{s}* wave-plates, such that the

*k*-th element is characterized by a generalized real-valued

*D*-dimensional birefringence vector

*b⃗*

_{det}+

*b⃗*, where

_{k}*b⃗*

_{det}describes the deterministic birefringence and

*b⃗*accounts for perturbation-induced birefringence. In the simulation of quasi-degenerate modes

_{k}*b⃗*

_{det}= 0, whereas in the study of non-degenerate modes reported in Fig. 4

*b⃗*

_{det}is used to tune the degree of deterministic coupling between different groups of quasi-degenerate modes. Since

*b⃗*

_{det}describes wave-vector mismatch between different modes, as discussed in Appendix A, only the last

*N*− 1 components of

*b⃗*

_{det}can be nonzero. Expanding

*b⃗*to first order in the optical frequency, and omitting the frequency-independent zeroth-order term (whose effect on the statistics of modal dispersion is immaterial) we write

_{k}*b⃗*=

_{k}*ωb⃗*′

*, where*

_{k}*ω*denotes the offset from the central frequency of the transmitted signal. The transfer matrix of the

*k*-th section is therefore given by

**U**

*= exp [*

_{k}*i*(

*b⃗*

_{det}+

*ωb⃗*′

*)*

_{k}*·*Λ⃗/(2

*N*)]. The frequency independent part of

*b⃗*was verified to have no effect on the statistics of MD and hence it was omitted in the simulations. A similar omission of the frequency independent part of the random birefringence is customary in the study of polarization effects and it is known to have no consequences on the statistics when characterizing linear propagation problems. The vectors

_{k}*b⃗*′

*are selected independently of one another and with each having*

_{k}*D*statistically independent zero-mean Gaussian components.

We start from discussing the case of quasi-degenerate modes, when *b*⃗_{det} = 0. In this case, using Eq. (21), the average square-modulus of the MD vector can be shown to be 〈*τ*^{2}〉 = *N _{s}*〈|

*b⃗*′

*|*

_{k}^{2}〉. To link this notation to the one used in prior sections we note that

*b*⃗

*is equivalent to*

_{k}*β*⃗(

*z*)Δ

_{k}*z*with

_{k}*z*being the position of the

_{k}*k*-th fiber increment and with Δ

*z*being its length.

_{k}In the case of full coupling, all components of the vector *b*⃗′* _{k}* are identically distributed and their variance is taken to be

*σ*

^{2}= 〈

*τ*

^{2}〉/(

*N*). While the modeling of the full coupling case is relatively straightforward, the case of partial coupling is significantly more involved. Most importantly, the regime of partial coupling cannot be captured unequivocally in terms of a single parameter. Recent experimental work [20] characterized mode coupling between modes

_{s}D*i*and

*j*in terms of the fraction of power that is coupled into mode

*j*at the end of the link when mode

*i*is excited. While this is a practical and useful parameter for quantifying mode coupling, it may not provide a unique characterization of the MD statistics, as the same coupling can be generated in a variety of ways. While a detailed description of the partial coupling regime in multi-mode fibers requires further experimental investigation, we model this situation in two ways. If partial coupling is caused by the phase mismatch between non-degenerate modes, then the magnitude of

*b*⃗

_{det}controls the degree of coupling. The results reported in Fig. 4 refer to this situation. Otherwise, in the case of quasi-degenerate modes, partial coupling can only be caused by an asymmetry of the coupling strength itself. This situation is modeled by attaching different values to the variance of different components of

*b*⃗′

*. In particular, using our choice of the matrices constructing Λ⃗ (see Appendix A), the first 3*

_{k}*N*components of

*b*⃗′

*are responsible for polarization coupling within the same spatial mode, whereas the remaining components are responsible for coupling different spatial modes with one another. Thus the degree of inter-modal coupling is controlled by varying the variance ${\sigma}_{c}^{2}$ of the last*

_{k}*D*− 3

*N*components of

*b*⃗′

*, as compared to the variance of the first 3*

_{k}*N*components, which we denote by ${\sigma}_{p}^{2}$. In this case the average square modulus of the MD vector is given by $\u3008{\tau}^{2}\u3009=\left[3N{\sigma}_{p}^{2}+(D-3N){\sigma}_{c}^{2}\right]{N}_{s}$.

## 5. Conclusions

Inspired by the analogy between MD and PMD, we developed a formalism to characterize modal dispersion in multimode fibers. The proposed formalism is based on the definition of a set of *D* = 4*N*^{2} − 1 matrices — which are the generalization of the Pauli spin matrices — as a basis for the expansion of the generalized Jones matrix **U** describing the propagation of *N* spatial modes. In the absence of mode-dependent loss, MD is characterized by a real *D*-dimensional MD vector *τ*⃗, whose components are the coefficients of the expansion of the traceless part of the matrix **T** = −*i*(d**U**/d*ω*)**U**^{†}. The eigenstates of such a matrix are generalized principal states and its eigenvalues are the corresponding group delays; the square length of *τ*⃗ is the sum of the square group delays. In the regime of strong mode coupling the vector *τ*⃗ is Gaussian distributed and its modulus follows a chi distribution with *D* degrees of freedom. We found that to an excellent degree of approximation, the same distribution, yet with a smaller number of degrees of freedom, also characterizes the largest differential group delay, that is the difference between the delays of the fastest and the slowest principal states. Finally, we studied the frequency autocorrelation of the MD vector, of the square modulus of the MD vector and of the largest differential group delay. Extensive numerical simulations were used to check the analytical results, showing good agreement in all cases.

## A. Algorithm to build the Λ matrices

For *N* = 1 the matrices Λ* _{i}* coincide with the Pauli matrices, namely

*N*> 1 the algorithm which gives the 4

*N*

^{2}− 1 matrices proceeds as follows.

**From**Λ_{1}**to**Λ_{3}: Each 2 × 2 block in the main diagonal, from the leftmost to the rightmost, is sequentially set to either one of the three Pauli matrices, while the other elements are set to zero. This gives 3_{N}*N*matrices. The trace of such matrices squared is 2 and hence a normalization coefficient $\sqrt{N}$ is required.**From**Λ_{3}_{N}_{+1}**to**Λ_{4N2–N}: Each element outside the 2 × 2 blocks in the main diagonal is sequentially set to either 1 or*i*and the symmetric element to 1 or −*i*respectively, while the other elements are set to zero. This gives 4*N*^{2}– 4*N*matrices. The trace of such matrices squared is 2 and hence a normalization coefficient $\sqrt{N}$ is required.**From**Λ_{4N2–N+1}**to**Λ_{4N2–1}: We denote these matrices as Λ_{4N2–N+n}, with*n*= 1,...,*N*−1. The 2 × 2 blocks in the main diagonal from the first to the*n*-th are set to*σ*_{0}and the (*n*+ 1)-th to −*nσ*_{0}, while the other elements are set to zero. The trace of such matrix squared is 2(*n*^{2}+*n*) and hence a normalization coefficient $\sqrt{N/\left({n}^{2}+n\right)}$ is required.

It is apparent from the definition that the matrices Λ_{1} . . . Λ_{3}* _{N}* do not couple different spatial modes. Correspondingly, the components

*β*

_{1}. . .

*β*

_{3}

*of the generalized birefringence vector*

_{N}*β*⃗ account for polarization-mode coupling within the same spatial mode. The matrices Λ

_{3}

_{N}_{+1}to Λ

_{4N2–N}couple individual polarization modes of different spatial modes, whose magnitude is defined by the corresponding components of

*β*⃗. Finally, the matrices Λ

_{4N2–N}to Λ

_{4N2–1}describe wave-vector mismatch between different spatial modes.

As an example, we list below the Λ* _{i}* matrices for the case

*N*= 2.

## Acknowledgments

This work has been carried out within an agreement funded by
Alcatel-Lucent in the framework of Green Touch (www.greentouch.org). MS also acknowledges financial support from Tera Santa Consortium and from the University of L’Aquila under project “Progetto speciale multiasse *Reti per la conoscenza e l’orientamento tecnico-scientifico per lo sviluppo della competitività* (Re.C.O.Te.S.S.C.) P.O.R. Regione Abruzzo F.S.E. 2007–2013 Piano 2007–2008.”

## References and links

**1. **A. R. Chraplyvy, “The coming capacity crunch,” European Conference on Optical Communication 2009 (ECOC09), plenary talk (2009).

**2. **R. W. Tkach, “Scaling optical communications for the next decade and beyond,” Bell Labs Tech. J. **14**, 3–9 (2010). [CrossRef]

**3. **Peter J. Winzer and Gerard J. Foschini, “MIMO capacities and outage probabilities in spatially multiplexed optical transport systems,” Opt. Express **19**, 16680–16696 (2011). [CrossRef] [PubMed]

**4. **S. Chandrasekhar, A. H. Gnauck, X. Liu, P. J. Winzer, Y. Pan, E. C. Burrows, B. Zhu, T. F. Taunay, M. Fishteyn, M. F. Yan, J. M. Fini, E. M. Monberg, and F. V. Dimarcello, “WDM/SDM transmission of 10 × 128-Gb/s PDM-QPSK over 2688-km 7-core fiber with a per-fiber net aggregate spectral-efficiency distance product of 40,320 km×b/s/Hz,” Proceedings of European Conference on Optical Communication 2011 (ECOC11), Paper Th.13.C4 (2011).

**5. **S. Randel, R. Ryf, A. Sierra, P.J. Winzer, A.H. Gnauck, C.A. Bolle, R-J. Essiambre, D.W. Peckham, A. McCurdy, and R. Lingle “6×56-Gb/s mode-division multiplexed transmission over 33-km few-mode fiber enabled by 6×6 MIMO equalization,” Opt”. Express **19**, 16697–16707 (2011). [CrossRef] [PubMed]

**6. **E. Ip, N. Bai, Y.K. Huang, E. Mateo, F. Yaman, S. Bickham, H.Y. Tam, C. Lu, M.J. Li, S. Ten, A.P. Tao Lau, V. Tse, G.D. Peng, C. Montero, X. Prieto, and G. Li, “88 × 3 × 112-Gb/s WDM transmission over 50 km of three-mode fiber with inline few-mode fiber amplifier,” Proceedings of European Conference on Optical Communication 2011 (ECOC11), Paper Th.13.C2 (2011).

**7. **S. Fan and J. M. Kahn, “Principal modes in multi-mode waveguides”, Opt. Lett. **30**, 135–137 (2005). [CrossRef] [PubMed]

**8. **K-P. Ho and J. M Kahn, “Statistics of group delays in multi-mode fibers with strong mode coupling,” J. Lightwave Technol. **29**, 3119–3128 (2011). [CrossRef]

**9. **K-P. Ho and J. M Kahn, “Mode-dependent loss and gain: statistics and effect on mode-division multiplexing,” Opt. Express **19**, 16612–16635 (2011). [CrossRef] [PubMed]

**10. **J. P. Gordon and H. Kogelnik, “PMD fundamentals: polarization mode dispersion in optical fibers,” Proc. Natl. Acad. Sci. USA **97**, 4541–4550 (2000). [CrossRef] [PubMed]

**11. **N. Gisin, “Statistics of polarization dependent losses,” Opt. Commun. **114**, 399–405 (1995). [CrossRef]

**12. **D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G. White, “Measurement of qubits,” Phys. Rev. A **64**, 052312 (2001). [CrossRef]

**13. **M. Nazarathy and E. Simony, “Generalized Stokes parameters shift keying approach to multichip differential phase encoded optical modulation formats,” Opt. Lett. **31**, 435–437 (2006). [CrossRef] [PubMed]

**14. **M. Salsi, C. Koebele, D. Sperti, P. Tran, H. Mardoyan, P. Brindel, S. Bigo, A. Boutin, F. Verluise, P. Sillard, M. Astruc, L. Provost, and G. Charlet, “Mode division multiplexing of 2 × 100Gb/s channels using an LCOS based spatial modulator,” J. Lightwave Technol. **30**, 618–623 (2012). [CrossRef]

**15. **S. Randel, M. Magarini, R. Ryf, R-J. Essiambre, A. H. Gnauck, P. J. Winzer, T. Hayashi, T. Taru, and T. Sasaki “MIMO-based signal processing of spatially multiplexed 112-Gb/s PDM-QPSK signals using strongly-coupled 3-core fiber,” Proceedings of European Conference on Optical Communication 2011 (ECOC11), Paper Tu.5.B1 (2011).

**16. **D. Cassioli, M. Z. Win, and A. F. Molisch, “The ultra-wide bandwidth indoor channel: from statistical model to simulations,” IEEE J. Sel. Areas Commun. **20**, 1247–1257 (2002). [CrossRef]

**17. **In the literature the shape parameter *K _{N}* is typically referred to as the number of degrees of freedom of the chi distribution. We refrain from this terminology so as to avoid confusion with the number of degrees of freedom

*D*

_{eff}representing the dimension of the manyfold spanned by the MD vector in the generalized Stokes space.

**18. **M. Karlsson and J. Brentel, “The autocorrelation function of the polarization-mode dispersion vector,” Opt. Lett. **24**, 939–941 (1999). [CrossRef]

**19. **M. Shtaif and A. Mecozzi, “Study of the frequency autocorrelation of the differential group delay in fibers with polarization mode dispersion,” Opt. Lett. **25**, 707–709 (2000). [CrossRef]

**20. **R. Ryf, S. Randel, A. H. Gnauck, C. Bolle, A. Sierra, S. Mumtaz, M. Esmaeelpour, E. C. Burrows, R-J. Essiambre, P. J. Winzer, D. W. Peckham, A. H. McCurdy, and R. Lingle, “Mode-division multiplexing over 96km of few-mode fiber using coherent 6 × 6 MIMO processing,” J. Lightwave Technol. **30**, 521–531 (2012). [CrossRef]

**21. **M. Gell-Mann, “Symmetries of baryons and mesons,” Phys. Rev. **125**, 1067–1084 (1962). [CrossRef]

**22. **A. Galtarossa, L. Palmieri, M. Schiano, and T. Tambosso, “Measurement of birefringence correlation length in long, single-mode fibers,” Opt. Lett. **26**, 962–964 (2001). [CrossRef]

**23. **F. Curti, B. Daino, G. De Marchis, and F. Matera, “Statistical treatment of the evolution of the principal states of polarization in single-mode fibers,” J. Lightwave Technol. **8**, 1162–1166 (1990). [CrossRef]

**24. **The more common terminology for the vector operation defined in Eq. (28) is Lie bracket. We prefer the term generalized cross-product for emphasizing the analogy with the case of PMD.