## Abstract

We present an efficient method for system identification (nonlinear channel estimation) of third order nonlinear Volterra Series Transfer Function (VSTF) characterizing the four-wave-mixing nonlinear process over a coherent OFDM fiber link. Despite the seemingly large number of degrees of freedom in the VSTF (cubic in the number of frequency points) we identified a compressed VSTF representation which does not entail loss of information. Additional slightly lossy compression may be obtained by discarding very low power VSTF coefficients associated with regions of destructive interference in the FWM phased array effect. Based on this two-staged VSTF compressed representation, we develop a robust and efficient algorithm of nonlinear system identification (optical performance monitoring) estimating the VSTF by transmission of an extended training sequence over the OFDM link, performing just a matrix-vector multiplication at the receiver by a pseudo-inverse matrix which is pre-evaluated offline. For 512 (1024) frequency samples per channel, the VSTF measurement takes less than 1 (10) msec to complete with computational complexity of one real-valued multiply-add operation per time sample. Relative to a naïve exhaustive three-tone-test, our algorithm is far more tolerant of ASE additive noise and its acquisition time is orders of magnitude faster.

© 2012 OSA

## 1. Introduction

The nonlinear impairment is a dominant one in long-haul fiber-optic transmission. Modeling and
mitigating fiber nonlinearity is an active and vigorous research topic. Multiple studies of
nonlinear propagation and transmission capacity for single carrier and *Orthogonal
Frequency Division Multiplexing* (OFDM) transmission have been conducted [1–11] and various
*nonlinear compensation* (NLC) methods have been considered. Three prominent
classes of NLC methodologies are digital back-propagation (DBP) [12–14] RF pilot methods [15,16] and
*Volterra Series* (VS) based methods [17–24]. VS is a basic formalism for
characterizing any nonlinear system with memory [25],
particularly useful for formalizing the FWM generation model over a fiber-optic nonlinear
channel [7–11]. A key question not addressed in the prior NLC works pertains to how to estimate
the nonlinear fiber link parameters to be used for compensation. To the best of our knowledge
all *nonlinear* (NL) mitigation approaches just arbitrarily assumed a-priori
knowledge of the nonlinear parameters characterizing the link, either by means of an unspecified
measurement process, or from a theoretical calculation. In particular, to the best of our
knowledge, all VS-based mitigation methods have heretofore assumed analytic formulas for either
the time-domain *Volterra Series Kernel* (VSK) or for the corresponding frequency
domain *Volterra Series Transfer Function* (VSTF). In practice, the exact link
configuration must be estimated in terms of its nonlinear and dispersion parameters, amplifier
gains and fiber span optical lengths, which may discontinuously jump to new values due to
protection switching or network reconfiguration over different paths. A key question which
remains unanswered in all NLC proposals is ‘how the receiver ‘knows’ the
amount of nonlinearity to be compensated in the link. Upon connection initialization, an optical
monitoring procedure would be required in order to estimate the nonlinear channel, measuring and
inferring either the distributed profile of the nonlinear parameter $\gamma (z)$along the link (to be used in DBP-based NLC) or the VSTF (to be
used in VSTF-based NLC). In any case, the appropriate nonlinear description should then be
derived and fed as a parameter set initializing the NLC procedure.

This paper aims to address this glaring omission – the current unavailability of a
practical characterization method for the fiber channel VSTF nonlinear description, which is an
indispensable ingredient for nonlinear compensation to be fed into the parameters of any
VS-based NLC. We develop a methodology of nonlinear optical performance monitoring, providing
for the first time an efficient nonlinear channel estimation algorithm. Specifically we propose
to measure the link VSTF by means of a fast and efficient data-aided algorithm, using an OFDM
*transmitter* (Tx) and *receiver* (Rx). If the OFDM Tx and Rx are
used as part of an operational link, the VSTF estimation will be seen to be available
‘for free’. As the fiber link nonlinear parameters are typically stable over
extended periods of time, it may suffice to perform the VSTF estimation very seldom, e.g. just
upon link initialization or immediately following a protection switching event. The estimated
VSTF may also be used for network planning purposes.

The nonlinear channel estimation algorithm conceived and investigated here is a relatively fast and efficient one, reduced to a linear transformation exhibiting low computational complexity. E.g., for 256 frequency samples per channel, the VSTF measurement takes less than 1 msec to complete and its computational complexity is just a single real-valued multiply-add per time sample. Relative to a naïve exhaustive three-tone-test our scheme is far more tolerant of ASE additive noise and its acquisition time is orders of magnitude faster. Remarkably, the algorithm consists of nothing more than a linear transformation performed on the deviation between the received samples at the FFT output (after SPM/XPM/dispersion compensation) and the samples of the transmitted training sequence, amounting to the application of a matrix-vector multiplication at the Rx, generating the product of a vector of subcarrier outputs in response to a training sequence and a matrix which is predetermined in advance, calculated offline based on the known training sequence. The complexity of this matrix-vector multiplication, expressed as multiply-accumulate operations normalized over the acquisition time, is negligible relative to the complexity of the overall OFDM Rx. Thus, the proposed nonlinear estimation algorithm may be built-in into any OFDM Rx, as part of its initialization procedure, running a rapid procedure to report the nonlinear VSTF characteristics of the fiber link to the higher network layers and enabling to configure the parameters of active NLC modules used during normal operation.

The paper is structured as follows: Section 2 reviews the Volterra nonlinear formalism,
detailing the background for the concept of VSTF which is the target of nonlinear monitoring in
this paper. In section 3 we explore the limitations of a naïve initial approach to VSTF
monitoring, consisting of exhaustive 3-tone probing of the nonlinear fiber link over coherent
OFDM. Section 4 revisits the VSTF analytics, extracting the key insight on lossless compressed
VSTF representation. In section 5 we develop our nonlinear *system
identification* (SID) algorithm for efficient estimation of the VSTF of the fiber link,
and determine its noise robustness, latency and computational complexity. Section 6 treats the
additional slightly lossy compression attainable by prioritized discarding of low-power and
low-multiplicity VSTF samples. In section 7 we present simulations validating the proposed SID
methods vs. *Split-Step-Fourier* (SSF) numerical modeling. Section 8 concludes
the paper.

## 2. Volterra nonlinear formalism – third order fiber nonlinearity characterization

Our objective is fiber nonlinearity monitoring. The first step is to adopt a suitable mathematical representation for the fiber nonlinearity. This section provides the necessary background reviewing the VSTF, which is our target quantity for optical measurement.

The *Volterra Series* (VS) provides a general model for nonlinear
transformations with memory. For our purposes it suffices to truncate the VS to third-order and
adopt a discretized frequency-domain representation of the VS as elaborated in this section,
which reviews the frequency-domain VS model from an optical physics vantage point, relating the
mathematical model with FWM generation. Readers unfamiliar with frequency-domain VS nonlinear
modeling should consult our review in Ch. 3 of [26] for a
self-contained tutorial. Similar Volterra-based nonlinear models in slightly different notations
have also appeared in [7–11] and recently the number of papers approaching nonlinear compensation via
VS-based tools rather than by SSF-based back-propagation is on the rise [17–24]. This trend is consistent
with the Volterra formalism being increasingly recognized as the proper tool for describing weak
nonlinearities in all science and engineering areas and recently also in fiber-optic
communication. The first application of the Volterra series to fiber transmission was presented
in [7].

We remark that even without the benefit of the VS mathematical framework, the theory of
Kerr-induced fiber nonlinearity may be developed in the frequency domain in purely physical
terms. The key is to identify the elementary mechanism of nonlinear generation as the
interaction of frequency triplets via FWM. The overall nonlinear interference is obtained by
summing up all the FWM elementary intermodulation contributions, each of which consists of a
triple product of three complex amplitudes associated with three frequencies. Generally, each
such triple product would be weighted differently in the overall nonlinear superposition. The
VSTF is nothing but the complex weight attached to each frequency triplet, describing the FWM
generation efficiency. Thus, the VSTF may be viewed as the extension of the conventional concept
of *linear transfer function* (TF) (which in turn is the complex weight used to
multiply the amplitude of single input frequency into a linear system by in order to obtain its
output). For a nonlinear system, a full frequency domain characterization involves sweeping over
all possible triplets of frequencies rather than over a single frequency. Hence, the FWM
nonlinearity is fully determined by specifying the VSTF as a complex-valued function of three
frequencies (which may be suitably sampled).

It follows that upon designing nonlinear compensation systems at either the Tx or the Rx, the first indispensable step should be to measure the VSTF for any given fiber link. Heretofore, to the best of our knowledge, the VSTF measurement requirement has been ignored – none of the papers addressing VSTF-based models and their nonlinear compensation have ever specified how the VSTF is obtained, but have typically assumed some analytic model for the VSTF, stemming from an idealized highly structured fiber topology, such as the cascade of fiber spans which are identical in geometry and material parameters. In practice, the geometry and material parameters affecting linear and nonlinear characteristics of the fiber link may deviate from uniformity either by design or even if intended to be uniform and there may also be slow environmental variations. Therefore the measurement / monitoring of the VSFT fiber nonlinear characteristic, as addressed in this paper, is of critical importance if nonlinear monitoring and compensation is to progress from an abstract theoretical concept to practical application.

For the purposes of our mathematical derivation we assume that the readers have made themselves partially familiar with the formulation of the Volterra series methodology for modeling FWM nonlinear generation in the fiber-optic transmission context, as systematically developed in our previous works Ch.3 of [26] and [27], or alternatively in other researchers’ works [1–6] in equivalent notations. In the next subsection we introduce our particular definition and notation for the VSTF, in preparation of deriving novel VSTF measurement procedure.

Finally, as this paper is devoted to proof of concept, for the sake of clarity, rather than
using a vector Manakov system to handle both orthogonal polarizations we illustrate the key
points for a single polarization, amounting to a treatment of scalar *nonlinear
Schro¨dinger equation* (NLSE). Future work should explore extending the novel
SID procedure derived here to a dual polarization context.

#### 2.1 Volterra Series Transfer Function (VSTF)

Let three optical harmonic tones at freqs.${\nu}_{j},{\nu}_{k},{\nu}_{l}$ generate a fourth FWM tone at freq. ${\nu}_{i}={\nu}_{j}+{\nu}_{k}-{\nu}_{l}$ s.t. ${\nu}_{j}\ne {\nu}_{i},{\nu}_{k}\ne {\nu}_{i}.$The rotating phasors describing the optical field complex envelopes (CE) of the three input tones are given by

In OFDM, the center frequencies (subcarriers) of the sub-channels fall on a regularly spaced frequency grid, ${\nu}_{i}=i\Delta \nu ,\text{\hspace{0.17em}}\text{\hspace{0.17em}}i=1,2,\mathrm{...}M,$hence it is convenient to label all the discrete tones by their integer indices, $i\in \mathbb{Z},$setting a one-to-one correspondence between frequencies and their indices: ${\nu}_{i}={\nu}_{j}+{\nu}_{k}-{\nu}_{l}=(j+k-l)\Delta \nu .$Substituting the three phasors (1) into (2) yields the nonlinear output field complex amplitude at the mixing frequency ${\nu}_{i}:$

*j,k,l*(which in turn determine the output tone $i=j+k-l$). This complex scaling factor ${H}_{i;jkl}$ is defined as the

*Volterra Series Transfer Function*(VSTF) of the 3rd order nonlinear system. The VSTF describes the amplitude attenuation or gain and the phase-shift experienced by the FWM mixing product excited by the three input tones. The VSTF is a generalization of the concept of linear

*Transfer Function*(TF). Notice that for a specified output tone

*i*, once the two input tones

*j,k*are also given, the third input tone,

*l*, becomes redundant, as it is uniquely determined as $l=j+k-i.$We then discard this implied index, introducing the abbreviated VSTF notation${H}_{i;jk}\equiv {H}_{i;j,k,j+k-i},$ expressing the output FWM contribution due to the three tones as follows (

*j,k*determine the third index

*l*causing the mixing product to fall onto the specified

*i*):As detailed in [26], the CE of the output nonlinear signal ${R}_{i;jk}^{(3)}$is specified at the input plane of the nonlinear system where the input tones CEs, ${A}_{j},{A}_{k},{A}_{j+k-i}^{},$are measured. Thus, if we obtain an expression for the CE of the nonlinear field at some output plane, in order to extract the corresponding VSTF we ought to first (quasi-linearly) back-propagate the nonlinear generated field to the input plane and then express the resulting field as in Eq. (5).

When the input contains a multitude of tones, e.g. the multiple subcarriers in an OFDM
signal, the mixing product contributions from all tone triplets must be coherently superposed.
Let the input into the nonlinear system be given by a *Fourier Series* (FS),
implying that it is either time-limited or periodic. Further assume that the input is
approximately *Band-Limited* (BL) i.e. it may be expanded as a *Finite
FS* (FFS) i.e. a FS with a finite number *N* of harmonics:

*i*-th frequency is then

*i*, then ${R}_{i;jk}^{(3)}$ may be null out for certain indices $j,k,$ whenever $l=j+k-i$is not a valid index falling in the range. Equivalently, given

*i*, it suffices to restrict the summation to the set $S[i]\subset [1,N]\times [1,N]$of “proper FWM triplets” [27], namely index pairs$j,k$subject to $l=j+k-i$ being a valid index, and excluding

*Self-Phase-Modulation*(

*j*=

*k*=

*i*) and

*Cross-Phase-Modulation*(XPM) (

*j*=

*i*or

*k*=

*i*) contributions, which are separately treated:The summation of Eq. (7), describing just the FWM contribution, then reduces to

*N*sub-carriers of the given OFDM channel) is a superposition over all nonlinear contributions of all the

*i*tones: ${r}_{}^{\text{FWM}}(t)={\displaystyle {\sum}_{i=1}^{N}{r}_{i}^{\text{FWM}}(t)}.$Substituting Eqs. (9) and (5) into the last equation yields a FS expansion of the nonlinear system output over the [0,

*T*] interval:

#### 2.2 VSTF of a general optically amplified dispersive fiber link with non-uniform parameters

We now review the generalized analytical expressions derived in [26,27] for the
VSTF,${H}_{i;jk}$, of a quite general multi-span fiber-optic link of total length
*L*, with the spans not necessarily identical, with arbitrary z-varying
dispersion and nonlinear fiber parameters ${\beta}_{2}\left(z\right),\gamma \left(z\right),$and with arbitrary power differential loss profile,
$\alpha \left(z\right),$which incorporates possible lumped or distributed gains (e.g.
lumped optical amplifiers are represented as negative impulsive components of
$\alpha \left(z\right)$); the power gain (or attenuation factor if less than unity) from
the link input (*z* = 0) to a point *z* is given by
${G}_{p}(z)=\mathrm{exp}\left\{-{\displaystyle {\int}_{0}^{z}\alpha ({z}^{\prime})d{z}^{\prime}}\right\}.$Our most general expression for the VSTF of a spatially
non-uniform a-periodic or quasi-periodic fiber optic link with second order *chromatic
dispersion* (CD) is given by Eq. (3.70) of [26]:

*Fourier Transform*(FT) of the product $\gamma \left(z\right){G}_{p}(z){1}_{[0,L]}(z)$ (where ${1}_{[a,b]}(z)=1$ if $z\in [a,b],$zero otherwise, and the FT is generically defined as $G(\kappa )=F\left\{g\left(z\right)\right\}\equiv {\displaystyle {\int}_{-\infty}^{\infty}g(z)\mathrm{exp}\left\{-j\kappa z\right\}}dz$):

*j*,

*k*deviate from the target index

*i*. The FT specific value yielding the VSTF is evaluated at a spatial frequency equal to $\Delta {\beta}_{i;jk}$(Eq. (13)), corresponding to the mismatch in propagation constants between the four waves participating in the FWM process. We reiterate that the only assumptions underlying Eq. (12) are that the nonlinearity be of third order, the CD be quadratic, $\beta (\nu )=\beta ({\nu}_{0})+{\scriptscriptstyle \frac{{(2\pi )}^{2}}{2}}{\beta}_{2}{(\nu -{\nu}_{0})}^{2}$and all constituent frequencies be on a regular grid with $\Delta \nu $ spacing: ${\nu}_{i}={\nu}_{0}+i\cdot \Delta \nu .$

The VSTF was derived in [26,27] for the commonly applicable scenario of a multi-span fiber link with
uniform parameters and identical spans, where the optical gain exactly compensates for the span
loss. A remarkable feature is that the VSTF in that case is factorable into a single-span VSTF
component and an *array-factor*, which is akin to the one arising in the
radiation pattern from an antenna array. It is this array-factor which accounts to the
nonlinear tolerance advantage of dispersion-uncompensated links over dispersion-compensated
ones. We quote here the result for the VSTF in that case${H}_{i;jk}^{\text{ID}\text{\hspace{0.17em}}\text{spans}}$:

This completes our review of the VSTF formulation, as defined and derived above for a link
with z-varying physical parameters,$\alpha (z),{\beta}_{2}(z),\gamma (z).$In principle, were these parameters independently measured or
a-priori known, the VSTF could be calculated based on the analytical formulas presented above.
However, the axial *z*-varying and slowly time-varying profile of these
(non)linear parameters is extremely challenging to measure [28–30]. Actually, for the purposes of
nonlinear compensation at either the Tx or Rx, it suffices to obtain the overall VSTF of the
end-to-end fiber link, which may be treated as a black-box concealing and encapsulating the
details of axial distribution of the $\alpha (z),{\beta}_{2}(z),\gamma (z)$parameters. Instead, our novel SID approach bypasses the
evaluation of the distributed parameters, directly deriving an estimate of the VSTF which is
the operational characterization required for nonlinear compensation.

An interesting related concept to be further explored in future work, is whether Eq. (11) for the VSTF may be inverse-sourced, i.e., given our measurement of the VSTF ${H}_{i;jk}$for a multitude of frequency indexes, could the integral equation (Eq. (11)) be solved for ${\beta}_{2}(z),\gamma (z)$. As for the loss profile, $\alpha (z),$featuring in ${G}_{p}(z)=\mathrm{exp}\left\{-{\displaystyle {\int}_{0}^{z}\alpha ({z}^{\prime})d{z}^{\prime}}\right\},$this quantity may be independently measured with an OTDR, with good accuracy.

#### 2.3 Numeric validation of the analytic Volterra series based nonlinear fiber link description

We end this section by validating our VSTF analytical expressions by means of a numeric
simulation of nonlinear propagation using the Split-Step-Fourier method. We simulate nonlinear
dispersive transmission of a single OFDM channel with *N* = 128 subcarriers and
aggregate bandwidth of${B}_{T}=25\text{\hspace{0.17em}}\text{}\text{\hspace{0.17em}}\text{GHz}$over a dispersion-unmanaged 1000 Km SSMF link containing 10
identical noiselessly amplified spans precisely compensating fiber loss in each span. The
transmitted modulation format per subcarrier is 16-QAM, with equal powers in all subcarriers.
Additional transmission parameters are stated in the caption of Fig. 1
, which plots the *modulation error ratio* (MER) vs. the subcarrier index.
The receiver performs an FFT followed by CD equalization multiplying all sub-carriers by
complex constants (as sufficient *Cyclic Prefix* (CP) is provided) as well as
SPM/XPM nonlinear compensation, de-rotating each received constellation per subcarrier
(actually an identical rotation to all subcarriers, proportional to the overall, constant,
power). As the transmission is noiseless and CD, SPM/XPM have been compensated for, the
remaining impairment is FWM among the subcarriers. The bottom curve in Fig. 1 describes the FWM limited performance, as expressed by time-averaging
the following MER expression:

In top curve of Fig. 1 we additionally apply
*nonlinear compensation* (NLC) based on the analytic FWM generation model of
Eq. (10), which predicts the nonlinear
distortion complex amplitudes ${R}_{i}^{\text{FWM}},$expressed as a summation of triple products of transmitted complex
amplitudes weighted by the VSTF, ${H}_{i;jk}.$The FWM compensated samples for each subcarrier are now set to
${\rho}_{i}={R}_{i}-{R}_{i}^{\text{FWM}},$analytically calculating ${R}_{i}^{\text{FWM}}$as per Eq. (10) and
subtracting it out, and the MER (Eq. (18)) is
evaluated for the resulting FWM-compensated ${\rho}_{i}.$The very large improvement in MER attained relative to the
FWM-uncompensated case is indicative of the high accuracy of the analytic FWM generation model
of Eq. (10) and in particular verifies the
analytic expressions for the VSTF. This simulation clearly validates our nonlinear propagation
analytic model for OFDM as derived in [27] and reviewed
in this section.

In summary, this section reviewed the concept of frequency-sampled VSTF, useful for characterizing the nonlinearity of a most general fiber link and presented the particular VSTF formula for a link with multiple identical homogeneous spans. Having reassured ourselves by simulation that our VSTF analytic model is a realistic and accurate one, we proceed to introduce novel optical monitoring procedures for measuring the VSTF of any fiber link.

## 3. Naïve VSTF monitoring using exhaustive 3-tone tests over coherent OFDM

In this section we consider an initial naïve approach to nonlinear monitoring making use of a coherent optical OFDM transceiver for VSTF system identification.

In Eq. (5) we characterized the VSTF as the
coefficient ${H}_{i;jk}$to be applied to the triple product of complex amplitudes of three
harmonic tones interacting through FWM, in order to obtain the CE of the mixing product,
${R}_{i;jk}^{\text{FWM}}={H}_{i;jk}{A}_{j}{A}_{k}{A}_{j+k-i}^{*}.$This suggests that, in principle, the VSTF may be determined by
probing the system with all possible triplets of tones – applying an exhaustive
triple-tone test. Such nonlinear monitoring procedure may readily be realized if we set up
coherent OFDM transmission in the particular channel over the fiber link. At the OFDM Tx we turn
on three subcarriers at a time to be used as pilots for nonlinearity monitoring for each
subcarrier, *i*, namely subcarriers$j,k\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}l=j+k-i.$In each probe interval we transmit an OFDM symbol over the duration
*T*, comprising just these three subcarrier frequencies. This has the effect of
linearly exciting the same subcarriers frequencies in the Rx, however at subcarrier index
*i*, (i.e., at the *i*-th port of the FFT output in the Rx),
there appears a small nonlinear disturbance${R}_{i;jk}^{\text{FWM}}$due to FWM mixing of the three frequencies. Assuming the probing
subcarriers are sufficiently strong, the FWM disturbance will be above the noise. Alternatively,
the *signal-to-noise ratio* (SNR) may be improved by having each triplet of pilot
frequencies repeatedly transmitted. The samples collected at the *i*-th output of
the FFT in the Rx in response to the repetitions of each triple-frequency training symbol should
then be time-averaged. Then, assuming noise has been sufficiently suppressed relative to the
signal, the particular value of the VSTF at the frequency triplet is simply obtained by
${\widehat{H}}_{i;jk}=\overline{{R}_{i;jk}^{\text{FWM}}}/\left({A}_{j}{A}_{k}{A}_{j+k-i}^{*}\right),$where $\overline{{R}_{i;jk}^{\text{FWM}}}$is the time-averaged value of multiple samples at the
*i*-th FFT output, and the *A*-s represent the complex amplitudes
of the transmitted training signal. In particular, it is convenient to set the CE of all
transmitted tones to be real-valued, all of amplitude *A*,
yielding${\widehat{H}}_{i;jk}=\overline{{R}_{i;jk}^{\text{FWM}}}/{A}^{3}.$

While this method works in principle, its main shortcomings are an excessively long
acquisition time and poor noise tolerance. Indeed, there are $O\left({N}^{3}\right)$OFDM VSTF coefficients to be evaluated (where *N* is
the number of *Analog to Digital Converter* (ADC) samples in an OFDM symbol,
i.e., the OFDM FFT size) thus the overall measurement time, expressed in ADC sample interval
units, should be$N\cdot O\left({N}^{3}\right)\cdot S=O\left({N}^{4}S\right),$where $S$ is the number averaging measurements, E.g., for *N*
= 1024 and *S* = 100 (yielding 20 dB averaging gain), we would require
$O\left({10}^{14}\right)$ samples. In this particular example we would be able to measure
the VSTF with a resolution of$\Delta \nu ={B}_{T}/N,$where ${B}_{T}$ is the OFDM signal aggregate bandwidth, e.g.
for${B}_{T}=25\text{\hspace{0.17em}}GHz,$we would attain a resolution of $\Delta \nu \cong 25\text{\hspace{0.17em}}\text{\hspace{0.17em}}MHz.$

Unfortunately, this naïve procedure would be prohibitive in terms of the time required to complete the measurement. E.g., assuming the ADC sampling rate is 25 GS/s (in practice the sampling rate would be somewhat higher than${B}_{T}$, due to oversampling), then the $O\left({10}^{14}\right)$samples would correspond to a measurement time of $O\left({N}^{4}S\right)\cdot {B}_{T}^{-1}\cong {10}^{14}\cdot 40p\mathrm{sec}=4000\text{\hspace{0.17em}}\mathrm{sec}.$If the frequency resolution were reduced by a factor of 8, degraded to 200 MHz, by using *N* = 128 rather than *N* = 1024 OFDM FFT size, the measurement time would be reduced by a factor of ${8}^{4}=4096$, from 4000 sec down to about 1 sec. Unfortunately this would still result in unacceptably slow acquisition time, which would preclude protection switching, for which application the link must be reconfigured within several msec. Thus, this naive three-tone test exhaustive method of VSTF monitoring is not sufficiently fast to be practically applicable.

In the sequel we explore alternative VSTF monitoring approaches consisting sending more sophisticated *training sequences* (TS) rather than the simplistic three-tone OFDM symbols just considered. Such TS would contain more energy and improve SNR allowing to eliminate averaging and speed up the measurement, however the foreseen difficulty would be that the TS, now consisting of multiple (much more than three) pilot tones, would simultaneously excite superpositions of three-tone responses, which would fall on the same set of output frequencies and would be hard to separate out. Nevertheless, the challenges entailed in such an approach may be successfully addressed and a fast method of nonlinear channel identification be obtained on its basis, as will be explored below.

## 4. VSTF analytics revisited - compressed representations

Let us revisit Eq. (11), which specified the VSTF of a most general fiber link, the only underlying assumptions having been that the link be dispersive up to second order and the Kerr nonlinearity be of the third-order. We observe a unique mathematical structure of that expression, which turns out to be crucial to our objectives: the dependence of the VSTF on the tone indices,$i;jk,$is solely via their combination

which expression was referred to in Eq. (90) of [27] as the ‘hyperbolic distance’ between the pairs $\left(j,k\right)$ and $\left(i,i\right).$Thus, the VSTF may be expressed as a composition of two mappings: the hyperbolic distance function ${m}_{i;jk}$ (a function of the three indices*i,j,k*) and the following function,

*m*. Equation (20) is henceforth referred to as ‘compressed VSTF’. It is the composition of the two mappings that yields the VSTF:

*i*). This is indicative of potential substantial reduction in the number of degrees of freedom (DOFs) required for nonlinear SID: The compressed VSTF number of DOFs will be seen to grow quadratically, as $O\left({N}^{2}\right)$ rather than cubically, $O\left({N}^{3}\right),$with

*N*. As for the coefficient in the $O\left({N}^{2}\right)$dependence, this coefficient will be seen to be of the order of 0.2-0.3, as the set of valid

*m*-values happens to include “gaps”, i.e., there may be non-consecutive values of

*m*, corresponding to missing

*m*indices for which no combination of valid

*i,j,k*subcarrier indices may be found.

The simple insight that the VSTF is fully specified in terms of the 1-D compressed VSTF ${{H}^{\prime}}_{m}$function will lead to considerable simplification of the nonlinear monitoring procedure, which is reformulated to directly target${{H}^{\prime}}_{m},$rather than the original ${H}_{i;jk}.$This approach is generally applicable to any fiber optic link affected by second-order CD and third order nonlinearity, be it uniform, or non-uniform, with arbitrary profiles of loss, nonlinearity and dispersion parameters.

Further reduction in the number of DOFs may be obtained for a CD-unmanaged multi-span fiber link by accounting for the specific functional form of ${{H}^{\prime}}_{m},$which may attain low-power and/or low multiplicity values for certain values of the index *m* (corresponding to certain triplets), whenever the array factor generates destructive interference. Additional functional symmetry properties of ${{H}^{\prime}}_{m}$ will also be applied to further reduce the number of effective DOFs. In particular, it is readily proven that the compressed VSTF, ${{H}^{\prime}}_{m},$ (Eq. (20)) is an anti-hermitian function of its index, *m*:

*System Identification*(SID) procedure as derived in the next section.

In the remainder of this section we develop the mathematical properties of the compressed VSTF formulation, setting the mathematical background for developing the novel fast SID.

#### 4.1 Compressed VSTF mathematical properties

It is useful to rework, in the compressed *m*-index notation, our previous results for the VSTF of various links. Notice that the following relation always holds:

For a general uniformly-dispersive${\beta}_{2}\left(z\right)={\beta}_{2}$link we convert Eq. (12) using the mapping of Eq. (23), obtaining the following result for the compressed VSTF:

*m*-index, the compressed VSTF magnitude for a multi-span link with identical spans, as described by Eq. (25). The figure also shows the constituent multiplicative factors, namely the single-span VSTF and the array factor as per the respective equations Eqs. (26),(27). The array factor is seen to modulate the slowly and smoothly decaying profile of the single-span VSTF, superimposing a spiky periodic structure.

Notice that Eq. (19) amounts to a many-to-one mapping from the pair of indices *j,k* to the index *m* (for a given target index, *i*). Certain indices, *m*, may each correspond to multiple (*j,k*) pairs as there may be multiple ways to factor *m* as a product of the *j*-*i* and *k*-*i*. In contrast, *m*-index values which are either prime or are expressible as products of two prime numbers are factorizable just in a single way as products of *j*-*i* and *k*-*i.* Moreover, there might be values of *m* which are not feasible (are undefined) as the *j,k* values of their integer factors fall outside $S[i],$for every $1\le i\le N.$ The inverse image of the many-to-one transformation of Eq. (19), i.e. the set of (*j,k*) pairs falling on target index *i* (uniquely defining FWM triplets) and satisfying the relation $m=(j-i)(k-i),$is denoted as${S}_{m}[i]$(formally defined in Eq. (37)below),introducing the following sets of indices:

For points *m* which are not feasible, we take ${S}_{m}[i]=\varnothing ,$ the empty set, with cardinality zero.

Comparing the definition of ${S}_{m}[i]$ with that of $S[i]$(Eq. (8)) it is apparent that the set of inverse images ${\left\{{S}_{m}[i]\right\}}_{m\in \tilde{M}}$forms a disjoint partition of the aggregate set$S[i]$(Eq. (8)), with each subset in the partition indexed by *m*, i.e., $S[i]={\displaystyle {\cup}_{m\in \tilde{M}}{S}_{m}[i]}$ and ${S}_{m}[i]\cap {S}_{{m}^{\prime}}[i]=\varnothing \text{\hspace{0.17em}}\text{\hspace{0.17em}}if\text{\hspace{0.17em}}m\ne {m}^{\prime}.$

In addition we shall also require the following set, assembling all triplets with given *m*-index corresponding to all the subcarriers:

*m,*referred to as ‘

*m-index multiplicity’*is plotted in Fig. 2(b), featuring a large variation. Thus, whereas all ${H}_{i;j,k}$ elements are of equal relevance (in the sense that each element corresponds to a single triplet of channels with all triplets on equal footing), the elements of ${{H}^{\prime}}_{m}$vary wildly in “relevance” - some correspond to many triplet combinations, while others correspond to just a few. Thus the samples of the compressed VSTF must be weighted in some sense by their multiplicity,$\left|{S}_{m}\right|,$which was plotted in Fig. 2(b). In the next sub-section we proceed to specify the nature of the multiplicity weighting following [26,27] in order to express the total FWM power at target index

*i*as an incoherent summation of the powers of the individual FWM intermodulation products falling on the target frequency indexed

*i*.

#### 4.2 Compressed representation of the FWM mixing products build-up

Based on the compressed representation of the VSTF we proceed to formulate a corresponding compressed representation of the FWM products build-up process, recasting Eq. (9) in the form:

Next, we evaluate the efficiency improvement attained by switching from conventional to compressed VSTF description by evaluating $\left|\tilde{M}\right|$vs. ${\sum}_{i=1}^{N}\left|S[i]\right|}.$

The cardinality of the set *S*[*i*] was derived in [27]:

*N*, the corresponding sizes of the set $\tilde{M}$of

*m*-indices which are seen to be much smaller. As the set$\tilde{M}$contains the products of two integer factors $\left(j-i\right),\left(k-i\right)$ each ranging from –(

*N*-1) to (

*N*-1), it approximately scales as$O({N}^{2})$which is factor of

*N*less than the order of ${\sum}_{i=1}^{N}\left|S[i]\right|}.$Actually,$O({N}^{2})$is a loose upper bound.

In the case of interest of *N* being a power-of-two, we have for the maximum value included in the $\tilde{M}$set the expression:

*m*-index is given by${m}_{\mathrm{min}}=-{m}_{\mathrm{max}}.$The $\tilde{M}$set is properly included in the integer range$\{-{m}_{\mathrm{max}},-{m}_{\mathrm{max}}+1,\mathrm{...},{m}_{\mathrm{max}}-1,{m}_{\mathrm{max}}\},$as there are non-feasible values of

*m*, as already indicated. Thus, we have the upper bound$\left|\tilde{M}\right|\le 2{m}_{\mathrm{max}}\cong {N}^{2}/2.$ In fact $0.5{N}^{2}$ is a loose upper bound: the ratio $\left|\tilde{M}\right|/{N}^{2}$is tabulated in Table 1 and is seen to fall well under 0.5, essentially falling in the 0.2...0.3 range. For $N\ge 128,$which is the range of OFDM sizes of interest, the table indicates a tighter upper bound $\left|\tilde{M}\right|<{\scriptscriptstyle \frac{1}{4}}{N}^{2}.$

This approximately quantifies the initial compression achievable with our ‘fast SID’ approach. The precise compression factor is presented in the rightmost column of Table 1 which presents the compression ratio${\sum}_{i=1}^{N}\left|S[i]\right|}/\left|\tilde{M}\right|.$An approximation for this ratio is

*i*. Assuming uncoded OFDM

*n*-ary PSK transmission with constellation $\sqrt{{P}_{T}/N}{e}^{j\alpha 2\pi /n},\alpha =0,1,\mathrm{...},n-1$ (as will be used for our training sequences), we follow the approach of [27] to express the total FWM power falling on the

*i*-th subcarrier in terms of the individual triplets powers.

*m*-index notation, realizing that all triplets corresponding to a given

*m*-index share the same${\left|{{H}^{\prime}}_{m}\right|}^{2},$hence their power contribution is weighted by the factors $\left|{S}_{m}[i]\right|$ which were referred to as the multiplicities of the

*m*-indices. The expression of Eq. (35) for the FWM power was described in [27] - approximately akin to having individual triplets incoherently superpose in power, except that the triplets $j,k,j+k-i$and $k,j,j+k-i$(related by transposing

*j,k*) are indistinguishable in their respective coherent field contributions, hence re-enforce coherently, yielding a factor of two in Eq. (35). The approximation stems from neglecting the ‘degenerate’ triplets of the form $j,j,2j-i.$Summing the nonlinear interference power (Eq. (35)) over all sub-carriers yields

## 5. Efficient nonlinear system identification of the fiber link

At the end of section 3 we proposed to explore the potential for SID nonlinear performance monitoring (measurement of the FWM VSTF) by transmitting training sequences which are ‘rich’ in frequency content and more energetic, as opposed to the initial naïve three-tone-test approach which was based on training symbols containing just three pilot subcarriers.

Resorting to compressed VSTF formalism of the last section, we reformulate the nonlinear channel model of Eq. (30) as follows:

*i*-th subcarrier, due to all triplets corresponding to the

*m*-th subset, ${S}_{m}[i]:$

*i*and

*m,*since out of the triplets falling on

*i*(belonging to the set$S[i]$) we must single out a subset ${S}_{m}[i]$ of triplets satisfying the requirement $(j-i)(k-i)=m.$The modified formulation of Eq. (37), with ${A}_{i,m}^{\text{FWM}}$ given by Eq. (38) describes a simple linear transformation underlying our proposed efficient SID method, to be introduced next.

#### 5.1 SID procedure using arbitrary training sequences formulated as least-squares problem

We proceed to formulate the SID problem with general training sequences${\left\{{A}_{i}\right\}}_{i=1}^{N},$as estimation of the compressed VSTF, ${{H}^{\prime}}_{m}$ based on measuring the subcarrier complex amplitudes, ${R}_{i}^{\text{FWM}},$under transmission of arbitrary training sequences. We transmit a TS consisting of a succession of OFDM symbols, each having all its *N* sub-carriers generally non-zero. The received complex amplitudes are labeled as ${R}_{i}^{}[t],$where$i\in \{1,2,\mathrm{...},N\}$is the subcarrier frequency index (*i*-th output of the Rx FFT), and $t\in \tilde{T}$ indices the transmitted OFDM symbol in the TS (the ‘training symbol’). Here $\tilde{T}\subset {\mathbb{Z}}^{+}$is a set of training symbol discrete-time indices, assumed for simplicity of notation to consist of contiguous integers. We also allow for the possibility of repetition transmissions, in order to enable averaging the additive noise for SNR improvement, however the averaging process is to be modeled outside the context of the set $\tilde{T}$of TS indices, i.e. the training symbols labeled by $t\in \tilde{T}$ are independent rather than being repeated.

The subcarrier complex amplitudes ${R}_{i}^{}[t]$ received in training mode are logged after CD equalization and after SPM/XPM compensation (a deterministic counter-rotation of all subcarrier complex amplitudes to correct for SPM/XPM [27]). Assuming perfect CD equalization and SPM/XPM compensation, and neglecting noise-signal interaction, the received signal would be equal to the transmitted signal${A}_{i}^{}[t],$plus the FWM nonlinear distortion terms, plus additive noises:

*least-squares*(LS) one (under additive white gaussian noise this is equivalent to a maximum-likelihood estimator), expressed as a minimization of the mean-squared error:

*t*would be discarded, and the

*Mean Square Error*(MSE) would be expressible in terms of a summation over the subcarriers,

*i*, represented in vectorial notation as a squared norm:

*t*. The mean square error now becomes $MSE={\Vert \delta R-A\text{\hspace{0.17em}}{H}^{\prime}\Vert}^{2}$ where the $\left|\tilde{M}\right|\times 1$ column vector ${H}^{\prime}$ is defined just as in Eq. (43), whereas the observation vector, $\delta R,$and the$A$matrix are now defined in terms of the following vertical concatenations of blocks:

**has full column rank, which it typically does) and there would be no precise solution. However, we may obtain an optimal ‘pseudo-solution’ by selecting ${H}^{\prime}$ such that$A\text{\hspace{0.17em}}{H}^{\prime}$be closest in Euclidean distance to the observed $\delta R$vector. Formally, Eq. (41) is rewritten as**

*A**pseudoinverse*(PI) $\left|\tilde{M}\right|\times N\left|\tilde{T}\right|$matrix:

*the PI may be evaluated offline*from the training sequences, which are specified in advance. The remaining real-time task of the Rx is to perform the matrix-vector multiplication. Thus, the estimation of the $\left|\tilde{M}\right|$-element column vector $\widehat{{H}^{\prime}}$of VSTF samples amounts to a linear processing task consisting of multiplying the accumulated observations vector $\delta R$ (which contains $N\left|\tilde{T}\right|$samples), by this fixed pre-evaluated $\left|\tilde{M}\right|\times N\left|\tilde{T}\right|$ PI matrix. This is more efficient than the nonlinear SID method [32], wherein the matrix inversion must be dynamically evaluated using the far more complex adaptive

*Recursive Least Squares*(RLS).

#### 5.2 Exploiting the anti-hermitian symmetry of the compressed VSTF

In the last subsection we have shown how to select an optimal$\widehat{{H}^{\prime}}$as a solution to the optimization problem of Eq. (45). We have previously shown that the compressed VSTF features anti-hermitian symmetry (Eq. (22)), which is equivalent to the real and imaginary parts satisfying:

*u*,

*l*indicate upper and lower blocks.

Without loss of generality we then decompose the $\delta R$vector into real and imaginary parts,

and further separate the $A$ matrix into left and right blocks (denoted by the subscripts*L,R*):The optimization problem of Eq. (45), applied to ${H}^{\prime}$constrained to the form of Eq. (49), and further using the decompositions of Eqs. (50),(51), yields after separation to real and imaginary parts the formulation:

**A**-matrix is now $2N\left|\tilde{T}\right|\times \left|\tilde{M}\right|$ rather than $N\left|\tilde{T}\right|\times \left|\tilde{M}\right|$and the modified PI matrix is now dimensioned$\left|\tilde{M}\right|\times 2N\left|\tilde{T}\right|$) albeit with real-valued rather than complex-valued unknowns.

We have thus gained in two respects. First, for a given number of transmitted OFDM symbols, we collect twice as many equations each with the same number of unknowns. Indeed, the threshold condition for a unique valid optimal least-squares solution is having a full column rank **A**-matrix, for which it is necessary that the number of equations be equal or greater than the number of unknowns– we have thus halved the number of OFDM symbols necessary to obtain a solution and subsequently halved the acquisition time. Secondly, as each complex multiplier in the original **A**-matrix formulation amounts to three real multipliers, it is apparent that the anti-hermitian-constrained optimization provides a complexity reduction factor of 3/2 relative to the original unconstrained optimization problem, as the new ${A}^{-}$is now real-valued rather than complex-valued.

#### 5.3 Training sequences

As for the training sequences to be used for best performance, as we probe the fiber nonlinearity, it is advantageous to excite the system with as strong signals as possible, within the regime of validity of the 3rd order Volterra model formalism. Under a transmit-power-limited constraint, ${P}_{\text{peak}}\text{\hspace{0.17em}},$it is thus worthwhile to set all the components of the training sequence at an equal maximum power level. Our proposed training sequence then consists of OFDM symbols with their *N* subcarriers having the complex amplitudes $\sqrt{{P}_{\text{peak}}^{}}{e}^{j{\theta}_{i}[t]},$where the phases sequence ${\theta}_{i}[t]$ is white (over both the *i* and *t* indices) with elements drawn from some distribution, e.g. ${\theta}_{i}[t]~\text{Unif}[0,2\pi ],$or more practically from a QPSK constellation: ${\theta}_{i}[t]\in \sqrt{{P}_{T}/N}{e}^{j\alpha 2\pi /n},\alpha =0,1,\mathrm{...},n-1$ It is sufficient to store one realization of a training sequence in the OFDM Tx and offline evaluate the corresponding PI matrix to be applied in the Rx processing.

#### 5.4 Noise analysis

Let us derive the noise performance of the proposed PI-based system identification (PI-SID) procedure and compare it with that of the *three-tone-test* (3TT) SID described in section 3. The PI-SID amounts to a matrix multiplication by ${A}^{-}$ of the noisy observation vector $\delta R=\u3008\delta R\u3009+n,$where $n$ denotes the vector of additive noises accompanying the components of the observed vector $\delta R.$Propagating signal and noise through the PI we have:

*n*-PSK training sequences, which are our preferred option, we have $\left|{\stackrel{\u2323}{A}}_{i}^{}\right|=1,$i.e. the only DOFs in TS design are the phases.

The FWM-matrix elements are then expressed as:

**A**-matrix is then expressed as $\text{\hspace{0.17em}}A={A}_{o}^{3}\stackrel{\u2323}{A},$where$\stackrel{\u2323}{A}$is defined as the normalized FWM-matrix with elements $\text{\hspace{0.17em}}{\left[\stackrel{\u2323}{A}\right]}_{i,m}={\stackrel{\u2323}{A}}_{i,m}^{\text{FWM}}$given in Eq. (38). Substituting$\text{\hspace{0.17em}}A={A}_{o}^{3}\stackrel{\u2323}{A}$into Eq. (46) relates the un-normalized and normalized PIs as follows: $\text{\hspace{0.17em}}{A}^{-}={A}_{o}^{-3}{\stackrel{\u2323}{A}}^{-}.$

The total power of the output noise vector,$\text{\hspace{0.17em}}\text{\hspace{0.17em}}{{n}^{\prime}}^{\text{PI-SID}},$is then given by

Let us now introduce the *Singular-Value-Decomposition* (SVD) of the FWM-normalized matrix (assumed to have full-column-rank $\left|\tilde{M}\right|$): $\stackrel{\u2323}{A}=UD{V}^{\u2020}$where ${U}_{N\left|\tilde{T}\right|\times N\left|\tilde{T}\right|},{V}_{\left|\tilde{M}\right|\times \left|\tilde{M}\right|}$ are unitary, and ${D}_{N\left|\tilde{T}\right|\times \left|\tilde{M}\right|}=diag\left[\sqrt{{\lambda}_{1}},\sqrt{{\lambda}_{2}},\mathrm{...},\sqrt{{\lambda}_{\left|\tilde{M}\right|}}\right]$ is a rectangular (“portrait”$N\left|\tilde{T}\right|>\left|\tilde{M}\right|$) matrix containing the square-roots of the singular values along its diagonal, zero elsewhere. The average noise power in the reconstructed compressed VSTF vector, ${H}^{\prime}$ (averaged over all its $\left|\tilde{M}\right|$components) is then obtained by dividing the total noise power (Eq. (55)) by$\left|\tilde{M}\right|$:

*i*, in response to a “3-tone” OFDM training symbol with amplitudes ${A}_{o}^{}$for subcarriers

*j,k,j + k-i*and zero elsewhere. Now, writing $\delta {R}_{i;jk}^{}=\u3008\delta {R}_{i;jk}^{}\u3009+{\underset{\u02dc}{n}}_{i}^{\text{3TT-SID}},$the fluctuations in the reconstructed VSTF are given by:

Figure 3
plots the noise reduction figure-of-merit (Eq. (58)) vs. the compressed$\left|{\tilde{M}}^{\prime}\right|$(as described in section 6) and parameterized by $T\equiv \left|\tilde{T}\right|$for *N* = 64 and 16-PSK TS (evaluated for the symmetrized A matrix of section 5.2). It is apparent that doubling the measurement time $\tilde{T}$results in a noise reductions larger than 3dB, indicating that this technique is preferred to simple averaging of the received noisy samples.

## 6. Extra ‘lossy’ compression by band-limitation and prioritized discarding of *m*-indices

In principle, in order to fully characterize the VSTF over all triplets we ought to evaluate ${{H}^{\prime}}_{m}$ (Eq. (30)) over all feasible indices,$m\in \tilde{M}.$However, this task might be time and computationally demanding, especially when higher frequency resolution is sought. It turns out that the computational and acquisition time demands may be substantially relaxed while sacrificing a very small controlled amount of estimation accuracy, by introducing additional ‘lossy’ compression of the VSTF, either applying band-limitation (discarding *m*-indices corresponding to higher frequencies) or preferably by smart prioritization of the coefficients according to their power and multiplicity.

In either case the idea is to restrict the domain of the estimated VSTF by excluding the triplets corresponding to smaller nonlinear power contributions. Thus, the VSTF is only evaluated over a ‘relevant’ subset $\tilde{{M}^{\prime}}\subset \tilde{M}$of the full domain of *m*-indices, $\tilde{M}.$The compressed VSTF is set to zero outside the restricted domain, with the expectation that this restriction have negligible or little impact on the evaluation of the nonlinear distortions. This is akin to video or image compression where low spatial transform coefficients are simply set to zero.

We note that the methods developed in section 5 apply to the new 'lossy-compressed' VSTF estimation problem by simply replacing $\tilde{M}$by$\tilde{{M}^{\prime}},$which amounts to eliminating corresponding elements of ${H}^{\prime}$and columns of the **A** matrix.

The *lossy compression quality* (LCQ) may be quantified in terms of the fraction of the power of the participating triplets:

Equation (36) indicates that not all ${{H}^{\prime}}_{m}$coefficients contribute on equal footing to the overall nonlinear distortion. The nonlinear power contribution associated with index *m* is approximately proportional to$\left|{S}_{m}\right|{\left|{{H}^{\prime}}_{m}\right|}^{2}.$Many coefficients are weighed very low in terms of their squared absolute value ${\left|{{H}^{\prime}}_{m}\right|}^{2}$as they either fall between the array factor's side lobes, or as a result of the diminishing single span VSTF with increasing *m*. other coefficients may contribute only an insignificant portion of the nonlinear distortion power due to their small multiplicity$\left|{S}_{m}\right|,$resulting in nonlinear contribution from a small number of triplets. The various lossy compression strategies differ by their selection of the reduced subset, $\tilde{{M}^{\prime}}.$

#### 6.1 Lossy compression of the VSTF by band-limitation

Noticing that both the *m*-index multiplicity, and the single span VSTF decrease with increasing *m*, a simple approach towards discarding of *m*-indices is to exclude all *m*-values beyond a certain distance from the origin, *m* = 0, i.e. select the target *m*-indices as the following ‘reduced’ subset of the full set, $\tilde{M}$:

*m*-index represents a sampled spatial frequency we may refer to this set as low-pass-filtered or band-limited (BL) around DC. Figure 4(a) plots the LCQ (Eq. (59)) incurred in this simple lossy compression procedure. The LCQ falls off monotonically with increasing size$\left|{\tilde{M}}_{{m}_{\text{cutoff}}}^{\text{BL}}\right|,$which provides a useful initial tradeoff between complexity (the number of coefficients used) and performance (LCQ). Notice that for an LCQ of −0.4dB (red line), 958 coefficients are retained. In the next subsection we apply a smarter prioritized discarding procedure.

#### 6.2 Sorted prioritization of the VSTF coefficients according to power and multiplicity

Inspecting Eq. (36) it is apparent that optimal ‘lossy compression’ of the VSFT estimation would be obtained by sorting the *m*-indices of the sequence of summand terms $\left|S[m]\right|{\left|{{H}^{\prime}}_{m}\right|}^{2}$ in decreasing order of their weighted powers and truncating the sorted list at some level and determining the associated *m*-indices. Denoting the sorted indices by $\mu $(rather than *m*), and further denoting the ordered list of indices by ${\tilde{M}}_{sort},$we then replace Eq. (36) by the following equivalent expression:

*i*may then be approximated by truncating the power-sorted list, ${\tilde{M}}_{sort},$at a certain radius ${\mu}_{\text{cutoff}}<{m}_{\mathrm{max}},$using the following approximate representation:

*m*-indices or any other permutation of these indices. Figure 4(b) plots the LCQ (Eq. (59)) incurred in this compression method. It is apparent that the sorted policy is superior to the band-limited one in its performance vs. complexity tradeoff; E.g., for an LCQ of −0.4dB (the horizontal red line), 350 coefficients are retained, as compared with the 958 coefficients required for the band-limited compression of the previous subsection.

The compressed representation of Eq. (62) should also be instrumental in optimizing complexity-performance tradeoffs in active nonlinear compensation of OFDM by means of a frequency-domain Volterra equalizer, optimizing NLC performance for a given number of VSTF coefficients, however this topic is outside the scope of the current paper, which is dedicated to VSTF system identification.

#### 6.3 Complexity of the proposed nonlinear system identification procedure

The proposed PI-SID procedure amounts to a matrix-vector multiplication via the PI matrix of size$\left|\tilde{M}\right|\times 2N\left|\tilde{T}\right|.$Fortunately, the processing may be performed ‘on-the-fly’. It is not necessary to collect the full $\delta R$observations vector (of length$2N\left|\tilde{T}\right|$) prior to commencing the matrix-vector multiplication. Rather, matrix multiplication partial results may be gradually accumulated as soon as the responses due to each TS arrive. To this end, the PI matrix ${A}^{-}\text{\hspace{0.17em}}$(in its anti-hermitian symmetrized version), is viewed as being horizontally partitioned into sub-blocks of $\left|\tilde{M}\right|\times 2N$ dimension (there are $\left|\tilde{T}\right|$ such sub-blocks). For each arriving sub-block of 2*N* elements of $\delta R$we proceed to multiply the received sub-block by the corresponding sub-block of the ${A}^{-}\text{\hspace{0.17em}}$matrix, yielding a partial result consisting of an $\left|\tilde{M}\right|$ elements column vector, which is element-by-element accumulated in a running sum, yielding the overall $\left|\tilde{M}\right|$ elements estimated vector, ${H}^{\prime}.$This practically eliminates latency, as the multiplication of the $\delta R$vector may commence right after the negligible delay entailed in receiving its first 2*N* elements, rather than waiting for the entire vector to be received. This indicates that the SID procedure completion time is practically equal to the SID computation time.

To obtain a unique valid pseudo-inverse solution we must have$2N\left|\tilde{T}\right|>\left|\tilde{M}\right|.$Ignoring noise, it is advantageous to take$\left|\tilde{T}\right|$as small as possible to satisfy the condition above. In practice, as indicated by simulation due to numerical inaccuracy it turns out that $2N\left|\tilde{T}\right|$ must typically exceed$\left|\tilde{M}\right|$by a factor of the order of 10:$2N\left|\tilde{T}\right|\cong 10\left|\tilde{M}\right|\Rightarrow \left|\tilde{T}\right|\cong 5\left|\tilde{M}\right|/N.$The number of elements of the PI matrix (number of real-valued *multiply-accumulates* (MAC)) is given by

*N*, as detailed in Fig. 5 (assuming the sorted prioritization of section 6.2).

The complexity of evaluation is defined as the MAC rate, namely the number of MACs per sample, obtained by dividing Eq. (64) by the number of ADC samples, ${S}_{SID}={T}_{SID}/{T}_{s},$ transmitted during the SID procedure duration ${T}_{SID},$where ${R}_{s}^{}={T}_{s}^{-1}$is the ADC sampling rate in the OFDM transmitter:

*N*. If we insist on keeping the SID procedure complexity down to a low level of just 1 MAC per unit time, then the measurement time must accordingly be scaled to rise as ${N}^{4}$:

Figure 6 indicates that for the specific channel under test and for *N* = 128,256,512 the SID measurement time is less than 1 msec, whereas for *N* = 1024 it steeply rises to 10 msec.

We note that the fiber nonlinearity is quite a stable effect, thus the nonlinear SID estimate, rapidly obtained in just 1msec (10 msec for *N* = 1024) should be applicable for an extended period of time – as suitable for protection switching reconfiguration scenarios.

## 7. Simulations of operational performance of the proposed SID

In this section we present additional simulations assessing and validating the proposed SID procedure over a fiber link numerically modeled by the SSF method. Once the VSTF is estimated using our SID method, we assess the quality of the resulting estimate by substituting the estimated VSTF into a nonlinear compensation module incorporated in the simulated Rx. The better the quality of the VSTF estimate, the more precise the nonlinear compensation becomes, thus this method provides direct operational assessment of the quality of our SID, and its dependence on the various parameters.

The estimated VSTF is plugged into an ideal NLC whereby a ‘genie’ makes available to the NLC the transmitted symbols, which are then distorted via a ‘synthetic fiber link’ modeled in terms of the SID-estimated VSTF. The synthesized distortion is subtracted off the actually ‘measured’ distortion of the uncompensated link. The residual FWM distortion per subcarrier is represented by the variance of the nonlinear interference in each sub-carrier, expressed as the MER of the noisy constellation per sub-carrier, which is used as the criterion of merit for the SID performance.

#### 7.1 Simulation setup

The simulation setup is an OFDM link with *N* subcarriers akin to that described in sub-section 2.4. At the Tx we launch a uniformly distributed white 16-PSK pseudo random training sequence. At the Rx we compensate for CD and SPM/XPM, providing *N* subcarrier measurements${\left\{{R}_{i}\right\}}_{i=1}^{N}$per OFDM symbol. These measurements are made available to the SID procedure which generates a PI matrix multiplication as per section 5, reconstructing the compressed VSTF, from which the uncompressed VSTF is retrieved as per Eq. (21). We perform the lossy-compressed SID procedure introduced in section 6, varying the number of coefficients${\tilde{M}}^{\prime},$as well also the acquisition time $\tilde{\left|T\right|}$available to the Rx.

The quality of the estimated VSTF is tested at the NLC output in terms of the variance of the nonlinear interference per subcarrier, expressed as the perturbed constellation *modulation error ratio* (MER). We compare the MER performance for the uncompensated received signal, with post-NLC performance loading our identified VSTF into the NLC, and also with the post-NLC performance obtained with an analytically calculated VSTF according to Eq. (12), which results are used for reference.

#### 7.2 SID simulated performance

We start with a short 3x100Km noiselessly amplified link and *N* = 32 subcarriers. Figure 7(a)
presents SID performance vs. the number of VSTF coefficients${\tilde{M}}^{\prime},$for a simple selection scheme setup (subsection 6.1), parameterized by the acquisition time$\tilde{\left|T\right|}$:

For *N* = 32 the full group of *m*-indices, $\tilde{M},$includes 282 coefficients. As ${\tilde{M}}^{\prime}$approaches this limit we get improved results, until finally inclusion of additional coefficients provides diminishing improvement in approximating the nonlinear distortion. Upon increasing the acquisition time the results approach the optimal selection, which is a band-limited truncation of the analytically calculated ${{H}^{\prime}}_{m}.$

Figure 7(b) depicts results for a SID procedure with the sorted prioritization scheme introduced in subsection 6.2. The MER for any given number of coefficients is improved here compared to the band-limited approach. Figure 8 plots results for the identified ${{H}^{\prime}}_{m}$for $\left|{\tilde{M}}^{\prime}\right|=280,\left|\tilde{T}\right|=120.$The analytically calculated${{H}^{\prime}}_{m}$is presented for reference.

Figure 9(a) shows full MER results vs. subcarrier index for the parameters given above, compared with the uncompensated results, and results for the analytically calculated VSTF.

Figure 9(b) plots similar results for a long 25x100Km noiselessly amplified link with *N* = 256. It is apparent that for sufficient acquisition time the nonlinear estimation results are substantially improved.

Finally we consider a noisy version of the noiseless channel used in Fig. 7, whereby each amplifier along the link adds white Gaussian noise, set at a level such that the overall average uncompensated MER will decrease by 3 dB, down from 24.5dB to 21.6dB – which is attained at a Noise Figure of 8.5dB for each of the three link amplifiers. This configuration approximately balances the amounts of FWM distortion power and ASE Gaussian noise power, which is the typical operating point of practically operated long-haul links. It is apparent that the acquisition time is increased relative to the noiseless ideal case simulated in Fig. 7, yet our SID procedure is quite tolerant of ASE white noise, as illustrated in Fig. 10 .

## Conclusions

We have proposed a novel computationally efficient and accurate system identification method for estimating the Volterra Series Transfer Function of the fiber link by re-using existing OFDM transmission gear. For frequency resolution of *N* = 512,1024 points per 25 GHz channel, the proposed SID algorithm is fast (1,10 msec respectively) and the nonlinear optical performance monitoring complexity is negligible relative to the overall OFDM Rx complexity.

Future research directions include extending the acquired insights into the compressed representations of the VSTF beyond nonlinear optical monitoring in order to improve active nonlinear compensation techniques. Another important direction relegated to future work is to possibly use the estimated VSTF in order to extract the spatial profile of the nonlinear parameter, $\gamma (z)$along the fiber link.

## Acknowledgments

This work was supported in part by the Israeli Science Foundation (ISF) and by the Chief Scientist Office of the Israeli Ministry of Industry, Trade and Labor within the ‘Tera Santa’ consortium.

## References and links

**1. **A. Bononi, P. Serena, N. Rossi, E. Grellier, and F. Vacondio, “Modeling nonlinearity in coherent transmissions with dominant intrachannel-four-wave-mixing,” Opt. Express **20**(7), 7777–7791 (2012). [CrossRef] [PubMed]

**2. **G. Bosco, P. Poggiolini, A. Carena, V. Curri, and F. Forghieri, “Analytical results on channel capacity in uncompensated optical links with coherent detection,” Opt. Express **19**(26), B440–B449 (2011). [CrossRef] [PubMed]

**3. **A. Carena, V. Curri, G. Bosco, P. Poggiolini, and F. Forghieri, “Modeling of the impact of nonlinear propagation effects in uncompensated optical coherent transmission links,” J. Lightwave Technol. **30**(10), 1524–1539 (2012). [CrossRef]

**4. **X. Chen and W. Shieh, “Closed-form expressions for nonlinear transmission performance of densely spaced coherent optical OFDM systems,” Opt. Express **18**(18), 19039–19054 (2010). [CrossRef] [PubMed]

**5. **J. K. Fischer, C.-A. Bunge, and K. Petermann, “Equivalent single-span model for dispersion- managed fiber-optic transmission systems,” J. Lightwave Technol. **27**(16), 3425–3432 (2009). [CrossRef]

**6. **F. Vacondio, O. Rival, C. Simonneau, E. Grellier, A. Bononi, L. Lorcy, J.-C. Antona, and S. Bigo, “On nonlinear distortions of highly dispersive optical coherent systems,” Opt. Express **20**(2), 1022–1032 (2012). [CrossRef] [PubMed]

**7. **K. Peddanarappagari and M. Brandt-Pearce, “Volterra series transfer function of single-mode fibers,” J. Lightwave Technol. **15**(12), 2232–2241 (1997). [CrossRef]

**8. **B. Xu and M. Brandt-pearce, “Modified Volterra series transfer function method,” Photon. Technol. Lett. **14**(1), 47–49 (2002). [CrossRef]

**9. **B. Xu and M. Brandt-Pearce, “Comparison of FWM- and XPM-induced crosstalk using the Volterra series transfer function method,” J. Lightwave Technol. **21**(1), 40–53 (2003). [CrossRef]

**10. **J. D. Reis, D. M. Neves, and A. L. Teixeira, “Weighting nonlinearities on future high aggregate data rate PONs,” Opt. Express **19**(27), 26557–26567 (2011). [CrossRef] [PubMed]

**11. **J. D. Reis and A. L. Teixeira, “Unveiling nonlinear effects in dense coherent optical WDM systems with Volterra series,” Opt. Express **18**(8), 8660–8670 (2010). [CrossRef] [PubMed]

**12. **E. Ip and J. M. Kahn, “Compensation of dispersion and nonlinear impairments using digital backpropagation,” J. Lightwave Technol. **26**(20), 3416–3425 (2008). [CrossRef]

**13. **G. Li, E. Mateo, and L. Zhu, “Compensation of nonlinear effects using digital coherent receivers,” in OFC/NFOEC - Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference (2011), p. OWW1.

**14. **D. Rafique, M. Mussolin, M. Forzati, J. Mårtensson, M. N. Chugtai, and A. D. Ellis, “Compensation of intra-channel nonlinear fibre impairments using simplified digital back-propagation algorithm,” Opt. Express **19**(10), 9453–9460 (2011). [CrossRef] [PubMed]

**15. **A. Lobato, B. Inan, S. Adhikari, and S. L. Jansen, “On the efficiency of RF-Pilot-based nonlinearity compensation for CO-OFDM,” in OFC/NFOEC - Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference (2011), p. OThF2.

**16. **L. B. Y. Du and A. J. Lowery, “Pilot-based XPM nonlinearity compensator for CO-OFDM systems,” Opt. Express **19**(26), B862–B867 (2011). [CrossRef] [PubMed]

**17. **L. Liu, L. Li, Y. Huang, K. Cui, Q. Xiong, F. N. Hauske, C. Xie, and Y. Cai, “Intrachannel nonlinearity compensation by inverse Volterra series transfer function,” J. Lightwave Technol. **30**(3), 310–316 (2012). [CrossRef]

**18. **L. Liu, L. Li, Y. Huang, K. Cui, Q. Xiong, F. N. Hauske, C. Xie, and Y. Cai, “Electronic nonlinearity compensation of 256Gb / s PDM- 16QAM based on inverse Volterra transfer function,” in ECOC’11 (2011).

**19. **L. B. Du and A. J. Lowery, “Improved nonlinearity precompensation for long-haul high-data-rate transmission using coherent optical OFDM,” Opt. Express **16**(24), 19920–19925 (2008). [CrossRef] [PubMed]

**20. **F. P. Guiomar, J. D. Reis, A. L. Teixeira, and A. N. Pinto, “Digital postcompensation using Volterra series transfer function,” Photon. Technol. Lett. **23**(19), 1412–1414 (2011). [CrossRef]

**21. **F. P. Guiomar, J. D. Reis, A. L. Teixeira, and A. N. Pinto, “Mitigation of intra-channel nonlinearities using a frequency-domain Volterra series equalizer,” Opt. Express **20**(2), 1360–1369 (2012). [CrossRef] [PubMed]

**22. **R. Weidenfeld, M. Nazarathy, R. Noe, and I. Shpantzer, “Volterra nonlinear compensation of 100G coherent OFDM with baud-rate ADC, tolerable complexity and low intra-channel FWM/XPM error propagation,” in OFC/NFOEC - Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference (2010).

**23. **H.-M. Chin, F. Marco, and M. Jonas, “Volterra based nonlinear compensation on 224 Gb/s PolMux-16QAM optical fibre link,” in OFC/NFOEC - Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference (2012).

**24. **Z. Pan, C. Benoit, M. Chagnon, and D. V. Plant, “Volterra filtering for nonlinearity impairment mitigation in DP-16QAM and DP-QPSK fiber optic communication systems,” in OFC/NFOEC - Conference on Optical Fiber Communication and the National Fiber Optic Engineers Conference (2011).

**25. **G. L. Mathews and V. J. Sicuranza, *Polynomial Signal Processing* (Wiley-Interscience, 2000).

**26. **S. Kumar, *Impact of Nonlinearities on Fiber Optic Communications*, Ch. 3 (Springer, 2011).

**27. **M. Nazarathy, J. Khurgin, R. Weidenfeld, Y. Meiman, P. Cho, R. Noe, I. Shpantzer, and V. Karagodsky, “Phased-array cancellation of nonlinear FWM in coherent OFDM dispersive multi-span links,” Opt. Express **16**(20), 15777–15810 (2008). [CrossRef] [PubMed]

**28. **H. W. Hatton and M. Nishimura, “Temperature dependence of chromatic dispersion in single mode fibers,” J. Lightwave Technol. **4**(10), 1552–1555 (1986). [CrossRef]

**29. **G. Ishikawa and H. Ooi, “Demonstration of automatic dispersion equalization in 40 Gbit/s OTDM transmission,” in European Conference of Optical Communication (ECOC) (1998), 519–520.

**30. **H. Onaka, K. Otsuka, H. Miyata, and T. Chikama, “Measuring the longitudinal distribution of four-wave mixing efficiency in dispersion-shifted fibers,” Photon. Technol. Lett. **6**(12), 1454–1456 (1994). [CrossRef]

**31. **S. Haykin, *Adaptive Filter Theory* (Prentice Hall, 2002).

**32. **S. W. Nam, S. B. Kim, and E. J. Powers, “On the identification of a third-order Volterra nonlinear system using a frequency-domain block RLS adaptive algorithm,” in Acoustics, Speech, and Signal Processing **ICASSP-90**, 2407–2410 (1990).