## Abstract

Optical implementations of reservoir computing systems are very promising because of their high processing speeds and the possibility to process several tasks in parallel. These systems can be implemented using semiconductor lasers subject to optical delayed feedback and optical injection. While the amount of the feedback/injection can be easily controlled, it is much more difficult to control the optical feedback/injection phase. We present extensive numerical investigations of the influence of the feedback/injection phases on laser-based reservoir computing systems with feedback. We show that a change in the phase can lead to a strong reduction in the reservoir computing system performance. We introduce a new readout layer design that -at least for some tasks- reduces this sensitivity to changes in the phase. It consists in optimizing the readout weights from a coherent combination of the reservoir’s readout signal and its delayed version rather than only from the reservoir’s readout signal as is usually done.

© 2016 Optical Society of America

## 1. Introduction

To solve computational hard tasks such as pattern recognition, chaotic time series prediction and classification, much effort is focused on developing efficient and computationally fast tools which can outperform traditional digital computers. Reservoir computing (RC) is one paradigm within the machine learning field that is being developed to reach this goal [1, 2]. Originally, it was based on recursive neural networks, but now it can be fully implemented in optics using off-the-shelf components [3, 4]. In order to perform the tasks well, an RC system needs to be nonlinear and needs to operate in a high-dimensional phase space. As such, an input signal is expanded by the RC into a high-dimensional phase space in which a processing problem becomes much easier to solve. In classical recurrent neural networks, the nonlinearity is typically implemented through sigmoidal activation functions (like, e.g a hyperbolic tangent function) while a high dimensionality is obtained using a large number of randomly interconnected neurons (these neurons are often also called nodes) [1,2,5,6]. These two properties (i.e nonlinearity and high-dimensionality) can also be obtained by using a single nonlinear neuron with delayed feedback [3, 4, 7–12]. The latter configuration is an effective approach to simplify the physical implementation. The variability of the input over the delay line is usually ensured in this configuration by time-multiplexing the input data stream with a mask consisting of a series of random numbers [13].

Since the first time-delay configuration introduced by Appeltant et. al [7], both optical and optoelectronic delay-feedback RCs have been demonstrated [3, 4, 7–12]. These systems are typically dedicated to high speed information processing [4, 9, 14]. In particular, it has been shown that delay-based RC systems using a laser with optical delayed feedback can be used to process, in parallel, several independent machine learning tasks [11]. In addition, lasers with optical delayed feedback have been fully integrated on a single chip providing manufacturability, power efficiency and compactness [15]. They are therefore potential candidates to explore in the future optical RC systems that are fully integrated on a single chip.

However, in many laser applications such as subwavelength position sensing or wavelength switching and selection [16], it has been found that the performance of laser-based systems with optical delayed feedback is sensitive to changes in the feedback phase. This sensitivity is typically a disadvantage as the phase is difficult to control in a real-world environment. In particular under short delay feedback (i.e delay times less or comparable to the relaxation oscillation period of the laser), the response of the laser can be strongly influenced by the feedback phase and the injection detuning [17,18]. For example, a small change in a (on-chip) waveguide’s refractive index or in the operating optical wavelength can drastically modify the feedback phase or the injection detuning. Nonetheless, a short delay time is desirable for on-chip implementations of laser-based systems with delayed optical feedback as it consumes less wafer space. In optical delay-based RC systems, the changes in the feedback phase modify the effectiveness of the feedback strength which in its turn can influence the memory capacity of the system, while the frequency detuning and the MZM static phase influence the amplitude of the injection signal containing the data to be processed. On the grounds of all these aspects, a comprehensive study regarding the effect of different static phases (e.g feedback phase, injection phase) is of great interest.

In this paper, we consider an RC system based on semiconductor ring lasers (SRLs) with optical feedback in which the data to be processed is optically injected using another laser [11]. We provide a systematic study of the influence of the different static phases and the injection detuning on the computational performance of the system. Our choice is motivated by the fact that SRLs are scalable and they can easily be implemented with other photonic components (laser, short delay lines, couplers) on the same chip. They are therefore promising candidates for compact and integrated on-chip delay-based RC systems. We will show that changes in the feedback/injection phase and in the frequency detuning between SRL and injection laser lead to a strong reduction of the system performance. We will also discuss how this degradation can be lowered by carefully designing the readout layer of the RC system. The rest of paper is structured as follows: We briefly describe the basic operation of RC systems in Sec. 2. In Sec. 3, we describe the optical RC system studied in this paper. In Sec. 4 we provide details about the dynamical regimes of the system in the absence of the input data. Benchmark tasks used in this study are briefly presented in Sec 5. In Sec. 6, we address the system performance as a function of the feedback phase and in Sec. 7 we introduce a new read-out layer design intended to reduce the negative effects of phase fluctuations. Secs. 8 and 9 deal with the influence of the injection detuning and the static Mach-Zehnder modulator phase, respectively. Final remarks are given in Sec. 10.

## 2. Reservoir computing systems

A typical RC system is composed of three main parts: an input layer, a reservoir and an output layer (see Fig. 1). The input layer consists of components which provide the data stream to be fed into the reservoir while the output layer (also called readout layer) is composed of components which allow to weight and sum the different states of the different nodes of the reservoir. In classical RC (Fig. 1(a)), the reservoir consists of a large number (10^{2} to 10^{3}) of randomly connected nodes or neurons. Only the output connections are adjusted during the so-called training phase. This training consists in feeding a certain amount of input data into the system and determining the optimal weight values for which the summation of all the different neuron responses approaches the associated target as closely as possible. This is achieved by minimizing the deviation between the system’s output after summation and the target value. In the ideal case, the deviation is zero meaning that the signal obtained after the summation is identical to the desired target.

In delay-based RC (Fig. 1(b)), all the neurons in the reservoir are replaced by a single nonlinear node (which is a laser in the optical RC system studied in this paper). Because of the delayed feedback, the output of the node will follow a path in a highly dimensional phase space. For the processing, the input data is fed into the nonlinear node and the delay line is sampled at equidistant time intervals to construct virtual nodes (instead of real neurons) from which the signal is read out. The time interval between two virtual nodes is called *θ*. To ensure a variability over the different virtual nodes, each input data is convoluted with a mask which is a random sequence so that it is constant over a time interval *θ* and periodic with a period equal to the delay time *T*.

In this paper, our nonlinear node is an SRL lasing in two directional modes, i.e one propagating in the clockwise (CW) direction and one propagating in the counterclockwise (CCW) direction [9, 11]. Each directional mode is subject to self-feedback. As such, independent in puts and outputs can be constructed using the directional modes. In the next section, we provide details about the modelling of such a laser-based optical RC system.

## 3. System description

The main dynamical features of a laser subject to moderate optical delayed feedback can be recovered using the Lang-Kobayashi equations [19]. In the specific case of an RC system based on a laser subject to optical delayed feedback, the input data is usually injected optically into the optical mode of the laser in which it has to be processed [4, 9, 14]. For this reason, the usual Lang-Kobayashi model has to be modified to include the injection terms [9,11,14]. Here, we consider a single-longitudinal mode SRL subject to optical self-feedback. SRLs support lasing in two directional modes, i.e one propagating in the clockwise (CW) direction and one propagating in the counterclockwise (CCW) direction [9, 11]. To implement the double self-feedback configuration, a part of the output signal of each directional mode is injected back in the same directional mode. The two optical modes can be used to process two independent tasks in parallel [11]. The scheme of such an RC system is illustrated in Fig. 2. It has the following structure. A laser beam with constant power generated by a continuous-wave semiconductor laser (SL) is equally split into two parts. Each part is injected into a Mach-Zehnder modulator (MZM) which has a radio-frequency (RF) electrode and a bias electrode. The data is injected via the RF electrode and the output of the MZM (i.e
${\mathcal{E}}_{1}(t)$ or
${\mathcal{E}}_{2}(t)$) is coupled to a specific directional mode of the SRL in which it has to be processed. It should be noted that the injection in one optical mode is sufficient for single task processing. Here we consider optical injection in each directional mode because we want to use the same system throughout the paper both for single task processing and for simultaneous processing of two tasks. The power in the CW and CCW directions is sampled at time interval *θ* in order to construct the (virtual) nodes in the readout layer of Fig. 2.

In terms of the mean-field slowly varying complex amplitudes of the electric field associated with the two counter-propagating modes *E _{cw}* and

*E*, and the carrier number $\mathcal{N}$, the model of concern is given by [11]:

_{ccw}*α*, renormalized bias current

*μ*, field decay rate

*κ*, carrier inversion decay rate

*γ*, feedback strengths

*η*and

_{cw}*η*, solitary laser frequency

_{ccw}*ω*

_{0}, delay time

*T*, and the backscattering coefficients

*k*+

_{d}*ik*where

_{c}*k*and

_{c}*k*are the conservative and the dissipative couplings, respectively. The differential gain functions are given by ${\mathcal{G}}_{cw}=1-s{\left|{E}_{cw}\right|}^{2}-c{\left|{E}_{ccw}\right|}^{2}$ and ${\mathcal{G}}_{ccw}=1-s{\left|{E}_{ccw}\right|}^{2}-c{\left|{E}_{ccw}\right|}^{2}$ where

_{d}*s*and

*c*account for the phenomenological self- and cross-saturations, respectively. The term ω

_{0}

*T*is the static feedback phase which results from the time delay. It depends on the operating wavelength through

*ω*

_{0}= 2

*πc/λ*(

*c*being the velocity of light in vacuum) and the delay time

*T*. The fourth term at the right hand side of Eqs. (1) and (2) represents the effect of spontaneous emission noise coupled to the CW/CCW modes. The noise amplitude is ${D}_{cw,ccw}={D}_{m}\left(\mathcal{N}+{G}_{0}{\mathcal{N}}_{0}/\kappa \right)$ where

*D*is the noise strength and

_{m}*ξ*(

_{i}*t*) (

*i*=

*cw, ccw*) are two independent complex Gaussian white noises with zero mean and correlation $\u3008{\xi}_{i}\left(t\right){\xi}_{j}^{*}\left({t}^{\prime}\right)\u3009={\delta}_{ij}\left(t-{t}^{\prime}\right)$. The last terms in Eqs. (1) and (2) are the injected fields containing the data of the different tasks to be processed in the two modes (e.g CW for

*task*1 and CCW for

*task*2),

*k*

_{1,2}being the injection strengths. The injected fields are given by

_{1,2}are the MZMs’ static phases; Δ

*ω*

_{1,2}are the frequency detunings between

*E*and ${\mathcal{E}}_{1,2}$. In Eqs. (1)–(4), the time factor convention exp(+

_{cw,ccw}*iω*

_{0}

*t*) has been used. These frequency detunings arise when the laser (SL) used for optical data injection operates at a different wavelength than that of the free-running SRL.

*S*

_{1}(

*t*) and

*S*

_{2}(

*t*) are the original input data convoluted with a random mask. We will use the same mask for the two tasks. This mask has 4 discrete values (−1,−0.25,0.25,1) generated randomly with equal probability [20]. More details on how to construct the masks can be found in [7,13,20].

As the optical injection of the input signals into the two SRL modes is achieved using the same laser (SL), and since the CW and CCW modes emit at the same wavelength, Δ*ω*_{1} = Δ*ω*_{2} = Δ*ω*. We use the parameter values [11, 21, 22]: *α* = 3.5, *s* = 0.005, *c* = 0.01, *κ* = 100 ns^{−1}, *γ* = 0.2 ns^{−1}, *k _{d}* = 0.033 ns

^{−1},

*η*=

_{cw}*η*= 10 ns

_{ccw}^{−1},

*k*= 0.44 ns

_{c}^{−1},

*μ*= 1.2,

*k*

_{1}=

*k*

_{2}= 9 ns

^{−1},

*N*= 200 nodes,

*θ*= 20 ps, $\left|{\mathcal{E}}_{0}\right|=2$. With

*θ*= 20 ps, the overall delay loop is

*T*=

*Nθ*= 4 ns. This delay time is comparable to the relaxation oscillation period of the free-running laser which is ${\tau}_{R0}\approx 2\pi /\sqrt{2\left(\mu -1\right)\gamma \kappa}\approx 3$ ns. The noise parameters are:

*D*= 5 × 10

_{m}^{−6}ns

^{−1},

*G*

_{0}= 10

^{−12}m

^{3}s

^{−1}and ${\mathcal{N}}_{0}=1.4\times {10}^{24}\phantom{\rule{0.2em}{0ex}}{\mathrm{m}}^{-3}$. Other parameters (i.e

*ω*

_{0}

*T*, Δ

*ω*and Φ

_{1,2}) are stated in the figure captions. We next investigate the influence of the different static phases on the nonlinearity and the memory capacity which are necessary properties for an efficient RC system.

## 4. Dynamical characterization in the absence of input data

Empirical evidence has shown that the optimal performance of an RC system is usually obtained when the system operates in its stable regime, i.e the output intensity of the laser is constant in time when no input data is fed into the reservoir [3,6,7]. In laser-based systems with feedback, these stable regimes are obtained by appropriately adjusting the feedback/injection strengths or the pump current of the laser. From a bifurcation study of Eqs. (1)–(3), we know that the system of Fig. 2 operates in a stable regime in the absence of input data [i.e *S*_{1,2}(*t*) = 0] for *μ ≤* 1.55, *ω*_{0}*T* = *π/*2, Δ*ω* = 0 and Φ_{1} = Φ_{2} = 0. Here we will use *μ* = 1.2 as this value led to the best performance [11]. It should be noted that the SRL output is unstable(i.e not constant in time) in the absence of the external injection (*k*_{1,2} = 0) for our parameters, but the output can be stabilized by optical injection as pointed out in [9, 23, 24]. However, this stabilization depends on the effective feedback strength (i.e the real part of the feedback term in Eqs. (1) and (2)) and the injection strength. From Eq. (4), it can be seen that the signal injected into the SRL depends on the frequency detuning Δ*ω* and the MZM static phase Φ_{1,2}. In this section, we want to further explore the features of the stable regime when the static phases or frequency detuning are varied. To this end, we show in Fig. 3 the orbit diagrams highlighting the local maxima and minima in the numerically simulated time series at the output of the laser for different values of the static feedback phase *ω*_{0}*T* (Fig. 3(a)), frequency detuning Δ*ω* (Fig. 3(b)), and MZM static phase Φ_{1} = Φ_{2} (Fig. 3(c)). Fig. 3(a) shows that the SRL’s output is periodic for −0.8*π* ≲ *ω*_{0}*T* ≲ −0.25*π*. These oscillations are much faster than the relaxation oscillations Their period is rather close to the period at which the phase dynamics relaxes [24]. The output is stable for all other values of *ω*_{0}*T*. In the stable regime, the output power is a nonlinear function with respect to *ω*_{0}*T* with a maximum at *ω*_{0}*T* ≈ 0.4*π*. If we look at the effect of Δ*ω* in Fig. 3(b), we observe several regions of unstable oscillatory behavior (around −2.17 rad/ns, −0.6 rad/ns, 0.9 rad/ns and 2.47 rad/ns) which correspond to unlocking to consecutive external cavity modes. In the stable regime, the maxima of the output power are found at *≈* (0.1 ± *π*/2) rad/ns. In Fig. 3(c), we display the orbit diagram showing the local extrema of the SRL output signal as a function of Φ_{1,2}. For Φ_{1,2} ∈ [−*π*, −*π*/2] ∪ [*π/*2,−*π*] the amplitude of the MZM output signal is small so that the injection is unable to stabilize the SRL. The reservoir’s output therefore remains unstable. For Φ_{1,2} ∈ [−*π*/2; *π*/2], the SRL delivers a stable response with a maximum at Φ_{1,2}*≈*0.

## 5. Benchmark tasks

To investigate how the changes in the static phases and detuning can influence the RC performance, we will evaluate the system performance on two typical machine learning tasks: chaotic time series prediction and nonlinear channel equalization (NCE).

For chaotic time series prediction, we will use chaotic intensity time series experimentally recorded from a far-infrared laser operating in a chaotic state as input data. This data, referred to as Santa-Fe time series, is available online [25] and has been widely used in machine learning [1–4, 7–12]. The goal for this task (which will be referred to as Santa-Fe task) is to predict the next sample in a chaotic time trace before it has been injected into the reservoir computer (one-step ahead prediction). The system performance on this task is evaluated by calculating the normalized mean square error (NMSE) between the predicted value *y* and the expected value *y _{target}*:

*n*is a discrete time index while ‖.‖ and 〈.〉 stand for the norm and the average. Note that NMSE= 0 means perfect prediction while NMSE = 1 indicates no prediction at all.

The NCE task is a way for one party to communicate a symbol to another party. The original signal is a random sequence *d*(*n*) with values from {−3, 1,+1,+3 Then, 10 consecutive data bits are mixed using

Finally, the second-order and the third-order nonlinear distortions, together with Gaussian white noise, are taken into account by applying a nonlinear transformation to *Q*(*n*) so that the final input signal to the reservoir *S*(*n*) is given by

*ξ*(

_{e}*t*) is an independent Gaussian white noise with zero mean and correlation $\u3008{\xi}_{e}\left(t\right){\xi}_{e}^{*}\left({t}^{\prime}\right)\u3009={D}_{e}\delta \left(t-{t}^{\prime}\right)$.

*D*is typically chosen such that the signal-to-noise ratio (SNR) yields values between 12 to 32 dB. Here, we choose

_{e}*D*such that the SNR is 30 dB. The goal of the NCE task is to reconstruct

_{e}*d*(

*n − j*) when

*S*(

*n*) is presented as input data. This means a reconstruction of the original symbols with

*j*delay steps. In [8, 10],

*d*(

*n*) has been reconstructed starting from

*S*(

*n*). But similarly to [2,11,26], we will compute

*d*(

*n*− 2) starting from

*S*(

*n*). This choice is motivated by the fact that the classification performance of the system can be better appreciated by reconstructing

*d*(

*n −*2) rather than

*d*(

*n*), since

*d*(

*n*2) is more difficult to compute than

*d*(

*n*) when the input is

*S*(

*n*). The system performance on this task will be quantified by the symbol error rate (SER) which is the fraction of

*d*(

*n −*2) values that is misclassified, i.e

Here, perfect classification corresponds to SER= 0.

Besides these two tasks, we will also evaluate the memory capacity (MC) and how it depends on the static phases and detuning. To this end, we drive the reservoir of the RC system by a uniformly distributed random signal *U* and the desired targets are the delayed versions of the input signal, i.e *U*(*i− j*). Here, we use for *U*(*i*) real numbers drawn from a uniform distribution in the interval [−0.5;0.5]. We consider an input scaling factor such that *−π/*4 ≤ *S*_{1}(*t*) ≤ *π/*4 (we consider *S*_{2}(*t*) = 0). After training the RC system for a particular value of the delay j, we then provide previously unseen data *V*(*i*) to the RC system with *i* = 1,2,… The values at the RC system output are labeled as *y _{j}*(

*i*), calculated for different values of

*i*. The MC is calculated as the sum of the normalized correlation between the estimated values

*y*(

_{j}*i*) and the desired targets

*V*(

*i − j*), i.e

*m*(

*j*) being the memory function defined as

*m*(

*j*) = 1 means that the RC system perfectly retains information about the past input data of

*j*time-steps ago.

We will use these benchmark tasks to process either single tasks using the CW SRL mode [i.e *S*_{1}(*n*) ≠ 0 and *S*_{2}(*n*) ≠ 0] or two independent tasks in parallel using the CW and the CCW SRL modes [i.e *S*_{1}(*n*) ≠ 0 and *S*_{2}(*n*) ≠ 0]. For parallel computation, the processing in the CW mode will be referred to as “task 1” while “task 2” is processed in the CCW mode. The input signals (including the mask) *S*_{1}(*n*) and *S*_{2}(*n*) are rescaled in each simulation so that −*π/*2 ≤ *S*_{1,2}(*t*) ≤ *π/*2. For parallel computation of two tasks, we consider the two following cases: i) a simultaneous prediction of two different Santa-Fe time series. ii) a simultaneous computation of two independent NCE tasks. For the processing of each task, we use 3000 data points for training and 1000 other data points for testing. Each experiment to obtain the value of NMSE or SER is evaluated 5 times with different randomly generated masks and the results shown are the mean values over these 5 runs. In each run in parallel computation, an identical random mask is used for the two tasks. For clarity, we do not depict the error bars showing the standard deviation.

## 6. Influence of the feedback phase *ω*_{0}*T* on the RC performance

As shown in Sec. 4, the output power depends on the static feedback phase *ω*_{0}*T*. In order to investigate how the feedback phase influences the prediction performance of SRL-based RCs, we show in Fig. 4(a) the prediction errors for the Santa-Fe prediction task as a function of *ω*_{0}*T*. The results show that there is a clear effect of the feedback phase on the RC prediction performance both for single Santa-Fe time series as well as for simultaneous prediction of two different Santa-Fe time series. By way of illustration, the NMSE obtained for a single Santa-Fe prediction task (magenta) considering *ω*_{0}*T* ∈ [0.75*π*;*π*] is more than twice larger than the NMSE obtained considering *ω*_{0}*T* ∈ [0.25*π*;0.5*π*] although the SRL output is stable for the values of *ω*_{0}*T* belonging to the two intervals. Note that the highest errors (i.e worse performance) are obtained when *ω*_{0}*T* ∈ [−0.75*π*;−0.5*π*] for which the SRL is unstable as was shown in Fig. 3(a). This strong dependence of the NMSE is also noticed for two simultaneous predictions (red and blue curves in Fig. 4(a)) of Santa-Fe time series. The performance when processing two tasks is always worse than when processing a single task. This is because the two directional modes that process the independent tasks are coupled in the cavity of the SRL. These unwanted couplings are due to the saturation effects and backscattering.

In Fig. 4(b), we also evaluate the effect of the feedback phase on the NCE task. It can be seen that the effect of the feedback phase for this task is even more pronounced. Even for a single NCE task (magenta), only feedback phase values close to 0.6*π* lead to RC performance that is comparable to results reported in literature (i.e a SER of the order 10^{−4}) [8]. For example, the SER is *≈* 6 *×* 10^{−2} for *ω*_{0}*T* = 0 while it is 6 *×* 10^{−4} for *ω*_{0}*T ≈* 0.6*π*. This performance is significantly degraded when simultaneously computing two NCE tasks although the lowest SER-values are still obtained for the same *ω*_{0}*T*-values [see Fig. 4(b)]. As a remark from Fig. 4, the lowest NMSE and SER values (i.e best performance) are not obtained for the same value of *ω*_{0}*T*.

In order to understand the degradation of the performance for many values of *ω*_{0}*T*, we have calculated the memory capacity of the system for different values of *ω*_{0}*T*. Figure 5(a) shows the results when *j* in Eq. (9) is truncated at 10. It turns out that the capacity of the system to retain the past information strongly depends on *ω*_{0}*T*. For example, MC *≈* 8.75 at *ω*_{0}*T* = 0.75*π* while MC*≈* 5 for *ω*_{0}*T ≈* 0.1*π* which means that the ability of the RC to partially or totally recall the past information is larger for *ω*_{0}*T* = 0.75*π* than for *ω*_{0}*T ≈* 0.1*π*. Remark that the values of *ω*_{0}*T* for which the MC in Fig. 5(a) is largest correspond to the region of best performance in Fig. 4(b). For many values of *ω*_{0}*T*, we have found that - at any given time-step - the RC system is only able to perfectly remember the previous input data up to 1 past time-step (i.e. *j* = 1 in Fig. 5(a)). Clearly, this memory is insufficient to reconstruct *d*(*n −* 2) starting from *S*(*n*). By way of illustration, Fig. 5(b) shows the memory function for *ω*_{0}*T* = 0.6*π* and *ω*_{0}*T* = 0.9*π*. Although the two values both lead to a stable reservoir output power, it is seen for *ω*_{0}*T* = 0.9*π* that the memory function *m*(*j*) starts to decrease when *j* > 1 and the RC system can only remember 77% of the information injected 4 time-steps ago. For *ω*_{0}*T* = 0.6*π*, the *m*(*j*) only starts to decrease for *j* > 2 and the RC system can still remember 95% of the information injected 4 time-steps ago. For all values of *ω*_{0}*T* where the memory function *m*(*j* = 4) is far away from 1, a high SER value has been obtained. This confirms the importance of the memory for the computation of the NCE task. It should be noted, however, that the memory is not the only property required for a successful computation. This is why, for some tasks such as the Santa-Fe prediction task, the best performance is not found for the highest MC value.

## 7. Reducing phase sensitivity through the reservoir output design

In order to reduce the strong influence of the phase on our RC performance, we introduce in this section a different approach to optimize the readout weights. More concretely, we consider that the readout layer of our RC system is modified so that the readout signal is a coherent combination of the reservoir’s readout signal and its delayed version (new procedure) instead of the reservoir’s readout signal only (standard procedure). In other words, the readout weights are optimized from |(*E*_{1,2}(*t*) + *E*_{1,2}(*t − T _{d}*)|

^{2})/2 with

*T*≠ 0 rather than from |(

_{d}*E*

_{1,2}(

*t*)|

^{2},

*T*being the extra delay which is implemented at the readout layer only (and therefore does not affect the reservoir internal dynamics). Note that even in this configuration, the number of nodes in each readout layer is the same and equal to

_{d}*N*. We anticipate that -by mixing the delayed reservoir output in the readout- this new procedure results in: (i) increasing the length of the memory; and (ii) reducing the number

*j*of the past input data required to reconstruct

*d*(

*n*− 2) starting from

*S*(

*n*). Throughout this manuscript, we will consider

*T*= 1.6

_{d}*T*. Figure 6(a) shows the memory capacity as a function of the feedback phase for

*T*= 0 and

_{d}*T*= 1.6

_{d}*T*. It turns out that the memory capacity increases for all values of

*ω*

_{0}

*T*. Fig. 6(b) shows the memory function

*m*(

*j*) for different values of

*T*considering

_{d}*ω*

_{0}

*T*= 0.6

*π*. It is clear from Fig. 6(b) that the number of past time-steps that are perfectly remembered increases with increasing

*T*. Roughly speaking, the increase in the amount of perfectly remembered time-steps is equal to the integer part of

_{d}*T*. However, it should be noted that an original memory capacity can only be improved to approximately its double. If

_{d}/T*T*is larger than the largest

_{d}/T*j*for

*m*(

*j*) = 1 when

*T*= 0, a dip will appear in

_{d}*m*(

*j*) at appropriately this

*j*.

Next we explicitly show how this new design of the readout layer leads to an improved RC performance. We first look at the NCE task as it needs memory of many time steps ago. The SER values (averaged over 5 runs) for *T _{d}* = 1.6

*T*are depicted in Fig. 7 together with those obtained with the traditional readout layer design (i.e

*T*= 0). For the processing of a single NCE task, it can be seen that the optimum performance is now reached for all values of the feedback phase

_{d}*ω*

_{0}

*T*. For the simultaneous processing of two tasks, the overall performance is improved by one order of magnitude and the performance is only weakly dependent on the feedback phase. Even for the values of

*ω*

_{0}

*T*leading to an unstable reservoir output in Fig. 3, a significant improvement of the RC performance is found. This confirms that the readout layer design discussed in this paper (i.e

*T*≠ 0) makes the RC performance more robust against fluctuations in the feedback phase. We have also investigated the effect of the new read-out layer design on the RC performance for the Santa-Fe prediction task. Figure 8 displays the NMSE values for

_{d}*T*= 0 and

_{d}*T*= 1.6

_{d}*T*. It can be seen that a limited improvement is found for all feedback phase values both for single Santa-Fe time series prediction (see Fig. 8(a)) and for two simultaneous predictions (see Fig. 8(b)). We conclude from Fig. 8 that the increase of the memory capacity because of the new readout layer design is not beneficial for the Santa-Fe prediction task.

## 8. Influence of the injection detuning Δ*ω* on the RC performance

Figure 3(b) has shown that the reservoir output power also depends on the injection detuning. Therefore, we investigate here in detail the consequences of changes in the detuning on the RC performance. We will only consider Δ*ω* ∈ [*−π/*2,* π/*2] rad/ns since Fig. 3(b) also indicates that a similar behavior is obtained after a period of *π/*2 with respect to Δ*ω*. Figure 9 shows the evolution of the NMSE in a windows Δ*ω* ∈ [*−π/*2,*π/*2] rad/ns (corresponding to a frequency detuning −250 MHz ≤ Δ *f* ≤ 250 MHz) for the single Santa-Fe time series prediction (a) and for the simultaneous prediction of two independent Santa-Fe time series (b). It turns out that the highest NMSE values (worst performance) are obtained for the Δ*ω* values which lead to unstable behavior as shown in Fig. 3(b). Similar to the results discussed in the previous section, it is further confirmed that the calculation of the weights using *T _{d}* = 1.6

*T*does not significantly improve the NMSE.

For NCE task, Fig. 10 depicts the SER values as a function of Δ*ω* both for *T _{d}* = 0 (magenta, red, blue) and for

*T*= 1.6

_{d}*T*(black, cyan). For

*T*= 0, low SER values are obtained only around the local maxima located around (0±

_{d}*π/*2) rad/ns in Fig. 3(b). The computation of the memory function (not shown) has evidenced that these values correspond to those for which the system successfully recalls information of up to four time steps in the past. Elsewhere, the system performance for this task is worse. This means that the best performance is obtained when the frequency of the laser used to inject the data into the SRL is tuned and locked to the frequency of the SRL. It is also seen that low and high classification errors (SER) alternate several times. This means that a slight change in the operating wavelength of the SRL or that of the injection SL can lead to a strong degradation of the performance for this task. Fortunately, the optimization of the readout weights considering

*T*= 1.6

_{d}*T*leads to a significant improvement of the performance. Remarkably, this improvement is also observed for the simultaneous processings of two different NCE tasks. While we have plotted the results only for Δ

*ω*∈ [−

*π/*2,

*π/*2] rad/ns for clarity, it is to be noted that similar SER values are found for any Δ

*ω*within the interval [−2

*π*,2

*π*] rad/ns that we have expored.

## 9. Influence of the static phase Φ_{1,2}

The original data to be processed in each directional mode is injected via an MZM. As such, it can undergo a linear/nonlinear transformation before being injected into the SRL. This transformation depends on the MZM static phase Φ_{1,2}. Figure 11 shows the evolution of the prediction errors (a) [classification errors (b)] as a function of the MZM static phase for two simultaneous predictions of Santa-Fe time series [two simultaneous classifications of nonlinear channel equalizations] for Φ_{1,2}-values leading to a stable reservoir output power. The dependence of the system performance on Φ_{1,2} in the whole range is minor both for prediction (Fig. 11(a)) as well as for classification tasks (Fig. 11(b)). This suggests that the extra nonlinearity added by the MZM is not necessary because it does not contribute to further performance improvement. Thus Φ_{1,2} can be arbitrary set without degrading the system performance. Interestingly, it is further confirmed that the prediction/classification errors are significantly lowered when the readout weights are optimized considering *T _{d}* = 1.6

*T*. The performance improvement when using the new read-out layer design is largest for the NCE task.

## 10. Conclusions

We have investigated the influence of the static phases and frequency detuning on the performance of optical RC systems based on semiconductor lasers with optical feedback. The static phases/detuning that we consider are: the feedback phase, the injection detuning and the MZM phase. Considering two benchmark tasks which require both a nonlinearity and a memory, we have provided evidence that the static feedback phase and the injection detuning strongly influence the computational performance of laser-based RC systems with feedback. The system performance dependence on the static MZM phase is minor even though this parameter can be precisely tuned in experiments. By modifying the readout design of our RC system, such that the read-out weights are optimized from a combination of the reservoir’s state and its delayed version, we have found that the influence of changes in the feedback phase and in the injection detuning can be considerably reduced, mainly for the NCE task. We have shown that the performance improved due to an increase in the memory length. We expect these results to be of particular importance for the design and optimization of optical RC systems. More particularly, they can help to make these systems less dependent on the optical phase or detuning which are difficult to stabilize or control in a real-world system. We anticipate that -if needed- the memory length can even be further increased by mixing multiple other delayed outputs in the readout layer.

## Acknowledgments

The authors acknowledge the Research Foundation Flanders (FWO) for project support, the Research Council of the VUB and the Interuniversity Attraction Poles program of the Belgian Science Policy Office, under grant IAP P7-35 “photonics@be”. R.M.N also acknowledges the Fond National de la Recherche Scientifique (FNRS) (Belgium).

## References and links

**1. **W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput. **14**, 2531–2560 (2002). [CrossRef] [PubMed]

**2. **H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science **304**, 78–80 (2004). [CrossRef] [PubMed]

**3. **L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing,” Opt. Express **20**, 3241–3249 (2012). [CrossRef] [PubMed]

**4. **D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun. **4**, 1364 (2013). [CrossRef] [PubMed]

**5. **D. Verstraeten, J. Dambre, X. Dutoit, and B. Schrauwen, “Memory versus non-linearity in reservoirs,” In Proc. Int Neural Networks (IJCNN) Joint Conf, 1–8 (2010).

**6. **K. Vandoorne, P. Mechet, T. V. Vaerenbergh, M. Fiers, G. Morthier, D. Verstraeten, B. Schrauwen, J. Dambre, and P. Bienstman, “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun. **5**, 4541 (2014). [CrossRef]

**7. **L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. **2**, 468 (2011). [CrossRef] [PubMed]

**8. **Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. **2**, 287 (2012). [CrossRef] [PubMed]

**9. **R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and G. Van der Sande, “Fast photonic information processing using semiconductor lasers with delayed optical feedback: role of phase dynamics,” Opt. Express **22**, 8672–8686 (2014). [CrossRef] [PubMed]

**10. **F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and Serge Massar, “All optical reservoir computing,” Opt. Express **20**, 22783–22795 (2012). [CrossRef] [PubMed]

**11. **R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and G. Van der Sande, “Simultaneous computation of two independent tasks using reservoir computing based on a single photonic nonlinear node with optical feedback,” IEEE Trans. Neural Netw. Learn. Syst. **26**, 3301–3307 (2015). [CrossRef] [PubMed]

**12. **R. Martinenghi, S. Rybalko, M. Jacquot, Y. K. Chembo, and L. Larger, “Photonic nonlinear transient computing with multiple-delay wavelength dynamics,” Phys. Rev Lett. **108**, 244101 (2012). [CrossRef] [PubMed]

**13. **L. Appeltant, G. Van der Sande, J. Danckaert, and I. Fischer, “Constructing optimized binary masks for reservoir computing with delay systems,” Sci. Rep. **4**, 3629 (2014). [CrossRef] [PubMed]

**14. **K. Hicke, M. A. Escalona-Moran, D. Brunner, M. C. Soriano, I. Fischer, and C. R. Mirasso, “Information processing using transient dynamics of semiconductor lasers subject to delayed feedback,” IEEE J. Sel. Top. Quantum Electron. **19**, 1501610 (2013). [CrossRef]

**15. **A. Argyris, M. Hamacher, K. Chlouverakis, A. Bogris, and D. Syvridis, “Photonic integrated device for chaos applications in communications,” Phys. Rev. Lett. **100**, 194101 (2008). [CrossRef] [PubMed]

**16. **M. Khoder, R. M. Nguimdo, G. Verschaffelt, J. Bolk, X. J. M. Leijtens, and J. Danckaert, “Wavelength switching speed in semiconductor ring lasers with on-chip filtered optical feedback,” Photon. Technol. Lett. **26**, 520–523 (2014). [CrossRef]

**17. **T. Heil, I. Fischer, W. Elsäer, B. Krauskopf, K. Green, and A. Gavrielides, “Delay dynamics of semiconductor lasers with short external cavities: bifurcation scenarios and mechanisms,” Phys. Rev. E **67**, 066214 (2003). [CrossRef]

**18. **M. C. Soriano, J. García-Ojalvo, C. R. Mirasso, and I. Fischer, “Complex photonics: dynamics and applications of delay-coupled semiconductors lasers,” Rev. Mod. Phys. **85**, 421–470 (2013). [CrossRef]

**19. **R. Lang and K. Kobayashi, “External optical feedback effects on semiconductor injection laser properties,” IEEE J. Quantum Electron. **16**, 347–355 (1980). [CrossRef]

**20. **M. C. Soriano, S. Ortín, D. Brunner, L. Larger, C. R. Mirasso, I. Fischer, and L. Pesquera, “Optoelectronic reservoir computing: Tackling noise-induced performance degradation,” Opt. Express **21**, 12–20 (2013). [CrossRef] [PubMed]

**21. **L. Gelens, S. Beri, G. Van der Sande, G. Mezosi, M. Sorel, J. Danckaert, and G. Verschaffelt, “Exploring multi-stability in semiconductor ring lasers: theory and experiment,” Phys. Rev. Lett. **102**, 193904 (2009). [CrossRef]

**22. **R. M. Nguimdo, G. Verschaffelt, J. Danckaert, X. Leijtens, J. Bolk, and G. Van der Sande, “Fast random bit generation based on a single chaotic semiconductor ring laser,” Opt. Express **20**, 28603–28613 (2012). [CrossRef] [PubMed]

**23. **S. Wieczorek, B. Krauskopf, T. B. Simpson, and D. Lenstra, “The dynamical complexity of optically injected semiconductor lasers,” Phys. Rep. **416**1 (2005). [CrossRef]

**24. **R. M. Nguimdo, M. Khoder, G. Verschaffelt, J. Danckaert, and G. Van der Sande, “Fast phase response and chaos bandwidth enhancement in semiconductor lasers subject to optical feedback and injection,” Opt. Lett. **39**5945–5948 (2014). [CrossRef] [PubMed]

**25. **A. S. Weigend and N. A. Gershenfeld, “Time series prediction: Forecasting the future and understanding the past,” ftp://ftp.santafe.edu/pub/Time-Series/Competition (1993).

**26. **A. Rodan and Peter Tiňo, ”Minimum complexity echo state network,” IEEE Trans. Neural Netw. **22**, 131–144 (2011). [CrossRef]