Effects of cavity nonlinearities and linear losses on silicon microring-based reservoir computing

Bernard J. Giron Castro; Christophe Peucheret; Darko Zibar; Francesco Da Ros

doi:10.1364/OE.509437

1. Introduction

Neuromorphic computing systems, which try to resemble the working mechanism of the human brain, are an interesting alternative to traditional Von Neumann architectures. Fundamental physical boundaries of electronics set some limits in future developments of current architectures to increase their computing capacity. Hence, neuromorphic computing appears to be a promising step in the development of novel artificial intelligence processors that can enhance the performance of current computing architectures and might extend Moore’s law [1]. Over the last decade, developments in integrated photonics have allowed the exploration of novel computing paradigms. Additionally, these developments have driven the photonic hardware realization of computing processing schemes and machine learning algorithms. Lower energy consumption, parallel computing and faster processing speed are the key potential benefits of photonic computing architectures that could address the limitations of traditional electronic circuits [2,3]. Photonic neural networks, all-optical switching, optical spiking neurons and optical activation functions are some examples of the emergence of the photonic computing field [3].

Reservoir computing (RC) is a relatively recent computing paradigm in the recurrent neural networks (RNNs) family that offers a lower complexity of the training process with respect to conventional RNN and other neural network schemes [4,5]. An RC architecture consists of an input layer, in which the data is assigned random fixed weights before being transferred to the reservoir layer, where the data is mapped into a higher dimensional space by means of interconnected nonlinear nodes with random and fixed connections. Using the response of the reservoir nodes, the weights in the output layer are trained to solve a specific target task, usually using ridge or linear regression. Based on the trained weights, RC can also make a prediction of the target task for subsequent input sequences that are unknown to the reservoir. Only the output layer is trained in RC, and this key feature considerably decreases the training time of this type of neural network [5,6]. RC has been shown to have applications in time-series predictions, channel equalization, speech recognition, medical and financial applications, etc. Further details about RC architecture are available in comprehensive reviews on the subject [5,6].

Multiple works have demonstrated implementations of photonic RC, where usually the nonlinear nodes are achieved through the nonlinear behaviour of photonic devices [7–20] or the dynamics of nonlinear optical phenomena [21–23]. Some of these works consist of blocks of photonic devices that perform as nonlinear nodes [7,11,12], but this leads to the scalability of RC being a challenge as well as the footprint of the photonic circuit being considerably increased. An alternative approach known as time-delay reservoir computing (TDRC) is to multiplex the nodes in time and use a single physical nonlinear node that typically receives feedback through a physical loop to boost the connectivity between the virtual nodes and the overall memory of RC. Several works regarding photonic TDRC can be found in the literature, e.g., using a Mach Zehnder modulator as the nonlinear node [9,10,14,17,19], semiconductor optical amplifiers [8] or the nonlinear dynamics of laser devices [13,15]. A TDRC setup based on microring resonators (MRR), first studied in [18], demonstrated a good performance in time-series prediction tasks. Nonetheless, there was no clearly established relationship between the performance of RC and the physical effects that generate the nonlinear dynamics of the microring cavity. In [24] we reported initial studies on the impact of varying the relaxation times of such physical effects and numerically showed that it is possible to obtain frequency detuning and power regions with a prediction error lower than other RC implementations with a similar number of virtual nodes and input rate.

In this work, we extend our study of the MRR-based TDRC architecture from [24] by investigating the impact of the cavity waveguide linear loss on the performance of RC. This study also encompasses an analysis of how the amount of nonlinearity given by the cavity dynamics influences RC and how the behaviour of such dynamics can be used to improve the performance of this type of RC implementation. We also explore the impact of the generated nonlinear oscillations in the time-series prediction and show that it is a non-desirable effect for solving the discrete-time tenth-order nonlinear auto-regressive moving average (NARMA-10) task. Additionally, this work deepens the understanding of the performance thresholds and the fabrication requirements for the microring waveguide in order to achieve lower error prediction than similar numerical RC schemes. This improvement is shown for the prediction of the NARMA-10 sequence and can potentially be extended to other RC tasks.

The structure of the paper goes as follows: In section 2 we introduce the model used to mathematically describe the nonlinear dynamics of the MRR-based TDRC scheme. In this section, we also detail the major assumptions and values used for the optical parameters. In section 3 we present the details of each of the TDRC layers, including the mathematical description of the electric field of the processed optical signal at the different stages of our setup. In section 4 we describe the benchmark methodology when solving the NARMA-10 task. In section 5 we present the results of the setup when varying different physical parameters related to the MRR properties, and define input power vs. frequency detuning regions with different levels of prediction error. Afterwards, in section 6 we analyze the previous results and relate the region of low error prediction with the physical properties of the MRR and the dominance of each of the nonlinear effects. Section 7 summarizes the main conclusions.

2. Free-carrier nonlinearities in silicon MRR

Silicon microring resonators have been extensively studied in the field of photonic computing as their features have demonstrated applications in a variety of computing processes such as all-optical switching [25–27], optical logic gates [28], weight banks [29], photonic spiking neural networks [30], photonic accelerators [31] and photonic RC [16,18]. The study in [18] focuses on the dependence of the MRR-based TDRC performance on the feedback intensity and the phase shift of the external feedback waveguide. This analysis is tested for the NARMA-10 and Mackey-Glass tasks. The dependence on the input power and detuning of the pump for the Santa Fe task is also studied. The system investigated in [16] presents a similar scheme but without external feedback and was tested on analog and binary operations. Furthermore, the impact of noise on the performance of the previous schemes with or without feedback was analyzed in [32].

To model the dynamics of a silicon MRR, we use the same mathematical model as in our previous study [24], similar to the one used in [18]. In this model, we introduce in the input port of the MRR a quasi-monochromatic electric field $E_{in}$ at an angular frequency $\omega _p$ close to that of the resonance frequency of the cold MRR cavity $\omega _0$. This field triggers the generation of excess carriers due to two-photon absorption (TPA). The conversion of the optical mode energy to heat by the absorption of power results in the increase of the cavity temperature. The generated free carriers change the refractive index of the cavity waveguide through free-carrier dispersion (FCD), which in turn results in a blue shift of the resonance frequency. They also become a source of free-carrier absorption (FCA). FCA contributes to the total rise of heat in the cavity. The resulting thermo-optic (TO) effect also changes the refractive index of the cavity but in the opposite direction, causing a red shift of the resonance [33]. All of these effects taking place inside the MRR cavity can be described by the temporal coupled mode theory (TCMT) through the following system of coupled differential equations, for an add-drop MRR configuration [18,33–37]:

(1)$$\frac{\textrm{d$a$(t)}}{\textrm{dt}} = [i\delta_\omega(t)-\gamma_{\textrm{tot}}(t)]a(t) + i \sqrt{\frac{2}{\tau_\textrm c}}\left[E_\textrm{ {in}}(t) + E_{\textrm{add}}(t)\right]e^{i\omega_\textrm pt},$$

(2)$$\frac{\textrm{d}\Delta N(t)}{\rm{d}t} ={-}\frac{\Delta N(t)}{\tau_{\textrm{FC}}} + \frac{\Gamma_{\textrm{FCA}}c^{2} \beta_{\textrm{TPA}}}{2\hbar\omega_pV^{2}_{\textrm{FCA}}n^{2}_{\rm Si}} |a(t)|^4,$$

(3)$$\frac{\textrm{d}\Delta T(t)}{\rm{d}t} ={-}\frac{\Delta T(t)}{\tau_{\textrm{th}}} + \frac{\Gamma_{\textrm{th}}P_{\textrm{abs}}(t)}{mc_\textrm p} |a(t)|^2,$$

where $a$ is the modal amplitude within the resonator cavity, $\Delta N$ represents the excess free-carrier density generated via TPA, and $\Delta T$ is the temperature difference of the waveguide cavity with respect to the environment. The variation of $a$ described in Eq. (1) is dependent on both $E_\textrm { {in}}$ and $E_\textrm { {add}}$, where the latter denotes the electric field at the add port in an add-drop MRR configuration. 1/$\tau _\textrm c$ is the decay rate of the cavity modal energy due to the coupling of each bus waveguide to the MRR. The terms $\delta _\omega (t)$ and $\gamma _\textrm { {tot}}(t)$ represent the total angular frequency detuning and losses rate in the MRR cavity, respectively. In Eq. (2) and (3), $c$ and $\hbar$ denote the speed of light in vacuum and the reduced Planck’s constant, respectively. $\tau _\textrm { {FC}}$ is the relaxation time of the free carriers, $\tau _\textrm { {th}}$ is the decay time of the TO effect, and $m$ is the mass of the MRR. The terms $\beta _\textrm { {TPA}}$, $n_\textrm { {Si}}$ and $c_\textrm { p}$ refer to the TPA coefficient, refractive index and specific heat of silicon, respectively. $\Gamma _\textrm { {FCA}}$ and $\Gamma _\textrm { {th}}$ are the FCA and thermal confinement factors related to the fractional energy overlap of the mode with the differential temperature and excess of FCD within the silicon microring. $V_\textrm { {FCA}}$ is the effective volume of the FCA. $P_\textrm { {abs}}(t)$ is the total optical mode energy converted into absorbed power. The time-dependent terms $\delta _\omega (t)$, $\gamma _\textrm { {tot}}(t)$ and $P_\textrm { {abs}}(t)$ have the following definitions [33,35,36]:

(4)$$\delta_\omega(t) = \omega_\textrm p - \omega_0\left[1 - \frac{1}{n_{\textrm{Si}}}\left(\Delta N(t) \frac{\textrm{d}n_{\textrm{Si}}}{\textrm{dN}} + \Delta T(t) \frac{\textrm{d}n_{\textrm{Si}}}{\textrm{d}T} \right)\right],$$

(5)$$\gamma_\textrm{ {tot}}(t) = \frac{c\alpha}{n_\textrm{ {Si}}} + \frac{2}{\tau_\textrm c} + \gamma_\textrm{ {TPA}} + \gamma_\textrm{ {FCA}} = \frac{c\alpha}{n_\textrm{ {Si}}} + \frac{2}{\tau_\textrm c} + \frac{\beta_\textrm{ {TPA}}c^{2}}{n_\textrm{ {Si}}^2V_\textrm{ {TPA}}}|a(t)|^2 +\frac{\Gamma_\textrm{ {FCA}}\sigma_\textrm{ {FCA}}c}{2n_{Si}} \cdot \Delta N(t),$$

(6)$$P_\textrm{ {abs}}(t) = \left(\frac{c\alpha}{n_\textrm{ {Si}}} + \frac{\beta_\textrm{ {TPA}}c^{2}}{n_\textrm{ {Si}}^2V_\textrm{ {TPA}}} |a(t)|^2 +\frac{\Gamma_\textrm{ {FCA}}\sigma_\textrm{ {FCA}}c}{2n_{Si}} \cdot \Delta N(t)\right)|a(t)|^2.$$

In Eq. (4), the total angular frequency detuning is a result of the sum of the detuning between $\omega _\textrm p$ and $\omega _0$, which we refer to in this work as $\Delta \omega = \omega _\textrm p - \omega _0$, and the nonlinear detuning due to TO and FCD effects. $\textrm dn_\textrm { {Si}}/\textrm dT$ and $\textrm dn_\textrm { {Si}}/\textrm dN$ represent the TO and FCD coefficients of silicon. In Eq. (5) and (6), the terms $\gamma _\textrm { {TPA}}$ and $\gamma _\textrm { {FCA}}$ denote the losses due to TPA and FCA [33], and $\alpha$ is the linear attenuation of the waveguide. $\sigma _\textrm { {FCA}}$ is the total FCA cross-section. The values of the optical parameters used for the simulation of the photonic RC are listed in Table 1.

Table 1. Optical parameters used in the photonic RC simulations.

View Table | View all tables in this article

Our model does not consider the contribution of the frequency detuning due to Kerr effect in Eq. (1) as the induced refractive index change (of the cavity waveguide) is negligible compared to the change caused by TO and FCD effects [38] at the simulated input power range. However, it is important to point out that the Kerr effect becomes more relevant in the case of coupled cavities and different material platforms, as recently studied in [39]. The model also does not consider any source of noise, nor does it consider the counterpropagating optical mode as the absence of backscattering is assumed. This assumption follows the approaches of [34–38]. In order to apply the model to our RC tasks, we normalize and solve Eq. (1)–(3) using a $4^{\textrm {th}}$-order Runge-Kutta method similar to the solutions of the TCMT equations described in [35,38]. Throughout this work, we sweep the values of the nonlinear effects lifetimes $\tau _\textrm { {FC}}$, and $\tau _\textrm { {th}}$, as well as the attenuation value given by $\alpha$. Then, we evaluate the impact of varying those parameters on RC dynamics and performance.

3. MRR-based time-delay photonic RC

The photonic TDRC simulated in this work (Fig. 1(a)), consists of an optical pump from a laser source which is modulated by the masked input data sequence of the RC. The virtual nodes are multiplexed in time by the masking signal $m(n)$ and a delay waveguide is added in order to increase the memory to the RC, with a length that matches the added delay. In [16,18], the authors demonstrated the memory capacity provided by the MRR cavity itself and by the external waveguide. The response of the RC is obtained through a photodetector connected to the drop port of the MRR. Afterwards, the training and testing of the output layer are performed using ridge regression.

Fig. 1. a) Photonic TDRC scheme using a silicon MRR with delayed feedback. b) Data sequence and masked input of the RC, taken from an individual testing set of the RC simulations. c) Corresponding electric field envelope of the signal at the input port of the MRR. The small modulation index approximates a quasi-monochromatic optical signal.

Download Full Size | PDF

The silicon MRR cavity acts as the single physical node of RC when nonlinear behaviour is induced. Indeed, this behaviour tends to become oscillatory as the FCD and TO effects cause an increase in cavity losses, which decreases the modal energy $|a|^2$. This reduction of energy in turn diminishes the nonlinear effects together with their caused losses, and consequently, $|a|^2$ starts to rise again, forming a cyclic nonlinear behaviour known as self-pulsing (SP) [34,37,38].

In RC, the high nonlinearity is correlated with the dimensionality expansion that is required for computation tasks when their solution is not feasible in low dimensional input space, and a task-dependent correlation between higher dimensionality and better performance has been demonstrated [40,41]. However, a higher level of nonlinearity and dimensionality does not always entail a better performance of RC [42], whereas, in time-series prediction tasks like the one analyzed in this work, it might even be detrimental for the memory capacity of the reservoir [43]. Furthermore, in certain frequency detuning and input power conditions, a perturbation of the input optical signal can trigger self-pulsing of the cavity energy with ultra-short discontinuities or spikes, as previously studied on silicon MRR nonlinearities [37,44]. Such fast pulse transitions can alter the stability of the RC dynamics, as they affect the computational consistency of the system, as defined in [45]. In [39], it is also pointed out that in the case of coupled cavities under the influence of FCD effects, there is an optimum input power interval in which the increase of dimensionality comes before the loss of consistency. As further discussed in section 5, similar findings are achieved in this work where the RC reaches enough dimensionality to achieve good performance without exciting SP that could affect its stability. Further details from each of the layers of the simulated RC scheme are described in the following subsections.

3.1 Input layer

The 1-GBd input symbol sequence, $u(n)$ is multiplied by the mask $m(n)$, whose random values are generated from a uniform distribution over the interval [0, +1]. The size of $m(n)$ is determined by the number of nodes $N$ of the RC. Unless another value is specified, this work uses $N = 50$. Therefore, every symbol, which has a duration of 1.0 ns, belonging to the sequence $u(n)$, is masked into a virtual node over a time interval of $\theta =\frac {1.0 \textrm { ns}}{N} = 20$ ps, producing $X(n)$ (Fig. 1(b)). In order to fulfil the requirement of our model of using a quasi-monochromatic input electric field by using a small modulation index, we add a bias $\beta = 8.0$ to $X(n)$ which was optimized with respect to the performance of the RC (with the rest of the parameters fixed). The resulting masked input signal, $\hat {X}(n)$, has a modulation index of less than 2%. $\hat {X}(n)$ is transformed into an electric signal that linearly modulates the optical field from an ideal noiseless continuous-wave laser source that generates an optical pump at a frequency $\omega _\textrm p$ and with average power $\overline {P}_\textrm { {in}}$. We denote the resulting linearly modulated optical signal power at the input port of the MRR as $X_\textrm { {in}}(n)$. Consequently, we can mathematically describe the input sequence $\hat {X}(n)$ and the electric field $E_\textrm { {in}}$ at the input port of the MRR (Fig. 1(c)) corresponding to the input optical signal (square root of $X_\textrm { {in}}(n)$), in the following way [18]:

(7)$$\hat{X}(n) = u(n)m(n) + \beta = X(n) + \beta,$$

(8)$$E_\textrm{ {in}}(n) = [X_\textrm{ {in}}(n)]^{(1/2)}.$$

3.2 Reservoir layer

The Runge-Kutta solution of Eq. (1)–(3) is obtained with a step $\eta = 2.0$ ps. This value was sufficiently small to obtain an accurate solution for the cavity nonlinearities of the system under investigation in [24]. There are $M = \frac {\theta }{\eta } = 10$ solver steps per virtual node (500 per symbol for $N$ = 50). Hereafter, we sample and hold $E_\textrm { {in}}(n)$ for each $k^{\textrm {th}}$ step of the Runge-Kutta solver over an interval $\theta$ for each $j_{\textrm {th}}$ virtual node to build the input electric field $\hat {E}_\textrm { {in}}(k)$ used in our model solver :

(9)$$\hat{E}_\textrm{ {in}}(k) = E_\textrm{ {in}}(n), \qquad \textrm{for} \qquad (j-1)\theta \le (j-1)\theta + k\eta < j\theta, \qquad\qquad 0 \le k<M.$$

Similar to the mathematical description of the scheme performed in [18], the external waveguide provides a delay $\tau _\textrm d$ to the optical signal that links the through and the add port. Throughout the simulations of the RC, we use a value of $\tau _\textrm d = 0.5$ ns for the delay waveguide. This value was optimized as a function of the ratio between $\tau _d$ and the symbol duration (1.0 ns) where a delay of a duration of half the symbol length was found to be the optimum value. The solver steps equivalent to the delay time are determined as $\hat {\tau }_\textrm d = \tau _\textrm d / \eta = 250$. We assume no counterpropagating modes in the microring cavity. Hence, the electric field samples at the add and drop ports can be expressed as:

(10)$$\hat{E}_\textrm{ {add}}(k) = \kappa e^{{-}i\phi_\textrm d}\left[\hat{E}_\textrm{ {in}}(k-\hat{\tau}_\textrm d )+\frac{1}{\tau_\textrm c}a(k-\hat{\tau}_\textrm d )\right],$$

(11)$$\hat{E}_\textrm{ {drop}}(k) = \frac{1}{\tau_\textrm c}a(k)\hat{E}_\textrm{ {in}}(k)+\hat{E}_\textrm{ {add}}(k),$$

where $\kappa$ = 0.95 represents the optimized coupling factor between the delay waveguide and the bus waveguides of the MRR. As in the case of $\beta$, $\kappa$ was optimized with respect to the performance of the RC (lowest prediction error). The obtained value of $\kappa$ is also close to the one used similarly in [18]. The model takes into account the phase shift $\phi _\textrm d$ due to propagation through the delay waveguide for the optical pump with wavelength $\lambda _ \textrm p$. In [18], an adjustable external phase shift ($\Delta \phi$) in the delay waveguide was used to improve performance. In this work, $\Delta \phi$ = 0 is used, for which similar performance is achieved than for other values of $\Delta \phi$ within a limited range of configurations of the setup. The results obtained in this work could be further improved by a more systematic optimization of $\Delta \phi$, which is out of the scope of this work. $\phi _\textrm d$ is defined as:

(12)$$\phi_\textrm d = \frac{2\pi \tau_\textrm d c}{\lambda_\textrm p}.$$

3.3 Readout layer

We average the $M$ step values of the Runge-Kutta solution for $\hat {E}_\textrm { {drop}}(k)$ over the duration $\theta$ for each $j_{\textrm {th}}$ virtual node to obtain $E_\textrm { {drop}}(n)$:

(13)$$E_\textrm{ {drop}}(n) = \frac{1}{M}\sum_{k=(j-1)M+1}^{jM} \hat{E}_\textrm{ {drop}}(k)$$

Next, we simulate the photodiode response of the RC by calculating the square of $E_\textrm { {drop}}(n)$.

(14)$$X_\textrm{ {drop}}(n) = \lvert E_\textrm{ {drop}}(n)\rvert^2$$

Once the response of the RC is obtained, we calculate the $N$-size weight vector $W$ for the output layer of the RC by using ridge regression with an optimized regularization parameter $\Lambda = 10^{-9}$. The impact of varying $\Lambda$ for different values of the physical parameters considered in this work is not analyzed and we keep it constant. The $n$ elements of the predicted sequence $\hat {y}$(n) are then determined as follows:

(15)$$\hat{y}(n) = \sum_{j=1}^N W_jX_{n,j}(n)$$

4. Benchmark

To investigate the impact of the relaxation times of the studied nonlinear effects and the waveguide attenuation we test the system when solving a chaotic time-series prediction task like NARMA-10. This task requires the high dimensionality given by the nonlinearities of the cavity and the photodiode response, in addition to memory capabilities as it requires a memory of 10 steps in the past to be solved. The NARMA-10 time-series target equation can be expressed as [46]:

(16)$$y(n) = 0.3y(n) + 0.05y(n)\left[\sum_{i=0}^9 y(n-i)\right] + 1.5u(n-9)u(n) + 0.1$$

For this task, the sequence $u\left (n\right )$ is taken from a uniform distribution over the interval: [0.0, 0.5]. We use 2000 data points for the training and 2000 new data points for the testing. The whole data batch is processed into the RC at once. Additional to the 4000 data points, we add 200 warm-up data points at the start of both the training and the testing sets. This allows the Runge-Kutta solution to settle and eliminates any memory dependency between the sequences. To measure the performance of the RC, we calculate the normalized mean square error (NMSE) of the predicted sets. This metric is an estimation of the prediction errors, and so the lower the error, the higher the performance of the RC. The NMSE between the predicted sequence $\hat {y}(n)$, and the target $y(n)$ with a standard deviation $\sigma _{y}^2$ and length $L_\textrm { {data}}$, can be expressed as:

(17)$$\textrm{NMSE} = \frac{1}{L_\textrm{ {data}}}\frac{\sum_{n} \left( \hat{y}(n) - y(n) \right)^2}{\sigma_{y}^2}$$

5. Results

In the following subsections, all the results are obtained from simulations under a $\overline {P}_\textrm { {in}}$ range of -20 to +20 dBm and a $\Delta \omega /2\pi$ range of $\pm 300$ GHz. For every simulation with the same value of $N$, we use the same generated random mask sequence $m(n)$. The calculated NMSE of our simulation results corresponds to the testing set and it is averaged over 10 different seeds used to generate $u(n)$ and consequently, the NARMA-10 sequence. As previously mentioned, a normalization process is carried out when solving Eq. (1)–(3). However, for visualization purposes, we present the denormalized quantities of $\Delta N(t)$ and $\Delta T(t)$ in our results. When not otherwise specified, we use the following values for the analyzed parameters: $\tau _ \textrm {th}$ = 50 ns, $\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.2 dB/cm, $N$= 50. The value of each parameter analyzed was selected as a middle point within realistic ranges of previously reported values from studies in the literature that use MRRs for photonic computing applications [34–38]. First, using the previous parameters as the initial conditions we obtain the results of subsection 5.1.

5.1 $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ regions of NMSE

We divide the parameter space $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ according to the NMSE performance and identify three different behaviours. The corresponding regions are labelled A, B, and C, as shown in Fig. 2. The first region, A, occurs at higher detuning from the (cold) resonance of the MRR in the heat maps, in which the NMSE gradually approximates a constant value. The region C is located near the resonance of the MRR (although this is not necessarily the case for other parameter configurations, as further results demonstrate). It achieves an NMSE > 1.0, which indicates an inconsistent response of the RC which makes it unable to achieve accurate predictions. The third region, B is located in the middle of regions A and C at both the red-shift and blue-shift sides of the spectrum. This region is the one with the best performance of the RC (NMSE<0.2).

Fig. 2. The regions A, B, and C in terms of $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ as defined in subsection 5.1 when solving the NARMA-10 task ($N$ = 50, $\tau _ \textrm {th}$ = 50 ns, $\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.8 dB/cm).

Download Full Size | PDF

In Fig. 3, we show a slightly different perspective of the previous result by simulating a frequency sweep for a set of $\overline {P}_\textrm { {in}}$ values that highlight the transitions between the different regions. In order to determine the reasoning behind the constant value that region A appears to approximate when we increase either positive or negative detuning, we determine first the performance of the RC in the absence of nonlinear dynamics in the MRR. In this scenario, we can expect the photodiode response at the detection stage of Fig. 1(a) to become the only component acting as the physical nonlinear node of the RC, with the delay waveguide still present in the system. Therefore, we simulate our system using a photodiode as the nonlinear node and with absence of MRR nonlinearities, which we refer to as photodiode-based RC in this work. For such a setup, we obtain approximately the same NMSE threshold as the one to which region A converges in Fig. 3 (NMSE $\approx$ 0.3475). This indicates that the MRR approaches a linear regime value the further we detune $\omega _p$ from $\omega _0$. The slope of this trend increases in low power levels and it decreases for 20 dBm, as the high power holds the FCD nonlinearities for a longer $\Delta \omega$ range. Later in this work, we also study the characteristic response from each of the regions.

Fig. 3. NMSE as a function of $\Delta \omega$ for different levels of $\overline {P}_\textrm {in}$ ($\tau _ \textrm {FC}$ = 10 ns, $\tau _ \textrm {th}$ = 50 ns, $\alpha$ = 0.8 dB/cm). We also include the NMSE result obtained when the system uses a photodiode as single nonlinear element of the TDRC.

Download Full Size | PDF

5.2 Impact of the thermo-optic decay time

The value of $\tau _ \textrm {th}$ is related to the thermal conduction and geometry properties of the microring silicon waveguide and the cladding. It is possible to tailor $\tau _ \textrm {th}$ by altering the thickness of the cladding or its material. It is also possible to control $\tau _ \textrm {th}$ by etching trenches around the microring [33,38]. In Fig. 4, we vary $\tau _ \textrm {th}$ while fixing the values of $\tau _ \textrm {fc}$ and $\alpha$. The NMSE of the testing set prediction is obtained for the aforementioned ranges of average input power and frequency detuning. The minimum NMSE obtained for each value of $\tau _ \textrm {th}$ is shown in Table 2.

Fig. 4. NMSE of the simulated RC for an increasing $\tau _ \textrm {th}$ and $\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.8 dB/cm.

Download Full Size | PDF

Table 2. Minimum NMSE of testing set prediction for selected values of $\tau _ \textrm {th}$ ($\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.8 dB/cm).

View Table | View all tables in this article

When increasing $\tau _ \textrm {th}$, the TO effect inside the cavity becomes dominant over the FCD effect. Hence, as mentioned in section 2, the TO effect causes a red shift of the frequency resonance of the MRR, which is more evident in moderately higher powers (above 0 dBm) [34]. This red shift appears to be reflected in the behaviour of region C with respect to $\omega _p$. Region C is shifted to negative detunings (redshift of $\omega _p$) also in values of $\overline {P}_\textrm { {in}}$ above 0 dBm. On the other hand, as $\tau _ \textrm {th}$ increases, region B gets narrower, and the minimum NMSE obtained is also higher. In general, increasing $\tau _ \textrm {th}$ shows to be detrimental to the performance of the RC, resulting in a limited tolerance for practical demonstration. The results of Table 2 seem to indicate that the longer $\tau _ \textrm {th}$ the higher the required power to achieve the minimum NMSE of the results of Fig. 4(a-d). Nevertheless, as region C shifts towards negative detunings, the location of minimum NMSE shifts to positive detunings where now regions A and B are located.

5.3 Impact of the free-carrier relaxation time

The carrier’s lifetime within the diffusion effective area of a microring cavity is determined by recombination processes such as the Shockley-Read-Hall (SRH), radiative and Auger recombinations. $\tau _ \textrm {FC}$ depends on the density of holes in the valence band and electrons in the conduction band [47]. In practice, the value of $\tau _ \textrm {FC}$ is usually approximated taking into account just the SRH recombination rate if the deviation of carrier density from thermal equilibrium is much larger than the density of traps [44]. $\tau _ \textrm {FC}$ can also be adjusted by improving the quality of the silicon-silica interfaces [38]. Using the same methodology as in the previous subsection, we determine the RC performance when varying $\tau _ \textrm {FC}$ while fixing the values of $\tau _ \textrm {th}$ and $\alpha$ (Fig. 5).

Fig. 5. NMSE of the simulated RC for different values of $\tau _ \textrm {FC}$ with $\tau _ \textrm {th}$ = 50 ns, $\alpha$ = 0.8 dB/cm.

Download Full Size | PDF

We vary $\tau _ \textrm {FC}$ from the order of picoseconds to the order of tens of nanoseconds, so that we can analyze the impact of both reducing and increasing $\tau _ \textrm {FC}$ to have a deeper understaning of how the ratio $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$ affects the RC. This ratio has a significant influence on the nonlinearities behaviour as discussed in further details later in this work. The minimum NMSE values obtained under the previous conditions are shown in Table 3. For a low value of $\tau _ \textrm {FC}$ (Fig. 5(a and b)), the TO effect is dominant and, similarly to the results obtained in Fig. 4, region C is shifted towards negative $\Delta \omega$. However, the extension over the $\Delta \omega /2\pi$ axis seems to be reduced with respect to Fig. 4(a)) when $\tau _ \textrm {FC}$ is reduced. As $\tau _ \textrm {FC}$ increases to the order of tens of nanoseconds, the FCD effect becomes more dominant and region C shifts towards positive $\Delta \omega$, which means a blueshift of $\omega _p$. This behaviour of region C is similar to the results of the previous subsection but in the opposite direction of the spectrum. This is to be expected since TO and FCD effects cause a detuning of $\omega _0$ in opposite directions. Unlike the previous case though, when region C shifts towards either negative or positive $\Delta \omega$s its width gets narrower, which translates into a larger region B. This is of crucial importance as it opens a significantly larger window of $\overline {P}_\textrm { {in}}$ vs $\Delta \omega$ where RC operates with a very low NMSE and low power (particularly in Fig. 5(e and f)). This finding is also summarized by the results listed in Table 3.

Table 3. Minimum NMSE of testing set prediction for selected values of $\tau _ \textrm {FC}$ ($\tau _ \textrm {th}$ = 50 ns, $\alpha$ = 0.8 dB/cm).

View Table | View all tables in this article

5.4 Impact of the waveguide linear attenuation

The attenuation of a silicon MRR is directly related to the fabrication quality of the MRR waveguide and it has been extensively studied as high-quality factor ($Q$) silicon microring cavities have been pursued during the last decades [48]. However, it is also important in the context of this work to consider that modifying $\tau _ \textrm {th}$ or $\tau _ \textrm {FC}$ could lead to a collateral impact on $\alpha$, and therefore, it is important to assess the influence of $\alpha$ on the RC performance in order to fully grasp the possible performance penalties of altering the dimensions or physical properties of the MRR when tuning other parameters. Under the same frequency and average power conditions as before, we increase the value of $\alpha$ while fixing the ones of $\tau _ \textrm {th}$ and $\tau _ \textrm {FC}$ (Fig. 6). The minimum NMSE obtained for each increasing value of $\alpha$ is shown in Table 4. Contrary to the previous results, altering the waveguide attenuation does not have a great impact on the position of region C when $\overline {P}_\textrm { {in}}$ increases, although it does have an effect on its extension. A higher attenuation appears to increase the size of region C and in turn, makes region B narrower. Therefore, a relatively high-$Q$ MRR (low attenuation) is desirable in terms of performance of the RC.

Fig. 6. NMSE of the simulated RC for as a function of $\alpha$, and $\tau _ \textrm {th}$ = 50 ns, $\tau _ \textrm {FC}$ = 10 ns.

Download Full Size | PDF

Table 4. Minimum NMSE of testing set prediction for selected values of $\alpha$ ($\tau _ \textrm {FC}$ = 10 ns, $\tau _ \textrm {th}$ = 50 ns).

View Table | View all tables in this article

5.5 Decreasing the number of virtual nodes

In order to make a fair comparison with the study on the MRR-based RC with external feedback done in [18], we decrease the number of virtual nodes to 25 and simulate the RC for a sample set of configurations from the previous subsections (Fig. 7). This set corresponds to the minimum and maximum values of $\tau _ \textrm {th}$, $\tau _ \textrm {FC}$ and $\alpha$ simulated in the previous subsections. There is a slight increase in the obtained minimum NMSE as is expected from a neural network with fewer virtual nodes (Table 5). Likewise, when reducing $N$ to 10 virtual nodes, further slight degradation of the performance is observed. For this case (Fig. 8), the reduction of the size over the parameter space of region B is more noticeable than in the case of $N = 25$. The minimum NMSE for $N = 10$ is shown in Table 6.

Fig. 7. NMSE of the simulated RC with $N =25$ for: a) $\tau _ \textrm {th}$ = 50 ns, b) $\tau _ \textrm {th}$ = 400 ns, c) $\tau _ \textrm {FC}$ = 0.01 ns, d) $\tau _ \textrm {FC}$ = 50 ns, e) $\alpha$ = 0.2 dB/cm, and f) $\alpha$ = 2.0 dB/cm.

Download Full Size | PDF

Fig. 8. NMSE of the simulated RC with $N =10$ for: a) $\tau _ \textrm {th}$ = 50 ns, b) $\tau _ \textrm {th}$ = 400 ns, c) $\tau _ \textrm {FC}$ = 0.01 ns, d) $\tau _ \textrm {FC}$ = 50 ns, e) $\alpha$ = 0.2 dB/cm, and f) $\alpha$ = 2.0 dB/cm.

Download Full Size | PDF

Table 5. Minimum reached NMSE and corresponding parameters for the testing set prediction with $N$ = 25.

View Table | View all tables in this article

Table 6. Minimum reached NMSE and corresponding parameters for the testing set prediction with $N$ = 10.

View Table | View all tables in this article

The results indicate that even when the minimum error obtained increases, there is still no dependence of the defined $\overline {P}_\textrm { {in}}$ vs $\Delta \omega$ regions (A, B and C) on $N$, as the obtained trends are very similar between the different values of $N$. It is possible to argue that the gain in terms of minimum NMSE is relatively small when increasing $N$ up to 50. This indicates that the dimensionality of the RC is mainly given by the MRR nonlinear dynamics, and, within the considered values, $N$ does not impact the behaviour of the defined regions as much as $\tau _ \textrm {th}$, $\tau _ \textrm {FC}$ and $\alpha$.

5.6 Characteristic RC response of the $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ regions of NMSE

In Fig. 9(a-f), we display an instance of the waveform of the MRR response at the drop port for each of the defined regions shown in Fig. 2. To switch between the regions, we fix the value of $\overline {P}_\textrm { {in}}$ to 0 dBm, while varying $\Delta \omega /2\pi$. Fig. 9(a-c) correspond to the response of the same 10 bits (10 ns) of the testing sequence set previously shown in Fig. 1(a) and 1(b). By zooming out this sequence to capture 1000 bits (1 $\mu$s), we obtain Fig. 9(d-f). The response of the RC in Fig. 9(a,d) (region A) shows a delayed linear transformation of the input in which the signal does not differ much from the input sequence. Due to the lack of nonlinearity, the prediction fails to resemble the original NARMA-10 sequence, resulting in a relatively high NMSE = 0.3357 as the system approaches the response of a photodiode-based RC as explained in subsection 5.1 for region A. Next, the waveform examples of region B (Fig. 9(b,e)) show a nonlinear transformation of the input sequence under stable conditions (avoiding SP) which gives enough dimensionality to the RC to achieve a low error (NMSE = 0.0269). Lastly, Region C, which can be observed in Fig. 9(c,f) shows SP oscillatory behaviour with various discontinuities or ’spikes’ and fast transitions to values near zero amplitude (more visible in Fig. 9(f)). This behaviour of the SP affects the consistency of the RC response as in these time intervals RC is incapable of learning a consistent response to similar inputs in order to be tested with unknown data sets.

Fig. 9. a-c) Sampled waveforms at drop port corresponding to 10 bits of each of the three defined $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ parameter space regions of the simulated RC when solving the NARMA-10 task with $\overline {P}_\textrm { {in}}$ = 0 dBm, N = 50, $\tau _ \textrm {th}$ = 50 ns, $\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.8 dB/cm. a and d-f) 1.0 $\mu$s Extended sampled waveforms under the same conditions. g-i) Samples of the target (blue) and predicted (orange) testing sets of a NARMA-10 sequence.

Download Full Size | PDF

5.7 RC linear vs nonlinear regimes

The previous subsections qualitatively indicate the amount of nonlinear transformation the input goes through in each region. A way to quantitatively evaluate this is by determining the coefficient of determination, $R^2$, between $E_\textrm {drop}$ and $E_\textrm {in}$ so that we can quantify, on a range $[0.0 - 1.0]$, how much the response of the RC (before photodetection) can be accounted as a linear transformation of the input. In other words, an $R^2$ of 1.0 would indicate that the RC approximates a linear regime. We show the results for 6 instances of the configurations used in Fig. 2–4 (2 per subsection). As demonstrated in Fig. 10. Each of the 3 defined regions of the previous subsections matches specific levels of $R^2$. First, it confirms that the cavity gradually gets close to a linear regime as the $R^2$ tends to 1.0 as we increase the detuning from $\omega _0$ (Region A). Then, the best performance is obtained in a mixed state between a linear and nonlinear regime where $R^2$ oscillates between $\sim$0.2 and $\sim$0.9. The location of this mixed state matches that of region B. Lastly, an $R^2$ of 0 is obtained in the same area that corresponds to SP (region C), which indicates that there is no direct relation between the response of the RC and the input. Similar findings regarding the importance of the transition between linear and nonlinear regimes to enhance TDRC performance were previously found in [49] for a Mach Zehnder Modulator used as a nonlinear node of photonic TDRC. However, in their implementation, the TDRC performs a different time-series prediction task.

Fig. 10. $R^2$ between $E_ \textrm {drop}$ and $E_ \textrm {in}$ for a) $\tau _ \textrm {th}$ = 50 ns, b) $\tau _ \textrm {th}$ = 400 ns, c) $\tau _ \textrm {FC}$ = 0.01 ns, d) $\tau _ \textrm {FC}$ = 50 ns, e) $\alpha$ = 0.2 dB/cm, and f) $\alpha$ = 2.0 dB/cm.

Download Full Size | PDF

5.8 Dependence of the RC dynamics on $\Delta$T and $\Delta$N

Finally, we also assess the relation between the defined regions (A, B and C) over the $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ parameter space, with $\Delta T$ and $\Delta N$. To achieve this, we plot the average of $\Delta T$ and $\Delta N$ for the Runge-Kutta solution of Eq. (1)–(3) of the whole testing set. By comparing their $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ heatmaps with the ones related to the NMSE and $R^2$, as shown in Fig. 11, we can see a clear relation between the increase in free-carrier concentration and temperature, and the rise of SP with its previously stated effects on the RC response. Very low average $\Delta N$ or $\Delta T$ are not good either for the performance of the RC, as this shows to be highly correlated with the MRR approximating a linear regime. To achieve optimum performance, relatively small increases in temperature $\sim$(2.5 to 15 K) and a $\Delta N$ between $\sim$10$^{18}$ and $\sim$10$^{22}$ appear to be the key requirements. However, the susceptibility of the MRR cavity to enable SP for a given $\Delta \omega$ and $\overline {P}_\textrm { {in}}$ also depends strongly on the ratio between the relaxation lifetimes of the nonlinear effects as discussed in section 6.

Fig. 11. Comparison of the $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ heatmaps for a) NMSE of the testing set. b) $R^2$ between $E_ \textrm {drop}$ and $E_ \textrm {in}$ c) Average $\Delta T$ of the testing set computing. d) Average $\Delta N$ of the testing set computing. $\tau _ \textrm {th}$ = 50 ns, $\tau _ \textrm {FC}$ = 10 ns, $\alpha$ = 0.8 dB/cm.

Download Full Size | PDF

6. Discussion

In our study, the free-carrier and TO nonlinearities relaxation times ($\tau _ \textrm {FC}$, $\tau _ \textrm {th}$) and the waveguide attenuation $\alpha$ have been varied while fixing the other parameters in order to understand the individual contribution of each of them to the overall performance of the RC. However, in practice, each of them is highly correlated with the others and the potential changes that can be made to the MRR to optimize one of them will most probably affect the others. In the case of $\tau _ \textrm {FC}$ and $\tau _ \textrm {th}$ it is also essential to consider the ratio between them, $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$. As shown in [37,38], this ratio has a severe influence on the existence of SP, depending in turn on $\Delta \omega$ and $\overline {P}_\textrm { {in}}$. In [37] the authors conclude that in order to enhance SP, a short $\tau _ \textrm {FC}$ is required with respect to $\tau _ \textrm {th}$, but not too short (it should be longer than the photon lifetime, i.e., the relaxation time of the resonance $\tau _ \textrm {r} = \frac {n_\textrm { {Si}}}{c\alpha }$). This condition is matched in the simulated parameters used to obtain the results of Fig. 4. By increasing $\tau _ \textrm {th}$, the ratio $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$ becomes smaller and since $\tau _ \textrm {FC}$ = 10 ns is higher than $\tau _ \textrm {r}$ ($\sim$0.14 ns for $\alpha$ = 0.8 dB/m), conditions are favourable to reinforce SP and extend its region. Due to the increase of $\tau _ \textrm {th}$, the TO effect is dominant and the SP region is shifted towards negative detuning (red-shift of the resonance) [38].

Hence, for a given $\alpha$, one way to diminish SP behaviour is to reduce $\tau _ \textrm {FC}$ as much as possible, so that the requirement $\tau _ \textrm {FC}$<$\tau _ \textrm {r}$ is fulfilled. A few attempts to try to reduce $\tau _ \textrm {FC}$ to sub-nanosecond magnitudes are found in the literature: A $\tau _ \textrm {FC}$ of 55 ps is achieved in [26] using ion implantation in a silicon MRR, although the waveguides are penalized with additional 22 dB/cm propagation losses. Moreover, a $\tau _ \textrm {FC}$ of 15 ps is reported in [27] using an oxygen-implanted silicon on insulator (SOI) MRR for all-optical switching purposes. Lastly, in [50], a p-i-n junction is integrated into an SOI waveguide to reduce $\tau _ \textrm {FC}$ to 12.2 ps with an attenuation loss of 2 dB/cm.

Another way to keep a certain level of nonlinear dynamics without reinforcing SP would be to increase the ratio $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$ [37]. This can be done by decreasing $\tau _ \textrm {th}$ so that both relaxation times are around the same order of magnitude (Fig. 4(a)), but this is more challenging as $\tau _ \textrm {th}$ is very dependent on the quality of the fabrication process and the geometry of the silicon waveguide. $\tau _ \textrm {th}$ is also relatively fixed by the required width of the cladding material. So, the other way to increase $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$ would be to increase $\tau _ \textrm {FC}$, and with it, the strength of free-carrier nonlinearities. Our results are coherent with these previous methods as the area of very high error (Region C) that corresponds to SP is reduced by making $\tau _ \textrm {FC}$ either too small in comparison to $\tau _ \textrm {th}$ and smaller than $\tau _ \textrm {r}$ (order of picoseconds, Fig. 5(a)) or by making $\tau _ \textrm {FC}$/$\tau _ \textrm {th}$ >= 0.5 (Fig. 5(b)). Both scenarios provide enough nonlinearity to the RC and region B is extended to lower levels of power.

Regarding the linear losses, resonator cavities with a high $Q$-factor are indispensable to achieve the necessary nonlinear dynamics for the RC to operate with low NMSE [37]. Higher linear losses contribute to the increase in heat due to power absorption and in turn, the TO effect becomes the dominant effect and the region of SP is widened. On the contrary, low linear waveguide losses (which normally translate into relatively high $Q$ MRRs) allow the dynamics of free-carrier nonlinearities to become more relevant within the cavity. As FCD becomes more dominant, the resonance is blue-shifted and the SP region shifts to positive detunings [37]. Our results in Fig. 6 match the previously described physical dynamics.

Following the aforementioned insights from studies about SP in MRR cavities, therefore, provides a way to enhance the range of $\overline {P}_\textrm { {in}}$ and $\Delta \omega /2\pi$ in which region B occurs. These conditions are consistent with our results. The existence of this interval of low-error prediction (region B in our case) also fits well with the results of [39], where, as we mentioned before, a power interval is also found for FCD nonlinearities in which the RC is given enough dimensionality while keeping consistency using directly coupled cavities.

As a matter of performance comparison, we state previous works about photonic TDRC schemes solving the NARMA-10 task with the same number of virtual nodes ($N$ = 50) and similar speed. An optoelectronic scheme using a Mach-Zehnder modulator achieved an NMSE = 0.168 $\pm$ 0.015 [10]. Even though, we acknowledge that the results in [10] were obtained both in simulations and experimentally. Numerically, an NMSE = 0.103 $\pm$ 0.018 was obtained in [14]. In order to perform a fair comparison of our results with [18], we simulate the system using the highest and lowest values of the nonlinear effects lifetimes and waveguide loss considered in this study, for $N$ = 25 nodes. The minimum NMSE with that number of virtual nodes numerically obtained in [18] is 0.204 $\pm$ 0.026. In our results, we obtain a minimum NMSE of 0.0151 $\pm$ 0.0021, which is a considerable improvement over [18] and we could expand the region in which the RC achieves its best performances in comparison to our previous study [24].

7. Conclusion

Throughout this work, we have investigated the relationship between the performance of an MRR-based TDRC and the lifetimes of the free-carrier and TO effects as well as the impact of the waveguide attenuation. We simulate different values for the parameters ($\tau _ \textrm {FC}$, $\tau _ \textrm {th}$ and $\alpha$), and define three $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ regions in which each of them shows a very different level of error prediction. We characterize such regions by first showing qualitatively the waveform response of each region and then we quantify the differences in nonlinearity between such regions using $R^2$. We show that when the response at the drop port can be potentially represented by a linear transformation of the input sequence, the nonlinearities of the system are given mostly by the photodiode response at the detection stage. When the $R^2$ between the output and input of the RC is very low, the system might run into the risk of obtaining a response too inconsistent to accurately calculate the weights during the training process due to the discontinuities and near-to-zero response of the RC caused by SP nonlinearities. There is an interval of average input power and angular frequency detuning in which enough dimensionality is given to the RC without becoming unstable due to SP. This work provides a further understanding of the physical conditions that are optimum to reduce SP while keeping a high dimensionality. In the areas that fulfil such conditions, our RC achieves low error at low power when solving a time-series prediction. Our results obtain an NMSE lower than other works that propose a similar $N$ and speed. We also show that decreasing $N$ does not have a great impact on the physical dynamics of this RC setup.

Funding

Villum Fonden (VIL29334); Vetenskapsrådet (2022-04798).

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. P. Yadav, A. Mishra, and S. Kim, Neuromorphic Hardware Accelerators (Springer International Publishing, Cham, 2023), pp. 225–268.

2. G. Dabos, D. V. Bellas, R. Stabile, et al., “Neuromorphic photonic technologies and architectures: scaling opportunities and performance frontiers [invited],” Opt. Mater. Express 12(6), 2343–2367 (2022). [CrossRef]

3. K. Demertzis, G. D. Papadopoulos, L. Iliadis, et al., “A comprehensive survey on nanophotonic neural networks: Architectures, training methods, optimization, and activations functions,” Sensors 22(3), 720 (2022). [CrossRef]

4. L. Appeltant, M. C. Soriano, G. Van der Sande, et al., “Information processing using a single dynamical node as complex system,” Nat. Commun. 2(1), 468 (2011). [CrossRef]

5. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev. 3(3), 127–149 (2009). [CrossRef]

6. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz 26(4), 365–371 (2012). [CrossRef]

7. K. Vandoorne, W. Dierckx, B. Schrauwen, et al., “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16(15), 11182–11192 (2008). [CrossRef]

8. F. Duport, B. Schneider, A. Smerieri, et al., “All-optical reservoir computing,” Opt. Express 20(20), 22783–22795 (2012). [CrossRef]

9. L. Larger, M. C. Soriano, D. Brunner, et al., “Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012). [CrossRef]

10. Y. Paquot, F. Duport, A. Smerieri, et al., “Optoelectronic reservoir computing,” Sci. Rep. 2(1), 287 (2012). [CrossRef]

11. C. Mesaritakis, V. Papataxiarhis, and D. Syvridis, “Micro ring resonators as building blocks for an all-optical high-speed reservoir-computing bit-pattern-recognition system,” J. Opt. Soc. Am. B 30(11), 3048–3055 (2013). [CrossRef]

12. K. Vandoorne, P. Mechet, T. Van Vaerenbergh, et al., “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun. 5(1), 3541 (2014). [CrossRef]

13. J. Bueno, D. Brunner, M. C. Soriano, et al., “Conditions for reservoir computing performance using semiconductor lasers with delayed optical feedback,” Opt. Express 25(3), 2401–2412 (2017). [CrossRef]

14. Y. Chen, L. Yi, J. Ke, et al., “Reservoir computing system with double optoelectronic feedback loops,” Opt. Express 27(20), 27431–27440 (2019). [CrossRef]

15. J. Vatin, D. Rontani, and M. Sciamanna, “Experimental reservoir computing using vcsel polarization dynamics,” Opt. Express 27(13), 18579–18584 (2019). [CrossRef]

16. M. Borghi, S. Biasi, and L. Pavesi, “Reservoir computing based on a silicon microring and time multiplexing for binary and analog operations,” Sci. Rep. 11(1), 15642 (2021). [CrossRef]

17. M. Nakajima, K. Tanaka, and T. Hashimoto, “Scalable reservoir computing on coherent linear photonic processor,” Commun. Phys. 4(1), 20 (2021). [CrossRef]

18. G. Donati, C. R. Mirasso, M. Mancinelli, et al., “Microring resonators with external optical feedback for time delay reservoir computing,” Opt. Express 30(1), 522–537 (2022). [CrossRef]

19. M. Abdalla, C. Zrounba, R. Cardoso, et al., “Minimum complexity integrated photonic architecture for delay-based reservoir computing,” Opt. Express 31(7), 11610–11623 (2023). [CrossRef]

20. T. Wang, C. Jiang, Q. Fang, et al., “Reservoir computing and task performing through using High-β lasers with delayed optical feedback,” Prog. Electromagn. Res. 178, 1–12 (2023). [CrossRef]

21. C. Mesaritakis and D. Syvridis, “Reservoir computing based on transverse modes in a single optical waveguide,” Opt. Lett. 44(5), 1218–1221 (2019). [CrossRef]

22. J. Dong, M. Rafayelyan, F. Krzakala, et al., “Optical reservoir computing using multiple light scattering for chaotic systems prediction,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–12 (2020). [CrossRef]

23. T. Bu, H. Zhang, S. Kumar, et al., “Efficient optical reservoir computing for parallel data processing,” Opt. Lett. 47(15), 3784–3787 (2022). [CrossRef]

24. B. J. Giron Castro, C. Peucheret, D. Zibar, et al., “Impact of free-carrier nonlinearities on silicon microring-based reservoir computing,” arXiv, arXiv:2307.07011 (2023). [CrossRef]

25. P. Sethi and S. Roy, “All-optical ultrafast switching in 2 × 2 silicon microring resonators and its application to reconfigurable demux/mux and reversible logic gates,” J. Lightwave Technol. 32(12), 2173–2180 (2014). [CrossRef]

26. M. Först, J. Niehusmann, T. Plötzing, et al., “High-speed all-optical switching in ion-implanted silicon-on-insulator microring resonators,” Opt. Lett. 32(14), 2046–2048 (2007). [CrossRef]

27. M. Waldow, T. Plötzing, M. Gottheil, et al., “25ps all-optical switching in oxygen implanted silicon-on-insulator microring resonator,” Opt. Express 16(11), 7693–7702 (2008). [CrossRef]

28. M. Xiong, L. Lei, Y. Ding, et al., “All-optical 10 Gb/s and logic gate in a silicon microring resonator,” Opt. Express 21(22), 25772–25779 (2013). [CrossRef]

29. A. N. Tait, T. F. de Lima, E. Zhou, et al., “Neuromorphic photonic networks using silicon photonic weight banks,” Sci. Rep. 7(1), 7430 (2017). [CrossRef]

30. J. Xiang, Y. Zhang, Y. Zhao, et al., “All-optical silicon microring spiking neuron,” Photonics Res. 10(4), 939–946 (2022). [CrossRef]

31. H. Zhou, J. Dong, J. Cheng, et al., “Photonic matrix multiplication lights up photonic accelerator and beyond,” Light: Sci. Appl. 11(1), 30 (2022). [CrossRef]

32. G. Donati, A. Argyris, C. R. Mirasso, et al., “Noise effects on time delay reservoir computing using silicon microring resonators,” Proc. SPIE 12004, 120040U (2022). [CrossRef]

33. T. J. Johnson, M. Borselli, and O. Painter, “Self-induced optical modulation of the transmission through a high-Q silicon microdisk resonator,” Opt. Express 14(2), 817–831 (2006). [CrossRef]

34. M. Soltani, “Novel integrated silicon nanophotonic structures using ultra-high Q resonators,” PhD thesis, Georgia Institute of Technology (2009).

35. T. Van Vaerenbergh, M. Fiers, P. Mechet, et al., “Cascadable excitability in microrings,” Opt. Express 20(18), 20292–20308 (2012). [CrossRef]

36. M. Mancinelli, “Linear and non linear coupling effects in sequence of microresonators,” PhD thesis. University of Trento, (2013).

37. T. Van Vaerenbergh, M. Fiers, J. Dambre, et al., “Simplified description of self-pulsation and excitability by thermal and free-carrier effects in semiconductor microcavities,” Phys. Rev. A 86(6), 063808 (2012). [CrossRef]

38. L. Zhang, Y. Fei, T. Cao, et al., “Multibistability and self-pulsation in nonlinear high-Q silicon microring resonators considering thermo-optical effect,” Phys. Rev. A 87(5), 053805 (2013). [CrossRef]

39. I. K. Boikov, D. Brunner, and A. D. Rossi, “Evanescent coupling of nonlinear integrated cavities for all-optical reservoir computing,” New J. Phys. 25(9), 093056 (2023). [CrossRef]

40. A. Skalli, X. Porte, N. Haghighi, et al., “Computational metrics and parameters of an injection-locked large area semiconductor laser for neural network computing [invited],” Opt. Mater. Express 12(7), 2793–2804 (2022). [CrossRef]

41. P. Kärkkäinen and R. Linna, “Dimensional criterion for forecasting nonlinear systems by reservoir computing,” arXiv, ArXiv:2202.05159v3 (2022). [CrossRef]

42. T. L. Carroll, “Dimension of reservoir computers,” Chaos: An Interdiscip. J. Nonlinear Sci. 30(1), 013102 (2020). [CrossRef]

43. M. Inubushi and K. Yoshimura, “Reservoir computing beyond memory-nonlinearity trade-off,” Sci. Rep. 7(1), 10199 (2017). [CrossRef]

44. M. Borghi, D. Bazzanella, M. Mancinelli, et al., “On the modeling of thermal and free carrier nonlinearities in silicon-on-insulator microring resonators,” Opt. Express 29(3), 4363–4377 (2021). [CrossRef]

45. A. Uchida, R. McAllister, and R. Roy, “Consistency of nonlinear system response to complex drive signals,” Phys. Rev. Lett. 93(24), 244102 (2004). [CrossRef]

46. A. Atiya and A. Parlos, “New results on recurrent network training: unifying the algorithms and accelerating convergence,” IEEE Trans. Neural Netw. 11(3), 697–709 (2000). [CrossRef]

47. D. Schroder, “Carrier lifetimes in silicon,” IEEE Trans. Electron Devices 44(1), 160–170 (1997). [CrossRef]

48. D. Zeng, Q. Liu, C. Mei, et al., “Demonstration of Ultra-High-Q silicon microring resonators for nonlinear integrated photonics,” Micromachines 13(7), 1155 (2022). [CrossRef]

49. J. Pauwels, G. Verschaffelt, S. Massar, et al., “Distributed kerr non-linearity in a coherent all-optical fiber-ring reservoir computer,” Front. Phys. 7, 138 (2019). [CrossRef]

50. A. C. Turner-Foster, M. A. Foster, J. S. Levy, et al., “Ultrashort free-carrier lifetime in low-loss silicon nanowaveguides,” Opt. Express 18(4), 3582–3591 (2010). [CrossRef]

Parameter	Value	Parameter	Value
$m$	$1.2 \times 10^{- 11}$ kg	$β_{{TPA}}$	$8.4 \times 10^{- 11}$ m $\cdot$ W $^{- 1}$ [35]
$τ_{c}$	$54.7$ ps	$Γ_{{FCA}}$	0.9996 [35]
$n_{{Si}}$	3.485 [33]	$Γ_{{th}}$	0.9355 [35]
$λ_{0}$	$1553.49$ nm	d $n_{{Si}} /$ d $T$	$1.86$ $\times$ 10 $^{- 4}$ K $^{- 1}$ [33]
$L$	$2 π \cdot 7.5$ $μ$ m	d $n_{{Si}} /$ d $N$	$- 1.73$ $\times$ 10 $^{- 27}$ m $^{- 3}$ [35]
$c_{p}$	0.7 J $\cdot$ (g $\cdot$ K) $^{- 1}$ [33]	$σ_{{FCA}}$	1.0 $\times$ 10 $^{- 21}$ m $^{2}$ [35]
$V_{{FCA}}$	2.36 $μ$ m $^{3}$ [35]	$V_{{TPA}}$	2.59 $μ$ m $^{3}$ [35]

NMSE	$τ_{th}$ [ns]	$Δ ω / 2 π$ [GHz]	${\bar{P}}_{{in}}$ [dBm]
0.0178 $\pm$ 0.0018	50	-30	-5.0
0.0283 $\pm$ 0.0026	100	-65	-2.5
0.0412 $\pm$ 0.0030	150	-95	7.5
0.0611 $\pm$ 0.0033	200	-145	15
0.0736 $\pm$ 0.0044	300	75	-7.5
0.0748 $\pm$ 0.0044	400	80	-7.5

NMSE	$τ_{FC}$ [ns]	$Δ ω / 2 π$ [GHz]	${\bar{P}}_{{in}}$ [dBm]
0.0184 $\pm$ 0.0008	0.01	-25	10.0
0.0192 $\pm$ 0.0023	0.1	30	0.0
0.0228 $\pm$ 0.0019	1.0	45	-17.5
0.0178 $\pm$ 0.0018	10	-30	-5.0
0.0174 $\pm$ 0.0024	25	-50	-5.0
0.0151 $\pm$ 0.0021	50	-45	-7.5

NMSE	$α$ [dB/cm]	$Q$	$Δ ω / 2 π$ [GHz]	${\bar{P}}_{{in}}$ [dBm]
0.0164 $\pm$ 0.0020	0.2	3.5 $\times 10^{5}$	-10	-7.5
0.0169 $\pm$ 0.0020	0.5	1.4 $\times 10^{5}$	-20	-5.0
0.0178 $\pm$ 0.0018	0.8	8.8 $\times 10^{4}$	-30	-5.0
0.0182 $\pm$ 0.0022	1.0	7.0 $\times 10^{4}$	-45	-5.0
0.0190 $\pm$ 0.0024	1.5	4.7 $\times 10^{4}$	-55	-2.5
0.0215 $\pm$ 0.0030	2.0	3.5 $\times 10^{4}$	60	-7.5

Parameter	Value	Parameter	Value
$m$	$1.2 \times 10^{- 11}$ kg	$β_{{TPA}}$	$8.4 \times 10^{- 11}$ m $\cdot$ W $^{- 1}$ [35]
$τ_{c}$	$54.7$ ps	$Γ_{{FCA}}$	0.9996 [35]
$n_{{Si}}$	3.485 [33]	$Γ_{{th}}$	0.9355 [35]
$λ_{0}$	$1553.49$ nm	d $n_{{Si}} /$ d $T$	$1.86$ $\times$ 10 $^{- 4}$ K $^{- 1}$ [33]
$L$	$2 π \cdot 7.5$ $μ$ m	d $n_{{Si}} /$ d $N$	$- 1.73$ $\times$ 10 $^{- 27}$ m $^{- 3}$ [35]
$c_{p}$	0.7 J $\cdot$ (g $\cdot$ K) $^{- 1}$ [33]	$σ_{{FCA}}$	1.0 $\times$ 10 $^{- 21}$ m $^{2}$ [35]
$V_{{FCA}}$	2.36 $μ$ m $^{3}$ [35]	$V_{{TPA}}$	2.59 $μ$ m $^{3}$ [35]

Effects of cavity nonlinearities and linear losses on silicon microring-based reservoir computing

Abstract

1. Introduction

2. Free-carrier nonlinearities in silicon MRR

3. MRR-based time-delay photonic RC

3.1 Input layer

3.2 Reservoir layer

3.3 Readout layer

4. Benchmark

5. Results

5.1 $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ regions of NMSE

5.2 Impact of the thermo-optic decay time

5.3 Impact of the free-carrier relaxation time

5.4 Impact of the waveguide linear attenuation

5.5 Decreasing the number of virtual nodes

5.6 Characteristic RC response of the $\overline {P}_\textrm { {in}}$ vs $\Delta \omega /2\pi$ regions of NMSE

5.7 RC linear vs nonlinear regimes

5.8 Dependence of the RC dynamics on $\Delta$T and $\Delta$N

6. Discussion

7. Conclusion

Funding

Disclosures

Data Availability

References

Data Availability

Cited By

Figures (11)

Tables (6)

Equations (17)

Optics Express

NMSE	$τ_{FC}$ [ns]	$τ_{th}$ [ns]	$α$ [dB/cm]	$Δ ω / 2 π$ [GHz]	${\bar{P}}_{{in}}$ [dBm]
0.0197 $\pm$ 0.0009	0.01	50	0.8	-10	7.5
0.0169 $\pm$ 0.0023	50	50	0.8	-45	-5.0
0.0185 $\pm$ 0.0021	10	50	0.8	-40	-5.0
0.0758 $\pm$ 0.0044	10	400	0.8	80	-7.5
0.0173 $\pm$ 0.0025	10	50	0.2	-35	-12.5
0.0250 $\pm$ 0.0033	10	50	2.0	60	-7.5

NMSE	$τ_{FC}$ [ns]	$τ_{th}$ [ns]	$α$ [dB/cm]	$Δ ω / 2 π$ [GHz]	${\bar{P}}_{{in}}$ [dBm]
0.0249 $\pm$ 0.0021	0.01	50	0.8	35	2.5
0.0300 $\pm$ 0.0018	50	50	0.8	-45	-7.5
0.0224 $\pm$ 0.0019	10	50	0.8	50	-7.5
0.0812 $\pm$ 0.0038	10	400	0.8	80	-7.5
0.0272 $\pm$ 0.0024	10	50	0.2	35	-15.0
0.0385 $\pm$ 0.0033	10	50	2.0	70	2.5

Bernard J. Giron Castro	https://orcid.org/0009-0003-1265-1118
Christophe Peucheret	https://orcid.org/0000-0002-1655-9293
Francesco Da Ros	https://orcid.org/0000-0002-9068-9125