Impact of optical coherence on the performance of large-scale spatiotemporal photonic reservoir computing systems

Romain Modeste Nguimdo; Piotr Antonik; Nicolas Marsal; Damien Rontani

doi:10.1364/OE.400546

Optics Express
Vol. 28,
Issue 19,
pp. 27989-28005
(2020)
•https://doi.org/10.1364/OE.400546

Impact of optical coherence on the performance of large-scale spatiotemporal photonic reservoir computing systems

Romain Modeste Nguimdo, Piotr Antonik, Nicolas Marsal, and Damien Rontani

Open Access

Get PDF
Email
Share
Get Citation
Copy Citation Text
Romain Modeste Nguimdo, Piotr Antonik, Nicolas Marsal, and Damien Rontani, "Impact of optical coherence on the performance of large-scale spatiotemporal photonic reservoir computing systems," Opt. Express 28, 27989-28005 (2020)

Export Citation
- BibTex
- Endnote (RIS)
- HTML
- Plain Text
Citation alert
Save article
Spotlight Summary

Check for updates

More Like This

Toward optical signal processing using Photonic Reservoir Computing
Kristof Vandoorne, et al.
Opt. Express 16(15) 11182-11192 (2008)

Performance-enhanced time-delayed photonic reservoir computing system using a reflective...
Xiaoyu Li, et al.
Opt. Express 31(18) 28764-28777 (2023)

Demonstrating delay-based reservoir computing using a compact photonic integrated chip
Krishan Harkhoe, et al.
Opt. Express 28(3) 3086-3096 (2020)

Related Topics
Table of Contents Category
- Nonlinear Optics
Optics & Photonics Topics
?

The topics in this list come from the Optics and Photonics Topics applied to this article.

About this Article
History
- Original Manuscript: June 18, 2020
- Revised Manuscript: August 13, 2020
- Manuscript Accepted: August 14, 2020
- Published: September 9, 2020
Virtual Issues
October 8, 2020 Spotlight on Optics

Abstract

Large-scale spatiotemporal photonic reservoir computer (RC) systems offer remarkable solutions for massively parallel processing of a wide variety of hard real-world tasks. In such systems, neural networks are created by either optical or electronic coupling. Here, we investigate the impact of the optical coherence on the performance of large-scale spatiotemporal photonic RCs by comparing a coherent (optical coupling between the reservoir nodes) and incoherent (digital coupling between the reservoir nodes) RC systems. Although the coherent configuration offers significant reduction on the computational load compared to the incoherent architecture, for image and video classification benchmark tasks, it is found that the incoherent RC configuration outperforms the coherent configuration. Moreover, the incoherent configuration is found to exhibit a larger memory capacity than the coherent scheme. Our results pave the way towards the optimization of implementation of large-scale RC systems.

1. Introduction

A wide variety of hard tasks such as speech recognition [1], nonlinear channel equalization and time-series prediction [2], detection of epileptic seizures [3], robot control [4], and automatic classification of images and videos [5–7] can be solved using brain-inspired systems that do not require explicit instructions, but rely on patterns and inferences [8,9]. Reservoir computing belongs to that class of systems for information processing [10]. Over the last decade, its photonic implementations have received considerable attention due to their excellent performance, energy efficiency, and fast speed [11–14]. A large variety of photonic architectures have been already proposed such as single laser diode with optical or optoelectronic feedbacks [15–21], free-space optics [6,22–24], optical cavities [25,26], and photonic integrated technology [27–30].

A RC architecture typically consists of three parts: an input layer, a reservoir, and an output layer. The input and the output layers are the stages where the pre-processing and the post-processing are performed, respectively, while the reservoir is usually a high dimensional recurrent dynamical network typically composed of hundreds to thousands of interconnected nodes (also called echo state network). The simplicity of the RC approach relies on the fact that the recurrent internal weights of RC are fixed and can be chosen randomly meaning that only the readout layer has to be trained [2,31]. This dramatically reduces the training time and also simplifies experimental implementations of such systems. Nevertheless, the physical implementations of RC systems have remained very challenging over a decade due to the large number of nonlinear nodes to be coupled and controlled. To get around this drawback, time-delay configurations have been successfully investigated [16,17,19,25,32,33]. In this approach, the reservoir size increases linearly with the length of the feedback loop, but at the expense of the processing speed. That is, the data cannot be fed to the systems faster than the time-delay to get all the virtual nodes subjected to the data [13].

To perform complex tasks requiring a large number of nodes, a large-scale spatiotemporal photonic RC with a reservoir up to $2500$ nodes was, first, demonstrated [23]. Then, Antonik et al., recently achieved the state-of-the-art experimental performance on image and video classification using spatiotemporal photonic RC system capable of implementing neural networks up to $16384$ nodes [6,7]. In previous large-scale spatiotemporal photonic RC systems, complex and recurrently connected networks have been created using either imaging spatially structured via a diffractive optical element [23,24] or from a computer [6,7]. Recent studies have shown the potential of free-space RC architecture to achieve hundred of thousands of interconnected nodes with diffractive coupling [34] and more than a million with a scattering medium [35].

Apart from the dimensionality, which plays a crucial role in the system performance, nonlinearity is another key attribute of the RC systems. Various implementations with different nonlinearities have been demonstrated, all showing good performances [6,17,19,25,32,33]. However, it is still not clear whether a particular nonlinearity outperforms the others when used on specific tasks. For large-scale spatiotemporal photonic RC systems, the needed high-dimensionality and nonlinearity can be typically created in the optical or electronic domain using polarizers, high-dimensional cameras and spatial light modulator (SLM) [6,23,24].

Few investigations have separately focused on incoherent [6,7] or coherent cases [22–24], but no comparative study has been carried out to evaluate the benefits of each configuration. In such schemes, the inherent nonlinearity implemented in the optical domain depends on the coherence on the light beam, which determines the internal connectivity of the network. Hence, the following question arises: between coherent and incoherent light beams, which is the most suitable for implementing spatiotemporal photonic RC systems?

In this work, we report numerical results of two SLM-based, large-scale, spatiotemporal RC systems, which only differ in the nature of the light beam and the physical coupling of the reservoir nodes. The paper is structured as follows. Section 1 is devoted to the introduction. In Sec. 2, we describe the operating principle of each configuration and give the corresponding numerical model. In Sec. 3, we describe the benchmark tasks used to compare the performances of the two systems. Section 4 highlights the main results and discusses the performance of each configuration. Section 5 concludes the paper while further modeling details are given in Appendix 6.

2. System modeling

The two architectures of large-scale, spatiotemporal, photonic RC systems investigated in this work are shown in Fig. 1. They are electro-optical systems composed of an optical path and an electrical path, both built by interconnecting stand-alone components. On one hand, a SLM is used to transform an electrical signal into optical signal through phase modulation, while a camera transforms the optical signal into an electrical signal. The number of nodes of the reservoir is essentially limited by the resolutions of the SLM and the camera. In our current setup, the lowest resolution is given by the SLM whose resolution equal to $512$x$512$ could provide up to $N=262144$ nodes. However, we typically use only pixels located at the center of the SLM matrix due to physical limitations from (i) slight misalignment between the SLM and camera optical axe and (ii) inhomogeneous intensity distribution of the optical beam. Such a constraint limits our study to a maximum number of nodes $N$ up to $16384$ similar to [6].

Fig. 1. Schemes of principle of the experimental setups. The data to be processed are the HOG features extracted from images or video streams. The features are multiplied by a random mask before being injected into the reservoir. The reservoir structure depends on the configuration. a) Incoherent configuration. The SLM is illuminated by an incoherent light from a LED. Then, a polarizer is used to transform the SLM phase-modulated signal into an intensity modulation before its detection by a camera. After the camera, the computer multiplies the signal by a random coupling matrix and superimposes it with the new masked data. The result is used to update the SLM state. b) Coherent configuration. The SLM is illuminated by a coherent light from a laser. After the second polarizer, a diffuser is inserted to ensure interconnections between the reservoir nodes via interferences before their detection on the camera. As such, the computer is used to inject the masked data into the reservoir and to update the phase states of the SLM.

Download Full Size | PDF

2.1 Incoherent reservoirs

For the incoherent architecture, the SLM is illuminated by a continuous-intensity light beam emitted by a LED [Fig. 1(a)]. The reflected light beam then passes through a linear polarizer oriented at $45^o$ with respect to the vertical axis so that it implements a nonlinear $\sin$-square function by transforming the phase-modulation into an intensity modulation. Afterwards, the obtained signal is detected by a camera and serves two purposes: the first is the readout for post-processing of the data on a computer, and the second is the implementation of the coupling between the reservoir’s nodes by multiplying, digitally, this signal with an $N\times N$ random matrix. Subsequently, the data to be processed are added and the resulting signal is used to update the SLM phase for each pixel.

Experimentally, the camera and the SLM have a limited resolution of $8$-bits. Hence, the dynamics of the system as shown in Fig. 1(a) can be described by the following coupled maps:

(1)$$I^{n+1}_i=\left \lfloor I_0 \sin^2 \left \lfloor \sum_{k=1}^N w_{ik}I^n_k +\beta\sum_{l=1}^M b_{kl}U^n_l \right \rfloor_8 \right \rfloor_8,$$

where $\lfloor . \rfloor _8$ refers to the $8$-bit resolution and $I_0$ is the input intensity which illuminates the total pixels of the SLM. $I^{n}_i$ is the light intensity at time step $n$ for $i^{th}$-pixel at the output of the camera. $w_{ik} \in \mathbb {R}^{N\times N}$ are the coefficients of the random interconnection matrix, while $b\in \mathbb {R}^{N\times M}$ is the input weight matrix, $M$ being the dimension of the input vector; $\beta$ is the scaling parameter for the input data. The modeling details to obtain Eq. (1) are given in Appendix 6.1.

2.2 Coherent reservoirs

The architecture for the coherent case is shown in Fig. 1(b). It has structural similarities with the incoherent architecture with a few noticeable differences. First, the SLM is illuminated by a continuous-wave coherent light beam emitted by a laser (Fig. 1(b)). Then, a diffuser is inserted after the second polarizer (Pol. $2$) so that the coherent light beam from this polarizer is scattered creating a complex interference pattern. Hence, the diffuser output is a randomly weighted sum of the different contributions from all SLM pixels. In other words, the variability and the coupling between the RC nodes are ensured optically so that an input image imprinted as a phase-modulation on the SLM coherent light beam propagates through the optical diffuser to form a speckle pattern on the camera. Here, the computer is only used to inject the data and to update the SLM phases. This offers significant reduction on the computational load compared to the incoherent architecture.

Under this description, the RC system shown in Fig. 1(b) can be modeled by:

(2)$$I^{n+1}_i=\left \lfloor I_0\left| \sum_{k=1}^N w_{ik}\sin \left \lfloor I^n_k +\beta\sum_{l=1}^M b_{kl}U^n_l \right \rfloor_8 \right|^2 \right \rfloor_8.$$

Here, $w \in \mathbb {C}^{N\times N}$ is the complex-valued transmission matrix. The others parameters were defined in Sec. 2.1. The detailed derivation of the Eq. (2) are given in Appendix 6.2.

Since we have assumed that the summation of the different internal weights contributing to the $i^{th}$ pixel is achieved by a diffuser, the matrix elements of $w$ are complex numbers fixed by this component with their phase values randomly distributed in the interval $[0,2\pi ]$. It has been demonstrated that the singular values of $w$ follow the so-called quarter-circle law distribution [36]. According to this law, the module of $w_{ik}$ is generated so that the singular values of a normally distributed square matrix lie on a quarter circle. More details on $w$ are given in Appendix 6.3. Also, the elements of the matrix $b$ can be freely fixed during the pre-processing on the computer. Hence, $I_0$ and $\beta$ are the key parameters which can be tuned to observe different dynamical regimes.

For both configurations, we have normalized the matrix $w$ so that $\sum _{i,k} |w_{ik}|^2=1$. Also, the training is done on the camera readout signal, which forms the following RC outputs:

(3)$$y_{RC,p}^n= \sum_{k=1}^N w^{out}_{pk}\left \lfloor I^n_k \right \rfloor_8,$$

where $w^{out}\in \mathbb {R}^{P\times N}$ is a matrix whose elements are determined by the training. The number $P$ of RC outputs depends on the type of benchmark tasks or metrics used.

3. Benchmark tasks and performance metrics

Our comparative study is carried out on two different benchmark tasks, namely image classification of handwritten digits and the recognition of human actions in video streams. The image classification will be evaluated on the popular MNIST database [37] publicly available while the recognition of human actions in video streams will be performed on the popular KTH database also publicly available online [38]. In addition, we consider a memory capacity metric as a measure of system attribute.

3.1 MNIST task and pre-processing

The MNIST database contains $70000$ images of handwritten digits from $0$ to $9$. All images have been normalized to fit into a $28 \times 28$ pixels bounding box, antialiased, and converted into gray-scale levels. We use $60000$ of these data to train the system and $10000$ others for testing. The output layer is composed of $10$ binary outputs corresponding to the $10$ different classes of digits. The pre-processing consists in extracting relevant spatial and shape informations from individual images. An efficient way to do this is to compute the histograms of oriented gradients (HOG) from the original images [7,39]. For the HOG calculations, we used $9$ bin orientations, $7\times 7$ pixels per cell and $2\times 2$ cells per block [7]. In addition, HOG values are normalized with their highest value so that the input scaling factor becomes the amplitude of the reservoir input signal. In such an approach, the RC input data are the HOG features convoluted with a random mask which ensures the variability over the reservoir nodes. The HOG features of the same image are simultaneously injected into the reservoir at the time step $n$ and the next features are injected at time step $n+1$ meaning that the injection time interval between two consecutive images is the discrete time step. Increasing the wait time between the injection of two consecutive inputs would lead to a decrease in the classification error rate [7]: The inputs in the MNIST database are not time-dependent; hence for small wait times, past inputs affect more significantly the current reservoir dynamical state of the reservoir used for the classification. Nevertheless the overall performance scaling is not drastically changed; this will be briefly discussed in Section 4.1. The order of the image injection does not matter. The linear regression training is applied to the different node responses so that the RC outputs “$1$” for the digit corresponding to the class the image is associated to, and “$0$” otherwise. The principle of “the winner-takes-all decision” strategy is further applied to classify each image based on the binary output with the highest value.

3.2 KTH task and pre-processing

The KTH database contains video recordings of six different motions (walking, jogging, running, boxing, hand waving, and hand clapping) performed by 25 subjects. Each motion is repeated four times. Similarly to [6], we limited the database to the "s1" scenario, i.e. videos shot over a uniform background without zoom or lightning variations. The "s1" subset contains a total of $599$ video sequences, that we bring up to $600$ by artificially duplicating a missing boxing sequence. Over a total of the $600$ video sequences, a subset of $450$ video sequences are used for training, while the remaining $150$ sequences are used for testing. Both for training and testing, all videos are concatenated together and split into individual image frames. Then, the HOG algorithm is applied to each image to extract features, which will be used as the input for the RC. To reduce computational cost, we reduce the number of HOG features by further applying principal component analysis (PCA) based on the covariance method [40]. As such, we only keep the first $2000$ components (out of $9576$), whose eigenvalues account for $91.6\%$ of the total variability in the data. From each video, one constructs a matrix where the rows are image frames, while the columns are the HOG features extracted from those images. At each discrete time step, we inject the HOG features of an image. The order of injecting the videos does not matter, but the images of each video should be injected as a sequence. In this work, the images will be injected so that the time interval between two consecutive images corresponds to the discrete time step of the model. In other words, the injection time interval between two consecutive images belonging to the same video is the same as that between two consecutive images from two different videos. After classifying the images from the $600$ videos through training and testing processes of our RC system, we subsequently classify the individual frame of these videos in their respective classes considering again “the winner-takes-all” approach. In practice, the classifier output is evaluated throughout the full video sequence (from the first frame to the last) and the final result corresponds to the class having the majority of frames within the sequence attributed to it.

3.3 Memory capacity metric

Memory capacity is a common way to evaluate the ability of machine learning systems for referring back to the past information. For time-dependent tasks, this property is crucial for the performance and efficiency of the information processing system. Considering a sequence of numbers $(u^k)_{k\ge 0}$ at discrete time steps drawn from a uniform distribution in the interval $[-0.5, 0.5]$, a typical way to address the linear memory capacity of a physical system consists in constructing the copies of given inputs $u^k$ at time step $k$ shifted by $s$ time steps, i.e. $\hat {y}^k=u^{k-s}$ with $s=0,1,2,..$. To quantify this capacity, we define the memory function as the cross-correlation between $u^{k-s}$ and $\hat {y}^k$ [41]:

(4)$$m^s=\frac{\langle \left[u^{k-s}-\langle u^{k-s} \rangle\right]\left[\hat{y}^k-\langle \hat{y}^k \rangle\right] \rangle}{\left\langle \left|u^{k-s}- \left\langle u^{k-s} \right\rangle \right|^{2} \right\rangle ^{1/2} \left\langle \left| \hat{y}^k- \langle \hat{y}^k \rangle \right|^{2} \right\rangle ^{1/2} }.$$

From this memory function, the memory capacity is calculated as:

(5)$$MC=\sum_s m^s.$$

For all tasks, the training consists in using linear regression to determine the different weights, which should be assigned to the reservoir readout responses so that the output signal approaches the target as closely as possible. The simulations are carried out on Eq. (1) for the incoherent case and Eq. (2) for the coherent case. The classification error is calculated in each case as the percentage of images/videos wrongly classified by the reservoir computer on MNIST/KTH tasks.

4. Results and discussions

4.1 Performances of the two system configurations

In order to identify the appropriate parameter ranges for the task processing, we first scan the system’s performance on the ($\beta , I_0$)-space parameters. The results for the coherent and incoherent configurations are shown in Fig. 2. For the incoherent configuration, classification errors are as low as $1.8\%$ with $1024$ nodes even with $8$-bit quantization considered in the model. Overall, classification errors less than $2\%$ can be obtained for a broad range of parameter sets for this configuration. These errors are consistent with the state-of-the-art performance for similar reservoir’s sizes reported in the literature [7]. For the coherent configuration, however, worse classification errors are obtained for the MNIST task in the same parameter sets explored. The lowest classification error obtained is $2.8\%$. Overall, classification errors less than $3\%$ can be obtained, but in a very narrow range of values for parameters $(\beta ,I_0)$.

Guided by the results in Fig. 2, we now choose the optimal values of $\beta$ and $I_0$ and investigate the influence of the reservoir size on the performance of the two configurations. These values are $I_0=5$, $\beta =10$ for both for the coherent and the incoherent cases. They are kept fixed for all reservoir sizes as $\sum w^2_{ij}=1$ has been adopted. The results are shown in Fig. 3. We find that the increase of reservoir size has different effect on the system performance depending on the configuration. For the incoherent configuration, our results confirm a decrease of the classification errors with the increase of the reservoir size as previously reported in [6]. By way of illustration, the classification errors gradually decrease from $\sim 1.8\%$ for $N=1024$ nodes to $\sim 0.8\%$ for $N=9216$ nodes. For the coherent configuration, which has not been explored before, we observe that the classification error gradually decreases and reaches a minimum for a particular size before becoming worse with the increase of the reservoir size. For example, the classification error decreases from $3.7\%$ for $1024$ nodes to $2.5\%$ for $7168$ nodes, then, it increases again to $2.6\%$ for $11264$ nodes. We have noted that, for some parameter sets, the increase of the errors with the increase of reservoir size after reaching the minimum is quite significant. As indicated in Section 3.1, the results are obtained with a wait time of one time step for the injection of consecutive inputs, as we were not primary interested in heavily optimizing the performance on the task. We have checked for the two configurations that an increase in wait time up to 10 time steps provides an expected increase in performance without radically changing its scaling. For example with $N=1024$ nodes, we achieve an optimum classification error rate of $1.72\%$ (vs $1.8\%$) for the incoherent configuration and $3.15\%$ (vs $3.7\%$). This shows that the two architectures can be further optimized by carefully investigating the impact of wait time on the classification performance.

Fig. 2. Classification error for the MNIST task plotted in ($\beta , I_0$)-plane for the coherent case (left) and the incoherent case (right) with a reservoir of $1024$ nodes.

Download Full Size | PDF

Fig. 3. Classification error for the MNIST task computed from Eqs. (1) and (2) as a function of the reservoir sizes. The parameters are $I_0=5$, $\beta =10$ for both for the coherent and the incoherent cases.

Download Full Size | PDF

Next, we compare the performance of the two configurations on the classification of video streams. Figure 4 shows the classification errors over $150$ videos (videos used for testing) for the two configurations in the ($\beta , I_0$)-plane. The analysis of the results shows that there is a large range of parameter sets for which misclassified videos are comprised between $20\%$ and $25\%$, and also few parameter sets with errors slightly below $20\%$ for the incoherent configuration, while misclassified videos are in-between $25\%$ and $30\%$ in a narrow windows of parameter sets for the coherent configuration. Explicitly, up to $77.3\%$ of videos can be successfully classified using our incoherent RC while a maximum of $73\%$ of videos have been successfully classified using the coherent RC. The incoherent configuration therefore gives rise to a more flexibility in the choice of ($\beta , I_0$) while ensuring better performance.

Fig. 4. Video classification error for the KTH task plotted in ($\beta , I_0$)-plane for the incoherent case (left) and the coherent case (right) with a reservoir of $1024$ nodes considering $8$-bit resolution.

Download Full Size | PDF

In Fig. 5, we again choose the optimal parameter set for each configuration (i.e. coherent and incoherent) and plot the classification errors score on the videos as a function of the reservoir size. The parameters $\beta$ and $I_0$ are kept fixed for all reservoir sizes. It is confirmed that the incoherent configuration outperforms the coherent configuration for all the reservoir sizes. Interestingly, the results for large reservoir sizes indicate a significant improvement of the performance for the two configurations. For example, we achieved an accuracy of $90\%$ and $85\%$ for the incoherent and coherent configurations, respectively, with a reservoir size of $11264$ nodes.

Fig. 5. Classification error of the KTH task as a function of the reservoir size for $\beta =2$, $I_0=2.5$ for incoherent case and $\beta =7$, $I_0=25$ for coherent case considering $8$-bit resolution.

Download Full Size | PDF

To provide further insights regarding the comparison of the two configurations, we show in Fig. 6 the confusion matrices, which allow to have the explicit percentage of success of specific actions for each configuration. It appears that the two configurations achieve a good performance for the "boxing", "hand clapping" and "hand waving" actions which are typically less constraining to learn than the "jogging" and the "running" actions. For the two latter actions, the learning performance of the two systems are limited. Both from Fig. 6 and our results with others reservoir sizes (not shown here), the incoherent architecture outperforms the coherent configuration on these two specific actions. In particular, with a relatively large reservoir size (e.g $5120$ nodes), it is found that the classification accuracy on the "jogging" and "running" actions can be increased to $\sim 65\%$ for the coherent configuration, while the accuracy exceeds $\sim 75\%$ for the incoherent configuration.

Fig. 6. Confusion matrices for $\beta =2$, $I_0=2.5$ for incoherent case and $\beta =7$, $I_0=25$ for coherent case considering a reservoir size of $5120$ nodes.

Download Full Size | PDF

One should note that the results reported here are comparable but lower than those reported previously on the same simulated setup [6,42]. The difference stems in part from the optimisation of the hyper-parameters $\beta$ and $I_0$. In this work, to avoid excessive computational times, we chose to keep the same hyper-parameters values for all the different reservoir sizes, whereas in [6,42] they were optimised specifically for each size of the neural network. The similarity of our results to those reported previously shows the robustness of our system, since accuracies comparable to state-of-the-art can be obtained even with sub-optimal hyper-parameters.

4.2 Memory capacity

Another interesting property of RC systems is their memory capacity. Although low memory capacity may be sufficient for our above-mentioned tasks, some time-dependant machine learning tasks require a large memory capacity (e.g. NARMA10 [25,32]). Therefore, for further insights regarding our two configurations, we show in Fig. 7 the results for the memory capacity of the system scanning in the ($\beta ,I_0$)-plane with a reservoir size of $N=1024$ nodes considering $10000$ points for training and $6000$ points for testing. For the incoherent configuration, memory capacity up to $20$ can be obtained for substantial parameter sets despite the presence of quantization noise. On the contrary, the coherent configuration yields a poor memory capacity. Precisely, the best memory capacity is less than $6$ and only very few parameter sets allow a memory capacity larger than $3$ for this configuration.

Fig. 7. Memory capacity plotted in the ($\beta$, $\alpha$)-plane with a reservoir size $N=1024$ for $8$-bit resolution. Note the difference of the scale for the colour bars for the two configurations.

Download Full Size | PDF

4.3 Impact of noise

4.3.1 Quantization noise

The decrease in performance for the coherent configuration on the MNIST task (see Fig. 3) is an unexpected result, as the RC performance typically improves with the increase of the dimensionality. In order to understand this result, we investigate the specific role of the bit quantization by comparing the results with and without quantization noise (i.e. the bit resolution is that of the computer in our simulation model at $64$-bit). Figure 8 shows the results for $64$-bit resolution in comparison with those of $8$-bit resolution already reported. For the incoherent configuration, the same classification errors are obtained both for $64$-bit and $8$-bit resolutions for the same reservoir sizes. The system is therefore robust to readout noise introduced by bit quantization for this parameter set. For the coherent configuration, we find classification errors as low as those obtained for the incoherent configuration at $64$-bit resolution. This suggests that the large classification errors obtained with $8$-bit resolution were caused by quantization noise. Thus, this configuration appears to be more sensitive to low bit resolution than the incoherent configuration: For example, the classification error increases from $1.8\%$ for $64$-bit resolution to $3.6\%$ for $8$-bit resolution for a RC size of $1024$ nodes. In addition, the results of the coherent configuration without bit quantization also indicate that the unexpected degradation of the system performance for large RC sizes is likely caused by the data overfitting due to the fact that data become more tricky to learn in the presence of strong noise. This suggests that more sophisticated learning techniques, such as ridge regularization methods [31] should be investigated to improve the system performance, when devices with low bit resolution are used to implement the coherent configuration. For future studies, the overall impact of noise (additive or multiplicative) on the performance could be investigated within the framework developed in Ref. [43].

Fig. 8. Classification Error on MNIST task as a function of the reservoir sizes for different quantization levels. The parameters are the same as in Fig. 3.

Download Full Size | PDF

To unveil the impact of the quantization noise on the memory, we plot in Fig. 9 the memory capacity considering the same parameters as in Fig. 7 but for $64$-bit resolution. By comparing the results with those obtained for $8$-bit resolution, it appears that the memory capacity of the coherent configuration improves but the best value still remains lower than $6$, while the incoherent configuration exhibits robustness to quantization noise with a very similar memory-capacity landscape for both $8$-bit and $64$-bit quantization.

Fig. 9. Same as Fig. 7 for $64$-bit resolution

Download Full Size | PDF

4.3.2 Phase and intensity noise

The 8-bit quantization introduced by the SLM and camera are sources of phase and intensity noise. We propose to analyze their individual impact on the performance of the unquantized system and use the MNIST task as a benchmark. The inclusion of noise in the models is detailed in Eqs. (9) and (14) in the Appendix section. For the two noise sources, we use independent and identically distributed random variables following a normal distribution with zero mean and standard deviation equal to a small percentage ($1-5\%$) of $2\pi$ for the phase and $I_0$ for the intensity.

Figure 10 (top panel) displays the impact of phase noise levels of $1$, $2$, and $5\%$ on the MNIST classification error rate as a function of the number of nodes $N$. For low levels, the incoherent architecture performs slightly better than the coherent architecture with levels of performance close to those without noise. However, as the phase noise continuously increases we observe a significant degradation of the performance with the incoherent configuration versus a relative immunity to phase noise level for the coherent configuration at the $5\%$ level. We interpret this result based on how the phase noises are introduced in each model: In the incoherent configuration, they affect directly each intensity state variable, whereas in the coherent case the interference pattern created by the scattering medium leads to a large number of independent phase-noise contributions affecting a given intensity state. These individual fluctuations may be replaced by an attenuated, averaged, phase-noise contribution, which is reminiscent of the Law of Large Numbers.

Fig. 10. Classification Error on MNIST task for $64$-bit resolution as a function of the reservoir sizes considering models with phase noise (top panel) and detection noise (bottom panel). The parameters are the same as in Fig. 3.

Download Full Size | PDF

Figure 10 (bottom panel) displays the impact of intensity (or detection) noise levels of $1$, $2$, and $5\%$. It shows that the incoherent configuration exhibits consistently better level of performance, highlighting a stronger sensitivity of the coherent configuration with a significant degradation of performance. By comparison to how phase noise was introduced in the model, the intensity noises affect individually each intensity state variables in a similar way for the two configurations. Consequently, we do not observe the same scenario. If we combine phase and intensity noises at the lower levels ($<2\%$), the scenario is qualitatively comparable to what is observed in Fig. 8, unveiling that the quantization noise from the camera is predominant in the degradation of the performance observed in the coherent case.

5. Conclusion

Based on the various benchmark tasks used here, it appears that the incoherent configuration has a higher image and video processing capacities than the coherent configuration. For the sake of clarity, Table 1 summarizes the best performances of our two configurations on the different tasks for $8$- and $64$-bit resolutions for a reservoir with $1024$ nodes. It appears that the incoherent configuration outperforms the coherent configuration for all tasks. We also notice that the significant effect of low-bit resolution on the image classification is attenuated for the KTH task because each full video sequence is classified by simply choosing the class with the majority of frames within the sequence. For KTH and memory capacity, we have not investigated the influence of bit quantization for other reservoir sizes for $64{-}$bit.

Table 1. Performances of the two RC system configurations on the different tasks for $8$- and $64$-bit resolutions considering a reservoir with $1024$ nodes.

View Table

In this work, we have compared the performances of an incoherent and a coherent free-space photonic RC systems on image and video classification, while preserving as much as possible the structure of each architectures by using similar components.

The results have shown a trade-off between the incoherent and the coherent architectures. On one hand, the coherent RC architecture allows physical coupling between the reservoir nodes and prevents the use of additional computational resources needed to perform the multiplication of the adjacency matrix with the RC’s state vector, which considerably slows down the processing speed as the reservoir scales up. On the other hand, our results have shown that the incoherent configuration outperforms the coherent configuration in all benchmark tasks tested, but at the cost of processing speed. Furthermore, we have found that the incoherent configuration can exhibit a large memory capacity while the coherent configuration have shown comparatively low (five to tenfold lower) memory capacity for all the hyper-parameters sets explored. Based on our simulations, we have found that the coherent configuration is more sensitive to quantization noise generated by low-bit resolution imposed by the physical component than the incoherent configuration. In particular, the incoherent configuration shows very-high robustness as no visible difference in performance between the 8-bit and 64-bit quantization were observed. The coherent configuration shows a more significant resilience to phase noise compared to the incoherent configuration; a reverse scenario is observed for intensity noise with a better resilience for the incoherent configuration.

The clear origin for the difference in performance induced by the optical coherence remains an open question. Based on our mathematical modeling, at a given network size, the coherent configuration with its interference pattern has a more complex nonlinearity in comparison to the incoherent configuration. This could potentially trigger the variance-bias trade-off, an effect known in Machine Learning to link the difficulty for increasingly complex mathematical models to generalize [44]. Further investigations along this line of reasoning coupled with the use of task-independent metrics, such as the computational ability, could provide additional insight towards answering this question.

In summary, our result has unveiled the impact of optical coherence on the processing capabilities of RC systems with similar structures on demanding computer-vision tasks; this will potentially pave the way towards informed design-choices and optimization of future large-scale, free-space, photonic RC architectures.

6. Appendix

6.1. Appendix 1 - Modeling of the incoherent configuration

For the incoherent configuration, a LED delivers an incident incoherent light beam propagating through a first linear polarizer (Pol. 1) oriented at $45^o$ such that, according to the Jones formalism, the resulting electric field reads $\mathcal {E}_0= E_0/\sqrt {2}e^{j\varphi _0}[1,1]^T$, with $E_0$ and $\varphi _0$ the amplitude and the phase of $\mathcal {E}_0$, respectively. On each pixel of the SLM, a specific time-discrete phase modulation is applied resulting in a local electric field $\mathcal {E}_i^n$ for the $i$-th pixel at the $n$-th time step. Then, the SLM output passes through a second linear polarizer (Pol. 2) oriented at $45^o$ with respect to the vertical axis. This polarizer transforms the electric field $\mathcal {E}_i^n$ of each pixel into $E_0/\sqrt {2} \sin (\varphi ^n_i/2)[1,1]^T$, where $\varphi ^n_i$ is the phase of the $i$-th SLM’s pixel at the $n$-th time step. We have assumed here homogeneity and constant electric field over a given SLM’s pixel for a time-step duration. Afterwards, a camera is used to detect the intensity of the polarized light beam and its output gives the intensity of each pixel as [7]:

(6)$$I^{n}_i=I_0 \sin^2 \left(\varphi^n_i/2\right),$$

with $I_i \propto |\mathcal {E}_i|^2$ and $I_0 \propto |E_0|^2$ . After the detection, the signal $I^n_k$ is multiplied with a random matrix $W \in \mathbb {R}^{N\times N}$ whose elements are generated on computer before being added with the randomly-masked data. The resulting signal is used to update the programmable phase shifts of the SLM as follows

(7)$$\varphi^{n+1}_i=\sum_{k=1}^N W_{ik} I^n_k +\beta\sum_{k=1}^M b_{ik}U^n_k.$$

Thus, by combining Eqs. (6) and (7) and rescaling the matrices and state variables, one finds

(8)$$I^{n+1}_i=I_0 \sin^2\left(\sum_{k=1}^N W_{ik} I^n_k +\beta\sum_{k=1}^M b_{ik}U^n_k\right),$$

which is the Eq. (1) given in Sec. 2.1 without considering the 8-bit quantization operation imposed on the SLM phase values and intensities detected by the camera. The introduction of phase and intensity noise is performed as follows: For the phase-noise sources, we introduce additive independent and identically distributed random variables with Normal law $\zeta _i^n$ in Eq. (7). For the intensity noise, we introduce additive random variables with Normal law $\eta _i^n$ in Eq. (8). We obtain the following nonlinear stochastic model:

(9)$$I^{n+1}_i=I_0 \sin^2\left(\sum_{k=1}^N W_{ik} I^n_k +\beta\sum_{k=1}^M b_{ik}U^n_k+\zeta_i^n\right)+\eta_i^n.$$

6.2. Appendix 2 - Modeling of the coherent configuration

For the coherent configuration, the SLM is illuminated by a linearly polarized coherent light beam generated by a laser. After the second polarizer (Pol. 2) a diffuser is inserted so that the polarized coherent light originating from each pixel $E_0/\sqrt {2} \sin (\varphi ^n_i/2)[1,1]^T$ is scattered. The electric field of the $i$-th pixel at the diffuser’s output $\mathcal {F}^n_i$ is then the coherent sum of the different contributions from other $k$-th pixels:

(10)$$\mathcal{F}^n_i= {E}_0\sum_{k=1}^N w_{ik}\sin\left(\varphi^n_k/2\right )\mathbf{e}_{i,k} ,$$

where $\mathbf {e}_{i,k}$ is a normalized vector giving the polarization direction at the diffuser’s output of the field amplitude contribution of the $k$-th to the $i$-th pixel. We assume that vectors $\mathbf {e}_{i,k}$ have independent, uncorrelated, random orientations, thus leading to a degree of polarization (DOP) of the scattered light with respect to its initial polarization approximately equal to zero (i.e. unpolarized light). Coefficients $w_{ik} = w^{d}_{ik}e^{j\phi _{ik}}$ form the complex-valued light transmission matrix with $w_{ik}^d$ and $\phi _{ik}$ the real-valued transmission coefficient in amplitude and the phase accumulation during the propagation of light from the $k$-th to the $i$-th pixel, respectively. The diffuser’s output is then focused by an imaging lens to a camera. This camera reads the intensity of the signal of each pixel as $I_i^n\propto |\mathcal {F}_i^n|^2$. Similar to the incoherent configuration, the computer is used to include masked data and update the programmable phase-shift states of the SLM’s $i$-th pixel according to the following update rule:

(11)$$\varphi^{n+1}_i =I_i^n +\beta\sum_{k=1}^M b_{ik}U^n_k,$$

where $\beta$ is the control parameters used to rescale the SLM phase-shift and the amplitude of the input data, respectively. $(b_{ik})_{i,k}$ is the input matrix. At time step $n+1$, the intensity detected by the $i$-th camera’s pixel is such that $I^{n+1}_i \propto \Big |F^{n+1}_i\Big |^2$, which becomes after substituting Eq. (10):

(12)$$I^{n+1}_i \propto \left|E_0\sum_{k=1}^N w_{ik}\sin\left(\varphi^{n+1}_k/2\right)\mathbf{e}_{i,k}\right |^2.$$

With a coherent light beam, there are cross terms reading $I_0 w_{ik}\overline {w}_{ik'}\sin (\varphi _k^{n+1}/2)\sin (\varphi _{k'}^{n+1}/2)(\mathbf {e}_{i,k}\cdot \mathbf {e}_{i,k'})$, which come from pixel-to-pixel interference between the diffuser output signals detected by the $i$-th pixel of the camera. They play a crucial role both in the dynamics and the RC’s processing capabilities. To simplify the notations, we have embedded the contribution of dot products $(\mathbf {e}_{i,k}\cdot \mathbf {e}_{i,k'})=\cos \left (\mathbf {e}_{i,k},\mathbf {e}_{i,k'}\right )$ directly in the transmission matrix $w$. Finally, using Eq. (11) and the same rescaling procedure used in Eq. (8), the model of the coherent RC reads,

(13)$$I^{n+1}_i=I_0\left| \sum_{k=1}^N w_{ik}\sin\left(I^n_k +\beta\sum_{l=1}^M b_{kl}U^n_l\right) \right|^2.$$

Equation (13) corresponds to Eq. (2) of Sec. 2.2 without considering the 8-bit quantization operation imposed on the SLM phase values and intensities detected by the camera.

The introduction of phase and intensity noise is performed similarly to the incoherent configuration by adding their respective contribution to Eq. (11) and Eq. (13). We obtain the stochastic model:

(14)$$I^{n+1}_i=I_0\left| \sum_{k=1}^N w_{ik}\sin\left(I^n_k +\beta\sum_{l=1}^M b_{kl}U^n_l+\zeta_i^n\right) \right |^2+\eta_i^n.$$

6.3. Appendix 3

For our system, the optical power is conserved if we neglect losses which may occur during the propagation of the light between the input of the SLM and the output of the camera. To ensure this property, singular value decomposition (SVD) is applied on the coupling matrix $w$ to find its singular values. Precisely, $w$ is decomposed as:

(15)$$w=U\Sigma V^T,$$

where $U$ is the unitary change of basis matrix between transmission channels output modes and output free modes, $\Sigma$ is a diagonal matrix whose elements are singular values $\lambda _m$ of $w$, and $V$ is a unitary change of basis matrix linking input free modes with transmission channel input modes of the system while the upper $T$ refers to the matrix transpose [36]. Since the singular values of $w$ are the square root of the energy transmission values of the transmission channels, we ensure the conservation of the total power transmission by normalizing the singular values as follows:

(16)$$\lambda_m \longrightarrow \lambda_m/\sqrt{\sum_j \lambda^2_j}.$$

After this normalization, we reconstruct $w=U\Sigma ' V^T$ where $\Sigma '$ is a diagonal matrix whose elements are normalized singular values.

Funding

This work was supported by the AFOSR (grants No. FA-9550-17-1-0072) and the Région Grand-Est.

Acknowledgments

The authors gratefully acknowledge Dr. Daniel Brunner for the insightful scientific exchanges and discussions related to spatiotemporal photonic reservoir computers.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

References

1. F. Triefenbach, A. Jalal, B. Schrauwen, and J.-P. Martens, “Phoneme recognition with large hierarchical reservoirs,” in Advances in Neural Information Processing Systems, vol. 23 (2010), pp. 2307–2315.

2. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304(5667), 78–80 (2004). [CrossRef]

3. P. Buteneers, D. Verstraeten, T. W. P. Van Mierlo, D. Stroobandt, R. Raedt, H. Hallez, and B. Schrauwen, “Automatic detection of epileptic seizures on the intra-cranial electroencephalogram of rats using reservoir computing,” Artif. Intell. Medicine 53(3), 215–223 (2011). [CrossRef]

4. E. A. Antonelo, B. Schrauwen, and D. Stroobandt, “Event detection and localization for small mobile robots using reservoir computing,” Neural Networks 21(6), 862–871 (2008). [CrossRef]

5. K. Lüdge and A. Röhm, “Computing with a camera,” Nat. Mach. Intell. 1(12), 551–552 (2019). [CrossRef]

6. P. Antonik, N. Marsal, D. Brunner, and D. Rontani, “Human action recognition with a large-scale brain-inspired photonic computer,” Nat. Mach. Intell. 1(11), 530–537 (2019). [CrossRef]

7. P. Antonik, N. Marsal, and D. Rontani, “Large-scale spatiotemporal photonic reservoir computer for image classification,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–12 (2020). [CrossRef]

8. W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput. 14(11), 2531–2560 (2002). [CrossRef]

9. H. Jaeger, “Tech. Rep. GMD Report (German National Research Center for Information Technology),” in Wseas Trans. signal Process, vol. 148 (2001).

10. D. Verstraeten, B. Schrauwen, M. D’Haene, and D. Stroobandt, “An experimental unification of reservoir computing methods,” Neural Networks 20(3), 391–403 (2007). [CrossRef]

11. G. V. der Sande, D. Brunner, and M. C. Soriano, “Advances in photonic reservoir computing,” Nanophotonics 6(3), 561–576 (2017). [CrossRef]

12. G. Tanaka, T. Yamane, J.-B. Héroux, R. Nakane, N. Kanazawa, S. Takeda, H. Numata, D. Nakano, and A. Hirose, “Recent advances in physical reservoir computing: A review,” Neural Networks 115, 100–123 (2019). [CrossRef]

13. Y. K. Chembo, “Machine learning based on reservoir computing with time-delayed optoelectronic and photonic systems,” Chaos 30(1), 013111 (2020). [CrossRef]

14. A. Lugnan, A. Katumba, F. Laporte, M. Freiberger, S. Sackesyn, C. Ma, E. Gooskens, J. Dambre, and P. Bienstman, “Photonic neuromorphic information processing and reservoir computing,” APL Photonics 5(2), 020901 (2020). [CrossRef]

15. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2(1), 287 (2012). [CrossRef]

16. R. Martinenghi, S. Rybalko, M. Jacquot, Y. K. Chembo, and L. Larger, “Photonic nonlinear transient computing with multiple-delay wavelength dynamics,” Phys. Rev. Lett. 108(24), 244101 (2012). [CrossRef]

17. D. Brunner, M. C. Soriano, C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun. 4(1), 1364 (2013). [CrossRef]

18. F. Duport, A. Smerieri, A. Akrout, M. Haelterman, and S. Massar, “Fully analogue photonic reservoir computer,” Sci. Rep. 6(1), 22381 (2016). [CrossRef]

19. R. M. Nguimdo, E. Lacot, O. Jacquin, O. Hugon, G. V. der Sande, and H. G. de Chatellus, “Prediction performance of reservoir computing systems based on diode-pumped erbium doped microchip laser subject to optical feedback,” Opt. Lett. 42(3), 375–378 (2017). [CrossRef]

20. L. Larger, A. Baylón-Fuentes, R. Martinenghi, V. S. Udaltsov, Y. K. Chembo, and M. Jacquot, “High-speed photonic reservoir computing using a time-delay-based architecture: Million words per second classification,” Phys. Rev. X 7(1), 011015 (2017). [CrossRef]

21. A. Argyris, J. Bueno, and I. Fischer, “Photonic machine learning implementation for signal recovery in optical communications,” Sci. Rep. 8(1), 8487 (2018). [CrossRef]

22. D. Brunner and I. Fischer, “Reconfigurable semiconductor laser networks based on diffractive coupling,” Opt. Lett. 40(16), 3854 (2015). [CrossRef]

23. J. Bueno, S. Maktoobi, L. Froehly, I. Fischer, M. Jacquot, L. Larger, and D. Brunner, “Reinforcement learning in a large scale photonic recurrent neural network,” Optica 5(6), 756 (2018). [CrossRef]

24. J. Dong, M. Rafayelyan, F. Krzakala, and S. Gigan, “Optical reservoir computing using multiple light scattering for chaotic systems prediction,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–12 (2020). [CrossRef]

25. Q. Vinckier, F. Duport, A. Smerieri, K. Vandoorne, P. Bienstman, M. Haelterman, and S. Massar, “High-performance photonic reservoir computer based on a coherently driven passive cavity,” Optica 2(5), 438–446 (2015). [CrossRef]

26. S. Sunada and A. Uchida, “Photonic reservoir computing based on nonlinear wave dynamics at microscale,” Sci. Rep. 9(1), 19078 (2019). [CrossRef]

27. K. Vandoorne, P. Mechet, T. V. Vaerenbergh, M. Fiers, G. Morthier, D. Verstraeten, B. Schrauwen, J. Dambre, and P. Bienstman, “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun. 5(1), 3541 (2014). [CrossRef]

28. K. Takano, C. Sugano, M. Inubushi, K. Yoshimura, S. Sunada, K. Kanno, and A. Uchida, “Compact reservoir computing with a photonic integrated circuit,” Opt. Express 26(22), 29424 (2018). [CrossRef]

29. F. D. L. Coarer, M. Sciamanna, A. Katumba, M. Freiberger, J. Dambre, P. Bienstman, and D. Rontani, “All-optical reservoir computing on a photonic chip using silicon-based ring resonators,” IEEE J. Sel. Top. Quantum Electron. 24(6), 1–8 (2018). [CrossRef]

30. A. Katumba, X. Yin, J. Dambre, and P. Bienstman, “A neuromorphic silicon photonics nonlinear equalizer for optical communications with intensity modulation and direct detection,” J. Lightwave Technol. 37(10), 2232–2239 (2019). [CrossRef]

31. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev. 3(3), 127–149 (2009). [CrossRef]

32. L. Appeltant, M. C. Soriano, J. D. G. Van der Sande, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nat. Commun. 2(1), 468 (2011). [CrossRef]

33. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241 (2012). [CrossRef]

34. S. Maktoobi, L. Froehly, L. Andreoli, X. Porte, M. Jacquot, L. Larger, and D. Brunner, “Diffractive coupling for photonic networks: how big can we go?” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–8 (2020). [CrossRef]

35. M. Rafayelyan, J. Dong, Y. Tan, F. Krzakala, and S. Gigan, “Large-scale optical reservoir computing for spatiotemporal chaotic systems prediction,” arXiv:2001.09131v1 (2020).

36. S. M. Popoff, G. Lerosey, M. Fink, A. C. Boccara, and S. Gigan, “Controlling light through optical disordered media: transmission matrix approach,” New J. Phys. 13(12), 123021 (2011). [CrossRef]

37. Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, vol. 86 (1998), p. 2278.

38. C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: a local svm approach,” in Proceedings of the 17th International Conference on Pattern Recognition, (2004).

39. H. E. Bahi, Z. Mahani, A. Zatni, and S. Saoud, “A robust system for printed and handwritten character recognition of images obtained by camera phone,” in Wseas Trans. signal Process, vol. 11 (2015), pp. 9–22.

40. I. T. Jolliffe, Principal Component Analysis (Springer, 2002), 2nd ed.

41. R. M. Nguimdo, G. Verschaffelt, J. Danckaert, and G. V. der Sande, “Reducing the phase sensitivity of laser-based optical reservoir computing systems,” Opt. Express 24(2), 1238 (2016). [CrossRef]

42. P. Antonik, N. Marsal, D. Brunner, and D. Rontani, “Bayesian optimisation of large-scale photonic reservoir computers,” arXiv preprint arXiv:2004.02535 (2020).

43. N. Semenova, X. Porte, L. Andreoli, M. Jacquot, L. Larger, and D. Brunner, “Fundamental aspects of noise in analog- hardware neural networks,” Chaos 29(10), 103128 (2019). [CrossRef]

44. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Springer Series in Statistics (Springer New York Inc., New York, NY, USA, 2001).

Previous Article Next Article

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.

View in Article | Download Full Size | PDF

Fig. 2. Classification error for the MNIST task plotted in (

$\beta , I_0$

)-plane for the coherent case (left) and the incoherent case (right) with a reservoir of

$1024$

nodes.

View in Article | Download Full Size | PDF

Fig. 3. Classification error for the MNIST task computed from Eqs. (1) and (2) as a function of the reservoir sizes. The parameters are

$I_0=5$

$\beta =10$

for both for the coherent and the incoherent cases.

View in Article | Download Full Size | PDF

Fig. 4. Video classification error for the KTH task plotted in (

$\beta , I_0$

)-plane for the incoherent case (left) and the coherent case (right) with a reservoir of

$1024$

nodes considering

$8$

-bit resolution.

View in Article | Download Full Size | PDF

Fig. 5. Classification error of the KTH task as a function of the reservoir size for

$\beta =2$

$I_0=2.5$

for incoherent case and

$\beta =7$

$I_0=25$

for coherent case considering

$8$

-bit resolution.

View in Article | Download Full Size | PDF

Fig. 6. Confusion matrices for

$\beta =2$

$I_0=2.5$

for incoherent case and

$\beta =7$

$I_0=25$

for coherent case considering a reservoir size of

$5120$

nodes.

View in Article | Download Full Size | PDF

Fig. 7. Memory capacity plotted in the (

$\beta$

$\alpha$

)-plane with a reservoir size

$N=1024$

for

$8$

-bit resolution. Note the difference of the scale for the colour bars for the two configurations.

View in Article | Download Full Size | PDF

Fig. 8. Classification Error on MNIST task as a function of the reservoir sizes for different quantization levels. The parameters are the same as in Fig. 3.

View in Article | Download Full Size | PDF

Fig. 9. Same as Fig. 7 for

$64$

-bit resolution

View in Article | Download Full Size | PDF

Fig. 10. Classification Error on MNIST task for

$64$

-bit resolution as a function of the reservoir sizes considering models with phase noise (top panel) and detection noise (bottom panel). The parameters are the same as in Fig. 3.

View in Article | Download Full Size | PDF

Tables (1)

Table 1. Performances of the two RC system configurations on the different tasks for $8$ - and $64$ -bit resolutions considering a reservoir with $1024$ nodes.

View Table

Equations (16)

Equations on this page are rendered with MathJax. Learn more.

I_{i}^{n + 1} = {⌊ I_{0} \sin^{2} {⌊ \sum_{k = 1}^{N} w_{i k} I_{k}^{n} + β \sum_{l = 1}^{M} b_{k l} U_{l}^{n} ⌋}_{8} ⌋}_{8},

I_{i}^{n + 1} = {⌊ I_{0} {| \sum_{k = 1}^{N} w_{i k} \sin {⌊ I_{k}^{n} + β \sum_{l = 1}^{M} b_{k l} U_{l}^{n} ⌋}_{8} |}^{2} ⌋}_{8} .

y_{R C, p}^{n} = \sum_{k = 1}^{N} w_{p k}^{o u t} {⌊ I_{k}^{n} ⌋}_{8},

m^{s} = \frac{⟨ [u^{k - s} - ⟨ u^{k - s} ⟩] [{\hat{y}}^{k} - ⟨ {\hat{y}}^{k} ⟩] ⟩}{{⟨ {| u^{k - s} - ⟨ u^{k - s} ⟩ |}^{2} ⟩}^{1 / 2} {⟨ {| {\hat{y}}^{k} - ⟨ {\hat{y}}^{k} ⟩ |}^{2} ⟩}^{1 / 2}} .

M C = \sum_{s} m^{s} .

I_{i}^{n} = I_{0} \sin^{2} (φ_{i}^{n} / 2),

φ_{i}^{n + 1} = \sum_{k = 1}^{N} W_{i k} I_{k}^{n} + β \sum_{k = 1}^{M} b_{i k} U_{k}^{n} .

I_{i}^{n + 1} = I_{0} \sin^{2} (\sum_{k = 1}^{N} W_{i k} I_{k}^{n} + β \sum_{k = 1}^{M} b_{i k} U_{k}^{n}),

I_{i}^{n + 1} = I_{0} \sin^{2} (\sum_{k = 1}^{N} W_{i k} I_{k}^{n} + β \sum_{k = 1}^{M} b_{i k} U_{k}^{n} + ζ_{i}^{n}) + η_{i}^{n} .

F_{i}^{n} = E_{0} \sum_{k = 1}^{N} w_{i k} \sin (φ_{k}^{n} / 2) e_{i, k},

φ_{i}^{n + 1} = I_{i}^{n} + β \sum_{k = 1}^{M} b_{i k} U_{k}^{n},

I_{i}^{n + 1} \propto {| E_{0} \sum_{k = 1}^{N} w_{i k} \sin (φ_{k}^{n + 1} / 2) e_{i, k} |}^{2} .

I_{i}^{n + 1} = I_{0} {| \sum_{k = 1}^{N} w_{i k} \sin (I_{k}^{n} + β \sum_{l = 1}^{M} b_{k l} U_{l}^{n}) |}^{2} .

I_{i}^{n + 1} = I_{0} {| \sum_{k = 1}^{N} w_{i k} \sin (I_{k}^{n} + β \sum_{l = 1}^{M} b_{k l} U_{l}^{n} + ζ_{i}^{n}) |}^{2} + η_{i}^{n} .

w = U Σ V^{T},

λ_{m} ⟶ λ_{m} / \sqrt{\sum_{j} λ_{j}^{2}} .

	Incoherent case		Coherent case
Bit resolution	8	64	8	64
MNIST: Lowest error	$1.8 %$	$1.8 %$	$3.7 %$	$1.8 %$
KTH: Highest accuracy	$77.3 %$	$80 %$	$73 %$	$73 %$
MC: Maximum value	$20$	$23$	$4.07$	$6$

Abstract

1. Introduction

2. System modeling

2.1 Incoherent reservoirs

2.2 Coherent reservoirs

3. Benchmark tasks and performance metrics

3.1 MNIST task and pre-processing

3.2 KTH task and pre-processing

3.3 Memory capacity metric

4. Results and discussions

4.1 Performances of the two system configurations

4.2 Memory capacity

4.3 Impact of noise

4.3.1 Quantization noise

4.3.2 Phase and intensity noise

5. Conclusion

6. Appendix

6.1. Appendix 1 - Modeling of the incoherent configuration

6.2. Appendix 2 - Modeling of the coherent configuration

6.3. Appendix 3

Funding

Acknowledgments

Disclosures

References

Cited By

Figures (10)

Tables (1)

Equations (16)

Optics Express