Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

High-speed serial deep learning through temporal optical neurons

Open Access Open Access

Abstract

Deep learning is able to functionally mimic the human brain and thus, it has attracted considerable recent interest. Optics-assisted deep learning is a promising approach to improve forward-propagation speed and reduce the power consumption of electronic-assisted techniques. However, present methods are based on a parallel processing approach that is inherently ineffective in dealing with the serial data signals at the core of information and communication technologies. Here, we propose and demonstrate a sequential optical deep learning concept that is specifically designed to directly process high-speed serial data. By utilizing ultra-short optical pulses as the information carriers, the neurons are distributed at different time slots in a serial pattern, and interconnected to each other through group delay dispersion. A 4-layer serial optical neural network (SONN) was constructed and trained for classification of both analog and digital signals with simulated accuracy rates of over 79.2% with proper individuality variance rates. Furthermore, we performed a proof-of-concept experiment of a pseudo-3-layer SONN to successfully recognize the ASCII codes of English letters at a data rate of 12 gigabits per second. This concept represents a novel one-dimensional realization of artificial neural networks, enabling a direct application of optical deep learning methods to the analysis and processing of serial data signals, while offering a new overall perspective for temporal signal processing.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

The framework of artificial neural networks originated from biological neural networks, which deep learning methods seek to mimic. The realization of deep learning requires a tremendous amount of data and computational resources. Fortunately, the advent of the big data field and the explosion of computing capabilities, supported by the deployment of graphics processing units (GPUs) and other parallel processing units, have enabled a remarkable progress in deep learning. As a result, deep learning has been extensively employed for many different important tasks both in industrial and academic settings, such as for speech recognition, image classification, game playing (decision making), language translation, etc. [17]. However, computing speed and power consumption aspects remain as key concerns for the further development of deep learning methods based on conventional technologies.

It stands to reason that the nature of light endows photonic signal processing with broad bandwidth (i.e., potential high processing speed), low delay/latency, low power consumption, and advanced mathematical operation capabilities [812]. Therefore, several different approaches have been proposed and demonstrated for the implementation of neural networks using light waves, so-called optical neural networks (ONNs), including optical reservoir computing with the passive silicon photonic circuit and deep learning through optical diffraction, optical interference, multimode fiber, wavelength division multiplexing, etc. [1317]. As shown in Fig. 1(a), all these schemes operate on parallel-pattern data in the physical space domain, which are processed through a system that emulates a traditional neural network architecture. This parallel-signal-processing structure can only be fed with data in a parallel fashion. Nevertheless, serial data are utilized across a broad range of application scenarios, such as in telecommunications, sensing, Lidar, Radar, ultra-fast optical imaging, etc. [1825]. Thus, the data present in these applications are inherently incompatible with current ONNs. Although serial-to-parallel conversion may be envisioned, this would lead to increased processing latencies and reduced overall system efficiency, limiting the possibility of the application of ONN strategies on high-speed serial data signals.

 figure: Fig. 1.

Fig. 1. Principle of SONN. (a) Schematic of a conventional optical network for deep learning. The neurons are arranged abreast, and the ‘data’ flows through the network in a parallel fashion. (b) Schematic of the proposed serial optical neural network (SONN). The objects to be processed are temporal waveforms, which are sampled in amplitude by an optical pulse train (with a constant phase) and then fed into the SONN. All the neurons/pulses are distributed within a dispersive link (e.g., an optical fiber) over time and interconnected through dispersion. (c) Schematic of the interconnections between layers. The “weights” are applied across the neurons at each layer through a process of temporal complex-field (e.g., phase) modulation of the incoming temporal sequence. The blue dash lines represent the trained phase applied to the neurons/pulses. The optical spectral components (arrows with different colors) of each pulse/neuron in layer j will be retarded or advanced with respect to each other, and then added coherently with the components of other neurons to form the new neuronal pattern in layer j+1.

Download Full Size | PDF

In this paper, we propose an entirely new ONN scheme that is specifically conceived for processing a serial data flow. In this scheme, the serial pattern to be processed (the object) is encoded in a sequence of optical pulses, which serve as a sequential set of temporal neurons that are subsequently interconnected through group-velocity dispersion, e.g., easily implemented by linear propagation through a section of optical fiber. In this way, the optical neurons are distributed and interconnected with each other along a single channel, namely, the time domain. The desired ‘weights’ on the different neurons are realized by imposing a prescribed modulation on the coded pulses along the time domain, e.g., through widely available temporal modulation. This time-domain serial ONN (SONN) scheme is thus ideally suited to deal with serial data patterns, directly on-the-fly and in a real-time fashion. We discuss here the basic design conditions and main trade-offs of the proposed SOON scheme. Moreover, the concept is successfully validated for application on analog and digital data through numerical simulations and a proof-of-concept experiment.

2. Principle

The proposed SONN scheme is illustrated in Fig. 1(b). The input of this SONN is a continuous sequence of short optical pulses modulated by the input pattern (or object) under analysis. In other words, the data to be recognized is allocated in different timeslots in a serial optical channel, rather than in different parallel channels. For simplicity, in our following analysis, we assume that the object is a temporal intensity (or amplitude) pattern, thus involving temporal intensity modulation of the optical pulse train, though generally, the object can be a complex-field pattern. Subsequently, the modulated pulse sequence is fed into the SONN and serially processed in real-time. In particular, different consecutive time slots in the temporal trace represent the different serial neurons in the network, with each neuron occupying a single period of the original pulse train. In the SONN shown in Fig. 1(c), the neurons of consecutive layers are connected by use of group-velocity dispersion. As it is well known, dispersion retards or advances the different spectral components of a propagating signal to different time slots. As shown in Fig. 1(c), when the pulse train in our problem propagates through a dispersive medium, the broadband spectral components of each pulse (arrows in different colors) will be distributed to adjacent timeslots, and add coherently. This process is used to perform the required connections among the neurons in consecutive layers. The amount of dispersion used in the scheme should be sufficiently high to ensure that the spectral content of each pulse is spread out in time beyond the number of periods (or neurons) to be connected. Within these fundamental conditions, our numerical simulations show that the performance of the proposed SONN scheme is optimized when the dispersion amount is fixed to work under the needed conditions to produce a Talbot self-imaging effect of the original pulse train [2628] (see Supplement 1 for a detailed discussion). Recall that temporal Talbot self-imaging can be observed when a periodic pulse train propagates through a dispersive medium that satisfies a Talbot effect condition, namely, $s/{f_{rep}}^{2}\textrm{ = 2}\mathrm{\pi }\left|{ \mathrm{\ddot \beta}{ } } \right|{L_T}$, where ${f_{rep}}$ is the repetition rate of the input pulse train, and $\left|{ \mathrm{\ddot \beta}{ } } \right|,\,{L_T}$ and s are the second-order derivative of the propagation constant, the dispersive propagation length and a positive integer number, respectively. At the dispersion lengths defined by this condition, the original periodic pulse train will reproduce itself with the exact same repetition rate. As per our discussions in Supplement 1, operating at a Talbot distance ensures an optimal interaction among the consecutive coded pulses (neurons) to be interconnected so that a maximal accuracy can be obtained. Thus, the dispersion value of each layer in the network should be carefully designed to satisfy a Talbot effect condition [26].

The interconnection of the temporal neurons in between neighboring layers merely through the use of dispersive propagation is not enough to implement a deep learning process because of the lack of the ‘weights’, the essential factor of any neural network scheme. In our proposed system, the ‘weights’ are realized by imposing a proper temporal modulation pattern on the sequential pulses/neurons before propagating the resulting sequence through the dispersive medium. Phase modulation is used in our demonstrations reported here, though complex modulation (both amplitude and phase) is generally possible, as discussed further in Supplement 1, section 7. The resulting complex-valued ‘weights’ of the neurons are then determined by the temporal modulation function —to be customized through the training stage— and the phase imposed by the dispersive propagation process itself. As stated before, the connections among neurons are produced through coherent addition of the temporally dispersed optical spectral components of the phase-modulated neurons of the former layer, as shown in Fig. 1(c). To be more specific, each of the spectrum components (colorful arrows shown in Fig. 1(c)) in layer j are relocated to adjacent timeslots through dispersive propagation at a Talbot distance ${L_T}$. Then they add coherently with respective weights that depend on the trained phase applied to each neuron. At the subsequent network layer, they form new neurons. Without phase modulation, the dispersion alone will only produce a temporal averaging effect to the original ‘intensity-varying’ pulses [29]. However, in our proposed scheme, the output of each layer can be controlled through the application of the trained phase profile to the temporal neurons. By properly training the phase modulation profiles to be applied across all the layers in the network, one can customize the shape of the temporal waveform at the output of the network according to the specific features of the input data pattern (object) under analysis, implementing a desired input-to-output waveform mapping. For instance, as illustrated in Fig. 1(b), a final output waveform with a peak pulse at a designed temporal position can be obtained, with the temporal position being dependent on the specific shape of the input object. In this way, different input data patterns can be recognized through the different time positions of the peak pulses at the ONN output. Hence, through this new ONN concept, the two-dimensional structure of a traditional neural network is actually implemented in a single dimension (the temporal dimension), enabling a serial ONN architecture.

Figure 2(a) shows the model flowsheet of the proposed SONN and related functions that are critical to construct the model. In the forward propagation, every layer contains the processes of temporal phase modulation and dispersive propagation. After propagation through several layers, the final measured output waveform together with the target (ideal) output waveform are used to calculate the corresponding mean square error (MSE) as the cost function. A backpropagation algorithm is employed to train the phase modulation profiles across all layers of the network to achieve the highest possible accuracy rate (see Supplement 1 for a detailed analysis of this SONN model). In what follows, we illustrate the proposed concept through a 4-layer serial neural network design for the classifications of analog and digital signals, respectively.

 figure: Fig. 2.

Fig. 2. Simulation of a 4-layer SONN for analog signals. (a) Model of the SONN consisting of forward and backward propagation. Each layer contains two operations — temporal phase modulation and dispersive propagation. The related mathematical functions are shown in the corresponding regions. Layer 2&3 are omitted considering the similar structure of all layers. (b—e) ‘Feature’ waveforms of sine, square, reverse-triangle and sawtooth functions, respectively. (f—i) Ideal ‘label’ waveforms corresponding to Fig. 2(b)—(e). (j—m) Trained output waveforms corresponding to Fig. 2(b)—(e).

Download Full Size | PDF

3. Simulation

An example of a 4-layer SONN based on the proposed architecture is first designed and validated through numerical simulations. The simulation results for the classification of analog signals are shown in Fig. 2(b)—(m). The data for training and testing was specifically designed to validate the SONN architecture and evaluate its performance. A pulse train with a repetition rate of 5 GHz is used to sample the different objects under analysis, namely, sine, square, reverse triangle and sawtooth waves, each repeating periodically with a 3-ns period. These four patterns are firstly generated. Then a 30-dB individuality variance (IV) or more precisely, a white Gaussian noise on peak amplitude was also imposed on the patterns/objects to generate and diversify the data, with the aim of generalizing the trained model to unseen data (see Supplement 1 for more information about IV). The “features” data (input objects to be processed) are shown in Fig. 2(b)—(e) and the “label” data (corresponding ideal target output waveforms) are shown in Fig. 2(f)—(i). We generated 100 data for each kind of analog wave (400 data in total), of which 70% were used to train the network and 30% were used to test the training performance. The classification result is correct when the peak pulse of the output waveform matches that of its “label” waveform in terms of temporal position. Given that the data rate of the phase profile is 10 Gb/s and the batch size is 6, the network was trained for 1000 epochs till the trend of cost and accuracy were stable (see Supplement 1). Noticeably, the accuracy can reach 100% at an IVR of 30 dB (see Supplement 1 for more results). Examples of the output waveforms corresponding to some of the test data are shown in Fig. 2(j)—(m). As can be seen, the peak pulses of the trained outputs align well with those of the ‘label’ waveforms.

The operation of the proposed SONN is based on the interference among the optical spectral components of the incoming pulses. These components cancel each other at the specific time slots for which no pulse is desired, as shown in Fig. 2(f)—(i). Nonetheless, in the resulting output waveforms, we observe the presence of undesired side peaks near the main pulse. These peaks have no negative effect on the classification accuracy. Another important issue to be considered for this optical deep learning scheme concerns the classification of successive patterns/objects. Due to the broadband nature of the light source and the dispersion used in the SONN, the optical spectral components of an incoming pattern will be inevitably stretched to beyond the temporal slot allocated to this specific pattern, which may affect the classification ability for consecutive patterns/objects. However, by inserting suitable temporal gaps between the incoming patterns and employing a symmetrical dispersion strategy, consecutive patterns can be classified with the desired accuracy, as shown in Supplement 1. Symmetrical dispersion here refers to the use of an amount of dispersion in neighboring layers with the same absolute value but with alternating (opposite) signs. This strategy also allows one to overcome a potential limitation in the number of layers associated with an excessive accumulation of dispersion along the system, thus helping to implement a ‘deeper’ ONN. Here, the gap in the evaluated models for classification of analog signals (results in Fig. 2) and digital signals (results in Fig. 3, described below) is fixed to be 16× and 20× the pulse repetition period, and the absolute dispersion values for these two cases are $2 \times {D_T}$ and $3 \times {D_T}$, respectively, where ${D_T}\textrm{ = 1/2}\mathrm{\pi }f_{rep}^2$ is the fundamental (first-order) Talbot length, i.e., with s = 1, corresponding to the input pulse train at ${f_{rep}} = $ 5 GHz. The additional latency caused by the time gap should be taken into consideration in evaluating the total delay needed for the analysis of each pattern. In the considered examples, an additional latency of ∼3.2 ns and ∼4 ns should be considered for the analog and digital patterns under analysis, respectively.

 figure: Fig. 3.

Fig. 3. Simulation of a 4-layer SONN for digital signals. (a—d) ‘Feature’ waveforms of four English letters — ‘u’, ‘c’, ‘a’ and ‘s’— in the form of a binary ASCII code. Each bit is represented by the power of each pulse. (e—h) Ideal ‘label’ waveforms corresponding to Fig. 3(a)—(d). (i—l) Trained output waveforms corresponding to Fig. 3(a)—(d).

Download Full Size | PDF

Next, the classification capability for digital signals is investigated. The digital signals to be recognized are 8-bit binary ASCII codes— ‘u’, ‘c’, ‘a’ and ‘s’. Likewise, 100 data for each letter was generated and diversified for training and testing the model. These codes are transferred to the pulses, as shown in Fig. 3(a)—(d) through on-off-keying (OOK) intensity modulation. The corresponding “label” waveforms are presented in Fig. 3(e)—(h). The model and the hyperparameters are the same as those in the classification model for analog signals. An accuracy of 100% was obtained after an 800-epoch training at an IVR of 30 dB (see Supplement 1 for more results with different IVRs). The relationship between IVR and accuracy rate is discussed in Supplement 1, section 6. The outputs of the test data after training are displayed in Fig. 3(i)—(l), where the peak position also matches well with the corresponding “label” waveform in Fig. 3(e)—(h).

4. Experiment

Proof-of-concept experiments were also performed in order to verify the feasibility of the proposed SONN model. We designed and trained a pseudo-3-layer SONN for the separate classifications of two groups of letters— ‘u’ versus ‘c’ and ‘a’ versus ‘s’. The experimental setup is presented in Fig. 4(a) (see Supplement 1 for more details). Due to the degradation in the recognition capability as compared with a 4-layer SONN, a layer with no trainable variable (no phase modulation) is added for an improved performance (see Supplement 1). The repetition rate of the mode-locked laser (MLL) is set to be 12 GHz, matched with the dispersion value (∼ −850 ps nm-1 for the first-order Talbot condition). The ideal output waveform (not shown here) is set to be a peak pulse located on the left or right side to better discriminate the results. We trained the pseudo-3-layer ONN with 2000 and 1000 epochs and obtained the phase modulation profiles which are shown inside the green arrows in Fig. 4(a) (red line for ‘u’ & ‘c’, blue line for ‘a’ & ‘s’). Notice that unlike the symmetrical dispersion strategy used in the classification setup simulated above, the dispersion of each layer in the experimental design is fixed to ${D_T}$ with the same sign. The data rate of the phase modulation profiles is also reduced with respect to the design above, and in particular, it is fixed to be equal to the data rate of the input data pattern. These two factors give rise to an amount of undesired noise around the peak pulse in the obtained output waveforms, affecting the classification performance of the demonstrated SONN, as detailed below.

 figure: Fig. 4.

Fig. 4. Experimental setup of a pseudo-3-layer SONN. (a) Experimental configuration of a pseudo-3-layer SONN. I: Generation and coding of the pulse train. ASG, analog signal generation; MLL, actively mode-locked laser; OBPF, optical bandpass filter; PC, polarization controller; PBERT, parallel bit error ratio tester; IM, intensity modulator; AWG, arbitrary waveform generator; II: Pseudo layer 1 and layer 2 with phase modulation of the proposed SONN. DCF, dispersion compensation fiber; OC: optical coupler; EDFA, erbium-doped fiber amplifier; OTDL, optical tunable delay line; PM, phase modulator; Pseudo layer 1 contains no trainable variables, i.e., modulation phase, and it is added in the system in order to obtain a better classification performance (see Supplement 1). III: Layer 3 of the proposed SONN. TDCM, tunable dispersion compensation module; PD, photodetector; OSC, oscilloscope. The temporal waveforms inside the green arrows in II and III are applied to the phase modulators. Dash lines stand for synchronization between different signal sources. (b) Trained phase profiles applied on the phase modulators in experiments. The red and blue waveforms are for the ‘u’ & ‘c’ and ‘a’ & ‘s’ classifications, respectively. The waveforms of the first and second row are generated for layer 2 and 3, respectively.

Download Full Size | PDF

Figure 5(a), c shows the input waveforms corresponding to the English letters of ‘u’ & ‘c’ and ‘a’ & ‘s’, respectively. After carefully synchronizing the optical pulses with the applied trained phase (shown in Fig. 4(b), the red line for ‘u’ & ‘c’ and the blue line for ‘a’ & ‘s’), the two pairs of English letters are well classified by the SONN. The results are shown in Fig. 5(b), d where the peak pulses appear in the specific prescribed time positions in both the simulation and experimental results. The mismatch between the simulation and the measurement results in regards to the noise observed around the peak pulse is attributed to the instability of the laser and electrical signal source, as well as the unavoidable noise induced by environmental fluctuations along the fiber devices. This phenomenon has been verified in our simulation by changing the IVR of the input signals. Nevertheless, the noise of incoming signals does not affect the temporal position of the peak pulse, which demonstrates the classification ability of the proposed method under practical conditions. Furthermore, the reconfigurability of the proposed system is confirmed by the classifications of two separate groups of objects. By simply changing the phase profile applied to the phase modulators (see respective phase profiles in Fig. 4(b)), we can fulfill different tasks by using the same hardware setup. In all evaluated cases, the experimental results agree well with the simulation ones, thus confirming the feasibility of the proposed serial optical deep learning technique.

 figure: Fig. 5.

Fig. 5. Comparison between experimental and simulated results of a pseudo-3-layer SONN. (a) ‘Feature’ waveforms corresponding to ‘u’ and ‘c’ of the input layer in the simulation (blue, top plots) and experiment (red, bottom plots). (b) Final output waveform corresponding to a in the simulation (blue, top plots)/experiment (red, bottom plots). (c) ‘Feature’ waveforms corresponding to ‘a’ and ‘s’ of the input layer in the simulation (blue, top plots)/experiment (red, bottom plots). (d) Final output waveform corresponding to c in the simulation (blue, top plots)/experiment (red, bottom plots). These results were obtained by inputting a single isolated pattern to the SONN at a time.

Download Full Size | PDF

5. Discussion

Recurrent neural network (RNN) is a kind of artificial neural network for processing temporal sequence, whereas the proposed is not RNN. For RNN, the serial inputs are inherently of temporal relevance, and it learns to predict the next input from previous inputs. However, the input data or patterns of SONN are temporally irrelevant, and the output of SONN is predicted merely based on the current input. Hence, the proposed SONN is essentially different from RNN. Instead, it is a traditional fully-connected neural network since all the neurons of each layer are connected to each of the neurons in the next layer.

It is inherently challenging for traditional ONNs to implement nonlinear ‘activation’ functions for every neuron because of the side-by-side arrangement of neurons.8-11 In other words, each neuron in every layer should be provided with a separate nonlinear activation device, which could increase complexity. In contrast, the proposed SONN is ideally suited for the realization of activation functions as all neurons pass through the same single optical fiber. Thus, only one nonlinear activation device would be required for each neural network layer. In practice, the activation functions of this system could be achieved by merely putting one saturable absorber or semiconductor optical amplifier after each layer. Although the classification capability of SONN is limited due to the absence of activation function, a better performance of the SONN could be expected by the incorporation of activation functions [3032]. In addition, the symmetrical dispersion strategy should facilitate the practical realization of this proposed SONN scheme, e.g., through the use of linearly chirped fiber Bragg gratings (LCFBGs) for implementation of the dispersive media in the network. This is so because a single LCFBG can provide a specific dispersion from one port of the fiber grating, as well as the same amount of dispersion but with the opposite sign from the other port. Thus, a single LCFBG could be used to implement the needed dispersion lines in two neighboring layers, with the grating respectively operated from its two different ports. This strategy would also ensure a higher dispersion precision than using two separate dispersive media in each two consecutive network layers. Furthermore, the use of an LCFBG as the dispersive medium can also reduce the signal processing time (i.e., latency) and the optical loss significantly and contribute to implement a “deeper” neural network, thanks to the shorter physical length of the grating as compared with other alternative dispersion devices (e.g., a section of optical fiber). However, the employment of dispersive media and modulators will inevitably induce optical loss which makes this technique not that power-efficient as current ONN techniques, and will increase the difficulty of integrating the whole system. Additionally, it is important to note that the proposed SONN system requires a very strict synchronization between the input pulse train and the electric phase signals, as well as for the output pattern identification. These two issues become more serial for a “deeper” SONN. Fortunately, precise temporal alignment techniques up to the attosecond regime [33,34] are currently available for application on the proposed scheme. Moreover, in addition to the application as a classifier for analog or digital signals demonstrated here, the introduced SONN scheme also holds the potential for implementation of any signal-processing functionality based on a prescribed input-to-output mapping, such as for reconfigurable pulse repetition rate conversion methods, specific waveform extraction, etc.

6. Conclusions

In this paper, we have proposed and demonstrated a SONN concept that can process serial data in a direct and real-time fashion, by avoiding the need for serial-to-parallel conversion. The concept exploits a novel deep learning strategy that exploits sequential temporal optical neurons, which are suitably weighted through temporal phase modulation and interconnected through group-velocity dispersion. We have provided the design conditions and trade-offs of this novel ONN scheme, and have successfully demonstrated recognition of analog and digital serial data signals within one temporal channel. The proposed SONN is compatible with most current time-domain data processing systems, and the data are directly processed in an on-the-fly and real-time manner. By adequately training the phase profiles of the neurons, the ‘features’ —input data patterns— are well mapped into the desired prescribed ‘labels’ —output waveforms. We have reported a proof-of-concept experimental demonstration of a pseudo-3-layer SONN, which can recognize the ASCII codes of English letters at a data rate of 12 gigabits per second. Furthermore, the proposed SONN scheme is fully reconfigurable for different tasks and can be readily extended for a simple realization of activation functions. The proposed technique provides a novel route to optical neural networks, and it also offers a fresh overall perspective for temporal signal processing.

Funding

National Key Research and Development Program of China (2020AAA0130301, 2018YFB2201902); National Natural Science Foundation of China (61925505, 62075212).

Disclosures

The authors declare no conflicts of interest.

Supplemental document

See Supplement 1 for supporting content.

References

1. Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521(7553), 436–444 (2015). [CrossRef]  

2. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, (Curran Associates, Inc., 2012), pp. 1097–1105.

3. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” Nature 529(7587), 484–489 (2016). [CrossRef]  

4. K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” arXiv:1406.1078 [cs, stat] (2014).

5. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev, and A. Walsh, “Machine learning for molecular and materials science,” Nature 559(7715), 547–555 (2018). [CrossRef]  

6. P. Márquez-Neila, C. Fisher, R. Sznitman, and K. Heng, “Supervised machine learning for analysing spectra of exoplanetary atmospheres,” Nat. Astron. 2(9), 719–724 (2018). [CrossRef]  

7. M. Molla, M. Waddell, D. Page, and J. Shavlik, “USING MACHINE LEARNING TO DESIGN AND INTERPRET GENE-EXPRESSION MICROARRAYS,” 37 (n.d.).

8. M. Nazirzadeh, M. Shamsabardeh, and S. J. Ben Yoo, “Energy-Efficient and High-Throughput Nanophotonic Neuromorphic Computing,” in Conference on Lasers and Electro-Optics (OSA, 2018), p. ATh3Q.2.

9. R. Hamerly, L. Bernstein, A. Sludds, M. Soljačić, and D. Englund, “Large-Scale Optical Neural Networks Based on Photoelectric Multiplication,” Phys. Rev. X 9, 021032 (2019). [CrossRef]  

10. M. He, M. Xu, Y. Ren, J. Jian, Z. Ruan, Y. Xu, S. Gao, S. Sun, X. Wen, L. Zhou, L. Liu, C. Guo, H. Chen, S. Yu, L. Liu, and X. Cai, “High-performance hybrid silicon and lithium niobate Mach–Zehnder modulators for 100 Gbit s −1 and beyond,” Nat. Photonics 13(5), 359–364 (2019). [CrossRef]  

11. M. A. Muriel, J. Azaña, and A. Carballar, “Real-time Fourier transformer based on fiber gratings,” Opt. Lett. 24(1), 1 (1999). [CrossRef]  

12. R. Slavík, Y. Park, M. Kulishov, R. Morandotti, and J. Azaña, “Ultrafast all-optical differentiators,” Opt. Express 14(22), 10699 (2006). [CrossRef]  

13. K. Vandoorne, P. Mechet, T. Van Vaerenbergh, M. Fiers, G. Morthier, D. Verstraeten, B. Schrauwen, J. Dambre, and P. Bienstman, “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun. 5(1), 3541 (2014). [CrossRef]  

14. X. Lin, Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, “All-optical machine learning using diffractive deep neural networks,” Science 361(6406), 1004–1008 (2018). [CrossRef]  

15. Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, and D. Englund, “Deep learning with coherent nanophotonic circuits,” Nat. Photonics 11(7), 441–446 (2017). [CrossRef]  

16. J. Bueno, S. Maktoobi, L. Froehly, I. Fischer, M. Jacquot, L. Larger, and D. Brunner, “Reinforcement learning in a large-scale photonic recurrent neural network,” Optica 5(6), 756 (2018). [CrossRef]  

17. J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran, and W. H. P. Pernice, “All-optical spiking neurosynaptic networks with self-learning capabilities,” Nature 569(7755), 208–214 (2019). [CrossRef]  

18. K. Goda, K. K. Tsia, and B. Jalali, “Serial time-encoded amplified imaging for real-time observation of fast dynamic phenomena,” Nature 458(7242), 1145–1149 (2009). [CrossRef]  

19. P. V. Kelkar, F. Coppinger, A. S. Bhushan, and B. Jalali, “Time domain optical sensing,” in 1999 IEEE LEOS Annual Meeting Conference Proceedings. LEOS’99. 12th Annual Meeting. IEEE Lasers and Electro-Optics Society 1999 Annual Meeting (Cat. No.99CH37009) (1999), pp. 381–382 vol.1.

20. K. Goda, A. Ayazi, D. R. Gossett, J. Sadasivam, C. K. Lonappan, E. Sollier, A. M. Fard, S. C. Hur, J. Adam, C. Murray, C. Wang, N. Brackbill, D. D. Carlo, and B. Jalali, “High-throughput single-microparticle imaging flow analyzer,” Proc. Natl. Acad. Sci. U. S. A. 109(29), 11630–11635 (2012). [CrossRef]  

21. C.-H. Cheng, C.-Y. Chen, J.-D. Chen, D.-K. Pan, K.-T. Ting, and F.-Y. Lin, “3D pulsed chaos lidar system,” Opt. Express 26(9), 12230–12241 (2018). [CrossRef]  

22. C. V. Poulton, A. Yaacobi, D. B. Cole, M. J. Byrd, M. Raval, D. Vermeulen, and M. R. Watts, “Coherent solid-state LIDAR with silicon photonic optical phased arrays,” Opt. Lett. 42(20), 4091–4094 (2017). [CrossRef]  

23. P. Ghelfi, F. Laghezza, F. Scotti, G. Serafino, A. Capria, S. Pinna, D. Onori, C. Porzi, M. Scaffardi, A. Malacarne, V. Vercesi, E. Lazzeri, F. Berizzi, and A. Bogoni, “A fully photonics-based coherent radar system,” Nature 507(7492), 341–345 (2014). [CrossRef]  

24. D. S. Bykov, O. A. Schmidt, T. G. Euser, and P. S. J. Russell, “Flying particle sensors in hollow-core photonic crystal fibre,” Nat. Photonics 9(7), 461–465 (2015). [CrossRef]  

25. C. Haffner, W. Heni, Y. Fedoryshyn, J. Niegemann, A. Melikyan, D. L. Elder, B. Baeuerle, Y. Salamin, A. Josten, U. Koch, C. Hoessbacher, F. Ducry, L. Juchli, A. Emboras, D. Hillerkuss, M. Kohl, L. R. Dalton, C. Hafner, and J. Leuthold, “All-plasmonic Mach–Zehnder modulator enabling optical high-speed communication at the microscale,” Nat. Photonics 9(8), 525–528 (2015). [CrossRef]  

26. J. Azana and M. A. Muriel, “Temporal self-imaging effects: theory and application for multiplying pulse repetition rates,” IEEE J. Sel. Top. Quantum Electron. 7(4), 728–744 (2001). [CrossRef]  

27. H. F. Talbot, “LXXVI. Facts relating to optical science. No. IV,” London Edinburgh Philos. Mag. J. Sci. London 9, 401–407 (1836). [CrossRef]  

28. B. H. Kolner, “Space-time duality and the theory of temporal imaging,” IEEE J. Quantum Electron. 30(8), 1951–1963 (1994). [CrossRef]  

29. Z. Lin, S. Sun, W. Li, N. Zhu, and M. Li, “Temporal Cloak Without Synchronization,” IEEE Photonics Technol. Lett. 31(5), 373–376 (2019). [CrossRef]  

30. K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks 4(2), 251–257 (1991). [CrossRef]  

31. G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine: Theory and applications,” Neurocomputing 70(1-3), 489–501 (2006). [CrossRef]  

32. M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,” Neural Networks 6(6), 861–867 (1993). [CrossRef]  

33. F. Riehle, “Optical clock networks,” Nat. Photonics 11(1), 25–31 (2017). [CrossRef]  

34. J. Kim and F. X. Kärtner, “Attosecond-precision ultrafast photonics,” Laser Photonics Rev. 4(3), 432–456 (2010). [CrossRef]  

Supplementary Material (1)

NameDescription
Supplement 1       Supplemental Document

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. Principle of SONN. (a) Schematic of a conventional optical network for deep learning. The neurons are arranged abreast, and the ‘data’ flows through the network in a parallel fashion. (b) Schematic of the proposed serial optical neural network (SONN). The objects to be processed are temporal waveforms, which are sampled in amplitude by an optical pulse train (with a constant phase) and then fed into the SONN. All the neurons/pulses are distributed within a dispersive link (e.g., an optical fiber) over time and interconnected through dispersion. (c) Schematic of the interconnections between layers. The “weights” are applied across the neurons at each layer through a process of temporal complex-field (e.g., phase) modulation of the incoming temporal sequence. The blue dash lines represent the trained phase applied to the neurons/pulses. The optical spectral components (arrows with different colors) of each pulse/neuron in layer j will be retarded or advanced with respect to each other, and then added coherently with the components of other neurons to form the new neuronal pattern in layer j+1.
Fig. 2.
Fig. 2. Simulation of a 4-layer SONN for analog signals. (a) Model of the SONN consisting of forward and backward propagation. Each layer contains two operations — temporal phase modulation and dispersive propagation. The related mathematical functions are shown in the corresponding regions. Layer 2&3 are omitted considering the similar structure of all layers. (b—e) ‘Feature’ waveforms of sine, square, reverse-triangle and sawtooth functions, respectively. (f—i) Ideal ‘label’ waveforms corresponding to Fig. 2(b)—(e). (j—m) Trained output waveforms corresponding to Fig. 2(b)—(e).
Fig. 3.
Fig. 3. Simulation of a 4-layer SONN for digital signals. (a—d) ‘Feature’ waveforms of four English letters — ‘u’, ‘c’, ‘a’ and ‘s’— in the form of a binary ASCII code. Each bit is represented by the power of each pulse. (e—h) Ideal ‘label’ waveforms corresponding to Fig. 3(a)—(d). (i—l) Trained output waveforms corresponding to Fig. 3(a)—(d).
Fig. 4.
Fig. 4. Experimental setup of a pseudo-3-layer SONN. (a) Experimental configuration of a pseudo-3-layer SONN. I: Generation and coding of the pulse train. ASG, analog signal generation; MLL, actively mode-locked laser; OBPF, optical bandpass filter; PC, polarization controller; PBERT, parallel bit error ratio tester; IM, intensity modulator; AWG, arbitrary waveform generator; II: Pseudo layer 1 and layer 2 with phase modulation of the proposed SONN. DCF, dispersion compensation fiber; OC: optical coupler; EDFA, erbium-doped fiber amplifier; OTDL, optical tunable delay line; PM, phase modulator; Pseudo layer 1 contains no trainable variables, i.e., modulation phase, and it is added in the system in order to obtain a better classification performance (see Supplement 1). III: Layer 3 of the proposed SONN. TDCM, tunable dispersion compensation module; PD, photodetector; OSC, oscilloscope. The temporal waveforms inside the green arrows in II and III are applied to the phase modulators. Dash lines stand for synchronization between different signal sources. (b) Trained phase profiles applied on the phase modulators in experiments. The red and blue waveforms are for the ‘u’ & ‘c’ and ‘a’ & ‘s’ classifications, respectively. The waveforms of the first and second row are generated for layer 2 and 3, respectively.
Fig. 5.
Fig. 5. Comparison between experimental and simulated results of a pseudo-3-layer SONN. (a) ‘Feature’ waveforms corresponding to ‘u’ and ‘c’ of the input layer in the simulation (blue, top plots) and experiment (red, bottom plots). (b) Final output waveform corresponding to a in the simulation (blue, top plots)/experiment (red, bottom plots). (c) ‘Feature’ waveforms corresponding to ‘a’ and ‘s’ of the input layer in the simulation (blue, top plots)/experiment (red, bottom plots). (d) Final output waveform corresponding to c in the simulation (blue, top plots)/experiment (red, bottom plots). These results were obtained by inputting a single isolated pattern to the SONN at a time.
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.