Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics

Aashu Jha; Chaoran Huang; Paul R. Prucnal

doi:10.1364/OL.398234

Optics Letters
Vol. 45,
Issue 17,
pp. 4819-4822
(2020)
•https://doi.org/10.1364/OL.398234

Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics

Aashu Jha, Chaoran Huang, and Paul R. Prucnal

Open Access

Get PDF
Email
Share
Get Citation
Copy Citation Text
Aashu Jha, Chaoran Huang, and Paul R. Prucnal, "Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics," Opt. Lett. 45, 4819-4822 (2020)

Export Citation
- BibTex
- Endnote (RIS)
- HTML
- Plain Text
Citation alert
Save article
Spotlight Summary

Check for updates

More Like This

Programmable low-threshold optical nonlinear activation functions for photonic neural networks
Ying Huang, et al.
Opt. Lett. 47(7) 1810-1813 (2022)

Programmable low-power consumption all-optical nonlinear activation functions using a micro-ring...
Ziling Fu, et al.
Opt. Express 30(25) 44943-44953 (2022)

All-optical non-linear activation function for neuromorphic photonic computing using semiconductor...
Thorsten S. Rasmussen, et al.
Opt. Lett. 45(14) 3844-3847 (2020)

Related Topics
Table of Contents Category
- Nonlinear Optics
Optics & Photonics Topics
?

The topics in this list come from the Optics and Photonics Topics applied to this article.

About this Article
History
- Original Manuscript: June 11, 2020
- Revised Manuscript: July 21, 2020
- Manuscript Accepted: July 26, 2020
- Published: August 26, 2020
Virtual Issues
September 14, 2020 Spotlight on Optics

Abstract

We experimentally demonstrate all-optical reconfigurable nonlinear activation functions in a cavity-loaded Mach–Zehnder interferometer device on a silicon photonics platform, via the free-carrier dispersion effect. Our device is programmable to generate various nonlinear activation functions, including sigmoid, radial-basis, clamped rectified linear unit, and softplus, with tunable thresholds. We simulate benchmark tasks such as XOR and MNIST handwritten digit classifications with experimentally measured activation functions and obtain accuracies of 100% and 94%, respectively. Our device can serve as nonlinear units in photonic neural networks, while its nonlinear transfer function can be flexibly programmed to optimize the performance of different neuromorphic tasks.

The appeal of neuromorphic photonics rests primarily on the enviable speed and energy efficiency of photonics relative to digital electronics. A neuron consists of linear weighting and summing followed by a nonlinear activation. Interference in the optical domain naturally lends itself to highly efficient optical matrix–vector multiplication. Wavelength-division multiplexed (WDM) [1] as well as coherent [2] architectures have already been demonstrated. A nonlinear activation function is a critical unit of a neuron. Several nonlinear unit designs have been studied based on a silicon microring modulator [3], Mach–Zehnder modulator (MZM) [4], spiking laser [5], electro-absorption modulator [6], etc. Programmable electro-optical nonlinearity has been demonstrated in Refs. [3,4]. However, such schemes require efficient optical–electrical conversion, which needs either highly efficient modulators or transimpedance gain [7]. The latter typically calls for heterogeneous integration with CMOS electronics, which adds to the system complexity. An alternative approach is to implement the nonlinearity purely in the optical domain. With the advent of ultrafast, low-energy optical switching, enabling sub-picosecond switching time and switching energies on the order of femtojoules per bit [8], all-optical signal processing fares remarkably well against hybrid approaches and thus deserves attention. Recent experimental work targeting all-optical nonlinear functionality for neurons has been implemented in free space [9] and using discrete components [6,10], albeit with fixed nonlinearity. We demonstrate programmable all-optical nonlinear activation functions on an integrated device, which has never been demonstrated before.

Silicon-on-insulator (SOI) is a desirable platform candidate given its ubiquity in state-of-the-art photonic integration circuits (PICs), and the nonlinear properties of silicon, discussed in Ref. [11]. The dominant nonlinearity in silicon waveguides is the free-carrier dispersion (FCD) effect making it the enabling nonlinearity of our device. FCD has been extensively applied in switching, thresholding [12], signal regeneration [13], etc. Its application in all-optical neurons has been theoretically proposed [14], however without any experimental demonstration to date, to the best of our knowledge.

Here, we experimentally demonstrate an all-optical neuron unit, via the FCD effect, with programmable nonlinear activation functions. Our device is based on a cavity-loaded interferometer, where the cavity buildup, allowing efficient nonlinear operation via FCD, paired with the tuning biases on the interferometers enables programming both the shape and threshold of the nonlinear activation functions. We experimentally demonstrate nonlinear transfer functions, resembling sigmoid, clamped rectified linear unit (ReLU), radial-basis, and softplus, and theoretically verify these results. Such functions have numerous applications—sigmoid shapes are used in Hopfield recurrent networks [15], ReLU in convolutional networks [16], radial basis functions in support vector machines [17], and softplus in deep learning networks [18], among others. Finally, we also demonstrate benchmark machine learning tasks, such as a binary exclusive-OR (XOR) task and a multi-class MNIST handwritten digit classification task, with accuracies of 100% and 94%, respectively, using our experimental activation functions.

A neuron node comprises weighting, summation, and nonlinear activation [Fig. 1(a)]. In this work, we demonstrate all-optical nonlinear activation functions utilizing the FCD effect in silicon. Our device, as illustrated in Fig. 1(b), is composed of a microring resonator (MRR) loaded on an arm of a Mach–Zehnder interferometer (MZI) after a Mach–Zehnder coupler (MZC). This design was first proposed in Ref. [19] for optical thresholding, and later extended to other applications [14]. The MRR cavity buildup triggers FCD, which manifests as a nonlinear phase response to optical power. This nonlinear phase translates to nonlinear transmission via the MZI. MZC behaves as a tunable coupler allowing further control of the interference at the MZI output. Thermo-optic heaters on the MRR, MZI, and MZC allow for tuning amplitude and phase biases of the device, which together enable the programmability of the shape and threshold of the device’s nonlinear transfer functions.

Fig. 1. (a) Schematic of a neuron, showing weighting of inputs $[{x_1},{x_2},\ldots,{x_n}]$, summing, and nonlinear operation resulting in output, $y$. (b) Schematic of the device, comprising an MRR loaded on an MZI, with a preceeding MZC. (c) Optical micrograph of the fabricated device on an SOI platform. The $Q$ factor of the fabricated MRR was ${10^4}$. The resonance of the MRR on the second arm of the MZI was tuned far from the operating range of wavelength in this work.

Download Full Size | PDF

We simulated the dynamic behavior of the device to understand the operating principle and distinguish the parameter regimes necessary for a given nonlinear function. We used the coupled-mode theory model shown in Ref. [20] that incorporates all nonlinear effects in a silicon MRR. The governing dynamic equations of the normalized complex amplitude of light in the MRR cavity, $a$, and the number of free carriers generated, $n$, as prescribed by the model are

(1)$$\begin{split}{\delta a/\delta t}&={ i(\delta \omega - {n_{{\rm kerr}}}|a{|^2} + {\sigma _{{\rm fcd}}}{\alpha _{{\rm TPA}}}n)a},\\&\quad {- (1 + {\alpha _{{\rm TPA}}}|a{|^2} + {\gamma _{{\rm FCA}}}{\alpha _{{\rm TPA}}}n)a + \sqrt {{\gamma _p}{P_{{\rm IN}}}(t)}}\\{\delta n/\delta t}& = {|a{|^4} - n/\tau},\end{split}$$

where $\delta \omega$ is the detuning between the frequency of the input signal and that of the MRR resonance (${\omega _0}$); $t$ represents time normalized with respect to the quantity $\Gamma _0^{- 1} = 2{Q_L}/{\omega _0}$, where ${Q_L}$ is the MRR loaded quality factor. ${n_{{\rm kerr}}},{\sigma _{{\rm fcd}}},{\alpha _{{\rm TPA}}},{\gamma _{{\rm FCA}}},{\gamma _p}$ correspond to nonlinear parameters, discussed in detail in Ref. [12]. The nonlinearity in the MRRs is supplemented by linear operation in the MZI and MZC. While the nonlinearity results primarily from the power-dependent nonlinear phase change due to the FCD effect, the tuning heaters that change the coupling ratio and thus the interference condition are equally important to achieve the programmability of the nonlinear transfer function. We simulated this reconfigurability by tuning the following model parameters: $r$, which corresponds to the coupling ratio of the MZC, and $\Delta \lambda$, which is the wavelength detuning between the input signal and the MRR resonance.

Figures 2(a)–2(d).(i) show the nonlinear transfer functions (left column), the MRR-induced nonlinear phase change (${\Phi _{{\rm MRR}}}$), and MRR transmission (${T_{{\rm MRR}}}$) (right column) as a function of input power. The insets illustrate the relative wavelength detuning between input light and MRR resonance at the initial and final conditions. A clamped ReLU, i.e., ReLU with saturation [Fig. 2(a).(i)], occurs when the signal is red-detuned at the initial point. The output power initially increases linearly with the input, while generating free carriers that induce proportional nonlinear phase change [Fig. 2(a).(ii)]. At the maxima of ${\Phi _{{\rm MRR}}}$, the MRR resonance blue shifts, resulting in less coupling into the MRR, and hence decreased nonlinear phase. A reduction in ${\Phi _{{\rm MRR}}}$ means a reduced phase offset between the two MZI arms. As a consequence, there is constant output power for further increase in input power. The slope of the linear region as well as the flatness of the one-level can be tuned by adjusting the MZC bias parameter, $r$.

Fig. 2. (a)–d.(i) Simulated nonlinear transfer functions, i.e., device output power (${P_{{\rm OUT}}}$) versus input power (${P_{{\rm IN}}}$). Insets are illustrations of the wavelength detuning of the MRR resonance (green) relative to input signal (yellow) at the initial and final powers, as indicated by the arrows. (a)–(d).(ii) MRR transmission, ${T_{{\rm MRR}}}$, (solid blue, left axis), and nonlinear FCD-induced phase change, ${\Phi _{{\rm MRR}}}$, (dashed red, right axis) as a function of input power corresponding to each transfer function type.

Download Full Size | PDF

A sigmoid-like function occurs when the signal is slightly blue-detuned to the resonance [Fig. 2(b).(i)]. Initially, an increase in ${P_{{\rm IN}}}$ is negated by an increase in ${\Phi _{{\rm MRR}}}$, resulting in constant ${P_{{\rm OUT}}}$. When enough carriers accumulate and the MRR resonance aligns to the signal as a result of FCD-induced blue shift, there is a sharp decline in ${\Phi _{{\rm MRR}}}$, as well as a dip in ${T_{{\rm MRR}}}$. As the input power continues to increase, the signal is off-resonance to the right, and ${\Phi _{{\rm MRR}}}$ reduces due to less light being coupled into the MRR. An oscillation is seen in the output power, resulting from oscillation in ${T_{{\rm MRR}}}$ and reduced ${\Phi _{{\rm MRR}}}$ as the system stabilizes.

A radial-basis-like function is obtained when the signal is further blue-detuned to the resonance [Fig. 2(c).(i)]. Here, the output power increases linearly with the input power, with minimal change in ${\Phi _{{\rm MRR}}}$. At the threshold, the MRR resonance blue shifts to the signal illustrated by sharp transitions in ${\Phi _{{\rm MRR}}}$ and ${T_{{\rm MRR}}}$. An oscillation in ${P_{{\rm OUT}}}$ is seen after the transition point similar to the sigmoid case. This feature was found to be largely suppressed in the experiment (to be discussed later), which can be attributed to having access to an additional variable, i.e., heater bias on the MZI arm, resulting in better interference control at the MZI output. The signal approaches to the right of resonance as ${P_{{\rm IN}}}$ further increases. Figure 2(d).(i) shows a softplus-like function, obtained when the signal is at the edge of the MRR resonance. Transmission here increases linearly with input power with negligible change in ${\Phi _{{\rm MRR}}}$. Given higher detuning, more input power is required to cause sufficient FCD-induced blue shift for the MRR resonance and signal to align. At this point, a sharp change in ${\Phi _{{\rm MRR}}}$ occurs accompanied by a sharp transition to higher output power. The MZC bias parameter adjusts the slope of the linear region.

Fig. 3. Schematic of the experimental setup. DFB, distributed feedback laser; SG, signal generator; MZM, Mach–Zehnder modulator; EDFA, erbium-doped fiber amplifier; BPF, bandpass filter; CS, current source; OSC, sampling oscilloscope; PD, photodetector; OSA, optical spectrum analyzer.

Download Full Size | PDF

For the experimental demonstration, our device was fabricated at the Advanced Micro Foundry, Singapore, on an SOI substrate. The optical components of the device reside on a 220 nm Si layer beneath a 3 µm oxide cladding, and the electrical components on a TiN heater layer and Al routing layer. The schematic of the experimental setup is shown in Fig. 3. The data from a signal generator, a sine wave at 500 Mbps, are first modulated by an MZM, and then carved out by a square wave (1 Gbps) at a second MZM. The optical signal is then amplified by an erbium-doped fiber amplifier, while a bandpass filter filters out-of-band amplified spontaneous emission noise. Optical coupling is done via sub-wavelength grating couplers. The transmission spectrum and input/output waveforms are monitored on an optical spectrum analyzer and a sampling oscilloscope, respectively. Device biases are controlled via thermo-optic heaters on MZI, MRR, and MZC, each with a measured resistance of $200\,\Omega$. A three-dimensional sweep of current biases was sourced to the heaters through the current source, remotely controlled by Lightlab [21]. The transfer functions represent the instantaneous input and output power amplitudes measured at various points within the 3D sweep. Figures 4(a)–4(d) show the various classes of measured nonlinear activation functions, clamped ReLU, sigmoid, radial basis, and softplus, obtained at different bias points. The heaters allow further tuning of the threshold of the nonlinear transfer function besides the nonlinear shape. This reconfigurability distinguishes our device from previous neuron designs, where the nonlinear activation function is fixed [6,10]. We note that these conventional nonlinear functions represent a subset of functions achievable by our device. Additionally, a cascaded ring architecture can be employed to enable more sophisticated transfer functions.

Fig. 4. Experimentally measured nonlinear transfer functions. (a) Clamped ReLU, (b) radial basis, (c) sigmoid, and (d) softplus. Labels correspond to $(\Delta {\phi _{{\rm MZC}}},\Delta {\phi _{{\rm MZI}}},\Delta \lambda)$, i.e., actuated phase shifts by MZC/MZI heaters and wavelength detuning. Thermal crosstalk between heaters has not been calibrated for.

Download Full Size | PDF

We then simulate a binary classification task, exclusive-OR (XOR), and a multi-class classification, MNIST handwritten digit classification, using experimentally measured sigmoid and clamped ReLU activation functions, respectively, on the Pytorch framework. Specifically, the activation functions are the spline interpolated functions generated from the experimentally measured transfer functions.

The architecture of our XOR classification network is shown in Fig. 5(a). The two-dimensional input, represented by ${x_1}$ and ${x_2}$, is sent to two fully connected layers, of two and one nodes, each followed by the experimental sigmoid activation function, $a$, resulting in the output, $y$. A stochastic gradient descent optimizer was used for training, with the parameters: momentum 0.9, learning rate 0.01. Figure 5(b) shows the mean-squared error (MSE) and accuracy obtained during training. The network was found to converge to 100% accuracy within 20 epochs. The MSE of a test run on 20 samples was about 0.0015.

Fig. 5. (a) Schematic illustration of the XOR classification network. [${x_1},{x_2}$], $a$, and $y$ correspond to inputs, activation functions, and the network output, respectively. (b) Mean-squared-error (MSE) loss (left) and accuracy (right) as a function of epoch count during training.

Download Full Size | PDF

The schematic of our multi-class classification network for the MNIST handwritten digit classification task is shown in Fig. 6(a). The experimental clamped ReLU function employed here is modeled as a ReLU6 function, defined in Ref. [22]. The MNIST dataset is divided into training and test samples, with batch sizes of 64 and 1000, respectively. The stochastic gradient descent optimizer is used with the following hyperparameters: learning rate 0.001, momentum 0.5. The input image of a handwritten digit is a 2D tensor comprising ${{28 \times 28}}$ pixels. The input tensor is normalized with respect to the global mean (0.1307) and standard deviation (0.3081) of the MNIST dataset. This input undergoes convolutions (labeled as conv), pooling (labeled as maxpool), and experimental ReLU6 operations, followed by two fully connected layers and a softmax activation function. Figure 6(b) shows the negative log likelihood loss as a function of samples fed into the network, during the training and test stages. The network is found to converge within 10 epochs to negligible loss and 94% accuracy during testing. Figure 6(c) shows a sample of handwritten digit inputs, with labels correctly predicted by the network. Besides these classification examples, there are other applications where such activation functions are routinely used for artificial neural network tasks.

Fig. 6. (a) Schematic illustration of MNIST handwritten digit classification network, comprising convolutional (conv), pooling (maxpool), experimental clamped ReLU-like activation function (ReLU6), dropout, and last, two fully connected layers followed by a softmax activation, resulting in outputs in the range of [0,9]. (b) Negative log likelihood loss (left axis) during training and testing stages, and accuracy during testing (right axis) as a function of number of samples. Each datapoint corresponds to an epoch. (c) Example of network predicted labels corresponding to six input images.

Download Full Size | PDF

Here, we discuss the optical and electrical power requirements of our device. The phase efficiency of thermo-optic heaters in the current platform is $25\;{\rm{mW}}/\pi$, but using thermal isolation trenches can enable efficiencies up to $1.3\;{\rm{mW}}/\pi$ [23]. Alternatively, passive tuning schemes can be employed: the interferometer can be designed to fabricate in the phase bias for a desired activation function, or post-fabrication trimming can eliminate the need for constant tuning power [24]. The optical power cost is a function of material nonlinearity and cavity design. The current device sensitivity is about 5 dBm, which can be enhanced significantly with more efficient cavities, such as photonic crystals with switching energy of a few femtojoules [25]. In terms of architecture, a coherent interference unit [2] suits best to the nonlinear unit given the operation range of wavelengths limited to proximity of the cavity resonance. To ensure indefinite cascadability despite losses, either a pump/probe scheme to drive the neuron or an active III–V platform that allows efficient nonlinearities with on-chip gain for amplifying signal between layers may be useful. To recapitulate, we experimentally demonstrate reconfigurable nonlinear activation functions in a cavity-loaded interferometer device via the FCD effect in silicon, and theoretically verify these results. This is the first experimental demonstration of a single device with programmable all-optical nonlinear activation functions. We further employ these functions in two classification tasks and achieve high accuracy. Photonic implementation of such activation functions paves the way for realizing highly efficient on-chip photonic neural networks.

Funding

Office of Naval Research (N00014-18-1-2297).

Disclosures

The authors declare no conflicts of interest.

REFERENCES

1. A. N. Tait, J. Chang, B. J. Shastri, M. A. Nahmias, and P. R. Prucnal, Opt. Express 23, 12758 (2015). [CrossRef]

2. Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljačić, Nat. Photonics 11, 441 (2017). [CrossRef]

3. A. N. Tait, T. F. De Lima, M. A. Nahmias, H. B. Miller, H.-T. Peng, B. J. Shastri, and P. R. Prucnal, Phys. Rev. Appl. 11, 064043 (2019). [CrossRef]

4. M. M. P. Fard, I. A. Williamson, M. Edwards, K. Liu, S. Pai, B. Bartlett, M. Minkov, T. W. Hughes, S. Fan, and T.-A. Nguyen, Opt. Express 28, 12138 (2020). [CrossRef]

5. M. A. Nahmias, B. J. Shastri, A. N. Tait, and P. R. Prucnal, IEEE J. Sel. Top. Quantum Electron. 19, 1 (2013). [CrossRef]

6. M. Miscuglio, A. Mehrabian, Z. Hu, S. I. Azzam, J. George, A. V. Kildishev, M. Pelton, and V. J. Sorger, Opt. Mater. Express 8, 3851 (2018). [CrossRef]

7. T. F. de Lima, A. N. Tait, H. Saeidi, M. A. Nahmias, H.-T. Peng, S. Abbaslou, B. J. Shastri, and P. R. Prucnal, “Noise analysis of photonic modulator neurons,” arXiv:1907.07325 (2019).

8. V. Rutckaia and J. Schilling, Nat. Photonics 14, 4 (2020). [CrossRef]

9. Y. Zuo, B. Li, Y. Zhao, Y. Jiang, Y.-C. Chen, P. Chen, G.-B. Jo, J. Liu, and S. Du, Optica 6, 1132 (2019). [CrossRef]

10. G. Mourgias-Alexandris, A. Tsakyridis, N. Passalis, A. Tefas, K. Vyrsokinos, and N. Pleros, Opt. Express 27, 9620 (2019). [CrossRef]

11. Q. Lin, O. J. Painter, and G. P. Agrawal, Opt. Express 15, 16604 (2007). [CrossRef]

12. C. Huang, T. F. De Lima, A. Jha, S. Abbaslou, A. N. Tait, B. J. Shastri, and P. R. Prucnal, IEEE Photon. Technol. Lett. 31, 1834 (2019). [CrossRef]

13. D. A. Bekele, Y. Yu, H. Hu, P. Guan, M. Galili, L. Ottaviano, L. K. Oxenløwe, K. Yvind, and J. Mork, Opt. Express 26, 19596 (2018). [CrossRef]

14. C. Huang, A. Jha, T. F. de Lima, A. N. Tait, B. J. Shastri, and P. R. Prucnal, IEEE J. Sel. Top. Quantum Electron. 27, 6100211 (2020). [CrossRef]

15. J. J. Hopfield and D. W. Tank, Biol. Cybern. 52, 141 (1985). [CrossRef]

16. G. E. Dahl, T. N. Sainath, and G. E. Hinton, in IEEE International Conference on Acoustics, Speech Signal Processing (2013), p. 8609.

17. C. Cortes and V. Vapnik, Mach. Learning 20, 273 (1995). [CrossRef]

18. H. Zheng, Z. Yang, W. Liu, J. Liang, and Y. Li, in International Joint Conference on Neural Networks (IJCNN) (IEEE, 2015), p. 1.

19. A. N. Tait, B. J. Shastri, M. P. Fok, M. A. Nahmias, and P. R. Prucnal, J. Lightwave Technol. 31, 1263 (2013). [CrossRef]

20. S. Chen, L. Zhang, Y. Fei, and T. Cao, Opt. Express 20, 7454 (2012). [CrossRef]

21. T. de Lima and A. Tait, “Lightlab–laboratory instrumentation and automation,” 2018, https://github.com/lightwave-lab/lightlab.

22. A. Krizhevsky, “Convolutional deep belief networks on CIFAR-10,” http://www.cs.utoronto.ca/~kriz/conv-cifar10-aug2010.pdf.

23. A. Masood, M. Pantouvaki, G. Lepage, P. Verheyen, J. Van Campenhout, P. Absil, D. Van Thourhout, and W. Bogaerts, in 10th International Conference on Group IV Photonics (IEEE, 2013), p. 83.

24. D. Bachman, Z. Chen, R. Fedosejevs, Y. Y. Tsui, and V. Van, Opt. Express 21, 11048 (2013). [CrossRef]

25. X. Yang, C. Husko, C. W. Wong, M. Yu, and D.-L. Kwong, Appl. Phys. Lett. 91, 051113 (2007). [CrossRef]

Previous Article Next Article

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.

Fig. 1. (a) Schematic of a neuron, showing weighting of inputs

$[{x_1},{x_2},\ldots,{x_n}]$

, summing, and nonlinear operation resulting in output,

$y$

. (b) Schematic of the device, comprising an MRR loaded on an MZI, with a preceeding MZC. (c) Optical micrograph of the fabricated device on an SOI platform. The

$Q$

factor of the fabricated MRR was

${10^4}$

. The resonance of the MRR on the second arm of the MZI was tuned far from the operating range of wavelength in this work.

View in Article | Download Full Size | PDF

Fig. 2. (a)–d.(i) Simulated nonlinear transfer functions, i.e., device output power (

${P_{{\rm OUT}}}$

) versus input power (

${P_{{\rm IN}}}$

). Insets are illustrations of the wavelength detuning of the MRR resonance (green) relative to input signal (yellow) at the initial and final powers, as indicated by the arrows. (a)–(d).(ii) MRR transmission,

${T_{{\rm MRR}}}$

, (solid blue, left axis), and nonlinear FCD-induced phase change,

${\Phi _{{\rm MRR}}}$

, (dashed red, right axis) as a function of input power corresponding to each transfer function type.

View in Article | Download Full Size | PDF

View in Article | Download Full Size | PDF

Fig. 4. Experimentally measured nonlinear transfer functions. (a) Clamped ReLU, (b) radial basis, (c) sigmoid, and (d) softplus. Labels correspond to

$(\Delta {\phi _{{\rm MZC}}},\Delta {\phi _{{\rm MZI}}},\Delta \lambda)$

, i.e., actuated phase shifts by MZC/MZI heaters and wavelength detuning. Thermal crosstalk between heaters has not been calibrated for.

View in Article | Download Full Size | PDF

Fig. 5. (a) Schematic illustration of the XOR classification network. [

${x_1},{x_2}$

$a$

, and

$y$

correspond to inputs, activation functions, and the network output, respectively. (b) Mean-squared-error (MSE) loss (left) and accuracy (right) as a function of epoch count during training.

View in Article | Download Full Size | PDF

View in Article | Download Full Size | PDF

Equations (1)

Equations on this page are rendered with MathJax. Learn more.

\begin{aligned} δ a / δ t & = i (δ ω - n_{k e r r} | a |^{2} + σ_{f c d} α_{T P A} n) a, \\ - (1 + α_{T P A} | a |^{2} + γ_{F C A} α_{T P A} n) a + \sqrt{γ_{p} P_{I N} (t)} \\ δ n / δ t & = | a |^{4} - n / τ, \end{aligned}

Abstract

Funding

Disclosures

REFERENCES

Cited By

Figures (6)

Equations (1)

Optics Letters