Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Low-threshold all-optical nonlinear activation function based on a Ge/Si hybrid structure in a microring resonator

Open Access Open Access

Abstract

Optical nonlinear activation function is an indispensable part of the optical neural network. While linear matrix computation has thrived in an integrated optical neural network, there are many challenges for nonlinear activation function on a chip such as large latency, high power consumption and high threshold. Here, we demonstrate that Ge/Si hybrid structure would be a qualified candidate owing to its property of CMOS-compatibility, low nonlinear threshold and compact footprint. Thanks to the strong thermal-optic effect of germanium in conjunction with micro-ring resonator, we experimentally demonstrate three different types of nonlinear function (Radial basis, Relu and ELU functions) with a lowest threshold of 0.74 mW among our measured nonlinear functions and they can work well with a repetition rate below 100 kHz. Simultaneous size shrinkage of germanium and resonance constraint inside germanium is proposed to speed up response time. Furthermore, we apply our measured nonlinear activation function to the task of classification of MNIST handwritten digit image dataset and improve the test accuracy from 91.8% to 94.8% with feedforward full-connected neural network containing three hidden layers. It proves that our scheme has potential in the future optical neural network.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

With rapid advancement made in artificially intelligence, conventional Von-Neumann architecture featuring sequential instruction is hard to meet the requirement of parallelism required in neural network [1]. A lot of efforts have been devoted into photonic architectures to overcome drawbacks of its electric counterpart because of its innate high parallelism, wide bandwidth and low power consumption. Generally, a neural network comprises several layers made up of linear matrix multiplication unit and nonlinear activation function (NAF). A lot of mature schemes for linear matrix multiplication unit have arisen in free-space diffraction element or integrated photonic circuit such as space light modulator (SLM) [24], Mach-Zehnder interferometer (MZI) mesh [58], micro-ring weight banks [913] and phase-change material-based network [1417].

In contrast to the dramatic development of photonic matrix computation, nonlinear effect with high efficiency and simple implementation is yet to find its way into optical NAF, which makes it the main bottleneck of realization for large-scale optical neural network. At present, saturable absorber is commonly used as the NAF for free space light propagation [2,3]. Other ideas using electromagnetically induced transparency of laser-cooled atoms [18] or reverse saturable absorber [19] were reported too. However, these schemes require either special material or rigid low-temperature environment. On indium phosphide platform, active devices such as semiconductor optical amplifiers (SOAs) are suitable to act as NAF owing to their abundant nonlinear effects and distinctive property of amplifying light [2025]. Nonetheless, silicon photonic platform is a better choice to implement neural network because it allows large-scale integration. Optical-electrical-optical (O-E-O) conversion is still the main route to bring in nonlinearity for NAF on silicon photonic platform whereas four main challenges lie in O-E-O process [10,26,27]. First, bandwidth of photodetector and modulator determines the maximum data rate it can undertake. Second, it is estimated that latency spent at nonlinear part is in the same scale of that in MZI mesh, which will be a bottleneck when the aggregate number of layers becomes large [28]. Third, power consumption of extra laser, photodetector and electric amplifier adds to the total power cost. Finally, it will take a large footprint to integrate photodetector and optical modulator on the same chip. Recently, some new approaches are raised in order to address those problems such as converting only small amount of optical power to electricity and then use it to modulate the rest light to eliminate the need for extra laser [28]. Another method is using free carrier effect in micro-ring resonator to realize reconfigurable CMOS-compatible all optical NAF [29]. However free carrier effect is relatively weak in intrinsic silicon which means input power has to be high enough to trigger nonlinearity. Moreover, phase change material and 2D material are found to be suitable NAF in spiking neural network for the past few years [14,30,31]. But the demand of special fabrication process makes it unable to be applied in batch production.

In this paper, we design a low-threshold all-optical nonlinear activation function based on Ge/Si hybrid structure in microring resonator. The compatibility with CMOS process makes it easily integrated with modulators, photodetectors and passive waveguides. Furthermore, we take advantage of strong thermal-optic effect of germanium around 1550 nm and micro-ring resonator to lower nonlinear threshold. The device is all-passive thus there is no need of extra power consumption in electronic circuits. By varying the wavelength of input light, three different types of nonlinear function can be realized with the lowest threshold of 0.74 mW among our measured nonlinear functions. All the functions can work well with a repetition rate below 100 kHz. In the task of classification of MNIST handwritten digit image dataset, we show that our NAF can improve test accuracy from 91.8% to 94.8% with feedforward full-connected neural network containing three hidden layers, proving that our scheme has potential in the future optical neural network. We also analyze the speed limitation and build a possible path towards higher speed application.

2. Principle and device fabrication

Our device is a racetrack resonator with a span of Ge/Si hybrid waveguide as shown in Fig. 1(a). In the ring waveguide, light is coupled to hybrid region through 500 nm wide single-mode silicon waveguide. Figure 1(a) also shows the mode propagation in the hybrid region. We can see that light is transferred between silicon and germanium alternately owing to mode mismatch. Length of germanium stripe is chosen to be 3 µm so that light is fully coupled into silicon waveguide at the end of hybrid region. Gap and coupling length of racetrack resonator is chosen to be 210 nm and 12 µm respectively, which ensures it is in the critical coupling state.

 figure: Fig. 1.

Fig. 1. (a) Structure and light propagation of proposed device. (b) Scanning electron microscope image of fabricated device. (c) Principle of proposed device’s nonlinearity.

Download Full Size | PDF

In communication band, germanium is generally adopted as absorptive material of photodetector to convert photons to free carriers, which can be gathered as electric current [32]. However, few investigations have been taken on its optical property change when light is absorbed around 1550 nm. In a recent theoretical study [33], it was shown that self-induced thermo-optical effect in germanium dielectric nanoresonator could provide a large spectrum shift, which lays the foundation of our proposed NAF. Principle of the nonlinearity is shown in Fig. 1(c), where germanium absorbs energy of light in 1550 nm and produces heat. And rise of temperature will result in red shift of absorption edge of germanium, which leads to a dramatic rise of absorption coefficient (labeled as α) around absorption edge [34]. Then more heat would be produced and a positive feedback is formed between heat and absorption coefficient. Moreover, according to Kramers-Kronig relationship, the dramatic change of absorption coefficient will lead to the increase of refractive index [35]. As a result, the transmission spectrum of resonator will experience a red shift and shallower resonant dip as we present in the inset of Fig. 1(c). To represent the nonlinearity mathematically, we assume that the effective change of absorption coefficient and refractive index in the ring versus temperature change follows [34]:

$$\Delta \alpha = {k_1}{e^{\Delta T}},\Delta n = {k_2}\Delta T,$$
where ${k_1}$ and ${k_2}$ are constants. According to the principle of micro-ring resonator, input power and output power follow:
$${P_{out}} = |\frac{{t - {e^{i\phi - \alpha }}}}{{1 - t{e^{i\phi - \alpha }}}}{|^2}{P_{in}},$$
where t represents the transmission coefficient of coupling region for micro-ring, which is a constant here. $\phi $ represents the total phase shift during single round in the ring, whose change is proportional to $\Delta n$. $\alpha $ represents the total loss during single round in the ring. If $\phi $ and $\alpha $ stay immutable, then ${P_{out}} \propto {P_{in}}$ according to Eq. (2). Whereas in our case, increases in ${P_{in}}$ will lead to a variation in temperature thus also in $\phi $ and $\alpha $, which therefore results in nonlinearity. Using the similar principle, the experiment of Mie resonance in silicon nanostructure shows an efficient nonlinearity and potential speed of GHz [36]. It is estimated that thermal-optic coefficient of germanium is 7×10−4 /K at 1550 nm and 20 °C, which is larger than 1.84×10−4 /K of silicon in the same condition [37], and thus makes nonlinear threshold lower.

We fabricate the device on a silicon-on-insulator wafer with 220-nm-thick top silicon layer. Width of germanium and silicon in the hybrid zone is set to be 500 nm and 700 nm respectively. Lift-off process is used to fabricate germanium on the designed area after the fabrication of silicon waveguide. Mask here is ZEP photoresist, and germanium is deposited by e-beam evaporation. Figure 1(b) shows the scanning electron microscope (SEM) images of the fabricated device. The measured length of germanium stripe is about 2.58 µm.

3. Experimental measurement

First, we measure the static characteristics of NAF. Figure 2(a) shows the measured spectrum of our device with a notch depth of 27 dB where we make the power low enough to make sure the spectrum can well represent the initial state of the device. Figure 2(c) sketches the evolution of transmission with respect to input optical power under different wavelengths. The relationship between input power and output power falls into four categories:

  • I. When the input wavelength is set at point a (i.e., 1548.45 nm), far away from resonant wavelength (i.e., 1548.17 nm), the device exhibits a bistable state as shown in Fig. 2(b), which is not suitable to act as NAF. Nevertheless, it is noteworthy that the bistability only occurs at the wavelength far away from the resonant wavelength. Therefore, it has no impact on the reliability of subsequently measured nonlinear function.
  • II. When the wavelength moves to point A, B and C (i.e., 1548.32 nm, 1548.29 nm and 1548.24 nm), the transmission goes through a slump followed by a linear rise as input power increases, resulting in the Radial basis function, as shown in Fig. 2(d).
  • III. When the wavelength moves much closer to the resonant wavelength (i.e., point D, 1548.22 nm), the output power is nearly zero when input power is within the first 1 mW, and then starts to rise with the increasing of the input power, making a Relu function possible in Fig. 2(e).
  • IV. When the wavelength further moves to the left side of resonant wavelength (i.e., point E, 1548.15 nm), transmission always increases while adding power, so an ELU function can be got in Fig. 2(f).

 figure: Fig. 2.

Fig. 2. (a) Initial optical spectrum of our device without nonlinearity. (b) Measured bistable state with input wavelength at point a in (a). (c) Sketch of transmission evolution with incremental input optical power for different NAFs (The yellow and blue arrow represent the evolution process. The insets around those arrows show the transmission versus input power for different NAFs. (d-f) Measured nonlinear activation function at different wavelength in (a).

Download Full Size | PDF

Here we define the nonlinear threshold as the optical power required to generate a 50% change in the power transmission relative to the transmission with null input [28]. According to the definition, nonlinear threshold for A, B, C, D and E is 1.7 mW, 1.37 mW, 0.74 mW, 1.15 mW and 1.8 mW respectively. It is noteworthy that although our measured NAFs are not the exact Radial basis function, Relu function or ELU function, they can be regarded as an approximation of these normally used NAFs. And these measured NAFs also perform well as we will show in the Discussion part.

We then measure the frequency response of our device to determine the maximum frequency where it can work. We carry out a dynamic experiment exhibited in Fig. 3(a) where a sawtooth electric signal generated by arbitrary waveform generator (AWG) is delivered into an intensity modulator (IM). The input laser wavelength is selected as 1548.45 nm. The output waveform of IM is shown in Fig. 3(b) as a reference below. Then the output signal of our device under different frequency of sawtooth electric signal is measured by an oscilloscope (OSC). Figure 3(c) shows the output signal, where T0 denotes one clock period of each output signal and is scaled to the same length in axis for different frequency. From the evolution of output waveform, we can conclude that nonlinear function keeps immutable under repetition rate of 100 kHz. But after that, because heat response cannot keep up with fast change of signal, the heat diffusion is manifested as a stable offset of resonance red shift of micro-ring resonator. As a result, the output waveform evolves from line A to line E in Fig. 2 as the frequency increases gradually. Furthermore, we sent a digital signal involving three levels (0,1 and 2) into the IM and get the output signal shown in Fig. 3(d). Apparent nonlinearity appears and we can get the rise and fall time of our device, that is the time that output signal takes to become stable when the input signal changes its level. Here we get 14 µs, 7 µs and 13 µs for level changes (0-1), (1-2) and (2-1) respectively. The main obstacle for higher speed here is the conduction and dissipation speed of the heat, especially in silicon. We will discuss possible solutions to improve the speed in the Discussion part.

 figure: Fig. 3.

Fig. 3. (a) Dynamic experiment scheme. TLS, tunable laser source; PC, polarization controller; AWG, arbitrary waveform generator; IM, intensity modulator; DUT, device under test; PD, photodetector; OSC, oscilloscope. (b) Waveform after sawtooth modulation (c) Output waveform under different frequency of input signal in (b). (d) Input and output signal with digital modulation containing three levels

Download Full Size | PDF

4. Discussion

4.1 Application

To verify the applicability of our NAF, we perform a simulation of classification of MNIST dataset with two of our measured NAF (B and D in Fig. 2) on Tensorflow. And we call them optical activation functions (i.e., OAF1 and OAF2) in the subsequent part respectively. To implement the classification of MNIST dataset, raw images with size of 28×28 pixel are first flattened into one-dimensional arrays, as shown in Fig. 4. Then the input 784 pixels are fed into a four-layer network and output elements are normalized to represent probabilities from digit 0 to 9. The loss function is defined as the cross-entropy loss. In the real application, linear matrix network can be replaced by MZI meshes [5]. Input data can be loaded to optical domain as optical intensity through optical modulator and we assume that the input optical sources are incoherent for simplicity. After the MZI mesh, there is an array of Ge/Si hybrid micro-ring resonator serving as NAF. Light passing through optical NAF is regarded as input signal into the next layer. Photodetector array is used to collect the final output optical powers.

 figure: Fig. 4.

Fig. 4. Schematic of handwritten digit image recognition using neural network with NAFs.

Download Full Size | PDF

Within the simulation, 60,000 images are used as train dataset and the rest 10,000 images are used as test dataset. In the NAF part, it should be noticed that as optical neural network mentioned above contains no negative value, the weight and bias of each neuron in the simulation are set to be non-negative. Therefore, we use rational function where numerator degree is one larger than denominator degree to fit our measured discreet data in static experiment and make sure the fitting curve pass original point and only positive part of the function is selected. As shown in Fig. 5(a), we adopt half Sigmoid and linear function as a comparison besides OAF1 and OAF2. Accuracy and cross-entropy loss of train dataset and test dataset in the process of training are shown in Figs. 5(b) and 5(c), respectively. Obviously, both optical neural networks with OAF1 and OAF2 outperform the one without NAF (i.e., linear function) and they show the similar ability as commonly used Sigmoid function. With our NAF, test accuracy rises from 91.8% to 94.8%. Confusion matrix of the classification task is shown in Figs. 5(d) and 5(e), from which we can see a high accuracy of prediction. The accuracy can be further improved with more complex network structure such as network with deeper layers or convolutional network. Anyway, it does not change the important role of our nonlinear activation function in improving accuracy relative to the network without any nonlinearity. Moreover, one can finely tune the NAF by adjusting input wavelength or applying an initial phase shift on the micro-ring, which makes the network more flexible. For example, for the deeper optical neural network, optical loss caused by waveguide and NAFs can exceed 10 dB considering 10 layers with 1 dB loss in each layer. Therefore, it is beneficial to lower down the threshold of NAFs in the deeper layers. Moreover, some NAFs such as Sigmoid and Softmax are more suitable to be put into the output layer than hidden layers and some NAFs might perform better in some specific scenario. Therefore, the programmability of our device will help to bring a better performance to optical neural network.

 figure: Fig. 5.

Fig. 5. (a) Different NAFs used in the simulation. (b) Accuracy and cross-entropy loss of train dataset. (c) Accuracy and cross-entropy loss of test dataset. (d) Confusion matrix of neural network using OAF1. (e) Confusion matrix of neural network using OAF2.

Download Full Size | PDF

4.2 Comparison between silicon micro-ring and Ge/Si micro-ring

To compare our device with pure silicon micro-ring and verify the function of germanium, we fabricate both of them on the same chip. The measured spectrum red shift with increasing optical power is demonstrated in Fig. 6. Figures 6(a) and 6(c) denote initial optical spectra of our device and the reference micro-ring respectively without nonlinearity. A tunable laser source coupled together with broadband light source (BLS) is injected into the chip after amplified by an Erbium-doped fiber amplifier (EDFA). BLS here acts as an ‘sensor’ of device variance resulted from tunable laser source. The wavelength of laser is set at the right edge of a resonant notch annotated with ‘Laser in’ and arrow. The left nearest resonant notch is chosen to be the monitored window, as marked in Figs. 6(a) and 6(c). Optical spectrum change of BLS for our device and silicon micro-ring is shown in Figs. 6(b) and 6(d). Obviously, Ge/Si hybrid micro-ring exhibits a large red shift and extinction ratio of notch is obviously decreased, which is in accord with our theoretical analysis. Whereas silicon micro-ring only exhibits a small red shift which means it only has thermal-optic effect without light absorption. According to the principle, we know silicon has a lower thermal-optic coefficient than germanium. In addition, silicon absorbs much less light than germanium under the same optical power injection thus produces less heat. Therefore, our scheme outperforms the case of ring resonator without germanium.

 figure: Fig. 6.

Fig. 6. (a) Initial optical spectrum of our device. (b) Spectrum change of our device under different input laser power. (c) Initial optical spectrum of silicon micro-ring. (d) Spectrum change of silicon micro-ring under different input laser power.

Download Full Size | PDF

4.2.1 Optimization of nonlinear threshold

The red shift of micro-ring spectrum under perturbation of refractive index can be written as:

$$\Delta \lambda = \frac{{\lambda \overline {\Delta {n_e}{{_f}_f}} }}{{\overline {{n_g}} }},$$
where $\lambda$ is monitored resonant wavelength, which is almost the same for the two micro-rings tested in the last part. $\overline {\Delta {n_e}{{_f}_f}}$ and $\overline {{n_g}}$ represent the average effective refractive index and group refractive index of the total micro-ring. We expect a large $\overline {\Delta {n_e}{{_f}_f}} /\overline {{n_g}}$ for a large red shift. A practical method is to increase the proportion of Si/Ge hybrid region in the whole micro-ring, thus a smaller radius of micro-ring and longer germanium length are desired. However, length of germanium should be carefully considered, because longer germanium would bring about more loss, which means light enhancement in the micro-ring will be reduced. Furthermore, longer wavelength such as 1600 nm shows a bigger change in absorption coefficient and thermal-optic coefficient [34]. Apart from adjusting working wavelength, annealing procedure parameters for the growth of germanium can also be tailored to increase thermal-optic effect at 1550 nm [38].

4.2.2 Optimization of response speed

Although we get an optical NAF with low threshold, its speed is still limited to 100 kHz, which is hard to meet the requirement of GHz response in optical neural network. Here we analyze the relationship between heat response speed and germanium size in order to point out a direction to higher speed. Figure 7(a) shows the heat simulation parameters in COMSOL, where the thermal conductivity of germanium, silicon and silica are set to be 60 W/(m·K), 130 W/(m·K) and 1.38 W/(m·K) respectively. The initial temperature of germanium nanostructure and the rest part of simulation region are 800 K and 293.15 K respectively. We change the width of germanium nanostructure and get corresponding heat dissipation inside germanium shown in Fig. 7(b). We can notice that as the size of germanium decreases, the heat dissipation is accelerated. Thus the heat response speed is largely dependent on the size of heated object. That is why silicon nanostructure has a nanosecond heat response time [36]. In our device, the contribution of nonlinearity comes from both the germanium and silicon waveguide. Owing to the larger size of silicon waveguide, its heat response time limits the whole response speed.

 figure: Fig. 7.

Fig. 7. (a) Heat simulation setting in the COMSOL. (b) Heat dissipation process with different germanium size.

Download Full Size | PDF

From above analysis, we propose a modified scheme, using the resonance inside germanium nanostructure to produce nonlinearity shown in Fig. 8(a), where the height, width and length of germanium nanostructure are 260 nm, 440 nm and 220 nm respectively. Figure 8(b) shows the transmission spectrum of the device using simulation in the Ansys Lumerical FDTD where a resonant dip appears. The contribution of silicon to nonlinearity here can be neglected because resonant light is almost within germanium. This approach will guarantee the small footprint, large nonlinearity and fast speed simultaneously. And we expect to apply it to the high-speed application in the future.

 figure: Fig. 8.

Fig. 8. (a) Structure of modified scheme to improve the speed. (b) Simulated transmission spectrum of the modified device.

Download Full Size | PDF

5. Conclusion

We design and fabricate a CMOS-compatible all optical NAF using Si/Ge hybrid structure loaded racetrack resonator. The device takes full advantage of large thermal-optic coefficient of germanium around 1550 nm and light enhancement in the resonator. We also investigate static and dynamic behavior of our device, indicating that it can work as three different NAFs with low threshold. Then a comprehensive discussion on its real application in optical neural network is provided. The simulation of classification of handwritten digit images also shows that they can perform well as alternative NAFs. Methods to improve speed and nonlinear threshold are also demonstrated. We believe that our proposal provides a new idea to realize optical NAF, showing great potential in the future all-optical on-chip neural network.

Funding

National Key Research and Development Program of China (2018YFB2201901); National Natural Science Foundation of China (62075075).

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. B. J. Shastri, A. N. Tait, T. Ferreira de Lima, W. H. P. Pernice, H. Bhaskaran, C. D. Wright, and P. R. Prucnal, “Photonics for artificial intelligence and neuromorphic computing,” Nat. Photonics 15(2), 102–114 (2021). [CrossRef]  

2. X. Lin, Y. Rivenson, N. T. Yardimei, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, “All-optical machine learning using diffractive deep neural networks,” Science 361(6406), 1004–1008 (2018). [CrossRef]  

3. A. Dejonckheere, F. Duport, A. Smerieri, L. Fang, J. L. Oudar, M. Haelterman, and S. Massar, “All-optical reservoir computer based on saturation of absorption,” Opt. Express 22(9), 10868–10881 (2014). [CrossRef]  

4. J. Bueno, S. Maktoobi, L. Froehly, I. Fischer, M. Jacquot, L. Larger, and D. Brunner, “Reinforcement learning in a large-scale photonic recurrent neural network,” Optica 5(6), 756–760 (2018). [CrossRef]  

5. Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljačić, “Deep learning with coherent nanophotonic circuits,” Nat. Photonics 11(7), 441–446 (2017). [CrossRef]  

6. H. Zhang, M. Gu, X. D. Jiang, J. Thompson, H. Cai, S. Paesani, R. Santagati, A. Laing, Y. Zhang, M. H. Yung, Y. Z. Shi, F. K. Muhammad, G. Q. Lo, X. S. Luo, B. Dong, D. L. Kwong, L. C. Kwek, and A. Q. Liu, “An optical neural chip for implementing complex-valued neural network,” Nat. Commun. 12(1), 457 (2021). [CrossRef]  

7. M. Nakajima, K. Tanaka, and T. Hashimoto, “Scalable reservoir computing on coherent linear photonic processor,” Commun. Phys. 4(1), 1 (2021). [CrossRef]  

8. T. W. Hughes, M. Minkov, Y. Shi, and S. Fan, “Training of photonic neural networks through in situ backpropagation and gradient measurement,” Optica 5(7), 864–871 (2018). [CrossRef]  

9. A. N. Tait, T. F. de Lima, E. Zhou, A. X. Wu, M. A. Nahmias, B. J. Shastri, and P. R. Prucnal, “Neuromorphic photonic networks using silicon photonic weight banks,” Sci. Rep.7(1), 1 (2017). [CrossRef]  

10. A. N. Tait, T. F. de Lima, M. A. Nahmias, H. B. Miller, H.-T. Peng, B. J. Shastri, and P. R. Prucnal, “Silicon photonic modulator neuron,” Phys. Rev. Appl. 11, 064043 (2019). [CrossRef]  

11. A. N. Tait, H. Jayatilleka, T. F. de Lima, P. Y. Ma, M. A. Nahmias, B. J. Shastri, S. Shekhar, L. Chrostowski, and P. R. Prucnal, “Feedback control for microring weight banks,” Opt. Express 26(20), 26422–26443 (2018). [CrossRef]  

12. A. N. Tait, A. X. Wu, T. F. de Lima, E. Zhou, B. J. Shastri, M. A. Nahmias, and P. R. Prucnal, “Microring weight banks,” IEEE J. Sel. Top. Quantum Electron. 22(6), 590214 (2016). [CrossRef]  

13. A. N. Tait, T. F. de Lima, M. A. Nahmias, B. J. Shastri, and P. R. Prucnal, “Continuous calibration of microring weights for analog optical networks,” IEEE Photonics Technol. Lett. 28(8), 887–890 (2016). [CrossRef]  

14. J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran, and W. H. P. Pernice, “All-optical spiking neurosynaptic networks with self-learning capabilities,” Nature 569(7755), 208–214 (2019). [CrossRef]  

15. C. Wu, H. Yu, S. Lee, R. Peng, I. Takeuchi, and M. Li, “Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network,” Nat. Commun. 12(1), 96 (2021). [CrossRef]  

16. J. Feldmann, N. Youngblood, M. Karpov, H. Gehring, X. Li, M. Stappers, M. Le Gallo, X. Fu, A. Lukashchuk, A. S. Raja, J. Liu, C. D. Wright, A. Sebastian, T. J. Kippenberg, W. H. P. Pernice, and H. Bhaskaran, “Parallel convolutional processing using an integrated photonic tensor core,” Nature 589(7840), 52–58 (2021). [CrossRef]  

17. I. Chakraborty, G. Saha, A. Sengupta, and K. Roy, “Toward fast neural computing using all-photonic phase change spiking neurons,” Sci. Rep.8, 1 (2018). [CrossRef]  

18. Y. Zuo, B. H. Li, Y. J. Zhao, Y. Jiang, Y. C. Chen, P. Chen, G. B. Jo, J. W. Liu, and S. W. Du, “All-optical neural network with nonlinear activation functions,” Optica 6(9), 1132–1137 (2019). [CrossRef]  

19. M. Miscuglio, A. Mehrabian, Z. Hu, S. I. Azzam, J. George, A. V. Kildishev, M. Pelton, and V. J. Sorger, “All-optical nonlinear activation function for photonic neural networks [Invited],” Opt. Mater. Express 8(12), 3851–3863 (2018). [CrossRef]  

20. B. Shi, N. Calabretta, and R. Stabile, “Deep neural network through an InP SOA-based photonic integrated cross-connect,” IEEE J. Sel. Top. Quantum Electron. 26(1), 77011 (2020). [CrossRef]  

21. G. Mourgias-Alexandris, A. Tsakyridis, N. Passalis, A. Tefas, K. Vyrsokinos, and N. Pleros, “An all-optical neuron with sigmoid activation function,” Opt. Express 27(7), 9620–9630 (2019). [CrossRef]  

22. J. Crnjanski, M. Krstic, A. Totovic, N. Pleros, and D. Gvozdic, “Adaptive sigmoid-like and PReLU activation functions for all-optical perceptron,” Opt. Lett. 46(9), 2003–2006 (2021). [CrossRef]  

23. F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and S. Massar, “All-optical reservoir computing,” Opt. Express 20(20), 22783–22795 (2012). [CrossRef]  

24. S. Xiang, Y. Zhang, J. Gong, X. Guo, L. Lin, and Y. Hao, “STDP-based unsupervised spike pattern learning in a photonic spiking neural network with VCSELs and VCSOAs,” IEEE J. Sel. Top. Quantum Electron. 25(6), 170019 (2019). [CrossRef]  

25. B. Shi, K. Prifti, E. Magalhães, N. Calabretta, and R. Stabile, “Lossless monolithically integrated photonic InP Neuron for all-optical computation,” in 2020 Optical Fiber Communications Conference and Exhibition (OFC) (2020), pp. 1–3.

26. J. K. George, A. Mehrabian, R. Amin, J. Meng, T. F. de Lima, A. N. Tait, B. J. Shastri, T. El-Ghazawi, P. R. Prucnal, and V. J. Sorger, “Neuromorphic photonics with electro-absorption modulators,” Opt. Express 27(4), 5181–5191 (2019). [CrossRef]  

27. R. Amin, J. K. George, S. Sun, T. F. de Lima, A. N. Tait, J. B. Khurgin, M. Miscuglio, B. J. Shastri, P. R. Prucnal, T. El-Ghazawi, and V. J. Sorger, “ITO-based electro-absorption modulator for photonic neural activation function,” APL Mater. 7, 081112 (2019). [CrossRef]  

28. I. A. D. Williamson, T. W. Hughes, M. Minkov, B. Bartlett, S. Pai, and S. H. Fan, “Reprogrammable electro-optic nonlinear activation functions for optical neural networks,” IEEE J. Sel. Top. Quantum Electron. 26(1), 1–12 (2020). [CrossRef]  

29. A. Jha, C. Huang, and P. R. Prucnal, “Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics,” Opt. Lett. 45(17), 4819–4822 (2020). [CrossRef]  

30. Z. Cheng, C. Rios, W. H. P. Pernice, C. D. Wright, and H. Bhaskaran, “On-chip photonic synapse,” Sci. Adv. 3(9), e1700160 (2017). [CrossRef]  

31. A. Jha, C. Huang, H.-T. Peng, B. Shastri, and P. R. Prucnal, “Photonic spiking neural networks and CMOS-compatible graphene-on-silicon spiking neurons,” arXiv:2109.13797 (2021).

32. H. Chen, P. Verheyen, P. De Heyn, G. Lepage, J. De Coster, S. Balakrishnan, P. Absil, W. Yao, L. Shen, G. Roelkens, and J. Van Campenhout, “-1 V bias 67 GHz bandwidth Si-contacted germanium waveguide p-i-n photodetector for optical links at 56 Gbps and beyond,” Opt. Express 24(5), 4622–4631 (2016). [CrossRef]  

33. T. V. Tsoulos and G. Tagliabue, “Self-induced thermo-optical effects in silicon and germanium dielectric nanoresonators,” Nanophotonics 9(12), 3849–3861 (2020). [CrossRef]  

34. V. Sorianello, A. Perna, L. Colace, G. Assanto, H. C. Luan, and L. C. Kimerling, “Near-infrared absorption of germanium thin films on silicon,” Appl. Phys. Lett. 93(11), 111115 (2008). [CrossRef]  

35. L. Peng and L. Min, “Effects of the absorption coefficient on the refractive index of germanium in a fiber optic-semiconductor temperature sensor,” Proc. SPIE 8199, 81990Y (2011). [CrossRef]  

36. Y. S. Duh, Y. Nagasaki, Y. L. Tang, P. H. Wu, H. Y. Cheng, T. H. Yen, H. X. Ding, K. Nishida, I. Hotta, J. H. Yang, Y. P. Lo, K. P. Chen, K. Fujita, C. W. Chang, K. H. Lin, J. Takahara, and S. W. Chu, “Giant photothermal nonlinearity in a single silicon nanostructure,” Nat. Commun. 11(1), 4101 (2020). [CrossRef]  

37. M. Li and Y. Li, “Fiber-optic temperature sensor based on interaction of temperature-dependent refractive index and absorption of germanium film,” Appl. Opt. 50(2), 231–236 (2011). [CrossRef]  

38. Y. Ishikawa, K. Wada, D. D. Cannon, J. Liu, H.-C. Luan, and L. C. Kimerling, “Strain-induced band gap shrinkage in Ge grown on Si substrate,” Appl. Phys. Lett. 82(13), 2044–2046 (2003). [CrossRef]  

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. (a) Structure and light propagation of proposed device. (b) Scanning electron microscope image of fabricated device. (c) Principle of proposed device’s nonlinearity.
Fig. 2.
Fig. 2. (a) Initial optical spectrum of our device without nonlinearity. (b) Measured bistable state with input wavelength at point a in (a). (c) Sketch of transmission evolution with incremental input optical power for different NAFs (The yellow and blue arrow represent the evolution process. The insets around those arrows show the transmission versus input power for different NAFs. (d-f) Measured nonlinear activation function at different wavelength in (a).
Fig. 3.
Fig. 3. (a) Dynamic experiment scheme. TLS, tunable laser source; PC, polarization controller; AWG, arbitrary waveform generator; IM, intensity modulator; DUT, device under test; PD, photodetector; OSC, oscilloscope. (b) Waveform after sawtooth modulation (c) Output waveform under different frequency of input signal in (b). (d) Input and output signal with digital modulation containing three levels
Fig. 4.
Fig. 4. Schematic of handwritten digit image recognition using neural network with NAFs.
Fig. 5.
Fig. 5. (a) Different NAFs used in the simulation. (b) Accuracy and cross-entropy loss of train dataset. (c) Accuracy and cross-entropy loss of test dataset. (d) Confusion matrix of neural network using OAF1. (e) Confusion matrix of neural network using OAF2.
Fig. 6.
Fig. 6. (a) Initial optical spectrum of our device. (b) Spectrum change of our device under different input laser power. (c) Initial optical spectrum of silicon micro-ring. (d) Spectrum change of silicon micro-ring under different input laser power.
Fig. 7.
Fig. 7. (a) Heat simulation setting in the COMSOL. (b) Heat dissipation process with different germanium size.
Fig. 8.
Fig. 8. (a) Structure of modified scheme to improve the speed. (b) Simulated transmission spectrum of the modified device.

Equations (3)

Equations on this page are rendered with MathJax. Learn more.

Δ α = k 1 e Δ T , Δ n = k 2 Δ T ,
P o u t = | t e i ϕ α 1 t e i ϕ α | 2 P i n ,
Δ λ = λ Δ n e f f ¯ n g ¯ ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.