Achieving efficient inverse design of low-dimensional heterostructures based on a vigorous scalable multi-task learning network

Shiyin Du; Jie You; Jie You; Yuhua Tang; Hao Ouyang; Zilong Tao; Tian Jiang; Tian Jiang

doi:10.1364/OE.426968

1. Introduction

The advent of graphene in 2004 has triggered a huge amount of innovative scientific inquiry, owning to its metallic, gapless structure characteristics and remarkable mechanical, thermal, and physical properties [1]. Over the last decade, graphene has been proved to acquire large thermal conductivity [2] and carrier mobility (∼2×10⁵ cm²/(V·s)) [3], zero bandgap [4], and satisfied optical absorption (2.3%) [5] over a ultrabroad wavelength range, namely from visible to terahertz (THz), which leads to extraordinary progress in the on-chip photonic functionalities based on graphene [6,7]. In particular, due to the metallic nature, graphene film or nano-patterned graphene can possess propagating or localized surface plasmon polaritons (SPPs), which allows for the enhancement of surface optical field after excitation of these localized SPPs [2,8]. Furthermore, it is feasible to utilize external stimuli (e.g., chemical doping or electric gating) to engineer the optical responses of graphene within an ultra-short time scale [9–11], which grants graphene excellent potentials for applications in ultrafast optoelectronic devices.

Provoked by the giant success of metamaterials in understanding light-matter interactions and implementing multiple functions embracing optical sensing [12], electromagnetic cloaking [13], perfect absorption [14], optical modulating [15,16] and imaging [17], the exploration of graphene has been extended to its periodic patterned counterparts. Unlike the traditional metallic metamaterials [18,19], graphene metasurfaces are not only equipped with the advantages of 2D metamaterials and comparatively-small optical losses [20], but also possess significantly-reduced dimensions while maintaining its resonances at similar THz frequencies to metallic ones [21,22]. A particularly appealing aspect pertaining to graphene metasurfaces is their integration with silicon (Si) nanoparticles and the corresponding optical property [23,24], which is extremely valuable and beneficial to the innovations of ultrafast nanodevices and the underlying physics. The fabrication of graphene ribbons and squares [25], Si nanostructures, as well as their heterostructures, is available nowadays. However, a vigorous and intelligent tool that describes the physics of the low-dimensional heterostructures and facilitates the fast inverse design of such structures is still lacking, since in practice either sophisticated experiments or traditional computational methods consuming plentiful computing time and resources are desired in order to address the above two processes [8].

Owing to the advances in computer science, deep learning (DL) algorithm, serving as a statistical learning method, has been widely applied in many fields, including natural language processing [26,27], computer vision [28,29], signal processing [30,31], biomedicine [32,33], gaming [34,35] and physics [36–38] . Particularly, the DL approach based inverse design in nanophotonics has received considerable scholarly attention in recent years [39–53]. Peurifoy et al. employed artificial neural networks to estimate the light scattering of multilayer nanoparticles, opening the era of inverse design based on deep learning [39]. Ashalley et al. utilized multi-task learning to improve the performance of the main task through auxiliary tasks, thereby completing the design of chiral plasmonic metamaterials [42]. Liu et al. designed some nanostructures using neural networks based on transmission properties. They found that the non-uniqueness of the transmission spectrum caused the model's training effect to deteriorate, and then designed an architecture that combines inverse-mapping algorithm and forward-mapping algorithm [43]. Liu et al. exploited Generative Adversarial Networks (GAN) to design metasurfaces, demonstrating the possibility of using unsupervised learning for the inverse design of nanophotonic structures [44]. These DL methods exhibit great advantages in terms of designing speed, accuracy and generalization ability, according to which many non-intuitive structures can be created, when compared with the conventional design process. However, there are certain drawbacks associated with the use of DL, one of which is that abundant training samples are normally required for such model to accomplish a specific task. For instance, in 2020 Tang et al. used 15,000 samples to train a generative deep learning model in order to design nanopatterned integrated photonic components [54]. Zhang et al. trained an artificial neural network using 20,000 samples to achieve spectrum prediction, inverse design, and performance optimization for plasmonic waveguide coupled with cavities structures [55]. Unfortunately, it remains difficult to acquire such large amount of training data for more general cases. This problem turns more prominent when coming to some experimental researches. One salient example is that in order to design an optical image random positioning system, Han et al. has to collect the optical image datasets from 6 different optical microscopes over the past 10 years [56]. Another significant disadvantage is that one DL model is generally responsible for a single task lacking the capability of handling additional missions. In fact, the biological data processing appears to follow a multi-tasking strategy, which can be confirmed by the abilities of humans in dealing with multitask simultaneously [57,58]. Therefore, a rational inverse design model whose key algorithm is multi-task deep learning (MDL) should be developed in order to identify nanostructures and estimate the geometric parameters through spectral images of various problems.

One extraordinary feature of MDL model is that it can accomplish multiple tasks simultaneously. A task is generally referred to the learning of one output target using a single input source, which can be mathematically expressed as ${T_i}{\rm{ }} \buildrel \Delta \over = {\rm{ \{ }}{p_i}(x),\;{p_i}(y|x),\;{{\cal L}_i}{\rm{\} }}$. Here, ${p_i}(x)$ and ${p_i}(y|x)$ represent the probability distribution of input source and the conditional probability distribution of output target, respectively, and ${{\cal L}_i}$ is a cost function for model training. In this sense, MDL can learn different output attributes according to one input source, the same attribute of multiple input sources, and multiple attributes of multiple input sources [59]. Considering the constraints among different tasks, MDL is believed to have the advantages of implicit data augmentation, eavesdropping, attention focusing, representation bias, and regularization [60–62].

In this work, a scalable multi-task learning (SMTL) algorithm is proposed and utilized to furnish the efficient inverse design of graphene-silicon heterostructures, as well as predict the optical properties of such samples. A striking and intriguing characteristic of SMTL model is that it cannot only perform the inverse design for a variety of heterostructure in a highly-accurate and ultrafast manner, but also rapidly expand to create new structures, perfectly overcoming the dilemma that traditional deep learning methods can only deal with one specific physical problem. Here, both single graphene-Si heterostructures containing n×n graphene squares and their periodic array counterparts, as well as the periodic graphene patches and pure Si cubes, are systematically explored via the SMTL model. Importantly, the conventional finite element method (FEM) is used to characterize the optical absorption of the above nanostructures, and then facilitate the SMTL training process. During the evaluation process, normalization mechanism is put forward to deal with the inconsistent parameter scales between different optical structures and make them on the same scale, serving as the basis of the multi-task inverse design, Meanwhile, an algorithm that quantitatively measures the influence of geometric parameters on the optical response is employed in the training of SMTL, enhancing its generalization ability. Comprehensive analysis of optical absorption in graphene-Si heterostructures comprising different geometric parameters and the optimal inverse design are conducted by SMTL, during which the complex and nonintuitive relations between optical absorption and geometric parameters including the number and size of graphene patches, and the changeable length, width, and thickness of Si cuboid are thoroughly revealed.

2. Design of a graphene-Si hybrid nanostructure

Figure 1 shows the vivid schematics of the two-dimensional (2D) periodic graphene-Si cube hybrid structure, which is denoted as S1. From this figure, one can see that its unit cell contains n×n graphene squares, a Si cube and the glass substrate, with the unit periods along x- and y- axis to be P_x and P_y. Additionally, nine types of graphene patches are considered in the Si-based heterostructures, namely 1×1, 2×2, 3×3, 4×4, 5×5, 6×6, 7×7, 8×8, and 9×9 graphene squares, with their spatial distributions clearly displayed in the inset of Fig. 1. These hybrid structures are denoted as GS1-GS9. Two factors that may significantly affect the absorption properties of the graphene squares are introduced here, which are the length of graphene square (w) and the gap between two adjacent graphene patches (g). In terms of the Si cube, its length (L), width (W), and height (H) can be varied as desired, making possible for the exploration of the complicated absorption behaviors inside the graphene-Si hybrid nanostructure.

Fig. 1. Schematics of the two-dimensional periodic graphene-Si cube hybrid structure, on top of the glass substrate. Its unit cell comprises one Si cube whose length (L), width (W), and height (H) are variable, as well as the n×n graphene patches with n=1∼9.

Download Full Size | PDF

Furthermore, in addition to hybrid structures presented in Fig. 1, there are other nano-structures being investigated in this work, including the single graphene-Si cube heterostructure (S2), one-dimensional periodic graphene ribbon (S3), 2D periodic graphene square (S4), single Si cube (S5), and periodic Si cube (S6). For comparison, we also change the cube material from Si to air in both single and periodic graphene-Si cube heterostructures in our numerical simulations, which are referred as S7 and S8, respectively. Notably, the linear optical properties of graphene are determined by the surface optical conductivity σ_s [23], which can be calculated using Kubo’s formula. We utilize the Fermi level of 0.6 eV, the relaxation time of 0.25 ps, and the temperature of 300 K for σs of graphene in all numerical simulations. Whilst the permittivity of Si can be referred to [63].

3. Scalable multi-task learning model

In the conventional numerical simulation, the optical response of nanostructures can be acquired by solving the Maxwell equations [64–66]. This process can be mathematically formulated as a mapping $f:X \to Y$ from a nanostructure ($x \in X$) to optical response ($y \in Y$), where $y = f(x)$. The corresponding inverse design of nanostructures with desired optical response is naturally expressed as $g:Y \to X$ from an optical response $y \in Y$ to an optical structure $x \in X$, where $x = g(y)$ [67]. Obviously, the mapping g is the inverse of the mapping f, namely $g = {f^{ - 1}}$, which is however an ill-posed problem and difficult to solve in mathematics. Notably, the solution of the ill-posed problem either does not exist, or is not unique, or does not depend continuously on data and parameters [68,69]. As a result, some parameters that can barely affect the optical response may leads to multiple solutions of the inverse design. Meanwhile, small changes in the optical response are likely to cause huge differences in the parameters, since the response does not depend continuously on parameters. Moreover, for an arbitrarily specified optical response, there may be no corresponding structure. In this case, an approximate solution ($G$) is highly needed to minimize the error, which is mathematically given by:

(1)$$\;\;\mathop {\arg \min }\limits_G {\rm{ }}D({G({\varphi (y )} ),x} )+ \lambda R({G({\varphi (y )} )} )$$

where $D$ is a distance measurement function, $\varphi$ is the transformation that converts the spectrum into a higher-dimensional space to increase the difference among spectra, $R$ is the regularization function, and $\lambda $is a hyperparameter that represents the weight of $R$. Evidently, after finding the approximate solution $G$, the desired optical structure can be calculated following $\tilde x = G(\varphi (y))$. Here, the transformation is equivalent to indirectly increasing the influence of unimportant parameters on optical response, and thus even minor changes in unimportant parameters can be accurately reflected in the transformed spectrum. And the regularization $R$ relies on the fact that the abovementioned unimportant parameters of nanostructures are possible to determine other factors such as price, stability, and lifetime, instead of optical property. When there are multiple inverse-designed structures, one can select the optimal one by additional constraints (the regularization function $R$).

The schematic configuration of SMTL network is shown in Fig. 2. One can see from this figure that the SMTL model consists of two important sections, namely a classification network and a series of multi-task inverse design networks. More specifically, the classification network preliminarily identifies the nanostructures relying on the features of linear absorption spectra. Actually, it is quite simple for the classification network to roughly distinguish the type of heterostructure, the geometric parameters and their numbers. Next, the inverse design networks will determine the category and geometric parameters of the nanostructure according to the results of the classification network. Herein, the classification network acts as a decision-maker in SMTL model, while inverse design networks are executors. Importantly, in face of a completely-different task, we only need to fine-tune these two networks of SMTL, instead of changing all algorithms. This indicates the highly scalable ability of SMTL model. Moreover, the multilayer classification structure of SMTL model is good at tackling the inconsistency issue of geometrical parameters, while the inverse design network is actually an MDL model that shows advantages in predicting a variety of specific structures with similar structures and similar geometric parameters. In this context, linear absorption spectra datasets processed by each inverse design network come from a variety of low-dimensional nanostructures, rather than one particular category of nanostructure with different geometric parameters.

Fig. 2. Schematics of the SMTL model for inverse design of low-dimensional nanostructures. The orange dashed box shows the main architecture of the SMTL model, namely a classification network and a series of inverse design networks, while the solid blue box below describes the detailed structure of the inverse design network. Here, the SMTL model used for inverse design can be divided into three stages, indicated by blue big arrows: (A) Input the linear absorption spectrum into the classification network to obtain the category number. (B) Choose an inverse design network based on the category number. (C) Use the selected inverse design network to predict the low-dimensional nanostructures. Additionally, to verify the inverse design performance of the SMTL model, we also need to calculate the linear absorption spectrum for the predicted structure (D) and compare it with the input spectrum (E).

Download Full Size | PDF

Turning now to the details of these two key networks. To start with, the classification network in Fig. 2 is a fully connected neural network, which contains an input layer, an output layer, and eight hidden layers. The output layer exploits the “softmax” function as the activation function, while in hidden layers the “leaky-relu” function is used for the first four layers and the “softplus” function is used for the last four layers. During the model training, the cross-entropy is utilized as the loss function. Secondly, one can find from Fig. 2 that the inverse design network consists of an autoencoder and a fully connected neural network. The main function of the autoencoder is to convert linear absorption spectra into a higher-dimensional feature space. Precisely, this autoencoder is a six-layer fully connected neural network, in which the first four layers constitute the encoder and the latter two layers serve as the decoder. Here, we exploit Euclidean distance to measure the difference among the transformed spectra. Therefore, during the training process of the autoencoder, a loss function ${{\cal L}_\varphi }$ is applied to guide the learning of autoencoder, which is mathematically expressed as:

(2)$${{\cal L}_\varphi } = \frac{1}{m}\sum\limits_{i = 1}^m {|{y_{pred}}(i) - {y_{real}}(i)} |+ \frac{{\alpha m(m - 1)}}{{\sum\limits_{i = 1}^m {\sum\limits_{j = 1,j \ne i}^m {||y^{\prime}(i) - y^{\prime}(j)||_2^2} } }}$$

where ${y_{pred}}$ and ${y_{real}}$ represent the output of decoder and the input of encoder, respectively, $y^{\prime}$ is the output of encoder, m is the size of batch data and $\alpha $ is the weight of the difference.

In fact, the autoencoder implemented here corresponds to the aforementioned function. Naturally, the fully connected neural network in the inverse design network coincides with the approximate solution G mentioned earlier. What follows is the training of this fully connected neural network (The hard parameter sharing network in Fig. 2) with the datasets extracted from a variety of low-dimensional nanostructures. Nevertheless, it is hard to directly utilize these datasets to train the inverse design network, since their geometric parameters have different scales. Therefore, the normalization of their geometric parameters to the same scale is a must before model training, considering the fact that the difference of parameter scales often leads to the failure of the inverse design network. Inspired by the preparation precision of the device, a normalization mechanism is proposed to address the above issue. Particularly, each optical structure is prepared with a minimum resolution, and its preparation precision cannot exceed the minimum resolution. In other words, the size of the optical structure is discrete, which is an integer multiple of this minimum resolution. Thus, we define a unit size and employ the multiple between the actual size and the unit size to represent the normalized size. In this way, all geometric parameters are normalized to a positive integer. If a minimum size of 0 is selected, all geometric parameters will be aligned. Mathematically, the normalization mechanism can be defined as:

(3)$${x_{norm}} = \frac{{x - {x_{\min }}}}{\delta }$$

where ${x_{\min }}$ and $x$ denote the minimum size and the actual size, respectively, and $\delta $ represent the unit size. It is worth noting that the geometric parameters of each structure have different ${x_{\min }}$ and $\delta $ when normalized. Therefore, it is necessary to choose appropriate normalization parameters (${x_{\min }}$ and $\delta $) for different structures, ensuring the labels of different tasks on the same scale. Simultaneously, all parameters after normalization are integers, so the predicted results of the parameters must also be integers. Thus, the parameter prediction results of the inversely designed network need to be rounded, and prediction errors within 0.5 will become 0.

The training process of the hard parameter sharing network is described as follows. As shown in Fig. 2, this network has multiple outputs and all outputs share the first few layers (i.e., shared layers), with each output corresponding to a loss function. Using this strategy, the specific layers learn some specific features via loss functions, whereas the shared layers attain certain common features by synthesizing the constraints of multiple outputs. Therefore, we can obtain the following empirical risk minimization formula:

(4)$$\mathop {\arg \min }\limits_{{\theta ^{sh}},{\theta ^i}} \;\sum\limits_{i = 1}^T {{w_i}{{\cal L}_i}({\theta ^{sh}},\;{\theta ^i})}$$

where ${w_i}$ is the weight of the loss function, $T$ is the number of outputs, ${\theta ^{sh}}$ and ${\theta ^i}$ represent shared parameters and specific parameters in the inverse design network, respectively [31]. Considering that the trained fully connected neural network is the approximate solution $G$, the loss function ${{\cal L}_i}({\theta ^{sh}},\;{\theta ^i})$ correspond to $D({G({\psi (y )} ),{x_i}} )+ {\lambda _i}R({G({\psi (y )} )} )$, where ${x_i}$ and ${\lambda _i}$ represent the i^th component in x and $\lambda $, respectively. According to Eq. (4), the loss function during training of the hard parameter sharing network is defined as:

(5)$${{\cal L}_{FNN}} = \sum\limits_{i = 1}^T {{w_i}} \{{D({G({\psi (y )} ),{x_i}} )+ {\lambda_i}R({G({\psi (y )} )} )} \}$$

Since the optical geometric parameters predicted by the inverse design network are divided into the structural parameters and size parameters, we exploit two types of distance measurement functions in Eq. (5). The prediction of structural parameters is a classification task, so we utilize cross-entropy to measure the difference. Whilst the prediction of size parameters is a regression task, so the function $D$ is the mean absolute error (MAE). In view that larger devices are easier to manufacture in the design of low-dimensional material heterostructure, we prefer larger devices. Thus, the regularization function $R$ is expressed as:

(6)$$R(x) = \frac{1}{{||x||}}$$

where x is the input of regularization function R. Notably, the detailed information about the SMTL model construction and training process are provided in Supplement 1.

Finally, we quantitatively measure the influence of geometric parameters on the optical response. By analyzing the gradient backpropagation during the network training, it can be found that ${w_i}$ is used to integrate the constraints of multiple outputs. At the same scale of losses, a larger ${w_i}$ would lead to a stronger influence of the corresponding optical geometric parameters on the feature extraction of the shared layer. Thus, it can be safely concluded that the greater the influence of the parameter on the linear absorption spectrum, the larger the weight. Therefore, we calculate the optimization weight of size parameter $k$ according to the following formula:

(7)$${w_k} = \frac{2}{{n(n - 1)}}\sum\limits_{i = 1}^n {\sum\limits_{j = i + 1}^n {\frac{{||sptr(i) - sptr(j)|{|_1}}}{{|{k_i} - {k_j}|+ 1}}} }$$

where $sptr(i )$ and $sptr(j )$ represent the i-th and j-th linear absorption spectra in the training set, with ${k_i}$ and ${k_j}$ being their corresponding size parameters. As for the classification parameter $c$, its weight is defined as follows:

(8)$${w_c} = \frac{2}{{s(s - 1)}}\sum\limits_{p = 1}^s {\sum\limits_{q = p{\rm{ + }}1}^s {\frac{{||\overline {spt{r_p}} - \overline {spt{r_q}} |{|_1}}}{{|{{c_p} - {c_q}} |}}} }$$

where $\overline {spt{r_p}} $ and $\overline {spt{r_q}} $ are the average of the linear absorption spectra for the p-th and the q-th category in the training set, ${c_p}$ and ${c_q}$ represent the corresponding category numbers, and s is the number of categories.

4. Inverse design of low-dimensional nanostructures

In this section, the SMTL model is employed to achieve the efficient inverse design of graphene-silicon heterostructures and other related nanoparticles, aiming at verifying the superiority of the SMTL model in multi-structure inverse design. An essential prerequisite is that the SMTL model can accurately predict the category of nanostructures and its geometric parameters according to linear absorption spectra. Here, the inputs of the SMTL model are discrete data points uniformly sampled from linear absorption spectra, and the outputs are structure parameters and structure number. Three different inverse design networks (see Fig. 2) are selected in our SMTL model: (1) The first one is used to predict the specific category of low-dimensional nanoparticles (i.e., S1-S8), with the input to be linear absorption spectra and the output containing a classification parameter and two size parameters. For convenience, these two size parameters are denoted as a and b, where a represents the length of graphene and b represents the length of silicon. (2) The second network identifies the key geometric parameters of Si cuboids in the arrays of graphene-Si heterostructures, which are the length, width, and height represented by a, b, and c, respectively. (3) The third network distinguishes the patterns of graphene squares in GS1-GS9 structures. Here, the two important geometric parameters are the length of graphene patches and gap between two adjacent patches, denoted as a and b, respectively. As a result, the optimized and key weights for all geometric parameters are shown in Table 1.

Table 1. The parameter optimization weights of inverse design networks.

View Table | View all tables in this article

Using the crucial parameter optimization weights in Table 1, we proceed to the efficient inverse design of graphene-silicon heterostructures and other related nanoparticles, with the main results being shown in Fig. 3. What needs to be emphasized here is that FEM approach is also applied to calculate the linear absorption spectra of the nanostructures corresponding to the SMTL prediction results, in order to visually verify the accuracy of SMTL model. Figure 3(a) shows the input spectrum and the spectra corresponding to prediction result for S4 sample consisting of 2D periodic graphene squares, whose length and predicted result are both 161.6 nm. Additionally, the optical absorption responses of GS4 sample calculated by both FEM and the second inverse network of SMTL model are presented in Fig. 3(c). Interestingly, the length, width and height of Si cuboid are 8000 nm, 2000nm, and 6000 nm respectively, whereas the predicted results are 8000 nm, 8000 nm, and 6000 nm. Although the predicted sizes are inconsistent, their linear absorption spectra are almost the same. This indicates the existence of unimportant parameters in the inverse design. Therefore, it is quite necessary to consider the influence of unimportant parameters in the inverse design solution. In case of GS9 sample, the comparison between the FEM-calculated absorption and the same optical property predicted by the third network of SMTL is shown in Fig. 3(e). The predicted length and gap of graphene square are 230 nm and 368 nm, respectively, which are consistent with the original setting.

Fig. 3. Quantified results of inverse design with SMTL model. (a), (c), (e) Spectral comparison of true (red solid lines) and predicted values (blue dotted lines). (b), (d), (f) The predicted MAPE of samples in different categories based on the SMTL model. (a) The real result and prediction result for the 2D periodic graphene square (S4). (b) The results of the SMTL model on a data set composed of samples with eight kinds of structures (S1-S8). (c) The real result and prediction result for the 4×4 graphene-silicon heterostructures (GS4). (d) The results of the SMTL model on a data set containing samples with different channel numbers and silicon cuboid geometric parameters (length, width, and height). (e) The real result and prediction result for the 9×9 graphene-silicon heterostructures (GS9). (f) The results of the SMTL model in a data set consisting of samples with different channel numbers and graphene square geometric parameters (length and gap).

Download Full Size | PDF

The three subgraphs mentioned above show the prediction ability of the SMTL model for individual samples. Obviously, this is not enough to illustrate the good predictive performance of SMTL model. Therefore, we set up three datasets, which correspond to three inverse design networks. Firstly, we randomly select 400 samples from a data set containing S1-S8 as the first validation set, as shown in Fig. 3(b). Generally, the mean absolute percentage error (MAPE) is the main evaluation factor, which is described by the following equation:

(9)$$MAPE = \frac{1}{n}\sum\limits_{i = 1}^n {\left|{\frac{{y_i^{pred} - y_i^{real}}}{{y_i^{real}}}} \right|\times 100\%}$$

Obviously, the MAPEs in Fig. 3(b) are all lower than 1%. It means that the SMTL model is able to accurately predict the structure and geometric parameters of S1-S8 according to their absorption properties. Additionally, we employ 800 samples from GS1-GS9 with different Si cuboid sizes to construct the second validation set, and the test results are shown in Fig. 3(d). It is found that except for GS6 samples whose MAPE is slightly higher than 1%, the remaining channels are all lower than 1%. Finally, for the third verification set, it consists of 600 samples from GS1-GS9 with different parameters for graphene squares. As shown in Fig. 3(f), one striking observation is that the MAPE for all channels is less than 0.4%. Surprisingly, there are six channels whose MAPE is even less than 0.1% in Fig. 3(f), corresponding to the structures GS1, GS4-GS5, and GS7-GS9 respectively.

Although the geometric parameters of some optical structures differ greatly, their linear absorption spectra are quite similar (e.g., Fig. 3(c)). Therefore, these geometric parameters are referred as unimportant parameters, since they have slight impacts on the linear absorption characteristics. Thus, they should be given a smaller optimization weight in SMTL model training process. In order to prove its effectiveness, we propose an optimized weight calculation method to evaluate the influence of parameters. Hence, we train the SMTL model with influence weights calculated by Eq. (7) and Eq. (8). Of course, we also set the optimization weight of all parameters to be 1 as a comparison during the training, which means that the influence of all parameters is the same. Then, the generalization ability of the SMTL model is tested in these three validation sets constructed earlier, with the final quantification results being shown in Table 2. It is obvious that our proposed method reduces the MAPE of the SMTL model in all cases. This indicates that the generalization ability of the SMTL model can be improved by paying enough attentions to the influence of parameters on optical response.

Table 2. The MAPEs of the SMTL model trained based on different optimization weights.

View Table | View all tables in this article

5. SMTL assisted in-depth exploration of graphene-Si heterostructures

Armed with such powerful prototyping tool, we move on now to perform the inverse design for various low-dimensional heterostructures and investigate the complex correlations between the geometrical parameters and their optical properties. In order to address the interaction between graphene and Si cube and its impact on the optical response of the heterostructure, we first investigate the linear absorption of both pure graphene square and the graphene-Si cube heterostructures, with the results being shown in Fig. 4. More specifically, six types of graphene-based structures are considered here, with the unit period being P_x=P_y=10 µm for all periodic nanostructures. In Fig. 4(a) and Fig. 4(b), one can see the linear absorption of single graphene square owning a changeable length (w=50 nm∼2 µm) placed on glass substrate, and the same graphene patch but in 2D periodic pattern, respectively. One important finding is that though the resonance at the fundamental frequency (FF) of single graphene square seems to share a similar trend as the periodic graphene square, the FF of the former structure is smaller than the latter one when w<1 µm. Moreover, the plasmon resonance at third harmonic (TH) frequency can be easily discerned from Fig. 4(a) and Fig. 4(b), allowing for further study on the interesting and intriguing nonlinear optical response. For better comparison, we illustrate both a single heterostructure comprising the aforementioned graphene square and one Si cube in a fixed length (L=2 µm) and its 2D periodic array counterpart in Fig. 4(c) and Fig. 4(d) separately. What stands out in these two figures is the growth of FF when compared with the cases without Si cube, no matter whether it is a periodic structure. A reasonable explanation for the above phenomena is that the optical modes in Si cube would interact with the localized SPPs in graphene squares, resulting in different local optical fields. In order to quantify the contribution of Si cube to the absorption property of graphene, both single- and periodic heterostructures possessing one Si cube but in a changeable length (L=500 nm∼9 µm) and a graphene square (w=100 nm) are investigated, whose results are presented in Fig. 4(e) and Fig. 4(f), respectively. The most interesting aspect is that two different types of resonances appear at the absorption map for both single and periodic heterostructures. More specifically, the curves with a smaller gradient represent the resonances of SPPs in graphene square, while the lines with a larger gradient correspond to optical modes of Si cube. In addition, several crossing points with high absorption intensities can be seen from there two figures, whose detailed information are provided in Table S2 in Supplement 1, indicating the strong coupling effect between graphene and Si cube at these frequencies.

Fig. 4. The SMTL predicted normalized absorption property for single graphene-based nanostructure versus its periodic array, considering both pure graphene square and the graphene-Si cube hybrid structure. To be more specific, (a) demonstrates a single 1×1 graphene square on top the glass substrate, whose length (w) is in the range of 50 nm ∼ 2 µm; (b) shows the two-dimensional periodic array of graphene squares that share the same dimensions as (a); (c) stands for a single hybrid structure containing 1×1 graphene square with various length (w=50 nm∼2 µm) and a fixed Si cube length (L=2 µm); (d) illustrates the two-dimensional periodic array counterpart of (c); (e) corresponds to a single hybrid structure possessing 1×1 graphene square (w=100 nm) and a changeable Si cube length (L = 500 nm ∼ 9 µm); (f) represents the same hybrid structure as (e) but arranged in two-dimensional periodic pattern. In all periodic arrays, the unit period is set to be P_x= P_y= 10 µm.

Download Full Size | PDF

A key parameter on which the optical absorption of graphene-Si heterostructures depends is the graphene patch. Particularly, the number of graphene squares and their geometric parameters (e.g., length and gap) are the main factors that determine the overall performance. Thus, we have explored the dependence of absorption characteristics on the graphene dimensions in the periodic graphene-Si cube heterostructure and its relationship with the numbers of graphene patches. To illustrate the main findings of this analysis, we show in Fig. 5 the variation of normalized absorption with w and g for the periodic GS1-GS9 structures, whose distributions on top of Si cube are given in Fig. 1. Notably, the geometric parameters of Si cube are L = W=H=8 µm, and the corresponding unit periods are P_x=P_y=8 µm for x- and y-axis. It can be seen from Fig. 5 at first glance that the normalized absorption increases with graphene numbers, which means that GS9 structure owns the strongest absorption response. However, their resonances show slight wavelength shift for GS1-GS9 possessing the same w and g. Obviously, in all 9 types of heterostructures, the main absorption peaks exhibit red-shifts with the increase of w, for both gaps of g=0.2w (solid lines) and g = w (dashed lines). It is worth mentioning that the parameter of gap will not affect the absorption property of GS1 structure, since there is only graphene square in such structure. Notably, the spectral resonances of periodic graphene-Si hybrid structures at lower wavelengths are attributed to the coupling effect between SPPs of graphene squares and the optical modes of Si cube. Another significant finding is that the intensity of normalized absorption in case of g=0.2w is larger than g = w when w ≤ 100 nm, whereas the opposite is true when w > 100 nm. This is possibly due to the fact that the larger areas contributed by larger w and g, namely w > 100 nm and g = w, would cause a stronger the enhanced absorption responses. For heterostructures with smaller graphene length (w ≤ 100 nm), the shorter gap (g=0.2w) results in stronger interactions between two adjacent graphene patches, which plays the dominant roles in determining the optical absorption compared to the coupling effect among unit cells, considering that the unit period of P_x=P_y=8 µm is considerably large. Further to that, the smaller gaps also lead to the distortion of the absorption curves to some extent, regardless of the graphene numbers and length.

Fig. 5. Normalized absorption of different periodic graphene-Si hybrid structures, accounting for various length of graphene patches and gap between two adjacent patches, calculated via SMTL model. Precisely, the panels of (a)(b)(c)(d)(e)(f) represent the length of graphene square w being 50 nm, 100 nm, 150 nm, 200 nm, 250 nm, and 300 nm, respectively. Here, the red, blue, black, magenta, light blue, green, orange, grey, and purple curves correspond to the graphene patch numbers of 1×1, 2×2, 3×3, 4×4, 5×5, 6×6, 7×7, 8×8, and 9×9 within a unit cell, respectively. The solid lines stand for the cases of the gap between two adjacent patches being g=0.2w, while the dashed ones correspond to the hybrid structures with g = w. In fact, the structure consisting of 1×1 graphene patch will not be affected by the parameter of gap, resulting in the overlap of absorption curves for different g. Additionally, the dimensions of Si cube are fixed at L = W=H=8 µm, leading to the unit periods to be P_x= P_y= 8 µm for both x- and y-axis.

Download Full Size | PDF

Moving on now to consider the influence of Si cuboids on the absorption properties of the graphene-Si heterostructures. In order to further clarify the geometric parameters dependence of absorption, we varied the length and height of Si cuboids in the range of 2 µm∼8 µm, but kept the width constant and equal to W=2 µm. The corresponding absorption curves predicted by SMTL, calculated for GS1-GS9 structures, are depicted in Fig. 6. One remarkable finding from this figure is that the enhanced optical absorption is associated with a larger number of graphene patches, in all cases. When comes to the factor of cuboid length, it seems to affect the absorption of GS1-GS9 structures with various height in a different manner. For instance, in Fig. 6(a)-(c), the resonances of absorption at FFs of heterostructures (H=2 µm and W=2 µm) exhibit a distinct red-shift with an increase in length, namely from λ=27 µm at L=2 µm to λ=30 µm at L=5 µm and λ=34 µm at L=2 µm. However, slight difference among these absorption peaks is observed in GS1-GS9 with H=8 µm (see Fig. 6(g)-(i)), indicating that the length of Si cuboid can barely affect the optical response of these heterostructures. This may be explained by the fact that the size of n×n graphene squares (n=1–9) is rather small compared to the unit period of P_x=L=8 µm, resulting in quite similar weakened coupling among unit periods especially when the height is relatively large. Similarly, if the length of Si cuboids continues increasing to 10 µm and 25 µm, it can hardly affect the optical absorption properties of GS1-GS9 heterostructures with W=2 µm and H=8 µm. Furthermore, when height decreases to H=5 µm, the impact of length becomes more transparent when compare Fig. 6(d) and Fig. 6(f). On the other hand, the heterostructures in Fig. 6(a) preserve most absorption features as pure periodic arrays of graphene squares, particularly for these peaks at FFs and THs. Nonetheless, there exist a certain amount of change in the shapes of the absorption curves of heterostructures when P_x ≠ P_y.

Fig. 6. Prediction of nine types of graphene-Si hybrid arrays containing different shapes of Si cuboids. Specifically, different graphene patch numbers of 1×1 (red), 2×2 (blue), 3×3 (black), 4×4 (magenta), 5×5 (light blue), 6×6 (green), 7×7 (orange), 8×8 (grey), and 9×9 (purple) are included in a unit cell of the periodic graphene-Si hybrid structures. In all cases, the width of Si cuboid is fixed at W=2 µm, while its length (L) and height (H) change from 2 µm to 8 µm. From left to right, the panels correspond to L=2 µm, L=5 µm, and L=8 µm. Whilst the top to bottom panels stand for H=2 µm, H=5 µm, and H=8 µm, respectively. Importantly, the length and gap of graphene patches are fixed at w = g=100 nm, and the unit periods along x- and y-direction are P_x=L and P_y=W, respectively.

Download Full Size | PDF

6. Conclusion

In summary, we have established a SMTL framework to achieve the efficient inverse design of multiple nanostructures and the accurate forecasting of their optical absorption, which relies on both the normalization mechanism that addresses different parameter scales for various structures and a specific algorithm which captures the dependence of optical absorption on the geometric parameters. The conventional FEM is first employed to describe the optical absorption of numerous structures such as single graphene-Si heterostructures consisting of n×n graphene squares (n=1∼9) and the corresponding arrays, the periodic patterned graphene and Si cubes, aiming at providing training dataset for SMTL. Therefore, the complicated and highly-nonlinear relationships between the dimensions of graphene or Si layer in the heterostructures and the optical response are profoundly studied by means of SMTL. This model is verified to possess extremely fast computing speed and significantly high accuracy in terms of inverse design of plentiful nanoparticles and heterostructures, which can easily extend to the design of brand-new structures on demand. This work represents a major breakthrough in the aspect of multi-task inverse design of heterostructures and highlight the potential usefulness of SMTL in more complicated nanostructures and nonlinear optical devices.

Funding

National Natural Science Foundation of China (62075240, 11902358, 41904167); Distinguished Young Scholar Foundation of Hunan Province (2020JJ2036).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. K. S. Novoselov, A. K. Geim, S. V. Morozov, D. Jiang, Y. Zhang, S. V. Dubonos, I. V. Grigorieva, and A. A. Firsov, “Electric field effect in atomically thin carbon films,” Science 306(5696), 666–669 (2004). [CrossRef]

2. J. W. You, J. You, M. Weismann, and N. C. Panoiu, “Double-resonant enhancement of third-harmonic generation in graphene nanostructures,” Philos Trans A Math Phys Eng Sci 375, (2017).

3. K. I. Bolotin, K. J. Sikes, Z. Jiang, M. Klima, G. Fudenberg, J. Hone, P. Kim, and H. L. Stormer, “Ultrahigh electron mobility in suspended graphene,” Solid State Commun. 146(9-10), 351–355 (2008). [CrossRef]

4. I. Meric, M. Y. Han, A. F. Young, B. Ozyilmaz, P. Kim, and K. L. Shepard, “Current saturation in zero-bandgap, top-gated graphene field-effect transistors,” Nat. Nanotechnol. 3(11), 654–659 (2008). [CrossRef]

5. H. Heo, S. Lee, and S. Kim, “Broadband absorption enhancement of monolayer graphene by prism coupling in the visible range,” Carbon 154, 42–47 (2019). [CrossRef]

6. M. Romagnoli, V. Sorianello, M. Midrio, F. H. L. Koppens, C. Huyghebaert, D. Neumaier, P. Galli, W. Templ, A. D’Errico, and A. C. Ferrari, “Graphene-based integrated photonics for next-generation datacom and telecom,” Nat. Rev. Mater. 3(10), 392–414 (2018). [CrossRef]

7. M. B. Mia, S. Z. Ahmed, I. Ahmed, Y. J. Lee, M. Qi, and S. Kim, “Exceptional coupling in photonic anisotropic metamaterials for extremely low waveguide crosstalk,” Optica 7(8), 881–887 (2020). [CrossRef]

8. J. W. You, E. Threlfall, D. F. G. Gallagher, and N. C. Panoiu, “Computational analysis of dispersive and nonlinear 2D materials by using a GS-FDTD method,” J. Opt. Soc. Am. B 35(11), 2754–2763 (2018). [CrossRef]

9. M. Liu, X. Yin, E. Ulin-Avila, B. Geng, T. Zentgraf, L. Ju, F. Wang, and X. Zhang, “A graphene-based broadband optical modulator,” Nature 474(7349), 64–67 (2011). [CrossRef]

10. V. W. Brar, M. S. Jang, M. Sherrott, J. J. Lopez, and H. A. Atwater, “Highly Confined Tunable Mid-Infrared Plasmonics in Graphene Nanoresonators,” Nano Lett. 13(6), 2541–2547 (2013). [CrossRef]

11. B. Liu, C. You, C. Zhao, G. Shen, Y. Liu, Y. Li, H. Yan, and Y. Zhang, “High responsivity and near-infrared photodetector based on graphene/MoSe 2 heterostructure,” Chin. Opt. Lett. 17(2), 020002 (2019). [CrossRef]

12. E. J. Osley, C. G. Biris, P. G. Thompson, R. R. F. Jahromi, P. A. Warburton, and N. C. Panoiu, “Fano Resonance Resulting from a Tunable Interaction between Molecular Vibrational Modes and a Double Continuum of a Plasmonic Metamolecule,” Phys. Rev. Lett. 110(8), 087402 (2013). [CrossRef]

13. D. Schurig, J. J. Mock, B. J. Justice, S. A. Cummer, J. B. Pendry, A. F. Starr, and D. R. Smith, “Metamaterial Electromagnetic Cloak at Microwave Frequencies,” Science 314(5801), 977–980 (2006). [CrossRef]

14. S. Song, Q. Chen, L. Jin, and F. Sun, “Great light absorption enhancement in a graphene photodetector integrated with a metamaterial perfect absorber,” Nanoscale 5(20), 9615–9619 (2013). [CrossRef]

15. Y. Poudel, J. Sławińska, P. Gopal, S. Seetharaman, Z. Hennighausen, S. Kar, F. D’souza, M. B. Nardelli, and A. Neogi, “Absorption and emission modulation in a MoS 2–GaN (0001) heterostructure by interface phonon–exciton coupling,” Photonics Res. 7(12), 1511–1520 (2019). [CrossRef]

16. Y. Hu, J. You, M. Tong, X. Zheng, Z. Xu, X. Cheng, and T. Jiang, “Pump-Color Selective Control of Ultrafast All-Optical Switching Dynamics in Metaphotonic Devices,” Adv. Sci. 7(14), 2000799 (2020). [CrossRef]

17. T. Decoopman, G. Tayeb, S. Enoch, D. Maystre, and B. Gralak, “Photonic Crystal Lens: From Negative Refraction and Negative Index to Negative Permittivity and Permeability,” Phys. Rev. Lett. 97(7), 073905 (2006). [CrossRef]

18. F. Qin, L. Ding, L. Zhang, F. Monticone, C. C. Chum, J. Deng, S. Mei, Y. Li, J. Teng, M. Hong, S. Zhang, A. Alù, and C. W. Qiu, “Hybrid bilayer plasmonic metasurface efficiently manipulates visible light,” Sci. Adv. 2(1), e1501168 (2016). [CrossRef]

19. Y. Zhao and A. Alù, “Manipulating light polarization with ultrathin plasmonic metasurfaces,” Phys. Rev. B 84(20), 205428 (2011). [CrossRef]

20. Q. Ren, J. W. You, and N. C. Panoiu, “Large enhancement of the effective second-order nonlinearity in graphene metasurfaces,” Phys. Rev. B 99(20), 205404 (2019). [CrossRef]

21. J. W. You and N. C. Panoiu, “Polarization control using passive and active crossed graphene gratings,” Opt. Express 26(2), 1882–1894 (2018). [CrossRef]

22. T. Jiang, K. Yin, C. Wang, J. You, H. Ouyang, R. Miao, C. Zhang, K. Wei, H. Li, and H. Chen, “Ultrafast fiber lasers mode-locked by two-dimensional materials: review and prospect,” Photonics Res. 8(1), 78–90 (2020). [CrossRef]

23. J. You, Z. Tao, Y. Luo, J. Yang, J. Zhang, X. Zheng, X. Cheng, and T. Jiang, “BER evaluation in a multi-channel graphene-silicon photonic crystal hybrid interconnect: a study of fast- and slow-light effects,” Opt. Express 28(12), 17286–17298 (2020). [CrossRef]

24. J. You, Y. Luo, J. Yang, J. Zhang, K. Yin, K. Wei, X. Zheng, and T. Jiang, “Hybrid/Integrated Silicon Photonics Based on 2D Materials in Optical Communication Nanosystems,” Laser Photon. Rev. 14(12), 2000239 (2020). [CrossRef]

25. A. Celis, M. N. Nair, A. Taleb-Ibrahimi, E. H. Conrad, C. Berger, W. A. De Heer, and A. Tejeda, “Graphene nanoribbons: fabrication, properties and devices,” J. Phys. D: Appl. Phys. 49(14), 143001 (2016). [CrossRef]

26. T. Young, D. Hazarika, S. Poria, and E. Cambria, “Recent trends in deep learning based natural language processing,” IEEE Comput. Intell. Mag. 13(3), 55–75 (2018). [CrossRef]

27. D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages of deep learning for natural language processing,” IEEE Trans. Neural Networks Learn. Syst. (2020).

28. C. Tian, Y. Xu, L. Fei, and K. Yan, “Deep learning for image denoising: a survey,” in (Springer, 2018), pp. 563–572.

29. A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep learning for computer vision: A brief review,” Comput. Intell. Neurosci. 2018, (2018).

30. H. Purwins, B. Li, T. Virtanen, J. Schlüter, S. Y. Chang, and T. Sainath, “Deep learning for audio signal processing,” IEEE J. Sel. Top. Signal Process. 13(2), 206–219 (2019). [CrossRef]

31. R. Haeb-Umbach, S. Watanabe, T. Nakatani, M. Bacchiani, B. Hoffmeister, M. L. Seltzer, H. Zen, and M. Souden, “Speech processing for digital home assistants: Combining signal processing with deep-learning techniques,” IEEE Signal Process. Mag. 36(6), 111–124 (2019). [CrossRef]

32. M. Wainberg, D. Merico, A. Delong, and B. J. Frey, “Deep learning in biomedicine,” Nat. Biotechnol. 36(9), 829–838 (2018). [CrossRef]

33. C. Cao, F. Liu, H. Tan, D. Song, W. Shu, W. Li, Y. Zhou, X. Bo, and Z. Xie, “Deep learning and its applications in biomedicine,” Genomics, Proteomics Bioinf. 16(1), 17–32 (2018). [CrossRef]

34. N. Justesen, P. Bontrager, J. Togelius, and S. Risi, “Deep learning for video game playing,” IEEE Trans. Games 12(1), 1–20 (2020). [CrossRef]

35. H. Tembine, “Deep learning meets game theory: Bregman-based algorithms for interactive deep generative adversarial networks,” IEEE Trans. Cybern. 50(3), 1132–1145 (2020). [CrossRef]

36. Z. Tao, J. You, J. Zhang, X. Zheng, H. Liu, and T. Jiang, “Optical circular dichroism engineering in chiral metamaterials utilizing a deep learning network,” Opt. Lett. 45(6), 1403–1406 (2020). [CrossRef]

37. Z. Tao, J. Zhang, J. You, H. Hao, H. Ouyang, Q. Yan, S. Du, Z. Zhao, Q. Yang, X. Zheng, and T. Jiang, “Exploiting deep learning network in optical chirality tuning and manipulation of diffractive chiral metamaterials,” Nanophotonics 9(9), 2945–2956 (2020). [CrossRef]

38. S. Du, J. You, J. Zhang, Z. Tao, H. Hao, Y. Tang, X. Zheng, and T. Jiang, “Expedited circular dichroism prediction and engineering in two-dimensional diffractive chiral metamaterials leveraging a powerful model-agnostic data enhancement algorithm,” Nanophotonics 10(3), 1155–1168 (2021). [CrossRef]

39. J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić, “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. 4(6), eaar4206 (2018). [CrossRef]

40. Y. Long, J. Ren, Y. Li, and H. Chen, “Inverse design of photonic topological state via machine learning,” Appl. Phys. Lett. 114(18), 181105 (2019). [CrossRef]

41. I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, “Plasmonic nanostructure design and characterization via deep learning,” Light: Sci. Appl. 7(1), 60 (2018). [CrossRef]

42. E. Ashalley, K. Acheampong, L. V. Besteiro, P. Yu, A. Neogi, A. O. Govorov, and Z. M. Wang, “Multitask deep-learning-based design of chiral plasmonic metamaterials,” Photonics Res. 8(7), 1213–1225 (2020). [CrossRef]

43. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]

44. Z. Liu, D. Zhu, S. P. Rodrigues, K. T. Lee, and W. Cai, “Generative model for the inverse design of metasurfaces,” Nano Lett. 18(10), 6570–6576 (2018). [CrossRef]

45. P. R. Wiecha, A. Arbouet, C. Girard, and O. L. Muskens, “Deep learning in nano-photonics: inverse design and beyond,” Photonics Res. 9(5), B182–B200 (2021). [CrossRef]

46. N. J. Anika and M. B. Mia, “Design and analysis of guided modes in photonic waveguides using optical neural network,” Optik 228, 165785 (2021). [CrossRef]

47. W. He, T. mingyu, Z. Xu, Y. Hu, X. Cheng, and T. Jiang, “Ultrafast all-optical terahertz modulation based on inverse-designed metasurface,” Photonics Res. (2021).

48. J. Qie, E. Khoram, D. Liu, M. Zhou, and L. Gao, “Real-time deep learning design tool for far-field radiation profile,” Photonics Res. 9(4), B104–B108 (2021). [CrossRef]

49. Y. Xu, X. Zhang, Y. Fu, and Y. Liu, “Interfacing photonics with artificial intelligence: an innovative design strategy for photonic structures and devices based on artificial neural networks,” Photonics Res. 9(4), B135–B152 (2021). [CrossRef]

50. C. Liu, W. M. Yu, Q. Ma, L. Li, and T. J. Cui, “Intelligent coding metasurface holograms by physics-assisted unsupervised generative adversarial network,” Photonics Res. 9(4), B159–B167 (2021). [CrossRef]

51. Z. Zhen, C. Qian, Y. Jia, Z. Fan, R. Hao, T. Cai, B. Zheng, H. Chen, and E. Li, “Realizing transmitted metasurface cloak by a tandem neural network,” Photonics Res. 9(5), B229–B235 (2021). [CrossRef]

52. P. Dai, Y. Wang, Y. Hu, C. H. de Groot, O. Muskens, H. Duan, and R. Huang, “Accurate inverse design of Fabry–Perot-cavity-based color filters far beyond sRGB via a bidirectional artificial neural network,” Photonics Res. 9(5), B236–B246 (2021). [CrossRef]

53. S. So, T. Badloe, J. Noh, J. Rho, and J. Bravo-Abad, “Deep learning enabled inverse design in nanophotonics,” Nanophotonics 9(5), 1041–1057 (2020). [CrossRef]

54. Y. Tang, K. Kojima, T. Koike-Akino, Y. Wang, P. Wu, Y. Xie, M. H. Tahersima, D. K. Jha, K. Parsons, and M. Qi, “Generative Deep Learning Model for Inverse Design of Integrated Nanophotonic Devices,” Laser Photon. Rev. 14(12), 2000287 (2020). [CrossRef]

55. T. Zhang, J. Wang, Q. Liu, J. Zhou, J. Dai, X. Han, Y. Zhou, and K. Xu, “Efficient spectrum prediction and inverse design for plasmonic waveguide systems based on artificial neural networks,” Photonics Res. 7(3), 368–380 (2019). [CrossRef]

56. B. Han, Y. Lin, Y. Yang, N. Mao, W. Li, H. Wang, K. Yasuda, X. Wang, V. Fatemi, and L. Zhou, “Deep-Learning-Enabled Fast Optical Identification and Characterization of 2D Materials,” Adv. Mater. 32(29), 2000953 (2020). [CrossRef]

57. S. Vandenhende, S. Georgoulis, W. Van Gansbeke, M. Proesmans, D. Dai, and L. Van Gool, “Multi-Task Learning for Dense Prediction Tasks: A Survey,” arXiv Prepr. arXiv2004.13379 (2020).

58. Y. Zhang and Q. Yang, “A survey on multi-task learning,” arXiv Prepr. arXiv1707.08114 (2017).

59. K. H. Thung and C. Y. Wee, “A brief review on multi-task learning,” Multimed. Tools Appl. 77(22), 29705–29725 (2018). [CrossRef]

60. S. Ruder, “An overview of multi-task learning in deep neural networks,” arXiv Prepr. arXiv1706.05098 (2017).

61. Y. Zhang and Q. Yang, “An overview of multi-task learning,” Natl. Sci. Rev. 5(1), 30–43 (2018). [CrossRef]

62. O. Sener and V. Koltun, “Multi-task learning as multi-objective optimization,” arXiv Prepr. arXiv1810.04650 (2018).

63. D. Chandler-Horowitz and P. M. Amirtharaj, “High-accuracy, midinfrared (450 cm− 1⩽ ω⩽ 4000 cm− 1) refractive index values of silicon,” J. Appl. Phys. 97(12), 123526 (2005). [CrossRef]

64. A. Taflove and S. C. Hagness, “Computational electromagnetics: the finite-difference time-domain method,” Artech House 3, (2000).

65. E. Barkanov, “Introduction to the finite element method,” Inst. Mater. Struct. Fac. Civ. Eng. Riga Tech. Univ. 1–70 (2001).

66. M. Clemens and T. Weiland, “Discrete electromagnetism with the finite integration technique,” Prog. Electromagn. Res. 32, 65–87 (2001). [CrossRef]

67. S. Molesky, Z. Lin, A. Y. Piggott, W. Jin, J. Vucković, and A. W. Rodriguez, “Inverse design in nanophotonics,” Nat. Photonics 12(11), 659–670 (2018). [CrossRef]

68. J. V Beck, B. Blackwell, and C. R. S. Clair Jr, Inverse Heat Conduction: Ill-Posed Problems (James Beck, 1985).

69. P. C. Hansen, “Analysis of discrete ill-posed problems by means of the L-curve,” SIAM Rev. 34(4), 561–580 (1992). [CrossRef]

Networks	classification parameter	size parameter
Networks	classification parameter	a	b	c
1	10.38	0.93	4.17	/
2	2.56	2.81	3.25	3.17
3	2.56	2.01	0.83	/

Validation sets	Mean absolute percentage error (%)
Validation sets	Equal weight	Calculated weight
1	0.76	0.65
2	0.37	0.33
3	0.16	0.14

Networks	classification parameter	size parameter
Networks	classification parameter	a	b	c
1	10.38	0.93	4.17	/
2	2.56	2.81	3.25	3.17
3	2.56	2.01	0.83	/

Validation sets	Mean absolute percentage error (%)
Validation sets	Equal weight	Calculated weight
1	0.76	0.65
2	0.37	0.33
3	0.16	0.14

Achieving efficient inverse design of low-dimensional heterostructures based on a vigorous scalable multi-task learning network

Abstract

1. Introduction

2. Design of a graphene-Si hybrid nanostructure

3. Scalable multi-task learning model

4. Inverse design of low-dimensional nanostructures

5. SMTL assisted in-depth exploration of graphene-Si heterostructures

6. Conclusion

Funding

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (6)

Tables (2)

Equations (9)

Optics Express