An approach to optimizing the Q factors of two-dimensional photonic crystal (2D-PC) nanocavities based on deep learning is hereby proposed and demonstrated. We prepare a data set consisting of 1000 nanocavities generated by randomly displacing the positions of many air holes in a base nanocavity and calculate their Q factors using a first-principles method. We train a four-layer neural network including a convolutional layer to recognize the relationship between the air holes’ displacements and the Q factors using the prepared data set. After the training, the neural network is able to estimate the Q factors from the air holes’ displacements with an error of 13% in standard deviation. Crucially, the trained neural network can estimate the gradient of the Q factor with respect to the air holes’ displacements very quickly using back-propagation. A nanocavity structure with an extremely high Q factor of 1.58 × 109 was successfully obtained by optimizing the positions of 50 holes over ~106 iterations, taking advantage of the very fast evaluation of the gradient in high-dimensional parameter spaces. The obtained Q factor is more than one order of magnitude higher than that of the base cavity and more than twice that of the highest Q factors reported so far for cavities with similar modal volumes. This approach can optimize 2D-PC structures over a parameter space of a size unfeasibly large for previous optimization methods that were based solely on direct calculations. We believe that this approach is also useful for improving other optical characteristics.
© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
Photonic nanocavities based on artificial defects in two-dimensional (2D) photonic crystal (PC) slabs have been used to realize high quality (Q) factors from ~thousand , tens of thousands , hundreds of thousands , millions [4–9], to more than ten million  together with small modal volumes (Vcav) of the order of one cubic wavelength or less. A higher Q factor increases the storage time of photons and light-matter interaction time, while a smaller Vcav enhances the light matter interaction strength and decreases the footprint. There have been various efforts to increase the Q factors and/or Q/V of 2D-PC slab nanocavities both in theory and experiments [2–17]. The developed PC nanocavities have been used for various applications including ultracompact channel add/drop devices , nano-lasers , laser arrays for sensing , strongly coupled light-matter systems in solids [20,21], ultra-low-power consumption optical bi-stable systems , ultracompact and low-threshold all-Si Raman lasers , and photonic buffer memories [24–26]. However, further improvement is desirable for the realization of more advanced applications.
The fundamental design principle to increase the designed Q factor (Qdes) of 2D-PC nanocavities is well known: the wavevector components of the cavity electro-magnetic field within the light cone should be decreased as much as possible to reduce radiation loss . Many design methods including Gaussian envelope approaches [2,3], analytic inverse problem approaches [12,13], genetic algorithms [14,15], and leaky position visualization  have been proposed for obtaining a higher Qdes while keeping a small Vcav. For example, a five-step heterostructure nanocavity comprising a defect waveguide with lattice constant modulation analytically designed to realize a Gaussian envelope function for the mode field was reported to have a Qdes of 7 × 108 and a Vcav of 1.3 cubic wavelength in the material (λ/n)3 with an assistance of leaky mode visualization technique . In addition, a two-step heterostructure nanocavity with a Qdes of 1.4 × 108 and a Vcav of 1.5 (λ/n)3 was reported , where the positions of the eight air holes near the center of the cavity (two parameters) were tuned using the leaky position visualization method . Recently, the L4/3 cavity, in which positions of 22 air holes (11 parameters) were tuned by genetic algorithms [14,15], was reported to have a Qdes of 2.1 × 107 and a Vcav of 0.32 (λ/n)3 . Though these approaches were successful, there still remain numbers of unused design freedoms in PC nanocavities that are difficult to fully utilize due to the large cost involved in calculating the gradient of the Q factor in a high dimensional structural parameter space in each step of structural optimization.
In this paper, we propose an approach to optimize 2D-PC nanocavities based on deep learning of the relationship between the nanocavities’ structures and their Q factors. We prepare a data set consisting of 1000 different nanocavities whose air hole’s positions are randomly but symmetrically displaced. Their Q factors are calculated by a first-principles method where multiple parallel computation techniques can be fully utilized to reduce the computation time. Next, we train a four-layer artificial neural network (NN) including a convolutional layer using the data set to recognize the relationship between the air hole displacement patterns and their Q factors. The trained NN is able to predict the gradient of the Q factor with respect to the air holes’ displacements at a speed extremely faster than the first-principles calculation. This is used to optimize the displacements of many air holes (50 air holes, 27 parameters) for a large number of repetitions (>1,000,000). This optimization method demonstrates a very high Q factor that exceeds one billion.
In general, automatic structural optimization with respect to target characteristic value(s) requires at least three steps; (a) select a set of parameters that represents the structure to be optimized, (b) calculate the gradient of the target characteristic value(s) with respect to the structural parameters, and (c) modify the structural parameters based on the calculated gradient. (b) and (c) are repeated until the target value(s) saturates. It is important to select all parameters that have a strong correlation with the target characteristic(s) in step (a). In step (b), fast evaluation of the gradient is required to ensure enough repetition of the optimization. However, it is difficult to fulfill these requirements when the structures to be optimized have large degrees of freedom and requires a large computation cost for the evaluation of the gradient.
This situation applies to Q factor optimization in 2D-PC nanocavities, and we utilize a deep neural network (DNN)  to resolve these requirements. (Structural optimization of optical nanostructures using neural network has been reported for multilayered films  and metamaterials [29,30].) A DNN implements a complex non-linear function that associates a fixed-size input to a fixed size output through multiple units connected from layer to layer by linear and nonlinear operations. Because a DNN contains a large number of internal adjustable parameters (such as connection weights and biases) for tuning, it can approximate various input-output relationships once the internal parameters are properly tuned using many sets of example input-output data (training data). In particular, a DNN that contains convolutional layer(s), called a convolutional network (CNN), is very effective for learning the spatial features of input data . Because such a CNN is effective for image processing , it is considered useful for learning the relationship between the structure of 2D-PC nanocavities and their Q factors. Once we obtain a properly trained CNN using a data set prepared by first-principles calculations, the gradient of the Q factor with respect to the structural parameters can be estimated much faster than the direct calculations. Therefore, optimization of a number of parameters can be repeated many times to fully exert the potential of 2D-PC structures, whereas this is impossible using method that are solely based on a direct calculation due to the exponential increase in computational costs with increasing dimensions of the structural parameters. Our strategy is summarized as follows:
- I. Select a base cavity structure to be optimized.
- II. Select the type of structural parameter to be optimized. (e.g. air hole position, air hole size, air hole shape)
- III. Generate many 2D-PC nanocavities from the base structure by randomly fluctuating all structural parameters selected in (II) in an area much larger than the cavity field.
- IV. Calculate the Q factors of the nanocavities prepared in (III) by a first principles method, where many structures can be calculated separately in a multiple parallel fashion to reduce the computation time.
- V. Prepare an NN to learn the relationship between the structural parameters and the Q factors.
- VI. Train the NN using the relationship between the subset of the parameters selected from (III) and the Q factors calculated in (IV).
- VII. Find the best set of parameters that minimizes the error between the Q factors predicted by the NN and those calculated by the first principles method.
- VIII. Starting from an initial cavity structure, gradually change the structural parameters selected in (VII) using the gradient of the Q factor with respect to the parameters predicted by the trained NN many times until the Q factor saturates.
- IX. Check the true Q factor of the optimized structure obtained in (VIII) by the first principles calculation.
In this section, we describe the optimization of a two-step heterostructure nanocavity as an example. Fig. 1 shows the structure of a base nanocavity [step (I)] that is a two-step heterostructure nanocavity made of silicon slab with a thickness of 220 nm. The radii of air holes were 110 nm, and a line defect waveguide was formed by filling a row of air holes. The base lattice constant, a was 410 nm, and those around the center of the nanocavity were modulated by 3 nm in two steps to confine light by the mode gap effect , as shown in Fig. 1. The eight air holes shown in the figure were shifted from their original positions by an order of a/1000 through a manual tuning process based on the leaky position visualization technique . This manual tuning process increased the Q factor of the nanocavity from 50 million (before tuning) to 140 million (after tuning) , however Vcav was almost unchanged [~1.5 (λ/n)3].
3.1 Preparation of data set for learning
In step (II), we selected the displacements of the air holes as the parameters to be optimized. This is because the displacement of the air holes can be implemented in the fabrication process more accurately than other parameters such as the air hole radii or shapes. The air holes’ positions can be accurately controlled in the electron beam writing process, while their radii and shapes are largely influenced by an etching process that is more difficult to control. In step (III), we added random displacements to all air holes in the x and y directions in such a way that the symmetry of the structure was maintained. We maintained the symmetry because an asymmetry of a PC cavity increases radiation loss . The induced random displacements obeyed a Gaussian distribution with a standard deviation of a/1000. This magnitude of fluctuation was determined from experience of the manual optimization mentioned above . We generated 1000 randomly-fluctuated nanocavity structures using the above procedure.
In step (IV), we calculated the electromagnetic field and Q factors of the fundamental resonant mode for the 1000 structures using the three-dimensional (3D) finite difference time domain (FDTD) method. We used sub cells with a size of for the discretization of the dielectric constant distributions. Next, the dielectric constants of the cells for FDTD calculation were determined by averaging the sub cells in each FDTD cell. The size of the FDTD cell was . Therefore, change of the order of a/4000 in dielectric constant distribution according to the air holes’ displacements can be reflected in the FDTD calculation.
A histogram of the calculated Q factors (QFDTD) is plotted in Fig. 2, and examples of the generated cavity structures are shown in Fig. 3 with their electric fields (Ey) and QFDTD. It can be seen from Fig. 2 that QFDTD is distributed in a range of almost two-orders of magnitudes, from ~106 to more than 3 × 108, however, it was mainly concentrated in the region below ~3 × 107. In addition, there is a steep drop at the lower-Q side of the peak. We thought that this nonuniform distribution of QFDTD is relatively difficult to be learned by an NN , and transformed QFDTD to log10(QFDTD), as shown in Fig. 2(b) that shows a more uniform distribution similar to the Gaussian distribution. As a result, we used the relationships between the air hole’s displacement patterns and log10(QFDTD)s of the 1000 prepared structures as data set for the machine learning in step (VI).
3.2 Configuration of neural network
Fig. 4 shows the configuration of the neural network prepared in step (V). The input data were a set of displacement vectors of air holes [from the positions of a base cavity (I)] in an Nx (a) × Ny (rows) area around the center of the nanocavity that is normalized by the unit of a/1000, where i and j are the discrete x and y coordinates of the air holes. The first layer was a convolutional layer  in which in a local area of the input are summarized into one unit in the next layer by element-wise multiplication with a weight matrix of size Nfw (holes) × Nfh (rows) (called filter) and summation. By iteratively shifting the application area of this operation, where the amount of shift is defined as a stride, the input is convoluted with the filter to be summarized into a feature map. We used 50 filters of size 3 × 5 ( × 2 channels: x and y displacements) so that the second layer contained 50 different summaries (feature maps) of the input, where the strides in the x and y directions were 1 and 2, respectively.
The second layer was fully connected to the third layer with 200 units through a rectified linear unit (ReLU ) and Affine transformation (multiplication with a weight matrix followed by summation with a bias vector). The third layer was fully connected to the fourth layer with 50 units through ReLU, random information selection units (Dropout ), and Affine transformation. Finally, the information in the fourth layer was summarized into the one output unit through ReLU and Affine transformation. This output unit was supposed to predict log10(QFDTD).
3.3. Training of neural network
3.3.1 Loss function
In step (VI), we trained this neural network using 900 data among the 1000 prepared in steps (III and IV) and left the remaining 100 data as a test data set. A test data set is necessary to avoid overfitting that is a situation in which an NN is too optimized to the training examples so that it cannot predict meaningful answers for other inputs. It is important to maximize the generalization ability of an NN that is an ability to predict meaningful answers to new inputs it did not see during the training. Therefore, the generalization ability of an NN should be checked using a test data set at intervals . For the training, we set the loss function, L as
The first term in Eq. (1) represents the deviation from the true answer. The second term was introduced to penalize large connection weights in the network, where the summation is taken over all the weights in the network. This additional term is effective in preventing overfitting (the weight decay method ), and is the control parameter (we used 0.001). We randomly selected one set of “ pattern”-“log10(QFDTD)” from the training data set, and gradually changed the internal parameters of the NN to reduce L (stochastic gradient method ) based on the gradient of L with respect to the internal parameters that was obtained using the back-propagation method . We also applied the Momentum optimization method to speed up the convergence , where the learning rate and the momentum decay rate were set to 0.001 and 0.9, respectively. The learning step was repeated until L of the test data set converges. The accuracy of the prediction was evaluated as the standard deviations of the output [ = log10(QNN)] from the true answer [log10(QFDTD)] for the test data set () and training data set (). These values were further converted into more comprehensive average prediction errors for the Q values (EQ) using the following equation:
3.3.2 Training example
An example of a learning curve [iteration of learning (optimization) vs. prediction error] is shown in Fig. 5(a), where the input area size (Nx, Ny) = (10, 5). It can be seen from the figure that EQ for the training and test data sets are initially more than 80%, however, it decreases to less than 20% after ~2000 learning iterations. It is natural that EQtest is always larger than EQtrain because the NN has never learned the test data set. Nevertheless, the minimum EQtest became as small as ~16% within 2 × 105 iterations. The correlation between QFDTD and QNN for the test and training data sets (for the case with minimum EQtest) is shown in Fig. 5(b) and 5(c), respectively. Good correlation (with a correlation coefficient of 0.92) was obtained even for the test data set. This result demonstrates successful achievement of the generalization ability, at least within the parameter space in which the prepared structure was distributed (dimensions: 10 × 5 × 2, range of each displacement: ~ ± a/1000). We noticed that the correlation is better for the lower-Q region, and that the deviation from QFDTD increases in the higher-Q region. This is because the number of training data sets is much smaller in the higher-Q region compared to the lower Q region as shown in Fig. 2.
3.3.3 Dependence on input area size
Next, we trained the NN by changing the input area size (Nx, Ny), and plotted the minimum EQtest (obtained during 2 × 105 iterations of the learning steps) as a function of Nx and Ny in Fig. 6. It can be observed from the figure that the prediction error is higher than 60% when (Nx, Ny) is as small as (2, 5). The prediction error decreases as the input area size increases, and the case with (Nx, Ny) = (13, 5) shows the minimum prediction error of ~13%, where the correlation coefficients between QFDTD and QNN for the test and training data sets were 0.96 and 0.99, respectively. However, the error increases again for a larger input area size. It is considered that the learning process was disturbed when the air holes’ displacements that have small correlations with QFDTD were input in the NN because they acted as noise for the learning process. This provides a hint: Fig. 6 allows us to pick up the structural parameters that have strong correlations with the target value, that is, the parameters that are effective in the optimization process. In this case, we decided to optimize the displacements of airholes in the input area with (Nx,Ny) = (13,5) in step (VIII) that has 27 parameters in total. It can be seen from the lower-right inset of Fig. 6 that this area corresponds to the area where the electric field of the cavity is mainly distributed. Therefore, we can omit this step (VII) by simply setting the input area size to be the main area of cavity field distribution from the next time.
3.4. Structural optimization by trained neural network
3.4.1. Loss function
In step (VIII), we performed a structural optimization of the nanocavity using the gradient method, where the gradient was calculated by the trained NN. More precisely, we took advantage of the error back propagation method  to enable a high-speed calculation of the gradient. The back-propagation method is extremely effective for calculating the gradient of loss with respect to the internal parameters of an NN, which was already used in the training process (VI). This method can also be used to rapidly obtain the gradient of loss with respect to the input parameters (i.e., the air holes’ displacements). We set the loss as , and calculated the gradient of with respect to using the same framework used for the training process, where was set to a very high constant value (1010). We also added an artificial loss to penalize large displacements to constrain the air holes from moving far away from the parameter space that the NN learned in step (VI), where the prediction error was small:Fig. 1 is the origin of because we prepared the learning data set by adding random displacements to this structure. Then, we changed the structure () step by step to reduce the loss L’ based on the Momentum  method:
3.4.2 Optimization curve
In 3.3.3, we decided to use NN with an input area size of (Nx,Ny) = (13,5). Consequently, only the displacements of airholes in this area were recognized by the NN and modified in the optimization step. Fig. 7(a) shows the evolution of QNN during the optimization for various ’s from 0.01 to 1, where the initial structure was set to the structure that had the highest QFDTD [ = 3.8 × 108, Fig. 3(a)] among the 1000 randomly prepared structures in step (III-IV). The displacements of the air holes for this initial structure recognized by the NN are shown in Fig. 7(b). We also optimized the structure using , however, the obtained result was identical to the case with . It is seen in the figure that QNN increased from the original value after optimization in the cases with . QNN slightly decreased from the original value after optimization in the cases with . This is considered because the high additional loss of inhibited large deformation from the original structure (Fig. 1). We also performed the optimization process with a completely different randomly created initial structure (denoted as fluc46398753, Fig. 7(c)) with . The result is plotted in Fig. 7(a) as indicated by the brown solid line. The initial QNN of this structure is as low as 1 × 107, however, it increased to 4.84 × 108 after optimization. The final QNN is almost the same as the case started from the structure in Fig. 3(a) with , and the obtained structures were also almost identical [see Fig. 8(b) and 8(f)]. Therefore, the selection of the initial structure is not so important.
3.4.3 Validation by FDTD
In step (IX), we calculated the Q factors for the structures obtained in step (VIII) using the 3D-FDTD method. Here, the displacements of air holes outside the NN input area were set to zero because these air holes have small correlation with the Q factor as discussed in 3.3.3. The results are summarized in Fig. 8, where the structure after optimization, QFDTD, QNN, electric field distribution (Ey), and cavity modal volume (Vcav) are shown. (Precise displacement values of air holes are provided in Dataset 1 .) It can be seen from the figure that an extremely high QFDTD of 1.58 × 109 was obtained with [Fig. 8(c)]. This QFDTD is one order of magnitude higher than the manually optimized two-step heterostructure nanocavity (QFDTD = 1.37 × 108, Fig. 1), and more than twice the highest QFDTD of the 2D-PC nanocavity reported so far (QFDTD = 7 × 108 ). The successful achievement of such an extremely high QFDTD demonstrates the effectiveness of the proposed optimization method, where large degrees of freedom of the 2D-PC structure were effectively utilized, as can be seen from a comparison between Fig. 1 and Fig. 8.
4.1 Discussion of results
It can be observed from Fig. 8(c) that QNN is less than 1/3 of QFDTD. This is understood from the response of the trained NN shown in Fig. 5(b) and 5(c), where QNN tends to be lower than QFDTD as QFDTD increases. As discussed before, the number of data with QFDTD > 1 × 108 is rare (40 samples among 1000), and there are many data samples with lower QFDTD (Fig. 2) so that the prediction tends to be low for high QFDTD structures. Nevertheless, the fact that the structure optimized by this method has a larger QFDTD than the initial structure indicates that the direction of the gradient of QFDTD with respect to the air holes’ displacements were properly evaluated by the trained NN.
It is also interesting that the highest QFDTD was achieved by constraining the magnitude of the airholes’ displacements to some extent by increasing in the loss function L’ [Eq. (3))]. In Fig. 8(a) to 8(c), QFDTD increases from 4.48 × 108 to 1.58 × 109 as increases from 0.01 to 0.1, and the corresponding air holes’ displacements decreases. The structure in Fig. 8(c) shows a much larger QFDTD than that of Fig. 8(a) because the accuracy of the Q prediction becomes lower as the displacements of the air holes move away from the center of the parameter space that the NN has learned (i.e. = 0). For the case with >0.5, the magnitudes of the displacements are too constraint to obtain the highest QFDTD, however, QFDTD’s > 1.39 × 109 were still realized.
We also compare Fig. 8(b) and 8(f). Although the initial structures for these two cases are completely different (as can be seen in Fig. 7(b) and 7(c)), the final optimized structures and their QFDTD’s are almost the same. This result indicates that the structures obtained using this method are globally optimized at least within and near the parameter space that the NN learned.
4.2 Comparison with other optimization methods
(A) Optimization based on genetic algorithm prepare many randomly generated cavities (individuals) and select better individuals to prepare individuals in the next generation with natural-selection, cross-over and random mutations that is repeated until the Q factor converges . This method can optimize cavities automatically, however, it requires the calculation of relatively large numbers of cavities. Ref  reports that a few tens of generations with 80 individuals (i.e. calculation of 1600~2400 patterns of cavities) were required to optimize 3 parameters (shift of 3 air holes) in the L3 cavity. The requirement increases as the numbers of parameters to be optimized increases. To optimize 5 parameters in the L3 cavity, where QFDTD of 4.2 million with Vcav = 0.95(λ/n)3 has been achieved, 100 generations were required (i.e. calculation of ~8000 cavities). To optimize 7 parameters in the H0 cavity, where QFDTD of 8.3 million with Vcav = 0.64(λ/n)3 and QFDTD of 1.66 million with Vcav = 0.34(λ/n)3 have been achieved, 300 generations with 120 individuals in each generation were required (i.e. calculation of 36000 cavities).
(B) Optimization based on leaky component visualization utilizes Fourier transformation of a cavity mode field followed by clipping of the components within the light cone and inverse Fourier transformation to visualize the real space region where out of plane loss occurs. The air holes located at the leaky region were manually tuned by scanning the parameters (e.g. positions or radii) and by calculating the Q factor to obtain local maxima. This procedure was repeated until the Q factor converges. This method required less numbers of cavity patterns to optimize the structure. In , we repeated the procedure 8 times and calculated 10~20 cavities in each step to optimize the L3 cavity (i.e. calculation of < 200 cavities), where QFDTD of 5.02 million with Vcav = 0.75(λ/n)3 has been achieved by tuning 9 parameters. To optimize the H0 cavity, the procedure was repeated four times (i.e. calculation of < 100 cavities), where QFDTD of 1.67 million with Vcav = 0.31(λ/n)3 has been achieved by tuning five parameters. Method (B) required much less samples for the optimization because the most important region is determined and optimized in each step. However, this method needs manual tuning of parameters because the parameter(s) to be optimized are selected manually and the scanning range is also determined manually. In addition, there is a risk that the optimization becomes local because only one or two parameters were tuned in each step.
In comparison to these two methods, the proposed method based on deep learning requires relatively small numbers of sample cavities (1000) to optimize large numbers of parameters (27 parameters), and the optimization is performed automatically. (Learning and optimization process consume much less time (< 2 hours) compared to the FDTD calculation of 1000 sample cavities.) Because the features of the cavities were recognized by a layered network, not only spatially local features but spatially global features that relate to the Q factors were separately learned in each layer of the network that were then used to optimize the cavity structure very efficiently. The recognition of cavity structures by deep layered networks is the key of this optimization method, and can be applied to more general structures even though the target cavity optimized in this paper is different from those discussed the above. However, as discussed in 4.1, our approach works well in the vicinity of the parameter space where the learning data set was prepared. When the structure needs to be modified significantly, two-step approach (rough and fine data sets) is considered effective. In such cases, the number of sample cavities required for optimization will increase.
We have proposed and demonstrated a novel approach for optimizing 2D-PC nanocavities based on deep learning of the relationship between nanocavities’ structures and their Q factors. We have successfully trained a neural network consisting of a convolutional layer and three fully connected layers using 1000 randomly generated nanocavities and their Q factors. After the training, the convolutional neural network was able to predict Q factors from the displacement patterns of air holes with an error of 13% in standard deviation. Structural optimization was performed by estimating the gradient of Q with respect to the displacements of the air holes using the trained neural network. A nanocavity structure with an extremely high theoretical Q factor of 1.58 × 109 that is 10 times larger than that of the manually optimized base structure, and more than twice the highest Q factor ever reported for 2D-PC cavities with similar modal volumes, was successfully obtained. We attribute our unprecedentedly high Q factor to the ability of our method to optimize the nanocavity over a parameter space of a size unfeasibly large for previous methods that were based solely on direct calculations. We believe that this approach is effective for the optimization of various types of 2D-PC nanocavity structures, not only for increasing Q factors but also for improving other target characteristics.
JSPS KAKENHI (15H03993); New Energy and Industrial Technology Development Organization (NEDO).
The authors would like to thank Mr. Koki Saito for his helpful textbook on deep learning written in Japanese (Deep learning from scratch, O’Reilly Japan).
3. B. S. Song, S. Noda, T. Asano, and Y. Akahane, “Ultra-high-Q photonic double-heterostructure nanocavity,” Nat. Mater. 4(3), 207–210 (2005). [CrossRef]
5. E. Kuramochi, M. Notomi, S. Mitsugi, A. Shinya, T. Tanabe, and T. Watanabe, “Ultrahigh-Q photonic crystal nanocavities realized by the local width modulation of a line defect,” Appl. Phys. Lett. 88(4), 041112 (2006). [CrossRef]
7. E. Kuramochi, H. Taniyama, T. Tanabe, A. Shinya, and M. Notomi, “Ultrahigh-Q two-dimensional photonic crystal slab nanocavities in very thin barriers,” Appl. Phys. Lett. 93(11), 111112 (2008). [CrossRef]
8. Z. Han, X. Checoury, D. Néel, S. David, M. El Kurdi, and P. Boucaud, “Optimized design for 2×106 ultra-high Q silicon photonic crystal cavities,” Opt. Commun. 283(21), 4387–4391 (2010). [CrossRef]
13. Y. Tanaka, T. Asano, and S. Noda, “Design of photonic crystal nanocavity with Q-Factor of ~ 109,” J. Lightwave Technol. 26(11), 1532–1539 (2008). [CrossRef]
14. Y. Lai, S. Pirotta, G. Urbinati, D. Gerace, M. Minkov, V. Savona, A. Badolato, and M. Galli, “Genetically designed L3 photonic crystal nanocavities with measured quality factor exceeding one million,” Appl. Phys. Lett. 104(24), 241101 (2014). [CrossRef]
16. T. Nakamura, Y. Takahashi, Y. Tanaka, T. Asano, and S. Noda, “Improvement in the quality factors for photonic crystal nanocavities via visualization of the leaky components,” Opt. Express 24(9), 9541–9549 (2016). [CrossRef] [PubMed]
17. M. Minkov, V. Savona, and D. Gerace, “Photonic crystal slab cavity simultaneously optimized for ultra-high Q / V and vertical radiation coupling,” Appl. Phys. Lett. 111(13), 131104 (2017). [CrossRef]
18. M. Nomura, N. Kumagai, S. Iwamoto, Y. Ota, and Y. Arakawa, “Laser oscillation in a strongly coupled single quantum dot-nanocavity system,” Nat. Phys. 6(4), 279–283 (2010). [CrossRef]
20. T. Yoshie, A. Scherer, J. Hendrickson, G. Khitrova, H. M. Gibbs, G. Rupper, C. Ell, O. B. Shchekin, and D. G. Deppe, “Vacuum Rabi splitting with a single quantum dot in a photonic crystal nanocavity,” Nature 432(7014), 200–203 (2004). [CrossRef] [PubMed]
22. K. Nozaki, A. Shinya, S. Matsuo, Y. Suzaki, T. Segawa, T. Sato, Y. Kawaguchi, R. Takahashi, and M. Notomi, “Ultralow-power all-optical RAM based on nanocavities,” Nat. Photonics 6(4), 248–252 (2012). [CrossRef]
25. Y. Sato, Y. Tanaka, J. Upham, Y. Takahashi, T. Asano, and S. Noda, “Strong coupling between distant photonic nanocavities and its dynamic control,” Nat. Photonics 6(1), 56–61 (2012). [CrossRef]
28. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]
30. S. Inampudi and H. Mosallaei, “Neural network based design of metagratings,” Appl. Phys. Lett. 112(24), 241102 (2018). [CrossRef]
31. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jacklel, “Handwritten digit recognition with a back-propagation networks,” in Proceedings of Advances in Neural Information Processing Systems, pp. 396–404 (1990).
32. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105, (2012).
33. X. Glorot, A. Bordes, and Y. Bengio: “Deep sparse rectifier neural networks,” in Proceedings of Artificial Intelligence and Statistics, pp. 315–323 (2011).
34. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res. 15, 1929–1958 (2014).
35. A. Krogh and J. A. Hertz, “A simple weight decay can improve generalization,” in Proceedings of Advances in Neural Information Processing Systems, pp. 950–957 (1991).
36. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature 323(6088), 533–536 (1986). [CrossRef]
37. B. T. Polyak, “Some methods of speeding up the convergence of iteration methods,” USSR Comput. Math. Math. Phys. 4(5), 791–803 (1964). [CrossRef]
38. Data for the precise displacement values of air holes are provided: https://doi.org/10.6084/m9.figshare.7223222.