Self-adaptive deep reinforcement learning for THz beamforming with silicon metasurfaces in 6G communications

Yi Ji Tan; Yi Ji Tan; Yi Ji Tan; Changyan Zhu; Changyan Zhu; Thomas Caiwei Tan; Thomas Caiwei Tan; Abhishek Kumar; Abhishek Kumar; Liang Jie Wong; Yidong Chong; Yidong Chong; Ranjan Singh; Ranjan Singh

doi:10.1364/OE.458823

1. Introduction

The exponential increase in demand for high bandwidth applications [1], such as autonomous driving and enhanced mobile broadband [2,3], has driven efforts to realize 6G terabits per second wireless communications [4–6]. The terahertz band is a leading candidate for 6G wireless communications because its wide bandwidth (from 0.1 to 10 THz) can allow for exceptionally high data rates. A crucial consideration for terahertz wireless communications is spectral efficiency [7–9], which quantifies the data transmission rate for a given number of users over a given bandwidth [10]; Achieving high spectral efficiency is necessary to counteract free-space path loss and atmospheric attenuation [11,12], as well as the limited conversion efficiency of transmitters at terahertz frequencies [13,14]. One way to increase spectral efficiency is to use beamforming in MIMO systems to exploit multipath propagation, simultaneously directing multiple radiation beams towards spatially separated users [15–17]. However, the implementation of beamforming circuitry in antenna arrays is difficult at terahertz frequencies for several reasons: the need for sub-wavelength inter-element spacing to avoid grating lobes in the radiation pattern, the large number of antennas required to compensate power losses [18], and high power density and heat dissipation requirements in densely packed antenna arrays. Furthermore, the required complex and fast control of phased arrays for terahertz beamforming are yet to be realized with existing digital circuitry [19].

Metasurfaces, which are two-dimensional (2D) metamaterials of sub-wavelength thickness, are emerging as a promising alternative to phased array antennas for terahertz beamforming [20–22]. Current fabrication technologies allow for metasurfaces with sub-micron features, satisfying the requirement of sub-wavelength inter-element spacing to achieve total wavefront manipulation at terahertz frequencies [23]. In addition, dielectric metasurfaces can manipulate terahertz wavefronts with low loss [24], potentially circumventing the issues of power loss and heat dissipation associated with phased array antennas. Recently, the arbitrary manipulation of terahertz wavefronts has been experimentally demonstrated using reconfigurable and reprogrammable metasurfaces [25,26]. Iterative radiation pattern optimization techniques, such as the adjoint method and particle swarm optimization [27,28], are unsuitable for the real-time control of metasurfaces for spatial phase modulation, since the optimized phase profiles may be invalid by the time the algorithm completes. On the other hand, deep learning algorithms based on neural network (NN) can be used for such dynamic controls of metasurfaces to alter the radiation patterns in real-time. To date, NN-based deep learning models have been used to predict realistic radiation patterns produced by metasurfaces [29,30], and the NN-based predictions of phase profiles for target intensity patterns have been demonstrated for one-dimensional (1D) arrays [31,32]. NN models for predicting the phase profile from 2D intensity patterns have also been demonstrated [33,34]. However, these NN models are based on supervised learning and have yet to exhibit the ability to generalize phase profile predictions for more complex input intensity (i.e., patterns with more primary radiation lobes). This is mainly due to the predetermined training data set used in these models, where each example consists of an input intensity pattern paired with an output phase profile. Furthermore, different phase profiles may produce nearly identical intensity patterns, and such similarity in the training data is difficult to model using a single NN trained based on input-output data pairs. Using a tandem NN model helps overcome this fundamental issue [35,36], which trains an inverse NN based on the output of a forward NN.

Here, we propose an intelligent real-time and self-adaptive beamforming scheme based on deep reinforcement learning. Our deep learning model uses a fully connected NN to predict the phase profile required to beamform desired intensity patterns and is suitable for implementing a multi-user MIMO system that can respond dynamically to real-time changes in user locations. It employs automatic differentiation to calculate the error gradients and adaptively optimize the NN parameters by minimizing the difference between input and predicted intensity patterns [37,38], of which the latter is calculated from the output phase profiles. Since the model is trained by comparing input and predicted intensity patterns, the training data does not require input-intensity and output-phase pairs, a departure from previous NN-based beamforming schemes [31–34]. Instead, training of the NN is performed in real-time using mini-batch data taken from an ever-growing database of intensity patterns based on the users’ location history. As such, the prediction accuracy of the NN improves in a self-adaptive manner while being employed for real-time beamforming without the need for predetermined data. Similar to the training of an inverse network using a pre-trained forward network in a tandem NN model [35,36], our NN is trained to predict phase profiles from input intensity patterns. However, our deep learning model uses the phase-to-intensity forward function to train the NN, thus avoiding any limitations imposed by a pre-trained forward network (i.e., the forward prediction capability). Moreover, we show that the trained NN is capable of generalization. After being trained using limited data designed for a maximum of three simultaneous users, it accurately performs beamforming for five randomly positioned and spatially separated users. As a proof of concept, we design silicon metasurfaces with the aid of our deep learning model and use them to experimentally realize the 2D beamforming of terahertz radiation towards multiple spatially separated users. These results demonstrate the utility of our deep learning model for real-time and self-adaptive beamforming, paving the way for intelligent metasurfaces in terahertz communication, imaging, and sensing applications.

2. Results

2.1 Deep learning model for intelligent real-time terahertz beamforming

Figure 1 shows our deep learning scheme for real-time self-adaptive terahertz beamforming. Based on the angular positions of a group of multiple spatially separated users, we define the desired intensity pattern consisting of a superposition of beam profiles directed toward each user (see Methods). New intensity patterns are defined and added to a database as the users’ locations change in real-time. The fully connected NN receives, at its input layer, the current desired intensity pattern as well as a mini-batch of intensity patterns randomly drawn from the database. The output of the NN is a set of phase profiles, one for each intensity pattern in the input. This design allows the self-adaptive training of the NN and the simultaneous phase profile prediction for beamforming. The NN parameters are updated using the adaptive moment (Adam) estimation algorithm [39], with error gradients computed via automatic differentiation.

Fig. 1. Self-adaptive deep reinforcement learning model for intelligent real-time terahertz beamforming based on the prediction of phase profile for metasurfaces (see Visualization 1). A fully connected neural network (NN) outputs the spatial phase profile required to beamform a terahertz wave for transmission to multiple spatially separated users. New target intensity patterns are continuously added to a database from which mini-batch data is used to train the NN. The prediction error is defined as the difference between the target intensity and predicted intensity patterns calculated from the output phase profiles. The error gradient is computed via automatic differentiation, which is used to adaptively optimize the NN parameters. The inset (top right) is an image of the fabricated silicon metasurface designed with the aid of our deep learning model, taken using a scanning electron microscope.

Download Full Size | PDF

The prediction error is defined as the weighted mean squared difference between the input intensity patterns (i.e., the current and mini-batch patterns) and the predicted intensity patterns calculated from the output phase profiles. The output phase profile for the current desired intensity pattern can then be sent to a reprogrammable metasurface to spatially modulate the wavefront of a terahertz wave. Although we focus on terahertz beamforming, a similar scheme may be applied to other frequency regimes such as radio waves and visible light, limited mainly by the scalability of metasurfaces.

The radiation pattern produced by an array of M by N isotropic radiating elements (e.g., a metasurface) with amplitude ${A_{mn}}$ and phase ${\varphi _{mn}}$ is given by the nonlinear array factor

(1)$$\begin{array}{c} {AF({\textbf r} )= \mathop \sum \nolimits_m^M \mathop \sum \nolimits_n^N {A_{mn}}\textrm{exp} [{i({{\textbf k} \cdot {{\textbf r}_{mn}} - {\varphi_{mn}}} )} ],} \end{array}$$

where ${\textbf k}$ is the wavevector and ${{\textbf r}_{mn}}$ is the position of each radiating element. We assume constant amplitude modulation (${A_{mn}} = 1$), and model the radiation pattern of each radiating element with a ${\cos ^2}\theta $ intensity distribution, where $({r,\theta ,\phi } )$ denotes spherical coordinates with the principal axis perpendicular to the metasurface. The resulting intensity pattern is

(2)$$\begin{array}{c} {I({\theta ,\phi } )= {{\cos }^2}\theta {{|{AF({\textbf r} )} |}^2} = {{\cos }^2}\theta {{\left|{\mathop \sum \nolimits_m^M \mathop \sum \nolimits_n^N \textrm{exp} \{{i[{k\sin \theta ({{x_{mn}}\cos \phi + {y_{mn}}\sin \phi } )- {\varphi_{mn}}} ]} \}} \right|}^2},} \end{array}$$

where $k = |{\textbf k} |$ and $({{x_{mn}},\; {y_{mn}}} )$ are the in-plane Cartesian coordinates of ${{\textbf r}_{mn}}$. We define the target intensity pattern as a superposition of single-beam intensity profiles, given by

(3)$$\begin{array}{c} {{I_0} = \frac{1}{U}\mathop \sum \nolimits_u^U {{\cos }^2}\theta {{\left|{\textrm{sinc}\left[ {\frac{{kN\mathrm{\Delta }x}}{2}({\sin \theta \cos \phi - \sin {\theta_u}\cos {\phi_u}} )} \right]\textrm{sinc}\left[ {\frac{{kM\mathrm{\Delta }y}}{2}({\sin \theta \sin \phi - \sin {\theta_u}\sin {\phi_u}} )} \right]} \right|}^2},} \end{array}$$

where $\mathrm{\Delta }x$ and $\mathrm{\Delta }y$ are the spacings between the radiating elements, and U is the total number of spatially separated users located at angular positions $({{\theta_u},{\phi_u}} )$. Each expression in the summation in Eq. (3) is a radiation pattern intended for transmitting towards a single user, calculated from a linear phase profile (see Methods). Although Eq. (2) allows us to calculate the intensity patterns for any given phase profile, it is non-trivial to achieve the reverse – i.e., finding a spatial phase profile that produces the desired intensity pattern given by Eq. (3). However, NN can help solve such complex and nonlinear inverse problems. The NN shown in Fig. 1 consists of artificial neurons fully connected from one layer to the next. The input to each artificial neuron in the hidden layers and output layer is the sum of weighted outputs from all artificial neurons in the previous layer, passed through a nonlinear activation function. The information propagation in a fully connected NN is performed via matrix-vector multiplication operations with a time complexity order of $O({{n^2}} )$. By comparison, the Gerchberg-Saxton (GS) algorithm, an iterative phase retrieval algorithm commonly used in designing holographic metasurfaces [40], requires the discrete spherical Fourier transform (DSFT) for computing wavefront propagations in spherical coordinates, with time complexity of $O({{n^2}{{({\log n} )}^2}} )$ [41]. A fully connected NN can predict phase profiles in a single forward pass within milliseconds, making it suitable for real-time beamforming.

2.2 Neural network training and predictions

We train a fully connected NN with 2 hidden layers (16384×4096×4096×1024 nodes) using intensity patterns calculated from Eq. (3) for a maximum number of three simultaneous users. The intensity patterns are calculated for an array of 32×32 radiating elements, operating at 1.0 THz frequency with 75 µm inter-element spacing. The users’ angular positions for each intensity pattern are randomly chosen from a uniform distribution within the angular range $\theta \in $ [0°, +60°] and $\phi \in $ [−180°, +180°]. The NN is trained according to the deep learning model in Fig. 1 for 10⁶ iterations using a batch of 256 input intensity patterns per iteration, which consists of 128 new intensity patterns generated every iteration and a mini-batch of 128 intensity patterns randomly selected from the database. New intensity patterns are continuously added to the database, and the NN weights and biases are updated as the prediction accuracy improves. In the first iteration, the NN is trained using 128 newly generated intensity patterns which form the initial database. We define the prediction error as the weighted mean squared difference between the input intensity and predicted intensity patterns (see Methods), with the weights designed to counteract the ${\cos ^2}\theta $ intensity distribution. Automatic differentiation is employed to calculate the error gradient for the NN weights and biases, which we use to update the NN using the Adam optimizer.

Fig. 2. Beamforming accuracy of a fully connected NN trained using intensity patterns generated for a maximum number of three spatially separated users. User and beam positions are given by their polar and azimuth angular position relative to the surface normal. Correlation between the user and beam positions predicted by the trained NN for: (a), (b) 4000 cases of three spatially separated users; (c), (d) 2400 cases of five spatially separated users. The predictions are performed for intensity patterns that do not belong to the training data.

Download Full Size | PDF

Figure 2 illustrates the beamforming accuracy of the trained fully connected NN based on the linear correlation between the user positions and the beam positions for both the polar and azimuth angles. For a given input intensity pattern, we define the “user position” as the angular position of the local intensity maximum, while the “beam position” is defined as the local maximum of the predicted intensity pattern nearest to the “user position”. Figures 2(a) and 2(b) show a strong linear correlation between the user positions and beam positions generated by the trained NN, calculated using 4000 input intensity patterns generated for three spatially separated users. The linear correlation of the polar angles in Fig. 2(a) shows a standard deviation of less than 2° for $\theta < $ 45°, which may be further decreased by increasing the number of training iterations. On the other hand, the linear correlation in Fig. 2(b) shows a standard deviation of less than 15° across all azimuth angles. The larger standard deviation in the linear correlation of the azimuth angles is due to the higher uncertainty in beamforming radiation towards the desired azimuth angle at smaller polar angles. The predicted intensity patterns cover most angular positions since the standard deviation in the linear correlation is less than the 6.3° angular beamwidth (see Methods).

Although the input intensity patterns used in Fig. 2 were not part of the training data, the high beamforming accuracy shown in Figs. 2(a) and 2(b) is perhaps unsurprising due to similarities with the intensity patterns in the training data. On the other hand, Figs. 2(c) and 2(d) show the correlation between the user positions and beam positions calculated for 2400 target intensity patterns corresponding to five randomly positioned and spatially separated users. The NN here is the same as in Figs. 2(a) and 2(b), trained for a maximum of three simultaneous users. While the predicted intensity patterns suffer higher inaccuracies, a strong correlation is still observed for the polar angles with a standard deviation of about 2° for $\theta < $ 45°.

The strong linear correlation in Figs. 2(c) and 2(d) indicates that the trained NN can generalize its predictions to more complex intensity pattern inputs. These intensity patterns consist of up to five primary radiation lobes, completely unlike the training data. The ability to generalize to more complex intensity profiles can be explained by the nonlinearity in Eq. (2) for three or more spatially separated users. For instance, a NN trained on intensity patterns designed for two spatially separated users could not accurately predict the phase profile required to beamform radiation towards three or more simultaneous users (see Supplement 1, S1). The reason is that only a linear or periodic phase profile is necessary to form one or two primary radiation lobes in the intensity pattern, and a neural network trained on these intensity data would not be capable of predicting the nonlinear phase profile required to form three or more primary radiation lobes in the intensity pattern. The beamforming accuracy of the trained NN for more than five simultaneous users is also analyzed below.

Fig. 3. Correlation between the user and beam positions predicted by the trained NN for: (a), (b) 2000 cases of six spatially separated users; (c), (d) 1200 cases of ten spatially separated users. The predictions are performed for intensity patterns that does not belong to the training data.

Download Full Size | PDF

Figure 3 shows the beamforming accuracy of the same pre-trained fully connected NN in Fig. 2, trained for a maximum of three simultaneous users. While a linear correlation between the user and beam positions is observed for 6 simultaneous users, the beamforming accuracy of the trained NN deteriorates significantly when there are 10 simultaneous users. The corresponding input and predicted intensity patterns of the test scenarios in Fig. 3 are shown in Fig. 4 below.

Fig. 4. Input and predicted intensity patterns for 6 and 10 users corresponding to the test scenarios in Fig. 3. The intensity values are normalized against the maximum of the intensity pattern for a single beam positioned at θ = 0°.

Download Full Size | PDF

The pre-trained fully-connected NN in Figs. 2 and 3 has demonstrated the ability to generalize beamforming predictions beyond the training data. To understand the potential and limitations of our deep learning model, the performance of the Adam estimation algorithm with the defined error function (see Methods) is explored for the rest of this section.

Fig. 5. Desired intensity pattern for (a) a single user (b) 112 linearly spaced users in the far-field projection plane, calculated using Eq. (3). The intensity values are normalized against the maximum of the intensity pattern in (a).

Download Full Size | PDF

In the case where a single intensity lobe is required for each user, the number of simultaneous users is fundamentally limited by the angular beamwidth of each radiation lobe which is in turn determined by the spatial extent of the phase profile or beamforming array. For a 2D array of 32×32 radiating elements with quarter-wavelength inter-element spacing, the intensity lobe at an angular position of θ = 0° has a full width half maximum beamwidth of 6.3° (see Methods). In theory, the 2D radiating array could beamform radiation towards more than 100 spatially separated users with the intensity pattern in Fig. 5(b).

Fig. 6. (a) Spatial phase profile required to beamform the intensity pattern in Fig. 5(b). (b) The intensity pattern resulting from the phase profile in (a), normalized against the maximum of the intensity pattern in Fig. 5(a)

Download Full Size | PDF

The deep learning model's ability to predict the spatial phase profile for generating the intensity pattern in Fig. 5(b) is further limited by the Adam estimation algorithm with the defined error function. Figure 6(a) shows the phase profile designed to beamform the intensity pattern in Fig. 5(b), obtained via gradient descent optimization using the error gradients computed via automatic differentiation. The intensity pattern shown in Fig. 6(b) closely resemble the intensity pattern in Fig. 5(b), which exemplifies the limits of our deep learning model.

Fig. 7. Input intensity pattern for 112 randomly positioned users and the output phase profile obtained via gradient descent optimization using the Adam optimizer. The corresponding predicted intensity pattern and the error is shown.

Download Full Size | PDF

To illustrate a more realistic scenario, the optimized phase profile for an input intensity pattern designed for 112 randomly positioned users is shown in Fig. 7. The output intensity pattern, obtained via gradient descent optimization, almost perfectly matches the input intensity pattern in this case. Therefore, our deep learning model can potentially be trained to accurately predict spatial phase profiles for beamforming radiation towards more than 100 randomly positioned and spatially separated users when given sufficient training iterations.

2.3 Design of silicon metasurfaces

To test the phase profiles generated by the deep learning scheme, we use them for designing silicon metasurfaces to beamform an incident linearly polarized 1.0 THz wave. Instead of using the commonly implemented half-wavelength inter-element spacing to avoid grating lobes in the radiation pattern, we use quarter-wavelength inter-element spacing for more precise and complex phase control. The grating lobes are suppressed in the radiation pattern for any inter-element spacing that is a dyadic rational fraction (e.g., ¼ or ¾) of the radiation wavelength (see Supplement 1, S2). A 2D array of 32×32 radiating elements with 75 µm period provides sufficient resolution for the spatial phase profile to beamform intensity patterns with minimal side lobes (see Supplement 1, S2).

Figure 8(a) shows a metasurface element structure with a quarter-wavelength period of 75 µm. A Finite-Difference Time-Domain (FDTD) simulation shows that a y-polarized 1.0 THz wave transmitted through the structure in Fig. 8(a) produces a far-field radiation pattern with an approximately ${\cos ^4}\theta $ intensity distribution (see Supplement 1, S3). By changing the lateral dimensions of the silicon metasurface structure, the transmittance and phase shift of a linearly polarized 1.0 THz wave can be tuned as shown in Figs. 8(b) and 8(c), respectively. In the simulation sweep (Fig. 8), we vary both the silicon structure lateral dimensions ${l_x}$ and ${l_y}$ from 10 µm to 70 µm in steps of 1 µm. Due to fabrication constraints in the etching of the silicon wafers [42], the extreme lateral dimensions of less than 10 µm and more than 70 µm are not included to limit the aspect ratio of the silicon pillars and grooves. FDTD simulations show that using a structure thickness of 200 µm, full 2π phase modulation can be achieved (hence allowing the metasurface to implement arbitrary phase profiles) within the specified range of lateral dimensions with a relatively high transmittance of over 0.7 (see Fig. 8(b)).

Fig. 8. A single metasurface element and the simulated transmittance and phase shift of a y-polarized 1.0 THz wave. (a) A single element of the silicon metasurface on a silicon substrate. (b), (c) Simulated transmittance and phase shift of an y-polarized 1.0 THz frequency as a function of different structure length (${l_x}$) and width (${l_y}$). The full 2π phase modulation in (c) is achieved using a quarter wavelength period of 75 µm and a pillar height (${l_z}$) of 200 µm.

Download Full Size | PDF

2.4 Terahertz beamforming experiments

Figure 9(a) shows a spatial phase profile generated by the deep learning model, to be implemented on a silicon metasurface with 32×32 quarter-wavelength-spaced elements, intended to beamform a 1.0 THz radiation towards five spatially separated users at the following angular position: $\theta = 15^\circ ,\; \phi ={-} 45^\circ $; $\theta = 20^\circ ,\; \phi = 80^\circ $; $\theta = 25^\circ ,\; \phi = 140^\circ $; $\theta = 25^\circ ,\; \phi ={-} 175^\circ $; and $\theta = 30^\circ ,\; \phi ={-} 170^\circ $. From this phase profile, we design the silicon metasurface depicted in Fig. 9(b) with the simulated phase shift data in Fig. 8(c) by using nearest neighbor interpolation, with each metasurface element having an equal lateral dimension (${l_x} = {l_y}$). Figure 9(c) shows an image of the fabricated silicon metasurface taken using a scanning electron microscope. Figure 9(d) plots the simulated far-field intensity pattern of a 1.0 THz radiation transmitted through the silicon metasurface calculated using an FDTD simulation, projected onto a 170 mm × 170 mm plane located 130 mm from the metasurface in the direction of the surface normal.

Fig. 9. Experimental demonstration of terahertz beamforming using a silicon metasurface. (a) Predicted spatial phase profile for beamforming a 1.0 THz radiation towards five spatially separated users. (b) A silicon metasurface design with 32 × 32 quarter-wavelength-spaced elements required to achieve the desired spatial phase modulation. (c) An image of the fabricated silicon metasurface taken using a scanning electron microscope. (d) Simulated intensity pattern at 1.0 THz frequency for the silicon metasurface design, projected onto a 170 mm × 170 mm plane 130 mm away from the metasurface in the direction of the surface normal. The angular positions of the five users are labelled as shown. The 4^th and 5^th user shares a single radiation lobe (of higher intensity) as their angular separation is smaller than the angular beamwidth. (e-i) Experimental measurements of the maximum intensity of each radiation lobe beamformed by the silicon metasurface at 1.0 THz frequency with the simulated radiation pattern overlayed in gray contours.

Download Full Size | PDF

Figures 9(e) to 9(i) shows the experimentally measured maximum intensity of each primary radiation lobe beamformed by the fabricated silicon metasurface in Fig. 9(c) at 1.0 THz frequency. The intensity data are experimentally obtained via a 2D raster scan of a terahertz pulse transmitted through the silicon metasurface using a fiber-based terahertz time-domain spectroscopy setup (see Methods), with the detector positioned 130 mm away and aligned parallel to the expected beamforming angle. The terahertz detector has a small angular detection window (due to the detection limit of its photoconductive antenna), so only the intensity data near the expected beamforming position is experimentally measured, as shown in Fig. 9(e) to 9(i). Both the simulated radiation intensity pattern (Fig. 9(d)) and experimentally obtained intensity data (Fig. 9(e) to 9(i)) are normalized against the intensity of a terahertz wave transmitted through an un-patterned silicon wafer at 1.0 THz frequency. The experimentally measured normalized intensity from Fig. 9(e) to 9(i) are of comparable magnitude to the simulated intensity pattern at the expected beamforming positions, showing that our designed silicon metasurface can accurately and efficiently beamform radiation towards five spatially separated users. Further experimental results of terahertz beamforming using silicon metasurfaces, designed using the trained NN in Fig. 2, for three and four spatially separated users, can be found in Supplement 1, S4.

Fig. 10. Spatial phase profile designed for beamforming the intensity pattern in Fig. 9(d), represented using (a) 8-bit values, (b) 4-bit values, and (c) 2-bit values. (d-f) Intensity patterns of the phase profiles in (a), (b) and (c), respectively.

Download Full Size | PDF

While current experimental demonstration of active terahertz beamforming is limited to 8-bit programmable metasurfaces [26], we show that the spatial phase profiles predicted by our deep learning model can still accurately beamform desired intensity patterns when down sampled to 2-bit values. Figure 10 shows a spatial phase profile generated by our deep learning model, intending for beamforming radiation towards five simultaneous users. The resulting intensity pattern remains relatively unchanged when the representation of the spatial phase profile decreases from 8-bit to 2-bit value, although the radiation lobe intensity decreases slightly for the 2-bit phase profile. This shows that our deep learning model may be implemented using existing programmable metasurfaces with 2-bit phase control to beamform radiation towards multiple spatially separated users.

3. Discussion

The deep learning model presented in this work can serve as the basis for implementing a MIMO system based on reprogrammable terahertz metasurface. Further performance improvements may be achieved by combining the NN with other optimization algorithms. For instance, the spatial phase profile predicted by the NN can be used as an initializer for further non-NN-based beam optimization. Moreover, the NN is designed to accept multiple current intensity patterns as the input, which could be used for the simultaneous prediction of multiple phase profiles (e.g., real-time resource slicing to support multiple users using several reprogrammable metasurfaces).

While we have only considered input intensity patterns with the specific form shown in Eq. (3), composed of a superposition of radiation lobes with equal amplitude and fixed angular width, the deep learning model can be trained to predict phase profiles for achieving more general intensity patterns, with multiple radiation lobes of varying intensities and angular widths. The ability to form unequal radiation lobes is extremely useful for broadcasting to user clusters of various sizes at different radial distances.

So far, we have only experimentally demonstrated beamforming using a passive metasurface. Our deep learning model must be implemented alongside a reprogrammable metasurface to achieve real-time complex beamforming. Dynamic terahertz beam steering and beamforming can be experimentally achieved using reprogrammable metasurfaces [25]. However, the phase profile control demonstrated in these experiments limits the range of beamforming radiation patterns they support. Spatial phase modulation of terahertz wave has also been theoretically proposed using micro-electromechanical systems (MEMs) [43], which could achieve full 2π spatial phase modulation and may be integrated with field programmable gate arrays (FPGA) to utilize our NN beamforming scheme.

For simplicity, our simulations and experiments have focused on the spatial phase modulation of a monochromatic terahertz wave. For terahertz wave with a finite frequency bandwidth, the intensity pattern of the transmitted radiation changes with frequency due to the variable frequency response of the silicon metasurface (see Supplement 1 S5). The use of a narrow band terahertz pulse can minimize this effect. However, a broadband terahertz pulse may present an advantage where the metasurface can simultaneously be used for beamforming at multiple frequencies.

4. Conclusion

We have demonstrated an intelligent real-time and self-adaptive terahertz beamforming scheme enabled by deep reinforcement learning based on a fully connected neural network, which outputs spatial phase profiles describing complex terahertz beams that accurately match a set of desired radiation patterns. To achieve real-time terahertz beamforming, the scheme may be implemented using a microcontroller to perform spatial phase modulation in reprogrammable metasurfaces or MEMs. We find that the trained neural network exhibits a high level of generalizability, producing good results for target intensity patterns that are unlike any of the training data. These results indicate that deep learning and automatic differentiation provide a promising approach for real-time terahertz beamforming in MIMO systems for next-generation wireless communications. Similar schemes could prove helpful in terahertz imaging and sensing applications.

5. Methods

Calculation of the intensity pattern for beamforming radiation towards a single user:

The phase profile required to form a single beam towards the angular position $({{\theta_u},{\phi_u}} )$ is given by the linear phase profile ${\varphi _{mn}} = k\sin {\theta _u}({{x_{mn}}\cos {\phi_u} + {y_{mn}}\sin {\phi_u}} )$, where $k = |{\textbf k} |$ and $({{x_{mn}},\; {y_{mn}}} )$ are the in-plane Cartesian coordinates of ${{\textbf r}_{mn}}$. The corresponding intensity pattern is given by

(4)$$\begin{array}{c} {{I_u}({\theta ,\phi } )\approx {{\left|{\frac{1}{{\mathrm{\Delta }x\mathrm{\Delta }y}}\mathop \smallint \nolimits_{ - \frac{{M\mathrm{\Delta }y}}{2}}^{ + \frac{{M\mathrm{\Delta }y}}{2}} \mathop \smallint \nolimits_{ - \frac{{N\mathrm{\Delta }x}}{2}}^{ + \frac{{N\mathrm{\Delta }x}}{2}} \exp [{ik[{x({\sin \theta \cos \phi - \sin {\theta_u}\cos {\phi_u}} )+ y({\sin \theta \sin \phi - \sin {\theta_u}\sin {\phi_u}} )} ]} ]\textrm{d}x\textrm{d}y} \right|}^2},} \end{array}$$

where $\mathrm{\Delta }x$ and $\mathrm{\Delta }y$ are the spacings between the radiating elements. Evaluating the double integral leads to the following equation describing a single-beam intensity pattern:

(5)$$\begin{array}{c} {{I_u} = {{\left|{\frac{{\sin \left[ {k\frac{{N\mathrm{\Delta }x}}{2}({\sin \theta \cos \phi - \sin {\theta_u}\cos {\phi_u}} )} \right]}}{{k\frac{{N\mathrm{\Delta }x}}{2}({\sin \theta \cos \phi - \sin {\theta_u}\cos {\phi_u}} )}} \times \frac{{\sin \left[ {k\frac{{M\mathrm{\Delta }y}}{2}({\sin \theta \sin \phi - \sin {\theta_u}\sin {\phi_u}} )} \right]}}{{k\frac{{M\mathrm{\Delta }y}}{2}({\sin \theta \sin \phi - \sin {\theta_u}\sin {\phi_u}} )}}} \right|}^2}.} \end{array}$$

For a single user at angular position $\theta $ = 0° and $\phi $ = 0°, Eq. (5) reduces to ${\left|{\textrm{sinc}\left( {k\frac{{N\mathrm{\Delta }x}}{2}\sin \theta } \right)} \right|^2}$ for M = 0. We can easily determine the half width at half maximum (HWHM) of the function ${|{\textrm{sinc}(x )} |^2}$ via numerical function minimization to find ${x_{\textrm{HWHM}}} \approx $ 1.3916. This suggests that the HWHM of the intensity lobe can be calculated via the relation $k\frac{{N\mathrm{\Delta }x}}{2}\sin {\theta _{\textrm{HWHM}}}$ = 1.3916. The full width at half maximum (FWHM) of the intensity lobe is therefore given by ${\theta _{\textrm{FWHM}}} = 2{\sin ^{ - 1}}\left( {\frac{{2.7832}}{{kN\mathrm{\Delta }x}}} \right)$. For wavevector $k = \frac{{2\pi }}{\lambda } = \frac{{2\pi \; }}{{300\mathrm{\mu m}}}$, $N$ = 32 radiating elements and $\mathrm{\Delta }x$ = 75 µm element spacing, the FWHM angular beamwidth of the intensity lobe is approximately 6.3°. The angular beamwidth of the intensity lobe can be preserved by keeping the parameter $kN\mathrm{\Delta }x$ constant.

Neural network architecture and training:

Our fully connected neural network (16384×4096×4096×1024 nodes) receives input intensity pattern with 128 × 128 = 16384 data points and outputs the phase profile for a metasurface with an array of 32 × 32 = 1024 radiating elements. The outputs of each fully connected layer are passed through a Leaky Rectified Linear Unit (ReLU) activation function with a 0.01 multiplier for negative values [44]. The weights of the fully connected layers are initialized using the Glorot initializer [45], and the biases are initialized to zero. The propagation of information through the fully connected layers is calculated via a series of matrix multiplication operations:

(6)$$\begin{array}{c} {{{\textbf x}_1} = f({{{\textbf W}_1}{{\textbf I}_0} + {{\textbf b}_1}} ),\; \; {{\textbf x}_2} = f({{{\textbf W}_2}{{\textbf x}_1} + {{\textbf b}_2}} ),\; \; {{\mathbf \varphi }_{\textrm{output}}} = f({{{\textbf W}_3}{{\textbf x}_2} + {{\textbf b}_3}} ),} \end{array}$$

where ${\textbf x}$ and ${\mathbf \varphi }$ are the outputs of the hidden and output layers respectively, ${{\textbf I}_0}$ is vector containing the input intensity pattern, and $f(x )$ is the nonlinear activation function. The matrices ${\textbf W}$ and vectors ${\textbf b}$ are the weights and biases of the fully connected layers, respectively. The predicted intensity pattern ${{\textbf I}_{\textrm{predicted}}}$ is calculated from the output phase profile ${{\boldsymbol \varphi }_{\textrm{output}}}$ using Eq. (2), and the error $\xi $ is defined as the weighted half mean squared difference between the predicted and input intensity pattern:

(7)$$\begin{array}{c} {\xi = \frac{1}{2}\mathop \sum \nolimits_{\theta ,\phi } \textrm{exp} \left[ {8\ln 2{{\left( {\frac{\theta }{{90^\circ }}} \right)}^2}} \right]{{|{{{\textbf I}_{\textrm{predicted}}}({\theta ,\phi } )- {{\textbf I}_0}({\theta ,\phi } )} |}^2},} \end{array}$$

where the weight of $\textrm{exp} \left[ {8\ln 2{{\left( {\frac{\theta }{{90^\circ }}} \right)}^2}} \right]$ counteracts the ${\cos ^2}\theta $ factor in Eq. (2). We optimize the fully connected neural network using the adaptive moment estimation algorithm and update the neural network weights and biases with a 0.0001 global learning rate [39], where the error gradients $\frac{{\textrm{d}\xi }}{{\textrm{d}{\textbf W}}}$ and $\frac{{\textrm{d}\xi }}{{\textrm{d}{\textbf b}}}$ are computed via automatic differentiation [37,38].

Simulation of far-field radiation patterns:

The electromagnetic fields of a terahertz pulse transmitted through a silicon metasurface are computed via FDTD simulations using the Lumerical software (https://www.lumerical.com). To obtain the far-field radiation intensity pattern at a specific frequency, a power monitor is placed at the surface of the silicon metasurface to record the simulated electromagnetic fields, where a Fourier transform is performed to obtain the angular spectrum of the transmitted terahertz pulse at the specified frequency. Each electromagnetic plane wave component of the angular spectrum is then propagated towards a hemispherical surface of 1 m radius, where the intensity pattern is computed.

Sample fabrication and preparation:

The silicon metasurface samples were fabricated on 500 µm thick silicon wafers with high-resistivity (>5000 Ω.cm) due to negligible transmission losses at terahertz frequencies. Deep reactive ion etching was used to pattern the metasurface structure with an etching depth of 200 µm to achieve the designed metasurface structure. Aluminum foils were then used to prevent the direct transmission of terahertz waves through the un-patterned regions of silicon surrounding each metasurface sample.

Experimental setup and measurements:

The intensity profile of the radiation pattern was obtained using a two-dimensional raster scan of a terahertz pulse transmitted through the silicon metasurface using a fiber-based terahertz time-domain spectroscopy setup (THz-TDS), which have a waveform acquisition rate of 100 waveforms per second. A terahertz lens with a 50 mm focal length is used to focus the terahertz pulse onto the substrate side of the silicon metasurface (2.4 mm × 2.4 mm sample size) with a 6 mm beam spot. The terahertz detector is placed on a two-dimensional raster scanning stage, with the lens positioned 130 mm away from the silicon metasurface in the perpendicular direction and the detector aligned parallel to the expected beamforming angle. We perform raster scans over a 170 mm × 170 mm flat surface with a pixel size of 0.5 mm and a scanning speed of 10 mm per second to obtain 5 waveforms per pixel. The power spectrum of the averaged waveform was then computed via Fourier transform, where the normalized power at 1.0 THz frequency is acquired as the value for each pixel of the intensity pattern.

Funding

National Research Foundation Singapore (NRF-CRP23-2019-0005).

Acknowledgments

YJ Tan acknowledges the A*STAR Graduate Scholarship funding from the Agency for Science, Technology and Research, Singapore. The authors also acknowledge the funding support from the National Research Foundation, Singapore (Grant No. NRF-CRP23-2019-0005).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. S. Cherry, “Edholm's law of bandwidth,” IEEE Spectr. 41(7), 58–60 (2004). [CrossRef]

2. J. Wang, J. Liu, and N. Kato, “Networking and Communications in Autonomous Driving: A Survey,” IEEE Commun. Surv. Tutorials 21(2), 1243–1274 (2018). [CrossRef]

3. D. M. Abdullah and S. Y. Ameen, “Enhanced mobile broadband (EMBB): A review,” J. Inf. Technol. Informatics 1(1), 13–19 (2021).

4. T. Nagatsuma, G. Ducournau, and C. C. Renaud, “Advances in terahertz communications accelerated by photonics,” Nat. Photonics 10(6), 371–379 (2016). [CrossRef]

5. K. Sengupta, T. Nagatsuma, and D. M. Mittleman, “Terahertz integrated electronic and hybrid electronic–photonic systems,” Nat. Electron. 1(12), 622–635 (2018). [CrossRef]

6. Y. Yang, Y. Yamagami, X. Yu, P. Pitchappa, J. Webber, B. Zhang, M. Fujita, T. Nagatsuma, and R. Singh, “Terahertz topological photonics for on-chip communication,” Nat. Photonics 14(7), 446–451 (2020). [CrossRef]

7. S. Dang, O. Amin, B. Shihada, and M.-S. Alouini, “What should 6G be?” Nat. Electron. 3(1), 20–29 (2020). [CrossRef]

8. M. H. Alsharif, A. H. Kelechi, M. A. Albreem, S. A. Chaudhry, M. S. Zia, and S. Kim, “Sixth Generation (6G) Wireless Networks: Vision, Research Activities, Challenges and Potential Solutions,” Symmetry 12(4), 676 (2020). [CrossRef]

9. I. F. Akyildiz, A. Kak, and S. Nie, “6G and Beyond: The Future of Wireless Communications Systems,” IEEE Access 8, 133995–134030 (2020). [CrossRef]

10. M. Guowang, Z. Jens, S. Ki Won, and S. Slimane Ben, Fundamentals of Mobile Data Networks (Cambridge University Press, 2016).

11. Y. Yang, A. Shutler, and D. Grischkowsky, “Measurement of the transmission of the atmosphere from 0.2 to 2 THz,” Opt. Express 19(9), 8830 (2011). [CrossRef]

12. D. M. Slocum, E. J. Slingerland, R. H. Giles, and T. M. Goyette, “Atmospheric absorption of terahertz radiation and water vapor continuum effects,” J. Quantitative Spectrosc. Radiative Transfer 127, 49–63 (2013). [CrossRef]

13. F. Ahmed, M. Furqan, B. Heinemann, and A. Stelzer, “0.3-THz SiGe-Based High-Efficiency Push–Push VCOs With > 1-mW Peak Output Power Employing Common-Mode Impedance Enhancement,” IEEE Trans. Microwave Theory Technol. 66(3), 1384–1398 (2017). [CrossRef]

14. K. Guo, Y. Zhang, and P. Reynaert, “A 0.53-THz Subharmonic Injection-Locked Phased Array With 63-µW Radiated Power in 40-nm CMOS,” IEEE J. Solid-State Circuits 54(2), 380–391 (2018). [CrossRef]

15. H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and Spectral Efficiency of Very Large Multiuser MIMO Systems,” IEEE Trans. Commun. 61(4), 1436–1449 (2013). [CrossRef]

16. D. Wang, J. Wang, X. You, Y. Wang, M. Chen, and X. Hou, “Spectral Efficiency of Distributed MIMO Systems,” IEEE J. Select. Areas Commun. 31(10), 2112–2127 (2013). [CrossRef]

17. E. Bjornson, E. G. Larsson, and M. Debbah, “Massive MIMO for Maximal Spectral Efficiency: How Many Users and Pilots Should Be Allocated?” IEEE Trans. Wireless Commun. 15(2), 1293–1308 (2015). [CrossRef]

18. C. Lin and G. Y. L. Li, “Terahertz Communications: An Array-of-Subarrays Solution,” IEEE Commun. Mag. 54(12), 124–131 (2016). [CrossRef]

19. X. Fu, F. Yang, C. Liu, X. Wu, and T. J. Cui, “Terahertz Beam Steering Technologies: From Phased Arrays to Field-Programmable Metasurfaces,” Adv. Optical Mater. 8(3), 1900628 (2020). [CrossRef]

20. L. Liu, X. Zhang, M. Kenney, X. Su, N. Xu, C. Ouyang, Y. Shi, J. Han, W. Zhang, and S. Zhang, “Broadband Metasurfaces with Simultaneous Control of Phase and Amplitude,” Adv. Mater. 26(29), 5031–5036 (2014). [CrossRef]

21. Q. Wang, X. Zhang, Y. Xu, J. Gu, Y. Li, Z. Tian, R. Singh, S. Zhang, J. Han, and W. Zhang, “Broadband metasurface holograms: toward complete phase and amplitude engineering,” Sci. Rep. 6(1), 1–10 (2016). [CrossRef]

22. L. Cong, Y. K. Srivastava, H. Zhang, X. Zhang, J. Han, and R. Singh, “All-optical active THz metasurfaces for ultrafast polarization switching and dynamic beam splitting,” Light Sci. Appl. 7(1), 1–9 (2018). [CrossRef]

23. H. Zhang, X. Zhang, Q. Xu, C. Tian, Q. Wang, Y. Xu, Y. Li, J. Gu, Z. Tian, C. Ouyang, X. Zhang, C. Hu, J. Han, and W. Zhang, “High-Efficiency Dielectric Metasurfaces for Polarization-Dependent Terahertz Wavefront Manipulation,” Adv. Opt. Mater. 6(1), 1700773 (2018). [CrossRef]

24. R. T. Ako, A. Upadhyay, W. Withayachumnankul, M. Bhaskaran, and S. Sriram, “Dielectrics for Terahertz Metasurfaces: Material Selection and Fabrication Techniques,” Adv. Optical Mater. 8(3), 1900750 (2020). [CrossRef]

25. J. Guo, T. Wang, H. Zhao, X. Wang, S. Feng, P. Han, W. Sun, J. Ye, G. Situ, H. Chen, and Y. Zhang, “Reconfigurable Terahertz Metasurface Pure Phase Holograms,” Adv. Opt. Mater. 7(10), 1801696 (2019). [CrossRef]

26. S. Venkatesh, X. Lu, H. Saeidi, and K. Sengupta, “A high-speed programmable and scalable terahertz holographic metasurface based on tiled CMOS chips,” Nat. Electron. 3(12), 785–793 (2020). [CrossRef]

27. L. A. Greda, A. Winterstein, D. L. Lemes, and M. V. T. Heckler, “Beamsteering and Beamshaping Using a Linear Antenna Array Based on Particle Swarm Optimization,” IEEE Access 7, 141562–141573 (2019). [CrossRef]

28. M. Zhou, D. Liu, S. W. Belling, H. Cheng, M. A. Kats, S. Fan, M. L. Povinelli, and Z. Yu, “Inverse Design of Metasurfaces Based on Coupled-Mode Theory and Adjoint Optimization,” ACS Photonics 8(8), 2265–2273 (2021). [CrossRef]

29. J. Qie, E. Khoram, D. Liu, M. Zhou, and L. Gao, “Real-time deep learning design tool for far-field radiation profile,” Photon. Res. 9(4), B104–B108 (2021). [CrossRef]

30. H. Taghvaee, A. Jain, X. Timoneda, C. Liaskos, S. Abadal, E. Alarcón, and A. Cabellos-Aparicio, “Radiation Pattern Prediction for Metasurfaces: A Neural Network-Based Approach,” Sensors 21(8), 2765 (2021). [CrossRef]

31. T. N. Kapetanakis, I. O. Vardiambasis, G. S. Liodakis, M. P. Ioannidou, and A. M. Maras, “Smart Antenna Design Using Neural Networks,” Proceedings of the 8th International Conference: New Horizons in Industry, Business and Education, (2013) pp. 130–135.

32. J. H. Kim and S. W. Choi, “A Deep Learning-Based Approach for Radiation Pattern Synthesis of an Array Antenna,” IEEE Access 8, 226059–226063 (2020). [CrossRef]

33. R. Lovato and X. Gong, “Phased Antenna Array Beamforming using Convolutional Neural Networks,” in 2019 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (IEEE, 2019), pp. 1247–1248.

34. T. Shan, X. Pan, M. Li, S. Xu, and F. Yang, “Coding Programmable Metasurfaces Based on Deep Learning Techniques,” IEEE J. Emerg. Sel. Topics Circuits Syst. 10(1), 114–125 (2020). [CrossRef]

35. D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training Deep Neural Networks for the Inverse Design of Nanophotonic Structures,” ACS Photonics 5(4), 1365–1369 (2018). [CrossRef]

36. Z. Zhen, C. Qian, Y. Jia, Z. Fan, R. Hao, T. Cai, B. Zheng, H. Chen, and E. Li, “Realizing transmitted metasurface cloak by a tandem neural network,” Photonics Res. 9(5), B229–B235 (2021). [CrossRef]

37. R. D. Neidinger, “Introduction to Automatic Differentiation and MATLAB Object-Oriented Programming,” SIAM Rev. 52(3), 545–563 (2010). [CrossRef]

38. A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, “Automatic Differentiation in Machine Learning: a Survey,” J. Mach. Learn. Res. 18, 1–43 (2018).

39. D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980 (2014).

40. S. A. Kuznetsov, M. A. Astafev, M. Beruete, and M. Navarro-Cía, “Planar Holographic Metasurfaces for Terahertz Focusing,” Sci. Rep. 5(1), 1–8 (2015). [CrossRef]

41. S. Kunis and D. Potts, “Fast spherical Fourier algorithms,” J. Comput. Appl. Math. 161(1), 75–98 (2003).

42. B. Wu, A. Kumar, and S. Pamarthy, “High aspect ratio silicon etch: A review,” J. Appl. Phys. 108(5), 9 (2010). [CrossRef]

43. L. Cong, P. Pitchappa, Y. Wu, L. Ke, C. Lee, N. Singh, H. Yang, and R. Singh, “Active Multifunctional Microelectromechanical System Metadevices: Applications in Polarization Control, Wavefront Deflection, and Holograms,” Adv. Opt. Mater. 5(2), 1600716 (2017). [CrossRef]

44. A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier Nonlinearities Improve Neural Network Acoustic Models,” in Proc. Icml1 (2013), 30, p. 3.

45. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (JMLR Workshop and Conference Proceedings, 2010, pp. 249–256

Self-adaptive deep reinforcement learning for THz beamforming with silicon metasurfaces in 6G communications

Abstract

Corrections

1. Introduction

2. Results

2.1 Deep learning model for intelligent real-time terahertz beamforming

2.2 Neural network training and predictions

2.3 Design of silicon metasurfaces

2.4 Terahertz beamforming experiments

3. Discussion

4. Conclusion

5. Methods

Funding

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (2)

Data availability

Cited By

Figures (10)

Equations (7)

Optics Express

Name	Description
Supplement 1	Supplement 1
Visualization 1	Self-adaptive deep learning model for intelligent real-time terahertz beamforming based on the prediction of phase profile for metasurfaces