AI-assisted intent-based traffic grooming in a dynamically shared 5g optical fronthaul network

Luyao Guan; Min Zhang; Min Zhang; Yihan Gui; Chunyu Zhang; Hui Yang; Anthony C. Boucouvalas; Danshi Wang; Danshi Wang

doi:10.1364/OE.428024

1. Introduction

With ultra-dense deployment of small cell sites in 5G networks, mobile data traffic is expected to rapidly expand to 110 EB/month by 2023 [1]. This significant growth is driven by the extreme proliferation of smart mobile devices and new applications, such as virtual reality, augmented reality, and 4 K videos [2, 3]. In such circumstances, a tremendous amount of real-time and large bandwidth requirements services will be generated amongst the networks [4], which in conjunction with the increasingly dynamic traffic patterns of cell sites in different times and locations, collectively pose a major challenge to mobile networks [5]. Therefore, upcoming 5G optical fronthaul networks are required to evolve to support exploding mobile traffic volumes, real-time extraction of fine-grained and intelligent analytics, and intent-based automatic management of network resources to effectively handle such traffic [6].

Traditional optical fronthaul networks, which lack a traffic prediction-assisted intent-based intelligent resource scheduling scheme [7] and sufficient network resources, such as bandwidth, computational power, and storage, will not be capable of coping with the extensive, large-bandwidth, and unevenly distributed traffic that will be generated in the future [8]. Specifically, traditional network management are triggered based on real-time traffic requirements, so the network resources need to be reprogrammed for incoming traffic. These schemes lack the necessary long-time network resource configuration and planning [9], and will result the average delay increasing with the frequent processing of resource requirements and the execution policy. Besides due to the limited network bandwidth, limited buffer capacity of network equipment and the high number of unevenly distributed services with large bandwidth requirements, it will even cause the network congestion. However, due to the complicated dynamic nature of mobile traffic demand, traditional time series methods cannot satisfy the requirements of prediction tasks well and often neglect the important spatial factors [10]. In addition, while some recent approaches model mobile traffic prediction problem using temporal and spatial features, they only consider local geographical dependency and do not take influential distant regions into consideration [11]. Thus, we design an adaptive graph convolutional network with gated recurrent unit (AGCN-GRU) model to learn the temporal and spatial dependencies of the traffic patterns of cell sites to provide an accurate traffic prediction. The AGCN model can break the graph structure of the local spatial connection between nodes and learn the real potential correlations form each type of features in all nodes.

Although the overall data traffic demand of the mobile network is growing, the demand in different areas and periods of time is not evenly distributed. Thus, one of the most potential solutions to the abovementioned problem is to achieve traffic prediction-assisted intent-based intelligent network management and optimization in an SDN-based 5G optical fronthaul network architecture. We leverage the k-means algorithm to automatically map each cluster of cell sites to a BBU in advance to achieve cell site clustering based on the prediction results. What’s more, we innovatively consider how to deal with the unpredicted burst traffic. To prevent network performance tremendously degraded due to traffic congestion by unpredicted burst traffic, we can achieve intent-based traffic grooming to groom the services by allocating network resource according to the monitoring real-time bandwidth and QoS of services. However, most existing approaches pertaining to network planning and resource management are demonstrated by simulations, and the corresponding experimental verification is lacking, especially in intelligent network control. Experimental verification can evaluate the scheme in a real network scenario [12]. Thus, we establish a software-defined testbed in which our AI-assisted intent-based traffic grooming (AI-ITG) scheme is deployed to predict network traffic and achieve intelligent network control.

The contributions of this study can be summarised as follows:

First, we design an AGCN-GRU model to learn the temporal and spatial dependencies of the traffic patterns of cell sites to provide an accurate traffic prediction. The AGCN model can break the graph structure of the local spatial connection between nodes and learn the real potential correlations form each type of features in all nodes.

Second, we propose an intelligent intent-based traffic grooming (ITG) scheme to achieve cell site clustering based on the prediction results, and traffic grooming to groom the services by allocating network resource according to the monitoring real-time bandwidth and QoS requirement of services. It can prevent network performance tremendously degraded due to traffic congestion by unpredicted burst traffic.

Finally, we establish a software-defined testbed and deploy the AGCN-GRU model and ITG schemes on the mentioned testbed. We acquire the records of cell site traffic values in the existing networks in Guangzhou city over a 28-day period, and we use these data in our testbed to verify the intelligent-control-based network. The experimental results show that the proposed schemes can optimize network resource allocation, increase the average resource utility ratio, and reduce the average delay and rejection ratio via network load balancing and traffic grooming.

2. Related works

Accurate network traffic prediction assists in executing intent-based intelligent network management. However, owing to complicated temporal variations including burstiness and long periods, multi-variant impact factors such as the point of interest, day of the week, and potential spatial dependencies introduced by the movement of population, mobile network traffic experiences temporal and spatial fluctuations, which make the realisation of network traffic forecasts challenging. [13]. In optical metro and core networks, time-series modelling and forecasting have been introduced for traffic prediction [14, 15]. However, considering the highly dynamic nature of the traffic patterns of cellular sites in a mobile network, it is challenging to predict and characterize these patterns with high accuracy [16], and research on such networks is limited. The ARMA/ARIMA model and the HoltWinters algorithm are the most widely used traditional linear prediction methods [17]. The inability to detect sudden changes in traffic values is a key limitation for this type of predictors. Recently, recurrent neural networks (RNN) have been widely used to analyse complicated nonlinear sequence patterns to model the traffic flow of 5G optical networks [18]. But the major drawback of these methods is that they do not take into account the potential spatial dependencies between network traffic series, which are very common and important in a mobile network. Specifically, the spatial correlations are shown as follows: in some cases, adjacent base stations share similar traffic patterns, while some faraway base stations share reversed traffic patterns [19]. Because maybe cells located adjacently share similar infrastructure and serve similar user groups with the same habits e.g., residential areas would have low demand during the day and great demand at night while shopping districts have particularly high demand for mobile traffic on weekends compared to weekdays [20]. However, beyond the simple close spatial locations, cells that are distant can also be contacted because of their similar city function or the connection of advanced transportation systems like subways and buses, thus they may share similar traffic patterns. In addition, according to Xu et al. [21], the movement of users who require and consume mobile traffic contributes to spatial correlations. The newly proposed graph convolutional network (GCN) provides a new idea for extracting spatial dependency from non-Euclidean data [22] and has been applied to vehicle traffic flow prediction. GCN models the dynamics of mobile traffic as an information dissemination process. However, GCN model assumes a graphical structure of the data such that the structure of the network is fixed and incapable of learning new connections [23]. For example, there are two base stations in different downtown areas where major populations are office workers. Because they are far apart, the similarities of the traffic patterns of base stations may not be reflected in the graph topology. Thus the GCN model can’t learn their correlations. These issues are addressed by the proposed AGCN-GRU model because AGCN model can break the graph structure of the local geographical dependency and learn the real potential correlations form each type of features in all nodes.

Accurate and timely prediction of cellular traffic will facilitate carriers to schedule resources and groom traffic based on network intent. Based on this, we propose an intelligent intent-based traffic grooming ITG scheme and we innovatively consider how to deal with the unpredicted burst traffic. However, the current literature does not consist of any scheme that is based on AI-assisted intent-based intelligent traffic grooming in an optical fronthaul network [24, 25]. In [18] and [26], the authors used a Long Short-Term Memory (LSTM) RNN to predict network traffic and then allocate resources in a 5G C-RAN network. [8] proposed the MuLSTM model to accurately foresee RRH traffic patterns in a future time-period and a two-phase framework to dynamically determine the optimal RRH clustering and BBU mapping schemes under different contexts. We emphatically focus on how to provide accurate network traffic predictions by learning the temporal and spatial dependencies of traffic patterns of cell sites, and how traffic prediction can be effectively used to assist in executing intent-based intelligent and automatical network management, which can prevent network performance tremendously degraded due to traffic congestion by unpredicted burst traffic.

3. Software defined-fronthaul net architecture and AI-assisted network resource management

3.1 SD-FNet architecture considering the AI-assisted ITG scheme

To achieve better resource utilisation and cope with the explosion of mobile traffic, as compared to traditional C-RANs, we designed a more flexible and programmable SDN-based optical fronthaul network architecture, namely the SD-FNet, as shown in Fig. 1. For the data plane, in the virtual baseband unit (vBBU) cloud, the vBBUs are implemented through commercial servers that provide real-time baseband processing. The conventional, complicated, and power-hungry cell sites can be simplified to RRHs. Each RRH is equipped with a tuneable transceiver, thereby serving as a terminal of the fronthaul optical network. Each RRH provides the area within its coverage by assigning resource blocks to users in its spectrum band. All the RRHs are connected with vBBUs by a multiplexer/demultiplexer that aggregates and separates traffic over multiple wavelengths.

Fig. 1. SD-FNet architecture utilizing the AI-assisted ITG scheme

Download Full Size | PDF

For the controller plane, the intelligent ITG orchestrator contains the module of data analytics and prediction, network orchestration module, and SDN controller module. The SDN controller consists of a data monitoring and provisioning manager. Traffic statistics and network resources are collected by the data monitoring module, and the provisioning manager module is responsible for resource provision and management. The controller sends this collected information to the network resource and traffic demands databases. The network resource information is transmitted to network orchestration module to provide network resource data. Additionally, traffic-demands information is used to assist in achieving predicting the traffic load and the number of access users in data analytics and the prediction model. Data pre-processing removes the unique attributes, such as users’ cell phone numbers, on the basis of the original data, then handles the missing values, selects the features, and standardises the data. The AGCN-GRU prediction model achieves the prediction of traffic load and number of access users. The network orchestrator module realises the ITG scheme-based network optimiser and network reconfiguration.

3.2 AI-assisted intent-based network resource management signalling procedure

By deploying the AGCN-GRU model and ITG schemes, we realise an AI-assisted intent-based traffic grooming scheme on the SD-FNet. The network resource management signalling procedure with AGCN-GRU-ITG on the SD-FNet is shown in Fig. 2. For steps 1 and 2, indicated in Fig. 2, the SDN collects traffic statistics and network resource information by sending a state-request message to the BBU, optical distribution network remote node (RN), and RRH, and then collects state-reply information from them in real-time. When the controller receives the state reply message from each responder promptly, all the information is updated and stored in the corresponding network resource database and traffic demands database. Subsequently, in step 3 the AGCN-GRU model is cyclically trained to achieve the prediction of the cellular traffic load and number of access users in the next period. The ITG algorithm is executed according to the prediction results from the AGCN-GRU model, and the controller sends the flow-mod message to inform the corresponding RRH, RN, and BBU to cluster RRHs and realise traffic grooming by step 4. As indicated by step 5, after the path has been successfully established, the connection is considered to be provisioned, and the traffic flow in the RRH is relayed to the corresponding virtual BBU with the allocated network resource. For a comparison, assuming the case with Round-Robin scheduling (RRS) scheme as a benchmark, we show its signalling procedure according to real-time services resource requirement in Fig. 2.

Fig. 2. AI-assisted intent-based network resource management signalling procedure

Download Full Size | PDF

3.3 AI-assisted intent-based network resource management algorithm

Studies have been conducted in which graph convolution networks (GCNs) were integrated with recurrent neural networks (RNNs) [27, 28] or with convolutional neural networks (CNNs) [29]; the bottlenecks in the current GCNs were reported in [23]. The GCN model assumes a graphical structure of the data such that the structure of the network is fixed and incapable of learning new connections, as shown in Fig. 3. GCN is an efficient method to capture a node’s spatial feature given its adjacency matrix. Different from image data, the neighbors of a node are unordered and variable in size. From a spatial perspective, graph convolution is to take the weighted average value of the features of the one node and its neighbors. Kipf and Welling proposed the first approximate convolution on the graph that is defined as [23]:

(1)$$Y = \sigma ({{\tilde{\boldsymbol A}\boldsymbol {XW}}} )$$

Here, $\tilde{{\boldsymbol A}} \in {{\boldsymbol R}^{N \times N}}$ is the normalized adjacency matrix of graph $\textrm{G}$ with self-connections, and it is set as untrainable. ${\boldsymbol X} \in {{\boldsymbol R}^{N \times K}}$ denote the input historical signale data of the total number of access users, traffic flow load of services, and vertex domains and edges of graphs, ${\boldsymbol Y} \in {{\boldsymbol R}^{N \times L}}$ denotes the output, ${\boldsymbol W} \in {{\boldsymbol R}^{K \times L}}$ denotes a trainable weight matrix, and $\mathrm{\sigma }(\cdot )$ is an activation function, such as the logistic sigmoid function.

Fig. 3. Convolution operation comparison. Redpoint: center of the kernel. Orange points: coverage of the kernel. (a) GCN, spatial-based convolutional operation on a graph; (b) AGCN, operation on an adaptive graph. Learned edges are red dash lines. The color of an edge indicates the dependency weights between nodes.

Download Full Size | PDF

However, if there are two base stations in different downtown areas where major populations are office workers, due to far apart, the similarities of the traffic patterns of base stations will not be reflected in the graph topology. Thus the GCN model can’t learn their correlations. But AGCN model can break the graph structure of the local geographical dependency and learn the real potential correlations form each type of features in all nodes. To fully employ the spatial and temporal features of traffic flow data of cell sites, we propose the framework of the AGCN-GRU model, as shown in Fig. 4.

Fig. 4. The architecture of AGCN-GRU. The inputs are first entered into the AGCN layer, then passed to the GRU (encoder), and followed by the decoder GRU.

Download Full Size | PDF

In AGCN, for each feature $\textrm{k} \in \textrm{K}$, we use an adaptive adjacency matrix $\tilde{{\textbf A}}_{\textrm{adpt}}^\textrm{k} \in {\textrm{R}^{\textrm{N} \times \textrm{N}}}\; $ to model the spatial dependency, where ${\boldsymbol N}$ is the number of monitoring nodes. This self-adaptive adjacency matrix does not require prior knowledge from the graph and is trained end to end through backpropagation during training to learn hidden spatial dependency form each type of features in all nodes all by itself. When the graph structure is not provided or is unavailable, the original adjacency matrix is replaced by the self-adaptive adjacency matrix, which is trainable. Each adaptive adjacency matrix $\tilde{{\textbf A}}_{\textrm{adpt}}^\textrm{k}$ is able to model the spatial dependency of each cell site with different features during training. The input ${\textbf X}$ is the historical data of total traffic load, number of access users, traffic load of P2P, VoIP, finance, navigation, stocks, instant messaging, social networks, video, web, file transfer, E-mail, game, and others some services. Output ${\textbf Y}$ is extracted from historical traffic load to capture spatial dependencies, as inputs of GRU network.

(2)$$Y = \mathop \sum \limits_{\textrm{k} = 0}^\textrm{K} \tilde{{\textbf A}}_{\textrm{adpt}}^\textrm{k}{\textbf X}$$

The entire framework is trained by maximizing the likelihood of generating the target future time series using backpropagation through time. In multi-step ahead forecasting, we employ the sequence-to-sequence architecture. Both the encoder and decoder are RNNs with GRU cells. We feed the historical time series into the encoder and use its final states to initialize the decoder. The decoder generates predictions based on the predictions generated in the previous time step.

Using the aforementioned method, we can train the AGCN-GRU to perform a 30-minute prediction of the traffic load and the number of access users, given the previous six-hours of historical data. Based on these prediction results, we propose an intelligent ITG scheme. We leverage the k-means algorithm to automatically map each cluster of cell sites to a BBU in advance to achieve cell site clustering based on the prediction results. Then we can achieve intent-based traffic grooming to groom the services by allocating network resource according to the monitoring real-time bandwidth and QoS requirements of services. The ITG algorithm is given as algorithm 1. The List of abbreviations are shown in Table 1. For every time Tc, we first perform the intelligent wavelength resource allocation and association step, which are performed in the network orchestrator module. Then, we perform constrained k-means on cell sites to identify N clusters (C = {C₁, C₂,… C_N}) according to the predicted value of T and NAU of each cell sites. We calculate the load factor of each cluster and determine C_max and C_min. And we calculate the optimal balancing value T* and NUA*. We find the RRH(target) closest to T*/2 and NAU*/2 in the cluster of C_max, move the RRH(target) to the cluster of C_min, and update T*/2, NAU*/2, C_max and C_min, until RRH(min) in the cluster of C_max is greater than T/2 or NAU/2, or C_min is greater than C_max. We repeat these steps until the determined clusters are not changed. Then, we allocate wavelength resources for these clusters and each cluster is assigned one wavelength. We start the traffic grooming period (performed in the SDN controller). According to the monitoring real-time bandwidth of each wavelength and QoS requirements of services, we determine whether the traffic of each RRH should be groomed. We consider two classes of QoS based on priority from high to low to classify the services in the fronthaul optical network. The high-priority services have high requirements for delay and usually don’t require high data rates; therefore, we focus on the ultra-low latency requirements, such as P2P, VoIP, instant messaging and so on. The low priority services represent that the services require a large fronthaul bandwidth to support applications or require a large amount of data exchange and are insensitive to time delay, such as video, web, finance and so on. H and L represent the numbers of high load ONUs that low-priority services dominate and low load ONUs that high-priority services dominate, respectively. If the detected real-time residual bandwidth of each wavelength is about to exceed the wavelength capacity provided by the optical fronthaul network, then we calculate the allocated bandwidth for each RRH in the same wavelength and groom the traffic. Otherwise, we do not groom the traffic. We achieve intent-based traffic grooming, prioritizing the bandwidth requirements of high-priority services, and thus ensuring the QoS of services.

oe-29-15-23113-i001

Table 1. List of abbreviations

View Table | View all tables in this article

4. Experimental results

4.1 Experimental environment

Most approaches pertaining to network planning and resource management are demonstrated by simulations, and the corresponding experimental verification is infrequent, especially in traffic prediction-assisted intent-based intelligent resource scheduling schemes. To demonstrate the performance of the proposed scheme, we built up an experimental testbed with an RYU controller. The RYU controller, which is entirely written in Python, is convenient and efficient for developing and implementing the proposed schemes in optical networks [30–33]. Furthermore, it is highly compatible with OpenStack, which is an open-source cloud computing management platform and responsible for managing cloud and fog computing resources. Developing RYU controller contributes to the further study about joint optimization of cloud and fog computing resources and optical network resources. Therefore, we established an SD-FNet testbed with RYU controller based on the servers and deployed the proposed schemes on it, as shown in Fig. 5(a). The orchestrator and controller were implemented on a Linux server using RYU version 3.20.29 (with four processors, 8 GB memory, and an independent network adapter). The wavelength selective switch (WSS) connected the RRUs and the vBBUs, that were deployed on a separate Linux server. In addition, each virtual machine had its own operating system (i.e., Ubuntu) and virtual hardware resources, and each machine was considered a real node. The specific network connection topology is shown in Fig. 5(b). The interfaces were connected with a virtual Ethernet switch supporting OpenFlow 1.3, which was implemented using OpenvSwitch (OvS) 2.48. OvS1 and OvS2 were the switches that were connected with the RRUs (i.e., RRU1 and RRU2, respectively). OvS3 and OvS4 were the switches that were connected to the BBUs (i.e., BBU1 and BBU2, respectively). The RYU controller communicated with the OvS and WSS (extended with OvS) using the OpenFlow protocol.

Fig. 5. (a) Software-defined fronthaul network (SD-FNet) testbed; (b) Specific network connection topology; (c) Wireshark capture of OpenFlow message showing how SDN transmits flows with AGCN-GRU-ITG. vBBU: virtual broadband unit; RRH: remote radio head; OvS: OpenvSwitch; UE: user equipment.

Download Full Size | PDF

To visually validate the AGCN-GRU-ITG scheme, we performed an experiment on the testbed. Initially, without the AGCN-GRU-ITG scheme, the users connected to RRU1 and RRU2 simultaneously transmitted the data flows to vBBU1 carried by wavelength 1 (λ_l) using the OP WILL OTP-6200 traffic analyser, as shown by the red line in Fig. 5(b). When we executed the AGCN-GRU-ITG scheme, the RRU1 and RRU2 transmitted the data flows to vBBU1 and vBBU2, carried by wavelength 1 (λ_l) and 2 (λ₂) respectively (green line in Fig. 5(b)). According to the results of the scheme, the SDN controller informed the UE2, modified the flow entries and carried wavelength using an OFPT FLOW MOD message, which indicated the successful configuration of the flow entry. The corresponding OpenFlow protocol messages were captured, as shown in Fig. 5(c). We transferred data by adjusting the service generation interval generated by an OP WILL OTP-6200 traffic analyser on the SD-FNet testbed.

We define the load balancing coefficient (LBC), as the reciprocal of the sum of two parameters indicative of the traffic load and number of access users:

(3)$$LBC\textrm{ = }\frac{\textrm{1}}{{std(T) + std(NAU)}}$$

(4)$$std(T)\textrm{ = }\frac{{\sum\nolimits_{i = 1}^N {\left|{x{{(T)}_i} - \frac{1}{N}\sum\nolimits_{i = 1}^N {x{{(T)}_i}} } \right|} }}{{\sum\nolimits_{i = 1}^N {x{{(T)}_i}} }}$$

(5)$$std(NAU)\textrm{ = }\frac{{\sum\nolimits_{i = 1}^N {\left|{x{{(NAU)}_i} - \frac{1}{N}\sum\nolimits_{i = 1}^N {x{{(NAU)}_i}} } \right|} }}{{\sum\nolimits_{i = 1}^N {x{{(NAU)}_i}} }}, $$

std(T) represents the sum of the ratios of traffic load values of each vBBU deviation from the mean to the total traffic load value, and std(NAU) represents the sum of the ratios of the number of access users of each vBBU deviations from the mean to the total number of access users. A higher value of $LBC$ represents better balance between vBBUs.

4.2 Experimental verification of the AGCN-GRU-ITG scheme considering existing network traffic data

To demonstrate the effectiveness of our proposed scheme on the SD-FNet, we acquired cellular traffic datasets from existing optical networks. This dataset records mobile traffic of 5 base stations taken at one-minute intervals in Guangzhou city starting from January 5th, 2018 up until February 2th, 2018. It consists of 5 time series. Each of these time series contains mobile phone traffic of different categories of applications, such as video, social networks, games. Details of Dataset are depicted in Table 2. The collected data were then processed and cleaned, and after processing the initial dataset, the data were converted into a vector form and divided into two sets; 75% (from the 1st day to the 21st day) for training and 25% (from the 22nd day to the 28th day) for testing. We compared the proposed AGCN-GRU model with the following statistics and deep neural network models. (1) ARIMA: autoregressive integrated moving average is a linear model, commonly used in predicting future points in the series. (2) LSTM: as a special kind of RNN designed to address these problems by introducing new gates (input, forget, and output) that allow better control over the gradient flow and enable better preservations of long-range dependencies [34]. (3) GRU: A variation of the LSTM that has exhibited better performance on smaller datasets. (4) GCN-GRU: A two layers model that integrates a fast graph convolution layer and a GRU layer with 64 hidden units. It was also referred to as the Temporal-GCN [28]. The initial adjacency matrix of nodes was constructed by cell sites distance. All the models considered in this paper were built and implemented on the TensorFlow framework. For all deep neural network models, the Adam optimizer was used, and our model was trained with an initial learning rate of 0.001 and the decay parameter was set as 0.95.

Table 2. Dataset descriptions and all features

View Table | View all tables in this article

The evaluation metrics include the mean absolute percentage error (MAPE) and accuracy, and the predicted value was considered accurate when it was between 95% and 105% of the actual value, because in this circumstance, the error value of prediction result will not affect the results of executing ITG scheme. Table 3 compares the performance of AGCN-GRU and other baseline models for predictions 30-min and 1 h ahead in time on the mobile traffic dataset. We averaged the results of twenty runs and found that the RNN-based methods, including the GRU and LSTM model, generally showed better prediction accuracy than other baselines, such as the ARIMA model. Compared to temporal prediction models, including GRU and LSTM, spatial-information, AGCN approached and surpassed previous non-adaptive GCN approaches and obtained the best performance in all evaluation metrics. The results indicated that our method outperformed others in predicting traffic and applies to traffic volumes spanning a wide dynamic range. It shows that, in the AGCN adjacency matrices are initialized as identity model, when prior graph knowledge is not provided, the matrices and adaptive graph topology structure for each feature are learnt according to the data. Additionally, we make the following observations from the results. (1) For all prediction methods, MAPE increased with the forecasting horizon (from 30 min to 1 h). This is because the temporal dynamic becomes increasingly non-linear with the growth of the horizon. (2) GCN-GRU performed similarly to

Table 3. Performance comparison of AGCN-GRU and baseline models

View Table | View all tables in this article

GRU. It could be explained that the GRU played a vital role in the spatial-temporal prediction model, and that the explicit geological graph structure does not reflect the true dependency. (3) AGCN-GRU method was robust in all forecasting horizons and outperforms the baselines. Owing to the self-adaptive graph structure and with fewer parameters, our model has superior performances in balancing time consumption and parameter settings.

Figure 6 shows the observed and predicted traffic load data from the 16th day to the 22nd day of five cell sites by adopting AGCN-GRU model. The corresponding numbers of access users of the five cell sites are shown in Fig. 7. Table 4 presents the overall evaluation results of the proposed AGCN-GRU model. The experimental results show that the proposed AGCN-GRU model accurately predicted the traffic load and the number of access users based on the temporal and spatial dependencies learned from the training set, achieved high prediction accuracy, and that the prediction results of the testing dataset (the dataset on 22nd day) can be used in theITG scheme to optimize network resources in advance. Following this stage, we then used the dataset on the 22nd day to validate the proposed scheme.

Fig. 6. Real, training and testing traffic loads of five cell sites.

Download Full Size | PDF

Fig. 7. Real, training and testing number of access users of five cell sites

Download Full Size | PDF

Table 4. Evaluation results

View Table | View all tables in this article

We used the Iperf tool (a network performance test tool) to generate traffic in the simulations [35–36]. The Iperf tool can test bandwidth quality, delay jitter and packet loss rate [37]. We used the tool to generate the traffics of 13 applications based TCP protocol apart to simulate the above application. We assumed the case with the RRS scheme as a benchmark. With the RRS scheme, the resource of two wavelengths is assigned to cell sites in turn according to the real-time traffic requirements from five cell sites, and the traffic of five cell sites will be transmitted to vBBU1 and vBBU2 respectively carried by the assigned wavelengths. When using the AGCN-GRU-ITG, according to the prediction results, the five cell sites were assigned wavelengths dynamically with time in advance by leaving the ITG scheme and the traffic of the five cell sites converged into two data flows transmitted to vBBU1 and vBBU2 carried by the assigned wavelength. The traffic load values of each wavelength on the 22nd day are shown for both cases in Fig. 8(a); the numbers of access users of each wavelength on the 22nd are shown in Fig. 8(b). In the scheme with AGCN-GRU-ITG, the traffic load and the number of access users is more balanced, as traffic is groomed in advance.

Fig. 8. (a) Traffic load of vBBUs with RRS and with AGCN-GRU-ITG; (b) Number of access users of vBBUs with RRS and with AGCN-GRU-ITG.

Download Full Size | PDF

The LBC at each instant is shown in Fig. 9(a). The LBC with the AGCN-GRU-ITG was significantly higher than the benchmark. The average delay experienced at each instant is shown in Fig. 9(b) and compared with the benchmark, the average delays with the AGCN-GRU-ITG were significantly lower, especially between 7:30 to 24:00. The rejection ratio is shown in Fig. 9(c). Compared to the benchmark, the rejection ratio with the AGCN-GRU-ITG were, on an average, reduced by 46.45% between 7:30 to 24:00. The bandwidth utilization ratio of the proposed scheme, and the benchmark is shown in Fig. 9(d). The AGCN-GRU-ITG scheme effectively increased the bandwidth utilisation ratio and compared to the benchmark, the bandwidth utilisation ratio with the AGCN-GRU-ITG, on an average, increased by over 8.52%.

Fig. 9. The performance of network with AGCN-GRU-ITG and RRS : (a) load balancing coefficient (LBC) with time; (b) average delay performance with time; (c) rejection ratio performance with time; (d) bandwidth utilization ratio performance with time.

Download Full Size | PDF

The main reason for these findings is that, during the considered time, the traffic load and number of access users were usually distributed in a highly uneven manner. With the RRS scheme, some subscribers will still transport the traffic on the heavily loaded optical fronthaul wavelength, and the traffic will be transmitted to vBBU with heavier loads. AGCN-GRU-ITG can globally optimise the network resources in advance according to the network intent and release the bandwidth burden of the imminent, heavily-loaded, optical fronthaul networks. The RRHs were grouped and clustered for balancing the network load, and all the optical fronthaul wavelengths can share the optical fronthaul network bandwidth burden. This proposed scheme avoids traffic congestion, results in a lower queuing delay and transmission delay, decreases the rejection ratio, and improves the bandwidth utilisation ratio.

4.3 Performance evaluation of the AGCN-GRU-ITG scheme on the SD-FNet testbed

The above experiment verifies the performance of the AGCN-GRU-ITG scheme considering existing network traffic data. However, the traffic data of the existing network are dynamic and vary over time but do not contain all future possible traffic loads. Thus, the scheme with all possible traffic loads and the performance of the AGCN-GRU-ITG scheme had to be evaluated in terms of the average delay, cumulative distribution function (CDF) of delay, rejection ratio, and bandwidth utilisation ratio of each wavelength. We assumed that the two data flows were transmitted simultaneously to vBBU1 and vBBU2 and that the total transferred data rate of the two data flows ranged from 200 to 2000 Mbps at 200 Mbps intervals. The case with RRS as a benchmark and the two data flows had different load ratios (9:1, 8:2, 7:3, 6:4) as inputs. The LBC of the above cases were 1/8, 1/6, 1/4, and 1/2, correspondingly respectively, to the cases of two data flows with load ratios of 9:1, 8:2, 7:3, and 6:4.

As shown in Fig. 10, all curves exhibited a similar trend; however, the average delay with AGCN-GRU-ITG was only 0.705 ms, which is considerably lower than the delay with RRS scheme. This value decreased more than two-fold compared to the case with RRS scheme ($LBC\textrm{ = 1/8}$). Specifically, with RRS scheme, the delay increased with LBC decreasing because some traffic was carried by the heavily loaded optical fronthaul wavelength, and uneven load distribution causes network congestion. Figure 10(b) presents the CDF of the average delay. In the case of AGCN-GRU-ITG, all packets were delivered within 2.4 ms. However, the other conditions with RRS scheme exhibited a long-tailed delay distribution.

Fig. 10. Delay performance: (a) average delay vs. traffic load; (b) CDF of the average delay vs. average delay

Download Full Size | PDF

Service rejection ratio was also investigated, as shown in Fig. 11. The rejection ratio with AGCN-GRU-ITG was always within 1.515% and considerably lower than the other cases when the RRS scheme was used. The rejection ratio with AGCN-GRU-ITG decreased by five times, compared to the case with the RRS scheme ($LBC\; \textrm{ = 1/8}$). The rejection ratio with the RRS scheme increased with decreasing LBC due to the uneven load distribution. The proposed scheme can therefore globally optimise the network resources in advance based on the network intent and release the bandwidth burden of the imminent heavily loaded optical fronthaul network to decrease the rejection ratio.

Fig. 11. Rejection ratio performance for different load balancing coefficient

Download Full Size | PDF

Finally, we investigated the bandwidth utilisation ratio of wavelengths, the results of which are presented in Fig. 12. At low loads, the network resources were adequate. Therefore, when the network load was smaller than 0.4, the bandwidth utilisation ratio of wavelength was similar for all the conditions. In the case of high loads, the bandwidth utilisation ratio of wavelength with AGCN-GRU-ITG was relatively large, as the proposed scheme considered optimal cell site clustering for load balancing and traffic grooming according to the real-time perceptive capacity to improve network resource utilisation and avoid traffic congestion.

Fig. 12. Bandwidth utilization ratio performance for different load balancing coefficient

Download Full Size | PDF

5. Conclusion

The results of our proposed scheme provide a foundation for intent-based traffic grooming according to traffic prediction in 5G optical fronthaul networks. Compared with the benchmark, the average delays with the AGCN-GRU-ITG were reduced by 45.71%, the rejection ratio were averagely reduced by 17.64% and the bandwidth utilization ratio were averagely increased by over 8.52%. Future work will focus on evaluating the model in large-scale network scenarios, developing resource scheduling in more detail, and jointing optimization of cloud and fog computing resources and optical network resources, which will be configured on the network resource management platform.

Funding

This work was supported in part by National Natural Science Foundation of China (61975020), by the Fund of State Key Laboratory of IPOC (BUPT) (No. IPOC2020ZT05), and by the Key Laboratory Fund (6142104190207).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. L. Gong and Z. Zhu, “Virtual Optical Network Embedding (VONE) over Elastic Optical Networks,” J. Lightwave Technol. 32(3), 450–460 (2014). [CrossRef]

2. D Wang, Z. Zhang, M Zhang, M. Fu, J. Li, S. Cai, C. Zhang, and X. Chen, “The Role of Digital Twin in Optical Communication: Fault Management, Hardware Configuration, and Transmission Simulation[J],” IEEE Commun. Mag. 59(1), 133–139 (2021). [CrossRef]

3. L. Gong, X. Zhou, X. Liu, W. Zhao, W. Lu, and Z. Zhu, “Efficient Resource Allocation for All-Optical Multicasting over Spectrum-Sliced Elastic Optical Networks,” J. Opt. Commun. Netw. 5(8), 836–847 (2013). [CrossRef]

4. Z. Zhu, W. Lu, L. Zhang, and N. Ansari, “Dynamic Service Provisioning in Elastic Optical Networks with Hybrid Single-/Multi-Path Routing,” J. Lightwave Technol. 31(1), 15–22 (2013). [CrossRef]

5. Y. Yin, H. Zhang, M. Zhang, M. Xia, Z. Zhu, S. Dahfort, and S. J. B. Yoo, “Spectral and Spatial 2D Fragmentation-Aware Routing and Spectrum Assignment Algorithms in Elastic Optical Networks,” J. Opt. Commun. Netw. 5(10), A100–A106 (2013). [CrossRef]

6. X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan, and X. Chen, “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Commun. Surv. Tutorials 22(2), 869–904 (2020). [CrossRef]

7. M. Chen, Y. Miao, H. Gharavi, L. Hu, and I. Humar, “Intelligent Traffic Adaptive Resource Allocation for Edge Computing-Based 5G Networks,” IEEE Trans. Cogn. Commun. Netw. 6(2), 499–508 (2020). [CrossRef]

8. L. Chen, D. Yang, D. Zhang, C. Wang, and J. Li, “Deep mobile traffic forecast and complementary base station clustering for C-RAN optimization,” Journal of Network and Computer Applications 121(1), 59–69 (2018). [CrossRef]

9. M. F. Iqbal, M. Zahid, D. Habib, and L. K. John, “Efficient prediction of network traffic for real-time applications,” Journal of Computer Networks and Communications, Feb. 2019.

10. K. He, Y. Huang, X. Chen, Z. Zhou, and S. Yu, “Graph attention spatial-temporal network for deep learning based mobile traffic prediction[C],” 2019 IEEE Global Communications Conference (GLOBECOM). IEEE,2019:1–6.

11. C. Zhang and P. Patras, “Long-term mobile traffic forecasting using deep spatio-temporal neural networks,” in Proceedings of the Nineteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing, MobiHoc, 2018, pp. 231–240.

12. S. Xiao and W. Chen, “Dynamic allocation of 5G transport network slice bandwidth based on LSTM traffic prediction,” in Proc. ICSESS, 2018.

13. J. Feng, X. Chen, R. Gao, M. Zeng, and Y. Li, “Deeptp: An end-to-end neural network for mobile cellular traffic prediction,” IEEE Network 32(6), 108–115 (2018). [CrossRef]

14. C. Song, M. Zhang, X. Huang, Y. Zhan, D. Wang, M. Liu, and Y. Rong, “Machine learning enabling traffic-aware dynamic slicing for 5G optical transport networks,” in Proc. CLEO, JTu2A.44,2019.

15. I. Alawe, A. Ksentini, Y. Hadjadj-Aoul, and P. Bertin, “Improving traffic forecasting for 5G core network scalability: A machine learning approach,” IEEE Network 32(6), 42–49 (2018). [CrossRef]

16. H. Lu, M. Zhang, M. Wang, C. Song, D. Wang, and L. Guan, “Big-data-driven dynamic clustering and load balancing of virtual base stations for 5G fronthaul network,” In Proc. 24th OptoElectronics and Communications Conference (OECC) and 2019 International Conference on Photonics in Switching and Computing (PSC), Fukuoka, Japan, Paper ThE3-4,2019.

17. A. Azzouni and G. Pujolle, “NeuTM: A neural network-based framework for traffic matrix prediction in SDN,” IEEE/IFIP Network Operations and Management Symposium. 2018.

18. W. C. Chien, C. F. Lai, and H. C. Chao, “Dynamic resource prediction and allocation in C-RAN with edge artificial intelligence,” IEEE Trans. Ind. Inf 15(7), 4306–4314 (2019). [CrossRef]

19. F. Sun, P. Wang, J. Zhao, N. Xu, J. Zeng, J. Tao, K. Song, C. Deng, J. C. Lui, and X. Guan, Mobile data traffic prediction by exploiting time-evolving user mobility patterns, IEEE Transactions on Mobile Computing

20. K. He, X. Chen, Q. Wu, S. Yu, and Z. Zhou, Graph attention spatial-temporal network with collaborative global-local learning for citywide mobile traffic prediction, IEEE Transactions on Mobile Computing.

21. F. Xu, Y. Li, M. Chen, and S. Chen, “Mobile Cellular Big Data: Linking Cyberspace and the Physical World with Social Ecology,” IEEE Network 30(3), 6–12 (2016). [CrossRef]

22. R. Li, S. Wang, F. Zhu, and J. Huang, “Adaptive graph convolutional neural networks,” in 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 3546–3553, 2018.

23. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, pp. 1–14, 2019.

24. C. A. Chan, M. Yan, A. F. Gygax, W. Li, L. Li, I. Chih-Lin, J. Yan, and C. Leckie, “Big data driven predictive caching at the wireless edge,” In Proc. 2019 IEEE International Conference on Communications Workshops (ICC Workshops), Shanghai, pp. 1–6, 2019.

25. Y. Wang, Y. Zhao, W. Wang, X. Yu, and J. Zhang, “Dynamic tidal traffic grooming in software defined metropolitan networks,” In Proc. 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), pp. 735–739, 2018.

26. W. Mo, C. L. Gutterman, Y. Li, G. Zussman, and D. C. Kilper, “Deep neural network based dynamic resource reallocation of BBU pools in 5G C-RAN ROADM networks,” Optical Fiber Communication Conference. Optical Society of America, Th1B. 4, 2018.

27. Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting,” 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, pp. 1–16, 2018.

28. L. Zhao, Y. Song, C. Zhang, Y. Liu, P. Wang, T. Lin, M. Deng, and H. Li, “T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction,” IEEE Trans. Intell. Transport. Syst. 14(8), 1–11 (2019). [CrossRef]

29. B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting,” International Joint Conference on Artificial Intelligence, vol. 2018-July, pp. 3634, 2018.

30. Q. Guo, R. Gu, M. Cen, X. Kang, T. Zhao, L. Bai, and Y. Ji, “Multi-tenant Hybrid Slicing with Cross-layer Heterogeneous Resource Coordination in 5G Transport Network,” Optical Fiber Communication Conference. Optical Society of America, M2A. 1, 2018.

31. P. Marques, A. P. do Carmo, V. Frascolla, C. Silva, E. D. Sena, R. Braga, and L. F. Bittencourt, “Optical and wireless network convergence in 5G systems–an experimental approach,” 2018 IEEE 23rd International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), pp. 1–5, 2018.

32. R. Gu, Y. Qu, M. Lian, H. Li, Z. Wang, Y. Zhu, and Y. Ji, “Flexible Optical Network Enabled Proactive Cross-Layer Restructuring for 5G/B5G Backhaul Network with Machine Learning Engine,” 2020 Optical Fiber Communications Conference and Exhibition (OFC). pp. 1–3, 2020.

33. M. Lian, R. Gu, Y. Qu, Z. Wang, and Y. Ji, Flexible Optical Network Enabled Hybrid Recovery for Edge Network with Reinforcement Learning[C]//Optical Fiber Communication Conference. Optical Society of America, 2020: M1A. 2.

34. J. Wang, J. Tang, Z. Xu, Y. Wang, G. Xue, X. Zhang, and D. Yang, “Spatiotemporal modeling and prediction in cellular networks: A big data enabled deep learning approach,” Proceedings - IEEE INFOCOM, pp. 1323–1331, 2017.

35. A. Tirumala, L. Cottrell, and T. Dunigan, “Measuring end-to-end bandwidth with Iperf using Web100,” In Web100, Proc. of Passive and Active Measurement Workshop. 2003.

36. C. H. Hsu and U. Kremer, “IPERF: A framework for automatic construction of performance prediction models,” Workshop on Profile and Feedback-Directed Compilation (PFDC), Paris, France. 1998.

37. Y. Jiang, Z. Mi, H. Wang, X. Wang, and N. Zhao, “The experiment and performance analysis of multi-node UAV ad hoc network based on swarm tactics,” 2018 10th International Conference on Wireless Communications and Signal Processing (WCSP). IEEE, 2018: 1–6.

Abbreviation	Explanation
C	Number of cluster
C_max	The cluster with the maximum load
C_min	The cluster with the minimum load
LF	Load factor of each cluste
T	Traffic load
NAU	Number of access users
RRH(target)	The load of one RRH closest to T/2 and NAU/2
RRH(min)	The load of one RRH is minimum
N	Number of total ONUs
H	Number of high load ONUs
L	Number of low load ONUs
B_req	Requested wavelength bandwidth
B_assign	Allocated wavelength bandwidth
B_surplus	Surplus wavelength bandwidth
B_ins	Insufficient wavelength bandwidth

Dataset	#Nodes	#Features	Range of values	Mean value
Guangzhou traffic	5	15	0∼9.76E+8	3.91E+08
Application categories	The total number of access users, total traffic, traffic of P2P, VoIP, finance, navigation, stocks, instant messaging, social networks, video, web, file transfer, E-mail, game, and others.

Time Model	30-min prediction of traffic load		1-hour prediction of traffic load		30-min prediction of number of access users		1-hour prediction of number of access users
Time Model	MAPE	Accuracy	MAPE	Accuracy	MAPE	Accuracy	MAPE	Accuracy
ARIMA	26.2%	78.9%	30.4%	72.3%	19.1%	78.9%	22.4%	78.9%
LSTM	13.2%	89.3%	16.8%	81.9%	11.8%	89.3%	16.8%	89.3%
GRU	13.5%	90.1%	17.2%	81.7%	10.3%	90.1%	16.6%	90.1%
GCN-GRU	12.6%	90%	16.3%	82.1%	8.9%	90%	14.2%	90%
AGCN-GRU	10.7%	93.2%	13.1%	86.2%	5.2%	96.2%	9.8%	92.7%

Cell site	a	b	c	d	e
Prediction MAPE of traffic load (%)	8.55	8.75	10.61	12.68	12.92
Prediction accuracy of traffic load (%)	92.98	93.87	94.54	91.62	92.87
Prediction MAPE of number of access users (%)	4.21	6.23	3.78	4.81	6.96
Prediction accuracy of number of access users (%)	96.98	94.72	97.53	96.87	94.73

Abbreviation	Explanation
C	Number of cluster
C_max	The cluster with the maximum load
C_min	The cluster with the minimum load
LF	Load factor of each cluste
T	Traffic load
NAU	Number of access users
RRH(target)	The load of one RRH closest to T/2 and NAU/2
RRH(min)	The load of one RRH is minimum
N	Number of total ONUs
H	Number of high load ONUs
L	Number of low load ONUs
B_req	Requested wavelength bandwidth
B_assign	Allocated wavelength bandwidth
B_surplus	Surplus wavelength bandwidth
B_ins	Insufficient wavelength bandwidth

AI-assisted intent-based traffic grooming in a dynamically shared 5g optical fronthaul network

Abstract

1. Introduction

2. Related works

3. Software defined-fronthaul net architecture and AI-assisted network resource management

3.1 SD-FNet architecture considering the AI-assisted ITG scheme

3.2 AI-assisted intent-based network resource management signalling procedure

3.3 AI-assisted intent-based network resource management algorithm

4. Experimental results

4.1 Experimental environment

4.2 Experimental verification of the AGCN-GRU-ITG scheme considering existing network traffic data

4.3 Performance evaluation of the AGCN-GRU-ITG scheme on the SD-FNet testbed

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (12)

Tables (4)

Equations (5)

Optics Express