Channel characteristics estimation based on a secure optical transmission system with deep neural networks

Kun Wu; Hongxiang Wang; Yuefeng Ji

doi:10.1364/OE.464257

1. Introduction

Due to the advantages of high speed, large bandwidth and long transmission distance, optical fiber communications are widely applied in personal, commercial and military communications. With the advent of 5G and concepts of interconnections and intelligence, the capacity of optical network is explosively growing [1]. In the meantime, the issue of securing optical network is also becoming progressively significant. Optical fiber is used as the transmission medium in the optical network. But it is vulnerable to many security threats [2,3], involving jamming, eavesdropping, interceptions, identity spoofing attacks and infrastructure attacks. Meanwhile, conventional secure schemes are facing with a multitude of treats with the development of quantum computers, which have the capacity to crack the ciphers in a short period of time. Hence, ensuring security at the physical layer of optical network is extremely critical.

Recent numerous anti-eavesdropping schemes, which are based on optical or electrical layer processing, have been proposed, such as all-optical XOR logic gate [4,5], optical steganography [6,7], optical code division multiplexing (OCDMA) technology [8,9] and spectral phase encryption with dispersion apparatuses [10,11]. These schemes can effectively enhance the privacy and confidentiality of data signals, but more challenges to existing optical digital signal process (DSP) and higher requirements to terminal equipment are caused. Moreover, the security of OCDMA technology is still an open question [12].

Optical chaos encryption [13,14] and optical quantum key encryption [15,16] are two other encryption methods that can guarantee the security of optical layer. However, broadband chaos signal is usually applied as optical carrier to hide transmission data in optical chaos encryption. Therefore, optical chaos encryption is challenging to practical employ owing to the complexity and bandwidth of chaotic carrier. Likewise, it is obvious that the combination of optical quantum key encryption and One-Time-Pad can ensure absolute security of optical network. However, there are still some shortcomings in optical quantum key encryption, such as short transmission distance, low KGR and expensive infrastructure.

Furthermore, the security of the above schemes is additive. Considering the continuous improvement of the eavesdropper’s ability, it is necessary to propose a more secure scheme which is expected to be based on fiber channel characteristic. To this end, we propose a novel secure optical transmission system based on NNs, which are applied for channel characteristics estimation. In the proposed system, the legal parties first transmit a pre-agreed detection data to each other, and then train a NN locally with the transmitted data and post-processed received data. Then the legal transmitter generates a pseudo-key and acquires the final key by the pseudo-key and trained NN. The plaintext can be encrypted into ciphertext when the legal transmitter has obtained the final key. After the data encryption, the pseudo-key and ciphertext are transmitted to the legal receiver. Finally, the legal receiver acquires the final key with the received pseudo-key and trained NN, and then decrypts the received ciphertext into the correct plaintext. Although the eavesdropper can also acquire the pseudo-key and ciphertext, the correct plaintext cannot be obtained because the NN models trained by the legal parties and illegal eavesdropper are different. Unlike the above additional secure schemes, the proposed system takes advantage of the variable channel characteristics. Hence, the higher-level security requirement can be satisfied.

The remainder of the paper is organized as follows. The related works are introduced in Section 2. Section 3 describes the system model and the principle in detail. The simulation results are put forward and discussed in Section 4. Section 5 concludes the paper.

2. Related works

In recent years, an increasing number of physical layer secure schemes based on optical fiber channel characteristics have been surveyed and proposed [17–23]. In [17], authors realized a secure key distribution of the legal parties by using the random phase fluctuation generated in the polarization maintaining fiber. Likewise, to increase the transmission distance, authors in [18] proposed a secure scheme, which utilized the random mode mixing in multimode fiber to generate key. However, the above researches did not take into account that the keys acquired by legal parties still have partial difference. Therefore, a secure key distribution scheme by exploiting Polarization Mode Dispersion (PMD) was proposed in [19], in which the inconsistency problem was solved. Similarly, to improve the KGR, authors in [20] took advantage of digital chaotic devices to scramble the polarization state in optical fiber. Meanwhile, a bidirectional wide-band polarization scrambler (WBPS) was applied to alter rapidly and symmetrically the state of polarization (SOP) for the two counter-propagation optical signals in fiber [21]. Furthermore, a secure key generation and distribution (SKGD) scheme, which exploited the SOPs between the optical line terminal (OLT) and optical network units (ONUs), was proposed. Nevertheless, the transmission distances of [20], [21] and [22] were 24 km, 10 km and 11 km, respectively. Therefore, these secure schemes [20–22] cannot be employed in the medium and long distance optical fiber communications. In order to simultaneously achieve high KGR and long transmission distance, authors in [23] proposed an error-free SKGD schemes over 300 km standard single-mode fiber (SSMF) and verified the feasibility of the proposed scheme by experiment. However, the KGR of [23] was 127.12 Mbit/s, which still cannot meet the higher KGR requirement in the future.

On account of the powerful performance of NN in optical network [24] and channel modeling [25–28], there are more and more researches on NNs. It has been proved that an NN, which possesses more hidden layers or more neurons in each layer, can fit a particularly complicated function very well. In [25], a NN model was applied to model the stochastic channel, reproducing the measurement path losses including shadow fading and small scale parameters accurately. Meanwhile, the NN model was expected to get the better of the drawbacks of statistical modeling methods based on geometric stochastic model (GBSM) in time-varying channels. In [26], deep learning was employed to complete the modeling of Uncoded space-time labeling diversity (USTLD) and did not require the knowledge of channel autocorrelation statistics and noise variance. A deep learning was wielded to actualize the estimation of Rayleigh channel in [27]. In comparison with traditional algorithms, this method can effectuate dynamic tracking channel and robustness for statistical characteristics of channel, performing satisfactorily. In [28], a generation adversarial network was used to realize the estimation of SSMF channel and had the ability to model wavelength and polarization division multiplexing optical channels.

Compared to the above secure schemes, our proposed secure optical transmission system can support the integrated transmission of plaintext together with key, modifying the final key dynamically. As a result of the time-varying and random of PMD [29,30], the proposed optical transmission system still remains secure when the illegal eavesdropper acquires the lengths of the local fibers wielded by legal parties in non-detection time. Even if the two sets of training data applied by legal parties are acquired by illegal eavesdropper, the proposed optical transmission system is still secure because the structures of the NNs utilized by legal parties and eavesdropper are still dissimilar. Furthermore, even if the illegal eavesdropper can acquire all secure parameters applied by the legal parties, the proposed system can still remain secure. The reason is that the lengths of the local fibers applied by the legal parties can be changed and the local NNs can be retrained before the illegal eavesdropper has acquired these secure parameters.

3. System model and principle

The proposed secure optical transmission system can be mainly divided into the following five steps. As shown in Fig. 1, where steps 1 and 2 constitute ① (Channel characteristics detection) in Fig. 1, while steps 3, 4 and 5 constitute ② (Communication) in Fig. 1.

1. The legal transmitter and legal receiver, which are the legal parties, transmit the pre-agreed detection data to each other respectively. Then, the legal parties apply the data post-processing at the receiving end. After the data post-processing, the legal parties will acquire identical received data. Meanwhile, the difference between the two sets of data received by the legal parties and illegal eavesdropper is greater.
2. The legal parties train a NN respectively and store it locally. It should be noted that the input data of the NN model is the pre-agreed detection data, which is represented as $data$ in Fig. 1. And the label data is the received data after post-processing.
3. The legal transmitter generates a pseudo-random code as pseudo-key and acquires the final key with the generated pseudo-key and NN which has been trained in step 2. And then the legal transmitter encrypts the plaintext with the generated final key into the ciphertext.
4. The legal transmitter sends the pseudo-key and ciphertext to the legal receiver. Subsequently, the legal receiver will acquire the error-free ciphertext and pseudo-key by using channel compensation algorithm.
5. The legal receiver converts the received pseudo-key into the final key through the trained NN. And then the received ciphertext will be decrypted by the legal receiver into the correct plaintext with the converted final key.

Fig. 1. System chart of the proposed secure optical transmission system. Tr, transmitter; Re, receiver; NN, neural network; IR, information reconciliation; PA, privacy amplification; SSMF, standard single-mode fiber; SW, MEMS switch.

Download Full Size | PDF

3.1 Channel characteristics detection and data consistency

During the channel characteristics detection, Alice and Bob send a pre-agreed detection data to each other respectively, which is aimed to guarantee that the two sets of data transmitted by Alice and Bob are equivalent. Meanwhile, we assume that the detection data has been also acquired by the illegal eavesdropper. The detection data is represented as $data$. After the SSMF transmission, $data_{Alice}$ is received by Alice, $data_{Bob}$ is received by Bob and $data_{Eve}$ is received by Eve, as shown in Fig. 2(a). It is obvious that Alice and Bob are legal parties, while Eve is the illegal eavesdropper. The Tr is the optical transmitter, while the optical receiver is represented as the Re. It can be seen from Fig. 2(b) that the legal transmitter first uses Mach-Zehnder-Modulator (MZM) to modulate the 100 Gbps pulse-amplitude-modulation-8 (PAM8) detection data onto the optical carrier. And then the optical signal is amplified via an erbium-doped fiber amplifier (EDFA), transmitted through the 130 km SSMF. In the receiving end, the received data is acquired after Photo-Diode (PD) and digital signal process (DSP). The security level of the proposed system is related to the difference of the NNs trained by Alice and Eve. That is, the greater the disagreement rate of two sets of data received by Alice and Eve after post-processing is, the more secure the system is. However, the local fibers applied by Alice and Bob are unknown to Eve. Therefore, we set the gain of EDFA to 20 dB, which is for increasing the linear and nonlinear distortions caused by this part of fiber that the data received by Eve has not experienced. Besides, the lengths of the local fibers wielded by Alice and Bob are both set to 15 km.

Fig. 2. (a) Schematic diagram of channel characteristics detection, (b) Illustration of Tr and Re, (c) Three sets of raw data received by Alice, Bob and Eve.

Download Full Size | PDF

Since optical fiber has reciprocity [31] and the two sets of data received by Alice and Bob have gone through the same path of fibers, there is a strong correlation between $data_{Alice}$ and $data_{Bob}$. However, even though Eve is the closest to Alice or Bob, there is no strong correlation between $data_{Alice}$ and $data_{Eve}$. The reason is that the local fibers applied by Alice and Bob are undetermined to Eve. The three sets of data received by Alice, Bob and Eve are shown in Fig. 2(c). It is obvious that the two sets of data received by Alice and Bob are very resembling. However, the local fibers applied by Alice and Bob are unknown to Eve, representing that Eve only can eavesdrop from the 100 km public fiber. Hence, the data received by Eve will experience different fiber path from the data received by Alice or Bob. Meanwhile, because the gain of EDFA is set as 20 dB, the optical fiber will cause large linear and nonlinear distortions to the transmitted data. Therefore, the two sets of data received by Alice and Eve are inconsistent. Moreover, we also calculate the correlation coefficients (CCs) of the three sets of data received by Alice, Bob and Eve, as shown in Table 1(a). It can be seen that the CC between Alice and Bob is as high as 0.99, while the CC between Alice and Eve is 0.39.

Table 1. The correlation coefficients between the two sets of received data. (a) original data, (b) data after IR, (c) data after post-processing.

View Table | View all tables in this article

However, Alice and Bob cannot utilize an optical fiber to estimate the channel characteristics simultaneously. Therefore, we assume that Alice transmits the $data$ first, and then Bob sends the $data$ after Bob has received the $data_{Bob}$. That is, Alice and Bob apply different time period for channel characteristics detection. Meanwhile, the variation of the fiber channel characteristics is proved to be very small in a short time [31]. Consequently, there is actually a slight difference between $data_{Alice}$ and $data_{Bob}$. It should be noted that even if the lengths of the local fibers applied by the legal parties have been changed, the two sets of data received by the legal parties are still similar since the legal parties still share the same fiber path to estimate the channel characteristics. It can be seen from Table 1(a) that the CC between $data_{Alice}$ and $data_{Bob}$ is 0.99 instead of 1. Hence, the post-processing is required to realize the consistency of the two sets of data received by Alice and Bob [32,33]. We adopt Cascade algorithm for information reconciliation (IR) between Alice and Bob in the proposed system. Meanwhile, we assume that Alice is the legal transmitter and Eve utilizes the reconciliation information together with IR, which are applied by Bob. Moreover, we also calculate the CCs of $data_{Bob}^{'}$, $data_{Eve}^{'}$, which have experienced the IR, and $data_{Alice}$, as shown in Table 1(b). It is obvious that the CC between $data_{Alice}$ and $data_{Bob}^{'}$ is 1 instead of 0.99, which means that the two sets of data received by Alice and Bob are equivalent now. As can be seen from Table 1(a) and (b), the CC between the two sets of data received by Alice and Eve increases from 0.39 to 0.51. The reason is that Eve also corrects the received data with the same reconciliation information when Bob rectifies the received data according to the reconciliation information transmitted by Alice.

In order to further amplify the distinction between the two sets of data received by Alice and Eve, we apply the hash function of SHA3-512 for privacy amplification (PA). And the PA combines with IR as post-processing. Meanwhile, for ensuring that the KGR remains constant, we set the length of the input data of SHA3-512 is 512 bits. It is obvious that the input data and output data of SHA3-512 are identical. As shown in Table 1(c), after PA, the CC between Alice and Eve decreases from 0.51 to 0.023, meaning that there is a very feeble correlation between $data_{Alice}^{'}$ and $data_{Eve}^{''}$. However, the CC between Alice and Bob is still 1, which means that Alice and Bob still have acquired the same received data after the post-processing.

3.2 Local training of NN model

After three sets of received data have been post-processed, Alice, Bob and Eve train a NN model respectively and locally. Meanwhile, since the intention of the NN model is to estimate channel characteristics, the input data is $data$, while the label data is the received data after post-processing, as shown in Fig. 3(a). It should be noted that the NN model does not require to guarantee that its estimation is extraordinarily accurate, but only needs to make certain that the NN models trained by Alice and Eve are dissimilar. However, Alice and Bob are the exact owner of the indistinguishable training data. Therefore, if the structures of the NN models, such as the numbers of neurons in each layer and hidden layers, and the initial values of weights and biases, applied by Alice and Bob are identical, the NN models trained by Alice and Bob will be completely consistent. Nevertheless, Eve can’t acquire the training data wielded by Alice, even Eve is not aware of what kind of configuration of the NN model used by Alice. Therefore, it is impossible for Eve to train a NN model which is equivalent to that trained by Alice. In other words, it can guarantee that $NN_{Alice}=NN_{Bob} \not = NN_{Eve}$.

Fig. 3. (a) Channel characteristics detection, (b) Final key generation.

Download Full Size | PDF

The structure of the NN employed for channel characteristics estimation is shown in Fig. 3(a). It is evident that the NN contains three hidden layers from this figure. The circles in the figure represent neurons, and the input layer has 51 neurons, which denotes that each single bit suffers from the crosstalk by 50 surrounding bits. The numbers of neurons in the first, second and third hidden layers are 8*128, 4*128 and 2*128, respectively and the number of neurons in the output layer is 8, which represents the probability of eight amplitudes of the PAM8 signal. This kind of NN model is also acknowledged as the fully connected NN model.

The input data of this NN model is the pre-agreed detection data, which is represented as $data$ in Fig. 1 and Fig. 2(a). And the label data is the corresponding received data after post-processing. Each neuron of hidden layer and output layer is equivalent to a computing unit, which consists of nonlinear operation and linear operation and can be expressed as follows:

(1)$$y=f(\sum_{i=1}^{n} w_{i}x_{i}+b)$$

where $y$ is the output data of the node, $x_{i}$ is not only the input data of the node but also the output data of the previous node, $w_{i}$ is the corresponding weight and $b$ represents the corresponding bias. Moreover, $f(*)$ is a nonlinear operation, which can also be called the activation function. It is precisely because of the existence of the activation function that the NN model has the ability to fit the nonlinear function. The activation function applied in the hidden layer is ReLU, while the activation function of the output layer is Softmax, which can be expressed as follows:

(2)$$ReLU(x)=max(0,x), Softmax(x_{i})=\frac{e^{x_{i}}}{\sum_{j=1}^{8}e^{x_{j}}}$$

The output data of the output layer, which can be represented as $y_{i}$ or $Softmax(x_{i})$, is composed of

(3)$$y=(y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6}, y_{7}, y_{8})$$

where each element indicates the probability of eight amplitudes of PAM8. Meanwhile, the max value $y_{i}$ of this vector represents that the detected symbol is the $i-th$ symbol of PAM8. Therefore, the configuration of target data is

(4)$$\hat{y}=\begin{cases} 0, & i \not=k \\ 1, & i=k \\ \end{cases}$$

where $k \in \{1,2,3,4,5,6,7,8\}$. Moreover, the cross-entropy loss is selected as the loss function of the NN model and is calculated according to the output data $y$ and the corresponding label data $\hat {y}$. Cross-entropy loss is a conception accustomed to judge the similarity between two probabilities in information theory [34] and can be expressed as below:

(5)$$CrossEntropyLoss={-}{\sum_{j=1}^{8}{\hat{y_{i}}log(y_{i})}}$$

where $y$ represents the output data, while $\hat {y}$ is the corresponding label data. During the training, the NN model can estimate the channel characteristics step by step and finally realize the channel characteristics estimation. The epoch, batch size, learning rate and the length of training data of the NN model are 120, 100, 0.0001, 65536, respectively. Moreover, Adam is used as the optimizer to carry out back propagation and gradient descent [35]. Simultaneously, for better performance, we make use of batch normalization operation [36], which can be expressed as

(6)$$y=\frac{x-E[x]}{\sqrt{Var[x]+\varepsilon}}\gamma+\beta$$

where $x$ is the input data, while $y$ represents the corresponding normalized output data, $\gamma$ and $\beta$ are trainable vectors of the same size as $x$. It is worth noting that the NN model does not need to ensure its performance of estimating the channel characteristics, but only requires to guarantee that the NN models trained by Alice and Eve are distinct. Therefore, the cross validation data, test data and dropout strategy [37] are not applied in our proposed system. Furthermore, if the lengths of the local fibers applied by the legal parties have been changed, the local trained NNs need to be retrained and the corresponding channel characteristics will be re-estimated.

3.3 Key generation and plaintext encryption

After the NN model has been trained, the legal transmitter requires to generate the final key by using the pseudo-key and the trained NN model. And then the plaintext will be encrypted into the ciphertext by the legal transmitter with the generated key. First of all, the legal transmitter needs to produce a pseudo-random code as the pseudo-key locally. And then inputs the pseudo-key into the trained NN model. Finally, the output data of the trained NN model is the real key, which can also be called the final key, as shown in Fig. 3(b). At this time, the legal transmitter can encrypt the plaintext into the ciphertext by applying the generated final key.

We also have designed an encryption algorithm suitable for the proposed system. Subsequently, we will demonstrate the algorithm in terms of 512 bits of plaintext and key. Before the data encryption, the complete key is obliged to be divided into four different functional blocks. As shown in Fig. 4, $K_{1}$ is the layout function key, in other words, the sum of the five bits in $K_{1}$ determines the arrangement of function blocks. When the sum of the five bits in $K_{1}$ is even, the distribution of function blocks is shown in Fig. 4(a). On the contrary, the allocation of the function blocks is presented in Fig. 4(b). The code lengths of $K_{2}$, $K_{3}$ and $K_{4}$ are 11 bits, 7*16 bits and 3*128 bits, respectively. Meanwhile, the functions of $K_{2}$, $K_{3}$ and $K_{4}$ are to divide, rotate and conduct XOR operation on bits of plaintext, respectively.

Fig. 4. Two key layouts according to the sum of $K_{1}$. (a) even, (b) odd.

Download Full Size | PDF

The designed encryption algorithm not only has distinct key layouts in light of whether the sum of the five bits of $K_{1}$ is even or odd, but also adopts different encryption methods according to the sum is even in three ways 0, 2, 4 or odd in three ways 1, 3, 5. The more detailed description about the designed encryption algorithm is shown in Fig. 5.

Fig. 5. The flow chart of the designed encryption algorithm.

Download Full Size | PDF

It is obvious from Fig. 5 that two sets of input data of the designed encryption algorithm are the Key and PT, which represents the plaintext. First of all, the sum of the five bits in $K_{1}$ is calculated and the distribution of function blocks is determined according to the calculated sum. The next step is to convert binary $K_{2}$ into decimal $num_{2}$, and then the PT is split to many information segments according to $num_{2}$. It should be noted that the length of the last information segment is determined according to the specific situation. The next step is to split $K_{3}$ into seven $K_{3}$ blocks of the same length and convert all $K_{3}$ blocks to decimal numbers. Then, each information block is rotated with the corresponding converted decimal number times. For example, the first information block is rotated with $num_{31}$, which is converted by the first $K_{3}$ block, times. The left rotation operation with one times is shown as follows:

(7)$$P=[P_{1},P_{2},P_{3},\ldots,P_{n}]\stackrel{1}{\longrightarrow}[P_{2},P_{3},\ldots,P_{n},P_{1}]=P'$$

where $P$ represents the plaintext, while $P'$ is the corresponding rotated plaintext. Moreover, since $K_{3}$ is divided into seven blocks, these seven converted decimal numbers are recycled when the number of information segments is greater than seven. After rotation, the all information segments are combined into a complete information, and then the last bit is inverted. The next step is to divide $K_{4}$ into three $K_{4}$ blocks of the same length, and then two different $K_{4}$ blocks are selected to perform XOR on the inverted information according to $num_{1}$. The next step is to split the information into information segments according to $num_{2}$, and then all information segments are rotated with the $num_{3}$, which is converted by $K_{3}$, times. After rotation, all rotated information segments are combined into a complete information, and then the remaining one $K_{4}$ blocks is selected to perform XOR on the complete information. The last step is that every eight bits of the complete information is converted to a decimal number, then all decimal numbers are converted to ASCII codes, finally all the resulting ASCII codes constitute the ciphertext. And then the generated ciphertext is outputted.

3.4 Communication

After the plaintext has been encrypted into the ciphertext, the legal transmitter sends the generated ciphertext to the legal receiver. In other words, the legal parties communicate now. During the communication, since nonlinear loss is no longer beneficial for us, we set the gain of EDFA to 0 dB and the lengths of the local fibers utilized by Alice and Bob to 0 km. The schematic diagram of communication is shown in Fig. 6. It is obvious that the legal transmitter first combines the ciphertext and the pseudo-key into a new data, and then transmits the new data to the legal receiver through the 100 km public fiber. The reason why the transmission distance is 100 km during the communication is that the lengths of the local fibers applied by legal parties are 0 km.

Fig. 6. Schematic diagram of communication.

Download Full Size | PDF

It should be noted that the gain of EDFA is 20 dB and the lengths of the local fibers used by the legal parties are both 15 km before the communication. In our proposed system, an effortless front-back combination form is applied. During the communication, the linear and nonlinear distortions experienced by the transmitted data are very insignificant, so the transmitted data can be received without error by employing channel compensation algorithm at the receiving end. Therefore, the legal receiver can acquire error-less ciphertext and pseudo-key which are the same combination form as the transmitted data. Meanwhile, there is no need to guarantee that the linear and nonlinear distortions experienced by the data received by legal parties are similar. Hence, we can apply a fiber pair to support duplex communication during the communication.

3.5 Key generation and ciphertext decryption

After the pseudo-key and ciphertext have been received, the legal receiver first needs to generate the final key to decrypt the ciphertext. Meanwhile, it can be seen from Fig. 3(b) that the final key is generated by the pseudo-key and the NN model trained in section 3.2. For simplicity, we assume that Alice is the legal transmitter, while Bob is the corresponding legal receiver. Because $NN_{Alice}=NN_{Bob}$, the final key, which is used by Alice to encrypt plaintext, can be acquired by Bob. In other words, $key_{Alice}=key_{Bob}$. However, even if Eve has got the pseudo-key, the final key, which is applied by Alice for encryption, is still inaccessible to Eve because $NN_{Alice} \not = NN_{Eve}$. That is, $key_{Alice} \not = key_{Eve}$. Therefore, we can get the conclusion that $key_{Alice}=key_{Bob}\not =key_{Eve}$. The process of decryption is the opposite of the process of encryption. Consequently, the plaintext, which is transmitted by Alice, will be acquired by Bob, while the correct plaintext is unobtainable to Eve. That is to say, the legal transmitter has sent the plaintext to the legal receiver safely. Compared with the scheme which directly uses the received data as the key, the final key of our proposed scheme depends not only on the received data, but also on the structure of NN and the pseudo-key transmitted with the ciphertext, so the higher-level security requirement of our proposed system has been satisfied. In addition to more secure key distribution, the final key can be changed by revising the pseudo-key during the communication. Moreover, the simultaneous transmission of key together with plaintext has been made true.

Furthermore, we also consider the condition that Eve has acquired the lengths of the local fibers wielded by Alice and Bob. It is obvious that the final key of our proposed system is generated by the NN model trained in section 3.2 and the pseudo-key produced in section 3.3. Meanwhile, it can be seen that the pseudo-key has been acquired by Eve during the communication. Therefore, the security of our proposed system is equivalent to the security of the NN model trained in section 3.2. On account of the randomness and time-varying property of PMD [29], after Eve has spent a period of time to acquire the lengths of the local fibers used by Alice and Bob, the PMD coefficients of these fibers have varied. In other words, the channel characteristics for legal parties and illegal eavesdropper are still not completely equivalent. Hence, only when Eve has matched the lengths of the local fibers applied by Alice and Bob in detection time, which is unachievable, the training data utilized by Alice will be obtained by Eve. Moreover, even if the training data applied by Alice has been acquired by Eve, the structures of the NNs used by Alice and Eve are still distinct. Therefore, the security level of the proposed system is extremely high.

If we assume that the illegal eavesdropper has a digital coherent receiver with enough memory, the data received by illegal eavesdropper in the detection time will be recorded. And then when the lengths of the local fibers applied by the legal parties have been acquired by the illegal eavesdropper, the corresponding fiber parameters (such as PMD and CD) will be figured out by the brute force search and digital propagation. In this case, the secure method based on the simple nonlinear pseudo-random number generator (PRNG) will be cracked because of the stationary secure parameter (such as initial seed). However, our proposed optical transmission system can still guarantee the high-level security of the plaintext. The reason is that the lengths of the local fibers applied by the legal parties can be adjusted without the negotiation between the legal parties. It is conspicuous that the use of brute force search to obtain these corresponding secure fiber parameters will consume a lot of time. If the lengths of the local fibers applied by the legal parties have been changed and the trained NNs have been retrained to re-estimate the new channel characteristics before the illegal eavesdropper has acquired these fiber parameters, the received data as the same as the data received by the legal parties is still not able to be acquired by the illegal eavesdropper. Consequently, the key acquired by the legal parties cannot be obtained by the illegal eavesdropper. As shown in Fig. 2(a), even if the lengths of the local fibers used by the legal parties have been changed, the two sets of similar data will be received by the legal parties since the legal parties still share the same fiber path to estimate channel characteristics. And the legal parties do not need to know the length of the local fiber applied by each other. In principle, the shorter the time interval of changing the lengths of the local fibers applied by the legal parties and retraining the local NNs, the higher the security level of the proposed system. The safest scenario is to change the lengths of the local fibers applied by the legal parties and retrain the local NNs after each communication between the legal parties has completed.

In summary, even if the illegal eavesdropper can acquire all secure parameters applied by the legal parties (such as the lengths of local fibers, the corresponding fiber parameters in the detection time, the structure and initial values of biases and weights of NN), the correct and real-time key obtained by legal parties is still unobtainable to the illegal eavesdropper. The reason is that the legal parties can change the lengths of the local fibers applied by them and retrain the local NNs before the illegal eavesdropper has obtained these secure parameters. Even if the lengths of the local fibers applied by the legal parties are not changed, the legal parties can also update the local NNs because of the time-varying property of channel characteristics. However, when the environment during the communication is changed, the local NNs trained by the legal parties are still identical. On the other hand, the NNs trained by the legal parties and the illegal eavesdropper are still different. Therefore, the final keys generated by legal parties are still secure. In conclusion, the standard of whether the trained NNs need to be retrained is whether the NN trained by the legal parties is safe, rather than whether the environment is changed.

4. Simulation results and discussion

4.1 System evaluation criteria

4.1.1 KDR

(8)$$KDR=\frac{(\sum_{i=1}^{N}|X_{A}(i)-X_{B}(i)|)}{N}$$

The $KDR$ corresponds to the security level of the system. The higher the $KDR$ is, the fewer correct bits of the final key are acquired by the illegal eavesdropper. To put it differently, the more difficult the ciphertext is decrypted into the correct plaintext. Otherwise, the more insecure the proposed system is.

4.1.2 KGR

(9)$$KGR=\frac{N_{out}}{N_{in}}*R_{b}$$

The $KGR$ represents the efficiency of the system. If the $KGR$ is higher, meaning that the system can generate more bits of key in the same time, the more bits of plaintext can be encrypted. That is, the higher efficient the system is. Inversely, the efficiency of the system is lower.

4.1.3 Randomness

In our proposed system, the output data of the trained NN model is the final key, as shown in Fig. 3(b). Furthermore, it can be seen from Fig. 3(a) that the input data of the NN model is the pre-agreed detection data, while the label data is the received data after post-processing. Both of these two sets of data are random and irregular, which can ensure that when the input data of the NN model is random, the output data is also irregular. Thus, the randomness of the final key can be guaranteed when the pseudo-key has no regularity.

4.2 Data source

In this paper, the system signal transmission is simulated by a combination of simulation tools including Python and VPI transmission Maker. The VPI transmission Maker is applied to generate the detection data and plaintext, to acquire the modulated PAM8 optical signals and to receive data. In the channel characteristics estimation, the length of public fiber is 100 km, while the lengths of local fibers employed by Alice and Bob are both 15 km. Meanwhile, the gain of EDFA is set as 20 dB, and the bit rate is 100 Gbps. Moreover, the length of training data is 65536 bits, and the input power of laser is 2 mW. The other optical fiber parameters are shown in Table 2. During the communication, for decreasing the linear and nonlinear distortions caused by the optical fiber, the lengths of the local fibers used by Alice and Bob are changed to 0 km, while the gain of EDFA is 0 dB.

Table 2. Standard single-mode fiber parameters

View Table | View all tables in this article

4.3 Simulation results

For simplifying the analysis of the simulation results, we assume that Alice is the legal transmitter, while Bob is the legal receiver. When Eve is located at three positions 15 km, 65 km, 115 km, respectively, away from Bob, the scatter diagrams of three sets of data received by Alice, Bob and Eve are shown in Fig. 7. Moreover, the CCs of three sets of data received by Alice, Bob and Eve are also calculated in Table 3. It is obvious from the figure and table that two sets of data received by Alice and Bob are highly correlated, while two sets of data received by Alice and Eve are not significantly correlated no matter where Eve is located. This is for the reason that the lengths of the local fibers wielded by Alice and Bob are unrevealed to Eve. Consequently, Eve can’t acquire the training data applied by Alice or Bob.

Fig. 7. The scatter diagrams of different locations of Eve. (a) 15 km, (b) 65 km, (c) 115 km.

Download Full Size | PDF

Table 3. The correlation coefficients of different locations of Eve. (a) 15 km, (b) 65 km, (c) 115 km.

View Table | View all tables in this article

Moreover, the loss function of the NN model, which is trained in section 3.2, is shown in Fig. 8(a). It can be seen from the figure that the loss function appears a sharp decline in the beginning of training. As the NN model continuously updates the corresponding weights and biases by gradient descent algorithm, the loss function of the NN model slowly and stably converges to a very small value. When the NN model is further trained, it will become unstable, which is what we hope to see. The reason is that the performance of the NN is insignificant, we just require to guarantee that the NNs trained by Alice and Eve are distinct. Subsequently, we can achieve that when Alice and Eve acquire identical training data but apply two NNs with different structures, the NN models trained by Alice and Eve are different.

Fig. 8. (a) The loss function of NN, (b) The KDR corresponding to different positions of Eve.

Download Full Size | PDF

The detail parameters of NN and the corresponding results are shown in section 3.2 and Table 4. If the base learning rate is 0.0001, the training time for each sample is 12.355 ms with 120 epochs. Meanwhile, when the trained NN is applied to generate the final key, the average consumed time is 0.1445 ms per sample, which is acceptable. However, apart from these specific scenarios which have very high-level security requirements, we do not need to retrain the NN frequently. The reason is that we can also change the final key by modifying the pseudo-key. Therefore, the time cost of the proposed optical transmission system is satisfactory.

Table 4. Time cost of training/generating final key

View Table | View all tables in this article

Then, in order to calculate the KDR between Alice and Eve, we input the same pseudo-key of 65536 bits into these NN models, which have been trained by Alice, Bob and Eve locally. It should be noted that Eve has trained five NN models, which represent that Eve is located at 15 km, 40 km, 65 km, 90 km, 115 km, respectively, away from Bob. Meanwhile, we have assumed that Eve knows the numbers of hidden layers and neurons in each layer of the NN model employed by Alice. As shown in Fig. 8(b), where the solid line indicates that Eve knows the initial values of the weights and biases of the NN model wielded by Alice, while the dashed line represents that Eve doesn’t know the initial values of the initial weights and biases of the NN model utilized by Alice. It is obvious from the figure that regardless of where Eve is located and whether or not Eve knows the initial values of the weights and biases of the NN model employed by Alice, the KDR between Alice and Eve approaches to 50%. However, the KDR between Alice and Bob is 0 forever, representing that Alice and Bob have acquired identical final key. Furthermore, because the code lengths of the input data and output data of the NN model are equivalent, the KGR can reach the bit rate when the code length of the plaintext is zero. Meanwhile, we can realize the variation of KGR by adjusting the proportion of the pseudo-key and ciphertext.

In general, it is impossible for Eve to be conscious of the structure of the NN model employed by Alice. Therefore, we assume that Alice and Eve use two NNs with different structures, while Alice and Bob still apply identical NNs. Then, we input the same pseudo-key into these three trained NNs, the corresponding KDRs are presented, as shown in Fig. 9. When Eve only knows the number of hidden layers but does not know the number of neurons in each layer of the NN applied by Alice, the KDR between Alice and Eve is presented in Fig. 9(a). It can be seen that the KDR between Alice and Eve closes to 50% no matter whether or not Eve knows the initial values of the weights and biases of the NN employed by Alice, while the KDR between Alice and Bob is 0 forever. On the other hand, Fig. 9(b) represents the KDR between Alice and Eve when Eve only knows the number of neurons of each layer but does not know the number of hidden layers of the NN utilized by Alice. It also can be seen that the KDR between Alice and Eve closes to 50% no matter whether or not Eve knows the initial values of the weights and biases of the NN wielded by Alice, while the KDR between Alice and Bob is still 0.

Fig. 9. Different training data. (a) The KDR corresponding to different number of neurons in each layer, (b) The KDR corresponding to different number of hidden layers.

Download Full Size | PDF

It is obvious from Fig. 8(b) and Fig. 9 that if Eve does not know the lengths of the local fibers used by Alice and Bob, regardless of where Eve is located and whether or not Eve knows the structure and the initial values of the weights and biases of the NN model employed by Alice, the KDR between Alice and Eve will approach to 50%. Meanwhile, the KDR between Alice and Eve is 50% means that Eve has acquired no information about the final key acquired by Alice and is blindly guessing. Therefore, the data decrypted by Eve is different from the correct plaintext.

We aspire to be aware of the relationship between the decryption error rate and the KDR, performing corresponding simulation, as shown in Fig. 10(a). The abscissa in figure represents the number of different bits between two keys, while the ordinate indicates the number of different bits between decrypted data and correct plaintext. Besides, the code lengths of key and plaintext are both 512 bits. Meanwhile, the green line stands for the lowest error number of twenty tests, which is the most beneficial situation for Eve. Inversely, the purple line depicts the highest error number of twenty tests, which is the most unfavorable situation for Eve. It can be seen from the figure that even if Eve is in the most favorable condition, the high-level security of plaintext can also be achieved when the disagreement rate between the two keys exceeds 12.5%. That is, the high-level security of plaintext can be guaranteed by employing the proposed encryption algorithm when the KDR is greater than 12.5%.

Fig. 10. (a) The number of different bits between decrypted data and correct plaintext corresponding to the number of different bits of two keys, (b) The results of NIST randomness test.

Download Full Size | PDF

Meanwhile, we apply the randomness tests provided by National Institute of Standards and Technology (NIST) test suite [38] to evaluate the randomness (and equivalently, the security level) of the generated key with a length of three million, as shown in Fig. 10(b). Each test is designed to evaluate a bit sequence’s randomness with an information theoretic metric or a specific pattern. It can be seen from the figure that all of the returned $P-values$ of the total 15 sub-tests are greater than the threshold 0.01. Therefore, the symmetric keys, which are generated by the proposed system, have eminent randomness.

Considering that the PMD, which is random and time-varying, is sensitive to the surrounding environment, so the linear and nonlinear distortions experienced by the transmitted data are also time-varying and random. Meanwhile, it is unachievable for Eve to acquire the lengths of the local fibers applied by Alice and Bob in detection time. Therefore, even if Eve only spends a short time to match the lengths of the local fibers used by Alice and Bob, the corresponding PMD coefficients of these fibers have varied. Based on these premises, we simulate the relationship between the disagreement rate of two sets of output data and the deviation of two PMD coefficients, as shown in Fig. 11(a). In addition, the reference PMD coefficient is $\frac {0.1e^{-12}}{31.62}\quad \frac {s}{\sqrt {m}}$. It is obvious from the figure that the output data will alter if we reduce or increase the PMD coefficient. This is because that the variable of PMD coefficient will change the linear and nonlinear distortions experienced by the transmitted data. Moreover, since the designed encryption algorithm can guarantee the high-level security of plaintext when the KDR exceeds 12.5%. That is, even if the lengths of the local fibers used by Alice and Bob have been matched by Eve perfectly, the high-level security of plaintext can still be guaranteed when the PMD coefficients of these fibers have increased more than 1.2 times or decreased more than 1.7 times.

Fig. 11. The disagreement rate of two sets of output data corresponding to (a) the deviation of two PMD coefficients, (b) the length deviation of two local fibers.

Download Full Size | PDF

Figure 11(a) has discussed the situation that Eve has perfectly matched the lengths of the local fibers used by Alice and Bob. However, it is possible that the lengths of the local fibers applied by Eve have a small deviation from those wielded by Alice and Bob. Therefore, we simulate the relationship between the disagreement rate of two sets of output data and the length deviation of two local fibers, as shown in Fig. 11(b). Besides, the reference length is 130 km. It is clear from the figure that Eve can acquire part of correct plaintext only when the length deviation is within −0.06 km to 0.04 km. Nevertheless, 0.1 km is a relatively small choice space compared with unlimited length options. Hence, we can think the system can guarantee the high-level security of plaintext when the length deviation between two local fibers applied by the legal parties and illegal eavesdropper is not zero. In conclusion, it is very strenuous for Eve to acquire partial correct plaintext, which will be acquired by Alice and Bob.

However, if Eve has obtained the training data utilized by Alice but does not know the structure of the NN model employed by Alice, is the proposed system safe? In this case, we simulate that the proposed system is secure under what circumstances, as shown in Fig. 12. When Eve only knows the number of hidden layers but does not know the number of neurons in each layer of the NN utilized by Alice, the KDR between Alice and Eve is shown in Fig. 12(a). Obviously the KDR between Alice and Eve is lower than 12.5% only when Eve knows the initial values of the weights and biases of the NN utilized by Alice and the number of the neurons in each layer of the NN employed by Alice is similar to that used by Eve, while the KDR is greater than 12.5% at all other times. However, the KDR between Alice and Bob is 0 forever. On the other hand, Fig. 12(b) indicates the KDR between Alice and Eve when Eve only knows the number of neurons of each layer but does not know the number of hidden layers of the NN employed by Alice. Obviously the KDR between Alice and Eve is lower than 12.5% only when Eve knows the number of hidden layers and the initial values of the weights and biases of the NN wielded by Alice, while the KDR is higher than 12.5% at all other times. However, the KDR between Alice and Bob is still 0. Therefore, the partial correct plaintext can be obtained by Eve only when Eve has attained the training data used by Alice and been aware of the initial values of weights and biases and the number of hidden layers of the NN model utilized by Alice and the number of the neurons in each layer of the NN model applied by Eve is similar to that used by Alice.

Fig. 12. Same training data. (a) The KDR corresponding to different number of neurons of each layer, (b) The KDR corresponding to different number of hidden layers.

Download Full Size | PDF

Finally, we consider the influence of channel noise and the power mismatch of the lasers applied by the legal parties on the proposed system. If the illegal eavesdropper cannot acquire the data received by the legal parties, the correct key will be secure. Therefore, we represent the security level of the proposed system with the correlation coefficients of the received data. The relationship between the channel noise and security level is shown in Fig. 13(a). It is obvious that even if the powers of the noise and optical signal are identical, the CC of the data received by the legal parties is greater. When the legal parties do not receive the same data, the post-processing is required. Therefore, in normal environments, the channel noise does not affect the security level of the proposed system. Figure 13(b) represents the relationship between the security level and the deviation of optical powers. It should be noted that the power of the laser applied by Bob is 2 mW. It is clear that even if the power of laser applied by Alice is 1 mW or 3 mW, the CC of the data received by the legal parties is larger. Hence, the deviation of optical powers does not affect the security level of the proposed system. Meanwhile, it should be noted that the powers of lases applied by the legal parties have been negotiated in advance. Hence, the power deviation must be very small and only occurs because of the imperfections in the manufacturing process.

Fig. 13. The correlation coefficient between Alice and Bob(Eve) corresponding to (a) the optical signal noise ratio, (b) the deviation of optical powers.

Download Full Size | PDF

5. Conclusion

We propose and discuss a novel secure optical transmission system based on NNs, which are employed to estimate channel characteristics. 100 Gbps secure optical transmission over 100 km is demonstrated. The proposed secure transmission system is mainly divided into five parts—channel characteristics detection and data consistency, local training of NN model, key generation and plaintext encryption, communication, and key generation and ciphertext decryption. The proposed system can realize the joint plaintext and key transmission. Meanwhile, the final key can be modified by changing the pseudo-key during the communication. The simulation results indicate that if Eve desires to acquire partial bits of correct plaintext, it is required to satisfy the following conditions: 1) The number of neurons in each layer of the NN model utilized by Eve is similar to that employed by Alice or Bob; 2) The number of hidden layers of the NN model employed by Eve is the identical as that wielded by Alice or Bob; 3) The initial values of the biases and weights of the NN model wielded by Eve are the same as those applied by Alice or Bob; 4) The length deviation between the two local fibers applied by the illegal eavesdropper and the legal parties is within −0.06 km to 0.04 km; 5) The PMD coefficients of these fibers applied by the legal parties have increased less than 1.2 times or decreased less than 1.7 times. Therefore, the key space of proposed secure system is particularly enormous, which can guarantee that the illegal eavesdropper needs to consume a lot of time to acquire these secure parameters by using the brute force search. However, because of the variability of the local fibers applied by the legal parties, even if the illegal eavesdropper has obtained all secure parameters, the real-time and effective plaintext cannot be acquired. The reason is that the lengths of the local fibers applied by the legal parties can be changed and the local trained NNs can be retrained before illegal eavesdropper has acquired all secure parameters. We also discover that the channel noise and the deviation of optical powers applied by the legal parties do not affect the security level of the proposed system. Moreover, we realize the KGR of 50 Gbps for legal parties and the KDR of 50% for illegal eavesdropper. The proposed system provides a reference for combining artificial intelligence with security technology, opening a new perspective for achieving simultaneous transmission of plaintext together with key.

Funding

National Natural Science Foundation of China (61831003, 62021005).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. G. Das, C. Leckie, R. Parthiban, and R. S. Tucker, “Traffic grooming in hybrid ring-star optical networks,” J. Opt. Netw 6(3), 263–277 (2007). [CrossRef]

2. M. Fok, Z. Wang, Y. Deng, and P. Prucnal, “Optical Layer Security in Fiber-Optic Networks,” IEEE Trans. Inform. Forensic Secur. 6(3), 725–736 (2011). [CrossRef]

3. N. Skorin-Kapov, M. Furdek, S. Zsigmond, and L. Wosinska, “Physical-layer security in evolving optical networks,” IEEE Commun. Mag. 54(8), 110–117 (2016). [CrossRef]

4. J. M. Castro, I. B. Djordjevic, and D. F. Geraghty, “Novel Super Structured Bragg Gratings for Optical Encryption,” J. Lightwave Technol. 24(4), 1875–1885 (2006). [CrossRef]

5. K. Vahala, R. Paiella, and G. Hunziker, “Ultrafast WDM logic,” IEEE J. Sel. Top. Quantum Electron. 3(2), 698–701 (1997). [CrossRef]

6. B. Wu, A. Agrawal, I. Glesk, E. Narimanov, S. Etemad, and P. Prucnal, “Steganographic Fiber-Optic Transmission using Coherent Spectral-Phase-Encoded Optical CDMA,” in Conference on Lasers and Electro-Optics, (OSA, 2008), pp. CFF5.

7. B. Wu, M. P. Chang, B. J. Shastri, P. Y. Ma, and P. R. Prucnal, “Dispersion Deployment and Compensation for Optical Steganography Based on Noise,” IEEE Photonics Technol. Lett. 28(4), 421–424 (2016). [CrossRef]

8. N. Kostinski, K. Kravtsov, and P. R. Prucnal, “Demonstration of an All-Optical OCDMA Encryption and Decryption System With Variable Two-Code Keying,” IEEE Photonics Technol. Lett. 20(24), 2045–2047 (2008). [CrossRef]

9. S. S. Pawar and R. K. Shevgaonkar, “Modelling of FBG for encoding/decoding in SAC-OCDMA system,” in International Conference on Next Generation Networks, (IET, 2010), pp. 1–4.

10. X. Wang, Z. Gao, N. Kataoka, and N. Wada, “Time domain spectral phase encoding/DPSK data modulation using single phase modulator for OCDMA application,” Opt. Express 18(10), 9879–9890 (2010). [CrossRef]

11. N. Jiang, A. Zhao, C. Xue, J. Tang, and K. Qiu, “Physical secure optical communication based on private chaotic spectral phase encryption/decryption,” Opt. Lett. 44(7), 1536–1539 (2019). [CrossRef]

12. Z. Jiang, D. Leaird, and A. Weiner, “Experimental Investigation of Security Issues in O-CDMA,” J. Lightwave Technol. 24(11), 4228–4234 (2006). [CrossRef]

13. B. Zhu, F. Wang, and J. Yu, “A Chaotic Encryption Scheme in DMT for IM/DD Intra-Datacenter Interconnects,” IEEE Photonics Technol. Lett. 33(8), 383–386 (2021). [CrossRef]

14. Y. Fu, M. Cheng, W. Shao, H. Luo, D. Li, L. Deng, Q. Yang, and D. Liu, “Analog-digital hybrid chaos-based long-haul coherent optical secure communication,” Opt. Lett. 46(7), 1506–1509 (2021). [CrossRef]

15. C. Cai, Y. Sun, and Y. Ji, “Simultaneous Long-Distance Transmission of Discrete-Variable Quantum Key Distribution and Classical Optical Communication,” IEEE Trans. Commun. 69(5), 3222–3234 (2021). [CrossRef]

16. J. Niu, Y. Sun, X. Jia, and Y. Ji, “Key-Size-Driven Wavelength Resource Sharing Scheme for QKD and the Time-Varying Data Services,” J. Lightwave Technol. 39(9), 2661–2672 (2021). [CrossRef]

17. A. Hajomer, X. Yang, A. Sultan, and W. Hu, “Key Distribution Based on Phase Fluctuation Between Polarization Modes in Optical Channel,” IEEE Photonics Technol. Lett. 30(8), 704–707 (2018). [CrossRef]

18. Y. Bromberg, B. Redding, S. Popoff, N. Zhao, G. Li, and H. Cao, “Remote key establishment by random mode mixing in multimode fibers and optical reciprocity,” Opt. Eng. 58(01), 1–10 (2019). [CrossRef]

19. I. Zaman, A. Lopez, M. Faruque, and O. Boyraz, “Physical Layer Cryptographic Key Generation by Exploiting PMD of an Optical Fiber Link,” J. Lightwave Technol. 36(24), 5903–5911 (2018). [CrossRef]

20. A. A. E. Hajomer, L. Zhang, X. Yang, and W. Hu, “284.8-Mb/s Physical-Layer Cryptographic Key Generation and Distribution in Fiber Networks,” J. Lightwave Technol. 39(6), 1595–1601 (2021). [CrossRef]

21. L. Zhang, A. A. E. Hajomer, W. Hu, and X. Yang, “2.7 Gb/s secure key generation and distribution using bidirectional polarization scrambler in fiber,” IEEE Photonics Technol. Lett. 33(6), 289–292 (2021). [CrossRef]

22. L. Zhang, X. Huang, W. Hu, and X. Yang, “Point to multi-point physical-layer key generation and distribution in passive optical networks,” Opt. Lett. 46(13), 3223–3226 (2021). [CrossRef]

23. K. Zhu, J. Zhang, Y. Li, W. Wang, X. Liu, and Y. Zhao, “Experimental demonstration of error-free key distribution without an external random source or device over a 300-km optical fiber,” Opt. Lett. 47(10), 2570–2573 (2022). [CrossRef]

24. Y. Ji, R. Gu, Z. Yang, J. Li, H. Li, and M. Zhang, “Artificial intelligence-driven autonomous optical networks: 3S architecture and key technologies,” Sci. China Inf. Sci. 63(6), 160301 (2020). [CrossRef]

25. X. Zhao, F. Du, S. Geng, N. Sun, Y. Zhang, Z. Fu, and G. Wang, “Neural network and GBSM based time-varying and stochastic channel modeling for 5G millimeter wave communications,” China Commun. 16(6), 80–90 (2019). [CrossRef]

26. B. Mthethwa and H. Xu, “Deep Learning-Based Wireless Channel Estimation for MIMO Uncoded Space-Time Labeling Diversity,” IEEE Access 8, 224608–224620 (2020). [CrossRef]

27. Q. Bai, J. Wang, Y. Zhang, and J. Song, “Deep Learning-Based Channel Estimation Algorithm Over Time Selective Fading Channels,” IEEE Trans. Cogn. Commun. Netw. 6(1), 125–134 (2020). [CrossRef]

28. H. Yang, Z. Niu, S. Xiao, J. Fang, Z. Liu, D. Fainsin, and Y. Li, “Fast and Accurate Optical Fiber Channel Modeling Using Generative Adversarial Network,” J. Lightwave Technol. 39(5), 1322–1333 (2021). [CrossRef]

29. S. Ten and M. Edwards, “An Introduction to the Fundamentals of PMD in Fibers,” in WP5051 (2006).

30. D. Boiyo, S. Kuja, D. Waswa, G. Amolo, R. Gamatham, E. Kipnoo, T. Gibbon, and A. Leitch, “Effects of polarization mode dispersion (PMD) on Raman gain and PMD measurement using an optical fibre Raman amplifier,” in Africon, (IEEE, 2013), pp. 1–5.

31. Y. Wu, Y. Yu, Y. Hu, Y. Sun, T. Wang, and Q. Zhang, “Channel-Based Dynamic Key Generation for Physical Layer Security in OFDM-PON Systems,” IEEE Photonics J. 13(2), 1–9 (2021). [CrossRef]

32. A. Hajomer, L. Zhang, X. Yang, and W. Hu, “Post-Processing Protocol for Physical-Layer Key Generation and Distribution in Fiber Networks,” IEEE Photonics Technol. Lett. 32(15), 901–904 (2020). [CrossRef]

33. S. Chen, L. Zhang, W. Hu, and X. Yang, “Efficient Post-Processing for Physical-Layer Secure Key Distribution in Fiber,” IEEE Photonics Technol. Lett. 33(6), 325–328 (2021). [CrossRef]

34. S. G. Zadeh and M. Schmid, “Bias in cross-entropy-based training of deep survival networks,” IEEE Trans. Pattern Anal. Mach. Intell. 43(9), 3126–3137 (2021). [CrossRef]

35. M. Zhang, Y. Zhou, W. Quan, J. Zhu, R. Zheng, and Q. Wu, “Online Learning for IoT Optimization: A Frank–Wolfe Adam-Based Algorithm[J],” IEEE Internet Things J. 7(9), 8228–8237 (2020). [CrossRef]

36. M. M. Kalayeh and M. Shah, “Training faster by separating modes of variation in batch-normalized models[J],” IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1483–1500 (2020). [CrossRef]

37. X. Shen, X. Tian, T. Liu, F. Xu, and D. Tao, “Continuous dropout[J],” IEEE Trans. Neural Netw. Learning Syst. 29(9), 3926–3937 (2018). [CrossRef]

38. L. Bassham III, A. Rukhin, J. Soto, J. Nechvatal, M. Smid, E. Barker, S. Leigh, M. Levenson, and M. Vangel, D. Banks and others, Sp 800-22 rev. 1a. a statistical test suite for random and pseudorandom number generators for cryptographic applications (National Institute of Standards & Technology, 2010).

NN	Samples	Time (per sample)
Training	65536	12.355ms
Generating final key	65536	0.1445ms

NN	Samples	Time (per sample)
Training	65536	12.355ms
Generating final key	65536	0.1445ms

Channel characteristics estimation based on a secure optical transmission system with deep neural networks

Abstract

1. Introduction

2. Related works

3. System model and principle

3.1 Channel characteristics detection and data consistency

3.2 Local training of NN model

3.3 Key generation and plaintext encryption

3.4 Communication

3.5 Key generation and ciphertext decryption

4. Simulation results and discussion

4.1 System evaluation criteria

4.1.1 KDR

4.1.2 KGR

4.1.3 Randomness

4.2 Data source

4.3 Simulation results

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (13)

Tables (4)

Equations (9)

Optics Express

	(a)			(b)			(c)
	Alice	Bob	Eve	Alice	Bob	Eve	Alice	Bob	Eve
Alice	1	0.99	0.39	1	1	0.51	1	1	0.023
Bob	0.99	1	0.25	1	1	0.51	1	1	0.023
Eve	0.39	0.25	1	0.51	0.51	1	0.023	0.023	1