CN111917474A

CN111917474A - Implicit triple neural network and optical fiber nonlinear damage balancing method

Info

Publication number: CN111917474A
Application number: CN202010710931.6A
Authority: CN
Inventors: 杨爱英; 何品靖; 郭芃; 冯立辉; 忻向军
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2020-11-10
Anticipated expiration: 2040-07-22
Also published as: CN111917474B

Abstract

The invention relates to an implicit triple neural network and an optical fiber nonlinear damage balancing method, belonging to the technical field of optical fiber communication and balancing. The method comprises the following steps: 1) generating a training set, a verification set and a test set, specifically: generating a binary bit stream, generating a label symbol stream, acquiring a symbol stream to be processed, generating a sample and dividing a data set 2) optimizing an implicit triplet neural network to obtain an optimal implicit triplet neural network; initializing a hyper-parameter search process, initializing an adjustable parameter iteration process, calculating a loss function and gradients of all adjustable parameters, updating the adjustable parameters, iterating, evaluating an optimization result, and selecting an optimal implicit triple neural network; 3) and testing the implicit triple neural network to obtain the balanced signal. Compared with the prior art, the neural network and the method have lower calculation cost and can further improve the balance effect.

Description

Implicit triple neural network and optical fiber nonlinear damage balancing method

Technical Field

The invention relates to an implicit triple neural network and an optical fiber nonlinear damage balancing method, and belongs to the technical field of optical fiber communication and balancing.

Background

The capacity of fiber optic communication systems is limited by fiber optic linear and nonlinear impairments. With the development of optical fiber communication technology, the communication capacity of optical fiber communication systems has approached the shannon limit of the linear region. Further increasing the communication capacity of optical fiber communication systems requires breaking through the limitation of nonlinear damage of optical fibers. Typical optical fiber nonlinear compensation methods include digital signal processing-type compensation methods in addition to link optical compensation-type methods. Compensation methods for digital signal processing include digital back-propagation (DBP), methods based on a Volterra Series Transfer Function (VSTF), methods based on perturbation theory, and methods based on machine learning.

DBP and VSTF can effectively alleviate fiber nonlinear damage in signals. However, due to its recurrences, DBP and VSTF require unacceptable computational complexity. Methods based on perturbation theory, although not requiring recursion, require computation of a large-scale coefficient matrix in the case of large dispersion accumulated in the signal, and also require unacceptable computational complexity. In the method based on machine learning, the neural network is a potential optical fiber nonlinear equalization method. The neural network estimates an inverse transmission system of the optical fiber by fitting training data to achieve the aim of balancing nonlinear damage of the optical fiber. The use of neural networks for fiber nonlinear equalization does not require recursion, requiring lower computational complexity. Also, due to its adaptive nature, neural networks are likely to perform better than non-machine learning-like methods. However, the design of the neural network does not take any expert knowledge into consideration, which makes training of the neural network require a large amount of training data, and aggravates the cost of equalization. As an attempt to introduce expert knowledge, using signal-forming triplets as inputs can improve the equalization effect of neural networks. However, computing triples also requires a large amount of computation.

Disclosure of Invention

The invention aims to provide an implicit triple neural network and an optical fiber nonlinear damage balancing method aiming at the technical defects that the existing optical fiber nonlinear damage balancing method needs more training data and has higher complexity.

The invention is realized by the following technical scheme:

the implicit triplet neural network and the optical fiber nonlinear damage balancing method comprise the implicit triplet neural network used for optical fiber nonlinear damage balancing and the optical fiber nonlinear damage balancing method based on the implicit triplet neural network.

The implicit triplet neural network is defined by the following equation:

wherein subscript x/y represents either the x polarization state or the y polarization state;

the equalization result of the k-th symbol in the x or y polarization state; r is_x/y,kThe kth received symbol for either the x or y polarization state;^*representing a conjugation operation; beta is a_sIs the s-th anchor weight; g_sIs the s-th implicit triple kernel function; m is called the number of anchor points, and the number of anchor points M is more than or equal to 1; (r)_x/y,k,r_y/x,k) Middle, current person r_x/y,kIs in the x polarization state r_x,kThen r is the latter_y/x,kIs in the y polarization state r_y,k；r_x/y,kOr r_y/x,kThe information vector formed by a plurality of symbols adjacent to the kth received symbol specifically includes:

wherein D is a time window and is an odd number; the time windows D are all larger than or equal to 1, and superscript T represents transposition;

the s-th implicit triplet kernel is defined as:

g_s(r_x/y,k,r_y/x,k)＝P_x/y,s,k(|P_x/y,s,k|²+|P_y/x,s,k|²) (2)

wherein the content of the first and second substances,

namely:

wherein H is a conjugate transpose operation; alpha is alpha_sA parameter vector called the s-th implicit triplet kernel; the parameter vectors and the anchor weights of all the implicit triple kernel functions are called adjustable parameters;

the optical fiber nonlinear damage equalization method based on the implicit triplet neural network comprises the following steps:

step one, generating a data set;

the data set comprises a training set, a verification set and a test set;

the first step specifically comprises the following substeps:

step 1.1 generating a binary bit stream, i.e. randomly generating a binary bit stream b transmitted in the x polarization state_xAnd a binary bit stream b transmitted in the y polarization state_y；

Wherein the binary bit stream b_xAnd b_yOnly bit 0 and bit 1;

wherein, the occurrence probability of bit 0 and bit 1 is 50%;

step 1.2, generating a label symbol stream, specifically: b generated in step 1.1_xAnd b_yMapping the mapping table f to a constellation diagram respectively to obtain a label symbol stream s transmitted in an x polarization state_xAnd a stream s of label symbols transmitted in the y polarization state_y；

Wherein the mapping table f is determined by a modulation format;

step 1.3, obtaining a symbol stream to be processed, and generating a sample, specifically: step 1.2 generating a stream of label symbols s_xAnd s_yAfter transmission of an optical fiber communication system and pre-equalization, a symbol stream r to be processed in a corresponding x polarization state is obtained_xAnd a stream r of symbols to be processed of y polarization states_y；

Wherein, the pre-equalization comprises linear equalization or non-adaptive fiber nonlinear damage equalization; the combination of a symbol to be processed and a corresponding label symbol is called a sample;

step 1.4, data set division, specifically comprising:

randomly dividing a sample into three parts according to a certain proportion, and respectively using the three parts as a training set, a verification set and a test set;

step two, optimizing the implicit triplet neural network to obtain an optimal implicit triplet neural network; the method specifically comprises the following substeps:

step 2.1, initializing a hyper-parameter search process, namely setting a hyper-parameter candidate set, a maximum iteration number, a minimum optimization quantity, a maximum enduring step number, a loss function, a gradient optimization function, a batch processing size and distributing cache;

the cache comprises a loss cache, a hyper-parameter cache, an adjustable parameter cache and a verification result cache;

wherein, the initial value of the loss cache is 0;

the hyper-parameters comprise a time window D, an anchor point number M and a learning rate eta;

the super-parameter configuration comprises a time window, an anchor point number and a learning rate which are marked as (D, M, eta), and all the super-parameter configurations to be selected form a super-parameter candidate set;

wherein the loss function J is defined by the following formula (4):

wherein N is the batch size and is greater than or equal to 1; a is_iIs the ith output symbol of the network; s_iFor sum a in the label symbol stream_iA corresponding tag symbol;

wherein the gradient optimization function is a function taking the gradient of the loss function relative to the adjustable parameter as an independent variable and is recorded as u (grad)_θ)；

Wherein, grad_θThe gradient of the loss function relative to an adjustable parameter theta is determined, theta is a certain component of a weight vector of the anchor weight or the implicit triple kernel function, and the gradient optimization function is a bounded function with or without memory;

the adjustable parameters comprise parameter vectors and anchor weights of all implicit triple kernel functions;

wherein, the maximum iteration number is denoted as iter_maxThe minimum optimization is marked as I_minThe iteration times are marked as iter;

wherein the minimum optimization amount I_minGreater than 0;

step 2.2, initializing an adjustable parameter iteration process, specifically: initializing the iteration number iter to be 0, the tolerance step number to be 0, and the loss cache to be 0; selecting a group of hyper-parameter configurations from the hyper-parameter candidate set and applying the hyper-parameter configurations, and removing the selected hyper-parameter configurations from the hyper-parameter candidate set;

step 2.3, calculating a loss function, specifically: randomly extracting samples with the same number as the batch processing size N from a training set, sending the samples into the network to obtain N balanced symbols, calculating loss through a formula (4), calculating an optimization amount I, and finally storing the calculated loss into a loss cache; wherein, the optimization quantity I is the loss cache minus the loss;

step 2.4 calculating the gradient of the loss function in step 2.3 with respect to all adjustable parameters;

wherein the gradient is a Watertian lattice gradient;

step 2.5, updating the adjustable parameter theta and adding 1 to iter, and specifically realizing the method by updating the formula (5):

θ←θ-ηu(grad_θ) (5)

wherein theta is an adjustable parameter to be updated;

step 2.6, judging whether the adjustable parameter iteration is terminated, specifically:

2.6A if iter equals iter_maxThen jump to step 2.7;

2.6B if iter is less than iter_maxAnd further judging, specifically:

2.6BA if I_minIf the value is less than or equal to I, jumping to the step 2.3;

2.6BB F I_minIf the tolerance step number is larger than I, adding 1 to the tolerance step number, and jumping to the step 2.6C;

step 2.6C, judging whether to terminate the adjustable parameter iteration according to the tolerance step number, specifically:

step 2.6, if the tolerance step number is equal to the maximum tolerance step number, the CA jumps to step 2.7;

step 2.6CB, if the tolerance step number is less than the maximum tolerance step number, jumping to step 2.3;

step 2.7, evaluating the optimization result, specifically comprising the following steps: sending all samples in the verification set into the network to obtain a balanced signal, calculating the error rate of the signal, storing the error rate as a verification result into a verification result cache and attaching a number to the verification result cache; storing the hyper-parameter configuration into a hyper-parameter cache, and attaching a number which is the same as the verification result of the time; storing the adjustable parameters into an adjustable parameter cache, and attaching the same number as the verification result;

step 2.8, determining whether to end the hyper-parameter search process according to the number of elements in the hyper-parameter candidate set, specifically: if the number of the elements in the hyper-parameter candidate set is more than 0, jumping to the step 2.2; otherwise, if the number of the elements in the hyper-parameter candidate set is equal to 0, ending the hyper-parameter search process and jumping to the step 2.9;

step 2.9, selecting an optimal implicit triplet neural network, specifically: searching the verification result with the minimum value in the verification result cache, and respectively reading the hyper-parameter configuration and the adjustable parameter with the same number as the verification result with the minimum value from the hyper-parameter cache and the adjustable parameter cache to obtain the optimal hyper-parameter configuration and the optimal adjustable parameter; applying the optimal hyper-parameter configuration and the optimal adjustable parameter to obtain an optimal implicit triple neural network;

step three, testing the implicit triple neural network, specifically: and sending all samples in the test set into an optimal implicit triple neural network to obtain signals after the nonlinear damage of the optical fibers is balanced.

Advantageous effects

The invention discloses an optical fiber nonlinear damage equalization method based on an implicit triplet neural network, which has the following beneficial effects:

1. compared with recursive fiber nonlinear damage equalization methods such as DBP (doubly-bound power process) and the like, the method gets rid of recursive behaviors and requires lower calculation cost;

2. the implicit triplet neural network requires less training data than an artificial neural network;

3. the method can improve the equalization effect of the non-adaptive optical fiber nonlinear damage equalization method with lower calculation cost.

Drawings

Fig. 1 is a schematic diagram illustrating mapping from a bit stream to a constellation diagram in an embodiment 1 of a method for equalizing a hidden triplet neural network and a nonlinear impairment of an optical fiber according to the present invention;

fig. 2 is a schematic system diagram of a coherent optical communication simulation system in embodiment 1 of a method for equalizing implicit triplet neural networks and nonlinear impairments of optical fibers according to the present invention;

FIG. 3 is a schematic structural diagram of an implicit triplet neural network and a flowchart of a method for equalizing nonlinear damage of optical fibers according to the present invention;

fig. 4 is a diagram of an equalization result in embodiment 1 of an implicit triplet neural network and optical fiber nonlinear damage equalization method according to the present invention;

fig. 5 is a schematic diagram of computational complexity in an embodiment 1 of the method for equalizing nonlinear damage of an optical fiber based on an implicit triplet neural network according to the present invention.

Detailed Description

The following describes an implicit triplet neural network and an optical fiber nonlinear damage equalization method according to the present invention with reference to the accompanying drawings and specific embodiments. The following examples are provided only for illustrating the present invention in more detail, but are not intended to limit the scope of the present invention.

Example 1

Two Pseudo Random Bit Sequences (PRBS) are generated in each channel and are respectively used for two polarization state transmissions, and the two PRBS strings are mapped on a constellation diagram according to the mapping relation shown in figure 1 to obtain a dual-polarization 16QAM symbol which is used as a label symbol stream. A simulation system is built according to the system schematic diagram of the attached figure 2, the fiber incoming power and the transmission distance are changed (the fiber incoming power is from-4 dBm to 2dBm per channel, the stepping is 1, the transmission distance is from 2400km to 4000km, the stepping is 80km), and data sets under different conditions are generated according to the step one. The training set ranges in size from 3000 symbols to 32768 symbols, with the validation set and test set each ranging in size from 32768 symbols.

An implicit triplet neural network is constructed from the block diagram of figure 3a to be optimized. And aiming at the data sets under different conditions, obtaining the optimal implicit triplet neural network under different conditions by the flow shown in the attached figure 3 b. And (3) executing the third step on the optimal implicit triplet neural network under different conditions to obtain the result of the fiber nonlinear damage equalization shown in the figure 4.

Fig. 4a is a curve of Q factor variation with fiber-entering power of each channel obtained by different fiber nonlinear damage equalization methods after a dual-polarization 16QAM signal is transmitted by 2400 km. Fig. 4a shows that after the optimal implicit triplet neural network obtained in step two is used for fiber nonlinear damage equalization, the Q factor of the signal is improved by about 0.6dB under the condition of optimal fiber input power, namely-1 dBm per channel,

fig. 4b is a graph of Q factors obtained by different fiber nonlinear impairment equalization methods as a function of the length of the fiber link, when a dual-polarization 16QAM signal is transmitted at a fiber input power of-1 dBm per channel. The lowest Q factor for 20% soft-decision forward error correction (20% SD-FEC) is shown in the figure with a horizontal dashed line. Fig. 4b shows that the method and DBP per span 1 step (DBP1) improves the same transmission distance compared to linear equalization by about 10.3% with the target of the lowest Q factor required for 20% SD-FEC. When the method is cascaded after the DBP1, the transmission distance is improved by about 28.2% compared with linear equalization, and the equalization effect is the same as that of the DBP (ideal DBP) with the step size of 1 km.

Fig. 4c is a graph of the implicit triplet neural network and the artificial neural network, after the power of-1 dBm entering the fiber of each channel of the dual-polarization 16QAM signal is transmitted by 2400km, the equalization result of the nonlinear damage of the optical fiber to the dual-polarization 16QAM signal changes with the size of the training set. The equalization results of the DBP1 under the same conditions are shown in the figure in the form of a horizontal dashed line. Fig. 4c shows that the implicit triplet neural network requires a training set size of approximately 7200 symbols and the artificial neural network requires a training set size of approximately 13000 symbols, targeting the equalization effect of DBP 1.

Fig. 5 shows the algorithm complexity of different fiber nonlinear equalization algorithms for different transmission distances. Fig. 5 shows that the real number of multiplications required to equalize a symbol for the implicit triplet neural network is approximately 1100, while the real number of multiplications required to equalize a symbol for the DBP1 equalization method is greater than or equal to 3000. The complexity required for cascading DBP1 with the method is only about 1.5% of ideal DBP.

The computational cost of a classification method such as DBP is mainly derived from the recursive nature. The optical fiber nonlinear damage equalization method based on the implicit triple neural network does not need recursion, and can obtain an equalization result only by one-step calculation, so that the complexity is lower compared with recursive equalization methods such as DBP (direct binary Pattern) and the like, and the 1 st beneficial effect of the method is embodied.

An artificial neural network can fit any function with sufficient parameters, i.e., assuming a sufficiently large space, at the cost of requiring a large amount of training data. The implicit triplet neural network is designed with a priori knowledge of nonlinear damage of the optical fiber, has a smaller but more accurate hypothesis space compared with an artificial neural network, and therefore needs less training data. Compared with an artificial neural network, the training data required by the implicit triplet neural network is reduced by 40%, and the 2 nd beneficial effect of the method is embodied.

Non-adaptive fiber nonlinear damage equalization methods such as DBP are difficult to equalize random fiber nonlinear damage introduced by amplifier noise accumulation and out-of-band channels. The optical fiber nonlinear damage equalization method based on the implicit triplet neural network is a self-adaptive method, and can inhibit residual random optical fiber nonlinear damage after the non-self-adaptive optical fiber nonlinear damage equalization method. Cascading the network after the DBP1 increases the transmission distance as well as the ideal DBP, and the cascading method is less complex than the ideal DBP, embodying the 3 rd advantageous effect of the present invention.

In the second step, the number of anchor points to be selected ranges from 5 to 30, and the step is 5; the candidate range of the time window is 15 to 70, and the step is 5; the candidate ranges of the learning rate are 0.1, 0.01, 0.001 and 0.0001. The hyperparameter candidate set is composed of all possible combinations of the three hyperparameters in the candidate range. And (5) selecting hyper-parameter configurations of the optimal implicit triple neural network obtained in the step two, wherein the number of anchor points is 10, the time window is 31, and the learning rate is 0.001 under all simulation conditions. In step 2.1, the maximum iteration number is 30 times of the size of the data set, the minimum optimization number is 0.001, the maximum tolerance step number is 300, the batch processing size is 64, and a gradient optimization function u (grad)_θ)＝grad_θ。

In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The implicit triplet neural network for the fiber nonlinear damage equalization is characterized in that: is defined by the formula:

the equalization result of the k-th symbol in the x or y polarization state;r_x/y,kthe kth received symbol for either the x or y polarization state;^*representing a conjugation operation; beta is a_sIs the s-th anchor weight; g_sIs the s-th implicit triple kernel function; m is called the number of anchor points, and the number of anchor points M is more than or equal to 1; (r)_x/y,k,r_y/x,k) Middle, current person r_x/y,kIs in the x polarization state r_x,kThen r is the latter_y/x,kIs in the y polarization state r_y,k；r_x/y,kOr r_y/x,kThe information vector formed by a plurality of symbols adjacent to the kth received symbol specifically includes:

the s-th implicit triplet kernel is defined as:

g_s(r_x/y,k,r_y/x,k)＝P_x/y,s,k(|P_x/y,s,k|²+|P_y/x,s,k|²) (2)

wherein the content of the first and second substances,

namely:

wherein H is a conjugate transpose operation; alpha is alpha_sA parameter vector called the s-th implicit triplet kernel; the parameter vectors and anchor weights of all implicit triplet kernel functions are called tunable parameters.

2. The implicit triplet neural network supported fiber nonlinear damage equalization method of claim 1, characterized in that: the method comprises the following steps:

step one, generating a data set;

the data set comprises a training set, a verification set and a test set;

wherein, the initial value of the loss cache is 0;

wherein the loss function J is defined by the following formula (4):

wherein the minimum optimization amount I_minGreater than 0;

wherein the gradient is a Watertian lattice gradient;

θ←θ-ηu(grad_θ) (5)

wherein theta is an adjustable parameter to be updated;

2.6A if iter equals iter_maxThen jump to step 2.7;

2.6B if iter is less than iter_maxAnd further judging, specifically:

2.6BA if I_minIf the value is less than or equal to I, jumping to the step 2.3;

3. The method according to claim 1, wherein: the first step specifically comprises the following substeps:

Step 1.2, generating a label symbol stream, specifically: b generated in step 1.1_xAnd b_yWarp reflectionMapping the emanator f to a constellation diagram respectively to obtain a label symbol stream s transmitted in an x polarization state_xAnd a stream s of label symbols transmitted in the y polarization state_y；

Wherein the mapping table f is determined by a modulation format;

The combination of a symbol to be processed and a corresponding label symbol is called a sample;

step 1.4, data set division, specifically comprising:

and randomly dividing the sample into three parts according to a certain proportion, and respectively using the three parts as a training set, a verification set and a test set.

4. The method according to claim 3, wherein: binary bit stream b in step 1.1_xAnd b_yContains only bit 0 and bit 1.

5. The method according to claim 3, wherein: in step 1.1, the probability of occurrence of both bit 0 and bit 1 is 50%.

6. The method according to claim 3, wherein: in step 1.3, the pre-equalization includes linear equalization or non-adaptive fiber nonlinear damage equalization.