CN111917474A - Implicit triple neural network and optical fiber nonlinear damage balancing method - Google Patents

Implicit triple neural network and optical fiber nonlinear damage balancing method Download PDF

Info

Publication number
CN111917474A
CN111917474A CN202010710931.6A CN202010710931A CN111917474A CN 111917474 A CN111917474 A CN 111917474A CN 202010710931 A CN202010710931 A CN 202010710931A CN 111917474 A CN111917474 A CN 111917474A
Authority
CN
China
Prior art keywords
parameter
implicit
hyper
neural network
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010710931.6A
Other languages
Chinese (zh)
Other versions
CN111917474B (en
Inventor
杨爱英
何品靖
郭芃
冯立辉
忻向军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202010710931.6A priority Critical patent/CN111917474B/en
Publication of CN111917474A publication Critical patent/CN111917474A/en
Application granted granted Critical
Publication of CN111917474B publication Critical patent/CN111917474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B10/00Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
    • H04B10/25Arrangements specific to fibre transmission
    • H04B10/2507Arrangements specific to fibre transmission for the reduction or elimination of distortion or dispersion
    • H04B10/2543Arrangements specific to fibre transmission for the reduction or elimination of distortion or dispersion due to fibre non-linearities, e.g. Kerr effect
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B10/00Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
    • H04B10/50Transmitters
    • H04B10/516Details of coding or modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B10/00Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
    • H04B10/60Receivers
    • H04B10/61Coherent receivers
    • H04B10/616Details of the electronic signal processing in coherent optical receivers
    • H04B10/6163Compensation of non-linear effects in the fiber optic link, e.g. self-phase modulation [SPM], cross-phase modulation [XPM], four wave mixing [FWM]

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Nonlinear Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Optical Communication System (AREA)

Abstract

The invention relates to an implicit triple neural network and an optical fiber nonlinear damage balancing method, belonging to the technical field of optical fiber communication and balancing. The method comprises the following steps: 1) generating a training set, a verification set and a test set, specifically: generating a binary bit stream, generating a label symbol stream, acquiring a symbol stream to be processed, generating a sample and dividing a data set 2) optimizing an implicit triplet neural network to obtain an optimal implicit triplet neural network; initializing a hyper-parameter search process, initializing an adjustable parameter iteration process, calculating a loss function and gradients of all adjustable parameters, updating the adjustable parameters, iterating, evaluating an optimization result, and selecting an optimal implicit triple neural network; 3) and testing the implicit triple neural network to obtain the balanced signal. Compared with the prior art, the neural network and the method have lower calculation cost and can further improve the balance effect.

Description

Implicit triple neural network and optical fiber nonlinear damage balancing method
Technical Field
The invention relates to an implicit triple neural network and an optical fiber nonlinear damage balancing method, and belongs to the technical field of optical fiber communication and balancing.
Background
The capacity of fiber optic communication systems is limited by fiber optic linear and nonlinear impairments. With the development of optical fiber communication technology, the communication capacity of optical fiber communication systems has approached the shannon limit of the linear region. Further increasing the communication capacity of optical fiber communication systems requires breaking through the limitation of nonlinear damage of optical fibers. Typical optical fiber nonlinear compensation methods include digital signal processing-type compensation methods in addition to link optical compensation-type methods. Compensation methods for digital signal processing include digital back-propagation (DBP), methods based on a Volterra Series Transfer Function (VSTF), methods based on perturbation theory, and methods based on machine learning.
DBP and VSTF can effectively alleviate fiber nonlinear damage in signals. However, due to its recurrences, DBP and VSTF require unacceptable computational complexity. Methods based on perturbation theory, although not requiring recursion, require computation of a large-scale coefficient matrix in the case of large dispersion accumulated in the signal, and also require unacceptable computational complexity. In the method based on machine learning, the neural network is a potential optical fiber nonlinear equalization method. The neural network estimates an inverse transmission system of the optical fiber by fitting training data to achieve the aim of balancing nonlinear damage of the optical fiber. The use of neural networks for fiber nonlinear equalization does not require recursion, requiring lower computational complexity. Also, due to its adaptive nature, neural networks are likely to perform better than non-machine learning-like methods. However, the design of the neural network does not take any expert knowledge into consideration, which makes training of the neural network require a large amount of training data, and aggravates the cost of equalization. As an attempt to introduce expert knowledge, using signal-forming triplets as inputs can improve the equalization effect of neural networks. However, computing triples also requires a large amount of computation.
Disclosure of Invention
The invention aims to provide an implicit triple neural network and an optical fiber nonlinear damage balancing method aiming at the technical defects that the existing optical fiber nonlinear damage balancing method needs more training data and has higher complexity.
The invention is realized by the following technical scheme:
the implicit triplet neural network and the optical fiber nonlinear damage balancing method comprise the implicit triplet neural network used for optical fiber nonlinear damage balancing and the optical fiber nonlinear damage balancing method based on the implicit triplet neural network.
The implicit triplet neural network is defined by the following equation:
Figure BDA0002596516460000021
wherein subscript x/y represents either the x polarization state or the y polarization state;
Figure BDA0002596516460000022
the equalization result of the k-th symbol in the x or y polarization state; r isx/y,kThe kth received symbol for either the x or y polarization state;*representing a conjugation operation; beta is asIs the s-th anchor weight; gsIs the s-th implicit triple kernel function; m is called the number of anchor points, and the number of anchor points M is more than or equal to 1; (r)x/y,k,ry/x,k) Middle, current person rx/y,kIs in the x polarization state rx,kThen r is the lattery/x,kIs in the y polarization state ry,k;rx/y,kOr ry/x,kThe information vector formed by a plurality of symbols adjacent to the kth received symbol specifically includes:
Figure BDA0002596516460000023
Figure BDA0002596516460000031
wherein D is a time window and is an odd number; the time windows D are all larger than or equal to 1, and superscript T represents transposition;
the s-th implicit triplet kernel is defined as:
gs(rx/y,k,ry/x,k)=Px/y,s,k(|Px/y,s,k|2+|Py/x,s,k|2) (2)
wherein the content of the first and second substances,
Figure BDA0002596516460000032
namely:
Figure BDA0002596516460000033
wherein H is a conjugate transpose operation; alpha is alphasA parameter vector called the s-th implicit triplet kernel; the parameter vectors and the anchor weights of all the implicit triple kernel functions are called adjustable parameters;
the optical fiber nonlinear damage equalization method based on the implicit triplet neural network comprises the following steps:
step one, generating a data set;
the data set comprises a training set, a verification set and a test set;
the first step specifically comprises the following substeps:
step 1.1 generating a binary bit stream, i.e. randomly generating a binary bit stream b transmitted in the x polarization statexAnd a binary bit stream b transmitted in the y polarization statey
Wherein the binary bit stream bxAnd byOnly bit 0 and bit 1;
wherein, the occurrence probability of bit 0 and bit 1 is 50%;
step 1.2, generating a label symbol stream, specifically: b generated in step 1.1xAnd byMapping the mapping table f to a constellation diagram respectively to obtain a label symbol stream s transmitted in an x polarization statexAnd a stream s of label symbols transmitted in the y polarization statey
Wherein the mapping table f is determined by a modulation format;
step 1.3, obtaining a symbol stream to be processed, and generating a sample, specifically: step 1.2 generating a stream of label symbols sxAnd syAfter transmission of an optical fiber communication system and pre-equalization, a symbol stream r to be processed in a corresponding x polarization state is obtainedxAnd a stream r of symbols to be processed of y polarization statesy
Wherein, the pre-equalization comprises linear equalization or non-adaptive fiber nonlinear damage equalization; the combination of a symbol to be processed and a corresponding label symbol is called a sample;
step 1.4, data set division, specifically comprising:
randomly dividing a sample into three parts according to a certain proportion, and respectively using the three parts as a training set, a verification set and a test set;
step two, optimizing the implicit triplet neural network to obtain an optimal implicit triplet neural network; the method specifically comprises the following substeps:
step 2.1, initializing a hyper-parameter search process, namely setting a hyper-parameter candidate set, a maximum iteration number, a minimum optimization quantity, a maximum enduring step number, a loss function, a gradient optimization function, a batch processing size and distributing cache;
the cache comprises a loss cache, a hyper-parameter cache, an adjustable parameter cache and a verification result cache;
wherein, the initial value of the loss cache is 0;
the hyper-parameters comprise a time window D, an anchor point number M and a learning rate eta;
the super-parameter configuration comprises a time window, an anchor point number and a learning rate which are marked as (D, M, eta), and all the super-parameter configurations to be selected form a super-parameter candidate set;
wherein the loss function J is defined by the following formula (4):
Figure BDA0002596516460000041
wherein N is the batch size and is greater than or equal to 1; a isiIs the ith output symbol of the network; siFor sum a in the label symbol streamiA corresponding tag symbol;
wherein the gradient optimization function is a function taking the gradient of the loss function relative to the adjustable parameter as an independent variable and is recorded as u (grad)θ);
Wherein, gradθThe gradient of the loss function relative to an adjustable parameter theta is determined, theta is a certain component of a weight vector of the anchor weight or the implicit triple kernel function, and the gradient optimization function is a bounded function with or without memory;
the adjustable parameters comprise parameter vectors and anchor weights of all implicit triple kernel functions;
wherein, the maximum iteration number is denoted as itermaxThe minimum optimization is marked as IminThe iteration times are marked as iter;
wherein the minimum optimization amount IminGreater than 0;
step 2.2, initializing an adjustable parameter iteration process, specifically: initializing the iteration number iter to be 0, the tolerance step number to be 0, and the loss cache to be 0; selecting a group of hyper-parameter configurations from the hyper-parameter candidate set and applying the hyper-parameter configurations, and removing the selected hyper-parameter configurations from the hyper-parameter candidate set;
step 2.3, calculating a loss function, specifically: randomly extracting samples with the same number as the batch processing size N from a training set, sending the samples into the network to obtain N balanced symbols, calculating loss through a formula (4), calculating an optimization amount I, and finally storing the calculated loss into a loss cache; wherein, the optimization quantity I is the loss cache minus the loss;
step 2.4 calculating the gradient of the loss function in step 2.3 with respect to all adjustable parameters;
wherein the gradient is a Watertian lattice gradient;
step 2.5, updating the adjustable parameter theta and adding 1 to iter, and specifically realizing the method by updating the formula (5):
θ←θ-ηu(gradθ) (5)
wherein theta is an adjustable parameter to be updated;
step 2.6, judging whether the adjustable parameter iteration is terminated, specifically:
2.6A if iter equals itermaxThen jump to step 2.7;
2.6B if iter is less than itermaxAnd further judging, specifically:
2.6BA if IminIf the value is less than or equal to I, jumping to the step 2.3;
2.6BB F IminIf the tolerance step number is larger than I, adding 1 to the tolerance step number, and jumping to the step 2.6C;
step 2.6C, judging whether to terminate the adjustable parameter iteration according to the tolerance step number, specifically:
step 2.6, if the tolerance step number is equal to the maximum tolerance step number, the CA jumps to step 2.7;
step 2.6CB, if the tolerance step number is less than the maximum tolerance step number, jumping to step 2.3;
step 2.7, evaluating the optimization result, specifically comprising the following steps: sending all samples in the verification set into the network to obtain a balanced signal, calculating the error rate of the signal, storing the error rate as a verification result into a verification result cache and attaching a number to the verification result cache; storing the hyper-parameter configuration into a hyper-parameter cache, and attaching a number which is the same as the verification result of the time; storing the adjustable parameters into an adjustable parameter cache, and attaching the same number as the verification result;
step 2.8, determining whether to end the hyper-parameter search process according to the number of elements in the hyper-parameter candidate set, specifically: if the number of the elements in the hyper-parameter candidate set is more than 0, jumping to the step 2.2; otherwise, if the number of the elements in the hyper-parameter candidate set is equal to 0, ending the hyper-parameter search process and jumping to the step 2.9;
step 2.9, selecting an optimal implicit triplet neural network, specifically: searching the verification result with the minimum value in the verification result cache, and respectively reading the hyper-parameter configuration and the adjustable parameter with the same number as the verification result with the minimum value from the hyper-parameter cache and the adjustable parameter cache to obtain the optimal hyper-parameter configuration and the optimal adjustable parameter; applying the optimal hyper-parameter configuration and the optimal adjustable parameter to obtain an optimal implicit triple neural network;
step three, testing the implicit triple neural network, specifically: and sending all samples in the test set into an optimal implicit triple neural network to obtain signals after the nonlinear damage of the optical fibers is balanced.
Advantageous effects
The invention discloses an optical fiber nonlinear damage equalization method based on an implicit triplet neural network, which has the following beneficial effects:
1. compared with recursive fiber nonlinear damage equalization methods such as DBP (doubly-bound power process) and the like, the method gets rid of recursive behaviors and requires lower calculation cost;
2. the implicit triplet neural network requires less training data than an artificial neural network;
3. the method can improve the equalization effect of the non-adaptive optical fiber nonlinear damage equalization method with lower calculation cost.
Drawings
Fig. 1 is a schematic diagram illustrating mapping from a bit stream to a constellation diagram in an embodiment 1 of a method for equalizing a hidden triplet neural network and a nonlinear impairment of an optical fiber according to the present invention;
fig. 2 is a schematic system diagram of a coherent optical communication simulation system in embodiment 1 of a method for equalizing implicit triplet neural networks and nonlinear impairments of optical fibers according to the present invention;
FIG. 3 is a schematic structural diagram of an implicit triplet neural network and a flowchart of a method for equalizing nonlinear damage of optical fibers according to the present invention;
fig. 4 is a diagram of an equalization result in embodiment 1 of an implicit triplet neural network and optical fiber nonlinear damage equalization method according to the present invention;
fig. 5 is a schematic diagram of computational complexity in an embodiment 1 of the method for equalizing nonlinear damage of an optical fiber based on an implicit triplet neural network according to the present invention.
Detailed Description
The following describes an implicit triplet neural network and an optical fiber nonlinear damage equalization method according to the present invention with reference to the accompanying drawings and specific embodiments. The following examples are provided only for illustrating the present invention in more detail, but are not intended to limit the scope of the present invention.
Example 1
Two Pseudo Random Bit Sequences (PRBS) are generated in each channel and are respectively used for two polarization state transmissions, and the two PRBS strings are mapped on a constellation diagram according to the mapping relation shown in figure 1 to obtain a dual-polarization 16QAM symbol which is used as a label symbol stream. A simulation system is built according to the system schematic diagram of the attached figure 2, the fiber incoming power and the transmission distance are changed (the fiber incoming power is from-4 dBm to 2dBm per channel, the stepping is 1, the transmission distance is from 2400km to 4000km, the stepping is 80km), and data sets under different conditions are generated according to the step one. The training set ranges in size from 3000 symbols to 32768 symbols, with the validation set and test set each ranging in size from 32768 symbols.
An implicit triplet neural network is constructed from the block diagram of figure 3a to be optimized. And aiming at the data sets under different conditions, obtaining the optimal implicit triplet neural network under different conditions by the flow shown in the attached figure 3 b. And (3) executing the third step on the optimal implicit triplet neural network under different conditions to obtain the result of the fiber nonlinear damage equalization shown in the figure 4.
Fig. 4a is a curve of Q factor variation with fiber-entering power of each channel obtained by different fiber nonlinear damage equalization methods after a dual-polarization 16QAM signal is transmitted by 2400 km. Fig. 4a shows that after the optimal implicit triplet neural network obtained in step two is used for fiber nonlinear damage equalization, the Q factor of the signal is improved by about 0.6dB under the condition of optimal fiber input power, namely-1 dBm per channel,
fig. 4b is a graph of Q factors obtained by different fiber nonlinear impairment equalization methods as a function of the length of the fiber link, when a dual-polarization 16QAM signal is transmitted at a fiber input power of-1 dBm per channel. The lowest Q factor for 20% soft-decision forward error correction (20% SD-FEC) is shown in the figure with a horizontal dashed line. Fig. 4b shows that the method and DBP per span 1 step (DBP1) improves the same transmission distance compared to linear equalization by about 10.3% with the target of the lowest Q factor required for 20% SD-FEC. When the method is cascaded after the DBP1, the transmission distance is improved by about 28.2% compared with linear equalization, and the equalization effect is the same as that of the DBP (ideal DBP) with the step size of 1 km.
Fig. 4c is a graph of the implicit triplet neural network and the artificial neural network, after the power of-1 dBm entering the fiber of each channel of the dual-polarization 16QAM signal is transmitted by 2400km, the equalization result of the nonlinear damage of the optical fiber to the dual-polarization 16QAM signal changes with the size of the training set. The equalization results of the DBP1 under the same conditions are shown in the figure in the form of a horizontal dashed line. Fig. 4c shows that the implicit triplet neural network requires a training set size of approximately 7200 symbols and the artificial neural network requires a training set size of approximately 13000 symbols, targeting the equalization effect of DBP 1.
Fig. 5 shows the algorithm complexity of different fiber nonlinear equalization algorithms for different transmission distances. Fig. 5 shows that the real number of multiplications required to equalize a symbol for the implicit triplet neural network is approximately 1100, while the real number of multiplications required to equalize a symbol for the DBP1 equalization method is greater than or equal to 3000. The complexity required for cascading DBP1 with the method is only about 1.5% of ideal DBP.
The computational cost of a classification method such as DBP is mainly derived from the recursive nature. The optical fiber nonlinear damage equalization method based on the implicit triple neural network does not need recursion, and can obtain an equalization result only by one-step calculation, so that the complexity is lower compared with recursive equalization methods such as DBP (direct binary Pattern) and the like, and the 1 st beneficial effect of the method is embodied.
An artificial neural network can fit any function with sufficient parameters, i.e., assuming a sufficiently large space, at the cost of requiring a large amount of training data. The implicit triplet neural network is designed with a priori knowledge of nonlinear damage of the optical fiber, has a smaller but more accurate hypothesis space compared with an artificial neural network, and therefore needs less training data. Compared with an artificial neural network, the training data required by the implicit triplet neural network is reduced by 40%, and the 2 nd beneficial effect of the method is embodied.
Non-adaptive fiber nonlinear damage equalization methods such as DBP are difficult to equalize random fiber nonlinear damage introduced by amplifier noise accumulation and out-of-band channels. The optical fiber nonlinear damage equalization method based on the implicit triplet neural network is a self-adaptive method, and can inhibit residual random optical fiber nonlinear damage after the non-self-adaptive optical fiber nonlinear damage equalization method. Cascading the network after the DBP1 increases the transmission distance as well as the ideal DBP, and the cascading method is less complex than the ideal DBP, embodying the 3 rd advantageous effect of the present invention.
In the second step, the number of anchor points to be selected ranges from 5 to 30, and the step is 5; the candidate range of the time window is 15 to 70, and the step is 5; the candidate ranges of the learning rate are 0.1, 0.01, 0.001 and 0.0001. The hyperparameter candidate set is composed of all possible combinations of the three hyperparameters in the candidate range. And (5) selecting hyper-parameter configurations of the optimal implicit triple neural network obtained in the step two, wherein the number of anchor points is 10, the time window is 31, and the learning rate is 0.001 under all simulation conditions. In step 2.1, the maximum iteration number is 30 times of the size of the data set, the minimum optimization number is 0.001, the maximum tolerance step number is 300, the batch processing size is 64, and a gradient optimization function u (grad)θ)=gradθ
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The implicit triplet neural network for the fiber nonlinear damage equalization is characterized in that: is defined by the formula:
Figure FDA0002596516450000011
wherein subscript x/y represents either the x polarization state or the y polarization state;
Figure FDA0002596516450000012
the equalization result of the k-th symbol in the x or y polarization state;rx/y,kthe kth received symbol for either the x or y polarization state;*representing a conjugation operation; beta is asIs the s-th anchor weight; gsIs the s-th implicit triple kernel function; m is called the number of anchor points, and the number of anchor points M is more than or equal to 1; (r)x/y,k,ry/x,k) Middle, current person rx/y,kIs in the x polarization state rx,kThen r is the lattery/x,kIs in the y polarization state ry,k;rx/y,kOr ry/x,kThe information vector formed by a plurality of symbols adjacent to the kth received symbol specifically includes:
Figure FDA0002596516450000013
Figure FDA0002596516450000014
wherein D is a time window and is an odd number; the time windows D are all larger than or equal to 1, and superscript T represents transposition;
the s-th implicit triplet kernel is defined as:
gs(rx/y,k,ry/x,k)=Px/y,s,k(|Px/y,s,k|2+|Py/x,s,k|2) (2)
wherein the content of the first and second substances,
Figure FDA0002596516450000015
namely:
Figure FDA0002596516450000016
wherein H is a conjugate transpose operation; alpha is alphasA parameter vector called the s-th implicit triplet kernel; the parameter vectors and anchor weights of all implicit triplet kernel functions are called tunable parameters.
2. The implicit triplet neural network supported fiber nonlinear damage equalization method of claim 1, characterized in that: the method comprises the following steps:
step one, generating a data set;
the data set comprises a training set, a verification set and a test set;
step two, optimizing the implicit triplet neural network to obtain an optimal implicit triplet neural network; the method specifically comprises the following substeps:
step 2.1, initializing a hyper-parameter search process, namely setting a hyper-parameter candidate set, a maximum iteration number, a minimum optimization quantity, a maximum enduring step number, a loss function, a gradient optimization function, a batch processing size and distributing cache;
the cache comprises a loss cache, a hyper-parameter cache, an adjustable parameter cache and a verification result cache;
wherein, the initial value of the loss cache is 0;
the hyper-parameters comprise a time window D, an anchor point number M and a learning rate eta;
the super-parameter configuration comprises a time window, an anchor point number and a learning rate which are marked as (D, M, eta), and all the super-parameter configurations to be selected form a super-parameter candidate set;
wherein the loss function J is defined by the following formula (4):
Figure FDA0002596516450000021
wherein N is the batch size and is greater than or equal to 1; a isiIs the ith output symbol of the network; siFor sum a in the label symbol streamiA corresponding tag symbol;
wherein the gradient optimization function is a function taking the gradient of the loss function relative to the adjustable parameter as an independent variable and is recorded as u (grad)θ);
Wherein, gradθThe gradient of the loss function relative to an adjustable parameter theta is determined, theta is a certain component of a weight vector of the anchor weight or the implicit triple kernel function, and the gradient optimization function is a bounded function with or without memory;
the adjustable parameters comprise parameter vectors and anchor weights of all implicit triple kernel functions;
wherein, the maximum iteration number is denoted as itermaxThe minimum optimization is marked as IminThe iteration times are marked as iter;
wherein the minimum optimization amount IminGreater than 0;
step 2.2, initializing an adjustable parameter iteration process, specifically: initializing the iteration number iter to be 0, the tolerance step number to be 0, and the loss cache to be 0; selecting a group of hyper-parameter configurations from the hyper-parameter candidate set and applying the hyper-parameter configurations, and removing the selected hyper-parameter configurations from the hyper-parameter candidate set;
step 2.3, calculating a loss function, specifically: randomly extracting samples with the same number as the batch processing size N from a training set, sending the samples into the network to obtain N balanced symbols, calculating loss through a formula (4), calculating an optimization amount I, and finally storing the calculated loss into a loss cache; wherein, the optimization quantity I is the loss cache minus the loss;
step 2.4 calculating the gradient of the loss function in step 2.3 with respect to all adjustable parameters;
wherein the gradient is a Watertian lattice gradient;
step 2.5, updating the adjustable parameter theta and adding 1 to iter, and specifically realizing the method by updating the formula (5):
θ←θ-ηu(gradθ) (5)
wherein theta is an adjustable parameter to be updated;
step 2.6, judging whether the adjustable parameter iteration is terminated, specifically:
2.6A if iter equals itermaxThen jump to step 2.7;
2.6B if iter is less than itermaxAnd further judging, specifically:
2.6BA if IminIf the value is less than or equal to I, jumping to the step 2.3;
2.6BB F IminIf the tolerance step number is larger than I, adding 1 to the tolerance step number, and jumping to the step 2.6C;
step 2.6C, judging whether to terminate the adjustable parameter iteration according to the tolerance step number, specifically:
step 2.6, if the tolerance step number is equal to the maximum tolerance step number, the CA jumps to step 2.7;
step 2.6CB, if the tolerance step number is less than the maximum tolerance step number, jumping to step 2.3;
step 2.7, evaluating the optimization result, specifically comprising the following steps: sending all samples in the verification set into the network to obtain a balanced signal, calculating the error rate of the signal, storing the error rate as a verification result into a verification result cache and attaching a number to the verification result cache; storing the hyper-parameter configuration into a hyper-parameter cache, and attaching a number which is the same as the verification result of the time; storing the adjustable parameters into an adjustable parameter cache, and attaching the same number as the verification result;
step 2.8, determining whether to end the hyper-parameter search process according to the number of elements in the hyper-parameter candidate set, specifically: if the number of the elements in the hyper-parameter candidate set is more than 0, jumping to the step 2.2; otherwise, if the number of the elements in the hyper-parameter candidate set is equal to 0, ending the hyper-parameter search process and jumping to the step 2.9;
step 2.9, selecting an optimal implicit triplet neural network, specifically: searching the verification result with the minimum value in the verification result cache, and respectively reading the hyper-parameter configuration and the adjustable parameter with the same number as the verification result with the minimum value from the hyper-parameter cache and the adjustable parameter cache to obtain the optimal hyper-parameter configuration and the optimal adjustable parameter; applying the optimal hyper-parameter configuration and the optimal adjustable parameter to obtain an optimal implicit triple neural network;
step three, testing the implicit triple neural network, specifically: and sending all samples in the test set into an optimal implicit triple neural network to obtain signals after the nonlinear damage of the optical fibers is balanced.
3. The method according to claim 1, wherein: the first step specifically comprises the following substeps:
step 1.1 generating a binary bit stream, i.e. randomly generating a binary bit stream b transmitted in the x polarization statexAnd a binary bit stream b transmitted in the y polarization statey
Step 1.2, generating a label symbol stream, specifically: b generated in step 1.1xAnd byWarp reflectionMapping the emanator f to a constellation diagram respectively to obtain a label symbol stream s transmitted in an x polarization statexAnd a stream s of label symbols transmitted in the y polarization statey
Wherein the mapping table f is determined by a modulation format;
step 1.3, obtaining a symbol stream to be processed, and generating a sample, specifically: step 1.2 generating a stream of label symbols sxAnd syAfter transmission of an optical fiber communication system and pre-equalization, a symbol stream r to be processed in a corresponding x polarization state is obtainedxAnd a stream r of symbols to be processed of y polarization statesy
The combination of a symbol to be processed and a corresponding label symbol is called a sample;
step 1.4, data set division, specifically comprising:
and randomly dividing the sample into three parts according to a certain proportion, and respectively using the three parts as a training set, a verification set and a test set.
4. The method according to claim 3, wherein: binary bit stream b in step 1.1xAnd byContains only bit 0 and bit 1.
5. The method according to claim 3, wherein: in step 1.1, the probability of occurrence of both bit 0 and bit 1 is 50%.
6. The method according to claim 3, wherein: in step 1.3, the pre-equalization includes linear equalization or non-adaptive fiber nonlinear damage equalization.
CN202010710931.6A 2020-07-22 2020-07-22 Implicit triple neural network and optical fiber nonlinear damage balancing method Active CN111917474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010710931.6A CN111917474B (en) 2020-07-22 2020-07-22 Implicit triple neural network and optical fiber nonlinear damage balancing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010710931.6A CN111917474B (en) 2020-07-22 2020-07-22 Implicit triple neural network and optical fiber nonlinear damage balancing method

Publications (2)

Publication Number Publication Date
CN111917474A true CN111917474A (en) 2020-11-10
CN111917474B CN111917474B (en) 2022-07-29

Family

ID=73280197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010710931.6A Active CN111917474B (en) 2020-07-22 2020-07-22 Implicit triple neural network and optical fiber nonlinear damage balancing method

Country Status (1)

Country Link
CN (1) CN111917474B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113285758A (en) * 2021-05-18 2021-08-20 成都信息工程大学 Optical fiber nonlinear equalization method based on IPCA-DNN algorithm
CN113656333A (en) * 2021-10-20 2021-11-16 之江实验室 Method for accelerating deep learning training task data loading
CN114006657A (en) * 2021-10-27 2022-02-01 北京理工大学 Nonlinear parameter optimization method based on Gaussian pulse peak power distribution
CN115208721A (en) * 2022-06-23 2022-10-18 上海交通大学 Volterra-like neural network equalizer construction method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160127047A1 (en) * 2013-05-13 2016-05-05 Xieon Networks S.A.R.L. Method, device and communication system for reducing optical transmission impairments
CN109379132A (en) * 2018-12-05 2019-02-22 北京理工大学 A kind of apparatus and method of low speed coherent detection and neural network estimation fibre-optical dispersion
US20190393965A1 (en) * 2018-06-22 2019-12-26 Nec Laboratories America, Inc Optical fiber nonlinearity compensation using neural networks
CN110636020A (en) * 2019-08-05 2019-12-31 北京大学 Neural network equalization method for adaptive communication system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160127047A1 (en) * 2013-05-13 2016-05-05 Xieon Networks S.A.R.L. Method, device and communication system for reducing optical transmission impairments
US20190393965A1 (en) * 2018-06-22 2019-12-26 Nec Laboratories America, Inc Optical fiber nonlinearity compensation using neural networks
CN109379132A (en) * 2018-12-05 2019-02-22 北京理工大学 A kind of apparatus and method of low speed coherent detection and neural network estimation fibre-optical dispersion
CN110636020A (en) * 2019-08-05 2019-12-31 北京大学 Neural network equalization method for adaptive communication system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CATANESE CLARA等: "A Survey of Neural Network Applications in Fiber Nonlinearity Mitigation", 《21ST INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON)》 *
WANG ZIYI等: "CNN based OSNR estimation method for long haul optical fiber communication systems", 《ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP)》 *
YANG SHUANGMING等: "Scalable Digital Neuromorphic Architecture for Large-Scale Biophysically Meaningful Neural Network With Multi-Compartment Neurons", 《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》 *
李亚杰等: "基于人工智能的光纤非线性均衡算法研究概述", 《电信科学》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113285758A (en) * 2021-05-18 2021-08-20 成都信息工程大学 Optical fiber nonlinear equalization method based on IPCA-DNN algorithm
CN113656333A (en) * 2021-10-20 2021-11-16 之江实验室 Method for accelerating deep learning training task data loading
CN114006657A (en) * 2021-10-27 2022-02-01 北京理工大学 Nonlinear parameter optimization method based on Gaussian pulse peak power distribution
CN114006657B (en) * 2021-10-27 2023-11-21 北京理工大学 Nonlinear parameter optimization method based on Gaussian pulse peak power distribution
CN115208721A (en) * 2022-06-23 2022-10-18 上海交通大学 Volterra-like neural network equalizer construction method and system
CN115208721B (en) * 2022-06-23 2024-01-23 上海交通大学 Volterra-like neural network equalizer construction method and system

Also Published As

Publication number Publication date
CN111917474B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN111917474B (en) Implicit triple neural network and optical fiber nonlinear damage balancing method
Sidelnikov et al. Advanced convolutional neural networks for nonlinearity mitigation in long-haul WDM transmission systems
Häger et al. Physics-based deep learning for fiber-optic communication systems
Freire et al. Neural networks-based equalizers for coherent optical transmission: Caveats and pitfalls
Amirabadi et al. Novel suboptimal approaches for hyperparameter tuning of deep neural network [under the shelf of optical communication]
Redyuk et al. Compensation of nonlinear impairments using inverse perturbation theory with reduced complexity
Jovanovic et al. Gradient-free training of autoencoders for non-differentiable communication channels
US20220239371A1 (en) Methods, devices, apparatuses, and medium for optical communication
Niu et al. End-to-end deep learning for long-haul fiber transmission using differentiable surrogate channel
Lun et al. ROADM-induced anomaly localization and evaluation for optical links based on receiver DSP and ML
Yang et al. Overfitting effect of artificial neural network based nonlinear equalizer: from mathematical origin to transmission evolution
Freire et al. Reducing computational complexity of neural networks in optical channel equalization: From concepts to implementation
Yi et al. Neural network-based equalization in high-speed PONs
CN114301529B (en) Volterra equalization method and system based on multi-symbol processing
CN112887237B (en) Method for designing complex channel equalizer of optical fiber communication system
CN111935008B (en) Optical network routing method and system based on physical layer damage constraint of machine learning
CN116668112A (en) Method and device for generating flow countermeasure sample access black box model
Hamgini et al. Application of transformers for nonlinear channel compensation in optical systems
Shahkarami et al. Efficient deep learning of nonlinear fiber-optic communications using a convolutional recurrent neural network
Zaouche et al. Baud-spaced constant modulus blind equalization via hybrid genetic algorithm and generalized pattern search optimization
Esteves et al. Deep learning for BER prediction in optical connections impaired by inter-core crosstalk
Ron et al. On the Impact of the Optical Phase Conjugation on the Computational Complexity of Neural Network-Based Equalisers
Maniak et al. Deep Neural Networks for Transmission Impairment Mitigation in Long-Reach 5G Access Networks
CN114978313B (en) Compensation method of visible light CAP system based on Bayesian neurons
CN114204993B (en) Nonlinear equalization method and system based on polynomial mapping feature construction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant