CN113852434A

CN113852434A - LSTM and ResNet assisted deep learning end-to-end intelligent communication method and system

Info

Publication number: CN113852434A
Application number: CN202111113281.8A
Authority: CN
Inventors: 张皓天; 姜园; 张琳
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2021-09-18
Filing date: 2021-09-18
Publication date: 2021-12-28
Anticipated expiration: 2041-09-18
Also published as: CN113852434B

Abstract

The invention discloses an LSTM and ResNet assisted deep learning end-to-end intelligent communication method, which applies a long and short term memory unit in an end-to-end intelligent communication system to carry out joint coding and decoding and modulation and demodulation processing on a grouped bit sequence in an actual communication system so as to improve the error rate performance of the system; meanwhile, the invention provides that a residual error network structure is applied in an end-to-end intelligent communication system to effectively improve the convergence speed of a neural network and avoid the problems of gradient disappearance and gradient explosion which may occur.

Description

LSTM and ResNet assisted deep learning end-to-end intelligent communication method and system

Technical Field

The invention relates to the technical field of communication, in particular to an LSTM and ResNet assisted deep learning end-to-end intelligent communication method and system.

Background

As a powerful machine learning algorithm, the deep learning technique can efficiently learn linear or non-linear mapping required by users, and has been widely applied to various fields such as image classification, speech recognition, language translation, and the like. On the other hand, the future communication system needs to be enhanced to be deeply integrated with artificial intelligence and machine learning so as to further improve the performance of the communication system and realize intelligent communication, which is also proposed in the first 6G meeting of the world held in 3 months in 2019.

The current research on deep learning in combination with communication systems is usually limited to the use of deep learning to optimize, e.g., demodulate, decode, etc., one of the modules in the communication system. The idea of optimizing different modules respectively is likely to fail to obtain a global optimum value, thereby limiting the performance upper limit that the system can achieve. Meanwhile, the traditional optimization algorithm and the modular optimization algorithm based on deep learning both have difficulty in obtaining satisfactory performance under the condition that the communication environment is unknown, so that the practicability of a high-speed communication system such as a 5G communication system in a complex environment is greatly limited. Finally, in recent years, accelerated updating of communication technologies brings examination to the construction cost of communication infrastructure and compatibility and upgradability of equipment, and the advantage that the intelligent communication system can switch the neural network structure and adjust parameters according to requirements at low cost is further highlighted.

In order to solve the above problems and simultaneously exert the advantages of deep learning, a deep learning-based end-to-end intelligent communication system is proposed. The system regards the whole transmission system as a black box model, and realizes the joint optimization of all components of a transmitting end and a receiving end through a neural network, thereby providing an end-to-end intelligent communication system which is globally optimal and comprises the realization of the processes of coding, modulating, demodulating and decoding. Compared with the traditional communication system and the machine learning auxiliary communication system optimized in a sub-module mode, the end-to-end intelligent communication system can effectively adapt to the unknown communication environment and nonlinearity imposed by communication equipment or a channel. Meanwhile, due to the use of the deep learning technology, the communication system can quickly and inexpensively optimize the communication algorithm and parameters.

However, the existing end-to-end intelligent communication system still has shortcomings, and the most important thing is that the improvement for communication application is lacked in the used neural network structure, so the system still has a large improvement space in performance.

Chinese patent publication No. CN110460402A, published as 11/15/2019, discloses a method for establishing an end-to-end communication system based on deep learning. The process of the present invention is divided into two stages. Firstly, a self-encoder neural network is established and a channel layer is complicated, and the network is initially trained by taking random number simulation as a training set to obtain an encoding mode which has adaptability to channel interference. Then, the USRP collects communication data under a large number of actual channels and uses the collected communication data as a training set to train the decoding layer independently so that the decoding layer has better performance for communication under actual conditions. Also, this patent, while using deep learning, lacks improvements for communication applications in the neural network architecture used.

Disclosure of Invention

The invention aims to provide an LSTM and ResNet assisted deep learning end-to-end intelligent communication method, which effectively utilizes user data and improves the communication performance of a system.

It is a further object of this invention to provide an LSTM and ResNet assisted deep learning end-to-end intelligent communications system

In order to solve the technical problems, the technical scheme of the invention is as follows:

an LSTM and ResNet assisted deep learning end-to-end intelligent communication method comprises the following steps:

s1: randomly generating binary data as data to be sent by a user;

s2: setting a simulation channel environment;

s3: initializing a deep neural network, wherein the deep neural network comprises a sending block, a noise layer and a receiving block, and binary data randomly generated in the step S1 is processed by the sending block and then output to the noise layer;

s4: the noise layer scrambles and adds noise to the signal processed by the sending block according to the simulation channel environment set in the step S2, and sends the signal to the receiving block;

s5: the receiving block obtains estimated user data after using the reverse processing of the sending block;

s6: training the deep neural network according to the difference between the estimated user data obtained in the step S5 and the user data to be sent in the step S1;

s7: the method comprises the steps of setting a trained sending block at a communication sending end, using binary data bits needing to be sent as input vectors of the sending block, using output vectors of the sending block as sending signals to be sent to a wireless channel, setting a trained receiving block at a communication receiving end, using the trained receiving block as the input vectors of the receiving block after the communication receiving end receives the signals, and obtaining estimated user sending data after processing.

Preferably, the step S2 sets a channel simulation environment, specifically:

when the actual channel environment is known, simulation is directly carried out according to the probability distribution of the known actual channel environment;

if the actual channel environment is unknown, the probability distribution of the unknown channel is adaptively fitted and learned through the countermeasure generation network for simulation.

Preferably, the sending block of the deep neural network in step S3 includes an LSTM layer, a fully connected layer, and a regularization layer, where an input of the LSTM layer is data to be sent by the user, an output of the LSTM layer is superimposed with an input of the LSTM layer and then used as an input of the fully connected layer to form a ResNets structure, an output of the full connection is an input of the regularization layer, and an output of the regularization layer is an input of the noise layer.

Preferably, the data to be sent of the user is converted into a one-hot vector and then input to the LSTM layer of the sending block.

Preferably, the LSTM layer passes an external input vector x at time t_tAnd the output vector h at time t-1_t-1Calculating to obtain an output vector h at the moment t_t：

f_t＝σ(W_f·[h_t-1,x_t]+b_f)

i_t＝σ(W_i·[h_t-1,x_t]+b_i)

o_t＝σ(W_o·[h_t-1,x_t]+b_o)

h_t＝o_t⊙tanh(c_t)

Wherein [ h ]_t-1,x_t]Denotes x_t，h_t-1Splicing vector of W_f、b_f、W_i、b_i、

W_o、b_oFor trainable parameters, σ (-) indicates a sigmoid activation function, tanh (-) indicates a tanh activation function, and "-" indicates multiplication of corresponding elements; c. C_t-1Is the memory cell vector at time t-1; f. of_t、i_t、o_tAre vector of forgetting gate, input gate, output gate, respectively, f_tIs responsible for controlling c_t-1Which elements need to be preserved and which elements need to be weakened, i_tDetermining candidate update vectors

Which elements of the memory cell vector need to be added to the memory cell vector at the current time, f_t、i_t、

The three components act together to generate a new memory cell vector c_t(ii) a Then, o_tDetermination of tan h (c)_t) Which are used as output, thereby obtaining an output vector h at the time t_t。

Preferably, the receiving block of the deep neural network comprises two fully connected layers and one LSTM unit with a ResNets structure, wherein:

the input of one full connection layer is the output of a noise layer, the output is the input of an LSTM unit with a ResNet structure, the output of the LSTM unit with the ResNet structure is superposed with the output of one full connection layer to be used as the input of the other full connection layer, the output of the other full connection layer is a probability vector obtained by mapping that user data to be sent is firstly converted into a one-hot vector, the serial number of the largest element in the probability vector is judged to be user sending information, and estimated user data is obtained according to the probability vector.

Preferably, in step S6, the deep neural network is trained according to the difference between the estimation data obtained in step S5 and the binary data randomly generated in step S1, specifically:

the method comprises the steps of calculating a loss function value between estimated user data output by a deep neural network and user data to be sent according to preset loss functions, quantitatively measuring the difference between actual output and expected output, calculating the partial derivative of a loss value to each trainable parameter in the deep neural network through a back propagation algorithm, updating the parameters of the deep neural network by using a preset optimization algorithm, and reducing the difference between the estimated user data of the deep neural network and the user data to be sent.

Preferably, in step S7, if the actual channel has slow variation, retraining with low time cost and low computation cost is performed by a method including transfer learning, and performing deep neural network training and deploying applications synchronously by a means including parallel training.

An LSTM and ResNets assisted deep learning end-to-end intelligent communication system, wherein the system uses the LSTM and ResNets assisted deep learning end-to-end intelligent communication method, the system comprising:

the generating module is used for randomly generating binary data as data to be sent by a user;

the channel simulation module is used for setting a simulation channel environment;

the device comprises an initialization network module, a data processing module and a data processing module, wherein the initialization network module is used for initializing a deep neural network, the deep neural network comprises a sending block, a noise layer and a receiving block, and binary data randomly generated by the generation module are output to the noise layer after being processed by the sending block;

the noise layer scrambles and adds noise to the signal processed by the sending block according to the simulation channel environment set by the channel simulation module and sends the signal to the receiving block;

a receiving processing module, wherein the receiving block obtains estimated user data after using the processing opposite to that of the sending block;

the training module trains the deep neural network according to the difference between the estimated user data obtained by the receiving and processing module and the data to be sent by the user in the generating module;

and the deployment module is used for setting the trained sending block at the communication sending end, using binary data bits needing to be sent as input vectors of the sending block, using output vectors of the sending block as sending signals to be sent to a wireless channel, setting the trained receiving block at the communication receiving end, using the trained receiving block as the input vectors of the receiving block after the communication receiving end receives the signals, and processing the signals to obtain estimated user sending data.

Preferably, the sending block of the deep neural network includes an LSTM layer, a full-connection layer, and a regularization layer, where an input of the LSTM layer is data to be sent by a user, an output of the LSTM layer is superimposed with an input of the LSTM layer and then used as an input of the full-connection layer to form a ResNets structure, an output of the full-connection layer is an input of the regularization layer, and an output of the regularization layer is an input of the noise layer.

Compared with the prior art, the technical scheme of the invention has the beneficial effects that:

the invention applies a long short-term memory (LSTM) unit to carry out joint coding and decoding and modulation and demodulation processing on a packet bit sequence in an actual communication system, and effectively utilizes the characteristics of user data to improve the performance of a Symbol Error Rate (SER) of the system; meanwhile, the invention provides that a residual network (ResNet) structure is applied in an end-to-end intelligent communication system to effectively improve the convergence speed of the neural network and avoid the problems of gradient disappearance and gradient explosion which can occur.

Drawings

FIG. 1 is a schematic flow chart of the method of the present invention.

FIG. 2 is a schematic diagram of a deep neural network structure according to the present invention.

Fig. 3 is a schematic diagram of the structure of the LSTM layer.

Fig. 4 is a schematic diagram of the structure of ResNets.

Fig. 5 is a schematic diagram showing the SER performance comparison of the communication system using the proposed method of the present invention and the conventional hamming code encoding and BPSK modulation under AWGN channel.

Fig. 6 is a schematic diagram showing the SER performance comparison between the method of the present invention and the conventional hamming code encoding and BPSK modulation communication system under the rayleigh fast fading channel.

Fig. 7 is a schematic diagram showing the SER performance comparison between the method of the present invention and the conventional hamming code coding under different channel estimation accuracy conditions.

Fig. 8 is a schematic diagram showing the comparison of SER performance of a communication system under a deep neural network with a ResNets structure added under AWGN channel and without the ResNets structure added.

Fig. 9 is a schematic diagram of the system structure of the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the patent;

for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;

it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.

Example 1

The embodiment provides an LSTM and ResNets assisted deep learning end-to-end intelligent communication method, as shown in fig. 1, including the following steps:

s1: randomly generating binary data as data to be sent by a user;

s2: setting a simulation channel environment;

s3: initializing a deep neural network, wherein the deep neural network comprises a sending block, a noise layer and a receiving block as shown in fig. 2, and binary data randomly generated in step S1 is processed by the sending block and then output to the noise layer;

The steps S1 to S6 are training stages of the present invention, and are used to optimize trainable parameters in the deep neural network, and the step S7 is a deployment application stage of the present invention, in which the deep neural network can directly read the optimized parameters, so as to implement encoding and modulation of bit data to be sent by a user at a sending end and demodulation and decoding of a received signal at a receiving end in real time.

In step S2, a channel simulation environment is set, specifically:

when the actual channel environment is known, for example, a gaussian channel, a rayleigh fading channel, etc., output signals s (t) of a neural network transmission block are scrambled and noised directly according to the probability distribution of the known actual channel environment to obtain received signals r (t), namely r (t) I (s (t)), wherein I () represents a mapping function from input signals to output signals set according to the actual known channel environment in the training process;

The sending block of the deep neural network in step S3 is shown in fig. 3, and includes an LSTM layer, a fully connected layer, and a regularization layer, where an input of the LSTM layer is data to be sent by a user, an output of the LSTM layer is superimposed with an input of the LSTM layer and is used as an input of the fully connected layer to form a networks structure, an output of the fully connected layer is an input of the regularization layer, and an output of the regularization layer is an input of the noise layer.

The data to be sent by the user is converted into a one-hot vector, and then input to the LSTM layer of the sending block, where only one bit element of the vector is 1 and the rest are all 0, e.g., if 2-bit data is used as a group to process the user data, d represents 2-bit information, which has 4 possibilities, and if 3 types of information are sent by the sending end, then a one-hot vector l is input_d＝[0,0,1,0]。

The LSTM layer passes through an external input vector x at time t_tAnd the output vector h at time t-1_t-1Calculating to obtain an output vector h at the moment t_t：

f_t＝σ(W_f·[h_t-1,x_t]+b_f)

i_t＝σ(W_i·[h_t-1,x_t]+b_i)

o_t＝σ(W_o·[h_t-1,x_t]+b_o)

h_t＝o_t⊙tanh(c_t)

The three components act together to generate a new memory cell vector c_t(ii) a Then, o_tDetermination of tan h (c)_t) Which are used as output, thereby obtaining an output vector h at the time t_t；

The LSTM layer is used for capturing dynamic characteristics of input vectors, so that coding, decoding and modulation and demodulation can be performed by effectively utilizing the correlation of user data in each time period, and the reliability of information transmission is improved; even under the special condition that the user data in each time period are completely independent, the LSTM layer-assisted neural network can still realize the joint coding modulation operation on the user data in a plurality of time periods. Therefore, the advantage of global optimization is exerted, and meanwhile, the constraint length of coding is increased under the condition of keeping the code rate unchanged, so that the reliability of the system is effectively improved.

The full connection layer is responsible for further carrying out nonlinear operation on the output of the LSTM layer, so that the depth of the neural network is deepened, and the learning capability of the neural network is improved. And the regularization layer in the transmission block is used for carrying out normalization operation on the coded and modulated signals so as to maintain the power of the transmission signals at a certain value.

The receive block of the deep neural network is responsible for learning and performing the inverse operation of the transmit block to recover the user data bits from the disturbed receive signal as correctly as possible, including two fully connected layers and one LSTM unit with a ResNets structure, where:

As shown in fig. 4, in the ResNets structure, assuming that the input vector is x, the desired implementation is mapped to g (x), and the actual output of the target layer is f (x). Then, before introducing the ResNets structure, f (x) ═ g (x) needs to be trained; after the ResNets structure is introduced, the training target is changed to h (x) ═ f (x) + x ═ g (x), that is, f (x) ═ g (x) — x. By introducing the structure, the deep neural network can realize residual learning on a specified layer, so that neural network degradation is avoided, the possibility of gradient disappearance in the training process is reduced, and the convergence efficiency of the neural network is improved.

Step S6 trains the deep neural network according to the difference between the estimated data obtained in step S5 and the binary data randomly generated in step S1, specifically:

In the step S6, attention is paid to the deep neural network in the training phase, and the sending block and the receiving block are trained and optimized together. In the deployment application stage after training, the sending block and the receiving block of the neural network are respectively deployed at the sending end and the receiving end so as to respectively realize the coding, modulation and demodulation and decoding processing of the user bit data.

In step S7, if the actual channel changes slowly, retraining with low time cost and low computation cost is performed by a method including transfer learning, and deep neural network training and deployment application are performed synchronously by a means including parallel training.

In a specific embodiment, the wireless channel over which the signal propagates is assumed to be an AWGN channel, while 4 bits of user information are required to be transmitted over 7 channels. Under the condition, the LSTM and ResNet assisted deep learning end-to-end intelligent communication system provided by the invention is compared with the SER performance of a traditional Hamming code coding and BPSK modulation communication system, and the result is shown in FIG. 5. Where Q in the legend represents the number of groups of 4-bit user information that are jointly processed by the LSTM unit at a time. As shown in fig. 5, the SER performance of the communication system using the method of the present invention is better than that of the conventional communication system of hard decision decoding under AWGN channel, similar to that of the conventional communication system of maximum likelihood decoding, which shows the effectiveness of the system of the present invention. In addition, compared with the neural network using the LSTM unit without adding the ResNet structure, the neural network using the LSTM unit with the ResNet structure can obtain better SER performance, which shows that the ResNet structure adopted by the invention can effectively improve the reliability of the end-to-end intelligent communication system.

Under other conditions, the wireless channel for signal propagation is set as a rayleigh fast fading channel, and under such conditions, the SER performance of the LSTM and renets assisted deep learning end-to-end intelligent communication system provided by the present invention is compared with the SER performance of the traditional hamming code encoding and BPSK modulation communication system, and the result is shown in fig. 6. According to the result, under more complex communication environment conditions, compared with the traditional communication system, the end-to-end intelligent communication system can obtain better SER performance, and the performance gain is improved along with the increase of Q. This further demonstrates that the system of the present invention has better reliability and reflects that the LSTM unit can indeed improve SER performance for information transmission by jointly processing multiple groups of user data bits.

When other conditions are unchanged, the wireless channel of signal propagation is set as a Rayleigh fast fading channel, and SER performance of the system and the traditional communication system under different channel estimation accuracy conditions is tested to reflect robustness of the system. The results are shown in FIG. 7. Assuming that the actual channel coefficient is h and the channel estimation accuracy is rho, estimating the channel coefficient

Can be expressed as:

wherein epsilon is a complex gaussian variable with a mean value of 0 and a variance of 1, and rho is a constant with a value between 0 and 1. In the training stage, rho is fixed to be 1, which shows that the channel estimation during training is completely accurate; and in the testing stage, the SER performance of each system under different rho conditions is tested to reflect the robustness of the system when the training condition is different from the testing condition. As can be seen from fig. 7, when the channel estimation is not perfect and the test condition deviates from the training condition, the system of the present invention can still achieve SER performance better than that of the conventional communication system when Q is 4 and 8, which proves that the system has good robustness.

And when other conditions are unchanged, measuring the SER performance of the neural network with the ResNet structure and the neural network without the ResNet structure after each training round number to reflect the convergence efficiency of the system. As shown in fig. 8, in the training process, compared with a neural network without adding a ResNets structure, the deep neural network adopted by the system of the present invention can improve the SER performance of the neural network before reaching a saturation state, and can obtain a better SER performance when the training reaches saturation. Therefore, the neural network adopted by the system has higher convergence efficiency and lower training time cost.

Example 2

The embodiment provides an LSTM and ResNets assisted deep learning end-to-end intelligent communication system, which is characterized in that the system uses the LSTM and ResNets assisted deep learning end-to-end intelligent communication method described in embodiment 1, as shown in fig. 9, the system includes:

The transmitting block of the deep neural network comprises an LSTM layer, a full connection layer and a regularization layer, wherein the input of the LSTM layer is data to be transmitted by a user, the output of the LSTM layer and the input of the LSTM layer are superposed to be used as the input of the full connection layer to form a ResNet structure, the output of the full connection is the input of the regularization layer, and the output of the regularization layer is the input of a noise layer.

The same or similar reference numerals correspond to the same or similar parts;

the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;

it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. An LSTM and ResNet assisted deep learning end-to-end intelligent communication method is characterized by comprising the following steps:

s1: randomly generating binary data as data to be sent by a user;

s2: setting a simulation channel environment;

2. The LSTM and ResNets-assisted deep learning end-to-end intelligent communication method according to claim 1, wherein the channel simulation environment is set in step S2, specifically:

3. The LSTM and ResNets-assisted deep learning end-to-end intelligent communication method of claim 1, wherein the sending block of the deep neural network in step S3 includes an LSTM layer, a full connection layer and a regularization layer, wherein an input of the LSTM layer is data to be sent by a user, an output of the LSTM layer is superimposed with an input of the LSTM layer and is used as an input of the full connection layer to form a ResNets structure, an output of the full connection is an input of the regularization layer, and an output of the regularization layer is an input of the noise layer.

4. The LSTM and ResNets assisted deep learning end-to-end intelligent communication method according to claim 3, wherein the user data to be sent is converted into a one-hot vector before being input into the LSTM layer of the sending block.

5. The LSTM and ResNets assisted deep learning end-to-end intelligent communication method of claim 4, wherein the LSTM layer passes an external input vector x at time t_tAnd the output vector h at time t-1_t-1Calculating to obtain an output vector h at the moment t_t：

f_t＝σ(W_f·[h_t-1，x_t]+b_f)

i_t＝σ(W_i·[h_t-1，x_t]+b_i)

o_t＝σ(W_o·[h_t-1，x_t]+b_o)

h_t＝o_t⊙tanh(c_t)

Wherein [ h ]_t-1，x_t]Denotes x_t，h_t-1Splicing vector of W_f、b_f、W_i、b_i、

W_o、b_oFor trainable parameters, σ (-) indicates a sigmoid activation function, tan h (-) indicates a tan h activation function, and "-" indicates multiplication of corresponding elements; c. C_t-1Is the memory cell vector at time t-1; f. of_t、i_t、o_tAre vector of forgetting gate, input gate, output gate, respectively, f_tIs responsible for controlling c_t-1Which elements need to be preserved and which elements need to be weakened, i_tDetermining candidate update vectors

6. The LSTM and ResNets assisted deep learning end-to-end intelligent communication method of claim 4, wherein the receiving block of the deep neural network comprises two fully connected layers and one LSTM unit with ResNets structure, wherein:

7. The LSTM and ResNets assisted deep learning end-to-end intelligent communication method of claim 1, wherein step S6 is to train the deep neural network according to the difference between the estimation data obtained in step S5 and the binary data randomly generated in step S1, specifically:

8. The LSTM and ResNets assisted deep learning end-to-end intelligent communication method of claim 1, wherein in step S7, if the actual channel changes slowly, retraining with low time cost and low computation cost is performed by a method including transfer learning, and the deep neural network training and the deployment application are performed synchronously by a means including parallel training.

9. An LSTM and ResNets assisted deep learning end-to-end intelligent communications system using the LSTM and ResNets assisted deep learning end-to-end intelligent communications method of any of claims 1 to 8, the system comprising:

10. The LSTM and ResNets assisted deep learning end-to-end intelligent communication system of claim 9, wherein the sending block of the deep neural network comprises an LSTM layer, a fully connected layer and a regularization layer, wherein an input of the LSTM layer is data to be sent by a user, an output of the LSTM layer is superimposed with an input of the LSTM layer to be used as an input of the fully connected layer to form a ResNets structure, an output of the fully connected layer is an input of the regularization layer, and an output of the regularization layer is an input of a noise layer.