CN114268328A

CN114268328A - Convolutional code decoding method based on bidirectional LSTM and convolutional code encoding and decoding method

Info

Publication number: CN114268328A
Application number: CN202111462642.XA
Authority: CN
Inventors: 吴少川; 王利繁; 李壮
Original assignee: Beijing Mechanical And Electrical Engineering General Design Department; Harbin Institute of Technology
Current assignee: Beijing Mechanical And Electrical Engineering General Design Department; Harbin Institute of Technology
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-04-01

Abstract

The invention discloses a convolutional code decoding method based on bidirectional LSTM (least square TM), a convolutional code encoding and decoding method, belongs to the technical field of electronic communication, and solves the problems that the time complexity and the space complexity of the traditional encoding and decoding are exponentially increased along with the increase of the code length and the constraint degree, a long code codebook is not easy to select, encoding information is easy to intercept and crack, and the safety is low. The method of the invention comprises the following steps: constructing a bidirectional LSTM neural network decoder, wherein the neural network decoder adopts a bidirectional LSTM neural network for decoding; establishing a receiving sequence data set, and constructing a training codebook according to the receiving sequence data set and a bidirectional LSTM neural network; selecting a training signal-to-noise ratio, wherein the training signal-to-noise ratio is the signal-to-noise ratio of a receiving sequence; setting simulation parameters, and training the bidirectional LSTM neural network decoder by using a training codebook and a training signal-to-noise ratio; and decoding by using the trained bidirectional LSTM neural network decoder. The invention is suitable for end-to-end convolutional code encoding and decoding.

Description

Convolutional code decoding method based on bidirectional LSTM and convolutional code encoding and decoding method

Technical Field

The present application relates to the field of electronic communications technologies, and in particular, to a method for encoding and decoding an end-to-end convolutional code.

Background

The conventional convolutional code encoding method adopts a shift register method, as shown in fig. 1, the specific principle is as follows: code length n of the convolutional code ₀2,

information length k

₀1, 2 for code storage time, 3 for code constraint degree, and length of code constraint N_AThe encoder consists of two shift registers and two adders, the output code groups of the convolutional code are respectively the first code word of each code group is the modulo two sum of the input information code word and the input code word of the previous time interval and the input code words of the previous two time intervals, and the second code word is the modulo two sum of the input information code word and the input code words of the previous two time intervals.

The decoding of the convolutional code is mainly divided into two directions of algebraic decoding and probabilistic decoding. The former is based on a large number of logical decisions, which are directed to convolutional codes of a specific code pattern, decoding a received sequence through a companion matrix and an error pattern; the latter is based on the maximum a posteriori probability (MAP) criterion and the Maximum Likelihood (ML) criterion, the transmitted sequence is reversely deduced through the probability distribution of the received sequence, the inclusion to the code pattern of the convolutional code is high, and the decoding performance of the convolutional code is superior to that of algebraic decoding no matter the code pattern of the convolutional code is complex or simple, so the convolutional code is more generally applied to the actual communication system.

The mathematical basis of the viterbi decoding algorithm is the Maximum Likelihood (ML) criterion, which is also proved to be the optimal probability decoding algorithm, the decoding structure and algorithm are simple, the execution processing efficiency is high, and the viterbi decoding algorithm becomes the most popular decoding algorithm of the convolutional codes. Viterbi decoding is a method for representing the transition between states by a network diagram, and decoding a transmission sequence by finding a path with a minimum distance metric.

The defects of the conventional encoding and decoding are as follows:

1. the time complexity and the space complexity of the encoding and decoding of the traditional convolutional code such as the Viterbi decoding are exponentially increased along with the increase of the code length and the constraint degree, so that the traditional convolutional code encoding and decoding mode is not suitable for the long code.

2. A convolutional code codebook which is suitable for the current channel and has excellent performance is difficult to be obtained by calculation in a mathematical mode when the long code length is long, namely the long code codebook is not easy to select.

3. The coding and decoding modes of the short code codebook are fixed, coding information is easy to intercept and crack, and the safety is low.

Disclosure of Invention

The present invention provides a convolutional code decoding method and a convolutional code encoding and decoding method based on bidirectional LSTM, in order to solve the above-mentioned problems of the prior art.

The invention is realized by the following technical scheme, and on one hand, the invention provides a convolutional code decoding method based on bidirectional LSTM, which comprises the following steps:

constructing a bidirectional LSTM neural network decoder, wherein the neural network decoder adopts a bidirectional LSTM neural network for decoding;

establishing a receiving sequence data set, and constructing a training codebook according to the receiving sequence data set and a bidirectional LSTM neural network;

selecting a training signal-to-noise ratio, wherein the training signal-to-noise ratio is the signal-to-noise ratio of the receiving sequence;

setting simulation parameters, and training the bidirectional LSTM neural network decoder by using the training codebook and the training signal-to-noise ratio;

and decoding by using the trained bidirectional LSTM neural network decoder.

Further, the constructing a bidirectional LSTM neural network decoder specifically includes: the bidirectional LSTM neural network decoder comprises an input layer, a hidden layer nerve and an output layer;

the hidden layer nerve is formed by combining a plurality of bidirectional LSTM network layers, a batch standardized BN layer and a Drop Out combination layer and is used for extracting correlation between adjacent code words through a receiving sequence;

the input layer is used for amplifying the input codebook into a high-dimensional space;

the bidirectional LSTM network layer changes the sequentially input convolutional code coding code words into a characteristic rule which can be fitted by a neural network by learning the correlation between the front and the back of the convolutional code input coding sequence;

the batch standardized BN layer redistributes the output data of the neural network in a specified range through a normalization formula without changing the distribution rule;

the Drop Out layer prevents overfitting of the network by removing the neural network training unit from the network according to a certain probability during training;

the output layer is used to reduce the information of high dimension to low dimension.

Further, the establishing a receiving sequence dataset and constructing a training codebook according to the receiving sequence dataset and the bidirectional LSTM neural network specifically include:

according to the received sequence, cutting the sequence with length n x m and recording as [ R ₁1,R ₁2,...,R₁n,R ₂1,R ₂2,...,R₂n,...,R _m1,R _m2,...,R_mn]The code length n corresponds to Inputtim of the bidirectional LSTM neural network, and m corresponds to Timesteps of the bidirectional LSTM neural network, and the receiving sequence is converted into a two-dimensional tensor of m x n;

inputting the two-dimensional tensors of m x n in parallel to a bidirectional LSTM neural network;

changing all the two-dimensional tensors into three-dimensional tensors of (Batchsize, m, n) according to the Batchsize set by the two-way LSTM neural network parameters;

and constructing a training codebook according to the three-dimensional tensor of the (blocksize, m, n).

Further, the selecting a training snr specifically includes:

selecting a signal-to-noise ratio data set according to the received sequence data set;

and selecting a training signal-to-noise ratio according to the signal-to-noise ratio data set and the error rate curve optimal principle.

On the other hand, based on the above method for decoding the convolutional code based on the bidirectional LSTM, the present invention provides a method for encoding and decoding the convolutional code based on the bidirectional LSTM, which includes:

establishing a cascade training neural network, wherein the cascade training neural network comprises a transmitter, channel noise and a receiver;

the transmitter comprises an LSTM encoder and a modulator, the LSTM encoder is sequentially connected with the modulator, the LSTM encoder adopts a bidirectional LSTM neural network for encoding, and the modulator is used for mapping a constellation map;

the receiver comprises a bidirectional LSTM neural network decoder and a decision output which are constructed in the bidirectional LSTM-based convolutional code decoding method, the bidirectional LSTM neural network decoder is sequentially connected with the decision output, and the decision output is used for classifying the output of the bidirectional LSTM neural network decoder according to a decision threshold and decoding information bits;

establishing a received information source sequence data set, and constructing a cascade training codebook of the cascade training neural network according to the received information source sequence data set and a bidirectional LSTM neural network;

selecting a cascade training signal-to-noise ratio, wherein the cascade training signal-to-noise ratio is the signal-to-noise ratio of the convolutional code output by the modulator;

setting simulation parameters, and training the cascade training neural network by using the cascade training codebook and the cascade training signal-to-noise ratio;

and coding and decoding by using the trained cascade trained neural network.

Further, the LSTM encoder specifically includes: an input layer, a hidden layer nerve and an output layer;

the hidden layer nerve is formed by combining a plurality of bidirectional LSTM network layers, a batch standardized BN layer and a Drop Out combination layer and is used for generating a time correlation sequence through bidirectional LSTM neurons;

Furthermore, the modulator is a DNN full-connection layer network modulator, maps the coded information into a sending sequence with a corresponding code length, and performs two-dimensional modulation;

the two-dimensional modulation specifically divides a symbol into a real part and an imaginary part, and transmits the two parts simultaneously.

Further, the establishing a received signal source sequence data set, and constructing a cascade training codebook of the cascade-trained neural network according to the received signal source sequence data set and the bidirectional LSTM neural network specifically include:

setting a one-way queue with the length of L, wherein L is M-1, M is the constraint length of the convolutional code, and the one-way queue is initialized to be all 0;

inputting the received information source sequence into a queue according to each group of Timesteps serial, wherein the queue state value when each bit is input is mapping output;

constructing a two-dimensional tensor of Timestes x L, wherein each line component in the two-dimensional tensor has correlation with the first L line components and the last L line components;

changing the two-dimensional tensor of the Timestes xL into a three-dimensional tensor of (Batchsize, Timestes, L) according to the Batchsize set by the neural network parameters;

and constructing a cascade training codebook according to the three-dimensional tensor of the (batch, Timesteps, L).

In a third aspect, the present invention provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program stored in the memory to execute the steps of the bidirectional LSTM-based convolutional code decoding method as described above.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a bi-directional LSTM based convolutional code decoding method as described above.

The invention has the beneficial effects that:

firstly, because the neural network can obtain more characteristic numbers for the convolutional code with long code length, the invention utilizes the advantages of the neural network and adopts the bidirectional LSTM neural network, thereby solving the problem that the time complexity and the space complexity of Viterbi decoding are exponentially increased along with the increase of the code length and the constraint degree.

Secondly, the neural network of the invention adopts a bidirectional LSTM neural network, if other neural networks such as RNN networks are adopted, when the interval between the position to be predicted by the neural network and the associated information becomes large, the defect of the RNN network appears, namely the Long-Term dependence (Long-Term dependences) problem, and when the length of the input information exceeds the memory depth, the gradient disappears during training. Therefore, LSTM is proposed to solve the above problem, and is a special recurrent neural network.

And thirdly, the communication system using deep learning is different from the traditional communication system for establishing the modular mathematical model. The optimal algorithm is obtained by learning and obtaining the weight from the training model, so that the overall optimal solution is emphasized, and the overall performance can be improved. The traditional algorithm needs to artificially deduce the optimal coding mode under different channel conditions through a mathematical mode, and when the length of the convolutional code is too long, the calculation process is very complicated, so the advantage of deep learning training is highlighted.

The invention can encode and decode the long code length and high constraint convolutional code, and the neural network is a 'black box model' after the training is finished and determined, and when the outside does not know the specific model parameters, the intermediate code output is unknown.

The invention is suitable for end-to-end convolutional code encoding and decoding.

Drawings

In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram of the encoding scheme of a conventional convolutional code;

FIG. 2 is a block diagram of an overall scheme of a bi-directional LSTM convolutional code decoder;

FIG. 3 is a diagram of a convolutional code decoder neural network architecture;

FIG. 4 is a diagram illustrating a method for constructing a training set in language prediction;

FIG. 5 is a schematic diagram of a method for reconfiguring a receive sequence;

FIG. 6 is a diagram illustrating the variation of the input tensor in the neural network;

FIG. 7 is a comparison of tests for training sets of different signal-to-noise ratios;

FIG. 8 is an MSE error function image;

FIG. 9 is a Huber Loss error function image;

FIG. 10 is an overall block diagram of a cascaded neural network of a transmitter and a receiver;

FIG. 11 is a diagram of a neural network architecture for the overall simulation of a transmitter and receiver;

fig. 12 is a schematic diagram of reconstruction of time correlation of source information;

FIG. 13 is a neural network structure with noise addition and power normalization;

FIG. 14 is a comparison of decoding performance for unidirectional, bidirectional LSTM and DNN fully connected networks;

FIG. 15 is a comparison of decoding performance of bi-directional LSTM networks with convolutional codes of different code lengths in AWGN channels;

FIG. 16 is a comparison of decoding performance of different constraint length bi-directional LSTM networks under AWGN channels;

FIG. 17 is a comparison of performance of a bi-directional LSTM network decoder for a single path Rayleigh channel and an AWGN channel;

FIG. 18 is a comparison of error rate performance for different code lengths and the same correlation length;

FIG. 19 is a comparison of bit error rate performance for different code length concatenated decoding and Viterbi decoding;

fig. 20 is a modulation output constellation diagram for neural network training.

Detailed Description

In one embodiment, a method for decoding a bidirectional LSTM-based convolutional code, the method includes:

and decoding by using the trained bidirectional LSTM neural network decoder.

In the present embodiment, as shown in fig. 2, the Bidirectional LSTM neural network decoder selects Bidirectional LSTM (Bidirectional-LSTM) as a main structure of the neural network decoder. 0. 1, generating an (n, k, M) convolutional code coding sequence s with the code length n and the constraint degree k by a convolutional code encoder based on a shift register, mapping a constellation diagram of the coding sequence by a BPSK modulator to obtain a symbol x, and transmitting the symbol x in a channel.

The transmission signal in the channel is affected by the interference of a white gaussian noise channel (AWGN) or a rayleigh channel, the symbol received by the receiving end is y, and the symbol directly passes through a bidirectional LSTM convolutional code decoder to obtain a decoded prediction sequence u', and decoding is completed.

It should be noted that, in the step of setting the simulation parameters and training the bidirectional LSTM neural network decoder by using the training codebook and the training signal-to-noise ratio, because the output sample during the training in the present embodiment is the initial source sequence, when the output of the neural network prediction is closer to the training sample, the value of the loss function is smaller, the mean square error MSE loss function is preliminarily selected in the present embodiment, the mathematical expression of which is shown in formula (1), represents the average value of the squares of the distances between the predicted values of the network model and the actual values of the samples, and the error value function image of which is shown in fig. 8.

Wherein, y_iRepresents the target value, f (x)_i) Represents the predicted value, and m is the number of samples.

The image shows that the MSE function has the characteristics of continuity and first derivative continuity, the gradient is continuously changed all the time when the gradient descent algorithm is used, and the application range is very wide. But because of the square, when y and f (x)_i) When the difference is greater than 1, the error is increased, and when the difference is less than 1, the error is decreased. Since the training samples use 0dB noise set, it will have large noise interference on some samples. Sampling the MSE loss function results in increased bias at certain sample points, resulting in a decrease in the accuracy of the model as a whole.

The present embodiment uses a smoothed average absolute error Huber Loss whose mathematical expression is formula (2), and the image when δ is 2 is shown in fig. 9.

Wherein δ is a hyper-parameter for controlling the gradient change of the Loss.

By adopting the smooth average absolute error Huber Loss, when the deviation of a training sample is reduced, the gradient during back propagation calculation is also reduced, further convergence of the neural network is facilitated, when the deviation of the training sample is larger, the convergence speed is not increased, the error is not amplified, and negative effects on the training of the neural network caused by large deviation are effectively avoided.

The neural network layer number settings and other simulation parameters are shown in tables 1 and 2:

TABLE 1 LSTM convolutional code decoder and channel environment simulation parameter configuration

TABLE 2 LSTM network model output dimensionality

In a second embodiment, the present embodiment is further limited to the method for decoding a bidirectional LSTM-based convolutional code according to the first embodiment, and in the present embodiment, the constructing a bidirectional LSTM neural network decoder is further limited to specifically include:

the bidirectional LSTM neural network decoder comprises an input layer, a hidden layer nerve and an output layer;

In this embodiment, the bidirectional LSTM neural network decoder is composed of an input layer, a bidirectional LSTM layer, a Batch Normalization BN layer (Batch Normalization), a Drop Out layer, and an output layer, and a specific structure diagram thereof is shown in fig. 3.

In the figure, an input layer and an output layer are all connected Dense layers, the input layer is used for amplifying an input codebook into a high-dimensional space so as to facilitate the subsequent bidirectional LSTM network to find the characteristics of data, and the output layer is used for reducing high-dimensional information to a low dimension so as to facilitate the judgment and output between normalization and [0,1 ].

The middle hidden layer nerve is formed by combining a plurality of layers of bidirectional LSTM network layers, batch standardized BN layers and Drop Out layers, the bidirectional LSTM network layers are main components of a convolutional code decoder, and sequentially input convolutional code coding code words are changed into a characteristic rule which can be fitted by a nerve network by learning the correlation between the front and the rear of a convolutional code input coding sequence, so that the decision output of a Dense layer is convenient to input; the BN layer is also called a batch normalization layer and has the function of redistributing the output data of the neural network in a specified range through a normalization formula without changing the distribution rule of the output data, so that the training of the neural network is accelerated; whether the Drop Out layer is used depends on whether an overfitting phenomenon occurs during training, the overfitting is the phenomenon that when the neural network structure is too complex or training samples are insufficient, the training accuracy is far higher than the testing accuracy, namely the neural network learns a wrong mapping rule, and the Drop Out layer prevents overfitting of the network by removing the neural network training unit from the network according to a certain probability during training.

In a third embodiment, the present embodiment is further limited to the method for decoding a bidirectional LSTM-based convolutional code according to the first embodiment, and in the present embodiment, the step of creating a received sequence dataset and constructing a training codebook according to the received sequence dataset and a bidirectional LSTM neural network is further limited to specifically include:

In the present embodiment, a method of constructing a training codebook for training a bi-directional LSTM neural network decoder is provided.

As shown in fig. 4, the construction of the convolutional code training set is based on the context language prediction of LSTM.

As shown in fig. 5, in the present embodiment, the received sequences to which the channel characteristics are added constitute code blocks as follows.

Assuming that the received sequence is serial data containing noise, the received sequence is a convolutional code of one (n, k, M), where n is a code length, k is an information length, and M is a code constraint degree, and since the code length is n, the present embodiment may intercept a sequence of n × M as [ R × M [ ]₁1,R ₁2,...,R₁n,R ₂1,R ₂2,...,R₂n,...,R _m1,R _m2,...,R_mn]Then, the code length n corresponds to Inputtims of the LSTM network, and M corresponds to Timesteps of the Bi-LSTM network, namely, the coding relation between adjacent code groups is searched in M groups of (n, k, M) of received code words of the convolutional code, and as long as Timesteps > M, namely the time sequence length of the LSTM network is far larger than the constraint degree of the convolutional code, the Bi-LSTM can correctly predict the coding rule of the convolutional code.

Taking (3,1, M) convolutional code as an example, the received sequence is serial data containing noise, and since the code length is 3, the sequence with the length of 3 × M is intercepted and recorded as [ R [ ]₁1,R ₁2,R ₁3,R ₂1,R ₂2,R ₂3,...,R _m1,R _m2,R_m3]Then will beThe code length 3 corresponds to the Inputtims of the LSTM network, M corresponds to the Timesteps of the LSTM network, namely, the coding relation between adjacent code groups is searched in M groups (3,1, M) of received code words of the convolutional codes, and the LSTM can correctly predict the coding rule of the convolutional codes as long as Timesteps > M, namely the time sequence length of the LSTM network is far larger than the constraint degree of the convolutional codes.

After the processing of the receiving sequence is finished, the receiving sequence becomes a three-dimensional tensor, the first dimension of the three-dimensional tensor is Batchsize, and the size of the receiving sequence is not fixed and can be adjusted according to the training effect; the second dimension is Timesteps, which represents finding the correlation of adjacent code words within the set time sequence length; the third dimension is Inputtim, which represents the characteristic number of the input code group, which is equal to the code length n before it is further amplified. The dimensional variation of the tensors in the network is shown in figure 6.

It should be noted that fig. 6 is only for illustrative observability, so that a tensor change process of a code block in a Batch is drawn, and in actual training, an entire Batch is usually input in parallel, so that the efficiency of the neural network decoder is considerable.

In a fourth embodiment, the present embodiment is a further limitation on the method for decoding a convolutional code based on bi-directional LSTM in the first embodiment, and in the present embodiment, the step of selecting a training snr is further limited, and specifically includes:

In this embodiment, a method of selecting a suitable snr is proposed, where the snr is used to train a bi-directional LSTM neural network decoder, i.e., to select a training snr.

The input of the training sample is a receiving sequence containing noise interference, a training set with a signal-to-noise ratio of 0dB is selected in the embodiment, and the signal-to-noise ratio selected in the embodiment is the optimal training signal-to-noise ratio. The (5,1,9) convolutional codes are tested by using different signal-to-noise ratio data of { -5-10 dB } under the conditions of the same network model, the same optimization method and the same training times, and representative training results of-5 dB, -2dB, 0dB, 5dB and 10dB are shown in FIG. 7.

As shown in fig. 7, the final decoding performance is affected by too high or too low signal-to-noise ratio of the training sample, and when the signal-to-noise ratio is too low, such as a curve of-5 dB, the bit error rate performance is still poor under the condition of high signal-to-noise ratio, because the excessive noise makes it difficult for the neural network to accurately extract the coded features; when the signal-to-noise ratio is too high, such as 5dB and 10dB, the decoding performance curve is very slow to drop, because the noise ratio in the sample is small, the trained neural network has weak ability to resist white noise. Therefore, the training set with the signal-to-noise ratio of 0dB selected in the compromise mode is the best signal-to-noise ratio, the obtained model has better anti-noise performance under the low signal-to-noise ratio, and the model can keep good performance under the high signal-to-noise ratio.

In this embodiment, 3.2 × 10 is selected⁷The bits are used as the data volume of the training set, and the data volume can well help the network to search the characteristics among the code words of the convolutional code under the limited hardware and time limit;

in this embodiment, 5.12 × 10 is selected⁶One bit is used as the data volume of the test set, which can make the test curve reach 10 commonly used for communication^-5～10^-6Bit error rate class.

In a fifth embodiment, based on the first embodiment, a method for decoding a bidirectional LSTM-based convolutional code is provided, where the present embodiment provides a method for encoding and decoding a bidirectional LSTM-based convolutional code, and the method includes:

the receiver comprises a bidirectional LSTM neural network decoder and a decision output, wherein the bidirectional LSTM neural network decoder is constructed according to the method in claim 1, the bidirectional LSTM neural network decoder is sequentially connected with the decision output, and the decision output is used for classifying the output of the bidirectional LSTM neural network decoder according to a decision threshold and decoding information bits;

and coding and decoding by using the trained cascade trained neural network.

In this embodiment, a bidirectional LSTM is used as a main structure of the neural network encoder and decoder, and the structure enables the channel coding designed by the neural network to still have temporal correlation as a conventional convolutional code, and a block diagram of an overall scheme of the structure is shown in fig. 10.

In the scheme, a transmitter, a receiver and channel noise are used as an integral neural network for training optimization, and after training is completed, the neural network is divided into a transmitter end and a receiver end which work independently. The input of the neural network is no longer a binary random source sequence of 0,1, etc., but is pre-processed by a mapping reconstructor. The processed data becomes a coding sequence through an LSTM encoder, and then a modulator of a DNN full-connection layer network is used for mapping a constellation diagram, and the constellation diagram enters a channel layer, which is the whole process at the transmitter end.

For the channel layer, it is a neural network layer when training, it can add noise with different SNR into the neural network output of the previous layer, help the whole neural network to complete the training of anti-noise, and when in actual use, it is the real physical channel.

The channel layer is followed by the receiver end, which includes the bi-directional LSTM neural network decoder constructed in the first embodiment, and also includes a decision output, which can directly decode the received information sequence containing noise, and also finds the time correlation between the received sequences through the multi-layer LSTM network, so that the decision output completes the decoding.

In the present embodiment, during training, one noise tensor having the same dimension as the output tensor of the DNN modulator is added to the signal tensor by using the Add layer, representing the influence of additive white gaussian noise, and is input to the output ends of the LSTM encoder and the DNN modulation layer to exercise the noise immunity of the network as noise, and the actual network structure thereof is as shown in fig. 13. As can be seen from fig. 13, before adding white gaussian noise, we need to normalize the power of the transmitted signal to ensure that the added noise power can correspond to the snr with different magnitudes, the power normalization is performed by setting the mean of the square sum of the real part and the imaginary part of each code block of the transmitted signal to be a certain value, in practical simulation, we can Add a Lambda layer before the Add layer, write a function in the Lambda layer, remove the dc component of the transmitted signal, and divide by the standard deviation, so that the square sum of the real part and the imaginary part is the set transmitted power.

Sixth embodiment, the present embodiment is further limited to the method for encoding and decoding a bidirectional LSTM-based convolutional code according to fifth embodiment, and in the present embodiment, the LSTM encoder is further limited to specifically include:

the LSTM encoder specifically includes: an input layer, a hidden layer nerve and an output layer;

In this embodiment, a specific structure of the LSTM encoder is given, as shown in fig. 11, the encoder and the decoder both adopt a network structure in which a plurality of bidirectional LSTM and BN layers are cascaded, and a sequence of time correlation is generated by bidirectional LSTM neurons or the correlation between adjacent codewords is extracted for a received sequence.

Seventh embodiment, the present embodiment is further limited to the method for encoding and decoding a bidirectional LSTM-based convolutional code according to the fifth embodiment, and in the present embodiment, the method for encoding and decoding a bidirectional LSTM-based convolutional code specifically includes:

the modulator is a DNN full-connection layer network modulator, maps the coded information into a sending sequence with a corresponding code length, and performs two-dimensional modulation;

In this embodiment, a specific structure of a modulator is given, a multi-layer full-connection network DNN is adopted in a modulator portion, encoded information is mapped to a transmission sequence with a corresponding code length and is modulated, and since a deep neural network cannot have a good processing method for complex numbers at present, when modulation is performed, if two-dimensional modulation is required, one symbol must be divided into a real part and an imaginary part for simultaneous transmission.

An eighth embodiment is further limited to the method for encoding and decoding a bidirectional LSTM-based convolutional code according to the fifth embodiment, where in the present embodiment, the step of establishing a received source sequence sample set and constructing a cascaded training codebook of the cascaded trained neural network according to the received source sequence sample set and the bidirectional LSTM neural network is further limited, and specifically includes:

Based on the whole framework of the cascade simulation of the transmitter and the receiver, the information of the bit streams of the source 0 and the source 1 cannot be directly input into the neural network for training, because the information of the source is completely random, and no correlation exists between adjacent bits of the information, the LSTM neural network encoder cannot encode the information of the source in a time correlation manner, and the neural network trained in the way only classifies the bits of the source 0 and the source 1 and does not have any channel coding significance. In this embodiment, before being input to the neural network, the source bit sequence is mapped and reconstructed to be a codebook with correlation.

The encoder of the

convolutional code inputs

0 and 1 bit streams into the cascaded shift registers in time sequence, and performs modulo two sum output of fixed taps on data in different shift registers at each moment to generate a time-dependent code word. Therefore, the bit sequence input into the neural network can be preprocessed to achieve the same effect as the convolutional code coding. For example: when a convolutional code of (n, k, M) is to be generated, M is a constraint length in the conventional convolutional code coding, and the value of M is the number M +1 of shift registers of an encoder, a queue with the length of M is needed, after k bits of random 0 and 1 bit sequences are interleaved (directly and sequentially input when k is 1), the random 0 and 1 bit sequences are enqueued from left to right and then dequeued, and the original 0 and 1 bits can be mapped into different sequences with front and back correlation and then input into an encoder network for encoding.

For example: a certain piece of source information is [1,0,0,1,1,0,1,1], and if a time-dependent code with a convolutional code-like coding constraint M of 4, that is, M is 3, is to be generated through this neural network, as shown in fig. 12, only reconstruction is needed.

As shown in fig. 12, a queue with a length of 3 is constructed, the initial states of the queue are all set to be 0, information source information is sequentially input from the left side, information in the current queue is translated to the right, the queue state corresponding to each time can be obtained, the 8 state sequences containing front and rear related information replace the original 0 and 1 bit sequences, the sequences are input into an LSTM neural network for training, and then the convolutional code coding based on the neural network with the memory depth of 3 can be obtained.

In the ninth embodiment, the following examples are used to verify the technical effects of the present invention: aiming at the convolutional code decoding method based on the bidirectional LSTM, the method comprises the following specific steps according to the influence of different neural network types on convolutional code decoding:

as shown in fig. 14, after the training parameters, optimizer selection, loss function selection, and activation function selection of the fixed neural network, the same 30 epochs training is performed to obtain the bit error rate performance curve as shown in the figure. As can be seen from the figure, the performance of the (5,1,9) convolutional code decoder composed of LSTM neural network is far better than that of the decoder composed of DNN full-connection network, because the DNN full-connection network does not have the capability of extracting the time correlation feature, and the code words of the (5,1,9) convolutional code have strong correlation between the front and back code groups, so that the LSTM neural network with the same time correlation can decode.

In addition, the performance of the bi-directional LSTM is significantly better than that of the unidirectional LSTM, which is at 10^-5～10^-6The bit error rate interval has about 3dB advantage because the bidirectional LSTM has stronger time compared with the unidirectional LSTM under the same training parameter quantityAnd (3) inter-correlation calculation, so that the adoption of the bidirectional LSTM structure increases the complexity of neural network training and prediction, but has good performance benefit.

In a tenth embodiment, for the method for decoding convolutional codes based on bi-directional LSTM described in the present invention, the decoding performance of convolutional codes with different code lengths is compared, specifically as follows:

as shown in fig. 15, when the code length is short, the error rate performance of the bidirectional LSTM decoder is far worse than that of the conventional viterbi decoding, such as 2-1-9 convolutional code with the shortest code length, the bidirectional LSTM network does not accurately extract the corresponding decoding rule under the training parameters set by the experiment, so that the error rate curve does not rapidly decrease when the signal-to-noise ratio increases, which may be caused by the short code length, which results in a small number of features of the corresponding training unit, and the data size and training time required for the complete fitting of the neural network are far longer than those set by the experiment. However, when the code length is greater than or equal to 4, the decoding performance of the bidirectional LSTM decoder can reach 10 in the signal-to-noise ratio region of more than 3dB^-5Magnitude and the difference between the magnitude and the traditional Viterbi decoding are smaller and smaller, because the neural network can obtain more feature numbers for the convolutional code with long code length, the neural network is more beneficial to learning a decoding mode. Certainly, the conventional viterbi decoding is a decoding mode customized for a certain code pattern, and the capability of resisting high noise is superior to that of a neural network, and it can be seen from the figure that the difference between the decoding performance of a bidirectional LSTM decoder and the viterbi decoding is large in a region with a low signal-to-noise ratio of 0-3 dB, taking (5,1,9) convolutional code as an example, and 10^-5～10^-6In this ber interval, the difference between bi-directional LSTM and viterbi decoding is about 1.7 dB.

In the eleventh embodiment, for the convolutional code decoding method based on bidirectional LSTM described in the present invention, the decoding performances of the same code length and different constraint degrees are compared, specifically as follows:

as shown in fig. 16, when the code length of the convolutional code is fixed and the constraint length is changed, the performance of the convolutional code decoder based on the bidirectional LSTM network is slightly reduced with the reduction of the coding constraint degree, but the change is not particularly severe. This is because, when the codebook is constructed, the method of the present invention recombines the received sequence according to the code length, so the coding constraint is only hidden in the received sequence, and it does not affect the structure and dimension change of the whole network, and further does not essentially affect the training of the neural network. In other words, as long as the code length can satisfy the requirement of the neural network to effectively extract the features between the adjacent code words, the influence of the constraint length on the decoding performance is far smaller than the change of the code length.

In the twelfth embodiment, the error rate performance under the single-path rayleigh channel is analyzed by the convolutional code decoding method based on the bidirectional LSTM described in the present invention, specifically as follows.

As shown in fig. 17, rayleigh fading is also a common channel environment in wireless communication, and the performance of a bidirectional LSTM convolutional code decoder under a rayleigh fading channel is studied, and taking (5,1,9) convolutional code as an example, the channel environment is assumed to be a rayleigh fading channel where only a single path exists and no doppler shift exists, and the channel gain is a complex gaussian component.

In the figure, the broken line is the performance curve of (5,1,9) convolutional code bidirectional LSTM decoding and Viterbi decoding under AWGN channel, the solid line is the performance curve of the same code pattern under single path Rayleigh channel, and the setting parameters of the neural network are consistent with those in AWGN, and the comparison shows that: at 10^-5～10^-6The error rate area is that under the original AWGN channel, the bidirectional LSTM neural network is about 2dB worse than the traditional Viterbi decoding; after the channel is changed into a more complex rayleigh channel, the difference between the two channels is reduced to about 0.5 dB. This also shows that the neural network can be regarded as a general function approximator, when the channel condition becomes complicated, the traditional decoding mode is difficult to reason through strict mathematical formula, and the advantage that the neural network is easy to approximate the complicated condition is shown.

By comparing the embodiments, the superiority of the method of the present invention to a more complex multipath transmission plus frequency dispersion wireless channel neural network can be determined to become more prominent.

In a thirteenth embodiment, for the convolutional code decoding method based on bidirectional LSTM described in the present invention, the complexity is analyzed as follows:

for Viterbi decoding, assuming a total length of information L, a convolutional code of (n, k, m) is used, with a number of states of 2^kmEach time to do 2^kmAdding and comparing to obtain 2^kmResidual path of each state, total computation amount is L2^kmThe secondary addition and comparison also require L2^kmA storage unit. Therefore, the complexity is O (L2)^km) When the code length n and the constraint length m are large, the amount of computation and memory required for viterbi decoding increase linearly with the code length n, and the number of states and the amount of computation increase exponentially with the packet size k and the constraint length m. Thus, for long code lengths and high degrees of constraint, i.e. code constraint length N_AWhen the size is large, the complexity of the traditional Viterbi decoding becomes very large, and the practical application value becomes very low.

For the bidirectional LSTM network convolutional code decoder, the complexity mainly depends on the number of units required for matrix operation, and different types of neural network layers have different calculation modes.

1. The total connection layer, the number of arithmetic units is represented by formula (3):

Param＝(input_dims+1)×output_dims (3)

in the formula, Param is the number of arithmetic units, 1 represents the bias of each neuron, and the number of neurons is equal to the output dimension.

2. In the Bi-LSTM layer, the number of arithmetic units is represented by formula (4):

where input _ dims is the input dimension number, hidden _ size is the LSTM cell state dimension, for unidirectional LSTM the cell state dimension is equal to the output _ dims, and for Bi-directional Bi-LSTM the output _ dims is 2 times the cell state dimension, plus 1 is the same as for the fully connected layers, considering the Bias at output.

3. The BN layer (Batch Normalization) and the Dropout layer are used as network layers for assisting training, do not participate in back propagation operation and are used as training units, and the fixed value of the size is small and can be ignored.

For the neural network decoder, each of the above params represents one multiplication, and because when constructing the codebook, it is only related to the code length of the convolutional code, and is not related to the constraint length of the convolutional code, when the code length n of the convolutional code is much smaller than the cell dimension of the bi-directional LSTM layer (200 in this embodiment), and the constraint length m is much smaller than the neural network time series length Timesteps set when constructing the code block in the method of the present invention, the number of the operation units is mainly related to the dimension and number of the LSTM layers, in other words, the complexity change introduced by the code length n change is only reflected in the first layer of the neural network, taking the simulation parameter of the present embodiment as an example, the influence of the code length change on the operand is only Δ Param 50 Δ n, and therefore, the calculation complexity is o (n), and is linear with respect to the code length, there is almost no influence with respect to the overall operand.

TABLE 3 Bi-directional LSTM decoder vs. Viterbi decoder complexity

In conclusion, the compatibility of the bidirectional LSTM neural network convolutional code decoder for long codes is far greater than that of the traditional Viterbi decoding, the longer the code length is, the larger the constraint length is, the more obvious the complexity advantage of the neural network decoder is, and the traditional Viterbi decoding usually adopts a data stream input mode, so that the parallel capability is poor; the processing of the data by the neural network can achieve high parallelization by adjusting the sizes of the timepieces and the blocksize, so that the decoder of the neural network convolutional code is more advantageous in transmission rate under the condition of large data volume.

In a fourteenth embodiment, aiming at the method for encoding and decoding the convolutional code based on the bidirectional LSTM, the error rates of different code lengths during encoding are analyzed, specifically as follows:

as shown in fig. 18, during encoding, the bit error rate curve shown in fig. 18 can be obtained by using a fixed correlation length, only changing the code length, then training with the same neural network model and parameters, and performing a test of bit error rate performance under AWGN channel.

As can be seen from the figure, for the neural network system with cascaded transmitter and receiver, the decoding performance of the corresponding neural network is increased when the code length is increased, which is consistent with the convolutional code decoder using bidirectional LSTM only at the decoder end; in addition, for the bidirectional LSTM convolutional code decoder with the same code length and the same coding constraint degree, the neural network cascaded by the transmitter and the receiver simultaneously codes and decodes the information, and the coding and decoding of the information by the neural network achieves the global optimum point by utilizing the gradient descent algorithm of the neural network, so that the decoder obviously has better error rate performance.

In a fifteenth embodiment, compared with the conventional viterbi decoding, the encoding and decoding method for the bidirectional LSTM-based convolutional code according to the present invention includes the following specific steps:

as shown in fig. 19, the bit error rate performance curve of the neural network cascade training coding and decoding system with different coding lengths is compared with the QPSK modulation-demodulation and viterbi decoding of the conventional (5,1,9) convolutional code, and it can be known that viterbi decoding is the best decoding method of the convolutional code, and under the condition of the same code length and constraint length, the bit error rate performance of the low signal-to-noise ratio region is indeed better than the coding and decoding method obtained by the neural network training, but when the code length of the neural network increases, the decoding performance is gradually improved and exceeds the conventional viterbi decoding performance.

However, in order to achieve the same error rate performance, the convolutional code encoder and decoder based on the deep learning neural network transmitter and receiver cascade connection integral training need to use a longer code length, that is, lower spectrum efficiency, but in some application occasions with low requirements on information rate, the encoding mode obtained based on the neural network learning has own advantages. For example, in a military communication scenario, the requirements on the rate and the delay of signal transmission are not as strict as those of mobile communication, but the requirements on confidentiality and anti-interception of information transmission are high.

The convolutional code coding and modulation mode obtained by the neural network training is similar to the traditional convolutional code coding and QPSK modulation mode in principle, but the output is no longer 0 and 1 level, but is an irregular floating point number output by the neural network. Taking code length 5, correlation length 9, and two-dimensional modulation in fig. 19 as an example, the two-dimensional constellation diagram output by the modulator neural network is shown in fig. 20.

As can be seen from fig. 20, although the output constellation point distribution of the encoder is distributed on 4 diagonal lines like QPSK, since the output of the neural network convolutional code encoder is no longer a bit sequence consisting of only 0 and 1, but has a floating point sequence with various distributions, the output constellation is not a conventional linear two-dimensional modulation constellation, and it is even impossible to demodulate and recover the information by using a conventional QPSK plus viterbi decoding receiver. Therefore, the coding and decoding method obtained by the neural network training can be regarded as an LPI (Low Probability Of Intercept) signal, namely a low interception Probability signal, and the method does not need to reduce the transmitting power to ensure that an enemy cannot detect the signal, but ensures that the opposite party cannot recover and recover the received signal in time, thereby achieving the purposes Of interception resistance and safety.

In a sixteenth embodiment, for the method for encoding and decoding a convolutional code based on bi-directional LSTM described in the present invention, the complexity is analyzed as follows:

for a traditional convolutional code encoding and decoding system, the complexity is mainly concentrated at one end of a decoder, a shift register adopted by the encoder realizes the rolling of a generator polynomial, and the complexity can be ignored, so that the complexity is consistent with the complexity analysis of the traditional Viterbi decoder end in a convolutional code decoding method based on bidirectional LSTM, when the total length of information is L and (n, k, m) convolutional codes are adopted, the state number of the convolutional codes is 2^kmEach time to do 2^kmAdding and comparing to obtain 2^kmResidual path of each state, total computation amount is L2^kmThe secondary addition and comparison also require L2^kmA storage unit with overall complexity of O at decoder end (L2)^km)。

For the convolutional code encoding and decoding system composed of a whole set of neural networks, the complexity of the convolutional code encoding and decoding system needs to consider decoders at the encoder end and the receiver end of a transmitter. Assuming that the code rate of the code generated by the neural network is k/n, the length of the correlation sequence is m, as can be known from the neural network structure and the codebook construction method in

sections

1 and 2, for the encoder end, the number of parameters affecting the neural network operation is mainly the feature number of the data at the input end, the codebook at the input end is constructed according to the length m of the correlation sequence, but one correlation length can only construct the correlation sequence of one information bit, and for the coding method with the code rate of k/n, there are k information bits, so the calculation complexity at the encoder end is O (k · m). On the decoder side, because the neural network constructed by the invention is the same as the neural network constructed in the convolutional code decoding method based on the bidirectional LSTM, the calculation complexity is only related to the code length n of the code, and the expression is O (n), so the overall calculation complexity of the whole neural network is O (k.m + n).

TABLE 4 Overall complexity contrast for encoding and decoding

As can be seen from table 4, due to the serial input manner of the conventional convolutional code encoding, the encoding and decoding complexity of processing the information with the same length is in a linear relationship with the sequence length and in an exponential relationship with the product of the encoded information bits and the constraint length; the convolutional code coding and decoding system formed by the neural network has the complexity only related to the dimension change of the input end of the neural network due to the block structure and the parallel processing structure of the neural network, and for the neural network designed by the invention, the calculation complexity only has a linear relation with the product of the information bit and the constraint length and the code length, and when the most common 1/n coding efficiency is adopted, the calculation complexity only has a linear relation with the sum of the constraint length and the code length. Therefore, for a channel coding and decoding system with large data processing, long code length and high constraint degree, the neural network has considerable advantages in the aspect of computational complexity compared with the traditional convolutional code coding and decoding system.

In summary, the neural network system designed by the present invention, which utilizes the integral cascade connection of the transmitter and the receiver, encodes and decodes the convolutional code for the source sequence, and has a certain bit error rate performance advantage under the conditions of long code length and high constraint degree compared with the case of using the neural network decoding only in the convolutional code decoder. Although the error rate performance is slightly different compared with the traditional Viterbi decoding mode, the method has own advantages in the aspect of computational complexity, and meanwhile, because the neural network coding and decoding are adopted integrally, the method is a brand-new convolutional code coding and decoding mode, and compared with the traditional communication system, the interception resistance and the safety of the method are greatly improved.

Claims

1. A method for decoding convolutional codes based on bi-directional LSTM, the method comprising:

and decoding by using the trained bidirectional LSTM neural network decoder.

2. The bi-directional LSTM-based convolutional code decoding method of claim 1, wherein said constructing a bi-directional LSTM neural network decoder specifically comprises: the bidirectional LSTM neural network decoder comprises an input layer, a hidden layer nerve and an output layer;

the hidden layer nerve is formed by combining a plurality of bidirectional LSTM network layers, a batch standardized BN layer and a Dropout combination layer and is used for extracting correlation between adjacent code words of a received sequence;

the Dropout layer prevents overfitting of the network by removing the neural network training unit from the network according to a certain probability during training;

3. The bi-directional LSTM based convolutional code decoding method of claim 1, wherein said establishing a receiving sequence dataset and constructing a training codebook according to the receiving sequence dataset and a bi-directional LSTM neural network specifically comprises:

according to the received sequence, cutting the sequence with length n x m and recording as [ R₁1,R₁2,...,R₁n,R₂1,R₂2,...,R₂n,...,R_m1,R_m2,...,R_mn]The code length n corresponds to Inputtim of the bidirectional LSTM neural network, and m corresponds to Timesteps of the bidirectional LSTM neural network, and the receiving sequence is converted into a two-dimensional tensor of m x n;

4. The bi-directional LSTM-based convolutional code decoding method as claimed in claim 1, wherein said selecting a training snr specifically comprises:

5. A method for encoding and decoding bi-directional LSTM based convolutional codes, comprising:

and coding and decoding by using the trained cascade trained neural network.

6. The bi-directional LSTM based convolutional code encoding and decoding method of claim 5, wherein said LSTM encoder comprises: an input layer, a hidden layer nerve and an output layer;

7. The method according to claim 5, wherein the modulator is a DNN full-link layer network modulator, maps the encoded information to a transmission sequence with a corresponding code length, and performs two-dimensional modulation;

8. The bi-directional LSTM based convolutional code encoding and decoding method of claim 5, wherein said building a received source sequence data set, and constructing a cascaded training codebook of said cascaded trained neural network from said received source sequence data set and bi-directional LSTM neural network comprises:

9. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the steps of the method of any of claims 1 to 4 are performed when the processor runs the computer program stored by the memory.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.