CN114598575B

CN114598575B - Deep learning channel estimation method based on self-attention mechanism

Info

Publication number: CN114598575B
Application number: CN202210239196.4A
Authority: CN
Inventors: 赵嗣强; 邱玲; 许逸丰
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2024-01-09
Anticipated expiration: 2042-03-11
Also published as: CN114598575A

Abstract

The invention discloses a deep learning channel estimation method based on a self-attention mechanism under a multi-input multi-output orthogonal frequency division multiplexing system, which is characterized in that the correlation between time-frequency domain channel impulse responses is utilized, channel characteristic information can be effectively extracted by adopting a self-attention mechanism module, so that a global dependent characteristic mapping is constructed, and the correlation between the time-frequency domain channel impulse responses can be deeply learned by utilizing a deep learning method. Compared with the channel estimation result of the existing MIMO OFDM system, the channel estimation result has ideal improvement in accuracy.

Description

Deep learning channel estimation method based on self-attention mechanism

Technical Field

The invention belongs to the technical field of wireless communication, and particularly relates to a deep learning channel estimation method based on a self-attention mechanism, which is applicable to a multi-input multi-output orthogonal frequency division multiplexing system.

Background

Emerging telecommunications technology books ("Orthogonal frequency division multiplexing with subcarrier power modulation for doubling the spectral efficiency of, G and beyond networks." in Transactions on Emerging Telecommunications Technologies,2020,31 (4): e 3921.) state that orthogonal frequency division multiplexing technology is widely used for 5G wireless communications due to its robustness to inter-carrier and inter-symbol interference, and will continue to be used in 6G communications as a key technology. Multiple-input multiple-output systems are capable of enhancing channel capacity without increasing bandwidth, thereby accommodating a greater number of users within the available bandwidth, and are therefore often used in conjunction with orthogonal frequency division multiplexing techniques to achieve better data transmission rates, higher spectral utilization, and better resistance to multipath fading. The performance of the mimo ofdm system in the wireless fading environment depends greatly on the accurate channel impulse response, and it is very important to obtain the accurate channel impulse response through channel estimation. The international society of electronic and electrical engineers communication express ("Deep learning-based channel estimation." in IEEE Communications Letters,2019,23 (4): 652-655.) proposes to treat the time-frequency response of a fast fading communication channel as a two-dimensional image, and to estimate the channel impulse response using a channel estimation method based on a Deep learning algorithm. But the accuracy of the channel impulse response obtained by the channel estimation method is not high because the correlation between the time-frequency domain channel impulse responses is not effectively utilized.

Disclosure of Invention

The invention provides a channel estimation method based on self-attention mechanism for deep learning in a multi-input multi-output orthogonal frequency division multiplexing system, so as to effectively utilize the correlation between time-frequency domain channel impulse responses and further improve the accuracy of estimation results.

The invention relates to a channel estimation method based on self-attention mechanism for deep learning in a multi-input multi-output orthogonal frequency division multiplexing system, which is characterized in that:

n for one transmitting antenna number _t The number of the receiving antennas is N _r Multiple input multiple output orthogonal frequency division multiplexing system of (2), wherein, at the transmitting end, the n-th _t A certain orthogonal frequency division multiplexing symbol on the root transmit antenna is denoted as:wherein L is the number of subcarriers, [] ^T Represents the transpose, for the nth _r The symbols on the kth subcarrier above the root receive antenna are denoted as:

wherein,represents the nth _t Root transmit antenna and nth _r Channel impulse response between root receive antennas corresponding to kth subcarrier +.>Representing additive white gaussian noise on the receiving antenna; therein the pair ofOne pilot symbol, all N _r The symbols on the kth subcarrier received by the root receive antenna are represented as:

Y ^k ＝H ^k X ^k +Z ^k ,(k＝1,...,L)

wherein the method comprises the steps of Respectively representing a receiving matrix of a receiving end, a transmitting matrix of a transmitting end, a noise matrix,representing the frequency domain channel impulse response matrix on the kth subcarrier, H ^k Expressed as:

for H ^k Channel estimation is carried out to obtain the channel impulse response estimation result as followsUsing a channel estimation method based on a least squares criterion, a cost function shown by the following formula is minimized:

letting the cost function of the above relation toThe partial derivative of (2) is 0, i.e

Obtaining channel impulse response estimation results under a channel estimation method based on a least square criterion,

wherein ( ^-1 An inverse matrix representing the matrix;

the specific process steps of the channel estimation method based on the deep learning of the self-attention mechanism are as follows:

step S1: channel state information at pilot symbols and estimated channel matrix by least squares channel estimation algorithm

S2, step: for the channel matrixPerforming up-sampling linear interpolation to obtain frequency domain channel response at the data symbol, combining with frequency domain channel response at pilot frequency to obtain complete frequency domain channel response matrix, and remolding the complete frequency domain channel response matrix into L×N _sym ×(N _t ×N _r X 2), wherein L represents the number of subcarriers, N _sym Indicating the number of sub-carriers, N _t Indicating the number of transmitting antennas, N _r Indicating the number of receiving antennas;

step S3: extracting features of the low-resolution matrix through two convolutional neural networks to obtain a feature map about the low-resolution matrix; obtaining a weighted self-attention mechanism feature map through a self-attention mechanism module according to the feature map, multiplying the self-attention mechanism feature map by a learning coefficient, and adding the learning coefficient to the original feature map to obtain a relationship feature of mutual dependence of any two positions in the feature map; the relation features are respectively passed through a convolution network and a self-attention mechanism module twice, so that a relation dependence feature with more comprehensive learning is obtained;

s4, step: the relation dependence characteristic is subjected to a convolution network to output a matrix consistent with the dimension of a channel matrix which is finally required to be estimated

The invention relates to a deep learning channel estimation method based on a self-attention mechanism in a multi-input multi-output orthogonal frequency division multiplexing system; the channel estimation method effectively utilizes the correlation between the time-frequency domain channel impulse responses, which is not considered in the prior channel estimation work; the invention provides a deep learning channel estimation method based on a self-attention mechanism, which can effectively construct the interdependence relation of any two positions by outputting channel characteristics to a super-resolution neural network of the self-attention mechanism; because of a certain correlation among elements of the channel matrix generated by the adopted channel model, the self-attention mechanism module is adopted to more effectively utilize the correlation among time-frequency domain channel impulse responses, and compared with the channel estimation result of the existing multi-input multi-output orthogonal frequency division multiplexing system, the channel estimation result has ideal improvement in accuracy.

Description of the drawings:

FIG. 1 is a flowchart of a neural network implementation of a deep learning channel estimation method based on a self-attention mechanism according to the present invention;

FIG. 2 is a schematic diagram of a self-attention mechanism module according to the present invention.

FIG. 3 is a graph comparing the Mean Square Error (MSE) performance of the method of the present invention with the channel estimation of an existing MIMO OFDM system at different signal-to-noise ratio (SNR) settings;

fig. 4 is a graph comparing the Bit Error Rate (BER) performance of the method of the present invention with the channel estimation of the existing mimo-ofdm system at different signal-to-noise ratio (SNR) settings.

Detailed Description

The deep learning channel estimation method based on the self-attention mechanism in the mimo-ofdm system of the present invention is described in further detail and specifically illustrated by the following embodiments with reference to the accompanying drawings.

Embodiment case 1:

to facilitate an understanding of the implementation of the method, a detailed description will be given below of how the present invention makes use of deep learning based on self-attention mechanismsThe method performs channel estimation. N for one transmitting antenna number _t The number of the receiving antennas is N _r Multiple input multiple output orthogonal frequency division multiplexing system of (2), wherein, at the transmitting end, the n-th _t A certain orthogonal frequency division multiplexing symbol on the root transmit antenna is denoted as:wherein L is the number of subcarriers, [] ^T Represents the transpose, for the nth _r The symbols on the kth subcarrier above the root receive antenna are denoted as:

wherein,represents the nth _t Root transmit antenna and nth _r Channel impulse response between root receive antennas corresponding to kth subcarrier +.>Representing additive white gaussian noise on the receiving antenna; for one of the pilot symbols, all N _r The symbols on the kth subcarrier received by the root receive antenna are represented as:

Y ^k ＝H ^k X ^k +Z ^k ,(k＝1,...,L)

wherein ( ^-1 An inverse matrix representing the matrix;

S2, step: for the channel matrixPerforming up-sampling linear interpolation to obtain frequency domain channel response at the data symbol, and combining with frequency domain channel response at pilot frequency to obtain complete signalA frequency domain channel response matrix, and remodelling the complete frequency domain channel response matrix into L multiplied by N _sym ×(N _t ×N _r X 2), where L is the number of subcarriers, N _sym N is the number of subcarriers _t For transmitting the antenna number N _r The number of the receiving antennas;

Fig. 1 shows a flowchart of a neural network implementation of the deep learning channel estimation method based on the self-attention mechanism.

In this embodiment, step S1: channel state information at pilot symbols and estimated channel matrix by least squares channel estimation algorithmThe method specifically comprises the following steps:

by transmitting pilot symbols X with known pilot positions ^k Using the received signal Y ^k Using least square methodEstimating channel state information of the symbol at all pilot frequencies to obtain an estimated channel matrixI.e. the channel matrix A0 in fig. 1.

In one embodiment, step S2: for the channel matrixPerforming up-sampling linear interpolation to obtain frequency domain channel response at the data symbol, combining with frequency domain channel response at the pilot frequency to obtain a complete frequency domain channel response matrix, and remolding the complete channel frequency response matrix into L×N _sym ×(N _t ×N _r X 2), where L is the number of subcarriers, N _sym N is the number of subcarriers _t For transmitting the antenna number N _r The method specifically comprises the following steps of:

for a pair ofThe up-sampling linear interpolation, namely the up-sampling operation A1 in the first step of fig. 1, is performed, the method replaces up-sampling by a linear interpolation method, and assuming that the interval between two pilots is L, the frequency domain channel response estimated value at the middle position of the mth pilot and the (m+1) th pilot can be expressed as:

after the frequency domain channel response at the data symbol is obtained, in order to facilitate the convolutional network operation, the real part and the imaginary part of the matrix are respectively used as the channel dimensions of the channel matrix, so that the matrix is remodeled into L multiplied by N _sym ×(N _t ×N _r X 2) is provided.

In one embodiment, step S3: extracting features of the low-resolution matrix through two convolutional neural networks to obtain a feature map about the low-resolution matrix; obtaining a weighted self-attention mechanism feature map through a self-attention mechanism module according to the feature map, multiplying the self-attention mechanism feature map by a learning coefficient, and adding the learning coefficient to the original feature map to obtain a relationship feature of mutual dependence of any two positions in the feature map; the relation features are respectively passed through a convolution network and a self-attention mechanism module twice to obtain a relation dependency feature with more comprehensive learning, which comprises the following steps:

the low-resolution matrix is first passed through two convolutional neural networks A2 and A3 as shown in fig. 1, to obtain a feature map x:

wherein Inter (·) represents the interpolation function, i.e. the upsampling module in the network, W ₁ ，W ₂ The weights of the convolutional neural network are represented, respectively, and the resulting feature map is then input into a self-attention feature module (i.e., A4 shown in fig. 1). Fig. 2 is a schematic diagram of a self-attention mechanism module structure used in the present invention, wherein a feature map is B0 shown in fig. 2, and then three feature matrices query, key and value are obtained by respectively obtaining 3 feature mapping functions f, g and h through 3 convolution networks with convolution kernel sizes of 1×1:

query(x)＝f(x)＝W _f x+b _f ，

key(x)＝g(x)＝W _g x+b _g ，

value(x)＝h(x)＝W _h x+b _h ，

where W, b denote the weight and bias of the convolutional network, respectively. The three feature matrices query, key, and value described above correspond to B1, B2, and B3, respectively, shown in fig. 2. Then, the query matrix, the key matrix and the value matrix are respectively recombined into the size ofMatrix Query, key, value, B4, B5, B6 shown in FIG. 2, wherein +.>N=H×W, then performing matrix multiplication operation on the Query and Key, and obtaining a block with the size of +.>Inter-dependent Attention mapping matrix attention_map, i.e. B7 shown in fig. 2:

Attention_map _j,i representing the degree of consideration dependence on the ith region in synthesizing the jth region, i.e. implementing the interdependence at any two positions in its features,

because the obtained attention_map is a weight matrix with a sum of 1 according to the row, the attention_map is transposed, then multiplied by the Value matrix to obtain weighted sum at each position, finally a convolution network with a convolution kernel size of 1×1 is used to obtain a self-Attention feature map o (i.e. B8 shown in fig. 2), then the self-Attention feature map is multiplied by a coefficient to be added with the original feature map, and the final output is:

y＝γO+x,

where y is B9 shown in fig. 2, γ is a learnable scalar initialized to 0, and since it can be learned more easily in locally adjacent network features, the introduced γ learns from local and then gradually learns to give higher weight to non-local features, which is also consistent with the intuition of the person: the simple task is learned firstly, then the complexity of the task is gradually increased, and the weighted self-attention feature map is continuously overlapped on the original feature map along with the continuous deep learning, so that a globally dependent feature map is finally obtained.

In one embodiment, step S4 above: the relation dependence characteristic is subjected to a convolution network to output a matrix consistent with the dimension of a channel matrix which is finally required to be estimatedThe method specifically comprises the following steps:

by setting the number of convolution kernels, the channel dimension of the final output is converted to be consistent with the estimated channel matrix dimension through convolution (namely A5 shown in fig. 1), and finally, the feedback of the neural network is performed by solving the following optimization problem, so that a better estimation result is obtained:

where Θ represents the training parameters of the whole network.

After the above steps, the recovered estimated channel (i.e., A6 shown in fig. 1) is finally outputted.

The invention relates to a deep learning channel estimation method based on a self-attention mechanism in a multi-input multi-output orthogonal frequency division multiplexing system; the channel estimation method effectively utilizes the correlation between the time-frequency responses of the channels, which is not considered in the prior channel estimation work; the invention provides a deep learning channel estimation method based on a self-attention mechanism, which can effectively construct the interdependence relation of any two positions by outputting channel characteristics to a super-resolution neural network of the self-attention mechanism; because of a certain correlation among elements of the channel matrix generated by the adopted channel model, the self-attention mechanism module is adopted to more effectively utilize the correlation among time-frequency responses of the channels, and compared with the channel estimation result of the existing multi-input multi-output orthogonal frequency division multiplexing system, the accuracy of the channel estimation result is improved.

The simulation is used for comparing the deep learning channel estimation method based on the self-attention mechanism in the MIMO OFDM system of the invention with the existing channel estimation method in the system. The indicators against which the accuracy of the channel estimation is measured by the present invention are Mean Square Error (MSE) and Bit Error Rate (BER).

The simulation of the deep learning channel estimation method based on the self-attention mechanism under the multi-input multi-output orthogonal frequency division multiplexing system in the embodiment is specifically set as follows:

for simulation of different signal-to-noise ratios, the number of transmitting antennas is 2, the number of receiving antennas is 2, the number of subcarriers is 64, a protection interval is a cyclic prefix, the cyclic prefix is 1/4, the modulation mode is quadrature phase shift keying, noise is additive Gaussian white noise, the transmitting power is normalized, and the signal-to-noise ratio is expressed in the form of a logarithmic function.

Fig. 3 shows the mean square error comparison result of the method of the present invention with the existing estimation method under different signal-to-noise ratios, wherein the solid line of the uppermost C3 mark represents that the existing estimation method is based on the least squares method, the solid line of the C2 mark represents that the existing estimation method is based on the super-resolution deep learning method, and the solid line of the lowermost C1 mark represents the method of the present invention. As can be seen from fig. 3, the mean square error of the mimo ofdm system employing the method of the present invention is smaller than that employing the least square method and the super-resolution deep learning method. At high signal-to-noise ratio, the method has a gain of about 1.1dB relative to the super-resolution deep learning method, and at low signal-to-noise ratio, the method has a gain of about 0.8 dB.

Fig. 4 shows the bit error rate comparison result of the method of the present invention with the existing estimation method under different signal-to-noise ratios, wherein the solid line of the uppermost D3 mark represents that the existing estimation method is based on the least squares method, the solid line of the middle D2 mark represents that the existing estimation method is based on the super-resolution deep learning method, and the solid line of the lowermost D1 mark represents the method of the present invention. As can be seen from fig. 4, the bit error rate of the mimo ofdm system employing the method of the present invention is smaller than that employing the least square method and the super-resolution deep learning method. And with the increase of the signal-to-noise ratio, the performance gap between the performance of the method and the performance of the existing estimation method is gradually increased.

By the above embodiment, it is proved that the deep learning channel estimation based on the self-attention mechanism in the mimo orthogonal frequency division multiplexing system has more accurate channel estimation result and ideal performance in terms of mean square error or bit error rate performance because of effectively utilizing the correlation between channels compared with the existing channel estimation method.

Claims

1. A deep learning channel estimation method based on a self-attention mechanism is characterized in that:

Y ^k ＝H ^k X ^k +Z ^k ,(k＝1,...,L)

wherein, respectively representing a receiving matrix of a receiving end, a transmitting matrix of a transmitting end, a noise matrix,representing the frequency domain channel impulse response matrix on the kth subcarrier, H ^k Expressed as:

wherein ( ^-1 An inverse matrix representing the matrix;

S2, step: for the channel matrixPerforming up-sampling linear interpolation to obtain frequency domain channel response at the data symbol, combining with frequency domain channel response at the pilot frequency to obtain a complete frequency domain channel response matrix, and remolding the complete frequency domain channel response matrix into L×N _sym ×(N _t ×N _r X 2), where L is the number of subcarriers, N _sym N is the number of subcarriers _t For transmitting the antenna number N _r For the number of receive antennas, including;

for a pair ofUp-sampling linear interpolation is performed, the up-sampling is replaced by a linear interpolation method, and assuming that the interval between two pilots is L, the frequency domain channel response estimated value at the middle position of the mth pilot and the (m+1) th pilot can be expressed as:

after the frequency domain channel response of the data symbol is obtained, the real part and the imaginary part of the matrix are respectively used as the channel dimension of the channel matrix, and the matrix is remodeled into L multiplied by N _sym ×(N _t ×N _r X 2) a low resolution matrix;

step S3: extracting features of the low-resolution matrix through two convolutional neural networks to obtain a feature map about the low-resolution matrix; obtaining a weighted self-attention mechanism feature map through a self-attention mechanism module according to the feature map, multiplying the self-attention mechanism feature map by a learning coefficient, and adding the learning coefficient to the original feature map to obtain a relationship feature of mutual dependence of any two positions in the feature map; the relation features are respectively passed through a convolution network and a self-attention mechanism module twice to obtain a relation dependency feature with more comprehensive learning, comprising:

the low-resolution matrix firstly passes through two convolutional neural networks to obtain a feature map x:

wherein Inter (·) represents the interpolation function, i.e. the upsampling module in the network, W ₁ ，W ₂ Respectively representing the weights of the convolutional neural network, inputting the obtained feature map into a self-attention feature module, and obtaining three feature matrices query, key and value through 3 convolutional networks with the convolution kernel size of 1×1 and 3 feature mapping functions f, g and h respectively:

query(x)＝f(x)＝W _f x+b _f ，

key(x)＝g(x)＝W _g x+b _g ，

value(x)＝h(x)＝W _h x+b _h ，

wherein W, b represent the weight and bias of the convolutional network respectively; then, the query matrix, the key matrix and the value matrix are respectively recombined into the size ofMatrix Query, key, value of (2), whereinN=h×w, then performing matrix multiplication on Query and Key, and obtaining a value of +_n via a softmax module>Is defined by the inter-dependent Attention mapping matrix attention_map:

Attention_map _j,i representing the degree of consideration dependence on the ith region in synthesizing the jth region, i.e. implementing its featuresInter-dependency at any two positions,

the attention_map is a weight matrix with the sum of 1 according to the row addition, the attention_map is transposed and multiplied by the Value matrix to obtain the weighted sum of the attention_map at each position, and finally a convolution network with the convolution kernel size of 1×1 is used to obtain a self-Attention feature map, then the self-Attention feature map is multiplied by a coefficient to be added with the original feature map, and the final output y is:

y＝γO+x,

y is B9, gamma is a learnable scalar initialized to 0, the gamma is learned from a part, then the non-local feature is given higher weight by gradual learning, and the weighted self-attention feature map is continuously overlapped on the original feature map along with the continuous deep learning, so that a globally dependent feature map is finally obtained;

s4, step: the relation dependence characteristic is subjected to a convolution network to output a matrix consistent with the dimension of a channel matrix which is finally required to be estimatedComprising the following steps:

the number of convolution kernels is set, the channel dimension finally output is converted to be consistent with the estimated channel matrix dimension through convolution, and finally feedback of the neural network is carried out through solving the following optimization problem, so that a better estimation result is obtained:

where Θ represents the training parameters of the whole network.