CN116405139A

CN116405139A - Spectrum prediction model and method based on Informar

Info

Publication number: CN116405139A
Application number: CN202310226715.8A
Authority: CN
Inventors: 关磊; 杨迪丹; 司江勃; 李晨曦; 郝本健; 齐佩汉; 李赞; 王天洋; 付杭; 惠佩
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2023-03-09
Filing date: 2023-03-09
Publication date: 2023-07-07

Abstract

The invention relates to a frequency spectrum prediction model and a frequency spectrum prediction method based on an Informier, wherein the model comprises a high-dynamic system frequency spectrum time sequence processing module, an Informier model and a frequency domain attention calculating module which are connected with each other; the method comprises the following steps: entering a high dynamic system frequency spectrum time sequence processing module, normalizing an original sequence to obtain a corresponding mean value, a standard deviation and a sequence after transformation, and embedding the normalized sequence to obtain

Will be

Sending the obtained original feature map P into an encoder, calculating the frequency domain attention by DCT to obtain an enhancement vector r, and splicing the enhancement vector r and the original feature map P in the frequency dimension to obtain the frequency domain attentionAnd (3) the force-enhanced feature map P ' is transmitted into a decoder to calculate to obtain an output initial prediction result y ', and the y ' is subjected to de-normalization operation to obtain a final prediction output y. The invention reduces the memory requirement, provides conditions for inputting longer sequences, has more comprehensive study on spectrum data, and has the characteristics of high accuracy and strong applicability.

Description

Spectrum prediction model and method based on Informar

Technical Field

The invention belongs to the field of spectrum management, and particularly relates to a spectrum prediction model and a method based on Informir.

Background

In an intelligent communication system, a ring of important spectrum usage management is that the spectrum scheduling process follows the steps of: spectrum sensing, spectrum analysis, spectrum decision, spectrum adjustment. In today's complex communication environments, comprehensive, real-time monitoring of spectrum is a relatively resource and time consuming task, and the complexity of spectrum sensing is further enhanced, especially in highly dynamic radio environments. Meanwhile, the spectrum is comprehensively monitored in real time, and then available spectrum holes are obtained through calculation and analysis, so that a certain time is consumed, collision is likely to occur during spectrum access due to monitoring and calculation time delay, and the communication quality is reduced. Therefore, spectrum prediction is proposed to reduce computation delay, reduce monitoring cost, and provide more data support for subsequent spectrum management.

The spectrum prediction predicts the occupation of the subsequent spectrum state by analyzing the use condition of the historical spectrum, thereby achieving the purposes of flexibly using spectrum resources and improving the communication quality of the system. In particular, in the high-dynamic electromagnetic environment such as the internet of things, the time delay requirement on spectrum switching is higher, the quality requirement on information transmission is also higher, the spectrum prediction can play an important role in the high-dynamic electromagnetic environment, and the potential of improving the overall communication quality of a communication system is higher, so that the spectrum prediction is a very valuable strategy in intelligent communication, in particular in the high-dynamic electromagnetic environment.

With the rise of new application scenarios such as the internet of vehicles, intelligent communication is generated, the non-renewable performance of spectrum resources requires that a system can be utilized more efficiently, and spectrum prediction has great potential in reducing spectrum switching delay and resource consumption, so that the spectrum prediction direction is a valuable research topic in modern communication environments, especially in high-dynamic electromagnetic environments.

Currently, for sequence prediction, the most proximal is the transducer model, proposed by google corporation, which is based on a complete attention mechanism, which shows a strong ability in sequence modeling. However, there are still some limitations. (1) high computational complexity: the model uses a full attention mechanism, i.e. the attention score at all position times in the input sequence needs to be calculated, and assuming that the input sequence length is L, then a total of L needs to be calculated ² The computational complexity of the secondary, i.e. full attention, is O (L ² ) The method comprises the steps of carrying out a first treatment on the surface of the In some experiments with larger data volume, the requirement on equipment may be higher, and even the risk of experiment failure may exist, so that the application of the model in an actual scene is limited; (2) In the spectrum time sequence, the transducer focuses on the relation of the sequence in the time dimension, but information such as interference and the like is difficult to distinguish in the time dimension, the information is better distinguished in the frequency domain, and the model lacks learning on the aspect of data frequency domain; (3) In high dynamic electromagnetic environments, the stationarity of the spectral data may also be poor, and the lack of processing consideration by the transducer for such data may affect the prediction results.

Disclosure of Invention

In order to solve the problems of high complexity of long sequence prediction calculation and lack of learning in the aspect of data frequency domain in the spectrum prediction problem in the prior art, the invention provides an Informir-based spectrum prediction model and algorithm, which reduce memory requirements, provide good conditions for inputting longer sequences, and have the characteristics of more comprehensive learning of spectrum data, high accuracy and strong applicability.

The frequency spectrum prediction model based on the Informier comprises a high-dynamic system frequency spectrum time sequence processing module, an Informier model and a frequency domain attention calculating module, wherein the Informier model is respectively connected with the high-dynamic system frequency spectrum time sequence processing module and the frequency domain attention calculating module.

Further, the normalization unit of the high-dynamic system spectrum time sequence processing module is connected with an encoder of an Infomer model, the encoder of the Infomer model is connected with a frequency domain attention calculation module, the frequency domain attention calculation module is connected with a decoder of the Infomer model, and the decoder of the Infomer model is connected with a de-normalization unit of the high-dynamic system spectrum time sequence processing module.

A spectrum prediction method based on Informir comprises the following steps:

step 1: entering a high dynamic system frequency spectrum time sequence processing module, normalizing an original sequence to obtain a corresponding mean value, a standard deviation and a sequence after transformation, and calculating to obtain a non-stationary factor through a layer of multi-layer perceptron;

step 2: embedding the normalized sequence to obtain

Will->

Sending the original characteristic map P into an encoder of the Informir model, and obtaining the original characteristic map P through calculation of the encoder;

step 3: performing frequency domain attention calculation on the original feature map P by using DCT to obtain an enhancement vector r, and splicing the enhancement vector r and the original feature map P in a frequency dimension to obtain a feature map P' with enhanced frequency domain attention;

step 4: the feature map P 'with enhanced frequency domain attention is transmitted into a decoder of an Informir model for calculation, and an output initial prediction result y' is obtained;

step 5: and performing de-normalization operation on the initial prediction result y' to obtain a final prediction output y.

Further, the step 1 specifically includes:

1) The original sequence is processed

Normalization is carried out, and the average mu of the original sequence is calculated _X And standard deviation sigma _X And the sequence X' after transformation;

2) The obtained average value mu _X And standard deviation sigma _X Sending the original sequence X and the original sequence X into a multi-layer perceptron unit to obtain non-stationary factors tau and delta in a high dynamic system spectrum time sequence processing module for subsequent sequence recovery, wherein log tau = MLP (sigma) _X ,X)，Δ＝MLP(μ _X ,X)。

Further, the step 2 specifically includes:

1) Embedding the normalized sequence to obtain

2) Data after embedding

And (3) sending the data to an encoder of an Informir model, firstly calculating matrixes Q ', K ', V ' required by attention calculation in each layer stack in the encoder, then calculating sparse attention according to the three matrixes and non-stationary factors, selecting important attention, replacing the rest inquiry values with a mean value, performing attention distillation once, and obtaining an original characteristic diagram P after N times of calculation.

Further, in the above-mentioned spectrum prediction method based on the infomer, if the number of layers of the encoder is N, the following steps are performed N times, specifically:

2.1 Calculating Q ', K ', V ' based on the linear properties;

2.2 Calculation of (c)

2.2.1 Randomly selecting U dot product pairs from K

2.2.2 Calculating a sample score:

2.2.3 Selecting out)

Middle and->

U queries with the largest mean differences, make up +.>

2.2.4 The rest of the query values are replaced by the average value;

2.3 Using)

V', τ, Δ calculation to smooth sparse attention;

wherein d represents the dimension of the sequence;

2.4 Calculating residual connection and carrying out layer normalization;

2.5 Entering a feedforward network for training;

2.6 Calculating residual connection again, and carrying out layer normalization;

2.7 Attention distillation)

Further, the step 3 specifically includes:

1) Calculating one-dimensional convolution for the original feature map P to obtain V;

2) The DCT transformation is performed on V and,

Freq＝DCT(V)＝stack([Freq ⁰ ,Freq ¹ ,...,Freq ^n-1 ])；

3) Calculating a frequency domain attention enhancement vector:

F _c -att＝σ(W ₂ δ(W ₁ Freq))

wherein W is ₁ ,W ₂ Are the learnable parameters obtained through training, delta represents RELU activation function, and sigma represents Sigmoid activation function;

4) The enhancement vector r is calculated by one-dimensional convolution:

r＝P*(F _c -att)；

5) And splicing the P and the r in the frequency dimension, and calculating to obtain an input characteristic diagram P ', P' =torch.cat (P, r) of the enhanced decoder.

Further, the step 4 is specifically that the frequency domain attention enhanced feature map P' is transmitted to the decoder of the Informir model for calculation, and the sequence X is input to the decoder _de ＝{X _token ,X ₀ }，X _token Part is a start token, X ₀ The length of the part of the mark prediction result is output through the decoder and the full-connection layer, and the original position of 0 is the output initial prediction result y', wherein the decoder is in a structure of a covered multi-head sparse attention mechanism and a covered multi-head attention mechanism.

Further, the above-mentioned spectrum prediction method based on the infomer performs calculation in each layer stack of the decoder:

1) Calculating a multi-head sparse attention mechanism of the cover;

2) Calculating residual error connection and carrying out layer normalization;

3) Entering a feedforward network for training;

4) And calculating residual connection again, and carrying out layer normalization.

Further, in the step 5, the method for calculating the predicted output y is as follows:

the invention has the beneficial effects that:

1. the basic model of the invention is an Informir model, is a model with smaller calculation amount based on an attention mechanism, and the calculation complexity of the model is changed from O (L ² ) The method reduces O (Llog L), reduces memory requirements, and provides good conditions for inputting longer sequences;

2. the invention adds the attention mechanism aiming at the frequency domain information, and based on the discovery that GAP is the lowest frequency component of DCT, the invention adopts the DCT calculation method when calculating the frequency domain attention, so that the model can learn the frequency spectrum data more comprehensively, and the error is reduced;

3. according to the invention, the collected spectrum data is preprocessed in a high dynamic electromagnetic environment, so that the stability of the sequence is enhanced, better attention mechanical learning conditions are provided for the model, and the result output by the decoder is subjected to non-stationary reduction, so that the prediction error is reduced, and the applicability of the model is improved.

Drawings

Fig. 1 is a flowchart of a spectrum prediction model based on an infomer in this embodiment.

Fig. 2 is a schematic diagram showing the connection between the infomer model and the frequency domain attention calculation module according to the present embodiment.

Fig. 3 is a flow chart of a frequency domain attention calculating module according to the present embodiment.

Fig. 4 is a schematic diagram of a data embedding method according to the present embodiment.

Detailed Description

The following detailed description, structural features and functions of the present invention are provided with reference to the accompanying drawings and examples in order to further illustrate the technical means and effects of the present invention to achieve the predetermined objects.

The embodiment provides a spectrum prediction model based on an Informier, and referring to fig. 1-3, the spectrum prediction model comprises a high-dynamic system spectrum time sequence processing module, an Informier model and a frequency domain attention calculating module, and the Informier model is respectively connected with the high-dynamic system spectrum time sequence processing module and the frequency domain attention calculating module. The normalization unit of the high dynamic system spectrum time sequence processing module is connected with the encoder of the Infomer model, the encoder of the Infomer model is connected with the frequency domain attention calculation module, the frequency domain attention calculation module is connected with the decoder of the Infomer model, and the decoder of the Infomer model is connected with the de-normalization unit of the high dynamic system spectrum time sequence processing module.

The embodiment also provides a spectrum prediction method based on Informir, which comprises the following steps:

step 1: and entering a high dynamic system frequency spectrum time sequence processing module, normalizing an original sequence to obtain a corresponding average value, a standard deviation and a sequence after transformation, and calculating to obtain a non-stationary factor through a layer of multi-layer perceptron.

1) The original sequence is processed

Normalization is carried out, and the average mu of the original sequence is calculated _X And standard deviation sigma _X And the sequence X' after transformation.

Step 2: embedding the normalized sequence to obtain

Will->

And (5) sending the raw feature map P into an encoder of the Informir model, and calculating by the encoder.

The encoder structure of this embodiment is formed by combining two layers of stacks, in each layer of stacks, a certain number of important attentions are obtained through a multi-head sparse attentions mechanism, and then attentions are screened through a distillation layer, so that effective attentions are obtained, which is also the key for reducing the computational complexity and the memory requirement in the algorithm.

1) The normalized sequence is embedded in a sequence,obtaining

Data embedding method referring to fig. 4.

2) Data after embedding

If the number of layers of the encoder is N, the following steps are performed N times, specifically:

2.1 Calculating Q ', K ', V ' based on the linear properties;

2.2 Calculation of (c)

2.2.1 Randomly selecting U dot product pairs from K

2.2.2 Calculating a sample score:

2.2.3 Selecting out)

Middle and->

U queries with the largest mean differences, make up +.>

2.2.4 The rest of the query values are replaced by the average value;

2.3 Using)

V', τ, Δ calculation to smooth sparse attention;

wherein d represents the dimension of the sequence;

2.4 Calculating residual connection and carrying out layer normalization;

2.5 Entering a feedforward network for training;

2.7 Attention distillation)

Step 3: the original feature map P is calculated by DCT to calculate the frequency domain attention module, and because the traditional channel attention intelligence learns the attention of the lowest frequency, the enhancement vector r is obtained by DCT calculation, and is spliced with the original feature map P in the frequency dimension to obtain the feature map P' enhanced by the frequency domain attention.

2) The DCT transformation is performed on V and,

Freq＝DCT(V)＝stack([Freq ⁰ ,Freq ¹ ,...,Freq ^n-1 ])；

3) Calculating a frequency domain attention enhancement vector:

F _c -att＝σ(W ₂ δ(W ₁ Freq))

4) The enhancement vector r is calculated by one-dimensional convolution:

r＝P*(F _c -att)；

Step 4: the decoder of the frequency-domain attention-enhanced feature map P' into the Informir model is computed while inputting the sequence X to the decoder _de ＝{X _token ,X ₀ }，X _token Part is a start token, X ₀ The length of the part of the mark predicted result is output through the decoder and the full-connection layer, and the original position of 0 is the output initial predicted result y', wherein the decoder is provided with a multi-head sparse attention mechanism and a multi-head attention mechanism which are covered, and the purpose of the covering is to prevent the predicted position from being influenced by the subsequent sequence, so that autoregressive is caused.

The calculations are performed in each layer stack of the decoder:

1) Calculating a multi-head sparse attention mechanism of the cover;

2) Calculating residual error connection and carrying out layer normalization;

3) Entering a feedforward network for training;

The calculation method of the predicted output y comprises the following steps:

the foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the invention, but rather should be construed in scope without departing from the technical scope of the invention.

Claims

1. The frequency spectrum prediction model based on the Informier is characterized by comprising a high-dynamic system frequency spectrum time sequence processing module, an Informier model and a frequency domain attention calculating module, wherein the Informier model is respectively connected with the high-dynamic system frequency spectrum time sequence processing module and the frequency domain attention calculating module.

2. The infomer-based spectrum prediction model of claim 1, wherein the normalization unit of the high dynamic system spectrum time series processing module is coupled to an encoder of an infomer model coupled to a frequency domain attention calculation module coupled to a decoder of the infomer model coupled to a de-normalization unit of the high dynamic system spectrum time series processing module.

3. The spectrum prediction method based on the Informier is characterized by comprising the following steps of:

step 2: embedding the normalized sequence to obtain

Will->

4. The method for predicting spectrum based on Informier according to claim 3, wherein the step 1 specifically comprises:

1) The original sequence is processed

5. The method for predicting spectrum based on Informier according to claim 3, wherein the step 2 specifically comprises:

1) Embedding the normalized sequence to obtain

2) Data after embedding

6. The method of claim 5, wherein if the number of layers of the encoder is N, the following steps are performed N times, specifically:

2.1 Calculating Q ', K ', V ' based on the linear properties;

2.2 Calculation of (c)

2.2.1 Randomly selecting U dot product pairs from the K 'to form K';

2.2.2 Calculating a sample score:

2.2.3 Selecting out)

Middle and->

U queries with the largest mean differences, make up +.>

2.2.4 The rest of the query values are replaced by the average value;

2.3 Using)

V', τ, Δ calculation to smooth sparse attention;

wherein d represents the dimension of the sequence;

2.4 Calculating residual connection and carrying out layer normalization;

2.5 Entering a feedforward network for training;

2.7 A) performing an attention-directed distillation,

7. the method for predicting spectrum based on Informier according to claim 3, wherein the step 3 is specifically:

2) The DCT transformation is performed on V and,

Freq＝DCT(V)＝stack([Freq ⁰ ,Freq ¹ ,...,Freq ^n-1 ])；

3) Calculating a frequency domain attention enhancement vector:

F _c -att＝σ(W ₂ δ(W ₁ Freq))

4) The enhancement vector r is calculated by one-dimensional convolution:

r＝P*(F _c -att)；

8. The method of claim 3, wherein the step 4 is specifically to calculate the decoder of the Informier model with the frequency-domain attention-intensive feature map P' and input the sequence X to the decoder _de ＝{X _token ,X ₀ }，X _token Part is a start token, X ₀ The length of the part of the mark prediction result is output through the decoder and the full-connection layer, and the original position of 0 is the output initial prediction result y', wherein the decoder is in a structure of a covered multi-head sparse attention mechanism and a covered multi-head attention mechanism.

9. The method of claim 8, wherein the calculation is performed in each layer stack of the decoder:

1) Calculating a multi-head sparse attention mechanism of the cover;

2) Calculating residual error connection and carrying out layer normalization;

3) Entering a feedforward network for training;

10. The method for predicting a spectrum based on an infomer according to claim 3, wherein in the step 5, the calculation method of the predicted output y is as follows: