CN116151459A

CN116151459A - Power grid flood prevention risk probability prediction method and system based on improved Transformer

Info

Publication number: CN116151459A
Application number: CN202310178028.3A
Authority: CN
Inventors: 姚德贵; 刘善峰; 石英; 李哲; 寇晓适; 袁少光; 王晓辉; 刘莘昱; 王津宇; 王棨; 王超; 夏中原; 田杨阳; 毛万登; 施瑀; 杨亚锡
Original assignee: Wuhan University of Technology WUT; State Grid Henan Electric Power Co Ltd; Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Current assignee: Wuhan University of Technology WUT; State Grid Henan Electric Power Co Ltd; Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-05-23

Abstract

The invention provides a power grid flood prevention risk prediction method and a power grid flood prevention risk prediction system based on an improved converter, wherein the prediction method comprises the following steps: acquiring flood prevention data; carrying out power grid flood prevention prediction early warning influence factor identification analysis on flood prevention data; modifying the Transformer network framework using a feature enhancement modification strategy based on a gating selection mechanism and embedded coding; then, an explicit sparse attention improvement strategy is used for improving a Transformer network framework; and carrying out power grid flood prevention risk prediction by using the improved converter network framework. According to the invention, the network framework based on the Transformer is used for enhancing the overall network performance and improving the accuracy of flood prevention risk probability prediction by optimizing the feature selection module, the feature fusion module and the attention module.

Description

Power grid flood prevention risk probability prediction method and system based on improved Transformer

Technical Field

The invention belongs to the field of data analysis in the power industry, and particularly relates to a power grid flood prevention risk probability prediction method and system based on an improved Transformer.

Background

The accurate prediction of the flood prevention risk probability of the transformer substation is of great significance to the improvement of the flood prevention capability of the power grid and the healthy flood prevention early warning system. The flood prevention influencing factors of the transformer substation not only have dynamic meteorological data, but also have static data such as the volume of a water collecting well of the transformer substation, the drainage amount of a water pump, the storage of flood prevention materials, the topography, the hydrologic characteristics and the like, and the dynamic and static combination brings difficulty to the prediction of the traditional flood prevention risk.

Conventional time series prediction methods represented by the autoregressive integrated moving average (ARIMA) model have strong interpretations, but they are mainly applicable to univariate prediction problems, which limit their application to complex multivariate time series data. Different from the independent prediction of single or small amount of time sequences, the machine learning-based method fully utilizes large-scale time sequence data in the existing application scene through feature processing, a more accurate model can be fitted, but the machine learning method has limited capability in the aspect of processing original data, professional knowledge in related fields is required to convert the original data into related representations or features, the quality of the original data representation directly influences the performance of the machine learning model, compared with the data-driven deep learning model which can learn the related representations and features of the data without a large amount of manual feature engineering, the method is suitable for the substation flood prevention risk prediction scene under the large-scale data by combining simple nonlinear function modules, converting low-level input into high-level abstract feature representation and automatically capturing complex relations in the original high-dimensional data. However, a large amount of data is required to train the deep learning model as a training basis, the solvability of the deep learning model is not strong, the result obtained by training the data of a single station is difficult to apply to the flood prevention risk probability prediction of other substations, and the generalization capability is not strong.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a power grid flood prevention risk probability prediction method and system based on an improved transducer, which solve the problems that flood prevention data is complex, has a certain difference with text data, is difficult to directly use on a transducer model, the transducer model cannot fully utilize time stamp information of a time sequence, and the complexity of a self-attention module in the transducer model in calculation time and memory consumption is high in the prior art.

In order to achieve the above purpose, the invention adopts the following technical scheme:

an improved transducer-based power grid flood prevention risk prediction method comprises the following steps:

acquiring flood prevention data;

carrying out power grid flood prevention prediction early warning influence factor identification analysis on flood prevention data;

modifying the Transformer network framework using a feature enhancement modification strategy based on a gating selection mechanism and embedded coding;

then, an explicit sparse attention improvement strategy is used for improving a Transformer network framework;

and carrying out power grid flood prevention risk prediction by using the improved converter network framework.

Preferably, before the identifying and analyzing the flood prevention data according to the power grid flood prevention prediction early warning influence factors, the method further includes:

after flood control data are acquired, data preprocessing is carried out, and then identification analysis of power grid flood control prediction early warning influence factors is carried out;

the data preprocessing comprises data cleaning and data normalization;

the data cleaning is used for removing abnormal values in the data set and filling missing values;

the data normalization is used to make the data features of the same metric.

Preferably, the identifying and analyzing the flood prevention data for the power grid flood prevention prediction and early warning influence factors includes:

dividing monitoring information of a transformer substation into meteorological monitoring information, geographical environment information and site self information;

the geographic environment information and the site self information are static covariate characteristics of the transformer substation;

the weather monitoring information is a dynamic covariate characteristic.

Preferably, the feature enhancement improvement strategy using a gating selection mechanism and embedded coding comprises: the main body input feature selection strategy based on the gating mechanism and the time stamp feature fusion strategy based on the embedded coding.

Preferably, the body input feature selection policy based on the gating mechanism includes:

taking the core meteorological features as main body input feature vectors, taking the static covariate features of the site as optional context feature vectors, and taking the main body input feature vectors and the optional context feature vectors as input features;

the input characteristics are operated through a gating residual error network module and softmax to obtain the selection weight of the input variable;

performing weighted assignment on the selection weight of the input variable and the input feature to obtain example feature selection of the input feature;

a feature variable selection network is constructed for the static covariates and time-dependent sequence variables applied to the instance feature selection.

Preferably, the embedding encoding-based time stamp feature fusion strategy comprises:

s1, taking an input sequence with a reserved t moment as an input transformation sequence vector;

s2, embedding a local position feature vector of a storage sequence by using a fixed position;

s3, representing the global timestamp feature vector by using a learnable time code;

s4, mapping the input transformation sequence, the local position feature and the global time stamp feature vector to the same feature dimension by using a one-dimensional convolution filter with the kernel size of 3 and the step distance of 1, and fusing to obtain the enhanced input feature embedded code.

Preferably, the using an explicit sparse attention improvement strategy is implemented by a selection strategy employing top-k, comprising:

obtaining an original attention score matrix by using keys in a self-attention mechanism and query calculation;

multiplying the original attention score matrix with a threshold matrix, and then calculating by a sign function to obtain an original position mask matrix;

replacing 1 in the original position mask matrix with 0 and replacing 0 with minus infinity to obtain a position mask matrix;

and adding the original attention score matrix and the position mask matrix to obtain an attention score matrix.

An improved fransformer-based power grid flood prevention risk prediction system, comprising: the power grid flood prevention risk prediction method based on the improved Transformer is realized when the processor executes the computer program.

A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the improved fransformer-based grid flood prevention risk probability prediction method.

Although the transducer has excellent performance as a representative of deep learning in recent years, the transducer model is designed for problems related to natural language processing, and although both text sequences and time sequences are sequence data, there are characteristic differences, such as timestamp information. From the above principle analysis, it is known that the transducer model performs position coding using a sine and cosine function, then adds it to the input sequence embedding and feeds it to the prediction model, and although this approach can extract some position information from the time sequence, it cannot make full use of the time stamp information of the time sequence, so that it is necessary to code the time stamp as an additional position code. Meanwhile, in order to fully utilize the static basic characteristics of the transformer substation, the implicit action relation between the meteorological factors and the static characteristics is learned, and the static characteristics are required to be interactively selected. In addition, in consideration of complexity of the self-attention module in the transducer model in terms of calculation time and memory consumption, in order to reduce requirements on hardware equipment in power grid early warning management and improve model efficiency, improvement on the self-attention model is required. In summary, the invention aims at the defects in the transducer model, combines the characteristics of the prediction object to carry out algorithm adaptability improvement, extracts high-dimensional coupling sequence characteristics on the premise of comprehensively considering the coupling effect of the multidimensional influence factors of flood disasters, constructs a flood risk prediction model, realizes the prediction and early warning of the flood risk of the Transformer substation, and is detailed as follows:

the invention provides a power grid flood prevention risk prediction method based on an improved transducer, which uses an improved transducer network frame to predict the power grid flood prevention risk probability, and comprises the following steps: a optimization model structure based on a Transformer network framework is selected according to flood prevention data characteristics, and a characteristic enhancement improvement strategy and an explicit sparse attention improvement strategy based on a gating selection mechanism and embedded codes are used. According to the method, a power grid flood prevention prediction early warning influence factor identification analysis strategy is provided, and the characteristic factors of power grid flood prevention data are utilized efficiently; providing a characteristic enhancement strategy based on a gating selection mechanism and embedded codes, enhancing information extraction of data characteristics, and enhancing input of a prediction model; the explicit sparse attention improvement strategy is provided, irrelevant information is effectively removed, complexity of the self-attention module in calculation time and memory consumption is reduced, and good performance is shown in the aspect of power grid flood prevention prediction.

The invention provides an improved Transformer-based power grid flood prevention risk prediction method, which is a network framework based on Transformer, and is characterized in that a feature selection module, a feature fusion module and an attention module are optimized, so that the overall network performance is enhanced, the accuracy of flood prevention risk probability prediction is improved, and the problems that flood prevention data is complex, has a certain difference with text data, is difficult to directly use on a Transformer model, time stamp information of a time sequence cannot be fully utilized by the Transformer model, and the complexity of a self-attention module in the Transformer model in calculation time and memory consumption is high in the prior art are solved.

Drawings

FIG. 1 is a flow chart diagram of an improved transducer-based power grid flood prevention risk probability prediction method;

FIG. 2 is a block flow diagram of a subject input feature selection strategy based on gating mechanisms of the present invention;

FIG. 3 is a flow chart of the embedded code based timestamp feature fusion strategy of the present invention;

FIG. 4 is a block flow diagram of the present invention using an explicit sparse attention improvement strategy.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that although functional block diagrams are depicted as block diagrams, and logical sequences are shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the block diagrams in the system. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The invention provides a power grid flood prevention risk prediction method based on an improved Transformer. According to the method, a power grid flood prevention prediction early warning influence factor identification analysis strategy is provided, and the characteristic factors of power grid flood prevention data are utilized efficiently; providing a characteristic enhancement strategy based on a gating selection mechanism and embedded codes, enhancing information extraction of data characteristics, and enhancing input of a prediction model; the explicit sparse attention improvement strategy is provided, irrelevant information is effectively removed, complexity of the self-attention module in calculation time and memory consumption is reduced, and good performance is shown in the aspect of power grid flood prevention prediction.

Embodiments of the present invention will be further described with reference to the accompanying drawings.

The overall architecture of the transducer is divided into four modules: the device comprises an input module, an encoding module, a decoding module and an output module. The input module comprises: the source data embedded layer and the position encoder thereof adopt a data input coding strategy; the coder module and the decoder module comprise: the system comprises a self-attention module, a normalization layer, a feedforward full-connection sub-layer and a residual error connection module; the output module includes: linear layer and softmax layer.

Referring to fig. 1, fig. 1 is a flowchart of an improved transporter based power grid flood prevention risk prediction method according to an embodiment of the present invention, where the improved transporter based power grid flood prevention risk prediction method includes, but is not limited to, steps S110 to S150.

Step S110, flood control data are obtained;

step S120, carrying out identification analysis on the flood control prediction early warning influence factors of the power grid on the flood control data;

step S130, improving the flood control data input coding strategy by using a characteristic enhancement improvement strategy based on a gating selection mechanism and embedded coding;

step S140, the self-attention module is improved by using the explicit sparse attention improvement strategy;

and step S150, predicting the risk probability of flood control of the power grid by using the improved converter network framework obtained in the step S140.

Further, before the power grid flood prevention prediction early warning influence factor identification analysis is performed on the flood prevention data, the method further comprises the following steps:

and carrying out data preprocessing on the flood prevention data, including data cleaning and data normalization.

The data cleaning is used for removing abnormal values in the data set and filling missing values; the data normalization is used to make the data features of the same metric.

Further, carrying out identification analysis on the power grid flood prevention prediction early warning influence factors on the flood prevention data, wherein the identification analysis comprises the following steps:

the monitoring information of the transformer substation is divided into three aspects, namely weather monitoring information, geographical environment information and site self information, through identification and analysis of the flood prevention influence factors of the transformer substation; the geographic environment and the site self information are static covariate characteristics of the transformer substation; meteorological monitoring information is categorized into dynamic covariate characteristics.

Further, feature enhancement refinement strategies based on gating selection mechanisms and embedded coding are used, including: the main body input feature selection strategy based on the gating mechanism and the time stamp feature fusion strategy based on the embedded coding.

Still further, referring to fig. 2, the gating mechanism-based principal input feature selection strategy includes the following steps, but is not limited to steps S210 to S240:

step S210, taking the core meteorological features as main input feature vectors, taking the static covariate features of the site as optional context feature vectors, and taking the main input feature vectors and the optional context feature vectors as input features;

step S220, the input characteristics are operated through a gate residual network module and softmax to obtain the selection weight of the input variable;

step S230, carrying out weighted assignment on the selection weights of the input variables and the input features to obtain example feature selection of the input features;

step S240, constructing a feature variable selection network of static covariates and time-dependent sequence variables applied to instance feature selection.

Still further, referring to fig. 3, the embedded encoding-based timestamp feature fusion strategy includes the following steps, but is not limited to steps S310 to S340:

step S310, taking the input sequence with the t moment reserved as an input transformation sequence vector;

step S320, embedding the local position feature vector of the preservation sequence by using the fixed position;

step S330, the global timestamp feature vector is characterized by using a learnable time code;

and step S340, mapping the input transformation sequence, the local position feature and the global time stamp feature vector to the same feature dimension by using a one-dimensional convolution filter with a kernel size of 3 and a step pitch of 1, and fusing to obtain the enhanced input feature embedded code.

Further, referring to fig. 4, the use of an explicit sparse attention improvement strategy is achieved by employing a top-k selection strategy comprising the steps of:

step S410, obtaining an original attention score matrix by calculation by using keys in a self-attention mechanism and a query;

step S420, multiplying the original attention score matrix with a threshold matrix, and then calculating by a sign function to obtain an original position mask matrix;

step S430, replacing 1 in the original position mask matrix with 0, and replacing 0 with minus infinity to obtain a position mask matrix;

step S440, adding the original attention score matrix to the position mask matrix to obtain an attention score matrix.

In one embodiment, first, flood control data is obtained; then, carrying out identification analysis on the power grid flood prevention prediction early warning influence factors on the data; then, a feature enhancement improvement strategy based on a gating selection mechanism and embedded codes is used; explicit sparse attention improvement strategies are then used; finally, the improved transducer network frame is used for predicting the flood prevention risk probability of the power grid, so that the high-efficiency and accurate prediction of the flood prevention risk probability of the power grid is realized, and the method is detailed as follows:

step S1: data analysis

Acquiring flood prevention data, performing data preprocessing on the data, removing abnormal values in a data set, filling in missing values, and enabling data features to have the same measurement scale; and then, carrying out identification analysis on the power grid flood prevention prediction early warning influence factors on the data, dividing the monitoring information of the transformer substation into three aspects, namely dynamic covariate weather monitoring information serving as core weather data, geographical environment information serving as static covariate and site self information, and further selecting related variables to improve model prediction accuracy.

Step S2: feature selection

The method comprises the steps of using a main body input characteristic selection strategy based on a gating mechanism and a timestamp characteristic fusion strategy based on embedded coding to enhance the input of a prediction model, wherein the main body input characteristic selection strategy based on the gating mechanism is as follows:

step S21: taking the core meteorological features as main input feature vectors a, and taking the static covariate features of the site as optional context feature vectors c;

step S22: the main body input feature vector and the optional context feature vector are subjected to a gating residual error network module and softmax operation to obtain the selection weight w of the input variable _xt Selecting a weight w _xt Is realized as follows:

w _xt ＝softmax(GRN(ξ _t ))

in xi _t The characteristic vector is input at the moment t; the calculation of the gate-control residual network GRN is implemented as:

GRN(a,c)＝LayerNorm(a+GLU(η ₁ ))

η ₁ ＝W ₁ η ₂ +b ₁

η ₂ ＝ELU(W ₂ a+W ₃ c+b ₂ )

wherein a is a main body input feature vector; c is an optional context feature vector; ELU is an activation function; w (W) _i Weight parameters which can be learned for the model; b _i As a bias parameter, eta _i Is an intermediate variable.

The gating linear unit GLU is a specific implementation of a gating layer, the input of a model is adjusted through a gating mechanism, the flexibility of the model is improved, and the calculation mode of the GLU is as follows:

GLU(γ)＝σ(W ₄ γ+b ₄ )⊙(W ₅ γ+b ₅ )

in the formula, GLU is a gating linear unit, gamma is the input of a gating layer, and sigma is a Sigmoid activation function;

step S23: the selection weight of the input variable is subjected to weighted assignment with the input feature to obtain the example feature selection of the input feature at the moment t

Is realized as follows:

step S24: a feature variable selection network is constructed for the static covariates and time-dependent sequence variables applied to the instance feature selection.

Step S3: feature fusion

The method comprises the steps of using a main body input feature selection strategy based on a gating mechanism and a time stamp feature fusion strategy based on embedded coding to enhance the input of a prediction model, wherein the time stamp feature fusion strategy based on the embedded coding is as follows:

step S31: the input sequence with t moment is reserved as the characteristic vector xi input by the input transformation sequence for t moment _t ；

Step S32: local position feature PE using fixed position embedding preservation sequences _pos The calculation formula of the position coding function is as follows:

wherein pos represents the position of the original data in the sequence, and d represents the dimension of the multidimensional feature after being subjected to the mapping of the input vector embedding layer;

step S33: the characteristic of the global time stamp is characterized by using a learnable time code, and the specific coding parameter calculation mode is as follows:

in TE _i Representing a class i timestamp encoding; t is t _i And T _i Representing the time point and period under the time stamp of the type, taking a month as an example, T represents the sampling time month of the data to be encoded, and T is period 12;

step S34: for t timeExample feature selection of an incoming feature

Local position feature vector PE _pos And a global timestamp feature vector TE _pos Using one-dimensional convolution filter with kernel size of 3 and stride of 1, the same feature dimension d _model And fusing to obtain the embedded code χ of the enhanced input feature ^t The specific calculation of the module is as follows:

where α is a factor that balances the size between the projection of the input sequence and the local and global embeddings, if the input sequence is normalized, it is recommended that α have a value of 1.

Step S4: self-attention thinning

Step S41: the original attention score matrix P is calculated by using keys in the self-attention mechanism and the query, and the calculation of the original attention score matrix P is implemented as follows:

wherein Q and K are queries and keys in the attention mechanism; d, d _k Is the characteristic dimension of each attention head;

step S42: multiplying the original attention score matrix P with a threshold matrix T, and then calculating by a sign function to obtain an original position mask matrix M, wherein the calculation of the original position mask matrix M is realized as follows:

M＝sign(PT)；

step S43: replacing 1 in the original position mask matrix M with 0, and replacing 0 with minus infinity to obtain a position mask matrix M';

step S44: adding the original attention score matrix and the position mask matrix to obtain a first output Y, and calculating the first output Y through a softmax function to obtain an attention score matrix A, wherein the calculation of the attention score matrix A is realized as follows:

A＝softmax(PM′)

step S45: the standardized attention score matrix is obtained through the above process, and the final self-attention output C is obtained by calculation by combining the value vector V obtained by the source context mapping, wherein the calculation formula is as follows

C＝AV。

Step S5: network output

And selecting a Transformer as a basic network frame, adopting an optimization and improvement strategy from step S2 to step S4 to optimize the feature selection module, the feature fusion module and the attention module, enhancing the overall network performance and reducing the calculation cost, thereby constructing a power grid flood prevention risk probability prediction model based on the improved Transformer, and realizing high-efficiency and accurate calculation of the power grid flood prevention risk probability.

The invention also provides a power grid flood prevention risk prediction system based on the improved Transformer, which comprises the following steps: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize any grid flood prevention risk prediction method based on the improved Transformer.

The processor and the memory may be connected by a bus or other means.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

It should be noted that, the road scene target detection system based on the optimized network structure and the mixed confidence coefficient in the embodiment may include a service processing module, an edge database, a server version information register, and a data synchronization module, where the processor implements the improved transform-based power grid flood prevention risk probability prediction method applied to the road scene target detection system based on the optimized network structure and the mixed confidence coefficient when executing the computer program.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Furthermore, the present invention provides a computer readable storage medium storing computer executable instructions, where the computer executable instructions are executed by a processor or a controller, for example, by one of the processors in the above terminal embodiment, and cause the processor to execute the grid flood prevention risk prediction method based on the improved converter in the above embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the above embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

The above-described embodiments of the present invention do not limit the scope of the present invention. Any other corresponding changes and modifications made in accordance with the technical idea of the present invention shall be included in the scope of the claims of the present invention.

Claims

1. The power grid flood prevention risk prediction method based on the improved Transformer is characterized by comprising the following steps of:

acquiring flood prevention data;

2. The improved fransformer-based power grid flood prevention risk probability prediction method according to claim 1, wherein before the power grid flood prevention prediction early warning impact factor identification analysis is performed on the flood prevention data, the method further comprises:

the data preprocessing comprises data cleaning and data normalization;

the data normalization is used to make the data features of the same metric.

3. The improved Transformer-based power grid flood prevention risk probability prediction method of claim 1, wherein the performing power grid flood prevention prediction early warning impact factor identification analysis on the data comprises:

the information of the geographic environment and the site is the static covariate characteristic of the transformer substation;

the weather monitoring information is a dynamic covariate characteristic.

4. The improved Transformer based grid flood prevention risk probability prediction method of claim 1, wherein the feature enhancement improvement strategy using the gating selection mechanism and embedded coding comprises: the main body input feature selection strategy based on the gating mechanism and the time stamp feature fusion strategy based on the embedded coding.

5. The improved Transformer-based grid flood prevention risk probability prediction method of claim 4, wherein the gating mechanism-based principal input feature selection policy comprises:

6. The improved Transformer-based grid flood prevention risk probability prediction method of claim 4, wherein the embedded coding-based time stamp feature fusion strategy comprises:

7. The improved Transformer-based grid flood prevention risk probability prediction method of claim 1, wherein the using explicit sparse attention improvement strategy is implemented by employing top-k selection strategy, comprising:

8. An improved fransformer-based power grid flood prevention risk prediction system, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the improved Transformer based grid flood prevention risk probability prediction method according to any one of claims 1-7 when the computer program is executed.

9. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the improved Transformer-based grid flood prevention risk probability prediction method according to any one of claims 1-7.