CN117236201B

CN117236201B - Diffusion and ViT-based downscaling method

Info

Publication number: CN117236201B
Application number: CN202311525721.XA
Authority: CN
Inventors: 季焱; 智协飞; 张永宏; 卢楚翰; 彭婷; 张玲; 王靖宇; 陈超辉; 吉璐莹; 吕阳; 朱寿鹏
Original assignee: Nanjing Institute Of Meteorological Science And Technology Innovation; National University of Defense Technology; Nanjing University of Information Science and Technology; Wuxi University
Current assignee: Nanjing Institute Of Meteorological Science And Technology Innovation; National University of Defense Technology; Nanjing University of Information Science and Technology; Wuxi University
Priority date: 2023-11-16
Filing date: 2023-11-16
Publication date: 2024-02-23
Anticipated expiration: 2043-11-16
Also published as: CN117236201A

Abstract

The invention discloses a downscaling method based on Diffusion and ViT, which comprises the following steps: s1, establishing a low-resolution numerical mode precipitation forecast and a high-resolution precipitation observation sample, and preprocessing; s2, constructing a Diffusion-Vision-transformation precipitation prediction model; s3, training a model until errors of the Diffusion-Vision-transformation converge, and storing the model and predicting; according to the invention, the Vision Transformer model is used for replacing the U-Net structure in the original Diffusion model, so that the training efficiency of the model is greatly improved, and the time of the model for prediction is reduced.

Description

Diffusion and ViT-based downscaling method

Technical Field

The invention relates to the technical field of weather forecast, in particular to a scale-down method based on Diffusion and ViT.

Background

Most of the traditional statistical downscaling methods are models based on linear frames, and are difficult to process complex and high-dimensional meteorological field data and characterize an atmospheric nonlinear dynamic process. The rise in deep learning provides new directions for characterizing complex data that are highly dimensional and strongly nonlinear, such as meteorological element fields. By utilizing the efficient spatial feature extraction module to extract key information of high-dimensional spatial data, a statistical model of low-resolution input to high-resolution output is established, the deep learning model can be effectively applied to scenes such as picture denoising and picture resolution improvement, and the like, and the method is generally called as a super-resolution model. However, how to efficiently migrate the model to the down-scale problem of meteorology, and further improve the calculation efficiency and the prediction accuracy of the model, still needs further research and exploration.

Disclosure of Invention

The invention aims to: the invention aims to provide a downscaling method based on Diffusion and ViT to solve the problems of insufficient spatial resolution and large prediction error of numerical mode precipitation prediction.

The technical scheme is as follows: the invention discloses a downscaling method based on Diffusion and ViT, which comprises the following steps:

s1: establishing a low-resolution numerical mode precipitation forecast and a high-resolution precipitation observation sample, and preprocessing;

s2: constructing a Diffusion-Vision-transformation precipitation prediction model; the method comprises the following steps:

s21: forward noise adding is carried out on the high-resolution precipitation observation sample in the Diffusion model;

s22: extracting high-order spatial features of low-resolution numerical mode precipitation prediction by using a Vision-transducer model;

s23: denoising the result obtained in the step S21 in a Diffusion model, and introducing the high-order spatial features obtained in the step S22 as condition information to obtain a reduced-scale high-resolution precipitation forecast;

s3: training the model until the error of the dispersion-Vision-transducer converges, and storing the model and predicting.

Further, in the step S1, the preprocessing includes: the data set is subjected to operations of logarithmization and normalization.

Further, the specific process of step S21 is as follows:

setting a high-resolution precipitation observation sample pretreated at a certain momentGaussian noise +.A. The original observation was added stepwise in T times>Obtain->Data distribution at time t +.>Before +.>The formula is as follows:

；

wherein,is a preset constant superparameter, and ranges between 0 and 1;

data distribution at last time tCan be made of data +.0 time instant>The distribution is obtained by the following formula:

；

wherein,and for->Then->。

Further, the step S22 is specifically as follows: input paired high resolution precipitation observation sampleAnd low resolution numberValue mode precipitation forecast ++>And determining the step number T of forward noise and the variance super-parameter of the added random Gaussian noise。

Further, the step S23 includes the following steps:

s231: dividing the low-resolution numerical mode precipitation forecast into a plurality of image blocks, and then carrying out linear mapping on the divided image blocks;

s232: the position information of different image blocks is represented by position codes, and the processed coding information is used as the input of N groups of self-attention modules;

s233: the convolution operation is replaced with a spatial self-attention module.

Further, the formula of step S231 is as follows:

;

wherein,for a group of segmented tiles, +.>For the weight coefficient to be trained, +.>For the truncation coefficient to be trained, +.>Is a set of vectors that have undergone linear mapping.

Further, the step S232 position encoding is a two-dimensional position embedding method.

Further, the step S233 specifically includes the following steps:

set a group of divided blocks asThree sets of weights are utilized, namely query weight +.>Key weight->Numerical weight->Raw data is divided into three features: query matrix->Key value matrixMatrix of values->The method comprises the steps of carrying out a first treatment on the surface of the Then->Corresponding self-attention->The formula is as follows:

；

wherein,is->Square root of dimension.

Further, the step S3 specifically includes the following steps:

the results obtained through steps S21-S22 are:wherein->For the model obtained for steps S21-S22, and (2)>For low resolution numerical model precipitation forecast +.>For paired high-resolution precipitation observation samples,the super parameter preset in the step S21 is T, and the number of steps of forward noise adding in the step S21 is T; the prediction error of the Diffusion-Vision-transformation model in step S3>The formula is as follows:

；

wherein,is a random Gaussian distribution, then +.>；

Forecast errors when the diffion-Vision-transform modelUpon convergence, deducing step T in reverse until model prediction +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein, the former step +.>By the next step->The formula is as follows:

；

wherein,for the model obtained for steps S21-S22, and (2)>For low resolution numerical model precipitation forecast +.>For the super parameter preset in step S21, < ->Is a random Gaussian distribution, then +.>。

An apparatus of the present invention includes a memory, a processor, and a program stored on the memory and executable on the processor, the processor implementing steps in any of the methods of downscaling based on diffion and ViT when the program is executed.

The beneficial effects are that: compared with the prior art, the invention has the following remarkable advantages: (1) The refinement degree of the downscaling prediction is improved by utilizing the Diffusion model, and the method has more advantages particularly in the task aiming at downscaling multiple exceeding 4; (2) By using the Vision Transformer model to replace the U-Net structure in the original Diffusion model, the training efficiency of the model is greatly improved, and the time of the model for prediction is reduced.

Drawings

FIG. 1 is a general flow chart of the present invention;

FIG. 2 is a schematic diagram of a training flow of the diffration-ViT model;

FIG. 3 is a schematic diagram of a diffration model;

FIG. 4 is a schematic diagram of a Vision-transducer model.

Description of the embodiments

The technical scheme of the invention is further described below with reference to the accompanying drawings.

As shown in fig. 1, an embodiment of the present invention provides a downscaling method based on diffration and ViT, which includes the following steps:

s1: establishing a low-resolution numerical mode precipitation forecast and a high-resolution precipitation observation sample, and preprocessing; the pretreatment comprises the following steps: the data set is subjected to operations of logarithmization and normalization.

As shown in fig. 2, S2: constructing a Diffusion-Vision-transformation precipitation prediction model; the method comprises the following steps:

s21: forward noise adding is carried out on the high-resolution precipitation observation sample in the Diffusion model; the method comprises the following steps: as shown in FIG. 3, a high-resolution precipitation observation sample pretreated at a certain moment is setGaussian noise +.A. The original observation was added stepwise in T times>Obtain->Data distribution at time t +.>Before +.>The formula is as follows: data distribution at time t +.>Before +.>The formula is as follows:

；

wherein,is a preset constant excessParameters ranging between 0 and 1;

；

wherein,and for->Then->。

S22: extracting high-order spatial features of low-resolution numerical mode precipitation prediction by using a Vision-transducer model; the method comprises the following steps: input paired high resolution precipitation observation sampleAnd low resolution numerical mode precipitation forecast +.>And determining the step number T of forward noise addition and the variance super-parameter of the added random Gaussian noise +.>。

Denoising the result obtained in the step S21 in a Diffusion model, and introducing the high-order spatial features obtained in the step S22 as condition information to obtain a reduced-scale high-resolution precipitation forecast;

the method comprises the following steps:

s231: as shown in fig. 4, the low-resolution numerical mode precipitation prediction is divided into a plurality of blocks, and then the divided blocks are subjected to linear mapping; the formula is as follows:

;

S232: the position information of different image blocks is represented by position codes, and the processed coding information is used as the input of N groups of self-attention modules; the position coding is a two-dimensional position embedding method, and specifically comprises the following steps: by encoding the position of each tile relative to the X-axis and the Y-axis, different tiles are represented with different position encodings.

S233: the convolution operation is replaced with a spatial self-attention module. The method comprises the following steps:

；

wherein,is->Square root of dimension. The spatial self-attention module consists of a regularization layer, a multi-head self-attention, a residual structure and a feedforward neural network.

S3: training the model until the error of the dispersion-Vision-transducer converges, and storing the model and predicting. The method comprises the following steps:

；

wherein,is a random Gaussian distribution, then +.>；

；

wherein,for the model obtained for steps S21-S22, and (2)>Precipitation pre-treatment for low resolution numerical modeNewspaper (I) of->For the super parameter preset in step S21, < ->Is a random Gaussian distribution, then +.>。

The embodiment of the invention also provides equipment, which comprises a memory, a processor and a program stored on the memory and capable of running on the processor, wherein the processor realizes the steps in any one of the downscaling methods based on the Diffusion and ViT when executing the program.

Claims

1. A downscaling method based on Diffusion and ViT, comprising the steps of:

2. The downscaling method based on Diffusion and ViT of claim 1, wherein the preprocessing in step S1 comprises: the data set is subjected to operations of logarithmization and normalization.

3. The downscaling method based on the Diffusion and the ViT according to claim 1, wherein the specific procedure of the step S21 is as follows:

setting a high-resolution precipitation observation sample pretreated at a certain momentAdding Gaussian noise to the original observation step by step in T times to obtain + ->Data distribution at time t +.>Before +.>The formula is as follows:

；

wherein,is a preset constant superparameter, and ranges between 0 and 1;

；

wherein,and for->Then->。

4. The downscaling method based on Diffusion and ViT of claim 1, wherein S22 is specifically as follows: input paired high resolution precipitation observation sampleAnd low resolution numerical mode precipitation forecast +.>And determining the step number T of forward noise addition and the variance super-parameter of the added random Gaussian noise +.>。

5. The downscaling method based on Diffusion and ViT of claim 1, wherein the step S23 comprises the steps of:

6. The downscaling method based on Diffusion and ViT of claim 4, wherein the formula of step S231 is as follows:

;

wherein,for a group of segmented tiles, +.>For the weight coefficient to be trained, +.>As the truncated coefficients to be trained,is a set of vectors that have undergone linear mapping.

7. The downscaling method of claim 4 wherein the step S232 position encoding is a two-dimensional position embedding method.

8. The downscaling method based on Diffusion and ViT of claim 4, wherein the step S233 is specifically as follows:

set a group of divided blocks asThree sets of weights are utilized, namely query weight +.>Key weight->Numerical weight->Raw data is divided into three features: query matrix->Key value matrix->Matrix of values->The method comprises the steps of carrying out a first treatment on the surface of the Then->Corresponding self-attention->The formula is as follows:

；

wherein,is->Square root of dimension.

9. The downscaling method based on Diffusion and ViT according to claim 1, wherein the step S3 is specifically as follows:

the results obtained through steps S21-S22 are:wherein->For the model obtained in steps S21-S22, and (2)>For low resolution numerical model precipitation forecast +.>For paired high resolution precipitation observation samples, +.>The super parameter preset in the step S21 is T, and the number of steps of forward noise adding in the step S21 is T; the prediction error of the Diffusion-Vision-transducer model in step S3>The formula is as follows:

；

wherein,is a random Gaussian distribution, then +.>；

Forecast error when using a Diffuse-Vision-transducer modelUpon convergence, deducing step T in reverse until model prediction +.>The method comprises the steps of carrying out a first treatment on the surface of the Wherein, the former step +.>By the next step->The formula is as follows:

；

wherein,for the model obtained in steps S21-S22, and (2)>For low resolution numerical model precipitation forecast +.>For the super parameter preset in step S21, < ->Is a random Gaussian distribution, then +.>。

10. An apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor performs the steps in a method of downscaling based on diffion and ViT as claimed in any one of claims 1-9 when the program is executed.