CN117236390A

CN117236390A - Reservoir prediction method based on cross attention diffusion model

Info

Publication number: CN117236390A
Application number: CN202311229800.6A
Authority: CN
Inventors: 武娟; 罗仁泽; 罗磊; 雷璨如
Original assignee: Southwest Petroleum University
Current assignee: Southwest Petroleum University
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2023-12-15

Abstract

The invention provides a reservoir prediction method based on a cross attention diffusion model, which takes a generated diffusion model as a basis of reservoir parameter prediction, takes the existing logging and coring data of a work area as data support, combines a transducer characteristic extraction module and a cross attention module, better captures the interactive association between logging data and reservoir parameters, and improves the accuracy of the model for predicting the reservoir parameters in the iterative denoising process. Compared with the conventional geological method which relies on mathematical empirical formulas with complicated modeling to calculate reservoir parameters, the method predicts the reservoir parameters in an end-to-end mode, namely, logging data of any target area is directly input into a trained model to predict the reservoir parameters, correlation among various characteristics of the logging data is better extracted, the problem of cost caused by correcting the model by acquiring core data is avoided, and low-cost and high-precision reservoir parameter prediction of any target area is realized.

Description

Reservoir prediction method based on cross attention diffusion model

Technical Field

The invention relates to the technical field of geological reservoir prediction and deep learning, in particular to a reservoir prediction method based on a cross attention diffusion model.

Background

Pore saturation parameters are key parameters describing reservoir physical properties that directly affect the storage and exploitation capabilities of the oil and gas in the reservoir. Therefore, it is important to predict reservoir parameters with high accuracy, which is the basic task of building high-accuracy geologic models, estimating reasonable reserves of oil and gas, and determining efficient development schemes.

At present, aiming at reservoir prediction research, a traditional prediction method mainly comprises semi-quantitative analysis based on a curve intersection graph and a simple mathematical analysis method (mainly comprising factor analysis for optimizing discrimination parameters, bayesian discrimination analysis, clustering analysis and the like) for fitting parameters to be predicted. However, these conventional methods are mostly based on linear prediction, have low efficiency and high subjectivity, and easily negatively affect the accuracy of the prediction results of the reservoir parameters. Meanwhile, in the actual well logging process, due to the fact that the reservoir is complex and high in heterogeneity, well logging data can be affected by noise and abnormal values, and the prediction effect on reservoir parameters is further affected. The development of deep learning is benefited, and a brand new analysis view angle is provided for petroleum geology researchers based on a data-driven thinking mode, so that the analysis cost can be reduced, and the accuracy of reservoir prediction can be improved.

The diffusion model is a type of generation model, in which noise is gradually added to input data in a forward diffusion process until the input data is completely changed into gaussian noise, and in a reverse diffusion process, denoising is gradually performed through sampling, and how to restore from gaussian noise to real input data is learned. However, the original diffusion model is directed to image generation tasks and cannot be directly applied to reservoir parameter prediction. The transducer network has strong feature extraction capability, and can well mine the correlation among the features of the logging data. Meanwhile, the cross attention mechanism is beneficial to capturing the interaction relation between the logging data and the reservoir parameters by calculating the cross attention between the original data and the target data. Thus, in combination with the advantages of the model networks described above, a reservoir parameter prediction method driven by logging data is constructed.

Aiming at complex logging response characteristics, the invention provides a reservoir prediction method based on a cross attention diffusion model according to the principle of the diffusion model on the basis of logging data. According to the method, firstly, abnormal values and standardization processing are carried out on logging data, and the phenomenon that a model learns wrong information in a training process is avoided. Secondly, the 1 multiplied by 1 convolution module is utilized to map the logging data training samples and the corresponding real reservoir parameters thereof to a high-dimensional vector space respectively so as to better distinguish different features. And extracting the characteristics of the training samples and the real reservoir parameters after noise addition in the high-dimensional vector space through a transducer backbone network in the diffusion model, and calculating the cross attention of the training samples and the real reservoir parameters to predict the noise of the current time step. And finally, the denoising module iteratively performs denoising to obtain the final predicted reservoir parameters.

Disclosure of Invention

The invention mainly overcomes the defects of the existing reservoir parameter prediction method, and aims to provide a reservoir prediction method based on a cross attention diffusion model.

In order to achieve the technical purpose, the invention provides the following technical scheme:

a reservoir prediction method based on a cross-attention diffusion model, comprising the steps of:

step 1, acquiring the existing logging data and coring data of a working area:

the acquisition of the existing logging data of the working area comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), at least three or more logging parameters should be selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.

Step 2, preprocessing the acquired logging data, and constructing a training and testing sample data set:

(1) Preprocessing the acquired logging data comprises outlier removal and normalization processing, wherein the outlier removal adopts a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF), and the method comprises the following steps of:

a) Given a log data set x= { X containing N sample points ₁ ,x ₂ ,…,x _n ,…,x _N Each sample x _i Is M. First, the first n sample points X' = { X are selected ₁ ,x ₂ ,…,x _n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d _left ＝{x _i |x _i ∈X′,x _id C and right subtree X _right ＝X′\X _left The method comprises the steps of carrying out a first treatment on the surface of the At X _left And X _right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained _left And X _right Constructing an initialized robust random partition tree T;

b) Anomaly scores are calculated by deleting and inserting sample points. Starting from the (n+1) th sample point, the (x) th sample point in T is deleted first _i-n The parent of the leaf node to which each sample belongs is replaced with its sibling node, where i=n+1. When deleting sample point x _i-n And when the model complexity change amount, namely the change value of the sum of the depths of all leaf nodes in the T, is the abnormal degree of the point. Thus, sample point x _i-n Is calculated as:

wherein the function f represents the depth of any leaf node y in T generated by the sample set X'. Abnormality score s (x _i-n The closer X') is to 0, X _i-n The less likely an outlier sample point is. Deleting sample point x from T _i-n Then, the sample point x is again _i And to delete x _i-n Set X' \ { X } _i-n Obtaining a new sample point set X' \ { X } _i-n }∪{x _i }. Repeatedly executing the initialization process of the robust random partition tree to obtain an insertion sample point x _i Is a robust random partitioning tree T ". By deleting sample point x _i-n Where i=n+2, x is calculated and recorded _i-n Is an anomaly score for (2);

c) Iteratively executing the process of initializing the robust random partition tree, deleting and inserting the sample points until the abnormal scores S= { S (X) ₁ ),s(x ₂ ),…,s(x _N ) All anomaly scores are arranged in Descending order, S '=Descending Sort (S), the first 2% of sample points in S' are set as anomaly points, and the corresponding samples are removed from XObtaining a log data set X with outlier removal _new ；

d) For log data X with outlier removal _new Carrying out standardization treatment, wherein the standardization method comprises the following steps:

wherein X 'is' _new Mu and sigma are X respectively as normalized sample values _new N' is the number of samples of the log data after removal of outliers;

(2) When training and testing sample data sets are constructed, 80% of well logging data after pretreatment is used as a training set X _train The remaining 20% of the log data is taken as test set X _test 。

Step 3, constructing a cross attention diffusion model network structure:

the model network structure comprises a 1 multiplied by 1 convolution module, a noise adding module, a noise removing module, a transducer characteristic extraction module and a cross attention module.

1 x 1 convolution module: the number of convolution kernels is set to c, and the convolution kernels are used for mapping the input logging data characteristics into a high-dimensional space;

and (3) a noise adding module: in the diffusion process, gaussian noise is gradually added to the original real reservoir parameters (tag data)Until it becomes completely pure gaussian noise;

and a denoising module: the method comprises a learnable neural network model, wherein in the back diffusion process, the state of the current time step is denoised by predicting the noise of the current time step to obtain the state of the previous time step;

the transducer feature extraction module: the encoder is composed of a layer normalization, a multi-head attention layer, a feedforward neural network, a neuron discarding layer (Dropout) and residual connection, wherein the decoder is added with a cross attention layer on the basis of the encoder, and the rest components are the same as the encoder;

cross-attention module: the multi-head attention layer, layer normalization and residual connection, wherein the key (K) and value (V) required for multi-head attention calculation are derived from the output of the last layer of the encoder, and the query (Q) is derived from the output of the previous multi-head attention of the decoder.

Step 4, inputting the training sample into a cross attention diffusion model for training:

(1) Inputting logging data as training samples after pretreatmentTo a 1X 1 convolution module, wherein m represents the number of training samples of the logging data, d represents the characteristic dimension of the logging data, the number of channels is 1, and the embedded vector +_ of the logging data is obtained after c 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;

(2) Will embed vector X _embedding Input to encoder of transducer feature extraction module, for X _embedding All neurons of the middle layer were layer normalized to X _LN ：

Wherein E [. Cndot.]Mean value Var [. Cndot.]The variance is represented, gamma and beta are scaling and translation parameters respectively, epsilon is a minimum value, and the situation that denominator is 0 is prevented; x is to be _LN Multi-headed injection for input encodersThe force layer first builds matrix vectors Q, K and V:

Q＝X _LN W _Q +b _Q (6)

K＝X _LN W _K +b _K (7)

V＝X _LN W _V +b _V (8)

wherein W is _Q 、W _K And W is _V Is three weight matrices, b _Q 、b _K And b _V Is three bias vectors. Second, the number of heads in the multi-head attention layer is set to h, dividing the matrix vectors Q, K and V into h groups, i.e., q= [ Q ₁ ,Q ₂ ,…,Q _h ]、K＝[K ₁ ,K ₂ ,…,K _h ]Sum v= [ V ₁ ,V ₂ ,…,V _h ]Make up of h different heads (Q _i ,K _i ,V _i ) I is more than or equal to 1 and less than or equal to h. Each head calculates the attention in a different vector space, respectively, while for a single head its attention mechanism is calculated as:

wherein the method comprises the steps ofRepresent K _i Transpose of d _k Represent K _i Is a dimension of (c). Finally, carrying out feature fusion on the attention calculated by different heads:

wherein the method comprises the steps ofRepresenting vector concatenation, W ^O Is a weight matrix. Obtaining a fused feature X _Attention After that, it is combined with the embedded vector X _embedding Residual connection is carried out:

X′ _Attention ＝X _Attention +X _embedding (11)

and then X' _Attention Is input into a feed-forward neural network, which comprises two fully connected layers, a Dropout layer and a GELU activation function. Thus, the output X through the feedforward neural network _Feed The calculation is as follows:

X _Feed ＝Linear(GELU(Dropout(Linear(X′ _Attention )))) (12)

where Linear represents a full join layer operation. Based on this, X is _Feed With X' _Attention Residual connection is carried out to finally obtain an embedded vector X _embedding Output vector X after passing through encoder _hidden ：

X _hidden ＝X _Feed +X′ _Attention (13)

(3) True reservoir parameters (tag data) Y to be associated with training samples _train Input to a 1 x 1 convolution module, similar to (1) in step 4, yielding Y _embedding . And then Y is added _embedding Input to a noise adding module, and the diffusion process gradually changes Y at time steps T E {1, …, T } _embedding Adding Gaussian noiseObtaining a group of hidden variables Y ₁ ,…,Y _T ：

Wherein beta is _t Represents diffusivity, Y _t Represents Y _embedding State at time step t. At any time step t, Y _t And the original state Y ₀ (Y _embedding ) The relationship of (2) can be expressed as:

wherein alpha is _t ＝1-β _t ，

(4) Y is set to _t Output vector X of encoder _hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y _t Performing a multi-head attention calculation in the same manner as in (2) in step 4, and similarly obtaining Y in time step t according to formula (11) _t ^Attention . Next, Y is _t ^Attention And X is _hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:

Q＝Y _t ^Attention W _Q +b _Q (16)

K＝X _hidden W _K +b _K (17)

V＝X _hidden W _V +b _V (18)

when calculating the cross-attention, Y is similarly obtained according to the formula (13) in the same manner as in (2) in step 4 except that the matrix vectors Q, K and V are calculated in a different manner from the multi-head self-attention _t ^hidden I.e. the noise z predicted by the denoising module at time step t _θ (Y _t ^Attention ,t,X _hidden )＝Y _t ^hidden Wherein z is _θ Representing the denoising module from Y during back diffusion _t ^hidden Predicting an output function of the noise, wherein θ represents a parameter of the function;

(5) Noise z predicted at time step t based on (4) in step 4 _θ (Y _t ^Attention ,t,X _hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:

thus, the state of time step t-1Sampling is as follows:

(6) The base iteratively predicts noise from time step t=t and completes sampling the previous time step using a denoising module until t=0, resulting in a predicted reservoir parameter

(7) The goal of the base model training is to minimize the square error between the mean of the predicted noise and the true noise mean, then the loss function is calculated as:

wherein mu _t-1 Representing the mean value of the real noise at time step t-1, i.e

After model loss value is calculated by using the method (22), the weight coefficient matrix w= [ W ] is obtained by the chain rule _Q ,W _K ,W _V ]And offset vector b= [ b ] _Q ,b _K ,b _V ]Conducting derivative and updating w and b using a random gradient descent algorithm:

wherein w is ^* 、b ^* And representing the optimized weight coefficient matrix and the bias vector, wherein eta represents the learning rate. And 5, predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step 4, wherein the method comprises the following steps of:

(1) Acquiring logging data X of any target area _test ；

(2) X is to be _test Random sampling Gaussian noiseSimultaneously inputting the X and X into a cross attention diffusion model which is trained, firstly calculating X according to the step (2) in the step 4 _test Output vector X after passing through encoder _hidden The method comprises the steps of carrying out a first treatment on the surface of the Next, Y is selected according to steps (4) - (6) in step 4 _T And X is _hidden Input to a denoising module, and finally obtain predicted reservoir parameters Y through iterative denoising _predict And finishing reservoir parameter prediction of the target area.

The innovation points of the invention are as follows:

(1) Constructing a diffusion model with a transducer as a backbone network for predicting reservoir parameters, deeply mining the correlation among the characteristics of logging data, and realizing efficient and accurate prediction of the reservoir parameters in an end-to-end mode;

(2) A cross attention mechanism is introduced into the diffusion model, interaction between logging data and reservoir parameters is better captured, and accuracy of predicting the reservoir parameters in the iterative denoising process of the model is improved;

(3) And correcting the model for any target area without core data, so that low-cost and high-precision prediction is realized. Meanwhile, the sensitivity of the model to abnormal data is reduced by using an abnormal value detection strategy.

The beneficial effects are that:

compared with the prior art, the invention has the following beneficial effects:

the invention provides a reservoir prediction method based on a cross attention diffusion model, which takes a generated diffusion model as a reservoir parameter prediction basis, takes the existing logging and coring data of a work area as a data support, and combines a transducer characteristic extraction module and a cross attention module to realize reservoir parameter prediction of any target area with low cost and high precision. Compared with the conventional geological method which relies on mathematical empirical formulas with complicated modeling to calculate reservoir parameters, the method predicts the reservoir parameters in an end-to-end mode, namely, logging data of any target area is directly input into a trained model to predict the reservoir parameters, so that correlation among various characteristics of the logging data is better extracted, the problem that core data are acquired at high cost to correct the model to improve model accuracy is avoided, and meanwhile, the reservoir parameters can be predicted more quickly and accurately.

Drawings

FIG. 1 is a cross-attention diffusion model block diagram for implementing reservoir parameter prediction;

FIG. 2 is a schematic diagram of a noise adding module in a diffusion model;

FIG. 3 is a schematic diagram of a denoising module in a diffusion model;

FIG. 4 is a flow chart of a method of reservoir prediction based on a cross-attention diffusion model.

Detailed Description

The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Examples:

a reservoir prediction method based on a cross attention diffusion model comprises the following steps:

step 1, acquiring the existing logging data and coring data of a working area:

the acquisition of the existing logging data of the working area comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), the 7 logging parameters were selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.

(1) As shown in fig. 4, preprocessing the acquired logging data includes outlier removal and normalization, and the outlier removal adopts a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF), including the following steps:

a) Given a log data set x= { X containing N sample points ₁ ,x ₂ ,…,x _n ,…,x _N Each sample x _i Is 7. First, the first n sample points X' = { X are selected ₁ ,x ₂ ,…,x _n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d _left ＝{x _i |x _i ∈X′,x _id C and right subtree X _right ＝X′\X _left The method comprises the steps of carrying out a first treatment on the surface of the At X _left And X _right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained _left And X _right Constructing an initialized robust random partition tree T;

c) Iteratively executing the process of initializing the robust random partition tree, deleting and inserting the sample points until the abnormal scores S= { S (X) ₁ ),s(x ₂ ),…,s(x _N ) Arranging all abnormal scores in Descending order S '=Descending Sort (S), setting the first 2% of sample points in S' as abnormal points, and eliminating the corresponding sample points from X to obtain a logging data set X with abnormal values removed _new ；

Step 3, constructing a cross attention diffusion model network structure:

as shown in fig. 1, the model network structure includes a 1×1 convolution module, a noise adding module, a noise removing module, a transducer feature extraction module, and a cross-attention module.

1 x 1 convolution module: the convolution kernel number is set to 8 for mapping the input log data features into a high-dimensional space;

(1) As shown in fig. 4, logging data as training samples after preprocessing is inputTo a 1X 1 convolution module, wherein m represents the number of logging data training samples, the number of channels is 1, and the embedded vector is obtained after 8 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;

(2) As shown in fig. 4, the vector X will be embedded _embedding Input to encoder of transducer feature extraction module, for X _embedding All neurons of the middle layer were layer normalized to X _LN ：

Wherein E [. Cndot.]Mean value Var [. Cndot.]Representing variance, gamma and beta are scaling and translation parameters respectively, and epsilon takes a value of 10 ^-8 The method comprises the steps of carrying out a first treatment on the surface of the X is to be _LN The multi-headed attention layer of the input encoder first builds matrix vectors Q, K and V:

Q＝X _LN W _Q +b _Q (30)

K＝X _LN W _K +b _K (31)

V＝X _LN W _V +b _V (32)

wherein W is _Q 、W _K And W is _V Is three weight matrices, b _Q 、b _K And b _V Is three bias vectors. Next, the number of heads in the multi-head attention layer is set to 8, and the matrix vectors Q, K and V are divided into 8 groups, i.e., q= [ Q ₁ ,Q ₂ ,…,Q ₈ ]、K＝[K ₁ ,K ₂ ,…,K ₈ ]Sum v= [ V ₁ ,V ₂ ,…,V ₈ ]Make up of 8 different heads (Q _i ,K _i ,V _i ) I is more than or equal to 1 and less than or equal to 8. Each head calculates the attention in a different vector space, respectively, while for a single head its attention mechanism is calculated as:

X′ _Attention ＝X _Attention +X _embedding (35)

and then will beX′ _Attention Is input into a feed-forward neural network, which comprises two fully connected layers, a Dropout layer and a GELU activation function. Thus, the output X through the feedforward neural network _Feed The calculation is as follows:

X _Feed ＝Linear(GELU(Dropout(Linear(X′ _Attention )))) (36)

X _hidden ＝X _Feed +X′ _Attention (37)

(3) As shown in fig. 4, the true reservoir parameters (tag data) Y corresponding to the training samples will be _train Input to a 1 x 1 convolution module, similar to (1) in step 4, yielding Y _embedding . As shown in FIG. 2, Y is again _embedding Input to a noise adding module, and the diffusion process gradually changes Y at time step t epsilon {1, …,100} _embedding Adding Gaussian noiseObtaining a group of hidden variables Y ₁ ,…,Y ₁₀₀ ：

Wherein beta is _t Representing diffusivity, beta of the present invention _t In [0.2,1 ]]Take the value in the range, Y _t Represents Y _embedding State at time step t. At any time step t, Y _t And the original state Y ₀ (Y _embedding ) The relationship of (2) can be expressed as:

wherein alpha is _t ＝1-β _t ，

(4) As shown in FIG. 4, Y _t Output vector X of encoder _hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y _t Performing a multi-head attention calculation in the same manner as in (2) in step 4, and similarly obtaining Y in time step t according to formula (11) _t ^Attention . Next, Y is _t ^Attention And X is _hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:

Q＝Y _t ^Attention W _Q +b _Q (40)

K＝X _hidden W _K +b _K (41)

V＝X _hidden W _V +b _V (42)

(5) As shown in fig. 3, the noise z predicted at time step t based on (4) in step 4 _θ (Y _t ^Attention ,t,X _hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:

thus, the state of time step t-1Sampling is as follows:

(6) As shown in fig. 3, the noise is iteratively predicted from time step t=100 and the sampling of the previous time step is completed using a denoising module until t=0, resulting in the predicted reservoir parameters

(7) The goal of model training is to minimize the square error between the mean of the predicted noise and the true noise mean, then the loss function is calculated as:

wherein w is ^* 、b ^* Representing the optimized weight coefficient matrix and the bias vector, and eta represents the learning rate, and the invention is set to be 0.0001.

And 5, predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step 4, wherein the method comprises the following steps of:

(1) Acquiring logging data X of any target area _test ；

The present invention is not limited to the above-mentioned embodiments, but is not limited to the above-mentioned embodiments, and any person skilled in the art can make some changes or modifications to the equivalent embodiments without departing from the scope of the technical solution of the present invention, but any simple modification, equivalent changes and modifications to the above-mentioned embodiments according to the technical substance of the present invention are still within the scope of the technical solution of the present invention.

Claims

1. A reservoir prediction method based on a cross-attention diffusion model, comprising the steps of:

s1: acquiring the existing logging data and coring data of a work area;

s2: preprocessing the acquired logging data, and constructing a training and testing sample data set;

s3: constructing a cross attention diffusion model network structure;

s4: inputting the training sample into a cross attention diffusion model for training;

s5: predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step S4;

s1, acquiring the existing logging information of the working area in the step S1 comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), at least three or more logging parameters should be selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.

S2, preprocessing the acquired logging data in the step S2 comprises abnormal value removal and standardization treatment, wherein a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF) is adopted for abnormal value removal, and the method comprises the following steps of:

s21: given a log data set x= { X containing N sample points ₁ ,x ₂ ,…,x _n ,…,x _N Each sample x _i Is M. First, the first n sample points X' = { X are selected ₁ ,x ₂ ,…,x _n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d _left ＝{x _i |x _i ∈X′,x _id C and right subtree X _right ＝X′\X _left The method comprises the steps of carrying out a first treatment on the surface of the At X _left And X _right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained _left And X _right Constructing an initialized robust random partition tree T;

s22: second, anomaly scores are calculated by deleting and inserting sample points. Starting from the (n+1) th sample point, the (x) th sample point in T is deleted first _i-n The parent of the leaf node to which each sample belongs is replaced with its sibling node, where i=n+1. When deleting sample point x _i-n And when the model complexity change amount, namely the change value of the sum of the depths of all leaf nodes in the T, is the abnormal degree of the point. Thus, sample point x _i-n Is calculated as:

wherein the function f represents the depth of any leaf node y in T generated by the sample set X'. Abnormality score s (x _i-n The closer X') is to 0, X _i-n The less likely an outlier sample point is. Deleting sample point x from T _i-n Then, the sample point x is again _i And to delete x _i-n Set X' \ { X } _i-n Obtaining a new sample point set X' \ { X } _i-n }∪{x _i }. Repeatedly executing the initialization process of the robust random partition tree to obtain the interpolationInto sample point x _i Is a robust random partitioning tree T ". By deleting sample point x _i-n Where i=n+2, x is calculated and recorded _i-n Is an anomaly score for (2);

s23: finally, the process of initializing the robust random partition tree, deleting and inserting the sample points is iteratively performed until abnormal scores S= { S (X) ₁ ),s(x ₂ ),…,s(x _N ) Arranging all abnormal scores in Descending order S '=Descending Sort (S), setting the first 2% of sample points in S' as abnormal points, and eliminating the corresponding sample points from X to obtain a logging data set X with abnormal values removed _new 。

Carrying out standardization processing on logging data with abnormal values removed, wherein the standardization method comprises the following steps:

wherein X 'is' _new Mu and sigma are X respectively as normalized sample values _new N' is the number of samples of the log data after outlier removal.

S24: when the training and testing sample data set is constructed in the step S2, 80% of the well logging data after pretreatment is used as a training set X _train The remaining 20% of the log data is taken as test set X _test 。

S3, constructing a cross attention diffusion model network structure in the step S3, wherein the cross attention diffusion model network structure comprises the following modules:

S4, inputting the training sample into a cross attention diffusion model for training in the step S4, wherein the method comprises the following steps of:

s41: inputting logging data as training samples after pretreatmentTo a 1X 1 convolution module, wherein m represents the number of training samples of the logging data, d represents the characteristic dimension of the logging data, the number of channels is 1, and the embedded vector +_ of the logging data is obtained after c 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;

s42: will embed vector X _embedding Input to encoder of transducer feature extraction module, for X _embedding All neurons of the middle layer undergo layer homingIs converted into X _LN ：

Wherein E [. Cndot.]Mean value Var [. Cndot.]The variance is represented, gamma and beta are scaling and translation parameters respectively, epsilon is a minimum value, and the situation that denominator is 0 is prevented; x is to be _LN The multi-headed attention layer of the input encoder first builds matrix vectors Q, K and V:

Q＝X _LN W _Q +b _Q (6)

K＝X _LN W _K +b _K (7)

V＝X _LN W _V +b _V (8)

X′ _Attention ＝X _Attention +X _embedding (11)

X _Feed ＝Linear(GELU(Dropout(Linear(X′ _Attention )))) (12)

X _hidden ＝X _Feed +X′ _Attention (13)

S43: true reservoir parameters (tag data) Y to be associated with training samples _train Input to a 1×1 convolution block, similar to step S41, to obtain Y _embedding . And then Y is added _embedding Input to a noise adding module, and the diffusion process gradually changes Y at time steps T E {1, …, T } _embedding Adding Gaussian noiseObtaining a group of hidden variables Y ₁ ,…,Y _T ：

wherein alpha is _t ＝1-β _t ，

S44: y is set to _t Output vector X of encoder _hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y _t A multi-head attention calculation is performed in the same manner as in step S42, and Y for time step t is similarly obtained according to formula (11) _t ^Attention . Next, Y is _t ^Attention And X is _hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:

Q＝Y _t ^Attention W _Q +b _Q (16)

K＝X _hidden W _K +b _K (17)

V＝X _hidden W _V +b _V (18)

when calculating the cross-attention, Y is similarly obtained according to the formula (13) in the same manner as in step S42 except that the matrix vectors Q, K and V are calculated in a different manner from the multi-head self-attention _t ^hidden I.e. the noise z predicted by the denoising module at time step t _θ (Y _t ^Attention ,t,X _hidden )＝Y _t ^hidden Wherein z is _θ Representing the denoising module from Y during back diffusion _t ^hidden Output function of prediction noise, θ represents functionParameters of (2);

s45: noise z predicted at time step t based on step S44 _θ (Y _t ^Attention ,t,X _hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:

thus, the state of time step t-1Sampling is as follows:

s46: iteratively predicting noise from time step t=t and completing sampling of the previous time step using a denoising module until t=0, resulting in a predicted reservoir parameter

S47: the training cross attention diffusion model is characterized in that the model training aims at minimizing the square error between the average value of the prediction noise and the average value of the real noise, and then the loss function is calculated as follows:

wherein mu _t-1 Representing real noise at time step t-1Mean of (a), i.e

The above calculation model loss function is characterized in that the model loss value is calculated by the equation (22), and then the weight coefficient matrix w= [ W ] is calculated by the chain law _Q ,W _K ,W _V ]And offset vector b= [ b ] _Q ,b _K ,b _V ]Conducting derivative and updating w and b using a random gradient descent algorithm:

wherein w is ^* 、b ^* And representing the optimized weight coefficient matrix and the bias vector, wherein eta represents the learning rate.

S5, predicting reservoir parameters of any target area in the step S5, wherein the method comprises the following steps:

s51: acquiring logging data X of any target area _test ；

S52: x is to be _test Random sampling Gaussian noiseSimultaneously inputting into a cross attention diffusion model after training, firstly calculating X according to step S42 _test Output vector X after passing through encoder _hidden The method comprises the steps of carrying out a first treatment on the surface of the Next, Y is selected according to steps S44 to S46 _T And X is _hidden Input to a denoising module, and finally obtain predicted reservoir parameters Y through iterative denoising _predict And finishing reservoir parameter prediction of the target area.