CN117236390A - Reservoir prediction method based on cross attention diffusion model - Google Patents

Reservoir prediction method based on cross attention diffusion model Download PDF

Info

Publication number
CN117236390A
CN117236390A CN202311229800.6A CN202311229800A CN117236390A CN 117236390 A CN117236390 A CN 117236390A CN 202311229800 A CN202311229800 A CN 202311229800A CN 117236390 A CN117236390 A CN 117236390A
Authority
CN
China
Prior art keywords
attention
data
noise
sample
hidden
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311229800.6A
Other languages
Chinese (zh)
Inventor
武娟
罗仁泽
罗磊
雷璨如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202311229800.6A priority Critical patent/CN117236390A/en
Publication of CN117236390A publication Critical patent/CN117236390A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a reservoir prediction method based on a cross attention diffusion model, which takes a generated diffusion model as a basis of reservoir parameter prediction, takes the existing logging and coring data of a work area as data support, combines a transducer characteristic extraction module and a cross attention module, better captures the interactive association between logging data and reservoir parameters, and improves the accuracy of the model for predicting the reservoir parameters in the iterative denoising process. Compared with the conventional geological method which relies on mathematical empirical formulas with complicated modeling to calculate reservoir parameters, the method predicts the reservoir parameters in an end-to-end mode, namely, logging data of any target area is directly input into a trained model to predict the reservoir parameters, correlation among various characteristics of the logging data is better extracted, the problem of cost caused by correcting the model by acquiring core data is avoided, and low-cost and high-precision reservoir parameter prediction of any target area is realized.

Description

Reservoir prediction method based on cross attention diffusion model
Technical Field
The invention relates to the technical field of geological reservoir prediction and deep learning, in particular to a reservoir prediction method based on a cross attention diffusion model.
Background
Pore saturation parameters are key parameters describing reservoir physical properties that directly affect the storage and exploitation capabilities of the oil and gas in the reservoir. Therefore, it is important to predict reservoir parameters with high accuracy, which is the basic task of building high-accuracy geologic models, estimating reasonable reserves of oil and gas, and determining efficient development schemes.
At present, aiming at reservoir prediction research, a traditional prediction method mainly comprises semi-quantitative analysis based on a curve intersection graph and a simple mathematical analysis method (mainly comprising factor analysis for optimizing discrimination parameters, bayesian discrimination analysis, clustering analysis and the like) for fitting parameters to be predicted. However, these conventional methods are mostly based on linear prediction, have low efficiency and high subjectivity, and easily negatively affect the accuracy of the prediction results of the reservoir parameters. Meanwhile, in the actual well logging process, due to the fact that the reservoir is complex and high in heterogeneity, well logging data can be affected by noise and abnormal values, and the prediction effect on reservoir parameters is further affected. The development of deep learning is benefited, and a brand new analysis view angle is provided for petroleum geology researchers based on a data-driven thinking mode, so that the analysis cost can be reduced, and the accuracy of reservoir prediction can be improved.
The diffusion model is a type of generation model, in which noise is gradually added to input data in a forward diffusion process until the input data is completely changed into gaussian noise, and in a reverse diffusion process, denoising is gradually performed through sampling, and how to restore from gaussian noise to real input data is learned. However, the original diffusion model is directed to image generation tasks and cannot be directly applied to reservoir parameter prediction. The transducer network has strong feature extraction capability, and can well mine the correlation among the features of the logging data. Meanwhile, the cross attention mechanism is beneficial to capturing the interaction relation between the logging data and the reservoir parameters by calculating the cross attention between the original data and the target data. Thus, in combination with the advantages of the model networks described above, a reservoir parameter prediction method driven by logging data is constructed.
Aiming at complex logging response characteristics, the invention provides a reservoir prediction method based on a cross attention diffusion model according to the principle of the diffusion model on the basis of logging data. According to the method, firstly, abnormal values and standardization processing are carried out on logging data, and the phenomenon that a model learns wrong information in a training process is avoided. Secondly, the 1 multiplied by 1 convolution module is utilized to map the logging data training samples and the corresponding real reservoir parameters thereof to a high-dimensional vector space respectively so as to better distinguish different features. And extracting the characteristics of the training samples and the real reservoir parameters after noise addition in the high-dimensional vector space through a transducer backbone network in the diffusion model, and calculating the cross attention of the training samples and the real reservoir parameters to predict the noise of the current time step. And finally, the denoising module iteratively performs denoising to obtain the final predicted reservoir parameters.
Disclosure of Invention
The invention mainly overcomes the defects of the existing reservoir parameter prediction method, and aims to provide a reservoir prediction method based on a cross attention diffusion model.
In order to achieve the technical purpose, the invention provides the following technical scheme:
a reservoir prediction method based on a cross-attention diffusion model, comprising the steps of:
step 1, acquiring the existing logging data and coring data of a working area:
the acquisition of the existing logging data of the working area comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), at least three or more logging parameters should be selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.
Step 2, preprocessing the acquired logging data, and constructing a training and testing sample data set:
(1) Preprocessing the acquired logging data comprises outlier removal and normalization processing, wherein the outlier removal adopts a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF), and the method comprises the following steps of:
a) Given a log data set x= { X containing N sample points 1 ,x 2 ,…,x n ,…,x N Each sample x i Is M. First, the first n sample points X' = { X are selected 1 ,x 2 ,…,x n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d left ={x i |x i ∈X′,x id C and right subtree X right =X′\X left The method comprises the steps of carrying out a first treatment on the surface of the At X left And X right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained left And X right Constructing an initialized robust random partition tree T;
b) Anomaly scores are calculated by deleting and inserting sample points. Starting from the (n+1) th sample point, the (x) th sample point in T is deleted first i-n The parent of the leaf node to which each sample belongs is replaced with its sibling node, where i=n+1. When deleting sample point x i-n And when the model complexity change amount, namely the change value of the sum of the depths of all leaf nodes in the T, is the abnormal degree of the point. Thus, sample point x i-n Is calculated as:
wherein the function f represents the depth of any leaf node y in T generated by the sample set X'. Abnormality score s (x i-n The closer X') is to 0, X i-n The less likely an outlier sample point is. Deleting sample point x from T i-n Then, the sample point x is again i And to delete x i-n Set X' \ { X } i-n Obtaining a new sample point set X' \ { X } i-n }∪{x i }. Repeatedly executing the initialization process of the robust random partition tree to obtain an insertion sample point x i Is a robust random partitioning tree T ". By deleting sample point x i-n Where i=n+2, x is calculated and recorded i-n Is an anomaly score for (2);
c) Iteratively executing the process of initializing the robust random partition tree, deleting and inserting the sample points until the abnormal scores S= { S (X) 1 ),s(x 2 ),…,s(x N ) All anomaly scores are arranged in Descending order, S '=Descending Sort (S), the first 2% of sample points in S' are set as anomaly points, and the corresponding samples are removed from XObtaining a log data set X with outlier removal new
d) For log data X with outlier removal new Carrying out standardization treatment, wherein the standardization method comprises the following steps:
wherein X 'is' new Mu and sigma are X respectively as normalized sample values new N' is the number of samples of the log data after removal of outliers;
(2) When training and testing sample data sets are constructed, 80% of well logging data after pretreatment is used as a training set X train The remaining 20% of the log data is taken as test set X test
Step 3, constructing a cross attention diffusion model network structure:
the model network structure comprises a 1 multiplied by 1 convolution module, a noise adding module, a noise removing module, a transducer characteristic extraction module and a cross attention module.
1 x 1 convolution module: the number of convolution kernels is set to c, and the convolution kernels are used for mapping the input logging data characteristics into a high-dimensional space;
and (3) a noise adding module: in the diffusion process, gaussian noise is gradually added to the original real reservoir parameters (tag data)Until it becomes completely pure gaussian noise;
and a denoising module: the method comprises a learnable neural network model, wherein in the back diffusion process, the state of the current time step is denoised by predicting the noise of the current time step to obtain the state of the previous time step;
the transducer feature extraction module: the encoder is composed of a layer normalization, a multi-head attention layer, a feedforward neural network, a neuron discarding layer (Dropout) and residual connection, wherein the decoder is added with a cross attention layer on the basis of the encoder, and the rest components are the same as the encoder;
cross-attention module: the multi-head attention layer, layer normalization and residual connection, wherein the key (K) and value (V) required for multi-head attention calculation are derived from the output of the last layer of the encoder, and the query (Q) is derived from the output of the previous multi-head attention of the decoder.
Step 4, inputting the training sample into a cross attention diffusion model for training:
(1) Inputting logging data as training samples after pretreatmentTo a 1X 1 convolution module, wherein m represents the number of training samples of the logging data, d represents the characteristic dimension of the logging data, the number of channels is 1, and the embedded vector +_ of the logging data is obtained after c 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;
(2) Will embed vector X embedding Input to encoder of transducer feature extraction module, for X embedding All neurons of the middle layer were layer normalized to X LN
Wherein E [. Cndot.]Mean value Var [. Cndot.]The variance is represented, gamma and beta are scaling and translation parameters respectively, epsilon is a minimum value, and the situation that denominator is 0 is prevented; x is to be LN Multi-headed injection for input encodersThe force layer first builds matrix vectors Q, K and V:
Q=X LN W Q +b Q (6)
K=X LN W K +b K (7)
V=X LN W V +b V (8)
wherein W is Q 、W K And W is V Is three weight matrices, b Q 、b K And b V Is three bias vectors. Second, the number of heads in the multi-head attention layer is set to h, dividing the matrix vectors Q, K and V into h groups, i.e., q= [ Q 1 ,Q 2 ,…,Q h ]、K=[K 1 ,K 2 ,…,K h ]Sum v= [ V 1 ,V 2 ,…,V h ]Make up of h different heads (Q i ,K i ,V i ) I is more than or equal to 1 and less than or equal to h. Each head calculates the attention in a different vector space, respectively, while for a single head its attention mechanism is calculated as:
wherein the method comprises the steps ofRepresent K i Transpose of d k Represent K i Is a dimension of (c). Finally, carrying out feature fusion on the attention calculated by different heads:
wherein the method comprises the steps ofRepresenting vector concatenation, W O Is a weight matrix. Obtaining a fused feature X Attention After that, it is combined with the embedded vector X embedding Residual connection is carried out:
X′ Attention =X Attention +X embedding (11)
and then X' Attention Is input into a feed-forward neural network, which comprises two fully connected layers, a Dropout layer and a GELU activation function. Thus, the output X through the feedforward neural network Feed The calculation is as follows:
X Feed =Linear(GELU(Dropout(Linear(X′ Attention )))) (12)
where Linear represents a full join layer operation. Based on this, X is Feed With X' Attention Residual connection is carried out to finally obtain an embedded vector X embedding Output vector X after passing through encoder hidden
X hidden =X Feed +X′ Attention (13)
(3) True reservoir parameters (tag data) Y to be associated with training samples train Input to a 1 x 1 convolution module, similar to (1) in step 4, yielding Y embedding . And then Y is added embedding Input to a noise adding module, and the diffusion process gradually changes Y at time steps T E {1, …, T } embedding Adding Gaussian noiseObtaining a group of hidden variables Y 1 ,…,Y T
Wherein beta is t Represents diffusivity, Y t Represents Y embedding State at time step t. At any time step t, Y t And the original state Y 0 (Y embedding ) The relationship of (2) can be expressed as:
wherein alpha is t =1-β t
(4) Y is set to t Output vector X of encoder hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y t Performing a multi-head attention calculation in the same manner as in (2) in step 4, and similarly obtaining Y in time step t according to formula (11) t Attention . Next, Y is t Attention And X is hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:
Q=Y t Attention W Q +b Q (16)
K=X hidden W K +b K (17)
V=X hidden W V +b V (18)
when calculating the cross-attention, Y is similarly obtained according to the formula (13) in the same manner as in (2) in step 4 except that the matrix vectors Q, K and V are calculated in a different manner from the multi-head self-attention t hidden I.e. the noise z predicted by the denoising module at time step t θ (Y t Attention ,t,X hidden )=Y t hidden Wherein z is θ Representing the denoising module from Y during back diffusion t hidden Predicting an output function of the noise, wherein θ represents a parameter of the function;
(5) Noise z predicted at time step t based on (4) in step 4 θ (Y t Attention ,t,X hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:
thus, the state of time step t-1Sampling is as follows:
(6) The base iteratively predicts noise from time step t=t and completes sampling the previous time step using a denoising module until t=0, resulting in a predicted reservoir parameter
(7) The goal of the base model training is to minimize the square error between the mean of the predicted noise and the true noise mean, then the loss function is calculated as:
wherein mu t-1 Representing the mean value of the real noise at time step t-1, i.e
After model loss value is calculated by using the method (22), the weight coefficient matrix w= [ W ] is obtained by the chain rule Q ,W K ,W V ]And offset vector b= [ b ] Q ,b K ,b V ]Conducting derivative and updating w and b using a random gradient descent algorithm:
wherein w is * 、b * And representing the optimized weight coefficient matrix and the bias vector, wherein eta represents the learning rate. And 5, predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step 4, wherein the method comprises the following steps of:
(1) Acquiring logging data X of any target area test
(2) X is to be test Random sampling Gaussian noiseSimultaneously inputting the X and X into a cross attention diffusion model which is trained, firstly calculating X according to the step (2) in the step 4 test Output vector X after passing through encoder hidden The method comprises the steps of carrying out a first treatment on the surface of the Next, Y is selected according to steps (4) - (6) in step 4 T And X is hidden Input to a denoising module, and finally obtain predicted reservoir parameters Y through iterative denoising predict And finishing reservoir parameter prediction of the target area.
The innovation points of the invention are as follows:
(1) Constructing a diffusion model with a transducer as a backbone network for predicting reservoir parameters, deeply mining the correlation among the characteristics of logging data, and realizing efficient and accurate prediction of the reservoir parameters in an end-to-end mode;
(2) A cross attention mechanism is introduced into the diffusion model, interaction between logging data and reservoir parameters is better captured, and accuracy of predicting the reservoir parameters in the iterative denoising process of the model is improved;
(3) And correcting the model for any target area without core data, so that low-cost and high-precision prediction is realized. Meanwhile, the sensitivity of the model to abnormal data is reduced by using an abnormal value detection strategy.
The beneficial effects are that:
compared with the prior art, the invention has the following beneficial effects:
the invention provides a reservoir prediction method based on a cross attention diffusion model, which takes a generated diffusion model as a reservoir parameter prediction basis, takes the existing logging and coring data of a work area as a data support, and combines a transducer characteristic extraction module and a cross attention module to realize reservoir parameter prediction of any target area with low cost and high precision. Compared with the conventional geological method which relies on mathematical empirical formulas with complicated modeling to calculate reservoir parameters, the method predicts the reservoir parameters in an end-to-end mode, namely, logging data of any target area is directly input into a trained model to predict the reservoir parameters, so that correlation among various characteristics of the logging data is better extracted, the problem that core data are acquired at high cost to correct the model to improve model accuracy is avoided, and meanwhile, the reservoir parameters can be predicted more quickly and accurately.
Drawings
FIG. 1 is a cross-attention diffusion model block diagram for implementing reservoir parameter prediction;
FIG. 2 is a schematic diagram of a noise adding module in a diffusion model;
FIG. 3 is a schematic diagram of a denoising module in a diffusion model;
FIG. 4 is a flow chart of a method of reservoir prediction based on a cross-attention diffusion model.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Examples:
a reservoir prediction method based on a cross attention diffusion model comprises the following steps:
step 1, acquiring the existing logging data and coring data of a working area:
the acquisition of the existing logging data of the working area comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), the 7 logging parameters were selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.
Step 2, preprocessing the acquired logging data, and constructing a training and testing sample data set:
(1) As shown in fig. 4, preprocessing the acquired logging data includes outlier removal and normalization, and the outlier removal adopts a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF), including the following steps:
a) Given a log data set x= { X containing N sample points 1 ,x 2 ,…,x n ,…,x N Each sample x i Is 7. First, the first n sample points X' = { X are selected 1 ,x 2 ,…,x n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d left ={x i |x i ∈X′,x id C and right subtree X right =X′\X left The method comprises the steps of carrying out a first treatment on the surface of the At X left And X right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained left And X right Constructing an initialized robust random partition tree T;
b) Anomaly scores are calculated by deleting and inserting sample points. Starting from the (n+1) th sample point, the (x) th sample point in T is deleted first i-n The parent of the leaf node to which each sample belongs is replaced with its sibling node, where i=n+1. When deleting sample point x i-n And when the model complexity change amount, namely the change value of the sum of the depths of all leaf nodes in the T, is the abnormal degree of the point. Thus, sample point x i-n Is calculated as:
wherein the function f represents the depth of any leaf node y in T generated by the sample set X'. Abnormality score s (x i-n The closer X') is to 0, X i-n The less likely an outlier sample point is. Deleting sample point x from T i-n Then, the sample point x is again i And to delete x i-n Set X' \ { X } i-n Obtaining a new sample point set X' \ { X } i-n }∪{x i }. Repeatedly executing the initialization process of the robust random partition tree to obtain an insertion sample point x i Is a robust random partitioning tree T ". By deleting sample point x i-n Where i=n+2, x is calculated and recorded i-n Is an anomaly score for (2);
c) Iteratively executing the process of initializing the robust random partition tree, deleting and inserting the sample points until the abnormal scores S= { S (X) 1 ),s(x 2 ),…,s(x N ) Arranging all abnormal scores in Descending order S '=Descending Sort (S), setting the first 2% of sample points in S' as abnormal points, and eliminating the corresponding sample points from X to obtain a logging data set X with abnormal values removed new
d) For log data X with outlier removal new Carrying out standardization treatment, wherein the standardization method comprises the following steps:
wherein X 'is' new Mu and sigma are X respectively as normalized sample values new N' is the number of samples of the log data after removal of outliers;
(2) When training and testing sample data sets are constructed, 80% of well logging data after pretreatment is used as a training set X train The remaining 20% of the log data is taken as test set X test
Step 3, constructing a cross attention diffusion model network structure:
as shown in fig. 1, the model network structure includes a 1×1 convolution module, a noise adding module, a noise removing module, a transducer feature extraction module, and a cross-attention module.
1 x 1 convolution module: the convolution kernel number is set to 8 for mapping the input log data features into a high-dimensional space;
and (3) a noise adding module: in the diffusion process, gaussian noise is gradually added to the original real reservoir parameters (tag data)Until it becomes completely pure gaussian noise;
and a denoising module: the method comprises a learnable neural network model, wherein in the back diffusion process, the state of the current time step is denoised by predicting the noise of the current time step to obtain the state of the previous time step;
the transducer feature extraction module: the encoder is composed of a layer normalization, a multi-head attention layer, a feedforward neural network, a neuron discarding layer (Dropout) and residual connection, wherein the decoder is added with a cross attention layer on the basis of the encoder, and the rest components are the same as the encoder;
cross-attention module: the multi-head attention layer, layer normalization and residual connection, wherein the key (K) and value (V) required for multi-head attention calculation are derived from the output of the last layer of the encoder, and the query (Q) is derived from the output of the previous multi-head attention of the decoder.
Step 4, inputting the training sample into a cross attention diffusion model for training:
(1) As shown in fig. 4, logging data as training samples after preprocessing is inputTo a 1X 1 convolution module, wherein m represents the number of logging data training samples, the number of channels is 1, and the embedded vector is obtained after 8 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;
(2) As shown in fig. 4, the vector X will be embedded embedding Input to encoder of transducer feature extraction module, for X embedding All neurons of the middle layer were layer normalized to X LN
Wherein E [. Cndot.]Mean value Var [. Cndot.]Representing variance, gamma and beta are scaling and translation parameters respectively, and epsilon takes a value of 10 -8 The method comprises the steps of carrying out a first treatment on the surface of the X is to be LN The multi-headed attention layer of the input encoder first builds matrix vectors Q, K and V:
Q=X LN W Q +b Q (30)
K=X LN W K +b K (31)
V=X LN W V +b V (32)
wherein W is Q 、W K And W is V Is three weight matrices, b Q 、b K And b V Is three bias vectors. Next, the number of heads in the multi-head attention layer is set to 8, and the matrix vectors Q, K and V are divided into 8 groups, i.e., q= [ Q 1 ,Q 2 ,…,Q 8 ]、K=[K 1 ,K 2 ,…,K 8 ]Sum v= [ V 1 ,V 2 ,…,V 8 ]Make up of 8 different heads (Q i ,K i ,V i ) I is more than or equal to 1 and less than or equal to 8. Each head calculates the attention in a different vector space, respectively, while for a single head its attention mechanism is calculated as:
wherein the method comprises the steps ofRepresent K i Transpose of d k Represent K i Is a dimension of (c). Finally, carrying out feature fusion on the attention calculated by different heads:
wherein the method comprises the steps ofRepresenting vector concatenation, W O Is a weight matrix. Obtaining a fused feature X Attention After that, it is combined with the embedded vector X embedding Residual connection is carried out:
X′ Attention =X Attention +X embedding (35)
and then will beX′ Attention Is input into a feed-forward neural network, which comprises two fully connected layers, a Dropout layer and a GELU activation function. Thus, the output X through the feedforward neural network Feed The calculation is as follows:
X Feed =Linear(GELU(Dropout(Linear(X′ Attention )))) (36)
where Linear represents a full join layer operation. Based on this, X is Feed With X' Attention Residual connection is carried out to finally obtain an embedded vector X embedding Output vector X after passing through encoder hidden
X hidden =X Feed +X′ Attention (37)
(3) As shown in fig. 4, the true reservoir parameters (tag data) Y corresponding to the training samples will be train Input to a 1 x 1 convolution module, similar to (1) in step 4, yielding Y embedding . As shown in FIG. 2, Y is again embedding Input to a noise adding module, and the diffusion process gradually changes Y at time step t epsilon {1, …,100} embedding Adding Gaussian noiseObtaining a group of hidden variables Y 1 ,…,Y 100
Wherein beta is t Representing diffusivity, beta of the present invention t In [0.2,1 ]]Take the value in the range, Y t Represents Y embedding State at time step t. At any time step t, Y t And the original state Y 0 (Y embedding ) The relationship of (2) can be expressed as:
wherein alpha is t =1-β t
(4) As shown in FIG. 4, Y t Output vector X of encoder hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y t Performing a multi-head attention calculation in the same manner as in (2) in step 4, and similarly obtaining Y in time step t according to formula (11) t Attention . Next, Y is t Attention And X is hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:
Q=Y t Attention W Q +b Q (40)
K=X hidden W K +b K (41)
V=X hidden W V +b V (42)
when calculating the cross-attention, Y is similarly obtained according to the formula (13) in the same manner as in (2) in step 4 except that the matrix vectors Q, K and V are calculated in a different manner from the multi-head self-attention t hidden I.e. the noise z predicted by the denoising module at time step t θ (Y t Attention ,t,X hidden )=Y t hidden Wherein z is θ Representing the denoising module from Y during back diffusion t hidden Predicting an output function of the noise, wherein θ represents a parameter of the function;
(5) As shown in fig. 3, the noise z predicted at time step t based on (4) in step 4 θ (Y t Attention ,t,X hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:
thus, the state of time step t-1Sampling is as follows:
(6) As shown in fig. 3, the noise is iteratively predicted from time step t=100 and the sampling of the previous time step is completed using a denoising module until t=0, resulting in the predicted reservoir parameters
(7) The goal of model training is to minimize the square error between the mean of the predicted noise and the true noise mean, then the loss function is calculated as:
wherein mu t-1 Representing the mean value of the real noise at time step t-1, i.e
After model loss value is calculated by using the method (22), the weight coefficient matrix w= [ W ] is obtained by the chain rule Q ,W K ,W V ]And offset vector b= [ b ] Q ,b K ,b V ]Conducting derivative and updating w and b using a random gradient descent algorithm:
wherein w is * 、b * Representing the optimized weight coefficient matrix and the bias vector, and eta represents the learning rate, and the invention is set to be 0.0001.
And 5, predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step 4, wherein the method comprises the following steps of:
(1) Acquiring logging data X of any target area test
(2) X is to be test Random sampling Gaussian noiseSimultaneously inputting the X and X into a cross attention diffusion model which is trained, firstly calculating X according to the step (2) in the step 4 test Output vector X after passing through encoder hidden The method comprises the steps of carrying out a first treatment on the surface of the Next, Y is selected according to steps (4) - (6) in step 4 T And X is hidden Input to a denoising module, and finally obtain predicted reservoir parameters Y through iterative denoising predict And finishing reservoir parameter prediction of the target area.
The present invention is not limited to the above-mentioned embodiments, but is not limited to the above-mentioned embodiments, and any person skilled in the art can make some changes or modifications to the equivalent embodiments without departing from the scope of the technical solution of the present invention, but any simple modification, equivalent changes and modifications to the above-mentioned embodiments according to the technical substance of the present invention are still within the scope of the technical solution of the present invention.

Claims (1)

1. A reservoir prediction method based on a cross-attention diffusion model, comprising the steps of:
s1: acquiring the existing logging data and coring data of a work area;
s2: preprocessing the acquired logging data, and constructing a training and testing sample data set;
s3: constructing a cross attention diffusion model network structure;
s4: inputting the training sample into a cross attention diffusion model for training;
s5: predicting reservoir parameters of any target area according to the cross attention diffusion model trained in the step S4;
s1, acquiring the existing logging information of the working area in the step S1 comprises the following logging parameters: natural Gamma (GR), sonic time difference (AC), borehole diameter (CAL), compensated Neutron (CNL), compensated Density (DEN), natural potential (SP), resistivity log (RT), at least three or more logging parameters should be selected as the characteristic data; the existing coring data of the work area is used as label data, which comprises the following steps: porosity (POR), permeability (PERM), and oil Saturation (SO), such reservoir parameters would need to be predicted for the uncancelled work area.
S2, preprocessing the acquired logging data in the step S2 comprises abnormal value removal and standardization treatment, wherein a robust random segmentation forest algorithm (Robust Random Cut Forest, RRCF) is adopted for abnormal value removal, and the method comprises the following steps of:
s21: given a log data set x= { X containing N sample points 1 ,x 2 ,…,x n ,…,x N Each sample x i Is M. First, the first n sample points X' = { X are selected 1 ,x 2 ,…,x n Initializing a robust random partition tree: calculating the maximum value of all samples in X' in each characteristic dimension dAnd minimum->According to probability distributionRandomly selecting a characteristic dimension d, wherein +.>Sampling a segmentation value C on the selected feature dimension d, the probability distribution of which is +.>Dividing X' into left subtree X according to the selected feature dimension d left ={x i |x i ∈X′,x id C and right subtree X right =X′\X left The method comprises the steps of carrying out a first treatment on the surface of the At X left And X right The process of sampling the partition value C and partitioning the subtrees is repeatedly performed until all the sample points in X 'are partitioned onto leaf nodes, each leaf node contains only one sample point, and finally all X's are obtained left And X right Constructing an initialized robust random partition tree T;
s22: second, anomaly scores are calculated by deleting and inserting sample points. Starting from the (n+1) th sample point, the (x) th sample point in T is deleted first i-n The parent of the leaf node to which each sample belongs is replaced with its sibling node, where i=n+1. When deleting sample point x i-n And when the model complexity change amount, namely the change value of the sum of the depths of all leaf nodes in the T, is the abnormal degree of the point. Thus, sample point x i-n Is calculated as:
wherein the function f represents the depth of any leaf node y in T generated by the sample set X'. Abnormality score s (x i-n The closer X') is to 0, X i-n The less likely an outlier sample point is. Deleting sample point x from T i-n Then, the sample point x is again i And to delete x i-n Set X' \ { X } i-n Obtaining a new sample point set X' \ { X } i-n }∪{x i }. Repeatedly executing the initialization process of the robust random partition tree to obtain the interpolationInto sample point x i Is a robust random partitioning tree T ". By deleting sample point x i-n Where i=n+2, x is calculated and recorded i-n Is an anomaly score for (2);
s23: finally, the process of initializing the robust random partition tree, deleting and inserting the sample points is iteratively performed until abnormal scores S= { S (X) 1 ),s(x 2 ),…,s(x N ) Arranging all abnormal scores in Descending order S '=Descending Sort (S), setting the first 2% of sample points in S' as abnormal points, and eliminating the corresponding sample points from X to obtain a logging data set X with abnormal values removed new
Carrying out standardization processing on logging data with abnormal values removed, wherein the standardization method comprises the following steps:
wherein X 'is' new Mu and sigma are X respectively as normalized sample values new N' is the number of samples of the log data after outlier removal.
S24: when the training and testing sample data set is constructed in the step S2, 80% of the well logging data after pretreatment is used as a training set X train The remaining 20% of the log data is taken as test set X test
S3, constructing a cross attention diffusion model network structure in the step S3, wherein the cross attention diffusion model network structure comprises the following modules:
1 x 1 convolution module: the number of convolution kernels is set to c, and the convolution kernels are used for mapping the input logging data characteristics into a high-dimensional space;
and (3) a noise adding module: in the diffusion process, gaussian noise is gradually added to the original real reservoir parameters (tag data)Until it becomes completely pure gaussian noise;
and a denoising module: the method comprises a learnable neural network model, wherein in the back diffusion process, the state of the current time step is denoised by predicting the noise of the current time step to obtain the state of the previous time step;
the transducer feature extraction module: the encoder is composed of a layer normalization, a multi-head attention layer, a feedforward neural network, a neuron discarding layer (Dropout) and residual connection, wherein the decoder is added with a cross attention layer on the basis of the encoder, and the rest components are the same as the encoder;
cross-attention module: the multi-head attention layer, layer normalization and residual connection, wherein the key (K) and value (V) required for multi-head attention calculation are derived from the output of the last layer of the encoder, and the query (Q) is derived from the output of the previous multi-head attention of the decoder.
S4, inputting the training sample into a cross attention diffusion model for training in the step S4, wherein the method comprises the following steps of:
s41: inputting logging data as training samples after pretreatmentTo a 1X 1 convolution module, wherein m represents the number of training samples of the logging data, d represents the characteristic dimension of the logging data, the number of channels is 1, and the embedded vector +_ of the logging data is obtained after c 1X 1 convolutions>Mapping each feature of the logging data into a high-dimensional space;
s42: will embed vector X embedding Input to encoder of transducer feature extraction module, for X embedding All neurons of the middle layer undergo layer homingIs converted into X LN
Wherein E [. Cndot.]Mean value Var [. Cndot.]The variance is represented, gamma and beta are scaling and translation parameters respectively, epsilon is a minimum value, and the situation that denominator is 0 is prevented; x is to be LN The multi-headed attention layer of the input encoder first builds matrix vectors Q, K and V:
Q=X LN W Q +b Q (6)
K=X LN W K +b K (7)
V=X LN W V +b V (8)
wherein W is Q 、W K And W is V Is three weight matrices, b Q 、b K And b V Is three bias vectors. Second, the number of heads in the multi-head attention layer is set to h, dividing the matrix vectors Q, K and V into h groups, i.e., q= [ Q 1 ,Q 2 ,…,Q h ]、K=[K 1 ,K 2 ,…,K h ]Sum v= [ V 1 ,V 2 ,…,V h ]Make up of h different heads (Q i ,K i ,V i ) I is more than or equal to 1 and less than or equal to h. Each head calculates the attention in a different vector space, respectively, while for a single head its attention mechanism is calculated as:
wherein the method comprises the steps ofRepresent K i Transpose of d k Represent K i Is a dimension of (c). Finally, carrying out feature fusion on the attention calculated by different heads:
wherein the method comprises the steps ofRepresenting vector concatenation, W O Is a weight matrix. Obtaining a fused feature X Attention After that, it is combined with the embedded vector X embedding Residual connection is carried out:
X′ Attention =X Attention +X embedding (11)
and then X' Attention Is input into a feed-forward neural network, which comprises two fully connected layers, a Dropout layer and a GELU activation function. Thus, the output X through the feedforward neural network Feed The calculation is as follows:
X Feed =Linear(GELU(Dropout(Linear(X′ Attention )))) (12)
where Linear represents a full join layer operation. Based on this, X is Feed With X' Attention Residual connection is carried out to finally obtain an embedded vector X embedding Output vector X after passing through encoder hidden
X hidden =X Feed +X′ Attention (13)
S43: true reservoir parameters (tag data) Y to be associated with training samples train Input to a 1×1 convolution block, similar to step S41, to obtain Y embedding . And then Y is added embedding Input to a noise adding module, and the diffusion process gradually changes Y at time steps T E {1, …, T } embedding Adding Gaussian noiseObtaining a group of hidden variables Y 1 ,…,Y T
Wherein beta is t Represents diffusivity, Y t Represents Y embedding State at time step t. At any time step t, Y t And the original state Y 0 (Y embedding ) The relationship of (2) can be expressed as:
wherein alpha is t =1-β t
S44: y is set to t Output vector X of encoder hidden And inputting the noise into a decoder of a denoising module, namely a transducer feature extraction module, and predicting the noise of the current time step. First to Y t A multi-head attention calculation is performed in the same manner as in step S42, and Y for time step t is similarly obtained according to formula (11) t Attention . Next, Y is t Attention And X is hidden Input into the cross attention module, the matrix vectors Q, K and V for cross attention are calculated as:
Q=Y t Attention W Q +b Q (16)
K=X hidden W K +b K (17)
V=X hidden W V +b V (18)
when calculating the cross-attention, Y is similarly obtained according to the formula (13) in the same manner as in step S42 except that the matrix vectors Q, K and V are calculated in a different manner from the multi-head self-attention t hidden I.e. the noise z predicted by the denoising module at time step t θ (Y t Attention ,t,X hidden )=Y t hidden Wherein z is θ Representing the denoising module from Y during back diffusion t hidden Output function of prediction noise, θ represents functionParameters of (2);
s45: noise z predicted at time step t based on step S44 θ (Y t Attention ,t,X hidden ) Denoising the state of time step t to obtain the state of time step t-1Further, the mean and variance of the predicted noise are calculated as:
thus, the state of time step t-1Sampling is as follows:
s46: iteratively predicting noise from time step t=t and completing sampling of the previous time step using a denoising module until t=0, resulting in a predicted reservoir parameter
S47: the training cross attention diffusion model is characterized in that the model training aims at minimizing the square error between the average value of the prediction noise and the average value of the real noise, and then the loss function is calculated as follows:
wherein mu t-1 Representing real noise at time step t-1Mean of (a), i.e
The above calculation model loss function is characterized in that the model loss value is calculated by the equation (22), and then the weight coefficient matrix w= [ W ] is calculated by the chain law Q ,W K ,W V ]And offset vector b= [ b ] Q ,b K ,b V ]Conducting derivative and updating w and b using a random gradient descent algorithm:
wherein w is * 、b * And representing the optimized weight coefficient matrix and the bias vector, wherein eta represents the learning rate.
S5, predicting reservoir parameters of any target area in the step S5, wherein the method comprises the following steps:
s51: acquiring logging data X of any target area test
S52: x is to be test Random sampling Gaussian noiseSimultaneously inputting into a cross attention diffusion model after training, firstly calculating X according to step S42 test Output vector X after passing through encoder hidden The method comprises the steps of carrying out a first treatment on the surface of the Next, Y is selected according to steps S44 to S46 T And X is hidden Input to a denoising module, and finally obtain predicted reservoir parameters Y through iterative denoising predict And finishing reservoir parameter prediction of the target area.
CN202311229800.6A 2023-09-22 2023-09-22 Reservoir prediction method based on cross attention diffusion model Pending CN117236390A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311229800.6A CN117236390A (en) 2023-09-22 2023-09-22 Reservoir prediction method based on cross attention diffusion model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311229800.6A CN117236390A (en) 2023-09-22 2023-09-22 Reservoir prediction method based on cross attention diffusion model

Publications (1)

Publication Number Publication Date
CN117236390A true CN117236390A (en) 2023-12-15

Family

ID=89085752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311229800.6A Pending CN117236390A (en) 2023-09-22 2023-09-22 Reservoir prediction method based on cross attention diffusion model

Country Status (1)

Country Link
CN (1) CN117236390A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117649529A (en) * 2024-01-30 2024-03-05 中国科学技术大学 Logging data interpretation method based on multidimensional signal analysis neural network
CN117726990A (en) * 2023-12-27 2024-03-19 浙江恒逸石化有限公司 Method and device for detecting spinning workshop, electronic equipment and storage medium
CN117893838A (en) * 2024-03-14 2024-04-16 厦门大学 Target detection method using diffusion detection model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117726990A (en) * 2023-12-27 2024-03-19 浙江恒逸石化有限公司 Method and device for detecting spinning workshop, electronic equipment and storage medium
CN117726990B (en) * 2023-12-27 2024-05-03 浙江恒逸石化有限公司 Method and device for detecting spinning workshop, electronic equipment and storage medium
CN117649529A (en) * 2024-01-30 2024-03-05 中国科学技术大学 Logging data interpretation method based on multidimensional signal analysis neural network
CN117893838A (en) * 2024-03-14 2024-04-16 厦门大学 Target detection method using diffusion detection model

Similar Documents

Publication Publication Date Title
CN117236390A (en) Reservoir prediction method based on cross attention diffusion model
CN112989708B (en) Well logging lithology identification method and system based on LSTM neural network
CN108596327B (en) Seismic velocity spectrum artificial intelligence picking method based on deep learning
CN111783825A (en) Well logging lithology identification method based on convolutional neural network learning
CN108647226B (en) Hybrid recommendation method based on variational automatic encoder
Yang et al. Two-stepped evolutionary algorithm and its application to stability analysis of slopes
CN110807544B (en) Oil field residual oil saturation distribution prediction method based on machine learning
CN113610945B (en) Ground stress curve prediction method based on hybrid neural network
CN108897975A (en) Coalbed gas logging air content prediction technique based on deepness belief network
CN111861756B (en) Group partner detection method based on financial transaction network and realization device thereof
CN112836802A (en) Semi-supervised learning method, lithology prediction method and storage medium
CN114723095A (en) Missing well logging curve prediction method and device
CN112761628B (en) Shale gas yield determination method and device based on long-term and short-term memory neural network
CN111048163A (en) Shale oil hydrocarbon retention amount (S1) evaluation method based on high-order neural network
CN111058840A (en) Organic carbon content (TOC) evaluation method based on high-order neural network
CN113947198A (en) Logging curve reconstruction method based on nonlinear autoregressive neural network model
CN117272841B (en) Shale gas dessert prediction method based on hybrid neural network
CN114114414A (en) Artificial intelligence prediction method for 'dessert' information of shale reservoir
CN117524353B (en) Molecular large model based on multidimensional molecular information, construction method and application
Rong et al. Machine learning method for TOC prediction: Taking wufeng and longmaxi shales in the Sichuan Basin, Southwest China as an example
CN111626377B (en) Lithology recognition method, device, equipment and storage medium
CN117114184A (en) Urban carbon emission influence factor feature extraction and medium-long-term prediction method and device
CN115620068A (en) Rock lithology automatic identification and classification method under deep learning mode
CN114881171A (en) Continental facies shale oil and rock facies type identification method and system based on convolutional neural network
CN117633658B (en) Rock reservoir lithology identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination