CN112633328A - Dense oil reservoir transformation effect evaluation method based on deep learning - Google Patents
Dense oil reservoir transformation effect evaluation method based on deep learning Download PDFInfo
- Publication number
- CN112633328A CN112633328A CN202011403763.2A CN202011403763A CN112633328A CN 112633328 A CN112633328 A CN 112633328A CN 202011403763 A CN202011403763 A CN 202011403763A CN 112633328 A CN112633328 A CN 112633328A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- neural network
- training
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000000694 effects Effects 0.000 title claims abstract description 24
- 230000009466 transformation Effects 0.000 title claims abstract description 19
- 238000011156 evaluation Methods 0.000 title claims abstract description 18
- 238000013135 deep learning Methods 0.000 title claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 44
- 238000000034 method Methods 0.000 claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 claims abstract description 39
- 230000009467 reduction Effects 0.000 claims abstract description 17
- 238000005516 engineering process Methods 0.000 claims abstract description 16
- 238000012795 verification Methods 0.000 claims abstract description 12
- 238000011161 development Methods 0.000 claims abstract description 11
- 238000010801 machine learning Methods 0.000 claims abstract description 10
- 238000004088 simulation Methods 0.000 claims abstract description 9
- 238000002474 experimental method Methods 0.000 claims abstract description 7
- 238000013507 mapping Methods 0.000 claims abstract description 4
- 239000010410 layer Substances 0.000 claims description 38
- 210000002569 neuron Anatomy 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 12
- 238000003062 neural network model Methods 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 238000010206 sensitivity analysis Methods 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 4
- 238000011423 initialization method Methods 0.000 claims description 3
- 239000011229 interlayer Substances 0.000 claims description 3
- 238000012847 principal component analysis method Methods 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 2
- 238000002407 reforming Methods 0.000 claims 6
- 238000005457 optimization Methods 0.000 abstract description 6
- 230000008569 process Effects 0.000 abstract description 5
- 230000008034 disappearance Effects 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 2
- 239000011159 matrix material Substances 0.000 description 14
- 238000004364 calculation method Methods 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000035699 permeability Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000009290 primary effect Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a dense oil reservoir transformation effect evaluation method based on deep learning, and belongs to the technical field of oil reservoir development. The method comprises the steps of firstly, acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, and combining the oil reservoir development data and the supplementary data to form a sample data set; then carrying out one-hot coding on the discrete data and the classified data in the sample data set, and mapping the discrete data and the classified data to an Euclidean space; further carrying out dimensionality reduction processing on the data, dividing the feature data set into a training set and a verification set according to different dimensionalities, and respectively taking the training set and the verification set as input sets; comparing different machine learning models, setting a basic single model with the same structure into a plurality of branches, initializing the gradient disappearance problem which is easy to appear in the neural network training process by using Xavier, designing a plurality of groups of comparison experiments to realize model optimization, and finally analyzing the influence of different input parameters on the prediction content. The method improves the accuracy and efficiency of prediction.
Description
Technical Field
The invention relates to the technical field of oil reservoir development, in particular to a method for evaluating the transformation effect of a compact oil reservoir based on deep learning.
Background
The method for improving the reservoir through the volume fracturing technology is a key technology for efficient development of a tight oil reservoir, the existing method for evaluating the improvement effect of the reservoir mainly adopts the seismic monitoring technology, but the seismic monitoring technology is high in cost and immature in accurate prediction aiming at unconventional oil and gas reservoirs. Therefore, the method for rapidly and accurately predicting the range and the transformation degree of the volume fracturing transformation area is beneficial to making a reasonable development technical policy, so that the reservoir stratum can be better known, and the development cost is greatly reduced.
The invention provides a transformation area range prediction method based on a machine learning method, aiming at the problem that the transformation range and the transformation effect are difficult to accurately evaluate when a compact oil reservoir is subjected to volume fracturing transformation of a reservoir. The system designed by the method can be applied to rapid prediction of the transformation range and the equivalent permeability of the transformation area of the existing compact reservoir volume fracturing horizontal well, has high prediction accuracy and adaptability and high calculation speed, and can well solve the problems of high prediction cost and long prediction time of the range and the effect of the volume fracturing transformation area.
Disclosure of Invention
The invention aims to provide a method for evaluating the transformation effect of a compact oil reservoir based on deep learning.
The method comprises the following steps:
(1) acquiring a sample data set: acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, unifying the same type of data into the same data format, and forming a sample data set;
(2) carrying out one-hot coding on discrete data and classified data in the sample data set, and mapping the discrete data and the classified data to a European space;
(3) and (3) performing dimensionality reduction on the data: performing dimensionality reduction on the data by adopting a principal component analysis method to obtain a characteristic data set subjected to dimensionality reduction;
(4) and (4) dividing the characteristic data set extracted in the step (3) into a training set and a verification set according to different dimensions in a certain proportion, and respectively using the training set and the verification set as input sets of a single model. The division proportion can be set independently, and whether the data set division is reasonable or not is judged according to the prediction precision of the model;
(5) calculating Mean Square Error (MSE), Mean Absolute Error (MAE) and R-square values under different machine learning models, and selecting a model with the R-square value being more than 95% and the MAE or MSE value being the minimum on the basis as a basic single model;
(6) setting a basic single model into a plurality of branches, respectively putting the input sets of the single models divided in the step (4) into branch models for training, finally obtaining the output with the same dimensionality by each branch, and then combining the outputs of the plurality of branches together to be used as the input data of a final model, thereby constructing an integral model;
(7) carrying out initialization setting on the weight, interlayer relevance, overfitting problem and learning rate reduction problem of the integral model;
(8) setting a plurality of groups of comparison experiments, and analyzing parameters influencing the prediction capability of the integral model to obtain an optimal model;
(9) training the training set on the basis of the optimal model, and obtaining a prediction result by using the test set;
(10) using mean absolute error and determining coefficient R2Evaluating the prediction result as an evaluation index;
(11) analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis.
The sample data set in the step (1) comprises continuous data, discrete data and classified data.
In the step (2), discrete data and classified data are coded and mapped into binary data of 0 and 1, and the number of characteristic values is used to represent the characteristic, and only one bit is set to be 1, and the other bits are all 0.
The machine learning model in the step (5) comprises a long-term and short-term memory neural network in a deep neural network model, a convolutional neural network model and a cyclic neural network model.
And (5) adopting an Xavier initialization method in the step (7).
The strategy adopted in the step (8) comprises two strategies, wherein one strategy is to compare the influence of training times, activation functions and data set division on the neural network on a basic model by adopting a control variable method without changing the neural network structure; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network.
In step (9), 70% of the data set is used as input of the neural network, i.e. the training set, the optimal weight and deviation are obtained through the three neural networks with the structure of the three parts, and the training data is used to predict the rest 30% of the test set to see whether the expected standard is achieved.
The technical scheme of the invention has the following beneficial effects:
1. the method designs a calculation frame model based on a machine learning algorithm, predicts the transformation area range by using field data and simulated supplementary data, and improves the accuracy and efficiency of prediction.
2. The method provides a characteristic data selection method for data dimensionality reduction, compresses data dimensionality on the premise of ensuring prediction precision, greatly saves calculation time and improves calculation efficiency.
3. The method maps the discrete data and the classified data into a data format which can be processed by a computer, and expands the characteristics, so that the data source is richer, and the data volume is more sufficient.
4. The method introduces a multi-branch neural network, and solves the heterogeneous problem of multi-source data through branch processing.
5. The method carries out comparison analysis on all parameters influencing the model one by one, provides a method for selecting the optimal model, provides a solution for overcoming the defects of the neural network model, and improves the prediction precision.
6. The method can find out parameters which have larger influence on the result, is convenient for subsequent analysis or model optimization, and has certain practical significance in the field guidance and analysis aspect.
Drawings
FIG. 1 is a process flow chart of the method for evaluating the transformation effect of the tight oil reservoir based on deep learning.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
The invention provides a dense oil reservoir transformation effect evaluation method based on deep learning.
As shown in fig. 1, the method comprises the steps of:
(1) acquiring a sample data set: acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, unifying the same type of data into the same data format, and forming a sample data set;
(2) carrying out one-hot coding on discrete data and classified data in the sample data set, and mapping the discrete data and the classified data to a European space;
(3) and (3) performing dimensionality reduction on the data: performing dimensionality reduction on the data by adopting a principal component analysis method to obtain a characteristic data set subjected to dimensionality reduction;
(4) and (4) dividing the characteristic data set extracted in the step (3) into a training set and a verification set according to different dimensions in a certain proportion, and respectively using the training set and the verification set as input sets of a single model. The division proportion can be set independently, and whether the data set division is reasonable or not is judged according to the prediction precision of the model;
(5) calculating Mean Square Error (MSE), Mean Absolute Error (MAE) and R-square values under different machine learning models, and selecting a model with the R-square value being more than 95% and the MAE or MSE value being the minimum on the basis as a basic single model;
(6) setting a basic single model into a plurality of branches, respectively putting the input sets of the single models divided in the step (4) into branch models for training, finally obtaining the output with the same dimensionality by each branch, and then combining the outputs of the plurality of branches together to be used as the input data of a final model, thereby constructing an integral model;
(7) carrying out initialization setting on the weight, interlayer relevance, overfitting problem and learning rate reduction problem of the integral model;
(8) setting a plurality of groups of comparison experiments, and analyzing parameters influencing the prediction capability of the integral model to obtain an optimal model;
(9) training the training set on the basis of the optimal model, and obtaining a prediction result by using the test set;
(10) using mean absolute error and determining coefficient R2Evaluating the prediction result as an evaluation index;
(11) analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis.
The following description is given with reference to specific examples.
(1) Oil reservoir development data are obtained from the site, supplementary data are obtained from a numerical simulation technology, and the supplementary data and the numerical simulation technology are combined to form a data set. The data categories are mainly classified into numerical data, classification data, and discrete data. Taking a certain block as an example, 3960 groups of simulation data are provided, and 6 production parameters of horizontal well length, single-end fracturing cluster number, fracture width, matrix permeability, daily oil production and cumulative oil production are extracted as sample data. Daily and cumulative oil production had 1500 days of data in the time dimension, thus generating data in the 3960 x 1500 dimension, and all four other parameters were independent of time, thus the dimension was 3960 x 1.
(2) The discrete data and the classified data in the sample data set are subjected to one-hot coding and are mapped to an Euclidean space, and for the sake of simplicity, the discrete characteristics are represented by the number of dimensions according to the number of values of the discrete characteristics, only one bit is 1, and the other bits are all 0, so that the problem that the classifier cannot process attribute data well is solved. For example, the wellbore direction is classified data, and is divided into eight attributes of east, west, south, north, southeast, southwest, northeast, and northwest, and when the one-hot encoding method is used, the azimuth east is represented as 10000000, the azimuth west is represented as 01000000, the azimuth south is represented as 00100000, the azimuth north is represented as 00010000, the east is represented as 00001000, the southwest is represented as 00000100, the northeast is represented as 00000010, and the northwest is represented as 00000001.
(3) And performing dimensionality reduction on the data. The method mainly adopts a Principal Component Analysis (PCA) method, and comprises the following specific operation steps: setting the training data to have m rows and n characteristic dimensions, and setting a matrix X to be an m multiplied by n matrix, namely:
firstly, normalizing the characteristics:
then the covariance matrix C of X is calculated:
then C is an n × n matrix:
next, the eigenvectors of the covariance matrix are calculated. N linearly independent non-zero eigenvectors e can be obtained according to the covariance matrix C1,e2,...,enThe feature matrix E ═ E composed of these feature vectors1,e2,...,en) Satisfy the requirement of:
For diagonal matrix, element λ on diagonal1,λ2,...,λnThe eigenvalue of the covariance matrix, and the column vector of the corresponding position in E is the unit eigenvector corresponding to the eigenvalue.
And performing space conversion on the features by using the XU as Z, wherein the expression of Z is as follows:
in the feature space, the features are linearly independent of each other, and the corresponding covariance matrix is:
sorting the eigenvalues from large to small, then taking the first K, and obtaining the data to be the next data after compression transformation of Z-XU:
x is an m × n matrix, U is an n × k matrix, and Z is an m × k matrix.
(4) And dividing the extracted characteristic data set into a training set and a verification set according to different dimensionalities, and respectively taking the training set and the verification set as input sets. Taking simulation data as an example, there are two 3960 × 1500 data sets and four 3960 × 1 data sets, and when dividing the training set and the verification set, the two 1500 data sets and the four 1-dimensional data sets are divided into two groups, which are respectively according to the training set: the verification set is divided into x _ train _1, y _ train _1, x _ valid _1 and y _ valid _1 according to the proportion of 0.8: 0.2; x _ train _2, y _ train _2, x _ valid _2, y _ valid _2, as input sets for the following machine learning model.
(5) Different machine learning models were compared, mainly long short term memory neural networks (LSTM) in the deep neural network model (DNN), convolutional neural network model (CNN) and recurrent neural network model (RNN). The DNN is the most basic neural network structure, has certain universality, is used for the aspects of mud content prediction, porosity prediction, well connectivity and the like at present, and achieves ideal effects. CNN is currently the best way to extract features in the image domain. The basic CNN is composed of three parts of structures of convolution, activation and pooling, and the output result is a specific feature space of each image. Unlike traditional feedforward neural networks, Recurrent Neural Networks (RNNs) introduce directed cycles. And the LSTM is an improved RNN, and can solve the problem of long-distance dependence. The most common DNN model is used because the data set used in the present invention does not use image data and the degree of correlation with time series is relatively small. Initially, the number of hidden layers of a single neural network is set to be 3, the number of neurons in each layer is 150, a Dropout technology is not used, the training frequency is 200 times, and a ReLU function is used as an activation function.
(6) And setting a basic single model with the same structure into a plurality of branches, respectively training multi-source heterogeneous data, and finally combining to construct an integral model. Taking the data taken in the step (4) as an example, transversely combining two 1500-dimensional data into 3000 dimensions, then training by using a basic model set in the step (5), and finally obtaining 1-dimensional data; and (4) transversely combining the four 1-dimensional data into 4 dimensions, and training the four 1-dimensional data through the basic neural network set in the step (5) to finally obtain a 1-dimensional result. And (3) outputting two 1-dimensional results as the input of a merging model, wherein the merging model has the same structure as the branch model, and the output result of the whole model is obtained through final training. The whole model has three neural networks with the same structure, so that the problem of different input data dimensions can be solved.
(7) And adopting an Xavier initialization method for the gradient disappearance problem which is easy to appear in the neural network training process. Assuming that the neural network has N hidden layers, i is the index of the hidden layer of the network, z represents the input of the i-th layer, y represents the output of the i-th layer, and w and b represent the weight and deviation of the i-th layer, respectively, the feed-forward operation of the standard neural network can be described as:
where f () represents the activation function and j is the index of the hidden layer neuron.
In practice, the number of inputs and outputs is generally different, so the weight variance is expressed as:
on this basis, the weight of the ith layer can be initialized to:
the initialization can also be done with a uniform distribution:
the method can keep the variance of the activation value and the variance of the state gradient of each layer consistent in the propagation process, thereby solving the problem of gradient disappearance.
When the neural network is trained, the input of the rear layer is influenced by the change of the weight parameters of the front layer, when the depth of the network is large, the relevance between layers becomes high, and the influence is large when one point of the front layer is changed and accumulated to the rear. However, the general deep neural network requires that the input variables are distributed similarly on the training data and the test data, and when the input distribution is greatly changed before and after the parameters are updated, the network is required to adapt to the new data distribution continuously, so that the training becomes extremely difficult. A batch normalization method is adopted for solving the problem, and the mean value and the variance of each layer of input are determined in training through normalization, so that the problem of internal covariate migration is relieved, and the dependence of gradients on parameters is reduced.
The specific operation is as follows:
for the overfitting problem which is easy to generate by the neural network, the Dropout technology can randomly disable partial neurons, so that the neural network structure trained in each iteration is not completely the same, the synergy among the characteristics is weakened, and the problem is solved. Using Dropout techniques, the neural network feed-forward operation is modified to:
The method for realizing optimization is more, and the Adadelta optimization scheme is adopted, so that the problem that the learning rate is too small to perform effective updating can be solved.
The Adadelta optimization scheme is as follows:
(8) And designing a plurality of groups of comparison experiments to realize model optimization. Two strategies are mainly adopted, wherein one strategy is that the influence of training times, activation functions and data set division on the neural network is compared on a basic model by adopting a control variable method without changing the neural network structure; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network. Specifically, Dropout is set to be scaled to five sets of values of 0, 0.3, 0.4, 0.5 and 0.6, and since the number of neurons per layer of the base model is 150, the number of primitive neurons per layer after adding the Dropout operation is 150, 500, 375, 300 and 250, respectively; the activation function types comprise softplus, ReLU and Linear; the number of hidden layers (single model) is 1,2, 3, 4, 5; the number of neurons per layer was 20, 50, 100, 150, 200; dividing the data set into 0.8:0.2, 0.7:0.3, 0.6:0.4 and 0.5: 0.5; the number of training times was 200, 300, 400, 500, 600 and 700. The two influencing factors of the number of hidden layers and the number of neurons in each layer are designed into 25 groups of schemes in an orthogonal mode, for example, when the number of hidden layers of a single model is 1 and the number of hidden layers of the whole model is 3, the number of neurons in each layer is set to be 20, 50, 100, 150 and 200 groups, and the obtained results are recorded. And then setting the number of hidden layers of a single model to be 2, setting the number of neurons in each layer to be five groups of 20, 50, 100, 150 and 200 when the number of hidden layers of the whole model is 6, and so on to obtain 25 groups of results, comprehensively considering the influences of the two factors of the number of neurons and the number of hidden layers according to the evaluation index, and selecting the group with the smaller number of neurons and the number of hidden layers when the evaluation index values are the same. And comparing and verifying other influence factors based on the basic model set in the fifth step, if the influence of the activation function on the result is compared, setting three groups of comparison experiments, modifying the activation function on the basic model, and then comparing the result according to the evaluation index to select the activation function with better effect. The prediction effects of the neural network under different selection modes are greatly different, each parameter is analyzed and compared, and a proper parameter setting model is selected to form an optimal model, so that the efficiency and the reliability of the neural network model are improved. Taking a certain block as an example, through a comparison experiment, the optimal setting of the neural network model is as follows: two branch networks and a merging network are adopted, the three network structures are the same and are all 5 hidden layers, the number of neurons in each layer is 50, and a ReLU function is used as an activation function; the number of model training times was 500, without the Dropout technique, the batch size was 128, and the ratio of training set to validation set was 0.7: 0.3.
(9) And (3) training a training set on the basis of the optimal model in the step (8), wherein 70% of the data set is used as input of a neural network (namely, the training set), the optimal weight and deviation are obtained through the three neural networks in a structure, and 30% of the test set is predicted by using the trained data to see whether the expected standard is met.
(10) Mean Absolute Error (MAE) and coefficient of determination R are used2Evaluation indexes and expectation criteria mentioned as step (8) and step (9). MAE and R2The calculation method of (2) is as follows:
wherein, yiThe actual value is represented by the value of,the predicted value is represented by a value of the prediction,represents the average value, and m is the number of samples.
When the MAE value is smaller, R2The larger the value the better the prediction of the model. Taking the optimal model of the step (8) as an example, measuring R295.7 percent of the prediction result, 179.39 percent of the MAE, and relatively high prediction precision, and the expected effect is achieved.
(11) And analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis, thereby facilitating subsequent analysis or optimizing the model. The Sobol sensitivity analysis method is a variance-based monte carlo method. Defining a k-dimensional unit volume omegakThe spatial domain, denoted as Ω, as an input factork={x|0≤xiLess than or equal to 1; 1, 2.. k }, the central idea of the Sobol method is to decompose the function f (x) into a sum of sub-terms:
the variance of each order sub-term in the above formula is called each order partial variance, i.e. s order partial variance:
the total variance is equal to the sum of the partial variances of the orders:
defining the sensitivity coefficient of each order as the ratio of the partial variance and the total variance of each order, S-order sensitivity Si1,i2,...,isIs defined as:
wherein S isiCalled factor xiFirst order sensitivity coefficient of (2), representing xiThe primary effect on output; sij(i ≠ j) is the second-order sensitivity coefficient, indicating the cross-influence between the two factors. The invention only considers the first-order sensitivity coefficient, and analyzes and judges the influence of different input parameters on the prediction content according to the coefficient value.
Combining the oilfield field data and the simulation result data innovatively in the step (1), and simultaneously predicting by using data of different sources and structures, thereby considering the problem of multi-source heterogeneous data.
And (3) innovatively introducing a dimension reduction method for extracting the feature data, and removing data or noise data with small relevance to the result, so that the dimension of the sample data is reduced, the calculation speed is increased, and the prediction effect is not influenced.
And (4) the multi-branch network model is innovatively introduced in the step (6), so that the multi-input and multi-dimension problem of multi-source heterogeneous data can be solved.
And (4) innovatively introducing an Xavier weight initialization, batch normalization, a Dropout technology and an Adadelta optimizer in the step (7), and solving the problems of nonlinearity, gradient disappearance, internal covariate offset, overfitting and continuous reduction of learning rate, so that a robust integral model is constructed.
And (8) creatively considering the influence of the hyper-parameters on the neural network, and providing a method for selecting an optimal model.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (7)
1. A dense oil reservoir transformation effect evaluation method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:
(1) acquiring a sample data set: acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, unifying the same type of data into the same data format, and forming a sample data set;
(2) carrying out one-hot coding on discrete data and classified data in the sample data set, and mapping the discrete data and the classified data to a European space;
(3) and (3) performing dimensionality reduction on the data: performing dimensionality reduction on the data by adopting a principal component analysis method to obtain a characteristic data set subjected to dimensionality reduction;
(4) dividing the characteristic data set extracted in the step (3) into a training set and a verification set according to different dimensions, and respectively using the training set and the verification set as input sets of a single model;
(5) calculating Mean Square Error (MSE), Mean Absolute Error (MAE) and R-square values under different machine learning models, and selecting a model with the R-square value being more than 95% and the MAE or MSE value being the minimum on the basis as a basic single model;
(6) setting a basic single model into a plurality of branches, respectively putting the input sets of the single models divided in the step (4) into branch models for training, finally obtaining the output with the same dimensionality by each branch, and then combining the outputs of the plurality of branches together to be used as the input data of a final model, thereby constructing an integral model;
(7) carrying out initialization setting on the weight, interlayer relevance, overfitting problem and learning rate reduction problem of the integral model;
(8) setting at least two groups of comparison experiments, and analyzing parameters influencing the prediction capability of the overall model to obtain an optimal model;
(9) training the training set on the basis of the optimal model, and obtaining a prediction result by using the test set;
(10) using mean absolute error and determining coefficient R2Evaluating the prediction result as an evaluation index;
(11) analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis.
2. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the sample data set in the step (1) comprises continuous data, discrete data and classified data.
3. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: specifically, in the step (2), discrete data and classification data are coded and mapped into binary data of 0 and 1, and how many feature values are taken, how many dimensions are used to represent the feature, and only one bit is set to be 1, and the other bits are all 0.
4. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the machine learning model in the step (5) comprises a long-short term memory neural network in a deep neural network model, a convolutional neural network model and a cyclic neural network model.
5. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: and (5) adopting an Xavier initialization method in the step (7).
6. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the strategy adopted in the step (8) comprises two strategies, wherein one strategy is to compare the influence of training times, activation functions and data set division on the neural network on a basic model by adopting a control variable method without changing the structure of the neural network; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network.
7. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: in the step (9), 70% of the data set is used as input of the neural network, i.e. the training set, the optimal weight and deviation are obtained through the three-part structured neural network, and the trained data is used to predict the rest 30% of the test set to see whether the expected standard is met.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011403763.2A CN112633328A (en) | 2020-12-04 | 2020-12-04 | Dense oil reservoir transformation effect evaluation method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011403763.2A CN112633328A (en) | 2020-12-04 | 2020-12-04 | Dense oil reservoir transformation effect evaluation method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112633328A true CN112633328A (en) | 2021-04-09 |
Family
ID=75307795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011403763.2A Pending CN112633328A (en) | 2020-12-04 | 2020-12-04 | Dense oil reservoir transformation effect evaluation method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112633328A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113359512A (en) * | 2021-06-26 | 2021-09-07 | 华东交通大学 | Component content digital twinning characteristic analysis method in rare earth extraction separation process |
CN113592194A (en) * | 2021-08-23 | 2021-11-02 | 北京科技大学 | Establishment of CO2Method of throughput effect prediction model and CO2Method for evaluating throughput effect |
CN113812851A (en) * | 2021-09-09 | 2021-12-21 | 熊猫智慧水务有限公司 | Water age control system for direct drinking water purification equipment |
CN115758084A (en) * | 2022-11-21 | 2023-03-07 | 清华大学 | Deep neural network crack quantification method and device and storage medium |
KR20230084833A (en) * | 2021-12-06 | 2023-06-13 | 이화여자대학교 산학협력단 | Method and apparatus for estimating reservoir parameters based on physical geometry data |
CN116821694A (en) * | 2023-08-30 | 2023-09-29 | 中国石油大学(华东) | Soil humidity inversion method based on multi-branch neural network and segmented model |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109993223A (en) * | 2019-03-26 | 2019-07-09 | 南京道润交通科技有限公司 | Pavement Condition prediction technique, storage medium, electronic equipment |
CN111651890A (en) * | 2020-06-04 | 2020-09-11 | 中南大学 | Data-driven aluminum electrolysis digital twin factory, control method and system |
-
2020
- 2020-12-04 CN CN202011403763.2A patent/CN112633328A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109993223A (en) * | 2019-03-26 | 2019-07-09 | 南京道润交通科技有限公司 | Pavement Condition prediction technique, storage medium, electronic equipment |
CN111651890A (en) * | 2020-06-04 | 2020-09-11 | 中南大学 | Data-driven aluminum electrolysis digital twin factory, control method and system |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113359512A (en) * | 2021-06-26 | 2021-09-07 | 华东交通大学 | Component content digital twinning characteristic analysis method in rare earth extraction separation process |
CN113592194A (en) * | 2021-08-23 | 2021-11-02 | 北京科技大学 | Establishment of CO2Method of throughput effect prediction model and CO2Method for evaluating throughput effect |
CN113812851A (en) * | 2021-09-09 | 2021-12-21 | 熊猫智慧水务有限公司 | Water age control system for direct drinking water purification equipment |
KR20230084833A (en) * | 2021-12-06 | 2023-06-13 | 이화여자대학교 산학협력단 | Method and apparatus for estimating reservoir parameters based on physical geometry data |
KR102696829B1 (en) * | 2021-12-06 | 2024-08-21 | 이화여자대학교 산학협력단 | Method and apparatus for estimating reservoir parameters based on physical geometry data |
CN115758084A (en) * | 2022-11-21 | 2023-03-07 | 清华大学 | Deep neural network crack quantification method and device and storage medium |
CN115758084B (en) * | 2022-11-21 | 2023-11-14 | 清华大学 | Deep neural network crack quantification method and device and storage medium |
CN116821694A (en) * | 2023-08-30 | 2023-09-29 | 中国石油大学(华东) | Soil humidity inversion method based on multi-branch neural network and segmented model |
CN116821694B (en) * | 2023-08-30 | 2023-12-01 | 中国石油大学(华东) | Soil humidity inversion method based on multi-branch neural network and segmented model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112633328A (en) | Dense oil reservoir transformation effect evaluation method based on deep learning | |
Otchere et al. | Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions | |
CN112989708B (en) | Well logging lithology identification method and system based on LSTM neural network | |
CN108399201B (en) | Web user access path prediction method based on recurrent neural network | |
CN107122861B (en) | Gas emission quantity prediction method based on PCA-PSO-ELM | |
CN110674841B (en) | Logging curve identification method based on clustering algorithm | |
CN110083125B (en) | Machine tool thermal error modeling method based on deep learning | |
CN110807544B (en) | Oil field residual oil saturation distribution prediction method based on machine learning | |
Liu et al. | Predictive model for water absorption in sublayers using a Joint Distribution Adaption based XGBoost transfer learning method | |
CN112083498A (en) | Multi-wave earthquake oil and gas reservoir prediction method based on deep neural network | |
CN116933192A (en) | Concrete dam operation key part partition monitoring method and model | |
CN114091333A (en) | Shale gas content artificial intelligence prediction method based on machine learning | |
CN111027249B (en) | Machine learning-based inter-well connectivity evaluation method | |
Li et al. | Prediction of shale gas production by hydraulic fracturing in changning area using machine learning algorithms | |
CN112926251B (en) | Landslide displacement high-precision prediction method based on machine learning | |
CN114114414A (en) | Artificial intelligence prediction method for 'dessert' information of shale reservoir | |
CN113095466A (en) | Algorithm of satisfiability model theoretical solver based on meta-learning model | |
Wang et al. | On the Feasibility of An Ensemble Multi-Fidelity Neural Network for Fast Data Assimilation for Subsurface Flow in Porous Media | |
Li et al. | Lithology classification based on set-valued identification method | |
Koochak et al. | A variability aware GAN for improving spatial representativeness of discrete geobodies | |
CN117009758A (en) | Oil well yield prediction method based on PSO-VMD-LSTM | |
Silva et al. | Generative network-based reduced-order model for prediction, data assimilation and uncertainty quantification | |
Gudmundsdottir et al. | Inferring interwell connectivity in fractured geothermal reservoirs using neural networks | |
CN112149311B (en) | Nonlinear multivariate statistical regression logging curve prediction method based on quantity specification | |
Kumar | Surrogate model for field optimization using beta-VAE based regression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210409 |