CN112633328A

CN112633328A - Dense oil reservoir transformation effect evaluation method based on deep learning

Info

Publication number: CN112633328A
Application number: CN202011403763.2A
Authority: CN
Inventors: 岳明; 宋洪庆; 宋田茹; 王九龙; 都书一
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2021-04-09

Abstract

The invention provides a dense oil reservoir transformation effect evaluation method based on deep learning, and belongs to the technical field of oil reservoir development. The method comprises the steps of firstly, acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, and combining the oil reservoir development data and the supplementary data to form a sample data set; then carrying out one-hot coding on the discrete data and the classified data in the sample data set, and mapping the discrete data and the classified data to an Euclidean space; further carrying out dimensionality reduction processing on the data, dividing the feature data set into a training set and a verification set according to different dimensionalities, and respectively taking the training set and the verification set as input sets; comparing different machine learning models, setting a basic single model with the same structure into a plurality of branches, initializing the gradient disappearance problem which is easy to appear in the neural network training process by using Xavier, designing a plurality of groups of comparison experiments to realize model optimization, and finally analyzing the influence of different input parameters on the prediction content. The method improves the accuracy and efficiency of prediction.

Description

Dense oil reservoir transformation effect evaluation method based on deep learning

Technical Field

The invention relates to the technical field of oil reservoir development, in particular to a method for evaluating the transformation effect of a compact oil reservoir based on deep learning.

Background

The method for improving the reservoir through the volume fracturing technology is a key technology for efficient development of a tight oil reservoir, the existing method for evaluating the improvement effect of the reservoir mainly adopts the seismic monitoring technology, but the seismic monitoring technology is high in cost and immature in accurate prediction aiming at unconventional oil and gas reservoirs. Therefore, the method for rapidly and accurately predicting the range and the transformation degree of the volume fracturing transformation area is beneficial to making a reasonable development technical policy, so that the reservoir stratum can be better known, and the development cost is greatly reduced.

The invention provides a transformation area range prediction method based on a machine learning method, aiming at the problem that the transformation range and the transformation effect are difficult to accurately evaluate when a compact oil reservoir is subjected to volume fracturing transformation of a reservoir. The system designed by the method can be applied to rapid prediction of the transformation range and the equivalent permeability of the transformation area of the existing compact reservoir volume fracturing horizontal well, has high prediction accuracy and adaptability and high calculation speed, and can well solve the problems of high prediction cost and long prediction time of the range and the effect of the volume fracturing transformation area.

Disclosure of Invention

The invention aims to provide a method for evaluating the transformation effect of a compact oil reservoir based on deep learning.

The method comprises the following steps:

(1) acquiring a sample data set: acquiring oil reservoir development data from the site, acquiring supplementary data from a numerical simulation technology, unifying the same type of data into the same data format, and forming a sample data set;

(2) carrying out one-hot coding on discrete data and classified data in the sample data set, and mapping the discrete data and the classified data to a European space;

(3) and (3) performing dimensionality reduction on the data: performing dimensionality reduction on the data by adopting a principal component analysis method to obtain a characteristic data set subjected to dimensionality reduction;

(4) and (4) dividing the characteristic data set extracted in the step (3) into a training set and a verification set according to different dimensions in a certain proportion, and respectively using the training set and the verification set as input sets of a single model. The division proportion can be set independently, and whether the data set division is reasonable or not is judged according to the prediction precision of the model;

(5) calculating Mean Square Error (MSE), Mean Absolute Error (MAE) and R-square values under different machine learning models, and selecting a model with the R-square value being more than 95% and the MAE or MSE value being the minimum on the basis as a basic single model;

(6) setting a basic single model into a plurality of branches, respectively putting the input sets of the single models divided in the step (4) into branch models for training, finally obtaining the output with the same dimensionality by each branch, and then combining the outputs of the plurality of branches together to be used as the input data of a final model, thereby constructing an integral model;

(7) carrying out initialization setting on the weight, interlayer relevance, overfitting problem and learning rate reduction problem of the integral model;

(8) setting a plurality of groups of comparison experiments, and analyzing parameters influencing the prediction capability of the integral model to obtain an optimal model;

(9) training the training set on the basis of the optimal model, and obtaining a prediction result by using the test set;

(10) using mean absolute error and determining coefficient R²Evaluating the prediction result as an evaluation index;

(11) analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis.

The sample data set in the step (1) comprises continuous data, discrete data and classified data.

In the step (2), discrete data and classified data are coded and mapped into binary data of 0 and 1, and the number of characteristic values is used to represent the characteristic, and only one bit is set to be 1, and the other bits are all 0.

The machine learning model in the step (5) comprises a long-term and short-term memory neural network in a deep neural network model, a convolutional neural network model and a cyclic neural network model.

And (5) adopting an Xavier initialization method in the step (7).

The strategy adopted in the step (8) comprises two strategies, wherein one strategy is to compare the influence of training times, activation functions and data set division on the neural network on a basic model by adopting a control variable method without changing the neural network structure; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network.

In step (9), 70% of the data set is used as input of the neural network, i.e. the training set, the optimal weight and deviation are obtained through the three neural networks with the structure of the three parts, and the training data is used to predict the rest 30% of the test set to see whether the expected standard is achieved.

The technical scheme of the invention has the following beneficial effects:

1. the method designs a calculation frame model based on a machine learning algorithm, predicts the transformation area range by using field data and simulated supplementary data, and improves the accuracy and efficiency of prediction.

2. The method provides a characteristic data selection method for data dimensionality reduction, compresses data dimensionality on the premise of ensuring prediction precision, greatly saves calculation time and improves calculation efficiency.

3. The method maps the discrete data and the classified data into a data format which can be processed by a computer, and expands the characteristics, so that the data source is richer, and the data volume is more sufficient.

4. The method introduces a multi-branch neural network, and solves the heterogeneous problem of multi-source data through branch processing.

5. The method carries out comparison analysis on all parameters influencing the model one by one, provides a method for selecting the optimal model, provides a solution for overcoming the defects of the neural network model, and improves the prediction precision.

6. The method can find out parameters which have larger influence on the result, is convenient for subsequent analysis or model optimization, and has certain practical significance in the field guidance and analysis aspect.

Drawings

FIG. 1 is a process flow chart of the method for evaluating the transformation effect of the tight oil reservoir based on deep learning.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

The invention provides a dense oil reservoir transformation effect evaluation method based on deep learning.

As shown in fig. 1, the method comprises the steps of:

The following description is given with reference to specific examples.

(1) Oil reservoir development data are obtained from the site, supplementary data are obtained from a numerical simulation technology, and the supplementary data and the numerical simulation technology are combined to form a data set. The data categories are mainly classified into numerical data, classification data, and discrete data. Taking a certain block as an example, 3960 groups of simulation data are provided, and 6 production parameters of horizontal well length, single-end fracturing cluster number, fracture width, matrix permeability, daily oil production and cumulative oil production are extracted as sample data. Daily and cumulative oil production had 1500 days of data in the time dimension, thus generating data in the 3960 x 1500 dimension, and all four other parameters were independent of time, thus the dimension was 3960 x 1.

(2) The discrete data and the classified data in the sample data set are subjected to one-hot coding and are mapped to an Euclidean space, and for the sake of simplicity, the discrete characteristics are represented by the number of dimensions according to the number of values of the discrete characteristics, only one bit is 1, and the other bits are all 0, so that the problem that the classifier cannot process attribute data well is solved. For example, the wellbore direction is classified data, and is divided into eight attributes of east, west, south, north, southeast, southwest, northeast, and northwest, and when the one-hot encoding method is used, the azimuth east is represented as 10000000, the azimuth west is represented as 01000000, the azimuth south is represented as 00100000, the azimuth north is represented as 00010000, the east is represented as 00001000, the southwest is represented as 00000100, the northeast is represented as 00000010, and the northwest is represented as 00000001.

(3) And performing dimensionality reduction on the data. The method mainly adopts a Principal Component Analysis (PCA) method, and comprises the following specific operation steps: setting the training data to have m rows and n characteristic dimensions, and setting a matrix X to be an m multiplied by n matrix, namely:

firstly, normalizing the characteristics:

then the covariance matrix C of X is calculated:

then C is an n × n matrix:

next, the eigenvectors of the covariance matrix are calculated. N linearly independent non-zero eigenvectors e can be obtained according to the covariance matrix C₁,e₂,...,e_nThe feature matrix E ═ E composed of these feature vectors₁,e₂,...,e_n) Satisfy the requirement of：

For diagonal matrix, element λ on diagonal₁,λ₂,...,λ_nThe eigenvalue of the covariance matrix, and the column vector of the corresponding position in E is the unit eigenvector corresponding to the eigenvalue.

And performing space conversion on the features by using the XU as Z, wherein the expression of Z is as follows:

in the feature space, the features are linearly independent of each other, and the corresponding covariance matrix is:

sorting the eigenvalues from large to small, then taking the first K, and obtaining the data to be the next data after compression transformation of Z-XU:

x is an m × n matrix, U is an n × k matrix, and Z is an m × k matrix.

(4) And dividing the extracted characteristic data set into a training set and a verification set according to different dimensionalities, and respectively taking the training set and the verification set as input sets. Taking simulation data as an example, there are two 3960 × 1500 data sets and four 3960 × 1 data sets, and when dividing the training set and the verification set, the two 1500 data sets and the four 1-dimensional data sets are divided into two groups, which are respectively according to the training set: the verification set is divided into x _ train _1, y _ train _1, x _ valid _1 and y _ valid _1 according to the proportion of 0.8: 0.2; x _ train _2, y _ train _2, x _ valid _2, y _ valid _2, as input sets for the following machine learning model.

(5) Different machine learning models were compared, mainly long short term memory neural networks (LSTM) in the deep neural network model (DNN), convolutional neural network model (CNN) and recurrent neural network model (RNN). The DNN is the most basic neural network structure, has certain universality, is used for the aspects of mud content prediction, porosity prediction, well connectivity and the like at present, and achieves ideal effects. CNN is currently the best way to extract features in the image domain. The basic CNN is composed of three parts of structures of convolution, activation and pooling, and the output result is a specific feature space of each image. Unlike traditional feedforward neural networks, Recurrent Neural Networks (RNNs) introduce directed cycles. And the LSTM is an improved RNN, and can solve the problem of long-distance dependence. The most common DNN model is used because the data set used in the present invention does not use image data and the degree of correlation with time series is relatively small. Initially, the number of hidden layers of a single neural network is set to be 3, the number of neurons in each layer is 150, a Dropout technology is not used, the training frequency is 200 times, and a ReLU function is used as an activation function.

(6) And setting a basic single model with the same structure into a plurality of branches, respectively training multi-source heterogeneous data, and finally combining to construct an integral model. Taking the data taken in the step (4) as an example, transversely combining two 1500-dimensional data into 3000 dimensions, then training by using a basic model set in the step (5), and finally obtaining 1-dimensional data; and (4) transversely combining the four 1-dimensional data into 4 dimensions, and training the four 1-dimensional data through the basic neural network set in the step (5) to finally obtain a 1-dimensional result. And (3) outputting two 1-dimensional results as the input of a merging model, wherein the merging model has the same structure as the branch model, and the output result of the whole model is obtained through final training. The whole model has three neural networks with the same structure, so that the problem of different input data dimensions can be solved.

(7) And adopting an Xavier initialization method for the gradient disappearance problem which is easy to appear in the neural network training process. Assuming that the neural network has N hidden layers, i is the index of the hidden layer of the network, z represents the input of the i-th layer, y represents the output of the i-th layer, and w and b represent the weight and deviation of the i-th layer, respectively, the feed-forward operation of the standard neural network can be described as:

where f () represents the activation function and j is the index of the hidden layer neuron.

In practice, the number of inputs and outputs is generally different, so the weight variance is expressed as:

on this basis, the weight of the ith layer can be initialized to:

the initialization can also be done with a uniform distribution:

the method can keep the variance of the activation value and the variance of the state gradient of each layer consistent in the propagation process, thereby solving the problem of gradient disappearance.

When the neural network is trained, the input of the rear layer is influenced by the change of the weight parameters of the front layer, when the depth of the network is large, the relevance between layers becomes high, and the influence is large when one point of the front layer is changed and accumulated to the rear. However, the general deep neural network requires that the input variables are distributed similarly on the training data and the test data, and when the input distribution is greatly changed before and after the parameters are updated, the network is required to adapt to the new data distribution continuously, so that the training becomes extremely difficult. A batch normalization method is adopted for solving the problem, and the mean value and the variance of each layer of input are determined in training through normalization, so that the problem of internal covariate migration is relieved, and the dependence of gradients on parameters is reduced.

The specific operation is as follows:

for the overfitting problem which is easy to generate by the neural network, the Dropout technology can randomly disable partial neurons, so that the neural network structure trained in each iteration is not completely the same, the synergy among the characteristics is weakened, and the problem is solved. Using Dropout techniques, the neural network feed-forward operation is modified to:

is a Bernoulli random variable vector, yⁱIs the output after the ith layer dropout.

The method for realizing optimization is more, and the Adadelta optimization scheme is adopted, so that the problem that the learning rate is too small to perform effective updating can be solved.

The Adadelta optimization scheme is as follows:

wherein,

γ is a hyperparameter and is typically set to 0.9.

(8) And designing a plurality of groups of comparison experiments to realize model optimization. Two strategies are mainly adopted, wherein one strategy is that the influence of training times, activation functions and data set division on the neural network is compared on a basic model by adopting a control variable method without changing the neural network structure; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network. Specifically, Dropout is set to be scaled to five sets of values of 0, 0.3, 0.4, 0.5 and 0.6, and since the number of neurons per layer of the base model is 150, the number of primitive neurons per layer after adding the Dropout operation is 150, 500, 375, 300 and 250, respectively; the activation function types comprise softplus, ReLU and Linear; the number of hidden layers (single model) is 1,2, 3, 4, 5; the number of neurons per layer was 20, 50, 100, 150, 200; dividing the data set into 0.8:0.2, 0.7:0.3, 0.6:0.4 and 0.5: 0.5; the number of training times was 200, 300, 400, 500, 600 and 700. The two influencing factors of the number of hidden layers and the number of neurons in each layer are designed into 25 groups of schemes in an orthogonal mode, for example, when the number of hidden layers of a single model is 1 and the number of hidden layers of the whole model is 3, the number of neurons in each layer is set to be 20, 50, 100, 150 and 200 groups, and the obtained results are recorded. And then setting the number of hidden layers of a single model to be 2, setting the number of neurons in each layer to be five groups of 20, 50, 100, 150 and 200 when the number of hidden layers of the whole model is 6, and so on to obtain 25 groups of results, comprehensively considering the influences of the two factors of the number of neurons and the number of hidden layers according to the evaluation index, and selecting the group with the smaller number of neurons and the number of hidden layers when the evaluation index values are the same. And comparing and verifying other influence factors based on the basic model set in the fifth step, if the influence of the activation function on the result is compared, setting three groups of comparison experiments, modifying the activation function on the basic model, and then comparing the result according to the evaluation index to select the activation function with better effect. The prediction effects of the neural network under different selection modes are greatly different, each parameter is analyzed and compared, and a proper parameter setting model is selected to form an optimal model, so that the efficiency and the reliability of the neural network model are improved. Taking a certain block as an example, through a comparison experiment, the optimal setting of the neural network model is as follows: two branch networks and a merging network are adopted, the three network structures are the same and are all 5 hidden layers, the number of neurons in each layer is 50, and a ReLU function is used as an activation function; the number of model training times was 500, without the Dropout technique, the batch size was 128, and the ratio of training set to validation set was 0.7: 0.3.

(9) And (3) training a training set on the basis of the optimal model in the step (8), wherein 70% of the data set is used as input of a neural network (namely, the training set), the optimal weight and deviation are obtained through the three neural networks in a structure, and 30% of the test set is predicted by using the trained data to see whether the expected standard is met.

(10) Mean Absolute Error (MAE) and coefficient of determination R are used²Evaluation indexes and expectation criteria mentioned as step (8) and step (9). MAE and R²The calculation method of (2) is as follows:

wherein, y_iThe actual value is represented by the value of,

the predicted value is represented by a value of the prediction,

represents the average value, and m is the number of samples.

When the MAE value is smaller, R²The larger the value the better the prediction of the model. Taking the optimal model of the step (8) as an example, measuring R²95.7 percent of the prediction result, 179.39 percent of the MAE, and relatively high prediction precision, and the expected effect is achieved.

(11) And analyzing the influence of different input parameters on the predicted content, and finding out the parameters with larger influence on the result by using Sobol global sensitivity analysis, thereby facilitating subsequent analysis or optimizing the model. The Sobol sensitivity analysis method is a variance-based monte carlo method. Defining a k-dimensional unit volume omega^kThe spatial domain, denoted as Ω, as an input factor^k＝{x|0≤x_iLess than or equal to 1; 1, 2.. k }, the central idea of the Sobol method is to decompose the function f (x) into a sum of sub-terms:

the variance of each order sub-term in the above formula is called each order partial variance, i.e. s order partial variance:

the total variance is equal to the sum of the partial variances of the orders:

defining the sensitivity coefficient of each order as the ratio of the partial variance and the total variance of each order, S-order sensitivity S_i1,i2,...,isIs defined as:

wherein S is_iCalled factor x_iFirst order sensitivity coefficient of (2), representing x_iThe primary effect on output; s_ij(i ≠ j) is the second-order sensitivity coefficient, indicating the cross-influence between the two factors. The invention only considers the first-order sensitivity coefficient, and analyzes and judges the influence of different input parameters on the prediction content according to the coefficient value.

Combining the oilfield field data and the simulation result data innovatively in the step (1), and simultaneously predicting by using data of different sources and structures, thereby considering the problem of multi-source heterogeneous data.

And (3) innovatively introducing a dimension reduction method for extracting the feature data, and removing data or noise data with small relevance to the result, so that the dimension of the sample data is reduced, the calculation speed is increased, and the prediction effect is not influenced.

And (4) the multi-branch network model is innovatively introduced in the step (6), so that the multi-input and multi-dimension problem of multi-source heterogeneous data can be solved.

And (4) innovatively introducing an Xavier weight initialization, batch normalization, a Dropout technology and an Adadelta optimizer in the step (7), and solving the problems of nonlinearity, gradient disappearance, internal covariate offset, overfitting and continuous reduction of learning rate, so that a robust integral model is constructed.

And (8) creatively considering the influence of the hyper-parameters on the neural network, and providing a method for selecting an optimal model.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A dense oil reservoir transformation effect evaluation method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:

(4) dividing the characteristic data set extracted in the step (3) into a training set and a verification set according to different dimensions, and respectively using the training set and the verification set as input sets of a single model;

(8) setting at least two groups of comparison experiments, and analyzing parameters influencing the prediction capability of the overall model to obtain an optimal model;

2. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the sample data set in the step (1) comprises continuous data, discrete data and classified data.

3. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: specifically, in the step (2), discrete data and classification data are coded and mapped into binary data of 0 and 1, and how many feature values are taken, how many dimensions are used to represent the feature, and only one bit is set to be 1, and the other bits are all 0.

4. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the machine learning model in the step (5) comprises a long-short term memory neural network in a deep neural network model, a convolutional neural network model and a cyclic neural network model.

5. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: and (5) adopting an Xavier initialization method in the step (7).

6. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: the strategy adopted in the step (8) comprises two strategies, wherein one strategy is to compare the influence of training times, activation functions and data set division on the neural network on a basic model by adopting a control variable method without changing the structure of the neural network; the other method is to change the structure of the neural network, analyze the hidden layer number and the neuron number of each layer and the influence of Dropout technology on the prediction effect of the neural network.

7. The tight reservoir reforming effect evaluation method based on deep learning of claim 1, characterized in that: in the step (9), 70% of the data set is used as input of the neural network, i.e. the training set, the optimal weight and deviation are obtained through the three-part structured neural network, and the trained data is used to predict the rest 30% of the test set to see whether the expected standard is met.