Oil-gas-containing property evaluation method for fracture-cavity type oil reservoir semi-supervised learning reservoir fused with static and dynamic characteristics
Technical Field
The invention belongs to the field of oil reservoir engineering, researches main control factors of a geological structure of a fracture-cavity oil reservoir, simultaneously considers dynamic data of a production well, establishes a relation between attribute data of seismic energy half-decay and oil-gas containing property of the reservoir, analyzes high-efficiency well positions, and gives well drilling suggestions to an oil field.
Background
In the evaluation process of hydrocarbon reservoirs, how to improve the interpretation compliance rate of the hydrocarbon reservoirs has become one of the most challenging hot topics in oil and gas exploration. In 2004, Lizongjie and the like perform reservoir oil and gas bearing performance evaluation based on geophysical information interpretation, and the interpretation method based on geophysical information can reasonably interpret the oil and gas bearing performance of most reservoirs, but has poor oil and gas bearing performance identification effect on low-permeability reservoirs, low-resistance reservoirs and thin layers and special lithology such as limestone, dolomite, mudstone cracks and the like. Wenqiangqian in 2017 utilizes fuzzy clustering fusion analysis technology to establish a nonlinear connection between fusion attributes and reservoir characteristics, and realizes the prediction of the target reservoir oil-gas enrichment region of four middle-to-inferior segments of the valley structure. According to the prediction method of Schweiqi and the like in 2019, a proper time window (P1 area) is determined according to the oil reservoir characteristics of the carbolite system working area, correlation and attribute characteristic strength comprehensive analysis is performed on different categories of attributes, 4 sensitive attributes are preferably selected, multiple linear regression fitting is performed, a multiple-attribute fusion attribute prediction plane graph is obtained, and the favorable area positions can be well matched.
In recent years, machine learning techniques have attracted much attention in various applications, and many domestic scholars have achieved certain effects by using such methods to detect the oil and gas content. However, seismic attribute data has extremely strong nonlinear characteristics and huge data volume, contains redundant information and noise, and the traditional method has weak capability of expressing a nonlinear structure and weak processing capability on high-dimensional data.
In summary, although certain knowledge is provided in the aspect of fracture-cavity carbonate reservoir development in China at present, a relatively mature theory and a corresponding matching technology are not formed yet. The method is characterized by designing a semi-supervised algorithm which is designed by adopting an automatic encoder dimension reduction technology and is based on improved CNN-K mean value clustering, removing redundant information and noise of original data through a dimension reduction method, proposing a K mean value algorithm with memory structure constraint to recalculate a clustering center, training and adjusting semi-supervised convolutional neural network parameters, and realizing oil and gas containing property prediction and evaluation of an oil reservoir.
Disclosure of Invention
Aiming at the defects, the invention provides a method for evaluating the oil-gas content of a fracture-cavity oil reservoir semi-supervised learning reservoir, which integrates static and dynamic characteristics.
In order to solve the technical problems, the invention adopts the following technical scheme:
the method for evaluating the oil-gas content of the fracture-cavity type oil reservoir semi-supervised learning reservoir integrating static and dynamic characteristics comprises the following steps:
step 1, extracting seismic multi-attribute data from original seismic data;
and 2, detecting the integrity of the seismic attribute data and carrying out normalization processing on the data.
And 3, calculating the accumulated oil production of each well, searching the x l i ne and i n l i ne coordinates of the earthquake matched with the bottom coordinates of the production well, and outputting the energy half-decay time data corresponding to the coordinates as a labeling sample.
Step 4, using an automatic encoder to perform dimensionality reduction processing on the original data;
step 5, selecting a local area around the data subjected to dimensionality reduction as training data, and selecting one part of the training data as a training expansion sample;
step 6, constructing a CNN-K mean value semi-supervised learning algorithm network model constrained by a memory structure, wherein the model design comprises 4 convolution layers, 3 × 3 convolution kernels, 4 maximum pooling layers and 2 full-connection layers are used for classifying through a Softmax mixed K-Means classifier, the initial model is obtained by training the training data obtained in the previous step, and the initial clustering center C is initialized to be { C ═ C1,c2,...cnN is the number of label types;
step 7, inputting the training extended samples into a CNN-K mean value semi-supervised learning algorithm network constrained by a memory structure according to batches of epochs, calculating the similarity between the training extended samples and the clustering centers of all classes to obtain class labels of the training extended samples, and adding the training extended samples into the labeled sample sets of the corresponding classes;
step 8, after a batch of training extended samples are added, retraining the network by using the extended samples and obtaining a new clustering center, judging whether all the training extended samples are added, if so, turning to step 9, and if not, turning to step 7 after updating the clustering center;
step 9, saving the network model parameters obtained in the previous step to obtain a CNN-K mean value semi-supervised learning algorithm network model;
and step 10, predicting the work area data by using a CNN-K mean value semi-supervised learning algorithm network model, and outputting an oil-gas containing evaluation matrix of the research area.
Further, the step 3 specifically includes the following steps:
and 3.1, reading production dynamic data, calculating accumulated oil production, and marking the production wells with more than 10 ten thousand tons and full fracture hole development as high-yield wells, the production wells with less than 3 ten thousand tons and near the river channel as low-yield wells, and the rest production wells with 3-10 ten thousand tons as medium-yield wells according to the combination of static main control factors and dynamic characteristics.
And 3.2, searching seismic xline and inline coordinates matched with the bottom coordinates of the production well, and outputting the data of 20-20 channels near the coordinates in the half-decay time of energy as a labeled sample.
Further, the step 4 specifically includes the following steps:
and 4.1, setting the maximum cycle number of the training of the automatic encoder, the type of the optimizer, the input dimension and the output dimension.
Step 4.2, inputting a batch of energy half-decay time data, wherein the number of the data sets is t ═ NxM, and the data set X ═ M1,X2,.,XtD-dimension for each datum;
step 4.3, setting the number of the two layers of characteristic units of the coding layer as n respectively1And n2The decoding layers are n respectively2And n3Wherein n is1=n3=D、n2K; the coding layer and the decoding layer obtain output data Z through an error back propagation process, and the calculation method comprises the following steps:
in the formula, W1Is the coefficient of the input layer; b1Is the bias of the input layer; w2Is the coefficient of the hidden layer; b2A bias to a hidden layer; z is output data of the neural network of the automatic encoder, and Z is K dimension; f is an activation function sigmoid function of the neural network; x is input energy half-decay time data;
step 4.4, calculating a loss function, wherein the numerical value of the loss function is equal to the mean square error of the input energy half-decay time data and the output data so as to measure the accuracy of the automatic encoder, and the formula of the loss function is as follows:
MSE(Y)=E(Z-X)2
step 4.5, adding 1 to the iteration times, adjusting parameters of the self-encoder according to the loss function value, if the iteration times reach a preset upper limit or the loss function value is lower than a preset lower limit, turning to step 4.6, otherwise, turning to step 4.2;
and 4.6, outputting the K-dimensional data subjected to the dimension reduction.
Further, the step 5 specifically includes the following steps:
and inputting a region with the size of 20 × K around the dimension reduction data of the self-encoder as training data, and selecting 20% of the training data as training expansion samples.
Further, the step 7 specifically includes the following steps:
inputting the training expansion samples into a CNN-K mean value semi-supervised learning algorithm network constrained by a memory structure according to the epochs in batches, and outputting n characteristic parameters x corresponding to each training expansion sample1…xnAnd calculating the similarity distance between the training extended sample and each class clustering center, selecting the class of the clustering center with the minimum similarity distance as the class label of the training extended sample, wherein the similarity distance calculation formula is as follows:
wherein n is the number of clustering parameters, xiCharacteristic parameters of the extended samples for training, ci∈C。
Further, the step 8 specifically includes the following steps:
the new cluster center is calculated as follows:
Ct=α×Ct-1+(1-α)×Ccurrent
in the formula, Ct-1For the clustering center obtained from the previous calculation, CcurrentFor extending the clustering centers obtained by network calculation during training samples, CtAnd alpha is a proportionality coefficient, and is a new clustering center.
The invention has the beneficial effects that:
the fracture-cavity type oil reservoir has strong heterogeneity, the oil-gas containing property of a research area is very difficult to accurately mark, and the problem that a training sample is seriously insufficient exists. The evaluation of the oil and gas properties of the reservoir of the traditional fracture-cavity oil reservoir is mainly based on seismic data, but the dimensionality of the seismic data is high, machine learning analysis is directly adopted, the calculated amount is exponentially multiplied, and the precision of the model is difficult to guarantee due to sparse data samples. According to the main control factors of the geological structure of the fracture-cavity type oil reservoir stratum, the dynamic data of the production well is considered, and the manifold structure characteristics of the seismic attribute data are mined by adopting an automatic encoder. On the basis of fusing static and dynamic data characteristics, a fracture-cavity oil reservoir oil-gas-containing property evaluation model is constructed based on a convolutional neural network semi-supervised learning technology of deep learning, meanwhile, aiming at the problems that the traditional unsupervised machine learning method does not utilize prior information and is easily interfered by noise, a K mean algorithm constrained by a memory structure is provided to recalculate a clustering center, semi-supervised convolutional neural network parameters are trained and adjusted, and oil reservoir oil-gas-containing property prediction is realized. The method overcomes the problem that the traditional method is easily interfered by noise, and simultaneously fully utilizes prior information such as geology, earthquake, production dynamic data and the like to ensure the precision of the model.
The present invention will be described in detail below with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a diagram of a semi-supervised CNN network architecture;
FIG. 2 is a general flow chart of the evaluation method;
FIG. 3 is a chart showing evaluation of oil and gas contents in a study area;
FIG. 4 is a TK625 well daily production curve;
FIG. 5 is a TK625 well daily water content curve;
FIG. 6 shows the results of oil-gas detection of TK625 well;
FIG. 7 is a TK7-638 well daily production curve;
FIG. 8 is a TK7-638 well water content curve;
FIG. 9 shows the results of oil-gas detection of TK7-638 well.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1-2, the method of the present invention mainly comprises the following parts:
data preprocessing
1) Reading original seismic data, extracting seismic multi-attribute data, preferably seismic attributes related to oil and gas properties such as energy half-decay data, wherein transverse changes of the seismic attributes represent amplitude abnormity caused by fluid content, unconformity and the like;
2) detecting the integrity of the seismic attribute data and carrying out normalization processing on the data;
3) and reading production dynamic data, calculating accumulated oil production, and according to static main control factors such as the geology and the structure of a fracture-cavity type oil reservoir and the like combined with dynamic characteristics, marking the production wells with the weight of more than 10 ten thousand tons and full development of fracture cavities as high-yield wells, marking the production wells with the weight of less than 3 ten thousand tons and near a river channel as low-yield wells, and taking the rest production wells with the weight of 3-10 ten thousand tons as medium-yield wells.
4) And searching the seismic xline and inline coordinates matched with the bottom coordinates of the production well, and outputting the data when the energy of 20 channels near the coordinates is half-aged as a labeled sample.
(II) earthquake data dimension reduction module based on automatic encoder
The seismic data has high dimensionality, the problem that the precision of a model is difficult to guarantee due to sparse data samples exists, and redundant information is removed by extracting principal components of the data through dimensionality reduction processing. The data of the half-decay of the original energy is seismic data with the size of N M D, wherein N is xlines, M is inlines, and D is a depth domain. And (4) reducing the dimension by adopting an automatic encoder to obtain data with the size of N M K, wherein K is the depth domain after dimension reduction. The method comprises the following specific steps:
1) and setting the maximum cycle number of the training of the automatic encoder, the type of the optimizer, the input dimension and the output dimension.
2) Inputting energy half-decay data, wherein the number of data sets is t ═ NxM, and the number of data sets is X ═ X1,X2,.,XtD-dimension for each datum.
3) Setting the number of two layers of characteristic units of the coding layer as n respectively1、n2The decoding layers are n respectively2、n3Wherein n is1=n3=D、n2K. The coding layer and the decoding layer obtain output data Z through an error back propagation process, and the calculation method comprises the following steps:
in the formula, W1Is the coefficient of the input layer; b1Is the bias of the input layer; w2Is the coefficient of the hidden layer; b2A bias to a hidden layer; z is output data of the neural network of the automatic encoder and is D-dimensional; f is an activation function sigmoid function of the neural network; x is the input energy half-decay time data.
4) A loss function is calculated. The loss function value is equal to the mean square error of the input energy half-decay time data and the output data so as to measure the accuracy of the automatic encoder.
MSE(Y)=E(Z-X)2 (2)
5) And (4) iteratively training the self-encoder until a set upper limit of the cycle times is reached or the value of the loss function is lower than a set lower limit.
6) And outputting the reduced K-dimensional data.
CNN-K mean value semi-supervised learning algorithm module with memory structure constraint
The fracture-cavity type oil reservoir has strong heterogeneity, wide research area, less well drilling, difficult oil-gas containing property accurate marking and serious shortage of training samples. Aiming at the problem that the convolutional neural network is easy to generate overfitting when the number of marked samples is small, manifold structure characteristics in energy half-decay data extracted by a self-encoder are researched, well labels are calibrated based on static and dynamic data, a CNN-K mean value semi-supervised learning algorithm training network model constrained by a memory structure is provided, the mapping relation between different types of reservoirs and the manifold structure characteristics in energy half-decay is learned, and an oil-gas-containing evaluation model is obtained. The method comprises the following specific steps:
1) and inputting a region with the size of 20 × K around the dimensionality reduction data of the self-encoder corresponding to the well label as training data, wherein K is the number of channels of the feature matrix after dimensionality reduction, 20% of the training expansion samples are divided, and the rest data are used as test samples.
2) Setting the maximum iteration times of network training, well class labels, learning rate alpha and CNN-K mean value semi-supervised learning algorithm network parameters (shown in figure 1) for initializing memory structure constraints;
3) inputting training samples, constructing a CNN-K mean value semi-supervised learning algorithm network constrained by a memory structure, designing a model comprising 4 3 × 3 convolution kernels, 4 maximum pooling layers and 2 full-connection layers, classifying the model by a Softmax mixed K-Means classifier, training the model to obtain an initial model, and initializing a clustering center C ═ { C ═ C { (C } C1,c2,...cnN is the number of label types;
4) inputting training expansion samples according to batches of epochs, calculating the similarity between the training expansion samples and the clustering centers of all classes, and dividing the training expansion samples into the classes with the highest similarity, wherein the similarity evaluation function is as follows:
in the formula, n is the number of clustering parameters, x is an output characteristic parameter of an unlabeled sample passing through a network, and c is a clustering center calculated in 3);
5) retraining the convolutional neural network to obtain the current clustering center CcurrentUpdating neural network parameters and a new clustering center, and in order to reduce the influence of noise on the clustering center when a training sample is updated, providing a clustering center updating method controlled by a memory structure, wherein the formula is as follows:
Ct=α×Ct-1+(1-α)×Ccurrent (4)
in the formula Ct-1For the clustering center obtained from the previous calculation, CcurrentFor extending the clustering centers obtained by network calculation during training samples, CtAlpha is a proportionality coefficient for the final cluster center.
6) Repeating the steps 4) and 5) until all the training expansion samples obtain the label types, and storing network parameters of the CNN-K mean value semi-supervised learning algorithm;
7) and (4) predicting the test sample by using the network model obtained in the step 6) to obtain an oil-gas-containing evaluation result, as shown in a figure 3.
Examples
(1) General overview of geological structures
The fracture-cave carbonate reservoir in the tower river oil field mainly takes a corrosion hole as a main part, has various reservoir types and has the characteristic of strong heterogeneity. At present, the main oil producing layer of an oil field is an Ordovician fracture-cave carbonate rock stratum. The experimental subject of the study was the production dynamics of a total of 222 production wells and 15 years (from 2001 to 2015) in the 67 th zone of the tahe oilfield. A plurality of attribute data closely related to oil and gas are extracted from the original seismic data.
(2) Software and hardware environment
The experimental software and hardware configuration is shown in table 5.1, and the operating system is Windows10(64 bit); the processor is Intel Core i5-6300HQ, 2.30 GHz; the memory is 12G, and the hard disk is 480G; the deep learning platform is open source framework Tensorflow1.8 based on Google; the programming language is python 3.6.
TABLE 3.1 Experimental Environment
(3) Experimental data
The experimental data comprises seismic data when energy is half-attenuated for 70ms below a T74 horizon of a Tahe oil field 67 area and production data of all wells of the Tahe oil field 67 area, 222 production wells are shared in the area, a large amount of production data cannot be formed due to frequent shut-down of some wells, effective samples cannot be formed, dynamic data of the production wells with complete data are selected as research objects, 45 wells are divided into 15 wells of a high-yield well, a medium-yield well and a low-yield well as samples, and training data are obtained based on a sliding window.
(4) Parameter optimization
The method comprises the following steps that a semi-supervised learning oil reservoir oil-gas-containing prediction model based on a convolutional neural network needs to adjust parameters during training, one type is parameters automatically adjusted in a back propagation mode in the training process, and the parameters are mainly weight coefficients and bias quantities of all connection nodes of the convolutional neural network; the other type is hyper-parameters which need to be adjusted manually before training, and mainly has loss functions, optimization algorithms, output layer activation functions, learning rates, convolution kernel sizes, full-connection layer output characteristic numbers, hidden layer number, hidden layer node numbers and iteration times, and the prediction accuracy of the model is influenced to the greatest extent. The model design comprises 4 convolution layers, 3 × 3 convolution kernels are used, 4 maximum pooling layers and 2 full-connection layers are used for classification through a Softmax mixed K-Means classifier, an adam function is selected as an optimizer of an algorithm, a cross entropy function is selected as a loss function, a 6-layer network is selected as the layer number, the characteristic number of the full-connection layers is adjusted, 20000 iterations are selected, and 0.3 is selected as a parameter coefficient alpha of a formula 5.
Table 1 network parameter table
(5) Example analysis
1) TK625 well oil-gas-containing property prediction analysis
The TK625 well starts to produce in 2 months in 2002, the oil yield is high from the initial production stage to 11 months in 2003, and meanwhile, the water content is low; the water content gradually increased from 11 months in 2003 to 9 months in 2008. The well had a cumulative oil production of 185468 tons and a daily average production of 39.3 tons.
And (3) predicting the oil-gas content of the TK625 well based on the model, wherein the well is in a yellow region and is a reservoir favorable region. The cumulative oil production was 185468 tons, which is a high production well, and the predicted result is consistent with the actual production, as shown in fig. 4-6.
2) TK7-638 well oil and gas containing property prediction analysis
The TK7-638 well starts to produce in 10 months in 2001, belongs to a tiny isolated cavity, and has high yield within 10 days at the initial production stage and then sharply decreases the yield; the water content is high by 4 months of 2007, and the well is closed. The well had a cumulative oil production of 5868.8 tons and a daily average production of 4.09 tons.
TK7-638 well is predicted to contain oil and gas based on the model, and the well is in a blue area and is a reservoir unfavorable area. The cumulative oil production was 5868.8 tons, which is a low yield well, and the predicted result is consistent with the actual production, as shown in fig. 7-9.
The foregoing is illustrative of the best mode of the invention and details not described herein are within the common general knowledge of a person of ordinary skill in the art. The scope of the present invention is defined by the appended claims, and any equivalent modifications based on the technical teaching of the present invention are also within the scope of the present invention.