CN109447249B

CN109447249B - An Adversarial Neural Network Logging Data Reconstruction Method Based on Deep Convolution

Info

Publication number: CN109447249B
Application number: CN201811540278.2A
Authority: CN
Inventors: 朱登明; 姬庆庆; 胡家琦; 王兆其
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2018-12-17
Filing date: 2018-12-17
Publication date: 2021-04-23
Anticipated expiration: 2038-12-17
Also published as: CN109447249A

Abstract

The present invention provides a method for training a model for reconstructing logging data, including: 1) reducing the dimension of the collected logging attribute data; The false data of the well attribute is used as an input of the discriminant network D, and the result of dimensionality reduction processing is used as another input of the discriminant network D. By adjusting the parameters of the generation network G, the discriminant network D judges the false data as true. The collected data; wherein, the generation network G is a convolutional neural network, and the fake data corresponds to a plane coordinate. Avoid the difficulty of obtaining seismic data from older well logs to train the neural network, and use adversarial training to obtain a model that can produce predictions that are closest to the real data.

Description

Antagonistic neural network well logging data reconstruction method based on deep convolution

Technical Field

The present invention relates to processing of big data, and more particularly, to processing of well log data of natural resources such as mineral deposits, oil, and natural gas based on deep learning.

Background

Fossil energy such as oil and natural gas is a world universal energy at present, and the exploration and development of the fossil energy are important guarantees for national construction and economic development. The breakthrough of exploration technology is accompanied by the emergence of more high-depth exploration wells, and the exploration cost is increased. How to accurately predict the storage condition of resources in the whole area through limited exploratory well data is a breakthrough point of resource exploitation technology, and the technology is called well logging data reconstruction.

In order to avoid wasting exploration resources, the reconstruction of the logging data has a high requirement on the accuracy of the prediction result. Through the consensus of years of practical experience industry, the logging data has a large amount of details reflecting individual differences and does not have strong regularity, so that the prediction effect achieved by sampling the traditional data of logging attributes through interpolation is very unsatisfactory.

In recent years, many researchers have been working on the development of machine learning-based well log data reconstruction, and most of them are to extract three-dimensional seismic data and one-dimensional seismic data of resource reservoirs, perform cluster analysis on these data, and delineate stratigraphic boundaries based on the cluster results to reconstruct well log data of a certain region position. However, such schemes still suffer from a number of problems. On the one hand, there is little corresponding seismic data available for fields that are older in the production age. On the other hand, the seismic data are easily interfered by corresponding noise in depth, and a good reconstruction effect is difficult to obtain. Such schemes are difficult to apply to early developed fields for the reasons described above, and also do not make good use of machine learning models to obtain accurate prediction results.

Disclosure of Invention

Accordingly, it is an object of the present invention to overcome the above-mentioned deficiencies of the prior art and to provide a method of training a model for reconstructing well log data, comprising:

1) carrying out dimensionality reduction on the collected data of the logging attributes;

2) the method comprises the steps that false data which are generated by a generating network G based on current parameters and aim at logging attributes are used as one input of a judging network D, a dimension reduction processing result is used as the other input of the judging network D, and the judging network D judges the false data as really acquired data by adjusting the parameters of the generating network G;

the generation network G is a convolutional neural network, and the dummy data corresponds to a plane coordinate.

Preferably, according to the method, the discrimination network D is a classifier for classifying whether a difference between a result of the dimension reduction processing and the dummy data is smaller than a set threshold.

Preferably, according to said method, said discriminating network D uses the mean square error to evaluate said difference.

Preferably, according to the method, wherein step 2) comprises:

2-1) carrying out normalization processing on the result of the dimensionality reduction processing;

2-2) when the mean square error is less than 1.5, taking the current parameters of the generating network G as the parameters of a model for reconstructing logging data.

Preferably, according to the method, the generating network G takes plane coordinates as input for generating logging property values corresponding to the plane coordinates based on its current parameters.

Preferably, according to the method, wherein step 2) comprises:

inputting a random value generated based on noise as the plane coordinate into the generating network G to generate a logging attribute value corresponding to the plane coordinate as the dummy data by the generating network G based on its current parameter and the plane coordinate.

Preferably, according to the method, wherein step 1) comprises: and reducing the dimension of the data of one logging attribute at different depths based on a PCA algorithm.

Preferably, according to the method, wherein step 1) comprises: data for a well log attribute at different depths is down sampled.

Preferably, the method according to, wherein the logging property of step 1) is selected from the group consisting of: CARB, CLLB, VAC, VAF90, VAT10, VAT20, VAT30, VAT60, VAT90, VCA, VCILD, VGR, VKRO, VKRW, VPERM, VPOR, VSH, VSP, VSPC, VSW, VSWIR, VSXO, DEN, SPC, RT, RM, SW, SOR, POR, PORT.

A method of reconstructing well log data based on a model generated by any one of the methods above, comprising:

1) inputting the plane coordinates of the area to be predicted into the obtained generation network G;

2) and generating a logging attribute value corresponding to the plane coordinate of the area to be predicted by the generation network G.

A computer-readable storage medium, in which a computer program is stored which, when executed, is adapted to carry out the method of any of the above.

A system for reconstructing a model of well log data, comprising:

a storage device and a processor;

wherein the storage means is adapted to store a computer program which, when executed by the processor, is adapted to carry out the method of any of the above.

Compared with the prior art, the embodiment of the invention has the advantages that:

the neural network model is trained by adopting the logging attributes instead of seismic data as samples, the condition that proper seismic data cannot be obtained due to the fact that the mining years are early is avoided, the logging attributes are relatively less interfered by noise, and the neural network model trained in the mode is better in prediction effect. In order to train the neural network model by using the logging attribute data, the original logging attribute data with huge data volume is subjected to dimensionality reduction, and a data part which is most beneficial to reconstructing the logging data and can express the nonlinear relation is reserved to be used as a sample of the training model. The method adopts a model for reconstructing logging data by adopting the confrontation network training, and makes the discrimination network D difficult to distinguish whether the logging attribute value generated by the generation network G is really explored data or not through the confrontation game, so that the obtained generation network G can generate a prediction result closest to the real data.

Drawings

Embodiments of the invention are further described below with reference to the accompanying drawings, in which:

FIG. 1 is a log property profile at different depths for a log property in a well log;

FIG. 2 is a countermeasure network model according to one embodiment of the invention;

FIG. 3 is a flow diagram of a method of training a model for reconstructing well log data according to one embodiment of the present invention;

FIG. 4a is the result of predicting GR well logging attributes using a generative network model obtained through the training method of the present invention;

FIG. 4b is a graph of the values of the actual detected logging attributes corresponding to FIG. 4 a.

Detailed Description

The logging data in the invention mainly include various physical parameters recorded aiming at the resource reservoir, such as natural potential, resistivity, acoustic velocity, rock volume density and the like, and geological information such as lithology, shale content, water saturation, permeability and the like obtained by processing the directly acquired data. The data can be called logging attributes, and the curve reflecting the change rule of the logging attributes in depth is also called logging attribute curve.

Fig. 1 shows a log property curve for a certain log property, and it can be seen that the information obtained by performing multiple detections at different depths accurately reflects the change of the property at the depth level.

Typically, well log attributes for oil exploration comprise over 160 attributes, which should be sampled once every 0.125 meters of depth, based on the national standard specifications, in order to accurately characterize the curve variation of one well log attribute. It can be understood that for each log, the values of the various logging attributes at different depths are acquired, and the data volume is huge. Thus, existing machine learning based log data reconstruction options use seismic data, rather than employing logging attributes.

Based on the background, the inventor proposes to perform dimensionality reduction processing on the logging attributes and train a confrontation network model by using processed real data aiming at the problem that a machine model is difficult to train by directly adopting the logging attributes. The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

According to one embodiment of the invention, the machine model used is a countermeasure network model (GAN), which is an unsupervised deep learning model that includes a discrimination network D and a generation network G. Based on the GAN theory, it is not required that the generation network G and the discrimination network D both use a neural network, and the main principle is to generate data by using the generation network G, and to determine whether the data generated by the generation network G is real data by the discrimination network D, and to obtain a result that can be generated to be very similar to the real data by playing a confrontation game.

Fig. 2 illustrates a countermeasure network model according to one embodiment of the invention. As shown in fig. 2, for the application scenario of predicting the logging attributes, the decision network D employs a two-classifier, which takes the real data acquired for logging and the false data generated by the generation network G based on the high-order noise data as its input, and outputs the prediction label that is determined as "true" when the difference between the real data and the false data is smaller than the set threshold, or outputs the prediction label that is determined as "false".

The following describes how to train a model for reconstructing well log data based on the above-described countermeasure network.

Referring to FIG. 3, in accordance with an embodiment of the present invention, there is provided a method of training a model for reconstructing well log data, comprising:

step 1, screening all logging attributes of logging. Based on existing national standards, 160 log attributes may be collected for a single log. The inventor finds that 30 logging attributes are very beneficial for reconstructing logging data, and the logging curve codes are respectively as follows: CARB, CLLB, VAC, VAF90, VAT10, VAT20, VAT30, VAT60, VAT90, VCA, VCILD, VGR, VKRO, VKRW, VPERM, VPOR, VSH, VSP, VSPC, VSW, VSWIR, VSXO, DEN, SPC, RT, RM, SW, SOR, POR, PORT. Thus, when training the model, the one or more logging attributes may be filtered from the raw well log data for use in training the model.

And 2, selecting data of a plurality of logs, and performing down-sampling on the data of each log attribute of each log on different depths.

Since the logging attributes of one log can only reflect the geological attribute conditions near the log, it is preferable to use data of the logging attributes of a plurality of logs in order to recover the geological attribute characteristics of a wide area. Preferably, data of logging attributes of the evenly distributed plurality of logs is selected.

The reason for down-sampling in this step is that the well logging attribute data obtained based on the national standard is sampled once at a depth of 0.125m, and the down-sampling can reduce the data amount of the depth layer on one hand and filter out the similar data between adjacent sampling points on the other hand. For example, the AC property value at a sample point of 0.125m is 119.6, and the AC property value at a sample point of 0.250m is 121.4, which are very similar. Preferably, one sample value is obtained by adjusting the sampling frequency to a depth of 0.5m per interval by down-sampling.

The number of logging attributes can be greatly reduced through downsampling, and the training of a model is facilitated.

Taking two attributes of VGR and VAC as an example, the result of downsampling at depth is shown in table 1:

TABLE 1

Depth of field	VGR	VAC		Normalized VGR	Normalized VAC
						901.500	103.03	323.74	Retention	0.3535	0.7374
901.625	120.46	323.65	Abandon
						901.750	138.55	323.36	Abandon
901.875	146.82	322.90	Abandon
						902.000	140.99	322.46	Retention	0.6066	0.4831
902.125	125.84	322.25	Abandon

Wherein, the logging attribute value of every interval 0.5m depth is reserved, and the rest logging attribute values are discarded. In addition, given that the values of many logging attributes are relatively large and there are differences between the values of different logging attributes, the values of the retained logging attributes may be normalized for subsequent processing. For example, find the maximum value of VGR, calculate the ratio of the VGR value of the current depth to the maximum value to obtain the normalized result. The normalization operation is beneficial to reducing the differential influence of different dimensions and different data scales on the subsequent training of the confrontation network.

And 3, performing dimensionality reduction on each logging attribute obtained through the step by using a Principal Components Analysis (PCA) dimensionality reduction method.

The PCA is a dimensionality reduction algorithm commonly used in image processing, and is very beneficial to remove linearly dependent portions of data and find linearly independent portions. The core idea of this algorithm is: for a given dissimilarity matrix between high-dimensional data points, the dissimilarity matrix between the point pairs can be matched with the given dissimilarity matrix between the high-dimensional data points by finding data corresponding to the high-dimensional data points in a low-dimensional space.

Assume that the dissimilarity matrix between pairs of points in the high-dimensional data set X is ∑ (σ)_ij)_n×nPCA aims to find low-dimensional data Y ═ Y_iI 1,2,.. multidata, n } such that the distance d between pairs of points in the low-dimensional data set_ijCan be maximally close to sigma_ij. If the Euclidean measurement is adopted, the following are:

σ_ij ²＝d_ij ²＝||x_i-x_j||²＝x_i ^Tx_i-2x_i ^Tx_j+x_j ^Tx_j

wherein x is_i、x_jRepresenting the ith and jth attribute values, d, in a well log attribute data set_ijRepresenting the distance between the ith and jth log attributes.

The data Y of the logging attributes after the dimensionality reduction of the PCA can be calculated by the following calculation formula.

Y＝S^0.5U_d

Wherein S is a data value x of a logging attribute to be processed₁,x₂…x_nDiagonal matrix, U, arranged in descending order_dAnd e represents a unit vector. According to empirical values, d ∈ [0,30 ]]。

Assuming that the values of VAC are distributed between 250 and 350 and the values of VGR are distributed between 50 and 200, the one-dimensional data shown in Table 2 is obtained after PCA processing:

TABLE 2

Well log attributes	Depth of field	Normalized numerical value
			VAC	901.500	0.5425
VGR	902.000	0.7248

In the above steps 1,2, and 3, three operations of performing dimensionality reduction on the logging attribute data are provided, and low-dimensional data is obtained for subsequent model training.

And 4, the real logging data obtained in the step 3 are input into a discrimination network D, and the generation network G takes the randomly generated numerical value as a plane coordinate position to generate a logging attribute value corresponding to the position, namely false data, and the false data are also input into the discrimination network D. And the discrimination network D evaluates whether the mean square error between the real data and the false data is less than a set threshold value. If not, adjusting the parameters of the generated network G, repeating the operation in the step until the network D is judged to be unable to identify whether the false data is real data, namely the mean square error is smaller than a set threshold value, and taking the parameters of the generated network G at the moment as the parameters of the model obtained by training.

The generation network G adopts a convolution neural network model. Preferably, the pooling operation in the generation network G is replaced by a deconvolution layer, and the deconvolution operation mainly uses a filter after the convolution process is reversed, which can perform better function for extracting the feature values than the pooling operation. The parameters of the deconvolution layer are changed in the back propagation of each iteration subject to the constraints of the parameters associated with the convolution layer. For example, in the reverse process of convolution, the step size and filling pattern of the deconvolution layer will change with each iteration.

In this embodiment, the input of the generation network G is a plane coordinate position generated based on noise, and the plane coordinate position may be generated randomly or based on a trained neural network model. In each iteration process of adjusting the parameters of the generating network G, the generating network G generates a predicted value of the logging attribute value corresponding to the current parameters of the generating network G and the input plane coordinate position of the generating network G.

And the judgment network D is a two-classifier, and outputs a prediction label judged to be 'true' when the difference between real data acquired by logging and false data generated by the generation network G based on high-order noise data is smaller than a set threshold, or outputs a prediction label judged to be 'false'. In other embodiments of the present invention, other models can be used as the discriminant network D as long as the model can identify the difference between the real data and the dummy data.

According to an embodiment of the invention, mean-square error (MSE) is used to measure the difference between the real data and the false data, the value of the MSE can reach tens of thousands at the initial stage of training the countermeasure network, and the MSE can continuously decrease to an ideal value range along with the development of the training process. For the case of normalizing the logging attributes, if the mean square error between the network generation data and the original data is less than 1.5, the training is considered to reach the expected target, the training is stopped, and the parameters of the generated network G are used as the training result.

Through the training process, the obtained logging attribute data generated by the generating network G can be enough to cheat the discriminating network D, so that the discriminating network D can not distinguish whether the logging attribute data from the generating network G is the data collected from real logging or not. Such a generated network G will be used for reconstruction of well log data for unknown regions where no production logs are produced.

When the model obtained by the method is used for well logging data reconstruction of an unknown region without exploitation well logging, only the generation network G is adopted, and the discrimination network D is not required. The planar coordinates of the area to be predicted may be input into the obtained generation network G, and logging attribute values corresponding to the planar coordinates of the area to be predicted may be produced by the generation network G.

In the embodiment, the neural network model is trained by adopting the logging attributes instead of the seismic data as the sample, so that the condition that proper seismic data cannot be acquired due to the early mining year is avoided, the logging attributes are relatively less interfered by noise, and the prediction effect of the neural network model trained in the mode is better. In order to train the neural network model by using the logging attribute data, the original logging attribute data with huge data volume is subjected to dimensionality reduction, and a data part which is most beneficial to reconstructing the logging data and can express the nonlinear relation is reserved to be used as a sample of the training model. The method adopts a model for reconstructing logging data by adopting the confrontation network training, and makes the discrimination network D difficult to distinguish whether the logging attribute value generated by the generation network G is really explored data or not through the confrontation game, so that the obtained generation network G can generate a prediction result closest to the real data.

In order to verify the effect obtained based on the method of the present invention, the inventors conducted simulation experiments. FIG. 4a shows the result of predicting GR well logging attributes using the generated network model obtained by the training method of the present invention, where the ordinate is GR well logging attribute values and the abscissa is depth values. The number of training iterations experienced by the model used in fig. 4a is 20,000, and the output condition of the training is to end the training when the number of iterations reaches 20,000. If the number of training times is increased, a better training result may be obtained, but when the number of training times is 20,000, the model already obtains a convergence result, so the invention chooses to finish the training when the number of training times is 20,000. FIG. 4b shows the values of the actual detected logging attributes corresponding to FIG. 4 a. For reconstruction of well log data, the reconstructed curve is usually not fit, and a good prediction should be one that gives as close to the true value as possible at the corresponding depth. It can be seen that the depths corresponding to the peaks and valleys in fig. 4a substantially match the values of the corresponding depths in fig. 4b, which means that the situation of acquiring the petroleum resource at the corresponding depth corresponding to the peak thereof substantially does not occur when no petroleum is acquired with reference to the prediction result of the present invention, and thus the method provided by the present invention is very beneficial to the exploration of petroleum logging.

To further quantify the effect of such predictions, the inventors also calculated the signal-to-noise ratio of the predictions using the generated network model of the present invention and compared it to the conventional MSE algorithm. The snr is a measure of the quality of the data reconstruction result by looking at the ratio between the predicted value (in the present invention, the reconstructed value) and the original value. The signal-to-noise ratio calculated here is defined as:

where SNR represents the signal-to-noise ratio, z represents the raw logging data,

representing reconstructed log data (i.e., predicted values).

Based on the above calculation, a signal-to-noise ratio of 20.42 using the conventional MSE algorithm and a signal-to-noise ratio of 21.73 based on the present invention can be obtained. Therefore, the generated model obtained by the method can obtain better and ideal prediction results. It should be noted that, all the steps described in the above embodiments are not necessary, and those skilled in the art may make appropriate substitutions, replacements, modifications, and the like according to actual needs.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of training a model for reconstructing logging data, comprising:

1) Dimensionality reduction of the collected logging attribute data;

2) The false data for the logging attribute generated by the generation network G based on its current parameters is used as an input of the discriminant network D, and the result of the dimensionality reduction process is used as another input of the discriminant network D. By adjusting the generation network The parameter of G makes the discriminant network D judge the false data as the real collected data;

The generation network G is a convolutional neural network, and the fake data corresponds to a plane coordinate.

2 . The method according to claim 1 , wherein the discriminant network D is a binary classifier for classifying whether the difference between the result of the dimensionality reduction process and the fake data is smaller than a set threshold. 3 .

3. The method of claim 2, wherein the discriminant network D uses mean squared error to evaluate the difference.

4. The method of claim 3, wherein step 2) comprises:

2-1) normalize the result of the dimensionality reduction processing;

2-2) When the mean square error is less than 1.5, the current parameters of the generation network G are used as parameters of the model for reconstructing the logging data.

5. The method of claim 1, wherein the generating network G takes plane coordinates as input for generating logging attribute values corresponding to the plane coordinates based on its current parameters.

6. The method of claim 5, wherein step 2) comprises:

A random value generated based on noise is input into the generation network G as the plane coordinate, so that the generation network G generates a logging attribute value corresponding to the plane coordinate based on its current parameters and the plane coordinate as the fake data.

7. The method according to claim 1, wherein step 1) comprises: reducing the dimension of data of one logging attribute at different depths based on PCA algorithm.

8. The method of claim 1, wherein step 1) comprises: down-sampling data at different depths for a logging attribute.

9. The method according to claim 1, wherein the logging attribute of step 1) is selected from: CARB, CLLB, VAC, VAF90, VAT10, VAT20, VAT30, VAT60, VAT90, VCA, VCILD, VGR, VKRO, VKRW, VPERM, VPOR, VSH, VSP, VSPC, VSW, VSWIR, VSXO, DEN, SPC, RT, RM, SW, SOR, POR, PORT.

10. A method for reconstructing logging data based on a model generated by the method according to any one of claims 1 to 9, comprising:

1) Input the plane coordinates of the region to be predicted into the obtained generation network G;

2) Generating logging attribute values corresponding to the plane coordinates of the region to be predicted by the generating network G.

11. A computer-readable storage medium in which a computer program is stored which, when executed, is used to implement the method of any one of claims 1-10.

12. A system for reconstructing a model of logging data, comprising:

a storage device, and a processor;

Wherein, the storage device is used to store a computer program, and the computer program is used to implement the method according to any one of claims 1 to 10 when executed by the processor.