Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problems of low prediction accuracy and poor robustness of the existing shale total organic carbon content prediction method, a first aspect of the present invention provides a method for predicting shale total organic carbon content based on a deep coding decoding network, the method comprising:
s100, obtaining a logging curve based on logging data of a target shale layer well position to be predicted; the logging curves comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves;
s200, preprocessing the logging curve to obtain a preprocessed logging curve;
step S300, performing windowing processing on the preprocessed logging curve, inputting the windowed logging curve into a pre-trained TOC prediction model to obtain TOC of a window center point as a prediction result of the total organic carbon content of the corresponding depth of the well position to be predicted of the target shale layer;
the TOC prediction model is constructed on the basis of an improved 1D U-Net network, and when the same-scale features of an encoder and a decoder in the improved 1D U-Net network are connected in a cross-level mode, the features are spliced after convolution and residual connection processing; in addition, the features output by each decoder are leveled and fused, and the TOC predicted value of the center point of the window is output through the full connection layer.
In some preferred embodiments, the preprocessing includes data outlier point removal, data normalization.
In some preferred embodiments, the TOC prediction model is trained by:
a10, collecting logging data of a target shale layer training well position and corresponding TOC content data; obtaining a logging curve based on the logging data, and taking the TOC content data as a truth value label;
a20, preprocessing the logging curve and the corresponding TOC content data, and dividing the preprocessed logging curve into a training set and a verification set according to a set proportion; the log curves in the training set and the corresponding truth value labels of the TOC content are used as training samples; the logging curves in the verification set and the corresponding truth value labels of the TOC content are used as verification samples;
a30, performing windowing treatment on the preprocessed training sample, inputting the training sample into a pre-constructed TOC prediction model after the windowing treatment to obtain the TOC of the center point of a window as a prediction result of the total organic carbon content of the corresponding depth of the target shale layer;
a40, calculating a loss value based on the prediction result and the truth value label of the corresponding TOC content, and updating the model parameters of the TOC prediction model based on the loss value;
a50, inputting the windowed verification sample into a TOC prediction model in each training period, obtaining a prediction result of the total organic carbon content of the corresponding depth of the target shale layer corresponding to the verification sample, solving the loss of mean square error by combining a truth label of the TOC content corresponding to the verification sample, finishing model training when the loss of mean square error is less than a set threshold value, otherwise, skipping to the step A60;
and A60, circularly executing the steps A10-A40 until a set cutoff condition is reached, and obtaining the trained TOC prediction model, wherein the cutoff condition comprises training times and model precision.
In some preferred embodiments, step a60 is followed by analysis of the log:
inputting the windowed verification samples into the TOC prediction model trained in the step A60 again to obtain a prediction result corresponding to each logging curve as a first result; the prediction result is the prediction result of the total organic carbon content of the corresponding depth of the target shale layer;
carrying out gradient back transmission on the first result to obtain response values of the well logging curves to a TOC prediction model, carrying out descending sequencing on the response values, and selecting the well logging curves corresponding to the first N response values as well logging curves which are sensitive to the prediction of the TOC content;
or
Inputting the windowed training samples into corresponding pre-constructed 1-dimensional input TOC prediction models respectively, and training the pre-constructed 1-dimensional input TOC prediction models;
after training is finished, obtaining a prediction result corresponding to each logging curve through a 1-dimensional input TOC prediction model and evaluating accuracy to serve as a second result;
and sequencing the second results in a descending order, and selecting the well logging curves corresponding to the first N second results as well logging curves which are sensitive to the TOC content prediction after sequencing.
In some preferred real-time modes, when the same-scale features of the encoder and the decoder in the improved 1D U-Net network are connected in a cross-stage manner, the same-scale features are subjected to convolution and residual error connection processing and then spliced, and the method comprises the following steps:
processing the characteristics coded by a coding module in a coder through N sequentially connected residual convolution modules, and splicing the processed characteristics with the same-scale characteristics in a decoder as the input of a corresponding decoding module in the decoder;
the residual convolution module comprises
A convolution layer of
The residual error layer of (1); and the residual convolution module is used for performing convolution and residual processing on the input features respectively and adding the processed features.
In some preferred embodiments, the pre-processed log is windowed by:
,
is the size of the window of data,
is shown as
And (6) a logging curve is planted.
In some preferred embodiments, the features output by each decoder are subjected to flattening fusion by: and converting the multi-dimensional characteristics output by each decoder into one-dimensional vector characteristics, and then splicing and fusing.
In a second aspect of the present invention, a system for predicting shale total organic carbon content based on a deep coding decoding network is provided, where the system includes: the device comprises a data acquisition module, a preprocessing module and a prediction module;
the data acquisition module is configured to obtain a logging curve based on logging data of a well position to be predicted of a target shale layer; the logging curves comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves;
the preprocessing module is configured to preprocess the logging curve to obtain a preprocessed logging curve;
the prediction module is configured to perform windowing processing on the preprocessed logging curve, and after the windowing processing is performed, the windowed logging curve is input into a pre-trained TOC prediction model to obtain the TOC of a window center point as a prediction result of the total organic carbon content of the corresponding depth of the well position to be predicted of the target shale layer;
the TOC prediction model is constructed on the basis of an improved 1D U-Net network, and when the same-scale features of an encoder and a decoder in the improved 1D U-Net network are connected in a cross-level mode, the encoder and the decoder are spliced after convolution and residual connection processing; in addition, the features output by each decoder are leveled and fused, and the TOC predicted value of the center point of the window is output through the full connection layer.
In a third aspect of the invention, an electronic device is proposed, at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for implementing the method for predicting shale total organic carbon content based on a deep code decoding network described above.
In a fourth aspect of the present invention, a computer-readable storage medium is provided, where computer instructions are stored in the computer-readable storage medium for execution by the computer to implement the above method for predicting shale total organic carbon content based on a deep coding decoding network.
The invention has the beneficial effects that:
the method improves the accuracy and robustness of predicting the total organic carbon content of the shale layer.
1) The TOC curve is predicted through an improved deep learning U-Net model, the U-Net model is a data driving model, a high-level abstract mode can be extracted, and a nonlinear function mapping relation is excavated to the maximum extent, so that the limitation brought by a human formalized designated model is broken through, the strong multi-level and multi-scale modeling capability of the model can be utilized, the multivariate and multi-scale nonlinear characteristics are fully fused for comprehensive prediction, the TOC prediction mode is automatically learned in a distributed manner by using all logging parameters, and the accuracy of TOC prediction is improved;
2) the invention infers the relevance between a plurality of logging parameters and TOC, and can perform gradient back-transmission on the trained improved U-Net model or perform combination input analysis modeling effect on different logging curves, thereby clearing the variation factors, neglecting irrelevant factors and having better robustness and interpretability.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The invention discloses a method for predicting shale total organic carbon content based on a deep coding decoding network, which comprises the following steps of:
s100, obtaining a logging curve based on logging data of a target shale layer well position to be predicted; the logging curves comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves;
s200, preprocessing the logging curve to obtain a preprocessed logging curve;
step S300, performing windowing processing on the preprocessed logging curve, inputting the windowed logging curve into a pre-trained TOC prediction model to obtain TOC of a window center point as a prediction result of the total organic carbon content of the corresponding depth of the well position to be predicted of the target shale layer;
the TOC prediction model is constructed on the basis of an improved 1D U-Net network, and when the same-scale features of an encoder and a decoder in the improved 1D U-Net network are connected in a cross-level mode, the encoder and the decoder are spliced after convolution and residual connection processing; in addition, the features output by each decoder are leveled and fused, and the TOC predicted value of the center point of the window is output through the full connection layer.
In order to more clearly describe the method for predicting the total organic carbon content of shale based on the deep coding decoding network, the following describes in detail the steps in an embodiment of the method of the present invention with reference to the accompanying drawings.
In the following embodiments, a training process of the TOC prediction model is explained first, and then a process of obtaining a prediction result by using a method for predicting the total organic carbon content of shale based on a deep coding decoding network is described in detail.
1. Training of TOC prediction model, as shown in FIG. 3
A10, collecting logging data of a target shale layer training well position and corresponding TOC content data; obtaining a logging curve based on the logging data, and taking the TOC content data as a truth value label;
in this embodiment, logging data of a plurality of well locations of a target shale layer and TOC content data are collected, wherein the TOC content data are obtained by analyzing the total carbon content of a core obtained in a well drilling process in a laboratory and are used as real tags. And obtaining a logging curve based on the obtained logging data. In the invention, 10 logging curves are preferably selected and used for training the TOC prediction model, wherein the 10 logging curves respectively comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves.
A20, preprocessing the logging curve and the corresponding TOC content data, and dividing the preprocessed logging curve into a training set and a verification set according to a set proportion; the log curves in the training set and the corresponding truth value labels of the TOC content are used as training samples; the logging curves in the verification set and the corresponding truth value labels of the TOC content are used as verification samples;
in this embodiment, the preprocessing preferably includes removing data outlier points caused by instrument response and the like, and performing data normalization.
A30, performing windowing treatment on the preprocessed training sample, inputting the training sample into a pre-constructed TOC prediction model after the windowing treatment to obtain the TOC of the center point of a window as a prediction result of the total organic carbon content of the corresponding depth of the target shale layer;
in this embodiment, the window processing is performed on the preprocessed logging curve first, and the processed windowed logging curve is used as the input of the model. The windowing treatment comprises the following steps:
,
is the size of the window of data,
is shown as
And (6) a logging curve is planted.
The TOC prediction model is constructed after being improved based on a classical 1D U-Net network, one-dimensional signals of a plurality of channels (preferably 10 in the invention) are used for input, cross-level connection in the model structure does not adopt a classical direct splicing method, but characteristic splicing is carried out after convolution and residual connection processing, and the difference between the characteristics of an encoder and a decoder can be reduced. As shown in fig. 4(a) and 4 (b). The method specifically comprises the following steps: processing the characteristics coded by a coding module in a coder through N sequentially connected residual convolution modules, and splicing the processed characteristics with the same-scale characteristics in a decoder as the input of a corresponding decoding module in the decoder; the residual convolution module comprises
A convolution layer of
The residual error layer of (1); and the residual convolution module is used for performing convolution and residual processing on the input features respectively and adding the processed features.
And then, performing leveling fusion on the features output by each decoder, wherein the leveling fusion is to convert the multi-dimensional features output by each decoder into one-dimensional vector features through the TOC predicted value of the center point of the output window of the full connection layer, and then performing splicing fusion. Unlike the classical 1D U-Net model, where the input is the same size as the output.
A40, calculating a loss value based on the prediction result and the truth value label of the corresponding TOC content, and updating the model parameters of the TOC prediction model based on the loss value;
a50, inputting the windowed verification sample into a TOC prediction model in each training period, obtaining a prediction result of the total organic carbon content of the corresponding depth of the target shale layer corresponding to the verification sample, solving the loss of mean square error by combining a truth label of the TOC content corresponding to the verification sample, finishing model training when the loss of mean square error is less than a set threshold value, otherwise, skipping to the step A60; wherein, the training period is set according to the actual situation.
And A60, circularly executing the steps A10-A40 until a set cutoff condition is reached, and obtaining the trained TOC prediction model, wherein the cutoff condition comprises training times and model precision.
Step a60 is followed by analysis of the log:
inputting the windowed verification samples into the TOC prediction model trained in the step A60 again to obtain a prediction result corresponding to each logging curve as a first result; the prediction result is the prediction result of the total organic carbon content of the corresponding depth of the target shale layer;
carrying out gradient back transmission on the first result to obtain response values of the well logging curves to a TOC prediction model, carrying out descending sequencing on the response values, and selecting the well logging curves corresponding to the first N response values as well logging curves which are sensitive to the prediction of the TOC content;
or
Inputting the windowed training samples into corresponding pre-constructed 1-dimensional input TOC prediction models respectively, and training the pre-constructed 1-dimensional input TOC prediction models;
after training is finished, acquiring a prediction result corresponding to each logging curve through a 1-dimensional input TOC prediction model and evaluating accuracy (the accuracy evaluation method comprises the steps of calculating a difference value between the prediction result and a truth value label of the TOC content corresponding to the logging curve, and evaluating the accuracy according to the difference value) to serve as a second result; namely, the second result is the accuracy corresponding to the predicted result.
Sorting the second results in a descending order, and selecting the well logging curves corresponding to the first N second results as well logging curves sensitive to TOC content prediction after sorting;
or
Firstly, inputting a single training sample (logging curve) subjected to windowing processing, respectively constructing a TOC prediction model with 1-dimensional input for training and evaluation, and sequencing according to the model prediction effect. And then combining the first logging curves with good prediction effects as input, constructing a new multidimensional (dimensionality is set according to the number of the first logging curves) input TOC prediction model, training and evaluating the model, comparing differences of various conditions, and finally selecting N or more than N sensitive logging curves for predicting the total organic carbon content of the shale. N is preferably set to 3 in the present invention.
2. Method for predicting total organic carbon content of shale based on deep coding decoding network
S100, obtaining a logging curve based on logging data of a target shale layer well position to be predicted; the logging curves comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves;
in this embodiment, the logging data of the well location to be predicted of the target shale layer is obtained first, and a logging curve is obtained.
S200, preprocessing the logging curve to obtain a preprocessed logging curve;
step S300, performing windowing processing on the preprocessed logging curve, inputting the windowed logging curve into a pre-trained TOC prediction model to obtain TOC of a window center point as a prediction result of the total organic carbon content of the corresponding depth of the well position to be predicted of the target shale layer;
the TOC prediction model is constructed on the basis of an improved 1D U-Net network, and when the same-scale features of an encoder and a decoder in the improved 1D U-Net network are connected in a cross-level mode, the features are spliced after convolution and residual connection processing; in addition, the features output by each decoder are leveled and fused, and the TOC predicted value of the center point of the window is output through the full connection layer.
In this embodiment, the preprocessed logging curve is windowed, and after being processed, the windowed logging curve is input into a trained TOC prediction model to obtain a TOC line at a center point of a window, which is used as a prediction result of the total organic carbon content of the corresponding depth of the well site to be predicted of the target shale layer.
A system for predicting shale total organic carbon content based on a deep coding decoding network according to a second embodiment of the present invention, as shown in fig. 2, includes: the system comprises a data acquisition module 100, a preprocessing module 200 and a prediction module 300;
the data acquisition module 100 is configured to obtain a logging curve based on logging data of a target shale layer well position to be predicted; the logging curves comprise radioactive uranium, radioactive thorium, radioactive potassium, sound wave time difference, compensation neutrons, density, lithologic density, deep bilateral resistivity, shallow bilateral resistivity and gamma logging curves;
the preprocessing module 200 is configured to preprocess the logging curve to obtain a preprocessed logging curve;
the prediction module 300 is configured to perform windowing on the preprocessed logging curve, and input the windowed logging curve into a pre-trained TOC prediction model to obtain the TOC of a window center point as a prediction result of the total organic carbon content of the corresponding depth of the target shale layer well site to be predicted;
the TOC prediction model is constructed on the basis of an improved 1D U-Net network, and when the same-scale features of an encoder and a decoder in the improved 1D U-Net network are connected in a cross-level mode, the features are spliced after convolution and residual connection processing; in addition, the features output by each decoder are leveled and fused, and the TOC predicted value of the center point of the window is output through the full connection layer.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
It should be noted that, the system for predicting the total organic carbon content of shale based on the deep coding decoding network provided in the foregoing embodiment is only illustrated by the division of the foregoing functional modules, and in practical applications, the foregoing function allocation may be completed by different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the functions described above. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
An electronic device according to a third embodiment of the present invention includes at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for implementing the method for predicting shale total organic carbon content based on a deep code decoding network described above.
A computer-readable storage medium of a fourth embodiment of the present invention stores computer instructions for being executed by the computer to implement the method for predicting shale total organic carbon content based on a deep coding decoding network as described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the above-described apparatuses and computer-readable storage media may refer to the corresponding processes in the foregoing method examples, and are not described herein again.
Referring now to FIG. 5, there is illustrated a block diagram of a computer system suitable for use as a server in implementing embodiments of the method, system, and apparatus of the present application. The server shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 5, the computer system includes a Central Processing Unit (CPU) 501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data necessary for system operation are also stored. The CPU501, ROM 502, and RAM503 are connected to each other via a bus 504. An Input/Output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output section 507 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage portion 508 including a hard disk and the like; and a communication section 509 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), a compact disc read-only memory (CD-ROM), Optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.