CN113592028A - Method and system for identifying logging fluid by using multi-expert classification committee machine - Google Patents

Method and system for identifying logging fluid by using multi-expert classification committee machine Download PDF

Info

Publication number
CN113592028A
CN113592028A CN202110939640.9A CN202110939640A CN113592028A CN 113592028 A CN113592028 A CN 113592028A CN 202110939640 A CN202110939640 A CN 202110939640A CN 113592028 A CN113592028 A CN 113592028A
Authority
CN
China
Prior art keywords
data
logging
training
prediction
logging data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110939640.9A
Other languages
Chinese (zh)
Inventor
谭茂金
白洋
张海涛
石玉江
王长胜
吴静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences Beijing
Original Assignee
China University of Geosciences Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences Beijing filed Critical China University of Geosciences Beijing
Priority to CN202110939640.9A priority Critical patent/CN113592028A/en
Publication of CN113592028A publication Critical patent/CN113592028A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a system for identifying a multi-expert classification committee machine logging fluid, which belong to the field of intelligent identification, and comprise the following steps: s1: acquiring logging data and fluid type label data as input data; s2: carrying out data cleaning, environmental correction and sensitive logging data screening on input data to obtain training data; s3: inputting training data into a plurality of heterogeneous learners for training, and then inputting the training data into a combiner for integration to obtain a multi-expert classification committee machine fluid identification model; s4: and inputting the new well logging data into a multi-expert classification committee machine fluid identification model to obtain a final fluid identification result. The invention can combine the advantages of each intelligent algorithm to the maximum extent, reduce the risks of local minimum and overfitting of a single intelligent algorithm, reduce noise interference in logging data and improve the precision and the robustness of an integrated system.

Description

Method and system for identifying logging fluid by using multi-expert classification committee machine
Technical Field
The invention belongs to the field of intelligent identification, and particularly relates to a method and a system for identifying a multi-expert classification committee machine logging fluid.
Background
At present, geophysical logging is used as an 'eye' and an 'ear' deep into a stratum, has the advantages of multiple measured physical parameters, high resolution, large information quantity and the like, and can provide continuous and accurate in-situ physical parameters such as electricity, sound, nuclear/nuclear magnetism and the like for oil and gas reservoir evaluation, wherein the in-situ physical parameters mainly comprise natural gamma, depth resistivity, sound wave, density, neutrons and the like. By means of explaining a plate, an empirical formula, a volume model and the like, the qualitative judgment of the underground lithology type, the oil and gas reservoir and the fluid property can be realized, and the actual oil and gas production and development work can be guided.
Along with the gradual progress of oil and gas development in China to unconventional and deep complex oil and gas reservoirs, the conventional geophysical well logging interpretation and evaluation method is difficult to apply and has poor reliability. Particularly in some low-porosity and low-permeability reservoirs, low-resistivity oil reservoirs and complex lithologic formations, the pore fluid has small contribution to logging response, and different fluid types have nonlinear relations with the logging response. Therefore, the nonlinear mapping relation between the logging data and the fluid type is constructed by using some intelligent algorithms, and the method has a good effect on improving the fluid identification capacity and accuracy. At present, commonly used intelligent algorithms comprise a neural network, a support vector machine, a decision tree, a Bayesian algorithm and the like, and the methods input logging data and fluid labels such as a reservoir oil layer, an oil-water layer, a water-containing layer, a water layer, a dry layer and the like in advance, automatically optimize a model structure, construct a nonlinear fluid identification model and realize the fluid type prediction of an unknown reservoir.
The method for predicting the fluid type of the complex reservoir by adopting the intelligent model with better performance has the advantages of small error and high efficiency. However, training an intelligent model with reliable performance is often difficult, including learning method optimization, hyper-parametric optimization, under-fit and over-fit tradeoffs, and the like, each of which has a large influence on model accuracy, robustness and generalization ability. Moreover, according to the free lunch theorem, no training model is optimal in precision, robustness and generalization ability. Therefore, some ensemble learning methods combining a plurality of base learners of the same type based on a voting mechanism are proposed and widely applied to various application fields. However, the existing lifting mechanism of ensemble learning mainly depends on the independence of the base learner, and when the training data volume is small, the traditional self-help sampling easily enables the base learner to be saturated, so that the integration effect is influenced.
In view of the above, the present invention is particularly proposed.
Disclosure of Invention
The invention aims to provide a method and a system for identifying logging fluid by a multi-expert classification committee machine, which take various intelligent algorithms of different types as base learners, take a voting method as a combiner, construct a committee machine, combine the advantages of each intelligent algorithm to the greatest extent, reduce the risks of local minimum and overfitting of a single intelligent algorithm, reduce noise interference in logging data and improve the precision and the robustness of an integrated system.
In order to achieve the above object, the present invention provides a method for multi-expert classification committee machine well logging fluid identification, comprising the steps of:
s1: acquiring logging data and fluid type label data as input data;
s2: carrying out data cleaning, environmental correction and sensitive logging data screening on input data to obtain training data;
s3: inputting training data into a plurality of heterogeneous learners for training, and then inputting the training data into a combiner for integration to obtain a multi-expert classification committee machine fluid identification model;
s4: and inputting the new well logging data into a multi-expert classification committee machine fluid identification model to obtain a logging fluid identification result.
Further, in step S2, the data cleansing is a process of reviewing and verifying the logging data, and is used to delete redundant information, correct error information, and provide data consistency, where the deleting of redundant information includes deleting resistivity logging data with similar radial formation characteristics and deleting logging data with similar longitudinal formation characteristics; correcting the error information comprises removing abnormal well section data and logging data with large noise influence; providing data consistency includes specifying names, units, and data types of different well logging data;
the environment correction is to remove the influence of well bores, mud, well deviation and surrounding rocks on the quality of logging data by using a logging interpretation chart or a correction formula, and comprises electric logging environment correction based on three-parameter inversion, acoustic logging environment correction based on a diffuse reflection acoustic path method, neutron logging environment correction based on an acoustic time difference neutron logging reconstruction method and density logging environment correction based on an acoustic time difference density logging reconstruction method.
Further, in step S2, the sensitive log data screening includes the following steps:
s201: taking all logging data as input, training by using a BP neural network, a probabilistic neural network and a decision tree algorithm, and constructing a pre-training intelligent model;
s202: inputting the new well logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
s203: disturbing a series of logging data in a certain proportion, and keeping other logging data unchanged to obtain a disturbed data set;
s204: inputting the disturbance data set into a pre-training intelligent model for fluid type prediction to obtain a prediction result B1Recording the prediction result B1Degree of influence δ (B) of relative prediction result A1,A);
S205: disturbing different series of logging data in sequence, inputting the data into a pre-training intelligent model to obtain a prediction result B2、B3、...、BmThe degree of influence delta (B) of the factor A with respect to the factor A is recordedm,A);
S206: and judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end and according to the fact that the larger the influence degree is, the more sensitive the logging data is.
Further, in step S3, in the multi-heterogeneous learner joint training, a heterogeneous intelligent algorithm is adopted, the same input data is received, and the hyper-parameter grid search and the intelligent model training are respectively performed to obtain a plurality of parallel sub-models, wherein the heterogeneous intelligent algorithm includes a BP neural network, a probabilistic neural network and a decision tree algorithm;
in the integration of the committee machine combiner, a relative majority voting method, an absolute majority voting method, a weighted voting method and a learning method are adopted, wherein the relative majority voting method takes the judgment result with the largest frequency count of the prediction result of each sub-model as output, and when more than one result frequency count is the largest, the sub-model with the best training performance is selected to output as the final judgment result; the absolute majority voting method takes a judgment result that the frequency of the prediction result of the submodel exceeds half of the total frequency as output, and refuses prediction when the frequency is not over half; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
Further, in step S4, the multi-expert classification committee machine fluid identification model includes a plurality of trained sub-models and a combiner, and after data cleaning and environmental correction are performed on the new well logging data, the new well logging data is input into the sub-models to perform fluid type discrimination, and a final fluid type discrimination result is output through the combiner.
The invention also provides a system for identifying the logging fluid of the multi-expert classification committee machine, which comprises a training data preparation module, a committee machine training module and a committee machine prediction module;
the training data preparation module is used for carrying out data cleaning, environmental correction and sensitive logging data screening on the input logging data and the fluid type label data to obtain training data;
the committee machine training module is used for performing multi-heterogeneous learner combined training and committee machine combiner integration on training data to obtain a multi-expert classification committee machine fluid identification model;
and the committee machine prediction module is used for obtaining a logging fluid identification result according to the new logging data and the multi-expert classification committee machine fluid identification model.
Further, in the training data preparation module, data cleaning is a process of rechecking and checking the logging data, and is used for deleting redundant information, correcting error information and providing data consistency, wherein the deleting of the redundant information comprises deleting resistivity logging data with similar radial formation characteristics and deleting logging data with similar longitudinal formation characteristics; correcting the error information comprises removing abnormal well section data and logging data with large noise influence; providing data consistency includes specifying names, units, and data types of different well logging data;
the environment correction is to remove the influence of well bores, mud, well deviation and surrounding rocks on the quality of logging data by using an interpretation chart or a correction formula, and comprises electric logging environment correction based on three-parameter inversion, sound wave logging environment correction based on a diffuse reflection acoustic path method, neutron logging environment correction based on a sound wave time difference neutron logging reconstruction method and density logging environment correction based on a sound wave time difference density logging reconstruction method.
Further, in the training data preparation module, the sensitive logging data screening includes the following steps:
s201: all logging data are used as input, training is carried out by respectively utilizing a BP neural network, a probabilistic neural network and a decision tree algorithm, and a pre-training intelligent model is constructed;
s202: inputting the new well logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
s203: disturbing a series of logging data, and keeping other logging data unchanged to obtain a disturbed data set;
s204: inputting the disturbance data set into a pre-training intelligent model for fluid type prediction to obtain a prediction result B1Calculating a prediction result B1Degree of influence δ (B) of relative prediction result A1,A);
S205: disturbing different series of logging data in turn, inputting into the predictionObtaining a prediction result B in the training intelligent model2、B3、...、BmThe degree of influence delta (B) with respect to A is calculatedm,A);
S206: and judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end and according to the fact that the larger the influence degree is, the more sensitive the logging data is.
Furthermore, in the committee machine training module, a heterogeneous intelligent algorithm is adopted for multi-heterogeneous learner combined training, the same input data is received, and hyper-parameter grid search and intelligent model training are respectively carried out to obtain a plurality of parallel sub-models, wherein the heterogeneous intelligent algorithm comprises a BP neural network, a probability neural network and a decision tree algorithm;
the committee machine combiner is integrated by adopting a relative majority voting method, an absolute majority voting method, a weighted voting method and a learning method, wherein the relative majority voting method takes the judgment result with the largest frequency number of the prediction result frequency numbers of all the sub-models as output, and when more than one result frequency number is the largest, the sub-model with the best training performance is selected to be output as the final judgment result; the absolute majority voting method takes a judgment result that the frequency of the prediction result of the submodel exceeds half of the total frequency as output, and refuses prediction when the frequency is not over half; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
Furthermore, the committee machine prediction module takes a multi-expert classification committee machine fluid identification model as a core, and comprises a plurality of trained sub-models and a combiner, wherein the trained sub-models and the combiner are used for inputting the logging data into the sub-models for fluid type judgment after data cleaning and environmental correction, and outputting a final fluid type judgment result through the combiner.
The method and the system for identifying the logging fluid by the multi-expert classification committee machine have the following beneficial effects that: the idea of combining various intelligent algorithms is introduced, and a method and a system of a multi-expert classification committee machine for well logging fluid identification are constructed. The core algorithm of the system is a committee machine training module, and various heterogeneous intelligent algorithms are combined by simulating a human committee mechanism through a flexible and reliable decision-making mechanism such as a relative majority voting method, an absolute majority voting method, a weighted voting method, a learning method and the like, so that the risk that the prediction system falls into local minimum and overfitting is reduced, and the robustness of a training model and the accuracy of a prediction result are improved. Moreover, the invention relies on the committee machine core algorithm, constructs a set of complete system for identifying the logging fluid by the training data preparation module and the committee machine prediction module, thereby improving the quality of input data and providing a quick and accurate fluid type judgment result.
Drawings
FIG. 1 is a flow chart of a method of multi-expert Classification Committee machine well logging fluid identification in accordance with the present invention.
FIG. 2 is a schematic diagram of a system for multi-expert Classification Committee machine well logging fluid identification in accordance with the present invention.
Figure 3 is a graph of single intelligence algorithm and committee machine prediction accuracy with 5% gaussian noise introduced into the input data according to the present invention.
Figure 4 is a graph of single intelligence algorithm and committee machine prediction accuracy with a gaussian noise introduced ratio of 10% for input data according to the present invention.
Fig. 5 is a schematic diagram of a fluid discrimination result according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to make the technical field better understand the scheme of the present invention.
One embodiment of the present invention provides a method for identifying a multi-expert classification committee machine well logging fluid, as shown in fig. 1, which can improve the accuracy and robustness of a prediction model through multi-expert comprehensive decision, and the method comprises the following steps:
s1: logging data and fluid type tag data are acquired.
The logging data comprises data such as natural gamma-ray logging, resistivity logging, acoustic logging, density logging, neutron logging and the like. The fluid type tag data includes data for an oil layer, an oil-water layer, a water layer, and a dry layer.
S2: and carrying out data cleaning, environmental correction and sensitive logging data screening on input data formed by the logging data and the fluid type label data to obtain training data.
The data cleaning is a process of rechecking and checking the logging data, and is used for deleting redundant information, correcting error information and providing data consistency, wherein the deleting of the redundant information comprises deleting resistivity logging data with similar radial stratum characteristics and deleting logging data with similar longitudinal stratum characteristics; correcting the error information comprises removing abnormal well section data and logging data with large noise influence; providing data consistency includes specifying the names, units, and data types of the different well logging data.
The environment correction is to remove the influence of environment factors such as well bores, mud, well deviation, surrounding rocks and the like on the logging data quality by using an interpretation chart or a correction formula, and includes but is not limited to electrical logging environment correction based on three-parameter inversion, acoustic logging environment correction based on a diffuse reflection acoustic path method, neutron logging environment correction based on an acoustic time difference neutron logging reconstruction method and density logging environment correction based on an acoustic time difference density logging reconstruction method.
Sensitive logging data screening is a process of further reducing data redundancy characteristics on the basis of data cleaning. The method comprises the following specific steps:
s201: all logging data are used as input, intelligent algorithms such as but not limited to a BP neural network, a probabilistic neural network and a decision tree are used for training, and a pre-training intelligent model is constructed;
s202: inputting the logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
s203: disturbing a series of logging data in a certain proportion, keeping other logging data unchanged, and obtaining a disturbed data set, wherein the proportion can be adjusted according to actual conditions;
s204: inputting the N groups of disturbance data sets into a pre-training intelligent model for fluid type prediction to obtain a prediction result B1Calculating a prediction result B1Degree of influence δ (B) of relative prediction result A1A), the calculation formula is as follows:
Figure BDA0003214316330000071
s205: disturbing different series of logging data in sequence, inputting the data into a pre-training intelligent model to obtain a prediction result B2、B3、...、BmThe degree of influence delta (B) of the factor A with respect to the factor A is recordedm,A);
S206: and judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end and the basis that the larger the influence degree is, the more sensitive the logging data is.
S3: and inputting the training data into a plurality of heterogeneous learners for training, and then inputting the training data into a combiner for integration to obtain a multi-expert classification committee machine fluid identification model.
In step S3, in the multi-heterogeneous learner joint training, the heterogeneous learner is the subject of the committee machine, and includes, but is not limited to, any of various types of intelligent algorithms such as BP neural network, probabilistic neural network, and decision tree. Generally, the number of heterogeneous learners employed in committee machines must be greater than 3. The intelligent algorithms receive the same input data, and respectively carry out hyper-parameter grid search and intelligent model training to obtain a plurality of parallel sub-models, wherein the sub-models are models obtained by training different intelligent algorithms.
In the integration of the committee machine combiner, the committee machine combiner is the core of the committee machine and comprises a relative majority voting method, an absolute majority voting method, a weighted voting method, a learning method and the like. The relative majority voting method takes the judgment result with the most frequency of the prediction results of each submodel as output, and selects the submodel with the best training performance to output as the final judgment result when more than one result has the most frequency; the absolute majority voting method takes a judgment result that the frequency of the prediction result of the submodel exceeds half of the total frequency as output, and refuses prediction when the frequency is not over half; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
S4: and inputting the new well logging data into a multi-expert classification committee machine fluid identification model to obtain a logging fluid identification result.
The multi-expert classification committee machine fluid identification model comprises a plurality of trained sub-models and a combiner, new well logging data can be input into the sub-models for fluid type judgment after being subjected to data cleaning and environmental correction, and finally a final fluid type judgment result is output through the combiner.
Another embodiment of the present invention provides a method for multi-expert classification committee machine well logging fluid identification, which utilizes well logging data to construct a multi-expert classification committee machine fluid identification model and a fluid type discrimination system. FIG. 2 is a schematic diagram of a system for multi-expert classification committee machine well logging fluid identification according to the invention, and the accuracy and robustness of a prediction model can be improved through multi-expert comprehensive decision making.
The system of the present invention includes three main modules, a training data preparation module 1, a committee machine training module 2, and a committee machine prediction module 3, respectively.
The training data preparation module 1 is used for carrying out data cleaning, environment correction and sensitive logging data screening on input data formed by logging data and fluid type label data to obtain training data.
The data cleaning is a process of rechecking and checking the logging data, and is used for deleting redundant information, correcting error information and providing data consistency, wherein the deleting of the redundant information comprises deleting resistivity logging data with similar radial stratum characteristics and deleting logging data with similar longitudinal stratum characteristics; correcting the error information comprises removing abnormal well section data and logging data with large noise influence; providing data consistency includes specifying the names, units, and data types of the different well logging data.
The environment correction is to remove the influence of the environment factors such as well bores, mud, well deviation, surrounding rocks and the like on the quality of the logging data by using an explanation chart or a correction formula. Including but not limited to electrical logging environment correction based on three-parameter inversion, acoustic logging environment correction based on diffuse reflection acoustic path method, neutron logging environment correction based on acoustic time difference neutron logging reconstruction method, and density logging environment correction based on acoustic time difference density logging reconstruction method.
Sensitive logging data screening is a process of further reducing data redundancy characteristics on the basis of data cleaning. The method comprises the following specific steps:
1) all logging data are used as input, intelligent algorithms such as but not limited to a BP neural network, a probabilistic neural network and a decision tree are used for training, and a pre-training intelligent model is constructed;
2) inputting the logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
3) disturbing a series of logging data in a certain proportion, keeping other logging data unchanged, and obtaining a disturbed data set, wherein the proportion can be adjusted according to actual conditions;
4) inputting the N groups of disturbance data sets into a pre-training intelligent model for fluid type prediction to obtain a prediction result B1Record B1Degree of influence delta (B) with respect to A1A), the calculation formula is as follows:
Figure BDA0003214316330000091
5) disturbing different series of logging data in sequence, inputting the data into a pre-training intelligent model to obtain a prediction result B2、B3、...、BmCalculating a prediction result BmInfluence of relative prediction result ADegree delta (B)m,A);
6) And judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end.
The committee machine training module 2 is used for performing multi-heterogeneous learner joint training and committee machine combiner integration on the training data, and therefore, the training data can be divided into 2 parts which are respectively used for the multi-heterogeneous learner joint training and the committee machine combiner integration.
In the multi-heterogeneous learner joint training, heterogeneous learners are the subject of committee machines, including but not limited to BP neural networks, probabilistic neural networks, decision trees, nearest neighbor algorithms, and any other different types of intelligent algorithms. Generally, the number of heterogeneous learners employed in committee machines must be greater than 3. The intelligent algorithms receive the same input data, and respectively carry out hyper-parameter grid search and intelligent model training to obtain a plurality of parallel sub-models, wherein the sub-models are models obtained by training different intelligent algorithms.
In the integration of the committee machine combiner, the committee machine combiner is the core of the committee machine and comprises a relative majority voting method, an absolute majority voting method, a weighted voting method, a learning method and the like. The relative majority voting method takes the judgment result with the most frequency of the prediction results of each submodel as output, and selects the submodel with the best training performance to output as the final judgment result when more than one result has the most frequency; the absolute majority voting method takes a judgment result that the frequency of the prediction result of the submodel exceeds half of the total frequency as output, and refuses prediction when the frequency is not over half; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
The committee machine prediction module 4 is centered on a multi-expert classification committee machine fluid identification model, including a plurality of trained sub-models and combiners. And after data cleaning and environment correction are carried out on the logging data, the logging data can be input into the sub-model for fluid type judgment, and finally a final fluid type judgment result is output through the combiner.
The method and system for multi-expert taxonomy committee machine well logging fluid identification of the present invention will be described with reference to examples hereinafter.
And selecting logging data and production and test data of a certain oil field part in the west of China to develop an experiment for identifying the logging fluid by a machine of the multi-expert classification committee. The logging curves as inputs are array induction logging data (AT10, AT20, AT3, AT60, AT90), acoustic time difference (AC), offset Density (DEN), neutron density (CNL), Gamma (GR), respectively. The fluid discrimination targets are five reservoir types, namely an oil layer, an oil-water-containing layer, a water layer and a dry layer, and tag data are constructed to be [10000], [01000], [00100], [00010] and [00001] by utilizing single hot coding.
Firstly, logging data and label data are input into a training data preparation module for preprocessing, namely data cleaning, environment correction and sensitive logging data screening. Finally, 5 series of logging data including AT90, AC, DEN, CNL and GR are constructed as characteristic data, the test data coded by one hot code is used as label data to construct a training set, and the training set is input into the next module for training. Table 1 is part of the training data for classification committee machine training.
TABLE 1 training data constructed based in part on well log data and test data
Figure BDA0003214316330000111
Then, in a committee machine training module, a plurality of heterogeneous learners such as a BP neural network, a probabilistic neural network and a decision tree can perform sub-model training based on training data, including optimization of hyper-parameters and automatic optimization of a model structure, and finally training is performed to obtain a plurality of sub-fluid discrimination models. For the combinatorial strategy, in most logging fluid identification problems, a relative majority voting method is generally employed. In addition, when a specific problem is met, such as a higher requirement on the reliability of the output result, the absolute majority voting method is selected.
Finally, the multi-expert classification committee machine fluid identification model performance, in particular robustness, was tested using the test data. The testing mode is a noise disturbance method, namely Gaussian noises with different proportions are respectively introduced into different logging data and input into a trained committee machine fluid identification model for prediction. As shown in fig. 3-4, by comparison with the authentic tag, it can be seen that: in the noise-free case, the committee machine output accuracy was 97.92%, while the average accuracy of all the individual intelligent algorithms was 87.50%. When gaussian noise of 5% ratio is introduced in AT90, AC, DEN, CNL, GR respectively, the accuracy of all the individual intelligent algorithms is degraded (85.83%, 72.50%, 59.17%, 84.85%, 83.75%), while the committee machine output accuracy is 97.92%, 79.17%, 64.58%, 93.75%, 87.50%, respectively, which is much higher than the average accuracy of the output results of all the individual intelligent algorithms. Likewise, when increasing the proportion of induced noise from 5% to 10%, the accuracy of all the individual intelligent algorithms further degraded (84.17%, 59.58%, 54.58%, 79.58%, 72.92%), while the committee machine output accuracy was 95.83%, 66.67%, 85.42%, 79.17%, respectively (fig. 2). The comparison result shows that the accuracy and robustness of the fluid identification model of the multi-expert classification committee machine are higher than those of a single intelligent algorithm, and the joint decision of a multi-heterogeneous learner can be effectively realized.
In a committee machine prediction module, the multi-expert classification committee machine fluid identification model is adopted to predict actual logging data, and fluid discrimination results of an oil layer, an oil-water layer, an oil-containing water layer, a dry layer and the like can be rapidly obtained. As shown in fig. 5, the 1 st, the 3 rd and the 4 th are logging data, the 2 nd is a depth track, the 5 th is a stratigraphic section, and the 6 th is a committee machine fluid identification result, and the oil layer (1), the oil-water layer (2), the oil-water layer (3) and the water layer (4) are sequentially distinguished from top to bottom.
The invention concept is explained in detail by applying specific examples, the description of the above examples taking the oil reservoir as an example is only used for helping to understand the core idea of the invention, and for the gas reservoir, the fluid identification is changed into the discrimination of a gas layer, a gas-water-containing layer, a water layer and the like, and the same is also applicable. It should be understood that any obvious modifications, equivalents and other improvements made by those skilled in the art without departing from the spirit of the present invention are included in the scope of the present invention.

Claims (10)

1. A method for multi-expert classification committee machine well logging fluid identification, comprising the steps of:
s1: acquiring logging data and fluid type label data as input data;
s2: performing data cleaning, environmental correction and sensitive logging data screening on input data to obtain training data II
S3: inputting training data into a plurality of heterogeneous learners for training, and then inputting the training data into a combiner for integration to obtain a multi-expert classification committee machine fluid identification model II
S4: and inputting the new well logging data into a multi-expert classification committee machine fluid identification model to obtain a logging fluid identification result.
2. The method of multi-expert classification committee machine well logging fluid identification as claimed in claim 1, wherein in step S2, data cleansing is a process of re-examining and verifying well logging data for removing redundant information, correcting erroneous information and providing data consistency, wherein removing redundant information comprises removing resistivity well logging data with similar radial formation characteristics and removing well logging data with similar longitudinal formation characteristics; correcting the error information comprises removing abnormal well section data and logging data with large noise influence; providing data consistency includes specifying names, units, and data types of different well logging data;
the environment correction is to remove the influence of well bores, mud, well deviation and surrounding rocks on the quality of logging data by using a logging interpretation chart or a correction formula, and comprises electric logging environment correction based on three-parameter inversion, acoustic logging environment correction based on a diffuse reflection acoustic path method, neutron logging environment correction based on an acoustic time difference neutron logging reconstruction method and density logging environment correction based on an acoustic time difference density logging reconstruction method.
3. The method of multi-expert classification committee machine well logging fluid identification as claimed in claim 1, wherein in the step S2, the sensitive well logging data screening comprises the steps of:
s201: taking all logging data as input, training by using a BP neural network, a probabilistic neural network and a decision tree algorithm, and constructing a pre-training intelligent model;
s202: inputting the new well logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
s203: disturbing a series of logging data, and keeping other logging data unchanged to obtain a disturbed data set;
s204: inputting the disturbance data set into a pre-training intelligent model for fluid type prediction to obtain a prediction result B1Calculating a prediction result B1Degree of influence δ (B) of relative prediction result A1,A);
S205: disturbing different series of logging data in sequence, inputting the data into a pre-training intelligent model to obtain a prediction result B2、B3、...、BmThe degree of influence delta (B) with respect to A is calculatedm,A);
S206: and judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end and according to the fact that the larger the influence degree is, the more sensitive the logging data is.
4. The method of claim 1, wherein in step S3, during the joint training of multiple heterogeneous learners, heterogeneous intelligent algorithms are used, the same input data is received, and the hyper-parametric grid search and the intelligent model training are respectively performed to obtain multiple parallel sub-models, wherein the heterogeneous intelligent algorithms include a BP neural network, a probabilistic neural network and a decision tree algorithm;
in the combiner integration, a relative majority voting method, an absolute majority voting method, a weighted voting method and a learning method are adopted, wherein the relative majority voting method takes the judgment result with the largest frequency count of the prediction result of each sub-model as output, when more than one result frequency count is the largest, the sub-model with the best training performance is selected to output as the final judgment result, and the absolute majority voting method takes the judgment result with the frequency count of the prediction result of the sub-model exceeding half of the total frequency count as output, and when the condition that the frequency count is over half does not exist, the prediction is refused; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
5. The method of claim 1, wherein in step S4, the multi-expert classification committee machine well logging fluid recognition model comprises a plurality of trained submodels and a combiner, the new well logging data is subjected to data cleaning and environmental correction, and then input into the submodels for fluid type discrimination, and the combiner outputs the final fluid type discrimination result.
6. A system for multi-expert classification board machine logging fluid identification, comprising a training data preparation module, a board machine training module, and a board machine prediction module;
the training data preparation module is used for carrying out data cleaning, environmental correction and sensitive logging data screening on the input logging data and the fluid type label data to obtain training data;
the committee machine training module is used for performing multi-heterogeneous learner combined training and committee machine combiner integration on training data to obtain a multi-expert classification committee machine fluid identification model;
and the committee machine prediction module is used for obtaining a logging fluid identification result according to the new logging data and the multi-expert classification committee machine fluid identification model.
7. The method of multiple expert classification committee machine well logging fluid identification as claimed in claim 5, wherein in the training data preparation module, data cleansing is a process of re-examining and verifying well logging data for removing redundant information, correcting erroneous information and providing data consistency, wherein removing redundant information comprises removing resistivity well logging data with similar radial formation characteristics, removing well logging data with similar longitudinal formation characteristics correcting erroneous information comprises removing abnormal interval data and well logging data with greater noise impact, and providing data consistency comprises normalizing names, units and data types of different well logging data;
the environment correction is to remove the influence of well bores, mud, well deviation and surrounding rocks on the quality of logging data by using an interpretation chart or a correction formula, and comprises electric logging environment correction based on three-parameter inversion, sound wave logging environment correction based on a diffuse reflection acoustic path method, neutron logging environment correction based on a sound wave time difference neutron logging reconstruction method and density logging environment correction based on a sound wave time difference density logging reconstruction method.
8. The method of multi-expert classification committee machine well logging fluid identification as claimed in claim 6, wherein in the training data preparation module, sensitive well logging data screening comprises the steps of:
s201: taking all logging data as input, training by using a BP neural network, a probabilistic neural network and a decision tree algorithm, and constructing a pre-training intelligent model;
s202: inputting the new well logging data serving as prediction data into a pre-training intelligent model for fluid type prediction to obtain a prediction result A;
s203: disturbing a series of logging data in a certain proportion, and keeping other logging data unchanged to obtain a disturbed data set;
s204: inputting the disturbance data set into a pre-trained intelligent model for streamingPredicting the body type to obtain a prediction result B1Recording the prediction result B1Degree of influence δ (B) of relative prediction result A1,A);
S205: disturbing different series of logging data in sequence, inputting the data into a pre-training intelligent model to obtain a prediction result B2、B3、...、BmThe degree of influence delta (B) of the factor A with respect to the factor A is recordedm,A);
S206: and judging the sensitivity of the logging data according to the influence degree of the disturbance of different logging series on the output end and according to the fact that the larger the influence degree is, the more sensitive the logging data is.
9. The method of claim 6, wherein in the committee machine training module, the joint training of multiple heterogeneous learners employs a heterogeneous intelligent algorithm, receives the same input data, and performs hyper-parametric grid search and intelligent model training respectively to obtain a plurality of parallel sub-models, wherein the heterogeneous intelligent algorithm comprises a BP neural network, a probabilistic neural network and a decision tree algorithm;
the committee machine combiner is integrated by adopting a relative majority voting method, an absolute majority voting method, a weighted voting method and a learning method, wherein the relative majority voting method takes the judgment result with the largest frequency number of the prediction result frequency numbers of all the sub-models as output, and when more than one result frequency number is the largest, the sub-model with the best training performance is selected to be output as the final judgment result; the absolute majority voting method takes a judgment result that the frequency of the prediction result of the submodel exceeds half of the total frequency as output, and refuses prediction when the frequency is not over half; the weighted voting method gives the optimal weight to each type of prediction result, and the final judgment result with the highest score is output; the learning method utilizes an intelligent algorithm to establish a nonlinear mapping relation between all sub-model outputs and a real result, and the relation is utilized to guide the combination of unknown data prediction results.
10. The method of claim 6, wherein the committee machine prediction module is based on a multi-expert classification committee machine fluid recognition model, and comprises a plurality of trained submodels and a combiner, wherein the trained submodels are used for inputting the logging data into the submodels for fluid type discrimination after data cleaning and environmental correction, and the final fluid type discrimination result is output through the combiner.
CN202110939640.9A 2021-08-16 2021-08-16 Method and system for identifying logging fluid by using multi-expert classification committee machine Pending CN113592028A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110939640.9A CN113592028A (en) 2021-08-16 2021-08-16 Method and system for identifying logging fluid by using multi-expert classification committee machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110939640.9A CN113592028A (en) 2021-08-16 2021-08-16 Method and system for identifying logging fluid by using multi-expert classification committee machine

Publications (1)

Publication Number Publication Date
CN113592028A true CN113592028A (en) 2021-11-02

Family

ID=78258106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110939640.9A Pending CN113592028A (en) 2021-08-16 2021-08-16 Method and system for identifying logging fluid by using multi-expert classification committee machine

Country Status (1)

Country Link
CN (1) CN113592028A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545356A (en) * 2022-11-30 2022-12-30 深圳市峰和数智科技有限公司 Determination method of prediction model, S-wave travel time curve prediction method and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101787884A (en) * 2010-01-28 2010-07-28 中国石油集团川庆钻探工程有限公司 Reservoir fluid type discrimination method based on difference value of acoustic porosity and neutron porosity
CN108596251A (en) * 2018-04-25 2018-09-28 中国地质大学(北京) One kind carrying out fluid identification of reservoir method based on committee machine using log data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101787884A (en) * 2010-01-28 2010-07-28 中国石油集团川庆钻探工程有限公司 Reservoir fluid type discrimination method based on difference value of acoustic porosity and neutron porosity
CN108596251A (en) * 2018-04-25 2018-09-28 中国地质大学(北京) One kind carrying out fluid identification of reservoir method based on committee machine using log data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
白洋等: "致密砂岩气藏动态分类委员会机器测井流体识别方法", 《地球物理学报》, vol. 64, no. 5, pages 1 - 2 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545356A (en) * 2022-11-30 2022-12-30 深圳市峰和数智科技有限公司 Determination method of prediction model, S-wave travel time curve prediction method and related equipment
CN115545356B (en) * 2022-11-30 2024-02-27 深圳市峰和数智科技有限公司 Determination method of prediction model, S-wave travel time curve prediction method and related equipment

Similar Documents

Publication Publication Date Title
CN109086824B (en) Seabed substrate sonar image classification method based on convolutional neural network
CN106407649B (en) Microseismic signals based on time recurrent neural network then automatic pick method
CN110657984B (en) Planetary gearbox fault diagnosis method based on reinforced capsule network
Liu et al. Deep classified autoencoder for lithofacies identification
CN113568055B (en) Aviation transient electromagnetic data inversion method based on LSTM network
CN108897975A (en) Coalbed gas logging air content prediction technique based on deepness belief network
CN110751186B (en) Cross-project software defect prediction method based on supervised expression learning
CN113359212B (en) Reservoir characteristic prediction method and model based on deep learning
CN112733447B (en) Underwater sound source positioning method and system based on domain adaptive network
CN115758212A (en) Mechanical equipment fault diagnosis method based on parallel network and transfer learning
CN109858523A (en) A kind of shallow sea velocity of sound profile inversion method of neural network and ray theory
CN111381275A (en) First arrival picking method and device for seismic data
CN113780242A (en) Cross-scene underwater sound target classification method based on model transfer learning
CN114723095A (en) Missing well logging curve prediction method and device
CN113687433A (en) Bi-LSTM-based magnetotelluric signal denoising method and system
CN115185937A (en) SA-GAN architecture-based time sequence anomaly detection method
CN115146700A (en) Runoff prediction method based on Transformer sequence-to-sequence model
CN117708656B (en) Rolling bearing cross-domain fault diagnosis method for single source domain
CN114487129B (en) Flexible material damage identification method based on acoustic emission technology
CN114676733A (en) Fault diagnosis method for complex supply and delivery mechanism based on sparse self-coding assisted classification generation type countermeasure network
CN114357372A (en) Aircraft fault diagnosis model generation method based on multi-sensor data driving
CN113592028A (en) Method and system for identifying logging fluid by using multi-expert classification committee machine
CN117093922A (en) Improved SVM-based complex fluid identification method for unbalanced sample oil reservoir
CN115639605B (en) Automatic identification method and device for high-resolution fault based on deep learning
CN110552693A (en) layer interface identification method of induction logging curve based on deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination