CN113988382A - Oil field recovery rate prediction method based on ensemble learning - Google Patents

Oil field recovery rate prediction method based on ensemble learning Download PDF

Info

Publication number
CN113988382A
CN113988382A CN202111178615.XA CN202111178615A CN113988382A CN 113988382 A CN113988382 A CN 113988382A CN 202111178615 A CN202111178615 A CN 202111178615A CN 113988382 A CN113988382 A CN 113988382A
Authority
CN
China
Prior art keywords
oil field
predictor
prediction
field recovery
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111178615.XA
Other languages
Chinese (zh)
Inventor
安兴菊
任志勇
杨雪玲
闵小翠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huali Vocational College of Technology
Original Assignee
Guangzhou Huali Vocational College of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huali Vocational College of Technology filed Critical Guangzhou Huali Vocational College of Technology
Priority to CN202111178615.XA priority Critical patent/CN113988382A/en
Publication of CN113988382A publication Critical patent/CN113988382A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Mining

Abstract

The invention discloses an oil field recovery ratio prediction method based on ensemble learning, which is characterized in that oil field historical data such as geological factors, development condition factors and the like are extracted for recovery ratio marking to form an oil field historical initial data set; calculating each feature score in the initial data set of the oil field by adopting a Pearson correlation coefficient, keeping the first k rows of features, and constructing an optimal historical data set of the oil field; a plurality of CART decision tree base class predictors are arranged according to oil field differences, and the base class predictors are combined in a linear weighting mode to form an oil field recovery rate strong predictor so as to realize oil field recovery rate prediction. The method adopts the internal relation of the geological factors, the development condition factors and the oil field recovery ratio of the decision tree to establish an oil field recovery ratio predictor suitable for each period of oil field development; and a plurality of base class predictors are arranged according to the difference of the oil field, so that an oil field recovery strong predictor is formed by integrating the learning idea, the prediction precision of the oil field recovery strong predictor is improved, and the accurate prediction of the oil field recovery is realized.

Description

Oil field recovery rate prediction method based on ensemble learning
Technical Field
The invention belongs to the field of geophysical exploration and the field of artificial intelligence, and particularly relates to an oil field recovery rate prediction method based on ensemble learning.
Background
The recovery ratio is an important component of potential evaluation of the water-flooding oil field, accurate prediction of the recovery ratio is realized, accurate deployment of each stage of the oil field can be effectively guided, and recovery measures which accord with actual conditions of the oil field are formulated. And as the development of the oil field enters the ultra-high water cut period, the traditional oil field recovery efficiency calculation methods such as an empirical formula method, an oil reservoir numerical simulation and the like have poor adaptability and the prediction result has deviation, so that a larger uncertainty range is easy to appear. The machine learning method is adopted to predict the oil field recovery rate, the data accumulated in the oil field development process can be fully utilized, the internal relation between the oil field data such as geological conditions, development conditions and the like in each period and the oil field recovery rate is analyzed, an oil field recovery rate predictor suitable for each period of oil field development is established, and the accurate prediction of the oil field recovery rate is realized.
Aiming at the problems that the geological conditions, the development conditions and other factors in the oil field are different remarkably, and the single predictor cannot reflect the relation between the oil field recovery rate and the oil field data comprehensively, a plurality of base class predictors are arranged at the same time, wherein the parameter structures of the base class predictors are different, and the base class predictors are trained and integrated by using an integrated learning idea, the oil field recovery rate prediction method based on the integrated learning is provided.
Disclosure of Invention
The invention provides an oil field recovery rate prediction method based on ensemble learning, aiming at solving the problems that the traditional oil field recovery rate calculation method is poor in adaptability and inaccurate in prediction result in an ultra-high water-cut period of an oil field. The method comprises the steps of optimizing high correlation characteristics influencing the oil field recovery ratio through a Pearson correlation coefficient characteristic selection method, mining the internal relation between factors such as geological conditions, development conditions and the like of each period and the oil field recovery ratio by adopting a machine learning method of a decision tree, establishing an oil field recovery ratio predictor suitable for each period of oil field development, setting a plurality of base type predictors according to the difference of oil fields, training by using an integrated learning idea, integrating the base type predictors to form a strong predictor, and achieving accurate prediction of the oil field recovery ratio.
In order to achieve the purpose, the technical scheme of the invention mainly comprises the following three steps:
1. extracting and processing oil field historical data:
extracting oil field historical data such as geological factors, development condition factors, recovery ratio and the like from an oil field data database, processing the oil field historical data by adopting a data cleaning method such as missing value filling and the like, and eliminating the data quality problems such as missing values, invalid values and the like of the oil field historical data. And extracting and merging the data such as geological factors, development condition factors and the like according to the information of the blocks and the years by using a data integration method, and forming an initial historical data set of the oil field by using the historical recovery ratio as a tag value.
2. Preferred recovery high correlation characteristics
The oil field historical initial data set comprises redundant characteristics and irrelevant characteristics which are partially low in correlation degree with oil field recovery rate prediction, so that the performance of the oil field recovery rate predictor is serious, and the convergence accuracy of the predictor is low. And calculating the correlation degree of each characteristic and the oil field recovery ratio by adopting a characteristic selection method of the Pearson correlation coefficient, preferably selecting the characteristic with high correlation degree with the oil field recovery ratio prediction, and improving the prediction precision of the oil field recovery ratio predictor.
And calculating each characteristic score in the initial data set of the oil field by using the Pearson correlation coefficient, recording the score as PScore, wherein the higher the PScore is, the higher the correlation between the characteristic and the oil field recovery factor prediction is, reserving the first k rows of characteristics (k is half of the number of all the characteristics) according to the calculation result of each characteristic correlation, and constructing an optimal historical data set of the oil field.
3. Prediction of oil field recovery based on ensemble learning
And (3) mining the internal rule of the historical optimal data set of each oil field by using a machine learning method of the CART decision tree, and establishing an oil field recovery rate predictor suitable for each period of oil field development. The method has the advantages that the conditions in the oil field are complex and have obvious differences, the relation between the oil field recovery rate and the oil field data cannot be comprehensively reflected by using a single predictor, a plurality of base class predictors are arranged according to the differences of the oil field, the base class predictors are integrated by the idea of integrated learning, the strong oil field recovery rate predictor is obtained by adopting a linear weighting method, the defect of the single predictor is avoided, and a more reasonable oil field recovery rate prediction result is obtained.
Building an ensemble learning predictor:
(1) setting m CART decision tree predictors, selecting a mean square error CMSE as an evaluation standard of the predictor, wherein parameters such as the number of leaf nodes and the tree depth of each predictor are different, and the precision of each predictor has a certain difference;
(2) training m CART decision tree predictors by using an oil field historical optimal data set until stable convergence is achieved;
(3) and calculating a weight coefficient Prew according to the CMSE value after the predictor converges, wherein the size of the weight coefficient Prew is inversely proportional to the CMSE value. And combining the base class predictor in a linear weighting mode to form an oil field recovery rate strong predictor and realize oil field recovery rate prediction.
The invention has the beneficial effects that: the method comprises the steps of adopting a machine learning method of a decision tree to dig the internal relation between factors such as optimized geological conditions and development conditions and oil field recovery ratio, establishing an oil field recovery ratio predictor suitable for each oil field development period, setting a plurality of base class predictors according to the differences of oil fields by an integrated learning method, forming an oil field recovery ratio strong predictor, improving the prediction precision of the oil field recovery ratio predictor and realizing accurate prediction of the oil field recovery ratio.
Drawings
FIG. 1 is a flow chart of the present invention
Detailed Description
The invention is described in further detail below with reference to fig. 1:
1. extracting and processing oil field historical data:
extracting oil field historical data such as geological factors, development condition factors, recovery ratio and the like from an oil field data database, and processing the oil field historical data by adopting a data cleaning method such as missing value filling and the like, thereby eliminating the quality problems of the data such as missing values, invalid values and the like of the oil field historical data due to long storage period. Extracting and combining data such as geological factors, development condition factors and the like according to the information of the blocks and the years by using a data integration method, processing the combined data by using a normalization method, eliminating dimension difference, and forming an oil field historical initial data set by using the historical recovery ratio as a label value.
2. Preferred recovery factor high correlation characteristics:
the characteristic dimension in the historical initial data set of the oil field is high, and the characteristic dimension comprises redundant characteristics and irrelevant characteristics which are partially low in correlation degree with oil field recovery rate prediction, and the existence of the redundant characteristics and the irrelevant characteristics can seriously improve the performance of the oil field recovery rate predictor, so that the convergence precision of the predictor is low. And calculating the correlation degree of each characteristic and the oil field recovery ratio by adopting a characteristic selection method of the Pearson correlation coefficient, preferably selecting the characteristic with high correlation degree with the oil field recovery ratio prediction, and improving the prediction precision of the oil field recovery ratio predictor.
And calculating each characteristic score in the initial data set of the oil field by using the Pearson correlation coefficient, recording the score as PScore, wherein the higher the PScore is, the higher the correlation between the characteristic and the oil field recovery factor prediction is, reserving the first k rows of characteristics (k is half of the number of all the characteristics) according to the calculation result of the correlation, and constructing an optimal historical data set of the oil field. The PScore calculation formula is:
Figure BDA0003300729180000041
wherein t represents that the t-th feature is selected at the moment, xiRepresenting the value of the feature at i,
Figure BDA0003300729180000042
representing the mean of the features, yiRepresenting the value of the field recovery at i,
Figure BDA0003300729180000043
means of oil field recovery.
3. Predicting oil field recovery ratio based on ensemble learning:
and (3) excavating the characteristics of the historical optimal data set of the oil field by using a machine learning method of the decision tree, and establishing an oil field recovery rate predictor suitable for each period of oil field development. The method is characterized in that the conditions in the oil field are complex and have obvious differences, the relation between the oil field recovery rate and the oil field data cannot be comprehensively reflected by using a single predictor, a plurality of base class predictors are arranged according to the differences of the oil field, the base class predictors are integrated according to the idea of ensemble learning, the strong oil field recovery rate predictor is obtained by adopting a linear weighting method, the defect of the single predictor is avoided, and a more reasonable oil field recovery rate prediction result is obtained.
Building an ensemble learning predictor:
(1) setting m CART decision tree predictors, selecting a mean square error CMSE as an evaluation standard of the predictor, wherein parameters such as leaf node number, tree depth and the like of each predictor are different, and the precision of each predictor is different;
(2) training m CART decision tree predictors according to the historical optimal data set of the oil field until stable convergence is achieved;
(3) and calculating a weight coefficient Prew according to the CMSE value after the predictor converges, wherein the size of the weight coefficient Prew is inversely proportional to the CMSE value, namely the smaller the CMSE value is, the larger Prew is. And combining the base class predictor in a linear weighting mode to form an oil field recovery rate strong predictor and realize oil field recovery rate prediction.
The foregoing is only a preferred embodiment of this invention and any person skilled in the art may use the above-described solutions to modify or change the same into equivalent embodiments with equivalent variations. Any simple modification, change or amendment to the above-mentioned embodiments according to the technical solutions of the present invention without departing from the technical solutions of the present invention belong to the protection scope of the technical solutions of the present invention.

Claims (1)

1. An oil field recovery prediction method based on ensemble learning is characterized by comprising the following steps:
extracting oil field historical data from an oil field data database, extracting and combining geological factors and development condition factors by using a data integration method in blocks and years, and forming an oil field historical initial data set by using historical recovery ratio as a tag value; aiming at the problem that the oil field historical initial data set comprises redundant features and irrelevant features which are partially low in correlation degree with oil field recovery ratio prediction, a Pearson correlation coefficient feature selection method is adopted to calculate the correlation degree of each feature and the oil field recovery ratio, and the features high in correlation degree with the oil field recovery ratio prediction are preferably selected to form an oil field historical optimal data set; the machine learning method for establishing the decision tree is used for mining the historical optimal data set of the oil field as a base class predictor, and a plurality of base class predictors are integrated in a linear weighting mode according to the integrated learning method to obtain a strong oil field recovery rate predictor, so that the defect of adopting a single predictor is overcome, and the oil field recovery rate prediction is realized.
CN202111178615.XA 2021-10-13 2021-10-13 Oil field recovery rate prediction method based on ensemble learning Withdrawn CN113988382A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111178615.XA CN113988382A (en) 2021-10-13 2021-10-13 Oil field recovery rate prediction method based on ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111178615.XA CN113988382A (en) 2021-10-13 2021-10-13 Oil field recovery rate prediction method based on ensemble learning

Publications (1)

Publication Number Publication Date
CN113988382A true CN113988382A (en) 2022-01-28

Family

ID=79737965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111178615.XA Withdrawn CN113988382A (en) 2021-10-13 2021-10-13 Oil field recovery rate prediction method based on ensemble learning

Country Status (1)

Country Link
CN (1) CN113988382A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759363A (en) * 2022-11-01 2023-03-07 昆仑数智科技有限责任公司 Model training method and device, and recovery ratio determining method, device and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759363A (en) * 2022-11-01 2023-03-07 昆仑数智科技有限责任公司 Model training method and device, and recovery ratio determining method, device and equipment

Similar Documents

Publication Publication Date Title
CN104615862B (en) The method that high water cut oil field based on evolution algorithm determines well location
Guyaguler et al. Optimization of well placement in a Gulf of Mexico waterflooding project
CN109061731B (en) Surface wave dispersion and bulk wave compose the global optimization method than joint inversion shallow-layer speed
CN109711597A (en) A kind of Copper-nickel Sulfide Ore Deposit metallogenic prognosis method based on stratified random forest model
CN103472484A (en) Horizontal well track optimization method based on RS three-dimensional sensitivity seismic attribution analysis
CN112360411B (en) Local well pattern water injection development optimization method based on graph neural network
US9411915B2 (en) Method of selecting positions of wells to be drilled for petroleum reservoir development
CN107728227B (en) A kind of well pattern maturation zone quick discrimination buried channel sand body method
CN111080021B (en) Sand body configuration CMM neural network prediction method based on geological information base
US20230169244A1 (en) Method for evaluating fracture connectivity and optimizing fracture parameters based on complex network theory
Eskandari et al. Reservoir Modelling of Complex Geological Systems--A Multiple-Point Perspective
CN113343574A (en) Mishrif group lithology logging identification method based on neural network
CN115860197A (en) Data-driven coal bed gas yield prediction method and system
CN113283180A (en) K-means and SVR combination-based tight reservoir horizontal well fracturing productivity prediction method and application
CN113988382A (en) Oil field recovery rate prediction method based on ensemble learning
CN116384554A (en) Method and device for predicting mechanical drilling speed, electronic equipment and computer storage medium
WO2021022632A1 (en) Oil shale continuous exploration drilling position optimization method
CN111027249A (en) Machine learning-based inter-well connectivity evaluation method
CN111768503B (en) Sea sand resource amount estimation method based on three-dimensional geological model
Agterberg et al. Recent developments in quantitative stratigraphy
CN115964667A (en) River-lake lithofacies well logging identification method based on deep learning and resampling
CN110414085A (en) It has developed fault block oil reservoir original oil-water level and has determined method
Castellini et al. History matching and uncertainty quantification assisted by global optimization techniques
CN113344729B (en) Residual oil submergence digging method based on small sample learning
RU2300632C1 (en) Horizontal well output estimation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220128

WW01 Invention patent application withdrawn after publication