CN117391227A - Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm - Google Patents

Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm Download PDF

Info

Publication number
CN117391227A
CN117391227A CN202210776243.9A CN202210776243A CN117391227A CN 117391227 A CN117391227 A CN 117391227A CN 202210776243 A CN202210776243 A CN 202210776243A CN 117391227 A CN117391227 A CN 117391227A
Authority
CN
China
Prior art keywords
system efficiency
data
model
pumping unit
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210776243.9A
Other languages
Chinese (zh)
Inventor
王才
张喜顺
赵瑞东
师俊峰
熊春明
孙艺真
邓峰
刘猛
陈诗雯
陈冠宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Petrochina Co Ltd
Original Assignee
Petrochina Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Petrochina Co Ltd filed Critical Petrochina Co Ltd
Priority to CN202210776243.9A priority Critical patent/CN117391227A/en
Publication of CN117391227A publication Critical patent/CN117391227A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • EFIXED CONSTRUCTIONS
    • E21EARTH DRILLING; MINING
    • E21BEARTH DRILLING, e.g. DEEP DRILLING; OBTAINING OIL, GAS, WATER, SOLUBLE OR MELTABLE MATERIALS OR A SLURRY OF MINERALS FROM WELLS
    • E21B43/00Methods or apparatus for obtaining oil, gas, water, soluble or meltable materials or a slurry of minerals from wells
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mining & Mineral Resources (AREA)
  • Geology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Environmental & Geological Engineering (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Fluid Mechanics (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • General Life Sciences & Earth Sciences (AREA)
  • Geochemistry & Mineralogy (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of oil extraction engineering, and particularly discloses an oil pumping unit system efficiency prediction method and system based on an integrated learning algorithm, wherein the oil pumping unit system efficiency prediction method and system comprises the steps of constructing oil extraction big data resource pools of different property blocks represented by shaft data, oil pumping unit data and production data by adopting a construction unit; the screening unit screens main control factors affecting the system efficiency in the big data resource pool through a machine learning model; the prediction unit predicts the system efficiency by using a plurality of integrated learning models by taking the screened main control factors as main characteristic parameters. The method is independent of solving a complicated pumping unit system efficiency equation, utilizes an integrated learning algorithm to analyze the pumping unit production big data, fully considers the sensitivity influence of each production parameter, has high prediction precision, meets the development requirement of the oil field Internet of things, and has important significance in promoting the construction of the low-cost Internet of things, excavating production potential, reducing cost and enhancing efficiency.

Description

Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm
Technical Field
The invention belongs to the field of oil extraction engineering, and particularly relates to an oil pumping unit system efficiency prediction method and system based on an integrated learning algorithm.
Background
The oil pumping well accumulates mass data and contains energy consumption evaluation and working system optimization models. The traditional working system optimization is mostly based on a physical modeling method, and a physical model which comprehensively considers the influence of all factors is difficult to build on the premise of more. The system efficiency is an visual index for analyzing the energy consumption of the oil pumping unit, and the current common method is to approximate and regress the calculation formula of the system efficiency through the calculation formula of the system efficiency.
The working process of the sucker rod pumping system is a process of continuously transmitting and converting energy, and a certain amount of energy is lost when each transmission of the energy is carried out. After various losses of the system are subtracted from the input energy of the ground motor, the system lifts the effective energy of the liquid. By hydraulic power P e Representing the work done by the pumping unit to lift a certain amount of liquid to the ground in a unit time, using P Into (I) Representing the motor input power, the ratio of which is the conventional system efficiency.
According to the working characteristics of the pumping unit system, the efficiency of the pumping unit system can be divided into ground efficiency and underground efficiency. Defining the power P consumed by the polish rod to lift the liquid and overcome various resistances downhole Light source The power of the polished rod is as follows:
the energy loss in the ground part occurs in the motor, belt and reduction gearbox, four bar linkage, therefore:
wherein K-payload factor; η (eta) 1 -motor efficiency; η (eta) 2 Belt and reduction gearbox efficiency; η (eta) 3 -four bar linkage efficiency.
The energy lost in the downhole part is in packing boxes, sucker rods, pumps and tubing strings, thus
η 1 -packing box efficiency; η (eta) 2 -sucker rod efficiency; η (eta) 3 Efficiency, eta of oil pump 3 -column efficiency. The system efficiency of the resulting sucker rod pumping system is as follows:
η=K×η 1 ×η 2 ×η 3 ×η 4 ×η 5 ×η 6 ×η 7
from the conventional system efficiency concept, it can be seen that many factors affecting the system efficiency, such as the liquid supply capacity of the reservoir, the characteristics of the fluid, the trajectory of the well, the equipment and operating parameters of the well, affect the exertion of the system efficiency. Because the system efficiency has a plurality of influencing factors and the model is complex, it is difficult to establish an accurate physical model. At present, a method for predicting the efficiency of the pumping unit system by utilizing an integrated learning algorithm is rarely reported, and a system efficiency prediction model is established based on big data machine learning, so that the method has important significance in promoting the construction of the low-cost Internet of things, excavating production potential, reducing cost and enhancing efficiency.
Disclosure of Invention
In view of the above problems, in one aspect, the present invention discloses a method for predicting efficiency of a pumping unit system based on an ensemble learning algorithm, where the method includes:
constructing oil extraction big data resource pools of different property blocks represented by shaft data, pumping unit data and production data;
screening main control factors affecting system efficiency in the big data resource pool through a machine learning model;
and predicting the system efficiency by using the screened main control factors as main characteristic parameters and adopting a plurality of integrated learning models.
Further, the wellbore data includes pump diameter, pump depth, pump efficiency, and actual lift;
the pumping unit data comprise a model, a motor model, a stroke frequency, consumed power, a maximum load, a minimum load, daily power consumption, torque and maximum torque;
the production data comprise daily oil production, accumulated oil production liquid, water content, working fluid level, sinking degree, oil pressure, casing pressure and hundred-meter ton liquid power consumption.
Further, the data is analyzed and preprocessed before the large data resource pool is established, including analyzing the data duty ratio of each month, the number of each type of data, and the overall quality analysis of the data.
Further, screening the master factors affecting the system efficiency through the machine learning model comprises the following steps:
calculating the correlation between the system efficiency and each data in the big data resource pool by using a machine learning model;
drawing a relevance heat value graph of the system efficiency and each characteristic parameter according to the relevance calculation result;
and sorting according to the relevance intensity of each characteristic parameter and the system efficiency by combining the relevance heat value graph, wherein the sorting comprises sorting of characteristic parameters positively related to the system efficiency and sorting of characteristic parameters positively related to the system efficiency.
Further, the machine learning model includes a pearson correlation coefficient model.
Further, the ensemble learning model includes a random forest model, an AdaBoost model, a GradientBoosting model, and/or a Bagging model.
Further, the prediction method further includes: and comparing the prediction results of the integrated learning models, and selecting the prediction result of the integrated learning model with the highest fitting precision as the final prediction result.
On the other hand, the invention also discloses a pumping unit system efficiency prediction system based on the ensemble learning algorithm, the prediction system comprises:
the construction unit is used for constructing oil extraction big data resource pools of different property blocks represented by the shaft data, the pumping unit data and the production data;
the screening unit is used for screening main control factors affecting the system efficiency in the big data resource pool through a machine learning model;
and the prediction unit is used for predicting the system efficiency by taking the screened main control factors as main characteristic parameters and adopting a plurality of integrated learning models.
Further, the wellbore data includes pump diameter, pump depth, pump efficiency, and actual lift;
the pumping unit data comprise a model, a motor model, a stroke frequency, consumed power, a maximum load, a minimum load, daily power consumption, torque and maximum torque;
the production data comprise daily oil production, accumulated oil production liquid, water content, working fluid level, sinking degree, oil pressure, casing pressure and hundred-meter ton liquid power consumption.
Further, the system further comprises:
the preprocessing unit is used for analyzing and preprocessing data before the large data resource pool is built, and comprises the steps of analyzing the data duty ratio of each month, the number of various types of data and the overall quality analysis of the data;
and a comparison unit: the method is used for comparing the prediction results of the integrated learning models, and selecting the prediction result of the integrated learning model with the highest fitting precision as the final prediction result.
Further, the screening unit performs the steps of:
calculating the correlation between the system efficiency and each data in the big data resource pool by using a machine learning model;
drawing a relevance heat value graph of the system efficiency and each characteristic parameter according to the relevance calculation result;
and sorting according to the relevance intensity of each characteristic parameter and the system efficiency by combining the relevance heat value graph, wherein the sorting comprises sorting of characteristic parameters positively related to the system efficiency and sorting of characteristic parameters positively related to the system efficiency.
Further, the machine learning model includes a pearson correlation coefficient model; the ensemble learning model comprises a random forest model, an AdaBoost model, a Gradientboosting model and/or a Bagging model.
The invention has the beneficial effects that:
the method is independent of solving a complicated pumping unit system efficiency equation, utilizes an integrated learning algorithm to analyze the pumping unit production big data, fully considers the sensitivity influence of each production parameter, has high prediction precision, meets the development requirement of the oil field Internet of things, and has important significance in promoting the construction of the low-cost Internet of things, excavating production potential and reducing cost and enhancing efficiency;
the invention analyzes and preprocesses the data before establishing the big data resource pool, can quickly know the whole quality and approximate distribution of the data, and can provide a basis for the follow-up screening of main control factors;
according to the invention, after the machine learning model is utilized to calculate the correlation between the system efficiency and each data in the big data resource pool, the correlation heat value diagram and the correlation sequencing are utilized to visually represent the correlation of each characteristic parameter, so that the main control factors can be screened out more clearly and intuitively.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a schematic diagram of a pumping unit power curve simulation flow in an embodiment of the invention;
FIG. 2 shows a pie chart of data for each month in data analysis in an embodiment of the invention;
FIG. 3 shows a histogram of model number statistics in an embodiment of the invention;
FIG. 4 shows a feature attribute statistics bin graph (subject value < 100) in an embodiment of the invention;
FIG. 5 shows a heat value graph of the relevance of various factors of the system efficiency in an embodiment of the invention;
FIG. 6 illustrates a system efficiency master factor relevance rank in an embodiment of the invention;
FIG. 7 shows random forest model predictions in an embodiment of the present invention;
FIG. 8 shows the prediction results of an AdaBoosting model in an embodiment of the present invention;
FIG. 9 shows the prediction results of the GradientBoosting model in the embodiment of the present invention;
FIG. 10 shows the prediction result of the Bagging model in the embodiment of the invention;
FIG. 11 shows the prediction results of a support vector machine model with radial basis function in an embodiment of the present invention;
FIG. 12 shows the prediction results of a support vector machine model using a polynomial as a kernel function in an embodiment of the present invention;
FIG. 13 shows the prediction result of the minimum K nearest neighbor model in the embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the machine learning process, although a weak learner similar to a support vector machine or a decision tree can be theoretically used to obtain good performance, in practice, for various reasons, some experiences of some common learners are usually reused, and a relatively strong learner is often used, and a learning algorithm formed by reusing and combining the weak learners is called integrated learning. Therefore, in order to improve the prediction precision, the invention provides a method for predicting the efficiency of the pumping unit system by analyzing production data through an integrated learning algorithm, and carries out sensitivity analysis on factors influencing the system efficiency, thereby providing theoretical support for evaluating the working state of the pumping unit and optimizing the working system of the pumping unit.
The invention provides a pumping unit system efficiency prediction method based on an ensemble learning algorithm, which is specifically shown in fig. 1, and comprises the following steps:
acquiring shaft data, pumping unit data and production data and preprocessing the shaft data, the pumping unit data and the production data;
according to the shaft data, the pumping unit data and the production data, forming a large oil extraction data resource pool representing different property blocks, carrying out statistical analysis on data quality, and establishing an effective data analysis set;
screening main control factors influencing the system efficiency through a machine learning model and evaluating the correlation;
and predicting the system efficiency by using the main control factors affecting the system efficiency as main characteristic parameters and optimizing the predicted value by using a plurality of integrated learning models.
Based on the prediction method, the invention also constructs a set of oil pumping unit system efficiency prediction system based on an integrated learning algorithm, and the prediction system comprises:
the preprocessing unit is used for analyzing and preprocessing the acquired data and comprises the steps of analyzing the data duty ratio of each month, the number of various types of parameters and the overall quality analysis of the data; the data includes wellbore data, pumping unit data, and production data.
And the construction unit is used for constructing oil extraction big data resource pools of different property blocks represented by the shaft data, the pumping unit data and the production data.
And the screening unit screens main control factors affecting the system efficiency in the big data resource pool through a machine learning model.
And the prediction unit is used for predicting the system efficiency by taking the screened main control factors as main characteristic parameters and adopting a plurality of integrated learning models.
And the comparison unit is used for comparing the prediction results of the integrated learning models and selecting the prediction result of the integrated learning model with the highest fitting precision as the final prediction result.
The well bore data comprise pump diameter, pump depth, pump efficiency and actual lifting; the pumping unit data comprise a model, a motor model, a stroke frequency, consumed power, a maximum load, a minimum load, daily power consumption, torque and maximum torque; the production data comprise daily oil production, accumulated oil production liquid, water content, working fluid level, sinking degree, oil pressure, casing pressure and hundred-meter ton liquid power consumption.
Wherein the screening unit performs the steps of:
calculating the correlation between the system efficiency and each data in the big data resource pool by using a machine learning model;
drawing a relevance heat value graph of the system efficiency and each characteristic parameter according to the relevance calculation result;
and sorting according to the relevance intensity of each characteristic parameter and the system efficiency by combining the relevance heat value graph, wherein the sorting comprises sorting of characteristic parameters positively related to the system efficiency and sorting of characteristic parameters positively related to the system efficiency.
The machine learning model comprises a pearson correlation coefficient model; the ensemble learning model comprises a random forest model, an AdaBoost model, a Gradientboosting model and/or a Bagging model.
The above method is described in detail with reference to specific examples.
The present embodiment describes the above process in detail taking statistics of system efficiency of the D oil field 1850 production well 2019, 10 months to 2020, 2 months and related production data as an example.
(1) Well bore data, pumping unit data and production data are collected and analyzed and preprocessed.
The collected characteristic parameters comprise daily data of machine type, motor type, pump diameter, pump depth, stroke frequency, daily liquid yield, daily oil yield, water content, working fluid level, sinking degree, oil pressure, casing pressure, consumed power, actual lifting, hundred-meter ton liquid power consumption, maximum load, minimum load, daily power consumption, pump efficiency, torque, maximum torque, load utilization rate, accumulated liquid yield and accumulated oil yield, and the characteristic parameter data of each month are shown in a figure 2. As can be seen from fig. 2, the data ratio of each month between 10 months in 2019 and 2 months in 2020 is: 22.13%, 23.25%, 24.56%, 15.03%.
The distribution of the characteristic parameters is counted by using a programming language, for example, the statistics of the model is carried out in FIG. 3, and the model is mainly CYJS8-3-37HB, PCYJY8-3-37HF and DCYJY8-3-37 HB. Fig. 4 is a box-type statistical chart of features with theoretical values smaller than 100 in the data set, circles in the chart represent abnormal values, and the chart shows that the abnormal values in the data set are more, and the overall data quality deviation is caused.
(2) System efficiency master control factor analysis based on big data analysis
In the embodiment, a pearson correlation coefficient model is selected to represent correlation analysis of system efficiency and production parameters, and the pearson correlation coefficient model is expressed by the following formula:
wherein x is i Representing the ith characteristic parameter of each piece of data,representing the average of all i-th characteristic parameters. The correlation coefficient properties are as follows:
1) When |r|=1, the x and y variables are perfectly linear, and there is a definite functional relationship between x and y.
2) When 0 < |r| < 1, the criteria for general judgment are: 0 < |r| is less than or equal to 0.2 and is called low correlation; 0.2 < |r|is less than or equal to 0.6 and is called moderate correlation; 0.6 < |r|.ltoreq.1 is called highly correlated.
3) When r > 0, x and y are positive correlations, and when r < 0, x and y are negative correlations.
The relevance heat value diagram of the system efficiency and each characteristic parameter is drawn based on the pearson correlation coefficient model, and particularly as shown in fig. 5, as can be seen from fig. 5, the characteristic parameter with the highest heat is a model. And ordering the system efficiency correlation characteristics, the result is shown in fig. 6, and as can be seen from fig. 6, the result of ordering the positive correlation characteristics with the system efficiency according to the correlation strength is: daily fluid production, pump diameter, cumulative fluid production, cumulative oil production, stroke frequency, water content, torque, oil pressure, daily oil production, stroke, casing pressure, sinking degree, maximum torque and pump efficiency. The result of the ranking of the negative correlation features with the system efficiency according to the correlation strength is as follows: hundred meters ton of liquid power consumption, pump depth, working fluid level, actual lifting height, minimum load, motor model, load utilization rate, daily power consumption, model, power consumption and maximum load. The sequencing can improve the prediction efficiency and effect of the subsequent model.
(3) Model building and prediction
The embodiment adopts 4 integrated learning models of a random forest model, an AdaBoost model, a Gradientboosting model and a Bagging model to predict the efficiency of the pumping unit system. Prediction of random forest modelThe result is shown in fig. 7, the prediction result of the AdaBoost model is shown in fig. 8, the prediction result of the GradientBoosting model is shown in fig. 9, the prediction result of the Bagging model is shown in fig. 10, and as can be seen from fig. 7-10, the 4 ensemble learning models have better prediction capability, wherein the fitting precision of the AdaBoost model is the lowest (r 2 = 0.9616), the highest fitting accuracy of the Bagging model (r 2 = 0.9896). While the effect is general by adopting a single machine learning prediction model, for example, a support vector machine model prediction with radial basis as a kernel function is adopted, and the prediction result is shown in fig. 11; the results of the support vector machine model and the minimum K-nearest neighbor model using the polynomial as a kernel function are shown in fig. 12 and 13, respectively. Comparing fig. 7-10 and fig. 11-13, it can be seen that the fitting effect and prediction ability of each ensemble learning method is far better than that of a single machine learning prediction model.
From the model prediction results, it can be seen that the ensemble learning model has a better prediction effect than a single machine learning model. In machine-learned supervised learning algorithms, sometimes only a plurality of favored models (weakly supervised models, which perform better in some ways) are available. The integrated learning is to combine a plurality of weak supervision models to obtain a better and more comprehensive strong supervision model, and even if one weak classifier obtains error prediction, other weak classifiers can correct the error. Several machine learning techniques are combined into a meta algorithm of a prediction model to achieve the effects of reducing variance (implemented by a bagging method), deviation (implemented by a boosting method) or improving prediction (implemented by a stacking method). Ensemble learning has a very good strategy on data sets of various scales. If the data set is large, it can be divided into multiple small data sets, and multiple models are learned for combination. If the data set is small, sampling can be performed by using a Bootstrap method, so that a plurality of data sets are obtained, and a plurality of models are respectively trained and then combined.
Integration methods can be divided into two categories:
(1) A sequence integration method in which a base learner that participates in training is generated sequentially (e.g., adaBoost). The principle of the sequence method is to use the dependency relationship between the basic learners. By assigning a higher weight to the erroneously marked samples in the previous training, the overall predictive effect can be improved.
(2) Parallel integration methods in which the underlying learners involved in training are generated in parallel (e.g., ran dot Forest). The principle of the parallel method is to use independence between basic learners, and errors can be significantly reduced by averaging.
From the above examples, it can be seen that the result of predicting the efficiency of the pumping unit system by analyzing the production data and adopting the integrated learning model is more accurate, the accuracy is higher, and theoretical support can be provided for evaluating the working state of the pumping unit and optimizing the working system of the pumping unit.
Although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (12)

1. An oil pumping unit system efficiency prediction method based on an ensemble learning algorithm, wherein the prediction method comprises the following steps:
constructing oil extraction big data resource pools of different property blocks represented by shaft data, pumping unit data and production data;
screening main control factors affecting system efficiency in the big data resource pool through a machine learning model;
and predicting the system efficiency by using the screened main control factors as main characteristic parameters and adopting a plurality of integrated learning models.
2. The pumping unit system efficiency prediction method of claim 1, wherein the wellbore data includes pump diameter, pump depth, pump efficiency, and actual lift;
the pumping unit data comprise a model, a motor model, a stroke frequency, consumed power, a maximum load, a minimum load, daily power consumption, torque and maximum torque;
the production data comprise daily oil production, accumulated oil production liquid, water content, working fluid level, sinking degree, oil pressure, casing pressure and hundred-meter ton liquid power consumption.
3. The pumping unit system efficiency prediction method according to claim 1 or 2, wherein,
the data is analyzed and preprocessed before the large data resource pool is established, including analyzing the data duty ratio of each month, the number of each type of data and the overall quality analysis of the data.
4. The pumping unit system efficiency prediction method according to claim 1 or 2, wherein,
screening master control factors affecting system efficiency through a machine learning model comprises the following steps:
calculating the correlation between the system efficiency and each data in the big data resource pool by using a machine learning model;
drawing a relevance heat value graph of the system efficiency and each characteristic parameter according to the relevance calculation result;
and sorting according to the relevance intensity of each characteristic parameter and the system efficiency by combining the relevance heat value graph, wherein the sorting comprises sorting of characteristic parameters positively related to the system efficiency and sorting of characteristic parameters positively related to the system efficiency.
5. The pumping unit system efficiency prediction method of claim 4, wherein,
the machine learning model includes a pearson correlation coefficient model.
6. The pumping unit system efficiency prediction method according to claim 1 or 2, wherein,
the ensemble learning model comprises a random forest model, an AdaBoost model, a Gradientboosting model and/or a Bagging model.
7. The pumping unit system efficiency prediction method according to claim 1 or 2, wherein the prediction method further comprises: and comparing the prediction results of the integrated learning models, and selecting the prediction result of the integrated learning model with the highest fitting precision as the final prediction result.
8. An integrated learning algorithm-based pumping unit system efficiency prediction system, wherein the prediction system comprises:
the construction unit is used for constructing oil extraction big data resource pools of different property blocks represented by the shaft data, the pumping unit data and the production data;
the screening unit is used for screening main control factors affecting the system efficiency in the big data resource pool through a machine learning model;
and the prediction unit is used for predicting the system efficiency by taking the screened main control factors as main characteristic parameters and adopting a plurality of integrated learning models.
9. The pumping unit system efficiency prediction system of claim 8, wherein the wellbore data comprises pump diameter, pump depth, pump efficiency, and actual lift;
the pumping unit data comprise a model, a motor model, a stroke frequency, consumed power, a maximum load, a minimum load, daily power consumption, torque and maximum torque;
the production data comprise daily oil production, accumulated oil production liquid, water content, working fluid level, sinking degree, oil pressure, casing pressure and hundred-meter ton liquid power consumption.
10. The pumping unit system efficiency prediction system of claim 8 or 9, wherein the system further comprises:
the preprocessing unit is used for analyzing and preprocessing data before the large data resource pool is built, and comprises the steps of analyzing the data duty ratio of each month, the number of various types of data and the overall quality analysis of the data;
and a comparison unit: the method is used for comparing the prediction results of the integrated learning models, and selecting the prediction result of the integrated learning model with the highest fitting precision as the final prediction result.
11. The pumping unit system efficiency prediction system according to claim 8 or 9, wherein,
the screening unit performs the following steps:
calculating the correlation between the system efficiency and each data in the big data resource pool by using a machine learning model;
drawing a relevance heat value graph of the system efficiency and each characteristic parameter according to the relevance calculation result;
and sorting according to the relevance intensity of each characteristic parameter and the system efficiency by combining the relevance heat value graph, wherein the sorting comprises sorting of characteristic parameters positively related to the system efficiency and sorting of characteristic parameters positively related to the system efficiency.
12. The pumping unit system efficiency prediction system of claim 11, wherein,
the machine learning model comprises a pearson correlation coefficient model; the ensemble learning model comprises a random forest model, an AdaBoost model, a Gradientboosting model and/or a Bagging model.
CN202210776243.9A 2022-06-29 2022-06-29 Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm Pending CN117391227A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210776243.9A CN117391227A (en) 2022-06-29 2022-06-29 Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210776243.9A CN117391227A (en) 2022-06-29 2022-06-29 Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm

Publications (1)

Publication Number Publication Date
CN117391227A true CN117391227A (en) 2024-01-12

Family

ID=89437917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210776243.9A Pending CN117391227A (en) 2022-06-29 2022-06-29 Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm

Country Status (1)

Country Link
CN (1) CN117391227A (en)

Similar Documents

Publication Publication Date Title
CN107305601B (en) Method for analyzing efficiency factors of oil pumping well system
US10233728B2 (en) Method and apparatus for drilling a new well using historic drilling data
CN111582325A (en) Multi-order feature combination method based on automatic feature coding
CN115860197A (en) Data-driven coal bed gas yield prediction method and system
CN114596010B (en) BiGRU network drilling condition identification method combined with attention mechanism
CN114048932A (en) Coal-bed gas well gas production rate prediction method based on LSTM
CN110778302A (en) Method for evaluating integration performance and modifying technology of pumping unit well group in oil field block
CN108678728A (en) A kind of oilwell parameter analysis combinational algorithm based on k-means
CN117391227A (en) Oil pumping unit system efficiency prediction method and system based on ensemble learning algorithm
CN104680023A (en) Multi-objective-decision-based pumping unit parameter optimization method
CN116663391A (en) Method, system, electronic equipment and storage medium for determining oil well liquid extraction effect
CN116108963A (en) Electric power carbon emission prediction method and equipment based on integrated learning module
CN113537706A (en) Oil field production increasing measure optimization method based on intelligent integration
CN108596781A (en) A kind of electric power system data excavates and prediction integration method
CN114169535A (en) Anomaly detection algorithm of industrial Internet of things data platform based on group intelligence
CN113988410A (en) Cross-region tight oil reservoir oil well productivity prediction method based on KNN algorithm and polynomial regression algorithm combination
CN112664185A (en) Indicator diagram-based rod-pumped well working condition prediction method
Guizhi et al. Application of big data analysis in oil production engineering
Lu et al. Main control factors affecting mechanical oil recovery efficiency in complex blocks identified using the improved k-means algorithm
CN109236277A (en) A kind of oil well fault diagnostic expert system based on production rule
CN115906591B (en) XGBoost network-based oil well working fluid level calculation method
CN113869266B (en) Centrifugal compressor rotating stall early warning method based on big data analysis
CN115310999B (en) Enterprise electricity behavior analysis method and system based on multi-layer perceptron and sequencing network
CN113987933A (en) Pumping unit well pump detection period prediction method based on BP neural network
CN116861800B (en) Oil well yield increasing measure optimization and effect prediction method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination