CN113379093A - Energy consumption analysis and optimization method for oil gas gathering and transportation system - Google Patents

Energy consumption analysis and optimization method for oil gas gathering and transportation system Download PDF

Info

Publication number
CN113379093A
CN113379093A CN202010159512.8A CN202010159512A CN113379093A CN 113379093 A CN113379093 A CN 113379093A CN 202010159512 A CN202010159512 A CN 202010159512A CN 113379093 A CN113379093 A CN 113379093A
Authority
CN
China
Prior art keywords
energy consumption
data
oil
transportation system
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010159512.8A
Other languages
Chinese (zh)
Inventor
李振泉
张丁涌
周长敬
王兴武
安学先
高华
孙东
刘文聪
闫恩祥
李红强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Petroleum and Chemical Corp
Sinopec Shengli Oilfield Co Xianhe Oil Production Plant
Original Assignee
China Petroleum and Chemical Corp
Sinopec Shengli Oilfield Co Xianhe Oil Production Plant
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Petroleum and Chemical Corp, Sinopec Shengli Oilfield Co Xianhe Oil Production Plant filed Critical China Petroleum and Chemical Corp
Priority to CN202010159512.8A priority Critical patent/CN113379093A/en
Publication of CN113379093A publication Critical patent/CN113379093A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides an energy consumption analysis and optimization method for an oil-gas gathering and transportation system, which comprises the following steps: step 1, collecting operation data of an oil field gathering and transportation system; step 2, preprocessing the collected data; step 3, carrying out correlation analysis on the data to evaluate the correctness and the validity of the data; step 4, carrying out variance analysis on the parameters of the oil field gathering and transportation equipment and carrying out feature importance analysis through preliminary fitting of a random forest machine learning method; step 5, establishing a machine learning random forest regression model; and 6, evaluating each set of scheme by using an energy consumption quantitative prediction big data model, and queuing to select the optimal scheme. The energy consumption analysis and optimization method for the oil gas gathering and transportation system can greatly reduce the modeling difficulty, shorten the design period and reduce the workload, thereby providing reasonable reference basis for the adjustment and optimization of the oil gas gathering and transportation scheme.

Description

Energy consumption analysis and optimization method for oil gas gathering and transportation system
Technical Field
The invention relates to the technical field of big data application and oil-gas gathering and transportation energy conservation, in particular to an energy consumption analysis and optimization method for an oil-gas gathering and transportation system.
Background
For an oil gas gathering and transportation system, nodes influencing energy consumption are more, such as outlet water temperature after heating by a heating furnace, water content of an oil outlet of a three-phase separator, surface temperature of the upper surface, the middle surface and the lower surface of a settling tank and the like. The conventional energy-saving method is to establish an energy consumption conservation model of the gathering and transportation system by combining an energy consumption analysis rule of the oil-gas gathering and transportation system and an energy balance test, and to analyze the model by adopting an optimization algorithm with the power and the minimum heat power consumption of the gathering and transportation system as targets.
The data accumulated in the oil gas gathering and transportation production has the following characteristics:
1. multiple event excitation, high dimension and stronger coupling. The data acquisition is frequent, the acquisition density is high, repeated redundant data exists, a plurality of parameters of the system are mutually influenced, and the behavior state of the system is acted jointly;
2. the oil gas gathering and transportation production system has instability, and the collected data is easily polluted due to industrial noise;
3. dynamics and diversity of data types. Parameters such as pressure, temperature, flow, equipment state and the like are constantly changed along with time and comprise various types of data such as logic type, numerical value type and the like;
4. multiple timescale and incompleteness. The frequency of the signals acquired by different parameters is different, and data loss may occur due to asynchronous data recording;
5. and (4) multiple modes. The gathering and transportation system has a normal working state and a fault working condition.
The traditional simulation technology is complex in work and long in research period when establishing an energy consumption analysis and optimization model, and the characteristics of a large amount of complex oil gas gathering and transportation data need to be considered, so that the modeling difficulty is further increased.
In recent years, big data driven AI machine learning techniques have advanced significantly and have been used very successfully in the fields of language and image processing. With the development of parallel computing architecture, the machine learning technology also has the capability of on-line operation, and the high real-time performance and low complexity thereof make the analysis and optimization of the oil field gathering and transportation tightly combined to be possible. The big data analysis technology is applied to other energy industries such as electric power, coal and the like, and particularly has obvious application effect on the aspect of electronic information.
In the field of oilfield big data, the production operation rule of the oilfield is analyzed according to the big data by the sectional Zealand and the like so as to solve the problem of oilfield production business; the problems and bottlenecks in management are solved through analysis of oilfield management data. Zhang uses big data analysis technology to carry out application analysis in the aspects of abnormal well automatic identification, sealing driving device matching technology and screw pump oil extraction process system optimization. The types of big data analysis technologies are discussed in the plum garden and the like, and the construction method of the oil field big data analysis platform and the application of the big data analysis technology in oil field production are summarized.
In the aspect of domestic patents, a machine learning method is less in published patents for oil-gas gathering and transportation optimization, and a machine learning-based three-segment continuous stepping heating furnace optimization energy-saving method is disclosed in a machine learning-based three-segment continuous stepping heating furnace optimization energy-saving method (application number: CN 201910526161.7). The time dimension is introduced into operation experience, and the rule that the heating furnace thermal efficiency is attenuated along with the time is abstracted, so that the problem of employee experience deviation is solved. But is more applied in other fields: a method for optimizing a machine learning model is disclosed in a patent of optimization method, device, terminal equipment and storage medium of the machine learning model (application number: CN201810687251.X), and is characterized by comprising the following steps: acquiring a plurality of data to be processed; inputting the data to be processed into a machine learning model, and screening the data to be processed which meets preset conditions by using the machine learning model to serve as marking data; wherein the annotation data comprises training data and verification data; training the machine learning model by using the training data to determine a trained model; updating the machine learning model based at least on the trained model and the validation data. A method for identifying negative financial information based on a machine learning algorithm is disclosed in the patent 'method and device for identifying negative financial information based on a machine learning algorithm' (application number: CN 201910789700.6). the method comprises the steps of analyzing a financial information text described in a natural language by using the machine learning algorithm to judge whether the emotion reflected by the text is negative; therefore, the processing of large-volume information can be realized through a computer, and whether the information text is negative can be judged more accurately through a pre-constructed algorithm model. An ore visible light image sorting method based on Adaboost machine learning is disclosed in the patent application No. CN201610715882.9, the machine learning method based on Adaboost machine learning is utilized to learn, train and predict the ore visible light image, mineral separation technicians with rich experience can be fully simulated to carry out ore sorting, each machine has the same and accurate ore sorting experience, subjectivity and individual difference of manually sorting ores are effectively avoided, people can be better replaced to work or work which cannot be finished by people can be better finished, labor intensity is reduced, and product production quality and labor production efficiency are improved.
However, no one has studied on the oil and gas gathering and transportation optimization studied finely by using a big data machine learning technology, so that an oil field gathering and transportation system energy consumption analysis and optimization method based on a big data machine learning algorithm is provided. Therefore, a new energy consumption analysis and optimization method for the oil-gas gathering and transportation system is invented, and the technical problems are solved.
Disclosure of Invention
The invention aims to provide an energy consumption analysis and optimization method for an oil gas gathering and transportation system, which combines big data application with actual requirements of an oil field, reduces gathering and transportation energy consumption and provides more accurate technical support for oil gas gathering and transportation.
The object of the invention can be achieved by the following technical measures: the energy consumption analysis and optimization method of the oil gas gathering and transportation system comprises the following steps: step 1, collecting operation data of an oil field gathering and transportation system; step 2, preprocessing the collected data; step 3, carrying out correlation analysis on the data to evaluate the correctness and the validity of the data; step 4, carrying out variance analysis on the parameters of the oil field gathering and transportation equipment and carrying out feature importance analysis through preliminary fitting of a random forest machine learning method; step 5, establishing a machine learning random forest regression model; and 6, evaluating each set of scheme by using an energy consumption quantitative prediction big data model, and queuing to select the optimal scheme.
The object of the invention can also be achieved by the following technical measures:
in step 1, the operation data of the oil field gathering and transportation system includes process parameters and energy consumption data of each component in the gathering and transportation system during actual operation, and specifically includes the extracted liquid amount processed by the gathering and transportation system and the temperature, pressure and water content of each component.
In step 1, parameters of different devices are collected to form a data set including all energy consumption device parameters, and all structured, unstructured and semi-structured data of the oil gas gathering and transportation energy consumption device parameters related to analysis are extracted in an all-sampling mode.
In step 2, the data preprocessing performed includes: missing value processing, outlier processing, and data distribution processing.
In step 2, when missing value processing is performed, for a data column with a missing value in the data table, a mean value method is used to fill up the missing value, and the formula is as follows:
Figure BDA0002404256620000041
null is a missing value, i is a row index, j is a column index, and m is the number of samples; v. ofi,jIs the value of ith row and j column of the data table.
In step 2, when abnormal values are processed, abnormal data exist in the data table, namely, the numerical values exceed the normal range, and abnormal points are identified and deleted by adopting a visual means; when data distribution processing is carried out, the distribution of each parameter is modified into normal distribution.
In step 3, every two gathering parameters are visualized and analyzed.
In step 4, analysis of variance is to calculate the variance of each feature and then delete the feature with variance 0:
Figure BDA0002404256620000042
σ2is the overall variance, X is variable, mu is the overall mean, N is the overall number of instances;
in actual work, when the overall mean is difficult to obtain, the sample statistics is applied to replace the overall parameters, and after correction, the sample variance calculation formula is as follows:
Figure BDA0002404256620000043
S2is the sample variance, X is a variable,
Figure BDA0002404256620000044
is the sample mean, and n is the sample number.
In step 4, the feature importance analysis is to calculate the contribution of each feature to the final predicted energy consumption, and the greater the contribution, the higher the weight.
In step 5, regression averaging is used in constructing the model; firstly, dividing a sampling data set into a training set, a testing set and a verification set; the division proportion of the three data sets is set as 80%, 10% and 10%, the characteristics are that each device exceeds the parameters, the target is energy consumption, and the method specifically comprises the following steps:
firstly, adopting cross validation and grid search means, training a model by using a training set, and determining random forest model parameters;
evaluating the fitting effect of the model by using the test set, wherein the evaluation standard adopts the mean square error; assuming that the real energy consumption is E, the model predicts the energy consumption as E', and the mean square error is:
Figure BDA0002404256620000051
m is the number of predicted samples, the smaller the RMSE, the better the fit;
finally, adjusting the model hyper-parameters by using the verification set; the specific mode is that a verification set is brought into a trained model, and a learning curve is drawn; the best hyper-parameter is when the upper curve is close to the lower curve and both are relatively high.
In step 6, the enumerated quantity of each adjustable parameter is implemented through the machine learning model trained in step 5, a scheme is generated by an exhaustion method or an orthogonal experiment, each scheme is evaluated by a big data model with energy consumption quantitative prediction, and the best scheme is selected in a queue.
The energy consumption analysis and optimization method of the oil and gas gathering and transportation system is different from the conventional method of establishing an energy consumption conservation model of the gathering and transportation system by combining an energy consumption analysis rule of the oil and gas gathering and transportation system and an energy balance test and analyzing the model by using an optimization algorithm. A quick and efficient management means is provided for an energy-saving manager, the oil-gas gathering and transportation system is operated efficiently, and energy is saved finally.
The method adopts a random forest machine learning mode to clean data of collected parameters of various energy consumption devices (three separators, a settling tank, a heating furnace, a dewatering pump, a stabilizing tower, a tower bottom pump and an outward conveying pump), and then divides a data set into a training set, a testing set and a verification set according to a machine learning modeling process. And training the model by using a training set by means of cross validation and grid search to determine the hyper-parameters of the model. And evaluating the fitting effect of the model by using a test set, wherein the evaluation standard adopts a mean square error. And finally, adjusting the model parameters by using the verification set. And obtaining the random forest regression model with the best effect to predict the energy consumption. And then, implementing enumeration quantity of each adjustable parameter through a machine learning model, generating a scheme by using an exhaustion method or an orthogonal experiment, evaluating each set of scheme by using an energy consumption quantitative prediction big data model, and queuing to select an optimal scheme, thereby optimizing the oil gas gathering and transportation process and reducing energy consumption. Different from the simulation technology, the simulation technology needs to establish each energy consumption equipment model according to a large number of complex formulas, the invention avoids the construction of a large number of complex formulas, and the modeling time is shorter compared with the simulation.
The invention has the following excellent effects: aiming at the phenomena of high application difficulty, long scheme design period and high workload of the traditional numerical simulation means based on the whole flow, the invention provides a processing method for fitting various equipment parameters of oil and gas transportation by using a machine learning method and establishing an energy consumption prediction model. By adopting the improved method, the modeling difficulty can be greatly reduced, the design period is shortened, the workload is reduced, and a reasonable reference basis is provided for the adjustment and optimization of the oil-gas gathering and transportation scheme.
Drawings
FIG. 1 is a graph of data correlation analysis in accordance with an embodiment of the present invention;
FIG. 2 is a graph of feature importance analysis in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a random forest in accordance with an embodiment of the present invention;
FIG. 4 is a schematic cross-validation in accordance with an embodiment of the present invention;
FIG. 5 is a diagram illustrating a learning curve according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of model optimization in an embodiment of the present invention;
fig. 7 is a flowchart of an embodiment of the energy consumption analysis and optimization method of the oil and gas gathering and transportation system according to the present invention.
Detailed Description
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Fig. 7 is a flowchart of the energy consumption analysis and optimization method of the oil and gas gathering and transportation system of the present invention, which includes the following steps:
in step 101, operational data of an oilfield gathering system is collected according to oilfield monitoring equipment. The operation data of the oil field gathering and transportation system comprises process parameters and energy consumption data of each component in the gathering and transportation system during actual operation, and specifically can be the extracted liquid amount processed by the gathering and transportation system and the temperature, pressure, water content and the like of each component. The following equipment and respective parameters are considered in this embodiment:
three separators: inlet flow, level, gun level, outlet oil pressure
A settling tank: liquid level, oil-water interface, volume, quality of pure oil
Heating the furnace: inlet pressure, inlet temperature, outlet temperature
A dewatering pump: outlet pressure, variable frequency power frequency
A stabilizing tower: pressure, temperature
A tower bottom pump: outlet pressure, variable frequency power frequency
An external delivery pump: outlet pressure, frequency conversion power frequency, temperature, instantaneous flow, accumulated flow and power consumption
Other factors: ambient temperature
Each device is recorded into a respective table. And splicing the parameter tables of different equipment to form a data set comprising all the parameters of the energy consumption equipment. All structured, unstructured and semi-structured data of oil gas gathering and transportation energy consumption equipment parameters related to analysis are extracted in an overall sampling mode. The sampling table size is m rows and n columns. m represents the number of data samples, n represents the number of parameters, and m is much larger than n. The collection of data is in units of ten thousand, considering that the amount of machine learning data cannot be too small. n includes the incoming flow, level, gun level, outlet oil pressure, inlet temperature, outlet pressure, variable frequency power frequency, etc. Since machine learning has high requirements on data quality, the data will be further processed in step 102 after the data merging is completed in step 101.
In step 102, data preprocessing is performed on the data merged in step 101. The data preprocessing mainly carries out data cleaning work and aims to improve the data quality. The following processing is performed on the data in this example:
missing value processing: for the data columns with missing values in the data table, a mean value method is adopted to fill up the missing values, and the formula is as follows:
Figure BDA0002404256620000071
null is the missing value, i is the row index, j is the column index, and m is the number of samples. v. ofi,jIs the value of ith row and j column of the data table;
abnormal value processing: abnormal data (numerical values exceed a normal range) exist in the data table, and abnormal points are identified and deleted by adopting a visual means;
data distribution processing: the distribution of each parameter is modified to be normal.
After the data preprocessing is completed, the data analysis can proceed to step 103.
In step 103, a correlation analysis is first performed on the data to evaluate the correctness and validity of the data. This example first relates to a correlation analysis of data by visualization as shown in fig. 1. Particularly, every two parameters of gathering and transportation are analyzed in a visual mode. As can be seen from fig. 1, the power consumption has a positive correlation with the ambient temperature, and the output temperature has no obvious linear relationship with the ambient temperature. Therefore, it can be seen that the power consumption has a strong correlation with the ambient temperature.
At step 104, analysis of variance of the oilfield gathering equipment parameters and feature importance analysis by preliminary fitting through a random forest machine learning method are performed.
Analysis of variance is to calculate the variance of each feature and then remove features with variance of 0.
Figure BDA0002404256620000081
σ2For the population variance, X is the variable, μ is the population mean, and N is the population case number.
In actual work, when the overall mean is difficult to obtain, the sample statistics is applied to replace the overall parameters, and after correction, the sample variance calculation formula is as follows:
Figure BDA0002404256620000082
S2is the sample variance, X is a variable,
Figure BDA0002404256620000083
is the sample mean, and n is the sample number.
The feature importance analysis is to calculate the contribution of each feature to the final predicted energy consumption, and the greater the contribution, the higher the weight.
The main control factors (such as the water content of the incoming liquid, the output temperature and the power consumption) are searched through the first two steps, and the method is shown in figure 2. Extraneous factors are eliminated, and the purpose is to improve the model precision and reduce the model running time. Next, machine learning modeling is performed in step 105.
In step 105, a machine-learned random forest regression model is established.
The random forest principle is shown in fig. 3:
the random forest is a forest established in a random mode, the forest is composed of a plurality of decision trees, and each decision tree has no relation with each other. When a new sample exists, each decision tree of the forest is judged respectively, which class the sample belongs to is judged, and then the most classes are selected in a voting mode to serve as a final classification result. In the regression problem, the random forest outputs the average of all decision tree outputs.
In random forest, four steps for each decision tree "planting" and "growing":
(1) assuming that the number of samples in a training set is set to be N, then the N samples are obtained through repeated multiple sampling (boost sampling) with resetting, and the sampling result is used as the training set of the decision tree generated by the user;
(2) if there are M input variables, each node will randomly select M (M < M) specific variables and then use the M variables to determine the best split point. In the generation process of the decision tree, the value of m is kept unchanged, and the splitting method adopts the following formula to split the decision tree to the direction with the maximum information gain, which is specifically shown in a formula (4);
Figure BDA0002404256620000091
d is the data set, a is a certain characteristic (such as the temperature of the settling tank) in the data set, Gain is the information Gain, Ent (D) is the entropy of D,
Figure BDA0002404256620000092
and v is the conditional entropy of D and the number of categories.
(3) Each decision tree is grown to the maximum possible without pruning;
(4) new data is predicted by summing all decision trees (majority voting in classification, averaging in regression).
Since energy consumption prediction belongs to the regression problem, we use regression averaging when building the model. First, a sample data set is divided into a training set, a test set and a verification set. The division ratio of the three data sets is 80%, 10% and 10%, the characteristics are that each device is over-parameter, and the target is energy consumption.
Thirdly, adopting cross validation and grid search means, training the model by using a training set, and determining the parameters of the random forest model. Cross-validation is shown in figure 4.
And fourthly, evaluating the fitting effect of the model by using the test set, wherein the evaluation standard adopts a mean square error. Assuming that the real energy consumption is E, the model predicts the energy consumption as E', and the mean square error is:
Figure BDA0002404256620000093
m is the number of predicted samples, and a smaller RMSE indicates a better fit.
And finally, adjusting the hyper-parameters of the model by using the verification set. The specific method is to bring the verification set into a trained model and draw a learning curve, wherein the learning curve is shown in fig. 5. The best hyper-parameter is when the upper and lower curves are close and both are relatively high. The hyper-parameters include the number of trees, the depth of the trees, the number of leaf nodes of the trees, etc. After model hyper-parameters are determined in step 105 and the model is trained, optimal parameter selection may be performed in step 106.
At step 106, the overall flow is as shown in FIG. 6. And (5) implementing enumeration of each adjustable parameter through the machine learning model trained in the step 105, generating a scheme by an exhaustion method or an orthogonal experiment, evaluating each set of scheme by using an energy consumption quantitative prediction big data model, and queuing to select an optimal scheme.
The energy consumption analysis and optimization method for the oil-gas gathering and transportation system can greatly reduce the modeling difficulty, shorten the design period and reduce the workload, thereby providing a reasonable reference basis for the adjustment and optimization of the oil-gas gathering and transportation scheme. However, the model adopted by the method has low interpretability and high requirement on data quality, and the model needs to be trained to obtain a better effect.

Claims (11)

1. The energy consumption analysis and optimization method of the oil gas gathering and transportation system is characterized by comprising the following steps:
step 1, collecting operation data of an oil field gathering and transportation system;
step 2, preprocessing the collected data;
step 3, carrying out correlation analysis on the data to evaluate the correctness and the validity of the data;
step 4, carrying out variance analysis on the parameters of the oil field gathering and transportation equipment and carrying out feature importance analysis through preliminary fitting of a random forest machine learning method;
step 5, establishing a machine learning random forest regression model;
and 6, evaluating each set of scheme by using an energy consumption quantitative prediction big data model, and queuing to select the optimal scheme.
2. The energy consumption analysis and optimization method for an oil and gas gathering and transportation system according to claim 1, wherein in step 1, the operation data of the oil field gathering and transportation system comprises process parameters and energy consumption data of each component in the gathering and transportation system during actual operation, and specifically comprises the liquid production amount processed by the gathering and transportation system and the temperature, pressure and water content of each component.
3. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system according to claim 2, wherein in step 1, parameters of different devices are collected to form a data set including all energy consumption device parameters, and all structured, unstructured and semi-structured data of the oil and gas gathering and transportation energy consumption device parameters related to analysis are extracted in an all-sampling manner.
4. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system as claimed in claim 1, wherein in step 2, the data preprocessing performed comprises: missing value processing, outlier processing, and data distribution processing.
5. The energy consumption analysis and optimization method of an oil and gas gathering and transportation system according to claim 4, wherein in step 2, when missing value processing is performed, for data columns with missing values in a data table, a mean value method is used to fill up the missing values, and the formula is as follows:
Figure FDA0002404256610000011
null is the miss value, i is the row index, j is the columnIndex, m is the number of samples; v. ofi,jIs the value of ith row and j column of the data table.
6. The energy consumption analysis and optimization method of an oil and gas gathering and transportation system according to claim 4, characterized in that in step 2, when abnormal values are processed, abnormal data exist in the data table, namely, the numerical value exceeds the normal range, and abnormal points are identified and deleted by a visual means; when data distribution processing is carried out, the distribution of each parameter is modified into normal distribution.
7. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system according to claim 1, wherein in step 3, gathering and transportation parameters are analyzed in a pairwise visualization manner.
8. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system according to claim 1, wherein in step 4, the analysis of variance is to calculate the variance of each feature and then delete the feature with the variance of 0:
Figure FDA0002404256610000021
σ2is the overall variance, X is variable, mu is the overall mean, N is the overall number of instances;
in actual work, when the overall mean is difficult to obtain, the sample statistics is applied to replace the overall parameters, and after correction, the sample variance calculation formula is as follows:
Figure FDA0002404256610000022
S2is the sample variance, X is a variable,
Figure FDA0002404256610000023
is the sample mean, and n is the sample number.
9. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system according to claim 1, wherein in step 4, the feature importance analysis is to calculate the contribution of each feature to the final predicted energy consumption, and the greater the contribution, the higher the weight.
10. The method for analyzing and optimizing energy consumption of an oil and gas gathering and transportation system according to claim 1, wherein in step 5, regression averaging is used in constructing the model; firstly, dividing a sampling data set into a training set, a testing set and a verification set; the division proportion of the three data sets is set as 80%, 10% and 10%, the characteristics are that each device exceeds the parameters, the target is energy consumption, and the method specifically comprises the following steps:
firstly, adopting cross validation and grid search means, training a model by using a training set, and determining random forest model parameters;
evaluating the fitting effect of the model by using the test set, wherein the evaluation standard adopts the mean square error; assuming that the real energy consumption is E, the model predicts the energy consumption as E', and the mean square error is:
Figure FDA0002404256610000031
m is the number of predicted samples, the smaller the RMSE, the better the fit;
finally, adjusting the model hyper-parameters by using the verification set; the specific mode is that a verification set is brought into a trained model, and a learning curve is drawn; the best hyper-parameter is when the upper curve is close to the lower curve and both are relatively high.
11. The energy consumption analysis and optimization method for an oil and gas gathering and transportation system according to claim 1, characterized in that in step 6, the enumerated quantity of each adjustable parameter is implemented through the machine learning model trained in step 5, a scheme is generated by an exhaustion method or an orthogonal experiment, each scheme is evaluated by an energy consumption quantitative prediction big data model, and the optimal scheme is selected in a queue.
CN202010159512.8A 2020-03-09 2020-03-09 Energy consumption analysis and optimization method for oil gas gathering and transportation system Pending CN113379093A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010159512.8A CN113379093A (en) 2020-03-09 2020-03-09 Energy consumption analysis and optimization method for oil gas gathering and transportation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010159512.8A CN113379093A (en) 2020-03-09 2020-03-09 Energy consumption analysis and optimization method for oil gas gathering and transportation system

Publications (1)

Publication Number Publication Date
CN113379093A true CN113379093A (en) 2021-09-10

Family

ID=77568592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010159512.8A Pending CN113379093A (en) 2020-03-09 2020-03-09 Energy consumption analysis and optimization method for oil gas gathering and transportation system

Country Status (1)

Country Link
CN (1) CN113379093A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116528270A (en) * 2023-06-27 2023-08-01 杭州电瓦特科技有限公司 Base station energy saving potential evaluation method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355208A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Data prediction analysis method based on COX model and random survival forest
US20180285788A1 (en) * 2015-10-13 2018-10-04 British Gas Trading Limited System for energy consumption prediction
CN109063313A (en) * 2018-07-26 2018-12-21 北京交通大学 Calculation Method of Energy Consumption in Train Traction based on machine learning
CN109543203A (en) * 2017-09-22 2019-03-29 山东建筑大学 A kind of Building Cooling load forecasting method based on random forest
CN110705774A (en) * 2019-09-26 2020-01-17 汉纳森(厦门)数据股份有限公司 Vehicle energy consumption analysis prediction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180285788A1 (en) * 2015-10-13 2018-10-04 British Gas Trading Limited System for energy consumption prediction
CN106355208A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Data prediction analysis method based on COX model and random survival forest
CN109543203A (en) * 2017-09-22 2019-03-29 山东建筑大学 A kind of Building Cooling load forecasting method based on random forest
CN109063313A (en) * 2018-07-26 2018-12-21 北京交通大学 Calculation Method of Energy Consumption in Train Traction based on machine learning
CN110705774A (en) * 2019-09-26 2020-01-17 汉纳森(厦门)数据股份有限公司 Vehicle energy consumption analysis prediction method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
文雯;刘文哲;肖祥武;向春波;谢小鹏;姜鑫;: "基于大数据和并行随机森林算法火电机组供电煤耗计算模型", 热力发电, no. 09, pages 13 - 18 *
肖祥武 等: "基于大数据平台和并行随机森林算法的能耗预测模型优化", 华电技术, vol. 40, no. 7, pages 1 - 4 *
黄铠;冯运凯;刘建武;程浩;张影;: "基于大数据挖掘的油气田企业全产业链精准管理", 物流技术, no. 02, pages 108 - 114 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116528270A (en) * 2023-06-27 2023-08-01 杭州电瓦特科技有限公司 Base station energy saving potential evaluation method, device, equipment and storage medium
CN116528270B (en) * 2023-06-27 2023-10-03 杭州电瓦特科技有限公司 Base station energy saving potential evaluation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106845717B (en) Energy efficiency evaluation method based on multi-model fusion strategy
CN107274105B (en) Linear discriminant analysis-based multi-attribute decision tree power grid stability margin evaluation method
CN104809658B (en) A kind of rapid analysis method of low-voltage distribution network taiwan area line loss
CN111340063B (en) Data anomaly detection method for coal mill
CN111259947A (en) Power system fault early warning method and system based on multi-mode learning
CN105701596A (en) Method for lean distribution network emergency maintenance and management system based on big data technology
CN109472241A (en) Combustion engine bearing remaining life prediction technique based on support vector regression
CN110782658A (en) Traffic prediction method based on LightGBM algorithm
CN110750524A (en) Method and system for determining fault characteristics of active power distribution network
CN111815054A (en) Industrial steam heat supply network short-term load prediction method based on big data
CN115794803B (en) Engineering audit problem monitoring method and system based on big data AI technology
CN110335168A (en) Method and system based on GRU optimization power information acquisition terminal fault prediction model
CN112987666A (en) Power plant unit operation optimization regulation and control method and system
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
CN111476274B (en) Big data predictive analysis method, system, device and storage medium
CN113379093A (en) Energy consumption analysis and optimization method for oil gas gathering and transportation system
CN113030633B (en) GA-BP neural network-based power distribution network fault big data analysis method and system
CN114169998A (en) Financial big data analysis and mining algorithm
CN103886512A (en) Thermal power unit index evaluation unit based on gray level clustering
CN116467658A (en) Equipment fault tracing method based on Markov chain
CN115186935B (en) Electromechanical device nonlinear fault prediction method and system
CN116108963A (en) Electric power carbon emission prediction method and equipment based on integrated learning module
CN108493933A (en) A kind of Characteristics of Electric Load method for digging based on depth decision Tree algorithms
CN114693175A (en) Unit state analysis method and system based on network source network-related test
Zhou et al. Study on Optimization of Data-Driven Anomaly Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination