CN116562428A

CN116562428A - Fracturing construction parameter optimization method based on machine learning

Info

Publication number: CN116562428A
Application number: CN202310383694.0A
Authority: CN
Inventors: 曾凡辉; 李宇麟; 胡大淦; 张宇; 郭建春; 王永红; 曾波
Original assignee: Southwest Petroleum University
Current assignee: Southwest Petroleum University
Priority date: 2023-04-12
Filing date: 2023-04-12
Publication date: 2023-08-08

Abstract

The invention discloses a fracturing construction parameter optimization method based on machine learning, which comprises the following steps: obtaining geological factors and engineering factors which influence the yield; sequencing geological factors and engineering factors by using a machine learning method, and screening main control factors according to sequencing results; performing dimension reduction treatment on the screened geological main control factors and engineering main control factors; determining the weights of the geological main control factors and the engineering main control factors after dimension reduction by utilizing an entropy weight method and combining heuristic automatic searching; according to the screening result of the main control factors, combining a machine learning method and an optimization algorithm, establishing an intelligent fracturing process parameter optimization model aiming at optimizing the single well yield, and optimizing to obtain the optimal construction parameters. Compared with the prior art, the fracturing construction parameter optimization method provided by the invention improves the calculation efficiency and the optimization precision.

Description

Fracturing construction parameter optimization method based on machine learning

Technical Field

The invention relates to the field of petroleum and natural gas engineering, in particular to a fracturing construction parameter optimization method based on machine learning.

Background

Hydraulic fracturing is one of the main ways of increasing production of low permeability reservoirs, and has wide application in the exploitation of unconventional oil and gas resources. About one third of the fractured wells are not statistically expected to produce, and thus there are deficiencies in optimizing the parameters of the hydraulic fracturing technique of the prior art. For the problem of fracturing optimization of the black box, a scientific optimization design system is needed to be established to guide on-site construction operation. In the fracturing process, a large amount of fracturing construction and production dynamic data with important values are accumulated on site, and the conventional fracturing parameter optimization method cannot fully utilize the valuable data.

Meanwhile, the data mining and the machine learning can fully utilize on-site fracturing construction and production dynamic data, and an approximate model of the law of the change of an objective function and a constraint function along with the variable is constructed through a small number of sample points aiming at a small-sample supervised machine learning model, so that the yield master control factor can be analyzed, and therefore, the optimization of fracturing construction parameters based on the machine learning has important guiding significance for improving the yield of a gas well.

The prior art CN114117654a discloses a horizontal well perforation optimization design method and device based on machine learning, wherein iterative optimization of a horizontal well productivity construction parameter prediction model is realized by utilizing a machine learning automation framework through algorithm and parameter hierarchical optimization, but the inventor finds that the input parameters are not subjected to dimension reduction treatment in the patent, so that the calculation efficiency and the optimization precision are not satisfactory. In the prior art, the method is more general in the aspect of feature extraction when the parameter optimization is performed, so that the pertinence of the feature parameters is required to be further improved, and meanwhile, the parameter optimization is mainly performed by orthogonal tests, so that the global optimal solution is not easy to obtain; and the different machine learning methods have advantages and disadvantages, and the parameter optimization models based on the machine learning in different areas are different.

Disclosure of Invention

In view of the above, an object of an embodiment of the present invention is to provide a fracturing construction parameter optimization method based on machine learning.

In order to achieve the technical purpose, the invention provides the following technical scheme.

The fracturing construction parameter optimization method based on machine learning is characterized by comprising the following steps of:

(1) Obtaining geological factors and engineering factors which influence the yield;

(2) Carrying out importance ranking on geological factors and engineering factors affecting the yield by using a machine learning method, and screening yield main control factors according to ranking results;

(3) Performing dimension reduction treatment on the screened geological main control factors and engineering main control factors by using a dimension reduction method;

(4) Determining weights of geological main control factors and engineering main control factors after dimension reduction by utilizing an entropy weight method in combination with heuristic automatic searching, and determining a membership function basic form of the main control factors based on a correlation between the main control factors and yield;

(5) According to the screening result of the main control factors, combining a machine learning method and an optimization algorithm, establishing an intelligent fracturing process parameter optimization model aiming at optimizing the single well yield, and optimizing to obtain the optimal construction parameters.

Further, the geological factors include TOC, porosity, permeability, pressure coefficient, brittleness index, reservoir thickness, minimum level principal stress, drilling rate, and shale content.

Further, the engineering factors comprise horizontal segment length, total liquid amount, total sand amount, average sand ratio, construction displacement, crack spacing, construction pressure, pump stopping pressure, average sand ratio, ceramsite consumption and fracturing fluid flowback rate.

Further, the step (2) further includes determining correlations between the geological factors and the engineering factors by pearson correlation analysis, where the determining method includes:

calculating a correlation coefficient

Wherein r represents a correlation coefficient, q is a natural number from 1 to n, n represents the number of wells, a and b represent two factors requiring correlation analysis, a _q Factor a, b for the q-th well _q The factor b of the q-th well is shown.

Further, the machine learning algorithm is at least one of a random forest, an artificial neural network and a support vector machine.

Further, the step of random forest method includes:

based on the feature importance measure of variance, the variance of the ith node is calculated as:

where mse (i) represents variance, D _i Showing the data set, x, at the ith node _j Representing a piece of data on the node i dataset, y _j Represents x _j Corresponding label, c _i Representation D _i An average value of the dataset label;

is provided withRepresenting the average amount of change in variance of the ith feature over all of the RF decision tree nodes, i.e., the feature importance of the ith feature;

feature x _i Importance on node n, i.e. the data on node n is divided into its left and right child nodes n _l And n _r Variance variation before and after:

in the method, in the process of the invention,is the characteristic x _i Importance on node n;

if feature x _i The importance of the feature on the kth decision tree is that the node set appearing as the node segmentation attribute in the kth decision tree is N:

in the method, in the process of the invention,is the characteristic x _i Importance at the kth decision tree;

if there are K trees in RF, then feature x _i The importance in the overall RF is:

in the IMP _i ^mse Is the characteristic x _i Importance on random forests.

Further, the method for reducing dimension by principal component analysis comprises the following steps:

assuming that the n-dimensional vector w is a mapping vector of the low-dimensional mapping space, the variance formula after the maximized data mapping is as follows:

wherein m is the number of data participating in dimension reduction, w is a mapping vector of the low-dimension mapping space, and x _i For a specific vector representation of the random data i,an average vector of all the data participating in dimension reduction;

assuming that W is a matrix of column vectors containing all feature map vectors, the matrix is expressed as:

min _w tr(W ^T AW),s.t.W ^T W＝I (7)

where tr is the trace of the matrix, W is the matrix composed of column vectors containing all feature mapping vectors, I is the unit vector, a is the covariance matrix, and the expression is as follows:

output of principal component analysis: y=wx, and the optimal W is formed by using eigenvectors corresponding to the first k largest eigenvalues in the covariance matrix as column vectors.

Further, after determining weights of the geological main control factors and the engineering main control factors after dimension reduction, determining a membership function basic form of each main component through a correlation between each main control factor and the single well yield, wherein the membership function basic form comprises the following steps:

establishing a main control factor membership matrix by adopting three membership functions of a large size, a small size and an intermediate type; establishing a comprehensive evaluation fuzzy set by using the weight and membership matrix; introducing an evaluation set, and establishing a fuzzy comprehensive evaluation model by utilizing the evaluation set and comprehensive evaluation fuzzy set data; and carrying out fracturing effect evaluation on the main control factor dimension reduction result by using the established fuzzy comprehensive evaluation model to obtain the geological score, the construction score and the total score of each well, fitting a function curve of the score and the yield, and carrying out main control factor screening rationality verification.

The invention provides a fracturing construction parameter optimization method based on machine learning, which integrates Pearson correlation analysis, a recursive feature elimination method and a random forest method to screen main control factors, introduces methods such as main component dimension reduction and the like to improve model accuracy, simultaneously utilizes a fuzzy comprehensive evaluation method to carry out main control factor rationality verification, establishes a yield prediction model, introduces a particle swarm optimization algorithm to establish a construction parameter optimization model, and improves calculation efficiency and optimization accuracy.

Drawings

FIG. 1 is a graph of the result of interpretation of variance ratio by geologic factors in an embodiment of the invention.

Fig. 2 is a graph of variance ratio results interpreted by a worker Cheng Yinsu in an embodiment of the invention.

FIG. 3 is a graph showing the fit of total score to kilometer test yield in an example of the present invention.

FIG. 4 is a graph of total score versus geologic score difference results in an embodiment of the invention.

Detailed Description

The details of the invention will be more clearly understood in conjunction with the accompanying drawings and description of specific embodiments of the invention. However, the specific embodiments of the invention described herein are for the purpose of illustration only and are not to be construed as limiting the invention in any way. Given the teachings of the present invention, one of ordinary skill in the related art will contemplate any possible modification based on the present invention, and such should be considered to be within the scope of the present invention.

The invention provides a fracturing property evaluation method based on rock breaking mass quality fractal, which comprises the following steps:

in a preferred embodiment, the geological factors include parameters such as TOC, porosity, permeability, pressure coefficient, brittleness index, reservoir thickness, minimum horizontal principal stress, drilling and meeting rate, clay content, etc., and the engineering factors include parameters such as horizontal segment length, total liquid volume, total sand volume, average sand ratio, construction displacement, fracture spacing, construction pressure, pump-down pressure, average sand ratio, ceramsite usage, fracturing fluid flowback rate, etc.

after geological factors and engineering factors affecting the yield are obtained, the relationship situation between the data is studied by using a Pearson correlation analysis method to determine whether the data have a relationship and how much the relationship strength is, and the strength of the relationship can be measured by the magnitude of the correlation coefficient.

The method for calculating the correlation coefficient is as follows:

Is to calculate the covariance of the data sequences of the two factors a and b,the standard deviation of the data sequences of the factor a and the factor b is calculated.

In specifically selecting machine learning algorithms, one skilled in the art may select at least one machine learning algorithm in combination with the characteristics of the data and the characteristics of the corresponding algorithm. In a preferred embodiment of the present invention, the machine learning algorithm may be at least one of a random forest, an artificial neural network, and a support vector machine.

The random forest mainly uses variance as non-purity to calculate the importance degree of the subsequent features when carrying out importance measurement of each factor.

where mse (i) represents variance, D _i Showing the data set, x, at the ith node _j Representing a piece of data on the node i dataset, y _j Represents x _j Corresponding label, c _i Representation D _i Average value of dataset label.

Is provided withRepresent the firstThe mean change in variance of the i features over all of the decision tree nodes of the RF, i.e., the feature importance of the i-th feature.

in the method, in the process of the invention,is the characteristic x _i Importance on node n.

in the method, in the process of the invention,is the characteristic x _i Importance at the kth decision tree.

in the method, in the process of the invention,is the characteristic x _i Importance on random forests.

the main component analysis method is used for reducing the dimension, a plurality of factors with correlation are integrated into linear uncorrelated main components, the correlation among all main control factors is utilized, fewer main components after the dimension reduction are used for replacing the original numerous factors, and the main components keep the information reflected by the original factors as much as possible, so that the problem is simplified. The calculation steps are as follows:

let n-dimensional vector w be a mapping vector of the low-dimensional mapping space, the variance formula after the maximized data mapping is as follows:

wherein m is the number of data participating in dimension reduction, w is a mapping vector of the low-dimension mapping space, and x _i Is a specific vector representation of the random data i,is the average vector of all the data involved in dimension reduction.

Let W be the matrix of column vectors containing all feature map vectors, which better retains the information in the data, as follows:

min _w tr(W ^T AW),s.t.W ^T W＝I (7)

where tr is the trace of the matrix, W is the matrix composed of column vectors containing all feature map vectors, I is the unit vector, and a is the covariance matrix, expressed as follows:

output of principal component analysis: y=wx, and the optimal W is formed by using the eigenvectors corresponding to the first k largest eigenvalues in the covariance matrix as column vectors, and the original dimension of X is reduced to k dimensions through this process.

after the weights of the geological main control factors and the engineering main control factors after dimension reduction are determined, the membership function basic form of each main component is defined through the correlation between each main control factor and the single well yield.

In a preferred embodiment, three membership functions, namely larger membership functions, smaller membership functions and intermediate membership functions, can be used to establish a master factor membership matrix. Establishing a comprehensive evaluation fuzzy set by using the weight and membership matrix; introducing an evaluation set, and establishing a fuzzy comprehensive evaluation model by utilizing the evaluation set and comprehensive evaluation fuzzy set data; and carrying out fracturing effect evaluation on the main control factor dimension reduction result by using the established fuzzy comprehensive evaluation model to obtain the geological score, the construction score and the total score of each well, fitting a function curve of the score and the yield, and carrying out main control factor screening rationality verification.

Specifically, the fuzzy comprehensive evaluation steps are as follows:

(1) Establishing an evaluation factor set element which is an evaluation index participating in influencing the yield of the fracturing well, wherein in the evaluation of the yield of the fracturing well, the factor set element is a fuzzy subset consisting of n main components participating in the evaluation well, and the fuzzy subset is marked as F= (F) ₁ F ₂ ,……,F _n )。

(2) Establishing an evaluation set

Evaluation set v= (v) ₁ ,v ₂ ,…,v _n ) V is a complete sequence, i.e. there is a level difference between any two comments in v. v is the set of evaluation criteria corresponding to the evaluation factors in F. In the frac well production evaluation, v is the set of production levels (level I, level II, level III, and level IV) corresponding to each evaluation factor, herein v= [100, 75, 50, 25]。

(3) Fuzzy weight vector of evaluation factor

Since the importance of each factor to the evaluation result is generally different, it is necessary to use a factor F for each factor _i Giving a corresponding weight w _i (i=1, 2, 3.,. The term.) n) thereby constitutes a weight set W. The weight determination of the accurate quantization index directly affects the quantization result, the entropy weight method is introduced to determine the weight, and then the weight is further optimized by adopting a heuristic automatic search mode to obtain a weight matrix W.

In a preferred embodiment, the entropy weighting method may be selected for weighting, which basically comprises the following steps:

let the original data matrix obtained by n evaluation targets of m evaluation indexes be:

wherein X is an original data matrix, X _ij Is the j parameter data of the i type in the original data.

For a greatly superior revenue index:

for cost indicators that are excellent for the small:

wherein r is _ij Standard value of jth evaluation object on ith evaluation index, r _ij ∈[0,1]。

The matrix normalization can be obtained:

R＝(r _ij ) _m*n (12)

wherein R is a standardized matrix of the X matrix.

In an evaluation problem having m evaluation indexes and n evaluation targets, the entropy of the i-th index is defined as:

in the method, in the process of the invention,when f _ij When=0, let f _ij ln f _ij ＝0。

After defining the entropy of the ith index, the entropy weight of the ith index is defined as:

wherein:

in a preferred embodiment, in order to further increase the rationality of the weight coefficient, reduce the model prediction error, further adopt a heuristic automatic search method to perform weight optimization based on the entropy weight coefficient to obtain the optimal weight coefficient, the steps are as follows:

(a) Determining initial weight, and setting step factor and floating range (generally + -10%);

(b) Sequentially adjusting the parameter weights, updating the residual weights, and continuously adjusting the weights according to the change direction if the model error is smaller; if the error becomes larger, the weight is returned to the previous step; calculating a round after all parameter weights are adjusted, and recording the model error at the moment; repeating the adjustment for a plurality of rounds, and ending the adjustment of the current step factor after the model error is unchanged for 10 times continuously;

(c) Halving the optimization step factor, repeating step (b), and terminating the algorithm when the weight adjustment is reduced to a certain degree (optionally 0.01/2) ¹⁰ ) At this time, the weight is the optimal weight, and the average relative error of the obtained model is the smallest.

(4) Determining a single factor evaluation matrix

The membership functions are determined according to the corresponding reservoir quality grades (which can be classified into grade I, grade II, grade III and grade IV) of each evaluation factor, the membership functions are determined by a comprehensive fuzzy statistics method and an assignment method, and three membership function forms of a larger type, a smaller type and an intermediate type are selected.

The present invention will be explained in connection with particular applications of the oilfield for the purpose of facilitating a further understanding of the technical solution of the present invention by those skilled in the art.

Based on basic parameters of the block sample well, geological and engineering main control factors in the sample well can be obtained by comprehensively screening through Pearson correlation analysis, a recursive feature elimination method and a random forest method by using formulas (1) - (5), and the results are shown in table 1:

TABLE 1 Master factor screening results

The main component dimension reduction method is introduced, and the dimension reduction treatment is carried out on the sample well by utilizing the formulas (6), (7) and (8), and the result is shown in the figures 1-2. For the geological features, the information accumulation ratio of the first 5 main components reaches 89.7%, and the information accumulation ratio can be accepted when the information accumulation ratio reaches 80%, so that 5 main components are selected to replace the original 10 features. For engineering features, the information accumulation duty ratio of the first 4 main components reaches 87.9%, so that 4 main components are selected to replace the original 10 features. The geological and engineering factor interpretation variance ratio after dimension reduction is shown in fig. 1-2, the geological and engineering factor characteristic matrix is shown in tables 2-3, and dimension reduction results of different wells obtained by substituting the sample well main control factor data into the geological and engineering factor characteristic matrix are shown in table 4.

TABLE 2 geological factor characterization matrix

TABLE 3 engineering factor characterization matrix

Original features	Principal component 1 coefficient	Principal component 2 coefficients	Principal component 3 coefficients	Principal component 4 coefficients
					Dragon one 11 drilling meeting rate	-0.25	-0.05	-0.72	-0.24
Drilling rate of 4m box body above Dragon 11 bottom	-0.40	-0.28	-0.01	-0.18
					Wellbore distance	0.29	0.31	-0.33	0.35
Wellbore orientation	-0.04	0.44	0.35	-0.68
					Average section length	-0.27	0.44	0.30	0.32
Average cluster spacing	0.43	0.00	0.04	-0.05
					Total liquid strength	0.35	0.01	-0.15	-0.46
Total sand adding strength	-0.40	0.21	0.00	0.02
					Ceramsite proportion	0.08	-0.62	0.37	0.03
Displacement volume	-0.39	-0.14	-0.04	-0.03

TABLE 4 dimension reduction results for geological and engineering Master control factors

A fuzzy comprehensive evaluation method is introduced to perform rationality verification of the screened main control factors, the score results of the sample wells are shown in table 5, and the positive correlation between the score and the yield can be seen from table 5, so that the screened main control factors are reasonable.

TABLE 5 sample well fuzzy evaluation results

Based on the data of the sample well after dimension reduction, three machine learning methods are adopted to respectively establish a kilometer test yield prediction model, and the dimension reduction of certain well data is substituted into the three models for calculation, the calculation comparison results of the three machine learning yield prediction models are shown in table 6, and the results show that the prediction accuracy of the support vector regression model is highest, and the average relative error is only 17.6%, so that the support vector regression model is preferably used as the prediction model of the kilometer test yield of the well.

Table 6 machine learning capacity prediction model comparison

The construction parameters of the well are intelligently optimized by adopting a particle swarm optimization algorithm, wherein the number of initial particles is 200, the iteration times is 2000, the learning factor is 2, and the inertia factor is 1. The optimized top-ranking yield 4 scheme is shown in Table 7, and the final optimal construction parameter combination is that the average section length is 90m, the average cluster spacing is 10m, and the total liquid strength is 26m ³ Per m, total sand adding strength 2.8t/m and discharge capacity 14m ³ Per min, the test yield corresponding to kilometers is 12.8X10 ⁴ m ³ And/d.km, which is 29.6% higher than the original capacity level.

Table 7 optimization of the resulting yield top 5 scheme

The method integrates Pearson correlation analysis, recursion feature elimination method and random forest method to screen the main control factors, introduces methods such as main component dimension reduction and the like to improve model precision, simultaneously utilizes a fuzzy comprehensive evaluation method to verify the rationality of the main control factors, establishes a yield prediction model based on a machine learning method, and introduces a particle swarm optimization algorithm to establish a construction parameter optimization model, thereby improving calculation efficiency and optimization precision compared with the prior art.

The present invention has been described in detail with reference to the embodiments, and it should be understood that the present embodiment is a preferred embodiment of the invention and is not intended to limit the invention to the form disclosed herein, but is not to be construed as excluding other embodiments. And the modifications and simple changes carried out by the person skilled in the art do not deviate from the technical idea and scope of the invention, and all belong to the protection scope of the technical scheme of the invention.

Claims

1. The fracturing construction parameter optimization method based on machine learning is characterized by comprising the following steps of:

2. The machine learning based fracturing construction parameter optimization method of claim 1, wherein the geological factors comprise TOC, porosity, permeability, pressure coefficient, brittleness index, reservoir thickness, minimum level principal stress, drilling rate, and clay content.

3. The machine learning based fracturing construction parameter optimization method of claim 1, wherein the engineering factors comprise horizontal segment length, total liquid volume, total sand volume, average sand ratio, construction displacement, crack spacing, construction pressure, pump-down pressure, average sand ratio, ceramsite usage, fracturing fluid flowback rate.

4. The method for optimizing fracturing construction parameters based on machine learning of claim 1, wherein said step (2) further comprises determining correlations between individual geological factors and engineering factors by pearson correlation analysis, wherein the determining method is as follows:

calculating a correlation coefficient

5. The fracturing construction parameter optimization method based on machine learning of claim 1, wherein the machine learning algorithm is at least one of a random forest, an artificial neural network and a support vector machine.

6. The machine learning based fracturing construction parameter optimization method of claim 1, wherein the step of random forest method comprises:

7. The machine learning based fracturing construction parameter optimization method of claim 1, wherein the dimension reduction by principal component analysis method comprises the steps of:

min _w tr(W ^T AW),s.t.W ^T W＝I (7)

8. The method for optimizing fracturing construction parameters based on machine learning of claim 1, wherein,

after determining the weights of the geological main control factors and the engineering main control factors after dimension reduction, determining the basic form of the membership function of each main component through the correlation between each main control factor and the single well yield, wherein the basic form comprises the following steps: