CN115828130A

CN115828130A - Clustering algorithm-based multi-parameter dominant water flow channel automatic identification method and system

Info

Publication number: CN115828130A
Application number: CN202310107756.5A
Authority: CN
Inventors: 张吉群; 李欣; 贾德利; 王利明; 常军华; 李夏宁; 吴丽; 崔丽宁; 闫林; 张洋; 王全宾
Original assignee: Petrochina Co Ltd
Current assignee: Petrochina Co Ltd
Priority date: 2023-02-14
Filing date: 2023-02-14
Publication date: 2023-03-21

Abstract

The invention discloses a method and a system for automatically identifying a multi-parameter dominant water flow channel based on a clustering algorithm, which relate to the technical field of oil exploitation, and comprise the following steps: step S1: collecting parameters of an oil-water well layer of an oil field by taking the layer as a unit; step S2: obtaining a principal component by using a principal component analysis method based on the interlayer parameter of the oil-water well; and step S3: and clustering the main components by using a K-Means clustering algorithm to identify a dominant water flow channel. The invention overcomes the limitations that other methods need to depend on the experience of engineers, the time consumption and the like. A more effective dominant water flow channel identification method is established, the research period is shortened, the identification precision is improved, judgment which is more and more consistent with the actual production is obtained, and the method has wide trial.

Description

Clustering algorithm-based multi-parameter dominant water flow channel automatic identification method and system

Technical Field

The invention relates to the technical field of oil exploitation, in particular to a multi-parameter dominant water flow channel automatic identification method and system based on a clustering algorithm.

Background

Most of oil fields in China are heterogeneous and multi-reservoir sandstone oil fields deposited on continental facies, and the reservoir heterogeneity is relatively serious. Water injection is an important means for maintaining pressure, increasing oil recovery rate and recovery ratio in oil field development. However, in the later period of high water content, interlayer contradiction, plane contradiction and in-layer contradiction are increasingly prominent, so that a serious water injection invalid circulation phenomenon appears, great difficulty is caused to oil stabilization and water control of the oil field, and the water injection development effect and the economic benefit are directly influenced. And only when the position of the dominant water flow channel is found, the feasible measures can be taken for the secondary development deep profile control and flooding work of the old oil field.

The existing dominant water flow channel identification method mainly comprises a well logging method, a well testing method and other mathematical methods, and each method at the present stage has certain limitations, wherein the well testing method needs field operation, is high in cost, only part of wells have data, and influences analysis results; the well logging method and other mathematical methods mostly focus on partial parameters for analysis, and the accuracy is relatively low and the method is not suitable for wide trial use.

Disclosure of Invention

The invention aims to provide a clustering algorithm-based multi-parameter dominant water flow channel automatic identification method and system, and overcomes the limitations that other methods need to depend on the experience of engineers, the time consumption and the like. A more effective dominant water flow channel identification method is established, the research period is shortened, the identification precision is improved, judgment which is more and more consistent with the actual production is obtained, and the method has wide trial. In order to achieve the purpose, the invention provides the following technical scheme:

according to one aspect of the disclosure, a method for automatically identifying a multi-parameter dominant water flow channel based on a clustering algorithm is provided, the method comprising the following steps:

step S1: collecting parameters of an oil-water well stratum with stratum as a unit in an oil field;

step S2: obtaining a principal component by using a principal component analysis method based on the interlayer parameter of the oil-water well;

and step S3: and clustering the main components by using a K-Means clustering algorithm to identify a dominant water flow channel.

In one possible embodiment, the parameters include: the accumulated water injection amount, the scouring time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water-flooding oil amount, the water-flooding liquid amount and the water injection strength among wells.

In a possible implementation, the step S2: based on the oil-water well interlayer parameters, obtaining principal components by using a principal component analysis method, wherein the principal components comprise:

step S21: constructing a sample data matrix X of n X m according to the sample data and the parameters of each oil-water well layer, wherein n represents the number of samples, and m represents the number of the parameters of the oil-water well layer;

step S22: carrying out standardization processing on the sample data matrix X;

step S23: calculating a covariance matrix of the sample data matrix X after the standardization processing;

step S24: calculating an eigenvalue and an eigenvector of the covariance matrix;

step S25: calculating a principal component contribution rate and an accumulated contribution rate based on the eigenvalues and the eigenvectors;

step S26: based on the principal component contribution rate and the cumulative contribution rate, p principal components are obtained.

In a possible implementation, the step S3: clustering the principal components by using a K-Means clustering algorithm to identify a dominant water flow channel, comprising:

step S31: randomly selecting 3 data from samples taking a layer as a unit as a central point;

step S32: respectively calculating the distance from each sample point to the selected 3 central points, and classifying the sample points according to the distances;

step S33: calculating each sample point in the classified sample points, and recalculating the average value as a new central point;

step S34: if the new center point calculated again is the same as the original center point, finishing clustering; if the new center point calculated again is different from the original center point, the new center point is assigned to the original center point, and the step S32 is continuously repeated until the calculation is finished;

step S35: according to the finally obtained clusters, the clusters can be divided into three categories: the strong water flow is used for identifying each dominant water flow channel.

In a possible implementation, the step S32: respectively calculating the distance from each sample point to the selected 3 central points, and classifying the sample points according to the distances, wherein the method comprises the following steps:

step 321: calculating the distance from each sample point to 3 central points according to the following formula:

；

in the formula (I), the compound is shown in the specification,

represents the distance from sample i to the center point 1 at time t;

represents the distance of sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

a value of a principal component k representing the center point 1;

a value of principal component k representing the center point 2;

a value of a principal component k representing the center point 3;

a value representing a principal component k of a sample i; i denotes the sample number and k denotes the principal component number.

Step 322: sequentially comparing the distance from each sample point to each center, dividing the sample object into the clusters of the centers closest to each other, and obtaining 3 new cluster types

，

，

。

In one possible embodiment, the new center point is calculated as follows:

；

in the formula (I), the compound is shown in the specification,

represents the center point 1 at time t + 1;

represents the center point 2 at time t + 1;

represents the center point 3 at time t + 1;

represents the distance from sample i to the center point 1 at time t;

represents the distance of sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

is a first cluster;

is a second type cluster;

is a third type cluster.

According to one aspect of the present disclosure, there is provided a system for automatically identifying a multi-parameter dominant water flow channel based on a clustering algorithm, the system comprising: the device comprises a collecting unit, a principal component analyzing unit and a clustering algorithm unit; wherein the content of the first and second substances,

the acquisition unit is used for acquiring the parameters of the oil-water well layer by layer in the oil field;

the principal component analysis unit is used for obtaining principal components by using a principal component analysis method based on the parameters of the oil-water well interlayer;

and the clustering algorithm unit is used for clustering the main components by using a K-Means clustering algorithm and identifying a dominant water flow channel.

In a possible embodiment, the parameters in the acquisition unit include: the accumulated water injection amount, the scouring time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water-flooding oil amount, the water-flooding liquid amount and the water injection strength among wells.

In one possible embodiment, the principal component analysis unit includes: the device comprises a construction module, a standardization processing module, a first calculation module, a second calculation module, a third calculation module and an acquisition module; wherein the content of the first and second substances,

the construction module is used for constructing a sample data matrix X of n X m according to the sample data and the parameters of each oil-water well interlayer, wherein n is the number of samples, and m is the number of the parameters of the oil-water well interlayer;

the standardization processing module is used for carrying out standardization processing on the sample data matrix X;

the first calculation module is used for calculating a covariance matrix of the sample data matrix X after the standardization processing;

the second calculation module is used for calculating an eigenvalue and an eigenvector of the covariance matrix;

the third calculation module is used for calculating the principal component contribution rate and the accumulated contribution rate based on the characteristic value and the characteristic vector;

and the acquisition module is used for acquiring p principal components based on the principal component contribution rate and the accumulated contribution rate.

In a possible implementation, the clustering algorithm unit includes: the device comprises a random selection module, a classification module, a fourth calculation module, a judgment module and an identification module; wherein the content of the first and second substances,

the random selection module is used for randomly selecting 3 data in a sample with a layer as a unit as a central point;

the classification module is used for respectively calculating the distance from each sample point to the selected 3 central points and classifying the sample points according to the distances;

the fourth calculation module is used for calculating each sample point in the classified data, and recalculating the average value as a new central point;

the judging module is used for finishing clustering if the newly calculated central point is the same as the original central point; if the new center point calculated again is different from the original center point, the new center point is assigned to the original center point, and the step S32 is continuously repeated until the calculation is finished;

the identification module is used for dividing the clusters into three categories according to the finally obtained clusters: the strong water flow is the channel for identifying each dominant water flow.

The invention has the technical effects and advantages that:

through the collected parameters of each oil-water well interlayer, principal components are preferably selected by using a principal component analysis method, and then the dominant water flow channel of each layer is automatically identified by using a clustering algorithm. The main influence factors can be selected from the multiple parameters, the influence of human factors is reduced while the fast recognition is carried out, the recognition precision and the prediction speed are improved, and the water injection development effect of the oil field can be effectively guided.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

FIG. 1 is a flow chart of a method for automatically identifying a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the invention;

FIG. 2 is a flow chart of a principal component analysis method according to an exemplary embodiment of the present invention;

FIG. 3 is a flowchart of a K-means clustering algorithm according to an exemplary embodiment of the present invention;

fig. 4 is a schematic diagram of an automatic identification system for a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a flowchart of an automatic identification method for a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the present invention, and as shown in fig. 1, an automatic identification method for a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the present invention includes the following steps:

step S1: collecting parameters of an oil-water well layer of an oil field by taking the layer as a unit;

In step S1 of the present invention, parameters of the oil-water well interval in each interval of the oil field are collected, including data of the inter-well accumulated water injection amount, the flushing time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water flooding oil amount, the water flooding liquid amount, the water injection strength, etc., and the parameter data of each small interval and all time points are sorted out, and a data table is shown in table 1.

TABLE 1 Collection of parameters between oil and water well layers in oil field

In step S2 of the present invention, a principal component analysis method is used to determine a main influence parameter, and fig. 2 is a flowchart of a principal component analysis method according to an exemplary embodiment of the present invention, as shown in fig. 2, including the following steps:

step S21: firstly, combining the number of samples (n) and the interlayer parameter (m) of the oil-water well, constructing a sample data matrix X of n X m, wherein the expression is as follows:

；

in the formula (I), the compound is shown in the specification,

represents the value of parameter 1 for sample 1;

represents the value of parameter 2 for sample 1;

represents the value of the parameter m for sample 1;

a value representing parameter 1 of sample 2;

represents the value of parameter 2 of sample 2;

represents the value of the parameter m for sample 2;

a value representing the sample n parameter 1;

a value representing a sample n parameter 2;

represents the value of the parameter m for the sample n;

a matrix of n sample values representing parameter 1;

a matrix of n sample values representing parameter 2;

a matrix of n sample values representing a parameter m; n represents the number of samples; m represents the number of parameters between oil-water well layers.

Step S22: calculate mean by column

Difference of sum and sample

Normalizing data

And carrying out standardization processing on the sample data matrix X to obtain:

；

in the formula (I), the compound is shown in the specification,

represents the normalized value of parameter 1 for sample 1;

represents the normalized value of parameter 2 for sample 1;

represents the normalized value of the parameter m of sample 1;

represents the normalized value of parameter 1 for sample 2;

represents the normalized value of parameter 2 for sample 2;

represents the normalized value of the parameter m of sample 2;

represents the normalized value of the parameter 1 of the sample n;

represents the normalized value of the sample n parameter 2;

represents the normalized value of the parameter m of the sample n;

a matrix of normalized values of n samples representing parameter 1;

a matrix of normalized values of n samples representing parameter 2;

a matrix of normalized values of n samples representing the parameter m; n represents the number of samples; m represents the number of parameters between oil-water well layers;

represents the value of the parameter j of the sample i;

the normalized value of the parameter j of the sample i is shown.

Step S23: and (3) calculating a covariance matrix according to the sample data matrix after the standardization treatment:

；

wherein

；

Step S24: and (3) calculating an eigenvalue and an eigenvector of the covariance matrix:

characteristic value:

；

feature vector:

；

step S25: calculating the contribution rate and the cumulative contribution rate of each component:

contribution rate:

；

cumulative contribution rate:

；

step S26: and acquiring p principal components according to the accumulated contribution rate.

Step S261: taking the first, second, and pth (p is less than or equal to m) principal components corresponding to the characteristic values with the cumulative contribution rate of more than 80%;

；

ith principal component:

；

step S262: so far, m influence indexes are converted into p main components by using a main component analysis method

。

In step S3 of the present invention, a K-Means clustering algorithm is used to cluster the principal components and identify the dominant water flow channel, fig. 3 is a flow chart of the K-Means clustering algorithm according to the exemplary embodiment of the present invention, as shown in fig. 3, including the following steps:

step S31: randomly selecting 3 data from samples taking a layer as a unit as a central point Z;

；

in the formula (I), the compound is shown in the specification,

a center point 1 representing time t;

a center point 2 representing time t;

a center point 3 representing time t;

a value of principal component 1 representing the center point 1;

a value representing the principal component 2 of the center point 1;

a value representing the principal component p of the center point 1;

a value representing principal component 1 of center point 2;

a value of principal component 2 representing the center point 2;

a value representing the principal component p of the center point 2;

a value of principal component 1 representing the center point 3;

a value of the principal component 2 representing the center point 3;

representing the value of the principal component p of the center point 3.

Step S32: respectively calculating the distance from each sample point D to the selected 3 central points, and classifying the sample points according to the distances;

step S321: calculating the distance from each sample point to 3 central points;

；

in the formula (I), the compound is shown in the specification,

represents the distance from sample i to the center point 1 at time t;

represents the distance of sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

a value of a principal component k representing the center point 1;

a value of principal component k representing the center point 2;

a value of a principal component k representing the center point 3;

Step S322: sequentially comparing the distance from each sample point to each center, dividing the sample objects into clusters with the nearest centers to obtain 3 new cluster types

，

，

。

；

in the formula (I), the compound is shown in the specification,

represents the center point 1 at time t + 1;

represents the center point 2 at time t + 1;

represents the center point 3 at time t + 1;

represents the distance from sample i to the center point 1 at time t;

to representThe distance from sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

is a first cluster;

is a second type cluster;

is a third type cluster.

Step S34: if the central point of the step t +1 is recalculated to be the same as the central point of the step t, finishing clustering; if the center point of the t +1 step is different from the t step, the new center point is assigned to the original center point

Continuing to repeat the step S32 until the calculation is finished;

step S35: according to the finally determined clusters, the clusters can be divided into three categories: strong water flow, normal water flow and weak water flow, wherein the strong water flow is the dominant water flow channels identified, and a table 2 is obtained.

TABLE 2 Final determined clustering Table

Fig. 4 is a schematic diagram of an automatic identification system for a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the present invention, and as shown in fig. 4, an automatic identification system for a multi-parameter dominant water flow channel based on a clustering algorithm according to an exemplary embodiment of the present invention is provided, and the system includes: the device comprises a collecting unit, a principal component analyzing unit and a clustering algorithm unit; the device comprises a collecting unit, a control unit and a control unit, wherein the collecting unit is used for collecting the parameters of an oil-water well layer by layer of an oil field; the principal component analysis unit is used for obtaining principal components by using a principal component analysis method based on the parameters of the oil-water well interlayer; and the clustering algorithm unit is used for clustering the main components by using a K-Means clustering algorithm and identifying a dominant water flow channel.

Further, the parameters in the acquisition unit include: the accumulated water injection amount, the scouring time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water-flooding oil amount, the water-flooding liquid amount and the water injection strength among wells. The principal component analysis unit includes: the device comprises a construction module, a standardization processing module, a first calculation module, a second calculation module, a third calculation module and an acquisition module; the construction module is used for constructing a sample data matrix X of n X m according to the sample data and the parameters of each oil-water well interlayer, wherein n represents the number of samples, and m represents the number of the parameters of the oil-water well interlayer; the standardization processing module is used for carrying out standardization processing on the sample data matrix X; the first calculation module is used for calculating a covariance matrix of the sample data matrix X after the standardization processing; the second calculation module is used for calculating an eigenvalue and an eigenvector of the covariance matrix; the third calculation module is used for calculating the principal component contribution rate and the accumulated contribution rate based on the characteristic value and the characteristic vector; and the acquisition module is used for acquiring p principal components based on the principal component contribution rate and the accumulated contribution rate. The clustering algorithm unit comprises: the device comprises a random selection module, a classification module, a fourth calculation module, a judgment module and an identification module; the random selection module is used for randomly selecting 3 data in a sample with a layer as a unit as a central point; the classification module is used for respectively calculating the distance from each sample point to the selected 3 central points and classifying the sample points according to the distances; the fourth calculation module is used for calculating each sample point in the classified data, and recalculating the average value as a new central point; the judging module is used for finishing clustering if the newly calculated central point is the same as the original central point; if the new center point calculated again is different from the original center point, the new center point is assigned to the original center point, and the step S32 is continuously repeated until the calculation is finished; the identification module is used for dividing the obtained clusters into three categories: the strong water flow is used for identifying each dominant water flow channel.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims

1. A multi-parameter dominant water flow channel automatic identification method based on a clustering algorithm is characterized by comprising the following steps:

2. The clustering algorithm-based multi-parameter dominant water flow channel automatic identification method according to claim 1, wherein the parameters comprise: the accumulated water injection amount, the scouring time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water-flooding oil amount, the water-flooding liquid amount and the water injection strength among wells.

3. The method for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm as claimed in claim 1 or 2, wherein the step S2: based on the oil-water interlayer parameters, obtaining principal components by using a principal component analysis method, wherein the method comprises the following steps:

step S22: carrying out standardization processing on the sample data matrix X;

4. The method for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm as claimed in claim 1, wherein the step S3: clustering the principal components by using a K-Means clustering algorithm to identify a dominant water flow channel, comprising:

step S34: if the new center point recalculated is the same as the original center point, finishing clustering; if the new center point calculated again is different from the original center point, the new center point is assigned to the original center point, and the step S32 is continuously repeated until the calculation is finished;

5. The method for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm as claimed in claim 4, wherein the step S32: respectively calculating the distance from each sample point to the selected 3 central points, and classifying the sample points according to the distances, wherein the method comprises the following steps:

；

in the formula (I), the compound is shown in the specification,

represents the distance from sample i to the center point 1 at time t;

represents the distance of sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

a value of a principal component k representing the center point 1;

a value of principal component k representing the center point 2;

a value of a principal component k representing the center point 3;

a value representing a principal component k of a sample i; i represents a sample number, k represents a principal component number;

，

，

。

6. The method for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm as claimed in claim 4, wherein the new central point is calculated as follows:

；

in the formula (I), the compound is shown in the specification,

represents the center point 1 at time t + 1;

represents the center point 2 at time t + 1;

center point 3 representing time t + 1;

represents the distance from sample i to the center point 1 at time t;

represents the distance of sample i to the center point 2 at time t;

represents the distance of sample i to the center point 3 at time t;

is a first cluster;

is a second type cluster;

is a third type cluster.

7. A multi-parameter dominant water flow channel automatic identification system based on a clustering algorithm is characterized by comprising: the device comprises a collecting unit, a principal component analyzing unit and a clustering algorithm unit; wherein the content of the first and second substances,

8. The system for automatically identifying the multi-parameter dominant water flow channel based on the clustering algorithm as claimed in claim 7, wherein the parameters in the acquisition unit comprise: the accumulated water injection amount, the scouring time, the instantaneous water injection amount, the water injection speed, the water consumption rate, the water-flooding oil amount, the water-flooding liquid amount and the water injection strength among wells.

9. The system for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm according to claim 7 or 8, wherein the principal component analysis unit comprises: the device comprises a construction module, a standardization processing module, a first calculation module, a second calculation module, a third calculation module and an acquisition module; wherein the content of the first and second substances,

10. The system for automatically identifying the multiparameter dominant water flow channel based on the clustering algorithm as claimed in claim 7, wherein the clustering algorithm unit comprises: the device comprises a random selection module, a classification module, a fourth calculation module, a judgment module and an identification module; wherein the content of the first and second substances,

the identification module is used for dividing the clusters into three categories according to the finally obtained clusters: the strong water flow is used for identifying each dominant water flow channel.