CN110674841A - Logging curve identification method based on clustering algorithm - Google Patents

Logging curve identification method based on clustering algorithm Download PDF

Info

Publication number
CN110674841A
CN110674841A CN201910780696.7A CN201910780696A CN110674841A CN 110674841 A CN110674841 A CN 110674841A CN 201910780696 A CN201910780696 A CN 201910780696A CN 110674841 A CN110674841 A CN 110674841A
Authority
CN
China
Prior art keywords
logging
identified
data points
data
principal component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910780696.7A
Other languages
Chinese (zh)
Other versions
CN110674841B (en
Inventor
周军
姬庆庆
李国军
胡家琦
张娟
朱登明
刘昱晟
王兆其
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China National Petroleum Corp
Institute of Computing Technology of CAS
China Petroleum Logging Co Ltd
Original Assignee
China National Petroleum Corp
Institute of Computing Technology of CAS
China Petroleum Logging Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China National Petroleum Corp, Institute of Computing Technology of CAS, China Petroleum Logging Co Ltd filed Critical China National Petroleum Corp
Priority to CN201910780696.7A priority Critical patent/CN110674841B/en
Publication of CN110674841A publication Critical patent/CN110674841A/en
Application granted granted Critical
Publication of CN110674841B publication Critical patent/CN110674841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • EFIXED CONSTRUCTIONS
    • E21EARTH DRILLING; MINING
    • E21BEARTH DRILLING, e.g. DEEP DRILLING; OBTAINING OIL, GAS, WATER, SOLUBLE OR MELTABLE MATERIALS OR A SLURRY OF MINERALS FROM WELLS
    • E21B49/00Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Abstract

The invention discloses a logging curve identification method based on a clustering algorithm, and belongs to the field of logging curve identification. The logging curve identification method based on the clustering algorithm comprises the steps of firstly utilizing a principal component analysis method to carry out dimensionality reduction on a logging curve so as to simplify information reflected by the logging curve, replacing the information by a few mutually independent and unrelated comprehensive indexes, and fully reflecting original multi-index information to the greatest extent by the limited indexes; the machine learning idea is combined with a K clustering analysis method to construct a KNN network, and calibrated hierarchical information is used as a data label of training data in machine learning, so that the K mean clustering problem can be better guided to find a clustering center, and a real and effective clustering result is obtained; the identification method of the invention does not need to carry out normalization operation on curve data, thereby reducing the time complexity of program operation to a great extent.

Description

Logging curve identification method based on clustering algorithm
Technical Field
The invention belongs to the field of logging curve identification, and particularly relates to a logging curve identification method based on a clustering algorithm.
Background
The logging technique originated from schrenbach, france and mainly collected attributes that reflect the properties of the formation, such as radioactivity, acoustic properties and conductivity. With the development of the last hundred years, the logging technology goes through the development process from analog logging to digital logging, numerical control logging and imaging logging. The method is widely applied to the exploration and development process of oil and gas fields, and becomes an important technical means for assisting geological exploration and oil exploitation personnel to find and evaluate oil and gas reservoirs. Meanwhile, the technology is gradually widely applied to the exploration of other mineral resources. With the continuous development of the logging technology in recent years, the logging information obtained by using the logging means is more and more abundant. The related well logging interpretation technology gradually moves from qualitative and semi-quantitative manual interpretation to the era of quantitative interpretation by means of computers, and the related interpretation model and interpretation efficiency are improved to a certain extent. But in general, the well logging interpretation method still lags behind and has low accuracy and efficiency. With the continuous expansion of oil and gas exploration scale, well logging interpretation is about to face more and more complex research objects, and the existing well logging interpretation method is difficult to meet the continuously improved interpretation requirement.
At present, a manual interpretation method mainly adopted for layering logging curves requires a large amount of manpower, and meanwhile, a layering result is easily influenced by subjective factors of interpreters, and a large amount of manpower and material resources are required. Therefore, it is highly desirable to realize automatic layering by using computer technology, thereby avoiding human error, reducing labor consumption, and improving production efficiency. With the development of computer technology, the automatic layering of well logging curves by using computer technology has made great progress. At present, methods for automatic lithology identification mainly include a probability statistics method, a support vector machine method and the like. The probability statistics method is usually based on probability statistics, and the well logging is identified and explained by estimating posterior probability through prior and conditional probability. Liu Zi Yun and the like are firstly put forward to judge lithology by utilizing a probability statistical method and obtain certain effect. The national construction and the like propose that a probabilistic neural network model (PNN model) is used for lithology recognition of well logging information, and the model is trained and tested by using well logging data, so that the PNN model can obtain a certain effect in the aspect of well logging layered interpretation. The probability statistical method is suitable for the digital logging information of the surrounding rock reflected by the curve with better physical property characteristic conditions, and can achieve a certain effect under the condition that the core information is less but the logging information is more, but the method has the defects of difficulty in obtaining the prior probability and large artificial influence factors. The support vector machine (SVM for short) is a new pattern recognition method developed on the basis of the theory of statistical learning, and can achieve good effects in solving the problems of small sample number, nonlinearity and high-dimensional data pattern recognition. Research and experimental verification prove that the method has stronger feasibility and effectiveness in automatic well logging layering and lithology identification and simultaneously obtains good effect. The high-sea coke is used for layering the single-well logging curve by combining the method with the key logging curve, and carrying out sedimentary facies identification on the single-well stratum profile by combining the layering result, so that a certain experimental result is obtained; zhang Yan [16] based on different lithology and fluid logging characteristics, the research of lithology and fluid identification is carried out by adopting an SFLA-SVM method, and the lithology is identified and judged. However, the method of the support vector machine is difficult to classify large-scale training samples, and meanwhile, the SVM has certain difficulty in handling the multi-classification problem. The method of Zhangxihua, Shenkao and the like in the experiment has difficulty in the high-precision well logging layering problem and larger error.
Through investigation of the current research situation at home and abroad, the research on the aspect of self-adaptive multi-scale hierarchical calibration is still in a continuous exploration stage at present, and the following problems mainly exist: both the probability statistical method and the manual calibration interpretation method are easily influenced by human factors; the support vector machine method is difficult to obtain a good effect on the high-precision layering problem by improving the algorithm. In addition to the above problems, the current-stage logging horizon interpretation is often researched by using a single method or two methods and results are obtained, and such methods often have the problems of unclear horizon differentiation and inaccurate interpretation. In addition, many researches often only can carry out level division on logging information, and the corresponding specific level name cannot be directly identified, so that manual identification is needed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a logging curve identification method based on a clustering algorithm.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
a logging curve identification method based on a clustering algorithm comprises the following steps:
1) analyzing the logging curves by adopting a principal component analysis method, and determining m principal component curves according to the contribution rate from high to low; wherein m is the number of logging attributes;
2) training the principal component curve of the known well by utilizing the first a principal component curves of the known well and adopting a KNN classification algorithm, and taking the calibrated classification information as a data label of training data in machine learning until the clustering centers of two consecutive times are unchanged to form a training data set; and (4) adding the first a principal component curves of the well to be identified as the example to be identified into the training data set, and training again to finish classification.
Further, before the step 1), abnormal value elimination and depth correction processing are carried out on the logging curve.
Further, step 2) comprises the following operations:
201) identifying reservoirs and non-reservoirs;
202) carrying out oil-water-gas layer multi-scale classification identification in a reservoir;
wherein the multi-scale classification of the oil-water-gas layer comprises: the dry layer, the oil-poor layer, the water layer, the oil-gas layer, the oil-containing water layer, the gas-poor layer, the gas-water layer, the oil-water layer and the gas layer.
Further, the first a principal component curves in step 201) are respectively acoustic wave, natural potential and natural gamma.
Further, the specific process of step 201) is as follows:
calculating Euclidean distances from data points of the first a principal component curves of the well to be identified to reservoir and non-reservoir logging data points in the training data set;
selecting k data points closest to the logging data points to be identified;
counting the number of reservoirs and non-reservoirs in the k data points;
and taking the category with the highest occurrence frequency as the category of the logging data points to be identified.
Further, the following operations are included after the category of the logging data point to be identified is identified:
and carrying out filtering processing for filtering the prediction error point positions.
Further, the first a principal component curves in step 202) are respectively acoustic wave, natural potential, neutron, density, natural gamma, deep lateral resistivity and shallow lateral resistivity.
Further, the specific process of step 202) is as follows:
adding data points of the first a principal component curves of the reservoir stratum of the well to be identified as an example to be identified into a training data set, and calculating the Euclidean distance from the data points of the example to be identified to the logging data points of the known category;
selecting k data points closest to the example data points to be identified;
counting the number of the occurrences of each prediction category in the k data points;
and the class with the highest class occurrence frequency in the k data points is used as the class of the example data point to be identified.
Further, the following operations are included after the category of the logging data point to be identified is identified:
and carrying out filtering processing for filtering the prediction error point positions.
Compared with the prior art, the invention has the following beneficial effects:
the logging curve identification method based on the clustering algorithm comprises the steps of firstly utilizing a principal component analysis method to carry out dimensionality reduction on a logging curve so as to simplify information reflected by the logging curve, replacing the information with a few mutually independent and unrelated comprehensive indexes, and fully reflecting original multi-index information to the greatest extent by the limited indexes; the machine learning idea is combined with a K clustering analysis method to construct a KNN network, and calibrated hierarchical information is used as a data label of training data in machine learning, so that the K mean clustering problem can be better guided to find a clustering center, and a real and effective clustering result is obtained; the identification method of the invention does not need to carry out normalization operation on curve data, thereby reducing the time complexity of program operation to a great extent.
Furthermore, abnormal value elimination and depth correction processing are carried out on the logging curve, on one hand, negative influence of the abnormal value is eliminated, on the other hand, corresponding attribute values of different logging attributes are ensured to exist under the same depth, and extraction and analysis of the layer position characteristic information are facilitated by the method;
furthermore, when the oil-water-gas layer is identified, the identified non-reservoir data is removed, and then identification is carried out in the reservoir, so that curve characteristics can be better extracted, and higher identification accuracy is obtained; the multi-scale recognition can be realized, different requirements of actual production are met, and the problems that the existing layered recognition method can only realize layered recognition of a single scale and is difficult to realize multi-scale recognition aiming at the layered recognition of the logging curve are solved;
furthermore, the acoustic wave AC, the natural potential SP and the natural gamma GR are used as the first a principal component curves, so that the reservoir and non-reservoir identification can be realized, and the calculation speed is increased.
Drawings
FIG. 1 is a schematic flow chart of a logging curve identification method based on a clustering algorithm according to the present invention;
FIG. 2 is a schematic view of the recognition of the oil-water layer according to the present invention;
FIG. 3 is a schematic diagram showing the well location distribution in the experimental region in example 1;
fig. 4 is a graph of the reservoir and non-reservoir identification results for Y189 wells in example 1.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention is described in further detail below with reference to the accompanying drawings:
referring to fig. 1, fig. 1 is a schematic flow chart of a logging curve identification method based on a clustering algorithm, which includes three steps:
s1, analyzing the logging curve by adopting a principal component analysis method, and determining m principal component curves according to the contribution rate from high to low
Due to the limitations of the logging equipment and the geological structure, during the logging process, in the initial and final logging stages, there are often numerical abnormal measurements, such as: 99999. 99999, 0, etc. The abnormal values can bring negative influence on automatic layering identification of the logging curve, so that the abnormal data needs to be removed before layering. In the well logging process, a plurality of well logging devices are often used, and depth intervals of the well logging devices are different when well logging attributes are acquired, and the following three sampling depth intervals generally exist: 0.075m, 0.1m and 0.125 m. The multiple attribute features of each sampling point can be better extracted under the same depth, so that the selected logging curve needs to be subjected to depth correction. By correcting the logging curve, corresponding attribute values of different logging attributes can be ensured to exist at the same depth, and extraction and analysis of the layer position characteristic information are facilitated by the method;
along with the development of logging equipment, the logging equipment can acquire more and more abundant logging data, in order to carry out dimensionality reduction on high-dimensional logging data, the method researches linear combinations of a few indexes in the data on the basis of the principle that data information is lost least, and retains various information in the original indexes as much as possible by comprehensive indexes formed by the linear combinations, wherein the comprehensive indexes are called as principal components.
For the logging curve data, a sample comprises m logging attributes, each variable has n sampling points, and thus an mxn-order matrix is formed, but the huge data information is difficult to process by a computer. Therefore, useful information capable of characterizing the object needs to be analyzed from the complicated data information, that is, key attributes need to be searched in m logging attributes, which puts high requirements on human analysis and observation capability. In order to effectively solve the problem, the high-dimensional logging data needs to be subjected to dimensionality reduction operation, information reflected by more original logging curves is simplified, then a few independent and unrelated comprehensive indexes are used for substitution, and meanwhile the purpose that the original multi-index information can be fully reflected by the limited indexes to the greatest extent is needed. Therefore, the above effect can be obtained by selecting the linear combination of the original indexes.
Suppose W1,W2,...,WmLogging attributes corresponding to all logging curves, wherein each logging attribute comprises n sampling points, and the following are provided: wj=(wj1,...,wjn)T. Let W ═ W1,W2,...,Wm)TThen there is a high order matrix as follows:
Figure BDA0002176484720000071
wherein, wijAnd j is the jth logging attribute of the ith sampling point, i is the ith sampling point, and j is the jth logging attribute.
Their main components are represented as:
Figure BDA0002176484720000081
wherein, YiIs the ith main component, eijThe correlation coefficient of the ith logging attribute and the jth logging attribute is obtained; the matrix is a m-order non-negative definite matrix, and the correlation coefficient thereof can be determined by the following rule:
1、Yiand Yj(i ≠ j; i, j ═ 1, 2.. multidot.m) has no correlation therebetween; 2. y is1Is W1,W2,...,WmMaximum of variance in all linear combinations, Y2Is with Y1W without correlation1,W2,...,WmMaximum of variance, Y, in any one of the linear combinationsiIs with Yj(i ≠ j; i, j ═ 1, 2.. times.m) W without correlation1,W2,...,WmAny one of the linesThe maximum value of the variance in the combination.
Firstly, a covariance matrix is calculated:
Figure BDA0002176484720000082
Figure BDA0002176484720000083
σijis the standard deviation between log property i and log property j,
Figure BDA0002176484720000085
is W1,W2,...,WnMean of all linear combinations;
Figure BDA0002176484720000086
is with Y1W without correlation1,W2,...,WmThe mean of any of the linear combinations; solving to obtain the characteristic value of sigma as: lambda [ alpha ]1≥λ2≥…≥λm≥0,e1=(ei1,ei2,…,eim)TFor corresponding characteristic value lambdaiThe contribution ratio of the ith principal component is:
Figure BDA0002176484720000087
because the well log data have different dimensions, the data are discrete to different degrees, and the calculated variance is different. In order to eliminate the influence caused by different logging curve dimensions, a method for standardizing data is often adopted, and the formula (3) is transformed to obtain:
Figure BDA0002176484720000091
Figure BDA0002176484720000092
Figure BDA0002176484720000093
r is a standardized equation of variables, RijThe standard deviation between the normalized logging attribute i and the normalized logging attribute j is obtained; solving to obtain the characteristic value of R as follows: lambda [ alpha ]1 *≥λ2 *≥…≥λm *≥0,e1 *=(ei1 *,ei2 *,…,eim *)TFor corresponding characteristic value lambdai *The contribution ratio of the ith principal component is:
Figure BDA0002176484720000094
the well log data to which the present invention relates includes 60 different attribute dimensions, such as Acoustic (AC), Density (DEN), neutrons (CNL), etc. Through the principal component analysis and calculation, seven logging curves such as an acoustic wave (AC), a natural potential (SP), a neutron (CNL), a Density (DEN), a natural Gamma (GR), a deep lateral resistivity (LLD) and a shallow lateral resistivity (LLS) are finally selected as principal component curves for developing automatic layering.
S2, reservoir and non-reservoir classification identification
Since the non-reservoir occupies most of the depth of the whole depth section in the logging problem, it brings great challenge to the identification and calibration of each horizon in the reservoir, so in order to achieve the goal of level-accurate identification in the reservoir, the invention firstly distinguishes the reservoir from the non-reservoir.
Because the reservoir and non-reservoir are divided into two categories, which is simple, in the invention, when the reservoir and non-reservoir are divided, three curves with the highest contribution rate calculated by adopting a principal component analysis method are taken as the layering identification basis, which are respectively: acoustic wave (AC), natural potential (SP), natural Gamma (GR).
For the classification problem, a common data processing method of cluster analysis is often adopted. The cluster analysis method is often used for classifying sample data, and the methods are more, and a K-means clustering method, a hierarchical clustering method, a fuzzy clustering method and the like are common. The core idea of cluster analysis is to classify sample data, so that the similarity of samples in the same class is as large as possible, and the difference between samples in different classes is as large as possible. The cluster analysis method usually uses a specific criterion to measure the similarity of the sample data, and this criterion is generally the true distance between the sample and the sample in space. For the present invention, assuming that there are m logging attributes, the similarity between one sample and another sample in m-dimensional space can be measured by using a distance formula, in the K-means clustering method, the euclidean distance is often used as a measurement standard, and the euclidean distance formula between the ith sample and the jth sample is:
Figure BDA0002176484720000101
the invention combines the specific characteristics of the logging data, and adopts a K-means clustering algorithm to classify the logging data, wherein the K-means clustering method is also called as a rapid clustering method and is a common method in clustering analysis. The invention is characterized in that a classification number K is given in a database containing n logging sample points, K initial clustering centers are selected, Euclidean distances from the sample points to the clustering centers are calculated, the sample points are classified into the class of the closest clustering center to the class of the closest clustering center, after all the sample points are classified, a new clustering center of each class of sample points is obtained by a method of calculating an average value, and finally after continuous iteration change, when the clustering center obtained by the last two iterations does not change, the iteration is considered to be finished, so that the final clustering center is obtained and the classification is finished. The method has higher requirement on the initial clustering centers, and different initial clustering centers can generate different clustering results; and if a proper initial clustering center is not selected, a true and effective clustering result may not be obtained.
Aiming at the problems, the method combines the supervised machine learning idea with the K clustering analysis method to construct the KNN network. According to the invention, the logging data contains artificially calibrated accurate hierarchical information which can be used as a data label of training data in machine learning, and the K-means clustering problem can be better guided to find a clustering center on the basis of training a network by the training data, so that a hierarchical task of an appointed exploratory well is completed.
For the well logging data points to be identified, the following algorithm steps are specifically required:
(1) calculating the distances from the logging data points to be identified to all reservoir and non-reservoir logging data points;
(2) sorting according to the calculated distance in ascending order;
(3) selecting the first k data points which are closest to the logging data points to be identified;
(4) counting the number of occurrences of two types of reservoirs and non-reservoirs in the k data points;
(5) the category with the highest category occurrence frequency in the k points is used as the category of the logging data point to be identified;
(6) and filtering the prediction result to filter out individual prediction error points.
After the steps are executed, the input logging curve data can be divided into two types of reservoir and non-reservoir.
S3, carrying out multi-scale oil-water-gas layer identification in the reservoir
After the logging curve is divided into the reservoir and the non-reservoir in the previous step, the logging curve data points of the depth section corresponding to the non-reservoir are removed, the logging curve is divided into the reservoir section with finer granularity, and the data characteristics in front of the oil-water-gas layer in the reservoir can be amplified to the maximum extent, so that the interference of non-reservoir data with a large number of data points is eliminated, and the purpose of better distinguishing the oil-water layer from the oil-water layer is achieved.
In the process of distinguishing oil, gas and water layers, because the characteristic information contained in the three logging curves is limited, a good layering effect is difficult to obtain. In order to solve the problems, the invention adopts seven logging curves, such as finally selected acoustic waves (AC), natural potential (SP), neutrons (CNL), Density (DEN), natural Gamma (GR), deep lateral resistivity (LLD) and shallow lateral resistivity (LLS), calculated by a principal component analysis method in the step as curve characteristic sources for distinguishing oil, gas and water layers.
The oil-water-gas layer comprises the following layering categories: dry layer, poor oil layer, water layer, oil-gas layer, oil-containing water layer, gas-containing water layer, poor gas layer, gas-water layer, oil-water layer, and gas layer, etc. in total 11 types.
For the well logging data points to be identified in the reservoir, the following algorithm steps are specifically required to be executed:
(1) calculating the distances from the logging data points to be identified to all the known logging data points;
(2) sorting according to the calculated distance in ascending order;
(3) selecting the first k data points which are closest to the logging data points to be identified;
(4) counting the number of the occurrences of each prediction category in the k data points;
(5) the category with the highest category occurrence frequency in the k points is used as the category of the logging data point to be identified;
(6) and filtering the prediction result to filter out individual prediction error points.
After the steps are executed, the input reservoir logging curves can be classified in a fine granularity mode, and oil-water-gas layer identification is completed.
The main innovation points of the invention and the existing method are as follows: the existing layered recognition method can only realize the layered recognition of a single scale aiming at the layered recognition problem of the logging curve, and is difficult to realize the multi-scale recognition, but the invention can realize the multi-scale recognition and meet different requirements of actual production; the existing logging curve layered identification method usually needs to normalize data first, but in order to better discover curve characteristics, the method does not need to normalize the curve data, and time complexity of program operation is reduced to a great extent; when the oil-water-gas layer is identified, the identified non-reservoir data is removed, and then identification is carried out in the reservoir, so that curve characteristics can be better extracted, and better identification accuracy is obtained.
Example 1
The invention selects 10 vertical well logging data in a certain block as experimental data, the well position distribution is shown as figure 2, wherein, reservoir and non-reservoir logging data of 9 wells are used as training data, and the reservoir and non-reservoir two-kind layer position distribution of another well, namely Y189 well, is identified, and relevant experiments are carried out.
S1, analyzing the logging curve by adopting a principal component analysis method, and determining m principal component curves according to the contribution rate from high to low
Inputting the logging attributes into an expression (1), constructing a high-order matrix, and finally obtaining the contribution sequence of each logging attribute in the logging curve identification through the calculation of an expression (2) -an expression (9): natural potential (SP) > Acoustic (AC) > neutron (CNL) > Density (DEN) > natural Gamma (GR) > deep lateral resistivity (LLD) > shallow lateral resistivity (LLS) > other logging properties. Therefore, three attributes of natural potential (SP), sound wave (AC) and neutron (CNL) are selected as division bases in the division of the reservoir and the non-reservoir; when oil-water-gas layer division is carried out in a reservoir, seven curves such as natural potential (SP), sound wave (AC), neutron (CNL), Density (DEN), natural Gamma (GR), deep lateral resistivity (LLD) and shallow lateral resistivity (LLS) are selected as division bases.
S2, reservoir and non-reservoir classification identification
Before the experiment, three curves of sound wave (AC), natural potential (SP) and natural Gamma (GR) of 9 exploratory wells such as Y220, Y219, Y205, Y194, Y192, Y181, Y148, Y146 and Y45 are selected for the experiment, the depth ranges of the curves of different wells are not completely the same, but are mostly concentrated in the range of 500m-1300m, and the sampling interval is 0.1 m. The prediction range of the pre-logging Y189 well is 500m-1100m, the total length is 600m, and 6000 sampling points are required to be predicted in total.
The KNN clustering analysis method is characterized in that a diagram for identifying results of a well Y189 well reservoir and a non-reservoir is shown in FIG. 4, and is limited by space, and only partial depth section identification results are shown. In fig. 4, the left three logs are used in the method of the present invention, and different response values are generated in different strata according to the depth change; the first channel on the right side is a reservoir and non-reservoir prediction result before filtering, the second channel on the right side is a reservoir and non-reservoir prediction result after filtering, the depth section filled with colors in the graph belongs to the reservoir, and the depth section not filled belongs to the non-reservoir. The third track on the right side is an identification result of manual marking, a marked depth section is a reservoir stratum, and a depth section which is not marked is a non-reservoir stratum.
As shown in Table 1, the horizon identification results shown in the table show that the method can accurately identify most horizons, which shows that the method can well identify reservoirs and non-reservoirs.
TABLE 1Y 189 well reservoir and non-reservoir stratifying results
Figure BDA0002176484720000141
S3, carrying out multi-scale oil-water-gas layer identification in the reservoir
On the basis of the experimental identification result, the method can divide the interior of the reservoir layer into finer granularity. The reservoir in the experimental region has 11 types of layers including a dry layer, a poor oil layer, a water layer, an oil-gas layer, an oil-containing water layer, a gas-containing water layer, a poor gas layer, a gas-water layer, an oil-water layer, a gas layer and the like. The sample set with few sample classes is expanded before the experiment, so that the aim of sample balance of each layer is fulfilled, and the layering result is more accurate.
In the experiment process, seven logging curves such as acoustic waves (AC), natural potentials (SP), neutrons (CNL), Densities (DEN), natural Gammas (GR), deep lateral resistivities (LLD) and shallow lateral resistivities (LLS) which are finally selected by calculation of 9 exploratory wells such as Y220, Y219, Y205, Y194, Y192, Y181, Y148, Y146 and Y45 through a principal component analysis method are used as training data input. Table 2 is the multi-scale layering method results:
TABLE 2Y 44 well Multi-Scale identification results
Figure BDA0002176484720000151
For oil-water layer identification of a Y44 well, as can be seen from Table 2, although the identification accuracy of the multi-scale identification method designed by the invention is not high for the layer position of the oil-gas same layer, among 24 layers, the multi-scale identification method designed by the invention can accurately identify 19 layers, and the identification accuracy reaches 79.2%. Experiments prove that in the oil-water-gas layer identification process, a more accurate identification effect can be obtained by removing non-reservoir data and then continuously identifying in a reservoir.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (9)

1. A logging curve identification method based on a clustering algorithm is characterized by comprising the following steps:
1) analyzing the logging curves by adopting a principal component analysis method, and determining m principal component curves according to the contribution rate from high to low; wherein m is the number of logging attributes;
2) training the principal component curve of the known well by utilizing the first a principal component curves of the known well and adopting a KNN classification algorithm, and taking the calibrated classification information as a data label of training data in machine learning until the clustering centers of two consecutive times are unchanged to form a training data set; and (4) adding the first a principal component curves of the well to be identified as the example to be identified into the training data set, and training again to finish classification.
2. The method for identifying the logging curve based on the clustering algorithm as claimed in claim 1, wherein the step 1) is preceded by outlier rejection and depth correction processing of the logging curve.
3. The method for identifying a logging curve based on a clustering algorithm according to claim 1, wherein the step 2) comprises the following operations:
201) identifying reservoirs and non-reservoirs;
202) carrying out oil-water-gas layer multi-scale classification identification in a reservoir;
wherein the multi-scale classification of the oil-water-gas layer comprises: the dry layer, the oil-poor layer, the water layer, the oil-gas layer, the oil-containing water layer, the gas-poor layer, the gas-water layer, the oil-water layer and the gas layer.
4. The method for identifying logging curves based on clustering algorithm as claimed in claim 3, wherein the first a principal component curves in step 201) are respectively acoustic wave, natural potential and natural gamma.
5. The method for identifying the logging curve based on the clustering algorithm according to claim 3 or 4, wherein the specific process of the step 201) is as follows:
calculating Euclidean distances from data points of the first a principal component curves of the well to be identified to reservoir and non-reservoir logging data points in the training data set;
selecting k data points closest to the logging data points to be identified;
counting the number of reservoirs and non-reservoirs in the k data points;
and taking the category with the highest occurrence frequency as the category of the logging data points to be identified.
6. The method of claim 5, wherein identifying the category of the log data point to be identified further comprises:
and carrying out filtering processing for filtering the prediction error point positions.
7. The method for identifying a logging curve based on a clustering algorithm as claimed in claim 3, wherein the first a principal component curves in step 202) are respectively acoustic, natural potential, neutron, density, natural gamma, deep lateral resistivity and shallow lateral resistivity.
8. The method for identifying a logging curve based on a clustering algorithm according to claim 3 or 7, wherein the specific process of step 202) is as follows:
adding data points of the first a principal component curves of the reservoir stratum of the well to be identified as an example to be identified into a training data set, and calculating the Euclidean distance from the data points of the example to be identified to the logging data points of the known category;
selecting k data points closest to the example data points to be identified;
counting the number of the occurrences of each prediction category in the k data points;
and the class with the highest class occurrence frequency in the k data points is used as the class of the example data point to be identified.
9. The method of claim 8, wherein identifying the category of the log data point to be identified further comprises:
and carrying out filtering processing for filtering the prediction error point positions.
CN201910780696.7A 2019-08-22 2019-08-22 Logging curve identification method based on clustering algorithm Active CN110674841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910780696.7A CN110674841B (en) 2019-08-22 2019-08-22 Logging curve identification method based on clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910780696.7A CN110674841B (en) 2019-08-22 2019-08-22 Logging curve identification method based on clustering algorithm

Publications (2)

Publication Number Publication Date
CN110674841A true CN110674841A (en) 2020-01-10
CN110674841B CN110674841B (en) 2022-03-29

Family

ID=69075532

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910780696.7A Active CN110674841B (en) 2019-08-22 2019-08-22 Logging curve identification method based on clustering algorithm

Country Status (1)

Country Link
CN (1) CN110674841B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783847A (en) * 2020-06-15 2020-10-16 中国石油大学(北京) Low-contrast oil-gas reservoir identification method, device, equipment and system
CN111894551A (en) * 2020-07-13 2020-11-06 太仓中科信息技术研究院 Oil-gas reservoir prediction method based on LSTM
CN112070073A (en) * 2020-11-12 2020-12-11 北京中恒利华石油技术研究所 Logging curve abnormity discrimination method based on Markov chain transition probability matrix eigenvalue classification and support vector machine
CN112784980A (en) * 2021-01-05 2021-05-11 中国石油天然气集团有限公司 Intelligent logging horizon division method
CN113759425A (en) * 2021-09-13 2021-12-07 中国科学院地质与地球物理研究所 Method and system for evaluating filling characteristics of deep paleo-karst reservoir stratum by well-seismic combination
CN114016998A (en) * 2020-07-16 2022-02-08 中国石油天然气集团有限公司 Prediction method and device for logging encountering block
CN114486330A (en) * 2022-01-25 2022-05-13 吴凤萍 Geological exploration intelligent sampling system
WO2022120335A1 (en) * 2020-12-03 2022-06-09 Schlumberger Technology Corporation Rig operations controller
US11506004B2 (en) 2016-06-23 2022-11-22 Schlumberger Technology Corporation Automatic drilling activity detection
WO2023178553A1 (en) * 2022-03-23 2023-09-28 Saudi Arabian Oil Company Method for obtaining geological heterogeneity trends of a geological formation
CN116933196A (en) * 2023-09-19 2023-10-24 中国科学院地质与地球物理研究所 Method and system for intelligently eliminating abnormal values of multidimensional logging data
CN117473305A (en) * 2023-12-27 2024-01-30 西南石油大学 Method and system for predicting reservoir parameters enhanced by neighbor information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107576772A (en) * 2017-07-25 2018-01-12 中国地质大学(北京) A kind of method using log data quantitative assessment coal body structure type coal
CN108596251A (en) * 2018-04-25 2018-09-28 中国地质大学(北京) One kind carrying out fluid identification of reservoir method based on committee machine using log data
CN108732620A (en) * 2018-03-09 2018-11-02 山东科技大学 A kind of non-supervisory multi-wave seismic oil and gas reservoir prediction technique under supervised learning
CN109113729A (en) * 2018-06-20 2019-01-01 中国石油天然气集团有限公司 Lithology Identification Methods and device based on log
CN109670539A (en) * 2018-12-03 2019-04-23 中国石油化工股份有限公司 A kind of silt particle layer detection method based on log deep learning
CN109919184A (en) * 2019-01-28 2019-06-21 中国石油大学(北京) A kind of more well complex lithology intelligent identification Methods and system based on log data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107576772A (en) * 2017-07-25 2018-01-12 中国地质大学(北京) A kind of method using log data quantitative assessment coal body structure type coal
CN108732620A (en) * 2018-03-09 2018-11-02 山东科技大学 A kind of non-supervisory multi-wave seismic oil and gas reservoir prediction technique under supervised learning
CN108596251A (en) * 2018-04-25 2018-09-28 中国地质大学(北京) One kind carrying out fluid identification of reservoir method based on committee machine using log data
CN109113729A (en) * 2018-06-20 2019-01-01 中国石油天然气集团有限公司 Lithology Identification Methods and device based on log
CN109670539A (en) * 2018-12-03 2019-04-23 中国石油化工股份有限公司 A kind of silt particle layer detection method based on log deep learning
CN109919184A (en) * 2019-01-28 2019-06-21 中国石油大学(北京) A kind of more well complex lithology intelligent identification Methods and system based on log data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
涂必超 等: ""基于主成分分析和马氏距离的测井曲线自动分层方法"", 《黑龙江大学自然科学学报》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11506004B2 (en) 2016-06-23 2022-11-22 Schlumberger Technology Corporation Automatic drilling activity detection
CN111783847B (en) * 2020-06-15 2023-08-25 中国石油大学(北京) Low-contrast hydrocarbon reservoir identification method, device, equipment and system
CN111783847A (en) * 2020-06-15 2020-10-16 中国石油大学(北京) Low-contrast oil-gas reservoir identification method, device, equipment and system
CN111894551A (en) * 2020-07-13 2020-11-06 太仓中科信息技术研究院 Oil-gas reservoir prediction method based on LSTM
CN114016998A (en) * 2020-07-16 2022-02-08 中国石油天然气集团有限公司 Prediction method and device for logging encountering block
CN112070073A (en) * 2020-11-12 2020-12-11 北京中恒利华石油技术研究所 Logging curve abnormity discrimination method based on Markov chain transition probability matrix eigenvalue classification and support vector machine
US11542760B2 (en) 2020-12-03 2023-01-03 Schlumberger Technology Corporation Rig operations controller
WO2022120335A1 (en) * 2020-12-03 2022-06-09 Schlumberger Technology Corporation Rig operations controller
CN112784980A (en) * 2021-01-05 2021-05-11 中国石油天然气集团有限公司 Intelligent logging horizon division method
US11500117B1 (en) 2021-09-13 2022-11-15 Institute Of Geology And Geophysics, Chinese Academy Of Sciences Method and system for evaluating filling characteristics of deep paleokarst reservoir through well-to-seismic integration
CN113759425A (en) * 2021-09-13 2021-12-07 中国科学院地质与地球物理研究所 Method and system for evaluating filling characteristics of deep paleo-karst reservoir stratum by well-seismic combination
CN114486330A (en) * 2022-01-25 2022-05-13 吴凤萍 Geological exploration intelligent sampling system
WO2023178553A1 (en) * 2022-03-23 2023-09-28 Saudi Arabian Oil Company Method for obtaining geological heterogeneity trends of a geological formation
CN116933196A (en) * 2023-09-19 2023-10-24 中国科学院地质与地球物理研究所 Method and system for intelligently eliminating abnormal values of multidimensional logging data
CN116933196B (en) * 2023-09-19 2023-12-26 中国科学院地质与地球物理研究所 Method and system for intelligently eliminating abnormal values of multidimensional logging data
CN117473305A (en) * 2023-12-27 2024-01-30 西南石油大学 Method and system for predicting reservoir parameters enhanced by neighbor information

Also Published As

Publication number Publication date
CN110674841B (en) 2022-03-29

Similar Documents

Publication Publication Date Title
CN110674841B (en) Logging curve identification method based on clustering algorithm
CN111783825A (en) Well logging lithology identification method based on convolutional neural network learning
CN106372402B (en) The parallel method of fuzzy region convolutional neural networks under a kind of big data environment
CN105760673B (en) A kind of fluvial depositional reservoir seismic-sensitive parameterized template analysis method
CN112989708B (en) Well logging lithology identification method and system based on LSTM neural network
US20040133531A1 (en) Neural network training data selection using memory reduced cluster analysis for field model development
Yin et al. Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning
CN109345007B (en) Advantageous reservoir development area prediction method based on XGboost feature selection
Bashari et al. Estimation of deformation modulus of rock masses by using fuzzy clustering-based modeling
CN108874772A (en) A kind of polysemant term vector disambiguation method
Zhu et al. Rapid identification of high-quality marine shale gas reservoirs based on the oversampling method and random forest algorithm
CN112684497B (en) Seismic waveform clustering method and device
CN110633371A (en) Log classification method and system
CN109272029B (en) Well control sparse representation large-scale spectral clustering seismic facies partitioning method
CN104598705B (en) For identifying the method and apparatus of subsurface material layer
Qin et al. Evaluation of goaf stability based on transfer learning theory of artificial intelligence
CN113420506A (en) Method for establishing prediction model of tunneling speed, prediction method and device
CN111626377A (en) Lithofacies identification method, device, equipment and storage medium
Wang et al. A novel multi-input alexnet prediction model for oil and gas production
Yu et al. Training image optimization method based on convolutional neural network and its application in discrete fracture network model selection
CN114818493A (en) Method for quantitatively evaluating integrity degree of tunnel rock mass
US11208886B2 (en) Direct hydrocarbon indicators analysis informed by machine learning processes
Pan et al. Decomposed and weighted characteristic analysis for the quantitative estimation of mineral resources
Esmaeiloghli et al. Optimizing the grade classification model of mineralized zones using a learning method based on harmony search algorithm
CN111542819A (en) Apparatus and method for improved subsurface data processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant