CN110674868A

CN110674868A - Stratum lithology identification system and method based on high-dimensional drilling parameter information

Info

Publication number: CN110674868A
Application number: CN201910898862.3A
Authority: CN
Inventors: 张宁; 张幼振; 姚克; 邵俊杰; 孙道明; 李旺年; 钟自成
Original assignee: Xian Research Institute Co Ltd of CCTEG
Current assignee: Xian Research Institute Co Ltd of CCTEG
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2020-01-10

Abstract

The invention provides a stratum lithology identification system and method based on high-dimensional drilling parameter information, including a drilling test bench for constructing data holes to obtain high-dimensional drilling parameters, each group of high-dimensional drilling parameters are respectively The corresponding training samples and prediction samples are formed; the data dimensionality reduction system is used to calculate the high-dimensional drilling parameters in the training samples, determine the number of principal components, and obtain the classification number of the preliminary training samples and the principal components of the training samples. data set, and then obtain the principal component data set of the prediction sample; the data clustering system is used to perform fuzzy kernel clustering on the principal component data set of the training sample, and obtain the best clustering of the principal component data set of the training sample. The optimal number K is used to obtain the principal component data set of the training samples of the definite classification; the prediction recognition system is used to establish the discriminant criterion for the principal component data set of the training samples of the definite classification, classify the predicted samples, and obtain the predicted samples The lithology category to which it belongs.

Description

A formation lithology identification system and method based on high-dimensional drilling parameter information

技术领域technical field

本发明属于岩土工程勘测领域，涉及地层岩性识别系统，具体涉及一种基于高维钻进参数信息的地层岩性识别系统及其方法。The invention belongs to the field of geotechnical engineering survey, relates to a formation lithology identification system, in particular to a formation lithology identification system and method based on high-dimensional drilling parameter information.

背景技术Background technique

由于我国幅员辽阔，地质条件极其复杂，地层中的岩石性质和岩层组合具有明显的空间差异性，工程设计的合理性与开采过程的安全保障受到了很大影响。目前，地层的识别通常采用地质钻探和物探的方法，地层的岩性识别存在识别精度低、识别方法粗泛等不足，难以实现对地层特性识别的及时与准确，无法有效指导现场工程实践，同时，利用钻探设备获取的实时钻进参数集数据量大，维数高，地层的岩性特征通常与多种钻进参数有关，并且各个钻进参数所包含的信息是交叉的，有些甚至是冗余的，从而无法利用多参数线性方法进行岩性识别，也没有合适的数据处理和分析方法，目前，国内外学者开展了利用各类数值仿真和试验的方法对钻进参数数据集进行研究，由于采用单个或多个指标参数与其阈值进行判断时，所依据的信息量较少，不同矿井各参数的阈值也不尽相同，当多个指标不同程度接近阈值时，如何综合判断，仍没有很好的解决方案，同时实测数据计算量巨大，各参数数据又存在相互干扰的问题，无法实时进行计算分析，从而无法对地层做出实时可信的预测。Due to the vast territory of our country and the extremely complex geological conditions, the rock properties and rock formations in the stratum have obvious spatial differences, and the rationality of engineering design and the safety guarantee of the mining process have been greatly affected. At present, geological drilling and geophysical exploration methods are usually used to identify the stratum. The lithology identification of the stratum has shortcomings such as low identification accuracy and rough identification methods. , The real-time drilling parameter set obtained by drilling equipment has a large amount of data and a high dimension. The lithological characteristics of the formation are usually related to a variety of drilling parameters, and the information contained in each drilling parameter is crossed, and some are even redundant. Therefore, it is impossible to use the multi-parameter linear method for lithology identification, and there is no suitable data processing and analysis method. At present, domestic and foreign scholars have carried out various numerical simulation and experimental methods to study drilling parameter data sets. Since a single or multiple index parameters and their thresholds are used to judge, the amount of information is relatively small, and the thresholds of various parameters in different mines are also different. When multiple indexes are close to the thresholds to different degrees, it is still not very clear how to comprehensively judge. It is a good solution. At the same time, the amount of calculation of the measured data is huge, and each parameter data has the problem of mutual interference. It is impossible to perform calculation and analysis in real time, so that it is impossible to make real-time and reliable prediction of the formation.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的不足，本发明的目的在于，提供一种基于高维钻进参数信息的地层岩性识别系统及其方法，解决现有技术中的识别精度不足的技术问题。In view of the deficiencies in the prior art, the purpose of the present invention is to provide a formation lithology identification system and method based on high-dimensional drilling parameter information, so as to solve the technical problem of insufficient identification accuracy in the prior art.

为了解决上述技术问题，本发明采用如下技术方案予以实现：In order to solve the above-mentioned technical problems, the present invention adopts the following technical solutions to realize:

一种基于高维钻进参数信息的地层岩性识别系统，包括依次相连的钻进试验台、数据降维系统、数据聚类系统、预测识别系统；A formation lithology identification system based on high-dimensional drilling parameter information, including a drilling test bench, a data dimensionality reduction system, a data clustering system, and a prediction and identification system connected in sequence;

所述的钻进试验台，用于施工数据孔，获得高维钻进参数，每组高维钻进参数分别组成对应的训练样本和预测样本；The drilling test bench is used for constructing data holes to obtain high-dimensional drilling parameters, and each group of high-dimensional drilling parameters constitutes corresponding training samples and prediction samples respectively;

所述的数据降维系统，用于对训练样本中的高维钻进参数进行计算，确定主元成分个数；得到初步训练样本的分类数和训练样本的主元成分数据集；进而得到预测样本的主元成分数据集；The data dimensionality reduction system is used to calculate the high-dimensional drilling parameters in the training samples to determine the number of principal components; obtain the classification number of the preliminary training samples and the data sets of the principal components of the training samples; and then obtain the prediction the principal component data set of the sample;

所述的数据聚类系统，用于对训练样本的主元成分数据集进行模糊核聚类，得出训练样本的主元成分数据集聚类的最优个数K，得到确定分类的训练样本主元成分数据集；The data clustering system is used to perform fuzzy kernel clustering on the principal component data set of the training sample, obtain the optimal number K of the principal component data set clustering of the training sample, and obtain the training sample of the determined classification Principal Component Component Dataset;

所述的预测识别系统，用于对确定分类的训练样本主元成分数据集建立判别准则，对预测样本进行归类，得出预测样本所属的岩性类别。The prediction and identification system is used to establish a discriminant criterion for the main component data set of the training samples for which classification is determined, to classify the predicted samples, and to obtain the lithology category to which the predicted samples belong.

本发明还具有如下技术特征：The present invention also has the following technical features:

所述钻进试验台包括分别与主机相连的液压泵站、操作台、冲洗液循环单元和数据采集单元。The drilling test bench includes a hydraulic pump station, an operation platform, a flushing fluid circulation unit and a data acquisition unit respectively connected with the main engine.

所述的数据降维系统包括依次相连的输入端一、数据处理器一和输出端一。The data dimensionality reduction system includes an input terminal 1, a data processor 1 and an output terminal 1 which are connected in sequence.

所述的数据聚类系统包括依次相连的输入端二、数据处理器二和输出端二。The data clustering system includes two input terminals, two data processors and two output terminals connected in sequence.

所述预测识别系统包括依次相连的输入端三，数据处理器三和输出端三。The predictive identification system includes three input terminals, three data processors and three output terminals which are connected in sequence.

本发明还保护一种基于高维钻进参数信息的地层岩性识别方法，该方法采用如上所述的基于高维钻进参数信息的地层岩性识别系统；所述的高维钻进参数包括机械钻速、回转扭矩、钻压、转速、回转压力和泥浆泵压力。The present invention also protects a formation lithology identification method based on high-dimensional drilling parameter information, which adopts the above-mentioned formation lithology identification system based on high-dimensional drilling parameter information; the high-dimensional drilling parameters include: ROP, swing torque, weight on bit, rotational speed, swing pressure and mud pump pressure.

该方法具体包括以下步骤：The method specifically includes the following steps:

步骤一，根据目标地层区域的生产资料，确定典型地层的岩层特性和预测范围，采用钻进试验台的主机施工数据孔，通过采用钻进试验台的数据采集单元获得高维钻进参数，每组高维钻进参数分别组成对应的训练样本和预测样本；Step 1: According to the production data of the target stratum area, determine the rock formation characteristics and prediction range of the typical stratum, use the main engine construction data hole of the drilling test bench, and obtain high-dimensional drilling parameters by using the data acquisition unit of the drilling test bench. The group of high-dimensional drilling parameters constitutes the corresponding training samples and prediction samples respectively;

步骤二，采用数据降维系统对训练样本中的高维钻进参数进行计算，得到各高维钻进参数之间的相关系数，然后获得每个预设主元成分的贡献率，并从大到小进行排序，并得出每个预设主元成分的累计贡献率；当某个预设主元成分的累计贡献率大于90％时，则该预设主元成分之前的所有预设主元成分为主元成分，最终确定主元成分个数和对应特征向量；并以各主元成分的贡献率作为权数进行加权计算，获得每个训练样本的加权得分，依据每个训练样本的加权得分从高到低进行排序，根据每个训练样本的加权得分情况，初步对训练样本分类并得到初步训练样本的分类数，根据主元成分的特征向量得到训练样本的主元成分数据集和预测样本的主元成分数据集；In step 2, the data dimensionality reduction system is used to calculate the high-dimensional drilling parameters in the training samples, and the correlation coefficients between the high-dimensional drilling parameters are obtained, and then the contribution rate of each preset principal component is obtained. Sort from small to small, and obtain the cumulative contribution rate of each preset pivot component; when the cumulative contribution rate of a preset pivot component is greater than 90%, then all preset pivot components before the preset pivot The main component is the main component, and the number of main components and the corresponding eigenvectors are finally determined; and the contribution rate of each main component is used as the weight for weighted calculation, and the weighted score of each training sample is obtained. The weighted scores are sorted from high to low. According to the weighted score of each training sample, the training samples are initially classified and the number of classifications of the preliminary training samples is obtained. According to the feature vector of the main component, the main component data set and Principal component data set of predicted samples;

步骤三，采用数据聚类系统对步骤二得到的训练样本的主元成分数据集进行模糊核聚类，将步骤二得到的初步训练样本的分类数作为原始聚类个数，设定模糊度，并构造核函数，建立隶属度矩阵，通过不断迭代优化参数，最终完成训练样本的主元成分数据集的聚类，得出训练样本的主元成分数据集聚类的最优个数K，所述的最优个数K即为岩性分类个数，同时计算出每个岩性分类的聚类中心和每个岩性分类的对应数据集，得到确定分类的训练样本主元成分数据集；In step 3, the data clustering system is used to perform fuzzy kernel clustering on the principal component data set of the training samples obtained in step 2, and the number of classifications of the preliminary training samples obtained in step 2 is used as the number of original clusters, and the ambiguity is set. And construct a kernel function, establish a membership matrix, and optimize the parameters through continuous iteration, and finally complete the clustering of the principal component data set of the training sample, and obtain the optimal number K of the principal component data set clustering of the training sample, so The optimal number K is the number of lithological classifications, and at the same time, the cluster center of each lithological classification and the corresponding data set of each lithological classification are calculated, and the main component data set of the training sample for the classification is obtained;

步骤四，采用预测识别系统对步骤三得到的确定分类的训练样本主元成分数据集建立判别准则，通过马氏距离判断法计算步骤二得到的预测样本各主元成分数据集分别与确定分类的训练样本主元成分数据集的马氏距离，选择其中最小的马氏距离，对预测样本进行归类，得出预测样本所属的岩性类别。In step 4, the prediction and recognition system is used to establish a discrimination criterion for the principal component data set of the training sample of the definite classification obtained in step 3, and each principal component data set of the predicted sample obtained in step 2 is calculated by the Mahalanobis distance judgment method. The Mahalanobis distance of the main component data set of the training sample is selected, the smallest Mahalanobis distance is selected, and the predicted samples are classified to obtain the lithology category to which the predicted samples belong.

本发明与现有技术相比，具有如下技术效果：Compared with the prior art, the present invention has the following technical effects:

本发明的识别精度高、聚类时间短，数据处理速度快，有效利用高维钻进参数数据集，过滤冗余和错误参数信息，可实时识别地层岩性与结构信息，应用提出的方法对地层岩性进行预测具有重要的实际意义，为实现岩石地层隐蔽致灾因素动态智能探测具有重要意义。不仅能够为地层智能识别提供准确的信息，也为其他井巷工程、边坡工程等岩土工程施工时岩石特性的预测提供借鉴和指导，同时对于工程实践中高维数据的处理提供思路和方法。The invention has high identification accuracy, short clustering time, fast data processing speed, effectively utilizes high-dimensional drilling parameter data sets, filters redundant and incorrect parameter information, and can identify formation lithology and structure information in real time. The prediction of formation lithology is of great practical significance, and it is of great significance to realize dynamic intelligent detection of hidden disaster-causing factors in rock formations. It can not only provide accurate information for the intelligent identification of strata, but also provide reference and guidance for the prediction of rock characteristics in other geotechnical engineering constructions such as wells and roadways, slope engineering, etc. At the same time, it provides ideas and methods for the processing of high-dimensional data in engineering practice.

附图说明Description of drawings

图1为本发明的基于高维钻进参数信息的地层岩性识别系统的结构示意图。FIG. 1 is a schematic structural diagram of a formation lithology identification system based on high-dimensional drilling parameter information of the present invention.

图2为本发明的钻进试验台结构示意图。FIG. 2 is a schematic structural diagram of the drilling test bench of the present invention.

图3为本发明的数据降维系统的结构示意图。FIG. 3 is a schematic structural diagram of the data dimensionality reduction system of the present invention.

图4为本发明的数据聚类系统的结构示意图。FIG. 4 is a schematic structural diagram of the data clustering system of the present invention.

图5为本发明的预测识别系统的结构示意图。FIG. 5 is a schematic structural diagram of the predictive identification system of the present invention.

图6为本发明的实际应用例的数据聚类系统的聚类效果示意图。FIG. 6 is a schematic diagram of the clustering effect of the data clustering system of the practical application example of the present invention.

图中各个标号的含义为：1为钻进试验台、2为数据降维系统、3为数据聚类系统、4为预测识别系统、5为液压泵站、6为操作台、7为冲洗液循环系统、8为主机、9为数据采集系统、10为输入端一、11为数据处理系统一、12为输出端一、13为输入端二、14为数据处理系统二、15为输出端二、16为输入端三、17为数据处理系统三、18为输出端三。The meanings of the symbols in the figure are: 1 is the drilling test bench, 2 is the data dimensionality reduction system, 3 is the data clustering system, 4 is the prediction and identification system, 5 is the hydraulic pump station, 6 is the operating table, and 7 is the flushing fluid Circulation system, 8 is the host computer, 9 is the data acquisition system, 10 is the input terminal one, 11 is the data processing system one, 12 is the output terminal one, 13 is the input terminal two, 14 is the data processing system two, and 15 is the output terminal two , 16 is the input terminal three, 17 is the data processing system three, and 18 is the output terminal three.

以下结合实施例对本发明的具体内容作进一步详细解释说明。The specific content of the present invention will be further explained in detail below in conjunction with the embodiments.

具体实施方式Detailed ways

本发明的目的在于提供一种基于高维钻进参数信息的地层岩性识别系统及其方法，能够处理高维钻进参数信息，减少钻进参数的相互影响，实时获取准确的地层岩性特征，可为煤矿井下钻孔施工参数科学设计与优化提供有效手段。不仅能够为地层岩性识别提供理论支持，还可为其他岩土工程施工等高维参数数据集的处理提供借鉴和指导，同时对钻探设备的故障诊断提供思路和方法。The purpose of the present invention is to provide a formation lithology identification system and method based on high-dimensional drilling parameter information, which can process high-dimensional drilling parameter information, reduce the mutual influence of drilling parameters, and obtain accurate formation lithology characteristics in real time. , which can provide effective means for scientific design and optimization of construction parameters of underground drilling in coal mines. It can not only provide theoretical support for stratum lithology identification, but also provide reference and guidance for the processing of other high-dimensional parameter data sets such as geotechnical engineering construction, and provide ideas and methods for fault diagnosis of drilling equipment.

本发明给出一种基于高维钻进参数信息的地层岩性识别系统，包括依次相连的钻进试验台、数据降维系统、数据聚类系统、预测识别系统。The invention provides a formation lithology identification system based on high-dimensional drilling parameter information, including a drilling test bench, a data dimensionality reduction system, a data clustering system, and a prediction and identification system connected in sequence.

所述的数据降维系统，用于完成高维钻进参数信息集降维至成新的互相无关的综合主元参数，同时最大程度上保持原有数据集的信息，并用于消除各参数量纲的影响，对训练样本中的高维钻进参数进行计算，确定主元成分个数；得到初步训练样本的分类数和训练样本的主元成分数据集；进而得到预测样本的主元成分数据集；The data dimensionality reduction system is used to reduce the dimensionality of the high-dimensional drilling parameter information set to a new comprehensive principal parameter that is independent of each other, while maintaining the information of the original data set to the greatest extent, and is used to eliminate the amount of each parameter. Calculate the high-dimensional drilling parameters in the training samples to determine the number of principal components; obtain the classification number of the preliminary training samples and the principal component data set of the training samples; and then obtain the principal component data of the predicted samples set;

所述的预测识别系统，利用马氏距离判断法计算预测样本分别与确定分类的训练样本主元成分数据集的马氏距离，并进行比较，对预测样本的岩性进行识别。The described prediction and identification system uses the Mahalanobis distance judgment method to calculate the Mahalanobis distances between the predicted samples and the training sample data set of the determined classification, and compare them to identify the lithology of the predicted samples.

所述钻进试验台包括分别与主机相连的液压泵站、操作台、冲洗液循环单元和数据采集单元；The drilling test bench includes a hydraulic pump station, an operation platform, a flushing fluid circulation unit and a data acquisition unit respectively connected with the main engine;

所述的主机用于驱动钻具施工数据钻孔；The host is used to drive drilling tool construction data to drill;

所述的液压泵站用于给主机提供原动力；The hydraulic pump station is used to provide the motive power to the main engine;

所述的操作台用于操作控制主机；The operating console is used to operate the control host;

所述的数据采集单元用于收集、传输高维钻进参数；The data acquisition unit is used for collecting and transmitting high-dimensional drilling parameters;

所述的冲洗液循环单元用于冷却钻头、渣料排放；The flushing fluid circulation unit is used for cooling the drill bit and discharging the slag;

所述的数据降维系统包括依次相连的输入端一、数据处理器一和输出端一；The data dimensionality reduction system comprises an input terminal 1, a data processor 1 and an output terminal 1 which are connected in sequence;

所述的输入端一用于接收训练样本和预测样本；Described input terminal one is used for receiving training samples and prediction samples;

所述的数据处理器一用于训练样本和预测样本中高维钻进参数信息数据集的标准化、计算相关系数、贡献率和累计贡献率，最终确定主元成分个数和对应特征向量；得到初步训练样本的分类数和训练样本的主元成分数据集；进而得到预测样本的主元成分数据集；The data processor 1 is used to standardize the high-dimensional drilling parameter information data set in the training sample and the prediction sample, calculate the correlation coefficient, the contribution rate and the cumulative contribution rate, and finally determine the number of principal components and the corresponding feature vector; obtain a preliminary The number of classifications of the training samples and the principal component data set of the training samples; and then the principal component data set of the predicted samples is obtained;

所述的输出端一主要用于初步训练样本的分类数、训练样本的主元成分数据集和预测样本的主元成分数据集的输出。The output terminal 1 is mainly used for the output of the classification number of the preliminary training samples, the principal component data set of the training samples and the principal component data set of the prediction samples.

所述的数据聚类系统包括依次相连的输入端二、数据处理器二和输出端二，The data clustering system includes two input terminals, two data processors and two output terminals connected in sequence,

所述的输入端二用于接收输出端一得到的初步训练样本的分类数和训练样本的主元成分数据集；Described input end 2 is used for receiving the classification number of preliminary training samples and the principal component data set of training samples obtained by output end 1;

所述的数据处理器二用于将初步训练样本的分类数作为原始聚类个数，设定模糊度，并构造核函数，建立隶属度矩阵，通过不断迭代优化参数，计算聚类中心，得出训练样本的主元成分数据集聚类的最优个数K和确定分类的训练样本主元成分数据集；The second data processor is used to use the classification number of the preliminary training samples as the original number of clusters, set the ambiguity, construct a kernel function, establish a membership matrix, optimize parameters through continuous iteration, calculate the cluster center, and obtain Obtain the optimal number K of the training sample principal component data set clustering and determine the training sample principal component data set of classification;

所述的输出端二主要用于输出确定分类的训练样本主元成分数据集。The second output terminal is mainly used for outputting the training sample principal component data set for determining the classification.

所述预测识别系统包括依次相连的输入端三，数据处理器三和输出端三；The predictive identification system includes three input terminals, three data processors and three output terminals connected in sequence;

所述的输入端三用于接收输出器一得到的预测样本的主元成分数据集和输出器二得到的确定分类的训练样本主元成分数据集；The third input terminal is used to receive the main component data set of the prediction sample obtained by the output device 1 and the training sample main component data set of the determined classification obtained by the output device 2;

所述的数据处理器三建立判别函数，建立判别准则，通过马氏距离判断法计算预测样本各主元成分数据集分别与确定分类的训练样本主元成分数据集的马氏距离，选择其中最小的马氏距离，对预测样本进行归类，得出预测样本所属的岩性类别；The data processor 3 establishes a discriminant function, establishes a discriminant criterion, and calculates the Mahalanobis distance between each main component data set of the predicted sample and the main component data set of the training sample of the determined classification through the Mahalanobis distance judgment method, and selects the smallest among them. The Mahalanobis distance of the predicted samples is classified to obtain the lithology category to which the predicted samples belong;

所述的输出端三可用于输出岩性识别结果。The output terminal 3 can be used for outputting lithology identification results.

本发明中，确定系统的精准率和修正的方法如下：In the present invention, the method for determining the accuracy of the system and the correction is as follows:

步骤1，通过现场试验测量各预测样本的硬度和强度物理力学参数，得到现场预测样本的材料属性，确定预测样本的实际岩性类别，并与预测结果相比较，得出预测样本的预测准确率。Step 1: Measure the physical and mechanical parameters of hardness and strength of each predicted sample through on-site tests, obtain the material properties of the predicted sample on site, determine the actual lithology category of the predicted sample, and compare it with the predicted result to obtain the predicted accuracy of the predicted sample. .

步骤2，当预测准确率小于99％时，通过钻进试验台施工数据钻孔，增加训练样本数据集，重复步骤一至步骤三，得出新的确定分类的训练样本主元成分数据集，重新利用步骤四进行预测样本的识别，得出新的预测准确率，通过增加训练样本数量，不断优化预测结果，直到预测准确率大于等于99％，否则仍将循环步骤一至步骤四。Step 2, when the prediction accuracy rate is less than 99%, drill holes into the test bench construction data to increase the training sample data set, and repeat steps 1 to 3 to obtain a new training sample principal component data set that determines the classification. Use step 4 to identify the prediction samples, and obtain a new prediction accuracy rate. By increasing the number of training samples, the prediction results are continuously optimized until the prediction accuracy rate is greater than or equal to 99%. Otherwise, steps 1 to 4 will be repeated.

步骤3，利用得出的最终优化的确定分类的训练样本主元成分数据集，通过实时进行钻孔施工，实时获得高维钻进参数集作为预测样本，并用数据降维系统获得实时的预测样本的主元成分数据集，通过上述步骤可进行实时识别所处地层岩性和获得钻探设备的运行工况。Step 3: Using the final optimized and classified training sample principal component data set, through real-time drilling construction, a high-dimensional drilling parameter set is obtained in real time as a prediction sample, and a real-time prediction sample is obtained by a data dimensionality reduction system. Through the above steps, the lithology of the formation can be identified in real time and the operating conditions of the drilling equipment can be obtained.

以下给出本发明的具体实施例，需要说明的是本发明并不局限于以下具体实施例，凡在本申请技术方案基础上做的等同变换均落入本发明的保护范围。Specific embodiments of the present invention are given below. It should be noted that the present invention is not limited to the following specific embodiments, and all equivalent transformations made on the basis of the technical solutions of the present application fall into the protection scope of the present invention.

实施例1：Example 1:

遵从上述技术方案，如图1至图5所示，本实施例给出一种基于高维钻进参数信息的地层岩性识别系统，包括依次相连的钻进试验台、数据降维系统、数据聚类系统、预测识别系统；Following the above technical solutions, as shown in Figures 1 to 5, this embodiment provides a formation lithology identification system based on high-dimensional drilling parameter information, including a drilling test bench, a data dimensionality reduction system, a data Clustering system, prediction and identification system;

钻进试验台，用于施工数据孔，获得高维钻进参数，每组高维钻进参数分别组成对应的训练样本和预测样本；Drilling into the test bench is used to construct data holes and obtain high-dimensional drilling parameters. Each group of high-dimensional drilling parameters is composed of corresponding training samples and prediction samples;

数据降维系统，用于对训练样本中的高维钻进参数进行计算，确定主元成分个数；得到初步训练样本的分类数和训练样本的主元成分数据集；进而得到预测样本的主元成分数据集；The data dimensionality reduction system is used to calculate the high-dimensional drilling parameters in the training samples to determine the number of principal components; obtain the classification number of the preliminary training samples and the principal component data set of the training samples; and then obtain the main components of the prediction samples. meta-component dataset;

数据聚类系统，用于对训练样本的主元成分数据集进行模糊核聚类，得出训练样本的主元成分数据集聚类的最优个数K，得到确定分类的训练样本主元成分数据集；The data clustering system is used to perform fuzzy kernel clustering on the principal component data set of the training sample, and obtain the optimal number K of the training sample principal component data set clustering, and obtain the training sample principal component of the classification. data set;

预测识别系统，用于对确定分类的训练样本主元成分数据集建立判别准则，对预测样本进行归类，得出预测样本所属的岩性类别。The prediction and identification system is used to establish a discriminant criterion for the main component data set of the training samples for which classification is determined, to classify the predicted samples, and to obtain the lithology category to which the predicted samples belong.

具体的，所述钻进试验台包括分别与主机相连的液压泵站、操作台、冲洗液循环单元和数据采集单元。Specifically, the drilling test bench includes a hydraulic pump station, an operation platform, a flushing fluid circulation unit and a data acquisition unit respectively connected with the main engine.

具体的，数据降维系统包括依次相连的输入端一、数据处理器一和输出端一。Specifically, the data dimensionality reduction system includes an input terminal 1, a data processor 1 and an output terminal 1 which are connected in sequence.

具体的，数据聚类系统包括依次相连的输入端二、数据处理器二和输出端二。Specifically, the data clustering system includes two input terminals, two data processors, and two output terminals that are connected in sequence.

具体的，预测识别系统包括依次相连的输入端三，数据处理器三和输出端三。Specifically, the predictive identification system includes three input terminals, three data processors and three output terminals, which are connected in sequence.

实施例2：Example 2:

遵从上述技术方案，本实施例给出一种基于高维钻进参数信息的地层岩性识别方法，该方法采用如实施例1的基于高维钻进参数信息的地层岩性识别系统；所述的高维钻进参数包括机械钻速、回转扭矩、钻压、转速、回转压力和泥浆泵压力。该方法具体包括以下步骤：Following the above technical solution, the present embodiment provides a formation lithology identification method based on high-dimensional drilling parameter information, and the method adopts the formation lithology identification system based on high-dimensional drilling parameter information as in Embodiment 1; the The high-dimensional drilling parameters include ROP, swing torque, WOB, rotational speed, swing pressure and mud pump pressure. The method specifically includes the following steps:

步骤二，采用数据降维系统对训练样本中的高维钻进参数进行计算，得到各高维钻进参数之间的相关系数，然后获得每个预设主元成分的贡献率，并从大到小进行排序，并得出每个预设主元成分的累计贡献率；当某个预设主元成分的累计贡献率大于90％时，则该预设主元成分之前的所有预设主元成分为主元成分，最终确定主元成分个数和对应的特征向量；并以各主元成分的贡献率作为权数进行加权计算，获得每个训练样本的加权得分，依据每个训练样本的加权得分从高到低进行排序，根据每个训练样本的加权得分情况，初步对训练样本分类并得到初步训练样本的分类数，根据主元成分的特征向量得到训练样本的主元成分数据集和预测样本的主元成分数据集。In step 2, the data dimensionality reduction system is used to calculate the high-dimensional drilling parameters in the training samples, and the correlation coefficients between the high-dimensional drilling parameters are obtained, and then the contribution rate of each preset principal component is obtained. Sort from small to small, and obtain the cumulative contribution rate of each preset pivot component; when the cumulative contribution rate of a preset pivot component is greater than 90%, then all preset pivot components before the preset pivot The main component is the main component, and the number of main components and the corresponding feature vector are finally determined; and the contribution rate of each main component is used as the weight for weighted calculation to obtain the weighted score of each training sample. Sort the weighted scores from high to low. According to the weighted score of each training sample, initially classify the training samples and obtain the classification number of the preliminary training samples, and obtain the main component data set of the training samples according to the feature vector of the main component. and the Principal Component Component dataset of predicted samples.

步骤二中，具体包括以下步骤：In step 2, the following steps are specifically included:

步骤2.1，高维钻进参数的数据标准化：Step 2.1, data standardization of high-dimensional drilling parameters:

为了使结果不受量纲的影响，首先对高维钻进参数进行标准化处理，使每个指标数据的均值为0，标准差为1。标准化处理的方法为：In order to make the results not affected by the dimension, the high-dimensional drilling parameters are first standardized, so that the mean of each index data is 0 and the standard deviation is 1. The standardization method is:

其中，

in,

其中，α_ij为数据集中第i个评价对象的第j个高维钻进参数的指标值，i为数据集中某个高维钻进参数对应的评价对象的序号，j为数据集中高维钻进参数的序号，μ_j为α_ij的平均值，s_j为α_ij的方差。Among them, α _ij is the index value of the j-th high-dimensional drilling parameter of the i-th evaluation object in the data set, i is the serial number of the evaluation object corresponding to a high-dimensional drilling parameter in the data set, and j is the high-dimensional drilling parameter in the data set. The serial number of the input parameter, μ _j is the average value of α _ij , and s _j is the variance of α _ij .

步骤2.2，高维钻进参数的相关系数计算：Step 2.2, calculation of correlation coefficient of high-dimensional drilling parameters:

计算各高维钻进参数之间的相关系数，构造相关系数矩阵。设相关系数矩阵为R(r_wq)，其中Calculate the correlation coefficient between the high-dimensional drilling parameters, and construct the correlation coefficient matrix. Let the correlation coefficient matrix be R(r _wq ), where

其中，r_wq为相关系数矩阵中第w个高维钻进参数和第q个高维钻进参数的相关系数，w、q均为数据集中高维钻进参数的序号。Among them, r _wq is the correlation coefficient between the w-th high-dimensional drilling parameter and the q-th high-dimensional drilling parameter in the correlation coefficient matrix, and w and q are the serial numbers of the high-dimensional drilling parameters in the data set.

步骤2.3，构造预设主元成分：Step 2.3, construct the preset pivot component:

计算出相关系数矩阵R(r_wq)的特征值λ₁≥λ₂≥...≥λ_N≥0及其对应的特征向量μ₁,μ₂,…,μ_N,记μ_j＝(μ_1j,μ_2j,…,μ_Nj)^T,做线性组合：Calculate the eigenvalues λ ₁ ≥λ ₂ ≥...≥λ _N ≥0 of the correlation coefficient matrix R(r _wq ) and its corresponding eigenvectors μ ₁ , μ ₂ ,..., μ _N , denoted μ _j = (μ _1j , μ _2j ,…,μ _Nj ) ^T , do a linear combination:

其中，y_N表示第N个预设主元成分的指标值。Wherein, y _N represents the index value of the Nth preset principal component.

步骤2.4选取主元成分：Step 2.4 Select the principal component:

计算预设主元成分的贡献率，其大小反映了预设主元成分的影响力。并计算预设主元成分的累计贡献率分别为：The contribution rate of the preset pivot component is calculated, and its size reflects the influence of the preset pivot component. And calculate the cumulative contribution rates of the preset principal components as follows:

其中，b_j表示预设主元成分的贡献率,α_j表示预设主元成分的累计贡献率。Among them, b _j represents the contribution rate of the preset principal component, and α _j represents the cumulative contribution rate of the preset principal component.

当累计贡献率α_j接近于1时，提取累积贡献率大于90％的特征值对应的预设主元成分，选取前p(p≤N)个指标变量y₁，y₂,…,y_p作为p个主元成分，其中p表示累计累积贡献率大于90％的特征值对应的预设主元成分的序号，即主元成分的个数。When the cumulative contribution rate α _j is close to 1, extract the preset principal component corresponding to the eigenvalues whose cumulative contribution rate is greater than 90%, and select the first p (p≤N) index variables y ₁ , y ₂ ,...,y _p As the p principal component, where p represents the sequence number of the preset principal component corresponding to the eigenvalue whose cumulative cumulative contribution rate is greater than 90%, that is, the number of the principal component.

步骤2.5获得训练样本的加权得分：Step 2.5 Get the weighted score of the training samples:

设加权得分为Z，则Let the weighted score be Z, then

依据每个训练样本的加权得分从高到低进行排序，根据每个训练样本的加权得分情况，初步对训练样本分类并得到初步训练样本的分类数为q，根据主元成分的特征向量得到训练样本的主元成分数据集和预测样本的主元成分数据集。According to the weighted score of each training sample, sort from high to low, according to the weighted score of each training sample, preliminarily classify the training samples and obtain the classification number of the initial training sample as q, and obtain the training according to the feature vector of the principal component. A dataset of principal component components of the sample and a dataset of principal component components of the predicted sample.

步骤三，采用数据聚类系统对步骤二得到的训练样本的主元成分数据集进行模糊核聚类，将步骤二得到的初步训练样本的分类数作为原始聚类个数，设定模糊度，并构造核函数，建立隶属度矩阵，通过不断迭代优化参数，最终完成训练样本的主元成分数据集的聚类，得出训练样本的主元成分数据集聚类的最优个数K，所述的最优个数K即为岩性分类个数，同时计算出每个岩性分类的聚类中心和每个岩性分类的对应数据集，得到确定分类的训练样本主元成分数据集。In step 3, the data clustering system is used to perform fuzzy kernel clustering on the principal component data set of the training samples obtained in step 2, and the number of classifications of the preliminary training samples obtained in step 2 is used as the number of original clusters, and the ambiguity is set. And construct a kernel function, establish a membership matrix, and optimize the parameters through continuous iteration, and finally complete the clustering of the principal component data set of the training sample, and obtain the optimal number K of the principal component data set clustering of the training sample, so The optimal number K is the number of lithological classifications. At the same time, the cluster center of each lithological classification and the corresponding data set of each lithological classification are calculated, and the main component data set of the training sample for the classification is obtained.

步骤三中，具体包括以下步骤：Step 3 specifically includes the following steps:

步骤3.1，设定原始分类个数为q,设定模糊度m，目标函数精度ε，设定合适核函数的参数并构造核函数；Step 3.1, set the number of original classifications as q, set the ambiguity m, the accuracy of the objective function ε, set the parameters of the appropriate kernel function and construct the kernel function;

步骤3.2，建立隶属度矩阵，对隶属度矩阵进行初始化；Step 3.2, establish a membership degree matrix, and initialize the membership degree matrix;

步骤3.3，计算聚类中心；Step 3.3, calculate the cluster center;

式中：x_i为原始特征空间中第i个样本，i＝1,2,…,n；μ_iu为第i个样本x_i对第u类的隶属度，u＝1,···,q；，μ_iu∈[0,1]；m为模糊度；v_a为高维特征空间中第a类的聚类中心，a＝1,2,…,q；

为高维特征空间中第i个样本x_i的距离。where x _i is the i-th sample in the original feature space, i=1, 2,...,n; μ _iu is the membership degree of the i-th sample x _i to the u-th class, u=1,..., q;, μ _iu ∈ [0,1]; m is the ambiguity; v _a is the cluster center of the a-th class in the high-dimensional feature space, a=1,2,…,q;

is the distance of the _ith sample xi in the high-dimensional feature space.

最小化目标函数可通过令其对隶属度矩阵的偏导数为零进行求导，则The objective function can be minimized by making its partial derivative with respect to the membership matrix zero, then

r＝1,2,…,n；s＝1,2,…,qr=1,2,…,n; s=1,2,…,q

式中：μ_rs为第r个样本x_r对第s类的隶属度，为高维特征空间中第r个样本x_r与第高维特征空间中第j类的聚类中心的距离。where μ _rs is the degree of membership of the r-th sample x _r to the s-th class, is the distance between the rth sample x _r in the high-dimensional feature space and the cluster center of the jth class in the high-dimensional feature space.

步骤3.4，对经过迭代的隶属度矩阵，依据矩阵范数进行对比，若收敛，迭代停止，否则返回步骤3.3。Step 3.4, compare the iterative membership degree matrix according to the matrix norm, if it converges, stop the iteration, otherwise return to step 3.3.

得出训练样本的主元成分数据集聚类的最优个数K，所述的最优个数K即为岩性分类个数，同时计算出每个岩性分类的聚类中心和每个岩性分类的对应数据集，得到确定分类的训练样本主元成分数据集。The optimal number K of the principal component data set clustering of the training sample is obtained, and the optimal number K is the number of lithological classifications. At the same time, the cluster center and each lithological classification are calculated. The corresponding data set of lithology classification is obtained, and the main component data set of the training sample for the classification is obtained.

步骤四中所述马氏距离判断法包括以下步骤：The Mahalanobis distance judgment method described in step 4 includes the following steps:

预测样本各主元成分数据集分别与确定分类的训练样本主元成分数据集两两进行计算，以其中两个A、B确定分类的训练样本主元成分数据集为例进行介绍。Each principal component data set of the prediction sample is calculated in pairs with the training sample principal component data set of the determined classification, and two of the training sample principal component data sets of the determined classification of A and B are used as examples to introduce.

步骤4.1，计算A、B两类的均值向量与协方差阵；Step 4.1, calculate the mean vector and covariance matrix of A and B;

ma＝mean(A)，mb＝mean(B)，S₁＝cov(A)，S₂＝cov(B)ma=mean(A), mb=mean(B), S ₁ =cov(A), S ₂ =cov(B)

步骤4.2，计算总体的协方差矩阵；Step 4.2, calculate the covariance matrix of the population;

其中n₁、n₂分别为A、B两类的容量，ma为A的均值向量，mb为B的均值向量，S₁为A的协方差阵，S₂为B的协方差阵，S为总体的协方差阵。where n ₁ and n ₂ are the capacities of A and B, respectively, ma is the mean vector of A, mb is the mean vector of B, S ₁ is the covariance matrix of A, S ₂ is the covariance matrix of B, and S is the The covariance matrix of the population.

步骤4.3，计算预测样本主元成分数据集x到A、B两类的马氏平方距离之差d；Step 4.3, calculate the difference d between the Mahalanobis squared distances between the main component data set x of the predicted sample and the two classes of A and B;

d＝(x-ma)S^-1(x-ma)^T-(x-mb)S^-1(x-mb)^T d=(x-ma)S ^-1 (x-ma) ^T- (x-mb)S ^-1 (x-mb) ^T

步骤4.4，若d<0，则x属于A类；若d>0，则x属于B类。Step 4.4, if d<0, then x belongs to class A; if d>0, then x belongs to class B.

应用例：Application example:

以某次煤矿巷道现场试验为例进行分析。利用钻进试验台共测得2900组高维钻进参数，其中组成2900个训练样本和50个测试样本。Take a coal mine roadway field test as an example for analysis. A total of 2900 sets of high-dimensional drilling parameters were measured using the drilling test bench, including 2900 training samples and 50 test samples.

首先完成训练样本的高维钻进参数数据标准化，为消除各参数量纲的影响，将所测得高维钻进参数进行标准化，定义机械钻速、回转扭矩、钻压、转速、回转压力、泥浆泵压力的标准化变量分别为x₁、x₂、x₃、x₄、x₅、x₆，定义各预设主元成分分别为y₁、y₂、y₃、y₄、y₅、y₆,可计算出相关系数矩阵，结果如下表1所示。First, standardize the high-dimensional drilling parameter data of the training samples. In order to eliminate the influence of the dimensions of each parameter, the measured high-dimensional drilling parameters are standardized, and the ROP, rotary torque, drilling pressure, rotational speed, rotary pressure, The standardized variables of the mud pump pressure are respectively x ₁ , x ₂ , x ₃ , x ₄ , x ₅ , x ₆ , and each preset pivot component is defined as y ₁ , y ₂ , y ₃ , y ₄ , y ₅ , y ₆ , the correlation coefficient matrix can be calculated, and the results are shown in Table 1 below.

表1训练样本相关系数表Table 1 Training sample correlation coefficient table

然后通过数据降维系统可进行计算各预设主元成分贡献率和累计贡献率，如表2所示。Then, the contribution rate and cumulative contribution rate of each preset principal component can be calculated through the data dimensionality reduction system, as shown in Table 2.

表2预设主元成分结果表Table 2 Preset principal component result table

根据相关经验原则，累计贡献率大于90％的原则确定选取的主元成分个数为3，则可以得出训练样本的主元成分的特征向量如下表3所示。According to relevant empirical principles, the principle that the cumulative contribution rate is greater than 90% determines that the number of selected principal component is 3, and the feature vector of the principal component of the training sample can be obtained as shown in Table 3 below.

表3训练样本的主元成分的特征向量Table 3 Eigenvectors of principal components of training samples

第一主元成分主要反映了x₂(回转扭矩)；第二主元成分主要反映了x₁(机械钻速)；第三主元成分主要反映了x₅(回转压力)，通过表3可将训练样本计算降维得到训练样本的主元成分数据集。通过加权得分得到初步训练样本的分类数为3，进而得到预测样本主元成分数据集。The first principal component mainly reflects x ₂ (swing torque); the second principal component mainly reflects x ₁ (ROP); the third principal component mainly reflects x ₅ (swing pressure). Calculate the dimension reduction of the training samples to obtain the principal component data set of the training samples. Through the weighted score, the number of classifications of the initial training sample is 3, and then the data set of the main component of the predicted sample is obtained.

通过选择合适的核函数，利用数据聚类系统对训练样本的主元成分数据集进行聚类，得到3个确定分类的训练样本主元成分数据集，可得到聚类中心分别为：v₁＝(0.1262，-1.0885，-0.7335)^T，v₂＝(1.6849，-0.7538，-1.2834)T，v3＝(0.4876，-0.8157，0.7405)T，分别对应砂岩层、煤层和泥岩层。By selecting an appropriate kernel function and using the data clustering system to cluster the principal component data sets of the training samples, three data sets of the principal components of the training samples with definite classification can be obtained, and the cluster centers can be obtained as follows: v ₁ = (0.1262, -1.0885, -0.7335) ^T , v ₂ =(1.6849, -0.7538, -1.2834) T, v3 = (0.4876, -0.8157, 0.7405) T, corresponding to sandstone, coal and mudstone layers respectively.

聚类效果如图6所示，然后通过马氏距离判断法计算预测样本各主元成分数据集分别与3个确定分类的训练样本主元成分数据集的马氏距离，对预测样本进行归类，得出预测样本所属的岩性类别，并与预测样本实际岩性类别进行比较，不断优化预测结果，直到预测准确率大于等于99％，预测结果如下表4所示。The clustering effect is shown in Figure 6. Then, the Mahalanobis distance between each principal component data set of the predicted sample and the three main component data sets of the training samples determined to be classified is calculated by the Mahalanobis distance judgment method, and the predicted samples are classified. , obtain the lithology category to which the predicted sample belongs, and compare it with the actual lithology category of the predicted sample, and continuously optimize the prediction results until the prediction accuracy rate is greater than or equal to 99%. The prediction results are shown in Table 4 below.

表4预测效果表Table 4 Prediction effect table

本实施例中，尽管为使解释简单化将上述方法图示并描述为一系列动作，但是应理解并领会，这些方法不受动作的次序所限，因为根据一个或多个实施例，一些动作可按不同次序发生和/或与来自本文中图示和描述或本文中未图示和描述但本领域技术人员可以理解的其他动作并发地发生。In this embodiment, although the above-described methods are illustrated and described as a series of acts for simplicity of explanation, it is to be understood and appreciated that the methods are not limited by the order of the acts, as some acts in accordance with one or more embodiments It may occur in a different order and/or concurrently with other actions from or not illustrated and described herein but which can be understood by those skilled in the art.

本领域技术人员将进一步领会，结合本文中所公开的实施例来描述的各种解说性逻辑板块、模块、电路、和算法步骤可实现为电子硬件、计算机软件、或这两者的组合。为清楚地解说硬件与软件的这一可互换性，各种解说性组件、框、模块、电路、和步骤在上面是以其功能性的形式作一般化描述的。此类功能性是被实现为硬件还是软件取决于具体应用和施加于整体系统的设计约束。技术人员对于每种特定应用可用不同的方式来实现所描述的功能性，但这样的实现决策不应被解读成导致脱离了本发明的范围。Those skilled in the art will further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the specific application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

注意到，说明书中对“一个实施例”、“实施例”、“示例实施例”、“一些实施例”等的引用指示所描述的实施例可以包括特定特征、结构或特性，但是每个实施例可以不必包括所述特定特征、结构或特性。而且，这样的短语不必指代同一实施例。此外，当结合实施例描述特定特征、结构或特性时，无论是否明确描述，结合其他实施例来实现这样的特征、结构或特性将在所属领域的技术人员的知识范围内。It is noted that references in the specification to "one embodiment," "an embodiment," "example embodiment," "some embodiments," etc. indicate that the described embodiment may include the particular feature, structure, or characteristic, but each implementation Examples may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure or characteristic is described in conjunction with one embodiment, whether explicitly described or not, it will be within the knowledge of those skilled in the art to implement such feature, structure or characteristic in conjunction with other embodiments.

提供对本公开的先前描述是为使得本领域任何技术人员皆能够制作或使用本公开。对本公开的各种修改对本领域技术人员来说都将是显而易见的，且本文中所定义的普适原理可被应用到其他变体而不会脱离本公开的精神或范围。由此，本公开并非旨在被限定于本文中所描述的示例和设计，而是应被授予与本文中所公开的原理和新颖性特征相一致的最广范围。The previous description of the present disclosure is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to the present disclosure will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other variations without departing from the spirit or scope of the present disclosure. Thus, the present disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A stratum lithology recognition system based on high-dimensional drilling parameter information is characterized by comprising a drilling test bed, a data dimension reduction system, a data clustering system and a prediction recognition system which are sequentially connected;

the drilling test bed is used for constructing data holes and obtaining high-dimensional drilling parameters, and each group of high-dimensional drilling parameters respectively form a corresponding training sample and a corresponding prediction sample;

the data dimension reduction system is used for calculating high-dimensional drilling parameters in the training sample and determining the number of principal component components; obtaining the classification number of the preliminary training sample and the principal component data set of the training sample; further obtaining a principal component data set of the prediction sample;

the data clustering system is used for carrying out fuzzy kernel clustering on the principal component data sets of the training samples to obtain the optimal number K of the principal component data sets of the training samples to be clustered, and obtaining the principal component data sets of the training samples to be classified;

the prediction identification system is used for establishing a discrimination criterion for the principal component data set of the training samples determined to be classified, classifying the prediction samples and obtaining the lithology categories to which the prediction samples belong.

2. The formation lithology recognition system based on high-dimensional drilling parameter information as claimed in claim 1, wherein the drilling test bench comprises a hydraulic pump station, an operation bench, a flushing liquid circulation unit and a data acquisition unit which are respectively connected with a host machine.

3. The formation lithology recognition system based on high-dimensional drilling parameter information as claimed in claim 1, wherein the data dimension reduction system comprises a first input end, a first data processor and a first output end which are connected in sequence.

4. The system for identifying the lithology of the stratum based on the high-dimensional drilling parameter information as claimed in claim 1, wherein the data clustering system comprises a second input end, a second data processor and a second output end which are connected in sequence.

5. The formation lithology recognition system of claim 1, wherein the predictive recognition system comprises a third input terminal, a third data processor and a third output terminal connected in series.

6. A stratum lithology recognition method based on high-dimensional drilling parameter information is characterized in that the method adopts the stratum lithology recognition system based on the high-dimensional drilling parameter information; the high-dimensional drilling parameters comprise the mechanical drilling speed, the rotary torque, the drilling pressure, the rotating speed, the rotary pressure and the pressure of a slurry pump.

7. The method for identifying the lithology of the stratum based on the high-dimensional drilling parameter information as claimed in claim 6, wherein the method comprises the following steps:

determining the rock stratum characteristics and the prediction range of a typical stratum according to the production information of a target stratum region, constructing a data hole by adopting a host of a drilling test bed, and obtaining high-dimensional drilling parameters by adopting a data acquisition unit of the drilling test bed, wherein each group of high-dimensional drilling parameters respectively form a corresponding training sample and a prediction sample;

calculating high-dimensional drilling parameters in the training sample by using a data dimension reduction system to obtain correlation coefficients among the high-dimensional drilling parameters, then obtaining the contribution rate of each preset principal component, sequencing from large to small, and obtaining the accumulated contribution rate of each preset principal component; when the accumulated contribution rate of a certain preset principal component is greater than 90%, all preset principal components before the preset principal component are principal components, and finally the number of the principal components and the corresponding feature vectors are determined; performing weighted calculation by taking the contribution rate of each principal component as weight to obtain the weighted score of each training sample, sequencing the training samples from high to low according to the weighted score of each training sample, preliminarily classifying the training samples according to the weighted score condition of each training sample to obtain the classification number of the preliminary training samples, and obtaining a principal component data set of the training samples and a principal component data set of the prediction samples according to the feature vectors of the principal components;

thirdly, fuzzy kernel clustering is carried out on the principal component data sets of the training samples obtained in the second step by adopting a data clustering system, the classification number of the preliminary training samples obtained in the second step is used as the original clustering number, the fuzzy degree is set, a kernel function is constructed, a membership matrix is established, clustering of the principal component data sets of the training samples is finally completed through continuous iteration optimization parameters, the optimal number K of the principal component data sets of the training samples is obtained, the optimal number K is the lithologic classification number, and meanwhile, the clustering center of each lithologic classification and the corresponding data set of each lithologic classification are calculated to obtain the principal component data sets of the training samples with determined classifications;

and step four, establishing a judgment criterion for the principal component data sets of the training samples of the determined classification obtained in the step three by adopting a prediction recognition system, calculating the Mahalanobis distance between each principal component data set of the prediction samples obtained in the step two and the principal component data sets of the training samples of the determined classification by using a Mahalanobis distance judgment method, selecting the smallest Mahalanobis distance, and classifying the prediction samples to obtain the lithology categories to which the prediction samples belong.