CN113076700A - SVM-LDA rock burst machine learning prediction model method based on data analysis principle - Google Patents

SVM-LDA rock burst machine learning prediction model method based on data analysis principle Download PDF

Info

Publication number
CN113076700A
CN113076700A CN202110458500.XA CN202110458500A CN113076700A CN 113076700 A CN113076700 A CN 113076700A CN 202110458500 A CN202110458500 A CN 202110458500A CN 113076700 A CN113076700 A CN 113076700A
Authority
CN
China
Prior art keywords
rock burst
prediction
model
grade
rock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110458500.XA
Other languages
Chinese (zh)
Inventor
李克钢
李明亮
秦庆词
娄颖豪
徐港
岳睿
刘博�
李博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202110458500.XA priority Critical patent/CN113076700A/en
Publication of CN113076700A publication Critical patent/CN113076700A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a SVM-LDA rock burst machine learning prediction model method based on a data analysis principle, and relates to the technical field of geotechnical engineering and underground excavation engineering. The method comprises collecting multiple groups of rock burst case engineering data at home and abroad; calculating the correlation coefficient of the rock burst prediction index by using a correlation coefficient principle; carrying out extreme value processing on original rock burst case engineering data, and then carrying out standardized processing; introducing a T-distribution neighborhood embedding (T-SNE) method to perform dimensionality reduction visualization on data; determining a rock burst prediction sample training set and a rock burst intensity level prediction set by combining a random cross validation method, and establishing 6 rock burst intensity level prediction models; the invention discusses the prediction effect of the model based on each rock burst grade, is not limited to a certain model, but finds the model with better prediction effect on a certain or certain rock burst grades, and combines the models to predict the rock burst grades. The method has great significance for researching rock burst intensity grade prediction of mines, tunnels, hydropower stations and the like.

Description

SVM-LDA rock burst machine learning prediction model method based on data analysis principle
Technical Field
The invention relates to a SVM-LDA rock burst machine learning prediction model method based on a data analysis principle, and belongs to the technical field of deep geotechnical engineering and underground excavation engineering.
Background
Along with the gradual extension of underground works such as mines, tunnels, hydropower stations and the like to the deep part, the problem of engineering geological disasters with great harmfulness, namely rock burst, is increased day by day. Rock burst is often manifested as the phenomena of extremely rapid ejection, spalling and surrounding rock rib spalling of brittle rock fragments. The rock burst has the characteristics of randomness, burst property, uncertainty, great danger and the like, and the research of rock burst disasters becomes one of important scientific problems to be solved urgently in the rock mechanics field of China. Therefore, how to accurately predict the occurrence of rock burst is an ongoing effort among many scholars.
Rock burst prediction is a core link of rock burst mechanism and rock burst prevention and control, a method for predicting rock burst disasters reasonably, effectively and accurately is provided, rock bursts can be effectively and deeply controlled and avoided, and a rock burst intensity grade prediction method is roughly divided into a single-factor prediction method and a comprehensive consideration multi-factor prediction method by numerous scholars and experts at home and abroad. The single-factor prediction method adopts the established rock burst criterion to realize the prediction of the rock burst intensity grade, such as: russenes criterion, Turchaninov criterion, Erlangshan criterion, pottery-earth criterion, Hoek criterion, N-Jelum criterion, and the like; due to numerous factors inducing rock burst and complexity of a rock burst mechanism, the accuracy and the confidence degree of a single-factor discrimination method are obviously low, so that a large number of scholars consider multi-factor prediction methods, and the multi-factor prediction methods predict the rock burst intensity grade based on a mathematical method and an intelligent algorithm, such as: the prediction method is widely applied, the accuracy of rock burst intensity level prediction is improved, and the problems of low accuracy, poor universality and the like still exist.
Disclosure of Invention
Aiming at the defects of the existing rock burst intensity grade prediction method, the invention provides a SVM-LDA rock burst machine learning prediction model method based on a data analysis principle, and aims to predict the rock burst intensity grade classification condition in geotechnical engineering and underground excavation engineering more accurately and with better universality.
The technical scheme adopted by the invention is as follows: a SVM-LDA rock burst machine learning prediction model method based on a data analysis principle comprises the following steps:
the method comprises the following steps: constructing a rock burst prediction sample library;
step two: analyzing rock burst case engineering data;
step three: determining the grading condition of the rockburst intensity grade;
step four: optimizing model parameters;
step five: pre-processing rock burst prediction sample data;
step six: establishing a rock burst intensity grade prediction model;
step seven: and (5) modeling analysis.
Specifically, in the first step, a rock burst prediction sample library is constructed by collecting relevant domestic and foreign rock burst prediction documents and selecting a plurality of groups of mutually independent domestic and foreign rock burst case engineering data based on the selected rock mass stress coefficient sigma theta/sigma c, the rock brittleness coefficient sigma c/sigma t and the elastic deformation energy coefficient Wet as rock burst prediction indexes.
Specifically, the method for analyzing the rockburst case engineering data in the second step includes: each group of rockburst case engineering data in the first step comprises 4 variables: and meanwhile, reducing 3 characteristics of the stress coefficient, the rock brittleness coefficient and the elastic deformation energy coefficient in each group of rock burst case engineering data in the rock burst case engineering data collected in the first step to two dimensions based on a T-SNE method, and observing whether the samples in different actual categories have obvious boundaries or not.
Specifically, in the third step, the grade of the rockburst intensity is divided into 4 grades, namely the total rockburst intensity prediction result is 4 types, namely, the rockburst-free intensity is I, the slight rockburst is II, the medium rockburst is III and the strong rockburst is IV.
Specifically, the fourth step specifically comprises the following steps: and constructing a KNN model, an NB model, a DT model, an RF model, an LDA model and an SVM model according to the machine learning algorithm data packet and the Python tool, and respectively optimizing the 6 rock burst prediction model parameters by adopting a grid search mode.
Specifically, the pretreatment of the rock burst prediction sample data in the fifth step is as follows: the extreme value processing is firstly carried out on the original rock burst case engineering data, then the standardization processing is carried out, and the influence of dimension is eliminated.
Specifically, the process of establishing the rock burst intensity level prediction model in the sixth step is as follows:
(1) adopting a five-fold cross-validation method to take the rock burst case engineering data sample after the fifth preprocessing step as a training set according to 80% of the sample, and taking 20% of the sample as a testing set;
(2) utilizing the optimized 6 rock burst prediction model parameters in combination with a machine learning algorithm data packet, and adopting Python tool operation processing to respectively obtain prediction accuracy of each grade of the 6 rock burst prediction models, and further judging the prediction accuracy of the 6 rock burst prediction models for rock burst intensity grades from I grade to IV grade;
the above two steps are performed simultaneously.
Specifically, the modeling analysis in the seventh step is as follows: and 6, obtaining the prediction accuracy rate result of each grade of the 6 rock burst prediction models based on the sixth step, and analyzing the bias of the prediction result of the 6 machine learning algorithm to the real result, namely the bias risk prediction or the bias safety prediction.
The invention has the beneficial effects that:
1. the invention provides a SVM-LDA rock burst machine learning prediction model method based on a data analysis principle, which selects a plurality of groups of typical rock burst case engineering data, and establishes 6 rock burst intensity grade prediction models based on 6 machine learning algorithms and a random cross validation method. The selected machine learning algorithm has the advantages of simple logic, easy realization, strong model generalization capability, high training speed, suitability for small samples and the like, and the method for optimizing the model parameters by adopting the grid search mode has certain universality. Determining that no strong correlation exists between variables by means of a correlation coefficient principle, and simultaneously carrying out extreme value processing on original rock burst case engineering data and then carrying out standardized processing to eliminate the influence of dimension;
2. the invention discusses the prediction effect of the model based on each rock burst grade, is not limited to a certain model, but finds the model with better prediction effect on a certain or certain rock burst grades, combines the models, predicts the rock burst grades and provides better guiding significance for the rock burst prediction problem of geotechnical engineering;
3. the invention adopts a T-SNE dimension reduction method to perform dimension reduction visualization processing on the rockburst case engineering data selected by the invention, judges whether each rockburst intensity grade sample has a clustering effect, and provides a theoretical basis for a reader to research such a topic by adopting a machine learning algorithm in the future.
Drawings
FIG. 1 is a diagram showing a stress coefficient distribution of a rock mass;
FIG. 2 is a graph of a rock brittleness coefficient distribution;
FIG. 3 is a graph showing an elastic modulus distribution;
FIG. 4 is a diagram of an actual grade profile of a rock burst;
FIG. 5 is a three-dimensional distribution diagram of actual levels of a rock burst;
FIG. 6 is a sample distribution diagram after dimensionality reduction;
FIG. 7 is a flow chart of a machine learning algorithm model;
FIG. 8 is a graph of LDA and SVM cross-validation accuracy.
Detailed Description
For better clarity of the computing principles, the operation processes and the method advantages of the embodiments of the present invention, the following detailed descriptions of the technical solutions of the present invention are provided with reference to the accompanying drawings.
Example 1: as shown in fig. 1 to 8, a method for learning and predicting a model by an SVM-LDA rock burst machine based on a data analysis principle includes the following steps:
the method comprises the following steps: and constructing a rock burst prediction sample library.
The invention relates to a method for predicting rock burst degree based on PCA-N, by collecting rock burst prediction related documents (Zhou J, Li X B, Shi XZ. Long-term prediction model of rock burst in underlying concrete using hardware and supporting vector models [ J ]. Safety Science,2012,50(4): 629. J., "DONG L J., XING L I., PENG K. morning prediction of rock burst specification [ J ]. Transactions of Nonrorus Metals Society of China,2013,23(2):472,. Wushun, Zhang bridge, and coal bridge prediction model [ J ]. 19, K. J. (J.),. Safety prediction of rock burst in nuclear coal mine) of rock burst classification [ 19, K. prediction of rock burst classification of rock burst in nuclear coal mine, K.,. 76, K. prediction of rock burst classification [ J.,. 9. sub.K.,. K.,76, K. 5. sub., 2016,26(7): 1995-.
TABLE 1 rock burst case engineering data at home and abroad
Figure BDA0003041381020000041
Figure BDA0003041381020000051
Figure BDA0003041381020000061
Step two: and analyzing the rockburst case engineering data.
(1) Normalization and correlation coefficients
Aiming at the problems that each evaluation index in a multi-index evaluation system has different properties and different dimensions and magnitude levels, or when the levels among the evaluation indexes are greatly different, the original index value is selected for analysis, so that the index with higher numerical value plays an important role in evaluation, and the function of the index with lower numerical value is reduced. Therefore, in order to ensure the reliability of the result, the raw index data needs to be normalized, and the invention utilizes a normalization processing method for comparing typical data, namely, uniformly mapping the data to the [0,1] interval, and the formula is as follows.
Figure BDA0003041381020000062
In the formula: x is the number ofistdFor normalized values, min (x) is the minimum value of the variable, max (x) is the maximum value of the variable, xiIs the actual variable value.
The correlation coefficient was first proposed by karl pearson and is a non-deterministic relationship that reflects a statistical indicator of the closeness of the correlation between variables. The formula is expressed as follows, the correlation coefficient R (X, Y) is positioned in the range of [ -1,1], and when R (X, Y) is 0, X and Y are called to be uncorrelated; when | R (X, Y) | ═ 1, X and Y are called to be completely correlated, and then X and Y have a linear functional relationship; if | R (X, Y) | <1, the variation of X causes a partial variation of Y, and if the absolute value of R (X, Y) is larger, the variation of X causes a larger variation of Y, then | R (X, Y) | >0.8 is called a high correlation, and if | R (X, Y) | <0.3 is called a low correlation, and if not, it is a medium correlation.
Figure BDA0003041381020000071
In the formula: cov (X, Y) is the covariance of X and Y, Var [ X ] is the variance of X, Var [ Y ] is the variance of Y, and R (X, Y) is the correlation coefficient of X and Y.
(2) T-SNE principle
T-SNE (T-distributed stored systematic neighbor embedding) was originally proposed by Laurens van der Maaten and Geoffrey Hinton, and is taken as a conventional method in a nonlinear dimension reduction algorithm and generally applied to the dimension reduction process of popular learning (simulated learning).
The rockburst case engineering data collected by the method is 145 groups in total, and samples are independent. There are 4 variables in total, namely the rock mass stress coefficient (sigma theta/sigma c), the rock brittleness coefficient (sigma c/sigma t), the elastic deformation energy coefficient (Wet) and the actual grade (I-IV) of the rock burst. The first three are independent variables of the model, the actual grade is a dependent variable of the model, and in order to achieve higher accuracy of the rock burst prediction model, rock burst case engineering data (table 1) are preprocessed based on the formula (1) and the formula (2), and the basic description of the sample is shown in the table 2.
TABLE 2 basic information of variables
Index (I) σθ/σc σc/σt Wet
Sample size
145 145 145
Mean value of 0.46 22.36 4.49
Standard deviation of 0.21 12.68 2.09
Minimum value 0.05 4.48 0.90
Quantile 25% 0.35 13.98 3.00
50% quantile 0.45 20.40 4.30
75% quantile 0.60 28.43 5.76
Maximum value 1.41 80.00 10.90
As can be seen from Table 2, the maximum value of the stress coefficient of the rock mass is 1.41, the quantile is 0.35, the minimum value is 0.05, and obvious right-bias distribution is presented; the maximum value of the rock brittleness coefficient is 80.00, the median is 20.40, the minimum value is 4.48, and obvious right-biased distribution is also presented; the maximum value of the elastic deformation energy coefficient is 10.90, the median is 4.30, the minimum value is 0.90, and the elastic deformation energy coefficient also has a certain right deviation characteristic. The data in the table are limited, and the distribution condition of the samples is difficult to visually see, so that distribution curves of three variables are drawn, and the distribution characteristics are visually presented as shown in figures 1-3.
Rock burst prediction results are visually expressed by rock burst grades, and in the existing rock burst prediction evaluation system, the rock burst intensity grades are generally divided into 4 grades such as no rock burst (I), slight rock burst (II), medium rock burst (III) and strong rock burst (IV). In the 145 cases collected this time, the actual grade distribution of the rock burst is shown in fig. 4.
As can be seen from fig. 4, the number of grades iii is at most 57; the number of grades i is the minimum, 27, and there is a certain imbalance characteristic for the samples. But the ratio of the maximum sample size to the minimum sample size is only slightly larger than 2, and the imbalance problem is less severe. For a model, if there is a large correlation between independent variables, the model prediction accuracy and stability are affected, and therefore, it is important to analyze the stability between independent variables.
From the equation (2), the correlation matrix among the rock mass stress coefficient σ θ/σ c, the rock brittleness coefficient σ c/σ t, and the elastic deformation energy coefficient (Wet) is shown in table 3. As can be seen from table 3, the correlation between the three variables is very small, and it can be considered that there is no correlation between the variables.
TABLE 3 independent variable correlation matrix
Index (I) σθ/σc σc/σt Wet
σθ/σc 1.00 -0.06 0.03
σc/σt -0.06 1.00 -0.14
Wet 0.03 -0.14 1.00
Since the independent variables are only 3, i.e. the modeled samples have only three features, the relationship between the independent variables and the dependent variables is now shown in three-dimensional space, see fig. 5. In three-dimensional space, the present invention does not allow the differences between categories to be viewed very intuitively. Therefore, the characteristics are reduced to two dimensions, and the sample distribution of each category is more visually shown.
FIG. 6 is a relationship between each category and two features after the T-SNE dimension reduction, and data after the dimension reduction shows that there is a more obvious boundary between samples of different actual categories. The invention adopts a machine learning method to carry out modeling prediction on actual categories. Note: VAR1 and VAR2 are two variables after dimensionality reduction. As can be seen from fig. 6, most of the points of the square are gathered together, and there are distinct boundaries between the points of the square and other points, so that the boundaries can be found by the prejudgment machine learning algorithm to distinguish different rock burst intensity levels.
Step three: and determining the grading condition of the rock burst intensity grade.
The grade of the rockburst intensity is divided into 4 grades, namely the rockburst intensity prediction result is 4 types, and the 4 types are non-rockburst (I), slight rockburst (II), medium rockburst (III), strong rockburst (IV) and the like in sequence.
Step four: and optimizing the model parameters.
The invention adopts a grid searching mode to determine the parameters of the model, and the grid searching principle is as follows: the grid search is a relatively common parameter determination method, and the basic idea is to traverse all possible values of parameters in an effective space, and to arrange and combine the values of different parameters, and finally to select one or more groups of parameter combinations with better model effect. One advantage of grid search is that the optimal parameter combination can be found more accurately. But the disadvantages are also evident. Firstly, a rough traversal method is adopted in grid search, and when the value space of parameters is large, the model training time is long; second, the range of the parameter needs to be determined empirically, so the parameter found is not necessarily globally optimal, and may be locally optimal. Note: the invention discusses the prediction effect of the model based on each rock burst grade, is not limited to a certain model, but finds the model with better prediction effect on a certain or certain rock burst grades, and combines the models to predict the rock burst grades. Therefore, the method for optimizing the model parameters by adopting the grid search mode has certain universality.
Step five: and (4) preprocessing rock burst case engineering data.
The data preprocessing stage comprises extreme value processing and data standardization. According to the univariate analysis, the rock stress coefficient and the rock brittleness coefficient both show serious right-bias distribution, namely a sample has a small number of extreme values, so that the characteristic values with the rock stress coefficient value larger than 0.8 are uniformly assigned to 0.8, the characteristic values with the rock brittleness coefficient value larger than 45 are uniformly assigned to 45, and data after extreme value processing are further subjected to data standardization.
The data processed according to equation (1) has a maximum value of 1 and a minimum value of 0, and the distribution of the samples is unchanged after the data is normalized, and the normalized partial samples are shown in table 4.
TABLE 4 normalized samples
Serial number σθ/σc σct Wet level
1 1.00 0.60 0.23
2 1.00 0.60 0.23
3 0.92 0.60 0.23
4 1.00 0.36 0.22
5 0.67 0.25 0.55
141 0.44 0.32 0.81
142 0.12 0.24 0.04
143 0.47 0.25 0.62
144 0.00 0.22 0.60
145 0.23 0.22 0.82
Step six: and establishing a rock burst intensity grade prediction model.
In order to solve the problem of rock burst intensity level prediction, the method carries out programming calculation on the existing machine learning algorithm based on python software, the searched algorithms are K Nearest Neighbor (KNN), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM) and Linear Discriminant (LDA), and a rock burst intensity level prediction model based on 6 machine learning algorithms is established. The flow chart of the machine learning algorithm model building of the rock burst intensity level prediction is shown in figure 7. In order to improve the prediction stability of the model and improve the fitting capability, cross validation is often performed on the data. The method of n-fold cross validation is as follows:
step 1: dividing the data into n parts on average; step 2: for the n samples, sequentially taking n-1 as a training set and the rest 1 as a prediction set, and establishing a model for prediction; step 3: and averaging the n prediction results to obtain the final effect of the model. Typically, to avoid the contingency of sample partitioning, m-fold cross-validation is also performed n times. The penmen try cross validation for multiple times, the purpose is to achieve the highest model accuracy, and meanwhile, 145 rockburst engineering cases selected by the method are considered, so that the five-fold cross validation method is suitable.
Step seven: and (5) modeling analysis.
The rock burst prediction model is trained and predicted by 6 statistical machine learning methods of nearest neighbor (KNN), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (Decision Tree), Random Forest (Random Forest) and Linear Discriminant (LDA), when the rock burst intensity grade prediction model is established, 20% of samples are used as a test set, and 80% of the samples are used as a training set. After the division, the training set has 116 samples in total, and the test set has 29 samples in total. In order to make the rock burst prediction model universal and eliminate the contingency of sample division. Each model should be selected on the model trained by the five-fold cross validation. And comparing the prediction grade with the real grade, and elaborating the prediction result. The prediction accuracy of each grade of the obtained model is shown in a table 5, and the prediction condition and the model accuracy of each grade of the model are shown in a table 6. Finally, based on the result of the five-fold cross validation, the bias of the prediction results of various algorithms to the real results (namely the bias risk prediction or the bias safety prediction) is intensively studied. And providing a model with highest accuracy and strongest applicability for predicting the rockburst intensity grade I and a rockburst prediction model with highest accuracy and strongest applicability for predicting the rockburst intensity grades II-IV in 6 rockburst machine learning prediction models.
TABLE 5 model prediction accuracy for each grade
Actual grade Actual quantity KNN SVM NB DT RF LDA
5 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
8 75.00% 87.50% 62.50% 75.00% 75.00% 100.00%
11 81.82% 100.00% 90.91% 72.73% 100.00% 90.91
5 100.00% 80.00% 80.00% 80.00% 60.00% 100.00%
As can be seen from Table 5, 6 models can be accurately predicted when the actual grade of the rock burst intensity is I; for the rock burst intensity actual grade II, the prediction accuracy of linear discriminant LDA is the highest and reaches 100%, and then the accuracy reaches 87.5% by a Support Vector Machine (SVM); for the actual grade of the rockburst intensity is III, the prediction accuracy of the support vector machine and the random forest is the highest and reaches 100%, then linear discrimination and naive Bayes are carried out, the accuracy reaches 90.91%, for the actual grade of the rockburst intensity is IV, the prediction accuracy of nearest neighbor and linear discrimination is the highest and reaches 100%, and then the support vector machine, naive Bayes and decision tree models are carried out.
TABLE 6 model prediction for each level and model accuracy
Actual grade Actual quantity KNN SVM NB DT RF LDA
5 5 5 5 5 5 5
8 6 7 5 6 6 8
11 9 11 10 8 11 10
5 5 4 4 4 3 5
Total of 29 25 27 24 23 25 28
Rate of accuracy -- 86.21% 93.10% 82.76% 79.31% 86.21% 96.55%
Table 5 analyzes the prediction accuracy of each model at a single level, based on which the prediction accuracy of the model is viewed overall. From table 6, it can be seen that the prediction accuracy of linear discrimination is the highest, which is consistent with the guess of the present invention, i.e. the samples are approximately linearly separable, and then the SVM, which has a very good prediction effect on the linearly separable small samples.
The invention adopts the random five-fold cross validation, each round of prediction set has 29 samples, and the five rounds of prediction set have 145 prediction levels in total. The predicted grade was compared to the true grade, see table 7. The prediction results are described in detail below.
(1) For the Decision Tree (DT) algorithm, for real samples with a rockburst level i, the ratio of the predicted level i is 85.19%, and the ratio of the predicted level ii is 14.81%. The decision tree will not predict the true grade i to iii or iv with an error substantially within a controllably acceptable range. And (3) predicting the proportion of the rock burst grade II in the real sample with the grade II, predicting the proportion of the rock burst grade II in the real sample with the grade II to be 56%, predicting the proportion of the rock burst grade III in the real sample with the grade III to be 36%, and predicting the proportion of the rock burst grade I in the real sample with the grade I to be 8%. It can be seen that the prediction error of the decision tree is biased towards a higher level. The rock burst level III real sample predicts 70% of level III, 18.57% of level IV and 11.43% of level III or below. And (3) predicting the proportion of the rock burst in the real sample with the grade IV to be 56.52 and predicting the proportion of the rock burst in the real sample with the grade III to be 39.13%. In a comprehensive view, the decision tree is biased to the prediction of danger, the overall error is large, and the prediction effect is poor especially for samples with real grades II-IV.
(2) For the nearest neighbor (KNN) algorithm, the real sample with the rock burst level I predicts the occupation ratio of the level I as high as 96.3%, which is far higher than the level of the decision tree. The rock burst grade of the real sample II is 60% higher than that of the decision tree, and the rock burst grade of the real sample II is 36% equal to that of the decision tree. And (3) for the real samples with the rock burst level III, the ratio of the prediction level III is 75.71%, the real samples are higher than the decision tree, the ratio of the prediction level IV is 15.71%, and no sample is predicted to be I. For real samples with a rock burst level IV, the ratio of the predicted level IV is 78.26%, the ratio of the predicted level III is 21.74%, and no sample is predicted to be I or II. In general, the KNN prediction effect is better than that of a decision tree, but for samples of rock burst grades II-IV, the prediction accuracy of the KNN algorithm is still low.
(3) For Linear Discriminant (LDA), the ratio of prediction grade i is 92.59% for real samples with grade i of rockburst, which is close to KNN. The rock burst grade is a real sample of II, the ratio of the prediction grade II reaches 84%, and the prediction grade is far higher than the two machine learning algorithms. The rock burst level III of the real sample is predicted to be 77.14 percent, and the rock burst level III of the real sample is predicted to be IV in nearly 15 percent. The rock burst grade is an IV real sample, and the prediction grade IV accounts for 86.69% which is far higher than the two algorithms. In a comprehensive view, the prediction effect of LDA on each grade has better performance, and each algorithm has certain limitation on a sample with a real grade of III.
(4) For a Naive Bayes (NB) algorithm, for real samples with a rock burst level I, the proportion of the rock burst level I is predicted to be 92.29%, but the remaining 7.41% of samples are predicted to be III, which is biased to be predicted at higher risk. The rock burst level is II-IV real samples, and the prediction effect is not as good as that of the LDA algorithm.
(5) For a Random Forest (RF) algorithm, the prediction accuracy of real samples with the rockburst level I is higher, but the prediction effect of real samples with the rockburst levels II-IV is not as good as that of an LDA algorithm.
(6) For a Support Vector Machine (SVM) algorithm, the proportion of the prediction grade I to the real sample with the rockburst grade I reaches 100%, namely the algorithm has higher recognition rate on the sample with the rockburst grade I, but the prediction effect of the real sample with the rockburst grades II-IV is not as good as that of an LDA algorithm.
TABLE 7 accuracy of machine learning algorithms on prediction of various rockburst intensity levels
Figure BDA0003041381020000121
Figure BDA0003041381020000131
Fig. 8 shows the prediction accuracy of Linear Discriminant (LDA) and Support Vector Machine (SVM) in the five-fold cross validation. As shown in fig. 8, for a rockburst level i, the prediction accuracy of each round of the Support Vector Machine (SVM) can reach 100%, so that the SVM has a good identification effect on a sample with the rockburst level i, and the prediction accuracy of other rockburst levels has high volatility. Compared with the SVM, the prediction accuracy and stability of the LDA on the rock burst grades of II-IV are superior to those of the SVM algorithm. Based on this, the prediction results of the two algorithms can be combined when finally determining the prediction results. When the SVM is predicted to be I, the SVM is used as a standard, otherwise, the LDA is used as a standard.
In the rock burst practical engineering application, not only the model prediction accuracy rate but also the risk caused by prediction errors are considered, if the rock burst practical level is IV and the prediction level is III, a more serious result is caused, the overall accuracy rate is comprehensively considered, a Support Vector Machine (SVM) and a linear discriminant model (LDA) are recommended to be selected for more refined comparison, and the prediction results of the support vector machine and the linear discriminant model are shown in a table 8.
TABLE 8 support vector machine and Linear discriminant model prediction results
Figure BDA0003041381020000132
As can be seen from Table 8, the SVM can accurately predict the actual grade of the rock burst I and the actual grade of the rock burst III; for the actual grade of the rock burst, II, 7 grades of models are accurately predicted, 1 grade of models is predicted to be I, for the actual grade of the rock burst, IV, 4 samples are correctly predicted, 1 grade of models is predicted to be III, the rock burst grade is underestimated under the condition that the two kinds of predictions are wrong, and the rock burst grade can bring serious influence on actual engineering construction. For the rock burst intensity grades I, II and IV, the LDA model can accurately predict, for the rock burst actual grade III, 10 models can accurately predict, wherein 1 model is predicted to be IV, namely the model overestimates the rock burst grade. In actual engineering construction, if the rock burst intensity level is overestimated, certain resource waste may be brought, but the safety can be guaranteed.
In summary, if the accuracy of the model is required to be higher than 90%, the effects of the SVM and the LDA can both meet the requirements. However, both the two misclassifications of the SVM underestimate the rockburst level, which may affect the production safety.
It is not difficult to see by combining the research, from the viewpoint of prediction accuracy and stability of rock burst grades from II to IV, a linear discriminant model (LDA) has better accuracy and more stable model performance, and because the actual grade of the engineering rock burst case selected in the engineering application is from II to III, the invention adopts the linear discriminant model (LDA) to carry out rock burst intensity grade prediction on a mosaic screen secondary hydropower station, a river side hydropower station and a pallida tunnel, realizes an LDA algorithm based on Python language programming, and realizes a code based on a KMeans algorithm package in Python 3.7. The model is applied to three projects with rock burst tendency in China, such as a brocade secondary hydropower station, a river side hydropower station, a palliative tunnel and the like, the linear discriminant model (LDA) in the machine learning algorithm provided by the invention is applied to rock burst prediction, the prediction grade is 9, and research results show that the prediction results of the three rock burst tendency projects completely accord with the actual grade.
TABLE 9 rockburst intensity level prediction results of three projects
Figure BDA0003041381020000141
The rockburst intensity grade prediction method has better accuracy and universality, and can provide better guiding significance for rockburst intensity grade prediction problems.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (8)

1. A SVM-LDA rock burst machine learning prediction model method based on a data analysis principle is characterized in that: the method comprises the following steps:
the method comprises the following steps: constructing a rock burst prediction sample library;
step two: analyzing rock burst case engineering data;
step three: determining the grading condition of the rockburst intensity grade;
step four: optimizing model parameters;
step five: pre-processing rock burst prediction sample data;
step six: establishing a rock burst intensity grade prediction model;
step seven: and (5) modeling analysis.
2. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 1, wherein: in the first step, a rock burst prediction sample library is constructed by collecting relevant documents for rock burst prediction at home and abroad and selecting a plurality of groups of independent domestic and foreign rock burst case engineering data based on the selected rock stress coefficient sigma theta/sigma c, the rock brittleness coefficient sigma c/sigma t and the elastic deformation energy coefficient Wet as rock burst prediction indexes.
3. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 2, wherein: the method for analyzing the rock burst case engineering data in the second step comprises the following steps: each group of rockburst case engineering data in the first step comprises 4 variables: and meanwhile, reducing 3 characteristics of the stress coefficient, the rock brittleness coefficient and the elastic deformation energy coefficient in each group of rock burst case engineering data in the rock burst case engineering data collected in the first step to two dimensions based on a T-SNE method, and observing whether the samples in different actual categories have obvious boundaries or not.
4. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 3, wherein: in the third step, the grade of the rockburst intensity is divided into 4 grades, namely the rockburst intensity prediction result is 4 types, and the rockburst-free intensity prediction result, the slight rockburst II, the medium rockburst III and the strong rockburst IV are sequentially arranged.
5. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 4, wherein: the fourth step comprises the following specific steps: and constructing a KNN model, an NB model, a DT model, an RF model, an LDA model and an SVM model according to the machine learning algorithm data packet and the Python tool, and respectively optimizing the 6 rock burst prediction model parameters by adopting a grid search mode.
6. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 5, wherein: preprocessing the rock burst prediction sample data in the fifth step as follows: the extreme value processing is firstly carried out on the original rock burst case engineering data, then the standardization processing is carried out, and the influence of dimension is eliminated.
7. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 6, wherein: the process of establishing the rock burst intensity grade prediction model in the sixth step is as follows:
(1) adopting a five-fold cross-validation method to take the rock burst case engineering data sample after the fifth preprocessing step as a training set according to 80% of the sample, and taking 20% of the sample as a testing set;
(2) utilizing the optimized 6 rock burst prediction model parameters in combination with a machine learning algorithm data packet, and adopting Python tool operation processing to respectively obtain prediction accuracy of each grade of the 6 rock burst prediction models, and further judging the prediction accuracy of the 6 rock burst prediction models for rock burst intensity grades from I grade to IV grade;
the above two steps are performed simultaneously.
8. The SVM-LDA rock burst machine learning prediction model method based on the data analysis principle as claimed in claim 7, wherein: the modeling analysis in the seventh step is as follows: and 6, obtaining the prediction accuracy rate result of each grade of the 6 rock burst prediction models based on the sixth step, and analyzing the bias of the prediction result of the 6 machine learning algorithm to the real result, namely the bias risk prediction or the bias safety prediction.
CN202110458500.XA 2021-04-27 2021-04-27 SVM-LDA rock burst machine learning prediction model method based on data analysis principle Pending CN113076700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110458500.XA CN113076700A (en) 2021-04-27 2021-04-27 SVM-LDA rock burst machine learning prediction model method based on data analysis principle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110458500.XA CN113076700A (en) 2021-04-27 2021-04-27 SVM-LDA rock burst machine learning prediction model method based on data analysis principle

Publications (1)

Publication Number Publication Date
CN113076700A true CN113076700A (en) 2021-07-06

Family

ID=76618888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110458500.XA Pending CN113076700A (en) 2021-04-27 2021-04-27 SVM-LDA rock burst machine learning prediction model method based on data analysis principle

Country Status (1)

Country Link
CN (1) CN113076700A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492627A (en) * 2022-01-25 2022-05-13 成都理工大学 Shale brittleness index prediction method based on improved KNN algorithm
US11422076B1 (en) * 2021-06-04 2022-08-23 Beijing University Of Civil Engineering And Architecture K-nearest neighbour rock burst prediction method and device based on big data visualization analysis
CN115618610A (en) * 2022-10-20 2023-01-17 长安大学 Underground engineering rockburst intensity evaluation method based on information variable weight
CN117312951A (en) * 2023-09-27 2023-12-29 安徽理工大学 Rock burst classification model generation method based on average independent property estimation and incremental learning
CN117332240A (en) * 2023-12-01 2024-01-02 中铁四局集团有限公司 Rock burst prediction model construction method, storage medium, rock burst prediction method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104732070A (en) * 2015-02-27 2015-06-24 广西大学 Rockburst grade predicting method based on information vector machine
CN106407493A (en) * 2016-03-15 2017-02-15 中南大学 Multi-dimensional Gaussian cloud model-based rock burst grade evaluation method
CN110889440A (en) * 2019-11-15 2020-03-17 山东大学 Rockburst grade prediction method and system based on principal component analysis and BP neural network
CN112699553A (en) * 2020-12-29 2021-04-23 昆明理工大学 Intelligent prediction system method for rock burst intensity level

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104732070A (en) * 2015-02-27 2015-06-24 广西大学 Rockburst grade predicting method based on information vector machine
CN106407493A (en) * 2016-03-15 2017-02-15 中南大学 Multi-dimensional Gaussian cloud model-based rock burst grade evaluation method
CN110889440A (en) * 2019-11-15 2020-03-17 山东大学 Rockburst grade prediction method and system based on principal component analysis and BP neural network
CN112699553A (en) * 2020-12-29 2021-04-23 昆明理工大学 Intelligent prediction system method for rock burst intensity level

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
[印]阿迪蒂亚.夏尔马等: "《机器学习》", 30 November 2020 *
张凯等: "主元分析_神经网络岩爆等级预测模型", 《中国安全科学学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11422076B1 (en) * 2021-06-04 2022-08-23 Beijing University Of Civil Engineering And Architecture K-nearest neighbour rock burst prediction method and device based on big data visualization analysis
CN114492627A (en) * 2022-01-25 2022-05-13 成都理工大学 Shale brittleness index prediction method based on improved KNN algorithm
CN115618610A (en) * 2022-10-20 2023-01-17 长安大学 Underground engineering rockburst intensity evaluation method based on information variable weight
CN115618610B (en) * 2022-10-20 2023-11-07 长安大学 Underground engineering rock burst intensity evaluation method based on information variable weight
CN117312951A (en) * 2023-09-27 2023-12-29 安徽理工大学 Rock burst classification model generation method based on average independent property estimation and incremental learning
CN117332240A (en) * 2023-12-01 2024-01-02 中铁四局集团有限公司 Rock burst prediction model construction method, storage medium, rock burst prediction method and system
CN117332240B (en) * 2023-12-01 2024-04-16 中铁四局集团有限公司 Rock burst prediction model construction method, storage medium, rock burst prediction method and system

Similar Documents

Publication Publication Date Title
CN113076700A (en) SVM-LDA rock burst machine learning prediction model method based on data analysis principle
CN109359788B (en) Method for establishing high-speed train initial late point influence prediction model
CN112016602B (en) Method, equipment and storage medium for analyzing correlation between power grid fault cause and state quantity
CN107578149B (en) Power grid enterprise key data analysis method
CN111222683A (en) PCA-KNN-based comprehensive grading prediction method for TBM construction surrounding rock
CN112506990A (en) Hydrological data anomaly detection method based on spatiotemporal information
CN113516228B (en) Network anomaly detection method based on deep neural network
CN112738014A (en) Industrial control flow abnormity detection method and system based on convolution time sequence network
CN114861788A (en) Load abnormity detection method and system based on DBSCAN clustering
CN109784668A (en) A kind of sample characteristics dimension-reduction treatment method for electric power monitoring system unusual checking
CN116737510B (en) Data analysis-based intelligent keyboard monitoring method and system
CN115130375A (en) Rock burst intensity prediction method
CN116796271A (en) Resident energy abnormality identification method
CN114076841B (en) Electricity stealing behavior identification method and system based on electricity consumption data
CN112418522A (en) Industrial heating furnace steel temperature prediction method based on three-branch integrated prediction model
CN115329219A (en) Complex equipment abnormity detection method based on prediction
CN115018161A (en) Intelligent rock burst prediction method based on African bald eagle optimization random forest model
CN111210147B (en) Sintering process operation performance evaluation method and system based on time sequence feature extraction
CN114943281A (en) Intelligent decision-making method and system for heat pipe cooling reactor
CN113946790A (en) Method, system, equipment and terminal for predicting height of water flowing fractured zone
CN112527789A (en) Method and device for detecting repeated data in online monitoring data of power transformation equipment
Afraei et al. Statistical analysis of rock-burst events in underground mines and excavations to present reasonable data-driven predictors
OLASEHINDE et al. Performance evaluation of bayesian classifier on filter-based feature selection techniques
CN115169434A (en) Method and system for extracting characteristic value of working condition of host based on K-means clustering algorithm
CN111639711B (en) Oil pipeline leakage monitoring method based on pressure monitoring time sequence data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210706