CN114386429A - Coal mine disaster risk prediction method and system based on semantic recognition - Google Patents
Coal mine disaster risk prediction method and system based on semantic recognition Download PDFInfo
- Publication number
- CN114386429A CN114386429A CN202111580046.1A CN202111580046A CN114386429A CN 114386429 A CN114386429 A CN 114386429A CN 202111580046 A CN202111580046 A CN 202111580046A CN 114386429 A CN114386429 A CN 114386429A
- Authority
- CN
- China
- Prior art keywords
- coal mine
- data
- hidden danger
- risk prediction
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003245 coal Substances 0.000 title claims abstract description 308
- 238000000034 method Methods 0.000 title claims abstract description 59
- 239000013598 vector Substances 0.000 claims abstract description 104
- 238000012549 training Methods 0.000 claims abstract description 85
- 238000011156 evaluation Methods 0.000 claims abstract description 29
- 238000000605 extraction Methods 0.000 claims abstract description 29
- 238000013058 risk prediction model Methods 0.000 claims abstract description 24
- 238000007477 logistic regression Methods 0.000 claims abstract description 21
- 238000013145 classification model Methods 0.000 claims description 9
- 239000002817 coal dust Substances 0.000 claims description 8
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 7
- 238000010276 construction Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000004927 fusion Effects 0.000 abstract description 6
- 230000006870 function Effects 0.000 description 11
- 238000012360 testing method Methods 0.000 description 9
- 238000007726 management method Methods 0.000 description 7
- 239000007789 gas Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000008521 reorganization Effects 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/02—Agriculture; Fishing; Forestry; Mining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Operations Research (AREA)
- Life Sciences & Earth Sciences (AREA)
- Agronomy & Crop Science (AREA)
- Animal Husbandry (AREA)
- Marine Sciences & Fisheries (AREA)
- Mining & Mineral Resources (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a coal mine disaster risk prediction method and a system based on semantic recognition, wherein the method comprises the following steps: extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set; inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model; extracting coal mine characteristic vectors in the coal mine hidden danger text information and the measuring point data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data; and calling a coal mine risk prediction model to perform risk prediction on the extracted coal mine risk evaluation data. The method and the device fully consider the fusion of the coal mine text data and the measuring point data, establish the dynamic incidence relation between the accident risk and the hidden danger, track the hidden danger risk in the coal mine in real time, improve the safety management level of the coal mine and the hidden danger disposal efficiency, block key links of accident occurrence, and reduce the accident risk of the coal mine disaster.
Description
Technical Field
The application relates to the technical field of coal mine safety risk evaluation, in particular to a coal mine disaster risk prediction method and system based on semantic recognition.
Background
In recent years, along with the enhanced management of coal mines in the aspect of standardized construction, the popularization and application of a coal mine 'three-in-one' (integrated management on coal mine risks, hidden dangers and safety standardization) system changes the traditional hidden danger management mechanism and process. Along with the promotion of colliery underground communication network, explosion-proof mobile terminal's use, the safety control personnel on-the-spot inspection risk hidden danger information, accessible explosion-proof mobile terminal inputs "three in one" system, and the system is automatic with hidden danger information transmission to the safety control department, and each department carries out hidden danger reorganization and supervise and does, realizes the closed loop of hidden danger information and deals with the management. With the popularization and application of the enterprise 'three-in-one' system, a large amount of structured data and unstructured information are uploaded and summarized, and powerful support is provided for a coal mine to deeply mine the incidence relation between related accident potential hazards by applying a natural language processing technology.
However, the existing coal mine disaster risk evaluation and management method has the following technical problems:
firstly, the analysis data is single, the analysis is mainly performed on coal mine measuring point data, and the fusion of coal mine text data and measuring point data is not fully considered. The analysis of coal mine text data is more complex than that of measuring point data, most results can be obtained by statistical analysis of the measuring point data, and the coal mine text data needs to be analyzed by combining expert experience, manual labeling, a natural language processing method and a model.
Secondly, at the present stage, the relevance between the coal mine hidden danger and the accident is not high. At present, hidden danger troubleshooting work is carried out on coal mines, but no correlation analysis model is established among various hidden dangers and accidents, the logic relation of dynamic correlation of accident risks and hidden dangers cannot be well realized, and the influence degree of the hidden dangers on the corresponding accident risks cannot be dynamically and visually embodied when the hidden dangers are troubleshot.
Therefore, how to fully consider the fusion of coal mine text data and measuring point data, establish the dynamic incidence relation between accident risk and hidden danger, improve the coal mine safety management level and hidden danger disposal efficiency, block the key link of accident occurrence, and reduce coal mine disaster accident risk is a technical problem which needs to be solved urgently at present.
Disclosure of Invention
The application aims to provide a coal mine disaster risk prediction method and system based on semantic recognition, fusion of coal mine text data and measuring point data is fully considered, a dynamic incidence relation between accident risk and hidden danger is established, the coal mine safety management level and hidden danger disposal efficiency are improved, key accident links are blocked, and coal mine disaster accident risk is reduced.
In order to achieve the purpose, the application provides a coal mine disaster risk prediction method based on semantic recognition, and the method comprises the following steps:
acquiring historical coal mine accident data and coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model;
acquiring coal mine hidden danger text data and measuring point real-time data;
extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and calling a coal mine risk prediction model, and performing risk prediction on the extracted coal mine risk evaluation data to obtain a coal mine risk prediction evaluation result.
The method for constructing the coal mine feature extraction model in advance comprises the following sub-steps:
acquiring a hidden danger text vector training set;
calling a preset classification model, and training a hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text;
and acquiring a measuring point hidden danger vector, and combining the measuring point hidden danger vector and the text hidden danger vector to obtain a coal mine characteristic extraction vector.
The coal mine characteristic extraction model extracts coal mine characteristic vectors, and the coal mine characteristic vectors comprise measuring point hidden danger vectors and text hidden danger vectors.
As above, the method for obtaining the hidden danger text vector training set includes the following sub-steps:
acquiring a hidden danger text training set labeled manually;
preprocessing a hidden danger text training set;
and vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
As above, the method for acquiring the hidden danger vector of the measuring point comprises the following steps:
acquiring real-time data of coal mine measuring points;
and judging the real-time data of the measuring points, and taking the category corresponding to the risk hidden danger of the measuring points as a measuring point hidden danger vector.
The method for acquiring the coal mine risk prediction training set comprises the following steps:
acquiring historical coal mine accident data corresponding to each accident category; acquiring coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, and roof accidents.
As above, among others, the coal mine accident information includes: the method comprises coal mine hidden danger text data and measuring point hidden danger data, wherein the coal mine hidden danger text data are used for extracting text vectors, and the measuring point hidden danger data are used for extracting measuring point vectors.
As above, the weights of the coal mine feature vectors are learned by using a logistic regression model, and the weights of the coal mine feature vectors are optimized.
The method comprises the steps of inputting coal mine risk evaluation data into a coal mine risk prediction model; and outputting the predicted coal mine accident category by the coal mine risk prediction model.
The application also provides a coal mine disaster risk prediction system based on semantic recognition, and the system comprises:
the training data acquisition module is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
the risk prediction module is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and acquiring a coal mine risk prediction evaluation result
The beneficial effect that this application realized is as follows:
(1) the coal mine information data that this application make full use of was collected contains: the hidden danger text data and the measuring point real-time data are obtained by adopting a semantic recognition method to obtain a coal mine text hidden danger vector, a statistical method is adopted to obtain a coal mine measuring point hidden danger vector, the two types of vectors are fused to obtain a coal mine characteristic vector, the fusion of the coal mine text data and the measuring point data is fully considered, the coal mine risk prediction accuracy is improved, and the occurrence probability of coal mine safety production accidents is reduced.
(2) According to the method and the device, the incidence relation is established between the hidden danger data and the accident data, and after the incidence relation is established, the dynamic risk distribution and the hidden danger incidence relation of the coal mine accident can be better analyzed and excavated, so that the accident risk is reduced for a coal mine, major risks are timely dealt with, and data support is provided. Specifically, this application make full use of the accident data that collect, the accident data contains: the method has the advantages that gas accidents, water damage accidents, coal dust accidents, roof accidents and the like are combined, a logistic regression model is adopted to predict coal mine accident risks in real time by combining coal mine characteristic vectors, prediction results are obtained, and the method has great significance for coal mine risk prevention.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of a coal mine disaster risk prediction method based on semantic recognition according to an embodiment of the present application.
Fig. 2 is a flowchart of a method for pre-constructing a coal mine feature extraction model according to an embodiment of the present application.
Fig. 3 is a flowchart of a method for obtaining a hidden danger text vector training set according to an embodiment of the present application.
Fig. 4 is a schematic diagram of a method for obtaining a coal mine feature vector according to an embodiment of the present application.
Fig. 5 is a schematic diagram of a coal mine risk prediction method according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a coal mine disaster risk prediction system based on semantic recognition according to an embodiment of the present application.
Reference numerals: 10-a training data acquisition module; 20-a model building module; 30-a real-time data acquisition module; 50-a risk prediction module; 100-coal mine disaster risk prediction system.
Detailed Description
The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
As shown in fig. 1, the present application provides a coal mine disaster risk prediction method based on semantic recognition, which includes the following steps:
and step S1, constructing a coal mine characteristic extraction model in advance.
As shown in fig. 2, step S1 includes the following sub-steps:
and step S110, acquiring a hidden danger text vector training set.
As shown in fig. 3, step S110 includes the following sub-steps:
and step S111, acquiring a hidden danger text training set labeled manually.
Specifically, the method for acquiring the hidden danger text training set of the coal mine comprises the following steps: and acquiring the text information of the hidden danger of the coal mine in a near time period (for example, 60 days) and carrying out manual category marking.
The categories are easily caused coal mine disasters and are divided by experts according to experience, and include: detonating the fire source, the amount of wind is not enough, the mine gushes water unusually, the sump is not cleared up, the water pump trouble, the pipeline trouble, the dust fall is failed, produce the coal dust, strut untimely, strut work is not normal, strut work inefficacy or effect reduce, and remaining hidden danger description etc. for each classification settlement classification label, classification label can replace with digit etc. for example: 0. 1, 2, 3, etc.
And step S112, preprocessing the hidden danger text training set.
Specifically, the method for preprocessing the hidden danger text training set comprises the following steps: and performing word segmentation and word stop removal processing on the hidden danger text training set.
And step S113, vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
The method for vectorizing the preprocessed hidden danger text training set comprises the following steps: and calculating the text type data in the hidden danger text training set by adopting a TF-IDF (text vectorization) calculation method, and realizing the vectorization processing of the text type data in the hidden danger text training set.
Specifically, the TF-IDF calculation method comprises the following steps:
TF-IDF=TF*IDF;
wherein, TF represents the word frequency and represents the frequency of the occurrence of the entry t in the document; IDF denotes the inverse text frequency index, the main idea of IDF is: if the number of documents containing the entry t is less, namely the number n of times of occurrence of the keyword in the hidden danger text is smaller, and the IDF is larger, the entry t is proved to have good category distinguishing capability, tfi,jRepresenting the word frequency of the ith word in the jth document; n isi,jRepresenting the number of times the ith word appears in the jth document; sigmaknk,jRepresenting the sum of the times of all the words appearing in the jth document; idfiAn inverse text frequency index representing an ith word; d represents a document; | D | represents the total number of documents; i { j, t |)i∈DiMeans to indicate the inclusion of the word tiThe number of files.
And step S120, calling a preset classification model, and training the hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text.
Specifically, the hidden danger text vector training set is divided into a training set and a test set. Training by adopting a training set to obtain a text hidden danger classification model, testing the text hidden danger classification model by adopting a testing set, optimizing a text hidden danger vector, and obtaining the optimized text hidden danger vector. The text hidden danger vector comprises hidden danger characteristics of a plurality of categories.
Preferably, the feature vectors and the class labels (labels) in the hidden danger text vector training set are divided into a training set and a testing set according to the ratio of 8: 2.
The training method of the text hidden danger classification model comprises the following steps: the classification is carried out by adopting an SVM two-classifier, a text hidden danger classification model is trained, when the problem of multi-classification is met, a one-to-many strategy can be adopted, each class is classified into one class, other classes are classified into another class in sequence, and U SVM two-classifiers are required to be constructed.
The optimization problem of solving the optimal classification hyperplane is transformed into the following minimization problem using a kernel function.
The minimization problem is:
wherein K represents a kernel function; x represents a training feature; c represents a penalty factor to prevent overfitting; l represents the total number of training features; n represents the number of times of occurrence of the keywords in the hidden danger text; y represents a category label; t represents transposition; s.t. indicates that the condition is satisfied; p, q and k are parameters; alpha is alphapAnd alphaqRepresents a parameter selected in the range of 0-C; y ispRepresents the p-th category label; y isqRepresents the qth category label; xpRepresenting the p-th training feature; xqRepresenting the qth training feature; alpha ═ alpha1,α2,......,αl]。
Wherein, the conversion from the low-dimensional space to the high-dimensional space is realized through a kernel function. Commonly used kernel functions include: RBF kernel, linear kernel, polynomial kernel and Sigmoid kernel.
And S130, acquiring the potential hazard vector of the measuring point, and combining the potential hazard vector of the measuring point and the potential hazard vector of the text to obtain a coal mine characteristic extraction model.
Specifically, the method for acquiring the potential risk vector of the measuring point comprises the following steps:
acquiring real-time data of coal mine measuring points, wherein the real-time data of the coal mine measuring points comprises the following steps: and (4) respectively carrying out statistical judgment on the real-time data of the measuring points by using gas, oxygen, coal dust, wind speed, mine pressure and the like, and judging whether the real-time data of the measuring points correspond to high-risk risks. And if the real-time data of the measuring points correspond to high-risk risks, namely the real-time data of the measuring points belong to risk hidden dangers, taking the category corresponding to the real-time data of the measuring points as a vector of the potential hidden dangers of the measuring points, or else, taking the category corresponding to the real-time data of the measuring points not as the vector of the potential hidden dangers of the measuring points.
Specifically, the potential hazard vectors of the measuring points and the potential hazard vectors of the texts are combined to obtain a coal mine characteristic extraction model for extracting coal mine characteristic vectors.
The coal mine characteristic extraction model extracts coal mine characteristic vectors, namely extracting measuring point hidden danger vectors and text hidden danger vectors. Specifically, when data consistent with a measuring point hidden danger vector or a text hidden danger vector in the coal mine feature extraction model exists in the extracted data, the consistent data is extracted and used as a coal mine feature vector.
Fig. 4 is a flow chart of a method for obtaining a coal mine feature vector. The method for acquiring the coal mine feature vector comprises the following steps: and acquiring a text hidden danger vector and a measuring point hidden danger vector, and combining the text hidden danger vector and the measuring point hidden danger vector into a coal mine characteristic vector.
As shown in fig. 4, the method for acquiring a text hidden danger vector includes: acquiring a hidden danger text training set labeled manually, performing word segmentation and word removal processing on the acquired hidden danger text training set to obtain a preprocessed hidden danger text training set, and performing vectorization processing on the hidden danger text training set to obtain a hidden danger text vector training set; training an SVM classification model according to the hidden danger text vector training set, and obtaining a file hidden danger vector according to the hidden danger text test set and the trained SVM classification model.
The method for acquiring the potential risk vector of the measuring point comprises the following steps: and (4) carrying out statistical analysis on real-time data (gas, oxygen, coal dust, wind speed, mine pressure and the like) of the measuring points to obtain measuring point hidden danger vectors which are easy to generate high-risk risks.
And step S2, acquiring historical coal mine accident data and coal mine non-accident data, and extracting coal mine characteristic data in the historical coal mine accident data and the coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set.
Specifically, historical coal mine accident information corresponding to each accident category is obtained, coal mine non-accident data are obtained, and coal mine characteristic data in the historical coal mine accident data and the coal mine non-accident data are extracted according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set.
Wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, roof accidents, etc.
Step S2 includes the following sub-steps:
step S210, coal mine accident information in a period of time (for example, in the last half year) is acquired aiming at each accident category.
The coal mine accident information comprises: the method comprises coal mine hidden danger text data and measuring point hidden danger data, wherein the coal mine hidden danger text data are used for extracting a text hidden danger category, and the measuring point hidden danger data are used for extracting a measuring point hidden danger category.
And S220, extracting coal mine non-accident data from the historical coal mine data in proportion.
And randomly extracting negative examples of the coal mine non-accident data according to a positive-negative ratio of 1:5, wherein the positive examples represent the coal mine accident data, and the negative examples represent the coal mine non-accident data. In other words, according to the proportion of coal mine accident data to coal mine non-accident data being 1: and 5, extracting coal mine non-accident data according to the data volume.
And step S230, extracting coal mine accident data and coal mine characteristic data in the coal mine non-accident data through a coal mine characteristic extraction module, and obtaining a coal mine risk prediction training set according to the coal mine characteristic data.
The coal mine characteristic data comprises a text hidden danger category and a measuring point hidden danger category. And dividing the coal mine characteristic data in the extracted coal mine accident data and coal mine non-accident data into a coal mine risk prediction training set and a coal mine risk prediction testing set. The coal mine risk prediction training set is used for training to obtain a coal mine risk prediction model, and the coal mine risk prediction testing set is used for testing and optimizing the coal mine risk prediction model.
And step S3, inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model.
The logistic regression base model is a linear regression model normalized by a Sigmoid function (logistic equation), and is a machine learning method for solving the binary problem.
The assumed functional form of the logistic regression model is as follows:
wherein h isθ(x) Representing a hypothesis function; x is input, and theta is a parameter required to be obtained; e is 2.718.
The assumptions made by the logistic regression model are as follows:
wherein g () represents a logical function; p (y 1| x; theta) represents the possibility of calculating the output variable y equal to 1 according to the selected parameter theta; t denotes transposition.
The decision function corresponding to the logistic regression model is:
y*=1,ifP(y=1|x)>0.5;
wherein, y*Representing a decision function; if indicates if; x is input; y is an output variable.
In logistic regression models, the most common cost function is cross entropy; in optimizing the parameters, the most common method is gradient descent.
And learning the weight of the coal mine feature vector by adopting a logistic regression model, optimizing the weight of the coal mine feature vector, and obtaining a coal mine risk prediction model, thereby achieving a better prediction effect.
Specifically, the weight of the coal mine feature vector is used as a parameter theta, the coal mine feature vector is used as input, and the parameter theta meeting the decision function is obtained and used as the weight of the optimized coal mine feature vector. And taking the weight of the optimized coal mine feature vector as a weight parameter in a coal mine risk prediction model, and better predicting the coal mine risk.
And step S4, acquiring coal mine hidden danger text data and measuring point real-time data.
And S5, extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to the coal mine characteristic extraction model to serve as coal mine risk evaluation data.
In other words, the potential hazard vector of the measuring point and the potential hazard vector of the text are extracted to serve as coal mine risk evaluation data.
And step S6, calling a coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data, and obtaining a coal mine risk prediction result.
Step S6 includes the following sub-steps:
and step S610, inputting coal mine risk evaluation data into a coal mine risk prediction model.
And S620, outputting the predicted coal mine accident category by the coal mine risk prediction model. Therefore, corresponding preventive measures are taken according to the predicted coal mine accident category.
As shown in fig. 5, the method for obtaining the coal mine risk prediction result includes: and obtaining a coal mine risk prediction evaluation result according to the trained LR logistic regression model and the coal mine test set.
As shown in fig. 5, the training method of the LR logistic regression model is: the method comprises the steps of obtaining coal mine accident data and coal mine non-accident data, extracting coal mine feature vectors in the coal mine accident data to serve as a coal mine risk prediction training set, and obtaining an LR logistic regression model according to the coal mine risk prediction training set, wherein the LR logistic regression model is used for predicting coal mine risks.
Example two
As shown in fig. 6, the present application provides a coal mine disaster risk prediction system 100 based on semantic recognition, which includes:
the training data acquisition module 10 is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module 20 is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module 30 is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module 40 is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and the risk prediction module 50 is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and obtaining a coal mine risk prediction evaluation result.
The beneficial effect that this application realized is as follows:
(1) the coal mine information data that this application make full use of was collected contains: the hidden danger text data and the measuring point real-time data are obtained by adopting a semantic recognition method to obtain a coal mine text hidden danger vector, a statistical method is adopted to obtain a coal mine measuring point hidden danger vector, the two types of vectors are fused to obtain a coal mine characteristic vector, the fusion of the coal mine text data and the measuring point data is fully considered, the coal mine risk prediction accuracy is improved, and the occurrence probability of coal mine safety production accidents is reduced.
(2) According to the method and the device, the incidence relation is established between the hidden danger data and the accident data, and after the incidence relation is established, the dynamic risk distribution and the hidden danger incidence relation of the coal mine accident can be better analyzed and excavated, so that the accident risk is reduced for a coal mine, major risks are timely dealt with, and data support is provided. Specifically, this application make full use of the accident data that collect, the accident data contains: the method has the advantages that gas accidents, water damage accidents, coal dust accidents, roof accidents and the like are combined, a logistic regression model is adopted to predict coal mine accident risks in real time by combining coal mine characteristic vectors, prediction results are obtained, and the method has great significance for coal mine risk prevention.
The above description is only an embodiment of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (10)
1. A coal mine disaster risk prediction method based on semantic recognition is characterized by comprising the following steps:
acquiring historical coal mine accident data and coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model;
acquiring coal mine hidden danger text data and measuring point real-time data;
extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and calling a coal mine risk prediction model, and performing risk prediction on the extracted coal mine risk evaluation data to obtain a coal mine risk prediction evaluation result.
2. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1, wherein the method for pre-constructing a coal mine feature extraction model comprises the following sub-steps:
acquiring a hidden danger text vector training set;
calling a preset classification model, and training a hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text;
and acquiring a measuring point hidden danger vector, and combining the measuring point hidden danger vector and the text hidden danger vector to obtain a coal mine characteristic vector.
3. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the coal mine feature extraction model extracts coal mine feature vectors, and the coal mine feature vectors comprise measuring point hidden danger vectors and text hidden danger vectors.
4. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the method for obtaining the text hidden danger vector training set comprises the following sub-steps:
acquiring a hidden danger text training set labeled manually;
preprocessing a hidden danger text training set;
and vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
5. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the method for obtaining the hidden danger vector of the measuring point is as follows:
acquiring real-time data of coal mine measuring points;
and judging the real-time data of the measuring points, and taking the category corresponding to the risk hidden danger of the measuring points as a measuring point hidden danger vector.
6. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1, wherein the method for obtaining coal mine risk prediction training set comprises the following steps:
acquiring historical coal mine accident data corresponding to each accident category; acquiring coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, and roof accidents.
7. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1 wherein the coal mine accident data comprises: the method comprises the following steps of coal mine text hidden danger data and measuring point hidden danger data, wherein the coal mine text hidden danger data are used for extracting text hidden danger vectors, and the measuring point hidden danger data are used for extracting measuring point hidden danger vectors.
8. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 1, wherein the weights of the coal mine feature vectors are learned by using a logistic regression model to optimize the weights of the coal mine feature vectors.
9. A coal mine disaster risk prediction method based on semantic recognition according to claim 1, characterized in that coal mine risk evaluation data is inputted into a coal mine risk prediction model; and outputting the predicted coal mine accident category by the coal mine risk prediction model.
10. A coal mine disaster risk prediction system based on semantic recognition is characterized by comprising:
the training data acquisition module is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and the risk prediction module is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and acquiring a coal mine risk prediction evaluation result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111580046.1A CN114386429A (en) | 2021-12-22 | 2021-12-22 | Coal mine disaster risk prediction method and system based on semantic recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111580046.1A CN114386429A (en) | 2021-12-22 | 2021-12-22 | Coal mine disaster risk prediction method and system based on semantic recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114386429A true CN114386429A (en) | 2022-04-22 |
Family
ID=81198599
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111580046.1A Pending CN114386429A (en) | 2021-12-22 | 2021-12-22 | Coal mine disaster risk prediction method and system based on semantic recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114386429A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115984193A (en) * | 2022-12-15 | 2023-04-18 | 东北林业大学 | PDL1 expression level detection method fusing histopathology image and CT image |
CN116070725A (en) * | 2022-08-29 | 2023-05-05 | 山东科技大学 | Mining pressure risk prediction method based on logistic regression |
CN116167532A (en) * | 2023-04-26 | 2023-05-26 | 中国安全生产科学研究院 | System optimization method for coal mine industry illegal production behavior prediction system |
CN117422581A (en) * | 2023-11-01 | 2024-01-19 | 中国地质科学院矿产资源研究所 | Mineral resource safety monitoring and early warning method, system, equipment and medium |
CN117726181A (en) * | 2024-02-06 | 2024-03-19 | 山东科技大学 | Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine |
-
2021
- 2021-12-22 CN CN202111580046.1A patent/CN114386429A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116070725A (en) * | 2022-08-29 | 2023-05-05 | 山东科技大学 | Mining pressure risk prediction method based on logistic regression |
CN115984193A (en) * | 2022-12-15 | 2023-04-18 | 东北林业大学 | PDL1 expression level detection method fusing histopathology image and CT image |
CN116167532A (en) * | 2023-04-26 | 2023-05-26 | 中国安全生产科学研究院 | System optimization method for coal mine industry illegal production behavior prediction system |
CN116167532B (en) * | 2023-04-26 | 2023-09-05 | 中国安全生产科学研究院 | System optimization method for coal mine industry illegal production behavior prediction system |
CN117422581A (en) * | 2023-11-01 | 2024-01-19 | 中国地质科学院矿产资源研究所 | Mineral resource safety monitoring and early warning method, system, equipment and medium |
CN117726181A (en) * | 2024-02-06 | 2024-03-19 | 山东科技大学 | Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine |
CN117726181B (en) * | 2024-02-06 | 2024-04-30 | 山东科技大学 | Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114386429A (en) | Coal mine disaster risk prediction method and system based on semantic recognition | |
Zhong et al. | Hazard analysis: A deep learning and text mining framework for accident prevention | |
CN113254594B (en) | Smart power plant-oriented safety knowledge graph construction method and system | |
Lin et al. | Understanding on-site inspection of construction projects based on keyword extraction and topic modeling | |
CN113095050A (en) | Intelligent ticketing method, system, equipment and storage medium | |
CN112348662B (en) | Risk assessment method and device based on user occupation prediction and electronic equipment | |
CN115544272A (en) | Attention mechanism-based chemical accident cause knowledge graph construction method | |
CN111191452A (en) | Railway text named entity recognition method and device | |
CN112257425A (en) | Power data analysis method and system based on data classification model | |
Wang et al. | Automatic frequency estimation of contributory factors for confined space accidents | |
Luo et al. | A correlation analysis of construction site fall accidents based on text mining | |
Fu et al. | Towards system-theoretic risk management for maritime transportation systems: A case study of the yangtze river estuary | |
Luo et al. | Convolutional neural network algorithm–based novel automatic text classification framework for construction accident reports | |
KR20200001936A (en) | Method for classifying safety document on construction site and Server for performing the same | |
Gangadhari et al. | Application of rough set theory and machine learning algorithms in predicting accident outcomes in the Indian petroleum industry | |
Rupasinghe et al. | Understanding construction site safety hazards through open data: text mining approach | |
CN113901815B (en) | Emergency working condition event detection method based on dam operation log | |
ALawad et al. | Unsupervised machine learning for managing safety accidents in railway stations | |
CN113221556A (en) | Method, device and equipment for identifying potential safety hazard | |
CN112256862A (en) | Data mapping relation establishing method | |
Xu et al. | Identification of construction safety risks based on text mining and LIBSVM method | |
CN117592561B (en) | Enterprise digital operation multidimensional data analysis method and system | |
CN116627915B (en) | Dam emergency working condition event detection method and system based on slot semantic interaction | |
Cao et al. | Identification of causative factors for fatal accidents in the electric power industry using text categorization and catastrophe association analysis techniques | |
CN116109142B (en) | Dangerous waste supervision method, system and device based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |