CN114386429A - Coal mine disaster risk prediction method and system based on semantic recognition - Google Patents

Coal mine disaster risk prediction method and system based on semantic recognition Download PDF

Info

Publication number
CN114386429A
CN114386429A CN202111580046.1A CN202111580046A CN114386429A CN 114386429 A CN114386429 A CN 114386429A CN 202111580046 A CN202111580046 A CN 202111580046A CN 114386429 A CN114386429 A CN 114386429A
Authority
CN
China
Prior art keywords
coal mine
data
hidden danger
risk prediction
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111580046.1A
Other languages
Chinese (zh)
Inventor
王鹏
付恩三
田乐逍
陈佳林
疏礼春
王刚
王新会
张倩
汪鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Research Institute Of Emergency Management Department
Original Assignee
Information Research Institute Of Emergency Management Department
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Research Institute Of Emergency Management Department filed Critical Information Research Institute Of Emergency Management Department
Priority to CN202111580046.1A priority Critical patent/CN114386429A/en
Publication of CN114386429A publication Critical patent/CN114386429A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Agronomy & Crop Science (AREA)
  • Animal Husbandry (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Mining & Mineral Resources (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a coal mine disaster risk prediction method and a system based on semantic recognition, wherein the method comprises the following steps: extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set; inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model; extracting coal mine characteristic vectors in the coal mine hidden danger text information and the measuring point data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data; and calling a coal mine risk prediction model to perform risk prediction on the extracted coal mine risk evaluation data. The method and the device fully consider the fusion of the coal mine text data and the measuring point data, establish the dynamic incidence relation between the accident risk and the hidden danger, track the hidden danger risk in the coal mine in real time, improve the safety management level of the coal mine and the hidden danger disposal efficiency, block key links of accident occurrence, and reduce the accident risk of the coal mine disaster.

Description

Coal mine disaster risk prediction method and system based on semantic recognition
Technical Field
The application relates to the technical field of coal mine safety risk evaluation, in particular to a coal mine disaster risk prediction method and system based on semantic recognition.
Background
In recent years, along with the enhanced management of coal mines in the aspect of standardized construction, the popularization and application of a coal mine 'three-in-one' (integrated management on coal mine risks, hidden dangers and safety standardization) system changes the traditional hidden danger management mechanism and process. Along with the promotion of colliery underground communication network, explosion-proof mobile terminal's use, the safety control personnel on-the-spot inspection risk hidden danger information, accessible explosion-proof mobile terminal inputs "three in one" system, and the system is automatic with hidden danger information transmission to the safety control department, and each department carries out hidden danger reorganization and supervise and does, realizes the closed loop of hidden danger information and deals with the management. With the popularization and application of the enterprise 'three-in-one' system, a large amount of structured data and unstructured information are uploaded and summarized, and powerful support is provided for a coal mine to deeply mine the incidence relation between related accident potential hazards by applying a natural language processing technology.
However, the existing coal mine disaster risk evaluation and management method has the following technical problems:
firstly, the analysis data is single, the analysis is mainly performed on coal mine measuring point data, and the fusion of coal mine text data and measuring point data is not fully considered. The analysis of coal mine text data is more complex than that of measuring point data, most results can be obtained by statistical analysis of the measuring point data, and the coal mine text data needs to be analyzed by combining expert experience, manual labeling, a natural language processing method and a model.
Secondly, at the present stage, the relevance between the coal mine hidden danger and the accident is not high. At present, hidden danger troubleshooting work is carried out on coal mines, but no correlation analysis model is established among various hidden dangers and accidents, the logic relation of dynamic correlation of accident risks and hidden dangers cannot be well realized, and the influence degree of the hidden dangers on the corresponding accident risks cannot be dynamically and visually embodied when the hidden dangers are troubleshot.
Therefore, how to fully consider the fusion of coal mine text data and measuring point data, establish the dynamic incidence relation between accident risk and hidden danger, improve the coal mine safety management level and hidden danger disposal efficiency, block the key link of accident occurrence, and reduce coal mine disaster accident risk is a technical problem which needs to be solved urgently at present.
Disclosure of Invention
The application aims to provide a coal mine disaster risk prediction method and system based on semantic recognition, fusion of coal mine text data and measuring point data is fully considered, a dynamic incidence relation between accident risk and hidden danger is established, the coal mine safety management level and hidden danger disposal efficiency are improved, key accident links are blocked, and coal mine disaster accident risk is reduced.
In order to achieve the purpose, the application provides a coal mine disaster risk prediction method based on semantic recognition, and the method comprises the following steps:
acquiring historical coal mine accident data and coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model;
acquiring coal mine hidden danger text data and measuring point real-time data;
extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and calling a coal mine risk prediction model, and performing risk prediction on the extracted coal mine risk evaluation data to obtain a coal mine risk prediction evaluation result.
The method for constructing the coal mine feature extraction model in advance comprises the following sub-steps:
acquiring a hidden danger text vector training set;
calling a preset classification model, and training a hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text;
and acquiring a measuring point hidden danger vector, and combining the measuring point hidden danger vector and the text hidden danger vector to obtain a coal mine characteristic extraction vector.
The coal mine characteristic extraction model extracts coal mine characteristic vectors, and the coal mine characteristic vectors comprise measuring point hidden danger vectors and text hidden danger vectors.
As above, the method for obtaining the hidden danger text vector training set includes the following sub-steps:
acquiring a hidden danger text training set labeled manually;
preprocessing a hidden danger text training set;
and vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
As above, the method for acquiring the hidden danger vector of the measuring point comprises the following steps:
acquiring real-time data of coal mine measuring points;
and judging the real-time data of the measuring points, and taking the category corresponding to the risk hidden danger of the measuring points as a measuring point hidden danger vector.
The method for acquiring the coal mine risk prediction training set comprises the following steps:
acquiring historical coal mine accident data corresponding to each accident category; acquiring coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, and roof accidents.
As above, among others, the coal mine accident information includes: the method comprises coal mine hidden danger text data and measuring point hidden danger data, wherein the coal mine hidden danger text data are used for extracting text vectors, and the measuring point hidden danger data are used for extracting measuring point vectors.
As above, the weights of the coal mine feature vectors are learned by using a logistic regression model, and the weights of the coal mine feature vectors are optimized.
The method comprises the steps of inputting coal mine risk evaluation data into a coal mine risk prediction model; and outputting the predicted coal mine accident category by the coal mine risk prediction model.
The application also provides a coal mine disaster risk prediction system based on semantic recognition, and the system comprises:
the training data acquisition module is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
the risk prediction module is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and acquiring a coal mine risk prediction evaluation result
The beneficial effect that this application realized is as follows:
(1) the coal mine information data that this application make full use of was collected contains: the hidden danger text data and the measuring point real-time data are obtained by adopting a semantic recognition method to obtain a coal mine text hidden danger vector, a statistical method is adopted to obtain a coal mine measuring point hidden danger vector, the two types of vectors are fused to obtain a coal mine characteristic vector, the fusion of the coal mine text data and the measuring point data is fully considered, the coal mine risk prediction accuracy is improved, and the occurrence probability of coal mine safety production accidents is reduced.
(2) According to the method and the device, the incidence relation is established between the hidden danger data and the accident data, and after the incidence relation is established, the dynamic risk distribution and the hidden danger incidence relation of the coal mine accident can be better analyzed and excavated, so that the accident risk is reduced for a coal mine, major risks are timely dealt with, and data support is provided. Specifically, this application make full use of the accident data that collect, the accident data contains: the method has the advantages that gas accidents, water damage accidents, coal dust accidents, roof accidents and the like are combined, a logistic regression model is adopted to predict coal mine accident risks in real time by combining coal mine characteristic vectors, prediction results are obtained, and the method has great significance for coal mine risk prevention.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of a coal mine disaster risk prediction method based on semantic recognition according to an embodiment of the present application.
Fig. 2 is a flowchart of a method for pre-constructing a coal mine feature extraction model according to an embodiment of the present application.
Fig. 3 is a flowchart of a method for obtaining a hidden danger text vector training set according to an embodiment of the present application.
Fig. 4 is a schematic diagram of a method for obtaining a coal mine feature vector according to an embodiment of the present application.
Fig. 5 is a schematic diagram of a coal mine risk prediction method according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a coal mine disaster risk prediction system based on semantic recognition according to an embodiment of the present application.
Reference numerals: 10-a training data acquisition module; 20-a model building module; 30-a real-time data acquisition module; 50-a risk prediction module; 100-coal mine disaster risk prediction system.
Detailed Description
The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
As shown in fig. 1, the present application provides a coal mine disaster risk prediction method based on semantic recognition, which includes the following steps:
and step S1, constructing a coal mine characteristic extraction model in advance.
As shown in fig. 2, step S1 includes the following sub-steps:
and step S110, acquiring a hidden danger text vector training set.
As shown in fig. 3, step S110 includes the following sub-steps:
and step S111, acquiring a hidden danger text training set labeled manually.
Specifically, the method for acquiring the hidden danger text training set of the coal mine comprises the following steps: and acquiring the text information of the hidden danger of the coal mine in a near time period (for example, 60 days) and carrying out manual category marking.
The categories are easily caused coal mine disasters and are divided by experts according to experience, and include: detonating the fire source, the amount of wind is not enough, the mine gushes water unusually, the sump is not cleared up, the water pump trouble, the pipeline trouble, the dust fall is failed, produce the coal dust, strut untimely, strut work is not normal, strut work inefficacy or effect reduce, and remaining hidden danger description etc. for each classification settlement classification label, classification label can replace with digit etc. for example: 0. 1, 2, 3, etc.
And step S112, preprocessing the hidden danger text training set.
Specifically, the method for preprocessing the hidden danger text training set comprises the following steps: and performing word segmentation and word stop removal processing on the hidden danger text training set.
And step S113, vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
The method for vectorizing the preprocessed hidden danger text training set comprises the following steps: and calculating the text type data in the hidden danger text training set by adopting a TF-IDF (text vectorization) calculation method, and realizing the vectorization processing of the text type data in the hidden danger text training set.
Specifically, the TF-IDF calculation method comprises the following steps:
TF-IDF=TF*IDF;
Figure BDA0003425777950000061
Figure BDA0003425777950000062
wherein, TF represents the word frequency and represents the frequency of the occurrence of the entry t in the document; IDF denotes the inverse text frequency index, the main idea of IDF is: if the number of documents containing the entry t is less, namely the number n of times of occurrence of the keyword in the hidden danger text is smaller, and the IDF is larger, the entry t is proved to have good category distinguishing capability, tfi,jRepresenting the word frequency of the ith word in the jth document; n isi,jRepresenting the number of times the ith word appears in the jth document; sigmaknk,jRepresenting the sum of the times of all the words appearing in the jth document; idfiAn inverse text frequency index representing an ith word; d represents a document; | D | represents the total number of documents; i { j, t |)i∈DiMeans to indicate the inclusion of the word tiThe number of files.
And step S120, calling a preset classification model, and training the hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text.
Specifically, the hidden danger text vector training set is divided into a training set and a test set. Training by adopting a training set to obtain a text hidden danger classification model, testing the text hidden danger classification model by adopting a testing set, optimizing a text hidden danger vector, and obtaining the optimized text hidden danger vector. The text hidden danger vector comprises hidden danger characteristics of a plurality of categories.
Preferably, the feature vectors and the class labels (labels) in the hidden danger text vector training set are divided into a training set and a testing set according to the ratio of 8: 2.
The training method of the text hidden danger classification model comprises the following steps: the classification is carried out by adopting an SVM two-classifier, a text hidden danger classification model is trained, when the problem of multi-classification is met, a one-to-many strategy can be adopted, each class is classified into one class, other classes are classified into another class in sequence, and U SVM two-classifiers are required to be constructed.
The optimization problem of solving the optimal classification hyperplane is transformed into the following minimization problem using a kernel function.
The minimization problem is:
Figure BDA0003425777950000071
Figure BDA0003425777950000072
wherein K represents a kernel function; x represents a training feature; c represents a penalty factor to prevent overfitting; l represents the total number of training features; n represents the number of times of occurrence of the keywords in the hidden danger text; y represents a category label; t represents transposition; s.t. indicates that the condition is satisfied; p, q and k are parameters; alpha is alphapAnd alphaqRepresents a parameter selected in the range of 0-C; y ispRepresents the p-th category label; y isqRepresents the qth category label; xpRepresenting the p-th training feature; xqRepresenting the qth training feature; alpha ═ alpha1,α2,......,αl]。
Wherein, the conversion from the low-dimensional space to the high-dimensional space is realized through a kernel function. Commonly used kernel functions include: RBF kernel, linear kernel, polynomial kernel and Sigmoid kernel.
And S130, acquiring the potential hazard vector of the measuring point, and combining the potential hazard vector of the measuring point and the potential hazard vector of the text to obtain a coal mine characteristic extraction model.
Specifically, the method for acquiring the potential risk vector of the measuring point comprises the following steps:
acquiring real-time data of coal mine measuring points, wherein the real-time data of the coal mine measuring points comprises the following steps: and (4) respectively carrying out statistical judgment on the real-time data of the measuring points by using gas, oxygen, coal dust, wind speed, mine pressure and the like, and judging whether the real-time data of the measuring points correspond to high-risk risks. And if the real-time data of the measuring points correspond to high-risk risks, namely the real-time data of the measuring points belong to risk hidden dangers, taking the category corresponding to the real-time data of the measuring points as a vector of the potential hidden dangers of the measuring points, or else, taking the category corresponding to the real-time data of the measuring points not as the vector of the potential hidden dangers of the measuring points.
Specifically, the potential hazard vectors of the measuring points and the potential hazard vectors of the texts are combined to obtain a coal mine characteristic extraction model for extracting coal mine characteristic vectors.
The coal mine characteristic extraction model extracts coal mine characteristic vectors, namely extracting measuring point hidden danger vectors and text hidden danger vectors. Specifically, when data consistent with a measuring point hidden danger vector or a text hidden danger vector in the coal mine feature extraction model exists in the extracted data, the consistent data is extracted and used as a coal mine feature vector.
Fig. 4 is a flow chart of a method for obtaining a coal mine feature vector. The method for acquiring the coal mine feature vector comprises the following steps: and acquiring a text hidden danger vector and a measuring point hidden danger vector, and combining the text hidden danger vector and the measuring point hidden danger vector into a coal mine characteristic vector.
As shown in fig. 4, the method for acquiring a text hidden danger vector includes: acquiring a hidden danger text training set labeled manually, performing word segmentation and word removal processing on the acquired hidden danger text training set to obtain a preprocessed hidden danger text training set, and performing vectorization processing on the hidden danger text training set to obtain a hidden danger text vector training set; training an SVM classification model according to the hidden danger text vector training set, and obtaining a file hidden danger vector according to the hidden danger text test set and the trained SVM classification model.
The method for acquiring the potential risk vector of the measuring point comprises the following steps: and (4) carrying out statistical analysis on real-time data (gas, oxygen, coal dust, wind speed, mine pressure and the like) of the measuring points to obtain measuring point hidden danger vectors which are easy to generate high-risk risks.
And step S2, acquiring historical coal mine accident data and coal mine non-accident data, and extracting coal mine characteristic data in the historical coal mine accident data and the coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set.
Specifically, historical coal mine accident information corresponding to each accident category is obtained, coal mine non-accident data are obtained, and coal mine characteristic data in the historical coal mine accident data and the coal mine non-accident data are extracted according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set.
Wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, roof accidents, etc.
Step S2 includes the following sub-steps:
step S210, coal mine accident information in a period of time (for example, in the last half year) is acquired aiming at each accident category.
The coal mine accident information comprises: the method comprises coal mine hidden danger text data and measuring point hidden danger data, wherein the coal mine hidden danger text data are used for extracting a text hidden danger category, and the measuring point hidden danger data are used for extracting a measuring point hidden danger category.
And S220, extracting coal mine non-accident data from the historical coal mine data in proportion.
And randomly extracting negative examples of the coal mine non-accident data according to a positive-negative ratio of 1:5, wherein the positive examples represent the coal mine accident data, and the negative examples represent the coal mine non-accident data. In other words, according to the proportion of coal mine accident data to coal mine non-accident data being 1: and 5, extracting coal mine non-accident data according to the data volume.
And step S230, extracting coal mine accident data and coal mine characteristic data in the coal mine non-accident data through a coal mine characteristic extraction module, and obtaining a coal mine risk prediction training set according to the coal mine characteristic data.
The coal mine characteristic data comprises a text hidden danger category and a measuring point hidden danger category. And dividing the coal mine characteristic data in the extracted coal mine accident data and coal mine non-accident data into a coal mine risk prediction training set and a coal mine risk prediction testing set. The coal mine risk prediction training set is used for training to obtain a coal mine risk prediction model, and the coal mine risk prediction testing set is used for testing and optimizing the coal mine risk prediction model.
And step S3, inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model.
The logistic regression base model is a linear regression model normalized by a Sigmoid function (logistic equation), and is a machine learning method for solving the binary problem.
The assumed functional form of the logistic regression model is as follows:
Figure BDA0003425777950000091
wherein h isθ(x) Representing a hypothesis function; x is input, and theta is a parameter required to be obtained; e is 2.718.
The assumptions made by the logistic regression model are as follows:
Figure BDA0003425777950000092
wherein g () represents a logical function; p (y 1| x; theta) represents the possibility of calculating the output variable y equal to 1 according to the selected parameter theta; t denotes transposition.
The decision function corresponding to the logistic regression model is:
y*=1,ifP(y=1|x)>0.5;
wherein, y*Representing a decision function; if indicates if; x is input; y is an output variable.
In logistic regression models, the most common cost function is cross entropy; in optimizing the parameters, the most common method is gradient descent.
And learning the weight of the coal mine feature vector by adopting a logistic regression model, optimizing the weight of the coal mine feature vector, and obtaining a coal mine risk prediction model, thereby achieving a better prediction effect.
Specifically, the weight of the coal mine feature vector is used as a parameter theta, the coal mine feature vector is used as input, and the parameter theta meeting the decision function is obtained and used as the weight of the optimized coal mine feature vector. And taking the weight of the optimized coal mine feature vector as a weight parameter in a coal mine risk prediction model, and better predicting the coal mine risk.
And step S4, acquiring coal mine hidden danger text data and measuring point real-time data.
And S5, extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to the coal mine characteristic extraction model to serve as coal mine risk evaluation data.
In other words, the potential hazard vector of the measuring point and the potential hazard vector of the text are extracted to serve as coal mine risk evaluation data.
And step S6, calling a coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data, and obtaining a coal mine risk prediction result.
Step S6 includes the following sub-steps:
and step S610, inputting coal mine risk evaluation data into a coal mine risk prediction model.
And S620, outputting the predicted coal mine accident category by the coal mine risk prediction model. Therefore, corresponding preventive measures are taken according to the predicted coal mine accident category.
As shown in fig. 5, the method for obtaining the coal mine risk prediction result includes: and obtaining a coal mine risk prediction evaluation result according to the trained LR logistic regression model and the coal mine test set.
As shown in fig. 5, the training method of the LR logistic regression model is: the method comprises the steps of obtaining coal mine accident data and coal mine non-accident data, extracting coal mine feature vectors in the coal mine accident data to serve as a coal mine risk prediction training set, and obtaining an LR logistic regression model according to the coal mine risk prediction training set, wherein the LR logistic regression model is used for predicting coal mine risks.
Example two
As shown in fig. 6, the present application provides a coal mine disaster risk prediction system 100 based on semantic recognition, which includes:
the training data acquisition module 10 is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module 20 is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module 30 is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module 40 is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and the risk prediction module 50 is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and obtaining a coal mine risk prediction evaluation result.
The beneficial effect that this application realized is as follows:
(1) the coal mine information data that this application make full use of was collected contains: the hidden danger text data and the measuring point real-time data are obtained by adopting a semantic recognition method to obtain a coal mine text hidden danger vector, a statistical method is adopted to obtain a coal mine measuring point hidden danger vector, the two types of vectors are fused to obtain a coal mine characteristic vector, the fusion of the coal mine text data and the measuring point data is fully considered, the coal mine risk prediction accuracy is improved, and the occurrence probability of coal mine safety production accidents is reduced.
(2) According to the method and the device, the incidence relation is established between the hidden danger data and the accident data, and after the incidence relation is established, the dynamic risk distribution and the hidden danger incidence relation of the coal mine accident can be better analyzed and excavated, so that the accident risk is reduced for a coal mine, major risks are timely dealt with, and data support is provided. Specifically, this application make full use of the accident data that collect, the accident data contains: the method has the advantages that gas accidents, water damage accidents, coal dust accidents, roof accidents and the like are combined, a logistic regression model is adopted to predict coal mine accident risks in real time by combining coal mine characteristic vectors, prediction results are obtained, and the method has great significance for coal mine risk prevention.
The above description is only an embodiment of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (10)

1. A coal mine disaster risk prediction method based on semantic recognition is characterized by comprising the following steps:
acquiring historical coal mine accident data and coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
inputting the coal mine risk prediction training set into a logistic regression model for training to obtain a coal mine risk prediction model;
acquiring coal mine hidden danger text data and measuring point real-time data;
extracting coal mine characteristic vectors in the coal mine hidden danger text data and the measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and calling a coal mine risk prediction model, and performing risk prediction on the extracted coal mine risk evaluation data to obtain a coal mine risk prediction evaluation result.
2. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1, wherein the method for pre-constructing a coal mine feature extraction model comprises the following sub-steps:
acquiring a hidden danger text vector training set;
calling a preset classification model, and training a hidden danger text vector training set to obtain a text hidden danger vector corresponding to the hidden danger text;
and acquiring a measuring point hidden danger vector, and combining the measuring point hidden danger vector and the text hidden danger vector to obtain a coal mine characteristic vector.
3. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the coal mine feature extraction model extracts coal mine feature vectors, and the coal mine feature vectors comprise measuring point hidden danger vectors and text hidden danger vectors.
4. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the method for obtaining the text hidden danger vector training set comprises the following sub-steps:
acquiring a hidden danger text training set labeled manually;
preprocessing a hidden danger text training set;
and vectorizing the preprocessed hidden danger text training set to obtain a hidden danger text vector training set.
5. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 2, wherein the method for obtaining the hidden danger vector of the measuring point is as follows:
acquiring real-time data of coal mine measuring points;
and judging the real-time data of the measuring points, and taking the category corresponding to the risk hidden danger of the measuring points as a measuring point hidden danger vector.
6. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1, wherein the method for obtaining coal mine risk prediction training set comprises the following steps:
acquiring historical coal mine accident data corresponding to each accident category; acquiring coal mine non-accident data;
extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
wherein the accident category includes: gas accidents, water damage accidents, coal dust accidents, and roof accidents.
7. A coal mine disaster risk prediction method based on semantic recognition as claimed in claim 1 wherein the coal mine accident data comprises: the method comprises the following steps of coal mine text hidden danger data and measuring point hidden danger data, wherein the coal mine text hidden danger data are used for extracting text hidden danger vectors, and the measuring point hidden danger data are used for extracting measuring point hidden danger vectors.
8. The coal mine disaster risk prediction method based on semantic recognition as recited in claim 1, wherein the weights of the coal mine feature vectors are learned by using a logistic regression model to optimize the weights of the coal mine feature vectors.
9. A coal mine disaster risk prediction method based on semantic recognition according to claim 1, characterized in that coal mine risk evaluation data is inputted into a coal mine risk prediction model; and outputting the predicted coal mine accident category by the coal mine risk prediction model.
10. A coal mine disaster risk prediction system based on semantic recognition is characterized by comprising:
the training data acquisition module is used for acquiring historical coal mine accident data and coal mine non-accident data; extracting coal mine characteristic data in historical coal mine accident data and coal mine non-accident data according to a pre-constructed coal mine characteristic extraction model to serve as a coal mine risk prediction training set;
the model construction module is used for inputting the coal mine risk prediction training set into the logistic regression model for training to obtain a coal mine risk prediction model;
the real-time data acquisition module is used for acquiring coal mine hidden danger text data and measuring point real-time data;
the risk evaluation data acquisition module is used for extracting coal mine characteristic vectors in coal mine hidden danger text data and measuring point real-time data according to a coal mine characteristic extraction model to serve as coal mine risk evaluation data;
and the risk prediction module is used for calling the coal mine risk prediction model, performing risk prediction on the extracted coal mine risk evaluation data and acquiring a coal mine risk prediction evaluation result.
CN202111580046.1A 2021-12-22 2021-12-22 Coal mine disaster risk prediction method and system based on semantic recognition Pending CN114386429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111580046.1A CN114386429A (en) 2021-12-22 2021-12-22 Coal mine disaster risk prediction method and system based on semantic recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111580046.1A CN114386429A (en) 2021-12-22 2021-12-22 Coal mine disaster risk prediction method and system based on semantic recognition

Publications (1)

Publication Number Publication Date
CN114386429A true CN114386429A (en) 2022-04-22

Family

ID=81198599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111580046.1A Pending CN114386429A (en) 2021-12-22 2021-12-22 Coal mine disaster risk prediction method and system based on semantic recognition

Country Status (1)

Country Link
CN (1) CN114386429A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984193A (en) * 2022-12-15 2023-04-18 东北林业大学 PDL1 expression level detection method fusing histopathology image and CT image
CN116070725A (en) * 2022-08-29 2023-05-05 山东科技大学 Mining pressure risk prediction method based on logistic regression
CN116167532A (en) * 2023-04-26 2023-05-26 中国安全生产科学研究院 System optimization method for coal mine industry illegal production behavior prediction system
CN117422581A (en) * 2023-11-01 2024-01-19 中国地质科学院矿产资源研究所 Mineral resource safety monitoring and early warning method, system, equipment and medium
CN117726181A (en) * 2024-02-06 2024-03-19 山东科技大学 Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116070725A (en) * 2022-08-29 2023-05-05 山东科技大学 Mining pressure risk prediction method based on logistic regression
CN115984193A (en) * 2022-12-15 2023-04-18 东北林业大学 PDL1 expression level detection method fusing histopathology image and CT image
CN116167532A (en) * 2023-04-26 2023-05-26 中国安全生产科学研究院 System optimization method for coal mine industry illegal production behavior prediction system
CN116167532B (en) * 2023-04-26 2023-09-05 中国安全生产科学研究院 System optimization method for coal mine industry illegal production behavior prediction system
CN117422581A (en) * 2023-11-01 2024-01-19 中国地质科学院矿产资源研究所 Mineral resource safety monitoring and early warning method, system, equipment and medium
CN117726181A (en) * 2024-02-06 2024-03-19 山东科技大学 Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine
CN117726181B (en) * 2024-02-06 2024-04-30 山东科技大学 Collaborative fusion and hierarchical prediction method for typical disaster risk heterogeneous information of coal mine

Similar Documents

Publication Publication Date Title
CN114386429A (en) Coal mine disaster risk prediction method and system based on semantic recognition
Zhong et al. Hazard analysis: A deep learning and text mining framework for accident prevention
CN113254594B (en) Smart power plant-oriented safety knowledge graph construction method and system
Lin et al. Understanding on-site inspection of construction projects based on keyword extraction and topic modeling
CN113095050A (en) Intelligent ticketing method, system, equipment and storage medium
CN112348662B (en) Risk assessment method and device based on user occupation prediction and electronic equipment
CN115544272A (en) Attention mechanism-based chemical accident cause knowledge graph construction method
CN111191452A (en) Railway text named entity recognition method and device
CN112257425A (en) Power data analysis method and system based on data classification model
Wang et al. Automatic frequency estimation of contributory factors for confined space accidents
Luo et al. A correlation analysis of construction site fall accidents based on text mining
Fu et al. Towards system-theoretic risk management for maritime transportation systems: A case study of the yangtze river estuary
Luo et al. Convolutional neural network algorithm–based novel automatic text classification framework for construction accident reports
KR20200001936A (en) Method for classifying safety document on construction site and Server for performing the same
Gangadhari et al. Application of rough set theory and machine learning algorithms in predicting accident outcomes in the Indian petroleum industry
Rupasinghe et al. Understanding construction site safety hazards through open data: text mining approach
CN113901815B (en) Emergency working condition event detection method based on dam operation log
ALawad et al. Unsupervised machine learning for managing safety accidents in railway stations
CN113221556A (en) Method, device and equipment for identifying potential safety hazard
CN112256862A (en) Data mapping relation establishing method
Xu et al. Identification of construction safety risks based on text mining and LIBSVM method
CN117592561B (en) Enterprise digital operation multidimensional data analysis method and system
CN116627915B (en) Dam emergency working condition event detection method and system based on slot semantic interaction
Cao et al. Identification of causative factors for fatal accidents in the electric power industry using text categorization and catastrophe association analysis techniques
CN116109142B (en) Dangerous waste supervision method, system and device based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination