CN110928988B - Method for rapidly estimating risk level of potential safety hazard in factory building - Google Patents
Method for rapidly estimating risk level of potential safety hazard in factory building Download PDFInfo
- Publication number
- CN110928988B CN110928988B CN201911034620.6A CN201911034620A CN110928988B CN 110928988 B CN110928988 B CN 110928988B CN 201911034620 A CN201911034620 A CN 201911034620A CN 110928988 B CN110928988 B CN 110928988B
- Authority
- CN
- China
- Prior art keywords
- potential safety
- safety hazard
- text
- factory building
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000013145 classification model Methods 0.000 claims abstract description 21
- 238000004519 manufacturing process Methods 0.000 claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 7
- 238000012795 verification Methods 0.000 claims abstract description 7
- 238000003062 neural network model Methods 0.000 claims abstract description 4
- 230000011218 segmentation Effects 0.000 claims description 20
- 239000000463 material Substances 0.000 claims description 17
- 230000014509 gene expression Effects 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000007613 environmental effect Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 231100001268 hazard characterization Toxicity 0.000 description 1
- 231100001261 hazardous Toxicity 0.000 description 1
- 238000012950 reanalysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3335—Syntactic pre-processing, e.g. stopword elimination, stemming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Engineering & Computer Science (AREA)
- General Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Manufacturing & Machinery (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Computational Linguistics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for quickly estimating the risk level of potential safety hazards in a factory, which comprises the following steps: step 1, establishing a factory building potential safety hazard database; step 2, pre-training a Chinese word vector model by utilizing a Chinese language database; step 3, dividing a workshop potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector; step 4, sending the divided and standardized potential safety hazard texts into a bert neural network model for fine adjustment to obtain a danger level classification model; step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information of the factory building, importing a danger level classification model, and estimating a danger level; and 6, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level in the database, and evaluating the confidence of the estimated result. By the method and the system, when the production elements of the factory building are changed, the new potential safety hazard can be rapidly evaluated in the danger level.
Description
Technical Field
The invention relates to the technical field of manufacturing industry danger control, in particular to a method for quickly estimating the danger level of potential safety hazards in a factory building.
Background
At present, with the rapid development of informatization of manufacturing industry and digitalization of manufacturing workshops in China, each single element in five elements (man-machine material method ring) of the workshops, such as equipment (machine) or products (materials), has a mature control method in the actual production of the traditional manufacturing industry. However, for the research on the actual potential safety hazard in the production plant, it is difficult to systematically and comprehensively grade and evaluate the risk due to the five elements of the plant.
In the actual production process of the current manufacturing industry, two main problems exist in the evaluation of potential safety hazards of a factory building: firstly, because the safety hidden trouble of the factory building involves many elements, and most of them are textual descriptions, it is difficult to perform quantitative grade evaluation. Secondly, the evaluation of the potential safety hazard needs to be carried out by safety experts in related fields. Therefore, LEC evaluation methods (proposed by American safety experts K.J. Graham and K.F. Jinni) are mostly adopted in the traditional manufacturing industry field at present for evaluating the dangerousness and harmfulness of operators in the operation in the environment with potential safety hazards. The method uses the product of index values of three factors related to the system risk to evaluate the casualty risk of the operator, wherein the three factors are as follows: l (likelihood, the likelihood of an accident), E (exposure, the frequency with which personnel are exposed to hazardous environments), and C (consequential, possible consequences in the event of an accident). The value of the risk of the working condition is evaluated by multiplying the value of the three values by the value of "D" (risk). However, the method still has great limitations, and the limitations are mainly reflected in the requirement of manual evaluation by safety experts. When any one of the five elements of the factory building production workshop is changed (for example, new equipment is purchased), the related potential safety hazard in the factory building is changed, the danger level of the factory building is possibly changed, at the moment, a safety specialist is required to perform reanalysis and evaluation again, and a new danger level is given, so that the evaluation process of the method is troublesome and slow.
Disclosure of Invention
The invention aims to provide a method for quickly estimating the risk level of potential safety hazard in a factory building, which can quickly estimate the new potential safety hazard caused by the change of five workshop elements in the factory building production process.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a method for quickly estimating the risk level of potential safety hazards in a factory building comprises the following steps:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) The potential safety hazard is caused by human factors and equipment material factors;
(2) The method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) The dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) The potential safety hazard relates to regions, namely the regions of a factory where the potential safety hazard may occur;
(5) The potential safety hazard relates to objects, namely the potential safety hazard relates to which factors in a man-machine material law ring;
(6) The risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database;
step 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a corpus word vector of the potential safety hazard, which specifically comprises the following steps:
step 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql;
step 3.2, standardizing the extracted text data of the potential safety hazards, wherein each potential safety hazard adopts a format of danger level + text content and uses \\ t' interval;
3.3, performing word segmentation on the information in the potential safety hazard text content by using Jieba, and establishing a special stop word list aiming at the potential safety hazard text content of a specific factory building after obtaining a word segmentation result;
step 3.4, the potential safety hazard text contents after word segmentation and word stop removal are sent into the word vector model trained in the step 2, and the feature vector of each corresponding word is output;
step 4, sending the divided and standardized potential safety hazard text into a bert neural network model for fine adjustment to obtain a danger level classification model;
step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information according to the six points in the step 1.2, forming text information, importing a danger level classification model, and predicting the danger level, wherein the steps specifically include: preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into a finely adjusted danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model;
and 6, after the estimated risk level of the new potential safety hazard is obtained, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level stored in the database, and further evaluating the confidence level of the estimated result of the risk level of the new potential safety hazard.
Further, the chinese corpus in step 2 is a wikipedia chinese corpus.
Further, the step 2 specifically includes:
step 2.1, downloading a Chinese language database as original training data at a Wikipedia website and converting the Chinese language database into simplified characters by using an opencc tool;
step 2.2, extracting the content of the Chinese language database and performing word segmentation on the Chinese language database which is completely converted into simplified characters by using a regular expression;
and 2.3, training the corpus after Word segmentation and removal of stop words by using a Word2Vec model.
Further, the feature vector output in step 3.4 is a 64-bit feature vector.
Further, the step 4 specifically includes:
step 4.1, establishing a model, and establishing a pre-training model of bert by using python;
and 4.2, reading the divided potential safety hazard text training set, test set and verification set, and starting to train the potential safety hazard risk grade classification model.
Further, the step 6 specifically includes:
6.1, extracting text data with the same risk level as the risk level of the potential safety hazard with the highest probability evaluated in the step 5 from a potential safety hazard database after the risk level of the newly appeared potential safety hazard is estimated;
step 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the text data of the same level potential safety hazard extracted from the database, and then importing the Word data into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector;
6.3, representing the text vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard text;
and 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3.
Further, in step 6.4, if all the corresponding risk levels do not exceed the confidence threshold, the risk level classification model is selected to output the risk level corresponding to the highest prediction probability.
Compared with the prior art, the invention has the remarkable advantages that:
(1) The invention provides a method for carrying out artificial rating without safety experts, and compared with the traditional evaluation method, the method saves a lot of labor cost;
(2) The word vector model is pre-trained by utilizing a huge corpus of Wikipedia, so that the accuracy of the word vector model is improved, and the word vector model has better practicability when being applied to a factory building potential safety hazard database;
(3) By the method, when the production elements of the factory building are changed, the new potential safety hazards can be rapidly evaluated in the danger level, and the safe production work of the factory building is guided.
Drawings
FIG. 1 is a flow chart of a rapid estimation method for the risk level of the potential safety hazard in a factory building.
FIG. 2 is a diagram of a pre-training Word vector model Word2Vec structure.
FIG. 3 is a diagram of a standardized data format of a plant safety hazard text.
Detailed Description
The following describes the implementation of the present invention in detail with reference to specific embodiments.
The invention discloses a method for quickly estimating the risk level of potential safety hazards in a factory building, which comprises the following steps of:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) The potential safety hazard causes include human factors and equipment material factors;
(2) The method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) The dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) The potential safety hazard relates to areas, namely areas where the potential safety hazard can occur in a factory building;
(5) The potential safety hazard relates to objects, namely which factors in a man-machine material law ring are related to the potential safety hazard;
(6) The risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database, wherein the database table is designed as shown in the following table (mysql database is adopted by default):
table 1 factory building potential safety hazard database table:
name of field | Length of | Character type | Whether or not it is empty | Main key |
Dangers _ id (potential safety hazard number) | 5 | Int | Not null | PK |
Reason (cause of production) | 255 | Varchar | Not null | |
Performance (hazard characterization) | 255 | Varchar | Not null | |
Result (consequence of risk) | 255 | Varchar | Not null | |
Related _ area (Related to region) | 50 | Varchar | Not null | |
Related _ object (Related to object) | 20 | Varchar | Not null | |
Level (danger Level) | 5 | Int | Not null |
and 2.1, downloading a Chinese language database as original training data on a Wikipedia website. Since the data contains many traditional characters, the traditional characters are all converted into simplified characters by using an opencc tool.
And 2.2, extracting article contents and performing word segmentation on the Chinese corpus which is completely converted into simplified characters by using a regular expression. Step 2.1 the corpus extracted contains many < doc > </doc >, so that these irrelevant contents need to be removed by regular expressions. Then, the article is segmented by a Jieba tool in python, and some words without practical meaning are removed during segmentation, so that the removal of a stop word is added after the segmentation.
And 2.3, training the corpus after Word segmentation and Word stop removal by using a Word2Vec model, wherein the structure of the Word2Vec model is shown in FIG. 2. Some of the parameters can be applied to the subsequent steps with slight modifications as follows: firstly, changing the vector dimension of a word into 64 bits; secondly, changing the training window to 5, namely considering the front five adjacent words and the rear five adjacent words; then, the lowest word frequency is set to be 5, namely, if the frequency of one word appearing in all the linguistic data is less than five times, the word is discarded; meanwhile, the learning rate is adjusted to 0.025; finally, the number of iterations is set to 10.
And 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector. The method specifically comprises the following steps:
and 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql.
And 3.2, standardizing the extracted potential safety hazard text data according to the data format of the figure 3, wherein each potential safety hazard adopts a format of danger level + text content and is separated by a \ t interval.
And 3.3, performing word segmentation on the information in the specific content of the potential safety hazard by using Jieba, and establishing a special stop word list aiming at the specific text information of a specific workshop after a word segmentation result is obtained, wherein for example, a word "cause" is frequently generated in the text information of the potential safety hazard of a certain workshop, but the word has no specific function in risk level evaluation and is added into the stop word list.
And 3.4, sending the text data after word segmentation and word stop removal into the word vector model trained in the step 2, and outputting to obtain 64-bit feature vectors of each corresponding word.
And 4, sending the divided and preprocessed potential safety hazard texts into a neural network model for fine adjustment to obtain a danger level classification model. The method specifically comprises the following steps:
and 4.1, establishing a model, establishing a bert pre-training model by using python, and generally adopting the bert Chinese model for Chinese corpora.
And 4.2, reading the divided potential safety hazard text training set, test set and verification set, and starting to train the potential safety hazard danger level classification model after modifying part of parameters, wherein the model parameters are modified as follows: the risk level classification model after fine tuning is obtained by first modifying the longest sentence length to 50, then tuning the learning rate to 2e-5, and tuning the number of iterations to 3, and finally setting the batch size to 16.
And 5, when new potential safety hazards appear in the factory building, acquiring relevant element information of the factory building, importing a danger level classification model, and estimating the danger level, wherein the specific steps are as follows:
and preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into the post-fine-adjustment bert danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model.
And 6, after the estimated risk level is obtained, comparing the similarity of the document between the new potential safety hazard and the potential safety hazard with the same risk level existing in the database, and further evaluating the estimated risk level result, wherein the method specifically comprises the following steps:
and 6.1, extracting text data with the same risk level as the risk level of the potential safety hazard with the highest probability evaluated in the step 5 from the potential safety hazard database after the risk level of the newly appeared potential safety hazard is estimated.
And 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the related text data of the same-level potential safety hazard extracted from the database, and then importing the Word vector model into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector.
And 6.3, because the text data of the potential safety hazard is mostly short text and is usually within 50 words after preprocessing, representing the document vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard document.
And 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3. And if all the corresponding danger levels do not exceed the confidence threshold, preferentially selecting the danger level classification model to output the danger level corresponding to the highest prediction probability.
The foregoing shows and describes the general principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (6)
1. A method for quickly estimating the risk level of potential safety hazards in a factory building is characterized by comprising the following steps:
step 1, establishing a factory building potential safety hazard database, which specifically comprises the following steps:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) The potential safety hazard is caused by human factors and equipment material factors;
(2) The method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) The dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) The potential safety hazard relates to areas, namely which areas of the factory building the potential safety hazard can occur in;
(5) The potential safety hazard relates to objects, namely which factors in a man-machine material law ring are involved in the potential safety hazard;
(6) The risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database;
step 2, pre-training a Chinese word vector model by utilizing a Chinese language database;
the step 2 specifically comprises:
step 2.1, downloading a Chinese language database as original training data on a Wikipedia website and converting the Chinese language database into simplified characters by using an opencc tool;
step 2.2, extracting the content of the Chinese language database and performing word segmentation on the Chinese language database which is completely converted into simplified characters by using a regular expression;
step 2.3, training the corpus after Word segmentation and removal of stop words by using a Word2Vec model;
step 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector, wherein the method specifically comprises the following steps:
step 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql;
step 3.2, standardizing the extracted text data of the potential safety hazards, wherein each potential safety hazard adopts a format of danger level + text content and uses \\ t' interval;
3.3, performing word segmentation on the information in the potential safety hazard text content by using Jieba, and establishing a special stop word list aiming at the potential safety hazard text content of a specific factory building after obtaining a word segmentation result;
step 3.4, the potential safety hazard text contents after word segmentation and word stop removal are sent to the word vector model trained in the step 2, and the feature vector of each corresponding word is output;
step 4, sending the divided and standardized potential safety hazard texts into a bert neural network model for fine adjustment to obtain a danger level classification model;
step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information according to the six points in the step 1.2, forming text information, importing a danger level classification model, and estimating the danger level, wherein the method specifically comprises the following steps: preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into a finely adjusted danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model;
and 6, after the estimated risk level of the new potential safety hazard is obtained, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level stored in the database, and further evaluating the confidence of the estimated result of the risk level of the new potential safety hazard.
2. The method according to claim 1, wherein the chinese corpus in step 2 is a wikipedia chinese corpus.
3. The method of claim 1, wherein the eigenvector output in step 3.4 is a 64-bit eigenvector.
4. The method according to claim 3, wherein the step 4 specifically comprises:
step 4.1, establishing a model, and establishing a pre-training model of bert by using python;
and 4.2, reading the divided potential safety hazard text training set, test set and verification set, and starting to train the potential safety hazard risk grade classification model.
5. The method according to claim 4, wherein the step 6 specifically comprises:
step 6.1, after the danger level of the newly appeared potential safety hazard is estimated, extracting text data which are the same as the danger level of the potential safety hazard with the highest probability estimated in the step 5 from a potential safety hazard database;
step 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the text data of the same level potential safety hazard extracted from the database, and then importing the Word data into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector;
6.3, representing the text vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard text;
and 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, and otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3.
6. The method according to claim 5, wherein in step 6.4, if all the corresponding risk classes do not exceed the confidence threshold, the risk class classification model is selected to output the risk class corresponding to the highest prediction probability.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911034620.6A CN110928988B (en) | 2019-10-29 | 2019-10-29 | Method for rapidly estimating risk level of potential safety hazard in factory building |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911034620.6A CN110928988B (en) | 2019-10-29 | 2019-10-29 | Method for rapidly estimating risk level of potential safety hazard in factory building |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110928988A CN110928988A (en) | 2020-03-27 |
CN110928988B true CN110928988B (en) | 2022-10-14 |
Family
ID=69849678
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911034620.6A Active CN110928988B (en) | 2019-10-29 | 2019-10-29 | Method for rapidly estimating risk level of potential safety hazard in factory building |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110928988B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112070215B (en) * | 2020-09-10 | 2023-08-29 | 北京理工大学 | Processing method and processing device for dangerous situation analysis based on BP neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107730124A (en) * | 2017-10-20 | 2018-02-23 | 安厦系统科技成都有限责任公司 | A kind of method for carrying out security risk assessment for enterprise or project |
CN109472462A (en) * | 2018-10-18 | 2019-03-15 | 中山大学 | A kind of project risk ranking method and device based on the fusion of multi-model storehouse |
CN110347805A (en) * | 2019-07-22 | 2019-10-18 | 中海油安全技术服务有限公司 | Petroleum industry security risk key element extracting method, device, server and storage medium |
-
2019
- 2019-10-29 CN CN201911034620.6A patent/CN110928988B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107730124A (en) * | 2017-10-20 | 2018-02-23 | 安厦系统科技成都有限责任公司 | A kind of method for carrying out security risk assessment for enterprise or project |
CN109472462A (en) * | 2018-10-18 | 2019-03-15 | 中山大学 | A kind of project risk ranking method and device based on the fusion of multi-model storehouse |
CN110347805A (en) * | 2019-07-22 | 2019-10-18 | 中海油安全技术服务有限公司 | Petroleum industry security risk key element extracting method, device, server and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110928988A (en) | 2020-03-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109165284B (en) | Financial field man-machine conversation intention identification method based on big data | |
CN111198948B (en) | Text classification correction method, apparatus, device and computer readable storage medium | |
CN108628822B (en) | Semantic-free text recognition method and device | |
CN113590764B (en) | Training sample construction method and device, electronic equipment and storage medium | |
CN110807314A (en) | Text emotion analysis model training method, device and equipment and readable storage medium | |
CN106897439A (en) | The emotion identification method of text, device, server and storage medium | |
CN107729403A (en) | Internet information indicating risk method and system | |
CN111984792A (en) | Website classification method and device, computer equipment and storage medium | |
CN110750984B (en) | Command line character string processing method, terminal, device and readable storage medium | |
CN112069307B (en) | Legal provision quotation information extraction system | |
CN112199496A (en) | Power grid equipment defect text classification method based on multi-head attention mechanism and RCNN (Rich coupled neural network) | |
CN114416979A (en) | Text query method, text query equipment and storage medium | |
CN112257425A (en) | Power data analysis method and system based on data classification model | |
CN110928988B (en) | Method for rapidly estimating risk level of potential safety hazard in factory building | |
CN110781673B (en) | Document acceptance method and device, computer equipment and storage medium | |
CN107577738A (en) | A kind of FMECA method by SVM text mining processing datas | |
CN116644183B (en) | Text classification method, device and storage medium | |
CN114722198A (en) | Method, system and related device for determining product classification code | |
CN114625834A (en) | Enterprise industry information determination method and device and electronic equipment | |
CN110110087A (en) | A kind of Feature Engineering method for Law Text classification based on two classifiers | |
Zhang et al. | Similarity judgment of civil aviation regulations based on Doc2Vec deep learning algorithm | |
CN109947932B (en) | Push information classification method and system | |
CN115712715A (en) | Question answering method, device, electronic equipment and storage medium for introduction | |
CN111090723B (en) | Knowledge graph-based recommendation method for safe production content of power grid | |
CN113468882A (en) | Method for identifying similar spare parts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |