CN110928988A - Method for rapidly estimating risk level of potential safety hazard in factory building - Google Patents

Method for rapidly estimating risk level of potential safety hazard in factory building Download PDF

Info

Publication number
CN110928988A
CN110928988A CN201911034620.6A CN201911034620A CN110928988A CN 110928988 A CN110928988 A CN 110928988A CN 201911034620 A CN201911034620 A CN 201911034620A CN 110928988 A CN110928988 A CN 110928988A
Authority
CN
China
Prior art keywords
potential safety
safety hazard
text
factory building
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911034620.6A
Other languages
Chinese (zh)
Other versions
CN110928988B (en
Inventor
刘庭煜
韦凯翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201911034620.6A priority Critical patent/CN110928988B/en
Publication of CN110928988A publication Critical patent/CN110928988A/en
Application granted granted Critical
Publication of CN110928988B publication Critical patent/CN110928988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Manufacturing & Machinery (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for quickly estimating the risk level of potential safety hazards in a factory building, which comprises the following steps of: step 1, establishing a factory building potential safety hazard database; step 2, pre-training a Chinese word vector model by utilizing a Chinese language database; step 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector; step 4, sending the divided and standardized potential safety hazard texts into a bert neural network model for fine adjustment to obtain a danger level classification model; step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information of the factory building, importing the relevant element information into a danger level classification model, and estimating a danger level; and 6, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level in the database, and evaluating the confidence of the estimated result. By the method and the system, when the production elements of the factory building are changed, the new potential safety hazard can be rapidly evaluated in the danger level.

Description

Method for rapidly estimating risk level of potential safety hazard in factory building
Technical Field
The invention relates to the technical field of manufacturing industry danger control, in particular to a method for quickly estimating the danger level of potential safety hazards in a factory building.
Background
At present, with the rapid development of informatization of manufacturing industry and digitalization of manufacturing workshops in China, each single element in five elements (man-machine material method ring) of the workshops, such as equipment (machine) or products (materials), has a mature control method in the actual production of the traditional manufacturing industry. However, for the research on the actual potential safety hazard in the production plant, it is difficult to systematically and comprehensively grade and evaluate the risk due to the five elements of the plant.
In the actual production process of the current manufacturing industry, two main problems exist in the evaluation of potential safety hazards of a factory building: firstly, because the safety hidden trouble of the factory building involves many elements, and most of them are textual descriptions, it is difficult to perform quantitative grade evaluation. Secondly, the evaluation of the potential safety hazard needs to be carried out by safety experts in related fields. Therefore, the LEC evaluation method (proposed by american safety experts k.j. grave and k.f. kini) is mostly adopted in the conventional manufacturing field to evaluate the danger and hazard of operators working in the environment with potential safety hazard. The method uses the product of index values of three factors related to the system risk to evaluate the casualty risk of the operator, wherein the three factors are as follows: l (likelihood, the likelihood of an accident), E (exposure, the frequency with which personnel are exposed to hazardous environments), and C (consequential, possible consequences in the event of an accident). The value of the risk of the working condition is evaluated by multiplying the value of the three values by the value of "D" (risk). However, the method still has great limitations, and the limitations are mainly reflected in the requirement of manual evaluation by safety experts. When any one of the five elements of the factory building production workshop is changed (for example, new equipment is purchased), the related potential safety hazard in the factory building is changed, the danger level of the factory building is possibly changed, at the moment, a safety specialist is required to perform reanalysis and evaluation again, and a new danger level is given, so that the evaluation process of the method is troublesome and slow.
Disclosure of Invention
The invention aims to provide a method for quickly estimating the risk level of potential safety hazard in a factory building, which can quickly estimate the new potential safety hazard caused by the change of five workshop elements in the factory building production process.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a method for quickly estimating the risk level of potential safety hazards in a factory building comprises the following steps:
step 1, establishing a factory building potential safety hazard database, which specifically comprises the following steps:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) the potential safety hazard is caused by human factors and equipment material factors;
(2) the method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) the dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) the potential safety hazard relates to areas, namely areas where the potential safety hazard can occur in a factory building;
(5) the potential safety hazard relates to objects, namely which factors in a man-machine material law ring are related to the potential safety hazard;
(6) the risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database;
step 2, pre-training a Chinese word vector model by utilizing a Chinese language database;
step 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector, wherein the method specifically comprises the following steps:
step 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql;
step 3.2, standardizing the extracted text data of the potential safety hazards, wherein each potential safety hazard adopts a format of danger level + text content and uses \\ t' interval;
3.3, performing word segmentation on the information in the potential safety hazard text content by using Jieba, and establishing a special stop word list aiming at the potential safety hazard text content of a specific factory building after obtaining a word segmentation result;
step 3.4, the potential safety hazard text contents after word segmentation and word stop removal are sent into the word vector model trained in the step 2, and the feature vector of each corresponding word is output;
step 4, sending the divided and standardized potential safety hazard texts into a bert neural network model for fine adjustment to obtain a danger level classification model;
step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information according to the six points in the step 1.2, forming text information, importing a danger level classification model, and predicting the danger level, wherein the steps specifically include: preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into a finely adjusted danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model;
and 6, after the estimated risk level of the new potential safety hazard is obtained, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level stored in the database, and further evaluating the confidence of the estimated result of the risk level of the new potential safety hazard.
Further, the chinese corpus in step 2 is a wikipedia chinese corpus.
Further, the step 2 specifically includes:
step 2.1, downloading a Chinese language database as original training data on a Wikipedia website and converting the Chinese language database into simplified characters by using an opencc tool;
step 2.2, extracting the content of the Chinese language database and performing word segmentation on the Chinese language database which is completely converted into simplified characters by using a regular expression;
and 2.3, training the corpus after Word segmentation and removal of stop words by using a Word2Vec model.
Further, the feature vector output in step 3.4 is a 64-bit feature vector.
Further, the step 4 specifically includes:
step 4.1, establishing a model, and establishing a pre-training model of bert by using python;
and 4.2, reading the divided potential safety hazard text training set, test set and verification set, and starting to train the potential safety hazard risk grade classification model.
Further, the step 6 specifically includes:
6.1, extracting text data with the same risk level as the risk level of the potential safety hazard with the highest probability evaluated in the step 5 from a potential safety hazard database after the risk level of the newly appeared potential safety hazard is estimated;
step 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the text data of the same level potential safety hazard extracted from the database, and then importing the Word data into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector;
6.3, representing the text vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard text;
and 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3.
Further, in step 6.4, if all the corresponding risk levels do not exceed the confidence threshold, the risk level classification model is selected to output the risk level corresponding to the highest prediction probability.
Compared with the prior art, the invention has the remarkable advantages that:
(1) the invention provides a method for carrying out artificial rating without safety experts, and compared with the traditional evaluation method, the method saves a lot of labor cost;
(2) the word vector model is pre-trained by utilizing a huge corpus of Wikipedia, so that the accuracy of the word vector model is improved, and the word vector model has better practicability when being applied to a factory building potential safety hazard database;
(3) by the method, when the production elements of the factory building are changed, the new potential safety hazards can be rapidly evaluated in the danger level, and the safe production work of the factory building is guided.
Drawings
FIG. 1 is a flow chart of a rapid estimation method for the risk level of the potential safety hazard in a factory building.
FIG. 2 is a diagram of the structure of the pre-training Word vector model Word2 Vec.
FIG. 3 is a diagram of a standardized data format of a plant safety hazard text.
Detailed Description
The following describes the implementation of the present invention in detail with reference to specific embodiments.
The invention discloses a method for quickly estimating the risk level of potential safety hazards in a factory building, which comprises the following steps of:
step 1, establishing a factory building potential safety hazard database, which specifically comprises the following steps:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) the potential safety hazard is caused by human factors and equipment material factors;
(2) the method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) the dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) the potential safety hazard relates to areas, namely areas where the potential safety hazard can occur in a factory building;
(5) the potential safety hazard relates to objects, namely which factors in a man-machine material law ring are related to the potential safety hazard;
(6) the risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database, wherein the database table is designed as shown in the following table (mysql database is adopted by default):
TABLE 1 factory building potential safety hazard database table
Name of field Length of Character type Whether or not it is empty Main key
Dangers _ id (potential safety hazard number) 5 Int Not null PK
Reason (cause of production) 255 Varchar Not null
Performance (hazard characterization) 255 Varchar Not null
Result (consequence of risk) 255 Varchar Not null
Related _ area (Related to region) 50 Varchar Not null
Related _ object (Related to object) 20 Varchar Not null
Level (danger Level) 5 Int Not null
Step 2, pre-training a Chinese word vector model by utilizing a Chinese language database of Wikipedia, and specifically comprises the following steps:
and 2.1, downloading a Chinese language database as original training data on a Wikipedia website. Since the data contains many traditional characters, the traditional characters are all converted into simplified characters by using an opencc tool.
And 2.2, extracting article contents and performing word segmentation on the Chinese corpus which is completely converted into simplified characters by using a regular expression. Step 2.1 the corpus extracted contains many < doc > </doc >, so that these irrelevant contents need to be removed by regular expressions. Then, the article is segmented by a Jieba tool in python, and some words without practical meaning are removed during segmentation, so that the removal of a stop word is added after the segmentation.
And 2.3, training the corpus after Word segmentation and removal of stop words by using a Word2Vec model, wherein the structure of the Word2Vec model is shown in FIG. 2. Some of the parameters can be applied to the subsequent steps with slight modifications as follows: firstly, changing the vector dimension of a word into 64 bits; secondly, changing the training window into 5, namely considering the front five words and the rear five words adjacent to one word; then, the lowest word frequency is set to be 5, namely, if the frequency of one word appearing in all the linguistic data is less than five times, the word is discarded; meanwhile, the learning rate is adjusted to 0.025; finally, the number of iterations is set to 10.
And 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector. The method specifically comprises the following steps:
and 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql.
And 3.2, standardizing the extracted potential safety hazard text data according to the data format of the figure 3, wherein each potential safety hazard adopts a format of danger level + text content and uses a \ t interval.
And 3.3, performing word segmentation on the information in the specific content of the potential safety hazard by using the Jieba, and establishing a special stop word list aiming at the specific text information of the specific plant after obtaining a word segmentation result, wherein for example, a word 'cause' is frequently generated in the text information of the potential safety hazard of a certain plant, but the word has no specific function in risk level evaluation, so that the word is added into the stop word list.
And 3.4, sending the text data after word segmentation and word stop removal into the word vector model trained in the step 2, and outputting to obtain 64-bit feature vectors of each corresponding word.
And 4, sending the divided and preprocessed potential safety hazard texts into a neural network model for fine adjustment to obtain a danger level classification model. The method specifically comprises the following steps:
and 4.1, establishing a model, establishing a bert pre-training model by using python, and generally adopting the bert Chinese model for Chinese corpora.
Step 4.2, reading the divided potential safety hazard text training set, test set and verification set, and training the potential safety hazard danger level classification model after modifying part of parameters, wherein the modification of model parameters is as follows: the risk level classification model after fine tuning is obtained by first modifying the longest sentence length to 50, then tuning the learning rate to 2e-5, and tuning the number of iterations to 3, and finally setting the batch size to 16.
And 5, when new potential safety hazards appear in the factory building, acquiring relevant element information of the factory building, importing a danger level classification model, and estimating the danger level, wherein the specific steps are as follows:
and preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into the post-fine-adjustment bert danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model.
And 6, after the estimated risk level is obtained, comparing the similarity of the document between the new potential safety hazard and the potential safety hazard with the same risk level existing in the database, and further evaluating the estimated risk level result, wherein the method specifically comprises the following steps:
and 6.1, extracting text data with the same risk level as the risk level of the potential safety hazard with the highest probability evaluated in the step 5 from the potential safety hazard database after the risk level of the newly appeared potential safety hazard is estimated.
And 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the related text data of the same-level potential safety hazard extracted from the database, and then importing the Word vector model into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector.
And 6.3, because the text data of the potential safety hazard is mostly short text and is usually within 50 words after preprocessing, representing the document vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard document.
And 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3. And if all the corresponding danger levels do not exceed the confidence threshold, preferentially selecting the danger level classification model to output the danger level corresponding to the highest prediction probability.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. A method for quickly estimating the risk level of potential safety hazards in a factory building is characterized by comprising the following steps:
step 1, establishing a factory building potential safety hazard database, which specifically comprises the following steps:
step 1.1, listing all potential safety hazards in a factory building according to the actual production condition of the factory building and five production factors of a man-machine material method ring workshop;
step 1.2, analyzing each potential safety hazard of the plant according to the following six points:
(1) the potential safety hazard is caused by human factors and equipment material factors;
(2) the method comprises the following steps of (1) representing the danger of the potential safety hazard, namely representing the actual expression form of the potential safety hazard in a workshop when the potential safety hazard actually occurs, wherein the actual expression form specifically comprises environmental factors, equipment material factors and personnel factors;
(3) the dangerous consequences of the potential safety hazard comprise personnel dangerous consequences, material dangerous consequences and equipment dangerous consequences;
(4) the potential safety hazard relates to areas, namely areas where the potential safety hazard can occur in a factory building;
(5) the potential safety hazard relates to objects, namely which factors in a man-machine material law ring are related to the potential safety hazard;
(6) the risk grade of the potential safety hazard is evaluated by adopting an LEC method to the potential safety hazard initially existing in the factory building;
step 1.3, inputting the six-point information of each potential safety hazard analyzed in the step 1.2 into a database to form a plant potential safety hazard database;
step 2, pre-training a Chinese word vector model by utilizing a Chinese language database;
step 3, dividing a plant potential safety hazard text training set, a test set and a verification set, and generating a potential safety hazard corpus word vector, wherein the method specifically comprises the following steps:
step 3.1, extracting text data stored in the plant potential safety hazard database formed in the step 1 by using pymysql;
step 3.2, standardizing the extracted text data of the potential safety hazards, wherein each potential safety hazard adopts a format of danger level + text content and uses \\ t' interval;
3.3, performing word segmentation on the information in the potential safety hazard text content by using Jieba, and establishing a special stop word list aiming at the potential safety hazard text content of a specific factory building after obtaining a word segmentation result;
step 3.4, the potential safety hazard text contents after word segmentation and word stop removal are sent into the word vector model trained in the step 2, and the feature vector of each corresponding word is output;
step 4, sending the divided and standardized potential safety hazard texts into a bert neural network model for fine adjustment to obtain a danger level classification model;
step 5, when new potential safety hazards appear in the factory building, acquiring relevant element information according to the six points in the step 1.2, forming text information, importing a danger level classification model, and predicting the danger level, wherein the steps specifically include: preprocessing the text information of the new potential safety hazard, importing the preprocessed text information into a finely adjusted danger level classification model, and estimating the danger level of the new potential safety hazard from high to low in sequence according to the danger level probability output by the model;
and 6, after the estimated risk level of the new potential safety hazard is obtained, comparing the text similarity of the new potential safety hazard and the potential safety hazard with the same risk level stored in the database, and further evaluating the confidence of the estimated result of the risk level of the new potential safety hazard.
2. The method according to claim 1, wherein the chinese corpus in step 2 is a wikipedia chinese corpus.
3. The method according to claim 2, wherein the step 2 specifically comprises:
step 2.1, downloading a Chinese language database as original training data on a Wikipedia website and converting the Chinese language database into simplified characters by using an opencc tool;
step 2.2, extracting the content of the Chinese language database and performing word segmentation on the Chinese language database which is completely converted into simplified characters by using a regular expression;
and 2.3, training the corpus after Word segmentation and removal of stop words by using a Word2Vec model.
4. The method of claim 3, wherein the eigenvector output in step 3.4 is a 64-bit eigenvector.
5. The method according to claim 4, wherein the step 4 specifically comprises:
step 4.1, establishing a model, and establishing a pre-training model of bert by using python;
and 4.2, reading the divided potential safety hazard text training set, test set and verification set, and starting to train the potential safety hazard risk grade classification model.
6. The method according to claim 5, wherein the step 6 specifically comprises:
6.1, extracting text data with the same risk level as the risk level of the potential safety hazard with the highest probability evaluated in the step 5 from a potential safety hazard database after the risk level of the newly appeared potential safety hazard is estimated;
step 6.2, performing Word segmentation and stop Word removal on the newly-appeared potential safety hazard and the text data of the same level potential safety hazard extracted from the database, and then importing the Word data into the Word2Vec Word vector model generated in the step 2 to generate a corresponding Word vector;
6.3, representing the text vector of the potential safety hazard by using the average value of the word vectors in each potential safety hazard text;
and 6.4, representing the similarity of the new potential safety hazard and each text of the same-level potential safety hazard by using the space cosine similarity of the text vector, wherein the confidence threshold value is set to be 0.5, namely if the similarity of the new potential safety hazard text and more than 50% of the existing texts with the same-level potential safety hazard exceeds 50%, the risk level estimated by the risk level classification model is considered to be credible, otherwise, selecting the risk level corresponding to the next prediction probability from the step 5 and repeating the operations of the steps 6.2 to 6.3.
7. The method according to claim 6, wherein in step 6.4, if all the corresponding risk classes do not exceed the confidence threshold, the risk class classification model is selected to output the risk class corresponding to the highest prediction probability.
CN201911034620.6A 2019-10-29 2019-10-29 Method for rapidly estimating risk level of potential safety hazard in factory building Active CN110928988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911034620.6A CN110928988B (en) 2019-10-29 2019-10-29 Method for rapidly estimating risk level of potential safety hazard in factory building

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911034620.6A CN110928988B (en) 2019-10-29 2019-10-29 Method for rapidly estimating risk level of potential safety hazard in factory building

Publications (2)

Publication Number Publication Date
CN110928988A true CN110928988A (en) 2020-03-27
CN110928988B CN110928988B (en) 2022-10-14

Family

ID=69849678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911034620.6A Active CN110928988B (en) 2019-10-29 2019-10-29 Method for rapidly estimating risk level of potential safety hazard in factory building

Country Status (1)

Country Link
CN (1) CN110928988B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070215A (en) * 2020-09-10 2020-12-11 北京理工大学 BP neural network-based dangerous situation analysis processing method and processing device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107730124A (en) * 2017-10-20 2018-02-23 安厦系统科技成都有限责任公司 A kind of method for carrying out security risk assessment for enterprise or project
CN109472462A (en) * 2018-10-18 2019-03-15 中山大学 A kind of project risk ranking method and device based on the fusion of multi-model storehouse
CN110347805A (en) * 2019-07-22 2019-10-18 中海油安全技术服务有限公司 Petroleum industry security risk key element extracting method, device, server and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107730124A (en) * 2017-10-20 2018-02-23 安厦系统科技成都有限责任公司 A kind of method for carrying out security risk assessment for enterprise or project
CN109472462A (en) * 2018-10-18 2019-03-15 中山大学 A kind of project risk ranking method and device based on the fusion of multi-model storehouse
CN110347805A (en) * 2019-07-22 2019-10-18 中海油安全技术服务有限公司 Petroleum industry security risk key element extracting method, device, server and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070215A (en) * 2020-09-10 2020-12-11 北京理工大学 BP neural network-based dangerous situation analysis processing method and processing device
CN112070215B (en) * 2020-09-10 2023-08-29 北京理工大学 Processing method and processing device for dangerous situation analysis based on BP neural network

Also Published As

Publication number Publication date
CN110928988B (en) 2022-10-14

Similar Documents

Publication Publication Date Title
CN109165284B (en) Financial field man-machine conversation intention identification method based on big data
CN111198948B (en) Text classification correction method, apparatus, device and computer readable storage medium
CN108628822B (en) Semantic-free text recognition method and device
CN105095444A (en) Information acquisition method and device
CN106897439A (en) The emotion identification method of text, device, server and storage medium
CN113590764B (en) Training sample construction method and device, electronic equipment and storage medium
CN107729403A (en) Internet information indicating risk method and system
CN111984792A (en) Website classification method and device, computer equipment and storage medium
CN110750984B (en) Command line character string processing method, terminal, device and readable storage medium
CN104933072A (en) Multi-language internet information analysis method
CN114416979A (en) Text query method, text query equipment and storage medium
CN112257425A (en) Power data analysis method and system based on data classification model
CN110781673B (en) Document acceptance method and device, computer equipment and storage medium
CN110928988B (en) Method for rapidly estimating risk level of potential safety hazard in factory building
CN114625834A (en) Enterprise industry information determination method and device and electronic equipment
CN107577738A (en) A kind of FMECA method by SVM text mining processing datas
CN114722198A (en) Method, system and related device for determining product classification code
CN110110087A (en) A kind of Feature Engineering method for Law Text classification based on two classifiers
CN116644183A (en) Text classification method, device and storage medium
CN109947932B (en) Push information classification method and system
CN115712715A (en) Question answering method, device, electronic equipment and storage medium for introduction
CN113886520B (en) Code retrieval method, system and computer readable storage medium based on graph neural network
CN111090723B (en) Knowledge graph-based recommendation method for safe production content of power grid
CN113468882A (en) Method for identifying similar spare parts
CN112749530A (en) Text encoding method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant