CN114003683A - Alarm condition analysis method based on natural language processing and association rule - Google Patents
Alarm condition analysis method based on natural language processing and association rule Download PDFInfo
- Publication number
- CN114003683A CN114003683A CN202111303071.5A CN202111303071A CN114003683A CN 114003683 A CN114003683 A CN 114003683A CN 202111303071 A CN202111303071 A CN 202111303071A CN 114003683 A CN114003683 A CN 114003683A
- Authority
- CN
- China
- Prior art keywords
- alarm
- natural language
- association rules
- data
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Abstract
The invention relates to the technical field of alarm condition analysis, and particularly discloses an alarm condition analysis method based on natural language processing and association rules, wherein the alarm condition analysis method comprises the following steps: acquiring original data of an alarm receiving and handling service; processing the original data of the alarm handling service by a tool based on natural language to obtain event triples; matching the event triple with an accident factor, and binding the structured data in the event triple with the accident factor, wherein the accident factor represents a mapping relation table of a natural language of the alarm and a digital language of the alarm; mining the association rules of the multiple pieces of structured data bound by each accident factor according to an association rule mining algorithm to obtain a set with the association rules; and processing the set with the association rule to obtain an alarm condition analysis result. The alarm condition analysis method based on the natural language processing and association rules can effectively utilize historical alarm condition data to analyze the alarm condition.
Description
Technical Field
The invention relates to the technical field of alarm condition analysis, in particular to an alarm condition analysis method based on natural language processing and association rules.
Background
At present, a large amount of unstructured alarm condition text data are often generated in an alarm receiving and processing service, and in the face of the analysis work requirement of alarm condition contents, the existing conventional processing and analysis methods such as database query are difficult to dig out the association relationship which is helpful for alarm condition analysis and judgment from a large amount of alarm condition information, and the historical alarm condition data cannot be effectively utilized to construct an analysis early warning model.
Disclosure of Invention
The invention provides a warning condition analysis method based on natural language processing and association rules, which solves the problem that the historical warning condition data cannot be effectively utilized to carry out warning condition analysis and early warning in the related technology.
As an aspect of the present invention, there is provided an alert analysis method based on natural language processing and association rules, including:
acquiring original data of an alarm receiving and handling service;
processing the original data of the alarm receiving and processing service by a tool based on natural language to obtain event triples;
matching the event triple with an accident factor, and binding the structured data in the event triple with the accident factor, wherein the accident factor represents a mapping relation table of a natural language of an alarm and a digital language of the alarm;
mining the association rules of the multiple pieces of structured data bound by each accident factor according to an association rule mining algorithm to obtain a set with the association rules;
and processing the frequent item set to obtain an alarm condition analysis result.
Further, the raw data of the alarm receiving and processing service comprises: structured data and unstructured data, the structured data including an alert ticket number, a data source, an alarm receiver, a jurisdiction, an alert type, an alert time, a treatment result flag, a feedback person, a feedback department, a feedback time, an alert reverse check flag, and an alert check flag; the unstructured data includes alarm content and feedback content.
Further, the processing the raw data of the alarm receiving and processing service by a tool based on natural language to obtain an event triple includes:
carrying out reduction processing on the structured data and carrying out data cleaning on the unstructured data;
and performing word segmentation, part-of-speech tagging, syntactic structure description and semantic dependency analysis on the unstructured data after data cleaning according to a tool based on natural voice, and constructing an event triple.
Further, the matching the event triplet with the accident factor and binding the structured data in the event triplet with the accident factor includes:
classifying the accident factors according to the alarm condition types;
matching the classified accident factors with the event triples one by one;
binding the structured data in the matched event triples with the accident factors;
and repeating the steps until all the structured data in the matched event triples are bound with the accident factor.
Further, the mining of association rules for the multiple pieces of structured data bound to each accident factor according to an association rule mining algorithm to obtain a set with association rules includes:
establishing a set of items to be mined for the plurality of pieces of structured data bound by each accident factor, wherein the set of items represents a set of the plurality of pieces of structured data bound by each accident factor;
traversing the item set according to a preset minimum support threshold to obtain a frequent item set;
and traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold value to obtain a set with association rules.
Further, the traversing the item set according to a preset minimum support threshold to obtain a frequent item set includes:
setting a minimum support threshold;
calculating the support rate of the item set in the item set according to a support rate calculation formula;
and traversing the item set, and if the support rate of the item set of the current item set is not less than the minimum support rate, marking the item set as a frequent item set.
Further, the support rate calculation formula is as follows:
wherein the content of the first and second substances,representing a collection m of itemsjSupport ratio of (1), Num (m)j) Representing a set m of items of structured data DjNum (D) represents the number of tasks of the structured data D;
Further, the traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold to obtain a set with association rules includes:
setting a minimum confidence threshold;
calculating a confidence of a set of items within the set of items according to a confidence calculation formula;
traversing the non-empty subset of the frequent item set, and if the confidence of the item set of the non-empty subset of the current frequent item set is not less than the minimum confidence threshold, marking the item set as a set with association rules.
Further, the confidence calculation formula is:
wherein m isaDenotes the cause, m, in the structured data DbRepresents the conclusion in the structured data D,indicates that the reason was concludedThe degree of confidence of (a) is,indicates that the reason was concludedThe rate of support of (a) is,presentation reason maThe support ratio of (a);
the minimum confidence threshold value ranges from 70% to 75%.
Further, the processing the set with the association rule to obtain an alarm analysis result includes:
performing attribute restoration processing on the set with the association rule;
and matching the content subjected to attribute restoration processing with an evaluation factor to obtain an alarm analysis conclusion and conclusion evaluation, wherein the evaluation factor represents a mapping relation table of the alarm analysis conclusion and the conclusion evaluation.
According to the alarm condition analysis method based on natural language processing and association rules, association analysis is established between the main events of the alarm condition text and historical alarm condition data through a processing tool based on natural language and an association rule analysis method, event triple extraction can be carried out on unstructured text information, and association rules are established for different accident incentive types by combining a large amount of historical data, so that the alarm condition analysis capability of an alarm receiving and processing system is improved, and accident reason investigation and related behavior improvement actions can be carried out more pertinently.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of an alarm analysis method based on natural language processing and association rules according to the present invention.
Fig. 2 is a schematic diagram illustrating a traffic accident category item set style description provided by the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances in order to facilitate the description of the embodiments of the invention herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In this embodiment, an alarm analysis method based on natural language processing and association rules is provided, and fig. 1 is a flowchart of an alarm analysis method based on natural language processing and association rules according to an embodiment of the present invention, as shown in fig. 1, including:
s110, acquiring original data of an alarm receiving and processing service;
in the embodiment of the present invention, the acquired raw data of the alarm receiving and processing service is acquired, and in order to establish an association rule model, unstructured data and structured data need to be separated, where the raw data of the alarm receiving and processing service includes: structured data and unstructured data, the structured data including an alert ticket number, a data source, an alarm receiver, a jurisdiction, an alert type, an alert time, a treatment result flag, a feedback person, a feedback department, a feedback time, an alert reverse check flag, and an alert check flag; the unstructured data includes alarm content and feedback content.
For example, the alarm content: "alarm person A receives alarm, alarm person (telephone: 13XXXXXXXXX) name: ' My vehicle (SuBXXXXX) is parked at the roadside near the north door of XX district of great path of Longhu lake, children sitting in the rear row in the vehicle suddenly open the vehicle door, because of the slippery rainy road, a human tricycle coming from behind can not be braked to collide with the vehicle door, the person on the tricycle falls down to the ground to be injured, and the person is sent to a hospital for treatment. ' two accident parties negotiate in the hospital at present, and the opinions of the two parties are different, so that an alarm needs to be given. "
And (3) feedback content: "Accident policemen B, site feedback, has negotiated treatment. "
The serial number of the warning notice sheet: "001100011011"
The police receiver: "alarm receiver a", jurisdiction: "XXXXX"
The type of the alert: vehicle and non-motor vehicle "
Alarm time: "yyyy-MM-dd", treatment result flag: "1" …
S120, processing the original data of the alarm receiving and processing service through a tool based on natural language to obtain event triples;
in the embodiment of the present invention, the method specifically includes:
carrying out reduction processing on the structured data and carrying out data cleaning on the unstructured data;
and performing word segmentation, part-of-speech tagging, syntactic structure description and semantic dependency analysis on the unstructured data after data cleaning according to a tool based on natural voice, and constructing an event triple.
When the structured data is reduced, for example, an alarm condition reverse check mark, a treatment result mark, an alarm condition verification mark, and the like are set as boolean attributes, an alarm condition type, a data source, and the like are set as numerical attributes, and the content corresponding to a specific numerical value belongs to prior knowledge and is continuously maintained for the data.
During data cleaning work on unstructured data, invalid characters are deleted, text information which is not related to modeling and is automatically overlapped by a system is automatically overlapped, such as alarm names, alarm receiver names, law enforcement requirements, law enforcement equipment information and the like which are overlapped by the system before description of alarm content.
When the unstructured data is segmented and part-of-speech labeled, for example, the alarm content is segmented into word sequences and part-of-speech of each time is recognized, that is, verbs, nouns, pronouns, adverbs and the like are recognized.
In the syntax structure description of the unstructured data, the dependency relationship between parts of speech, namely the group dominance relationship, the moving guest relationship, the intervening guest relationship, the parallel relationship, the inter-guest relationship and the like, is identified in the sentence.
Performing semantic dependency analysis on the unstructured data, and constructing a triple of an event, namely a subject predicate object of the event; it is worth noting that the natural language based processing process performed on the textual data may be, but is not limited to, using an open source natural language processing based toolkit.
The triples for screening the relationships such as the SBV major-minor relationship, the VOB moving object and the like are as follows:
(child ', ' open ', ' door ') dependencies: SBV, VOB;
('Tricycle', 'Collision', 'door') dependencies: SBV, VOB;
(dependence of 'Tricycle', 'Fall', 'on ground'): SBV, POB;
('I', 'Send', 'Hospital rescue') dependencies: SVB, POB, VOB;
(both opinion ', ' present ', ' diverge ') dependencies: SBV, VOB;
s130, matching the event triple with an accident factor, and binding the structured data in the event triple with the accident factor, wherein the accident factor represents a mapping relation table of a natural language of an alarm and a digital language of the alarm;
in the embodiment of the present invention, the method may specifically include:
classifying the accident factors according to the alarm condition types;
matching the classified accident factors with the event triples one by one;
binding the structured data in the matched event triples with the accident factors;
and repeating the steps until all the structured data in the matched event triples are bound with the accident factor.
Specifically, an accident factor δ is introduced to be matched with the event triplet, and in the embodiment of the present invention, the accident factor is specifically a mapping relationship table between a natural language of the alarm and a digital language of the alarm. And maintaining accident factor data according to the priori knowledge, classifying the accident factor data according to the alarm types, and matching the accident factors with the event triples according to the recorded alarm types. The specific matching process is that the accident factors of the category are matched with the triples of the events one by one. In the embodiment of the invention, the accident factor matching can avoid the problems that the time consumption is too long for directly matching the specific accident type with the warning situation text content, and the text content has no part-of-speech label, so that ambiguity is generated. And binding the structured information of the accident with the accident factor after matching.
The data for which the accident factor δ is of the list type are taken as an example:
the contents of [ open door, push solid line, go backwards, break, escape, roll pedestrian, scrape hit pedestrian, roll over, crash, …, traffic jam ] are maintained artificially. After matching with the triple in the step S6, determining that the accident structured information is bound with the matched accident factor, and expanding the data after binding as follows:
the case time is as follows: "yyyy-MM-dd", location of case: "XXXXXX", illegal action "opening and closing the door to prevent other vehicle illegal and pedestrian traffic", whether to escape: "0", whether there is a scene "0", whether there is an injury "1", the scene traffic situation is "clear", whether the vehicle can move "1", the type of the vehicle involved in the accident "02 car", the type of the accident "vehicle collides with non-motor vehicle", and the accident factor "door open" …
And repeating the processing steps on the historical data until all the data are effectively bound with the accident factor.
S140, mining association rules of the multiple pieces of structured data bound by each accident factor according to an association rule mining algorithm to obtain a set with the association rules;
in the embodiment of the present invention, the method specifically includes:
establishing a set of items to be mined for the plurality of pieces of structured data bound by each accident factor, wherein the set of items represents a set of the plurality of pieces of structured data bound by each accident factor;
traversing the item set according to a preset minimum support threshold to obtain a frequent item set;
and traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold value to obtain a set with association rules.
It should be understood that the association rules are established class by class, and each incident factor corresponds to multiple pieces of bound structured data to form an item set D (at which time the unstructured, semi-structured conversion of the data to structured is completed).
And mining the association rules of the multiple pieces of bound structured data corresponding to each accident factor by constructing an association rule mining algorithm (specifically, an Apriori algorithm can be adopted). Firstly, the items of the item set D are designed differently according to different accident factors and accident types, for example, the items of the alarm condition item sets of the traffic accident class and the criminal security class are not designed to be the same, and the item set D of the traffic accident class is not suitable for the sametransThe following fields may be designed but are not limited to: the time of a case, the place of the case, illegal behaviors, whether to escape, whether to have a scene, whether to be injured, the situation of passing on the scene, whether the vehicle can move, the type of the vehicle involved in an accident, the type of personnel, the type of the accident, the accident factor and the like. Dtrans={t1,t2,t3,...,tk,tnWhere k is the number of tasks in the set of items of that type, and k is Num (D)trans). A certain task tkActually corresponding to the alarm condition record after structuring a certain data, tkM in (1)jRepresents DtransAll item sets in, tk={m1,m2,m3,...,mj,}(j=1,2,3,...,l)。
Further specifically, traversing the item set according to a preset minimum support threshold to obtain a frequent item set, including:
setting a minimum support threshold;
calculating the support rate of the item set in the item set according to a support rate calculation formula;
and traversing the item set, and if the support rate of the item set of the current item set is not less than the minimum support rate, marking the item set as a frequent item set.
In the embodiment of the present invention, it is,
the support rate calculation formula is as follows:
wherein the content of the first and second substances,representing a collection m of itemsjSupport ratio of (1), Num (m)j) Representing a set m of items of structured data DjNum (D) represents the number of tasks of the structured data D;
And sequentially searching by using the frequent item sets obtained in the previous time until all the frequent item sets are obtained.
Further specifically, the traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold to obtain a set with association rules includes:
setting a minimum confidence threshold;
calculating a confidence of a set of items within the set of items according to a confidence calculation formula;
traversing the non-empty subset of the frequent item set, and if the confidence of the item set of the non-empty subset of the current frequent item set is not less than the minimum confidence threshold, marking the item set as a set with association rules.
In the embodiment of the present invention, the confidence coefficient calculation formula is:
wherein m isaDenotes the cause, m, in the structured data DbRepresents the conclusion in the structured data D,indicates that the reason was concludedThe degree of confidence of (a) is,indicates that the reason was concludedThe rate of support of (a) is,presentation reason maThe support ratio of (a);
the minimum confidence threshold value ranges from 70% to 75%.
Specifically, item set DtransComprising mjThe number of tasks of is the item set mjThe number of supports of (2) is denoted as Num (m)j) Then m isjThe support ratio of (1) is support number/task number 100%, namely:
setting a minimum support threshold, i.e.It should be noted here that the minimum support threshold may be preset in advance, for example, the minimum support threshold may be set between 25% and 35%. If it is calculatedNot less thanThen m will be at this pointjIs recorded as a frequent item set.
With respect to item set DtransThe pattern description (for ease of analysis, where time is further discretized into a pattern period) is shown in detail in fig. 2.
For item set DtransWherein m is recordeda、mbRespectively indicate the cause and the conclusion, and thenThe support rate of (m) isa∩mbProbability P (m)a∩mb) I.e. byThe concept of confidence in the Apriori algorithm is to describe the cause maTo conclude mbThe degree of confidence of the image data obtained,the confidence level of may be at DtransThe middle task comprises maAlso includes mbThe conditional probability of (c), i.e.:
setting a minimum confidence threshold, i.e.Likewise, the minimum confidence threshold may also be preset in advance, for example, the minimum confidence threshold may be set between 70% and 75%.
Traversing a set of data items DtransFinding satisfaction of the calculation conditionThen using the frequent 1-item set to search the frequent 2-item set until all the frequent k-item sets are found, and passing the minimum confidence coefficient on the non-empty subset of the final frequent item setAnd screening again to obtain a final association rule set.
For example,
TABLE 1 search results for frequent 1-item set
1-item | Support |
M1 | 35 |
M2 | 26 |
M3 | 48 |
M4 | 21 |
M5 | 26 |
M6 | 27 |
M7 | 29 |
M8 | 41 |
TABLE 2 search results for frequent 2-item set
2-item | Support |
M1,M2 | 15 |
M1,M3 | 13 |
M1,M4 | 26 |
M1,M5 | 12 |
M1,M6 | 11 |
...... | ...... |
S150, processing the set with the association rule to obtain an alarm condition analysis result.
The method specifically comprises the following steps:
performing attribute restoration processing on the set with the association rule;
and matching the content subjected to attribute restoration processing with an evaluation factor to obtain an alarm analysis conclusion and conclusion evaluation, wherein the evaluation factor represents a mapping relation table of the alarm analysis conclusion and the conclusion evaluation.
It should be understood that the set with the association rule is subjected to attribute restoration processing, which mainly obtains the expression of the digital language after the accident factor matching is performed, and here, the attribute restoration is performed to restore the expression to the natural language.
And matching the restored content with an evaluation factor after attribute restoration, wherein the evaluation factor is a mapping relation table which comprises a mapping relation table of an alarm analysis conclusion and a conclusion evaluation. Therefore, after the content subjected to attribute restoration is matched with the evaluation factor, an alarm condition analysis conclusion and corresponding conclusion evaluation can be obtained.
For example, the data type of the evaluation factor γ is map, the key records the conclusion, and the corresponding value is the evaluation corresponding to the conclusion, such as: reason ma: no scene, vehicle escape ═ conclusion mb: some Key value of map of a road section (the attribute of the location is elementary school) near the XXX elementary school of the location of the case is as follows: location attribute of case-Primary school nearby road segment (23_ XXXXX primary school, wherein 23 denotes location attribute value is primary school nearby road segment), the value corresponding to the key is "strengthen school perimeter supervision".
In summary, according to the alarm analysis method based on natural language processing and association rules provided by the embodiment of the invention, association analysis is established between the main events of the alarm text and historical alarm data through the processing tool based on natural language and the association rule analysis method, event triple extraction can be performed on unstructured text information, and association rules are established for different accident incentive types by combining a large amount of historical data, so that the alarm analysis capability of the alarm receiving and processing system is improved, and accident cause investigation and related behavior improvement actions can be performed more specifically.
It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.
Claims (10)
1. An alarm condition analysis method based on natural language processing and association rules is characterized by comprising the following steps:
acquiring original data of an alarm receiving and handling service;
processing the original data of the alarm receiving and processing service by a tool based on natural language to obtain event triples;
matching the event triple with an accident factor, and binding the structured data in the event triple with the accident factor, wherein the accident factor represents a mapping relation table of a natural language of an alarm and a digital language of the alarm;
mining the association rules of the multiple pieces of structured data bound by each accident factor according to an association rule mining algorithm to obtain a set with the association rules;
and processing the set with the association rule to obtain an alarm condition analysis result.
2. The alarm condition analyzing method based on natural language processing and association rules as claimed in claim 1, wherein the raw data of the alarm receiving and processing service comprises: structured data and unstructured data, the structured data including an alert ticket number, a data source, an alarm receiver, a jurisdiction, an alert type, an alert time, a treatment result flag, a feedback person, a feedback department, a feedback time, an alert reverse check flag, and an alert check flag; the unstructured data includes alarm content and feedback content.
3. A method for analyzing an alarm situation based on natural language processing and association rules according to claim 2, wherein the processing the raw data of the alarm receiving and processing service by a natural language based tool to obtain event triples comprises:
carrying out reduction processing on the structured data and carrying out data cleaning on the unstructured data;
and performing word segmentation, part-of-speech tagging, syntactic structure description and semantic dependency analysis on the unstructured data after data cleaning according to a tool based on natural voice, and constructing an event triple.
4. A method for alarm analysis based on natural language processing and association rules according to claim 1, wherein the matching the event triples with accident factors and the binding of the structured data in the event triples with the accident factors comprises:
classifying the accident factors according to the alarm condition types;
matching the classified accident factors with the event triples one by one;
binding the structured data in the matched event triples with the accident factors;
and repeating the steps until all the structured data in the matched event triples are bound with the accident factor.
5. A method for analyzing an alarm situation based on natural language processing and association rules according to claim 1, wherein the mining association rules for the plurality of pieces of structured data bound by each accident factor according to an association rule mining algorithm to obtain a set with association rules comprises:
establishing a set of items to be mined for the plurality of pieces of structured data bound by each accident factor, wherein the set of items represents a set of the plurality of pieces of structured data bound by each accident factor;
traversing the item set according to a preset minimum support threshold to obtain a frequent item set;
and traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold value to obtain a set with association rules.
6. A method for analyzing a warning situation based on natural language processing and association rules according to claim 5, wherein traversing the set of items according to a preset minimum support threshold to obtain a frequent set of items comprises:
setting a minimum support threshold;
calculating the support rate of the item set in the item set according to a support rate calculation formula;
and traversing the item set, and if the support rate of the item set of the current item set is not less than the minimum support rate, marking the item set as a frequent item set.
7. The alarm analysis method based on natural language processing and association rules of claim 6, wherein the support rate calculation formula is:
wherein the content of the first and second substances,representing a collection m of itemsjSupport ratio of (1), Num (m)j) Representing a set m of items of structured data DjNum (D) represents the number of tasks of the structured data D;
8. The alarm analysis method based on natural language processing and association rules according to claim 5, wherein traversing the non-empty subset of the frequent item set according to a preset minimum confidence threshold to obtain a set with association rules comprises:
setting a minimum confidence threshold;
calculating a confidence of a set of items within the set of items according to a confidence calculation formula;
traversing the non-empty subset of the frequent item set, and if the confidence of the item set of the non-empty subset of the current frequent item set is not less than the minimum confidence threshold, marking the item set as a set with association rules.
9. A method for alarm analysis based on natural language processing and association rules according to claim 8, wherein the confidence score is calculated by the formula:
wherein m isaDenotes the cause, m, in the structured data DbRepresents the conclusion in the structured data D,indicates that the reason was concludedThe degree of confidence of (a) is,indicates that the reason was concludedThe rate of support of (a) is,presentation reason maThe support ratio of (a);
the minimum confidence threshold value ranges from 70% to 75%.
10. The alarm analysis method based on natural language processing and association rules according to claim 1, wherein the processing the set with association rules to obtain the alarm analysis result comprises:
performing attribute restoration processing on the set with the association rule;
and matching the content subjected to attribute restoration processing with an evaluation factor to obtain an alarm analysis conclusion and conclusion evaluation, wherein the evaluation factor represents a mapping relation table of the alarm analysis conclusion and the conclusion evaluation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111303071.5A CN114003683A (en) | 2021-11-04 | 2021-11-04 | Alarm condition analysis method based on natural language processing and association rule |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111303071.5A CN114003683A (en) | 2021-11-04 | 2021-11-04 | Alarm condition analysis method based on natural language processing and association rule |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114003683A true CN114003683A (en) | 2022-02-01 |
Family
ID=79927679
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111303071.5A Pending CN114003683A (en) | 2021-11-04 | 2021-11-04 | Alarm condition analysis method based on natural language processing and association rule |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114003683A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116821286A (en) * | 2023-08-23 | 2023-09-29 | 北京宝隆泓瑞科技有限公司 | Correlation rule analysis method and system for gas pipeline accidents |
-
2021
- 2021-11-04 CN CN202111303071.5A patent/CN114003683A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116821286A (en) * | 2023-08-23 | 2023-09-29 | 北京宝隆泓瑞科技有限公司 | Correlation rule analysis method and system for gas pipeline accidents |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Amado et al. | Criteria-Based Content Analysis (CBCA) reality criteria in adults: A meta-analytic review | |
Gu et al. | From Twitter to detector: Real-time traffic incident detection using social media data | |
Schulz et al. | I see a car crash: Real-time detection of small scale incidents in microblogs | |
CN102946331B (en) | A kind of social networks zombie user detection method and device | |
Miah et al. | Detection of child exploiting chats from a mixed chat dataset as a text classification task | |
US20170192959A1 (en) | Apparatus and method for extracting topics | |
Li et al. | Pedestrian injury severities in pedestrian-vehicle crashes and the partial proportional odds logit model: accounting for age difference | |
Sujon et al. | Social media mining for understanding traffic safety culture in washington state using twitter data | |
Tanev et al. | Monitoring disaster impact: detecting micro-events and eyewitness reports in mainstream and social media. | |
Curcio et al. | The A ustralian S elf‐report D elinquency S cale: A revision | |
Xu et al. | Traffic event detection using twitter data based on association rules | |
Lampoltshammer et al. | Sensing the public's reaction to crime news using the ‘Links Correspondence Method’ | |
CN114003683A (en) | Alarm condition analysis method based on natural language processing and association rule | |
Kim et al. | Hit-and-run crashes: Use of rough set analysis with logistic regression to capture critical attributes and determinants | |
Pour et al. | Spatial and temporal distribution of pedestrian crashes in Melbourne metropolitan area | |
Deshmukh et al. | Crime investigation using data mining | |
Zhu | Investigation of vehicle-bicycle hit-and-run crashes | |
Kabbani et al. | What do riders say and where? The detection and analysis of eyewitness transit tweets | |
Chamby-Diaz et al. | Identifying traffic event types from twitter by multi-label classification | |
CN112749239B (en) | Event map construction method and device and computing equipment | |
Zhang et al. | Automated hazardous action classification using natural language processing and machine-learning techniques | |
Neuhold et al. | Driver's dashboard–using social media data as additional information for motorway operators | |
Drápal et al. | Using large language models to support thematic analysis in empirical legal studies | |
CN112035726B (en) | Trademark registration method and device | |
Herwanto et al. | Traffic condition information extraction from Twitter data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |