CN111950199A - Earthquake data structured automation method based on earthquake news event - Google Patents
Earthquake data structured automation method based on earthquake news event Download PDFInfo
- Publication number
- CN111950199A CN111950199A CN202010799527.0A CN202010799527A CN111950199A CN 111950199 A CN111950199 A CN 111950199A CN 202010799527 A CN202010799527 A CN 202010799527A CN 111950199 A CN111950199 A CN 111950199A
- Authority
- CN
- China
- Prior art keywords
- news
- earthquake
- seismic
- training
- event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000000605 extraction Methods 0.000 claims abstract description 29
- 238000012360 testing method Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims abstract description 12
- 230000009193 crawling Effects 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 5
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 2
- 239000000126 substance Substances 0.000 claims description 2
- 230000002457 bidirectional effect Effects 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Geophysics And Detection Of Objects (AREA)
Abstract
The invention provides an earthquake data structuralization automatic method based on earthquake news events, which utilizes a web crawler to crawl a large amount of news data for a preset earthquake related website; marking trigger words and event elements in the collected news data in a BIO marking mode; randomly dividing a data set into a training data set and a testing data set; constructing a seismic event extraction model, wherein the model is realized by adopting a mode of combining a bidirectional long-time memory network and a conditional random field; and training the seismic event extraction model by using the marked training set. In the training process, testing the model by using the test set data, and if the precision requirement is met, finishing the training; and deploying the trained seismic event extraction model into practical application. Crawling earthquake related websites through a web crawler, analyzing each crawled content in real time through an earthquake event model, and further extracting earthquake event elements and storing the earthquake event elements into a database if the earthquake event elements accord with the earthquake event types.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to an earthquake data structuring automatic method based on earthquake news events.
Background
An earthquake news event generally refers to related news content acquired sometime and someplace due to an earthquake, and generally consists of a number of elements, generally including: occurrence time, epicenter position, seismic source depth, magnitude of earthquake, number of injured people, number of dead people, direct economic loss and the like. The earthquake occurs 10 thousands times each year in the world, and the number of earthquakes above grade 3.0 in 2018 is 542, while the related news reports about the earthquakes are not counted. Valuable element contents are extracted from massive earthquake news reports, and the integration and the structuralization can provide necessary basic information for the subsequent earthquake disaster analysis and prediction.
With the improvement of the degree of publicizing internet information and the development of natural language processing technology, a scheme of acquiring original earthquake news information through a network and then processing the earthquake news information by using a natural language model to obtain a corresponding result becomes practical. The method can realize automatic acquisition of relevant earthquake information, and is convenient for later retrieval and analysis; and manual searching and screening are not needed, so that the labor cost is greatly reduced, and the method has important large data value.
Disclosure of Invention
In view of the above, the invention provides an earthquake data structuring automatic method based on earthquake news events, which continuously crawls news of earthquake related websites through a web crawler, processes news contents by using a trained earthquake event extraction model to judge whether the news contents are earthquake events, and further extracts related elements in the news contents and stores the extracted related elements into a database if the news contents are earthquake events so as to provide necessary basic information for subsequent earthquake disaster analysis and prediction.
In order to achieve the purpose, the invention provides the following technical scheme:
an automated method for seismic data structuring based on seismic news events, substantially comprising the steps of:
step (1): crawling a large amount of news data for a preset earthquake related website by using a web crawler;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
and (3): randomly dividing a data set into a training data set and a testing data set;
and (4): constructing a seismic event extraction model, wherein the model is realized by adopting a mode of combining a bidirectional long-time memory network (Bi-LSTM) and a Conditional Random Field (CRF);
and (5): and training the seismic event extraction model by using the marked training set. In the training process, testing the model by using the test set data, and if the precision requirement is met, finishing the training;
and (6): and deploying the trained seismic event extraction model into practical application. Crawling earthquake related websites through a web crawler, analyzing each crawled content in real time through an earthquake event model, and further extracting earthquake event elements and storing the earthquake event elements into a database if the earthquake event elements accord with the earthquake event types.
Compared with the prior art, the invention has the beneficial effects that:
the method can automatically and accurately extract the earthquake events and the related event elements aiming at massive news internet data, is convenient for retrieval and analysis, and provides necessary basic information for subsequent earthquake disaster analysis and prediction; and manual searching and screening are not needed, so that the labor cost is greatly reduced, and the method has important big data application and research values.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an overall flow chart of a seismic data structuring automation method based on seismic news events provided in an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a Bi-LSTM recurrent neural network provided in an embodiment of the present invention;
as shown in fig. 2, the Bi-LSTM is composed of 2 × n cells, each having the same structure, where n is equal to the length of the input data. Each unit consists of an input layer, a hidden layer and an output layer, the output of the first unit is used as the input of the second unit, and the rest is done in the same way until the last unit finishes forward calculation; then, the last unit is sequentially moved forward until the first unit finishes the reverse calculation; adding the forward result and the reverse result of the same input data to obtain each output;
FIG. 3 is a schematic diagram of a single LSTM structure provided in an embodiment of the present invention;
as shown in fig. 3, the cell includes 4 network layers, where the activation functions of two network layers are sigmoid functions (sigmoid functions), and the activation functions of the other two network layers are hyperbolic functions (tanh functions). In addition, 3 doors are provided to control the information circulation mode, as shown in FIG. 3Andthe "gate" is the most typical feature of the LSTM recurrent neural network, and serves to retain information and filter noise. x is the number ofiAs input to the ith cyclic unit, while inputting the unit coefficient ci-1And an activation value ai-1And outputs y after calculationiCoefficient of cell ciActivation value ai,ciAnd aiAnd as the input of the (i + 1) th cycle unit, the whole process is as follows:
wherein, Wf、Wu、WtWeight coefficients corresponding to the three steps, bf、bu、btThen the bias factor, labeled in FIG. 3The intermediate variables generated in the operation process are respectively corresponded;
FIG. 4 is a schematic diagram of an example of BIO labeling provided in an embodiment of the present invention;
fig. 5 is a schematic diagram of an overall structure of a seismic event extraction model provided in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The overall flow chart of the seismic data structuring automation method based on the seismic news event provided in the embodiment of the invention is shown in fig. 1, and mainly comprises the following steps:
step (1): crawling relevant news of the earthquake website by using a web crawler; pre-selecting an earthquake news source website (such as a national earthquake bureau, an emergency management department and each earthquake-saving bureau) and setting a corresponding XPath, wherein a crawler can automatically download all news in a news list;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
the triggering words are prerequisites, and event elements can be further extracted only if the triggering words are detected and considered as seismic events;
the triggering words are used for judging whether the triggering words are earthquake events and comprise 'earthquake' key words, and if the triggering words are detected, the triggering words are regarded as the earthquake events; the event elements comprise 7 types of contents of occurrence time, epicenter position, seismic source depth, seismic level, number of injured people, number of dead people and direct economic loss; wherein "B-event element" represents the beginning of an element, "I-event element" represents the middle of an element, and "O" represents a non-event element; the labeling example is shown in FIG. 4;
and (3): randomly dividing the annotated news data set into a training data set and a test data set, wherein the test data set accounts for 20%;
and (4): constructing a seismic event extraction model, wherein the seismic event extraction model is realized by adopting a Bi-LSTM and CRF combined mode, and the structure of the seismic event extraction model is shown in figure 5;
(4.1) inputting characters of news contents into the seismic event extraction model, wherein the length of the contents is arbitrary and is marked as n; firstly, converting each character into a corresponding vector x through a word2vec modulei(ii) a The word2vec module is a trained open-source character vector library, wherein common characters such as Chinese characters, English letters, punctuation marks and the like are recorded, and a vector x corresponding to each characteriThe dimension is 100; finding the vector corresponding to each character of the news content, and finally outputting the word2vec module as n multiplied by 100 (x)1,x2,…,xn) Where Λ represents a vector of length 100, this step is aimed at counting news contentPerforming word formation;
(4.2) corresponding the vector x of each character in the last step (4.1)iSequentially used as the input of the Bi-LSTM module, and subjected to cyclic calculation to obtain the output vector y of each LSTM unitiVector yiHas a dimension of 17(7 types of event elements and 1 type of trigger words, each type of event elements comprises two labels of 'B-' and 'I-' and is added with a label of 'O'), and a vector yiIs the probability value corresponding to 17 labels, and the final output of the Bi-LSTM module is nx17 (y)1,y2,…,yn) Wherein Λ represents a vector of length 17;
(4.3) obtaining a final result path by the probability value output by each unit in the previous step (4.2) through a CRF layer; the CRF layer can add some constraints to ensure that the final prediction result is valid (if 'B-Label 1I-Label 1' is valid and 'B-Label 1I-Label 2' is invalid), and the constraints can be automatically learned by the CRF layer when training data; CRF is trained and predicted by calculating the scores of all possible paths, with the score of each possible path being given as PiIf there are N paths, the total score of the paths is
Wherein the content of the first and second substances,representing the probability of the corresponding label output by the ith LSTM unit;the jump probability from the ith label to the (i + 1) th label is represented, belongs to the parameter of a CRF layer, and can be automatically learned during training;
during training, the loss function is defined as follows:
wherein P isRealPathScore representing the true path (results when annotated);
in the actual prediction, the path with the highest score is obtained as the final result, i.e.
Ppredict=max(P1,P2,…,PN);
And (5): training the seismic event extraction model constructed in the step (4);
(5.1) inputting the training samples into the seismic event extraction model in batches;
(5.2) in the training process, calculating a loss value according to the loss function defined in the step (4.3), and continuously updating the weight of the seismic event extraction model by adopting a random gradient descent method;
deep learning model weights are a generic call, usually random initially, and can be updated by sample learning.
The gradient descent method is also the most basic weight updating method in machine learning.
(5.3) after a large amount of iterative training, the loss value output by the seismic event extraction model is converged to be lower; then, after each iteration training is finished, testing the seismic event extraction model on the test data set, comparing the result predicted by the model with the result manually marked, and calculating the accuracy (the number of correct results/the total number of correct results); if the test accuracy rate exceeds 97%, the whole training process is finished, and if the test accuracy rate does not meet the requirement, the step (5.1) is returned to, and the training is continued;
and (6): deploying the trained seismic event extraction model into practical application;
(6.1) crawling an earthquake news source website through a web crawler, extracting the text of news by using an HTML (hypertext markup language) label, and filtering out irrelevant contents such as pictures and external links;
(6.2) inputting the processed news content into a seismic event extraction model, and outputting a label path with the maximum probability;
(6.3) analyzing the label path, judging whether the label path contains a trigger word label, if so, further extracting event element information contained in the trigger word label and storing the event element information into a database; if not, the news content is discarded, and the next news is processed continuously.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (5)
1. An automatic seismic data structuring method based on seismic news events is characterized by comprising the following steps:
step (1): crawling relevant news of the earthquake website by using a web crawler; selecting an earthquake news source website in advance and setting a corresponding XPath, wherein a crawler can automatically download all news in a news list;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
and (3): randomly dividing the annotated news data set into a training data set and a test data set, wherein the test data set accounts for 20%;
and (4): constructing a seismic event extraction model, wherein the seismic event extraction model is realized by adopting a Bi-LSTM and CRF combined mode;
and (5): training the seismic event extraction model constructed in the step (4);
and (6): and deploying the trained seismic event extraction model into practical application.
2. An automated seismic data structuring method based on seismic news events according to claim 1, wherein the trigger word in step (1) is a prerequisite and the event elements are further extracted only if the trigger word is detected and considered as a seismic event;
the triggering words are used for judging whether the triggering words are earthquake events and comprise 'earthquake' key words, and if the triggering words are detected, the triggering words are regarded as the earthquake events; the event elements comprise 7 types of contents of occurrence time, epicenter position, seismic source depth, seismic level, number of injured people, number of dead people and direct economic loss; where "B-event element" represents the beginning of an element, "I-event element" represents the middle of an element, and "O" represents a non-event element.
3. An automated seismic data structuring method based on seismic news events as claimed in claim 1, wherein the specific flow of step (4) is as follows:
(4.1) inputting characters of news contents into the seismic event extraction model, wherein the length of the contents is arbitrary and is marked as n; firstly, converting each character into a corresponding vector x through a word2vec modulei(ii) a The word2vec module is an open source word which is trained to be completedA character vector library, in which the common characters of Chinese character, English letter and punctuation mark are recorded, and the vector x corresponding to every characteriThe dimension is 100; finding the vector corresponding to each character of the news content, and finally outputting the word2vec module as n multiplied by 100 (x)1,x2,…,xn) Where Λ represents a vector of length 100, this step being aimed at digitizing the news content;
(4.2) corresponding the vector x of each character in the last step (4.1)iSequentially used as the input of the Bi-LSTM module, and subjected to cyclic calculation to obtain the output vector y of each LSTM unitiVector yiHas a dimension of 17, vector yiIs the probability value corresponding to 17 labels, and the final output of the Bi-LSTM module is nx17 (y)1,y2,…,yn) Wherein Λ represents a vector of length 17;
(4.3) obtaining a final result path by the probability value output by each unit in the previous step (4.2) through a CRF layer; the CRF layer adds some constraints to ensure that the final prediction result is effective, and the constraints can be obtained by the automatic learning of the CRF layer when training data; CRF is trained and predicted by calculating the scores of all possible paths, with the score of each possible path being given as PiIf there are N paths, the total score of the paths is
Wherein the content of the first and second substances,representing the probability of the corresponding label output by the ith LSTM unit;the jump probability from the ith label to the (i + 1) th label is represented, belongs to the parameter of a CRF layer, and can be automatically learned during training;
during training, the loss function is defined as follows:
wherein P isRealPathRepresents a true path score;
in the actual prediction, the path with the highest score is obtained as the final result, i.e.
Ppredict=max(P1,P2,…,PN)。
4. An automated seismic data structuring method based on seismic news events as claimed in claim 3, wherein the specific flow of step (5) is as follows:
(5.1) inputting the training samples into the seismic event extraction model in batches;
(5.2) in the training process, calculating a loss value according to the loss function defined in the step (4.3), and continuously updating the weight of the seismic event extraction model by adopting a random gradient descent method;
(5.3) after a large amount of iterative training, the loss value output by the seismic event extraction model is converged to be lower; then, after each iteration training is finished, testing the seismic event extraction model on the test data set, comparing the result predicted by the model with the result manually marked, and calculating the accuracy; if the test accuracy rate exceeds 97%, the whole training process is completed, and if the test accuracy rate does not meet the requirement, the step (5.1) is returned to, and the training is continued.
5. An automated seismic data structuring method based on seismic news events according to claim 3 or 4, wherein the specific flow of step (6) is as follows:
(6.1) crawling an earthquake news source website through a web crawler, extracting the text of news by using a hypertext markup language tag, and filtering out irrelevant contents such as pictures and external links;
(6.2) inputting the processed news content into a seismic event extraction model, and outputting a label path with the maximum probability;
(6.3) analyzing the label path, judging whether the label path contains a trigger word label, if so, further extracting event element information contained in the trigger word label and storing the event element information into a database; if not, the news content is discarded, and the next news is processed continuously.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010799527.0A CN111950199A (en) | 2020-08-11 | 2020-08-11 | Earthquake data structured automation method based on earthquake news event |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010799527.0A CN111950199A (en) | 2020-08-11 | 2020-08-11 | Earthquake data structured automation method based on earthquake news event |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111950199A true CN111950199A (en) | 2020-11-17 |
Family
ID=73332832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010799527.0A Pending CN111950199A (en) | 2020-08-11 | 2020-08-11 | Earthquake data structured automation method based on earthquake news event |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111950199A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113468320A (en) * | 2021-07-22 | 2021-10-01 | 中国地震台网中心 | Method and system for quickly visualizing earthquake emergency information |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107239445A (en) * | 2017-05-27 | 2017-10-10 | 中国矿业大学 | The method and system that a kind of media event based on neutral net is extracted |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
AU2018100678A4 (en) * | 2015-11-05 | 2018-06-14 | Tongji University | News events extracting method and system |
CN108197112A (en) * | 2018-01-19 | 2018-06-22 | 成都睿码科技有限责任公司 | A kind of method that event is extracted from news |
CN109508459A (en) * | 2018-11-06 | 2019-03-22 | 杭州费尔斯通科技有限公司 | A method of extracting theme and key message from news |
CN109635280A (en) * | 2018-11-22 | 2019-04-16 | 园宝科技(武汉)有限公司 | A kind of event extraction method based on mark |
CN109670172A (en) * | 2018-12-06 | 2019-04-23 | 桂林电子科技大学 | A kind of scenic spot anomalous event abstracting method based on complex neural network |
CN110377680A (en) * | 2019-07-11 | 2019-10-25 | 中国水利水电科学研究院 | The method of mountain flood database sharing and update based on web crawlers and semantics recognition |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
CN110852068A (en) * | 2019-10-15 | 2020-02-28 | 武汉工程大学 | Method for extracting sports news subject term based on BilSTM-CRF |
CN110941692A (en) * | 2019-09-28 | 2020-03-31 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Method for extracting news events of Internet politics outturn class |
-
2020
- 2020-08-11 CN CN202010799527.0A patent/CN111950199A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2018100678A4 (en) * | 2015-11-05 | 2018-06-14 | Tongji University | News events extracting method and system |
CN107239445A (en) * | 2017-05-27 | 2017-10-10 | 中国矿业大学 | The method and system that a kind of media event based on neutral net is extracted |
CN107797993A (en) * | 2017-11-13 | 2018-03-13 | 成都蓝景信息技术有限公司 | A kind of event extraction method based on sequence labelling |
CN108197112A (en) * | 2018-01-19 | 2018-06-22 | 成都睿码科技有限责任公司 | A kind of method that event is extracted from news |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
CN109508459A (en) * | 2018-11-06 | 2019-03-22 | 杭州费尔斯通科技有限公司 | A method of extracting theme and key message from news |
CN109635280A (en) * | 2018-11-22 | 2019-04-16 | 园宝科技(武汉)有限公司 | A kind of event extraction method based on mark |
CN109670172A (en) * | 2018-12-06 | 2019-04-23 | 桂林电子科技大学 | A kind of scenic spot anomalous event abstracting method based on complex neural network |
CN110377680A (en) * | 2019-07-11 | 2019-10-25 | 中国水利水电科学研究院 | The method of mountain flood database sharing and update based on web crawlers and semantics recognition |
CN110941692A (en) * | 2019-09-28 | 2020-03-31 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Method for extracting news events of Internet politics outturn class |
CN110852068A (en) * | 2019-10-15 | 2020-02-28 | 武汉工程大学 | Method for extracting sports news subject term based on BilSTM-CRF |
Non-Patent Citations (4)
Title |
---|
LIXIANG GUO ET AL: "A Practical Approach to Chinese Emergency Event Extraction using BiLSTM-CRF", 2019 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, pages 163 - 164 * |
樊红;李怀远;杜武;杨继文;: "基于事件分析的Web地震新闻时空信息挖掘研究", 武汉大学学报(工学版), no. 02, pages 92 - 97 * |
江逸琪;赵彤洲;柴悦;高佩东;: "基于BiLSTM-CRF的体育新闻主题词抽取方法", 武汉工程大学学报, no. 01, pages 106 - 111 * |
邱锡鹏: "神经网络与深度学习", vol. 1, 机械工业出版社, pages: 147 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113468320A (en) * | 2021-07-22 | 2021-10-01 | 中国地震台网中心 | Method and system for quickly visualizing earthquake emergency information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112269949B (en) | Information structuring method based on accident disaster news | |
US8666998B2 (en) | Handling data sets | |
CN106557462A (en) | Name entity recognition method and system | |
CN106503055A (en) | A kind of generation method from structured text to iamge description | |
CN110532398B (en) | Automatic family map construction method based on multi-task joint neural network model | |
CN108984775B (en) | Public opinion monitoring method and system based on commodity comments | |
CN111160005A (en) | Event prediction method and device based on event evolution knowledge ontology and terminal equipment | |
CN113742733B (en) | Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type | |
CN110209816A (en) | Event recognition and classification method, system, device based on confrontation learning by imitation | |
CN111651983A (en) | Causal event extraction method based on self-training and noise model | |
CN113946677A (en) | Event identification and classification method based on bidirectional cyclic neural network and attention mechanism | |
CN111950199A (en) | Earthquake data structured automation method based on earthquake news event | |
CN109992723B (en) | User interest tag construction method based on social network and related equipment | |
Cordell et al. | Disaggregating repression: Identifying physical integrity rights allegations in human rights reports | |
CN111859074B (en) | Network public opinion information source influence evaluation method and system based on deep learning | |
Alzhrani | Political Ideology Detection of News Articles Using Deep Neural Networks. | |
CN111639494A (en) | Case affair relation determining method and system | |
Zhao et al. | State and tendency: an empirical study of deep learning question&answer topics on Stack Overflow | |
CN114860918A (en) | Mobile application recommendation method and device fusing multi-source reliable information | |
CN116450783A (en) | Method, system, storage medium and electronic equipment for extracting event facing chapter level | |
Curro et al. | Building usage profiles using deep neural nets | |
Kumar et al. | An Algorithm for Automatic Text Annotation for Named Entity Recognition using spaCy Framework | |
CN114357284A (en) | Crowdsourcing task personalized recommendation method and system based on deep learning | |
CN110309285B (en) | Automatic question answering method, device, electronic equipment and storage medium | |
CN109597879B (en) | Service behavior relation extraction method and device based on 'citation relation' data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |