CN111950199A - Earthquake data structured automation method based on earthquake news event - Google Patents

Earthquake data structured automation method based on earthquake news event Download PDF

Info

Publication number
CN111950199A
CN111950199A CN202010799527.0A CN202010799527A CN111950199A CN 111950199 A CN111950199 A CN 111950199A CN 202010799527 A CN202010799527 A CN 202010799527A CN 111950199 A CN111950199 A CN 111950199A
Authority
CN
China
Prior art keywords
news
earthquake
seismic
training
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010799527.0A
Other languages
Chinese (zh)
Inventor
俞一奇
邱彦林
陈尚武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Xujian Science And Technology Co ltd
Original Assignee
Hangzhou Xujian Science And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Xujian Science And Technology Co ltd filed Critical Hangzhou Xujian Science And Technology Co ltd
Priority to CN202010799527.0A priority Critical patent/CN111950199A/en
Publication of CN111950199A publication Critical patent/CN111950199A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention provides an earthquake data structuralization automatic method based on earthquake news events, which utilizes a web crawler to crawl a large amount of news data for a preset earthquake related website; marking trigger words and event elements in the collected news data in a BIO marking mode; randomly dividing a data set into a training data set and a testing data set; constructing a seismic event extraction model, wherein the model is realized by adopting a mode of combining a bidirectional long-time memory network and a conditional random field; and training the seismic event extraction model by using the marked training set. In the training process, testing the model by using the test set data, and if the precision requirement is met, finishing the training; and deploying the trained seismic event extraction model into practical application. Crawling earthquake related websites through a web crawler, analyzing each crawled content in real time through an earthquake event model, and further extracting earthquake event elements and storing the earthquake event elements into a database if the earthquake event elements accord with the earthquake event types.

Description

Earthquake data structured automation method based on earthquake news event
Technical Field
The invention relates to the technical field of natural language processing, in particular to an earthquake data structuring automatic method based on earthquake news events.
Background
An earthquake news event generally refers to related news content acquired sometime and someplace due to an earthquake, and generally consists of a number of elements, generally including: occurrence time, epicenter position, seismic source depth, magnitude of earthquake, number of injured people, number of dead people, direct economic loss and the like. The earthquake occurs 10 thousands times each year in the world, and the number of earthquakes above grade 3.0 in 2018 is 542, while the related news reports about the earthquakes are not counted. Valuable element contents are extracted from massive earthquake news reports, and the integration and the structuralization can provide necessary basic information for the subsequent earthquake disaster analysis and prediction.
With the improvement of the degree of publicizing internet information and the development of natural language processing technology, a scheme of acquiring original earthquake news information through a network and then processing the earthquake news information by using a natural language model to obtain a corresponding result becomes practical. The method can realize automatic acquisition of relevant earthquake information, and is convenient for later retrieval and analysis; and manual searching and screening are not needed, so that the labor cost is greatly reduced, and the method has important large data value.
Disclosure of Invention
In view of the above, the invention provides an earthquake data structuring automatic method based on earthquake news events, which continuously crawls news of earthquake related websites through a web crawler, processes news contents by using a trained earthquake event extraction model to judge whether the news contents are earthquake events, and further extracts related elements in the news contents and stores the extracted related elements into a database if the news contents are earthquake events so as to provide necessary basic information for subsequent earthquake disaster analysis and prediction.
In order to achieve the purpose, the invention provides the following technical scheme:
an automated method for seismic data structuring based on seismic news events, substantially comprising the steps of:
step (1): crawling a large amount of news data for a preset earthquake related website by using a web crawler;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
and (3): randomly dividing a data set into a training data set and a testing data set;
and (4): constructing a seismic event extraction model, wherein the model is realized by adopting a mode of combining a bidirectional long-time memory network (Bi-LSTM) and a Conditional Random Field (CRF);
and (5): and training the seismic event extraction model by using the marked training set. In the training process, testing the model by using the test set data, and if the precision requirement is met, finishing the training;
and (6): and deploying the trained seismic event extraction model into practical application. Crawling earthquake related websites through a web crawler, analyzing each crawled content in real time through an earthquake event model, and further extracting earthquake event elements and storing the earthquake event elements into a database if the earthquake event elements accord with the earthquake event types.
Compared with the prior art, the invention has the beneficial effects that:
the method can automatically and accurately extract the earthquake events and the related event elements aiming at massive news internet data, is convenient for retrieval and analysis, and provides necessary basic information for subsequent earthquake disaster analysis and prediction; and manual searching and screening are not needed, so that the labor cost is greatly reduced, and the method has important big data application and research values.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is an overall flow chart of a seismic data structuring automation method based on seismic news events provided in an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a Bi-LSTM recurrent neural network provided in an embodiment of the present invention;
as shown in fig. 2, the Bi-LSTM is composed of 2 × n cells, each having the same structure, where n is equal to the length of the input data. Each unit consists of an input layer, a hidden layer and an output layer, the output of the first unit is used as the input of the second unit, and the rest is done in the same way until the last unit finishes forward calculation; then, the last unit is sequentially moved forward until the first unit finishes the reverse calculation; adding the forward result and the reverse result of the same input data to obtain each output;
FIG. 3 is a schematic diagram of a single LSTM structure provided in an embodiment of the present invention;
as shown in fig. 3, the cell includes 4 network layers, where the activation functions of two network layers are sigmoid functions (sigmoid functions), and the activation functions of the other two network layers are hyperbolic functions (tanh functions). In addition, 3 doors are provided to control the information circulation mode, as shown in FIG. 3
Figure BDA0002626880700000021
And
Figure BDA0002626880700000022
the "gate" is the most typical feature of the LSTM recurrent neural network, and serves to retain information and filter noise. x is the number ofiAs input to the ith cyclic unit, while inputting the unit coefficient ci-1And an activation value ai-1And outputs y after calculationiCoefficient of cell ciActivation value ai,ciAnd aiAnd as the input of the (i + 1) th cycle unit, the whole process is as follows:
Figure BDA0002626880700000031
Figure BDA0002626880700000032
Figure BDA0002626880700000033
Figure BDA0002626880700000034
Figure BDA0002626880700000035
Figure BDA0002626880700000036
wherein, Wf、Wu、WtWeight coefficients corresponding to the three steps, bf、bu、btThen the bias factor, labeled in FIG. 3
Figure BDA0002626880700000037
The intermediate variables generated in the operation process are respectively corresponded;
FIG. 4 is a schematic diagram of an example of BIO labeling provided in an embodiment of the present invention;
fig. 5 is a schematic diagram of an overall structure of a seismic event extraction model provided in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The overall flow chart of the seismic data structuring automation method based on the seismic news event provided in the embodiment of the invention is shown in fig. 1, and mainly comprises the following steps:
step (1): crawling relevant news of the earthquake website by using a web crawler; pre-selecting an earthquake news source website (such as a national earthquake bureau, an emergency management department and each earthquake-saving bureau) and setting a corresponding XPath, wherein a crawler can automatically download all news in a news list;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
the triggering words are prerequisites, and event elements can be further extracted only if the triggering words are detected and considered as seismic events;
the triggering words are used for judging whether the triggering words are earthquake events and comprise 'earthquake' key words, and if the triggering words are detected, the triggering words are regarded as the earthquake events; the event elements comprise 7 types of contents of occurrence time, epicenter position, seismic source depth, seismic level, number of injured people, number of dead people and direct economic loss; wherein "B-event element" represents the beginning of an element, "I-event element" represents the middle of an element, and "O" represents a non-event element; the labeling example is shown in FIG. 4;
and (3): randomly dividing the annotated news data set into a training data set and a test data set, wherein the test data set accounts for 20%;
and (4): constructing a seismic event extraction model, wherein the seismic event extraction model is realized by adopting a Bi-LSTM and CRF combined mode, and the structure of the seismic event extraction model is shown in figure 5;
(4.1) inputting characters of news contents into the seismic event extraction model, wherein the length of the contents is arbitrary and is marked as n; firstly, converting each character into a corresponding vector x through a word2vec modulei(ii) a The word2vec module is a trained open-source character vector library, wherein common characters such as Chinese characters, English letters, punctuation marks and the like are recorded, and a vector x corresponding to each characteriThe dimension is 100; finding the vector corresponding to each character of the news content, and finally outputting the word2vec module as n multiplied by 100 (x)1,x2,…,xn) Where Λ represents a vector of length 100, this step is aimed at counting news contentPerforming word formation;
(4.2) corresponding the vector x of each character in the last step (4.1)iSequentially used as the input of the Bi-LSTM module, and subjected to cyclic calculation to obtain the output vector y of each LSTM unitiVector yiHas a dimension of 17(7 types of event elements and 1 type of trigger words, each type of event elements comprises two labels of 'B-' and 'I-' and is added with a label of 'O'), and a vector yiIs the probability value corresponding to 17 labels, and the final output of the Bi-LSTM module is nx17 (y)1,y2,…,yn) Wherein Λ represents a vector of length 17;
(4.3) obtaining a final result path by the probability value output by each unit in the previous step (4.2) through a CRF layer; the CRF layer can add some constraints to ensure that the final prediction result is valid (if 'B-Label 1I-Label 1' is valid and 'B-Label 1I-Label 2' is invalid), and the constraints can be automatically learned by the CRF layer when training data; CRF is trained and predicted by calculating the scores of all possible paths, with the score of each possible path being given as PiIf there are N paths, the total score of the paths is
Figure BDA0002626880700000041
Wherein the content of the first and second substances,
Figure BDA0002626880700000042
representing the probability of the corresponding label output by the ith LSTM unit;
Figure BDA0002626880700000043
the jump probability from the ith label to the (i + 1) th label is represented, belongs to the parameter of a CRF layer, and can be automatically learned during training;
during training, the loss function is defined as follows:
Figure BDA0002626880700000044
wherein P isRealPathScore representing the true path (results when annotated);
in the actual prediction, the path with the highest score is obtained as the final result, i.e.
Ppredict=max(P1,P2,…,PN);
And (5): training the seismic event extraction model constructed in the step (4);
(5.1) inputting the training samples into the seismic event extraction model in batches;
(5.2) in the training process, calculating a loss value according to the loss function defined in the step (4.3), and continuously updating the weight of the seismic event extraction model by adopting a random gradient descent method;
deep learning model weights are a generic call, usually random initially, and can be updated by sample learning.
The gradient descent method is also the most basic weight updating method in machine learning.
(5.3) after a large amount of iterative training, the loss value output by the seismic event extraction model is converged to be lower; then, after each iteration training is finished, testing the seismic event extraction model on the test data set, comparing the result predicted by the model with the result manually marked, and calculating the accuracy (the number of correct results/the total number of correct results); if the test accuracy rate exceeds 97%, the whole training process is finished, and if the test accuracy rate does not meet the requirement, the step (5.1) is returned to, and the training is continued;
and (6): deploying the trained seismic event extraction model into practical application;
(6.1) crawling an earthquake news source website through a web crawler, extracting the text of news by using an HTML (hypertext markup language) label, and filtering out irrelevant contents such as pictures and external links;
(6.2) inputting the processed news content into a seismic event extraction model, and outputting a label path with the maximum probability;
(6.3) analyzing the label path, judging whether the label path contains a trigger word label, if so, further extracting event element information contained in the trigger word label and storing the event element information into a database; if not, the news content is discarded, and the next news is processed continuously.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. An automatic seismic data structuring method based on seismic news events is characterized by comprising the following steps:
step (1): crawling relevant news of the earthquake website by using a web crawler; selecting an earthquake news source website in advance and setting a corresponding XPath, wherein a crawler can automatically download all news in a news list;
step (2): marking trigger words and event elements in the collected news data in a BIO marking mode;
and (3): randomly dividing the annotated news data set into a training data set and a test data set, wherein the test data set accounts for 20%;
and (4): constructing a seismic event extraction model, wherein the seismic event extraction model is realized by adopting a Bi-LSTM and CRF combined mode;
and (5): training the seismic event extraction model constructed in the step (4);
and (6): and deploying the trained seismic event extraction model into practical application.
2. An automated seismic data structuring method based on seismic news events according to claim 1, wherein the trigger word in step (1) is a prerequisite and the event elements are further extracted only if the trigger word is detected and considered as a seismic event;
the triggering words are used for judging whether the triggering words are earthquake events and comprise 'earthquake' key words, and if the triggering words are detected, the triggering words are regarded as the earthquake events; the event elements comprise 7 types of contents of occurrence time, epicenter position, seismic source depth, seismic level, number of injured people, number of dead people and direct economic loss; where "B-event element" represents the beginning of an element, "I-event element" represents the middle of an element, and "O" represents a non-event element.
3. An automated seismic data structuring method based on seismic news events as claimed in claim 1, wherein the specific flow of step (4) is as follows:
(4.1) inputting characters of news contents into the seismic event extraction model, wherein the length of the contents is arbitrary and is marked as n; firstly, converting each character into a corresponding vector x through a word2vec modulei(ii) a The word2vec module is an open source word which is trained to be completedA character vector library, in which the common characters of Chinese character, English letter and punctuation mark are recorded, and the vector x corresponding to every characteriThe dimension is 100; finding the vector corresponding to each character of the news content, and finally outputting the word2vec module as n multiplied by 100 (x)1,x2,…,xn) Where Λ represents a vector of length 100, this step being aimed at digitizing the news content;
(4.2) corresponding the vector x of each character in the last step (4.1)iSequentially used as the input of the Bi-LSTM module, and subjected to cyclic calculation to obtain the output vector y of each LSTM unitiVector yiHas a dimension of 17, vector yiIs the probability value corresponding to 17 labels, and the final output of the Bi-LSTM module is nx17 (y)1,y2,…,yn) Wherein Λ represents a vector of length 17;
(4.3) obtaining a final result path by the probability value output by each unit in the previous step (4.2) through a CRF layer; the CRF layer adds some constraints to ensure that the final prediction result is effective, and the constraints can be obtained by the automatic learning of the CRF layer when training data; CRF is trained and predicted by calculating the scores of all possible paths, with the score of each possible path being given as PiIf there are N paths, the total score of the paths is
Figure FDA0002626880690000021
Figure FDA0002626880690000022
Wherein the content of the first and second substances,
Figure FDA0002626880690000023
representing the probability of the corresponding label output by the ith LSTM unit;
Figure FDA0002626880690000024
the jump probability from the ith label to the (i + 1) th label is represented, belongs to the parameter of a CRF layer, and can be automatically learned during training;
during training, the loss function is defined as follows:
Figure FDA0002626880690000025
wherein P isRealPathRepresents a true path score;
in the actual prediction, the path with the highest score is obtained as the final result, i.e.
Ppredict=max(P1,P2,…,PN)。
4. An automated seismic data structuring method based on seismic news events as claimed in claim 3, wherein the specific flow of step (5) is as follows:
(5.1) inputting the training samples into the seismic event extraction model in batches;
(5.2) in the training process, calculating a loss value according to the loss function defined in the step (4.3), and continuously updating the weight of the seismic event extraction model by adopting a random gradient descent method;
(5.3) after a large amount of iterative training, the loss value output by the seismic event extraction model is converged to be lower; then, after each iteration training is finished, testing the seismic event extraction model on the test data set, comparing the result predicted by the model with the result manually marked, and calculating the accuracy; if the test accuracy rate exceeds 97%, the whole training process is completed, and if the test accuracy rate does not meet the requirement, the step (5.1) is returned to, and the training is continued.
5. An automated seismic data structuring method based on seismic news events according to claim 3 or 4, wherein the specific flow of step (6) is as follows:
(6.1) crawling an earthquake news source website through a web crawler, extracting the text of news by using a hypertext markup language tag, and filtering out irrelevant contents such as pictures and external links;
(6.2) inputting the processed news content into a seismic event extraction model, and outputting a label path with the maximum probability;
(6.3) analyzing the label path, judging whether the label path contains a trigger word label, if so, further extracting event element information contained in the trigger word label and storing the event element information into a database; if not, the news content is discarded, and the next news is processed continuously.
CN202010799527.0A 2020-08-11 2020-08-11 Earthquake data structured automation method based on earthquake news event Pending CN111950199A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010799527.0A CN111950199A (en) 2020-08-11 2020-08-11 Earthquake data structured automation method based on earthquake news event

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010799527.0A CN111950199A (en) 2020-08-11 2020-08-11 Earthquake data structured automation method based on earthquake news event

Publications (1)

Publication Number Publication Date
CN111950199A true CN111950199A (en) 2020-11-17

Family

ID=73332832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010799527.0A Pending CN111950199A (en) 2020-08-11 2020-08-11 Earthquake data structured automation method based on earthquake news event

Country Status (1)

Country Link
CN (1) CN111950199A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468320A (en) * 2021-07-22 2021-10-01 中国地震台网中心 Method and system for quickly visualizing earthquake emergency information

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239445A (en) * 2017-05-27 2017-10-10 中国矿业大学 The method and system that a kind of media event based on neutral net is extracted
CN107797993A (en) * 2017-11-13 2018-03-13 成都蓝景信息技术有限公司 A kind of event extraction method based on sequence labelling
AU2018100678A4 (en) * 2015-11-05 2018-06-14 Tongji University News events extracting method and system
CN108197112A (en) * 2018-01-19 2018-06-22 成都睿码科技有限责任公司 A kind of method that event is extracted from news
CN109508459A (en) * 2018-11-06 2019-03-22 杭州费尔斯通科技有限公司 A method of extracting theme and key message from news
CN109635280A (en) * 2018-11-22 2019-04-16 园宝科技(武汉)有限公司 A kind of event extraction method based on mark
CN109670172A (en) * 2018-12-06 2019-04-23 桂林电子科技大学 A kind of scenic spot anomalous event abstracting method based on complex neural network
CN110377680A (en) * 2019-07-11 2019-10-25 中国水利水电科学研究院 The method of mountain flood database sharing and update based on web crawlers and semantics recognition
CN110633409A (en) * 2018-06-20 2019-12-31 上海财经大学 Rule and deep learning fused automobile news event extraction method
CN110852068A (en) * 2019-10-15 2020-02-28 武汉工程大学 Method for extracting sports news subject term based on BilSTM-CRF
CN110941692A (en) * 2019-09-28 2020-03-31 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for extracting news events of Internet politics outturn class

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2018100678A4 (en) * 2015-11-05 2018-06-14 Tongji University News events extracting method and system
CN107239445A (en) * 2017-05-27 2017-10-10 中国矿业大学 The method and system that a kind of media event based on neutral net is extracted
CN107797993A (en) * 2017-11-13 2018-03-13 成都蓝景信息技术有限公司 A kind of event extraction method based on sequence labelling
CN108197112A (en) * 2018-01-19 2018-06-22 成都睿码科技有限责任公司 A kind of method that event is extracted from news
CN110633409A (en) * 2018-06-20 2019-12-31 上海财经大学 Rule and deep learning fused automobile news event extraction method
CN109508459A (en) * 2018-11-06 2019-03-22 杭州费尔斯通科技有限公司 A method of extracting theme and key message from news
CN109635280A (en) * 2018-11-22 2019-04-16 园宝科技(武汉)有限公司 A kind of event extraction method based on mark
CN109670172A (en) * 2018-12-06 2019-04-23 桂林电子科技大学 A kind of scenic spot anomalous event abstracting method based on complex neural network
CN110377680A (en) * 2019-07-11 2019-10-25 中国水利水电科学研究院 The method of mountain flood database sharing and update based on web crawlers and semantics recognition
CN110941692A (en) * 2019-09-28 2020-03-31 西南电子技术研究所(中国电子科技集团公司第十研究所) Method for extracting news events of Internet politics outturn class
CN110852068A (en) * 2019-10-15 2020-02-28 武汉工程大学 Method for extracting sports news subject term based on BilSTM-CRF

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LIXIANG GUO ET AL: "A Practical Approach to Chinese Emergency Event Extraction using BiLSTM-CRF", 2019 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, pages 163 - 164 *
樊红;李怀远;杜武;杨继文;: "基于事件分析的Web地震新闻时空信息挖掘研究", 武汉大学学报(工学版), no. 02, pages 92 - 97 *
江逸琪;赵彤洲;柴悦;高佩东;: "基于BiLSTM-CRF的体育新闻主题词抽取方法", 武汉工程大学学报, no. 01, pages 106 - 111 *
邱锡鹏: "神经网络与深度学习", vol. 1, 机械工业出版社, pages: 147 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468320A (en) * 2021-07-22 2021-10-01 中国地震台网中心 Method and system for quickly visualizing earthquake emergency information

Similar Documents

Publication Publication Date Title
CN112269949B (en) Information structuring method based on accident disaster news
US8666998B2 (en) Handling data sets
CN106557462A (en) Name entity recognition method and system
CN106503055A (en) A kind of generation method from structured text to iamge description
CN110532398B (en) Automatic family map construction method based on multi-task joint neural network model
CN108984775B (en) Public opinion monitoring method and system based on commodity comments
CN111160005A (en) Event prediction method and device based on event evolution knowledge ontology and terminal equipment
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN110209816A (en) Event recognition and classification method, system, device based on confrontation learning by imitation
CN111651983A (en) Causal event extraction method based on self-training and noise model
CN113946677A (en) Event identification and classification method based on bidirectional cyclic neural network and attention mechanism
CN111950199A (en) Earthquake data structured automation method based on earthquake news event
CN109992723B (en) User interest tag construction method based on social network and related equipment
Cordell et al. Disaggregating repression: Identifying physical integrity rights allegations in human rights reports
CN111859074B (en) Network public opinion information source influence evaluation method and system based on deep learning
Alzhrani Political Ideology Detection of News Articles Using Deep Neural Networks.
CN111639494A (en) Case affair relation determining method and system
Zhao et al. State and tendency: an empirical study of deep learning question&answer topics on Stack Overflow
CN114860918A (en) Mobile application recommendation method and device fusing multi-source reliable information
CN116450783A (en) Method, system, storage medium and electronic equipment for extracting event facing chapter level
Curro et al. Building usage profiles using deep neural nets
Kumar et al. An Algorithm for Automatic Text Annotation for Named Entity Recognition using spaCy Framework
CN114357284A (en) Crowdsourcing task personalized recommendation method and system based on deep learning
CN110309285B (en) Automatic question answering method, device, electronic equipment and storage medium
CN109597879B (en) Service behavior relation extraction method and device based on 'citation relation' data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination