CN112784602B - News emotion entity extraction method based on remote supervision - Google Patents

News emotion entity extraction method based on remote supervision Download PDF

Info

Publication number
CN112784602B
CN112784602B CN202011395972.7A CN202011395972A CN112784602B CN 112784602 B CN112784602 B CN 112784602B CN 202011395972 A CN202011395972 A CN 202011395972A CN 112784602 B CN112784602 B CN 112784602B
Authority
CN
China
Prior art keywords
emotion
news
sentences
sentence
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011395972.7A
Other languages
Chinese (zh)
Other versions
CN112784602A (en
Inventor
张琨
孙琦
李寻
张李林清
刘志敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202011395972.7A priority Critical patent/CN112784602B/en
Publication of CN112784602A publication Critical patent/CN112784602A/en
Application granted granted Critical
Publication of CN112784602B publication Critical patent/CN112784602B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a news emotion entity extraction method based on remote supervision, which comprises the following steps: the news of the become an official party news website is anticipated and cached to a local warehouse; preprocessing the crawled news corpus to obtain news expectation segmented into sentences; constructing a key entity knowledge base, and automatically labeling news predictions divided into sentences according to the knowledge base; training the emotion sentence extraction model by using the marked news anticipation to enable the emotion sentence extraction model to have the capability of automatically judging emotion of an input sentence; training by using the extracted emotion sentences as a training set of emotion entity extraction models; crawling news corpus and segmenting the news corpus into sentences, inputting the segmented news corpus into a trained emotion sentence extraction model to extract emotion sentences, and inputting the extracted emotion sentences into the trained emotion entity extraction model to obtain emotion entities. According to the invention, a noisy data set is generated for a large number of samples by adopting a remote supervision mode for model training, so that the model training efficiency is improved.

Description

News emotion entity extraction method based on remote supervision
Technical Field
The invention belongs to the field of computer artificial intelligence, and particularly relates to a news emotion entity extraction method based on remote supervision.
Background
Named entity recognition in the news field researchers have explored it due to unique application contexts and text expressions. Feng Yuntian et al put forward the entity classification principle of personnel, soldier, military personnel, military institutions, facilities and the like, and construct a corpus based on the standardized texts of combat paperwork, duty paperwork, military paperwork and the like. The CRF model is trained by using a small amount of artificial labeling training corpus, the training model carries out entity recognition on unlabeled test corpus, and the model obtains a recognition effect with an F value of 90.9% on the test corpus. The method comprises the steps of identifying weapon named entities such as free flight, establishing a weapon entity identification model based on DNN, and obtaining context characteristics by nonlinear transformation learning by taking word vectors and part-of-speech vectors with fixed dimensions as input by the model. The model is trained on 7500 news-built corpora from the world wide web, the Chinese net and the like, and the F value reaches 91.02%. Wang Xuefeng and the like divide named entities into 8 categories of troops, place names, institutions, weapons, facilities, time, environment and quantity, a word-level representation-based entity recognition model (character-BiLSTM-CRF) combining BiLSTM and CRF is provided, the model is trained based on a corpus constructed by undisclosed 30 more than 30 combined combat exercise desired documents and command exercise desired documents, and the F value reaches 98%. In addition, researchers have explored methods of generating word vectors using convolutional neural networks and combining BiLSTM and CRF to build news domain naming entities. Named entity identification in the unpublished combat documents is oriented, the named entities are divided into 13 subclasses of positions, troops, personnel, articles, number 5 major classes, place names, establishment and the like based on a nested classification principle, and higher recall rate and F value are obtained by adopting the CNN-BiLSTM-CRF model and experiments on a corpus constructed by 100 unpublished combat documents.
The traditional emotion entity identification method based on rules, dictionaries and statistical learning models relies on rule design and feature engineering, and although higher recall rate is achieved, the rule formulation and feature extraction require abundant domain knowledge and a great deal of labor cost, and it is difficult to formulate uniform templates and rules for all problems. In recent years, with the support of computing power and text distributed representation technology, emotion entity identification methods based on deep neural networks (deep neural network, DNN) have made breakthrough progress in general fields and specific fields such as law, medicine, biochemistry, finance and the like. Compared to emotion entity identification studies in other fields, news field emotion entity identification faces the following problems and challenges:
There is often a problem in entity recognition tasks that entity boundaries are difficult to define. For example, in the field of insurance, "Chinese life insurance" may be considered as one entity, and may be considered as 2 entities, "Chinese" and "life insurance". However, the expertise of the field makes the boundaries between entities more difficult to determine, for example, "imperial navy in the uk" may be considered as an organizational entity, and likewise "imperial" may be considered as a place name entity, "imperial navy" as an organizational entity; "Russian diagram-160 strategic bombers" may be considered as weaponry entities, as well as "Russian army" as organization entities, "diagram-160 strategic bombers" as weaponry entities.
There is also a phenomenon that an entity simplifies expression in an entity recognition task. Compared with other fields, the news field has the advantages that the emotion entity is obscured after simplified expression due to the uniqueness and the specialty of the field, and the news field has no certain regularity.
Named entity recognition technology based on CRF and other statistical models relies on field experts to complete a large amount of artificial feature selection work; the field named entity method based on long-short-term memory neural network and other models needs to rely on a huge corpus to construct word vectors in the model training process.
The electronic medical record in the medical field, the judgment book in the legal field and the prosecution book have strict formats and expression specifications, and excellent recognition effects can be obtained based on a rule recognition method. The social media data represented by the microblog is not standard in expression, a large number of spoken expressions exist, no specific rule exists, and the difficulty of identifying the entity is high.
At present, a corpus data set and entity classification standard facing the news field do not exist, and research work of open source information is hindered.
Disclosure of Invention
The invention aims to provide a news emotion entity extraction method based on remote supervision.
The technical scheme for realizing the purpose of the invention is as follows: a news emotion entity extraction method based on remote supervision comprises the following steps:
step 1: adopting a crawler technology to crawl become an official parts of news web news corpus and caching the news corpus to a local warehouse;
Step 2: preprocessing the crawled news corpus to obtain the news corpus segmented into sentences;
step 3: constructing a key entity knowledge base, and automatically labeling news corpus divided into sentences according to the knowledge base;
Step 4: training the emotion sentence extraction model by using the marked news corpus to enable the emotion sentence extraction model to have the capability of automatically judging emotion of an input sentence;
step 5: extracting emotion sentences by using the step 4, and training the emotion sentences as a training set of emotion entity extraction models to enable the emotion sentences to have the capability of a holder, an expression object and an event of emotion in the extracted sentences;
step 6, crawling news corpus and segmenting the news corpus into sentences by adopting the method of the step 1 and the step 2, inputting the news corpus segmented into sentences into a trained emotion sentence extraction model to extract emotion sentences, and inputting the extracted emotion sentences into the trained emotion entity extraction model to obtain emotion entities.
Preferably, the specific method for crawling become an official news related to the news website is as follows:
acquiring news websites related to the event by analyzing search results of the official websites with keywords;
And analyzing news content according to the news website, acquiring the title, time and specific content of the news, and caching the news to a local warehouse.
Preferably, preprocessing the crawled news corpus includes:
Cleaning the crawled news corpus, and removing redundancy and dirty data irrelevant to the theme;
sentence division is carried out on news corpus in the local warehouse by taking punctuation marks as marks.
Preferably, the key entity knowledge base is constructed as a human, organization, country, event entity knowledge base.
Preferably, the principle of automatically labeling the news corpus divided into sentences according to the knowledge base is as follows: when more than n knowledge base entities appear in the sentence, the sentence is marked as a sentence with emotion, and n is a set natural number.
Preferably, the emotion sentence extraction model includes a word vector expression layer and a SoftMax classification layer, which are respectively specified as follows:
the word vector expression layer adopts a BERT pre-training model and is used for extracting characteristics of each word in the news text data segmented into sentences to obtain word characteristics;
The SoftMax classification layer is used for predicting probability distribution on output categories and decoding labels, and judging whether an input sentence is an emotion sentence or not according to a prediction result.
Preferably, the emotion entity extraction model includes a word vector layer, an encoder, and a decoder, which are respectively specified as follows:
The word vector layer adopts a BERT pre-training model for obtaining the sub-features of emotion sentences;
The encoder adopts a bidirectional long-short-term memory neural network for extracting semantic features of an input text;
the decoder adopts a conditional random field for decoding semantic features into corresponding labels, and obtains corresponding entity positions and entity categories according to predicted label values
Compared with the prior art, the invention has the remarkable advantages that:
according to the invention, under the condition that a large number of unmarked samples exist, a noisy data set is generated for the large number of samples by adopting a remote supervision mode for model training, so that the cost of manual marking is greatly reduced, and the efficiency of model training is improved;
Aiming at the problem and the challenge brought by the special news field, the invention designs the emotion sentence extraction technology based on the BERT word vector, and the object of entity extraction is concentrated in a more meaningful range, thereby greatly improving the efficiency of entity extraction;
The invention is based on the entity extraction network of the multi-model fusion, and combines the expert knowledge base to extract emotion holders, emotion expression objects and related event information in emotion sentences, thereby laying the foundation of a pre-task for emotion analysis and public opinion analysis in the news field.
Drawings
FIG. 1 is a flow chart of the present invention.
FIG. 2 emotion sentence extraction model training test flow.
FIG. 3 emotion entity extraction model training test flow.
FIG. 4LSTM structure.
Fig. 5CRF structure diagram.
Detailed Description
A news emotion entity extraction method based on remote supervision, as shown in fig. 1, comprises the following steps:
step 1: adopting a crawler technology to crawl become an official parts of news web news corpus and caching the news corpus to a local warehouse;
the crawler technology is adopted to crawl relevant news corpus of official news websites such as world wide web, internet news, new-bloom daily news and the like aiming at hot news events. The specific method comprises the following steps: and acquiring news websites related to the event by analyzing the search results of the official websites with the keywords, analyzing news contents according to the news websites, acquiring the title, time, specific contents and other data of the news, and caching the data into a local warehouse.
Step 2: preprocessing the crawled news corpus to obtain the news corpus segmented into sentences;
and reading the crawled news corpus from the local warehouse to clean the data, and removing redundancy and dirty data irrelevant to the theme. And deleting the useless repeated sentences in the news. The cleaned data is stored in a structured manner for training of the algorithm model.
Marked with punctuation marks). "? ", I! The data in the database is sentence-divided by using "," … … "and" "" as marks.
Step 3, constructing a key entity knowledge base, and automatically labeling news corpus divided into sentences according to the knowledge base;
And establishing a key entity knowledge base of people, organizations, countries, events and the like according to the data in the local warehouse. And automatically labeling the news divided into sentences according to the key entity knowledge base. The labeling principle is as follows: and labeling as a sentence with emotion when more than n knowledge base entities appear in the sentence. n is an adjustable parameter, and a large amount of noisy training data can be obtained by the remote supervision mode.
Step 4: training the emotion sentence extraction model by using the marked news corpus to enable the emotion sentence extraction model to have the capability of automatically judging emotion of an input sentence;
As shown in fig. 2, the news text data cut into sentences is divided into a training set and a testing set according to the twenty-eight principle, emotion sentence extraction models are trained by the training set, and accuracy and performance analysis is performed on the trained models by the testing set.
In a further embodiment, the emotion sentence extraction model includes a word vector expression layer and a SoftMax classification layer.
Specifically, the word vector expression layer adopts a BERT pre-training model, the BERT pre-training model uses a transducer encoder as a language model, and adopts a 'shielding language model' and a next sentence prediction mechanism to solve the problem of unidirectional current most word vector generation models. And extracting the characteristics of each word in the news text data S i={Xi1,Xi2,...,Xik segmented into sentences by using the BERT pre-training model to obtain word characteristics: x ij=(e1,e2,...,em). Where S i represents the ith sentence in the dataset, X ik represents the kth word in the sentence, X ij represents the word vector representation of the jth word of the ith sentence, and e m represents the value of the mth word in X ij. To sum up, after each sentence passes through the word vector representation layer, each word therein is composed of m-dimensional word vector features, so that it can be represented as: Where S i represents the ith sentence in the dataset and e km represents the value of the mth of the kth word in the ith sentence.
Specifically, the SoftMax classification layer serves as a classifier for emotion sentence classification, normalizes the output of the network to a probability distribution over the predicted output categories, maps the output result to a value of (0, 1), and represents:
Wherein the method comprises the steps of Is a weight matrix,/>Is the weight deviation. /(I)Is the output of the last layer,/>Representing the intermediate value calculated by the output of the i-th node of the layer i. The SoftMax layer is used for normalizing the result and decoding the tag, and whether the input sentence is an emotion sentence or a non-emotion sentence is judged through the result.
And extracting sentences with emotion tendencies from news of long texts through the emotion sentence extraction model.
Step 5: extracting emotion sentences by using the step 4, and training the emotion sentences as a training set of emotion entity extraction models to enable the emotion sentences to have the capability of a holder, an expression object and an event of emotion in the extracted sentences;
The emotion entity extraction model training test flow is shown in fig. 3, and based on the extracted emotion sentences, the emotion holders, the expression objects and the emotion sentence related events in the sentences are extracted. The important entities in emotion sentences are identified by adopting a sequence-to-sequence model based on a deep learning algorithm.
In a further embodiment, the emotion entity extraction model consists of three parts: a word vector layer, an encoder, a decoder;
specifically, the word vector layer also employs a BERT pre-training model. And inputting emotion sentences extracted by the emotion sentence extraction model, and outputting word vector representation of the emotion sentences.
Specifically, the encoder employs a bi-directional long-short-term memory neural network (LSTM) for extracting semantic features of the input text. LSTM is also a special type of Recurrent Neural Network (RNN) that can learn long-term dependency information, all RNNs having a chained form of repeating neural network modules. In a standard RNN, the repetition module has only a very simple structure, e.g. a Tanh layer, whereas the "memory cells" of LSTM avoid the long-term dependency problem by being deliberately designed. LSTM controls cell status by a carefully designed structure called a gate, deleting or adding information directly throughout and into. The Bi-LSTM is adopted, and global feature information of the whole text can be obtained through two feature extractors in different directions, so that the feature extraction capability of enconder on the whole text is improved. The LSTM model is calculated as follows:
it=σ(Wxixt+Whiht-1+Wcict-1+bi)
ft=σ(Wxfxt+Whfht-1+Wcfct-1+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=σ(Wxoxt+Whoht-1+Wcoct-1+bo)
ht=ottanh(ct)
Wherein i, f, c, o is an input gate, a forgetting gate, a cell state and an output gate respectively; w and b are respectively corresponding weight coefficient matrixes and bias items; sigma and tanh are the sigmoid function and hyperbolic tangent activation function, respectively.
The LSTM model training process can be roughly divided into four steps: ① Calculating the output value of LSTM cells according to the fifth expression (forward calculation method); ② Calculating the error term of each LSTM cell reversely, wherein the error term comprises 2 reverse propagation directions of time and model level; ③ Calculating the gradient of each weight according to the corresponding error term; ④ The weights are updated using a gradient-based optimization algorithm. The LSTM structure is shown in FIG. 4.
In particular, the decoder employs a Conditional Random Field (CRF). The encoder extracts and encodes the characteristics of the data, the decoder decodes the characteristics into corresponding labels, and the corresponding entity positions and entity categories are obtained according to the predicted label values. The conditions in CRF refer to the markov random field for the random variable Y given the random variable X. Typically, only linear chain member random fields are used to label the problem, with a conditional probability of P (Y|X). Where X is a given observation sequence and Y is a labeling sequence (state sequence) that needs to be labeled. The conditional probability distribution P (y|x) is called conditional random field, which is generally as follows, if any node v is established.
P(Yv|X,Yw,w≠v)=P(Yv|X,Yw,w~v)
The corresponding label of each word can be obtained through the decoder, the type and the position of the entity are judged according to the label category, so that the identification and the extraction of emotion holders, expression objects and events in emotion sentences are realized, and the model can reach 65% of accuracy through testing. The CRF structure is shown in fig. 5.
Step 6, crawling news corpus and segmenting the news corpus into sentences by adopting the method of the step 1 and the step 2, inputting the news corpus segmented into sentences into a trained emotion sentence extraction model to extract emotion sentences, and inputting the extracted emotion sentences into the trained emotion entity extraction model to obtain emotion entities.
Through the steps 1 to 5, the emotion sentence extraction model and the emotion entity extraction model are trained, in practical application, new news corpus is crawled in the mode of the step 1, preprocessing is carried out on the corpus through the step 2, the processed long text is segmented into sentences, the sentences are input into the emotion sentence extraction model, and the model judges whether the input sentences are emotion sentences or not. And storing sentences which are judged to be emotion sentences by the emotion sentence extraction model into an emotion sentence library. And reading emotion sentences in the emotion sentence library as the input of the emotion entity extraction model, and acquiring the positions of all emotion entities in the input emotion sentences through the emotion sentence extraction model. According to the position, the emotion holder contained in the emotion sentence, the emotion expression object and the related event can be extracted.
The invention extracts emotion entities in news based on a remote supervision learning training deep learning model, wherein the emotion entities comprise emotion holders, emotion expression objects and events; aiming at the challenges of entity extraction in the news field, a deep learning model based on BERT word vectors is designed, and meanwhile, the cost of manual marking is greatly relieved by combining an expert knowledge base in an automatic marking mode, so that the method has great significance.

Claims (5)

1. The news emotion entity extraction method based on remote supervision is characterized by comprising the following steps of:
step 1: adopting a crawler technology to crawl become an official parts of news web news corpus and caching the news corpus to a local warehouse;
Step 2: preprocessing the crawled news corpus to obtain the news corpus segmented into sentences;
Step 3: constructing a key entity knowledge base, and automatically labeling news corpus divided into sentences according to the knowledge base; the constructed key entity knowledge base is a human, organization, country and event entity knowledge base; the principle of automatically labeling the news corpus divided into sentences according to the knowledge base is as follows: when more than n knowledge base entities appear in the sentence, marking the sentence as a sentence with emotion, wherein n is a set natural number;
Step 4: training the emotion sentence extraction model by using the marked news corpus to enable the emotion sentence extraction model to have the capability of automatically judging emotion of an input sentence;
step 5: extracting emotion sentences by using the step 4, and training the emotion sentences as a training set of emotion entity extraction models to enable the emotion sentences to have the capability of a holder, an expression object and an event of emotion in the extracted sentences;
step 6, crawling news corpus and segmenting the news corpus into sentences by adopting the method of the step 1 and the step 2, inputting the news corpus segmented into sentences into a trained emotion sentence extraction model to extract emotion sentences, and inputting the extracted emotion sentences into the trained emotion entity extraction model to obtain emotion entities.
2. The method for extracting news emotion entities based on remote supervision as defined in claim 1, wherein the specific method for crawling become an official news related to news website is as follows:
acquiring news websites related to the event by analyzing search results of the official websites with keywords;
And analyzing news content according to the news website, acquiring the title, time and specific content of the news, and caching the news to a local warehouse.
3. The method for extracting news emotion entities based on remote supervision according to claim 1, wherein preprocessing the crawled news corpus comprises:
Cleaning the crawled news corpus, and removing redundancy and dirty data irrelevant to the theme;
sentence division is carried out on news corpus in the local warehouse by taking punctuation marks as marks.
4. The news emotion entity extraction method based on remote supervision according to claim 1, wherein the emotion sentence extraction model includes a word vector expression layer and a SoftMax classification layer, which are respectively specified as follows:
the word vector expression layer adopts a BERT pre-training model and is used for extracting characteristics of each word in the news text data segmented into sentences to obtain word characteristics;
The SoftMax classification layer is used for predicting probability distribution on output categories and decoding labels, and judging whether an input sentence is an emotion sentence or not according to a prediction result.
5. The news emotion entity extraction method based on remote supervision according to claim 1, wherein the emotion entity extraction model includes a word vector layer, an encoder and a decoder, and specifically includes:
The word vector layer adopts a BERT pre-training model for obtaining the sub-features of emotion sentences;
The encoder adopts a bidirectional long-short-term memory neural network for extracting semantic features of an input text;
The decoder adopts a conditional random field for decoding semantic features into corresponding labels, and obtains corresponding entity positions and entity categories according to predicted label values.
CN202011395972.7A 2020-12-03 2020-12-03 News emotion entity extraction method based on remote supervision Active CN112784602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011395972.7A CN112784602B (en) 2020-12-03 2020-12-03 News emotion entity extraction method based on remote supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011395972.7A CN112784602B (en) 2020-12-03 2020-12-03 News emotion entity extraction method based on remote supervision

Publications (2)

Publication Number Publication Date
CN112784602A CN112784602A (en) 2021-05-11
CN112784602B true CN112784602B (en) 2024-06-14

Family

ID=75750656

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011395972.7A Active CN112784602B (en) 2020-12-03 2020-12-03 News emotion entity extraction method based on remote supervision

Country Status (1)

Country Link
CN (1) CN112784602B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221576B (en) * 2021-06-01 2023-01-13 复旦大学 Named entity identification method based on sequence-to-sequence architecture
CN113255358B (en) * 2021-07-12 2021-09-17 湖南工商大学 Multi-label character relation automatic labeling method based on event remote supervision
CN114970553B (en) * 2022-07-29 2022-11-08 北京道达天际科技股份有限公司 Information analysis method and device based on large-scale unmarked corpus and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110335A (en) * 2019-05-09 2019-08-09 南京大学 A kind of name entity recognition method based on Overlay model

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080249764A1 (en) * 2007-03-01 2008-10-09 Microsoft Corporation Smart Sentiment Classifier for Product Reviews
CN107783960B (en) * 2017-10-23 2021-07-23 百度在线网络技术(北京)有限公司 Method, device and equipment for extracting information
CN110516067B (en) * 2019-08-23 2022-02-11 北京工商大学 Public opinion monitoring method, system and storage medium based on topic detection
CN110502638B (en) * 2019-08-30 2023-05-16 重庆誉存大数据科技有限公司 Enterprise news risk classification method based on target entity
CN110705300A (en) * 2019-09-27 2020-01-17 上海烨睿信息科技有限公司 Emotion analysis method, emotion analysis system, computer terminal and storage medium
CN111966878B (en) * 2020-08-04 2022-07-01 厦门大学 Public sentiment event reversal detection method based on machine learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110335A (en) * 2019-05-09 2019-08-09 南京大学 A kind of name entity recognition method based on Overlay model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于BERT 和双向LSTM 的微博评论倾向性分析研究;谌志群,鞠婷;《情报理论与实践》(第8期期);173-178 *

Also Published As

Publication number Publication date
CN112784602A (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN112784602B (en) News emotion entity extraction method based on remote supervision
Quan et al. An efficient framework for sentence similarity modeling
CN108628935B (en) Question-answering method based on end-to-end memory network
CN111191002B (en) Neural code searching method and device based on hierarchical embedding
CN111444700A (en) Text similarity measurement method based on semantic document expression
CN107562792A (en) A kind of question and answer matching process based on deep learning
CN110674252A (en) High-precision semantic search system for judicial domain
Li et al. A method of emotional analysis of movie based on convolution neural network and bi-directional LSTM RNN
CN110348227B (en) Software vulnerability classification method and system
Xun et al. A survey on context learning
El Desouki et al. A hybrid model for paraphrase detection combines pros of text similarity with deep learning
Kumar et al. An abstractive text summarization technique using transformer model with self-attention mechanism
Shaker et al. Using lstm and gru with a new dataset for named entity recognition in the arabic language
CN111581365B (en) Predicate extraction method
Li et al. Efficient relational sentence ordering network
Xiao et al. Multi-Task CNN for classification of Chinese legal questions
Fu et al. Mixed word representation and minimal Bi-GRU model for sentiment analysis
CN111767388B (en) Candidate pool generation method
CN115034299A (en) Text classification method and device based on convolutional neural network multi-channel feature representation
Lokman et al. A conceptual IR chatbot framework with automated keywords-based vector representation generation
Abdolahi et al. A new method for sentence vector normalization using word2vec
Shuang et al. Combining word order and cnn-lstm for sentence sentiment classification
Mazitov et al. Named entity recognition in Russian using Multi-Task LSTM-CRF
CN114970557B (en) Knowledge enhancement-based cross-language structured emotion analysis method
Wan et al. Aspect-Based Sentiment Analysis with a Position-Aware Multi-head Attention Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant