CN108628828A - A kind of joint abstracting method of viewpoint and its holder based on from attention - Google Patents

A kind of joint abstracting method of viewpoint and its holder based on from attention Download PDF

Info

Publication number
CN108628828A
CN108628828A CN201810347840.3A CN201810347840A CN108628828A CN 108628828 A CN108628828 A CN 108628828A CN 201810347840 A CN201810347840 A CN 201810347840A CN 108628828 A CN108628828 A CN 108628828A
Authority
CN
China
Prior art keywords
viewpoint
holder
sentence
word
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810347840.3A
Other languages
Chinese (zh)
Other versions
CN108628828B (en
Inventor
李雄
刘春阳
张传新
张旭
王萌
闫昊
唐彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
National Computer Network and Information Security Management Center
Original Assignee
Beihang University
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University, National Computer Network and Information Security Management Center filed Critical Beihang University
Priority to CN201810347840.3A priority Critical patent/CN108628828B/en
Publication of CN108628828A publication Critical patent/CN108628828A/en
Application granted granted Critical
Publication of CN108628828B publication Critical patent/CN108628828B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

A kind of joint abstracting method of viewpoint and its holder based on from attention of the present invention:S1. structure extraction viewpoint and its corpus of holder;S2. identification includes the sentence of viewpoint;S3. joint extracts viewpoint and its holder.Advantage of the present invention:1, textual classification model avoids the case where sentence extracted does not include viewpoint;2, viewpoint and its holder combine extraction model and have broken away from part-of-speech tagging, named the natural language processings link such as Entity recognition and syntax dependency parsing, avoid these links from influence of the error to model extraction effect occur, and the model has very high flexibility ratio and covering surface;3, corpus of the present invention comprising structure extraction viewpoint and its holder, identification include the sentence of viewpoint, and joint extracts viewpoint and its holder.4, the present invention effectively combines the two advantage on the basis of two-way LSTM using self attention, keeps the expression semanteme of sequence of terms more rich, trained model accuracy rate higher.

Description

A kind of joint abstracting method of viewpoint and its holder based on from attention
Technical field
The present invention relates to a kind of natural language processing methods, more particularly to one kind to be based on from attention (self- Attention the joint abstracting method of viewpoint and its holder), it can extract the viewpoint in Chinese news corpus automatically And its holder, belong to Computer Science and Technology field.
Background technology
With the development of Internet technology, a large amount of text message is skyrocketed through on internet, and electronic medium is rapidly sent out Exhibition, for traditionally on paper media also in the camp that electronic medium is added, explosive growth is presented in news corpus.Viewpoint is carried out to text Extraction is also increasingly paid close attention to by researcher, and as most active research field in natural language processing instantly it One.The explosive growth of news corpus on network forms obstruction to obtaining information instead.In the less feelings of past news amount Under condition, one more comprehensive understanding can be formed to dependent event by artificial Fast Reading news, record viewpoint.And now News amount is very huge, if only reading section news, the information got is relatively limited, may obtain unilateral cognition, If reading whole news and counting the viewpoint of each expert or mechanism, because data volume is excessively huge, lead to reality It is upper infeasible.Currently, major news portal website or microblogging etc. all provide the summary info of news from media, it is provided to User can be allowed quickly and easily to understand the general contents of news, however only a small number of hot news just have such abstract, because Editorial staff is still relied on for it to write manually.It can see on the e-commerce platforms such as Taobao, the sight of comment on commodity Point excavates and sentiment analysis technology has gradually moved towards business application by science, and use is facilitated while saving human resources Family quick obtaining information on commodity comment.In contrast, the viewpoint of news corpus and its automatic extraction technique of holder are still being ground Study carefully the stage, nonetheless, it is contemplated that it is all widely used and studies in many fields, such as information retrieval, data are dug Pick, text mining, Web excavations etc., the range covered extends to the fields such as management and sociology from computer science.Newly It hears viewpoint extraction technique and is being increasingly becoming research hotspot.
The current hot spot of opining mining is concentrated mainly in comment on commodity, which is actually that a kind of fine granularity is multi-party The sentiment analysis in face.Sentiment analysis is divided into chapter grade, Sentence-level, phrase grade in granular level, is divided into two in taxonomical hierarchy Pole, multipole, various aspects.The main task that comment on commodity viewpoint extracts is to extract estimator, evaluation object and evaluating word, mainly By two kinds of supervised learning method and unsupervised learning method:
1. supervised learning method
The mainstream of supervised learning method is to be based on sequence labelling method, and the method for obtaining best effects at present is hidden Ma Er Section's husband's model (Hidden Markov Model, HMM) and condition random field (Conditional Random Field, CRF), Including the methods of Lexical HMM model, Skip-CRF, Tree-CRF.In addition to both main stream approach, it is also based on syntax The method of dependence filters out candidate evaluations pair, then using sorting technique to determine whether belonging to evaluation object and evaluating word.
2. unsupervised learning method
Unsupervised learning method mainly realizes that two kinds of models of mainstream are probability potential applications moulds using topic model Type (Probabilistic Latent Semantic Analysis, PLSA) and latent Dirichletal location (Latent Dirichlet Allocation, LDA) method.Both methods is initially not particularly suited for viewpoint extraction, but it can be by Extension is for modeling much information.The preferable method of effect includes Sentiment-LDA, MaxEnt-LDA etc. at present Method.Somebody combines HMM and LDA, it is proposed that HMM-LDA models, it can be found that potential evaluation object.
It is relatively fewer that news viewpoint extracts current research, has at present based on the associated viewpoint sentence of bilingual news sentence element Abstracting method, thinking is that sentence of the cluster comprising fixed morpheme and emotion is considered viewpoint sentence, first using name Entity recognition Method carries out sequence labelling to news sentence, obtains morpheme set, recycles emotion word dictionary to extract emotion word, then passes through Correlation degree between the morpheme of different news corpus between emotion word calculates sentence weight, finally obtains comprising viewpoint sentence Sentence cluster.
Our target is viewpoint and the viewpoint holder extracted in news corpus, is had with task above certain Similitude, but it is not exactly the same.Currently, the viewpoint and its holder that extract in news corpus become natural language not yet The hot spot of processing, research data is also relatively fewer, we can be by naming Entity recognition and syntax dependency parsing to obtain viewpoint The template of sentence, but the method coverage rate of template matches is low, underaction can only extract fixed expression way, it is difficult to adapt to Flexible language change.Therefore, we have proposed a kind of viewpoints based on self-attention and its joint of holder to take out Method is taken, solves the problems, such as this, compensates for field blank.
The distinct methods that viewpoint extracts have different limitations.The supervised learning of supervised learning method and other tasks Method all has that labeled data collection is difficult to obtain, classification is more, it is different classes of between training corpus gap it is big.In addition, With the prevalence of cyberspeak, language is also changing, and labeled data in the early time may be eliminated soon, mark new data Or it corrects legacy data and is required for expending a large amount of energy.
Mainly evaluation object and evaluating word are modeled using topic model in unsupervised learning method, however it is main It inscribes model to need to carry out a large amount of parameter complicated adjustment, can just obtain preferable as a result, causing training usually into postponing Slowly.In addition, topic model is easy to find out the evaluation generally occurred in document, it is difficult then hair for there is not frequent evaluation It is existing.In news corpus, the evaluation of universal evaluation, especially mechanism expert is actually rare, and often oneself respectively expresses in expert mechanism See, such evaluation is easy to be submerged in news corpus.
Current existing bilingual news viewpoint sentence abstracting method has used the relevance of bilingual news, while to emotion The extraction of word has still used most basic emotion word dictionary.It is that this method finally extracts the result is that one include emotion The small paragraph of tendency being made of multiple sentences, wherein might not include evaluation, accuracy rate can not reach requirement.
Invention content
The purpose of the present invention is to provide a kind of viewpoint based on from attention and its joint abstracting method of holder, To overcome the defect that above-mentioned evaluated views extract and news viewpoint sentence extracts, the textual classification model of the method for the present invention effectively to keep away The case where sentence extracted is not comprising viewpoint is exempted from;Viewpoint and its holder combine extraction model and have broken away from part-of-speech tagging, life The name natural language processings link such as Entity recognition and syntax dependency parsing avoids the error of these links appearance for model The influence of extraction effect, and the model does not have the process of artificial definition template, increases flexibility ratio and covering surface.
A kind of joint abstracting method of viewpoint and its holder based on from attention of the present invention, specifically includes following step Suddenly:
S1. structure extraction viewpoint and its corpus of holder
Corpus includes two parts, and a part is the negative sample not comprising viewpoint, and another part is comprising viewpoint and its to hold The positive sample for the person of having, the mark comprising viewpoint and its holder, a positive sample can be expressed as in positive sample<Original text, viewpoint Holder and viewpoint>Two tuples, the wherein format of viewpoint holder and viewpoint part are [viewpoint holder]:[viewpoint].This hair It is bright that such corpus is obtained by way of manually marking.
S2. identification includes the sentence of viewpoint
Sentence of the identification comprising viewpoint is two classification problem of text, and positive class is the sentence comprising viewpoint, does not include and sees The sentence of point is as negative class.Present invention employs the textual classification model based on CNN, the structure of this textual classification model is such as Shown in Fig. 2, specific implementation step is:
S21:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions;
S22:Word segmentation processing is carried out to sentence s, s is expressed as a Matrix C=< w1, w2 ..., wn using term vector >, wherein w1 is the corresponding d dimensions term vector of first word in sentence s;
S23:Matrix C is handled with k convolution kernel, the size of each convolution kernel is x*d, and x is one small more than 0 In 5 integer, each convolution operation obtains a n-dimensional vector;
S24:Maximum pond is carried out to the k n dimensions that step S23 is obtained, each n-dimensional vector exports maximum numerical value, finally Obtain a k dimensional vector;
S25:The k dimensional vectors that step S24 is obtained are as the input of the fully-connected network for classification;
S26:Model training, training data and test data can be that initial data is randomly ordered, be instructed by 80% Practice, 20%, which does the method tested, separates.
S3. joint extracts viewpoint and its holder
The extraction of viewpoint and its holder are that viewpoint and its holder are extracted from the sentence comprising viewpoint, in short In may include multiple names and viewpoint, how accurately to extract and match name and viewpoint is this task pass to be solved Key problem.Present invention employs the information that two-way LSTM captures text positive sequence and backward, are established using self-attention Each relationship between word and context words, and several words are extracted from text by Pointer Network and are constituted<It sees Point holder, viewpoint>Two tuples, as shown in figure 3, the joint extraction model of viewpoint and its holder, including word Embedding layers, LSTM layers two-way, self-attention layers and four part of pointer networks layers, joint, which extracts, to be seen It puts and its specific implementation step of holder is:
S31:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions;
S32:The sentence of vectorization<w1,w2,…,wn>It is inputted as two-way LSTM, has been merged context information Word vectors<h1,h2,…,hn>;
S33:By the word vectors of the obtained fusion semantic informations of step S32, word w is calculated to each wordiWith with other Word wjBetween weight αij, the vectorial a ' that is weightedi, by a 'iAnd hiIt is spliced into aiAs self-attention layers Output, correlation formula are as follows:
eij=We*tanh(Wshj+Waa′i-1) ai=[a 'i;hi]
Wherein a 'iIndicate word wiAfter self-attention mechanism weighted sums as a result, αijIndicate word wi With with other words wjBetween weight.Wherein αijIt is calculated by softmax functions, in eijIn calculating, We,Ws,WaIt is to need The parameter to be learnt, the last one formula indicate the concatenation of vector.
S34:The output that step S33 is obtained<a1,a2,…,an>As Pointer Network encoder it is defeated Enter, the output of encoder is denoted as<h1,h2,…,hn>, the maximum input subsequence of decoder output probabilities, which is exactly Combine the viewpoint being drawn into and its holder.According to the training corpus of structure, the first word of the sequence of output is held for viewpoint The person of having, remaining is viewpoint.
S35:Model training, training data and test data can be that initial data is randomly ordered, be instructed by 80% Practice, 20%, which does the method tested, separates.
A kind of joint abstracting method of viewpoint and its holder based on from attention of the present invention, advantage and effect exist In:
1, the textual classification model of the method for the present invention effectively prevents the case where sentence extracted is not comprising viewpoint;
2, viewpoint and its holder combine extraction model and have broken away from part-of-speech tagging, named Entity recognition and interdependent point of syntax The natural language processings links such as analysis avoid influence of the error of these links appearance for model extraction effect, and the mould Type does not have the process of artificial definition template, increases flexibility ratio and covering surface;
3, the work of previous opining mining is mainly towards comment on commodity, therefore main target is to extract evaluation object It is relatively fewer for the research for extracting viewpoint and its holder in newsletter archive with the Sentiment orientation to evaluation object, although with Name Entity recognition combination syntax dependency parsing can construct the template for extracting viewpoint, but this method coverage rate is low, spirit Poor activity, it is difficult to meet demand.For these limitations, the present invention proposes a kind of new method for extracting viewpoint and its holder, Including structure extraction viewpoint and its corpus of holder, identification include the sentence of viewpoint, joint extracts viewpoint and its holder Method.
4, the present invention proposes the sequence of terms of the integrating context information based on self-attention and two-way LSTM Representation method.Use merely two-way LSTM can integrating context information, but the pass between other words cannot be embodied System.The sequence signature between word is then lost using self-attention merely, the present invention is on the basis of two-way LSTM The advantages of the two being effectively combined using self-attention so that the expression semanteme of sequence of terms is more abundant, training Model accuracy rate higher.
Description of the drawings
Fig. 1 is the method for the present invention main flow chart.
Fig. 2 is the discrimination model that the method for the present invention includes viewpoint sentence.
Fig. 3 is the joint extraction model of the method for the present invention viewpoint machine holder.
Specific implementation mode
Below in conjunction with the accompanying drawings, the following further describes the technical solution of the present invention.
The method of the present invention has the characteristics that:
First, it includes mechanism or the viewpoint of expert generally there was only division statement in news corpus, we devise one For the mechanism expert view sentence judgment method of news corpus, can quickly judge in paragraph whether to include viewpoint sentence.
Second, in order to realize the effective identification and extraction of evaluating holder and evaluation content in news corpus, we build One end to end neural network cross model, which is based on self-attention and Pointer Network to evaluation The joint that content and its holder carry out extracts.
In this way, we are achieved that viewpoint and its holder a joint abstracting method based on self-attention.
The task of the present invention includes mainly the corpus of three aspects, structure extraction viewpoint and its holder;Training text Disaggregated model, identification include the sentence of viewpoint;Training can combine from the sentence comprising viewpoint extracts viewpoint and its holder Network model.On the basis of the above task is completed, the flow for a document extraction viewpoint and its holder is, first Subordinate sentence processing first is carried out to document, obtains sentence set.Then, it is to every words textual classification model judgement in set No includes viewpoint, if including if with the joint extraction model of viewpoint and its holder extract viewpoint and its holder.Side of the present invention The main flow of method is as shown in Figure 1, be as follows:
S1. structure extraction viewpoint and its corpus of holder.
The corpus of structure includes two parts, and a part is the negative sample not comprising viewpoint, and another part is comprising viewpoint And its positive sample of holder, the mark comprising viewpoint and its holder, a positive sample can be expressed as in positive sample<It is former Text, viewpoint holder and viewpoint>Two tuples, the wherein format of viewpoint holder and viewpoint part are [viewpoint holder]:It [sees Point].The present invention obtains such corpus by way of manually marking.
S2. identification includes the sentence of viewpoint.
Sentence of the identification comprising viewpoint is two classification problem of text, and positive class is the sentence comprising viewpoint, does not include and sees The sentence of point is as negative class.Deep learning achieves good effect in text classification problem at present, there is employed herein based on The textual classification model of CNN, this model can use the term vector of pre-training as mode input, and increase model can Transplantability, and the assemblage characteristic of local word can be obtained by controlling the size of convolution window, improve the accurate of classification Rate.The structure of this textual classification model is as shown in Fig. 2, the specific implementation step of this textual classification model is:
S21:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions.
S22:Word segmentation processing is carried out to sentence s, using term vector by s be expressed as a Matrix C=< w1,w2,…,wn>, Wherein w1 is the corresponding d dimensions term vector of first word in sentence s.
S23:Matrix C is handled with k convolution kernel, the size of each convolution kernel is x*d, and x is one small more than 0 In 5 integer, each convolution operation obtains a n-dimensional vector.
S23:Maximum pond is carried out to the k n dimensions that step S23 is obtained, each n-dimensional vector exports maximum numerical value, finally Obtain a k dimensional vector.
S25:The k dimensional vectors that step S24 is obtained are as the input of the fully-connected network for classification.
S26:Model training, training data and test data can be that initial data is randomly ordered, be instructed by 80% Practice, 20%, which does the method tested, separates.
S3. joint extracts viewpoint and its holder.
The extraction of viewpoint and its holder are that viewpoint and its holder are extracted from the sentence comprising viewpoint, in short In may include multiple names and viewpoint, how accurately to extract and match name and viewpoint is this task pass to be solved Key problem.Present invention employs the information that two-way LSTM captures text positive sequence and backward, are established using self-attention Each relationship between word and context words, and several words are extracted from text by Pointer Network and are constituted<It sees Point holder, viewpoint>Two tuples, as shown in figure 3, the joint extraction model of viewpoint and its holder, including word Embedding layers, LSTM layers two-way, self-attention layers and four part of pointer networks layers, sight of the invention It puts and its specific implementation step of the joint extraction model of holder is:
S31:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions.
S32:The sentence of vectorization<w1,w2,…,wn>It is inputted as two-way LSTM, has been merged context information Word vectors<h1,h2,…,hn>;
S33:By the word vectors of the obtained fusion semantic informations of step S32, word w is calculated to each wordiWith with other Word wjBetween weight αij, the vectorial a ' that is weightedi, by a 'iAnd hiIt is spliced into aiIt is defeated as self-attention layers Go out, correlation formula is as follows:
eij=We*tanh(Wshj+Waa′i-1) ai=[a 'i;hi]
Wherein a 'iIndicate word wiAfter self-attention mechanism weighted sums as a result, αijIndicate word wi With with other words wjBetween weight.Wherein αijIt is calculated by softmax functions, in eijIn calculating, We,Ws,WaIt is to need The parameter to be learnt, the last one formula indicate the concatenation of vector.
S34:The output that step S33 is obtained<a1,a2,…,an>As Pointer Network encoder it is defeated Enter, the output of encoder is denoted as<h1,h2,…,hn>, the maximum input subsequence of decoder output probabilities, which is exactly Combine the viewpoint being drawn into and its holder.According to the training corpus of structure, the first word of the sequence of output is held for viewpoint The person of having, remaining is viewpoint.
S35:Model training, training data and test data can be that initial data is randomly ordered, be instructed by 80% Practice, 20%, which does the method tested, separates.
Method proposes a novel viewpoint and its abstracting methods of holder, including structure corpus, identification packet Sentence containing viewpoint, joint extract viewpoint and its holder's three parts.Identify sentence whether comprising viewpoint text classification text This disaggregated model effectively prevents the case where sentence extracted is not comprising viewpoint, and viewpoint and its holder combine extraction model Part-of-speech tagging, the name natural language processings link such as Entity recognition and syntax dependency parsing have been broken away from, these links has been avoided and goes out Influence of the existing error for model extraction effect, and the model does not have the process of artificial definition template, increases flexibility ratio And covering surface.
The key point of the present invention and protection point are that joint extracts the processing method of viewpoint and its holder and is based on The representation method of the sequence of terms of the integrating context information of self-attention and two-way LSTM.

Claims (3)

1. a kind of joint abstracting method of viewpoint and its holder based on from attention, it is characterised in that:This method is specifically wrapped Include following steps:
S1. structure extraction viewpoint and its corpus of holder
Corpus includes two parts, and a part is the negative sample not comprising viewpoint, and another part is comprising viewpoint and its holder Positive sample, the mark comprising viewpoint and its holder, a positive sample can be expressed as in positive sample<Original text, viewpoint are held Person and viewpoint>Two tuples, the wherein format of viewpoint holder and viewpoint part are [viewpoint holder]:[viewpoint];
S2. identification includes the sentence of viewpoint
Sentence of the identification comprising viewpoint is two classification problem of text, and positive class is the sentence comprising viewpoint, does not include viewpoint Sentence is as negative class;
S3. joint extracts viewpoint and its holder
Using two-way LSTM capture text positive sequence and backward information, using self-attention establish each word with up and down Relationship between cliction language, and several words are extracted from text by Pointer Network and are constituted<Viewpoint holder, viewpoint> Two tuples.
2. the joint abstracting method of a kind of viewpoint and its holder based on from attention according to claim 1, special Sign is:The step S2 specifically uses the textual classification model based on CNN, and steps are as follows:
S21:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions;
S22:Word segmentation processing is carried out to sentence s, s is expressed as a Matrix C=< w using term vector1, w2..., wn>, Middle w1It is the corresponding d dimensions term vector of first word in sentence s;
S23:Matrix C is handled with k convolution kernel, the size of each convolution kernel is x*d, and x is one and is more than 0 less than 5 Integer, each convolution operation obtain a n-dimensional vector;
S24:Maximum pond is carried out to the k n dimensions that step S23 is obtained, each n-dimensional vector exports maximum numerical value, finally obtains One k dimensional vector;
S25:The k dimensional vectors that step S24 is obtained are as the input of the fully-connected network for classification;
S26:Model training, training data and test data can be that initial data is randomly ordered, and training is done by 80%, 20% The method tested is done to separate.
3. the joint abstracting method of a kind of viewpoint and its holder based on from attention according to claim 1, special Sign is:The step S3 implements step:
S31:Term vector is obtained, using Chinese wikipedia as language material, utilizes the term vector of word2vec model trainings d dimensions;
S32:The sentence < w of vectorization1, w2..., wn> is inputted as two-way LSTM, has been merged context information Word vectors < h1, h2..., hn>;
S33:By the word vectors of the obtained fusion semantic informations of step S32, word w is calculated to each wordiWith with other words wj Between weight αij, the vectorial a ' that is weightedi, by a 'iAnd hiIt is spliced into aiAs self-attention layers of output, phase It is as follows to close formula:
eij=We*tanh(Wshj+Waa′i-1)
ai=[a 'i;hi]
Wherein a 'iIndicate word wiAfter self-attention mechanism weighted sums as a result, αijIndicate word wiWith with its He is word wjBetween weight;Wherein αijIt is calculated by softmax functions, in eijIn calculating, We,Ws,WaIt needs to learn Parameter, the last one formula indicate the concatenation of vector;
S34:The output < a that step S33 is obtained1, a2..., an> as the encoder of Pointer Network input, The output of encoder is denoted as < h1, h2..., hnThe maximum input subsequence of >, decoder output probability, which is exactly to join Close the viewpoint being drawn into and its holder;According to the training corpus of structure, the first word of the sequence of output is held for viewpoint Person, remaining is viewpoint;
S35:Model training, training data and test data can be that initial data is randomly ordered, and training is done by 80%, 20% The method tested is done to separate.
CN201810347840.3A 2018-04-18 2018-04-18 Combined extraction method based on self-attention viewpoint and holder thereof Active CN108628828B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810347840.3A CN108628828B (en) 2018-04-18 2018-04-18 Combined extraction method based on self-attention viewpoint and holder thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810347840.3A CN108628828B (en) 2018-04-18 2018-04-18 Combined extraction method based on self-attention viewpoint and holder thereof

Publications (2)

Publication Number Publication Date
CN108628828A true CN108628828A (en) 2018-10-09
CN108628828B CN108628828B (en) 2022-04-01

Family

ID=63705515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810347840.3A Active CN108628828B (en) 2018-04-18 2018-04-18 Combined extraction method based on self-attention viewpoint and holder thereof

Country Status (1)

Country Link
CN (1) CN108628828B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408630A (en) * 2018-10-17 2019-03-01 杭州世平信息科技有限公司 A method of law court's viewpoint is automatically generated according to the description of crime fact
CN109446326A (en) * 2018-11-01 2019-03-08 大连理工大学 Biomedical event based on replicanism combines abstracting method
CN109684449A (en) * 2018-12-20 2019-04-26 电子科技大学 A kind of natural language characterizing semantics method based on attention mechanism
CN109783812A (en) * 2018-12-28 2019-05-21 中国科学院自动化研究所 Based on the Chinese name entity recognition method and device from attention mechanism
CN109933792A (en) * 2019-03-11 2019-06-25 海南中智信信息技术有限公司 Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN109977414A (en) * 2019-04-01 2019-07-05 中科天玑数据科技股份有限公司 A kind of internet financial platform user comment subject analysis system and method
CN110008807A (en) * 2018-12-20 2019-07-12 阿里巴巴集团控股有限公司 A kind of training method, device and the equipment of treaty content identification model
CN110162594A (en) * 2019-01-04 2019-08-23 腾讯科技(深圳)有限公司 Viewpoint generation method, device and the electronic equipment of text data
CN110263319A (en) * 2019-03-21 2019-09-20 国家计算机网络与信息安全管理中心 A kind of scholar's viewpoint abstracting method based on web page text
CN110334339A (en) * 2019-04-30 2019-10-15 华中科技大学 It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism
CN110472047A (en) * 2019-07-15 2019-11-19 昆明理工大学 A kind of Chinese of multiple features fusion gets over news viewpoint sentence abstracting method
CN111428490A (en) * 2020-01-17 2020-07-17 北京理工大学 Reference resolution weak supervised learning method using language model
CN111666767A (en) * 2020-06-10 2020-09-15 创新奇智(上海)科技有限公司 Data identification method and device, electronic equipment and storage medium
CN112328784A (en) * 2019-08-05 2021-02-05 上海智臻智能网络科技股份有限公司 Data information classification method and device
CN112667808A (en) * 2020-12-23 2021-04-16 沈阳新松机器人自动化股份有限公司 BERT model-based relationship extraction method and system
CN113139116A (en) * 2020-01-19 2021-07-20 北京中科闻歌科技股份有限公司 Method, device, equipment and storage medium for extracting media information viewpoints based on BERT

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034626A (en) * 2012-12-26 2013-04-10 上海交通大学 Emotion analyzing system and method
US20130120267A1 (en) * 2011-11-10 2013-05-16 Research In Motion Limited Methods and systems for removing or replacing on-keyboard prediction candidates
CN103678564A (en) * 2013-12-09 2014-03-26 国家计算机网络与信息安全管理中心 Internet product research system based on data mining
CN104778209A (en) * 2015-03-13 2015-07-15 国家计算机网络与信息安全管理中心 Opinion mining method for ten-million-scale news comments

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130120267A1 (en) * 2011-11-10 2013-05-16 Research In Motion Limited Methods and systems for removing or replacing on-keyboard prediction candidates
CN103034626A (en) * 2012-12-26 2013-04-10 上海交通大学 Emotion analyzing system and method
CN103678564A (en) * 2013-12-09 2014-03-26 国家计算机网络与信息安全管理中心 Internet product research system based on data mining
CN104778209A (en) * 2015-03-13 2015-07-15 国家计算机网络与信息安全管理中心 Opinion mining method for ten-million-scale news comments

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
喻琦: "中文微博情感分析技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
白静等: "基于注意力的 BiLSTM-CNN 中文微博立场检测模型", 《计算机应用与软件》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408630A (en) * 2018-10-17 2019-03-01 杭州世平信息科技有限公司 A method of law court's viewpoint is automatically generated according to the description of crime fact
CN109408630B (en) * 2018-10-17 2021-10-29 杭州世平信息科技有限公司 Method for automatically generating court opinions according to description of crime facts
CN109446326A (en) * 2018-11-01 2019-03-08 大连理工大学 Biomedical event based on replicanism combines abstracting method
CN109446326B (en) * 2018-11-01 2021-04-20 大连理工大学 Biomedical event combined extraction method based on replication mechanism
CN110008807A (en) * 2018-12-20 2019-07-12 阿里巴巴集团控股有限公司 A kind of training method, device and the equipment of treaty content identification model
CN109684449A (en) * 2018-12-20 2019-04-26 电子科技大学 A kind of natural language characterizing semantics method based on attention mechanism
CN110008807B (en) * 2018-12-20 2023-08-18 创新先进技术有限公司 Training method, device and equipment for contract content recognition model
CN109684449B (en) * 2018-12-20 2021-12-10 电子科技大学 Attention mechanism-based natural language semantic representation method
CN109783812A (en) * 2018-12-28 2019-05-21 中国科学院自动化研究所 Based on the Chinese name entity recognition method and device from attention mechanism
CN110162594A (en) * 2019-01-04 2019-08-23 腾讯科技(深圳)有限公司 Viewpoint generation method, device and the electronic equipment of text data
CN110162594B (en) * 2019-01-04 2022-12-27 腾讯科技(深圳)有限公司 Viewpoint generation method and device for text data and electronic equipment
CN109933792A (en) * 2019-03-11 2019-06-25 海南中智信信息技术有限公司 Viewpoint type problem based on multi-layer biaxially oriented LSTM and verifying model reads understanding method
CN110263319A (en) * 2019-03-21 2019-09-20 国家计算机网络与信息安全管理中心 A kind of scholar's viewpoint abstracting method based on web page text
CN109977414A (en) * 2019-04-01 2019-07-05 中科天玑数据科技股份有限公司 A kind of internet financial platform user comment subject analysis system and method
CN109977414B (en) * 2019-04-01 2023-03-14 中科天玑数据科技股份有限公司 Internet financial platform user comment theme analysis system and method
CN110334339A (en) * 2019-04-30 2019-10-15 华中科技大学 It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism
CN110472047A (en) * 2019-07-15 2019-11-19 昆明理工大学 A kind of Chinese of multiple features fusion gets over news viewpoint sentence abstracting method
CN110472047B (en) * 2019-07-15 2022-12-13 昆明理工大学 Multi-feature fusion Chinese-Yue news viewpoint sentence extraction method
CN112328784A (en) * 2019-08-05 2021-02-05 上海智臻智能网络科技股份有限公司 Data information classification method and device
CN112328784B (en) * 2019-08-05 2023-04-18 上海智臻智能网络科技股份有限公司 Data information classification method and device
CN111428490B (en) * 2020-01-17 2021-05-18 北京理工大学 Reference resolution weak supervised learning method using language model
CN111428490A (en) * 2020-01-17 2020-07-17 北京理工大学 Reference resolution weak supervised learning method using language model
CN113139116A (en) * 2020-01-19 2021-07-20 北京中科闻歌科技股份有限公司 Method, device, equipment and storage medium for extracting media information viewpoints based on BERT
CN113139116B (en) * 2020-01-19 2024-03-01 北京中科闻歌科技股份有限公司 BERT-based media information viewpoint extraction method, device, equipment and storage medium
CN111666767A (en) * 2020-06-10 2020-09-15 创新奇智(上海)科技有限公司 Data identification method and device, electronic equipment and storage medium
CN111666767B (en) * 2020-06-10 2023-07-18 创新奇智(上海)科技有限公司 Data identification method and device, electronic equipment and storage medium
CN112667808A (en) * 2020-12-23 2021-04-16 沈阳新松机器人自动化股份有限公司 BERT model-based relationship extraction method and system

Also Published As

Publication number Publication date
CN108628828B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN108628828A (en) A kind of joint abstracting method of viewpoint and its holder based on from attention
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN108573411B (en) Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments
CN107729309B (en) Deep learning-based Chinese semantic analysis method and device
CN108595708A (en) A kind of exception information file classification method of knowledge based collection of illustrative plates
CN110516067A (en) Public sentiment monitoring method, system and storage medium based on topic detection
CN103049435B (en) Text fine granularity sentiment analysis method and device
WO2021114745A1 (en) Named entity recognition method employing affix perception for use in social media
CN109325112B (en) A kind of across language sentiment analysis method and apparatus based on emoji
CN106951438A (en) A kind of event extraction system and method towards open field
CN104809176A (en) Entity relationship extracting method of Zang language
CN104881458B (en) A kind of mask method and device of Web page subject
CN106940726B (en) Creative automatic generation method and terminal based on knowledge network
CN110489523B (en) Fine-grained emotion analysis method based on online shopping evaluation
CN106599032A (en) Text event extraction method in combination of sparse coding and structural perceptron
CN102609427A (en) Public opinion vertical search analysis system and method
CN113157859B (en) Event detection method based on upper concept information
CN113987104A (en) Ontology guidance-based generating type event extraction method
Ketmaneechairat et al. Natural language processing for disaster management using conditional random fields
Xian et al. Self-guiding multimodal LSTM—when we do not have a perfect training dataset for image captioning
CN111159412A (en) Classification method and device, electronic equipment and readable storage medium
CN109086355A (en) Hot spot association relationship analysis method and system based on theme of news word
CN114048340A (en) Hierarchical fusion combined query image retrieval method
CN114065702A (en) Event detection method fusing entity relationship and event element
CN116628328A (en) Web API recommendation method and device based on functional semantics and structural interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant