CN108829662A - A kind of conversation activity recognition methods and system based on condition random field structuring attention network - Google Patents

A kind of conversation activity recognition methods and system based on condition random field structuring attention network Download PDF

Info

Publication number
CN108829662A
CN108829662A CN201810443182.8A CN201810443182A CN108829662A CN 108829662 A CN108829662 A CN 108829662A CN 201810443182 A CN201810443182 A CN 201810443182A CN 108829662 A CN108829662 A CN 108829662A
Authority
CN
China
Prior art keywords
word
random field
conversation
vector
context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810443182.8A
Other languages
Chinese (zh)
Inventor
陈哲乾
蔡登�
杨荣钦
赵洲
何晓飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201810443182.8A priority Critical patent/CN108829662A/en
Publication of CN108829662A publication Critical patent/CN108829662A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of conversation activity recognition methods and system based on condition random field structuring attention network, and wherein recognition methods includes the following steps:(1) memory network is combined, dialog semantics information is subjected to hierarchical reasoning, semantic modeling according to word layer, sentence layer, dialogue layer;(2) application structure attention network carries out the division of structure trifle to conversation content according to the correlation between conversation content;(3) obtained structured message is applied on linear conditions random field algorithm, based on context context predicts current session behavior.Through the invention, the contextual information in dialogue interactive process can be captured with depth, and it can accomplish that the segment of dynamic division conversation content can be further improved conversation activity recognition accuracy by combining structuring attention network with condition random field algorithm.

Description

A kind of conversation activity recognition methods based on condition random field structuring attention network And system
Technical field
The present invention relates to natural language processing conversational system fields, and in particular to one kind is infused based on condition random field structuring The conversation activity recognition methods and system of meaning power network.
Background technique
In recent years, gradualling mature with human-computer interaction technology, the product for largely carrying man-machine interactive system come into thousand Ten thousand families of family.Such as smart phone assistant Siri, Cortana, the small love classmate of intelligent sound, day cat spirit etc., such product go out Show, people is allowed profoundly to experience technology to mankind's bring convenience and enjoy.At the same time, human-computer interaction conversational system also by The concern from industry and academia researcher extensively is arrived.Major research field of the invention, being can not in conversational system Or one of scarce technology --- conversation activity identification.The purpose of conversation activity identification is for one section of conversation content, on given Hereafter under the premise of conversation content, the behavior of current session is predicted, so as to identify the intention of speaker.One efficiently and accurately Conversation activity identification model, it is desirable to be able to clearly capture context of co-text information, while state can be carried out to conversation content and chased after Track, so that the intention of clear current session speaker, carries out Activity recognition.For machine, speaker can be accurately identified Behavior be intended to, can accordingly produce accurately dialogue reply, this is an important technology in human-computer interaction conversational system Difficult point.
Currently, the conversation activity identification technology of mainstream, is mainly studied in terms of two.One is identify conversation activity Problem definition is at the more classification problems of text.Such as the paper that Khanpour et al. is proposed in COLING meeting in 2016 《Dialogue Act Classifcation in Domain-Independent Conversations Using a Deep Recurrent Neural Network》The LSTM-Softmax algorithm of proposition will be right in conjunction with deep learning and soft sorting algorithm Activity recognition problem definition is talked about into the more classification problems of simple text.Blunsom et al. was in Computer in 2013 The paper of Science periodical publication《Recurrent Convolutional Neural Networks for Discourse Compositionality》The RCNN algorithm of proposition carries out sentence model and dialog model in conjunction with layering convolutional neural networks Layered modeling carries out text classification after extracting Deep Semantics information.Ji Yangfeng et al. was in NAACL-HLT meeting paper in 2016《A Latent Variable Recurrent Neural Network for Discourse Relation Language Models》The DRLM-Conditional model of middle proposition, it also is contemplated that neural network structure and probability graph model, it will be different right The sentence of words behavior carries out labeling according to maximization principle, reaches conversation activity identifying purpose.There are also another kind sides Method is that conversation activity identification problem definition is marked problem at a structuring timing, generallys use Hidden Markov Model And conditional random field models, it may be considered that the context relation relationship of entire conversation content.Definition mode of this method to problem It is entirely different with the self-existent definition of conversation activity each in the more classification problems of text.Structuring timing mark, more Ground considers that influence state of the current the words by front context of co-text, each current state can be by preceding states It influences.Such as Kumar et al. discloses the paper issued on website in Connell《Dialogue Act Sequence Labeling using Hierarchical encoder with CRF》The Bi-LSTM-CRF model of proposition, by simple condition random field Algorithm is placed in the last layer output layer of depth of seam division learning neural network, thus the Activity recognition that engages in the dialogue.Stolcke etc. The paper that people delivered on Dialogue magazine in 2006《Dialogue Act Modeling for Automatic Tagging and Recognition of Conversation Speech》It directly proposes to use plain text feature extracting method It is combined with Hidden Markov Model, carries out structured sequence prediction.
It is obvious that conversation activity identification problem definition is merely lost dialogue field at the more classification problems of text-independent Context relation information abundant in scape.In dialog procedure, speaker speak intention can be by conversation content institute shadow before It rings, such as when the conversation activity of previous sentence is to greet, next conversation activity very maximum probability is also to greet.Or it is preceding One conversation activity is when asking a question, and it may be exactly to answer a question that next conversation activity is very big.And another settling mode, Using the method for usual terms random field and Hidden Markov Model, although the association of context conversation content can be captured Property, it has but seriously been confined to the influence of dialogue previous state, has not accounted for whole conversation content when dividing topic Influence to conversation activity identification.Usually, they can spread whole section of conversation content out to regard one whole section of impartial long article as This, does not distinguish topics different in conversation content by dividing the mode of trifle.This is like the mankind in more wheel dialogues In the process, topic is often replaced, and chats this topic for a moment, chats another topic again for a moment.Between two topics in fact simultaneously Without too big relevance, more the two topics should be independently differentiated.
Summary of the invention
The present invention provides a kind of conversation activity recognition methods based on condition random field structuring attention network, very well Ground solves the problems, such as that conversation content because Activity recognition accuracy rate caused by topic shifts is low, improves conversation activity in the process Robustness when identification is influenced by context relation.
A kind of conversation activity recognition methods based on condition random field structuring attention network, including including following step Suddenly:
(1) memory network is combined, dialog semantics information is subjected to hierarchical reasoning, language according to word layer, sentence layer, dialogue layer Justice modeling;
(2) it is small to carry out structure to conversation content according to the correlation between conversation content for application structure attention network Section divides;
(3) obtained structured message is applied on linear conditions random field algorithm, based on context context prediction is worked as Preceding conversation activity.
The present invention can be understood as machine by deep learning semantic understanding, understands whole section of conversation content, dialogue is divided Trifle puts several bigger conversation contents of relevance together, and is as far as possible separated from each other the small dialogue of relevance.It is sharp again With structuring attention network, can relevance between Dynamic Recognition conversation activity, and combined with condition random field algorithm, Reach structuring prediction conversation activity purpose.
The present invention carries out semantic understanding to conversation content, captures the semantic information of conversation content profound level.Conversation activity is known Other primary precondition needs to carry out accurate semantic understanding to conversation content.Since conversation content naturally has hierarchy Feature, the present invention are handled semanteme by the way of layered modeling in this step.Word constitutes sentence, and sentence group At whole section of conversation content, every corresponding behavior of words is differentiated according to conversation content.
In step (1), the word layer rational formula of the dialog semantics information is as follows:
E=fconcat(Ew,Ea,Epos,Ener)
Wherein, E is that the final complete vector of word indicates, is spliced by four kinds of different dimensions word informations, fconcatIt is The function representation of splicing, EwIndicate the Word2vec vector that word is obtained from the good English words vector model of Google's pre-training;EaTable Showing indicates vector by the word that Recognition with Recurrent Neural Network learns by monogram information;Indicate word composition Each letter;EposIndicate nltk kit treated word part-of-speech information;EnerIndicate nltk kit treated word Entity class information.
To reach semantic understanding purpose, it is desirable that model must have enough comprehensions to word.The present invention is using each Word part-of-speech information abundant and morphological information, to enhance word in the ability to express of semantic space.
In step (1), specific step is as follows for the dialogue layer reasoning of the dialog semantics information:
(1-1) uses bidirectional valve controlled cycling element, and the implicit expression of the forward direction of each word is indicated to splice with implicit backward, The Spatial Semantics vector for obtaining entire sentence indicates that formula is:
U=fbiGRU(E1,…,En)
Wherein, U indicates that the Spatial Semantics vector of entire sentence indicates, EiIndicate i-th of word in sentence;
(1-2) obtains semantic expressiveness of the current sentence in context of co-text, and formula is:
Ct=tanh (Wm-1Ct-1+Wm+1Ct+1+bm)
Wherein, CtIt is expressed as semantic expressiveness of the t word in context of co-text, Ct-1And Ct+1For preceding word and rear word It is implicit to indicate, Wm-1, Wm+1, bmIt is the parameter that training obtains, Tanh is activation primitive, that is to say, that in context of co-text, the Joint effect of the t word by its preceding word and rear word;
(1-3) uses Memory Neural Networks, integrates in conjunction with attention mechanism to two kinds of dialogue expressions, is finally melted The dialog semantics information of conjunction.
Here U only independently arrives the implicit expression study of vector that every is talked about, and but ignores sentence within a context It is influenced by front and back context.In order to further learn implicit expression of the every words in context of co-text, the present invention proposes one A variable C indicates implicit expression of the current sentence in context of co-text.
Specific step is as follows for step (1-3):
(1-3-1) normalizes to obtain original sentence expression U by softmaxtWith the semantic expressiveness C in context of co-texttIt Between correlation:
Wherein,It is expressed as the transposed vector of sentence original vector expression, pj,iIndicate that original sentence indicates UtWith it is upper and lower Literary context indicates CtBetween correlation, it is possible to understand that be attention weight between the two.
(1-3-2) introduces memory network and exports O to generate final memoryt, formula is as follows:
Ot=∑ipj,iCt
Final output OtThe implicit expression C of context of co-text can be regarded astIn each hidden state by original sentence shadow Semantic understanding after sound.Since memory network can arbitrarily be superimposed multilayer, often plus one layer of understanding that can be expressed as to original sentence Deeper one layer.
(1-3-3) after k layers of memory network, present invention employs stack manipulations, i.e., by upper one layer of output WithIt is added, obtaining next layer of sentence finally indicates, specific formula is:
Wherein,Expression is influenced by last memory network, and the final semantic understanding of obtained dialogue sentence indicates. Mean understanding of the model to current session, have passed through the complicated interaction of multilayer and understand, merged the original semanteme of conversation content The semantic understanding for understanding and being influenced by context.By above step, it is ensured that model has had dialog semantics content Sufficient understanding.
In step (2), the structure trifle generally includes greeting trifle, chat trifle, question and answer trifle and takes leave of small Section etc..Context relation between same trifle is close, and is associated between different trifles, compares and becomes estranged.
It is presently believed that the division to trifle, will largely influence accuracy of the conversation activity on sequence labelling. Allow U={ U1,U2…,UnIndicate each conversation content, y={ y1,y2…,ynIndicate the corresponding behavior classification of every words, z= {z1,z2…,znIndicate discrete implicit variable, any one zi∈ { 0,1 }, 0 represents context-free, and 1 represents context phase It closes.The purpose for introducing structuring attention mechanism can be based on context talked with interior so that model is when predict conversation activity Appearance and the corresponding behavior type of context every words, the relevance being inferred between current session and context, thus with reference to Current session behavior is predicted in the corresponding conversation activity of context.
In conversation activity identification mission, the behavior classification of every words is individually predicted in a manner of greed every time, it may be simultaneously Optimal solution will not be brought.On the contrary, it is desirable to based on context conversation content and the corresponding behavior of context every words Label, joint determine best sequence label.Therefore, the effect of linear conditions random field just highlights.
In step (3), the linear conditions random field algorithm is specially:For whole section talk with, conversation content and respectively it is right The conversation activity answered is the sequence of random variables that linear chain indicates, under conditions of given sequence of random variables X, stochastic variable sequence Conditional probability distribution P (Y | X) structure condition random field of Y is arranged, condition random field probability distribution can be expressed as:
Wherein, θi(zi) indicate probability distribution of the condition random field in each implicit dialogue node, with unified condition with Airport setting:
For every conversation content, this content and correlativity of the upper content in relevance are all summarized, is passed through Use structuring marginal probability function p (z1,…,zn| U, y), in conjunction with preceding Back Propagation Algorithm, calculate based on condition random field The conversation activity distribution probability p (y of whole section of conversation content1,..,yn,U1,..,Un;θ):
The present invention devises training algorithm end to end and testing algorithm, using maximum- likelihood estimation come condition for study Random field-structuring attention network parameter.Given training set (U, Y), log-likelihood function can be expressed as:
Wherein, Θ indicates the parameter that neural network is acquired.L indicates the loss function of training definition.
Objective function of the invention is defined as:
Indicate L2 regularization, λ is loss function L (Θ) and regular termsBetween tradeoff parameter.
For test phase, the present invention obtains optimal sequence prediction using viterbi algorithm.By dynamic programming algorithm, Conversation activity prediction can obtain in the following manner:
Y '=argmaxy∈Yp(y|U,Θ)
The conversation activity label that y ' expression model prediction goes out, argmax function are to take maximum in condition random field probability distribution Item is used as prediction result.
The present invention also provides a kind of conversation activity identifying system based on condition random field structuring attention network, tools Module includes:
Word layer representation module:For obtain the word2vec pre-training vector, the vector of character level, part of speech of word to Amount and entity class vector, and this four vectors are spliced to form to the final expression vector of the word;
Dialogue layer representation module:Using deep-cycle neural network, the original semantic expressiveness vector of sentence is obtained, and is combined Memory network, obtaining whole section of dialog semantics in conjunction with context of co-text and structuring attention mechanism indicates;
Behavior layer representation module:For talking with corresponding behavior classification according to conversation content prediction;
Context semantic understanding module:Context for being captured in dialog procedure using deep-cycle neural network is believed Breath;
Initialize dialogue state module:For initializing hyper parameter of the dialog model in training process and test process;
Condition random field probability distribution module:For calculating the context conversation activity pair when predicting current session behavior The influence degree of current session.
Test module:After model training finishes, conversation activity prediction result is externally exported.
The invention has the advantages that:
1, different from previous research, the present invention engages in the dialogue behavior from the angle that expansion condition random field structuring relies on Identification.Structuring attention network proposed by the invention, provide a kind of concern again of not only having paid close attention to dialog semantics content talk with it is small The new solution of section structure.
2, depth of seam division Recognition with Recurrent Neural Network proposed by the present invention is combined with memory-enhancing effect mechanism, sufficiently in simulation dialogue The semantic expressiveness of appearance.The frame proposed can be accomplished to train end to end, and this model can easily expand to not In same conversation tasks.
3, the present invention is on two popular data set SWDA and MRDA, by the way that experimental results demonstrate illustrate better than it The model performance of his reference line algorithm.From the experiment proves that model superiority.
Detailed description of the invention
Fig. 1 is overall structure diagram of the present invention in context semantic understanding;
Fig. 2 is the deep-cycle neural network schematic diagram of conversation content of the present invention;
Fig. 3 is that the dialogue trifle under real dialog scene of the present invention divides schematic diagram;
Fig. 4 is that the present invention is based on the structuring implicit semantics of linear conditions random field to divide schematic diagram;
Fig. 5 is that the present invention is based on the conversation activity identifying system block process of condition random field structuring attention network Figure;
Fig. 6 is the ten big conversation activity label thermodynamic charts that the present invention carries out on SwDA data set;
Fig. 7 is conversation activity identifying system of the present invention after CVAE dialog generation system combines, conversation activity classification number Measure the influence generated to dialogue.
Specific embodiment
The present invention is further elaborated and is illustrated with reference to the accompanying drawings and detailed description.
As shown in Figure 1, frame of the present invention is always divided into three layers using layering semantic understanding mode:
(a) word layer:For a word, the word2vec pre-training vector of the word is obtained, the vector of character level, Part of speech vector sum entity class vector.This four vectors are spliced to form to the final expression vector of this word.Firstly, of the invention Using the good English words vector model of Google's pre-training, the word2vec vector E of each word is obtainedw;Secondly, each word is It is made of each letter, the combination of different letters can indicate the root and etymology of word well.Pass through deep-cycle Neural network, the present invention can obtain another word vector E based on alphabetical levela;In addition, each word have it is corresponding Part-of-speech information, e.g. adjective, noun or verb, the present invention obtain the corresponding part of speech letter of word by nltk kit Cease Epos;There are also the entity information belonging to word, e.g. place name, name, certain specific entity information such as time or event, together Sample obtains E with nltk kitner.In this way, the list that a word present invention must be indicated to four kinds of different dimensions This four expression vectors are finally connected into the same vector, indicate the word finally complete space representation by word semantic information.
(b) dialogue layer:Using deep-cycle neural network, the original semantic expressiveness vector of sentence is obtained.Implementation is: U=fbiGRU(E1,…,En).Obtaining whole section of dialog semantics in conjunction with context of co-text and structuring attention mechanism indicates.Realization side Formula is:Ct=tanh (Wm-1Ct-1+
Wm+1Ct+1+bm).It indicates in the original vector talked in U and context after semantic expressiveness C, using memory Network constantly updates the level of interaction between U and C, so that model more enhances the dialog semantics degree of understanding.Mean model Understanding to current session have passed through the complicated interaction of multilayer and understand, merged the original semantic understanding of conversation content and process The semantic understanding that context influences.By above step, it is ensured that model has had sufficient reason to dialog semantics content Solution.
(c) behavior layer:Directly utilize condition random field algorithm different from previous, invention introduces structuring attention machines System, is no longer to regard whole section of dialogue as a flat article, but regard whole section of dialogue as a structured message, by difference Trifle is composed.Such as talk for one section, it will usually have greeting trifle, chat trifle, question and answer trifle, farewell trifle etc. Deng.Context relation between same trifle is close, and is associated between different trifles, compares and becomes estranged.It is presently believed that small The division of section will largely influence accuracy of the conversation activity on sequence labelling.It allows
U={ U1,U2…,UnIndicate each conversation content, y={ y1,y2…,ynIndicate the corresponding behavior class of every words Not, z={ z1,z2…,znIndicate discrete implicit variable, any one zi∈ { 0,1 }, 0 represents context-free, and 1 represents up and down It is literary related.The purpose for introducing structuring attention mechanism, be so that model is when predicting conversation activity, can be based on context right Words content and the corresponding behavior type of context every words, the relevance being inferred between current session and context, thus With reference to the corresponding conversation activity of context, current session behavior is predicted.
Assuming that regarding one section of dialogue as the undirected graph structure for possessing n node, every talk in dialogue is One node of non-directed graph.Condition random field can be by learning implicit variable parameter θC(zC) ∈ R divides to obtain.It defines herein Under, structuring attention probability can be defined as:
p(z|U,y;θ)=softmax (∑CθC(ZC))
Wherein, p (z | U, y;θ) in parameter θ, the attention influenced by conversation content U and behavior classification y is general Rate, ZCIndicate whether discrete implicit variable, characterization context are associated with, θC(ZC) it is the implicit parameter function for dividing trifle.
Corresponding, the expression C of final whole section of conversation content can be by conversation content U and corresponding implicit dialogue line For state Z expression:
Wherein labelling function f is defined as f (U, y, z)=∑ by usCfC(U,y,ZC), indicate the hidden of dialogue trifle selection It is indicated containing state.Conversation content C can be understood as be it is very sensitive to dialogue trifle content, pay close attention to one section of conversation activity talk in Hold, the optional state of all dialogue trifle selections is all weighted according to implicit variable z~p average.Here p is defined At a mapping function of conversation content U and conversation activity y.
In practical applications, labelling function is defined as f by the present inventioni(U,y,zi{ the z of)=1i=1 } (Ui,yi), i.e., for right Talk about the relevant trifle of content, labelling function emblem forms 1, and context incoherent for conversation content, setting associated with each other It is 0.By this definition, entire function expectation can be expressed as talking with one section, and how many note should be placed in every talk Power weight of anticipating is on the conversation activity mark of context.Formula is expressed as:
WhereinRepresent the whole trifle attention expectation of whole section of words.UiAnd yiI-th conversation content is respectively represented, And corresponding behavior type.p(zi=1 | U, y) to represent current session content and upper sentence pair words be belong to same trifle general Rate.For whole section of dialogue, condition random field probability distribution can be expressed as:
Wherein θi(zi) indicate probability distribution of the condition random field in each implicit dialogue node, unified item can be used The setting of part random field:
For every conversation content, this content and correlativity of the upper content in relevance are all summarized.Pass through Use structuring marginal probability function p (z1,…,zn| U, y), in conjunction with preceding Back Propagation Algorithm, can calculate based on condition random The conversation activity distribution probability p (y of whole section of conversation content of field1,..,yn,U1,..,Un;θ):
As shown in Fig. 2, each word includes former word, character in the deep-cycle neural network schematic diagram of conversation content Layer, part of speech name the big information of entity class four, this makes the original semantic of conversation content indicate more accurate.Specific embodiment party Formula is:E=fconcat(Ew,Ea,Epos,Ener),EwIndicate that word is directly extracted from the good Word2vec vector of Google's pre-training; EaIt indicates to indicate vector by the word that Recognition with Recurrent Neural Network learns by monogram information,Indicate the group of words At each letter;EposIndicate nltk kit treated word part-of-speech information;EnerIndicate that treated for nltk kit Word entities classification information.After obtaining the complete semantic expressiveness of word, deep-cycle neural network is used, to whole sentence pair It talks about content and carries out semantic understanding.
As shown in figure 3, this section of dialogue is segmented into three trifles, first trifle is to greet, and second trifle is to ask It answers, third trifle is to take leave of.These three trifles each other without too big relevance, it is obvious that can separate into Row conversation activity identification.
As shown in figure 4, being the structuring implicit semantic segmentation schematic diagram based on linear conditions random field.ZiIndicate i-th The structuring of dialogue is implicit to be indicated, for talking with trifle semantic segmentation.
As shown in figure 5, a kind of conversation activity identifying system based on condition random field structuring attention network, is always divided into For seven big modules, specific module includes:
Word layer representation module:For a word, the word2vec pre-training vector of the word is obtained, character level Vector, part of speech vector sum entity class vector.This four vectors are spliced to form to the final expression vector of this word.
Dialogue layer representation module:Using deep-cycle neural network, the original semantic expressiveness vector of sentence is obtained.And it combines Memory network, obtaining whole section of dialog semantics in conjunction with context of co-text and structuring attention mechanism indicates.
Behavior layer representation module:According to conversation content, corresponding behavior classification is talked in prediction.
Context semantic understanding module:Using deep-cycle neural network, for capturing the letter of the context in dialog procedure Breath.
Initialize dialogue state module:For initializing hyper parameter of the dialog model in training process and test process.
Condition random field probability distribution module:Using the characteristic of condition random field, when predicting current session behavior, consider Influence degree of the context conversation activity to current session.
Test module:The module be model is trained finish after, externally export conversation activity prediction result mould Block.By the module, system can show the final effect of algorithm with product form.
Pair of the present invention on conversation activity identification the data set SwDA and MRDA of two mainstreams with other current forefronts Words generating mode compares.Two large data sets introductions difference is as follows:
SwDA corpus:SwDA is the big data set based on handmarking, it is talked with by 1155 from phone pair It is obtained in words scene.Two strangers are all selected in experiment at random every time, they select one at random and are exchanged from topic.
MRDA corpus:MRDA is the data set recorded from 75 conference contents, and every by manually mark The intention classification of word, reaches conversation activity identifying purpose.
Particularly with regard to the introduction of corpus, can be learnt by table 1:
Table 1
Data set Categorical measure Vocabulary Training set Verifying collection Test set
SwDA 42 19k 1003(173k) 112(22k) 19(4k)
MRDA 5 10k 51(76k) 11(15k) 11(15k)
The present invention is mainly using conversation activity accuracy as judging quota.The dialogue of 7 current mainstreams is compared in total Activity recognition algorithm, respectively:Bi-LSTM-CRF,DRLM-Conditional,LSTM-Softmax,RCNN,CRF,HMM, SVM.Table 2 indicates that recognition accuracy of major algorithm model on SwDA corpus, table 3 indicate major algorithm model in MRDA language Expect the recognition accuracy on library.
Table 2
Model Accuracy (%)
Mankind's mark 84.0
Inventive algorithm 80.8
Bi-LSTM-CRF 79.2
DRLM-Conditional 77.0
LSTM-Softmax 75.8
RCNN 73.9
CRF 71.7
HMM 71.0
SVM 70.6
Table 3
Model Accuracy (%)
Inventive algorithm 91.4
Bi-LSTM-CRF 90.9
LSTM-Softmax 86.8
CRF 83.9
SVM 81.8
From table 2 and table 3 as can be seen that proposed by the present invention be based on condition random field structuring attention network frame, Optimal effectiveness is obtained compared to other algorithms on two large data sets, sufficiently illustrates the superiority of inventive algorithm.
In addition, the present invention has carried out the matching visualization of ten big conversation activity labels on SwDA data set.Such as Fig. 6 institute Show, abscissa represents real dialog behavior label, and ordinate represents the conversation activity label that model prediction of the present invention comes out.In figure Each node, color is deeper, and the value of the value and ordinate that represent abscissa is more close.
Finally, conversation activity identifying system of the present invention is combined with CVAE dialog generation system, by conversation activity Identification, auxiliary dialog generation system generate significant reply related to context implication.As shown in fig. 7, dialogue of the invention Activity recognition system, compared to other identifying systems, obvious accuracy is more done in the effect that auxiliary dialogue generates.Also side is demonstrate,proved Model proposed by the invention is illustrated in the superiority for comparing other forward position algorithms.This absolutely proves calculation proposed by the invention Method is more excellent than other models in the stability that auxiliary dialogue is replied.

Claims (7)

1. a kind of conversation activity recognition methods based on condition random field structuring attention network, which is characterized in that including with Lower step:
(1) memory network is combined, dialog semantics information is subjected to hierarchical reasoning according to word layer, sentence layer, dialogue layer, semanteme is built Mould;
(2) application structure attention network carries out structure trifle to conversation content and draws according to the correlation between conversation content Point;
(3) obtained structured message is applied on linear conditions random field algorithm, based on context context prediction is current right Words behavior.
2. the conversation activity recognition methods according to claim 1 based on condition random field structuring attention network, It is characterized in that, in step (1), the word layer rational formula of the dialog semantics information is as follows:
E=fconcat(Ew,Ea,Epos,Ener)
Wherein, E is that the final complete vector of word indicates, is spliced by four kinds of different dimensions word informations, fconcatIt is splicing Function representation, EwIndicate the Word2vec vector that word is obtained from the good English words vector model of Google's pre-training;EaIndicate by Monogram information indicates vector by the word that Recognition with Recurrent Neural Network learns;Indicate each of word composition Letter;EposIndicate nltk kit treated word part-of-speech information;EnerIndicate nltk kit treated word entities Classification information.
3. the conversation activity recognition methods according to claim 1 based on condition random field structuring attention network, It is characterized in that, in step (1), specific step is as follows for the dialogue layer reasoning of the dialog semantics information:
(1-1) uses bidirectional valve controlled cycling element, and the implicit expression of the forward direction of each word is indicated to splice with implicit backward, is obtained The Spatial Semantics vector of entire sentence indicates that formula is:
U=fbiGRU(E1,…,En)
Wherein, U indicates that the Spatial Semantics vector of entire sentence indicates, EiIndicate i-th of word in sentence;
(1-2) obtains semantic expressiveness of the current sentence in context of co-text, and formula is:
Ct=tanh (Wm-1Ct-1+Wm+1Ct+1+bm)
Wherein, CtIt is expressed as semantic expressiveness of the t word in context of co-text, Ct-1And Ct+1It is implicit for preceding word and rear word It indicates, Wm-1, Wm+1, bmIt is the parameter that training obtains, Tanh is activation primitive;
(1-3) uses Memory Neural Networks, integrates in conjunction with attention mechanism to two kinds of dialogue expressions, is finally merged Dialog semantics information.
4. the conversation activity recognition methods according to claim 3 based on condition random field structuring attention network, It is characterized in that, specific step is as follows for step (1-3):
(1-3-1) normalizes to obtain original sentence expression U by softmaxtWith the semantic expressiveness C in context of co-texttBetween Correlation:
Wherein,It is expressed as the transposed vector of sentence original vector expression, pj,iIndicate that original sentence indicates UtAnd context of co-text In semantic expressiveness CtBetween correlation;
(1-3-2) introduces memory network and generates final memory output Ot
(1-3-3) passes through after k layers of memory network, using stack manipulation, by upper one layer of outputWithIt is added, obtains down One layer of sentence finally indicates that specific formula is:
Wherein,Expression is influenced by last memory network, and the final semantic understanding of obtained dialogue sentence indicates.
5. the conversation activity recognition methods according to claim 1 based on condition random field structuring attention network, It is characterized in that, in step (2), the structure trifle includes greeting trifle, chat trifle, question and answer trifle and takes leave of trifle.
6. the conversation activity recognition methods according to claim 1 based on condition random field structuring attention network, It is characterized in that, in step (3), the linear conditions random field algorithm is specially:
Whole section is talked with, condition random field probability distribution is expressed as:
Wherein, θi(zi) indicate probability distribution of the condition random field in each implicit dialogue node, with unified condition random field Setting:
For every conversation content, this content and correlativity of the upper content in relevance are summarized, by using knot Structure marginal probability function p (z1,…,zn| U, y), in conjunction with preceding Back Propagation Algorithm, whole section based on condition random field of calculating is right Talk about the conversation activity distribution probability of content:
Wherein, p (y1,..,yn,U1,..,Un;That θ) represent is conversation activity distribution probability, Ui(yj) represent the i-th word prediction row For label yjProbability, what Σ was represented is that in short distribution probability in different dialogue behavior, Π represent every words in dialogue Conversation activity be distributed summer condition probability.
7. a kind of conversation activity identifying system based on condition random field structuring attention network, which is characterized in that including:
Word layer representation module:For obtaining word2vec pre-training vector, the vector of character level, the part of speech vector sum of word Entity class vector, and this four vectors are spliced to form to the final expression vector of the word;
Dialogue layer representation module:Using deep-cycle neural network, the original semantic expressiveness vector of sentence is obtained, and combines memory Network, obtaining whole section of dialog semantics in conjunction with context of co-text and structuring attention mechanism indicates;
Behavior layer representation module:For talking with corresponding behavior classification according to conversation content prediction;
Context semantic understanding module:For capturing the contextual information in dialog procedure using deep-cycle neural network;
Initialize dialogue state module:For initializing hyper parameter of the dialog model in training process and test process;
Condition random field probability distribution module:For calculating when predicting current session behavior, context conversation activity is to current The influence degree of dialogue;
Test module:After model training finishes, conversation activity prediction result is externally exported.
CN201810443182.8A 2018-05-10 2018-05-10 A kind of conversation activity recognition methods and system based on condition random field structuring attention network Pending CN108829662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810443182.8A CN108829662A (en) 2018-05-10 2018-05-10 A kind of conversation activity recognition methods and system based on condition random field structuring attention network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810443182.8A CN108829662A (en) 2018-05-10 2018-05-10 A kind of conversation activity recognition methods and system based on condition random field structuring attention network

Publications (1)

Publication Number Publication Date
CN108829662A true CN108829662A (en) 2018-11-16

Family

ID=64147812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810443182.8A Pending CN108829662A (en) 2018-05-10 2018-05-10 A kind of conversation activity recognition methods and system based on condition random field structuring attention network

Country Status (1)

Country Link
CN (1) CN108829662A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614485A (en) * 2018-11-19 2019-04-12 中山大学 A kind of sentence matching method and device of the layering Attention based on syntactic structure
CN109710930A (en) * 2018-12-20 2019-05-03 重庆邮电大学 A kind of Chinese Resume analytic method based on deep neural network
CN110210037A (en) * 2019-06-12 2019-09-06 四川大学 Category detection method towards evidence-based medicine EBM field
CN110245353A (en) * 2019-06-20 2019-09-17 腾讯科技(深圳)有限公司 Natural language representation method, device, equipment and storage medium
CN110569331A (en) * 2019-09-04 2019-12-13 出门问问信息科技有限公司 Context-based relevance prediction method and device and storage equipment
CN110705340A (en) * 2019-08-12 2020-01-17 广东石油化工学院 Crowd counting method based on attention neural network field
CN110727768A (en) * 2019-10-24 2020-01-24 中国科学院计算技术研究所 Candidate answer sentence generation and natural language selection method and system
CN110853626A (en) * 2019-10-21 2020-02-28 成都信息工程大学 Bidirectional attention neural network-based dialogue understanding method, device and equipment
CN111191415A (en) * 2019-12-16 2020-05-22 山东众阳健康科技集团有限公司 Operation classification coding method based on original operation data
CN111324704A (en) * 2018-12-14 2020-06-23 阿里巴巴集团控股有限公司 Method and device for constructing dialect knowledge base and customer service robot
US20200202887A1 (en) * 2018-12-19 2020-06-25 Disney Enterprises, Inc. Affect-driven dialog generation
CN112115247A (en) * 2020-09-07 2020-12-22 中国人民大学 Personalized dialogue generation method and system based on long-time and short-time memory information
CN112632961A (en) * 2021-03-04 2021-04-09 支付宝(杭州)信息技术有限公司 Natural language understanding processing method, device and equipment based on context reasoning
CN113157919A (en) * 2021-04-07 2021-07-23 山东师范大学 Sentence text aspect level emotion classification method and system
CN113240098A (en) * 2021-06-16 2021-08-10 湖北工业大学 Fault prediction method and device based on hybrid gated neural network and storage medium
CN115017286A (en) * 2022-06-09 2022-09-06 北京邮电大学 Search-based multi-turn dialog system and method
CN110377713B (en) * 2019-07-16 2023-09-15 广州探域科技有限公司 Method for improving context of question-answering system based on probability transition
WO2023231513A1 (en) * 2022-05-31 2023-12-07 华院计算技术(上海)股份有限公司 Conversation content generation method and apparatus, and storage medium and terminal
CN117591662A (en) * 2024-01-19 2024-02-23 川投信息产业集团有限公司 Digital enterprise service data mining method and system based on artificial intelligence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106777013A (en) * 2016-12-07 2017-05-31 科大讯飞股份有限公司 Dialogue management method and apparatus
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
US20180060301A1 (en) * 2016-08-31 2018-03-01 Microsoft Technology Licensing, Llc End-to-end learning of dialogue agents for information access

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180060301A1 (en) * 2016-08-31 2018-03-01 Microsoft Technology Licensing, Llc End-to-end learning of dialogue agents for information access
CN106777013A (en) * 2016-12-07 2017-05-31 科大讯飞股份有限公司 Dialogue management method and apparatus
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHEQIAN CHEN、RONGQIN YANG、ZHOU ZHAO、DENG CAI、XIAOFEI HE: "Dialogue Act Recognition via CRF-Attentive Structured Network", 《HTTPS://ARXIV.ORG/SEARCH/?QUERY=DIALOGUE+ACT+RECOGNITION+VIA+CRF-ATTENTIVE+STRUCTURED+NETWORK&SEARCHTYPE=ALL&SOURCE=HEADER》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614485B (en) * 2018-11-19 2023-03-14 中山大学 Sentence matching method and device of hierarchical Attention based on grammar structure
CN109614485A (en) * 2018-11-19 2019-04-12 中山大学 A kind of sentence matching method and device of the layering Attention based on syntactic structure
CN111324704B (en) * 2018-12-14 2023-05-02 阿里巴巴集团控股有限公司 Method and device for constructing speaking knowledge base and customer service robot
CN111324704A (en) * 2018-12-14 2020-06-23 阿里巴巴集团控股有限公司 Method and device for constructing dialect knowledge base and customer service robot
US20200202887A1 (en) * 2018-12-19 2020-06-25 Disney Enterprises, Inc. Affect-driven dialog generation
US10818312B2 (en) * 2018-12-19 2020-10-27 Disney Enterprises, Inc. Affect-driven dialog generation
CN109710930A (en) * 2018-12-20 2019-05-03 重庆邮电大学 A kind of Chinese Resume analytic method based on deep neural network
CN110210037A (en) * 2019-06-12 2019-09-06 四川大学 Category detection method towards evidence-based medicine EBM field
CN110210037B (en) * 2019-06-12 2020-04-07 四川大学 Syndrome-oriented medical field category detection method
CN110245353A (en) * 2019-06-20 2019-09-17 腾讯科技(深圳)有限公司 Natural language representation method, device, equipment and storage medium
CN110245353B (en) * 2019-06-20 2022-10-28 腾讯科技(深圳)有限公司 Natural language expression method, device, equipment and storage medium
CN110377713B (en) * 2019-07-16 2023-09-15 广州探域科技有限公司 Method for improving context of question-answering system based on probability transition
CN110705340B (en) * 2019-08-12 2023-12-26 广东石油化工学院 Crowd counting method based on attention neural network field
CN110705340A (en) * 2019-08-12 2020-01-17 广东石油化工学院 Crowd counting method based on attention neural network field
CN110569331A (en) * 2019-09-04 2019-12-13 出门问问信息科技有限公司 Context-based relevance prediction method and device and storage equipment
CN110853626A (en) * 2019-10-21 2020-02-28 成都信息工程大学 Bidirectional attention neural network-based dialogue understanding method, device and equipment
CN110727768A (en) * 2019-10-24 2020-01-24 中国科学院计算技术研究所 Candidate answer sentence generation and natural language selection method and system
CN111191415A (en) * 2019-12-16 2020-05-22 山东众阳健康科技集团有限公司 Operation classification coding method based on original operation data
CN112115247A (en) * 2020-09-07 2020-12-22 中国人民大学 Personalized dialogue generation method and system based on long-time and short-time memory information
CN112115247B (en) * 2020-09-07 2023-10-10 中国人民大学 Personalized dialogue generation method and system based on long-short-time memory information
CN112632961A (en) * 2021-03-04 2021-04-09 支付宝(杭州)信息技术有限公司 Natural language understanding processing method, device and equipment based on context reasoning
CN113157919A (en) * 2021-04-07 2021-07-23 山东师范大学 Sentence text aspect level emotion classification method and system
CN113240098A (en) * 2021-06-16 2021-08-10 湖北工业大学 Fault prediction method and device based on hybrid gated neural network and storage medium
WO2023231513A1 (en) * 2022-05-31 2023-12-07 华院计算技术(上海)股份有限公司 Conversation content generation method and apparatus, and storage medium and terminal
CN115017286A (en) * 2022-06-09 2022-09-06 北京邮电大学 Search-based multi-turn dialog system and method
CN115017286B (en) * 2022-06-09 2023-04-07 北京邮电大学 Search-based multi-turn dialog system and method
CN117591662A (en) * 2024-01-19 2024-02-23 川投信息产业集团有限公司 Digital enterprise service data mining method and system based on artificial intelligence
CN117591662B (en) * 2024-01-19 2024-03-29 川投信息产业集团有限公司 Digital enterprise service data mining method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN108829662A (en) A kind of conversation activity recognition methods and system based on condition random field structuring attention network
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
CN110222163B (en) Intelligent question-answering method and system integrating CNN and bidirectional LSTM
CN109885670A (en) A kind of interaction attention coding sentiment analysis method towards topic text
CN110263325A (en) Chinese automatic word-cut
CN111414461A (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN110362819A (en) Text emotion analysis method based on convolutional neural networks
CN106682089A (en) RNNs-based method for automatic safety checking of short message
CN111914553B (en) Financial information negative main body judging method based on machine learning
Chen et al. Deep neural networks for multi-class sentiment classification
Suyanto Synonyms-based augmentation to improve fake news detection using bidirectional LSTM
CN116049387A (en) Short text classification method, device and medium based on graph convolution
CN112784041A (en) Chinese short text emotion orientation analysis method
CN113094502A (en) Multi-granularity takeaway user comment sentiment analysis method
Wu et al. Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis
CN115577111A (en) Text classification method based on self-attention mechanism
CN113239678B (en) Multi-angle attention feature matching method and system for answer selection
Wang Text emotion detection based on Bi-LSTM network
CN114021658A (en) Training method, application method and system of named entity recognition model
CN114461779A (en) Case writing element extraction method
CN113435192A (en) Chinese text emotion analysis method based on changing neural network channel cardinality
Miao et al. Multi-turn dialogue model based on the improved hierarchical recurrent attention network
Zheng A Novel Computer-Aided Emotion Recognition of Text Method Based on WordEmbedding and Bi-LSTM
Le et al. Towards a human-like chatbot using deep adversarial learning
Gupta A Review of Generative AI from Historical Perspectives

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181116

RJ01 Rejection of invention patent application after publication