CN110390001A

CN110390001A - A kind of viewpoint type machine reads the implementation method understood, device

Info

Publication number: CN110390001A
Application number: CN201910481171.3A
Authority: CN
Inventors: 杨志明
Original assignee: Reflections On Artificial Intelligence Robot Technology (beijing) Co Ltd
Current assignee: Reflections On Artificial Intelligence Robot Technology (beijing) Co Ltd
Priority date: 2019-06-04
Filing date: 2019-06-04
Publication date: 2019-10-29

Abstract

This application discloses a kind of viewpoint type machines to read the implementation method understood, this method includes, information extraction is carried out to the fusion results of each context term vector respectively, obtain the contextual information extracted based on problem, using candidate answers contextualized coding result as cluster centre, using the contextual information extracted based on problem as clustering information, carry out dynamic routing iteration, the candidate answers weighted based on contextual information are exported, identify optimal candidate answer from the candidate answers weighted based on contextual information.The present invention is that dynamic selection type problem matches optimal answer, and accuracy is high.

Description

A kind of viewpoint type machine reads the implementation method understood, device

Technical field

The present invention relates to machines to understand field, particularly, is related to a kind of implementation method that the reading of viewpoint type machine understands.

Background technique

In machine understanding, it is sometimes desirable to be made inferences in conjunction with contextual information, can just obtain and correctly understand result. For example, today is raining? does flu eat that small radix bupleuri is good should to eat Ganmaoling Granules? problems need to combine context Information makes inferences, just available answer.

Viewpoint type problem mainly includes dual character problem and selection type problem, the former refer to " being ", " can " etc. Has the problem of [A or not A] characteristic, the latter refers to that " small radix bupleuri or cold drug ", " apple or pyrus nivalis " etc. have [A or B] spy The problem of property.

It is that the reading of extraction type understands model that existing reading, which understands model often, carries out information processing to problem and context, After the processes such as word insertion, coding, matching, fusion, intercepts out answer from the context using pointer net (pointer net) Case implements such as: multichannel attention network (the Multiway Attention Networks for of modelling sentence pair Modeling Sentence Pairs), BiDAF, R-net etc..

Viewpoint type reads the solution understood and has continued to use preceding four step that average reading understands model, i.e., word insertion, coding, Matching, fusion carry out matrix with the candidate answers after coding finally, indicating according to the vector that problem is directly extracted out from context Multiplication obtains probability of each candidate answers as answer.This mode by candidate answers independently of context except, it is easier to [A, B, can not determine] type classification problem is solved, but since the answer of viewpoint type problem is often dynamic, needs reasoning, it is existing There is most of solution to be intended to classify, candidate viewpoint knowledge information can not dynamically be incorporated using the thought of classification and moves State ground reasoning answer, in this way, but often hardly resulting in ideal answer for dynamically selection type problem.

Summary of the invention

The present invention provides a kind of viewpoint type machines to read the implementation method understood, for the matching of dynamic selection type problem Optimal answer.

A kind of viewpoint type machine provided by the invention is read the implementation method understood and is achieved in that

Context included by natural language text, problem and candidate answers, which are separately converted to term vector, to be indicated, point It Huo get not context term vector, problem term vector, candidate answers term vector；

Contextualized coding is carried out to context term vector, problem term vector, candidate answers term vector respectively, is obtained respectively Hereafter contextualized coding result, problem context coding result, candidate answers contextualized coding result；

Context of co-text coding result is matched with problem context coding result, obtains context and problem Matching result；

By each context term vector, context of co-text coding result and context and the matching result of problem into Row fusion, obtains the fusion results of each context term vector；

Information extraction is carried out to the fusion results of each context term vector respectively, obtains the context extracted based on problem Information,

It is poly- with the contextual information extracted based on problem using candidate answers contextualized coding result as cluster centre Category information carries out dynamic routing iteration, exports the candidate answers weighted based on contextual information,

Optimal candidate answer is identified from the candidate answers weighted based on contextual information.

It is wherein, described that contextualized coding is carried out respectively to context term vector, problem term vector, candidate answers term vector, Including,

By two-way length, Memory Neural Networks structure encodes problem term vector in short-term, obtains problem contextization coding As a result；

It will be to hidden layer status information in two-way length in problem term vector cataloged procedure in short-term Memory Neural Networks as upper Hereafter, the init state information of the LSTM neural network coding of candidate answers, by two-way LSTM neural network structure to upper Hereafter, candidate answers are encoded respectively, are obtained context of co-text coding result respectively, are asked candidate answers contextualized coding knot Fruit, and candidate answers contextualized coding result will be asked as the initial capsule in capsule network structure.

Wherein, described that information extraction is carried out to the fusion results of each context term vector respectively, it obtains and is taken out based on problem The contextual information taken, including,

It uses two-dimensional convolution neural network to carry out the extraction of contextual information with certain window size, passes through maximum Chi Huati Extracted information is taken, the contextual information extracted based on problem is obtained.

Wherein, described using candidate answers contextualized coding result as cluster centre, with described based on the upper and lower of problem extraction Literary information is clustering information, carries out dynamic routing iteration, including,

Ask that candidate answers contextualized coding result as initial capsule, is with candidate answers contextualized coding result for described Cluster centre, by capsule network structure, is answered using the contextual information extracted based on problem as clustering information according to candidate The quantity n of case carries out the dynamic routing iteration between n wheel capsule, obtains the n capsules for representing candidate answers, and each capsule is Candidate answers based on contextual information weighting.

Wherein, the candidate answers include first classification answer, second classification answer and can not determine the first classification and The third answer of second classification；

It is described to identify that optimal candidate answer includes passing through compression from the candidate answers weighted based on contextual information Function operation acquisition capsule vector mould is long, using the long longest capsule of mould as optimal candidate answer.

The present invention also provides a kind of viewpoint type machines to read the realization device understood, which includes,

Word embeding layer module, context included by natural language text, problem and candidate answers are separately converted to Term vector indicates, obtains context term vector, problem term vector, candidate answers term vector respectively；

Context is embedded in module, carries out contextualized respectively to context term vector, problem term vector, candidate answers term vector Coding obtains context of co-text coding result, problem context coding result, candidate answers contextualized coding result respectively；

Matching module matches context of co-text coding result with problem context coding result, obtains up and down The matching result of text and problem；

Fusion Module, by of each context term vector, context of co-text coding result and context and problem It is merged with result, obtains the fusion results of each context term vector；

Information extraction module carries out information extraction to the fusion results of each context term vector respectively, obtains and is based on asking The contextual information extracted is inscribed,

Dynamic routing module, using candidate answers contextualized coding result as cluster centre, with it is described based on problem extract Contextual information is clustering information, carries out dynamic routing iteration, exports the candidate answers weighted based on contextual information,

Categorization module identifies optimal candidate answer from the candidate answers weighted based on contextual information.

Wherein, context insertion module further includes,

The information extraction module further includes that two-dimensional convolution neural network is used to carry out context letter with certain window size Extracted information is extracted in the extraction of breath by maximum pondization, obtains the contextual information extracted based on problem；

The dynamic routing module further includes, using it is described ask candidate answers contextualized coding result as initial capsule, with Candidate answers contextualized coding result is cluster centre, using the contextual information extracted based on problem as clustering information, is led to Capsule network structure is crossed, according to the quantity n of candidate answers, the dynamic routing iteration between n wheel capsule is carried out, obtains n representative The capsule of candidate answers, each capsule are the candidate answers based on contextual information weighting.

The categorization module further includes, long by compression function operation acquisition capsule vector mould, by the long longest capsule of mould As optimal candidate answer；

The candidate answers include the first classification answer, the second classification answer and can not determine the first classification and second The third answer of classification.

Viewpoint type machine is supported to read the electronic equipment understood the present invention also provides a kind of, which includes memory And processor, wherein for storing instruction, which makes processor execute the viewpoint to memory when executed by the processor Type machine reads the step of implementation method understood.

The present invention also provides a kind of computer storage medium, computer program is stored in the storage medium, it is described The step of viewpoint type machine reads the implementation method understood is realized when computer program is executed by processor.

The present invention with the context extracted based on problem by being believed using candidate answers contextualized coding result as cluster centre Breath is clustering information, carries out dynamic routing iteration by capsule network structure, is selected according to capsule vector field homoemorphism length optimal Candidate answers.The information of candidate answers is utilized in the present invention well, enhances the ability for extracting answer from the context.Even if It does not answer a question directly within a context, viewpoint can also be made inferences and obtained from side-information, improve machine understanding Correctness and reliability, avoid it is first independent in the prior art extract answer from context, then by the answer and candidate answers It is compared and viewpoint type machine is caused to understand limitation, improve the intelligence of machine understanding, realize selective problems Dynamic Matching.

Detailed description of the invention

Fig. 1 is a kind of a kind of schematic diagram based on capsule-mrc model of the embodiment of the present invention.

Fig. 2 is that the viewpoint type machine of the embodiment of the present invention reads a kind of flow diagram of the implementation method understood.

Fig. 3 is that the viewpoint type machine of the embodiment of the present invention reads a kind of schematic diagram of the realization device understood.

Specific embodiment

In order to which the purpose, technological means and advantage of the application is more clearly understood, the application is done below in conjunction with attached drawing It is further described.

A kind of viewpoint type machine proposed by the present invention reads the implementation method understood, thinks viewpoint based on capsule-mrc It is the principle speculated by information cluster that type, which reads the essence understood, generates candidate answers according to problem, such as select The candidate answers of type problem are A, B, can not determine, carry out information exchange to problem and context later, using candidate answers as Context after cluster centre, interaction carries out capsule (capsule) dynamic routing cluster as clustering information；If some is candidate The information that answer obtains is more, then illustrates that the answer is more supported by context, it is also higher as the probability of answer.

Shown in Figure 1, Fig. 1 is a kind of a kind of schematic diagram based on capsule-mrc model of the embodiment of the present invention.

Capsule-mrc model successively includes word embeding layer (word embed layer), context embeding layer (contextual embed layer), matching layer (match layer), information extraction layer, move fused layer (fuse layer) State routing layer, classification layer.Wherein:

The effect of word embeding layer is to reflect the non-machine text (natural language text) such as the inoperable text of machine, symbol It is mapped in hyperspace, phrase semantic is realized by way of vector.In embodiments of the present invention, by problem, context, Candidate answers are separately converted to vector expression, for example, word is mapped as 300 dimensions by the term vector by Word2vec pre-training Vector.In figure, term vector includes context segmentation sequence term vector x1, x2 ... xT, problem segmentation sequence term vector Q1 ... ..Qj, candidate answers segmentation sequence term vector A1, A2, A3.

The effect of context embeding layer is the coding for carrying out contextualized, so that word has language ambience information.Of the invention real It applies in example, problem term vector is encoded using two-way long short-term memory (LSTM) neural network structure, its contextualized is compiled Code output will be passed to matching layer, while hiding layer state (state) information in LSTM neural network is as context, candidate The LSTM neural network init state information of answer.After initializing state, two-way LSTM neural network structure pair is equally used Context, candidate answers are encoded.Each candidate answers after coding regard the initial capsule in capsule network structure as, This has respectively represented three different viewpoints.In figure, the output of context embeding layer includes context of co-textization coding output h1, H2 ... ... hT, problem contextization coding output u1 ... ... uj, candidate answers contextualized coding output capsule1, capsule2、capsule3。

The effect of matching layer is to have carried out attention mechanism to context by problem, so that content relevant to problem obtains To reinforcing, and neglect unrelated text.In embodiments of the present invention, using BIDAF and dot, bi-linear, concat, Tetra- kinds of attention mechanism of minus come matching problem and context, extract effective information from context according to problem.In figure, respectively A context of co-text coding result is matched with each problem context coding result respectively, obtains context and problem Matching result g1, g2 ... gT.

The effect of fused layer is to merge the different information of multilayer once again, enriches the content of context.Implement in the present invention In example, splice the context coding of three first layers, and information fusion is carried out to it using two-way LSTM.In figure, the output of word embeding layer Context term vector x1, x2 ... xT, context embeding layer output context of co-text coding result h1, h2 ... hT, The context of matching layer output and matching result g1, g2 ... the gT of problem are merged, and fusion results are obtained.

The effect of information extraction layer be based on problem extract contextual information, by maximum pond by the information extracted into One step simplification.In embodiments of the present invention, information is carried out to context with carrying out window type using two-dimensional convolution neural network Extraction, window size range segments at 1-10, later using maximum pond to the contextual information progress information of extraction It extracts, obtains the contextual information extracted based on problem, realize the interaction of contextual information.

The effect of dynamic routing layer be using candidate answers as cluster centre, interaction after context as clustering information, Carry out capsule dynamic routing cluster；If the information that some candidate answers obtains is more, illustrate the answer more by context institute It supports, it is also higher as the probability of answer.In embodiments of the present invention, using capsule network structure (CapsulesNet), Using the candidate answers capsule of context embeding layer output as cluster centre, using the contextual information that is extracted based on problem as Clustering information, carries out the dynamic routing iteration of 3 wheels, and final output represents the capsule vector of 3 viewpoints, wherein each Capsule vector is the candidate answers based on contextual information weighting；The number of iteration and the quantity of candidate answers are identical.

The effect of classification layer is to identify which in the weighting capsule vector of dynamic routing layer output the answer gone wrong be One.In embodiments of the present invention, the capsule vector of upper one layer 3 exported weighting is selected: for example, passing through pressure It is long that contracting (squash) function operation obtains capsule vector mould, and each vector mould length is normalized, the longer capsule of mould to Amount illustrates that the viewpoint has obtained support more from context, and the probability as answer is also bigger.

The embodiment of the present invention passes through dynamic routing (the Dynamic Routing between convolutional neural networks and capsule Between Capsules), carry out the extraction of key message and the information based on dynamic routing respectively after fused layer Cluster, final output represents multiple capsule of different viewpoints, these capsule correspond to multiple candidate answers, with they Mould it is long come determine this viewpoint there are a possibility that.

In the embodiment of the present invention, even if not answering a question directly in context, it can also be made inferences from side-information And obtain viewpoint.Such as problem " today rain? " if mentioned in article, " today, I specially went out with umbrella The side-informations such as door " will be more towards thinking that today is raining；If article does not simultaneously include relevant information, will tend to back Answer " can not determine ".

Shown in Figure 2, Fig. 2 is that the viewpoint type machine of the embodiment of the present invention reads a kind of process of the implementation method understood Schematic diagram.Include the following steps:

Step 201, by context included by natural language text, problem and candidate answers be separately converted to word to Amount indicates, obtains context term vector, problem term vector, candidate answers term vector respectively；

Step 202, contextualized coding is carried out respectively to context term vector, problem term vector, candidate answers term vector, point It Huo get not context of co-text coding result, problem context coding result, candidate answers contextualized coding result；

In this step, for example, being carried out using two-way long short-term memory (LSTM) neural network structure to problem term vector Coding, its contextualized coding output will be passed to matching layer, while hiding layer state (state) letter in LSTM neural network Breath is as context, the LSTM neural network init state information of candidate answers.After initializing state, equally using two-way LSTM neural network structure encodes context, candidate answers.Each candidate answers after coding regard capsule network knot as Initial capsule in structure, this has respectively represented three different viewpoints.

Step 203, context of co-text coding result is matched with problem context coding result, obtains context With the matching result of problem；

In this step, using tetra- kinds of attention mechanism of BiDAF and dot, bi-linear, concat, minus next With problem and context, effective information is extracted from context according to problem.

Step 204, by of each context term vector, context of co-text coding result and context and problem It is merged with result using two-way LSTM, obtains the fusion results of each context term vector；

Step 205, information extraction is carried out to the fusion results of each context term vector respectively, obtains and is extracted based on problem Contextual information；

In this step, the extraction of information, window are carried out to context using two-dimensional convolution neural network with carrying out window type Mouth magnitude range is segmented at 1-10, is carried out the extraction of information to the contextual information of extraction using maximum pond later, is obtained Based on the contextual information that problem extracts, the interaction of contextual information is realized.

Step 206, using candidate answers contextualized coding result as cluster centre, with the context extracted based on problem Information is clustering information, carries out dynamic routing iteration, exports the candidate answers weighted based on contextual information.

In this step, using capsule network structure (CapsulesNet), the candidate answers exported with context embeding layer Capsule is cluster centre, and the contextual information to be extracted based on problem is carried out the dynamic routing that 3 take turns and changed as clustering information Generation, final output represent the capsule vector of 3 viewpoints, wherein each capsule vector is weighted based on contextual information Candidate answers；The number of iteration and the quantity of candidate answers are identical.

Step 207, optimal candidate answer is identified from the candidate answers weighted based on contextual information.

Obtain that capsule vector mould is long by compression (squash) function operation, the longer capsule vector explanation of mould should Viewpoint has obtained support more from context, and the probability as answer is also bigger.

Shown in Figure 3, Fig. 3 is that the viewpoint type machine of the embodiment of the present invention reads a kind of signal of the realization device understood Figure.The device includes,

The context is embedded in module,

Wherein, the information extraction module further includes that two-dimensional convolution neural network is used to carry out with certain window size Extracted information is extracted in the extraction of context information by maximum pondization, obtains the contextual information extracted based on problem；

The above method and device of the present invention can be applied to any electronic equipment for supporting viewpoint type machine to understand, the electronics Equipment includes memory and processor, wherein

For storing instruction, which makes processor execute any viewpoint type to memory when executed by the processor Machine reads the step of implementation method understood.

Memory may include random access memory (Random Access Memory, RAM), also may include non-easy The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also To be storage device that at least one is located remotely from aforementioned processor.

Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.；It can also be digital signal processor (Digital Signal Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete Door or transistor logic, discrete hardware components.

The embodiment of the invention also provides a kind of computer readable storage medium, computer is stored in the storage medium Program, the computer program realize following steps when being executed by processor:

For device/network side equipment/storage medium embodiment, since it is substantially similar to the method embodiment, institute To be described relatively simple, the relevent part can refer to the partial explaination of embodiments of method.

Herein, relational terms such as first and second and the like be used merely to by an entity or operation with it is another One entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this reality Relationship or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment for including a series of elements not only includes those elements, but also including Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims

1. a kind of viewpoint type machine reads the implementation method understood, which is characterized in that this method includes,

Context included by natural language text, problem and candidate answers, which are separately converted to term vector, to be indicated, is obtained respectively Obtain context term vector, problem term vector, candidate answers term vector；

Contextualized coding is carried out to context term vector, problem term vector, candidate answers term vector respectively, obtains context respectively Contextualized coding result, problem context coding result, candidate answers contextualized coding result；

Context of co-text coding result is matched with problem context coding result, obtains the matching of context and problem As a result；

Each context term vector, context of co-text coding result and context and the matching result of problem are melted It closes, obtains the fusion results of each context term vector；

Information extraction is carried out to the fusion results of each context term vector respectively, obtains the context letter extracted based on problem Breath,

It is cluster letter with the contextual information extracted based on problem using candidate answers contextualized coding result as cluster centre Breath carries out dynamic routing iteration, exports the candidate answers weighted based on contextual information,

2. implementation method as described in claim 1, which is characterized in that described to context term vector, problem term vector, candidate Answer term vector carries out contextualized coding respectively, including,

By two-way length, Memory Neural Networks structure encodes problem term vector in short-term, obtains problem contextization coding knot Fruit；

Will to hidden layer status information in two-way length in problem term vector cataloged procedure in short-term Memory Neural Networks as context, Candidate answers LSTM neural network coding init state information, by two-way LSTM neural network structure to context, Candidate answers are encoded respectively, are obtained context of co-text coding result respectively, are asked candidate answers contextualized coding result, and Candidate answers contextualized coding result will be asked as the initial capsule in capsule network structure.

3. implementation method as claimed in claim 2, which is characterized in that described respectively to the fusion knot of each context term vector Fruit carries out information extraction, obtains the contextual information extracted based on problem, including,

It uses two-dimensional convolution neural network to carry out the extraction of contextual information with certain window size, institute is extracted by maximum pondization The information of extraction obtains the contextual information extracted based on problem.

4. implementation method as claimed in claim 2 or claim 3, which is characterized in that described to be with candidate answers contextualized coding result Cluster centre carries out dynamic routing iteration using the contextual information extracted based on problem as clustering information, including,

Ask that candidate answers contextualized coding result is cluster with candidate answers contextualized coding result as initial capsule for described Center, using the contextual information extracted based on problem as clustering information, by capsule network structure, according to candidate answers Quantity n carries out the dynamic routing iteration between n wheel capsule, obtains the n capsules for representing candidate answers, each capsule is to be based on The candidate answers of contextual information weighting.

5. implementation method as claimed in claim 4, which is characterized in that the candidate answers include the first classification answer, second Classification answer and the third answer that can not determine the first classification and the second classification；

It is described to identify that optimal candidate answer includes passing through compression function from the candidate answers weighted based on contextual information Operation acquisition capsule vector mould is long, using the long longest capsule of mould as optimal candidate answer.

6. a kind of viewpoint type machine reads the realization device understood, which is characterized in that the device includes,

Word embeding layer module, by context included by natural language text, problem and candidate answers be separately converted to word to Amount indicates, obtains context term vector, problem term vector, candidate answers term vector respectively；

Context is embedded in module, carries out contextualized volume respectively to context term vector, problem term vector, candidate answers term vector Code obtains context of co-text coding result, problem context coding result, candidate answers contextualized coding result respectively；

Matching module matches context of co-text coding result with problem context coding result, obtain context with The matching result of problem；

Fusion Module, by the matching knot of each context term vector, context of co-text coding result and context and problem Fruit is merged, and the fusion results of each context term vector are obtained；

Information extraction module carries out information extraction to the fusion results of each context term vector respectively, obtains and is taken out based on problem The contextual information taken,

Dynamic routing module, using candidate answers contextualized coding result as cluster centre, with described based on the upper and lower of problem extraction Literary information is clustering information, carries out dynamic routing iteration, exports the candidate answers weighted based on contextual information,

7. realization device as claimed in claim 6, which is characterized in that the context is embedded in module and further includes,

8. realization device as claimed in claim 7, which is characterized in that the information extraction module further includes, using two-dimensional convolution Neural network carries out the extraction of contextual information with certain window size, and extracted information is extracted by maximum pondization, is obtained The contextual information extracted based on problem；

The dynamic routing module further includes asking candidate answers contextualized coding result as initial capsule, with candidate for described Answer contextualized coding result is cluster centre, using the contextual information extracted based on problem as clustering information, passes through glue Capsule network structure carries out the dynamic routing iteration between n wheel capsule, obtains n and represent candidate according to the quantity n of candidate answers The capsule of answer, each capsule are the candidate answers based on contextual information weighting.

The categorization module further includes, and operates that obtain capsule vector mould long by compression function, using the long longest capsule of mould as Optimal candidate answer；

The candidate answers include the first classification answer, the second classification answer and can not determine the first classification and the second classification Third answer.

9. it is a kind of support viewpoint type machine read understand electronic equipment, which is characterized in that the electronic equipment include memory and Processor, wherein for storing instruction, which makes processor execute such as claim to memory when executed by the processor The step of 1 to 5 any viewpoint type machine reads the implementation method understood.

10. a kind of computer storage medium, it is stored with computer program in the storage medium, the computer program is processed The text code representation method as described in claim 1 to 5 is any based on transformer model and more reference systems is realized when device executes The step of.