CN108959556A - Entity answering method, device and terminal neural network based - Google Patents
Entity answering method, device and terminal neural network based Download PDFInfo
- Publication number
- CN108959556A CN108959556A CN201810714445.4A CN201810714445A CN108959556A CN 108959556 A CN108959556 A CN 108959556A CN 201810714445 A CN201810714445 A CN 201810714445A CN 108959556 A CN108959556 A CN 108959556A
- Authority
- CN
- China
- Prior art keywords
- word
- candidate documents
- term vector
- closing
- documents
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes a kind of entity answering method, device and terminal neural network based, wherein method includes that word included in problem and candidate documents is converted to term vector respectively, generates corresponding problem term vector sequence and candidate documents term vector sequence;Problem term vector sequence and candidate documents term vector sequence are separately input into shot and long term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;The Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem matches, and generating the candidate documents based on match information indicates, candidate documents indicate to include that multiple words indicate;Selection starts word and closing in the expression of all words, and generates entity answer according to beginning word and closing.Reduce explicit algorithm and cumulative errors, efficiently use the semantic expressiveness between problem and document, improves the precision of entity answer positioning.
Description
Technical field
The present invention relates to computer fields, and in particular to a kind of entity answering method neural network based further relates to one
Kind entity question and answer system neural network based and a kind of entity question and answer terminal neural network based.
Background technique
On the basis of the relevant documentation of given question and answer, traditional entity question answering system needs to carry out problem types
The calculating of multiple functional modules such as analysis, Entity recognition, entity type matching, context matches, these functional modules it is explicit
Calculating often makes entity question answering system become heavy, and final system effect is limited to the deviation accumulation of all modules.Tradition
Entity question answering system disadvantage mainly has: (1) above-mentioned each functional module is related to a large amount of morphological analysis, syntactic analysis, semantic point
The key technologies such as analysis, knowledge engineering, so that system-computed is very heavy;(2) system overall effect is limited to each function mould
The individual effect of block, there are cumulative errors, and are unfavorable for sustainable effect optimization.
Summary of the invention
The embodiment of the present invention provides a kind of entity answering method, device and terminal neural network based, at least to solve
The above technical problem certainly in the prior art.
In a first aspect, the embodiment of the invention provides a kind of entity answering methods neural network based, comprising:
Word included in problem and candidate documents is converted into term vector respectively, generates corresponding problem term vector sequence
With candidate documents term vector sequence;
Described problem term vector sequence and the candidate documents term vector sequence are separately input into shot and long term memory network
In model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
The Chinese word coding sequence of Chinese word coding sequence and the candidate documents to described problem matches, and generates based on matching
The candidate documents of information indicate the candidate documents indicate to include that multiple words indicate;
Selection starts word and closing in all word expressions, and generates entity according to the beginning word and closing
Answer.
With reference to first aspect, the present invention is in the first embodiment of first aspect, to the Chinese word coding sequence of described problem
It is matched with the Chinese word coding sequence of the candidate documents, generating the candidate documents based on match information indicates, comprising:
Calculate Documents Similarity the problem of between the term vector of described problem and the term vector of the candidate documents;
According to described problem Documents Similarity and described problem term vector sequence, it is calculated based on problem and similarity
Document representation;
According to described problem Documents Similarity and the candidate documents term vector sequence, it is calculated based on problem understanding
Document representation;
According to the candidate documents term vector sequence, the document representation based on problem and similarity and described it is based on
The document representation that problem understands, the candidate documents, which are calculated, to be indicated.
With reference to first aspect, the present invention selects in all word expressions in the second embodiment of first aspect
Start word and closing, and entity answer generated according to the beginning word and closing, comprising:
Each word expression is input in full Connection Neural Network model, the vocabulary is generated and is shown as the beginning
The first probability value and the vocabulary of word are shown as the second probability value of the closing;
Corresponding first probability value and the second probability value are indicated according to each word, utilize condition random field algorithm
Select the beginning word and the closing;
Using the beginning word, the closing and the medium term started between word and the closing as described in
Entity answer.
With reference to first aspect or its any one embodiment, the present invention divide in the third embodiment of first aspect
Before word included in problem and candidate documents is not converted to term vector, comprising:
Word cutting is carried out to the sentence for including in described problem and the candidate documents.
Second aspect, the embodiment of the invention provides a kind of entity question and answer systems neural network based, comprising:
Vector conversion module, for word included in problem and candidate documents to be converted to term vector, generation pair respectively
The problem of answering term vector sequence and candidate documents term vector sequence;
Sequential coding module, for inputting described problem term vector sequence and the candidate documents term vector sequence respectively
Into shot and long term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module, the Chinese word coding sequence for Chinese word coding sequence and the candidate documents to described problem
Column are matched, and generating the candidate documents based on match information indicates, the candidate documents indicate to include that multiple words indicate;
Answer generation module, for selection beginning word and closing in all word expressions, and according to the beginning
Word and closing generate entity answer.
In conjunction with second aspect, the present invention is in the first embodiment of second aspect, described problem and document matches module
Include:
Similarity calculated, for calculating asking between the term vector of described problem and the term vector of the candidate documents
Inscribe Documents Similarity;
First document representation generation unit is used for according to described problem Documents Similarity and described problem term vector sequence,
The document representation based on problem and similarity is calculated;
Second document representation generation unit, for according to described problem Documents Similarity and the candidate documents term vector sequence
The document representation understood based on problem is calculated in column;
Candidate documents indicate generation unit, for according to the candidate documents term vector sequence, it is described be based on problem and phase
Like the document representation and the document representation understood based on problem of degree, it is calculated described based on matched candidate documents table
Show.
In conjunction with second aspect, in the second embodiment of second aspect, the answer generation module includes: the present invention
Probability calculation unit, for each word expression to be input in full Connection Neural Network model, described in generation
Vocabulary is shown as first probability value for starting word and the vocabulary is shown as the second probability value of the closing;
Word selecting unit, for indicating corresponding first probability value and the second probability value, benefit according to each word
Beginning word and the closing described in condition random field algorithms selection;
Answer marks unit, for by the beginning word, the closing and the beginning word and the closing it
Between medium term as the entity answer.
In conjunction with second aspect or its any one embodiment, the present invention is in the third embodiment of second aspect, institute
State device further include:
Word cutting module, for carrying out word cutting to the sentence for including in described problem and the candidate documents.
The third aspect, includes processor and memory in the structure of entity question and answer terminal neural network based, described to deposit
Reservoir supports entity question and answer system neural network based to execute reality neural network based in above-mentioned first aspect for storing
The program of body answering method, the processor is configured to for executing the program stored in the memory.It is based on nerve
The entity question and answer system of network can also include communication interface, be used for entity question and answer system neural network based and other equipment
Or communication.
The function can also execute corresponding software realization by hardware realization by hardware.The hardware or
Software includes one or more modules corresponding with above-mentioned function.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are based on nerve net for storing
Computer software instructions used in the entity question and answer system of network comprising for executing in above-mentioned first aspect based on neural network
Entity answering method be entity question and answer system neural network based involved in program.
A technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that by shot and long term memory network
The semantic feature that problem and candidate documents are extracted with context respectively generates the Chinese word coding sequence and candidate documents of problem
Chinese word coding sequence is matched by the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, obtains fusion matching
The candidate documents of information indicate, combine whole text semantic information, direct location entity answer.It reduces explicit algorithm and adds up
Error efficiently uses the semantic expressiveness between problem and document, improves the precision of entity answer positioning, convenient for combining question-answering environment
Background context, further increase the timeliness of entity question and answer.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description
Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further
Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings
Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention
Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is a kind of entity answering method flow chart neural network based provided in an embodiment of the present invention;
Fig. 2 is a kind of entity question answering process schematic diagram neural network based provided in an embodiment of the present invention;
Fig. 3 is a kind of entity question and answer system block diagram neural network based provided in an embodiment of the present invention;
Fig. 4 is problem provided in an embodiment of the present invention and document matches modular structure block diagram;
Fig. 5 is answer generation module structural block diagram provided in an embodiment of the present invention;
Fig. 6 is a kind of computer readable storage medium provided in an embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that
Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes.
Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Embodiment one
In a kind of specific embodiment, a kind of entity answering method neural network based is provided, such as Fig. 1 and Fig. 2 institute
Show, comprising:
Step S100: being converted to term vector for word included in problem and candidate documents respectively, generates corresponding problem
Term vector sequence and candidate documents term vector sequence.
In expression layer, vectorization expression is carried out respectively to the word in problem q and document p.Specifically, will be asked when initial
Each of topic and document word are initialized as the random floating point vector of a fixed dimension, and problem term vector arranges to form problem
Term vector sequence qemb, candidate documents term vector arranges to form candidate documents term vector sequence pemb, these can be used as problem and
The initial representation of the word of document.Then during systematic training, the expression of problem and document also can be optimized constantly.One
In kind of embodiment, before step S100, this method may include: to the sentence for including in problem and the candidate documents into
Row word cutting.The participle that the purpose of word cutting is easy for be formed after word cutting is converted to term vector.
Step S200: problem term vector sequence and candidate documents term vector sequence are separately input into shot and long term memory network
In model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported.
In coding layer, respectively using shot and long term memory network model (LSTM, Long Short Memory Network)
Problem term vector sequence and candidate documents term vector are encoded, it is therefore an objective to which the context for implicitly extracting problem term vector is special
The context semantic feature of candidate documents of seeking peace term vector.The context semantic feature of problem term vector is included in a problem language
Semantic feature of the term vector of problem sentence in this sentence is constituted in sentence.Likewise, the context language of candidate documents term vector
Adopted feature includes semantic feature of the term vector of composition problem sentence in this sentence in a sentence of candidate documents.Example
Such as, Chinese word coding sequence q the problem of outputencodeThe context semantic feature of problem term vector is contained, the word of candidate documents is compiled
Code sequence pencodeContain the context semantic feature of candidate documents.
Step S300: the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem matches, generate based on
Candidate documents with information indicate candidate documents indicate to include that multiple words indicate.
In matching layer, it is to mention that the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, which carries out matched purpose,
The related semantic information between problem and candidate documents is taken, considers whole text semantic, is screened in all candidate documents
Problem is more accurately replied out.The candidate documents expression based on match information generated contain problem and candidate documents it
Between match information, for example, Documents Similarity, document representation and base based on problem and similarity the problem of between term vector
In the document representation etc. that problem understands.
Step S400: selection starts word and closing in the expression of all words, and real according to beginning word and closing generation
Body answer.
In sequence labelling layer, candidate documents based on match information are indicated, using full Connection Neural Network model (FNN,
Fully neural network) prediction is indicated to the word of each position in candidate documents expression, and calculate it and answered as entity
The probability for originating word, entity answer medium term of case.Utilize linear markov condition random field (CRF, Conditional
Random Field algorithm) the optimal transfer parameters of model calculating, optimal transfer parameters are decoded, are obtained to each
The mark of the word of position.Especially mark out the beginning word and closing of entity answer.Start word, closing and be located to start word
Entity answer is labeled as between closing.
Entity answering method neural network based provided in this embodiment, by shot and long term memory network to problem and time
Selection shelves extract the semantic feature of context respectively, generate the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents
Column, are matched by the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, obtain the time of fusion match information
Document representation is selected, whole text semantic information, direct location entity answer are combined.Explicit algorithm and cumulative errors are reduced, are had
The semantic expressiveness between Utilizing question and document is imitated, the precision of entity answer positioning is improved, convenient for combining the correlation of question-answering environment
Background further increases the timeliness of entity question and answer.
In one embodiment, the Chinese word coding sequence of the Chinese word coding sequence of problem and candidate documents is matched, it is raw
It is indicated at the candidate documents based on match information, comprising:
The problem of between the term vector of computational problem and the term vector of candidate documents Documents Similarity;
According to problem Documents Similarity and problem term vector sequence, the document table based on problem and similarity is calculated
Show;
According to problem Documents Similarity and candidate documents term vector sequence, the document table understood based on problem is calculated
Show;
According to candidate documents term vector sequence, the document representation based on problem and similarity and the text understood based on problem
Shelves indicate, candidate documents expression is calculated.
For example, the formula of computational problem Documents Similarity is Sij=soft max (qi encode*pj encode), wherein
SijSimilarity between i-th of term vector in expression problem and j-th of term vector of candidate documents.It is calculated using above-mentioned algorithm
The numberical range of similarity out is between 0~1.Secondly, calculating the process of the document representation based on problem and similarity is
p2qattention=S*qencode, p2qattentionIt is the overall similarity calculating of document to problem, obtains based on problem and similarity
Document representation, S is problem Documents Similarity.Again, the process for calculating the document representation understood based on problem is first to calculateQ2p is calculated againattention=b*pencode, q2pattentionIt is overall similarity of the problem to document
It calculates, obtains the document representation understood based on problem, the implicit letter for having incorporated problem in the document representation understood based on problem
Breath, b are the weight distribution matrixes of a document representation understood in problem.Finally, calculating candidate documents indicates M, calculation formula
It is as follows:
M=concat (pencode,p2qattention,pencode*p2qattention,pencode*q2pattention)。
In one embodiment, selection starts word and closing in the expression of all words, and according to beginning word and end
Word generates entity answer, comprising:
The expression of each word is input in full Connection Neural Network model, vocabulary is generated and is shown as starting the first probability of word
Value and vocabulary are shown as the second probability value of closing;
Corresponding first probability value and the second probability value are indicated according to each word, are started using condition random field algorithms selection
Word and closing;
Using the medium term between beginning word, closing and beginning word and closing as entity answer.
Wherein, full Connection Neural Network model output starts word and two class of closing, and the expression of each word is input to and is connected entirely
It connects in neural network model, respectively obtains each word as two probability values for starting word and closing.Each word indicates to correspond to
First probability value and the second probability value, by condition random field algorithm to the word for being most probably acting as starting word and closing indicate into
Rower note.It finally will start word, closing and be labeled as entity answer between beginning word and closing.
Embodiment two
In another embodiment specific implementation mode, a kind of entity question and answer system neural network based is provided, such as Fig. 3 institute
Show, comprising:
Vector conversion module 10 is generated for word included in problem and candidate documents to be converted to term vector respectively
Corresponding problem term vector sequence and candidate documents term vector sequence;
Sequential coding module 20, for problem term vector sequence and candidate documents term vector sequence to be separately input into length
In phase memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module 30, the Chinese word coding sequence for Chinese word coding sequence and the candidate documents to problem
It is matched, generating the candidate documents based on match information indicates, candidate documents indicate to include that multiple words indicate;
Answer generation module 40, for selection beginning word and closing in the expression of all words, and according to beginning word and knot
Beam word generates entity answer.
In one embodiment, as shown in figure 4, problem and document matches module 30 include:
Similarity calculated 31, document the problem of between the term vector of computational problem and the term vector of candidate documents
Similarity;
First document representation generation unit 32, for calculating according to problem Documents Similarity and problem term vector sequence
To the document representation based on problem and similarity;
Second document representation generation unit 33, for according to problem Documents Similarity and candidate documents term vector sequence, meter
Calculate the document representation for obtaining understanding based on problem;
Candidate documents indicate generation unit 34, for according to candidate documents term vector sequence, described based on problem and similar
The document representation of degree and the document representation understood based on problem are calculated described based on matched candidate documents table
Show.
In one embodiment, as shown in figure 5, answer generation module 40 includes:
Probability calculation unit 41, for the expression of each word to be input in full Connection Neural Network model, generating word is indicated
The second probability value of closing is shown as the first probability value and vocabulary for starting word;
Word selecting unit 42 utilizes condition for indicating corresponding first probability value and the second probability value according to each word
Random field algorithms selection starts word and closing;
Answer mark unit 43, for will start word, closing and beginning word and closing between medium term conduct
Entity answer.
In one embodiment, device further include:
Word cutting module, for carrying out word cutting to the sentence for including in problem and candidate documents.
Embodiment three
The embodiment of the invention provides a kind of entity question and answer terminals neural network based, as shown in Figure 6, comprising:
Memory 400 and processor 500 are stored with the computer journey that can be run on processor 500 in memory 400
Sequence.Processor 500 realizes the entity answering method neural network based in above-described embodiment when executing the computer program.
The quantity of memory 400 and processor 500 can be one or more.
Communication interface 600 is communicated for memory 400 and processor 500 with outside.
Memory 400 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.
If memory 400, processor 500 and the independent realization of communication interface 600, memory 400, processor 500
And communication interface 600 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard
Architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral
Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard
Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, Fig. 6
In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 400, processor 500 and communication interface 600 are integrated in one piece
On chip, then memory 400, processor 500 and communication interface 600 can complete mutual communication by internal interface.
Example IV
A kind of computer readable storage medium is stored with computer program, realization when described program is executed by processor
Embodiment one include it is any as described in entity answering method neural network based.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described
It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this
The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples
Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden
It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise
Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings
Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory
(CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie
Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media
Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement,
These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim
It protects subject to range.
Claims (10)
1. a kind of entity answering method neural network based characterized by comprising
Word included in problem and candidate documents is converted into term vector respectively, generates corresponding problem term vector sequence and time
Selection shelves term vector sequence;
Described problem term vector sequence and the candidate documents term vector sequence are separately input into shot and long term memory network model
In, export the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents;
The Chinese word coding sequence of Chinese word coding sequence and the candidate documents to described problem matches, and generates and is based on match information
Candidate documents indicate the candidate documents indicate to include that multiple words indicate;
Selection starts word and closing in all word expressions, and generates entity according to the beginning word and closing and answer
Case.
2. the method according to claim 1, wherein to the Chinese word coding sequence and the candidate documents of described problem
Chinese word coding sequence matched, generating candidate documents based on match information indicates, comprising:
Calculate Documents Similarity the problem of between the term vector of described problem and the term vector of the candidate documents;
According to described problem Documents Similarity and described problem term vector sequence, the document based on problem and similarity is calculated
It indicates;
According to described problem Documents Similarity and the candidate documents term vector sequence, the document understood based on problem is calculated
It indicates;
According to the candidate documents term vector sequence, the document representation based on problem and similarity and described it is based on problem
The document representation of understanding, the candidate documents, which are calculated, to be indicated.
3. the method according to claim 1, wherein selection starts word and end in all word expressions
Word, and entity answer is generated according to the beginning word and closing, comprising:
Each word expression is input in full Connection Neural Network model, the vocabulary is generated and is shown as the beginning word
First probability value and the vocabulary are shown as the second probability value of the closing;
Corresponding first probability value and the second probability value are indicated according to each word, utilize condition random field algorithms selection
The beginning word and the closing;
Using the beginning word, the closing and the medium term started between word and the closing as the entity
Answer.
4. according to the method in any one of claims 1 to 3, which is characterized in that respectively by institute in problem and candidate documents
The word for including is converted to before term vector, comprising:
Word cutting is carried out to the sentence for including in described problem and the candidate documents.
5. a kind of entity question and answer system neural network based characterized by comprising
Vector conversion module generates corresponding for word included in problem and candidate documents to be converted to term vector respectively
Problem term vector sequence and candidate documents term vector sequence;
Sequential coding module, for described problem term vector sequence and the candidate documents term vector sequence to be separately input into length
In short-term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module, for Chinese word coding sequence and the candidate documents to described problem Chinese word coding sequence into
Row matching, generating the candidate documents based on match information indicates, the candidate documents indicate to include that multiple words indicate;
Answer generation module, in all words expressions selection start word and closing, and according to the beginnings word with
Closing generates entity answer.
6. device according to claim 5, which is characterized in that described problem and document matches module include:
Similarity calculated, text the problem of for calculating between the term vector of described problem and the term vector of the candidate documents
Shelves similarity;
First document representation generation unit, for calculating according to described problem Documents Similarity and described problem term vector sequence
Obtain the document representation based on problem and similarity;
Second document representation generation unit is used for according to described problem Documents Similarity and the candidate documents term vector sequence,
The document representation understood based on problem is calculated;
Candidate documents indicate generation unit, for according to the candidate documents term vector sequence, it is described be based on problem and similarity
Document representation and the document representation understood based on problem, be calculated described is indicated based on matched candidate documents.
7. device according to claim 5, which is characterized in that the answer generation module includes:
Probability calculation unit generates the vocabulary for each word expression to be input in full Connection Neural Network model
It is shown as first probability value for starting word and the vocabulary is shown as the second probability value of the closing;
Word selecting unit utilizes item for indicating corresponding first probability value and the second probability value according to each word
Beginning word and the closing described in part random field algorithms selection;
Answer marks unit, for will be between the beginning word, the closing and the beginning word and the closing
Medium term is as the entity answer.
8. device according to any one of claims 5 to 7, which is characterized in that described device further include:
Word cutting module, for carrying out word cutting to the sentence for including in described problem and the candidate documents.
9. a kind of entity question and answer terminal neural network based characterized by comprising
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors
Realize the method as described in any in claim 1-4.
10. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor
The method as described in any in claim 1-4 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810714445.4A CN108959556A (en) | 2018-06-29 | 2018-06-29 | Entity answering method, device and terminal neural network based |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810714445.4A CN108959556A (en) | 2018-06-29 | 2018-06-29 | Entity answering method, device and terminal neural network based |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108959556A true CN108959556A (en) | 2018-12-07 |
Family
ID=64485214
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810714445.4A Pending CN108959556A (en) | 2018-06-29 | 2018-06-29 | Entity answering method, device and terminal neural network based |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959556A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815325A (en) * | 2019-01-18 | 2019-05-28 | 北京百度网讯科技有限公司 | Answer extracting method, apparatus, server and storage medium |
CN109885672A (en) * | 2019-03-04 | 2019-06-14 | 中国科学院软件研究所 | A kind of question and answer mode intelligent retrieval system and method towards online education |
CN110222345A (en) * | 2019-06-18 | 2019-09-10 | 卓尔智联(武汉)研究院有限公司 | Cloze Test answer method, apparatus, electronic equipment and storage medium |
CN110245334A (en) * | 2019-06-25 | 2019-09-17 | 北京百度网讯科技有限公司 | Method and apparatus for output information |
CN110619042A (en) * | 2019-03-13 | 2019-12-27 | 北京航空航天大学 | Neural network-based teaching question and answer system and method |
CN111881264A (en) * | 2020-09-28 | 2020-11-03 | 北京智源人工智能研究院 | Method and electronic equipment for searching long text in question-answering task in open field |
CN112347229A (en) * | 2020-11-12 | 2021-02-09 | 润联软件系统(深圳)有限公司 | Answer extraction method and device, computer equipment and storage medium |
CN112417126A (en) * | 2020-12-02 | 2021-02-26 | 车智互联(北京)科技有限公司 | Question answering method, computing equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7933904B2 (en) * | 2007-04-10 | 2011-04-26 | Nelson Cliff | File search engine and computerized method of tagging files with vectors |
CN103229162A (en) * | 2010-09-28 | 2013-07-31 | 国际商业机器公司 | Providing answers to questions using logical synthesis of candidate answers |
CN106778887A (en) * | 2016-12-27 | 2017-05-31 | 努比亚技术有限公司 | The terminal and method of sentence flag sequence are determined based on condition random field |
CN107562792A (en) * | 2017-07-31 | 2018-01-09 | 同济大学 | A kind of question and answer matching process based on deep learning |
WO2018085710A1 (en) * | 2016-11-04 | 2018-05-11 | Salesforce.Com, Inc. | Dynamic coattention network for question answering |
-
2018
- 2018-06-29 CN CN201810714445.4A patent/CN108959556A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7933904B2 (en) * | 2007-04-10 | 2011-04-26 | Nelson Cliff | File search engine and computerized method of tagging files with vectors |
CN103229162A (en) * | 2010-09-28 | 2013-07-31 | 国际商业机器公司 | Providing answers to questions using logical synthesis of candidate answers |
WO2018085710A1 (en) * | 2016-11-04 | 2018-05-11 | Salesforce.Com, Inc. | Dynamic coattention network for question answering |
CN106778887A (en) * | 2016-12-27 | 2017-05-31 | 努比亚技术有限公司 | The terminal and method of sentence flag sequence are determined based on condition random field |
CN107562792A (en) * | 2017-07-31 | 2018-01-09 | 同济大学 | A kind of question and answer matching process based on deep learning |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815325B (en) * | 2019-01-18 | 2021-12-10 | 北京百度网讯科技有限公司 | Answer extraction method, device, server and storage medium |
CN109815325A (en) * | 2019-01-18 | 2019-05-28 | 北京百度网讯科技有限公司 | Answer extracting method, apparatus, server and storage medium |
CN109885672A (en) * | 2019-03-04 | 2019-06-14 | 中国科学院软件研究所 | A kind of question and answer mode intelligent retrieval system and method towards online education |
CN109885672B (en) * | 2019-03-04 | 2020-10-30 | 中国科学院软件研究所 | Question-answering type intelligent retrieval system and method for online education |
CN110619042A (en) * | 2019-03-13 | 2019-12-27 | 北京航空航天大学 | Neural network-based teaching question and answer system and method |
CN110222345A (en) * | 2019-06-18 | 2019-09-10 | 卓尔智联(武汉)研究院有限公司 | Cloze Test answer method, apparatus, electronic equipment and storage medium |
CN110245334A (en) * | 2019-06-25 | 2019-09-17 | 北京百度网讯科技有限公司 | Method and apparatus for output information |
CN110245334B (en) * | 2019-06-25 | 2023-06-16 | 北京百度网讯科技有限公司 | Method and device for outputting information |
CN111881264A (en) * | 2020-09-28 | 2020-11-03 | 北京智源人工智能研究院 | Method and electronic equipment for searching long text in question-answering task in open field |
CN111881264B (en) * | 2020-09-28 | 2020-12-15 | 北京智源人工智能研究院 | Method and electronic equipment for searching long text in question-answering task in open field |
CN112347229B (en) * | 2020-11-12 | 2021-07-20 | 润联软件系统(深圳)有限公司 | Answer extraction method and device, computer equipment and storage medium |
CN112347229A (en) * | 2020-11-12 | 2021-02-09 | 润联软件系统(深圳)有限公司 | Answer extraction method and device, computer equipment and storage medium |
CN112417126A (en) * | 2020-12-02 | 2021-02-26 | 车智互联(北京)科技有限公司 | Question answering method, computing equipment and storage medium |
CN112417126B (en) * | 2020-12-02 | 2024-01-23 | 车智互联(北京)科技有限公司 | Question answering method, computing device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108959556A (en) | Entity answering method, device and terminal neural network based | |
Lo Bosco et al. | Deep learning architectures for DNA sequence classification | |
CN110377903B (en) | Sentence-level entity and relation combined extraction method | |
KR102116518B1 (en) | Apparatus for answering a question based on maching reading comprehension and method for answering a question using thereof | |
CN109284506A (en) | A kind of user comment sentiment analysis system and method based on attention convolutional neural networks | |
Yu et al. | Sequential labeling using deep-structured conditional random fields | |
CN108170684A (en) | Text similarity computing method and system, data query system and computer product | |
CN110781306B (en) | English text aspect layer emotion classification method and system | |
CN112163429B (en) | Sentence correlation obtaining method, system and medium combining cyclic network and BERT | |
CN106294635B (en) | Application program searching method, the training method of deep neural network model and device | |
CN112015868A (en) | Question-answering method based on knowledge graph completion | |
CN109597992B (en) | Question similarity calculation method combining synonym dictionary and word embedding vector | |
CN112800239B (en) | Training method of intention recognition model, and intention recognition method and device | |
Jebbara et al. | Aspect-based relational sentiment analysis using a stacked neural network architecture | |
CN108897852A (en) | Judgment method, device and the equipment of conversation content continuity | |
CN110210043A (en) | Text interpretation method, device, electronic equipment and readable storage medium storing program for executing | |
JP2021522569A (en) | Machine learning model with evolving domain-specific lexicon features for text annotation | |
CN106557554B (en) | The display methods and device of search result based on artificial intelligence | |
Wang et al. | Summary-aware attention for social media short text abstractive summarization | |
CN109597988A (en) | The former prediction technique of vocabulary justice, device and electronic equipment across language | |
CN108846125A (en) | Talk with generation method, device, terminal and computer readable storage medium | |
CN110263167A (en) | Medical bodies method of generating classification model, device, equipment and readable storage medium storing program for executing | |
CN110399472A (en) | Reminding method, device, computer equipment and storage medium are putd question in interview | |
CN111914553A (en) | Financial information negative subject judgment method based on machine learning | |
Li et al. | Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181207 |