CN108959556A - Entity answering method, device and terminal neural network based - Google Patents

Entity answering method, device and terminal neural network based Download PDF

Info

Publication number
CN108959556A
CN108959556A CN201810714445.4A CN201810714445A CN108959556A CN 108959556 A CN108959556 A CN 108959556A CN 201810714445 A CN201810714445 A CN 201810714445A CN 108959556 A CN108959556 A CN 108959556A
Authority
CN
China
Prior art keywords
word
candidate documents
term vector
closing
documents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810714445.4A
Other languages
Chinese (zh)
Inventor
韦豪杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810714445.4A priority Critical patent/CN108959556A/en
Publication of CN108959556A publication Critical patent/CN108959556A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes a kind of entity answering method, device and terminal neural network based, wherein method includes that word included in problem and candidate documents is converted to term vector respectively, generates corresponding problem term vector sequence and candidate documents term vector sequence;Problem term vector sequence and candidate documents term vector sequence are separately input into shot and long term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;The Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem matches, and generating the candidate documents based on match information indicates, candidate documents indicate to include that multiple words indicate;Selection starts word and closing in the expression of all words, and generates entity answer according to beginning word and closing.Reduce explicit algorithm and cumulative errors, efficiently use the semantic expressiveness between problem and document, improves the precision of entity answer positioning.

Description

Entity answering method, device and terminal neural network based
Technical field
The present invention relates to computer fields, and in particular to a kind of entity answering method neural network based further relates to one Kind entity question and answer system neural network based and a kind of entity question and answer terminal neural network based.
Background technique
On the basis of the relevant documentation of given question and answer, traditional entity question answering system needs to carry out problem types The calculating of multiple functional modules such as analysis, Entity recognition, entity type matching, context matches, these functional modules it is explicit Calculating often makes entity question answering system become heavy, and final system effect is limited to the deviation accumulation of all modules.Tradition Entity question answering system disadvantage mainly has: (1) above-mentioned each functional module is related to a large amount of morphological analysis, syntactic analysis, semantic point The key technologies such as analysis, knowledge engineering, so that system-computed is very heavy;(2) system overall effect is limited to each function mould The individual effect of block, there are cumulative errors, and are unfavorable for sustainable effect optimization.
Summary of the invention
The embodiment of the present invention provides a kind of entity answering method, device and terminal neural network based, at least to solve The above technical problem certainly in the prior art.
In a first aspect, the embodiment of the invention provides a kind of entity answering methods neural network based, comprising:
Word included in problem and candidate documents is converted into term vector respectively, generates corresponding problem term vector sequence With candidate documents term vector sequence;
Described problem term vector sequence and the candidate documents term vector sequence are separately input into shot and long term memory network In model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
The Chinese word coding sequence of Chinese word coding sequence and the candidate documents to described problem matches, and generates based on matching The candidate documents of information indicate the candidate documents indicate to include that multiple words indicate;
Selection starts word and closing in all word expressions, and generates entity according to the beginning word and closing Answer.
With reference to first aspect, the present invention is in the first embodiment of first aspect, to the Chinese word coding sequence of described problem It is matched with the Chinese word coding sequence of the candidate documents, generating the candidate documents based on match information indicates, comprising:
Calculate Documents Similarity the problem of between the term vector of described problem and the term vector of the candidate documents;
According to described problem Documents Similarity and described problem term vector sequence, it is calculated based on problem and similarity Document representation;
According to described problem Documents Similarity and the candidate documents term vector sequence, it is calculated based on problem understanding Document representation;
According to the candidate documents term vector sequence, the document representation based on problem and similarity and described it is based on The document representation that problem understands, the candidate documents, which are calculated, to be indicated.
With reference to first aspect, the present invention selects in all word expressions in the second embodiment of first aspect Start word and closing, and entity answer generated according to the beginning word and closing, comprising:
Each word expression is input in full Connection Neural Network model, the vocabulary is generated and is shown as the beginning The first probability value and the vocabulary of word are shown as the second probability value of the closing;
Corresponding first probability value and the second probability value are indicated according to each word, utilize condition random field algorithm Select the beginning word and the closing;
Using the beginning word, the closing and the medium term started between word and the closing as described in Entity answer.
With reference to first aspect or its any one embodiment, the present invention divide in the third embodiment of first aspect Before word included in problem and candidate documents is not converted to term vector, comprising:
Word cutting is carried out to the sentence for including in described problem and the candidate documents.
Second aspect, the embodiment of the invention provides a kind of entity question and answer systems neural network based, comprising:
Vector conversion module, for word included in problem and candidate documents to be converted to term vector, generation pair respectively The problem of answering term vector sequence and candidate documents term vector sequence;
Sequential coding module, for inputting described problem term vector sequence and the candidate documents term vector sequence respectively Into shot and long term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module, the Chinese word coding sequence for Chinese word coding sequence and the candidate documents to described problem Column are matched, and generating the candidate documents based on match information indicates, the candidate documents indicate to include that multiple words indicate;
Answer generation module, for selection beginning word and closing in all word expressions, and according to the beginning Word and closing generate entity answer.
In conjunction with second aspect, the present invention is in the first embodiment of second aspect, described problem and document matches module Include:
Similarity calculated, for calculating asking between the term vector of described problem and the term vector of the candidate documents Inscribe Documents Similarity;
First document representation generation unit is used for according to described problem Documents Similarity and described problem term vector sequence, The document representation based on problem and similarity is calculated;
Second document representation generation unit, for according to described problem Documents Similarity and the candidate documents term vector sequence The document representation understood based on problem is calculated in column;
Candidate documents indicate generation unit, for according to the candidate documents term vector sequence, it is described be based on problem and phase Like the document representation and the document representation understood based on problem of degree, it is calculated described based on matched candidate documents table Show.
In conjunction with second aspect, in the second embodiment of second aspect, the answer generation module includes: the present invention
Probability calculation unit, for each word expression to be input in full Connection Neural Network model, described in generation Vocabulary is shown as first probability value for starting word and the vocabulary is shown as the second probability value of the closing;
Word selecting unit, for indicating corresponding first probability value and the second probability value, benefit according to each word Beginning word and the closing described in condition random field algorithms selection;
Answer marks unit, for by the beginning word, the closing and the beginning word and the closing it Between medium term as the entity answer.
In conjunction with second aspect or its any one embodiment, the present invention is in the third embodiment of second aspect, institute State device further include:
Word cutting module, for carrying out word cutting to the sentence for including in described problem and the candidate documents.
The third aspect, includes processor and memory in the structure of entity question and answer terminal neural network based, described to deposit Reservoir supports entity question and answer system neural network based to execute reality neural network based in above-mentioned first aspect for storing The program of body answering method, the processor is configured to for executing the program stored in the memory.It is based on nerve The entity question and answer system of network can also include communication interface, be used for entity question and answer system neural network based and other equipment Or communication.
The function can also execute corresponding software realization by hardware realization by hardware.The hardware or Software includes one or more modules corresponding with above-mentioned function.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are based on nerve net for storing Computer software instructions used in the entity question and answer system of network comprising for executing in above-mentioned first aspect based on neural network Entity answering method be entity question and answer system neural network based involved in program.
A technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that by shot and long term memory network The semantic feature that problem and candidate documents are extracted with context respectively generates the Chinese word coding sequence and candidate documents of problem Chinese word coding sequence is matched by the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, obtains fusion matching The candidate documents of information indicate, combine whole text semantic information, direct location entity answer.It reduces explicit algorithm and adds up Error efficiently uses the semantic expressiveness between problem and document, improves the precision of entity answer positioning, convenient for combining question-answering environment Background context, further increase the timeliness of entity question and answer.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is a kind of entity answering method flow chart neural network based provided in an embodiment of the present invention;
Fig. 2 is a kind of entity question answering process schematic diagram neural network based provided in an embodiment of the present invention;
Fig. 3 is a kind of entity question and answer system block diagram neural network based provided in an embodiment of the present invention;
Fig. 4 is problem provided in an embodiment of the present invention and document matches modular structure block diagram;
Fig. 5 is answer generation module structural block diagram provided in an embodiment of the present invention;
Fig. 6 is a kind of computer readable storage medium provided in an embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Embodiment one
In a kind of specific embodiment, a kind of entity answering method neural network based is provided, such as Fig. 1 and Fig. 2 institute Show, comprising:
Step S100: being converted to term vector for word included in problem and candidate documents respectively, generates corresponding problem Term vector sequence and candidate documents term vector sequence.
In expression layer, vectorization expression is carried out respectively to the word in problem q and document p.Specifically, will be asked when initial Each of topic and document word are initialized as the random floating point vector of a fixed dimension, and problem term vector arranges to form problem Term vector sequence qemb, candidate documents term vector arranges to form candidate documents term vector sequence pemb, these can be used as problem and The initial representation of the word of document.Then during systematic training, the expression of problem and document also can be optimized constantly.One In kind of embodiment, before step S100, this method may include: to the sentence for including in problem and the candidate documents into Row word cutting.The participle that the purpose of word cutting is easy for be formed after word cutting is converted to term vector.
Step S200: problem term vector sequence and candidate documents term vector sequence are separately input into shot and long term memory network In model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported.
In coding layer, respectively using shot and long term memory network model (LSTM, Long Short Memory Network) Problem term vector sequence and candidate documents term vector are encoded, it is therefore an objective to which the context for implicitly extracting problem term vector is special The context semantic feature of candidate documents of seeking peace term vector.The context semantic feature of problem term vector is included in a problem language Semantic feature of the term vector of problem sentence in this sentence is constituted in sentence.Likewise, the context language of candidate documents term vector Adopted feature includes semantic feature of the term vector of composition problem sentence in this sentence in a sentence of candidate documents.Example Such as, Chinese word coding sequence q the problem of outputencodeThe context semantic feature of problem term vector is contained, the word of candidate documents is compiled Code sequence pencodeContain the context semantic feature of candidate documents.
Step S300: the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem matches, generate based on Candidate documents with information indicate candidate documents indicate to include that multiple words indicate.
In matching layer, it is to mention that the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, which carries out matched purpose, The related semantic information between problem and candidate documents is taken, considers whole text semantic, is screened in all candidate documents Problem is more accurately replied out.The candidate documents expression based on match information generated contain problem and candidate documents it Between match information, for example, Documents Similarity, document representation and base based on problem and similarity the problem of between term vector In the document representation etc. that problem understands.
Step S400: selection starts word and closing in the expression of all words, and real according to beginning word and closing generation Body answer.
In sequence labelling layer, candidate documents based on match information are indicated, using full Connection Neural Network model (FNN, Fully neural network) prediction is indicated to the word of each position in candidate documents expression, and calculate it and answered as entity The probability for originating word, entity answer medium term of case.Utilize linear markov condition random field (CRF, Conditional Random Field algorithm) the optimal transfer parameters of model calculating, optimal transfer parameters are decoded, are obtained to each The mark of the word of position.Especially mark out the beginning word and closing of entity answer.Start word, closing and be located to start word Entity answer is labeled as between closing.
Entity answering method neural network based provided in this embodiment, by shot and long term memory network to problem and time Selection shelves extract the semantic feature of context respectively, generate the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents Column, are matched by the Chinese word coding sequence of Chinese word coding sequence and candidate documents to problem, obtain the time of fusion match information Document representation is selected, whole text semantic information, direct location entity answer are combined.Explicit algorithm and cumulative errors are reduced, are had The semantic expressiveness between Utilizing question and document is imitated, the precision of entity answer positioning is improved, convenient for combining the correlation of question-answering environment Background further increases the timeliness of entity question and answer.
In one embodiment, the Chinese word coding sequence of the Chinese word coding sequence of problem and candidate documents is matched, it is raw It is indicated at the candidate documents based on match information, comprising:
The problem of between the term vector of computational problem and the term vector of candidate documents Documents Similarity;
According to problem Documents Similarity and problem term vector sequence, the document table based on problem and similarity is calculated Show;
According to problem Documents Similarity and candidate documents term vector sequence, the document table understood based on problem is calculated Show;
According to candidate documents term vector sequence, the document representation based on problem and similarity and the text understood based on problem Shelves indicate, candidate documents expression is calculated.
For example, the formula of computational problem Documents Similarity is Sij=soft max (qi encode*pj encode), wherein SijSimilarity between i-th of term vector in expression problem and j-th of term vector of candidate documents.It is calculated using above-mentioned algorithm The numberical range of similarity out is between 0~1.Secondly, calculating the process of the document representation based on problem and similarity is p2qattention=S*qencode, p2qattentionIt is the overall similarity calculating of document to problem, obtains based on problem and similarity Document representation, S is problem Documents Similarity.Again, the process for calculating the document representation understood based on problem is first to calculateQ2p is calculated againattention=b*pencode, q2pattentionIt is overall similarity of the problem to document It calculates, obtains the document representation understood based on problem, the implicit letter for having incorporated problem in the document representation understood based on problem Breath, b are the weight distribution matrixes of a document representation understood in problem.Finally, calculating candidate documents indicates M, calculation formula It is as follows:
M=concat (pencode,p2qattention,pencode*p2qattention,pencode*q2pattention)。
In one embodiment, selection starts word and closing in the expression of all words, and according to beginning word and end Word generates entity answer, comprising:
The expression of each word is input in full Connection Neural Network model, vocabulary is generated and is shown as starting the first probability of word Value and vocabulary are shown as the second probability value of closing;
Corresponding first probability value and the second probability value are indicated according to each word, are started using condition random field algorithms selection Word and closing;
Using the medium term between beginning word, closing and beginning word and closing as entity answer.
Wherein, full Connection Neural Network model output starts word and two class of closing, and the expression of each word is input to and is connected entirely It connects in neural network model, respectively obtains each word as two probability values for starting word and closing.Each word indicates to correspond to First probability value and the second probability value, by condition random field algorithm to the word for being most probably acting as starting word and closing indicate into Rower note.It finally will start word, closing and be labeled as entity answer between beginning word and closing.
Embodiment two
In another embodiment specific implementation mode, a kind of entity question and answer system neural network based is provided, such as Fig. 3 institute Show, comprising:
Vector conversion module 10 is generated for word included in problem and candidate documents to be converted to term vector respectively Corresponding problem term vector sequence and candidate documents term vector sequence;
Sequential coding module 20, for problem term vector sequence and candidate documents term vector sequence to be separately input into length In phase memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module 30, the Chinese word coding sequence for Chinese word coding sequence and the candidate documents to problem It is matched, generating the candidate documents based on match information indicates, candidate documents indicate to include that multiple words indicate;
Answer generation module 40, for selection beginning word and closing in the expression of all words, and according to beginning word and knot Beam word generates entity answer.
In one embodiment, as shown in figure 4, problem and document matches module 30 include:
Similarity calculated 31, document the problem of between the term vector of computational problem and the term vector of candidate documents Similarity;
First document representation generation unit 32, for calculating according to problem Documents Similarity and problem term vector sequence To the document representation based on problem and similarity;
Second document representation generation unit 33, for according to problem Documents Similarity and candidate documents term vector sequence, meter Calculate the document representation for obtaining understanding based on problem;
Candidate documents indicate generation unit 34, for according to candidate documents term vector sequence, described based on problem and similar The document representation of degree and the document representation understood based on problem are calculated described based on matched candidate documents table Show.
In one embodiment, as shown in figure 5, answer generation module 40 includes:
Probability calculation unit 41, for the expression of each word to be input in full Connection Neural Network model, generating word is indicated The second probability value of closing is shown as the first probability value and vocabulary for starting word;
Word selecting unit 42 utilizes condition for indicating corresponding first probability value and the second probability value according to each word Random field algorithms selection starts word and closing;
Answer mark unit 43, for will start word, closing and beginning word and closing between medium term conduct Entity answer.
In one embodiment, device further include:
Word cutting module, for carrying out word cutting to the sentence for including in problem and candidate documents.
Embodiment three
The embodiment of the invention provides a kind of entity question and answer terminals neural network based, as shown in Figure 6, comprising:
Memory 400 and processor 500 are stored with the computer journey that can be run on processor 500 in memory 400 Sequence.Processor 500 realizes the entity answering method neural network based in above-described embodiment when executing the computer program. The quantity of memory 400 and processor 500 can be one or more.
Communication interface 600 is communicated for memory 400 and processor 500 with outside.
Memory 400 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If memory 400, processor 500 and the independent realization of communication interface 600, memory 400, processor 500 And communication interface 600 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard Architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) bus or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for expression, Fig. 6 In only indicated with a thick line, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 400, processor 500 and communication interface 600 are integrated in one piece On chip, then memory 400, processor 500 and communication interface 600 can complete mutual communication by internal interface.
Example IV
A kind of computer readable storage medium is stored with computer program, realization when described program is executed by processor Embodiment one include it is any as described in entity answering method neural network based.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims (10)

1. a kind of entity answering method neural network based characterized by comprising
Word included in problem and candidate documents is converted into term vector respectively, generates corresponding problem term vector sequence and time Selection shelves term vector sequence;
Described problem term vector sequence and the candidate documents term vector sequence are separately input into shot and long term memory network model In, export the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents;
The Chinese word coding sequence of Chinese word coding sequence and the candidate documents to described problem matches, and generates and is based on match information Candidate documents indicate the candidate documents indicate to include that multiple words indicate;
Selection starts word and closing in all word expressions, and generates entity according to the beginning word and closing and answer Case.
2. the method according to claim 1, wherein to the Chinese word coding sequence and the candidate documents of described problem Chinese word coding sequence matched, generating candidate documents based on match information indicates, comprising:
Calculate Documents Similarity the problem of between the term vector of described problem and the term vector of the candidate documents;
According to described problem Documents Similarity and described problem term vector sequence, the document based on problem and similarity is calculated It indicates;
According to described problem Documents Similarity and the candidate documents term vector sequence, the document understood based on problem is calculated It indicates;
According to the candidate documents term vector sequence, the document representation based on problem and similarity and described it is based on problem The document representation of understanding, the candidate documents, which are calculated, to be indicated.
3. the method according to claim 1, wherein selection starts word and end in all word expressions Word, and entity answer is generated according to the beginning word and closing, comprising:
Each word expression is input in full Connection Neural Network model, the vocabulary is generated and is shown as the beginning word First probability value and the vocabulary are shown as the second probability value of the closing;
Corresponding first probability value and the second probability value are indicated according to each word, utilize condition random field algorithms selection The beginning word and the closing;
Using the beginning word, the closing and the medium term started between word and the closing as the entity Answer.
4. according to the method in any one of claims 1 to 3, which is characterized in that respectively by institute in problem and candidate documents The word for including is converted to before term vector, comprising:
Word cutting is carried out to the sentence for including in described problem and the candidate documents.
5. a kind of entity question and answer system neural network based characterized by comprising
Vector conversion module generates corresponding for word included in problem and candidate documents to be converted to term vector respectively Problem term vector sequence and candidate documents term vector sequence;
Sequential coding module, for described problem term vector sequence and the candidate documents term vector sequence to be separately input into length In short-term memory network model, the Chinese word coding sequence of problem and the Chinese word coding sequence of candidate documents are exported;
Problem and document matches module, for Chinese word coding sequence and the candidate documents to described problem Chinese word coding sequence into Row matching, generating the candidate documents based on match information indicates, the candidate documents indicate to include that multiple words indicate;
Answer generation module, in all words expressions selection start word and closing, and according to the beginnings word with Closing generates entity answer.
6. device according to claim 5, which is characterized in that described problem and document matches module include:
Similarity calculated, text the problem of for calculating between the term vector of described problem and the term vector of the candidate documents Shelves similarity;
First document representation generation unit, for calculating according to described problem Documents Similarity and described problem term vector sequence Obtain the document representation based on problem and similarity;
Second document representation generation unit is used for according to described problem Documents Similarity and the candidate documents term vector sequence, The document representation understood based on problem is calculated;
Candidate documents indicate generation unit, for according to the candidate documents term vector sequence, it is described be based on problem and similarity Document representation and the document representation understood based on problem, be calculated described is indicated based on matched candidate documents.
7. device according to claim 5, which is characterized in that the answer generation module includes:
Probability calculation unit generates the vocabulary for each word expression to be input in full Connection Neural Network model It is shown as first probability value for starting word and the vocabulary is shown as the second probability value of the closing;
Word selecting unit utilizes item for indicating corresponding first probability value and the second probability value according to each word Beginning word and the closing described in part random field algorithms selection;
Answer marks unit, for will be between the beginning word, the closing and the beginning word and the closing Medium term is as the entity answer.
8. device according to any one of claims 5 to 7, which is characterized in that described device further include:
Word cutting module, for carrying out word cutting to the sentence for including in described problem and the candidate documents.
9. a kind of entity question and answer terminal neural network based characterized by comprising
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize the method as described in any in claim 1-4.
10. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor The method as described in any in claim 1-4 is realized when row.
CN201810714445.4A 2018-06-29 2018-06-29 Entity answering method, device and terminal neural network based Pending CN108959556A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810714445.4A CN108959556A (en) 2018-06-29 2018-06-29 Entity answering method, device and terminal neural network based

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810714445.4A CN108959556A (en) 2018-06-29 2018-06-29 Entity answering method, device and terminal neural network based

Publications (1)

Publication Number Publication Date
CN108959556A true CN108959556A (en) 2018-12-07

Family

ID=64485214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810714445.4A Pending CN108959556A (en) 2018-06-29 2018-06-29 Entity answering method, device and terminal neural network based

Country Status (1)

Country Link
CN (1) CN108959556A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815325A (en) * 2019-01-18 2019-05-28 北京百度网讯科技有限公司 Answer extracting method, apparatus, server and storage medium
CN109885672A (en) * 2019-03-04 2019-06-14 中国科学院软件研究所 A kind of question and answer mode intelligent retrieval system and method towards online education
CN110222345A (en) * 2019-06-18 2019-09-10 卓尔智联(武汉)研究院有限公司 Cloze Test answer method, apparatus, electronic equipment and storage medium
CN110245334A (en) * 2019-06-25 2019-09-17 北京百度网讯科技有限公司 Method and apparatus for output information
CN110619042A (en) * 2019-03-13 2019-12-27 北京航空航天大学 Neural network-based teaching question and answer system and method
CN111881264A (en) * 2020-09-28 2020-11-03 北京智源人工智能研究院 Method and electronic equipment for searching long text in question-answering task in open field
CN112347229A (en) * 2020-11-12 2021-02-09 润联软件系统(深圳)有限公司 Answer extraction method and device, computer equipment and storage medium
CN112417126A (en) * 2020-12-02 2021-02-26 车智互联(北京)科技有限公司 Question answering method, computing equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933904B2 (en) * 2007-04-10 2011-04-26 Nelson Cliff File search engine and computerized method of tagging files with vectors
CN103229162A (en) * 2010-09-28 2013-07-31 国际商业机器公司 Providing answers to questions using logical synthesis of candidate answers
CN106778887A (en) * 2016-12-27 2017-05-31 努比亚技术有限公司 The terminal and method of sentence flag sequence are determined based on condition random field
CN107562792A (en) * 2017-07-31 2018-01-09 同济大学 A kind of question and answer matching process based on deep learning
WO2018085710A1 (en) * 2016-11-04 2018-05-11 Salesforce.Com, Inc. Dynamic coattention network for question answering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933904B2 (en) * 2007-04-10 2011-04-26 Nelson Cliff File search engine and computerized method of tagging files with vectors
CN103229162A (en) * 2010-09-28 2013-07-31 国际商业机器公司 Providing answers to questions using logical synthesis of candidate answers
WO2018085710A1 (en) * 2016-11-04 2018-05-11 Salesforce.Com, Inc. Dynamic coattention network for question answering
CN106778887A (en) * 2016-12-27 2017-05-31 努比亚技术有限公司 The terminal and method of sentence flag sequence are determined based on condition random field
CN107562792A (en) * 2017-07-31 2018-01-09 同济大学 A kind of question and answer matching process based on deep learning

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815325B (en) * 2019-01-18 2021-12-10 北京百度网讯科技有限公司 Answer extraction method, device, server and storage medium
CN109815325A (en) * 2019-01-18 2019-05-28 北京百度网讯科技有限公司 Answer extracting method, apparatus, server and storage medium
CN109885672A (en) * 2019-03-04 2019-06-14 中国科学院软件研究所 A kind of question and answer mode intelligent retrieval system and method towards online education
CN109885672B (en) * 2019-03-04 2020-10-30 中国科学院软件研究所 Question-answering type intelligent retrieval system and method for online education
CN110619042A (en) * 2019-03-13 2019-12-27 北京航空航天大学 Neural network-based teaching question and answer system and method
CN110222345A (en) * 2019-06-18 2019-09-10 卓尔智联(武汉)研究院有限公司 Cloze Test answer method, apparatus, electronic equipment and storage medium
CN110245334A (en) * 2019-06-25 2019-09-17 北京百度网讯科技有限公司 Method and apparatus for output information
CN110245334B (en) * 2019-06-25 2023-06-16 北京百度网讯科技有限公司 Method and device for outputting information
CN111881264A (en) * 2020-09-28 2020-11-03 北京智源人工智能研究院 Method and electronic equipment for searching long text in question-answering task in open field
CN111881264B (en) * 2020-09-28 2020-12-15 北京智源人工智能研究院 Method and electronic equipment for searching long text in question-answering task in open field
CN112347229B (en) * 2020-11-12 2021-07-20 润联软件系统(深圳)有限公司 Answer extraction method and device, computer equipment and storage medium
CN112347229A (en) * 2020-11-12 2021-02-09 润联软件系统(深圳)有限公司 Answer extraction method and device, computer equipment and storage medium
CN112417126A (en) * 2020-12-02 2021-02-26 车智互联(北京)科技有限公司 Question answering method, computing equipment and storage medium
CN112417126B (en) * 2020-12-02 2024-01-23 车智互联(北京)科技有限公司 Question answering method, computing device and storage medium

Similar Documents

Publication Publication Date Title
CN108959556A (en) Entity answering method, device and terminal neural network based
Lo Bosco et al. Deep learning architectures for DNA sequence classification
CN110377903B (en) Sentence-level entity and relation combined extraction method
KR102116518B1 (en) Apparatus for answering a question based on maching reading comprehension and method for answering a question using thereof
CN109284506A (en) A kind of user comment sentiment analysis system and method based on attention convolutional neural networks
Yu et al. Sequential labeling using deep-structured conditional random fields
CN108170684A (en) Text similarity computing method and system, data query system and computer product
CN110781306B (en) English text aspect layer emotion classification method and system
CN112163429B (en) Sentence correlation obtaining method, system and medium combining cyclic network and BERT
CN106294635B (en) Application program searching method, the training method of deep neural network model and device
CN112015868A (en) Question-answering method based on knowledge graph completion
CN109597992B (en) Question similarity calculation method combining synonym dictionary and word embedding vector
CN112800239B (en) Training method of intention recognition model, and intention recognition method and device
Jebbara et al. Aspect-based relational sentiment analysis using a stacked neural network architecture
CN108897852A (en) Judgment method, device and the equipment of conversation content continuity
CN110210043A (en) Text interpretation method, device, electronic equipment and readable storage medium storing program for executing
JP2021522569A (en) Machine learning model with evolving domain-specific lexicon features for text annotation
CN106557554B (en) The display methods and device of search result based on artificial intelligence
Wang et al. Summary-aware attention for social media short text abstractive summarization
CN109597988A (en) The former prediction technique of vocabulary justice, device and electronic equipment across language
CN108846125A (en) Talk with generation method, device, terminal and computer readable storage medium
CN110263167A (en) Medical bodies method of generating classification model, device, equipment and readable storage medium storing program for executing
CN110399472A (en) Reminding method, device, computer equipment and storage medium are putd question in interview
CN111914553A (en) Financial information negative subject judgment method based on machine learning
Li et al. Semi-supervised Domain Adaptation for Dependency Parsing via Improved Contextualized Word Representations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181207