CN110489538A - Sentence answer method, device and electronic equipment based on artificial intelligence - Google Patents
Sentence answer method, device and electronic equipment based on artificial intelligence Download PDFInfo
- Publication number
- CN110489538A CN110489538A CN201910797093.8A CN201910797093A CN110489538A CN 110489538 A CN110489538 A CN 110489538A CN 201910797093 A CN201910797093 A CN 201910797093A CN 110489538 A CN110489538 A CN 110489538A
- Authority
- CN
- China
- Prior art keywords
- sentence
- user
- semantic
- corpus
- question sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Abstract
The present invention provides a kind of sentence answer method, device, electronic equipment and storage medium based on artificial intelligence;Sentence answer method based on artificial intelligence includes: to obtain user's question sentence, identifies the entity word in user's question sentence;Target corpus is determined in corpus according to user's question sentence, the section sentence granularity that the semanteme of the target corpus is determined as user's question sentence is semantic;It determines the corresponding semantic attribute of the entity word, and the entity word is arranged according to the corresponding progressive relationship of the semantic attribute, the words granularity for obtaining user's question sentence is semantic;It is semantic according to described section of sentence granularity semanteme and the words granularity, determine that the output of user's question sentence is semantic;It is inquired in the knowledge mapping of setting according to the output semanteme, obtains semantic results;Answer statement is generated according to the semantic results.Double grains degree mechanism through the invention is able to ascend semantic generalization ability, promotes the applicability to different user question sentence, realizes correct response.
Description
Technical field
The present invention relates to artificial intelligence technology more particularly to a kind of sentence answer method, device, electricity based on artificial intelligence
Sub- equipment and storage medium.
Background technique
Artificial intelligence (AI, Artificial Intelligence) is to utilize digital computer or digital computer control
Machine simulation, extension and the intelligence for extending people of system, perception environment obtain knowledge and the reason using Knowledge Acquirement optimum
By, method, technology and application system.Natural language processing (NLP, Nature Language Processing) is artificial intelligence
An important directions, it, which studies to be able to achieve between people and computer, carries out the various theoretical of efficient communication and just with natural language
Method is one and melts linguistics, computer science, mathematics in the science of one.
Sentence response is the important branch of natural language processing, refers specifically to convert computer for user's question sentence and is understood that
Logical form, and carry out according to stored information the process of response.It is usually to pass through in the scheme that the relevant technologies provide
Then a large amount of high quality corpus train classification models carry out semantic parsing to user's question sentence by disaggregated model, according to semanteme
The result of parsing carries out response.But it for containing noise or more complicated user's question sentence, is difficult to solve by disaggregated model
Analyse that core therein is semantic, being easy to appear can not reply or situation that answer statement is not corresponding with user's question sentence.To sum up, related
For the scheme that technology provides to the poor for applicability of different user question sentence, the accuracy of response is low.
Summary of the invention
The embodiment of the present invention provides a kind of sentence answer method based on artificial intelligence, device, electronic equipment and storage and is situated between
Matter is able to ascend the applicability to different user question sentence, promotes the accuracy of response.
The technical solution of the embodiment of the present invention is achieved in that
The embodiment of the present invention provides a kind of sentence answer method based on artificial intelligence, comprising:
User's question sentence is obtained, identifies the entity word in user's question sentence;
Target corpus is determined in corpus according to user's question sentence, the semanteme of the target corpus is determined as described
The section sentence granularity of user's question sentence is semantic;
Determine the corresponding semantic attribute of the entity word, and the entity word is corresponding progressive according to the semantic attribute
Relationship is arranged, and the words granularity for obtaining user's question sentence is semantic;
It is semantic according to described section of sentence granularity semanteme and the words granularity, determine that the output of user's question sentence is semantic;
It is inquired in the knowledge mapping of setting according to the output semanteme, obtains semantic results;
Answer statement is generated according to the semantic results.
In the above scheme, the entity word in identification user's question sentence, comprising:
Entity recognition is carried out to user's question sentence by Named Entity Extraction Model, obtains the first recognition result;
String matching is carried out to user's question sentence according to setting dictionary, obtains the second recognition result;
To first recognition result and second recognition result merges and duplicate removal, obtains entity word.
In the above scheme, described to be inquired in the knowledge mapping of setting according to the output semanteme, obtain semanteme
As a result, comprising:
When semantic corresponding at least two semantic attribute of the output, according to the semantic attribute in the output language
Sequence in justice, is successively inquired in knowledge mapping, is obtained and each one-to-one attribute results of semantic attribute;
Each attribute results group is combined into semantic results.
In the above scheme, acquisition user's question sentence, comprising:
Obtain user speech;
Speech recognition is carried out to the user speech, obtains user's question sentence.
In the above scheme, further includes:
It identifies the setting symbol in each corpus that the corpus includes, and deletes the setting symbol;
Letter in the corpus is all converted into upper case or lower case;
Chinese character in the corpus is all converted into the complex form of Chinese characters or simplified Chinese character.
The embodiment of the present invention provides a kind of sentence answering device based on artificial intelligence, comprising:
Identification module identifies the entity word in user's question sentence for obtaining user's question sentence;
Section sentence granularity processing module, for determining target corpus in corpus according to user's question sentence, by the mesh
The section sentence granularity that the semanteme of poster material is determined as user's question sentence is semantic;
Words granularity processing module, for determining the corresponding semantic attribute of the entity word, and by the entity word according to
The corresponding progressive relationship of the semantic attribute is arranged, and the words granularity for obtaining user's question sentence is semantic;
Semantic output module, for determining the user according to described section of sentence granularity semanteme and words granularity semanteme
The output of question sentence is semantic;
Result queries module obtains semanteme for being inquired in the knowledge mapping of setting according to the output semanteme
As a result;
Sentence generation module, for generating answer statement according to the semantic results.
In the above scheme, the identification module is also used to:
Entity recognition is carried out to user's question sentence by Named Entity Extraction Model, obtains the first recognition result;
String matching is carried out to user's question sentence according to setting dictionary, obtains the second recognition result;
To first recognition result and second recognition result merges and duplicate removal, obtains entity word.
In the above scheme, the result queries module is also used to:
When semantic corresponding at least two semantic attribute of the output, according to the semantic attribute in the output language
Sequence in justice, is successively inquired in knowledge mapping, is obtained and each one-to-one attribute results of semantic attribute;
Each attribute results group is combined into semantic results.
In the above scheme, the identification module is also used to:
Obtain user speech;
Speech recognition is carried out to the user speech, obtains user's question sentence.
In the above scheme, the sentence answering device based on artificial intelligence, further includes:
Removing module, the setting symbol in each corpus that the corpus includes for identification, and delete the set symbol
Number;
First conversion module, for the letter in the corpus to be all converted to upper case or lower case;
Second conversion module, for the Chinese character in the corpus to be all converted to the complex form of Chinese characters or simplified Chinese character.
The embodiment of the present invention provides a kind of electronic equipment, comprising:
Memory, for storing executable instruction;
Processor when for executing the executable instruction stored in the memory, is realized provided in an embodiment of the present invention
Sentence answer method based on artificial intelligence.
The embodiment of the present invention provides a kind of storage medium, is stored with executable instruction, real when for causing processor to execute
The existing sentence answer method provided in an embodiment of the present invention based on artificial intelligence.
The embodiment of the present invention has the advantages that
On the one hand the embodiment of the present invention determines target for the user's question sentence got according to user's question sentence in corpus
Corpus, on the other hand according to the semantic attribute of the entity word in user's question sentence, obtains words grain so that it is determined that section sentence granularity is semantic
Degree is semantic, and comprehensive section sentence granularity semanteme and words granularity are semantic, final to determine that output is semantic, and is answered according to output semanteme
It answers, by the mechanism of double grains degree, improves to different user question sentence, the applicability including simple question sentence and complicated question, promoted
The accuracy of response.
Detailed description of the invention
Fig. 1 is an optional framework signal of the sentence answering system provided in an embodiment of the present invention based on artificial intelligence
Figure;
Fig. 2 is an optional structural schematic diagram of server provided in an embodiment of the present invention;
Fig. 3 is an optional structural representation of the sentence answering device provided in an embodiment of the present invention based on artificial intelligence
Figure;
Fig. 4 A is that an optional process of the sentence answer method provided in an embodiment of the present invention based on artificial intelligence is shown
It is intended to;
Fig. 4 B is another optional process of the sentence answer method provided in an embodiment of the present invention based on artificial intelligence
Schematic diagram;
Fig. 5 is an optional structural schematic diagram of BERT model provided in an embodiment of the present invention;
Fig. 6 is an optional flow diagram of sentence response provided in an embodiment of the present invention;
Fig. 7 is the flow diagram for the sentence question and answer scheme that the relevant technologies provided in an embodiment of the present invention provide;
Fig. 8 is another flow diagram of the sentence answer method provided in an embodiment of the present invention based on artificial intelligence;
Fig. 9 is a contrast schematic diagram of response scene provided in an embodiment of the present invention;
Figure 10 is another contrast schematic diagram of response scene provided in an embodiment of the present invention;
Figure 11 is a schematic diagram of the response scene of complicated question provided in an embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention make into
It is described in detail to one step, described embodiment is not construed as limitation of the present invention, and those of ordinary skill in the art are not having
All other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
In the following description, it is related to " some embodiments ", which depict the subsets of all possible embodiments, but can
To understand, " some embodiments " can be the same subsets or different subsets of all possible embodiments, and can not conflict
In the case where be combined with each other.
In the following description, related term " first second third " be only be the similar object of difference, no
Represent the particular sorted for being directed to object, it is possible to understand that ground, " first second third " can be interchanged specific in the case where permission
Sequence or precedence so that the embodiment of the present invention described herein can be other than illustrating herein or describing
Sequence is implemented.
Unless otherwise defined, all technical and scientific terms used herein and belong to technical field of the invention
The normally understood meaning of technical staff is identical.Term used herein is intended merely to the purpose of the description embodiment of the present invention,
It is not intended to limit the present invention.
Before the embodiment of the present invention is further elaborated, to noun involved in the embodiment of the present invention and term
It is illustrated, noun involved in the embodiment of the present invention and term are suitable for following explanation.
1) entity word: also referred to as name entity, with the entity of certain sense in finger speech sentence, as name, place name, mechanism name and
Proper noun etc..Usually by naming Entity recognition (NER, Named Entity Recognition) technology to determine in sentence
Entity word.
2) semantic parsing: natural language is converted to the process for the logical form that machine is understood that.
3) ES (ElasticSearch): the search server based on Lucene provides the distributed search of analysis in real time and draws
It holds up, herein for determining candidate corpus relevant to user's question sentence, wherein Lucene is for full-text search and to search
Engine tool packet.
4) simple question sentence: user's question sentence including a subject and an attribute, such as " age of Zhang San ".
5) complicated question: user's question sentence including at least two subjects and an attribute, such as " year of the wife of Zhang San
Age ".
6) be cold-started: the stage that a product or new function have just been born, face verifying the market demand, shortage of data and
The difficulty of user's missing.
7) grammatical attribute: including at least one of: name entity attribute (such as name, mechanism name and the university of entity word
Name etc.), the part of speech (such as noun, verb and preposition) of entity word, entity word in user's question sentence syntactic structure (such as subject,
Predicate and object etc.).
8) semantic attribute: refer to that is defined possesses the text of particular meaning, such as " local " and " wife ".
9) knowledge mapping: for describe various entities and concept present in real world and they between it is existing
Relationship is usually built with the structures such as " entity-relationship-entity " and " entity-attribute-attribute value ", at this point, " entity-relation-reality
Body " is equivalent to a knowledge in knowledge mapping, " entity-attribute-attribute value " similarly.
10) corpus: user's question sentence as linguistic data.
Inventor has found during implementing the embodiment of the present invention, in the sentence response scheme that the relevant technologies provide,
Usually according to online mining or the corpus train classification models of manual compiling, then the disaggregated model completed by training is to user
Question sentence carries out semantic parsing, carries out response according to the result of semanteme parsing.Above scheme have the disadvantage in that (1) in product or
The quantity of the cold-start phase of new function, the corpus of online mining is very limited, if writing corpus manually, cost of labor mistake
Height, and take long time;(2) the semantic generalization ability for the model that training is completed is limited, for some noise-containing user's question sentences,
It is semantic to be difficult to position core therein, if user's question sentence " Zhang San other township " includes noise word " people ", user is asked by model
When sentence is handled, it is easy to resolve to " people " into the word there are practical significance;(3) model that training is completed is for complicated question
Processing capacity is poor, such as " which university, institute the wife of Li Si graduates from ", after model treatment, is often unable to get correct language
Justice.
The embodiment of the present invention provides a kind of sentence answer method based on artificial intelligence, device, electronic equipment and storage and is situated between
Matter is able to ascend the applicability to different user question sentence, promotes the accuracy of response, illustrates below provided in an embodiment of the present invention
The exemplary application of electronic equipment.
It is that one of the sentence answering system 100 provided in an embodiment of the present invention based on artificial intelligence can referring to Fig. 1, Fig. 1
The configuration diagram of choosing supports a sentence response application based on artificial intelligence to realize, terminal device 400 is (exemplary to show
Go out terminal device 400-1 and terminal device 400-2) server 200 is connected by network 300, network 300 can be wide area network
Or local area network, or be combination, in addition, Fig. 1 also shows the corpus that there is communication connection with server 200
500。
Terminal device 400 is used to (illustrate graphical interfaces 410-1 and graphical interfaces 410- in graphical interfaces 410
2) (referred to as response application) is applied in sentence response of the display based on artificial intelligence;It is also used to according to user in response application
Operation obtains user's question sentence, and user's question sentence is sent to server 200;Server 200 is identified for obtaining user's question sentence
Entity word in user's question sentence;It is also used to obtain corpus from corpus 500, target is determined in corpus according to user's question sentence
Corpus, the section sentence granularity that the semanteme of target corpus is determined as user's question sentence are semantic;It is also used to determine the corresponding semanteme of entity word
Attribute, and entity word is arranged according to the corresponding progressive relationship of semantic attribute, the words granularity for obtaining user's question sentence is semantic;
It is also used to determine that the output of user's question sentence is semantic according to section sentence granularity semanteme and words granularity semanteme;It is also used to according to output language
Justice is inquired in the knowledge mapping of setting, obtains semantic results;It is also used to generate answer statement according to semantic results, will answer
It answers sentence and is sent to terminal device 400;Terminal device 400 is also used to show response language in the response application of graphical interfaces 410
Sentence.
Continue with the exemplary application for illustrating electronic equipment provided in an embodiment of the present invention.Electronic equipment may be embodied as
Laptop, tablet computer, desktop computer, set-top box, (for example, mobile phone, portable music plays mobile device
Device, personal digital assistant, specific messages equipment, portable gaming device) etc. various types of terminal devices, also may be embodied as
Server.In the following, being illustrated so that electronic equipment is server as an example.
Referring to fig. 2, Fig. 2 is server provided in an embodiment of the present invention 200 (for example, it may be server shown in FIG. 1
200) configuration diagram, server 200 shown in Fig. 2 include: at least one processor 210, memory 250, at least one
Network interface 220 and user interface 230.Various components in server 200 are coupled by bus system 240.It can manage
Solution, bus system 240 is for realizing the connection communication between these components.Bus system 240 is in addition to including data/address bus, also
Including power bus, control bus and status signal bus in addition.But for the sake of clear explanation, in Fig. 2 all by various buses
It is designated as bus system 240.
Processor 210 can be a kind of IC chip, the processing capacity with signal, such as general processor, number
Word signal processor (DSP, Digital Signal Processor) either other programmable logic device, discrete gate or
Transistor logic, discrete hardware components etc., wherein general processor can be microprocessor or any conventional processing
Device etc..
User interface 230 include make it possible to present one or more output devices 231 of media content, including one or
Multiple loudspeakers and/or one or more visual display screens.User interface 230 further includes one or more input units 232, packet
Include the user interface component for facilitating user's input, for example keyboard, mouse, microphone, touch screen display screen, camera, other are defeated
Enter button and control.
Memory 250 can be it is removable, it is non-removable or combinations thereof.Illustrative hardware device includes that solid-state is deposited
Reservoir, hard disk drive, CD drive etc..Memory 250 optionally includes one geographically far from processor 210
A or multiple storage equipment.
Memory 250 includes volatile memory or nonvolatile memory, may also comprise volatile and non-volatile and deposits
Both reservoirs.Nonvolatile memory can be read-only memory (ROM, Read Only Me mory), and volatile memory can
To be random access memory (RAM, Random Access Memor y).The memory 250 of description of the embodiment of the present invention is intended to
Memory including any suitable type.
In some embodiments, memory 250 can storing data to support various operations, the example of these data includes
Program, module and data structure or its subset or superset, below exemplary illustration.
Operating system 251, including for handle various basic system services and execute hardware dependent tasks system program,
Such as ccf layer, core library layer, driving layer etc., for realizing various basic businesses and the hardware based task of processing;
Network communication module 252, for reaching other calculating via one or more (wired or wireless) network interfaces 220
Equipment, illustrative network interface 220 include: bluetooth, Wireless Fidelity (WiFi) and universal serial bus (USB,
Universal Serial Bus) etc.;
Module 253 is presented, for via one or more associated with user interface 230 output device 231 (for example,
Display screen, loudspeaker etc.) make it possible to present information (for example, for operating peripheral equipment and showing the user of content and information
Interface);
Input processing module 254, for one to one or more from one of one or more input units 232 or
Multiple user's inputs or interaction detect and translate input or interaction detected.
In some embodiments, the sentence answering device provided in an embodiment of the present invention based on artificial intelligence can use soft
Part mode realizes that Fig. 2 shows the sentence answering devices 255 based on artificial intelligence being stored in memory 250, can be
The software of the forms such as program and plug-in unit, including following software module: identification module 2551, section sentence granularity processing module 2552, word
Word granularity processing module 2553, semantic output module 2554, result queries module 2555 and sentence generation module 2556, these moulds
Block is in logic, therefore to can be combined arbitrarily according to the function of being realized or further split.
The function of modules will be described hereinafter.
In further embodiments, the sentence answering device provided in an embodiment of the present invention based on artificial intelligence can use
Hardware mode is realized, as an example, the sentence answering device provided in an embodiment of the present invention based on artificial intelligence can be use
The processor of hardware decoding processor form is programmed to perform the sentence provided in an embodiment of the present invention based on artificial intelligence
Answer method, for example, the processor of hardware decoding processor form can use one or more application specific integrated circuit
(ASIC, Application Specific Integrate d Circuit), DSP, programmable logic device (PLD,
Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic
Device), field programmable gate array (FPGA, Field-Programmable Gate Array) or other electronic components.
Sentence answer method provided in an embodiment of the present invention based on artificial intelligence can be executed by above-mentioned server,
It can be executed by terminal device (for example, it may be terminal device 400-1 and terminal device 400-2 shown in FIG. 1), or by taking
Business device and terminal device execute jointly.
Below in conjunction with the exemplary application and structure of the electronic equipment being described above, illustrate to pass through insertion in electronic equipment
The sentence answering device based on artificial intelligence and realize the process of the sentence answer method based on artificial intelligence.
It is the sentence answering device 255 provided in an embodiment of the present invention based on artificial intelligence referring to Fig. 3 and Fig. 4 A, Fig. 3
Structural schematic diagram shows a series of process flow that the semantic parsing and sentence response of user's question sentence is realized by modules, figure
4A is the flow diagram of the sentence answer method provided in an embodiment of the present invention based on artificial intelligence, will be in conjunction with Fig. 3 to Fig. 4 A
The step of showing is illustrated.
In a step 101, user's question sentence is obtained, identifies the entity word in user's question sentence.
Here, user's question sentence of textual form is obtained, and identifies the name entity in user's question sentence with practical significance, is made
For entity word.
In some embodiments, above-mentioned acquisition user's question sentence can be realized in this way: obtaining user's language
Sound;Speech recognition is carried out to the user speech, obtains user's question sentence.
As an example, in identification module 2551, being inputted in the form of text input in addition to acquisition user referring to Fig. 3
Outside user's question sentence, user speech can be also obtained, and automatic speech recognition (A SR, Automatic are carried out to user speech
Speech Recognition), obtain user's question sentence of textual form.The spirit for obtaining user's question sentence is improved through the above way
Activity, the application scenarios suitable for communication software.
In some embodiments, the entity in above-mentioned identification user's question sentence can be realized in this way
Word: Entity recognition is carried out to user's question sentence by Named Entity Extraction Model, obtains the first recognition result;According to setting word
Library carries out string matching to user's question sentence, obtains the second recognition result;To first recognition result and described second
Recognition result merges and duplicate removal, obtains entity word.
As an example, there are two processing branches to user's question sentence in identification module 2551 referring to Fig. 3, one of them
Processing branch is to carry out Entity recognition to user's question sentence by Named Entity Extraction Model, obtains the first recognition result, the name
Entity recognition model is machine learning model.It, can be by corpus existing in corpus and corpus in order to promote recognition effect
Fixed entity word is trained Named Entity Extraction Model, and according to training complete Named Entity Extraction Model to
Family question sentence carries out Entity recognition, wherein the embodiment of the present invention without limitation, such as may be used to the concrete type of Named Entity Extraction Model
For condition random field (CRF, C onditional Random Field) model or hidden Markov model (HMM, Hidden
Markov Mo del).In another processing branch, string matching is carried out to user's question sentence according to setting dictionary, obtains second
Recognition result, wherein setting dictionary includes multiple defined entity words, can be established according to practical application scene, character
String matching can be multimode matching, and multimode matching refers in user's question sentence while searching multiple mode characters that setting dictionary defines
Go here and there (existing word), the embodiment of the present invention to the mode of multimode matching without limitation, such as by establishing dictionary tree (trie tree)
Multimode matching is carried out, or carries out multimode matching etc. by establishing AC automatic machine (Aho-Corasick automation).
After the completion of being handled by two processing branches user's question sentence, the result for handling branches to two is merged, i.e.,
To the first recognition result and the second recognition result merges and duplicate removal, obtains entity word.For example, the first recognition result packet
Include word1And word2, the second recognition result includes word2And word3, then after merging and duplicate removal, obtained entity word includes
word1、word2And word3.Above-mentioned in such a way that Named Entity Extraction Model carries out Entity recognition to user's question sentence,
It can recognize that the neologisms not occurred in setting dictionary, pass through the above-mentioned side for carrying out string matching according to setting dictionary
Formula, can match the existing word in setting dictionary, and comprehensive two kinds of identification methods improve the accuracy for the entity word determined
And validity.
In a step 102, target corpus is determined in corpus according to user's question sentence, by the language of the target corpus
The section sentence granularity that justice is determined as user's question sentence is semantic.
In embodiments of the present invention, by section the double grains degree mechanism of sentence granularity and words granularity carries out semantic parsing.Specifically
Ground determines target corpus in corpus according to user's question sentence, the semanteme of target corpus is determined as user and is asked in section sentence granularity
The semanteme of sentence will be it is determined here that the semanteme gone out be named as section sentence granularity semanteme for the ease of distinguishing.Wherein, corpus, which refers to, has determined that
Semantic question sentence, can obtain corpus, and corpus is added in corpus in such a way that data on line crawl or manually write.
In some embodiments, between arbitrary steps, the sentence answer method based on artificial intelligence further include: know
The setting symbol in each corpus that the not described corpus includes, and delete the setting symbol;Letter in the corpus is complete
Portion is converted to upper case or lower case;Chinese character in the corpus is all converted into the complex form of Chinese characters or simplified Chinese character.
The embodiment of the present invention can pre-process corpus, specifically, setting in each corpus that identification corpus includes
Determine symbol, and delete setting symbol, setting symbol be with the incoherent meaningless symbol of corpus content, can be according to practical application field
Scape is configured, be such as set as "!", the symbols such as " # " and " * ".Also, by corpus letter all be converted to capitalization or it is small
It writes, the Chinese character in corpus is all converted into the complex form of Chinese characters or simplified Chinese character, guarantees the uniformity of corpus.
In step 103, the corresponding semantic attribute of the entity word is determined, and by the entity word according to the semantic category
Property corresponding progressive relationship arranged, the words granularity for obtaining user's question sentence is semantic.
It is worth noting that when being arranged entity word according to the corresponding progressive relationship of semantic attribute, by corresponding language
The entity word of adopted attribute itself replaces with the semantic attribute, is convenient for subsequent parsing.
As an example, in the N-Gram matching of words granularity processing module 2553, passing through left and right physical machine referring to Fig. 3
Restrict the progressive relationship of beam semantic attribute.For example, user's question sentence for " which university, institute Zhang San graduates from ", the entity word that identifies
Including " Zhang San " and " university ", " university " corresponding semantic attribute is " graduated school ", the corresponding progressive relationship of the semantic attribute
Are as follows: Zuo Shiti [person]/right entity [universit y], the then sequence according to entity word in user's question sentence from left to right
It is arranged, while " Zhang San " is replaced with into " graduated school ", the words granularity for obtaining user's question sentence is semantic are as follows: Zhang San finishes
Industry universities and colleges, above-mentioned [person] and [university] are grammatical attribute.
In the case where some complicated questions, user's question sentence may correspond at least two semantic attributes.For example, user's question sentence
" which university, institute the wife of Zhang San graduates from ", the entity word identified includes " Zhang San ", " wife " and " university ", and " wife " is right
The semantic attribute answered is " wife ", and progressive relationship is Zuo Shiti [perso n]/right entity [person], " university " corresponding language
Adopted attribute is " graduated school ", progressive relationship are as follows: Zuo Shiti [person]/right entity [university], then according to entity word
Sequence in user's question sentence from left to right arranges, while entity word is replaced with corresponding semantic attribute, obtains words
Granularity is semantic: Zhang San wife graduated school.Wherein, if there are at least two grammers before the corresponding entity word of semantic attribute
The entity word that attribute is consistent with the left entity of semantic attribute, then by least two entity words according in user's question sentence from left to right
Sequence merge, such as in the above example, " Zhang San " and " wife " is consistent with [person], then will " Zhang San " and " always
Mother-in-law " merges, collectively as the left entity of " graduated school " attribute.
In some embodiments, the corresponding semantic category of the above-mentioned determination entity word can be realized in this way
Property: the entity word is matched at least two semantic dictionaries of setting, wherein each semantic dictionary is one corresponding
Semantic attribute;When the entity word and the semantic dictionary successful match, by the corresponding semantic attribute of the semanteme dictionary, really
It is set to the semantic attribute of the entity word.
Variability based on language, i.e., multiple words may correspond to the same meaning, in embodiments of the present invention, set multiple
Semantic attribute, and a semantic dictionary is set for each semantic attribute, for example, semantic attribute is " local ", corresponding semanteme word
Library includes words such as " families ", " home " and " hometown ".
As an example, in words granularity processing module 2553, by the entity word in user's question sentence and owning referring to Fig. 3
Semantic dictionary is matched, i.e., will when the semanteme dictionary includes the entity word when entity word and some semantic dictionary successful match
The corresponding semantic attribute of semanteme dictionary, is determined as the semantic attribute of entity word.It improves through the above way and determines semantic belong to
The success rate of property.
In some embodiments, after the corresponding semantic attribute of the determination entity word, further includes: by the user
Entity word not corresponding with the semantic attribute is determined as non-matching word in question sentence, and determine the non-matching word and except it is described not
The word weight of entity word outside matching word;Sentence according to the word weight of the non-matching word and user's question sentence is long, determines institute
State the penalty values of non-matching word;According to the penalty values and the word weight of the entity word in addition to the non-matching word, determine described in
The sentence of user's question sentence scores;When sentence scoring is less than sentence scoring threshold value, determine the words granularity semanteme for sky.
Since there may be noises in user's question sentence, and noise may be identified as entity word, therefore implement in the present invention
In example, entity word is analyzed.Specifically, entity word not corresponding with semantic attribute in user's question sentence is determined as not matching
Word, and determine the word weight of non-matching word and the entity word in addition to non-matching word, word weight can be preset, such as by word " Zhang San "
Word weight be set as 0.6, set 0.5 for the word weight of word " Li Si ".Then, by the word weight of non-matching word divided by user
The sentence of question sentence is long, obtains the penalty values of non-matching word, wherein sentence is long to refer to the sum of word that user's question sentence includes.It will not match
The word weight of entity word outside word adds up, and accumulated result is subtracted penalty values, obtains the sentence scoring of user's question sentence.When
When sentence scoring is less than the sentence scoring threshold value of setting, determines that the entity word identified is unreliable, determine words granularity semanteme for sky;
When sentence is scored above sentence scoring threshold value, by the entity word in addition to non-matching word, according to the corresponding progressive relationship of semantic attribute into
Row arrangement, the words granularity for obtaining user's question sentence are semantic, wherein sentence scoring threshold value can be configured according to practical application scene.
By the above-mentioned means, excessive to noise, the excessive user's question sentence of mistake does not calculate words granularity semanteme, saves process resource.
In some embodiments, can realize in this way the above-mentioned basis non-matching word word weight and
The sentence of user's question sentence is long, determines the penalty values of the non-matching word: determining the grammatical attribute of the non-matching word;When described
When the grammatical attribute of non-matching word is subject, determine that the punishment of the non-matching word is divided into sky;When the grammer of the non-matching word
When attribute is not subject, the sentence according to the word weight of the non-matching word and user's question sentence is long, determines the non-matching word
Penalty values.
In embodiments of the present invention, it may recognize that the non-matching word with physical meaning, and the non-matching word do not punished
It penalizes.For example, it sets and grammatical attribute is not punished as the non-matching word of name subject, it is assumed that user's question sentence is " Zhang San people
Local ", noise therein are " people ", and the entity word identified includes " Zhang San ", " people " and " local ", and corresponding semantic attribute is
Entity word " local ", then non-matching word includes " Zhang San " and " people ", is name subject in the grammatical attribute for determining " Zhang San ", and
After the grammatical attribute of " people " is not name subject, determine that the punishment of non-matching word " Zhang San " is divided into sky, for non-matching word " people ",
It is long according to the word weight of the non-matching word and the sentence of user's question sentence, determine the penalty values of non-matching word.On this basis, it is assumed that
The word weight in " local " is 0.8, and the word weight of " Zhang San " is 0.6, and the word weight of " people " is 0.5, then sentence scoring can be obtained and be
- 0.5/5 (penalty values of " people ")=0.7 0.8-0*0.6 (" Zhang San " is not punished).To effectively have reality through the above way
Border meaning and non-matching word without physical meaning separate, and improve the accuracy of calculated sentence scoring.
At step 104, semantic according to described section of sentence granularity semanteme and the words granularity, determine user's question sentence
Output is semantic.
Section sentence granularity semanteme and words granularity semanteme are being obtained, is therefrom selecting one, the output as user's question sentence is semantic.
The embodiment of the present invention to the mode selected without limitation, for example, due to complicated question words granularity parsing effect usually more
It is good, therefore when the long threshold value long more than sentence of the sentence of user's question sentence, it is semantic that words granularity semanteme is determined as output;When user's question sentence
Sentence is long when being less than the long threshold value of sentence, and it is semantic that section sentence granularity semanteme is determined as output.
In some embodiments, it can realize in this way above-mentioned semantic and described according to described section of sentence granularity
Words granularity is semantic, determines that the output of user's question sentence is semantic: when described section of sentence granularity semanteme is not sky, by described section of sentence
It is semantic that granularity semanteme is determined as output;When described section of sentence granularity semanteme is sky, and the words granularity semanteme is not sky, by institute
It states words granularity semanteme and is determined as output semanteme;It is defeated when described section of sentence granularity semanteme and the words granularity semanteme are sky
The prompt of answer failed out.
As an example, in semantic output module 2554, determining output using the preferential mechanism of section sentence granularity referring to Fig. 3
It is semantic to be determined directly as output when section sentence granularity semanteme is not sky by semanteme for section sentence granularity semanteme;When section sentence granularity semanteme is
When the case where sky, and words granularity semanteme is not empty such as question sentence is complicated question, words granularity semanteme is determined as to export language
Justice;When section sentence granularity is semantic and words granularity semanteme is sky, since semanteme can not be parsed, the prompt of answer failed is exported.
Wherein, it is that empty situation is described in detail later that section sentence granularity semanteme, which is empty and words granularity semanteme,.It is set by above-mentioned
The mode of granularity priority is set, simple question sentence is semantic as output using section sentence granularity semanteme, and complicated question uses words granularity
Semanteme is semantic as output, improves the semantic accuracy of output.
In step 105, it is inquired in the knowledge mapping of setting according to the output semanteme, obtains semantic results.
As an example, in result queries module 2555, inquiry meets output semanteme in knowledge mapping referring to Fig. 3
Knowledge obtains semantic results.Such as a knowledge in knowledge mapping is " Zhang San-local-Beijing ", semantic output is Zhang San
Local then inquires the knowledge according to output semanteme in knowledge mapping, and " Beijing " in the knowledge is determined as semantic knot
Fruit.
In some embodiments, it can also realize in this way above-mentioned semantic in setting according to the output
It is inquired in knowledge mapping, obtains semantic results: when semantic corresponding at least two semantic attribute of the output, according to
Sequence of the semantic attribute in the output semanteme, successively inquires in knowledge mapping, obtains and each semantic attribute
One-to-one attribute results;Each attribute results group is combined into semantic results.
When corresponding at least two semantic attribute of the output semanteme determined, each semanteme is successively inquired in knowledge mapping
The corresponding attribute results of attribute, for example, semantic for output: Zhang San wife graduated school, firstly, according to the semanteme of " wife "
Attribute, whom the wife that Zhang San is inquired in knowledge mapping is, so that it is determined that first attribute results, it is assumed herein that for " Zhu is small
Elder sister ";Then, on the basis of first attribute results, according to the semantic attribute of " graduated school ", Zhu is inquired in knowledge mapping
What the graduated school of Miss is, so that it is determined that second attribute results.Two attribute results are combined, are obtained final
Semantic results.It is worth noting that can also possess semantic attribute for section sentence granularity semanteme, which is setting
In the semanteme of corpus.The careful property of semantic results is improved through the above way, suitable for different ways to put questions.
In step 106, answer statement is generated according to the semantic results.
As an example, referring to Fig. 3, it, can be semantic to output by the sentence template of setting in sentence generation module 2556
And semantic results are combined processing, obtain answer statement, can such as set sentence template as " xx of xx is xx ", in output semanteme
For the local Zhang San, in the case of semantic results are Pekinese, answer statement " local of Zhang San is Beijing " is obtained, to promote use
Family experience.Certainly, can also be by semantic results directly as answer statement, it is not limited in the embodiment of the present invention.
Implemented by above-mentioned example of the inventive embodiments for Fig. 4 A it is found that the machine that the embodiment of the present invention passes through double grains degree
System, improves to different user question sentence, the applicability including simple question sentence and complicated question improves the accuracy of response.
In some embodiments, B, Fig. 4 B are that the sentence provided in an embodiment of the present invention based on artificial intelligence is answered referring to fig. 4
The step of answering another optional flow diagram of method, showing in conjunction with Fig. 3 to Fig. 4 B is illustrated.
In figure 4b, step 102 shown in Fig. 4 A can be realized by step 1021 to step 1024, specifically:
In step 1021, candidate's corpus is determined in corpus according to user's question sentence.
As an example, in section sentence granularity processing module 2552, being determined in corpus according to user's question sentence referring to Fig. 3
Relevant corpus, as candidate corpus.
In some embodiments, can realize in this way it is above-mentioned according to user's question sentence in corpus
It determines candidate's corpus: determining the grammatical attribute of the entity word, and be described in subject by grammatical attribute in user's question sentence
Entity word replaces with reference word;In corpus Integrated query corpus relevant to replaced user's question sentence, and determine
The degree of correlation between the corpus and replaced user's sentence;It will expire with the degree of correlation of replaced user's sentence
The corpus of sufficient degree of correlation condition, is determined as candidate corpus.
While identifying the entity word in user's question sentence, pass through part-of-speech tagging technology (POS tagging, Pa rt-Of-
Speech tagging) and the relevant technologies determine grammatical attribute of the entity word in user's question sentence, due to the corpus in corpus
Subject is not included usually, therefore grammatical attribute is that the entity word of subject causes unfavorable shadow to the process for determining candidate corpus in order to prevent
It rings, the entity word that grammatical attribute in user's question sentence is subject is replaced with into reference word, refers to word such as letter A.
Then, replaced user's question sentence is handled by ES, replaced user's question sentence can be specifically divided
Word includes the corpus of any word in word segmentation result in corpus Integrated query, and is determined between corpus and replaced user's question sentence
The degree of correlation.Herein, can include using corpus word in word segmentation result TF-IDF as feature, obtain the degree of correlation, TF-IDF is
Referring to the result of product of TF and IDF, wherein word frequency (TF, Term Frequency) indicates the frequency that the word occurs in the corpus,
Reverse document-frequency (IDF, Inve rse Document Frequency) is the measurement of the general importance of the word, can be by language
Material collects the interior total quantity including corpus and takes denary logarithm to obtain divided by the quantity of the corpus comprising the word, then by obtained quotient
Arrive, TF and IDF with degree of correlation positive correlation.In addition to this, more features can be also introduced when determining the degree of correlation, this
Place repeats no more.The embodiment of the present invention supports corpus inquiry by using ES, largely increases in the corpus quantity of corpus, such as increases
When growing to hundred million grades, the management inquiry for quickly handling mass data is remained to, ensures that the search condition of various complexity can be controlled short
It is returned in time (usually 1 second), is easy time-out when in face of inquiry compared to traditional relational database such as MySQL database
The case where, tool has a distinct increment.
For the degree of correlation, degree of correlation condition is set, it is the highest n degree of correlation of numerical value that degree of correlation condition, which is such as arranged, and n is big
In 0 and the integer less than 25.The corpus of degree of correlation condition will be met with the degree of correlation of replaced user's sentence, is determined as candidate
Corpus, the replaced user's question sentence for being such as input to ES is " whom the wife of A is ", and the candidate corpus of ES output includes the " wife of A
Whom is ", " what is your name by the son's wife of A " and " you know that whom wife of A is " etc..Pass through the above-mentioned side for determining candidate corpus
Formula realizes the preliminary screening to the relevant corpus of user's sentence, provides data basis for subsequent determining target corpus.
In some embodiments, it can realize that the above-mentioned semanteme by the target corpus is determined as in this way
The section sentence granularity of user's question sentence is semantic: by the reference word in the semanteme of the target corpus, replacing with grammatical attribute
For the entity word of subject, the section sentence granularity for obtaining user's question sentence is semantic.
On the basis of corpus in corpus does not include subject, for the target corpus determined, by target corpus
Reference word in semanteme replaces with the entity word that grammatical attribute is subject, and the section sentence granularity for obtaining user's question sentence is semantic.For example, mesh
Poster material is " whom the wife of A is ", and the semanteme of target corpus is wife A., and grammatical attribute is that the entity word of subject is " Zhang San ",
The section sentence granularity semanteme for then obtaining user's question sentence is Zhang San wife, and guarantee is subsequent to inquire correct language in knowledge mapping
Adopted result.
In step 1022, the statement similarity between user's question sentence and the candidate corpus is determined.
In some embodiments, above-mentioned determination user's question sentence and the time can also be realized in this way
It selects the statement similarity between corpus: determining the between user's question sentence and the candidate corpus by neural network model
One similarity;The second similarity between user's question sentence and the candidate corpus is determined by extreme gradient lift scheme;
According to first similarity and second similarity, the sentence phase between user's question sentence and the candidate corpus is determined
Like degree.
As an example, in section sentence granularity processing module 2552, user's question sentence and candidate corpus are input to referring to Fig. 3
Neural network model, the neural network model of the embodiment of the present invention can be indicated for the alternating binary coding of Transformer (BERT,
Bidirectional Encoder Representation from Transformers) model, it is executed by BERT model
Classification task realizes that user's question sentence and the similarity of candidate corpus are predicted.BERT model includes embeding layer and full articulamentum, wherein
Embeding layer for generating the corresponding term vector of user's question sentence and the corresponding term vector of candidate corpus, full articulamentum be used for word to
Amount is handled, and the similarity generated between user's question sentence and candidate corpus will pass through neural network model for the ease of distinguishing
The similarity of generation is named as the first similarity.For example, user's question sentence is " whom the wife of A is ", and candidate corpus is that " you know
Whom the wife of road A is ", after being input to BERT model, by the output result 0.866 of BERT model, as the first similarity.Value
It must illustrate, be by training corpus collection to each layer in BERT model of power when being trained in advance to BERT model
Weight parameter is adjusted, and training corpus collection is different from corpus above, and include in training corpus collection is pairs of corpus, with
And similarity corresponding with pairs of corpus.
User's question sentence and candidate corpus are input to extreme ladder in the section sentence granularity processing module 2552 of Fig. 3, while also
Degree promotes (XGBoost, eXtreme Gradient Boosting) model, in input, by fasttext, TF-IDF and
At least one of one-hot coding mode carries out fusion treatment to user's question sentence and candidate corpus and generates input feature vector, and will be defeated
Enter feature and is input to XGBoost model.XGBoost model passes through classification and regression tree (CART, the Classification for including
And Regression Tree) execution classification task is carried out to input feature vector, it ultimately generates corresponding with input feature vector similar
The similarity generated by XGBoost model is named as the second similarity for the ease of distinguishing by degree.For example, Yong Huwen
Sentence is " whom the wife of A is ", and candidate corpus is " you know that whom the wife of A is ", generates input feature vector, and input feature vector is defeated
Enter to XGBoost model, by the output result 0.85 of XGBoost model, as the second similarity.It is worth noting that this
Inventive embodiments to the generating mode of input feature vector without limitation, for example, only including letter in user's question sentence and candidate corpus
In the case where, input feature vector can be set as { the character total length of user's question sentence, the character total length of candidate corpus, user's question sentence
With the character overall length difference of candidate corpus, the word quantity of user's question sentence, the word quantity of candidate corpus }.In addition, right in advance
During XGBoost model training, ten folding cross validations can be used, determine the accuracy rate of XGBoost model, when accuracy rate height
When accuracy rate threshold value, determine that XGBoost model training is completed.Ten folding cross validations, which refer to, will be used for trained training corpus collection
Be divided into 10 parts, in turn will wherein 9 parts be used as training data, 1 part is used as test data, is tested, and 10 times are tested and is obtained
The average value of accuracy rate is as final accuracy rate.
Summation is weighted to the first similarity and the second similarity and obtains statement similarity, the first similarity and the second phase
It can be determined according to practical application scene like corresponding weight is spent, weight is bigger, then the significance level of corresponding similarity is got over
It greatly, is 0.6 as the corresponding weight of the first similarity is arranged, the corresponding weight of the second similarity is 0.4, by taking above-mentioned example as an example,
It is 0.6*0.866+0.4*0.85 ≈ 0.86 that statement similarity, which then can be obtained,.By above-mentioned global neurological network model and extremely
The mode of gradient lift scheme improves the accuracy for the statement similarity determined.
In step 1023, when the statement similarity is more than the first similarity threshold, the candidate corpus is determined
For target corpus, the section sentence granularity that the semanteme of the target corpus is determined as user's question sentence is semantic.
In embodiments of the present invention, target corpus is determined by the first similarity threshold of setting.When super there is only one
When crossing the statement similarity of the first similarity threshold, the corresponding candidate corpus of the statement similarity is determined as target corpus;When
There are at least two more than the first similarity threshold statement similarity when, by the corresponding candidate of the highest statement similarity of numerical value
Corpus is determined as target corpus.Then, the section sentence granularity for the semanteme of target corpus being determined as user's question sentence is semantic.
In step 1024, when the statement similarity of all candidate corpus is less than first similarity
When threshold value, determine the target corpus and described section of sentence granularity semanteme for sky.
In another scenario, the statement similarity of all candidate corpus is less than the first similarity threshold, at this point, will
Target corpus and section sentence granularity semanteme are determined as sky.
In some embodiments, after step 1022, further includes: when between user's question sentence and the candidate corpus
Statement similarity is less than first similarity threshold, and when more than the second similarity threshold, and user's question sentence is submitted
To the audit side of setting, and the audit side is obtained to the auditing result of user's question sentence;When the auditing result is correct
When sentence, user's question sentence is added to the corpus;Wherein, it is similar to be greater than described second for first similarity threshold
Spend threshold value.
Limitation due to model by training corpus collection, for the question sentence type that do not trained or frequency of training is less, mould
The effect that type handles it may be bad, even if that is, user's question sentence is similar compared with candidate corpus, but after model treatment,
The similarity of generation will not be too high.For the situation, other than the first similarity threshold, the embodiment of the present invention also sets up second
Similarity threshold, wherein the first similarity threshold is greater than the second similarity threshold.Language between user's question sentence and candidate corpus
Sentence similarity is less than the first similarity threshold, and when more than the second similarity threshold, and user's question sentence is committed to the careful of setting
Core side, and audit side is obtained to the auditing result of user's question sentence, wherein audit side such as manual examination and verification side.When auditing result is positive
When true sentence, user's question sentence is added to corpus, realizes that the dynamic of corpus increases.It in addition to this, can also be by user's question sentence
It is added to training corpus in pairs with candidate corpus to concentrate, so that training neural network model and extreme gradient lift scheme, are promoted
Processing capacity of two models for different types of user's question sentence.
In figure 4b, before step 103, the grammatical attribute of the entity word can also be determined in step 107.
From in user's question sentence identify user's question sentence in entity word while, determine that entity word exists by part-of-speech tagging technology
Grammatical attribute in user's question sentence.
In step 108, it is verified according to grammatical attribute of the grammar templates of setting to the entity word.
As an example, in words granularity processing module 2553, being provided with grammar templates referring to Fig. 3.For example, In
In grammar templates, it is set with subject-noun corresponding relationship, is verified according to grammatical attribute of the grammar templates to entity word,
Judge whether the grammatical attribute of entity word meets subject-noun corresponding relationship.
In step 109, when there is the grammatical attribute for not meeting the grammar templates, it is corresponding to delete the grammatical attribute
Entity word.
For example, the grammatical attribute when entity word is subject-preposition, and grammar templates include subject-noun corresponding relationship
When, it determines that the grammatical attribute does not meet grammar templates, rejects the corresponding entity word of the grammatical attribute, it is subsequent no longer to the entity word
It processes.
Implemented by above-mentioned example of the inventive embodiments for Fig. 4 B it is found that the embodiment of the present invention passes through first in corpus
In filter out candidate corpus, then target corpus is filtered out from candidate corpus, so that the semanteme of target corpus is determined as user
The section sentence granularity of question sentence is semantic, improves the accuracy for the section sentence granularity semanteme determined;In addition, according to grammar templates to entity
Word is filtered, and improves the accuracy of the subsequent semantic attribute determined.
In order to make it easy to understand, the embodiment of the invention provides the structural schematic diagrams of BERT model as shown in Figure 5.In determination
When the first similarity in section sentence granularity, Text Pretreatment is carried out to user's question sentence and candidate corpus first, including delete setting
Symbol, capital and small letter format and the either traditional and simplified characters format of Chinese character of unified letter etc. operate, then distinguish user's question sentence and candidate corpus
Segmented, the embodiment of the present invention to the mode of participle without limitation.For user's question sentence, segment and obtain Tok1 ... TokN's
Word segmentation result;For candidate corpus, the word segmentation result of Tok ' M, wherein N and M is whole greater than 0 that participle obtains Tok ' 1 ...
Number.It is worth noting that also adding two additional characters when word segmentation result is inputted BERT model, wherein [C LS] is to use
In the additional character of classification output, i.e. similarity between instruction final output user question sentence and candidate corpus, [SEP] is to be used for
Separate the additional character of discontinuous sequence of tokens, i.e. the participle knot of the word segmentation result for separating user's question sentence and candidate corpus
Fruit.Then, in BERT model, word segmentation result and additional character are converted to by term vector by embeding layer, and by connecting entirely
It connects layer to handle term vector, the first similarity between final output user question sentence and candidate corpus, i.e. C in Fig. 5.
In the following, will illustrate exemplary application of the embodiment of the present invention in some actual application scenarios.
It is sentence response flow diagram provided in an embodiment of the present invention referring to Fig. 6, Fig. 6.In Fig. 6, Yong Huwen
Sentence is " whom the wife of Xiao Zhang is ", and by name Entity recognition and part-of-speech tagging, it is " small for obtaining the entity word in user's question sentence
" and " wife ", and determine that the grammatical attribute of " Xiao Zhang " is name subject.Assuming that the semantic dictionary of semantic attribute " Zhang San " includes
" Zhang San " and " Xiao Zhang " two words, the semantic dictionary of semantic attribute " spouse " include " spouse " and " wife " two words, and semantic
Simultaneously progressive relationship is not present in attribute " Zhang San ", then arranges according to " wife " corresponding progressive relationship entity word, simultaneously will
After entity word replaces with corresponding semantic attribute, it is semantic to obtain output: Zhang San spouse.According to the semantic building Subject, Predicate and Object of output
(SPO, Subject-Predication-Object) triple, obtains<S: Zhang San, P: spouse, O:>, according to the SPO ternary
Group is inquired in knowledge mapping, and obtaining semantic results is Zhu little Jie, finally " is opened according to the sentence template generation answer statement of setting
Three spouse is Zhu little Jie.", complete the whole flow process of sentence response.
It is the process signal for the sentence question and answer scheme that the relevant technologies provided in an embodiment of the present invention provide referring to Fig. 7, Fig. 7
Figure.In Fig. 7, by data on line crawl and manually write in the way of generate corpus, be added in corpus, and to corpus
Interior corpus carries out data prediction.Then, user speech is obtained, user speech is converted to by user's question sentence by ASR, and right
User's question sentence is named Entity recognition, obtains the entity word in user's question sentence.Entity word is input to trained classification mould
Type, it is according to the output of disaggregated model as a result, determining with the highest target corpus of user's question sentence degree of closeness in corpus, and will
The semanteme of target corpus is semantic as the output of user's question sentence.Map inquiry is carried out according to output is semantic, obtains semantic results, then
Answer statement is generated according to semantic results, completes to reply.In the sentence question and answer scheme that above-mentioned the relevant technologies provide, by dividing
Class model determines target corpus, causes semantic generalization ability limited, it is difficult to parse the core semantic information in user's question sentence, especially
It is for complicated question, it is possible that the case where parsing mistake can not even parse.
It is another process of the sentence answer method provided in an embodiment of the present invention based on artificial intelligence referring to Fig. 8, Fig. 8
Schematic diagram.In fig. 8, the mode for crawling also with data on line and manually writing generates corpus, is added in corpus, but
Compared to the requirement in the related technology to corpus quantity up to ten thousand easily, the embodiment of the present invention can be on the basis of thousand or so corpus
Sentence response is carried out, and guarantees certain accuracy.After completing the building of corpus, data are carried out to the corpus in corpus
Pretreatment, data prediction herein includes but is not limited to: deleting the setting symbol in corpus, the letter in corpus is all turned
It is changed to upper case or lower case, the Chinese character in corpus is all converted into the complex form of Chinese characters or simplified Chinese character.Then, user speech is obtained, is passed through
User speech is converted to user's question sentence by ASR, and is named Entity recognition to user's question sentence, obtains the entity in user's question sentence
Word.Name Entity recognition herein is combined using two ways, and first way is such as instructed according to Named Entity Extraction Model
The conditional random field models perfected are identified to obtain entity word to user's question sentence;The second way be according to setting dictionary to
Family question sentence carries out multimode matching, using the word of successful match as entity word.The result that two ways obtains is merged and gone
Weight, obtains final entity word.
The embodiment of the invention provides the semantic mechanism for resolving of double grains degree, in section sentence granularity, by grammer in user's question sentence
Attribute is that the entity word of subject replaces with reference word, and replaced user's question sentence is input to ES.By the query function of ES,
A plurality of corpus relevant to replaced user's question sentence in corpus is obtained, by the degree of correlation between replaced user's question sentence
The corpus for meeting degree of correlation condition is determined as candidate corpus, completes preliminary screening.Then, user's question sentence and candidate corpus is defeated
Enter to trained BERT model and X GBoost model, the between user's question sentence and candidate corpus is obtained by BERT model
One similarity obtains the second similarity between user's question sentence and candidate corpus by XGBoost model, to the first similarity and
It is preferred that second similarity is balanced coefficient, that is, is weighted summation, obtains statement similarity.
For statement similarity, measured by the first similarity threshold of setting and the second similarity threshold, wherein the
One similarity threshold is greater than the second similarity threshold, and the level1 in Fig. 8 is the first similarity threshold, and level2 is second similar
Threshold value is spent, score is statement similarity.It is when while statement similarity is more than the first similarity threshold, corresponding candidate corpus is true
It is set to target corpus, the section sentence granularity that the semanteme of target corpus is determined as user's question sentence is semantic, wherein if there are at least two times
The statement similarity for selecting corpus is more than the first similarity threshold, then the corresponding candidate corpus of the highest statement similarity of numerical value is true
It is set to target corpus;When the statement similarity of all candidate corpus is less than the first similarity threshold, target corpus is determined
And section sentence granularity semanteme is sky.In addition, being less than the first similarity threshold when existing, and more than the sentence of the second similarity threshold
When similarity, user's question sentence is committed to manual examination and verification side, and obtains manual examination and verification side to the auditing result of user's question sentence, when careful
When core result is correct sentence, which is added to corpus;When auditing result is wrong sentence, the user is abandoned
Question sentence.It is worth noting that when the semanteme of target corpus is determined as section sentence granularity semanteme, it will also be in section sentence granularity semanteme
Subject replaces with the entity word that grammatical attribute is subject.By BERT model and XGBoost model, essence has been carried out to candidate corpus
Screening, to can determine and the immediate target corpus of user's question sentence.
In words granularity, the grammatical attribute of entity word is determined, and according to the N-Gram grammar templates of setting to entity word
Grammatical attribute verified, weed out the corresponding entity word of grammatical attribute for not meeting grammar templates.Then, N-Gram is carried out
Matching determines the corresponding semantic attribute of semantic dictionary of successful match firstly, entity word is matched with semantic dictionary, and
Entity word is arranged according to the corresponding progressive relationship of semantic attribute, while entity word is replaced with into semantic attribute, is used
The words granularity of family question sentence is semantic.
It is semantic for section sentence granularity semanteme and words granularity, output language is determined in such a way that section sentence granularity semanteme is preferential
Justice, it may be assumed that when section sentence granularity semanteme is not sky, it is semantic that section sentence granularity semanteme is determined as output;When section sentence granularity semanteme be sky,
And it is semantic to be determined as output when not being sky by words granularity semanteme for words granularity semanteme;When section sentence granularity is semantic and words granularity
When semanteme is sky, the prompt of answer failed is exported.It is inquired in knowledge mapping according to obtained output semanteme, obtains language
Justice as a result, and according to semantic results generate answer statement, replied, complete the process of entire sentence response.Compared to correlation
The sentence question and answer scheme that technology provides, the embodiment of the present invention are improved by the mechanism of double grains degree to different user question sentence, including
The applicability of simple question sentence and complicated question, improves the accuracy of response.
It is a contrast schematic diagram of response scene provided in an embodiment of the present invention referring to Fig. 9, Fig. 9.The left figure of Fig. 9 is portion
The chat application for affixing one's name to the sentence response scheme that the relevant technologies provide, in the response scene generated when sentence response.In left figure
In response scene in, user input textual form user's question sentence " what is your name ", chat application parses it,
It obtains correctly exporting semanteme, and is correctly replied.But in question answering process 91, when user inputs user's question sentence " Li Xiaoming
It is high " when, for chat application by disaggregated model, the core semanteme that can not parse user's question sentence is "high", causes parsing to fail, returns
The prompt " I should study hard, and not understand what you are saying unexpectedly " of answer failed is returned.For user's question sentence, " you know Lee
Xiao Ming is high ", chat application can not equally parse, and return to the prompt of answer failed.
The right figure of Fig. 9 is to dispose the chat application of the sentence answer method based on artificial intelligence of the embodiment of the present invention, In
Carry out the intelligent response scene generated when sentence response.Compared to the situation that can not parse semanteme in question answering process 91, In
In question answering process 92, when user inputs user's question sentence " Li Xiaoming is high ", chat application parses it, determines therein
Core semanteme is "high", and output semantic is Li Xiaoming height, generates answer statement according to output is semantic, specially " Li Xiaoming's
Height is 187 centimetres ".User input user's question sentence " you know that Li Xiaoming is high " when, chat application equally can to its into
The correct parsing of row, it is Li Xiaoming height that it is semantic, which to obtain output, and generates correct answer statement.
It is another contrast schematic diagram of response scene provided in an embodiment of the present invention referring to Figure 10, Figure 10.A left side of Figure 10
Figure is the chat application for disposing the sentence response scheme that the relevant technologies provide, in the response scene generate when sentence response.
In response scene in left figure, user inputs user's question sentence " Zhang San local " of textual form, and chat application solves it
Analysis, obtains correctly exporting semantic, and exports correct answer statement " birthplace of Zhang San is Chengdu.".But in question answering process
In 101, when user inputs noise-containing user's question sentence " Zhang San other township ", chat application is by disaggregated model, to user
Question sentence has carried out the semantic parsing of mistake, causes to have replied a music links.
The right figure of Figure 10 is to dispose the chat application of the sentence answer method based on artificial intelligence of the embodiment of the present invention, In
Carry out the intelligent response scene generated when sentence response.When user inputs user's question sentence " Zhang San the age how old ", chat is answered
It is parsed with to it, determines that core semanteme therein is " age ", semantic output is the Zhang San age, semantic according to output
Answer statement is generated, specially " age of Zhang San is 36.".When user inputs user's question sentence " Zhang San local ", chat application
It can correctly be parsed, obtain exporting the birthplace semantic Zhang San, generating answer statement, " birthplace of Zhang San is into
All.".Compared in question answering process 101, the situation of user's question sentence " Zhang San other township " parsing mistake, in question answering process 102,
When user inputs noise-containing user's question sentence " Zhang San other township ", chat application can filter out noise " people ", obtain
To the output semanteme birthplace Zhang San, generating answer statement, " birthplace of Zhang San is Chengdu."
It is a schematic diagram of the response scene of complicated question provided in an embodiment of the present invention referring to Figure 11, Figure 11.Figure 11
Left figure and right figure be dispose the embodiment of the present invention the sentence answer method based on artificial intelligence chat application, carrying out
The response scene generated when sentence response, wherein the right figure of Figure 11 is the next screen of the left figure of Figure 11.In Figure 11, chat is answered
With the user speech inputted in a manner of voice input is obtained, speech recognition is carried out to user speech and obtains user's question sentence.For
The question answering process 111 of left figure, when user's question sentence is " company of the husband of Zhang little Jie ", chat application parses it, obtains
Semantic to output is Miss Zhang husband company, is inquired according to the sequence of semantic attribute, i.e., first in knowledge mapping to language
Adopted attribute " husband " is inquired, and attribute results " Liu " is obtained, then on the basis of the attribute results, to semantic attribute
" company " is inquired, and attribute results " store xx " is obtained.Attribute results group is finally combined into semantic results, and generates response language
" husband of Zhang little Jie is Liu to sentence, and the company of Liu is the store xx.".
When user's question sentence is " which university, institute the wife of Liu graduates from ", chat application parses it, obtains
Semantic output is Liu wife graduated school, according to the sequence of semantic attribute, i.e., first inquires " wife ", then inquire " graduation
The sequence of universities and colleges ", is successively inquired in knowledge mapping, and obtaining the corresponding attribute results of semantic attribute " wife " is " Zhang little Jie ",
The corresponding attribute results of semantic attribute " graduated school " are " so-and-so primary school, so-and-so middle school, so-and-so university ", finally by attribute results
Group is combined into semantic results, and generates answer statement " wife of Liu is Zhang little Jie, and the graduated school of Zhang little Jie is that so-and-so is small
It learns, so-and-so middle school, so-and-so university."
In the question answering process 112 of Figure 11 right figure, when user's question sentence is " Nanjing mayoral is how old ", chat application
It is parsed, it is the Nanjing mayor age that it is semantic, which to obtain output, according to the sequence of semantic attribute, i.e., " mayor " first is inquired,
The sequence for inquiring " age " again, is successively inquired in knowledge mapping, and it is " blue for obtaining the corresponding attribute results of semantic attribute " mayor "
So-and-so ", the corresponding attribute results of semantic attribute " age " are " 55 ", attribute results group are finally combined into semantic results, and generate
Answer statement " mayor in Nanjing be it is blue so-and-so, so-and-so blue age is 55."
When user's question sentence is " wife of Zhang San is Miss Zhu ", chat application parses it, retains grammer category
Property be name subject " Zhang San ", filter out grammatical attribute not and be the non-matching word " Zhu little Jie " of subject, obtaining output semanteme is
Zhang San wife is inquired in knowledge mapping according to output semanteme, is obtained semantic results " Zhu little Jie ", and tied according to semanteme
Fruit generates answer statement, and " wife of Zhang San is Zhu little Jie.".
When user's question sentence is " poplar has had 18 years old ", chat application parses it, identifies that semantic attribute is
" age ", it is the poplar age that it is semantic, which to obtain output, is inquired in knowledge mapping according to output semanteme, obtains semantic results
" 32 ", and " age of poplar is 32 according to semantic results generation answer statement.".
It continues with the explanation sentence answering device 255 provided in an embodiment of the present invention based on artificial intelligence and is embodied as software
The exemplary structure of module, in some embodiments, as shown in Fig. 2, being stored in the sentence based on artificial intelligence of memory 250
Software module in answering device 255 may include: identification module 2551, for obtaining user's question sentence, identify that the user asks
Entity word in sentence;Section sentence granularity processing module 2552, for determining target corpus in corpus according to user's question sentence,
The section sentence granularity that the semanteme of the target corpus is determined as user's question sentence is semantic;Words granularity processing module 2553 is used
In determining the corresponding semantic attribute of the entity word, and by the entity word according to the corresponding progressive relationship of the semantic attribute into
Row arrangement, the words granularity for obtaining user's question sentence are semantic;Semantic output module 2554, for according to described section of sentence granularity language
The adopted and described words granularity is semantic, determines that the output of user's question sentence is semantic;Result queries module 2555, for according to
Output semanteme is inquired in the knowledge mapping of setting, obtains semantic results;Sentence generation module 2556, for according to
Semantic results generate answer statement.
In some embodiments, section sentence granularity processing module 2552 is also used to: according to user's question sentence in corpus
Determine candidate's corpus;Determine the statement similarity between user's question sentence and the candidate corpus;When the statement similarity
When more than the first similarity threshold, the candidate corpus is determined as target corpus, the semanteme of the target corpus is determined as
The section sentence granularity of user's question sentence is semantic;When the statement similarity of all candidate corpus is less than described first
When similarity threshold, determine the target corpus and described section of sentence granularity semanteme for sky.
In some embodiments, described to determine candidate's corpus in corpus according to user's question sentence, comprising: to determine institute
The grammatical attribute of entity word is stated, and the entity word that grammatical attribute in user's question sentence is subject is replaced with into reference word;
In corpus Integrated query corpus relevant to replaced user's question sentence, and determine the corpus and replaced institute
State the degree of correlation between user's sentence;The corpus of degree of correlation condition will be met with the degree of correlation of replaced user's sentence,
It is determined as candidate corpus.
In some embodiments, section sentence granularity processing module 2552 is also used to: by the institute in the semanteme of the target corpus
Reference word is stated, the entity word that grammatical attribute is subject is replaced with, the section sentence granularity for obtaining user's question sentence is semantic.
In some embodiments, the statement similarity between determination user's question sentence and the candidate corpus, packet
It includes: the first similarity between user's question sentence and the candidate corpus is determined by neural network model;Pass through extreme ladder
Degree lift scheme determines the second similarity between user's question sentence and the candidate corpus;According to first similarity and
Second similarity determines the statement similarity between user's question sentence and the candidate corpus.
In some embodiments, the sentence answering device 255 based on artificial intelligence further include: auditing module, for working as institute
The statement similarity stated between user's question sentence and the candidate corpus is less than first similarity threshold, and more than the second phase
When like degree threshold value, user's question sentence is committed to the audit side of setting, and obtain the audit side to user's question sentence
Auditing result;Adding module, for when the auditing result is correct sentence, user's question sentence to be added to the corpus
Collection;Wherein, first similarity threshold is greater than second similarity threshold.
In some embodiments, identification module 2551 is also used to: by Named Entity Extraction Model to user's question sentence
Entity recognition is carried out, the first recognition result is obtained;String matching is carried out to user's question sentence according to setting dictionary, obtains the
Two recognition results;To first recognition result and second recognition result merges and duplicate removal, obtains entity word.
In some embodiments, the sentence answering device 255 based on artificial intelligence further include: grammer determining module is used for
Determine the grammatical attribute of the entity word;Correction verification module, for the grammer category according to the grammar templates of setting to the entity word
Property is verified;Entity word removing module, for deleting institute's predicate when there is the grammatical attribute for not meeting the grammar templates
Attribute corresponding entity word.
In some embodiments, the sentence answering device 255 based on artificial intelligence further include: word weight determination module is used
In entity word not corresponding with the semantic attribute in user's question sentence is determined as non-matching word, and determines and described do not match
The word weight of word and the entity word in addition to the non-matching word;Penalty values determining module, for the word according to the non-matching word
The sentence of weight and user's question sentence is long, determines the penalty values of the non-matching word;Score determining module, for punishing according to
The word weight of point penalty and the entity word in addition to the non-matching word determines the sentence scoring of user's question sentence;Empty semantic determining mould
Block, for determining the words granularity semanteme for sky when sentence scoring is less than sentence scoring threshold value.
In some embodiments, the word weight of the matching word non-according to and the sentence of user's question sentence are long, determine
The penalty values of the non-matching word, comprising: determine the grammatical attribute of the non-matching word;When the grammatical attribute of the non-matching word
When for subject, determine that the punishment of the non-matching word is divided into sky;When the grammatical attribute of the non-matching word is not subject, according to
The sentence of the word weight of the non-matching word and user's question sentence is long, determines the penalty values of the non-matching word.
In some embodiments, words granularity processing module 2553, is also used to: by the entity word and at least the two of setting
A semanteme dictionary is matched, wherein each semantic corresponding semantic attribute of dictionary;When the entity word and institute's predicate
When adopted dictionary successful match, by the corresponding semantic attribute of the semanteme dictionary, it is determined as the semantic attribute of the entity word.
In some embodiments, semantic output module 2554, is also used to:, will when described section of sentence granularity semanteme is not sky
It is semantic that described section of sentence granularity semanteme is determined as output;When described section of sentence granularity semanteme is sky, and the words granularity semanteme is not
When empty, it is semantic that the words granularity semanteme is determined as output;When described section of sentence granularity semanteme and the words granularity are semantic
When for sky, the prompt of answer failed is exported.
In some embodiments, result queries module 2555, is also used to: described in the output semantic corresponding at least two
When semantic attribute, according to the semantic attribute it is described output semanteme in sequence, successively inquired in knowledge mapping, obtain with
Each one-to-one attribute results of the semantic attribute;Each attribute results group is combined into semantic results.
In some embodiments, identification module 2551 is also used to: obtaining user speech;Voice is carried out to the user speech
Identification, obtains user's question sentence.
In some embodiments, the sentence answering device 255 based on artificial intelligence further include: removing module, for identification
The setting symbol in each corpus that the corpus includes, and delete the setting symbol;First conversion module, being used for will be described
Letter in corpus is all converted to upper case or lower case;Second conversion module, for all converting the Chinese character in the corpus
For the complex form of Chinese characters or simplified Chinese character.
The embodiment of the present invention provides a kind of storage medium for being stored with executable instruction, wherein it is stored with executable instruction,
When executable instruction is executed by processor, processor will be caused to execute method provided in an embodiment of the present invention, for example, such as Fig. 4 A
Or the sentence answer method based on artificial intelligence shown in 4B.
In some embodiments, storage medium can be FRAM, ROM, PROM, EPROM, EE PROM, flash memory, magnetic surface
The memories such as memory, CD or CD-ROM;Be also possible to include one of above-mentioned memory or any combination various equipment.
In some embodiments, executable instruction can use program, software, software module, the form of script or code,
By any form of programming language (including compiling or interpretative code, or declaratively or process programming language) write, and its
It can be disposed by arbitrary form, including be deployed as independent program or be deployed as module, component, subroutine or be suitble to
Calculate other units used in environment.
As an example, executable instruction can with but not necessarily correspond to the file in file system, can be stored in
A part of the file of other programs or data is saved, for example, being stored in hypertext markup language (H TML, Hyper Text
Markup Language) in one or more scripts in document, it is stored in the single file for being exclusively used in discussed program
In, alternatively, being stored in multiple coordinated files (for example, the file for storing one or more modules, subprogram or code section).
As an example, executable instruction can be deployed as executing in a calculating equipment, or it is being located at one place
Multiple calculating equipment on execute, or, be distributed in multiple places and by multiple calculating equipment of interconnection of telecommunication network
Upper execution.
In conclusion improving the extensive energy of semanteme of sentence response by the mechanism of double grains degree through the embodiment of the present invention
Power either can obtain more accurately answer statement for simple question sentence or complicated question, improve response just
True rate;Also, compared to usually in ten thousand grades or more of corpus quantity demand, the embodiment of the present invention is to cold start-up rank in the related technology
The corpus quantitative requirement of section is not high, by inventor's experimental verification, when the corpus quantity in corpus reaches thousand or so, and language
Sentence response just can reach certain accuracy rate, and by way of corpus self-propagation, can constantly promote corpus can
By property.
The above, only the embodiment of the present invention, are not intended to limit the scope of the present invention.It is all in this hair
Made any modifications, equivalent replacements, and improvements etc., is all included in the scope of protection of the present invention within bright spirit and scope.
Claims (10)
1. a kind of sentence answer method based on artificial intelligence characterized by comprising
User's question sentence is obtained, identifies the entity word in user's question sentence;
Target corpus is determined in corpus according to user's question sentence, and the semanteme of the target corpus is determined as the user
The section sentence granularity of question sentence is semantic;
Determine the corresponding semantic attribute of the entity word, and by the entity word according to the corresponding progressive relationship of the semantic attribute
It is arranged, the words granularity for obtaining user's question sentence is semantic;
It is semantic according to described section of sentence granularity semanteme and the words granularity, determine that the output of user's question sentence is semantic;
It is inquired in the knowledge mapping of setting according to the output semanteme, obtains semantic results;
Answer statement is generated according to the semantic results.
2. sentence answer method according to claim 1, which is characterized in that it is described according to user's question sentence in corpus
Middle determining target corpus, the section sentence granularity that the semanteme of the target corpus is determined as user's question sentence are semantic, comprising:
Candidate's corpus is determined in corpus according to user's question sentence;
Determine the statement similarity between user's question sentence and the candidate corpus;
When the statement similarity is more than the first similarity threshold, the candidate corpus is determined as target corpus, it will be described
The section sentence granularity that the semanteme of target corpus is determined as user's question sentence is semantic;
When the statement similarity of all candidate corpus is less than first similarity threshold, the mesh is determined
Poster material and described section of sentence granularity semanteme are sky.
3. sentence answer method according to claim 2, which is characterized in that it is described according to user's question sentence in corpus
Middle determining candidate corpus, comprising:
It determines the grammatical attribute of the entity word, and the entity word that grammatical attribute in user's question sentence is subject is replaced
To refer to word;
In corpus Integrated query corpus relevant to replaced user's question sentence, and after determining the corpus and replacement
User's sentence between the degree of correlation;
The corpus of degree of correlation condition will be met with the degree of correlation of replaced user's sentence, is determined as candidate corpus.
4. sentence answer method according to claim 2, which is characterized in that determination user's question sentence and the time
Select the statement similarity between corpus, comprising:
The first similarity between user's question sentence and the candidate corpus is determined by neural network model;
The second similarity between user's question sentence and the candidate corpus is determined by extreme gradient lift scheme;
According to first similarity and second similarity, the language between user's question sentence and the candidate corpus is determined
Sentence similarity.
5. sentence answer method according to claim 1, which is characterized in that the corresponding semanteme of the determination entity word
Before attribute, further includes:
Determine the grammatical attribute of the entity word;
It is verified according to grammatical attribute of the grammar templates of setting to the entity word;
When there is the grammatical attribute for not meeting the grammar templates, the corresponding entity word of the grammatical attribute is deleted.
6. sentence answer method according to claim 1, which is characterized in that further include:
Entity word not corresponding with the semantic attribute in user's question sentence is determined as non-matching word, and determine it is described not
The word weight of entity word with word and in addition to the non-matching word;
Sentence according to the word weight of the non-matching word and user's question sentence is long, determines the penalty values of the non-matching word;
According to the penalty values and the word weight of the entity word in addition to the non-matching word, determine that the sentence of user's question sentence is commented
Point;
When sentence scoring is less than sentence scoring threshold value, determine the words granularity semanteme for sky.
7. sentence answer method according to claim 1, which is characterized in that the corresponding semanteme of the determination entity word
Attribute, comprising:
The entity word is matched at least two semantic dictionaries of setting, wherein each semantic dictionary corresponding one
A semantic attribute;
When the entity word and the semantic dictionary successful match, the corresponding semantic attribute of the semanteme dictionary is determined as
The semantic attribute of the entity word.
8. sentence answer method according to any one of claims 1 to 7, which is characterized in that described according to described section of sentence grain
Degree semanteme and the words granularity are semantic, determine that the output of user's question sentence is semantic, comprising:
When described section of sentence granularity semanteme is not sky, it is semantic that described section of sentence granularity semanteme is determined as output;
When described section of sentence granularity semanteme is sky, and the words granularity semanteme is not sky, the words granularity semanteme is determined
It is semantic for output;
When described section of sentence granularity semanteme and the words granularity semanteme are sky, the prompt of answer failed is exported.
9. a kind of sentence answering device based on artificial intelligence characterized by comprising
Identification module identifies the entity word in user's question sentence for obtaining user's question sentence;
Section sentence granularity processing module, for determining target corpus in corpus according to user's question sentence, by the target language
The section sentence granularity that the semanteme of material is determined as user's question sentence is semantic;
Words granularity processing module, for determining the corresponding semantic attribute of the entity word, and by the entity word according to described
The corresponding progressive relationship of semantic attribute is arranged, and the words granularity for obtaining user's question sentence is semantic;
Semantic output module, for determining user's question sentence according to described section of sentence granularity semanteme and words granularity semanteme
Output it is semantic;
Result queries module obtains semantic results for being inquired in the knowledge mapping of setting according to the output semanteme;
Sentence generation module, for generating answer statement according to the semantic results.
10. a kind of electronic equipment characterized by comprising
Memory, for storing executable instruction;
Processor when for executing the executable instruction stored in the memory, is realized described in any one of claim 1 to 8
Sentence answer method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910797093.8A CN110489538B (en) | 2019-08-27 | 2019-08-27 | Statement response method and device based on artificial intelligence and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910797093.8A CN110489538B (en) | 2019-08-27 | 2019-08-27 | Statement response method and device based on artificial intelligence and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110489538A true CN110489538A (en) | 2019-11-22 |
CN110489538B CN110489538B (en) | 2020-12-25 |
Family
ID=68554496
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910797093.8A Active CN110489538B (en) | 2019-08-27 | 2019-08-27 | Statement response method and device based on artificial intelligence and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110489538B (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110929016A (en) * | 2019-12-10 | 2020-03-27 | 北京爱医生智慧医疗科技有限公司 | Intelligent question and answer method and device based on knowledge graph |
CN111026884A (en) * | 2019-12-12 | 2020-04-17 | 南昌众荟智盈信息技术有限公司 | Dialog corpus generation method for improving quality and diversity of human-computer interaction dialog corpus |
CN111159384A (en) * | 2019-12-31 | 2020-05-15 | 苏州思必驰信息科技有限公司 | Rule-based sentence generation method and device |
CN111259663A (en) * | 2020-01-14 | 2020-06-09 | 北京百度网讯科技有限公司 | Information processing method and device |
CN111325037A (en) * | 2020-03-05 | 2020-06-23 | 苏宁云计算有限公司 | Text intention recognition method and device, computer equipment and storage medium |
CN111357015A (en) * | 2019-12-31 | 2020-06-30 | 深圳市优必选科技股份有限公司 | Speech synthesis method, apparatus, computer device and computer-readable storage medium |
CN111414746A (en) * | 2020-04-10 | 2020-07-14 | 中国建设银行股份有限公司 | Matching statement determination method, device, equipment and storage medium |
CN111475651A (en) * | 2020-04-08 | 2020-07-31 | 掌阅科技股份有限公司 | Text classification method, computing device and computer storage medium |
CN111738011A (en) * | 2020-05-09 | 2020-10-02 | 完美世界(北京)软件科技发展有限公司 | Illegal text recognition method and device, storage medium and electronic device |
CN111785368A (en) * | 2020-06-30 | 2020-10-16 | 平安科技(深圳)有限公司 | Triage method, device, equipment and storage medium based on medical knowledge map |
CN111798847A (en) * | 2020-06-22 | 2020-10-20 | 广州小鹏车联网科技有限公司 | Voice interaction method, server and computer-readable storage medium |
CN111832603A (en) * | 2020-04-15 | 2020-10-27 | 北京嘀嘀无限科技发展有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN112287077A (en) * | 2019-12-09 | 2021-01-29 | 北京来也网络科技有限公司 | Statement extraction method and device for combining RPA and AI for document, storage medium and electronic equipment |
CN112330379A (en) * | 2020-11-25 | 2021-02-05 | 税友软件集团股份有限公司 | Invoice content generation method and system, electronic equipment and storage medium |
CN112988987A (en) * | 2019-12-16 | 2021-06-18 | 科沃斯商用机器人有限公司 | Human-computer interaction method and device, intelligent robot and storage medium |
CN113010768A (en) * | 2019-12-19 | 2021-06-22 | 北京搜狗科技发展有限公司 | Data processing method and device and data processing device |
WO2021128246A1 (en) * | 2019-12-27 | 2021-07-01 | 拉克诺德(深圳)科技有限公司 | Voice data processing method, apparatus, computer device and storage medium |
CN113191145A (en) * | 2021-05-21 | 2021-07-30 | 百度在线网络技术(北京)有限公司 | Keyword processing method and device, electronic equipment and medium |
CN113254606A (en) * | 2020-02-13 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Generative response method, and related method, apparatus, device and medium |
CN113657100A (en) * | 2021-07-20 | 2021-11-16 | 北京百度网讯科技有限公司 | Entity identification method and device, electronic equipment and storage medium |
CN113672719A (en) * | 2021-09-08 | 2021-11-19 | 中国平安人寿保险股份有限公司 | Conversation auxiliary information pushing method and device, computer equipment and storage medium |
CN114254090A (en) * | 2021-12-08 | 2022-03-29 | 马上消费金融股份有限公司 | Question-answer knowledge base expansion method and device |
CN114676244A (en) * | 2022-05-27 | 2022-06-28 | 深圳市人马互动科技有限公司 | Information processing method, information processing apparatus, and computer-readable storage medium |
CN115510203A (en) * | 2022-09-27 | 2022-12-23 | 北京百度网讯科技有限公司 | Question answer determining method, device, equipment, storage medium and program product |
CN115840510A (en) * | 2023-02-21 | 2023-03-24 | 中航信移动科技有限公司 | Input association method, electronic equipment and storage medium for civil aviation intelligent question answering |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105930452A (en) * | 2016-04-21 | 2016-09-07 | 北京紫平方信息技术股份有限公司 | Smart answering method capable of identifying natural language |
CN106777275A (en) * | 2016-12-29 | 2017-05-31 | 北京理工大学 | Entity attribute and property value extracting method based on many granularity semantic chunks |
US20170177715A1 (en) * | 2015-12-21 | 2017-06-22 | Adobe Systems Incorporated | Natural Language System Question Classifier, Semantic Representations, and Logical Form Templates |
US20170371859A1 (en) * | 2015-09-02 | 2017-12-28 | International Business Machines Corporation | Dynamic Portmanteau Word Semantic Identification |
CN108073569A (en) * | 2017-06-21 | 2018-05-25 | 北京华宇元典信息服务有限公司 | A kind of law cognitive approach, device and medium based on multi-layer various dimensions semantic understanding |
CN108549662A (en) * | 2018-03-16 | 2018-09-18 | 北京云知声信息技术有限公司 | The supplement digestion procedure and device of semantic analysis result in more wheel sessions |
CN108804521A (en) * | 2018-04-27 | 2018-11-13 | 南京柯基数据科技有限公司 | A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates |
CN109492077A (en) * | 2018-09-29 | 2019-03-19 | 北明智通(北京)科技有限公司 | The petrochemical field answering method and system of knowledge based map |
CN109727041A (en) * | 2018-07-03 | 2019-05-07 | 平安科技(深圳)有限公司 | Intelligent customer service takes turns answering method, equipment, storage medium and device more |
CN110059160A (en) * | 2019-04-17 | 2019-07-26 | 东南大学 | A kind of knowledge base answering method and device based on context end to end |
CN110162675A (en) * | 2018-09-25 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Generation method, device, computer-readable medium and the electronic equipment of answer statement |
-
2019
- 2019-08-27 CN CN201910797093.8A patent/CN110489538B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170371859A1 (en) * | 2015-09-02 | 2017-12-28 | International Business Machines Corporation | Dynamic Portmanteau Word Semantic Identification |
US20170177715A1 (en) * | 2015-12-21 | 2017-06-22 | Adobe Systems Incorporated | Natural Language System Question Classifier, Semantic Representations, and Logical Form Templates |
CN105930452A (en) * | 2016-04-21 | 2016-09-07 | 北京紫平方信息技术股份有限公司 | Smart answering method capable of identifying natural language |
CN106777275A (en) * | 2016-12-29 | 2017-05-31 | 北京理工大学 | Entity attribute and property value extracting method based on many granularity semantic chunks |
CN108073569A (en) * | 2017-06-21 | 2018-05-25 | 北京华宇元典信息服务有限公司 | A kind of law cognitive approach, device and medium based on multi-layer various dimensions semantic understanding |
CN108549662A (en) * | 2018-03-16 | 2018-09-18 | 北京云知声信息技术有限公司 | The supplement digestion procedure and device of semantic analysis result in more wheel sessions |
CN108804521A (en) * | 2018-04-27 | 2018-11-13 | 南京柯基数据科技有限公司 | A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates |
CN109727041A (en) * | 2018-07-03 | 2019-05-07 | 平安科技(深圳)有限公司 | Intelligent customer service takes turns answering method, equipment, storage medium and device more |
CN110162675A (en) * | 2018-09-25 | 2019-08-23 | 腾讯科技(深圳)有限公司 | Generation method, device, computer-readable medium and the electronic equipment of answer statement |
CN109492077A (en) * | 2018-09-29 | 2019-03-19 | 北明智通(北京)科技有限公司 | The petrochemical field answering method and system of knowledge based map |
CN110059160A (en) * | 2019-04-17 | 2019-07-26 | 东南大学 | A kind of knowledge base answering method and device based on context end to end |
Non-Patent Citations (1)
Title |
---|
毛麾等: "基于知识库的问答系统", 《现代计算机》 * |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112287077A (en) * | 2019-12-09 | 2021-01-29 | 北京来也网络科技有限公司 | Statement extraction method and device for combining RPA and AI for document, storage medium and electronic equipment |
CN110929016A (en) * | 2019-12-10 | 2020-03-27 | 北京爱医生智慧医疗科技有限公司 | Intelligent question and answer method and device based on knowledge graph |
CN111026884A (en) * | 2019-12-12 | 2020-04-17 | 南昌众荟智盈信息技术有限公司 | Dialog corpus generation method for improving quality and diversity of human-computer interaction dialog corpus |
CN112988987A (en) * | 2019-12-16 | 2021-06-18 | 科沃斯商用机器人有限公司 | Human-computer interaction method and device, intelligent robot and storage medium |
CN113010768A (en) * | 2019-12-19 | 2021-06-22 | 北京搜狗科技发展有限公司 | Data processing method and device and data processing device |
CN113010768B (en) * | 2019-12-19 | 2024-03-19 | 北京搜狗科技发展有限公司 | Data processing method and device for data processing |
WO2021128246A1 (en) * | 2019-12-27 | 2021-07-01 | 拉克诺德(深圳)科技有限公司 | Voice data processing method, apparatus, computer device and storage medium |
CN111159384B (en) * | 2019-12-31 | 2022-07-08 | 思必驰科技股份有限公司 | Rule-based sentence generation method and device |
CN111159384A (en) * | 2019-12-31 | 2020-05-15 | 苏州思必驰信息科技有限公司 | Rule-based sentence generation method and device |
CN111357015A (en) * | 2019-12-31 | 2020-06-30 | 深圳市优必选科技股份有限公司 | Speech synthesis method, apparatus, computer device and computer-readable storage medium |
CN111357015B (en) * | 2019-12-31 | 2023-05-02 | 深圳市优必选科技股份有限公司 | Text conversion method, apparatus, computer device, and computer-readable storage medium |
CN111259663B (en) * | 2020-01-14 | 2023-05-26 | 北京百度网讯科技有限公司 | Information processing method and device |
US11775776B2 (en) | 2020-01-14 | 2023-10-03 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for processing information |
CN111259663A (en) * | 2020-01-14 | 2020-06-09 | 北京百度网讯科技有限公司 | Information processing method and device |
CN113254606A (en) * | 2020-02-13 | 2021-08-13 | 阿里巴巴集团控股有限公司 | Generative response method, and related method, apparatus, device and medium |
CN111325037B (en) * | 2020-03-05 | 2022-03-29 | 苏宁云计算有限公司 | Text intention recognition method and device, computer equipment and storage medium |
CN111325037A (en) * | 2020-03-05 | 2020-06-23 | 苏宁云计算有限公司 | Text intention recognition method and device, computer equipment and storage medium |
CN111475651B (en) * | 2020-04-08 | 2023-04-07 | 掌阅科技股份有限公司 | Text classification method, computing device and computer storage medium |
CN111475651A (en) * | 2020-04-08 | 2020-07-31 | 掌阅科技股份有限公司 | Text classification method, computing device and computer storage medium |
CN111414746A (en) * | 2020-04-10 | 2020-07-14 | 中国建设银行股份有限公司 | Matching statement determination method, device, equipment and storage medium |
CN111414746B (en) * | 2020-04-10 | 2023-11-07 | 建信金融科技有限责任公司 | Method, device, equipment and storage medium for determining matching statement |
CN111832603A (en) * | 2020-04-15 | 2020-10-27 | 北京嘀嘀无限科技发展有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
CN111738011A (en) * | 2020-05-09 | 2020-10-02 | 完美世界(北京)软件科技发展有限公司 | Illegal text recognition method and device, storage medium and electronic device |
CN111798847A (en) * | 2020-06-22 | 2020-10-20 | 广州小鹏车联网科技有限公司 | Voice interaction method, server and computer-readable storage medium |
CN111785368A (en) * | 2020-06-30 | 2020-10-16 | 平安科技(深圳)有限公司 | Triage method, device, equipment and storage medium based on medical knowledge map |
CN112330379A (en) * | 2020-11-25 | 2021-02-05 | 税友软件集团股份有限公司 | Invoice content generation method and system, electronic equipment and storage medium |
CN112330379B (en) * | 2020-11-25 | 2023-10-31 | 税友软件集团股份有限公司 | Invoice content generation method, invoice content generation system, electronic equipment and storage medium |
CN113191145A (en) * | 2021-05-21 | 2021-07-30 | 百度在线网络技术(北京)有限公司 | Keyword processing method and device, electronic equipment and medium |
CN113191145B (en) * | 2021-05-21 | 2023-08-11 | 百度在线网络技术(北京)有限公司 | Keyword processing method and device, electronic equipment and medium |
CN113657100B (en) * | 2021-07-20 | 2023-12-15 | 北京百度网讯科技有限公司 | Entity identification method, entity identification device, electronic equipment and storage medium |
CN113657100A (en) * | 2021-07-20 | 2021-11-16 | 北京百度网讯科技有限公司 | Entity identification method and device, electronic equipment and storage medium |
CN113672719A (en) * | 2021-09-08 | 2021-11-19 | 中国平安人寿保险股份有限公司 | Conversation auxiliary information pushing method and device, computer equipment and storage medium |
CN114254090A (en) * | 2021-12-08 | 2022-03-29 | 马上消费金融股份有限公司 | Question-answer knowledge base expansion method and device |
CN114676244A (en) * | 2022-05-27 | 2022-06-28 | 深圳市人马互动科技有限公司 | Information processing method, information processing apparatus, and computer-readable storage medium |
CN115510203B (en) * | 2022-09-27 | 2023-09-22 | 北京百度网讯科技有限公司 | Method, device, equipment, storage medium and program product for determining answers to questions |
CN115510203A (en) * | 2022-09-27 | 2022-12-23 | 北京百度网讯科技有限公司 | Question answer determining method, device, equipment, storage medium and program product |
CN115840510A (en) * | 2023-02-21 | 2023-03-24 | 中航信移动科技有限公司 | Input association method, electronic equipment and storage medium for civil aviation intelligent question answering |
Also Published As
Publication number | Publication date |
---|---|
CN110489538B (en) | 2020-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110489538A (en) | Sentence answer method, device and electronic equipment based on artificial intelligence | |
CN116628172B (en) | Dialogue method for multi-strategy fusion in government service field based on knowledge graph | |
CN111026842B (en) | Natural language processing method, natural language processing device and intelligent question-answering system | |
CN115238101B (en) | Multi-engine intelligent question-answering system oriented to multi-type knowledge base | |
KR100533810B1 (en) | Semi-Automatic Construction Method for Knowledge of Encyclopedia Question Answering System | |
US11823074B2 (en) | Intelligent communication manager and summarizer | |
CN110377716A (en) | Exchange method, device and the computer readable storage medium of dialogue | |
TW201832104A (en) | Natural language question answering method and apparatus, and server | |
CN114547329A (en) | Method for establishing pre-training language model, semantic analysis method and device | |
CN109543034B (en) | Text clustering method and device based on knowledge graph and readable storage medium | |
CN111475623A (en) | Case information semantic retrieval method and device based on knowledge graph | |
KR20190015797A (en) | The System and the method of offering the Optimized answers to legal experts utilizing a Deep learning training module and a Prioritization framework module based on Artificial intelligence and providing an Online legal dictionary utilizing a character Strings Dictionary Module that converts legal information into significant vector | |
CN112948534A (en) | Interaction method and system for intelligent man-machine conversation and electronic equipment | |
CN108846138A (en) | A kind of the problem of fusion answer information disaggregated model construction method, device and medium | |
CN112052317A (en) | Medical knowledge base intelligent retrieval system and method based on deep learning | |
KR20200139008A (en) | User intention-analysis based contract recommendation and autocomplete service using deep learning | |
CN107967302A (en) | Game customer service conversational system based on deep neural network | |
CN115714002B (en) | Training method for depression risk detection model, depression symptom early warning method and related equipment | |
CN116244344A (en) | Retrieval method and device based on user requirements and electronic equipment | |
CN110245349A (en) | A kind of syntax dependency parsing method, apparatus and a kind of electronic equipment | |
CN114911915A (en) | Knowledge graph-based question and answer searching method, system, equipment and medium | |
CN113052544A (en) | Method and device for intelligently adapting workflow according to user behavior and storage medium | |
CN114647719A (en) | Question-answering method and device based on knowledge graph | |
Surendran et al. | Conversational AI-A retrieval based chatbot | |
CN113268673B (en) | Method and system for analyzing internet action type information clue |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |