CN106295807B - A method and device for information processing - Google Patents
A method and device for information processing Download PDFInfo
- Publication number
- CN106295807B CN106295807B CN201610710565.8A CN201610710565A CN106295807B CN 106295807 B CN106295807 B CN 106295807B CN 201610710565 A CN201610710565 A CN 201610710565A CN 106295807 B CN106295807 B CN 106295807B
- Authority
- CN
- China
- Prior art keywords
- question
- sample
- user
- questions
- standard
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of human-computer interaction, in particular to a method and a device for processing information in human-computer interaction. The invention provides an information processing method, which comprises the following steps: providing a model sample library, wherein the model sample library comprises sample standard questions and sample extension questions corresponding to each sample standard question; providing a knowledge base, wherein the knowledge base comprises knowledge base standard questions and knowledge base extension questions and answers corresponding to the knowledge base standard questions, and the knowledge base is used for providing answers for user question sentences; determining whether a sample expansion question matched with a user question sentence in a human-computer interaction log exists in the model sample library; if yes, determining whether the standard question corresponding to the user question sentence in the human-computer interaction log is the same as the standard question corresponding to the matched sample extension question; and if not, optimizing the knowledge base. The invention also provides an information processing device and system corresponding to the information processing method.
Description
Technical field
The present invention relates to the method and devices of information processing in human-computer interaction technique field more particularly to human-computer interaction.
Background technique
Human-computer interaction is the science of the interactive relation between research system and user.System can be various machines
Device is also possible to the system and software of computerization.For example, various artificial intelligence systems, example may be implemented by human-computer interaction
Such as, intelligent customer service system, speech control system etc..
Artificial intelligence semantics recognition is the basis of human-computer interaction, can be identified to human language, to be converted into machine
Device it will be appreciated that language.In order to understand human language, artificial intelligence semantics recognition system needs a set of knowledge base.Magnanimity
Isomeric data is organized into knowledge by knowledge learning system, and is dissolved into existing knowledge hierarchy.
Various artificial intelligence systems are handled the original question sentence that user proposes using artificial intelligence semantics recognition technology,
Determine the corresponding standard question sentence of the original question sentence, then based on incidental some limited in the standard question sentence and original question sentence
Information provide corresponding answer, the place for each original question sentence is recorded in the form of log in artificial intelligence system
Manage situation, the information of each log includes: original question sentence (user's question sentence) that user proposes and is answered standard question sentence (standard is asked)
Case.
Knowledge base optimized, include two important steps: the interactive log optimized will be needed to pick out;For
Select log optimizes knowledge base.
In the prior art, when selecting interactive log, mainly by manually collect and sort out correct log library and
Meaningless log library, is then compared with daily interactive log, is filtered to the log content of exact matching.Each log
Artificial contrast is all needed, needs to put into a large amount of hand labors.Meanwhile when needing to optimize knowledge base, it is also desirable to professional
Knowledge operation maintenance personnel, which for every need to optimize log and carry out standard, asks and writes, and is costly and inefficient down.
Summary of the invention
The purpose of the present invention is to provide a kind of method and device of information processing, overcome present in traditional technology with
Lower problem: it needs to put into a large amount of hand labors and selects the interactive log that need to optimize.Meanwhile in information processing, system can be automatic
Proposed standard is asked, the investment of hand labor is further reduced, and improves the optimization efficiency of knowledge base.
According to above-mentioned purpose, the present invention provides a kind of method of information processing, comprising: provides model sample library, the mould
Pattern example library includes that sample standard asks and asks that corresponding sample extension is asked with each sample standard;Knowledge base is provided, it is described
Knowledge base includes that knowledge library standard asks and asks that the extension of corresponding knowledge base is asked and answer with each knowledge library standard, described to know
Know library to be used to furnish an answer for user's question sentence;Determining in model sample library whether there is and the user in human-computer interaction log
The sample extension that question sentence matches is asked;If it exists, it is determined that the corresponding mark of user's question sentence described in the human-computer interaction log
Standard asks whether the corresponding sample standard asked with the extension of matched sample is asked identical;If not identical, optimize the knowledge base.
In one embodiment, the sample extension, which is asked, asks that the sample standard is asked including knowledge base including knowledge base extension
Standard is asked.
In one embodiment, it determines in model sample library and expands with the presence or absence of the sample to match with user's question sentence
Exhibition asks to include: to ask user's question sentence with sample extension to execute Semantic Similarity Measurement to be in determination model sample library
The no sample extension for being greater than first threshold there are the semantic similarity of at least one and user's question sentence is asked.
In one embodiment, determine the corresponding standard of user's question sentence ask with matched sample extension ask corresponding to
Sample standard asks that whether identical includes: that the corresponding standard of user's question sentence described in comparison asks that the institute asked with the extension of matched sample is right
Sample standard is answered to ask whether text is completely the same.
In one embodiment, it is greater than the first threshold with user's question semanteme similarity if it exists and is less than
100% sample extension asks, and the corresponding standard of user's question sentence ask with semantic similarity be greater than the first threshold and
It is identical less than the corresponding sample standard question sentence that the extension of 100% sample is asked, then by user's question sentence and user's question sentence
Corresponding standard ask and be added into model sample library in association.
In one embodiment, multiple matched sample extensions are asked if it exists, it is determined that the corresponding mark of user's question sentence
Standard, which is asked, asks with the corresponding sample standard asked of matched sample extension and whether identical has comprised determining whether a matched sample
Extend the corresponding sample standard asked ask asked with the corresponding standard of user's question sentence it is identical.
It in one embodiment, include: based on the Semantic Similarity Measurement to the optimization of the knowledge base as a result, recommending
The corresponding sample standard that the sample extension for being greater than second threshold with the semantic similarity of user's question sentence is asked is asked;It will be from being pushed away
The sample standard recommended asks that the sample standard that middle artificial selection goes out is asked and is added into the knowledge base in association with user's question sentence.
In one embodiment, the method also includes: by it is described from the sample standard recommended ask middle artificial selection go out
Sample standard is asked is added into model sample library with user's question sentence in association.
In one embodiment, if there is no the samples to match with user's question sentence to extend in model sample library
It asks, then creates knowledge point corresponding with user's question sentence in knowledge base, the knowledge point includes: that knowledge library standard is asked, known
The extension of knowledge library is asked and answer.
In one embodiment, the method also includes: by the knowledge point created in knowledge base while being added to the mould
Pattern example library.
In one embodiment, ask that executing Semantic Similarity Measurement includes: to sample for user's question sentence and sample extension
Extension, which is asked, to be segmented, and calculates word and sentence vector value;User's question sentence is segmented, and calculates word and sentence vector
Value;The word and sentence vector value and the word of user's question sentence and the degree of correlation of sentence vector value that sample extension is asked are calculated, with
User's question sentence and sample extend the semantic similarity asked out.
In one embodiment, with the presence or absence of the sample to match with user's question sentence in determining model sample library
Before extension is asked, the method also includes: all user's question sentences in the human-computer interaction log are pre-processed, to filter people
Invalid data in machine interactive log user's question sentence.
The present invention also provides a kind of devices of information processing, comprising: the first analysis module, for determining in model sample library
It is asked with the presence or absence of the sample extension to match with user's question sentence in human-computer interaction log;Second analysis module, in response to
It is asked in the presence of the sample extension to match with user's question sentence, it is determined that the institute of user's question sentence described in the human-computer interaction log
Corresponding standard asks whether the corresponding sample standard asked with the extension of matched sample is asked identical;And optimization module, for responding
Ask in the corresponding standard of user's question sentence asks not identical with the corresponding sample standard asked of matched sample extension, then optimizes
Knowledge base.
In one embodiment, first analysis module includes: Semantic Similarity Measurement module, for asking the user
Sentence with sample extension asks execution Semantic Similarity Measurement, in determination model sample library whether there is at least one with it is described
The sample extension that the semantic similarity of user's question sentence is greater than first threshold is asked.
In one embodiment, second analysis module includes: comparison module, and the institute for user's question sentence is right
Standard is answered to ask the corresponding sample standard asked with the extension of matched sample asks whether text is completely the same.
In one embodiment, second analysis module further include: adding module, in response to existing and the user
Question semanteme similarity is greater than the first threshold and the sample extension less than 100% is asked, and user's question sentence is corresponding
Standard, which is asked, to be greater than the first threshold with semantic similarity and extends the corresponding sample standard question sentence asked less than 100% sample
It is identical, then the corresponding standard of user's question sentence and user's question sentence is asked and is added into the model sample in association
Library.
In one embodiment, multiple matched sample extensions ask that then second analysis module determines whether if it exists
The corresponding sample standard that one matched sample extension is asked ask asked with the corresponding standard of user's question sentence it is identical.
In one embodiment, the optimization module includes: recommending module, for the knot based on the Semantic Similarity Measurement
Fruit, the corresponding sample standard for recommending the sample extension for being greater than second threshold with the semantic matching degree of user's question sentence to ask are asked;
And adding module, it is related to user's question sentence for will ask that the standard of middle artificial selection out is asked from the sample standard recommended
It is added into the knowledge base to connection.
In one embodiment, the adding module be further used for by it is described from the sample standard recommended ask in manually select
The standard selected out is asked is added into model sample library with user's question sentence in association.
In one embodiment, if there is no the samples to match with user's question sentence to extend in model sample library
It asks, then the adding module creates knowledge point corresponding with user's question sentence in knowledge base, and the knowledge point includes: knowledge
Library standard is asked, knowledge base extension is asked and answer.
In one embodiment, the adding module also by the knowledge point created in knowledge base while being added to the model
Sample library.
In one embodiment, the Semantic Similarity Measurement module includes: word and vector calculation module, for expanding sample
Exhibition is asked and is segmented, and calculates word and sentence vector value, and segment to user's question sentence, and calculate word and sentence to
Magnitude;And relatedness computation module, for calculating the word of sample the extension word asked and sentence vector value and user's question sentence
With the degree of correlation of sentence vector value, to show that user's question sentence and sample extend the semantic similarity asked.
In one embodiment, described device further include: preprocessing module, in determining model sample library whether
Before being asked in the presence of the sample extension to match with user's question sentence, user's question sentences all in the human-computer interaction log are carried out
Pretreatment, to filter the invalid data in human-computer interaction log user's question sentence.
The present invention also provides a kind of system of information processing, the devices including any information processing, further includes: model
Sample library, model sample library include that sample standard asks and asks that corresponding sample extension is asked with each sample standard;Know
Know library, the knowledge base includes that knowledge library standard asks and asks that corresponding knowledge base extension is asked and answered with each knowledge library standard
Case, the knowledge base are used to furnish an answer for user's question sentence.
The present invention carries out Automatic sieve by the model sample library set up first when human-computer interaction log need to be optimized by choosing
Choosing has filtered out largely existing knowledge content, has reduced the input amount of hand labor.Simultaneity factor can need to optimize people from trend
Machine interactive log proposed standard is asked, artificial only to be selected, and is further reduced hand labor, is improved knowledge base
Optimization efficiency.
More preferably understand to have to above-mentioned and other aspect of the invention, preferred embodiment is cited below particularly, and cooperates attached
Figure, is described in detail below:
Detailed description of the invention
Fig. 1 is knowledge base schematic diagram of the present invention;
Fig. 2 is model sample of the present invention library schematic diagram;
Fig. 3 is the schematic diagram for optimizing knowledge base process in the method flow of the information processing of one embodiment of the invention;
Fig. 4 is the schematic diagram of the device of the information processing of one embodiment of the invention.
Specific embodiment
User with can generate interactive log in intelligent robot interactive process, every interactive log is by user's question sentence, right
The knowledge library standard answered is asked and answer three parts composition.Wherein user's question sentence is that acquisition is directly inputted by user, passes through question and answer
After engine is to the parsing identification of user's question sentence, corresponding knowledge library standard is called to ask about corresponding answer.In these interactive logs
It is middle that accuracy differentiation is replied with the answer that corresponding knowledge point is given by robot according to user's question sentence, user's question sentence content machine can be divided into
Device people do not give reply, correct answer is given by user's question sentence content robot, wrong answer is given by user's question sentence content robot.
Robot is caused not reply or give the reason of mistake replies mainly due to having lacked corresponding knowledge in robot knowledge base
Point or the way to put questions of existing knowledge point are not abundant enough.Therefore it by the analysis of the interactive log generated daily, extracts because knowledge point lacks
The log of the incorrect answer of robot caused by mistake or way to put questions be not abundant is a main path to knowledge base Continuous optimization.This
The method and apparatus that invention provides can greatly reduce the artificial input amount when extracting the human-computer interaction log for needing to optimize.This
It invents the user's question sentence being primarily upon in interactive log and standard is asked.
Fig. 1 and Fig. 2 are please referred to, figures 1 and 2 show that the partial objects of information processing of the present invention, knowledge base and model sample
Example library.
As shown in Figure 1, knowledge base 10 includes that at least one knowledge library standard asks 101 and asks phase with each knowledge library standard
1011 and answer are asked in corresponding knowledge base extension, wherein each knowledge library standard asks a corresponding answer, can there is multiple knowledge
Library extension asks that the extension of 1011- knowledge base asks 101n that a corresponding knowledge library standard asks 101.Since knowledge library standard is asked 101 with answering
Case is asked present invention is primarily concerned with knowledge library standard and is asked with each knowledge library standard corresponding there are one-to-one relationship
Knowledge base extends the treatment process asked.In general, can all have multiple knowledge library standards in knowledge base asks that knowledge library standard is asked
101- knowledge library standard asks 10n.In knowledge base include multiple knowledge points, each knowledge point include: a knowledge library standard ask,
Multiple knowledge base extensions are asked with an answer, i.e., different knowledge base extensions ask it is all the corresponding same answer, a knowledge base
Standard, which is asked, also corresponds to this answer.Usually from each knowledge point, corresponding multiple knowledge base extensions ask middle selection one expression
Clear knowledge base extension easy to maintain asks that the knowledge library standard as the knowledge point is asked, therefore knowledge library standard is asked and known with one
The extension of knowledge library is asked identical.It should be noted that each knowledge library standard asks that corresponding knowledge base extension asks that number can be identical,
It can also be different.
In human-computer interaction process, after receiving user's question sentence, it can be obtained from knowledge base by Semantic Similarity Measurement
With the semantic similarity highest of user's question sentence and the knowledge base extension that is higher than threshold value is asked, and asks knowledge base extension to corresponding answer
Case is sent to user, while asking corresponding knowledge library standard with asking relevance conduct by user's question sentence and with knowledge base extension
One interactive log.
As shown in Fig. 2, model sample library 20 includes that at least one sample standard asks 201 and corresponding one or more
A sample extension asks 2011, similar with knowledge base data structure, and a sample standard, which asks to extend with multiple samples, asks correspondence.
Usually from the extension of multiple samples ask it is middle select one expression clearly extension easy to maintain ask as with the multiple sample pair
The sample standard answered asks, thus sample standard ask asked with the extension of one of sample it is identical.Each sample standard asks corresponding sample
Example extension asks that number may be the same or different.
Fig. 3 is please referred to, the knowledge base Optimizing Flow of one embodiment of the invention is shown comprising the steps of:
Step 301: starting.
Step 302: determining in model sample library with the presence or absence of the sample to match with user's question sentence in human-computer interaction log
Example extension is asked.
Step 303: if it exists, it is determined that the corresponding standard of user's question sentence described in the human-computer interaction log ask with
Whether the corresponding sample standard that the sample extension matched is asked asks identical.
Step 304: if not identical, optimizing the knowledge base.
In step 302, it has been looked for whether in model sample library first close with human-computer interaction log user's question semanteme
As sample extension ask, if there is approximate, then be referred to as match.If having matched, think that this user question sentence can quilt at this time
Model sample library determines.Then in step 303, if can be determined, it is determined that the corresponding standard of user's question sentence is asked and the sample
Example extension asks that whether identical corresponding standard asks, herein identical refers to that text is completely the same, then shows in knowledge base if they are the same
Include knowledge point corresponding with user's question sentence, has optimized knowledge base without using this user journal.If not identical, show
Question sentence not corresponding with the interactive log content, shows that this interactive log is new at this time in model sample library and knowledge base
Content, need using this interactive log Advance data quality knowledge base, that is, enter in step 304.At this point, due to interaction
User's question sentence in log can be determined, can directly by model sample library with interactive log user's question semanteme approximate one
The corresponding one or more sample standards of a or multiple sample question sentences, which are asked, recommends knowledge maintenance personnel, when for one, by knowing
Know maintenance personnel to judge whether properly;When to be multiple, therefrom directly selected by knowledge maintenance personnel one it is most suitable,
Most suitable sample standard that is finally that judgement is suitable or selecting is asked and user's question sentence is stored in knowledge base in association, thus people
The investment of work only needs to carry out simple supervision and management, and the knowledge maintenance personnel for the management that exercises supervision need to only recognize Chinese, has
Normal logic judgment ability needs coming for certain knowledge edition experience for needing to put into before manually in this way
It says, further reduced the requirement to personnel's threshold, and improve optimization efficiency.
The advantages of the method, also resides in, and judges whether that needing to optimize knowledge base is entirely to complete in local model sample library
, without the knowledge base using cloud.Arithmetic speed is not only improved in this way, but also saves the spending of cloud knowledge base.
In one embodiment, sample extension, which is asked, asks that sample standard is asked asks including knowledge library standard including knowledge base extension.More
Further, sample extension, which is asked, asks that sample standard is asked including the institute in knowledge base including all knowledge bases extension in knowledge base
There is knowledge library standard to ask.In this embodiment, model sample library includes that all knowledge library standards in knowledge base are asked and knowledge
Library extension is asked.Model sample library is further reduced what subsequent artefacts selected to whether the judgement that optimizes is more accurate at this time
Workload.
In one embodiment, in step 302, if judging result is, there is no ask with the user in model sample library
The sample extension that sentence matches is asked, then knowledge point corresponding with user's question sentence, the knowledge point packet are created in knowledge base
Include: knowledge library standard is asked, knowledge base extension is asked and answer.In this embodiment, it is believed that the interactive log can not be by model sample library
Determined, i.e., information not relevant to the interactive log in knowledge base, needs to optimize knowledge base using this interactive log.At this time
Due to the interactive log undecidable, a knowledge relevant to user's question sentence is actively only added by knowledge maintenance personnel
Point, that is, need to add a knowledge library standard ask, multiple knowledge bases extension ask with an answer, to complete the optimization of knowledge base.
In a preferred embodiment, whether be by semantic similarity measured, can set if being matched in step 302
One threshold value, when semantic similarity is greater than first threshold, it is believed that interactive log user question sentence asks matching with sample extension.When artificial
When input amount can guarantee, the first threshold can be set higher.Otherwise, then first threshold can be set low
It is some, so as to save human cost.
In one embodiment, whether there is and user's question sentence phase in human-computer interaction log in the determining model sample library
Matched sample extension is asked, is to be completed by semantic matching degree operation, is segmented comprising steps of asking sample extension, and
Calculate word and sentence vector value;User's question sentence is segmented, and calculates word and sentence vector value;Each sample is calculated to expand
The word asked and sentence vector value and the word of user's question sentence and the degree of correlation of sentence vector value are opened up, to obtain user's question sentence
The semantic similarity asked is extended with sample.There are many operation method of semantic matching degree, and method in the prior art can also be transported
It uses in the present invention.
Since the quality in model sample library is most important for the present invention, more preferably, in another embodiment, to model sample
Example library optimizes, including two ways: one, while optimizing to knowledge base, identical content being added into model
Sample library;Two, it is extended when in the presence of the sample with user's question semanteme similarity greater than the first threshold and less than 100%
It asks, and the corresponding standard of user's question sentence asks the sample with semantic similarity greater than the first threshold and less than 100%
It is identical to extend the corresponding sample standard question sentence asked, then asks the corresponding standard of user's question sentence and user's question sentence to phase
Associatedly it is added into model sample library.The first optimization is primarily to keep model sample library content with knowledge base content
Unanimously, and by newest question sentence and standard it asks and updates into model sample library, approximately handed over encountering the content with update in next time
It when mutual log, can directly filter out, optimize without artificial judgment through the invention.Under second of optimal way, due to
Correct answer can be provided for current user's question sentence from knowledge base, that is, find correct standard and ask, so as to not have to incite somebody to action
Interactive log optimizes into knowledge base, but optimizes to be conducive to for subsequent more interactive logs being included in model sample library and can determine that
In range, so as to directly handle related interactive log through the invention.
The information processing method process of one embodiment of the invention specifically includes:
Step 1: starting.
Step 2: determining in model sample library with the presence or absence of the sample to match with user's question sentence in human-computer interaction log
Extension is asked, is entered step 3 if it exists, is otherwise entered step 5.
Step 3: determining that the corresponding standard of user's question sentence described in the human-computer interaction log is asked and expand with matched sample
Whether the corresponding sample standard that exhibition is asked asks identical.4 are entered step if they are the same, otherwise enter step 6.
Step 4: judging whether the semantic similarity that user's question sentence is asked with sample extension is greater than first threshold and is less than
100%, if then entering step 7, otherwise enter step 8.
Step 5: re-creating knowledge point, and with knowledge point optimization knowledge base and model sample library.
Step 6: selection creation of knowledge point, and with knowledge point optimization knowledge base and model sample library.
Step 7: using interactive log content, Optimized model sample library.
Step 8: terminating.
Wherein step 5 content includes: actively to add a knowledge relevant to user's question sentence by knowledge maintenance personnel
Point, that is, need to add a knowledge library standard ask, multiple knowledge bases extension ask with an answer, to complete the optimization of knowledge base,
Identical knowledge point Optimized model sample library is utilized simultaneously, and only the question sentence in knowledge point has only been used in the optimization in model sample library
Content is asked with standard.Step 6 includes: that one or more standards in recommended models sample library ask and give knowledge maintenance personnel, knowledge
Maintenance personnel is directly selected that the pairing is then added into knowledge with a pairing for forming user's question sentence and standard is asked
Library, while the pairing is added into model sample library.In step 7, by interactive log user's question sentence and corresponding standard
It asks and is added in model sample library, so that forming a pair of new sample extension asks the correspondence asked with sample standard.
The present invention also provides a kind of devices 51 of information processing, please refer to Fig. 4.In one embodiment, described device includes
First analysis module 501, the second analysis module 502 and optimization module 503.Interactive log initially enters the first analysis module 501,
First analysis module 501 determines in model sample library with the presence or absence of the sample to match with user's question sentence in human-computer interaction log
Extension is asked, and if it exists, is then entered the second analysis module 502, is determined that the institute of user's question sentence described in the human-computer interaction log is right
It answers standard to ask and whether identical is asked with the corresponding sample standard asked of matched sample extension, optimization module is entered if not identical
503 pairs of knowledge bases optimize.
In another embodiment, Fig. 4 is please referred to, the first analysis module 501 further includes Semantic Similarity Measurement module 5011,
The semantic similarity asked is extended for calculating user's question sentence in human-computer interaction log and sample, and then obtains matching degree.Second
Analysis module 502 includes comparison module 5021, is asked and the expansion of matched sample for the corresponding standard of user's question sentence
Open up the corresponding sample standard asked asks whether text is completely the same.Optimization module 503 further includes recommending module 5031, for being based on
Semantic Similarity Measurement module 5011 as a result, recommending the sample for being greater than second threshold with the semantic matching degree of user's question sentence
The corresponding sample standard asked is extended to ask.Optimization module 503 further includes adding module 5032, for will be from the sample mark recommended
Standard asks that the standard that middle artificial selection goes out is asked and is added into the knowledge base in association with user's question sentence, while by above content
Optimization is added into model sample library.
More preferably, while optimizing to knowledge base, model sample library 504 is optimized.Second analysis module 502 is also
Including adding module 5022, when whether the semantic similarity that user's question sentence is asked with sample extension is greater than first threshold and is less than
100%, and when corresponding standard asks identical, interactive log content optimization is entered into model sample library.Adding module 5032 is also used to
It asks that the standard of middle artificial selection out is asked for the sample standard recommended from recommending module 5031 to add in association with user's question sentence
Model sample library is added.
In another embodiment, the invalid data in interactive log is filtered first, can be picked according to preset filtering rule
Except the junk data in daily record data, such as: single English alphabet be repeated 5 times more than data.Naive Bayesian can be used later
Algorithm is analyzed, and calculates whether log content can determine that in range in analysis model.
The present invention also provides a kind of systems 52 of information processing, please refer to Fig. 4.Including any information processing unit,
It include simultaneously knowledge base 504 and model sample library 505.
The present invention carries out Automatic sieve by the model sample library set up first when human-computer interaction log need to be optimized by choosing
Choosing has filtered out largely existing knowledge content, has reduced the input amount of hand labor.Simultaneity factor can need to optimize people from trend
Machine interactive log proposed standard is asked, artificial only to be selected, and is further reduced hand labor, is improved knowledge base
Optimization efficiency.
Offer is to make any person skilled in the art all and can make or use this public affairs to the previous description of the disclosure
It opens.The various modifications of the disclosure all will be apparent for a person skilled in the art, and as defined herein general
Suitable principle can be applied to other variants without departing from the spirit or scope of the disclosure.The disclosure is not intended to be limited as a result,
Due to example described herein and design, but should be awarded and principle disclosed herein and novel features phase one
The widest scope of cause.
Claims (20)
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810620088.5A CN108764480B (en) | 2016-08-23 | 2016-08-23 | Information processing system |
| CN201610710565.8A CN106295807B (en) | 2016-08-23 | 2016-08-23 | A method and device for information processing |
| CN201811074893.9A CN109344237B (en) | 2016-08-23 | 2016-08-23 | Information processing method and device for man-machine interaction |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610710565.8A CN106295807B (en) | 2016-08-23 | 2016-08-23 | A method and device for information processing |
Related Child Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811074893.9A Division CN109344237B (en) | 2016-08-23 | 2016-08-23 | Information processing method and device for man-machine interaction |
| CN201810620088.5A Division CN108764480B (en) | 2016-08-23 | 2016-08-23 | Information processing system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN106295807A CN106295807A (en) | 2017-01-04 |
| CN106295807B true CN106295807B (en) | 2018-12-21 |
Family
ID=57615826
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610710565.8A Expired - Fee Related CN106295807B (en) | 2016-08-23 | 2016-08-23 | A method and device for information processing |
| CN201811074893.9A Active CN109344237B (en) | 2016-08-23 | 2016-08-23 | Information processing method and device for man-machine interaction |
| CN201810620088.5A Active CN108764480B (en) | 2016-08-23 | 2016-08-23 | Information processing system |
Family Applications After (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811074893.9A Active CN109344237B (en) | 2016-08-23 | 2016-08-23 | Information processing method and device for man-machine interaction |
| CN201810620088.5A Active CN108764480B (en) | 2016-08-23 | 2016-08-23 | Information processing system |
Country Status (1)
| Country | Link |
|---|---|
| CN (3) | CN106295807B (en) |
Families Citing this family (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106886820A (en) * | 2017-02-08 | 2017-06-23 | 深圳市科迈爱康科技有限公司 | Intelligent information processing method and system |
| CN109933777B (en) * | 2017-12-18 | 2024-02-06 | 上海智臻智能网络科技股份有限公司 | Knowledge base expanding device |
| CN110019304B (en) * | 2017-12-18 | 2024-01-05 | 上海智臻智能网络科技股份有限公司 | Methods, storage media and terminals for expanding question and answer knowledge base |
| CN110019305B (en) * | 2017-12-18 | 2024-03-15 | 上海智臻智能网络科技股份有限公司 | Knowledge base expansion method, storage medium, and terminal |
| CN109934347B (en) * | 2017-12-18 | 2024-02-02 | 上海智臻智能网络科技股份有限公司 | Device for expanding question and answer knowledge base |
| CN108345644A (en) * | 2018-01-15 | 2018-07-31 | 阿里巴巴集团控股有限公司 | A kind of method and device of data processing |
| CN109325040B (en) * | 2018-07-13 | 2020-11-10 | 众安信息技术服务有限公司 | A kind of FAQ question and answer library generalization method, device and equipment |
| WO2020047779A1 (en) * | 2018-09-05 | 2020-03-12 | 西门子(中国)有限公司 | Fault analysis method and device and computer readable medium |
| CN109213847A (en) * | 2018-09-14 | 2019-01-15 | 广州神马移动信息科技有限公司 | Layered approach and its device, electronic equipment, the computer-readable medium of answer |
| CN109189912A (en) * | 2018-10-09 | 2019-01-11 | 阿里巴巴集团控股有限公司 | The update method and device of user's consulting statement library |
| CN111382239B (en) * | 2018-12-27 | 2023-06-23 | 上海智臻智能网络科技股份有限公司 | Interaction flow optimization method and device |
| CN111382235A (en) * | 2018-12-27 | 2020-07-07 | 上海智臻智能网络科技股份有限公司 | Question-answer knowledge base optimization method and device |
| CN111400458A (en) * | 2018-12-27 | 2020-07-10 | 上海智臻智能网络科技股份有限公司 | Automatic generalization method and device |
| CN109829051B (en) * | 2019-01-30 | 2023-01-17 | 科大讯飞股份有限公司 | Method and device for screening similar sentences of database |
| CN109992675A (en) * | 2019-01-30 | 2019-07-09 | 阿里巴巴集团控股有限公司 | Information processing method and device |
| CN109947651B (en) * | 2019-03-21 | 2022-08-02 | 上海智臻智能网络科技股份有限公司 | Artificial intelligence engine optimization method and device |
| CN110347807B (en) * | 2019-05-20 | 2023-08-08 | 平安科技(深圳)有限公司 | Problem information processing method and device |
| CN110362665B (en) * | 2019-06-12 | 2021-04-30 | 深圳追一科技有限公司 | Question-answering system and method based on semantic similarity |
| CN112825074A (en) * | 2019-11-20 | 2021-05-21 | 上海智臻智能网络科技股份有限公司 | Automatic question-answering system and device for updating question-answering knowledge base |
| CN110928991A (en) * | 2019-11-20 | 2020-03-27 | 上海智臻智能网络科技股份有限公司 | Method and device for updating question-answer knowledge base |
| CN111125379B (en) * | 2019-12-26 | 2022-12-06 | 科大讯飞股份有限公司 | Knowledge base expansion method and device, electronic equipment and storage medium |
| CN111144098B (en) * | 2019-12-26 | 2023-05-30 | 支付宝(杭州)信息技术有限公司 | Recall method and device for extended question |
| CN112936304B (en) * | 2021-02-02 | 2022-09-16 | 浙江大学 | Self-evolution type service robot system and learning method thereof |
| CN114064874A (en) * | 2021-11-19 | 2022-02-18 | 浙江百应科技有限公司 | Knowledge base problem adding method and device based on vector search engine |
| CN115203369A (en) * | 2022-07-08 | 2022-10-18 | 北京锐安科技有限公司 | A data processing method, apparatus, device and storage medium |
| CN118377858A (en) * | 2024-02-07 | 2024-07-23 | 上海迪爱斯信息技术有限公司 | A method and system for constructing data query prompts for language models |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104679815A (en) * | 2014-12-08 | 2015-06-03 | 北京云知声信息技术有限公司 | Method and system for screening question and answer pairs and updating question and answer database in real time |
| CN105488185A (en) * | 2015-12-01 | 2016-04-13 | 上海智臻智能网络科技股份有限公司 | Method and device for optimizing knowledge base |
| CN105550361A (en) * | 2015-12-31 | 2016-05-04 | 上海智臻智能网络科技股份有限公司 | Log processing method and apparatus, and ask-answer information processing method and apparatus |
| CN105678324A (en) * | 2015-12-31 | 2016-06-15 | 上海智臻智能网络科技股份有限公司 | Similarity calculation-based questions and answers knowledge base establishing method, device and system |
| CN105677783A (en) * | 2015-12-31 | 2016-06-15 | 上海智臻智能网络科技股份有限公司 | Information processing method and device for intelligent question-answering system |
| CN105824797A (en) * | 2015-01-04 | 2016-08-03 | 华为技术有限公司 | Method, device and system evaluating semantic similarity |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101178705A (en) * | 2007-12-13 | 2008-05-14 | 中国电信股份有限公司 | Free-running speech comprehend method and man-machine interactive intelligent system |
| US8504361B2 (en) * | 2008-02-07 | 2013-08-06 | Nec Laboratories America, Inc. | Deep neural networks and methods for using same |
| CN104199825A (en) * | 2014-07-23 | 2014-12-10 | 清华大学 | Information inquiry method and system |
| CN104360994A (en) * | 2014-12-04 | 2015-02-18 | 科大讯飞股份有限公司 | Natural language understanding method and natural language understanding system |
| US20160196490A1 (en) * | 2015-01-02 | 2016-07-07 | International Business Machines Corporation | Method for Recommending Content to Ingest as Corpora Based on Interaction History in Natural Language Question and Answering Systems |
| CN105591882B (en) * | 2015-12-10 | 2018-03-06 | 北京中科汇联科技股份有限公司 | A kind of intelligence machine person to person mixes the method and system of customer service |
| CN105631022B (en) * | 2015-12-29 | 2019-03-05 | 上海智臻智能网络科技股份有限公司 | Information processing method and device |
-
2016
- 2016-08-23 CN CN201610710565.8A patent/CN106295807B/en not_active Expired - Fee Related
- 2016-08-23 CN CN201811074893.9A patent/CN109344237B/en active Active
- 2016-08-23 CN CN201810620088.5A patent/CN108764480B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104679815A (en) * | 2014-12-08 | 2015-06-03 | 北京云知声信息技术有限公司 | Method and system for screening question and answer pairs and updating question and answer database in real time |
| CN105824797A (en) * | 2015-01-04 | 2016-08-03 | 华为技术有限公司 | Method, device and system evaluating semantic similarity |
| CN105488185A (en) * | 2015-12-01 | 2016-04-13 | 上海智臻智能网络科技股份有限公司 | Method and device for optimizing knowledge base |
| CN105550361A (en) * | 2015-12-31 | 2016-05-04 | 上海智臻智能网络科技股份有限公司 | Log processing method and apparatus, and ask-answer information processing method and apparatus |
| CN105678324A (en) * | 2015-12-31 | 2016-06-15 | 上海智臻智能网络科技股份有限公司 | Similarity calculation-based questions and answers knowledge base establishing method, device and system |
| CN105677783A (en) * | 2015-12-31 | 2016-06-15 | 上海智臻智能网络科技股份有限公司 | Information processing method and device for intelligent question-answering system |
Non-Patent Citations (1)
| Title |
|---|
| 交互式问答系统中的待改进问题自动识别方法;葛丽萍;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150215(第02期);I139-158 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109344237A (en) | 2019-02-15 |
| CN106295807A (en) | 2017-01-04 |
| CN108764480B (en) | 2020-07-07 |
| CN109344237B (en) | 2020-11-17 |
| CN108764480A (en) | 2018-11-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN106295807B (en) | A method and device for information processing | |
| CN109408821B (en) | Corpus generation method and device, computing equipment and storage medium | |
| CN115080732A (en) | Complaint work order processing method and device, electronic equipment and storage medium | |
| CN110262273A (en) | Household equipment control method and device, storage medium and intelligent household system | |
| CN108595696A (en) | A kind of human-computer interaction intelligent answering method and system based on cloud platform | |
| CN105868179B (en) | An intelligent question answering method and device | |
| CN111429915A (en) | A dispatch system and dispatch method based on speech recognition | |
| KR20200007969A (en) | Information processing methods, terminals, and computer storage media | |
| CN109829045A (en) | A kind of answering method and device | |
| CN106776832B (en) | Processing method, apparatus and system for question and answer interactive log | |
| KR102343407B1 (en) | Apparatus or Method for Detecting Meaningful Intervals using voice and video information | |
| CN111180025B (en) | Method, device and consultation system for representing medical record text vector | |
| CN110689078A (en) | Man-machine interaction method and device based on personality classification model and computer equipment | |
| CN110674276A (en) | Robot self-learning method, robot terminal, device and readable storage medium | |
| CN113628077A (en) | Method for generating non-repeated examination questions, terminal and readable storage medium | |
| CN106485370B (en) | A kind of method and apparatus of information prediction | |
| CN111460106A (en) | An information interaction method, device and device | |
| CN111881330B (en) | Automatic home service scene restoration method and system | |
| CN110362828B (en) | Network information risk identification method and system | |
| CN111209394A (en) | Text classification processing method and device | |
| CN117952128A (en) | Term translation recommendation method, device, electronic equipment and storage medium | |
| KR102507810B1 (en) | Voice-based sales information extraction and lead recommendation method using artificial intelligence, and data analysis apparatus therefor | |
| CN115222282A (en) | Quality inspection scoring method, equipment and storage medium based on correlation | |
| CN118843058B (en) | A comprehensive management system for hearing examination and hearing aid fitting effect evaluation | |
| CN116935230B (en) | Crop pest and disease identification methods, devices, equipment and media |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20181018 Address after: 201803 7, 398 Lane 1555, Jiangxi Road, Jinsha, Jiading District, Shanghai. Applicant after: SHANGHAI XIAOI ROBOT TECHNOLOGY Co.,Ltd. Applicant after: GUIZHOU XIAOAI ROBOT TECHNOLOGY CO.,LTD. Address before: 201803 7, 398 Lane 1555, Jiangxi Road, Jinsha, Jiading District, Shanghai. Applicant before: SHANGHAI XIAOI ROBOT TECHNOLOGY Co.,Ltd. |
|
| TA01 | Transfer of patent application right | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20181221 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |