CN109492085A - Method, apparatus, terminal and storage medium are determined based on the answer of data processing - Google Patents
Method, apparatus, terminal and storage medium are determined based on the answer of data processing Download PDFInfo
- Publication number
- CN109492085A CN109492085A CN201811364713.0A CN201811364713A CN109492085A CN 109492085 A CN109492085 A CN 109492085A CN 201811364713 A CN201811364713 A CN 201811364713A CN 109492085 A CN109492085 A CN 109492085A
- Authority
- CN
- China
- Prior art keywords
- answer
- generation
- preset
- model
- candidate answers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012545 processing Methods 0.000 title claims abstract description 50
- 238000012549 training Methods 0.000 claims description 91
- 238000004590 computer program Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 230000003993 interaction Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 241000008357 Okapia johnstoni Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of answers based on data processing to determine method, apparatus, terminal and storage medium.Wherein, method includes: to obtain the initial problem of user's input, and preset retrieval model is called to determine the corresponding candidate answers set of the initial problem from default knowledge base;Preset generation model is called to determine the corresponding generation answer set of the initial problem;The candidate matches degree between each candidate answers and the initial problem in the candidate answers set is calculated separately according to preset computation rule, to obtain at least one candidate matches degree, and seeks the first average value of at least one candidate matches degree;According to preset determining rule target answer to be output is determined from the candidate answers set or described generate according to first average value in answer set.The embodiment of the present invention can preferably determine target answer, avoid target answer from long-tail problem occur and guarantee the consistency and reasonability of target answer.
Description
Technical field
The present invention relates to field of computer technology, more particularly to determine method, apparatus, terminal based on the answer of data processing
And storage medium.
Background technique
Human-computer interaction (Human-Computer Interaction, HCI) refers to right using certain between people and computer
Conversational language determines the information exchanging process between people and computer with certain interactive mode.With the hair of human-computer interaction technology
Exhibition, more and more the intellectual product based on human-computer interaction technology comes into being, such as chat robots etc..These intellectual products
Chat communication can be carried out with user, and answers information accordingly according to generating the problem of user.But intellectual product root at present
Usually there is long-tail (i.e. minority's problem) in the answer information retrieved according to the problem of user, or be difficult to ensure answer letter
The consistency and reasonability of breath.Therefore, how preferably to determine that target answer becomes research hotspot according to the problem of user.
Summary of the invention
The embodiment of the invention provides a kind of answers based on data processing to determine that method, apparatus, terminal and computer can
Storage medium is read, can preferably determine target answer, avoids target answer from long-tail problem occur and guarantees target answer
Consistency and reasonability.
On the one hand, the embodiment of the invention provides a kind of answers based on data processing to determine method, should be based at data
The answer of reason determines that method includes:
The initial problem of user's input is obtained, and it is described first to call preset retrieval model to determine from default knowledge base
The corresponding candidate answers set of beginning problem, the default knowledge base include one corresponding at least one problem and each problem
Or multiple answers, it include at least one candidate answers in the candidate answers set;
Preset generation model is called to determine the corresponding generation answer set of the initial problem, the generation answer set
It include that at least one generates answer in conjunction, the preset model that generates is that the training dataset using multiple comprising problem carries out
What model training optimized;
Each candidate answers in the candidate answers set are calculated separately according to preset computation rule initially to ask with described
Candidate matches degree between topic to obtain at least one candidate matches degree, and seeks the of at least one candidate matches degree
One average value;
It is answered according to first average value from the candidate answers set or the generation according to preset determining rule
Target answer to be output is determined in case set.
On the other hand, the answer determining device based on data processing that the embodiment of the invention provides a kind of should be based on data
The answer determining device of processing includes:
Acquiring unit for obtaining the initial problem of user's input, and calls preset retrieval model from default knowledge base
In determine that the corresponding candidate answers set of the initial problem, the default knowledge base include at least one problem and respectively ask
Corresponding one or more answers are inscribed, include at least one candidate answers in the candidate answers set;
The acquiring unit, for calling preset generation model to determine the corresponding generation answer set of the initial problem
It closes, includes that at least one generates answer in the generation answer set, the preset generation model is using multiple comprising asking
The training dataset of topic carries out what model training optimized;
Computing unit, for calculating separately each candidate answers in the candidate answers set according to preset computation rule
Candidate matches degree between the initial problem to obtain at least one candidate matches degree, and seeks at least one described time
Select the first average value of matching degree;
Determination unit, for according to preset determining rule according to first average value from the candidate answers set or
Target answer to be output is determined in generation answer set described in person.
In another aspect, the terminal includes input equipment, output equipment, storage the embodiment of the invention provides a kind of terminal
Device and processor, the processor, the input equipment, the output equipment and the memory are connected with each other, wherein institute
Memory is stated for storing computer program, the computer program includes program instruction, and the processor is configured for adjusting
It is instructed with described program, executes following steps:
The initial problem of user's input is obtained, and it is described first to call preset retrieval model to determine from default knowledge base
The corresponding candidate answers set of beginning problem, the default knowledge base include one corresponding at least one problem and each problem
Or multiple answers, it include at least one candidate answers in the candidate answers set;
Preset generation model is called to determine the corresponding generation answer set of the initial problem, the generation answer set
It include that at least one generates answer in conjunction, the preset model that generates is that the training dataset using multiple comprising problem carries out
What model training optimized;
Each candidate answers in the candidate answers set are calculated separately according to preset computation rule initially to ask with described
Candidate matches degree between topic to obtain at least one candidate matches degree, and seeks the of at least one candidate matches degree
One average value;
It is answered according to first average value from the candidate answers set or the generation according to preset determining rule
Target answer to be output is determined in case set.
In another aspect, the computer storage medium is deposited the embodiment of the invention provides a kind of computer readable storage medium
Contain computer program.The computer program includes at least one program instruction, which can be by a processing
Device load, and for executing following steps:
The initial problem of user's input is obtained, and it is described first to call preset retrieval model to determine from default knowledge base
The corresponding candidate answers set of beginning problem, the default knowledge base include one corresponding at least one problem and each problem
Or multiple answers, it include at least one candidate answers in the candidate answers set;
Preset generation model is called to determine the corresponding generation answer set of the initial problem, the generation answer set
It include that at least one generates answer in conjunction, the preset model that generates is that the training dataset using multiple comprising problem carries out
What model training optimized;
Each candidate answers in the candidate answers set are calculated separately according to preset computation rule initially to ask with described
Candidate matches degree between topic to obtain at least one candidate matches degree, and seeks the of at least one candidate matches degree
One average value;
It is answered according to first average value from the candidate answers set or the generation according to preset determining rule
Target answer to be output is determined in case set.
In embodiments of the present invention, after the initial problem for getting user's input, preset retrieval mould can be called
Type determines the corresponding candidate answers set of initial problem from default knowledge base, and preset generation model is called to determine
The corresponding generation answer set of initial problem.Then calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
Finally determining target answer to be output in answer set from candidate answers set or can be generated according to the first average value.This
Inventive embodiments, which call retrieval model and generate model, determines target answer, can occur to avoid target answer long-tail problem with
And guarantee the consistency and reasonability of target answer.And determined according to the first average value from candidate answers set or from
The case where generating and determine target answer in answer set, retrieval model false retrieval can be evaded, to improve accuracy.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, institute in being described below to the embodiment of the present invention
Attached drawing to be used is needed to be briefly described.
Fig. 1 is the flow diagram that a kind of answer based on data processing provided in an embodiment of the present invention determines method;
Fig. 2 be another embodiment of the present invention provides a kind of answer based on data processing determine method process signal
Figure;
Fig. 3 a is the application scenario diagram that a kind of answer based on data processing provided in an embodiment of the present invention determines method;
Fig. 3 b is the application scenario diagram that a kind of answer based on data processing provided in an embodiment of the present invention determines method;
Fig. 4 is a kind of structural schematic diagram of answer determining device based on data processing provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention is described.
The embodiment of the present invention proposes a kind of answer based on data processing and determines method, should answer based on data processing
The method of determination can be used in terminal and the chat conversations of user, and terminal herein can include but is not limited to: smart phone,
The smart machines such as laptop computer, tablet computer, desktop computer, and the chat apparatus based on chat conversations, such as chat
Robot etc..Specifically, terminal during carrying out chat conversations with user, can obtain user's input in user interface
Initial problem, then call preset retrieval model to determine the corresponding candidate answers collection of initial problem from default knowledge base
It closes, and preset generation model is called to determine the corresponding generation answer set of initial problem.And it is advised according to preset calculating
Each candidate answers in candidate answers set and the candidate matches degree between initial problem are calculated separately, then to obtain at least one
Candidate matches degree, and seek the first average value of at least one candidate matches degree.It finally can be according to preset determining regular root
From candidate answers set or determining target answer to be output in answer set is generated according to the first average value.Determining target
After answer, the target answer can be exported in the user interface, to realize the chat conversations with user.
It referring to Figure 1, is that a kind of answer based on data processing provided in an embodiment of the present invention determines that the process of method is shown
It is intended to, which determines that method can be executed by above-mentioned terminal.As shown in Figure 1, should be determined based on the answer of data processing
Method may comprise steps of S101-S104:
S101, obtains the initial problem of user's input, and preset retrieval model is called to determine from default knowledge base
The corresponding candidate answers set of initial problem.Default knowledge base herein includes corresponding at least one problem and each problem
One or more answers include at least one candidate answers in candidate answers set herein.
Specifically, terminal obtain user input initial problem when, the voice messaging of available user, from the voice
Initial problem is extracted in information;For example, user against terminal says " hello, you know which the component of computer has? ", eventually
It holds available to the voice messaging, and extracts initial problem is " which the component of computer has " from the voice messaging.?
In a kind of embodiment, the text information of terminal also available user's input extracts initial problem from text information;
For example, terminal may provide the user with a dialog interface, so that user can input text information in the dialog interface
" hello, you know which the component of computer has? ", terminal can detecte the input operation of user, and obtain user's input
Text information, then extracted from the text information initial problem be " which the component of computer has ".
After getting initial problem, terminal can call preset retrieval model to determine from default knowledge base just
The corresponding candidate answers set of beginning problem, the preset retrieval model can be obtained based on the training of preset searching algorithm, herein
Preset searching algorithm can include but is not limited to: IR (Information Retrieval) algorithm, BM25 (Okapi
BM25) algorithm, etc..
S102 calls preset generation model to determine the corresponding generation answer set of initial problem, and generation herein is answered
It include that at least one generates answer in case set.
Specifically, the preset model that generates is that the training dataset using multiple comprising problem carries out model training optimization
It obtains;In one embodiment, which can be the generation model based on attention+seq2seq.?
After getting initial problem, it is one or more raw that terminal can call the preset generation model to generate for the initial problem
At answer, then all generation answers generated of preset generation model or the generation answer of preset quantity are constituted
Set is as generation answer set.
S103, according to preset computation rule calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
In one embodiment, preset computation rule may include: according to each candidate answers and initial problem answer
The frequency that joint occurs determines the rule of candidate matches degree.Specifically, can count each candidate respectively based on default knowledge base and answer
Case combines the frequency occurred with initial problem, and the frequency that statistics is obtained is as candidate between candidate answers and initial problem
With degree, so as to obtain at least one candidate matches degree.Then seek this at least one candidate matches degree again first is average
Value.
In another embodiment, preset computation rule may include: to carry out scoring treatment to each candidate answers with true
Determine the rule of candidate matches degree.Specifically, can use the statistical machine learning method of multiple features for initial problem for each time
Answer is selected to carry out scoring treatment, to obtain the score value of each candidate answers, and using the score value of each candidate answers as each candidate answers
Then candidate matches degree between initial problem seeks the first average value of this at least one candidate matches degree again.Specifically,
It, can be with subordinate clause when the statistical machine learning method for using multiple features carries out scoring treatment for initial problem for each candidate answers
Multiple features such as subcharacter, language feature, vocabulary pattern feature, redundancy feature carry out scoring treatment.
It should be noted that if only including a candidate answers in candidate answers set, then the first average value is equal to this
Candidate matches degree between candidate answers and initial problem.
S104 from candidate answers set or is generated in answer set according to preset determining rule according to the first average value
Determine target answer to be output.
After obtaining the first average value, can be determined according to preset determining rule according to first average value be from
Target answer is determined in candidate answers set, or target answer is determined in answer set from generating.In one embodiment,
It is preset determination rule include: target answer is determined according to the size relation of the first average value and preset threshold, correspondingly, according to
Preset determining rule from candidate answers set or generates determining target to be output in answer set according to the first average value
The specific embodiment of answer, which may is that, judges whether the first average value is greater than preset threshold, and preset threshold herein can root
It is arranged according to actual business demand, such as preset threshold can be with value for 0.5;If the first average value is greater than preset threshold, from
The highest candidate answers of candidate matches degree are chosen in candidate answers set as target answer;If the first average value is no more than default
Threshold value then generates the highest generation answer of matching degree as target answer from generating to choose in answer set, generation herein
It is the matching degree generated between answer and initial problem in generation answer set with degree.
Practice have shown that it sometimes appear that the matching for only existing a candidate answers and initial problem in candidate answers set
Degree is higher, and the matching degree of other candidate answers and initial problem is lower, illustrates that the higher candidate answers of matching degree may be at this time
Since there is retrieval error in retrieval model, what false retrieval was arrived.That is, each candidate in candidate answers set in the case of this
Answer is inaccuracy, if directly mesh can be reduced using the higher candidate answers of the matching degree as target answer output at this time
Mark the accuracy of answer.Therefore the method that the embodiment of the present invention uses the first average value, according to the first average value and preset threshold
Between size relation come determine from candidate answers set determine target answer or from generate answer set in determine mesh
The case where marking answer, retrieval model false retrieval can be evaded to a certain extent, so as to improve the accuracy of target answer.
In another embodiment, preset determination rule includes: the size according to the first average value and the second average value
Relationship determines target answer, correspondingly, according to preset determining rule according to the first average value from candidate answers set or life
It is generated in answer set at determining that the specific embodiment of target answer to be output may is that calculate separately in answer set
Each generation matching degree generated between answer and initial problem generates matching degree to obtain at least one, and seeks at least one
Generate the second average value of matching degree;If the first average value is greater than the second average value, chosen from candidate answers set candidate
The highest candidate answers of matching degree are as target answer;If the first average value is less than the second average value, from generation answer set
Middle selection generates the highest generation answer of matching degree as target answer;If the first average value is equal to the second average value, from time
It selects and chooses the highest candidate answers of candidate matches degree in answer set as target answer, or chosen from generation answer set
The highest generation answer of matching degree is generated as target answer.
A generation answer is only included it should be noted that if generating in answer set, then the second average value is equal to this
Generate the generation matching degree between answer and initial problem.It can be seen that this embodiment is according to the first average value and second
Size relation between average value is determined from determining target answer in candidate answers set or from generating in answer set
It determines target answer, can evade because preset threshold value is unreasonable to reduce accuracy the case where, it can further increase
The accuracy of target answer.
In embodiments of the present invention, after the initial problem for getting user's input, preset retrieval mould can be called
Type determines the corresponding candidate answers set of initial problem from default knowledge base, and preset generation model is called to determine
The corresponding generation answer set of initial problem.Then calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
Finally determining target answer to be output in answer set from candidate answers set or can be generated according to the first average value.This
Inventive embodiments, which call retrieval model and generate model, determines target answer, can occur to avoid target answer long-tail problem with
And guarantee the consistency and reasonability of target answer.And determined according to the first average value from candidate answers set or from
The case where generating and determine target answer in answer set, retrieval model false retrieval can be evaded, to improve accuracy.
Fig. 2 is referred to, is the process that another kind provided in an embodiment of the present invention determines method based on the answer of data processing
Schematic diagram should determine that method can be executed by above-mentioned terminal based on the answer of data processing.As shown in Fig. 2, data should be based on
The answer of processing determines that method may comprise steps of S201-S207:
S201 constructs the first training dataset, which includes at least a pair of of chat question and answer corpus, this is extremely
Few a pair of of chat question and answer corpus is acquired from least one question answering system, and every a pair of of chat question and answer corpus includes a problem
With corresponding model answer.
Specifically, one or more question answering systems can be searched on the internet in advance, and from the one or more question and answer
One or more pairs of chat question and answer corpus are obtained in system, every a pair of of chat question and answer corpus is that true and physical presence chat is asked
Answer corpus, so-called true and physical presence chat question and answer corpus refer to the problem of once being inputted in question answering system by user with
And the corpus that the model answer that is exported for this problem of the question answering system is constituted.For example, user is once in certain question answering system
Input " the Forbidden City is somewhere ", the model answer that this question answering system is exported for this problem is " Beijing ", then " the Forbidden City exists
Where " and " Beijing " can be used as true and physical presence chat question and answer corpus.
S202 is obtained the initial model constructed in advance, and is trained using the first training dataset to initial model, is obtained
To mid-module.
Specifically, may include encoder model and decoder model in the initial model constructed in advance, and pre-
When first constructing initial model, select the two-way GRU model of Bi-GRU as encoder model and decoder model, Bi- herein
The two-way GRU model of GRU is a kind of model that can identify inversion sentence structure.Since user is when inputting initial problem, may make
Obtaining the initial problem is inversion sentence structure, i.e., different with normal sentence structure, such as the initial problem of user's input is " to go
Where today ", and normal sentence structure is " today is where ", can identify upside-down mounting using the two-way GRU model of Bi-GRU
So as to enrich the preset function of generating model, and preset generation model can be improved in the initial problem of sentence structure
Robustness.
Although it should be noted that the embodiment of the present invention all select the two-way GRU model of Bi-GRU as encoder model and
Decoder model, but the framework of encoder model and decoder model is inconsistent, i.e., in encoder model and decoder model
Model parameter it is inconsistent.Since the model parameter of encoder model and decoder model is inconsistent, subsequent to generation
In the training process of model required training and update model parameter it is more, so as to improve generate model robustness and
Performance, so that generating the language that answer is more close to the mankind determined by the preset generation model that training obtains, more very
Reality.
When being trained using the first training dataset to the initial model, the first training dataset can be input to
In initial model.After receiving first training dataset, the encoder model in initial model can incite somebody to action initial model
The problems in every a pair of of chat question and answer corpus that first training data is concentrated is encoded into feature vector, then by decoder model root
It is decoded processing according to this feature vector, to determine corresponding answer corresponding to the problem.Then judge the corresponding answer and
Whether model answer corresponding to the problem that one training data is concentrated is consistent: if inconsistent, constantly updating in initial model
Encoder model and decoder model model parameter, until updated initial model it is determined the problem of it is corresponding
Corresponding answer can be consistent with model answer, at this time can be using updated initial model as mid-module;If consistent, illustrate
The initial model can accurately determine out the corresponding corresponding answer of the problem, at this time can be directly using the initial model as centre
Model.
S203 obtains the second training dataset according to the first training dataset, and using the second training dataset to centre
Model optimizes processing, obtains preset generation model.
After obtaining mid-module, the second training number can be obtained by the mid-module and the first training dataset
According to collection.Specifically, may is that according to the specific embodiment that the first training dataset obtains the second training dataset for first
Each of training dataset problem successively calls mid-module to determine the corresponding corresponding answer of each problem, and by the
The corresponding corresponding answer of all problems that one training data is concentrated is as negative sample;Obtain each of first training dataset
Model answer corresponding to answer, and the corresponding model answer of all problems that the first training data is concentrated is as positive sample;
It will include the data set of negative sample and positive sample as the second training dataset.That is, the second training dataset includes
Negative sample is the corresponding answer that mid-module generates, and positive sample is model answer corresponding to the true language of the mankind, the negative sample
It may be collectively referred to as sample with positive sample.For example, the problems in first training dataset is " which the component of computer has ", in calling
Between model determine the corresponding corresponding answer of the problem be " 1. screen, 2. keyboard, 3. video card, 4. sound card, 5. hard disk ", then negative sample
Originally it is " 1. screen, 2. keyboard, 3. video card, 4. sound card, 5. hard disk ";First training data concentrates standard corresponding to the problem
Answer is " computer is generally made of components such as screen, keyboard, video card, sound card and hard disks ", then positive sample is just " computer one
As be made of components such as screen, keyboard, video card, sound card and hard disks ".
After obtaining the second training dataset, place can be optimized to mid-module using the second training dataset
Reason, obtains preset generation model.Specifically, in the process for optimizing processing to mid-module using the second training dataset
In, arbiter model can be introduced, which can be two disaggregated models, and target is to differentiate the second training number
Each sample according to concentration is positive sample or negative sample.For any one sample that the second training data is concentrated, can adjust
The sample is differentiated with arbiter model, obtains the probability that the sample is positive sample.During differentiation, arbiter model
It can also learn the difference between positive sample and negative sample, the model that arbiter model is updated according to the difference learnt is joined
Number, to improve the discriminating power of arbiter model.
It, can be using the probability as mid-module after calling arbiter model to obtain the probability that the sample is positive sample
Reward function (rewards), wherein the value of rewards is bigger, then shows that the sample is more likely to be positive sample, i.e. the sample
Properer true language of the mankind.It then can be in conjunction with nitrification enhancement (Policy Gradient algorithm), according to rewards
The value of the loss function of mid-module is calculated, then according to the principle for the value for reducing the loss function being calculated, optimization is intermediate
The model parameter of model.Then the problem concentrated using the mid-module after Optimal Parameters again according to the second training data is raw
The corresponding answer of Cheng Xin, using new corresponding answer as new negative sample.Then call arbiter model to new negative sample and
Positive sample is differentiated, and obtains new probability, and during differentiation, arbiter model may learn new negative sample and just
Difference between sample, and according to the new model parameter for distinguishing update arbiter model again learnt.
After calling arbiter model to obtain new probability, using new probability as new rewards, and combines and strengthen
Learning algorithm calculates the value of loss function according to new rewards, then according to the value for reducing the loss function being calculated
Principle optimizes the model parameter of mid-module again.Learnt by the confrontation repeatedly between arbiter model and mid-module,
It can to tend between arbiter model and mid-module the state of a balance, i.e., the problem of mid-module is determined is corresponding
Corresponding answer infinitely approach model answer, i.e., for unlimited appropriateness in the true language of the mankind, negative sample is cannot be distinguished in arbiter model
This and positive sample, can using tend to balance state when mid-module as preset generation model.
It can be seen that the embodiment of the present invention after obtaining mid-module, is not directly using mid-module as default
Generation model, but be re-introduced into confrontation study thought, processing is optimized to mid-module using the second training dataset,
To obtain preset generation model.By optimizing processing to mid-module, preset generation mould can be further improved
The quality of type, so that the generation answer for calling preset generation model to determine more is close to the true language of the mankind, more very
Reality.
S204, obtains the initial problem of user's input, and preset retrieval model is called to determine from default knowledge base
The corresponding candidate answers set of initial problem.Default knowledge base herein includes corresponding at least one problem and each problem
One or more answers include at least one candidate answers in candidate answers set herein.
Terminal can determine initial problem by obtaining voice messaging or text information, then call preset retrieval
Model determines the corresponding candidate answers set of initial problem.Specifically, calling preset retrieval model from default knowledge base
Determine that the specific embodiment of the corresponding candidate answers set of initial problem may include steps of s11-s13:
S11 calls preset retrieval model to carry out query processing in default knowledge base, for initial problem with determination
At least one target problem, at least one of at least one target word and the initial problem in target problem are initial single
Word matches.
Specifically, words and phrases deconsolidation process first can be carried out to initial problem, one or more initial words are obtained, it is then right
Each initial word carries out term vector conversion, obtains the term vector of each initial word.Based on the term vector of each initial word,
Preset retrieval model is called to find the matching word to match with each initial word in default knowledge base.For target
Word, the target word include initial word and/or matching word, are mapped to the target word using the method for inverted index
Target problem comprising the target word, to determine target problem.Inverted index herein can be described as reverse indexing, merging shelves again
Case or reversed archives are a kind of indexing means, are used to be stored under full-text search some word in a document or one group
The mapping of storage location in document.It can include the target according to target word quick obtaining using the method for inverted index
The target problem of word improves the rate of retrieval.
For example, initial problem is " which the component of computer has ", words and phrases deconsolidation process is carried out to the initial problem, can be obtained
To one or more initial words " computer " " component ".Then term vector conversion is carried out to each initial word, and based on each
The term vector of initial word calls preset retrieval model to find in default knowledge base and matches with each initial word
Match word, such as " tablet computer " " computer " " part ".Using initial word and/or matching word as target word, i.e.,
Target word may include one or more words in " computer " " component " " tablet computer " " computer " " part ".With target
Word is for " computer ", which be carried out mapping processing, so-called mapping by the method that can use inverted index
Processing refers to the processing for searching the target problem comprising the target word.Problem is found using the method for inverted index " to calculate
What the structure of machine is " in contain target word " computer ", then can be by the problem " what structure of computer is "
As target problem;For another example, by taking target word is " computer " " part " as an example, problem is found using the method for inverted index
" which the part of computer has " contains target word " computer " " part ", then can " which the part of computer has by the problem
It is used as target problem a bit ".
S12 calculates separately the similarity between at least one target problem and initial problem, and determines that similarity is highest
Target problem.
Specifically, the phase between at least one target problem and initial problem can be calculated separately first using similarity algorithm
Like degree, to obtain multiple similarities;Similarity algorithm herein can include but is not limited to: BM25 algorithm, Euclidean distance are similar
Spend algorithm, included angle cosine similarity algorithm, Pearson came similarity algorithm, etc..After obtaining multiple similarities, it can use
Ranking functions are ranked up processing to multiple similarities, and ranking functions herein can include but is not limited to: Rank ranking functions,
Sort ranking functions, Oracle ranking functions, etc..
Sequence processing herein can be the sequence processing of similarity from high to low, be also possible to similarity from low to high
Sequence processing.If sequence processing is the sequence processing of similarity from high to low, the smallest target problem of the serial number that can will sort
Determine the highest target problem of similarity;If sequence processing is the sequence processing of similarity from low to high, can be by the sequence that sorts
Number maximum target problem determines the highest target problem of similarity.
S13, at least one answer corresponding to the acquisition highest target problem of similarity from default knowledge base, and according to
Candidate answers set is determined at least one answer.
After the highest target problem of similarity has been determined, the highest mesh of similarity can be obtained from default knowledge base
At least one answer corresponding to mark problem, and candidate answers set is determined according at least one answer.In a kind of embodiment party
In formula, each answer corresponding to the highest target problem of the similarity that can be will acquire is used as candidate answers, i.e., will
The set that all answers are constituted corresponding to the highest target problem of the similarity got is determined as candidate answers set.
In another embodiment, obtained corresponding to the highest target problem of similarity at least from default knowledge base
After one answer, the matching degree between each answer and initial problem can be calculated.Then from high to low according to matching degree
Sequentially, the answer for choosing preset quantity constitutes candidate answers set.The value of the preset quantity can be according to practical business demand
It determines, such as preset quantity is 5, then the sequence according to matching degree from high in the end, the answer of preceding top5 is answered as candidate
The set that the candidate answers are constituted is determined as candidate answers set by case.
For example, answer a total of 7 are obtained corresponding to the highest target problem of similarity from default knowledge base,
7 answers and its matching degree are respectively: answer A (matching degree 85%), answer B (matching degree 25%), answer C (matching degree
45%), answer D (matching degree 75%), answer E (matching degree 80%), answer F (matching degree 70%) and answer G (matching degree
60%).According to known to the sequence of matching degree from high to low: answer A (matching degree 85%) > answer E (matching degree 80%) > answer
D (matching degree 75%) > answer F (matching degree 70%) > answer G (matching degree 60%) > answer C (matching degree 45%) > answer B
(matching degree 25%);Preset quantity is 5, then regard the answer of preceding top5 as candidate answers, i.e., " answer A ", " answer E ", " answer
Case D ", " answer F " and " answer G " are candidate answers, i.e., candidate answers collection is combined into " answer A, answer E, answer D, answer F
And answer G ".
S205 calls preset generation model to determine the corresponding generation answer set of initial problem, the generation answer set
It include that at least one generates answer in conjunction, the preset model that generates is that the training dataset using multiple comprising problem carries out
What model training optimized.
Specifically, preset generation model can be called to determine at least one corresponding generation answer of initial problem, and
The matching degree between each generation answer and initial problem is calculated separately according to preset computation rule;Then according to matching degree from
High to Low sequence, the generation answer for choosing preset quantity, which is constituted, generates answer set, and preset quantity herein can be according to reality
Border business demand determines.Such as preset quantity is 5, then the sequence according to matching degree from high in the end, the generation of top5 before choosing
Answer, which is constituted, generates answer set.In other embodiments, it may call upon preset generation model and determine initial problem pair
At least one answered generates answer, and the set that all generation answers that preset generation model is determined are constituted is determined as generating
Answer set.
S206, according to preset computation rule calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
S207 from candidate answers set or is generated in answer set according to preset determining rule according to the first average value
Determine target answer to be output.
It should be noted that the step S206-S207 in the embodiment of the present invention may refer in foregoing invention embodiment
Step S103-S104, the embodiment of the present invention repeat no more.
In embodiments of the present invention, after the initial problem for getting user's input, preset retrieval mould can be called
Type determines the corresponding candidate answers set of initial problem from default knowledge base, and preset generation model is called to determine
The corresponding generation answer set of initial problem.Then calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
Finally determining target answer to be output in answer set from candidate answers set or can be generated according to the first average value.This
Inventive embodiments, which call retrieval model and generate model, determines target answer, can occur to avoid target answer long-tail problem with
And guarantee the consistency and reasonability of target answer.And determined according to the first average value from candidate answers set or from
The case where generating and determine target answer in answer set, retrieval model false retrieval can be evaded, to improve accuracy.
Fig. 3 a-3b is referred to, is that a kind of answer based on data processing provided in an embodiment of the present invention determines answering for method
With scene figure, user can open the user interface that chat conversations are carried out with terminal, as shown in Figure 3a.Then user can be at this
User interface inputs initial problem, as shown in Figure 3b.After terminal detects the input operation of user, available user's input
Initial problem, and preset retrieval model is called to determine the corresponding candidate answers collection of initial problem from default knowledge base
It closes, and preset generation model is called to determine the corresponding generation answer set of initial problem.Then according to preset calculating
Rule calculates separately each candidate answers in candidate answers set and the candidate matches degree between initial problem, to obtain at least one
A candidate matches degree, and seek the first average value of at least one candidate matches degree.It finally can be according to preset determining rule
From candidate answers set or determining target answer to be output in answer set is generated according to the first average value.Determining mesh
After marking answer, the target answer can be exported in the user interface, to realize the chat conversations with user, as shown in Figure 3b.
It calls retrieval model and generates model to determine target answer, can occur long-tail problem to avoid target answer and guarantee target
The consistency and reasonability of answer.And determined according to the first average value from candidate answers set or from generate answer set
The case where determining target answer in conjunction, retrieval model false retrieval can be evaded, to improve accuracy.
Fig. 4 is referred to, is that a kind of structure of answer determining device based on data processing provided in an embodiment of the present invention is shown
It is intended to.As shown in figure 4, the device in the embodiment of the present invention may include:
Acquiring unit 101 for obtaining the initial problem of user's input, and calls preset retrieval model from default knowledge
Determine the corresponding candidate answers set of the initial problem in library, the default knowledge base includes at least one problem and each
One or more answers corresponding to problem include at least one candidate answers in the candidate answers set;
Acquiring unit 101, for calling preset generation model to determine the corresponding generation answer set of the initial problem
It closes, includes that at least one generates answer in the generation answer set, the preset generation model is using multiple comprising asking
The training dataset of topic carries out what model training optimized;
Computing unit 102, for calculating separately each candidate in the candidate answers set according to preset computation rule
Candidate matches degree between answer and the initial problem to obtain at least one candidate matches degree, and seeks described at least one
First average value of a candidate matches degree;
Determination unit 103, for according to preset determining rule according to first average value from the candidate answers collection
It closes or described generate determines target answer to be output in answer set.
In one embodiment, the determination unit 103 is for according to preset determining regular according to described first
It is specific to use when average value determines target answer to be output from the candidate answers set or the generation answer set
In:
Judge whether first average value is greater than preset threshold;
If first average value is greater than the preset threshold, candidate matches degree is chosen from the candidate answers set
Highest candidate answers are as target answer;
If first average value is not more than the preset threshold, is chosen from the generation answer set and generate matching
Highest generation answer is spent as target answer, and the matching degree that generates is the generation answer generated in answer set and institute
State the matching degree between initial problem.
In another embodiment, the determination unit 103 is for according to preset determining regular according to described first
It is specific to use when average value determines target answer to be output from the candidate answers set or the generation answer set
In:
The generation matching degree between each generation answer and the initial problem in the generation answer set is calculated separately,
Matching degree is generated to obtain at least one, and seeks the second average value of at least one generation matching degree;
If first average value is greater than second average value, candidate matches are chosen from the candidate answers set
Highest candidate answers are spent as target answer;
If first average value is less than second average value, is chosen from the generation answer set and generate matching
Highest generation answer is spent as target answer;
If first average value is equal to second average value, candidate matches are chosen from the candidate answers set
Highest candidate answers are spent as target answer, or are chosen from the generation answer set and generated the highest generation of matching degree
Answer is as target answer.
In another embodiment, the acquiring unit 101 is for calling preset retrieval model from default knowledge base
In when determining the corresponding candidate answers set of the initial problem, be specifically used for:
For the initial problem, preset retrieval model is called to carry out query processing in the default knowledge base, with
Determine at least one target problem, at least one target word in the target problem and at least one in the initial problem
A initial word matches;
The similarity between at least one described target problem and the initial problem is calculated separately, and determines similarity most
High target problem;
At least one answer corresponding to the highest target problem of the similarity is obtained from the default knowledge base, and
Candidate answers set is determined according at least one described answer
In another embodiment, the acquiring unit 101 is described first for calling preset generation model to determine
When the corresponding generation answer set of beginning problem, it is specifically used for:
Preset generation model is called to determine at least one corresponding generation answer of the initial problem, and according to described
Preset computation rule calculates separately the matching degree between each generation answer and the initial problem;
According to the sequence of the matching degree from high to low, the generation answer for choosing preset quantity, which is constituted, generates answer set.
In another embodiment, the acquiring unit 101 can also be used in:
Construct the first training dataset, first training dataset includes at least a pair of of chat question and answer corpus, it is described extremely
Few a pair of of chat question and answer corpus is acquired from least one question answering system, and every a pair of of chat question and answer corpus includes a problem
With corresponding model answer;
The initial model constructed in advance is obtained, and the initial model is instructed using first training dataset
Practice, obtains mid-module;
The second training dataset is obtained according to first training dataset, and using second training dataset to institute
It states mid-module and optimizes processing, obtain preset generation model.
In another embodiment, the acquiring unit 101 is for obtaining second according to first training dataset
When training dataset, it is specifically used for:
For each of first training dataset problem, the mid-module is successively called to determine described each
The corresponding corresponding answer of a problem, and the corresponding corresponding answer of all problems that first training data is concentrated is as negative sample
This;
Obtain model answer corresponding to each described answer that first training data is concentrated, and by described first
The corresponding model answer of all problems that training data is concentrated is as positive sample;
It will include the data set of the negative sample and the positive sample as the second training dataset.
In embodiments of the present invention, after the initial problem for getting user's input, preset retrieval mould can be called
Type determines the corresponding candidate answers set of initial problem from default knowledge base, and preset generation model is called to determine
The corresponding generation answer set of initial problem.Then calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
Finally determining target answer to be output in answer set from candidate answers set or can be generated according to the first average value.This
Inventive embodiments, which call retrieval model and generate model, determines target answer, can occur to avoid target answer long-tail problem with
And guarantee the consistency and reasonability of target answer.And determined according to the first average value from candidate answers set or from
The case where generating and determine target answer in answer set, retrieval model false retrieval can be evaded, to improve accuracy.
Method and device are determined based on the above-mentioned described answer based on data processing, and the embodiment of the present invention also proposes
A kind of terminal, the terminal can be used to implement the above-mentioned answer based on data processing and determine method.Fig. 5 is referred to, is this
A kind of structural schematic diagram for terminal that inventive embodiments provide.As shown in figure 5, the terminal includes input equipment 201, output equipment
202, memory 203 and processor 204, the input equipment 201, the output equipment 202, the memory 203 can be with
It is connected with each other with the processor 204, wherein the initial problem that the input equipment 201 can be used for obtaining user's input is received
Message is sent out, which can correspond to the acquiring unit 101 in foregoing invention embodiment.The memory 203 can be with
For storing computer program, the computer program includes program instruction, which includes program instruction.Another
In embodiment, input equipment 201, output equipment 202, memory 203 and processor 204 can be mutual by way of bus
Connection.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, Ke Yitong
Computer program is crossed to instruct relevant hardware and complete, which can be stored in a computer readable storage medium, the journey
Sequence includes at least one program instruction, which is to be loaded by the processor 204, and be used to execute following step
It is rapid:
The initial problem of user's input is obtained, and it is described first to call preset retrieval model to determine from default knowledge base
The corresponding candidate answers set of beginning problem, the default knowledge base include one corresponding at least one problem and each problem
Or multiple answers, it include at least one candidate answers in the candidate answers set;
Preset generation model is called to determine the corresponding generation answer set of the initial problem, the generation answer set
It include that at least one generates answer in conjunction, the preset model that generates is that the training dataset using multiple comprising problem carries out
What model training optimized;
Each candidate answers in the candidate answers set are calculated separately according to preset computation rule initially to ask with described
Candidate matches degree between topic to obtain at least one candidate matches degree, and seeks the of at least one candidate matches degree
One average value;
It is answered according to first average value from the candidate answers set or the generation according to preset determining rule
Target answer to be output is determined in case set.
In one embodiment, according to preset determining rule according to first average value from the candidate answers
When determining target answer to be output in set or the generation answer set, which can be by processor
204 load and for executing:
Judge whether first average value is greater than preset threshold;
If first average value is greater than the preset threshold, candidate matches degree is chosen from the candidate answers set
Highest candidate answers are as target answer;
If first average value is not more than the preset threshold, is chosen from the generation answer set and generate matching
Highest generation answer is spent as target answer, and the matching degree that generates is the generation answer generated in answer set and institute
State the matching degree between initial problem.
In another embodiment, according to preset determining rule according to first average value from the candidate answers
When determining target answer to be output in set or the generation answer set, which can be by processor
204 load and for executing:
The generation matching degree between each generation answer and the initial problem in the generation answer set is calculated separately,
Matching degree is generated to obtain at least one, and seeks the second average value of at least one generation matching degree;
If first average value is greater than second average value, candidate matches are chosen from the candidate answers set
Highest candidate answers are spent as target answer;
If first average value is less than second average value, is chosen from the generation answer set and generate matching
Highest generation answer is spent as target answer;
If first average value is equal to second average value, candidate matches are chosen from the candidate answers set
Highest candidate answers are spent as target answer, or are chosen from the generation answer set and generated the highest generation of matching degree
Answer is as target answer.
In another embodiment, preset retrieval model is being called to determine the initial problem from default knowledge base
When corresponding candidate answers set, which can be loaded by processor 204 and for executing:
For the initial problem, preset retrieval model is called to carry out query processing in the default knowledge base, with
Determine at least one target problem, at least one target word in the target problem and at least one in the initial problem
A initial word matches;
The similarity between at least one described target problem and the initial problem is calculated separately, and determines similarity most
High target problem;
At least one answer corresponding to the highest target problem of the similarity is obtained from the default knowledge base, and
Candidate answers set is determined according at least one described answer.
In another embodiment, preset generation model is being called to determine the corresponding generation answer of the initial problem
When set, which can be loaded by processor 204 and for executing:
Preset generation model is called to determine at least one corresponding generation answer of the initial problem, and according to default
Computation rule calculate separately it is each generation answer and the initial problem between matching degree;
According to the sequence of the matching degree from high to low, the generation answer for choosing preset quantity, which is constituted, generates answer set.
In another embodiment, which can also be loaded by processor 204 and for executing:
Construct the first training dataset, first training dataset includes at least a pair of of chat question and answer corpus, it is described extremely
Few a pair of of chat question and answer corpus is acquired from least one question answering system, and every a pair of of chat question and answer corpus includes a problem
With corresponding model answer;
The initial model constructed in advance is obtained, and the initial model is instructed using first training dataset
Practice, obtains mid-module;
The second training dataset is obtained according to first training dataset, and using second training dataset to institute
It states mid-module and optimizes processing, obtain preset generation model.
In another embodiment, when obtaining the second training dataset according to first training dataset, this is at least
One program instruction can be loaded by processor and be used to execute:
For each of first training dataset problem, the mid-module is successively called to determine described each
The corresponding corresponding answer of a problem, and the corresponding corresponding answer of all problems that first training data is concentrated is as negative sample
This;
Obtain model answer corresponding to each described answer that first training data is concentrated, and by described first
The corresponding model answer of all problems that training data is concentrated is as positive sample;
It will include the data set of the negative sample and the positive sample as the second training dataset.
In embodiments of the present invention, after the initial problem for getting user's input, preset retrieval mould can be called
Type determines the corresponding candidate answers set of initial problem from default knowledge base, and preset generation model is called to determine
The corresponding generation answer set of initial problem.Then calculate separately each candidate answers in candidate answers set and initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first average value of at least one candidate matches degree.
Finally determining target answer to be output in answer set from candidate answers set or can be generated according to the first average value.This
Inventive embodiments, which call retrieval model and generate model, determines target answer, can occur to avoid target answer long-tail problem with
And guarantee the consistency and reasonability of target answer.And determined according to the first average value from candidate answers set or from
The case where generating and determine target answer in answer set, retrieval model false retrieval can be evaded, to improve accuracy.
The embodiment of the invention also provides a kind of computer storage medium, the described computer storage medium is stored with calculating
Machine program.The computer program includes at least one program instruction, which can be loaded by a processor, and
Method is determined for executing the above-mentioned described answer based on data processing.
The computer storage medium is a kind of memory device, for storing program and data.It is understood that herein
Computer storage medium both may include the built-in storage medium in server, naturally it is also possible to the expansion supported including server
Open up storage medium.In one embodiment, which can be magnetic disk, CD, read-only memory (Read-
Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
Above disclosed is only the section Example of the application, cannot limit the right of the application with this certainly
Range, those skilled in the art can understand all or part of the processes for realizing the above embodiment, and according to the application right
Equivalent variations made by it is required that still falls within and applies for covered range.
Claims (10)
1. a kind of answer based on data processing determines method characterized by comprising
The initial problem of user's input is obtained, and preset retrieval model is called to determine described initially to ask from default knowledge base
Corresponding candidate answers set is inscribed, the default knowledge base includes one or more corresponding at least one problem and each problem
A answer includes at least one candidate answers in the candidate answers set;
Preset generation model is called to determine the corresponding generation answer set of the initial problem, in the generation answer set
Answer is generated including at least one, the preset model that generates is that the training dataset using multiple comprising problem carries out model
Training optimization obtains;
According to preset computation rule calculate separately each candidate answers in the candidate answers set and the initial problem it
Between candidate matches degree, to obtain at least one candidate matches degree, and seek the first flat of at least one candidate matches degree
Mean value;
According to preset determining rule according to first average value from the candidate answers set or the generation answer set
Target answer to be output is determined in conjunction.
2. the method as described in claim 1, which is characterized in that described average according to described first according to preset determining rule
It is worth from the candidate answers set or described generate and determines target answer to be output in answer set, comprising:
Judge whether first average value is greater than preset threshold;
If first average value is greater than the preset threshold, candidate matches degree highest is chosen from the candidate answers set
Candidate answers as target answer;
If first average value is not more than the preset threshold, is chosen from the generation answer set and generate matching degree most
High generation answer as target answer, it is described generate matching degree be generation answer in the generation answer set and it is described just
Matching degree between beginning problem.
3. the method as described in claim 1, which is characterized in that described average according to described first according to preset determining rule
It is worth from the candidate answers set or described generate and determines target answer to be output in answer set, comprising:
The generation matching degree between each generation answer and the initial problem in the generation answer set is calculated separately, with
Matching degree is generated at least one, and seeks the second average value of at least one generation matching degree;
If first average value is greater than second average value, candidate matches degree is chosen most from the candidate answers set
High candidate answers are as target answer;
If first average value is less than second average value, is chosen from the generation answer set and generate matching degree most
High generation answer is as target answer;
If first average value is equal to second average value, candidate matches degree is chosen most from the candidate answers set
High candidate answers are as target answer, or choose from the generation answer set and generate the highest generation answer of matching degree
As target answer.
4. the method according to claim 1, which is characterized in that described that preset retrieval model is called to know from default
Know in library and determine the corresponding candidate answers set of the initial problem, comprising:
For the initial problem, preset retrieval model is called to carry out query processing in the default knowledge base, with determination
At least one target problem, at least one of at least one target word and the initial problem in the target problem are just
Beginning word matches;
The similarity between at least one described target problem and the initial problem is calculated separately, and determines that similarity is highest
Target problem;
Obtain at least one answer corresponding to the highest target problem of the similarity from the default knowledge base, and according to
Candidate answers set is determined at least one described answer.
5. the method according to claim 1, which is characterized in that described that preset generation model is called to determine institute
State the corresponding generation answer set of initial problem, comprising:
Preset generation model is called to determine at least one corresponding generation answer of the initial problem, and according to described default
Computation rule calculate separately it is each generation answer and the initial problem between matching degree;
According to the sequence of the matching degree from high to low, the generation answer for choosing preset quantity, which is constituted, generates answer set.
6. the method according to claim 1, which is characterized in that the method also includes:
Constructing the first training dataset, first training dataset includes at least a pair of of chat question and answer corpus, and described at least one
Chat question and answer corpus is acquired from least one question answering system, every a pair of of chat question and answer corpus includes a problem and right
The model answer answered;
The initial model constructed in advance is obtained, and the initial model is trained using first training dataset, is obtained
To mid-module;
The second training dataset is obtained according to first training dataset, and using second training dataset in described
Between model optimize processing, obtain preset generation model.
7. method as claimed in claim 6, which is characterized in that described to obtain the second training according to first training dataset
Data set, comprising:
For each of first training dataset problem, each described in the mid-module determination is successively called to ask
Corresponding corresponding answer is inscribed, and the corresponding corresponding answer of all problems that first training data is concentrated is as negative sample;
Model answer corresponding to each described answer that first training data is concentrated is obtained, and described first is trained
The corresponding model answer of all problems in data set is as positive sample;
It will include the data set of the negative sample and the positive sample as the second training dataset.
8. a kind of answer determining device based on data processing characterized by comprising
Acquiring unit for obtaining the initial problem of user's input, and calls preset retrieval model true from default knowledge base
The corresponding candidate answers set of the initial problem is made, the default knowledge base includes at least one problem and each problem institute
Corresponding one or more answer, includes at least one candidate answers in the candidate answers set;
The acquiring unit, for calling preset generation model to determine the corresponding generation answer set of the initial problem,
It include that at least one generates answer in the generation answer set, the preset generation model is using multiple comprising problem
Training dataset carries out what model training optimized;
Computing unit, for calculating separately each candidate answers and institute in the candidate answers set according to preset computation rule
The candidate matches degree between initial problem is stated, to obtain at least one candidate matches degree, and seeks described at least one candidate
The first average value with degree;
Determination unit, for according to preset determining rule according to first average value from the candidate answers set or institute
It states to generate and determines target answer to be output in answer set.
9. a kind of terminal, which is characterized in that including input equipment, output equipment, memory and processor, the processor,
The input equipment, the output equipment and the memory are connected with each other, wherein the memory is for storing computer journey
Sequence, the computer program include program instruction, and the processor is configured for calling described program instruction, execute such as right
It is required that the described in any item methods of 1-7.
10. a kind of computer readable storage medium, which is characterized in that the computer storage medium is stored with computer program,
The computer program includes program instruction, and described program instruction makes the processor execute such as right when being executed by a processor
It is required that the described in any item methods of 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811364713.0A CN109492085B (en) | 2018-11-15 | 2018-11-15 | Answer determination method, device, terminal and storage medium based on data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811364713.0A CN109492085B (en) | 2018-11-15 | 2018-11-15 | Answer determination method, device, terminal and storage medium based on data processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109492085A true CN109492085A (en) | 2019-03-19 |
CN109492085B CN109492085B (en) | 2024-05-14 |
Family
ID=65695856
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811364713.0A Active CN109492085B (en) | 2018-11-15 | 2018-11-15 | Answer determination method, device, terminal and storage medium based on data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109492085B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414246A (en) * | 2019-06-19 | 2019-11-05 | 平安科技(深圳)有限公司 | Shared file method for managing security, device, terminal and storage medium |
CN110489730A (en) * | 2019-08-14 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Text handling method, device, terminal and storage medium |
CN110543552A (en) * | 2019-09-06 | 2019-12-06 | 网易(杭州)网络有限公司 | Conversation interaction method and device and electronic equipment |
CN111339275A (en) * | 2020-02-27 | 2020-06-26 | 深圳大学 | Method and device for matching answer information, server and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160247068A1 (en) * | 2013-11-01 | 2016-08-25 | Tencent Technology (Shenzhen) Company Limited | System and method for automatic question answering |
CN106682387A (en) * | 2016-10-26 | 2017-05-17 | 百度国际科技(深圳)有限公司 | Method and device used for outputting information |
CN107133303A (en) * | 2017-04-28 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | Method and apparatus for output information |
CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
CN108509463A (en) * | 2017-02-28 | 2018-09-07 | 华为技术有限公司 | A kind of answer method and device of problem |
-
2018
- 2018-11-15 CN CN201811364713.0A patent/CN109492085B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160247068A1 (en) * | 2013-11-01 | 2016-08-25 | Tencent Technology (Shenzhen) Company Limited | System and method for automatic question answering |
CN106682387A (en) * | 2016-10-26 | 2017-05-17 | 百度国际科技(深圳)有限公司 | Method and device used for outputting information |
CN108509463A (en) * | 2017-02-28 | 2018-09-07 | 华为技术有限公司 | A kind of answer method and device of problem |
CN107133303A (en) * | 2017-04-28 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | Method and apparatus for output information |
CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414246A (en) * | 2019-06-19 | 2019-11-05 | 平安科技(深圳)有限公司 | Shared file method for managing security, device, terminal and storage medium |
CN110414246B (en) * | 2019-06-19 | 2023-05-30 | 平安科技(深圳)有限公司 | Shared file security management method, device, terminal and storage medium |
CN110489730A (en) * | 2019-08-14 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Text handling method, device, terminal and storage medium |
CN110543552A (en) * | 2019-09-06 | 2019-12-06 | 网易(杭州)网络有限公司 | Conversation interaction method and device and electronic equipment |
CN110543552B (en) * | 2019-09-06 | 2022-06-07 | 网易(杭州)网络有限公司 | Conversation interaction method and device and electronic equipment |
CN111339275A (en) * | 2020-02-27 | 2020-06-26 | 深圳大学 | Method and device for matching answer information, server and storage medium |
CN111339275B (en) * | 2020-02-27 | 2023-05-12 | 深圳大学 | Answer information matching method, device, server and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109492085B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020177282A1 (en) | Machine dialogue method and apparatus, computer device, and storage medium | |
WO2021159632A1 (en) | Intelligent questioning and answering method and apparatus, computer device, and computer storage medium | |
CN109492085A (en) | Method, apparatus, terminal and storage medium are determined based on the answer of data processing | |
CN111737426B (en) | Method for training question-answering model, computer equipment and readable storage medium | |
CN109791549A (en) | Machine customer interaction towards dialogue | |
CN110795542A (en) | Dialogue method and related device and equipment | |
CN111078837A (en) | Intelligent question and answer information processing method, electronic equipment and computer readable storage medium | |
US11120214B2 (en) | Corpus generating method and apparatus, and human-machine interaction processing method and apparatus | |
CN111125328B (en) | Text processing method and related equipment | |
CN112685550B (en) | Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium | |
CN111737439B (en) | Question generation method and device | |
KR20210070904A (en) | Method and apparatus for multi-document question answering | |
CN111897938A (en) | Dialogue robot reply method combining RPA and AI, model training method and device | |
CN113342948A (en) | Intelligent question and answer method and device | |
CN111597821A (en) | Method and device for determining response probability | |
CN112307166B (en) | Intelligent question-answering method and device, storage medium and computer equipment | |
CN111506717B (en) | Question answering method, device, equipment and storage medium | |
CN113033912A (en) | Problem solving person recommendation method and device | |
CN111353290B (en) | Method and system for automatically responding to user inquiry | |
CN111460117A (en) | Dialog robot intention corpus generation method, device, medium and electronic equipment | |
CN116521832A (en) | Dialogue interaction method, device and system, electronic equipment and storage medium | |
CN110413750A (en) | The method and apparatus for recalling standard question sentence according to user's question sentence | |
US20220108071A1 (en) | Information processing device, information processing system, and non-transitory computer readable medium | |
CN116414940A (en) | Standard problem determining method and device and related equipment | |
CN114490974A (en) | Automatic information reply method, device, system, electronic equipment and readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |