CN108920654A - A kind of matched method and apparatus of question and answer text semantic - Google Patents

A kind of matched method and apparatus of question and answer text semantic Download PDF

Info

Publication number
CN108920654A
CN108920654A CN201810718708.9A CN201810718708A CN108920654A CN 108920654 A CN108920654 A CN 108920654A CN 201810718708 A CN201810718708 A CN 201810718708A CN 108920654 A CN108920654 A CN 108920654A
Authority
CN
China
Prior art keywords
candidate
customer issue
training
matching attribute
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810718708.9A
Other languages
Chinese (zh)
Other versions
CN108920654B (en
Inventor
李渊
贺国秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd, Taikang Online Property Insurance Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN201810718708.9A priority Critical patent/CN108920654B/en
Publication of CN108920654A publication Critical patent/CN108920654A/en
Application granted granted Critical
Publication of CN108920654B publication Critical patent/CN108920654B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This application provides a kind of matched methods of question and answer text semantic, including:Receive customer issue;At least two candidate informations corresponding with the customer issue are obtained according to the customer issue, each candidate information includes candidate answers and candidate problem;According to the candidate information, the first matching attribute of customer issue and candidate answers, the second matching attribute of customer issue and candidate problem are calculated separately;The matching value corresponding to each candidate information is calculated according to the first matching attribute of each candidate information and the second matching attribute;Select the candidate answers in the maximum candidate information of matching value as the reply answer of the customer issue.In conjunction with the matching degree of multiplicity candidate information and customer issue, accuracy is improved.

Description

A kind of matched method and apparatus of question and answer text semantic
Technical field
This application involves field of electronic device, more specifically, it relates to a kind of matched method of question and answer text semantic and Device.
Background technique
True intention of the intelligent Answer System by analysis customer issue, the matching degree return of the candidate problem answers of foundation Correct matching answer.Intelligent Answer System is mainly understood by customer issue, information retrieval and answer generation form.
Text semantic matching technique in traditional question answering system, the machine learning model mainly used need to carry out text by hand The extraction of eigen, there is subjective error, machine learning model generalization ability is insufficient.In practical engineering applications, work people Member needs to be labeled a large amount of text datas, mainly marks these texts according to the working experience knowledge of data labeler Data simultaneously extract its characteristic information, and such way makes text feature construction quality not high, and needs the extensive work time. It moreover, considering that question and answer text semantic match information is not comprehensive in traditional question answering system, is answered for customer issue and candidate The matching value of case is calculated, and reply answer of the highest candidate answers as customer issue is matched.Using this method, only consider The matching value of subsequent answer and customer issue, the semantic matches factor is single, and accuracy is lower.
Summary of the invention
In view of this, solving question and answer in the prior art this application provides a kind of matched method of question and answer text semantic The system problem low due to the single caused accuracy of the semantic matches factor.
To achieve the above object, the application provides the following technical solutions:
A kind of matched method of question and answer text semantic, including:
Receive customer issue;
At least two candidate informations corresponding with the customer issue, each candidate are obtained according to the customer issue Information includes candidate answers and candidate problem;
According to the candidate information, calculate separately the first matching attribute of customer issue and candidate answers, customer issue with Second matching attribute of candidate problem;
It is calculated according to the first matching attribute of each candidate information and the second matching attribute corresponding to each candidate information Matching value;
Select the candidate answers in the maximum candidate information of matching value as the reply answer of the customer issue.
Above-mentioned method, it is preferred that according to the candidate information, calculate separately first of customer issue and candidate answers The second matching attribute with the factor, customer issue and candidate problem, including:
The characteristic vector sequence of customer issue, candidate problem and candidate answers, described eigenvector sequence tool are obtained respectively There is context local feature;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, the first spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the first eigenvector similarity score matrix State the first matching attribute of candidate answers;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, the second spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the second feature vector similarity score matrix State the second matching attribute of candidate problem.
Above-mentioned method, it is preferred that the feature vector sequence for obtaining customer issue, candidate problem and candidate answers respectively Column, including:
According to preset terminological dictionary and word segmentation regulation, respectively to the customer issue, candidate problem and candidate answers Word segmentation processing is carried out, customer issue phrase, candidate problem phrase and candidate answers phrase are obtained;
The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector conversion respectively, obtained Customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Using preset two-way length, memory network Bi-LSTM captures customer issue term vector sequence, candidate problem word in short-term The context local feature of sequence vector, candidate answers term vector sequence respectively obtains customer issue, candidate problem and candidate and answers The characteristic vector sequence of case.
Above-mentioned method, it is preferred that the characteristic vector sequence according to the customer issue and the candidate answers Characteristic vector sequence obtains first eigenvector similarity score matrix, including:
Using preset Text similarity computing formula calculate customer issue characteristic vector sequence and candidate answers feature to Sequence is measured, first eigenvector similarity score matrix is obtained.
Above-mentioned method, it is preferred that described that the client is determined according to the first eigenvector similarity score matrix First matching attribute of problem and the candidate answers, including:
Predetermined number is screened from the first eigenvector similarity score matrix using default filtering algorithm to meet in advance If the characteristic information of essential condition, Text eigenvector is formed;
Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding semantic matches probability of judging result As the first matching attribute.
Above-mentioned method, it is preferred that first matching attribute and the second matching attribute according to each candidate information come The matching value for corresponding to each candidate information is calculated, including:
First matching attribute of any candidate information and the second matching attribute are weighted read group total, obtain the time Select the matching value of information.
Above-mentioned method, it is preferred that before the reception customer issue, further include:
The computation rule of predetermined deep learning model and the matching value according to matching attribute calculating candidate information, the depth Learning model is used to calculate customer issue and matches with the first matching attribute of candidate answers, customer issue with the second of candidate problem The factor;
Wherein, the default process of the deep learning model, which passes through, trains the deep learning model realization, described in training The process of deep learning model specifically includes:
At least two training candidate information corresponding with training customer issue is obtained, each trained candidate information includes training Candidate problem and training candidate answers;
Obtain the spy of training customer issue, at least two training candidate information respectively based on the deep learning model Levy sequence vector;
It is candidate according to the characteristic vector sequence of the trained customer issue and the training based on the deep learning model The characteristic vector sequence of answer obtains the first training feature vector similarity score matrix;
Characteristic vector sequence and the training based on the deep learning model and the trained customer issue are candidate The characteristic vector sequence of problem obtains the second training feature vector similarity score matrix;
Default filtering algorithm based on the deep learning model is from the first training feature vector similarity score square Screening predetermined number meets the characteristic information of default essential condition, shape in battle array and the second training feature vector similarity score matrix At training text feature vector;
Two classification that classifier based on the deep learning model carries out semantic matches to training text feature vector are sentenced It is disconnected, and to obtained prediction result using the parameter of the gradient descent method training deep learning model, and export training result;
When the training result meets preset condition, the parameter of the deep learning model is recorded, so that the depth It spends learning model and reply answer is determined to customer issue based on the parameter.
A kind of matched device of question and answer text semantic, including:
Receiving module, for receiving customer issue;
Module is obtained, for obtaining corresponding with the customer issue at least two candidate letters according to the customer issue Breath, any candidate information contain at least two factor;
First computing module, for calculating separately first of customer issue and candidate answers according to the candidate information The second matching attribute with the factor, customer issue and candidate problem;
Second computing module, for being calculated according to the first matching attribute of each candidate information and the second matching attribute pair It should be in the matching value of each candidate information;
Selecting module, for selecting the returning as the customer issue of the candidate answers in the maximum candidate information of matching value Multiple answer.
A kind of computer-readable medium is stored thereon with computer program, realizes such as when described program is executed by processor The matched method of question and answer text semantic described in any of the above embodiments.
A kind of electronic equipment, including:
One or more processors;
Storage device, for storing one or more programs, when one or more of programs are one or more of When processor executes, so that one or more of processors realize that question and answer text semantic as described in any one of the above embodiments is matched Method.
It can be seen via above technical scheme that compared with prior art, this application provides a kind of question and answer text semantics The method matched, including:Receive customer issue;Corresponding with the customer issue at least two are obtained according to the customer issue to wait Information is selected, each candidate information includes candidate answers and candidate problem;According to the candidate information, calculates separately client and ask Topic and the first matching attribute of candidate answers, the second matching attribute of customer issue and candidate problem;According to each candidate information The first matching attribute and the second matching attribute calculate the matching value corresponding to each candidate information;Select matching value maximum Reply answer of the candidate answers as the customer issue in candidate information.Using this method, it is made up of Multiple factors Candidate information is analyzed with customer issue, obtains multiple matching attributes, and this is calculated based on multiple matching attribute Matching value with the factor corresponding candidate information and the customer issue, and then select the candidate in the maximum candidate information of matching value Answer is as reply answer corresponding with the customer issue.It include Multiple factors in candidate information in the program, it is more in conjunction with this The matching degree of a factor analysis candidate information and customer issue, improves accuracy.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of the matched embodiment of the method 1 of question and answer text semantic provided by the present application;
Fig. 2 is a kind of flow chart of the matched embodiment of the method 2 of question and answer text semantic provided by the present application;
Fig. 3 is a kind of flow chart of the matched embodiment of the method 3 of question and answer text semantic provided by the present application;
Fig. 4 is a kind of flow chart of the matched embodiment of the method 4 of question and answer text semantic provided by the present application;
Fig. 5 be in a kind of matched method concrete application scene of question and answer text semantic provided by the present application to customer issue into The process schematic of row processing;
Fig. 6 is a kind of structural schematic diagram of the matched Installation practice of question and answer text semantic provided by the present application;
Fig. 7 is the structural schematic diagram of a kind of electronic equipment embodiment provided by the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
As shown in Figure 1, it is a kind of flow chart of the matched embodiment of the method 1 of question and answer text semantic provided by the present application, This method is applied to an electronic equipment, which has the matched function of question and answer text semantic, and this method includes following step Suddenly:
Step S101:Receive customer issue;
Wherein, which is the problem of client proposes, needs to carry out semantic matches to the customer issue, from several It is obtained in candidate answers and its most matched answer.
Step S102:At least two candidate informations corresponding with the customer issue are obtained according to the customer issue;
Wherein, each candidate information includes candidate answers and candidate problem.
Wherein, it is preset with database, candidate problem and the candidate answers composition that magnanimity is preset in the database are candidate right.
Specifically, searched in the database based on the customer issue, can find asked with the client it is corresponding Multiple candidate informations, include in each candidate information a candidate problem and a candidate answers composition candidate it is right.
It in specific implementation, can be searched roughly, can be found and the visitor in the database based on the customer issue Problem relevant multiple candidate informations in family in the next steps carry out the matching degree of multiple candidate information and the customer issue It calculates, determines the answer replied.
Step S103:According to the candidate information, customer issue and the first matching attribute of candidate answers, visitor are calculated separately Second matching attribute of family problem and candidate problem;
In specific implementation, according to the candidate answers got and candidate problem, client can be calculated using preset algorithm Second matching attribute of the first matching attribute of problem and candidate answers, customer issue and candidate problem.
In specific implementation, which is specifically as follows a numerical value, and first of the customer issue and candidate answers Numerical representation method with the factor matching degree of customer issue and candidate answers, the second matching attribute of candidate problem and customer issue Numerical representation method candidate's problem and the matching degree of customer issue etc..
It should be noted that the process that matching attribute can be calculated in subsequent embodiment for this carries out detailed solution It releases, is not detailed in the present embodiment.
Step S104:It is calculated and is corresponded to each according to the first matching attribute of each candidate information and the second matching attribute The matching value of candidate information;
Wherein, which characterizes respectively between Multiple factors (candidate answers and candidate problem) and customer issue Matching degree, when calculating the matching degree between each candidate information and customer issue, it is contemplated that the corresponding matching of Multiple factors The factor improves the accuracy for calculating semantic matches between the candidate information and customer issue.
Step S105:The candidate answers in the maximum candidate information of matching value are selected to answer as the reply of the customer issue Case.
Wherein, the matching value of multiple candidate information is calculated, the matching value of the candidate information is bigger, characterizes the candidate Information and the matching degree of the customer issue are higher.
Therefore, from the matching value of multiple candidate informations, the candidate answers in the maximum candidate information of matching value is selected to make For the reply answer of the customer issue.
It should be noted that being considered respectively corresponding with the customer issue in the matching value calculating process of candidate information Candidate information in Multiple factors, improve the computational accuracy of matching degree between the candidate information and customer issue, it is final really The levels of precision of fixed reply answer is higher.
To sum up, a kind of matched method of question and answer text semantic provided in this embodiment, including:Receive customer issue;Foundation The customer issue obtains at least two candidate informations corresponding with the customer issue, and each candidate information includes candidate Answer and candidate problem;According to the candidate information, customer issue and the first matching attribute of candidate answers, client are calculated separately Second matching attribute of problem and candidate problem;It is counted according to the first matching attribute of each candidate information and the second matching attribute Calculate the matching value for corresponding to each candidate information;Select the candidate answers in the maximum candidate information of matching value as the client The reply answer of problem.Using this method, is analyzed, obtained more with customer issue by the candidate information that Multiple factors form A matching attribute, and the corresponding candidate information of the matching attribute and the customer issue are calculated based on multiple matching attribute Matching value, and then the candidate answers in the maximum candidate information of matching value is selected to answer as reply corresponding with the customer issue Case.It include Multiple factors in candidate information in the program, in conjunction with multiple factor analysis candidate information and customer issue Matching degree improves accuracy.
As shown in Figure 2, it is a kind of flow chart of the matched embodiment of the method 2 of question and answer text semantic provided by the present application, This approach includes the following steps:
Step S201:Receive customer issue;
Step S202:At least two candidate informations corresponding with the customer issue are obtained according to the customer issue;
Wherein, step S201-202 is consistent with the step S101-102 in embodiment 1, does not repeat them here in the present embodiment.
Step S203:The characteristic vector sequence of customer issue, candidate problem and candidate answers is obtained respectively;
Wherein, described eigenvector sequence has context local feature.
Wherein, preset rules in electronic equipment handle the customer issue, candidate problem and candidate answers, obtain Take its characteristic vector sequence
Specifically, this step S203, specifically includes:
Step S2031:According to preset terminological dictionary and word segmentation regulation, respectively to the customer issue, candidate problem Word segmentation processing is carried out with candidate answers, obtains customer issue phrase, candidate problem phrase and candidate answers phrase;
Wherein, terminological dictionary and word segmentation regulation are preset in electronic equipment.
Specifically, accordingly including the word of this profession of magnanimity in the terminological dictionary.
Such as terminological dictionary is to insure the exclusive word that may include insurance profession in terminological dictionary, such as the terminological dictionary Language, such as " micro- medical insurance ".
It should be noted that the terminological dictionary can be updated according to the actual situation in specific implementation, so that this is specially Word in industry dictionary can cover word involved in newest professional content.
Wherein, which is combined according to preset word segmentation regulation, to the customer issue, candidate problem and candidate answers Word segmentation processing is carried out respectively, obtains customer issue phrase, candidate problem phrase and candidate answers phrase.
It should be noted that segmenting to candidate's problem or candidate answers, obtained participle quantity can be identical, It can also be different, in the present embodiment with no restrictions.
Step S2032:The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector respectively Conversion obtains customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Wherein, term vector sequence is carried out respectively to the customer issue phrase, candidate problem phrase and candidate answers phrase to turn Change, obtains corresponding term vector sequence (customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector Sequence).
In specific implementation, can using keras (deep learning frame) embedding (embeding layer) layer execute the word to Measure Sequence Transformed process.
In specific implementation, in order to facilitate the calculating of term vector sequence, length threshold restriction is carried out to term vector sequence, word is few When can use 0 supplement, word quantity, which is greater than, to be limited threshold value and just intercept the word of number of thresholds.
Step S2033:Using preset Bi-LSTM (Bidirectional long short term memory, it is two-way Long memory network in short-term) capture customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence Context local feature respectively obtains the characteristic vector sequence of customer issue, candidate problem and candidate answers.
Wherein, the customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence are distinguished It inputs in Bi-LSTM (Bidirectional long short term memory, two-way length in short-term memory network) Reason.
In specific implementation, term vector sequence inputting deep neural network is handled, the mistake of characteristic vector sequence is obtained The term vector series processing inside the Bi-LSTM, is obtained inverted order first by term vector sequence inputting Bi-LSTM by Cheng Zhong After term vector sequence, the term vector sequence (positive sequence) and inverted order term vector processing are inputted into two LSTM (long respectively Short term memory, long memory network in short-term), latter two right LSTM distinguishes input vector sequence, by two vector sequences Column splicing, obtains this feature sequence vector.
Wherein, the formula of the LSTM network is as follows:
it=σ (Wxixt+Whiht-1+Wcict-1+bi)
ft=σ (Wxfxt+Whfht-1+Wcfct-1+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=σ (Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
Wherein, σ indicates that sigmoid activation primitive, tanh indicate tanh activation primitive, xtIt indicates (t-th of t moment Term vector input) in obtain word insertion vector, i, f, o and c are input gate respectively, forget the defeated of door, out gate and cell factory Enter and activate vector, vector length is consistent with hidden layer vector h.Weight matrix and offset parameter description have apparent meaning, example Such as WxiIndicate the weight matrix of input and input gate, WhiIndicate the weight matrix of hidden layer and input gate, WciIndicate cell factory With the weight matrix of input gate, bi、bfIt indicates input gate and forgets the offset parameter of door, footmark indicates affiliated calculating section.
By the learning training of above-mentioned LSTM, the semanteme at input study to preceding moment and the rear moment of moment t can be allowed to believe Breath.Because using two-way length memory network Bi-LSTM in short-term, list entries is input to two length from forward and reverse Memory network LSTM unit, the sequence vector of output are hfwAnd hbw, it is overlapped, is expressed as ht=[hfw,hbw], for spy Sequence vector is levied, there is context local feature in this feature sequence vector.
Finally its corresponding characteristic vector sequence is respectively obtained with candidate problem for customer issue, candidate answers.
S can wherein be usedcq、SqAnd SaRespectively indicate the feature vector sequence of customer issue, candidate answers and candidate problem Column.
Step S204:According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, First eigenvector similarity score matrix is obtained, and the visitor is determined according to the first eigenvector similarity score matrix First matching attribute of family problem and the candidate answers;
Wherein, in this step, first matching attribute of the computational representation customer issue and candidate answers matching value.
Specifically, the characteristic vector sequence of the characteristic vector sequence according to the customer issue and the candidate answers, First eigenvector similarity score matrix is obtained, including:Customer issue is calculated using preset Text similarity computing formula Characteristic vector sequence and candidate answers characteristic vector sequence obtain first eigenvector similarity score matrix.
In specific implementation, Text similarity computing formula can be using inner product formula, cosine formula etc..
It is illustrated by taking inner product formula as an example in the present embodiment.
If ScqiAnd SqjRespectively represent customer issue characteristic vector sequence and candidate answers characteristic vector sequence ScqAnd SqI-th A and j-th of feature vector, successively calculates the mutual similarity of feature vector, and formula is as follows:
simcqiqj=scqi·sqj
Wherein simqiajIndicate feature vector ScqiAnd SqjSimilarity.
Specifically, this determines the customer issue and the candidate according to the first eigenvector similarity score matrix First matching attribute of answer, including:
Step S2041:It is screened from the first eigenvector similarity score matrix using default filtering algorithm default Number meets the characteristic information of default essential condition, forms Text eigenvector;
Wherein it is possible to be screened using k-MAX pooling (k maximum value in set), screening obtains k number value Maximum characteristic information.
In specific implementation, which can be a lesser numerical value, such as 10, and the value of certain k is not limited to this, Other positive integers can be used.
Wherein, text feature vector is can to represent the question and answer semantic matches of candidate answers and customer issue.
Step S2042:Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding language of judging result Adopted matching probability is as the first matching attribute.
Learn wherein it is possible to carry out matched two classification based training of question and answer text semantic using softmax classifier, obtains The candidate answers and customer issue matching or unmatched prediction result, and prediction probability value is exported as matching attribute.
Step S205:According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, Second feature vector similarity score matrix is obtained, and the visitor is determined according to the second feature vector similarity score matrix Second matching attribute of family problem and the candidate problem;
Step S205 is used to calculate the second matching attribute of candidate's problem and customer issue.
The calculating process and the process class for the first matching attribute for calculating customer issue and candidate answers are to can refer to step S204。
In one embodiment, according to the feature vector of the characteristic vector sequence of the customer issue and the candidate problem Sequence obtains second feature vector similarity score matrix, including:Client is calculated using preset Text similarity computing formula Problem characteristic sequence vector and candidate problem characteristic sequence vector, obtain second feature vector similarity score matrix.
In one embodiment, the customer issue and institute are determined according to the second feature vector similarity score matrix The second matching attribute of candidate problem is stated, including:
Predetermined number is screened from the second feature vector similarity score matrix using default filtering algorithm to meet in advance If the characteristic information of essential condition, Text eigenvector is formed;
Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding semantic matches probability of judging result As the second matching attribute.
Step S206:It is calculated and is corresponded to each according to the first matching attribute of each candidate information and the second matching attribute The matching value of candidate information;
Step S207:The candidate answers in the maximum candidate information of matching value are selected to answer as the reply of the customer issue Case.
Wherein, step S206-207 is consistent with the step S104-105 in embodiment 1, does not repeat them here in the present embodiment.
To sum up, in a kind of matched method of question and answer text semantic provided in this embodiment, this divides according to the candidate information Not Ji Suan the first matching attribute of customer issue and candidate answers, customer issue and candidate problem the second matching attribute, including: The characteristic vector sequence of customer issue, candidate problem and candidate answers is obtained respectively, and described eigenvector sequence has context Local feature;According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, first is obtained Feature vector similarity score matrix, and according to the first eigenvector similarity score matrix determine the customer issue and First matching attribute of the candidate answers;According to the feature of the characteristic vector sequence of the customer issue and the candidate problem Sequence vector obtains second feature vector similarity score matrix, and according to the second feature vector similarity score matrix Determine the second matching attribute of the customer issue and the candidate problem.Using this method, to customer issue, candidate problem It is answered with candidate during carrying out term vector series processing and obtaining characteristic vector sequence to the end, the upper of text can be obtained Hereafter local feature information, and emphasis global characteristics information is chosen, conducive to the understanding of the Deep Semantics to question and answer text.
As shown in Figure 3, it is a kind of flow chart of the matched embodiment of the method 3 of question and answer text semantic provided by the present application, This approach includes the following steps:
Step S301:Receive customer issue;
Step S302:At least two candidate informations corresponding with the customer issue are obtained according to the customer issue;
Step S303:According to the candidate information, customer issue and the first matching attribute of candidate answers, visitor are calculated separately Second matching attribute of family problem and candidate problem;
Wherein, step S301-303 is consistent with the step S101-103 in embodiment 1, does not repeat them here in the present embodiment.
Step S304:First matching attribute of any candidate information and the second matching attribute are weighted read group total, Obtain the matching value of the candidate information;
Wherein, when calculating the matching value of candidate information, the weight of the first matching attribute and the second matching attribute that are related to is Default value can specifically realize the determination of weight by training, can be directed in subsequent embodiment in deep learning model process The contents of the section explains, and is not detailed in the present embodiment.
Specifically, the matching value formula for calculating candidate information is as follows:
Wherein, p1And p1Respectively indicate the first matching attribute and the second matching attribute, α and β be respectively the first matching attribute and The weight of second matching attribute.
In subsequent step, the matching value of each candidate information can be compared, therefrom determine numerical value it is maximum for The most matched candidate information of customer issue.
Step S305:The candidate answers in the maximum candidate information of matching value are selected to answer as the reply of the customer issue Case.
Wherein, step S305 is consistent with the step S105 in embodiment 1, does not repeat them here in the present embodiment.
To sum up, in a kind of matched method of question and answer text semantic provided in this embodiment, this is according to each candidate information First matching attribute and the second matching attribute calculate the matching value corresponding to each candidate information, including:By any candidate letter The first matching attribute and the second matching attribute of breath are weighted read group total, obtain the matching value of the candidate information.Pass through The weight for considering each matching attribute improves the semantic matched accuracy of calculating.
As shown in Figure 4, it is a kind of flow chart of the matched embodiment of the method 4 of question and answer text semantic provided by the present application, This approach includes the following steps:
Step S401:Pre- predetermined deep learning model and being calculated according to the first matching attribute and the second matching attribute is waited Select the computation rule of the matching value of information;
Wherein, which is used to calculate the first matching attribute, the customer issue of customer issue and candidate answers With the second matching attribute of candidate problem.
Wherein, the default process of the deep learning model is by the training deep learning model realization, so, into Before the formal matched process of customer issue question and answer text semantic of row, the training of deep learning model is first carried out.
It include Multiple factors (candidate problem, candidate answers) in the candidate information, correspondingly, the electricity in specific implementation Model corresponding with the factor respectively is provided in sub- equipment, therefore, it is necessary to be respectively trained and candidate problem and candidate answers pair The two deep learning models answered.
Wherein, the rule of the matching value that candidate information is calculated according to matching attribute can use weighted sum formula, In, the weight of each matching attribute can be manually to be arranged in advance.
The process of the training deep learning model specifically includes:
Step S01:Obtain at least two training candidate information corresponding with training customer issue, each trained candidate information Including the candidate problem of training and training candidate answers;
Wherein, in the training process, training candidate information and training customer issue are matched one by one, it, should in specific implementation The process of pairing can be by manually realizing, to guarantee that with the training candidate information of training customer issue pairing be corresponding letter Breath reduces interference.
Step S02:Obtain training customer issue, at least two training candidate respectively based on the deep learning model The characteristic vector sequence of information;
Wherein, during carrying out deep learning model training, terminological dictionary can be first established, so that according to the profession Dictionary and word segmentation regulation carry out the participle of profession to training customer issue and training candidate information.
In specific implementation, terminological dictionary can be established according to preset corpus in advance.
It wherein, include the corpus of magnanimity in the corpus.
In specific implementation, which can be configured according to different professional domains, and different professional domains can be with Different corpus is set.
Step S03:Based on the deep learning model according to the characteristic vector sequence of the trained customer issue and described The characteristic vector sequence of training candidate answers, obtains the first training feature vector similarity score matrix;
Step S04:Characteristic vector sequence based on the deep learning model and the trained customer issue and described The characteristic vector sequence of training candidate's problem, obtains the second training feature vector similarity score matrix;
Step S05:Default filtering algorithm based on the deep learning model is similar from first training feature vector Spend the spy that screening predetermined number in score matrix and the second training feature vector similarity score matrix meets default essential condition Reference breath, forms training text feature vector;
Step S06:Classifier based on the deep learning model carries out semantic matches to training text feature vector Two classification judgements, and to obtained prediction result using the parameter of the gradient descent method training deep learning model, and export Training result;
In the training process, will training candidate information and training customer issue match one by one, by participle after, progress word to Amount conversion, handle to obtain characteristic vector sequence by Bi-LSTM, further according to two groups of characteristic vector sequences (training candidate problem and Training customer issue, training candidate answers and training customer issue) obtain two feature vector similarity score matrixes.
Using default filtering algorithm from first eigenvector similarity score matrix and the second training feature vector similarity The characteristic information that predetermined number meets default essential condition is screened in score matrix, it, can be with after forming training text feature vector The Text eigenvector that this is obtained is input in deep learning model, is specifically as follows the training layer (such as full articulamentum) of model In, the prediction result (matching, not so that carry out two classification based training study to it using softmax classifier, to obtaining With) gradient descent method training parameter value is used, which is the value that the parameter of the deep learning model is taken.
Wherein, in the parametric procedure of the training deep learning model, training result is also exported, judges the training As a result whether meet preset condition, which is off trained condition.
In specific implementation, in trained process, training result is verified.Wherein, which can table in digital form Show.
Step S07:When the training result meets preset condition, the parameter of the deep learning model is recorded, so that It obtains the deep learning model and reply answer is determined to customer issue based on the parameter.
Wherein, preset condition is off trained condition, and when the training result is best, it can be with deconditioning, the mould The state that type is optimal.
Specifically, the training result most preferably refers to that the number of training result is no longer just more preferable.
Correspondingly, recording the parameter when deep learning model training result meets preset condition.
Wherein, when the parameter of the deep learning model is the parameter of the record, the customer issue received can be carried out Text semantic matching, obtains the higher reply answer of accuracy.
Step S402:Receive customer issue;
Step S403:At least two candidate informations corresponding with the customer issue are obtained according to the customer issue, often A candidate information includes candidate answers and candidate problem;
Step S404:According to the candidate information, customer issue and the first matching attribute of candidate answers, visitor are calculated separately Second matching attribute of family problem and candidate problem;
Step S405:It is calculated and is corresponded to each according to the first matching attribute of each candidate information and the second matching attribute The matching value of candidate information;
Step S406:The candidate answers in the maximum candidate information of matching value are selected to answer as the reply of the customer issue Case.
Wherein, step S402-406 is consistent with the step S101-105 in embodiment 1, does not repeat them here in the present embodiment.
To sum up, in a kind of matched method of question and answer text semantic provided in this embodiment, further include:Predetermined depth learns mould The computation rule of type and the matching value according to matching attribute calculating candidate information, the deep learning model are asked for calculating client Topic and the first matching attribute of candidate answers, the second matching attribute of customer issue and candidate problem.Using this method, by pre- First be configured to deep learning model and according to the computation rule of matching value that matching attribute calculates candidate information, for it is subsequent from In candidate information for customer issue determine reply answer it is specifically used in foundation is provided.
As shown in Figure 5 is the process schematic handled in concrete application scene customer issue.
It is searched from question and answer Candidate Set and obtains candidate problem and candidate answers;By candidate problem and customer issue composition one Group, after completing participle, by one deep learning model of phrase inputting, which includes embedding layers, Bi-LSTM, text phase Like degree calculation formula, k-MAX pooling, wherein candidate's problem and customer issue pass through embedding layers, Bi- respectively After LSTM processing, the characteristic vector sequence exported respectively, which inputs, carries out feature vector similarity meter in Text similarity computing formula It calculates, calculated result is exported and carries out processing to k-MAX pooling and obtains matching attribute p1 after classifier prediction result; Correspondingly, the candidate answers and customer issue form one group, after completing participle, by phrase inputting alternate model, by with time After selecting problem and customer issue similarly to handle, obtains matching attribute p2, matching attribute p1 and matching attribute p2 and be weighted and ask With obtain the matching angle value of candidate's problem and candidate answers group and the customer issue.
It should be noted that when the candidate problem and candidate answers more than one searched from question and answer Candidate Set, By comparing finally obtained matching angle value, the time in the maximum candidate problem of matching angle value and candidate answers combination can be determined Answer is selected to return to answer.
Corresponding with a kind of above-mentioned matched embodiment of the method for question and answer text semantic provided by the present application, the application also mentions The Installation practice using the matched method of question and answer text semantic is supplied.
As shown in FIG. 6 is a kind of structural representation of the matched Installation practice of question and answer text semantic provided by the present application Figure, including with flowering structure:Receiving module 601 obtains module 602, the first computing module 603, the second computing module 604 and selection Module 605;
Wherein, receiving module 601, for receiving customer issue;
Wherein, module 602 is obtained, for obtaining corresponding with the customer issue at least two according to the customer issue Candidate information, any candidate information contain at least two factor;
Wherein, the first computing module 603, for calculating separately customer issue and candidate answers according to the candidate information The first matching attribute, customer issue and candidate problem the second matching attribute;
In specific implementation, 603 are provided with deep learning model in first computing module, for calculating customer issue and waiting Select the first matching attribute of answer, the second matching attribute of customer issue and candidate problem.
Specifically, first computing module 603 for obtaining the feature of customer issue, candidate problem and candidate answers respectively Sequence vector, described eigenvector sequence have context local feature;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, the first spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the first eigenvector similarity score matrix State the first matching attribute of candidate answers;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, the second spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the second feature vector similarity score matrix State the second matching attribute of candidate problem.
Specifically, first computing module 603 obtains the feature vector of customer issue, candidate problem and candidate answers respectively Sequence, including:
According to preset terminological dictionary and word segmentation regulation, respectively to the customer issue, candidate problem and candidate answers Word segmentation processing is carried out, customer issue phrase, candidate problem phrase and candidate answers phrase are obtained;
The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector conversion respectively, obtained Customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Utilize preset two-way length memory network capture in short-term customer issue term vector sequence, candidate problem term vector sequence The context local feature of column, candidate answers term vector sequence respectively obtains the spy of customer issue, candidate problem and candidate answers Levy sequence vector.
Specifically, characteristic vector sequence and the candidate answers of first computing module 603 according to the customer issue Characteristic vector sequence, obtain first eigenvector similarity score matrix, including:
Using preset Text similarity computing formula calculate customer issue characteristic vector sequence and candidate answers feature to Sequence is measured, first eigenvector similarity score matrix is obtained.
Specifically, first computing module 603 determines the visitor according to the first eigenvector similarity score matrix First matching attribute of family problem and the candidate answers, including:
Predetermined number is screened from the first eigenvector similarity score matrix using default filtering algorithm to meet in advance If the characteristic information of essential condition, Text eigenvector is formed;
Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding semantic matches probability of judging result As the first matching attribute.
Wherein, the second computing module 604, for the first matching attribute and the second matching attribute according to each candidate information To calculate the matching value corresponding to each candidate information;
Specifically, second computing module be specifically used for by the first matching attribute of any candidate information with second match because Son is weighted read group total, obtains the matching value of the candidate information.
Wherein, selecting module 605, for selecting the candidate answers in the maximum candidate information of matching value as the client The reply answer of problem.
In specific implementation, which can use deep learning model.Have in the deep learning model Embedding layers, Bi-LSTM, Text similarity computing formula, the component parts such as k-MAX pooling.
In one embodiment of the application, aforementioned schemes are based on, the receiving module 601 is configured to:Communication interface is used The customer issue is received in the structure being connected from other with the device or the receiving module 601 can also be configured mouse, key Disk, touch device etc. can be used in the structure of input content.
Since each functional module of the matched device of question and answer text semantic of the example embodiment of the application is asked with above-mentioned The step of answering the example embodiment of the matched method of text semantic is corresponding, therefore for undisclosed in the application Installation practice Details please refers to the embodiment of the above-mentioned matched method of question and answer text semantic of the application.
In specific implementation, the matched device of question and answer text semantic includes processor and memory, above-mentioned receiving module 601, module 602, the first computing module 603, the second computing module 604 and selecting module 605 etc. are obtained and is used as program unit Storage in memory, executes above procedure unit stored in memory by processor to realize corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one Or more, task schedule is realized by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is deposited Store up chip.
To sum up, in a kind of matched device of question and answer text semantic provided in this embodiment, pass through the time of Multiple factors composition It selects information to be analyzed with customer issue, obtains multiple matching attributes, and the matching is calculated based on multiple matching attribute The matching value of the factor corresponding candidate information and the customer issue, and then the candidate in the maximum candidate information of matching value is selected to answer Case is as correct option corresponding with the customer issue.It include Multiple factors in candidate information, in conjunction with multiple in the program The matching degree of the factor analysis candidate information and customer issue, improves accuracy.
The embodiment of the present application provides a kind of storage medium, is stored thereon with program, real when which is executed by processor The existing question and answer text semantic matching process.
The embodiment of the present application provides a kind of processor, and the processor is for running program, wherein described program operation Question and answer text semantic matching process described in Shi Zhihang.
As shown in Figure 7 is the structural schematic diagram of a kind of electronic equipment embodiment provided by the present application, including with flowering structure: Processor 701 and memory 702;
It wherein, include one or more processor in the electronic equipment;
Storage device, for storing one or more programs, described program can be run on a processor.
Wherein, when one or more of programs are executed by one or more of processors so that it is one or Multiple processors are realized such as the matched method of question and answer text semantic in embodiment of the method 1-4.
In the application, which can be server, PC (personal computer, personal computer), PAD (tablet computer), mobile phone etc..
Specifically, processor realizes following steps when executing program:
Receive customer issue;
At least two candidate informations corresponding with the customer issue, each candidate are obtained according to the customer issue Information includes candidate answers and candidate problem;
According to the candidate information, calculate separately the first matching attribute of customer issue and candidate answers, customer issue with Second matching attribute of candidate problem;
It is calculated according to the first matching attribute of each candidate information and the second matching attribute corresponding to each candidate information Matching value;
Select the candidate answers in the maximum candidate information of matching value as the reply answer of the customer issue.
Preferably, according to the candidate information, customer issue and the first matching attribute of candidate answers, client are calculated separately Second matching attribute of problem and candidate problem, including:
The characteristic vector sequence of customer issue, candidate problem and candidate answers, described eigenvector sequence tool are obtained respectively There is context local feature;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, the first spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the first eigenvector similarity score matrix State the first matching attribute of candidate answers;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, the second spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the second feature vector similarity score matrix State the second matching attribute of candidate problem.
Preferably, the characteristic vector sequence for obtaining customer issue, candidate problem and candidate answers respectively, including:
According to preset terminological dictionary and word segmentation regulation, respectively to the customer issue, candidate problem and candidate answers Word segmentation processing is carried out, customer issue phrase, candidate problem phrase and candidate answers phrase are obtained;
The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector conversion respectively, obtained Customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Utilize preset two-way length memory network capture in short-term customer issue term vector sequence, candidate problem term vector sequence The context local feature of column, candidate answers term vector sequence respectively obtains the spy of customer issue, candidate problem and candidate answers Levy sequence vector.
Preferably, described according to the characteristic vector sequence of the customer issue and the feature vector sequence of the candidate answers Column, obtain first eigenvector similarity score matrix, including:
Using preset Text similarity computing formula calculate customer issue characteristic vector sequence and candidate answers feature to Sequence is measured, first eigenvector similarity score matrix is obtained.
Preferably, described that the customer issue and the time are determined according to the first eigenvector similarity score matrix The first matching attribute of answer is selected, including:
Predetermined number is screened from the first eigenvector similarity score matrix using default filtering algorithm to meet in advance If the characteristic information of essential condition, Text eigenvector is formed;
Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding semantic matches probability of judging result As the first matching attribute.
Preferably, described calculated according to the first matching attribute of each candidate information and the second matching attribute corresponds to often The matching value of a candidate information, including:
First matching attribute of any candidate information and the second matching attribute are weighted read group total, obtain the time Select the matching value of information.
Preferably, before the reception customer issue, further include:
The computation rule of predetermined deep learning model and the matching value according to matching attribute calculating candidate information, the depth Learning model is used to calculate customer issue and matches with the first matching attribute of candidate answers, customer issue with the second of candidate problem The factor;
Wherein, the default process of the deep learning model, which passes through, trains the deep learning model realization, described in training The process of deep learning model specifically includes:
At least two training candidate information corresponding with training customer issue is obtained, each trained candidate information includes training Candidate problem and training candidate answers;
Obtain the spy of training customer issue, at least two training candidate information respectively based on the deep learning model Levy sequence vector;
It is candidate according to the characteristic vector sequence of the trained customer issue and the training based on the deep learning model The characteristic vector sequence of answer obtains the first training feature vector similarity score matrix;
Characteristic vector sequence and the training based on the deep learning model and the trained customer issue are candidate The characteristic vector sequence of problem obtains the second training feature vector similarity score matrix;
Default filtering algorithm based on the deep learning model is from the first training feature vector similarity score square Screening predetermined number meets the characteristic information of default essential condition, shape in battle array and the second training feature vector similarity score matrix At training text feature vector;
Two classification that classifier based on the deep learning model carries out semantic matches to training text feature vector are sentenced It is disconnected, and to obtained prediction result using the parameter of the gradient descent method training deep learning model, and export training result;
When the training result meets preset condition, the parameter of the deep learning model is recorded, so that the depth It spends learning model and reply answer is determined to customer issue based on the parameter.
Present invention also provides a kind of computer-readable mediums, are stored thereon with computer program, set when in data processing When standby upper execution, it is adapted for carrying out initialization there are as below methods the program of step:
Receive customer issue;
At least two candidate informations corresponding with the customer issue, each candidate are obtained according to the customer issue Information includes candidate answers and candidate problem;
According to the candidate information, calculate separately the first matching attribute of customer issue and candidate answers, customer issue with Second matching attribute of candidate problem;
It is calculated according to the first matching attribute of each candidate information and the second matching attribute corresponding to each candidate information Matching value;
Select the candidate answers in the maximum candidate information of matching value as the reply answer of the customer issue.
Preferably, according to the candidate information, customer issue and the first matching attribute of candidate answers, client are calculated separately Second matching attribute of problem and candidate problem, including:
The characteristic vector sequence of customer issue, candidate problem and candidate answers, described eigenvector sequence tool are obtained respectively There is context local feature;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, the first spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the first eigenvector similarity score matrix State the first matching attribute of candidate answers;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, the second spy is obtained Vector similarity score matrix is levied, and the customer issue and institute are determined according to the second feature vector similarity score matrix State the second matching attribute of candidate problem.
Preferably, the characteristic vector sequence for obtaining customer issue, candidate problem and candidate answers respectively, including:
According to preset terminological dictionary and word segmentation regulation, respectively to the customer issue, candidate problem and candidate answers Word segmentation processing is carried out, customer issue phrase, candidate problem phrase and candidate answers phrase are obtained;
The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector conversion respectively, obtained Customer issue term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Utilize preset two-way length memory network capture in short-term customer issue term vector sequence, candidate problem term vector sequence The context local feature of column, candidate answers term vector sequence respectively obtains the spy of customer issue, candidate problem and candidate answers Levy sequence vector.
Preferably, described according to the characteristic vector sequence of the customer issue and the feature vector sequence of the candidate answers Column, obtain first eigenvector similarity score matrix, including:
Using preset Text similarity computing formula calculate customer issue characteristic vector sequence and candidate answers feature to Sequence is measured, first eigenvector similarity score matrix is obtained.
Preferably, described that the customer issue and the time are determined according to the first eigenvector similarity score matrix The first matching attribute of answer is selected, including:
Predetermined number is screened from the first eigenvector similarity score matrix using default filtering algorithm to meet in advance If the characteristic information of essential condition, Text eigenvector is formed;
Two classification judgements of semantic matches are carried out to Text eigenvector, and by the corresponding semantic matches probability of judging result As the first matching attribute.
Preferably, described calculated according to the first matching attribute of each candidate information and the second matching attribute corresponds to often The matching value of a candidate information, including:
First matching attribute of any candidate information and the second matching attribute are weighted read group total, obtain the time Select the matching value of information.
Preferably, before the reception customer issue, further include:
The computation rule of predetermined deep learning model and the matching value according to matching attribute calculating candidate information, the depth Learning model is used to calculate customer issue and matches with the first matching attribute of candidate answers, customer issue with the second of candidate problem The factor;
Wherein, the default process of the deep learning model, which passes through, trains the deep learning model realization, described in training The process of deep learning model specifically includes:
At least two training candidate information corresponding with training customer issue is obtained, each trained candidate information includes training Candidate problem and training candidate answers;
Obtain the spy of training customer issue, at least two training candidate information respectively based on the deep learning model Levy sequence vector;
It is candidate according to the characteristic vector sequence of the trained customer issue and the training based on the deep learning model The characteristic vector sequence of answer obtains the first training feature vector similarity score matrix;
Characteristic vector sequence and the training based on the deep learning model and the trained customer issue are candidate The characteristic vector sequence of problem obtains the second training feature vector similarity score matrix;
Default filtering algorithm based on the deep learning model is from the first training feature vector similarity score square Screening predetermined number meets the characteristic information of default essential condition, shape in battle array and the second training feature vector similarity score matrix At training text feature vector;
Two classification that classifier based on the deep learning model carries out semantic matches to training text feature vector are sentenced It is disconnected, and to obtained prediction result using the parameter of the gradient descent method training deep learning model, and export training result;
When the training result meets preset condition, the parameter of the deep learning model is recorded, so that the depth It spends learning model and reply answer is determined to customer issue based on the parameter.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element There is also other identical elements in process, method, commodity or equipment.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.The device provided for embodiment For, since it is corresponding with the method that embodiment provides, so being described relatively simple, related place is said referring to method part It is bright.
To the above description of provided embodiment, professional and technical personnel in the field is made to can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and principle provided in this article and features of novelty phase one The widest scope of cause.

Claims (10)

1. a kind of matched method of question and answer text semantic, which is characterized in that including:
Receive customer issue;
At least two candidate informations corresponding with the customer issue, each candidate information are obtained according to the customer issue Include candidate answers and candidate problem;
According to the candidate information, the first matching attribute, customer issue and candidate of customer issue and candidate answers are calculated separately Second matching attribute of problem;
Corresponding to each candidate information is calculated according to the first matching attribute of each candidate information and the second matching attribute With value;
Select the candidate answers in the maximum candidate information of matching value as the reply answer of the customer issue.
2. the method according to claim 1, wherein according to the candidate information, calculate separately customer issue with Second matching attribute of the first matching attribute of candidate answers, customer issue and candidate problem, including:
The characteristic vector sequence of customer issue, candidate problem and candidate answers is obtained respectively, and described eigenvector sequence has upper Hereafter local feature;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate answers, obtain fisrt feature to Similarity score matrix is measured, and the customer issue and the time are determined according to the first eigenvector similarity score matrix Select the first matching attribute of answer;
According to the characteristic vector sequence of the characteristic vector sequence of the customer issue and the candidate problem, obtain second feature to Similarity score matrix is measured, and the customer issue and the time are determined according to the second feature vector similarity score matrix Select the second matching attribute of problem.
3. according to the method described in claim 2, it is characterized in that, described obtain customer issue, candidate problem and candidate respectively The characteristic vector sequence of answer, including:
According to preset terminological dictionary and word segmentation regulation, the customer issue, candidate problem and candidate answers are carried out respectively Word segmentation processing obtains customer issue phrase, candidate problem phrase and candidate answers phrase;
The customer issue phrase, candidate problem phrase and candidate answers phrase are subjected to term vector conversion respectively, obtain client Problem term vector sequence, candidate problem term vector sequence, candidate answers term vector sequence;
Using preset two-way length memory network capture in short-term customer issue term vector sequence, candidate problem term vector sequence, wait The context local feature for selecting answer term vector sequence, respectively obtain the feature of customer issue, candidate problem and candidate answers to Measure sequence.
4. according to the method described in claim 2, it is characterized in that, the characteristic vector sequence according to the customer issue and The characteristic vector sequence of the candidate answers obtains first eigenvector similarity score matrix, including:
Customer issue characteristic vector sequence and candidate answers feature vector sequence are calculated using preset Text similarity computing formula Column, obtain first eigenvector similarity score matrix.
5. according to the method described in claim 2, it is characterized in that, described according to the first eigenvector similarity score square Battle array determines the first matching attribute of the customer issue and the candidate answers, including:
Predetermined number is screened from the first eigenvector similarity score matrix using default filtering algorithm and meets default weight The characteristic information of condition is wanted, Text eigenvector is formed;
To Text eigenvector carry out semantic matches two classification judgement, and using the corresponding semantic matches probability of judging result as First matching attribute.
6. the method according to claim 1, wherein first matching attribute according to each candidate information and Second matching attribute calculates the matching value corresponding to each candidate information, including:
First matching attribute of any candidate information and the second matching attribute are weighted read group total, obtain the candidate letter The matching value of breath.
7. the method according to claim 1, wherein further including before the reception customer issue:
Predetermined deep learning model and the matching value of candidate information is calculated according to the first matching attribute and the second matching attribute Computation rule, the deep learning model be used to calculate the first matching attribute of customer issue and candidate answers, customer issue with Second matching attribute of candidate problem;
Wherein, the default process of the deep learning model passes through the training deep learning model realization, the training depth The process of learning model includes:
At least two training candidate information corresponding with training customer issue is obtained, each trained candidate information includes that training is candidate Problem and training candidate answers;
Based on the deep learning model respectively obtain training customer issue, it is described at least two training candidate information feature to Measure sequence;
Based on the deep learning model according to the characteristic vector sequence and the trained candidate answers of the trained customer issue Characteristic vector sequence, obtain the first training feature vector similarity score matrix;
Characteristic vector sequence and the candidate problem of the training based on the deep learning model and the trained customer issue Characteristic vector sequence, obtain the second training feature vector similarity score matrix;
Default filtering algorithm based on the deep learning model from the first training feature vector similarity score matrix and The characteristic information that predetermined number meets default essential condition is screened in second training feature vector similarity score matrix, forms instruction Practice Text eigenvector;
Classifier based on the deep learning model carries out two classification judgements of semantic matches to training text feature vector, and To obtained prediction result using the parameter of the gradient descent method training deep learning model, and export training result;
When the training result meets preset condition, the parameter of the deep learning model is recorded, so that the depth It practises model and reply answer is determined to customer issue based on the parameter.
8. a kind of matched device of question and answer text semantic, which is characterized in that including:
Receiving module, for receiving customer issue;
Module is obtained, for obtaining at least two candidate informations corresponding with the customer issue according to the customer issue, is appointed One candidate information contains at least two factor;
First computing module, for according to the candidate information, calculate separately customer issue with the first of candidate answers match because Second matching attribute of son, customer issue and candidate problem;
Second computing module, for being corresponded to according to the first matching attribute of each candidate information and the second matching attribute to calculate The matching value of each candidate information;
Selecting module, for selecting the candidate answers in the maximum candidate information of matching value to answer as the reply of the customer issue Case.
9. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is executed by processor Question and answer text semantic matched method of the Shi Shixian as described in any one of claims 1 to 7.
10. a kind of electronic equipment, which is characterized in that including:
One or more processors;
Storage device, for storing one or more programs, when one or more of programs are by one or more of processing When device executes, so that one or more of processors realize the question and answer text semantic as described in any one of claims 1 to 7 Matched method.
CN201810718708.9A 2018-06-29 2018-06-29 Question and answer text semantic matching method and device Active CN108920654B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810718708.9A CN108920654B (en) 2018-06-29 2018-06-29 Question and answer text semantic matching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810718708.9A CN108920654B (en) 2018-06-29 2018-06-29 Question and answer text semantic matching method and device

Publications (2)

Publication Number Publication Date
CN108920654A true CN108920654A (en) 2018-11-30
CN108920654B CN108920654B (en) 2021-10-29

Family

ID=64423529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810718708.9A Active CN108920654B (en) 2018-06-29 2018-06-29 Question and answer text semantic matching method and device

Country Status (1)

Country Link
CN (1) CN108920654B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582970A (en) * 2018-12-12 2019-04-05 科大讯飞华南人工智能研究院(广州)有限公司 A kind of semantic measurement method, apparatus, equipment and readable storage medium storing program for executing
CN109597881A (en) * 2018-12-17 2019-04-09 北京百度网讯科技有限公司 Matching degree determines method, apparatus, equipment and medium
CN109726396A (en) * 2018-12-20 2019-05-07 泰康保险集团股份有限公司 Semantic matching method, device, medium and the electronic equipment of question and answer text
CN109800292A (en) * 2018-12-17 2019-05-24 北京百度网讯科技有限公司 The determination method, device and equipment of question and answer matching degree
CN109918673A (en) * 2019-03-14 2019-06-21 湖北亿咖通科技有限公司 Semantic referee method, device, electronic equipment and computer readable storage medium
CN110134775A (en) * 2019-05-10 2019-08-16 中国联合网络通信集团有限公司 Question and answer data creation method and device, storage medium
CN110347813A (en) * 2019-06-26 2019-10-18 北京大米科技有限公司 A kind of corpus processing method, device, storage medium and electronic equipment
CN110399473A (en) * 2019-06-28 2019-11-01 阿里巴巴集团控股有限公司 The method and apparatus for determining answer for customer problem
CN110413730A (en) * 2019-06-27 2019-11-05 平安科技(深圳)有限公司 Text information matching degree detection method, device, computer equipment and storage medium
CN110489730A (en) * 2019-08-14 2019-11-22 腾讯科技(深圳)有限公司 Text handling method, device, terminal and storage medium
CN111026840A (en) * 2019-11-26 2020-04-17 腾讯科技(深圳)有限公司 Text processing method, device, server and storage medium
CN111108501A (en) * 2019-12-25 2020-05-05 深圳市优必选科技股份有限公司 Context-based multi-turn dialogue method, device, equipment and storage medium
CN111259130A (en) * 2020-02-14 2020-06-09 支付宝(杭州)信息技术有限公司 Method and apparatus for providing reply sentence in dialog
CN111309878A (en) * 2020-01-19 2020-06-19 支付宝(杭州)信息技术有限公司 Retrieval type question-answering method, model training method, server and storage medium
CN111753062A (en) * 2019-11-06 2020-10-09 北京京东尚科信息技术有限公司 Method, device, equipment and medium for determining session response scheme
CN112231456A (en) * 2020-10-15 2021-01-15 泰康保险集团股份有限公司 Question generation method and device, storage medium and electronic equipment
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112989001A (en) * 2021-03-31 2021-06-18 建信金融科技有限责任公司 Question and answer processing method, device, medium and electronic equipment
CN113157868A (en) * 2021-04-29 2021-07-23 青岛海信网络科技股份有限公司 Method and device for matching answers to questions based on structured database
WO2022226879A1 (en) * 2021-04-29 2022-11-03 京东方科技集团股份有限公司 Question and answer processing method and apparatus, electronic device, and computer-readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236677A (en) * 2010-04-28 2011-11-09 北京大学深圳研究生院 Question answering system-based information matching method and system
CN102609500A (en) * 2012-02-01 2012-07-25 北京百度网讯科技有限公司 Question push method, question answering system using same and search engine
CN102903008A (en) * 2011-07-29 2013-01-30 国际商业机器公司 Method and system for computer question answering
CN103544216A (en) * 2013-09-23 2014-01-29 Tcl集团股份有限公司 Information recommendation method and system combining image content and keywords
CN104133817A (en) * 2013-05-02 2014-11-05 深圳市世纪光速信息技术有限公司 Online community interaction method and device and online community platform
CN105989040A (en) * 2015-02-03 2016-10-05 阿里巴巴集团控股有限公司 Intelligent question-answer method, device and system
US20170212916A1 (en) * 2016-01-22 2017-07-27 International Business Machines Corporation Duplicate post handling with natural language processing
CN107239574A (en) * 2017-06-29 2017-10-10 北京神州泰岳软件股份有限公司 A kind of method and device of intelligent Answer System knowledge problem matching
CN107315772A (en) * 2017-05-24 2017-11-03 北京邮电大学 The problem of based on deep learning matching process and device
CN107741976A (en) * 2017-10-16 2018-02-27 泰康保险集团股份有限公司 Intelligent response method, apparatus, medium and electronic equipment
CN107818164A (en) * 2017-11-02 2018-03-20 东北师范大学 A kind of intelligent answer method and its system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102236677A (en) * 2010-04-28 2011-11-09 北京大学深圳研究生院 Question answering system-based information matching method and system
CN102903008A (en) * 2011-07-29 2013-01-30 国际商业机器公司 Method and system for computer question answering
CN102609500A (en) * 2012-02-01 2012-07-25 北京百度网讯科技有限公司 Question push method, question answering system using same and search engine
CN104133817A (en) * 2013-05-02 2014-11-05 深圳市世纪光速信息技术有限公司 Online community interaction method and device and online community platform
CN103544216A (en) * 2013-09-23 2014-01-29 Tcl集团股份有限公司 Information recommendation method and system combining image content and keywords
CN105989040A (en) * 2015-02-03 2016-10-05 阿里巴巴集团控股有限公司 Intelligent question-answer method, device and system
US20170212916A1 (en) * 2016-01-22 2017-07-27 International Business Machines Corporation Duplicate post handling with natural language processing
CN107315772A (en) * 2017-05-24 2017-11-03 北京邮电大学 The problem of based on deep learning matching process and device
CN107239574A (en) * 2017-06-29 2017-10-10 北京神州泰岳软件股份有限公司 A kind of method and device of intelligent Answer System knowledge problem matching
CN107741976A (en) * 2017-10-16 2018-02-27 泰康保险集团股份有限公司 Intelligent response method, apparatus, medium and electronic equipment
CN107818164A (en) * 2017-11-02 2018-03-20 东北师范大学 A kind of intelligent answer method and its system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
伍浩铖: "社区问答搜索中排序方法的研究", 《中国优秀博硕士学位论文全文数据库(博士) 信息科技辑》 *
相洋: "问答系统的答案优化方法研究相", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109582970B (en) * 2018-12-12 2023-05-30 科大讯飞华南人工智能研究院(广州)有限公司 Semantic measurement method, semantic measurement device, semantic measurement equipment and readable storage medium
CN109582970A (en) * 2018-12-12 2019-04-05 科大讯飞华南人工智能研究院(广州)有限公司 A kind of semantic measurement method, apparatus, equipment and readable storage medium storing program for executing
CN109597881A (en) * 2018-12-17 2019-04-09 北京百度网讯科技有限公司 Matching degree determines method, apparatus, equipment and medium
CN109800292A (en) * 2018-12-17 2019-05-24 北京百度网讯科技有限公司 The determination method, device and equipment of question and answer matching degree
CN109726396A (en) * 2018-12-20 2019-05-07 泰康保险集团股份有限公司 Semantic matching method, device, medium and the electronic equipment of question and answer text
CN109918673A (en) * 2019-03-14 2019-06-21 湖北亿咖通科技有限公司 Semantic referee method, device, electronic equipment and computer readable storage medium
CN109918673B (en) * 2019-03-14 2021-08-03 湖北亿咖通科技有限公司 Semantic arbitration method and device, electronic equipment and computer-readable storage medium
CN110134775B (en) * 2019-05-10 2021-08-24 中国联合网络通信集团有限公司 Question and answer data generation method and device and storage medium
CN110134775A (en) * 2019-05-10 2019-08-16 中国联合网络通信集团有限公司 Question and answer data creation method and device, storage medium
CN110347813B (en) * 2019-06-26 2021-09-17 北京大米科技有限公司 Corpus processing method and device, storage medium and electronic equipment
CN110347813A (en) * 2019-06-26 2019-10-18 北京大米科技有限公司 A kind of corpus processing method, device, storage medium and electronic equipment
CN110413730A (en) * 2019-06-27 2019-11-05 平安科技(深圳)有限公司 Text information matching degree detection method, device, computer equipment and storage medium
WO2020258506A1 (en) * 2019-06-27 2020-12-30 平安科技(深圳)有限公司 Text information matching degree detection method and apparatus, computer device and storage medium
CN110399473A (en) * 2019-06-28 2019-11-01 阿里巴巴集团控股有限公司 The method and apparatus for determining answer for customer problem
CN110399473B (en) * 2019-06-28 2023-08-29 创新先进技术有限公司 Method and device for determining answers to user questions
CN110489730A (en) * 2019-08-14 2019-11-22 腾讯科技(深圳)有限公司 Text handling method, device, terminal and storage medium
CN111753062A (en) * 2019-11-06 2020-10-09 北京京东尚科信息技术有限公司 Method, device, equipment and medium for determining session response scheme
CN111026840B (en) * 2019-11-26 2023-10-13 腾讯科技(深圳)有限公司 Text processing method, device, server and storage medium
CN111026840A (en) * 2019-11-26 2020-04-17 腾讯科技(深圳)有限公司 Text processing method, device, server and storage medium
CN111108501B (en) * 2019-12-25 2024-02-06 深圳市优必选科技股份有限公司 Context-based multi-round dialogue method, device, equipment and storage medium
CN111108501A (en) * 2019-12-25 2020-05-05 深圳市优必选科技股份有限公司 Context-based multi-turn dialogue method, device, equipment and storage medium
CN111309878A (en) * 2020-01-19 2020-06-19 支付宝(杭州)信息技术有限公司 Retrieval type question-answering method, model training method, server and storage medium
CN111309878B (en) * 2020-01-19 2023-08-22 支付宝(杭州)信息技术有限公司 Search type question-answering method, model training method, server and storage medium
CN111259130B (en) * 2020-02-14 2023-04-07 支付宝(杭州)信息技术有限公司 Method and apparatus for providing reply sentence in dialog
CN111259130A (en) * 2020-02-14 2020-06-09 支付宝(杭州)信息技术有限公司 Method and apparatus for providing reply sentence in dialog
CN112231456A (en) * 2020-10-15 2021-01-15 泰康保险集团股份有限公司 Question generation method and device, storage medium and electronic equipment
CN112231456B (en) * 2020-10-15 2024-02-23 泰康保险集团股份有限公司 Question generation method, device, storage medium and electronic equipment
CN112434514B (en) * 2020-11-25 2022-06-21 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112434514A (en) * 2020-11-25 2021-03-02 重庆邮电大学 Multi-granularity multi-channel neural network based semantic matching method and device and computer equipment
CN112989001B (en) * 2021-03-31 2023-05-26 建信金融科技有限责任公司 Question and answer processing method and device, medium and electronic equipment
CN112989001A (en) * 2021-03-31 2021-06-18 建信金融科技有限责任公司 Question and answer processing method, device, medium and electronic equipment
WO2022226879A1 (en) * 2021-04-29 2022-11-03 京东方科技集团股份有限公司 Question and answer processing method and apparatus, electronic device, and computer-readable storage medium
CN113157868A (en) * 2021-04-29 2021-07-23 青岛海信网络科技股份有限公司 Method and device for matching answers to questions based on structured database

Also Published As

Publication number Publication date
CN108920654B (en) 2021-10-29

Similar Documents

Publication Publication Date Title
CN108920654A (en) A kind of matched method and apparatus of question and answer text semantic
CN111724083B (en) Training method and device for financial risk identification model, computer equipment and medium
CN109978060B (en) Training method and device of natural language element extraction model
CN107679082A (en) Question and answer searching method, device and electronic equipment
CN107844533A (en) A kind of intelligent Answer System and analysis method
CN110008984B (en) Target fraud transaction model training method and device based on multitasking samples
CN109299476A (en) Question answering method and device, electronic equipment and storage medium
CN109829065B (en) Image retrieval method, device, equipment and computer readable storage medium
US11568503B2 (en) Systems and methods for determining structured proceeding outcomes
CN114647741A (en) Process automatic decision and reasoning method, device, computer equipment and storage medium
CN114357117A (en) Transaction information query method and device, computer equipment and storage medium
CN112132238A (en) Method, device, equipment and readable medium for identifying private data
CN110990523A (en) Legal document determining method and system
CN110263817B (en) Risk grade classification method and device based on user account
CN110705622A (en) Decision-making method and system and electronic equipment
CN110472063A (en) Social media data processing method, model training method and relevant apparatus
CN111259975B (en) Method and device for generating classifier and method and device for classifying text
CN108733694B (en) Retrieval recommendation method and device
CN109582834B (en) Data risk prediction method and device
CN114254622A (en) Intention identification method and device
CN117034942B (en) Named entity recognition method, device, equipment and readable storage medium
Ali et al. Identifying and Profiling User Interest over time using Social Data
CN113220801B (en) Structured data classification method, device, equipment and medium
CN112395405B (en) Query document sorting method and device and electronic equipment
CN107391591B (en) Data processing method and device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Floor 36, Zheshang Building, No. 718 Jianshe Avenue, Jiang'an District, Wuhan, Hubei 430019

Patentee after: TK.CN INSURANCE Co.,Ltd.

Patentee after: TAIKANG INSURANCE GROUP Co.,Ltd.

Address before: Taikang Life Building, 156 fuxingmennei street, Xicheng District, Beijing 100031

Patentee before: TAIKANG INSURANCE GROUP Co.,Ltd.

Patentee before: TK.CN INSURANCE Co.,Ltd.

CP03 Change of name, title or address