Summary of the invention
Fundamental purpose of the present invention is the defect that problem matching result is undesirable in intelligent interactive system overcoming prior art existence.
For achieving the above object, the invention provides the problem matching process in a kind of intelligent interactive system, described method comprises the steps:
Obtain the asked questions of user's input;
Participle is carried out to described asked questions, goes stop words and query expansion process, obtain the index terms of described asked questions;
According to the index file preset, candidate's problem that coupling is relevant to the index terms of described asked questions from question and answer storehouse;
Calculate the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem;
According to the described similarity calculated, export the answer corresponding with the problem that described asked questions mates according to the rule preset.
Further, described method also comprises:
Carry out expansion to each problem in described question and answer storehouse and form Similar Problems collection, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.
Further, described method also comprises:
Set up question and answer storehouse;
Word segmentation processing is carried out to the problem in described question and answer storehouse and obtains index terms, set up the index file of described index terms and described problem corresponding relation.
Preferably, the described similarity that described basis calculates, the step exporting the answer corresponding with the problem that described asked questions mates according to the rule preset comprises:
If described Similar Problems concentrates the problem existing and be greater than preset range higher limit with described asked questions similarity, then export answer corresponding to the candidate problem maximum with described asked questions similarity;
Otherwise if described Similar Problems is concentrated, to there is similarity be problem in preset range, then the candidate's problem exported with described asked questions similarity in preset range is selected for user, and the answer that the candidate's problem exporting user's selection is corresponding;
Otherwise, if the similarity of the problem concentrated of described Similar Problems and described asked questions is all less than preset range lower limit, then described asked questions is added in described question and answer storehouse, and output matching is empty prompting.
Further, if described Similar Problems is concentrated, to there is similarity be Similar Problems in preset range, then export and select for user with candidate's problem of described asked questions similarity in preset range, and after the step of answer corresponding to the candidate's problem exporting user's selection, also comprise:
Described asked questions is added into Similar Problems corresponding to candidate's problem that described user selects to concentrate.
For achieving the above object, present invention also offers the problem matching system in a kind of intelligent interactive system, described system comprises:
Problem acquisition module, for obtaining the asked questions of user's input;
Problem pretreatment module, for carrying out participle to described asked questions, going stop words and query expansion process, obtains the index terms of described asked questions;
Candidate's problem matching module, for according to the index file preset, mates the candidate problem relevant to the index terms of described asked questions from question and answer storehouse;
Similar Problems matching module, for calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem;
Result output module, for according to the described similarity calculated, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.
Further, described system also comprises:
Similar Problems collection expansion module, form Similar Problems collection for carrying out expansion to each problem in described question and answer storehouse, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.
Further, described system also comprises:
Index file builds module, for setting up question and answer storehouse; Word segmentation processing is carried out to the problem in described question and answer storehouse and obtains index terms, set up the index file of described index terms and described problem corresponding relation.
Preferably, described result output module comprises:
Mate output module completely, when being greater than the problem of preset range higher limit for concentrating existence and described asked questions similarity when described Similar Problems, export the answer that the candidate problem maximum with described asked questions similarity is corresponding;
Similarity matching output module, during for concentrating when described Similar Problems that to there is similarity be the problem in preset range, the candidate's problem exported with described asked questions similarity in preset range is selected for user, and the answer that the candidate's problem exporting user's selection is corresponding;
Coupling is empty output module, for when the similarity of the problem that described Similar Problems is concentrated and described asked questions is all less than preset range lower limit, is added into by described asked questions in described question and answer storehouse, and output matching is empty prompting.
Preferably, described Similarity matching output module is also concentrated for the Similar Problems that the candidate's problem described asked questions being added into described user selection is corresponding.
The present invention adopts technique scheme, and the technique effect brought is: by carrying out participle to user's asked questions, going stop words and query expansion process, avoids word that in complicated statement, correlativity is little to the impact of problem matching result; According to the index file preset, candidate's problem that first coupling is relevant to described asked questions from question and answer storehouse, decreases the calculated amount of problem coupling; Calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem again, avoiding the situation that same problem is mated less than problem because there being multiple different way to put questions, improve problem matching result; The described similarity that last basis calculates, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.Whole scheme improves the accuracy of problem matching result in intelligent interactive system.
Embodiment
Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
Below with reference to some illustrative embodiments, principle of the present invention and spirit are described.Should be appreciated that providing these embodiments is only used to enable those skilled in the art understand better and then realize the present invention, and not limit the scope of the invention by any way.On the contrary, provide these embodiments to be to make the disclosure more thorough and complete, and the scope of the present disclosure intactly can be conveyed to those skilled in the art.
Art technology technician know, embodiments of the present invention can be implemented as a kind of system, device, equipment, method or computer program.Therefore, the disclosure can be implemented as following form, that is: hardware, completely software (comprising firmware, resident software, microcode etc.) completely, or the form that hardware and software combines.
Fundamental purpose of the present invention is the defect that problem matching result is undesirable in intelligent interactive system overcoming prior art existence.
With reference to the problem matching process schematic flow sheet that Fig. 1, Fig. 1 are in first preferred embodiment of the invention intelligent interactive system;
In one embodiment, as shown in Figure 1, described method comprises the steps:
S10: the asked questions obtaining user's input;
Particularly, refer to the asked questions obtaining user and input from the inputting interface of intelligent interactive system, when user needs to obtain the answer wanted from intelligent interactive system, user inputs an asked questions on the client, it can be audio form, written form or graphic form, now intelligent interactive system when asked questions be audio form or graphic form, format conversion is carried out to asked questions, convert written form to, to be conducive to searching of most matching problem below.Described inputting interface can be client end AP P, and the carrier of described method can be server, and server can be Web server, also can be the server of other types, such as APP server.
S20: participle is carried out to described asked questions, goes stop words and query expansion process, obtain the index terms of described asked questions;
Particularly, carry out participle to described asked questions and refer to described asked questions is divided into multiple word, participle process can call the participle instrument ICTCLAS of the Chinese Academy of Sciences; Go stop words to refer to and remove some words of having stopped using, stop words dictionary can be set up in advance mate, the word of stopping using is removed, go stop words to comprise to remove polite formula word (as may I ask, may I ask inferior), auxiliary word (as,, etc.) etc. to asked questions semantic relation models the little but word that the frequency of occurrences is higher; Query expansion process is fingering row synonym expansion (as doctor and doctor, father and father etc.) mainly, and " synonym woods " can be adopted to carry out synonym query expansion, by with described asked questions participle after the synonym of word expand; After participle is carried out to described asked questions, going stop words and query expansion process, index terms relevant in essence to described asked questions in asked questions can be obtained.Such as, asked questions for " your company sees which type of sheet operation flow is? " obtain after participle " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what ", " ", after removing stop words will " " remove, then carry out query expansion " you " by word to expand to " that etc. ", " company " expands to " shop ", " shop ", " shop ", " shop ", " hotel owner ", " firm ", " trading company ", " shop ", " paving ", " number ", " village ", " office ", " cabinet foreign firm ", " commission agent ", " commodity section ", " company ", " paving ", " business office ", " cooperative society ", " businessman ", " enterprise " etc., " seeing sheet " expands to " seeing medical image file ", and " business " expands to " work ", " operation ", " affairs ", " thing ", " matter ", " business ", " political and ideological work " etc., " flow process " expands to " streamline ", " technological process " etc., "Yes" expands to " correctly ", " to ", " so ", " well ", " errorless ", " enemy " etc., " what " expands to " how ", " how ", " why ", " how ", " what to do ", " why ", " what kind of ", " how about ", " how ", " what ", " how ", " how ", " what ", " how ", " should how to discuss ".Obtain process after " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what " and each word expansion synonym as index terms.
S30: according to the index file preset, candidate's problem that coupling is relevant to the index terms of described asked questions from question and answer storehouse;
Particularly, by the process of S20, after obtaining index terms, according to the index file preset, candidate's problem that coupling is relevant to the index terms of described asked questions from question and answer storehouse, the object obtaining candidate's problem is that the complicated processes such as subsequent calculations Similarity Measure are carried out in less problem scope.With described asked questions index terms for elementary cell, maximum front 10 problems of overlapping word quantity between described asked questions (this quantity can design according to the demand of the intelligent interactive system of reality, just does an example at this) alternatively problem is found out from the index file preset.Still with asked questions " your company sees which type of sheet operation flow is? " for example, according to " you ", " company ", " see sheet ", " business ", " flow process ", "Yes", the index terms such as the synonym of " what " and each word expansion, according to the index file preset, match " seeing sheet business the chances are what detailed process? " " which type of company Kan Pian company vRad is? " " your business is what does? " " you see sheet business when starts? " " you are any companies ", " see sheet business diagnosis of partial how to realize ", " see what sheet business full name is ", candidate's problem such as " see what sheet business service platform is ".
S40: the similarity calculating the described asked questions Similar Problems concentration problem corresponding with described candidate's problem;
Particularly, calculate described asked questions " your company sees which type of sheet operation flow is? " with described candidate's problem " seeing sheet business the chances are what detailed process? ", " which type of company Kan Pian company vRad is? ", " your business is what does? ", " you see sheet business when starts? ", Similar Problems concentration problem corresponding to " what company you are ", " seeing sheet business diagnosis of partial how to realize ", " seeing what sheet business full name is ", " seeing what sheet business service platform is " similarity.When actual design, described Similar Problems collection can be set up in advance when designing described intelligent interactive system, also can carry out setting up in the use procedure of follow-up system and perfect.Described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.Calculate the similarity that the described asked questions Similar Problems corresponding with the described candidate's problem matched concentrates all problems.Namely, after matching candidate's problem, the similarity that the described asked questions Similar Problems corresponding with described all candidate's problems concentrates all Similar Problems is calculated.The method calculating similarity between sentence can first by (knowing net based on HowNet, be one with the concept representated by the word of Chinese and english for description object, to disclose between concept and concept and pass between attribute that concept has is the commonsense knowledge base of substance) Words similarity algorithm calculate similarity between word and word, then the similarity of Similarity Measure sentence by word.Between sentence, the computing method of similarity belong to prior art, do not repeat at this.
S50: according to the described similarity calculated, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.
Particularly, according to the similarity that step S40 calculates, export the answer corresponding with the problem that described asked questions mates according to the rule preset.Described default rule can be the demand according to intelligent interactive system in embody rule field, and the rule of the output in preset range of the similarity of the degree of perfection design in question and answer storehouse.The process exporting the answer corresponding with the problem that described asked questions mates according to the rule preset comprises first according to the described similarity calculated, and determines candidate's problem of mating with described asked questions, then exports answer corresponding to described candidate's problem.According to the described similarity calculated, when determining candidate's problem of mating with described asked questions, according to the preset range of similarity, judge that described Similar Problems is concentrated and whether existed and the problem of described asked questions similarity in preset range, if exist, then the corresponding answer of candidate's problem that the Similar Problems set pair exporting the problem place corresponding with described similarity is answered; If do not exist, then direct described asked questions to be added in question and answer storehouse.
The embodiment of the present invention, by carrying out participle to user's asked questions, going stop words and query expansion process, avoids word that in complicated statement, correlativity is little to the impact of problem matching result; According to the index file preset, candidate's problem that first coupling is relevant to described asked questions from question and answer storehouse, decreases the calculated amount of problem coupling; Calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem again, avoiding the situation that same problem is mated less than problem because there being multiple different way to put questions, improve problem matching result; The described similarity that last basis calculates, exports the answer corresponding with the problem that described asked questions mates according to the rule preset., whole scheme improves the accuracy of problem matching result in intelligent interactive system.
With reference to the problem matching process schematic flow sheet that Fig. 2, Fig. 2 are in second preferred embodiment of the invention intelligent interactive system;
In one embodiment, as shown in Figure 2, based on the first preferred embodiment shown in Fig. 1, before described step S10, the problem matching process in described intelligent interactive system also comprises:
S60: carry out expansion to each problem in described question and answer storehouse and form Similar Problems collection, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.
Particularly, during in order to ensure that user uses, improve the accuracy of described asked questions matching result, when designing intelligent interactive system, for all problems in question and answer storehouse sets up Similar Problems collection in advance, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.Such as match candidate's problem " seeing sheet business the chances are what detailed process? " set up by " seeing sheet business how to complete whole flow process? ", " seeing which part is sheet business main flow have? ", " seeing how sheet business operates? ", " seeing how sheet business completes? " and " seeing sheet business the chances are what detailed process? " the Similar Problems collection that candidate's problem itself is formed; Which type of company by that analogy, for other candidate problem Kan Pian company vRad matched are? ", " your business is what does? ", " you see sheet business when starts? ", " what company you are ", " seeing sheet business diagnosis of partial how to realize ", " seeing what sheet business full name is ", " seeing what sheet business service platform is " set up Similar Problems collection respectively.
With reference to the problem matching process schematic flow sheet that Fig. 3, Fig. 3 are in third preferred embodiment of the invention intelligent interactive system;
In one embodiment, as shown in Figure 3, based on the second preferred embodiment shown in Fig. 2, before described step S60, the problem matching process in described intelligent interactive system also comprises:
S70: set up question and answer storehouse;
Particularly, according to the demand of described intelligent interactive system, set up problem answers database in advance, i.e. question and answer storehouse.In a preferred embodiment, for avoiding the redundancy of data, the question and answer in described question and answer storehouse has man-to-man relation.
S80: carry out word segmentation processing to the problem in described question and answer storehouse and obtain index terms, sets up the index file of described index terms and described problem corresponding relation.
Particularly, after setting up question and answer storehouse, word segmentation processing is carried out to all problems in question and answer storehouse, such as in question and answer storehouse problem " your company sees which type of sheet operation flow is? " obtain after participle " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what ", " ", by that analogy, word segmentation processing is carried out to the problem that has in question and answer storehouse more, participle process can call the participle instrument ICTCLAS of the Chinese Academy of Sciences, obtains the sentence after participle " your company sees which type of sheet operation flow is ".Call the full-text search engine kit Lucene of open source code, using the sentence " your company sees which type of sheet operation flow is " after participle and former asked questions " your company sees which type of sheet operation flow is " as parameters input, the index file of described index terms and described problem corresponding relation can be obtained.In step s 30, according to the index file set up, from described question and answer storehouse, mate the candidate problem relevant to described asked questions by the index terms of described asked questions herein.Described candidate's problem is for elementary cell with the index terms of described asked questions, maximum front 10 problems of overlapping word quantity between described asked questions (this quantity can design according to the demand of the intelligent interactive system of reality, just does an example at this) are found out from index file.Therefore a just preliminary matching process.In the process, the index terms of described asked questions have passed through participle, goes the process of stop words and query expansion, therefore improves the correctness of problem matching result.
With reference to the refinement process flow diagram that Fig. 4, Fig. 4 are step S50 shown in Fig. 1 in four preferred embodiment of the invention intelligent interactive system;
In one embodiment, as shown in Figure 4, described step S50: according to the described similarity calculated, the step exporting the answer corresponding with the problem that described asked questions mates according to the rule preset comprises:
S501: if described Similar Problems concentrates the problem existing and be greater than preset range higher limit with described asked questions similarity, then export answer corresponding to the candidate problem maximum with described asked questions similarity;
Particularly, pass through the calculating to the described asked questions Similar Problems concentration problem similarity corresponding with all candidate's problems matched, if judge, described Similar Problems concentrates the problem existing and be greater than preset range higher limit with described asked questions similarity, then directly export answer corresponding to the candidate problem maximum with described asked questions similarity.Such as: through calculating asked questions " your company sees which type of sheet operation flow is? " with the candidate's problem matched " seeing sheet business the chances are what detailed process? " Similar Problems concentrate problem " seeing sheet business how to complete whole flow process? " similarity be 90.5% (supposing that preset range is 75% ~ 90%), then directly export candidate problem " seeing sheet business the chances are what detailed process? " corresponding answer.
S502: otherwise if described Similar Problems is concentrated, to there is similarity be problem in preset range, then the candidate's problem exported with described asked questions similarity in preset range is selected for user, and the answer that the candidate's problem exporting user's selection is corresponding;
Particularly, if do not meet the condition of S501, then judge that described Similar Problems concentrates whether to exist with described asked questions similarity be problem in preset range, if exist, then the candidate's problem exported with described asked questions similarity in preset range is selected for user.Such as: through calculating with asked questions " your company sees which type of sheet operation flow is? " the similarity of all problems that the Similar Problems of all candidate's problems of coupling is concentrated all does not exceed preset range higher limit 90% (supposing that preset range is 75% ~ 90%), then judge with described asked questions " your company sees which type of sheet operation flow is? " the Similar Problems of all candidate's problems of coupling concentrates whether to there is similarity be problem in preset range, if exist, then exporting existence with described asked questions similarity is that candidate's problem that the Similar Problems set pair in preset range is answered is selected for user.If such as with described asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling " seeing sheet business the chances are what detailed process? " Similar Problems concentrate Similar Problems " seeing sheet business how to complete whole flow process? " similarity be 81%, with described asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling " seeing sheet business the chances are what detailed process? " Similar Problems concentrate Similar Problems " seeing which part is sheet business main flow have? " similarity be 78%, two similarities problem in preset range same candidate's problem " seeing sheet business the chances are what detailed process? " corresponding Similar Problems is concentrated, then only export once this candidate's problem to select for user, by that analogy, with described asked questions " your company sees which type of sheet operation flow is? " when the Similar Problems corresponding to other candidate's problems of coupling concentrates that to exist with described asked questions similarity be the Similar Problems in preset range, export this candidate's problem according to mentioned above principle to select for user, and the answer that the candidate's problem exporting user's selection is corresponding.The accuracy of matching result is guaranteed to the selection authority that user is enough.
S503: otherwise, if the similarity of the problem concentrated of described Similar Problems and described asked questions is all less than preset range lower limit, then described asked questions is added in described question and answer storehouse, and output matching is empty prompting.
Particularly, if the condition of above-mentioned steps S501 and step S502 does not all meet, namely the similarity of the problem concentrated of described Similar Problems and described asked questions is all less than preset range lower limit, be then added directly in described question and answer storehouse by described asked questions.Such as: through calculating asked questions " your company sees which type of sheet operation flow is? " the similarity of the problem that the Similar Problems corresponding with all candidate's problems matched is concentrated all is less than 75% (supposing that preset range is 75% ~ 90%), candidate's problem of not mating with described asked questions in question and answer storehouse is then described, now, described asked questions is added into enrich the problem in question and answer storehouse in described question and answer storehouse, and output matching is empty prompting.Enrich the problem in question and answer storehouse by this kind of mode, when next user can be made to put question to similar problem again, match candidate's problem that similarity is high, improve the accuracy of matching result.
With reference to the refinement process flow diagram that Fig. 5, Fig. 5 are step S50 shown in Fig. 1 in fifth preferred embodiment of the invention intelligent interactive system;
In one embodiment, as shown in Figure 5, based on the process flow diagram shown in Fig. 4 the 5th preferred embodiment, at shown described step S502: if described Similar Problems is concentrated, to there is similarity be Similar Problems in preset range, the candidate's problem then exported with described asked questions similarity in preset range is selected for user, and after exporting the step of answer corresponding to candidate's problem that user selects, also comprise:
S504: described asked questions is added into Similar Problems corresponding to candidate's problem that described user selects and concentrates;
Particularly, step S502 exports the candidate's problem selected for user, if the problem having user to need in described candidate's problem for user's selection, then system obtains candidate's problem that user selects, and performs step S505; If the problem not having user to need in described candidate's problem for user's selection, be then added into described asked questions in described question and answer storehouse, and output matching is empty prompting.
If the problem having user to need in described candidate's problem for user's selection, then system obtains candidate's problem that user selects, and described asked questions is added into Similar Problems corresponding to candidate's problem that described user selects and concentrates to enrich Similar Problems collection corresponding to described candidate's problem, Similar Problems collection is enriched by this kind of mode, when next user can be made to put question to similar problem again, can Rapid matching to the high candidate's problem of similarity, improve the accuracy of matching result.If such as export for user select with asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling for " seeing sheet business the chances are what detailed process? " " seeing which part is sheet business main flow have? " what user selected is " seeing sheet business the chances are what detailed process? ", then by asked questions " your company sees which type of sheet operation flow is? " be added into candidate's problem " seeing sheet business the chances are what detailed process? " corresponding Similar Problems is concentrated.When next time useful family asked questions for " your company sees which type of sheet operation flow is? " time, can accurate match to the high candidate's problem of similarity " seeing sheet business the chances are what detailed process? ", improve the accuracy of problem matching result.
After user have selected concrete candidate's problem, it is for reference that client is exported in answer corresponding to the candidate's problem described user selected.
For achieving the above object, present invention also offers the problem matching system in a kind of intelligent interactive system.
Reference Fig. 6, Fig. 6 are problem matching system structural representation in first preferred embodiment of the invention intelligent interactive system, and described system comprises:
Problem acquisition module 10, for obtaining the asked questions of user's input;
Particularly, refer to the asked questions obtaining user and input from the inputting interface of intelligent interactive system, when user needs to obtain the answer wanted from intelligent interactive system, user inputs an asked questions on the client, it can be audio form, written form or graphic form, now intelligent interactive system when asked questions be audio form or graphic form, format conversion is carried out to asked questions, convert written form to, to be conducive to searching of most matching problem below.Described inputting interface can be client end AP P, and the carrier of described method can be server, and server can be Web server, also can be the server of other types, such as APP server.
Problem pretreatment module 20, for carrying out participle to described asked questions, going stop words and query expansion process, obtains the index terms of described asked questions;
Particularly, carry out participle to described asked questions and refer to described asked questions is divided into multiple word, participle process can call the participle instrument ICTCLAS of the Chinese Academy of Sciences; Go stop words to refer to and remove some words of having stopped using, stop words dictionary can be set up in advance mate, the word of stopping using is removed, go stop words to comprise to remove polite formula word (as may I ask, may I ask inferior), auxiliary word (as,, etc.) etc. to asked questions semantic relation models the little but word that the frequency of occurrences is higher; Query expansion process is fingering row synonym expansion (as doctor and doctor, father and father etc.) mainly, and " synonym woods " can be adopted to carry out synonym query expansion, by with described asked questions participle after the synonym of word expand; After participle is carried out to described asked questions, going stop words and query expansion process, index terms relevant in essence to described asked questions in asked questions can be obtained.Such as, asked questions for " your company sees which type of sheet operation flow is? " obtain after participle " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what ", " ", after removing stop words will " " remove, then carry out query expansion " you " by word to expand to " that etc. ", " company " expands to " shop ", " shop ", " shop ", " shop ", " hotel owner ", " firm ", " trading company ", " shop ", " paving ", " number ", " village ", " office ", " cabinet foreign firm ", " commission agent ", " commodity section ", " company ", " paving ", " business office ", " cooperative society ", " businessman ", " enterprise " etc., " seeing sheet " expands to " seeing medical image file ", and " business " expands to " work ", " operation ", " affairs ", " thing ", " matter ", " business ", " political and ideological work " etc., " flow process " expands to " streamline ", " technological process " etc., "Yes" expands to " correctly ", " to ", " so ", " well ", " errorless ", " enemy " etc., " what " expands to " how ", " how ", " why ", " how ", " what to do ", " why ", " what kind of ", " how about ", " how ", " what ", " how ", " how ", " what ", " how ", " should how to discuss ".Obtain process after " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what " and each word expansion synonym as index terms.
Candidate's problem matching module 30, for according to the index file preset, mates the candidate problem relevant to the index terms of described asked questions from question and answer storehouse;
Particularly, by the process of problem pretreatment module 20, after obtaining the index terms of described asked questions, according to the index file preset, candidate's problem that coupling is relevant to the index terms of described asked questions from question and answer storehouse, the object obtaining candidate's problem is that the complicated processes such as subsequent calculations Similarity Measure are carried out in less problem scope.With the index terms of described asked questions for elementary cell, maximum front 10 problems of overlapping word quantity between described asked questions (this quantity can design according to the demand of the intelligent interactive system of reality, just does an example at this) alternatively problem is found out from the index file preset.Still with asked questions " your company sees which type of sheet operation flow is? " for example, according to " you ", " company ", " see sheet ", " business ", " flow process ", "Yes", the index terms such as the synonym of " what " and each word expansion, according to the index file preset, match " seeing sheet business the chances are what detailed process? " " which type of company Kan Pian company vRad is? " " your business is what does? " " you see sheet business when starts? " " you are any companies ", " see sheet business diagnosis of partial how to realize ", " see what sheet business full name is ", candidate's problem such as " see what sheet business service platform is ".If with the index terms of described asked questions for elementary cell, from the index file preset, do not find out the candidate problem relevant to the index terms of described asked questions, then described asked questions is added in described question and answer storehouse, and output matching is empty prompting.
Similar Problems matching module 40, for calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem;
Particularly, calculate described asked questions " your company sees which type of sheet operation flow is? " with described candidate's problem " seeing sheet business the chances are what detailed process? ", " which type of company Kan Pian company vRad is? ", " your business is what does? ", " you see sheet business when starts? ", Similar Problems concentration problem corresponding to " what company you are ", " seeing sheet business diagnosis of partial how to realize ", " seeing what sheet business full name is ", " seeing what sheet business service platform is " similarity.When actual design, described Similar Problems collection can be set up in advance when designing described intelligent interactive system, also can carry out setting up in the use procedure of follow-up system and perfect.Described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.Calculate the similarity that the described asked questions Similar Problems corresponding with the described candidate's problem matched concentrates all problems.Namely, after matching candidate's problem, the similarity that the described asked questions Similar Problems corresponding with described all candidate's problems concentrates all Similar Problems is calculated.The method calculating similarity between sentence can first by calculating the similarity between word and word based on the Words similarity algorithm of HowNet, then the similarity of Similarity Measure sentence by word.Between sentence, the computing method of similarity belong to prior art, do not repeat at this.
Result output module 50, for according to the described similarity calculated, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.
Particularly, according to the similarity that Similar Problems matching module 40 calculates, export the answer corresponding with the problem that described asked questions mates according to the rule preset.Described default rule can be the demand according to intelligent interactive system in embody rule field, and the rule of the output in preset range of the similarity of the degree of perfection design in question and answer storehouse.The process exporting the answer corresponding with the problem that described asked questions mates according to the rule preset comprises first according to the described similarity calculated, and determines candidate's problem of mating with described asked questions, then exports answer corresponding to described candidate's problem.According to the described similarity calculated, when determining candidate's problem of mating with described asked questions, according to the preset range of similarity, judge that described Similar Problems is concentrated and whether existed and the problem of described asked questions similarity in preset range, if exist, then the corresponding answer of candidate's problem that the Similar Problems set pair exporting the problem place corresponding with described similarity is answered; If do not exist, then direct described asked questions to be added in question and answer storehouse.
The embodiment of the present invention, by carrying out participle to user's asked questions, going stop words and query expansion process, avoids word that in complicated statement, correlativity is little to the impact of problem matching result; According to the index file preset, candidate's problem that first coupling is relevant to described asked questions from question and answer storehouse, decreases the calculated amount of problem coupling; Calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem again, avoiding the situation that same problem is mated less than problem because there being multiple different way to put questions, improve problem matching result; The described similarity that last basis calculates, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.Whole scheme improves the accuracy of problem matching result in intelligent interactive system.
Reference Fig. 7, Fig. 7 are problem matching system structural representation in second preferred embodiment of the invention intelligent interactive system, and described system also comprises:
Similar Problems collection expansion module 60, form Similar Problems collection for carrying out expansion to each problem in described question and answer storehouse, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.
Particularly, during in order to ensure that user uses, improve the accuracy of described asked questions matching result, when designing intelligent interactive system, for all problems in question and answer storehouse sets up Similar Problems collection in advance, described Similar Problems collection comprises at least described problem itself and has the Similar Problems of identical answer.Such as match candidate's problem " seeing sheet business the chances are what detailed process? " set up by " seeing sheet business how to complete whole flow process? ", " seeing which part is sheet business main flow have? ", " seeing how sheet business operates? ", " seeing how sheet business completes? " and " seeing sheet business the chances are what detailed process? " the Similar Problems collection that candidate's problem itself is formed; Which type of company by that analogy, for other candidate problem Kan Pian company vRad matched are? ", " your business is what does? ", " you see sheet business when starts? ", " what company you are ", " seeing sheet business diagnosis of partial how to realize ", " seeing what sheet business full name is ", " seeing what sheet business service platform is " set up Similar Problems collection respectively.
Reference Fig. 8, Fig. 8 are problem matching system structural representation in third preferred embodiment of the invention intelligent interactive system, and described system also comprises:
Index file builds module 70, for setting up question and answer storehouse; Word segmentation processing is carried out to the problem in described question and answer storehouse and obtains index terms, set up the index file of described index terms and described problem corresponding relation.
Particularly, according to the demand of described intelligent interactive system, set up problem answers database in advance, i.e. question and answer storehouse.In a preferred embodiment, for avoiding the redundancy of data, the question and answer in described question and answer storehouse has man-to-man relation.After setting up question and answer storehouse, word segmentation processing is carried out to all problems in question and answer storehouse, such as in question and answer storehouse problem " your company sees which type of sheet operation flow is? " obtain after participle " you ", " company ", " seeing sheet ", " business ", " flow process ", "Yes", " what ", " ", by that analogy, word segmentation processing is carried out to the problem that has in question and answer storehouse more, participle process can call the participle instrument ICTCLAS of the Chinese Academy of Sciences, obtains the sentence after participle " your company sees which type of sheet operation flow is ".Call the full-text search engine kit Lucene of open source code, using the sentence " your company sees which type of sheet operation flow is " after participle and former asked questions " your company sees which type of sheet operation flow is " as parameters input, the index file of described index terms and described problem corresponding relation can be obtained.In candidate's problem matching module 30, according to the index file set up, from described question and answer storehouse, mate the candidate problem relevant to described asked questions by the index terms of described asked questions herein.Described candidate's problem is for elementary cell with the index terms of described asked questions, maximum front 10 problems of overlapping word quantity between described asked questions (this quantity can design according to the demand of the intelligent interactive system of reality, just does an example at this) are found out from index file.Therefore a just preliminary matching process.In the process, the index terms of described asked questions have passed through participle, goes the process of stop words and query expansion, therefore improves the correctness of problem matching result.
With reference to the structural representation that Fig. 9, Fig. 9 are result output module shown in Fig. 6 in four preferred embodiment of the invention intelligent interactive system, described result output module 50 comprises:
Mate output module 501 completely, when being greater than the problem of preset range higher limit for concentrating existence and described asked questions similarity when described Similar Problems, export the answer that the candidate problem maximum with described asked questions similarity is corresponding;
Particularly, pass through the calculating to the described asked questions Similar Problems concentration problem similarity corresponding with all candidate's problems matched, if judge, described Similar Problems concentrates the problem existing and be greater than preset range higher limit with described asked questions similarity, then directly export answer corresponding to the candidate problem maximum with described asked questions similarity.Such as: through calculating asked questions " your company sees which type of sheet operation flow is? " with the candidate's problem matched " seeing sheet business the chances are what detailed process? " Similar Problems concentrate problem " seeing sheet business how to complete whole flow process? " similarity be 90.5% (supposing that preset range is 75% ~ 90%), then directly export candidate problem " seeing sheet business the chances are what detailed process? " corresponding answer.
Similarity matching output module 502, during for concentrating when described Similar Problems that to there is similarity be the problem in preset range, the candidate's problem exported with described asked questions similarity in preset range is selected for user, and the answer that the candidate's problem exporting user's selection is corresponding;
Particularly, if do not meet the condition of mating output module 501 completely, then judge that described Similar Problems concentrates whether to exist with described asked questions similarity be problem in preset range, if exist, then the candidate's problem exported with described asked questions similarity in preset range is selected for user.Such as: through calculating with asked questions " your company sees which type of sheet operation flow is? " the similarity of all problems that the Similar Problems of all candidate's problems of coupling is concentrated all does not exceed preset range higher limit 90% (supposing that preset range is 75% ~ 90%), then judge with described asked questions " your company sees which type of sheet operation flow is? " the Similar Problems of all candidate's problems of coupling concentrates whether to there is similarity be problem in preset range, if exist, then exporting existence with described asked questions similarity is that candidate's problem that the Similar Problems set pair in preset range is answered is selected for user.If such as with described asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling " seeing sheet business the chances are what detailed process? " Similar Problems concentrate Similar Problems " seeing sheet business how to complete whole flow process? " similarity be 81%, with described asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling " seeing sheet business the chances are what detailed process? " Similar Problems concentrate Similar Problems " seeing which part is sheet business main flow have? " similarity be 78%, two similarities problem in preset range same candidate's problem " seeing sheet business the chances are what detailed process? " corresponding Similar Problems is concentrated, then only export once this candidate's problem to select for user, by that analogy, with described asked questions " your company sees which type of sheet operation flow is? " when the Similar Problems corresponding to other candidate's problems of coupling concentrates that to exist with described asked questions similarity be the Similar Problems in preset range, export this candidate's problem according to mentioned above principle to select for user, and the answer that the candidate's problem exporting user's selection is corresponding.The accuracy of matching result is guaranteed to the selection authority that user is enough.
Coupling is empty output module 503, for when the similarity of the problem that described Similar Problems is concentrated and described asked questions is all less than preset range lower limit, is added into by described asked questions in described question and answer storehouse, and output matching is empty prompting.
Particularly, if above-mentioned condition of mating output module 501 and Similarity matching output module 502 completely does not all meet, namely the similarity of the problem concentrated of described Similar Problems and described asked questions is all less than preset range lower limit, be then added directly in described question and answer storehouse by described asked questions.Such as: through calculating asked questions " your company sees which type of sheet operation flow is? " the similarity of the problem that the Similar Problems corresponding with all candidate's problems matched is concentrated all is less than 75% (supposing that preset range is 75% ~ 90%), candidate's problem of not mating with described asked questions in question and answer storehouse is then described, now, described asked questions is added into enrich the problem in question and answer storehouse in described question and answer storehouse, and output matching is empty prompting.Enrich the problem in question and answer storehouse by this kind of mode, when next user can be made to put question to similar problem again, match candidate's problem that similarity is high, improve the accuracy of matching result.
In one embodiment, described Similarity matching output module 502 is also concentrated for the Similar Problems that the candidate's problem described asked questions being added into described user selection is corresponding:
Particularly, Similarity matching output module 502 exports the candidate's problem selected for user, if the problem having user to need in described candidate's problem for user's selection, then system obtains candidate's problem that user selects, and described asked questions is added into Similar Problems corresponding to candidate's problem that described user selects and concentrates; If the problem not having user to need in described candidate's problem for user's selection, be then added into described asked questions in described question and answer storehouse, and output matching is empty prompting.
If the problem having user to need in described candidate's problem for user's selection, then system obtains candidate's problem that user selects, and described asked questions is added into Similar Problems corresponding to candidate's problem that described user selects and concentrates to enrich Similar Problems collection corresponding to described candidate's problem, Similar Problems collection is enriched by this kind of mode, when next user can be made to put question to similar problem again, can Rapid matching to the high candidate's problem of similarity, improve the accuracy of matching result.If such as export for user select with asked questions " your company sees which type of sheet operation flow is? " candidate's problem of coupling for " seeing sheet business the chances are what detailed process? " " seeing which part is sheet business main flow have? " what user selected is " seeing sheet business the chances are what detailed process? ", then by asked questions " your company sees which type of sheet operation flow is? " be added into candidate's problem " seeing sheet business the chances are what detailed process? " corresponding Similar Problems is concentrated.When next time useful family asked questions for " your company sees which type of sheet operation flow is? " time, can accurate match to the high candidate's problem of similarity " seeing sheet business the chances are what detailed process? ", improve the accuracy of problem matching result.
After user have selected concrete candidate's problem, it is for reference that client is exported in answer corresponding to the candidate's problem described user selected.
The specific embodiment of the invention, by carrying out participle to user's asked questions, going stop words and query expansion process, avoids word that in complicated statement, correlativity is little to the impact of problem matching result; According to the index file preset, candidate's problem that first coupling is relevant to described asked questions from question and answer storehouse, decreases the calculated amount of problem coupling; Calculating the similarity of the described asked questions Similar Problems concentration problem corresponding with described candidate's problem again, avoiding the situation that same problem is mated less than problem because there being multiple different way to put questions, improve problem matching result; The described similarity that last basis calculates, exports the answer corresponding with the problem that described asked questions mates according to the rule preset.Whole scheme improves the accuracy of problem matching result in intelligent interactive system.
These are only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize instructions of the present invention and accompanying drawing content to do equivalent structure or equivalent flow process conversion; or be directly or indirectly used in other relevant technical fields, be all in like manner included in scope of patent protection of the present invention.