CN105955976A - Automatic answering system and method - Google Patents
Automatic answering system and method Download PDFInfo
- Publication number
- CN105955976A CN105955976A CN201610237009.3A CN201610237009A CN105955976A CN 105955976 A CN105955976 A CN 105955976A CN 201610237009 A CN201610237009 A CN 201610237009A CN 105955976 A CN105955976 A CN 105955976A
- Authority
- CN
- China
- Prior art keywords
- synonym
- vocabulary
- word
- key
- historical record
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3338—Query expansion
Abstract
The invention discloses an automatic answering system and method, and the system comprises a question receiving unit which is used for receiving a question inputted by a user; a keyword extraction unit which is used for analyzing the question, and extracting a key question word; a synonym expansion unit which is used for carrying out the synonym expansion of the key question word, and obtaining a question keyword after synonym expansion; a search unit which is used for searching a historical record with the highest matching degree with the question keyword in a data storage unit after the synonym expansion; a display unit which is used for displaying the historical record to the user; a result receiving unit which is used for receiving an optimal matching result selected by the user in the historical record and storing the optimal matching result in the data storage unit; and the data storage unit which is used for storing the search index data, historical record and optimal matching result.
Description
Technical field
The present invention relates to the technical field of data processing in computer information system, espespecially a kind of automatic answering system and side
Method.
Background technology
Under big data age, constantly create the new customer service channels such as note, wechat, microblogging, enterprises service
The data volume of class text record is growing.These records typically include client enquirement, complain, the key such as suggestion
Information, and the answer record of service personal.If substantial amounts of history text record can be carried out the matching analysis, and can be
Short time provides optimum response for terminal use automatically, will be greatly promoted service quality, and be conducive to establishing good enterprise
Image.
For above-mentioned consideration, the most general way is terminal service personnel by using research tool, to history service
Text entry scans for, and select answer that degree of association is the highest as with reference to response to terminal use.But, this
Method has following limitation: the word first in the problem of terminal use being described product, service has certain random
Property, there is the identical concept and use the situation of different vocabulary.It addition, terminal service personnel are when the problem of record, also
It is likely to be due to the reasons such as another name, wrong word and have recorded different sayings, thus cause under the accuracy of customer service text matches
Fall, data-handling efficiency is low;Secondly cannot accomplish the automatic-answering back device of terminal client problem, reply inefficient.
Summary of the invention
For the deficiency existing for existing response mode, the present invention proposes a kind of automatic answering system and method, passes through
Analysis to service text in the past, extracts the synonym pair of replaceable use, and receives similar terminal in terminal system
During the problem of user, the vocabulary in problem is first carried out synonym extension, automatic-answering back device after carrying out coupling search, with
The time of shortening system response, promote Data Matching degree of accuracy simultaneously.
For reaching above-mentioned purpose, the present invention proposes a kind of automatic answering system, and this system includes: problem receives unit,
For the problem receiving user's input;Keyword extracting unit, for being analyzed problem, extracts key issue word
Converge;Synonym expanding element, for carrying out synonym extension to key to the issue vocabulary, it is thus achieved that asking after synonym extension
Topic key vocabularies;Search unit, the key to the issue vocabulary after search extends with synonym in the data store
The historical record that matching degree is the highest;Display unit, for being shown to user by historical record;Result receives unit, uses
In receiving the best matching result that this user selects in historical record, best matching result is stored to data storage single
Unit;Data storage cell, is used for storing search index data, historical record, best matching result.
For reaching above-mentioned purpose, the invention allows for a kind of method utilizing automatic answering system to carry out automatic-answering back device,
The method includes: step 1, the problem receiving user's input;Step 2, is analyzed problem, extracts key and asks
Epigraph converges;Step 3, carries out synonym extension to key to the issue vocabulary, it is thus achieved that the key to the issue word after synonym extension
Converge;Step 4, key to the issue the highest the going through of terminology match degree after search extends with synonym in the data store
The Records of the Historian is recorded;Step 5, is shown to user by historical record;Step 6, receives what this user selected in historical record
Best matching result, stores best matching result to data storage cell.
The automatic answering system of present invention proposition and method, can automatically find terminal service text phase by analyzing and processing
The near synonym closed to or synonym pair, when terminal use inputs enquirement, automatically carry out synonym extension, promote coupling
Accuracy, and automatically carry out response, promote the efficiency of answer problem.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, not
Constitute limitation of the invention.In the accompanying drawings:
Fig. 1 is the automatic answering system structural representation of one embodiment of the invention.
Fig. 2 is the structural representation of the data storage cell of one embodiment of the invention.
Fig. 3 is the data structure schematic diagram of the best matching result of one embodiment of the invention storage.
Fig. 4 is that the near synonym of one embodiment of the invention storage are to data structural representation.
Fig. 5 is that the synonym of one embodiment of the invention storage is to data structural representation.
Fig. 6 is the structural representation of the data analysis unit of one embodiment of the invention.
Fig. 7 is that the degree of approximation of one embodiment of the invention analyzes process schematic.
Fig. 8 is the method flow diagram of the automatic-answering back device of one embodiment of the invention.
Fig. 9 is the analysis method flow diagram of the near synonym pair of one embodiment of the invention.
Figure 10 is the analysis method flow diagram of the synonym pair of one embodiment of the invention.
Detailed description of the invention
Hereinafter coordinate diagram and presently preferred embodiments of the present invention, the present invention is expanded on further for reaching predetermined goal of the invention institute
The technological means taked.Term " unit " used in it or " module " can realize the software of predetermined function
And/or the combination of hardware.Although the system described by following example preferably realizes with software, but hardware,
Or the realization of the combination of software and hardware also may and be contemplated.
Fig. 1 is the automatic answering system structural representation of one embodiment of the invention.As it is shown in figure 1, this system includes:
Problem receives unit 100, for the problem receiving user's input;
Keyword extracting unit 200, for being analyzed problem, extracts key issue vocabulary;
Synonym expanding element 300, for carrying out synonym extension to key to the issue vocabulary, it is thus achieved that after synonym extension
Key to the issue vocabulary;
Search unit 400, the key to the issue vocabulary after search extends with synonym in data storage cell 700
The historical record that matching degree is the highest;
Display unit 500, for being shown to user by historical record;Wherein, when display unit 500 shows, permissible
Historical record the highest for first three matching degree is fed back to user, selects for user.
Result receives unit 600, for receiving the best matching result that this user selects in historical record, and will be optimal
Matching result stores to data storage cell 700;
Data storage cell 700, is used for storing search index data, historical record, best matching result.
In the present embodiment, keyword extracting unit 200 is extracted the step of key issue vocabulary and is included:
Question text carrying out Chinese word segmentation, calculates vocabulary weighted value TF_IDF in the text, wherein, TF represents
Vocabulary occurrence frequency in current text is the highest, then weight is the biggest, and IDF represents vocabulary occurrence frequency in full text
The lowest, then weight is the biggest, extracts a number of vocabulary that TF_IDF value is the highest, as key issue vocabulary;
The computing formula of TF_IDF value is as follows:
TF_IDFi,j=TFi.j×IDFi;
Wherein, TF_IDFi,jRepresent vocabulary i weight in problem j;
Wherein TFi.jRepresent vocabulary i word frequency in problem j, ni,jFor vocabulary i occurrence number in problem j,
∑knk,jFor being the occurrence number sum of all words k in problem j;
Wherein, IDFi represents the document frequency that falls of vocabulary i, and | D | problem of representation is total, | { j:ti∈djRepresent comprise word
Problem d of language tijNumber.
In the present embodiment, shown in Fig. 2, for the structural representation of data storage cell.As in figure 2 it is shown, number
Include according to memory element 700: coupling record memory module 710, search index memory module 720, terminal service note
Address book stored module 730, near synonym to memory module 740, synonym to memory module 750.Wherein,
Coupling record memory module 710, is used for storing best matching result.As it is shown on figure 3, for optimal of storage
Join the data structure schematic diagram of result.Wherein have recorded session id, terminal use's numbering, historical problem, problem word
Converge and match time.
Search index memory module 720, is used for storing search index data, historical record is set up inverted index, supply
Searcher is inquired about, the incremental update by the increase along with historical record content.
Terminal service record memory module 730, for storing history, described historical record includes: historical problem
And by near synonym to or synonym to generate answer text.
Memory module 740, key to the issue vocabulary and near synonym after storing synonym extension are converged and form by near synonym
Near synonym pair.As shown in Figure 4, for storage near synonym to data structural representation.Wherein contain key issue
Vocabulary, near synonym converge and the degree of association.
Synonym is to memory module 750, the key to the issue vocabulary after storing synonym extension and synonym vocabulary composition
Synonym pair.As it is shown in figure 5, be that the synonym of storage is to data structural representation.Wherein contain key issue
Vocabulary, synonym vocabulary.
Further, in conjunction with shown in Fig. 1, automatic answering system also includes: data analysis unit 800, for right
Key to the issue vocabulary is analyzed with best matching result, according to analysis result set up near synonym to or synonym pair, and
It is stored in data storage cell 700.
Shown in Fig. 6, for the structural representation of data analysis unit.As shown in Figure 6, data analysis unit 800
Including: module 810, retrieval sequence analysis module 820, phonetic analysis module 830, Co-occurrence Analysis analyzed near synonym
Module 840, click feature analyze module 850.Wherein,
Module 810 analyzed near synonym, for key to the issue vocabulary and best matching result degree of being associated analysis, obtaining
After having the near synonym of certain degree of association, storage near synonym is to memory module 740, and these near synonym are probably generally
Read close vocabulary, it is also possible to there is father and son's concept of hyponymy.
The computing formula of nearly justice relation is as follows:
Wherein, pujProblem of representation key vocabularies u and the degree of approximation of best matching result j, N (u) is that problem is closed
The best matching result set that keyword remittance is mated, S (j, K) is its highest with best matching result j matching times
Its K best matching result set, wjiIt is best matching result j and the matching times of a number of vocabulary i, rui
It it is the key to the issue vocabulary u matching times to a number of vocabulary i;
The degree of approximation calculating the word pair obtained being normalized, formula is:
Y=(x-MinValue)/(MaxValue-MinValue), stores degree of approximation more than the near synonym of a setting threshold value
To near synonym to memory module 740.
More clearly explaining to above-mentioned near synonym be analyzed the function of module 810, coming below by way of an embodiment
Illustrate.
Shown in Fig. 7, the degree of approximation for one embodiment of the invention analyzes process schematic.As it is shown in fig. 7, ask
Epigraph converge " ATM " have 30 times and be have matched historical problem one by terminal use, have matched historical problem 2 10 times;Go through
History problem one has also been matched vocabulary " self-service facility " 6 times and self-service ATM 8 times;Historical problem two is also matched
Vocabulary " self-service ATM " 12 times and " gulping down card " 10 times.
Calculate according to above-mentioned formula, and after being normalized, obtain following word pairing approximation degree:
The self-service facility of ATM: 0.3;
The self-service ATM of ATM: 1;
ATM gulps down card: 0;
If degree of approximation threshold value is 0.2, then " the self-service ATM of ATM ", " the self-service facility of ATM " are judged as
Near synonym.
Retrieval sequence analysis module 820, is used near synonym reading near synonym pair memory module 740, and analyzes
Described near synonym are replaced the probability of use in coupling record memory module 710 in a time series set, by
In terminal use when inputting a problem vocabulary, if not obtaining preferable result, often can be a relatively short period of time
The vocabulary that interior selection same meaning can mutually be replaced carries out the rewriting of problem description.Therefore, it can probability more than one
The near synonym of setting threshold value, to being judged to synonym pair, are stored in synonym to memory module 750.
Phonetic analyzes module 830, is used near synonym reading near synonym pair memory module 740, and analyzes nearly justice
The pronunciation similarity of word pair, this be due to terminal use input problem time in order to ensure input speed, have input unisonance
The probability of wrong word is relatively big, as;Microblogging and meagre, same meaning in fact.Therefore pronunciation similarity can be more than
The unisonance near synonym of one setting threshold value, to being judged to synonym pair, are stored in synonym to memory module 750;Wherein,
Pronunciation calculating formula of similarity is as follows:
Wherein, SimdisRepresent pronunciation similarity, SwiRepresent the pronunciation character string of wi, | Swi| represent wi pronunciation character
The length of string, i=1,2, minDis (Sw1,Sw2) represent smallest edit distance.
Co-occurrence Analysis module 840, is used near synonym reading near synonym pair memory module 740, and analyzes nearly justice
Word is to the size of co-occurrence degree in the terminal service recording text that terminal service record memory module 730 stores.Due to
Terminal use describe problem time, may have different saying to before and after a vocabulary, as first used full name, follow-up again
Mention, use abbreviation.Therefore, if co-occurrence degree reaches a setting threshold value, it is possible to determine that near synonym are to for synonym
Right, it is stored in synonym to memory module 750;Wherein, the co-occurrence degree formula calculating two vocabulary is as follows:
Wherein, wijRepresenting the co-occurrence degree of vocabulary i and vocabulary j, N (i) represents the historical record of the vocabulary i that goes wrong
Set;N (i) ∩ N (j) represents historical record set vocabulary i and vocabulary j simultaneously occur;| N (i) | represents the word that goes wrong
The quantity of the historical record set of remittance i.
Click feature analyzes module 850, is used near synonym reading near synonym pair memory module 740, and analyzes
Described near synonym are in the terminal service recording text that terminal service record memory module 730 stores, and word i occurs in
In inquiry, but not appearing in the title of historical record, word j occurs in the title of historical record, calculates word i
With the computing formula of the ratio that is exchanged with each other of word j it is:
Wherein, CijRepresent and be exchanged with each other ratio, wtiRepresent that word i occurs in title, | wtiwqj| represent that word i occurs in
In title, word j quantity in queries occurs;
The ratio word more than a setting threshold value will be exchanged with each other to storage to synonym memory module 750.
The automatic answering system that the present invention proposes is that mode based on synonym extension carries out automatic-answering back device, data therein
The end user problems of a large amount of history and corresponding optimal response can be analyzed by analytic unit, obtain having relevant
Property near synonym pair, the near synonym obtained are processed doing further calculating, filter out available synonym pair, will knot
Fruit is stored in synonym to memory element.When service, the problem vocabulary in first extraction problem, is expanded by synonym
Problem vocabulary is extended by exhibition unit, promotes vocabulary coverage rate, most preferably goes through search further according to vocabulary association degree
History response record returns to terminal use.
Based on same inventive concept, the embodiment of the present invention additionally provides a kind of auto-answer method, such as following enforcement
Described in example.Owing to the principle of the method solution problem is similar to said system, therefore the enforcement of the method may refer to
State the enforcement of system, repeat no more in place of repetition.
Fig. 8 is the method flow diagram of the automatic-answering back device of one embodiment of the invention.The method can be by above-mentioned automatic-answering back device
System is carried out, including:
Step S1, the problem receiving user's input;
Step S2, is analyzed problem, extracts key issue vocabulary;
Step S3, carries out synonym extension to key to the issue vocabulary, it is thus achieved that the key to the issue vocabulary after synonym extension;
Step S4, the key to the issue terminology match degree after search extends with synonym in the data store is the highest
Historical record;Wherein, historical record includes: historical problem and by near synonym to or synonym to generate answer literary composition
This.
Step S5, is shown to user by historical record, can show the historical record that first three matching degree is the highest.
Step S6, receives the best matching result that this user selects in historical record, best matching result is stored
To data storage cell.
Shown in Fig. 9, for the analysis method flow diagram of near synonym pair.As it is shown in figure 9, include:
Step 101, obtains the key to the issue vocabulary after synonym extension and the best matching result of correspondence;
Step 102, is successively read the key to the issue vocabulary after synonym extension;
Step 103, matching times w between key to the issue vocabulary and best matching result after statistics synonym extension;
Step 104, is successively read best matching result;
Step 105, searches the historical record mating described best matching result, is successively read historical record;
Step 106, adds up described historical record and matches the number of times r of best matching result;
Step 107, calculating the nearly justice degree p=w × r between vocabulary, if running into the best matching result of repetition, then will
P adds up;
Step 108, reads near synonym judgment threshold s, if p > s, is then stored as near synonym pair;
Step 109, it may be judged whether for last historical record, be then to perform step 110, otherwise repeat step
Rapid 105;
Step 110, it may be judged whether for last best matching result, be then to perform step 111, otherwise repeat to hold
Row step 104;
Step 111, it may be judged whether the key to the issue vocabulary after extending for last synonym, is to analyze and terminate,
Otherwise repeated execution of steps 102.
Shown in Figure 10, for the analysis method flow diagram of synonym pair.As shown in Figure 10, including:
Step 201, is successively read near synonym pair;
Step 202, is calculated the pronunciation degree of approximation between the pinyin character string of near synonym pair;
Step 203, it is judged that whether the pronunciation degree of approximation, more than a setting threshold value, is then to perform step 210, otherwise continues
Perform step 204;
Step 204, according to retrieval sequence, calculates in same session, on the basis of search the first word, searches again
The rope conditional probability of the second word;
Step 205, it is judged that whether conditional probability, more than a setting threshold value, is then to perform step 210, otherwise continues to hold
Row step 206;
Step 206, analyzes the co-occurrence degree of the first word and the second word;
Step 207, it is judged that whether co-occurrence degree, more than a setting threshold value, is then to perform step 210, otherwise continues to hold
Row step 208;
Step 208, analyzes the click feature of two vocabulary;
Step 209, it is judged that click feature, more than a setting threshold value, is then to perform step 210, otherwise terminates synonym
Analyze, it is determined that for non-synonym;
Step 210, stores synonym pair.
The automatic answering system of present invention proposition and method, can automatically find terminal service text phase by analyzing and processing
The near synonym closed to or synonym pair, when terminal use inputs enquirement, automatically carry out synonym extension, promote coupling
Accuracy, and automatically carry out response, promote the efficiency of answer problem.
Particular embodiments described above, has been carried out the purpose of the present invention, technical scheme and beneficial effect the most in detail
Describe in detail bright, be it should be understood that the specific embodiment that the foregoing is only the present invention, be not used to limit this
Bright protection domain, all within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. done,
Should be included within the scope of the present invention.
Claims (13)
1. an automatic answering system, it is characterised in that this system includes:
Problem receives unit, for the problem receiving user's input;
Keyword extracting unit, for being analyzed problem, extracts key issue vocabulary;
Synonym expanding element, for carrying out synonym extension to key to the issue vocabulary, it is thus achieved that asking after synonym extension
Topic key vocabularies;
Search unit, the key to the issue terminology match degree after search extends with synonym in the data store is
High historical record;
Display unit, for being shown to user by historical record;
Result receives unit, for receiving the best matching result that this user selects in historical record, will most preferably mate
Result stores to data storage cell;
Data storage cell, is used for storing search index data, historical record, best matching result.
System the most according to claim 1, it is characterised in that described keyword extracting unit, for asking
Topic is analyzed, and extracts key issue vocabulary, including:
Question text carrying out Chinese word segmentation, calculates vocabulary weighted value TF_IDF in the text, wherein, TF represents
Vocabulary occurrence frequency in current text is the highest, then weight is the biggest, and IDF represents vocabulary occurrence frequency in full text
The lowest, then weight is the biggest, extracts a number of vocabulary that TF_IDF value is the highest, as key issue vocabulary;
The computing formula of TF_IDF value is as follows:
TF_IDFi,j=TFi.j×IDFi;
Wherein, TF_IDFi,jRepresent vocabulary i weight in problem j;
Wherein TFi.jRepresent vocabulary i word frequency in problem j, ni,jFor vocabulary i occurrence number in problem j,
∑knk,jFor being the occurrence number sum of all words k in problem j;
Wherein, IDFiRepresenting the document frequency that falls of vocabulary i, | D | problem of representation is total, | { j:ti∈dj| represent and comprise word
Language tiProblem djNumber.
System the most according to claim 1, it is characterised in that described data storage cell includes:
Coupling record memory module, is used for storing best matching result;
Search index memory module, is used for storing search index data, historical record is set up inverted index, for search
Device is inquired about, the incremental update by the increase along with historical record content;
Terminal service record memory module, for storing history, described historical record includes: historical problem and logical
Cross near synonym to or synonym to generate answer text;
Near synonym are to memory module, and key to the issue vocabulary after storing synonym extension and near synonym converge the near of composition
Justice word pair;
Synonym to memory module, key to the issue vocabulary after storing synonym extension and synonym vocabulary composition same
Justice word pair.
System the most according to claim 3, it is characterised in that this system also includes: data analysis unit,
For key to the issue vocabulary is analyzed with best matching result, according to analysis result set up near synonym to or synonym
Right, and it is stored in data storage cell.
System the most according to claim 4, it is characterised in that described data analysis unit includes:
Module analyzed near synonym, and for key to the issue vocabulary and best matching result degree of being associated analysis, nearly justice is closed
The computing formula of system is as follows:
Wherein, pujProblem of representation key vocabularies u and the degree of approximation of best matching result j, N (u) is that problem is closed
The best matching result set that keyword remittance is mated, S (j, K) is its highest with best matching result j matching times
Its K best matching result set, wjiIt is best matching result j and the matching times of a number of vocabulary i, rui
It it is the key to the issue vocabulary u matching times to a number of vocabulary i;
It is normalized calculating the degree of approximation of word pair obtained, near more than a setting threshold value of degree of approximation
Justice word stores near synonym to memory module.
System the most according to claim 5, it is characterised in that described data analysis unit also includes: retrieval
Sequence analysis module, near synonym to memory module read near synonym pair, and analyze described near synonym to
Join the probability being replaced use in record memory module in a time series set, by probability more than a setting threshold value
Near synonym, to being judged to synonym pair, are stored in synonym to memory module.
System the most according to claim 5, it is characterised in that described data analysis unit also includes: phonetic
Analyze module, be used near synonym memory module reads near synonym pair, and analyze the pronunciation similarity of near synonym pair,
By pronunciation similarity more than one setting threshold value unisonance near synonym to being judged to synonym pair, be stored in synonym to storage
Module;Wherein, pronunciation calculating formula of similarity is as follows:
Wherein, SimdisRepresent pronunciation similarity, SwiRepresent the pronunciation character string of wi, | Swi| represent wi pronunciation character
The length of string, i=1,2, minDis (Sw1,Sw2) represent smallest edit distance.
System the most according to claim 5, it is characterised in that described data analysis unit also includes: co-occurrence
Analyze module, be used near synonym memory module reads near synonym pair, and analyze near synonym in terminal service note
The size of co-occurrence degree in the terminal service recording text of address book stored module stores, if co-occurrence degree reaches a setting threshold
Value, it is determined that near synonym, to for synonym pair, are stored in synonym to memory module;Wherein, being total to of two vocabulary is calculated
Existing degree formula is as follows:
Wherein, wijRepresenting the co-occurrence degree of vocabulary i and vocabulary j, N (i) represents the historical record of the vocabulary i that goes wrong
Set;N (i) ∩ N (j) represents historical record set vocabulary i and vocabulary j simultaneously occur;| N (i) | represents the word that goes wrong
The quantity of the historical record set of remittance i.
System the most according to claim 5, it is characterised in that described data analysis unit also includes: click on
Characteristics analysis module, is used near synonym reading near synonym pair memory module, and analyzes described near synonym at end
In the terminal service recording text of end service log memory module storage, word i occurs in queries, but does not appears in
In the title of historical record, word j occurs in the title of historical record, and calculate word i and word j is exchanged with each other ratio
Computing formula be:
Wherein, CijRepresent and be exchanged with each other ratio, wtiRepresent that word i occurs in title, | wtiwqj| represent that word i occurs in
In title, word j quantity in queries occurs;
The ratio word more than a setting threshold value will be exchanged with each other to storage to synonym memory module.
10. the method that the automatic answering system utilizing claim 1 carries out automatic-answering back device, it is characterised in that should
Method includes:
Step 1, the problem receiving user's input;
Step 2, is analyzed problem, extracts key issue vocabulary;
Step 3, carries out synonym extension to key to the issue vocabulary, it is thus achieved that the key to the issue vocabulary after synonym extension;
Step 4, key to the issue the highest the going through of terminology match degree after search extends with synonym in the data store
The Records of the Historian is recorded;
Step 5, is shown to user by historical record;
Step 6, receives the best matching result that this user selects in historical record, best matching result is stored to
Data storage cell.
11. methods according to claim 10, it is characterised in that described historical record includes: historical problem
And by near synonym to or synonym to generate answer text.
12. methods according to claim 11, it is characterised in that the analysis method of described near synonym pair includes:
Step 101, obtains the key to the issue vocabulary after synonym extension and the best matching result of correspondence;
Step 102, is successively read the key to the issue vocabulary after synonym extension;
Step 103, matching times w between key to the issue vocabulary and best matching result after statistics synonym extension;
Step 104, is successively read best matching result;
Step 105, searches the historical record mating described best matching result, is successively read historical record;
Step 106, adds up described historical record and matches the number of times r of best matching result;
Step 107, calculating the nearly justice degree p=w × r between vocabulary, if running into the best matching result of repetition, then will
P adds up;
Step 108, reads near synonym judgment threshold s, if p > s, is then stored as near synonym pair;
Step 109, it may be judged whether for last historical record, be then to perform step 110, otherwise repeat step
Rapid 105;
Step 110, it may be judged whether for last best matching result, be then to perform step 111, otherwise repeat to hold
Row step 104;
Step 111, it may be judged whether the key to the issue vocabulary after extending for last synonym, is to analyze and terminate,
Otherwise repeated execution of steps 102.
13. methods according to claim 12, it is characterised in that the analysis method of described synonym pair includes:
Step 201, is successively read near synonym pair;
Step 202, is calculated the pronunciation degree of approximation between the pinyin character string of near synonym pair;
Step 203, it is judged that whether the pronunciation degree of approximation, more than a setting threshold value, is then to perform step 210, otherwise continues
Perform step 204;
Step 204, according to retrieval sequence, calculates in same session, on the basis of search the first word, searches again
The rope conditional probability of the second word;
Step 205, it is judged that whether conditional probability, more than a setting threshold value, is then to perform step 210, otherwise continues to hold
Row step 206;
Step 206, analyzes the co-occurrence degree of the first word and the second word;
Step 207, it is judged that whether co-occurrence degree, more than a setting threshold value, is then to perform step 210, otherwise continues to hold
Row step 208;
Step 208, analyzes the click feature of two vocabulary;
Step 209, it is judged that click feature, more than a setting threshold value, is then to perform step 210, otherwise terminates synonym
Analyze, it is determined that for non-synonym;
Step 210, stores synonym pair.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610237009.3A CN105955976B (en) | 2016-04-15 | 2016-04-15 | A kind of automatic answering system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610237009.3A CN105955976B (en) | 2016-04-15 | 2016-04-15 | A kind of automatic answering system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105955976A true CN105955976A (en) | 2016-09-21 |
CN105955976B CN105955976B (en) | 2019-05-14 |
Family
ID=56917383
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610237009.3A Active CN105955976B (en) | 2016-04-15 | 2016-04-15 | A kind of automatic answering system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105955976B (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106503265A (en) * | 2016-11-30 | 2017-03-15 | 北京赛迈特锐医疗科技有限公司 | Structured search system and its searching method based on weights |
CN106599297A (en) * | 2016-12-28 | 2017-04-26 | 北京百度网讯科技有限公司 | Method and device for searching question-type search terms on basis of deep questions and answers |
CN106649868A (en) * | 2016-12-30 | 2017-05-10 | 首都师范大学 | Method and device for matching between questions and answers |
CN107220317A (en) * | 2017-05-17 | 2017-09-29 | 北京百度网讯科技有限公司 | Matching degree appraisal procedure, device, equipment and storage medium based on artificial intelligence |
CN107453980A (en) * | 2017-07-26 | 2017-12-08 | 北京小米移动软件有限公司 | Problem response method and device in instant messaging |
CN108009253A (en) * | 2017-12-05 | 2018-05-08 | 昆明理工大学 | A kind of improved character string Similar contrasts method |
CN108509474A (en) * | 2017-09-15 | 2018-09-07 | 腾讯科技(深圳)有限公司 | Search for the synonym extended method and device of information |
CN109063060A (en) * | 2018-07-20 | 2018-12-21 | 吴怡 | A kind of semantic net legal advice service robot |
CN109299320A (en) * | 2018-10-30 | 2019-02-01 | 上海智臻智能网络科技股份有限公司 | A kind of information interacting method, device, computer equipment and storage medium |
WO2019041517A1 (en) * | 2017-08-29 | 2019-03-07 | 平安科技(深圳)有限公司 | Electronic device, question recognition and confirmation method, and computer-readable storage medium |
CN109710732A (en) * | 2018-11-19 | 2019-05-03 | 东软集团股份有限公司 | Information query method, device, storage medium and electronic equipment |
CN110019701A (en) * | 2017-09-18 | 2019-07-16 | 京东方科技集团股份有限公司 | Method, question and answer service system and storage medium for question and answer service |
CN110222192A (en) * | 2019-05-20 | 2019-09-10 | 国网电子商务有限公司 | Corpus method for building up and device |
CN110442760A (en) * | 2019-07-24 | 2019-11-12 | 银江股份有限公司 | A kind of the synonym method for digging and device of question and answer searching system |
CN109189897B (en) * | 2018-07-27 | 2020-07-31 | 什伯(上海)智能技术有限公司 | Chatting method and chatting device based on data content matching |
CN113609273A (en) * | 2021-08-12 | 2021-11-05 | 云知声(上海)智能科技有限公司 | Method and device for configuring mechanical speech technology, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101174259A (en) * | 2007-09-17 | 2008-05-07 | 张琰亮 | Intelligent interactive request-answering system |
CN101398835A (en) * | 2007-09-30 | 2009-04-01 | 日电(中国)有限公司 | Service selecting system and method, and service enquiring system and method based on natural language |
CN103902652A (en) * | 2014-02-27 | 2014-07-02 | 深圳市智搜信息技术有限公司 | Automatic question-answering system |
US20140337329A1 (en) * | 2010-09-28 | 2014-11-13 | International Business Machines Corporation | Providing answers to questions using multiple models to score candidate answers |
CN104809197A (en) * | 2015-04-24 | 2015-07-29 | 同程网络科技股份有限公司 | On-line question and answer method based on intelligent robot |
-
2016
- 2016-04-15 CN CN201610237009.3A patent/CN105955976B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101174259A (en) * | 2007-09-17 | 2008-05-07 | 张琰亮 | Intelligent interactive request-answering system |
CN101398835A (en) * | 2007-09-30 | 2009-04-01 | 日电(中国)有限公司 | Service selecting system and method, and service enquiring system and method based on natural language |
US20140337329A1 (en) * | 2010-09-28 | 2014-11-13 | International Business Machines Corporation | Providing answers to questions using multiple models to score candidate answers |
CN103902652A (en) * | 2014-02-27 | 2014-07-02 | 深圳市智搜信息技术有限公司 | Automatic question-answering system |
CN104809197A (en) * | 2015-04-24 | 2015-07-29 | 同程网络科技股份有限公司 | On-line question and answer method based on intelligent robot |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106503265A (en) * | 2016-11-30 | 2017-03-15 | 北京赛迈特锐医疗科技有限公司 | Structured search system and its searching method based on weights |
CN106599297A (en) * | 2016-12-28 | 2017-04-26 | 北京百度网讯科技有限公司 | Method and device for searching question-type search terms on basis of deep questions and answers |
CN106649868A (en) * | 2016-12-30 | 2017-05-10 | 首都师范大学 | Method and device for matching between questions and answers |
US11481419B2 (en) | 2017-05-17 | 2022-10-25 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for evaluating matching degree based on artificial intelligence, device and storage medium |
CN107220317A (en) * | 2017-05-17 | 2017-09-29 | 北京百度网讯科技有限公司 | Matching degree appraisal procedure, device, equipment and storage medium based on artificial intelligence |
CN107220317B (en) * | 2017-05-17 | 2020-12-18 | 北京百度网讯科技有限公司 | Matching degree evaluation method, device, equipment and storage medium based on artificial intelligence |
CN107453980A (en) * | 2017-07-26 | 2017-12-08 | 北京小米移动软件有限公司 | Problem response method and device in instant messaging |
WO2019041517A1 (en) * | 2017-08-29 | 2019-03-07 | 平安科技(深圳)有限公司 | Electronic device, question recognition and confirmation method, and computer-readable storage medium |
CN108509474B (en) * | 2017-09-15 | 2022-01-07 | 腾讯科技(深圳)有限公司 | Synonym expansion method and device for search information |
CN108509474A (en) * | 2017-09-15 | 2018-09-07 | 腾讯科技(深圳)有限公司 | Search for the synonym extended method and device of information |
CN110019701A (en) * | 2017-09-18 | 2019-07-16 | 京东方科技集团股份有限公司 | Method, question and answer service system and storage medium for question and answer service |
CN108009253A (en) * | 2017-12-05 | 2018-05-08 | 昆明理工大学 | A kind of improved character string Similar contrasts method |
CN109063060A (en) * | 2018-07-20 | 2018-12-21 | 吴怡 | A kind of semantic net legal advice service robot |
CN109189897B (en) * | 2018-07-27 | 2020-07-31 | 什伯(上海)智能技术有限公司 | Chatting method and chatting device based on data content matching |
CN109299320B (en) * | 2018-10-30 | 2020-09-25 | 上海智臻智能网络科技股份有限公司 | Information interaction method and device, computer equipment and storage medium |
CN109299320A (en) * | 2018-10-30 | 2019-02-01 | 上海智臻智能网络科技股份有限公司 | A kind of information interacting method, device, computer equipment and storage medium |
CN109710732B (en) * | 2018-11-19 | 2021-03-05 | 东软集团股份有限公司 | Information query method, device, storage medium and electronic equipment |
CN109710732A (en) * | 2018-11-19 | 2019-05-03 | 东软集团股份有限公司 | Information query method, device, storage medium and electronic equipment |
CN110222192A (en) * | 2019-05-20 | 2019-09-10 | 国网电子商务有限公司 | Corpus method for building up and device |
CN110442760A (en) * | 2019-07-24 | 2019-11-12 | 银江股份有限公司 | A kind of the synonym method for digging and device of question and answer searching system |
CN110442760B (en) * | 2019-07-24 | 2022-02-15 | 银江技术股份有限公司 | Synonym mining method and device for question-answer retrieval system |
CN113609273A (en) * | 2021-08-12 | 2021-11-05 | 云知声(上海)智能科技有限公司 | Method and device for configuring mechanical speech technology, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN105955976B (en) | 2019-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105955976A (en) | Automatic answering system and method | |
CN108536852B (en) | Question-answer interaction method and device, computer equipment and computer readable storage medium | |
CN106649818B (en) | Application search intention identification method and device, application search method and server | |
CA2556202C (en) | Method and apparatus for fundamental operations on token sequences: computing similarity, extracting term values, and searching efficiently | |
CN106874292B (en) | Topic processing method and device | |
US20100010803A1 (en) | Text paraphrasing method and program, conversion rule computing method and program, and text paraphrasing system | |
CN106970991B (en) | Similar application identification method and device, application search recommendation method and server | |
CN106126619A (en) | A kind of video retrieval method based on video content and system | |
CN105912629A (en) | Intelligent question and answer method and device | |
CN110597978B (en) | Article abstract generation method, system, electronic equipment and readable storage medium | |
CN111078893A (en) | Method for efficiently acquiring and identifying linguistic data for dialog meaning graph in large scale | |
CN113468891A (en) | Text processing method and device | |
CN114168841A (en) | Content recommendation method and device | |
CN111782793A (en) | Intelligent customer service processing method, system and equipment | |
CN114281972A (en) | Dialog control method, system storage medium and server based on subject object tracking and cognitive inference | |
CN111209367A (en) | Information searching method, information searching device, electronic equipment and storage medium | |
CN106649279A (en) | Specific information automatic generation system and method | |
CN111309882B (en) | Method and device for realizing intelligent customer service question and answer | |
CN109635289B (en) | Entry classification method and audit information extraction method | |
CN110866393B (en) | Resume information extraction method and system based on domain knowledge base | |
CN112115237B (en) | Construction method and device of tobacco science and technology literature data recommendation model | |
CN111625722B (en) | Talent recommendation method, system and storage medium based on deep learning | |
CN113836377A (en) | Information association method and device, electronic equipment and storage medium | |
KR101147508B1 (en) | Apparatus and Method for recommending of search formula | |
CN116775813B (en) | Service searching method, device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210106 Address after: 100140, 55, Fuxing Avenue, Xicheng District, Beijing Patentee after: INDUSTRIAL AND COMMERCIAL BANK OF CHINA Patentee after: ICBC Technology Co.,Ltd. Address before: 100140, 55, Fuxing Avenue, Xicheng District, Beijing Patentee before: INDUSTRIAL AND COMMERCIAL BANK OF CHINA |
|
TR01 | Transfer of patent right |