WO2020233386A1 - 基于aiml的智能问答方法、装置、计算机设备及存储介质 - Google Patents
基于aiml的智能问答方法、装置、计算机设备及存储介质 Download PDFInfo
- Publication number
- WO2020233386A1 WO2020233386A1 PCT/CN2020/088052 CN2020088052W WO2020233386A1 WO 2020233386 A1 WO2020233386 A1 WO 2020233386A1 CN 2020088052 W CN2020088052 W CN 2020088052W WO 2020233386 A1 WO2020233386 A1 WO 2020233386A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- answer
- preset
- question
- text
- information
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Definitions
- This application relates to the technical field of intelligent question answering robots, in particular to AIML-based intelligent question answering methods, devices, computer equipment and storage media.
- AIML Artificial Intelligence Markup Language
- XML eXtensible Markup Language
- Extensible Markup Language Extensible Markup Language
- AIML can realize the interaction between users and question answering robots, but the inventor found that the current AIML has the following problems when used in Chinese conversations: there is basically no public Chinese rule base, and the rule base is equivalent to the "brain" of the dialogue robot.
- the AIML interpreter does not support Chinese well, for example, English input uses spaces to separate words, while Chinese The input is not separated by spaces, which leads to a low problem recognition rate and matching rate.
- the main purpose of this application is to provide an AIML-based intelligent question answering method, device, computer equipment and storage medium, increase the AIML Chinese rule library, solve the problem of the existing AIML's low support for Chinese, and achieve improved problem recognition rate and Match rate.
- This application proposes an AIML-based intelligent question and answer method, including: obtaining question information input by a user, and obtaining text information according to the question information; converting Chinese in the text information into the same Chinese font to obtain the first text corresponding to the text information,
- the Chinese font is Chinese simplified or Chinese traditional; according to the preset filtering rules, delete the specified symbols in the first text to obtain the second text; according to the preset Chinese word segmentation rules, the second text is segmented in Chinese to obtain the second text correspondence
- This application also proposes an AIML-based intelligent question answering device, which includes: a first acquisition module for acquiring question information input by a user and obtaining text information according to the question information; a conversion module for converting Chinese in the text information into The same Chinese font is used to obtain the first text corresponding to the text information, and the Chinese font is Chinese simplified or traditional Chinese; the filtering module is used to delete the specified symbols in the first text according to the preset filtering rules to obtain the second text; the word segmentation module, It is used to perform Chinese word segmentation on the second text according to the preset Chinese word segmentation rules to obtain multiple first fields corresponding to the second text; the first matching module is used to perform synonym matching on each first field to obtain each first field.
- the module is used to match each target text with the preset question in the preset question and answer file.
- the preset question and answer file contains the mapping relationship information between the preset question and the first answer; the second acquisition module is used to if the target text matches If the preset question in the preset question and answer file is successfully matched, the first answer corresponding to the preset question is obtained, and the first answer is used as the answer to the question information.
- This application also proposes a computer device, including a memory and an executor, the memory stores a computer program, and the executor implements the steps of the AIML-based intelligent question answering method when the computer program is executed.
- This application also proposes a storage medium on which a computer program is stored.
- the computer program is executed by an executor, the steps of the AIML-based intelligent question answering method are realized.
- This application adds the Chinese rule database by configuring the question and answer data table, the synonym table, the professional vocabulary of Chinese word segmentation, the correspondence table between traditional and simplified in the intelligent question and answer database; font conversion and special symbols are performed on the text information corresponding to the question information Normalization processing such as filtering, Chinese word segmentation, synonym matching, text replacement, etc., enhance AIML's support for Chinese, so that AIML can better recognize text information, thereby improving problem recognition rate and matching rate; configure presets corresponding to various business types Question and answer files to ensure data security and avoid mutual interference, making AIML support multiple business question and answer scenarios at the same time. By matching the target text with the preset question, the answer to the question information is obtained.
- Normalization processing such as filtering, Chinese word segmentation, synonym matching, text replacement, etc.
- Figure 1 is a schematic diagram of the steps of an AIML-based intelligent question answering method in an embodiment of this application
- Figure 2 is a schematic structural diagram of an AIML-based intelligent question answering device in an embodiment of the application
- Fig. 3 is a schematic structural diagram of a computer device in an embodiment of the application.
- AIML-based intelligent question answering method in an embodiment of the present application includes:
- S4 Perform Chinese word segmentation on the second text according to preset Chinese word segmentation rules to obtain multiple first fields corresponding to the second text;
- S7 Match each target text with a preset question in a preset question and answer file, and the preset question and answer file contains the mapping relationship information between the preset question and the first answer;
- the access terminal for the above question information may be a question and answer dialogue scenario such as WeChat chat, web site online question answering and customer service.
- the question information can be input by the user's voice or manually. If it is a voice input, the voice is converted into text information through a voice-to-text conversion tool.
- the above question information can include questions of multiple business types.
- the intelligent question and answer file corresponding to each business type can be configured and maintained through the background management system, that is, each business type has a corresponding table of questions and answers, so the intelligent question answering robot It can support multiple business types of question information at the same time, and are separated from each other without interfering with each other, which overcomes the limitation that an intelligent question answering robot can only support one business type in the past.
- the above-mentioned Chinese is converted to the same Chinese font, that is, the simplified Chinese in the text information is converted into traditional Chinese or the traditional Chinese in the text information is converted into simplified Chinese.
- the Chinese font of the preset question and answer file For conversion, if the preset question in the preset question and answer file is simplified Chinese, and the text information corresponding to the question information input by the Hong Kong user is a traditional font, then the simplified Chinese of the text information is converted to traditional Chinese.
- the principle can be to configure the corresponding relationship configuration file between traditional Chinese and simplified Chinese, or it can be a database table that configures the corresponding relationship between traditional Chinese and simplified Chinese. Because the database table is convenient to add or modify the data in the table flexibly through the background management system interface , So it is preferably a database table, which can be implemented with the help of an open source toolkit, such as zhconverter in Java.
- the above preset filtering rules are the processing rules added to the specified punctuation marks and spaces in the text according to the Chinese language habits. If spaces appear in the text, the spaces will be deleted, and the dashes in the text will also be deleted. , And if the foreigner’s first name and last name are connected with a ⁇ sign, the filter rule also deletes the symbol.
- the above-mentioned preset Chinese word segmentation rules include simultaneous full segmentation and atomic segmentation of the second text, the segmentation plan to achieve the optimal path according to the hidden Markov model and Viterbi algorithm, and then perform name recognition, system dictionary supplementation, and user self-segmentation.
- the above Chinese word segmentation is to divide the text information into multiple phrases or fields. Specifically, it can be realized by open source Chinese word segmentation tools, such as Ansj. Ansj supports custom dictionaries. Therefore, users can edit the proprietary vocabulary corresponding to the business type to make AIML support Problem information of different business types and improve the recognition rate of problems, such as the name of a company’s insurance product: eLife Insurance.
- the above-mentioned synonym matching is to match the above-mentioned Chinese word segmentation to obtain synonyms corresponding to the phrase or field, and the principle is to configure the file or database table according to the correspondence relationship between the Chinese vocabulary and its corresponding synonyms.
- the above-mentioned replacing the first field corresponding to the second field according to the second field is replacing the phrase or field in the text information with a corresponding synonym, so as to obtain a new text, including replacing the words in the second text
- the first field is replaced multiple times to replace one or more fields in the first field with the corresponding second field, thereby obtaining multiple target texts.
- the step S6 of replacing the first field corresponding to the second field according to the second field to obtain multiple target texts includes:
- step S61 it includes replacing one field in all the first fields with a second field corresponding to the field each time, and replacing multiple fields in all the first fields with a first field corresponding to it each time.
- Two fields where when one first field corresponds to multiple second fields, the first field is replaced one by one corresponding to one of the multiple second fields. For example, if the text information is "What are the benefits of applying for pension insurance?", the text information is segmented in Chinese, and the first fields such as "application”, “endowment”, “insurance”, “benefits” and “what” are obtained, and then synonyms are matched.
- the original text information "What are the benefits of handling endowment insurance" will also be used as the target text.
- the question information entered by the user is in traditional Chinese
- the returned data is also converted to traditional Chinese and used as the answer to the question information.
- multiple target texts are obtained, so that AIML can better recognize text information and improve the problem recognition rate and matching rate.
- the above-mentioned preset question and answer file is a question and answer data table corresponding to multiple service types in the intelligent question and answer database, and the question and answer data table includes a correspondence table between the preset question and the first answer.
- the above-mentioned preset question and answer files can be configured through the background management system. Specifically, each business person can log in to the back-end management system with his account. After entering the system, they can configure the question and answer data table, the synonym table, the professional vocabulary list of Chinese word segmentation, the correspondence table between traditional and simplified, etc., and the table can be modified or deleted The data.
- the intelligent question answering system interface which triggers the intelligent question answering system to generate the question and answer data table, the synonym table, the professional vocabulary list of Chinese word segmentation, the correspondence table between traditional and simplified, etc., or update the above table data.
- each business person can only see and configure the relevant content of his business type to ensure data security and avoid mutual interference.
- the user may be prompted to select a business type for consultation option. For example, if the user selects the option of a wealth management business, the question and answer corresponding to the wealth management business The preset question corresponding to the matching target text in the data table.
- step S8 if the matching is successful, it means that there is a preset question that is the same as the question information in the question and answer data table, that is, the first answer corresponding to the preset question is used as the answer to the target text.
- multiple target texts are based on the same text information, so the meaning of multiple target texts is basically the same, so a first answer is obtained as the answer to the text information, but in multiple target texts There may also be differences in the meaning of the target text, and the first answer that is inconsistent with the answers of other target texts is matched, and different first answers matched by multiple target texts are returned to the front-end interface for the user's reference.
- step S1 of obtaining question information input by a user and obtaining text information according to the question information includes:
- S12 Perform voice preprocessing on the voice signal to obtain an observation sequence of the voice signal
- the voice signal is composed of the user's voice and the environmental noise, and the environmental noise will cause interference to the voice recognition, so the voice signal is voice preprocessed.
- the above-mentioned voice preprocessing is to divide the voice signal into frames through the VAD (Voice Activity Detection) technology and establish an HMM (Hidden Markov Model) model corresponding to the voice signal.
- VAD Voice Activity Detection
- HMM Hidden Markov Model
- the user’s voice signal is divided into overlapping voice frames according to its period to ensure that the LPC (Linear Predictive Coding) spectrum estimation is relevant from frame to frame; the endpoint detection algorithm is used to find the start and end of the voice, Then look for the intensity and the number of zero-crossing points of each speech frame to calculate the threshold of the energy zero-crossing point value, thereby removing most of the environmental noise; passing the speech signal through a low-order low-pass filter to flatten the signal frequency domain and weaken the signal The influence of the finite word length effect on the signal during the processing; windowing is performed on each speech frame to reduce the signal discontinuity between the start speech frame and the end speech frame; autocorrelation analysis is performed on each speech frame to obtain the autocorrelation The Levsion Durbin algorithm is used to find the LPC coefficients; the LPC coefficients are weighted using a tapered window to obtain the Cepatral coefficients, and the Cepatral coefficients are used as the feature vector of the speech frame. Furthermore, the time-domain Ce
- steps S13 and S14 perform HMM training for each phoneme system in the preset Chinese vocabulary (fields containing preset text), obtain digitized voice sample values, and perform preprocessing, feature vector extraction, and vector quantization. , Baum-Welch modeling, etc. Obtain the observation sequence of the speech model corresponding to the preset text. When the observation sequence of the speech signal matches the observation sequence of the speech model corresponding to the preset text, the probability calculation is performed on the observation sequence of each speech signal.
- the maximum likelihood estimation algorithm is used to calculate the maximum probability (that is, the observation sequence described above)
- the similarity of the observation sequence corresponding to the preset text if the maximum probability is greater than the preset similarity, the preset text corresponding to the observation sequence with the greatest probability of the observation sequence of the speech signal is used as the text information corresponding to the question information.
- the step S7 of matching each target text with the preset question in the preset question and answer file includes:
- the above semantic analysis includes the Chinese word segmentation of the text information, and then uses the statistical language model to determine the optimal word segmentation result, and calculates the weight of each term (term) after the word segmentation according to the term-weighting method.
- the weight of each term extracts the core words in the text;
- the language model can be an N-Gram model based on HMM, or a language model based on recurrent neural network, such as a state-of-the-art language model; term- Weighting methods such as TF-IDF, Okapi, MI, ATC, LTU, etc.; the more times a term appears in the text, the greater the weight and the more important it is.
- Each service type corresponds to one or more first preset question and answer files, and the first preset question and answer files are included in the preset question and answer files.
- matching the target text with the preset question is actually a process of text matching, in which text matching can be divided into a single semantic model, a multiple semantic model, a matching matrix model and a deep-level inter-sentence model.
- the single semantic model uses a fully connected, CNN or RNN neural network to encode two sentences and then calculates the similarity between the sentences, without considering the local structure of the phrases in the sentences, such as DSSM (Deep Structured Semantic Models); multiple semantic models Interpret sentences from a multi-grain perspective, taking into account the local structure of the sentence, such as MV-LSTM (MultiView Long Short Term Memory); the matching matrix model calculates the similarity between the sentence and the unused word, and then uses the deep network to extract the features, taking the sentence into consideration Interaction between different words, more precise processing of the relationship in the sentence, such as Text Matching as Image Recognition; deep-level inter-sentence model, according to the interaction mechanism such as attention, use a more refined structure to mine the relationship between different words within and between sentences , Such as the state of the art model.
- DSSM Deep Structured Semantic Models
- multiple semantic models Interpret sentences from a multi-grain perspective, taking into account the local structure of the sentence, such as MV-LSTM (
- the similarity between the target text and the question text can be calculated according to the similarity algorithm of the text matching model through any one of the above-mentioned text matching models, and it can be judged whether the similarity reaches the preset threshold, and if so, match Success, otherwise, the match fails.
- the question information is the text information "What are the benefits of handling endowment insurance?”
- the target text is "What are the benefits of having endowment insurance?” and "What are the advantages of buying endowment insurance?”
- Set the preset questions in the question and answer file to match, and calculate the similarity between "what are the benefits of having pension insurance” and “what are the advantages of buying pension insurance", and “what are the benefits of handling pension insurance”
- the similarity between one or more target texts and the question "What are the advantages of endowment insurance” in the first predetermined question and answer file reaches the preset threshold, the answer corresponding to "What are the advantages of endowment insurance?” What are the benefits of pension insurance?”;
- multiple target texts may also match multiple question texts in the first preset question and answer file. Although this is a small probability event, if this happens, they will be The answers corresponding to multiple question texts are used as answers to the question information for users' reference.
- step S7 of matching each target text with the preset question in the preset question and answer file the method includes:
- S071 Receive preset questions and first answers corresponding to various service types respectively;
- S072 Write the preset question and first answer corresponding to the first service type in the first preset question and answer file, where the first service type is included in all service types, and the first preset question and answer file is included in all preset questions and answers File.
- the first preset question and answer file can be configured through the background management system. Specifically, each business person logs into the background management system (part of the intelligent question answering system) through his account, enters an instruction to create a first preset question and answer file, and enters the preset question and first answer of the type of business the business person is responsible for. For example, a business person in charge of basketball business uses his account to log in to the background management system, and the background management system identifies the account of the business person, and according to the preset permissions corresponding to the account (only the editing permissions and browsing permissions related to the basketball business), the displayed The editable Q&A data table corresponding to the preset permissions.
- the salesperson enters the basketball business-related preset questions and the first answer in the Q&A data table, such as the preset question "What is the jersey number Yao Ming retired in the NBA Rockets",
- the first answer "No. 11” receives the above-mentioned preset question and first answer through the background management system, and generates the first preset question and answer file corresponding to the basketball business from the question and answer data table according to the confirmation instruction entered by the user.
- the business staff in charge of the football business can only edit the first question and answer file corresponding to the football business. Due to account permissions, each business person can only see and configure the relevant content of his business type, ensuring data security and avoiding mutual interference, and enabling the intelligent question answering system to support multiple business scenarios without interference.
- the method includes:
- S703 Analyze the returned data, obtain a number of second answers with higher relevance in the returned data, and use the plurality of second answers as answers corresponding to the text information.
- the data crawling technology is used to obtain the answer to the user's question.
- the above-mentioned data crawling technology is a web crawler, which is a program or script that automatically crawls the information in a specified URL according to certain rules. They are widely used in Internet search engines or other similar websites, and can automatically collect all the information they can access. To obtain or update the content and retrieval methods of these websites.
- the web crawler analyzes the calling address of the search engine (such as Baidu search) in advance according to the business type in the text information ( Preset URL address), the program sends a search query request carrying text information and obtaining a number of second answers corresponding to the text information to the calling address, and obtains the return data (html code) returned by the calling address, and then parses it through jsoup (java HTML) The processor) parses the above html code to obtain the second answer corresponding to the text information.
- the search engine such as Baidu search
- the program sends a search query request carrying text information and obtaining a number of second answers corresponding to the text information to the calling address, and obtains the return data (html code) returned by the calling address, and then parses it through jsoup (java HTML)
- jsoup java HTML
- the process of crawling out technology to find the answer corresponding to the question is itself a process of filtering out relevant answers, for example, the answer displayed when searching for the question "What are the benefits of pension insurance" through Baidu search
- the list itself is obtained after screening by the Baidu search engine, but in order to obtain more accurate answers, the most relevant answer in the answer list is taken as the second answer.
- the second answer is highly relevant and reliable.
- the first several answers are for user reference; among them, the high relevance can analyze the relevance of the answer to the question according to the position in the search result list, the number of results viewed, the number of likes, and the number of useful numbers.
- the answer value (the preset score value corresponding to the position/the position in the result list) * weight 1 + the number of views of the result * weight 2 + the number of likes * weight 3 + useful number * weight 4, and then calculate the result list
- Each answer has a corresponding value, and the answer with the highest value is used as the second answer.
- the second answer corresponding to the text information is obtained through data crawling technology, so that the machine has a learning function.
- the method includes:
- S706 Based on the user selecting the second answer as a useful option, accumulate the first useful number corresponding to each second answer;
- the second answer is displayed in the form of a list.
- the second answer can also be converted into voice output.
- Each second answer in the above answer list corresponds to an option that selects the second answer as a useful option.
- the user can select the second answer's useful options for the second answer that they agree with, and accumulate the first useful option of the second answer.
- the intelligent question answering system to recommend answers to new questions. For example, the user will judge the second answer after viewing the second answer. If the user feels good, he can choose a useful option, or if he feels bad, he can choose a useless option.
- the highest answer is used as the answer to the subsequent question, that is, when the first useful number of one of the second answers reaches the preset value, the second answer is added to the preset question and answer file as the answer to the corresponding question, thereby enhancing the question and answer
- the learning ability of the robot makes the question answering robot more intelligent.
- the method further includes:
- the question and answer robot receives the text information as in step S703 again, and the text information matches the first text information
- the first answer corresponding to the first text information in the preset question and answer file is called and returned to the front end
- the UI displays both useful and useless options for the first answer. Since the answer is time-sensitive, it is judged whether the second useful number is less than the useless number. When the second useful number is less than the useless number, it means that the first answer cannot be recognized by the user. Therefore, delete the first answer and its The corresponding first text message. For example, people in ancient times believed that the sun moved around the earth was right, but today people think this is wrong, so useless numbers will increase, and finally the useless numbers are greater than the second useful number.
- the AIML-based intelligent question answering device includes:
- the first obtaining module 1 is used to obtain question information input by a user, and obtain text information according to the question information;
- the conversion module 2 is used to convert the Chinese in the text information into the same Chinese font to obtain the first text corresponding to the text information, and the Chinese font is simplified or traditional Chinese;
- the filtering module 3 is used to delete the specified symbols in the first text according to the preset filtering rules to obtain the second text;
- the word segmentation module 4 is used to perform Chinese word segmentation on the second text according to preset Chinese word segmentation rules to obtain multiple first fields corresponding to the second text;
- the first matching module 5 is configured to perform synonym matching on each first field to obtain a second field corresponding to each first field;
- the replacement module 6 is configured to replace the first field corresponding to the second field in the second text according to the second field to obtain multiple target texts;
- the second matching module 7 is configured to match each target text with the preset question in the preset question and answer file, and the preset question and answer file contains the mapping relationship information between the preset question and the first answer;
- the second obtaining module 8 is configured to obtain the first answer corresponding to the preset question if the target text is successfully matched with the preset question in the preset question and answer file, so as to use the first answer as the answer to the question information.
- the access terminal for the above question information may be a question-and-answer dialogue scenario such as WeChat chat, web site online question-and-answer customer service, etc.
- the question information can be input by the user's voice or manually. If it is a voice input, the voice is converted into text information through a voice-to-text conversion tool.
- the above question information can include questions of multiple business types.
- the intelligent question and answer file corresponding to each business type can be configured and maintained through the background management system, that is, each business type has a corresponding table of questions and answers, so the intelligent question answering robot It can support multiple business types of question information at the same time, and are separated from each other without interfering with each other, which overcomes the limitation that an intelligent question answering robot can only support one business type in the past.
- the above-mentioned Chinese is converted into the same Chinese font, that is, the simplified Chinese in the text information is converted into traditional Chinese or the traditional Chinese in the text information is converted into simplified Chinese.
- the font is converted. If the preset question in the preset question and answer file is simplified Chinese, and the text information corresponding to the question information input by the Hong Kong user is a traditional font, the simplified Chinese of the text information is converted to traditional Chinese.
- the principle can be to configure the corresponding relationship configuration file between traditional Chinese and simplified Chinese, or it can be a database table that configures the corresponding relationship between traditional Chinese and simplified Chinese. Because the database table is convenient to add or modify the data in the table flexibly through the background management system interface , So it is preferably a database table, which can be implemented with the help of an open source toolkit, such as zhconverter in Java.
- the filtering module 3, the word segmentation module 4 and the first matching module 5 above, the above preset filtering rules are the processing rules added to the specified punctuation marks and spaces in the text according to the Chinese language habits. If there are spaces in the text, the spaces will be deleted , If a dash appears in the text, the dash will also be deleted. If the foreigner’s first name and surname are connected with a ⁇ sign, the filter rule will also delete the symbol.
- the above-mentioned preset Chinese word segmentation rules include simultaneous full segmentation and atomic segmentation of the second text, the segmentation plan to achieve the optimal path according to the hidden Markov model and Viterbi algorithm, and then perform name recognition, system dictionary supplementation, and user self-segmentation.
- the above Chinese word segmentation is to divide the text information into multiple phrases or fields. Specifically, it can be realized by open source Chinese word segmentation tools, such as Ansj. Ansj supports custom dictionaries. Therefore, users can edit the proprietary vocabulary corresponding to the business type to make AIML support Problem information of different business types and improve the recognition rate of problems, such as the name of a company’s insurance product: eLife Insurance.
- the above-mentioned synonym matching is to match the above-mentioned Chinese word segmentation to obtain synonyms corresponding to the phrase or field, and the principle is to configure the file or database table according to the correspondence relationship between the Chinese vocabulary and its corresponding synonyms.
- the replacement of the first field corresponding to the second field according to the second field is replacing a phrase or field in the text information with a corresponding synonym to obtain a new text, including replacing the second field in the
- the first field of is replaced multiple times to replace one or more fields in the first field with the corresponding second field to obtain multiple target texts.
- the aforementioned replacement module 6 includes:
- the replacement unit in the second text, replaces one or more fields in the first field with a corresponding second field to obtain multiple target texts.
- replacing one or more fields in the first field with the corresponding second field includes replacing one field in all the first fields with a second field corresponding to the field each time, and each time Replace multiple fields in all first fields with a corresponding second field, where when a first field corresponds to multiple second fields, replace the first field with multiple second fields one by one one of the.
- the text information is "What are the benefits of applying for pension insurance?"
- the text information is segmented in Chinese, and the first fields such as "application”, “endowment”, “insurance”, “benefits” and “what” are obtained, and then synonyms are matched.
- the original text information "What are the benefits of handling pension insurance?" will also be used as the target text.
- the question information entered by the user is in traditional Chinese
- the returned data is also converted into traditional Chinese and used as the answer to the question information.
- multiple target texts are obtained, so that AIML can better recognize the text information and improve the question recognition rate and matching rate.
- the preset question and answer file is a question and answer data table corresponding to multiple service types in the intelligent question and answer database, and the question and answer data table includes a corresponding table of the preset question and the first answer.
- the above-mentioned preset question and answer files can be configured through the background management system. Specifically, each business person can log in to the back-end management system with his account. After entering the system, they can configure the question and answer data table, the synonym table, the professional vocabulary list of Chinese word segmentation, the correspondence table between traditional and simplified, etc., and the table can be modified or deleted The data.
- the intelligent question answering system interface which triggers the intelligent question answering system to generate the question and answer data table, the synonym table, the professional vocabulary list of Chinese word segmentation, the correspondence table between traditional and simplified, etc., or update the above table data.
- each business person can only see and configure the relevant content of his business type to ensure data security and avoid mutual interference.
- semantic analysis and grammatical analysis are performed on the text information to determine the type of business consulted in the user's question information according to the keywords in the text information.
- the user may be prompted to select a business type for consultation option.
- the preset question corresponding to the matching target text in the data table.
- the match is successful in the second acquisition module 8, it means that there is a preset question that is the same as the question information in the question and answer data table, that is, the first answer corresponding to the preset question is used as the answer to the target text.
- multiple target texts are based on the same text information, so the meaning of multiple target texts is basically the same, so a first answer is obtained as the answer to the text information, but in multiple target texts There may also be differences in the meaning of the target text, and the first answer that is inconsistent with the answers of other target texts is matched, and different first answers matched by multiple target texts are returned to the front-end interface for the user's reference.
- the above-mentioned first obtaining module 1 includes:
- the first acquiring unit is used to acquire the user's voice signal, and the voice signal carries problem information;
- the processing unit is used to perform voice preprocessing on the voice signal to obtain an observation sequence of the voice signal
- the detection unit is used to detect whether the similarity between the observation sequence and the observation sequence corresponding to the preset text is greater than the preset similarity
- the voice signal is composed of the user's voice and environmental noise, and the environmental noise will cause interference to the voice recognition, so the voice signal is voice preprocessed.
- the above-mentioned voice preprocessing is to divide the voice signal into frames through the VAD (Voice Activity Detection) technology and establish an HMM (Hidden Markov Model) model corresponding to the voice signal.
- VAD Voice Activity Detection
- HMM Hidden Markov Model
- the user’s voice signal is divided into overlapping voice frames according to its period to ensure that the LPC (Linear Predictive Coding) spectrum estimation is relevant from frame to frame; the endpoint detection algorithm is used to find the start and end of the voice, Then look for the intensity and the number of zero-crossing points of each speech frame to calculate the threshold of the energy zero-crossing point value, thereby removing most of the environmental noise; passing the speech signal through a low-order low-pass filter to flatten the signal frequency domain and weaken the signal The influence of the finite word length effect on the signal during the processing; windowing is performed on each speech frame to reduce the signal discontinuity between the start speech frame and the end speech frame; autocorrelation analysis is performed on each speech frame to obtain the autocorrelation The Levsion Durbin algorithm is used to find the LPC coefficients; the LPC coefficients are weighted using a tapered window to obtain the Cepatral coefficients, and the Cepatral coefficients are used as the feature vector of the speech frame. Furthermore, the time-domain Ce
- the matching unit and as a unit perform HMM training for each phoneme system in the preset Chinese vocabulary (fields containing preset text) to obtain digitized voice sample values, and perform preprocessing, feature vector extraction, and vector quantization After processing and Baum-Welch modeling, the observation sequence of the speech model corresponding to the preset text is obtained. When the observation sequence of the speech signal is matched with the observation sequence of the speech model corresponding to the preset text, the probability calculation is performed on the observation sequence of each speech signal.
- the maximum likelihood estimation algorithm is used to calculate the maximum probability, (that is, the above observation The similarity between the sequence and the observation sequence corresponding to the preset text), if the maximum probability is greater than the preset similarity, the preset text corresponding to the observation sequence with the greatest probability of the speech signal observation sequence is used as the text information corresponding to the question information.
- the above-mentioned second matching module includes:
- the analysis unit is used to perform semantic analysis on the text information to analyze the business type corresponding to the text information;
- the searching unit is configured to find the first preset question and answer file corresponding to the service type in the preset question and answer file based on the service type corresponding to the text information;
- the second obtaining unit is used to obtain the similarity between the target text and the question text in the first preset question and answer file
- the judging unit is used to judge whether the similarity reaches a preset threshold
- the determining unit is configured to determine that the matching is successful if the similarity reaches the preset threshold, and determine that the matching fails if it does not.
- the above semantic analysis includes Chinese word segmentation of the text information, and then uses the statistical language model to determine the optimal word segmentation result, and calculates the weight of each term (term) after the word segmentation according to the term-weighting method.
- the weight of each term extracts the core words in the text;
- the language model can be the N-Gram model based on HMM, or the language model based on recurrent neural network, such as the state-of-the-art language model; term -weighting methods such as TF-IDF, Okapi, MI, ATC, LTU, etc.; the more times a term appears in the text, the greater the weight and the more important it is.
- Each service type corresponds to one or more first preset question and answer files, and the first preset question and answer files are included in the preset question and answer files.
- matching the target text with the preset question is actually a process of text matching.
- the text matching can be divided into single semantic model, multiple semantic model, matching matrix model and deep-semantic model. Inter-sentence model.
- the single semantic model uses a fully connected, CNN or RNN neural network to encode two sentences and then calculates the similarity between the sentences, without considering the local structure of the phrases in the sentences, such as DSSM (Deep Structured Semantic Models); multiple semantic models Interpret sentences from a multi-grain perspective, taking into account the local structure of the sentence, such as MV-LSTM (MultiView Long Short Term Memory); the matching matrix model calculates the similarity between the sentence and the unused word, and then uses the deep network to extract the features, taking the sentence into consideration Interaction between different words, more precise processing of the relationship in the sentence, such as Text Matching as Image Recognition; deep-level inter-sentence model, according to the interaction mechanism such as attention, use a more refined structure to mine the relationship between different words within and between sentences , Such as the state of the art model.
- DSSM Deep Structured Semantic Models
- multiple semantic models Interpret sentences from a multi-grain perspective, taking into account the local structure of the sentence, such as MV-LSTM (
- the similarity between the target text and the question text can be calculated according to the similarity algorithm of the text matching model through any one of the above-mentioned text matching models, and it can be judged whether the similarity reaches the preset threshold, and if so, match Success, otherwise, the match fails.
- the question information is the text information "What are the benefits of handling endowment insurance?”
- the target text is "What are the benefits of having endowment insurance?” and "What are the advantages of buying endowment insurance?”
- Set the preset questions in the question and answer file to match, and calculate the similarity between "what are the benefits of having pension insurance” and “what are the advantages of buying pension insurance", and “what are the benefits of handling pension insurance”
- the similarity between one or more target texts and the question "What are the advantages of endowment insurance” in the first predetermined question and answer file reaches the preset threshold, the answer corresponding to "What are the advantages of endowment insurance?” What are the benefits of pension insurance?”;
- multiple target texts may also match multiple question texts in the first preset question and answer file. Although this is a small probability event, if this happens, they will be The answers corresponding to multiple question texts are used as answers to the question information for users' reference.
- the above device further includes:
- the second receiving module is configured to receive preset questions and first answers corresponding to various service types
- the writing module is used to write the preset question and the first answer corresponding to the first service type in the first preset question and answer file, where the first service type is included in all service types, and the first preset question and answer file is included in All preset question and answer files.
- the above-mentioned first preset question and answer file can be configured through the background management system. Specifically, each business person logs into the background management system (part of the intelligent question answering system) through his account, enters an instruction to create a first preset question and answer file, and enters the preset question and first answer of the type of business the business person is responsible for. For example, a business person in charge of basketball business uses his account to log in to the background management system, and the background management system identifies the account of the business person, and according to the preset permissions corresponding to the account (only the editing permissions and browsing permissions related to the basketball business), the displayed The editable Q&A data table corresponding to the preset permissions.
- the salesperson enters the basketball business-related preset questions and the first answer in the Q&A data table, such as the preset question "What is the jersey number Yao Ming retired in the NBA Rockets",
- the first answer "No. 11” receives the above-mentioned preset question and first answer through the background management system, and generates the first preset question and answer file corresponding to the basketball business from the question and answer data table according to the confirmation instruction entered by the user.
- the business staff in charge of the football business can only edit the first question and answer file corresponding to the football business. Due to account permissions, each business person can only see and configure the relevant content of his business type, ensuring data security and avoiding mutual interference, and enabling the intelligent question answering system to support multiple business scenarios without interference.
- the above device further includes:
- the query module is used to send a query request to the preset URL address if the target text fails to match the preset question in the preset question and answer file, and the query request carries text information;
- the first receiving module is used to receive the return data corresponding to the query request
- the parsing module is used to parse the returned data, obtain a number of second answers with the highest relevance in the returned data, and use the number of second answers as answers corresponding to the text information.
- the data crawling technology is used to obtain the answer to the user's question.
- the above-mentioned data crawling technology is a web crawler, which is a program or script that automatically crawls the information in a specified URL according to certain rules. They are widely used in Internet search engines or other similar websites, and can automatically collect all the information they can access. To obtain or update the content and retrieval methods of these websites.
- the web crawler analyzes the calling address of the search engine (such as Baidu search) in advance according to the business type in the text information ( Preset URL address), the program sends a search query request carrying text information and obtaining a number of second answers corresponding to the text information to the calling address, and obtains the return data (html code) returned by the calling address, and then parses it through jsoup (java HTML) The processor) parses the above html code to obtain the second answer corresponding to the text information.
- the search engine such as Baidu search
- the program sends a search query request carrying text information and obtaining a number of second answers corresponding to the text information to the calling address, and obtains the return data (html code) returned by the calling address, and then parses it through jsoup (java HTML)
- jsoup java HTML
- the process of crawling out technology to find the answer corresponding to the question is itself a process of filtering out relevant answers, for example, the answer displayed when searching for the question "What are the benefits of pension insurance" through Baidu search
- the list itself is obtained after screening by the Baidu search engine, but in order to obtain more accurate answers, the most relevant answer in the answer list is taken as the second answer.
- the second answer is highly relevant and reliable.
- the first several answers are for user reference; among them, the high relevance can analyze the relevance of the answer to the question according to the position in the search result list, the number of results viewed, the number of likes, and the number of useful numbers.
- the answer value (the preset score value corresponding to the position/the position in the result list) * weight 1 + the number of views of the result * weight 2 + the number of likes * weight 3 + useful number * weight 4, and then calculate the result list
- Each answer has a corresponding value, and the answer with the highest value is used as the second answer.
- the second answer corresponding to the text information is obtained through data crawling technology, so that the machine has a learning function.
- the above device further includes:
- the generation module is used to add several second answers to the preset blank list to generate the answer list and save the answer list;
- the first display module is used for displaying a list of answers and for the user to select each second answer as a useful option
- the first accumulation module is used to accumulate the first useful number corresponding to each second answer based on the user's selection of the second answer as a useful option;
- the first judgment module is used to judge whether the first useful number of the second answer reaches a preset value
- the adding module is used to add the second answer whose first useful number reaches the preset value and the first text information to the preset question and answer file if it is, the first text information is the text information corresponding to the second answer, and as the preset Suppose the preset question in the question and answer file, the second answer whose first useful number reaches the preset value is taken as the first answer in the preset file.
- the second answer is displayed in the form of a list.
- the second answer can also be converted into voice output.
- Each second answer in the above answer list corresponds to an option that selects the second answer as a useful option.
- the user can select the second answer's useful options for the second answer that they agree with, and accumulate the first useful option of the second answer.
- the highest answer is used as the answer to the subsequent question, that is, when the first useful number of one of the second answers reaches the preset value, the second answer is added to the preset question and answer file as the answer to the corresponding question, thereby enhancing the question and answer
- the learning ability of the robot makes the question answering robot more intelligent.
- the above device further includes:
- the second display module is configured to display the first answer corresponding to the first text information when the text information matches the first text information, and display options for the user to select whether the first answer is useful or useless;
- the second accumulation module is used to accumulate the second useful number and useless number corresponding to the first answer based on the user's choice of the first answer as useful or useless option, and the second useful number is accumulated on the basis of the first useful number;
- the second judgment module is used to judge whether the second useful number is less than the useless number
- the deleting module is used for deleting the first answer and the first text information corresponding to the first answer in the preset question and answer file if the second useful number is less than the useless number.
- the Q&A robot again receives the text information parsed by the parsing module, the text information matches the first text information, then the first answer corresponding to the first text information in the preset Q&A file is called and returned to the front-end UI , And both useful and useless options of the first answer are displayed. Since the answer is time-sensitive, it is judged whether the second useful number is less than the useless number. When the second useful number is less than the useless number, it means that the first answer cannot be recognized by the user. Therefore, delete the first answer and its The corresponding first text message. For example, people in ancient times believed that the sun moved around the earth was right, but today people think this is wrong, so useless numbers will increase, and finally the useless numbers are greater than the second useful number.
- the computer device in an embodiment of the present application includes a memory and an executor.
- the memory stores a computer program.
- the executor implements the steps of the AIML-based intelligent question answering method when the computer program is executed.
- the above-mentioned computer equipment may be a server, and its internal structure may be as shown in FIG. 3.
- the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus.
- the computer designed processor is used to provide calculation and control capabilities.
- the memory of the computer device includes a non-volatile storage medium and an internal memory.
- the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
- the memory provides an environment for the operation of the operating system and computer readable instructions in the non-volatile storage medium.
- the database of the computer equipment is used to store data such as tasks, database tables, and tables to be processed.
- the network interface of the computer device is used to communicate with an external terminal through a network connection.
- FIG. 3 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
- the storage medium in an embodiment of the present application is a computer-readable storage medium.
- the computer-readable storage medium may be non-volatile or volatile.
- a computer program is stored thereon, and the computer program is executed. The steps of the AIML-based intelligent question answering method are implemented when the device is executed.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (21)
- 一种基于AIML的智能问答方法,所述AIML为人工智能标记语言,其中,包括:获取用户输入的问题信息,并根据所述问题信息得到文本信息;将所述文本信息中的中文转换为同一中文字体,得到所述文本信息对应的第一文本,所述中文字体为中文简体或中文繁体;根据预设过滤规则,删除所述第一文本中的指定符号,得到第二文本;根据预设的中文分词规则,将所述第二文本进行中文分词,得到所述第二文本对应的多个第一字段;将各个所述第一字段分别进行同义词匹配,得到各个所述第一字段分别对应的第二字段;在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本;将各个所述目标文本分别与预设问答文件中的预设问题进行匹配,所述预设问答文件包含所述预设问题与第一答案的映射关系信息;若所述目标文本与预设问答文件中的预设问题匹配成功,则获取所述预设问题对应的第一答案,以将所述第一答案作为所述问题信息的答案。
- 根据权利要求1所述的基于AIML的智能问答方法,其中,所述获取用户输入的问题信息,并根据所述问题信息得到文本信息的步骤,包括:获取用户的语音信号,所述语音信号携带所述问题信息;将所述语音信号进行语音预处理,得到所述语音信号的观察序列;检测所述观察序列与预设文本对应的观察序列的相似度是否大于预设相似度;若大于预设相似度,则将所述预设文本作为所述问题信息对应的所述文本信息。
- 根据权利要求1所述的基于AIML的智能问答方法,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤,包括:将所述文本信息进行语义分析,以分析出所述文本信息所对应的业务类型;基于所述文本信息对应的业务类型,在所述预设问答文件中查找出所述业务类型对应的第一预设问答文件;获取所述目标文本与所述第一预设问答文件中的问题文本的相似度;判断所述相似度是否达到预设阈值;若所述相似度达到预设阈值,则判定为匹配成功,若未达到,则判定为匹配失败。
- 根据权利要求1所述的基于AIML的智能问答方法,其中,所述在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本的步骤,包括:在所述第二文本中,将所述第一字段中的一个或者多个字段替换为对应的第二字段,得到多个所述目标文本。
- 根据权利要求1所述的基于AIML的智能问答方法,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤之后,包括:若所述目标文本与预设问答文件中的预设问题匹配失败,则向预设URL地址发送查询请求,所述查询请求携带所述文本信息;接收所述查询请求对应的返回数据;将所述返回数据进行解析,获取所述返回数据中相关度靠前的若干个第二答案,并将若干个所述第二答案作为所述文本信息对应的答案。
- 根据权利要求5所述的基于AIML的智能问答方法,其中,所述将若干个所述第二答案作为所述文本信息对应的答案的步骤之后,包括:将若干个所述第二答案添加至预设空白列表以生成答案列表,并保存所述答案列表;显示所述答案列表和供用户分别选择每个所述第二答案为有用的选项;基于用户选择所述第二答案为有用的选项,累计每个第二答案分别对应的第一有用数;判断所述第二答案的第一有用数是否达到预设值;若是,则将所述第一有用数达到预设值的所述第二答案,以及第一文本信息添加至所述预设问答文件,所述第一文本信息为所述第二答案对应的所述文本信息,以及作为所述预设问答文件中的预设问题,所述第一有用数达到预设值的所述第二答案作为所述预设文件中的第一答案。
- 根据权利要求6所述的基于AIML的智能问答方法,其中,所述将所述有用数达到预设值的所述第二答案,以及所述第二答案对应的文本信息添加至所述预设问答文件的步骤之后,还包括:当所述文本信息与所述第一文本信息匹配时,显示所述第一文本信息对应的第一答案,以及显示供用户选择所述第一答案为有用或无用的选项;基于用户选择所述第一答案为有用或无用的选项,累计所述第一答案对应的第二有用数和无用数,所述第二有用数在所述第一有用数的基础上累计;判断所述第二有用数是否小于所述无用数;若所述第二有用数小于无用数,则删除所述预设问答文件中所述第一答案及第一答案对应的第一文本信息。
- 一种计算机设备,包括存储器和执行器,所述存储器存储有计算机程序,其中,所述执行器执行所述计算机程序时实现如下步骤:获取用户输入的问题信息,并根据所述问题信息得到文本信息;将所述文本信息中的中文转换为同一中文字体,得到所述文本信息对应的第一文本,所述中文字体为中文简体或中文繁体;根据预设过滤规则,删除所述第一文本中的指定符号,得到第二文本;根据预设的中文分词规则,将所述第二文本进行中文分词,得到所述第二文本对应的多个第一字段;将各个所述第一字段分别进行同义词匹配,得到各个所述第一字段分别对应的第二字段;在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本;将各个所述目标文本分别与预设问答文件中的预设问题进行匹配,所述预设问答文件包含所述预设问题与第一答案的映射关系信息;若所述目标文本与预设问答文件中的预设问题匹配成功,则获取所述预设问题对应的第一答案,以将所述第一答案作为所述问题信息的答案。
- 根据权利要求8所述的计算机设备,其中,所述获取用户输入的问题信息,并根据所述问题信息得到文本信息的步骤,包括:获取用户的语音信号,所述语音信号携带所述问题信息;将所述语音信号进行语音预处理,得到所述语音信号的观察序列;检测所述观察序列与预设文本对应的观察序列的相似度是否大于预设相似度;若大于预设相似度,则将所述预设文本作为所述问题信息对应的所述文本信息。
- 根据权利要求8所述的计算机设备,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤,包括:将所述文本信息进行语义分析,以分析出所述文本信息所对应的业务类型;基于所述文本信息对应的业务类型,在所述预设问答文件中查找出所述业务类型对应的第一预设问答文件;获取所述目标文本与所述第一预设问答文件中的问题文本的相似度;判断所述相似度是否达到预设阈值;若所述相似度达到预设阈值,则判定为匹配成功,若未达到,则判定为匹配失败。
- 根据权利要求8所述的计算机设备,其中,所述在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本的步骤,包括:在所述第二文本中,将所述第一字段中的一个或者多个字段替换为对应的第二字段,得到多个所述目标文本。
- 根据权利要求8所述的计算机设备,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤之后,包括:若所述目标文本与预设问答文件中的预设问题匹配失败,则向预设URL地址发送查询请求,所述查询请求携带所述文本信息;接收所述查询请求对应的返回数据;将所述返回数据进行解析,获取所述返回数据中相关度靠前的若干个第二答案,并将若干个所述第二答案作为所述文本信息对应的答案。
- 根据权利要求12所述的计算机设备,其中,所述将若干个所述第二答案作为所述文本信息对应的答案的步骤之后,包括:将若干个所述第二答案添加至预设空白列表以生成答案列表,并保存所述答案列表;显示所述答案列表和供用户分别选择每个所述第二答案为有用的选项;基于用户选择所述第二答案为有用的选项,累计每个第二答案分别对应的第一有用数;判断所述第二答案的第一有用数是否达到预设值;若是,则将所述第一有用数达到预设值的所述第二答案,以及第一文本信息添加至所述预设问答文件,所述第一文本信息为所述第二答案对应的所述文本信息,以及作为所述预设问答文件中的预设问题,所述第一有用数达到预设值的所述第二答案作为所述预设文件中的第一答案。
- 根据权利要求13所述的计算机设备,其中,所述将所述有用数达到预设值的所述第二答案,以及所述第二答案对应的文本信息添加至所述预设问答文件的步骤之后,还包括:当所述文本信息与所述第一文本信息匹配时,显示所述第一文本信息对应的第一答案,以及显示供用户选择所述第一答案为有用或无用的选项;基于用户选择所述第一答案为有用或无用的选项,累计所述第一答案对应的第二有用数和无用数,所述第二有用数在所述第一有用数的基础上累计;判断所述第二有用数是否小于所述无用数;若所述第二有用数小于无用数,则删除所述预设问答文件中所述第一答案及第一答案对应的第一文本信息。
- 一种计算机存储介质,其上存储有计算机程序,其中,所述计算机程序被执行器执行时实现如下步骤:获取用户输入的问题信息,并根据所述问题信息得到文本信息;将所述文本信息中的中文转换为同一中文字体,得到所述文本信息对应的第一文本,所述中文字体为中文简体或中文繁体;根据预设过滤规则,删除所述第一文本中的指定符号,得到第二文本;根据预设的中文分词规则,将所述第二文本进行中文分词,得到所述第二文本对应的多个第一字段;将各个所述第一字段分别进行同义词匹配,得到各个所述第一字段分别对应的第二字段;在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本;将各个所述目标文本分别与预设问答文件中的预设问题进行匹配,所述预设问答文件包含所述预设问题与第一答案的映射关系信息;若所述目标文本与预设问答文件中的预设问题匹配成功,则获取所述预设问题对应的第一答案,以将所述第一答案作为所述问题信息的答案。
- 根据权利要求15所述的计算机存储介质,其中,所述获取用户输入的问题信息,并根据所述问题信息得到文本信息的步骤,包括:获取用户的语音信号,所述语音信号携带所述问题信息;将所述语音信号进行语音预处理,得到所述语音信号的观察序列;检测所述观察序列与预设文本对应的观察序列的相似度是否大于预设相似度;若大于预设相似度,则将所述预设文本作为所述问题信息对应的所述文本信息。
- 根据权利要求15所述的计算机存储介质,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤,包括:将所述文本信息进行语义分析,以分析出所述文本信息所对应的业务类型;基于所述文本信息对应的业务类型,在所述预设问答文件中查找出所述业务类型对应的第一预设问答文件;获取所述目标文本与所述第一预设问答文件中的问题文本的相似度;判断所述相似度是否达到预设阈值;若所述相似度达到预设阈值,则判定为匹配成功,若未达到,则判定为匹配失败。
- 根据权利要求15所述的计算机存储介质,其中,所述在所述第二文本中,根据所述第二字段替换与所述第二字段对应的第一字段,得到多个所述目标文本的步骤,包括:在所述第二文本中,将所述第一字段中的一个或者多个字段替换为对应的第二字段,得到多个所述目标文本。
- 根据权利要求15所述的计算机存储介质,其中,所述将各个所述目标文本分别与预设问答文件中的预设问题进行匹配的步骤之后,包括:若所述目标文本与预设问答文件中的预设问题匹配失败,则向预设URL地址发送查询请求,所述查询请求携带所述文本信息;接收所述查询请求对应的返回数据;将所述返回数据进行解析,获取所述返回数据中相关度靠前的若干个第二答案,并将若干个所述第二答案作为所述文本信息对应的答案。
- 根据权利要求19所述的计算机存储介质,其中,所述将若干个所述第二答案作为所述文本信息对应的答案的步骤之后,包括:将若干个所述第二答案添加至预设空白列表以生成答案列表,并保存所述答案列表;显示所述答案列表和供用户分别选择每个所述第二答案为有用的选项;基于用户选择所述第二答案为有用的选项,累计每个第二答案分别对应的第一有用数;判断所述第二答案的第一有用数是否达到预设值;若是,则将所述第一有用数达到预设值的所述第二答案,以及第一文本信息添加至所述预设问答文件,所述第一文本信息为所述第二答案对应的所述文本信息,以及作为所述预设问答文件中的预设问题,所述第一有用数达到预设值的所述第二答案作为所述预设文件中的第一答案。
- 根据权利要求20所述的计算机存储介质,其中,所述将所述有用数达到预设值的所述第二答案,以及所述第二答案对应的文本信息添加至所述预设问答文件的步骤之后,还包括:当所述文本信息与所述第一文本信息匹配时,显示所述第一文本信息对应的第一答案,以及显示供用户选择所述第一答案为有用或无用的选项;基于用户选择所述第一答案为有用或无用的选项,累计所述第一答案对应的第二有用数和无用数,所述第二有用数在所述第一有用数的基础上累计;判断所述第二有用数是否小于所述无用数;若所述第二有用数小于无用数,则删除所述预设问答文件中所述第一答案及第一答案 对应的第一文本信息。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910435063.2 | 2019-05-23 | ||
CN201910435063.2A CN110321416A (zh) | 2019-05-23 | 2019-05-23 | 基于aiml的智能问答方法、装置、计算机设备及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020233386A1 true WO2020233386A1 (zh) | 2020-11-26 |
Family
ID=68118991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/088052 WO2020233386A1 (zh) | 2019-05-23 | 2020-04-30 | 基于aiml的智能问答方法、装置、计算机设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110321416A (zh) |
WO (1) | WO2020233386A1 (zh) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110321416A (zh) * | 2019-05-23 | 2019-10-11 | 深圳壹账通智能科技有限公司 | 基于aiml的智能问答方法、装置、计算机设备及存储介质 |
CN110795548A (zh) * | 2019-10-25 | 2020-02-14 | 招商局金融科技有限公司 | 智能问答方法、装置及计算机可读存储介质 |
CN111582996B (zh) * | 2020-05-20 | 2023-11-24 | 拉扎斯网络科技(上海)有限公司 | 业务信息的展示方法及装置 |
CN113807148B (zh) * | 2020-06-16 | 2024-07-02 | 阿里巴巴集团控股有限公司 | 文本识别匹配方法和装置、终端设备 |
CN112069230B (zh) * | 2020-09-07 | 2023-10-27 | 中国平安财产保险股份有限公司 | 数据分析方法、装置、设备及存储介质 |
CN112667789B (zh) * | 2020-12-17 | 2024-07-26 | 中国平安人寿保险股份有限公司 | 用户意图匹配方法装置、终端设备及存储介质 |
CN116821304B (zh) * | 2023-07-07 | 2023-12-19 | 国网青海省电力公司信息通信公司 | 基于大数据的供电所知识智能问答系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105677822A (zh) * | 2016-01-05 | 2016-06-15 | 首都师范大学 | 一种基于对话机器人的招生自动问答方法及系统 |
CN107688667A (zh) * | 2017-09-30 | 2018-02-13 | 平安科技(深圳)有限公司 | 智能机器人客服方法、电子装置及计算机可读存储介质 |
CN109241258A (zh) * | 2018-08-23 | 2019-01-18 | 江苏索迩软件技术有限公司 | 一种应用税务领域的深度学习智能问答系统 |
CN109325040A (zh) * | 2018-07-13 | 2019-02-12 | 众安信息技术服务有限公司 | 一种faq问答库泛化方法、装置及设备 |
CN110321416A (zh) * | 2019-05-23 | 2019-10-11 | 深圳壹账通智能科技有限公司 | 基于aiml的智能问答方法、装置、计算机设备及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9460085B2 (en) * | 2013-12-09 | 2016-10-04 | International Business Machines Corporation | Testing and training a question-answering system |
CN107066541A (zh) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | 客服问答数据的处理方法及系统 |
CN107301865B (zh) * | 2017-06-22 | 2020-11-03 | 海信集团有限公司 | 一种用于语音输入中确定交互文本的方法和装置 |
CN107220380A (zh) * | 2017-06-27 | 2017-09-29 | 北京百度网讯科技有限公司 | 基于人工智能的问答推荐方法、装置和计算机设备 |
CN107609101B (zh) * | 2017-09-11 | 2020-10-27 | 远光软件股份有限公司 | 智能交互方法、设备及存储介质 |
CN109766423A (zh) * | 2018-12-29 | 2019-05-17 | 上海智臻智能网络科技股份有限公司 | 基于神经网络的问答方法及装置、存储介质、终端 |
-
2019
- 2019-05-23 CN CN201910435063.2A patent/CN110321416A/zh active Pending
-
2020
- 2020-04-30 WO PCT/CN2020/088052 patent/WO2020233386A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105677822A (zh) * | 2016-01-05 | 2016-06-15 | 首都师范大学 | 一种基于对话机器人的招生自动问答方法及系统 |
CN107688667A (zh) * | 2017-09-30 | 2018-02-13 | 平安科技(深圳)有限公司 | 智能机器人客服方法、电子装置及计算机可读存储介质 |
CN109325040A (zh) * | 2018-07-13 | 2019-02-12 | 众安信息技术服务有限公司 | 一种faq问答库泛化方法、装置及设备 |
CN109241258A (zh) * | 2018-08-23 | 2019-01-18 | 江苏索迩软件技术有限公司 | 一种应用税务领域的深度学习智能问答系统 |
CN110321416A (zh) * | 2019-05-23 | 2019-10-11 | 深圳壹账通智能科技有限公司 | 基于aiml的智能问答方法、装置、计算机设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110321416A (zh) | 2019-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020233386A1 (zh) | 基于aiml的智能问答方法、装置、计算机设备及存储介质 | |
CN112069298B (zh) | 基于语义网和意图识别的人机交互方法、设备及介质 | |
Malandrakis et al. | Distributional semantic models for affective text analysis | |
CN102629246B (zh) | 识别浏览器语音命令的服务器及浏览器语音命令识别方法 | |
CN111046656B (zh) | 文本处理方法、装置、电子设备及可读存储介质 | |
CN111708869B (zh) | 人机对话的处理方法及装置 | |
CN113468302A (zh) | 组合共享询问线的多个搜索查询的参数 | |
CN107704453A (zh) | 一种文字语义分析方法、文字语义分析终端及存储介质 | |
US20230394247A1 (en) | Human-machine collaborative conversation interaction system and method | |
CN113505209A (zh) | 一种面向汽车领域的智能问答系统 | |
CN112052324A (zh) | 智能问答的方法、装置和计算机设备 | |
KR101677859B1 (ko) | 지식 베이스를 이용하는 시스템 응답 생성 방법 및 이를 수행하는 장치 | |
CN109614620B (zh) | 一种基于HowNet的图模型词义消歧方法和系统 | |
US11907665B2 (en) | Method and system for processing user inputs using natural language processing | |
CN112765974B (zh) | 一种业务辅助方法、电子设备及可读存储介质 | |
CN112115252B (zh) | 智能辅助写作处理方法、装置、电子设备及存储介质 | |
US20220147719A1 (en) | Dialogue management | |
WO2023278052A1 (en) | Automated troubleshooter | |
JP2013190985A (ja) | 知識応答システム、方法およびコンピュータプログラム | |
KR101333485B1 (ko) | 온라인 사전을 이용한 개체명 사전 구축 방법 및 이를 실행하는 장치 | |
Aliero et al. | Systematic review on text normalization techniques and its approach to non-standard words | |
CN115017271B (zh) | 用于智能生成rpa流程组件块的方法及系统 | |
CN113761919A (zh) | 一种口语化短文本的实体属性提取方法及电子装置 | |
Karpagam et al. | Deep learning approaches for answer selection in question answering system for conversation agents | |
CN111046168A (zh) | 用于生成专利概述信息的方法、装置、电子设备和介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20810174 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20810174 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 18/03/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20810174 Country of ref document: EP Kind code of ref document: A1 |