WO2020140372A1 - 一种基于识别模型的意图识别方法、识别设备及介质 - Google Patents

一种基于识别模型的意图识别方法、识别设备及介质 Download PDF

Info

Publication number
WO2020140372A1
WO2020140372A1 PCT/CN2019/088802 CN2019088802W WO2020140372A1 WO 2020140372 A1 WO2020140372 A1 WO 2020140372A1 CN 2019088802 W CN2019088802 W CN 2019088802W WO 2020140372 A1 WO2020140372 A1 WO 2020140372A1
Authority
WO
WIPO (PCT)
Prior art keywords
intent
query sentence
word
keyword
intention
Prior art date
Application number
PCT/CN2019/088802
Other languages
English (en)
French (fr)
Inventor
周涛涛
周宝
贾怀礼
王虎
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020140372A1 publication Critical patent/WO2020140372A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular to an intent recognition method, recognition device and medium based on a recognition model.
  • robots when they perform intent recognition, they generally use a multi-classifier classifier to identify the user's query intent.
  • the multi-classifier classifier performs intent prediction, it often performs normalization processing, which leads to multi-classification classification.
  • the device When predicting an irrelevant query, the device often classifies the irrelevant intention into a certain category in the multi-classification. That is, the current intent recognition method cannot identify irrelevant queries. Even if the intent of a query does not belong to the intent of the classifier, the intent may be identified as the intent under the classifier, resulting in the inability to accurately identify the user intent.
  • Embodiments of the present application provide an intent recognition method, recognition device, and medium based on a recognition model, which helps to improve the accuracy of intent recognition.
  • an embodiment of the present application provides an intent recognition method based on a recognition model, including:
  • the intention recognition model is composed of a plurality of two classifiers, each of which corresponds to An intent
  • the intent recognition model is trained by query sentence samples of the intent corresponding to the multiple two classifiers, the recognition result is used to indicate the intent of the target query sentence
  • the intent of the target query sentence is any An intent under the two classifiers or an unrelated intent.
  • an embodiment of the present application provides an identification device, the identification device including a unit for performing the method of the first aspect.
  • an embodiment of the present application provides another identification device, including a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store a computer program that supports the identification device to perform the above method.
  • the computer program includes program instructions, and the processor is configured to call the program instructions to perform the method of the first aspect described above.
  • the identification device may further include a communication interface and/or a user interface.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, where the computer program includes program instructions, which when executed by a processor causes The processor executes the method of the first aspect described above.
  • the embodiment of the present application can perform word segmentation processing on the acquired query sentence to obtain multiple word segments of the query sentence and determine the word vector of each word segment, and then determine the keyword of the query sentence from the multiple word segments. After the word vectors of keywords are weighted, the feature vector of the query sentence is obtained, and then the intent of the query sentence can be determined by inputting the feature vector into a preset intention recognition model, which helps to improve the intention recognition Accurate, and able to identify irrelevant intentions.
  • FIG. 1 is a schematic flowchart of an intent recognition method based on a recognition model provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another intent recognition method based on a recognition model provided by an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of an identification device provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another identification device provided by an embodiment of the present application.
  • the technical solution of the present application may be applied to a recognition device, and the recognition device may include a server, a terminal, a robot, or other recognition devices, used for recognizing the intention of the user's query statement.
  • the terminal involved in this application may be a mobile phone, computer, tablet, personal computer, smart watch, etc. This application is not limited.
  • the present application can perform word segmentation processing on the query sentence to be subjected to intent recognition to obtain one or more word segments of the query sentence and determine the word vector of each word segment, thereby calculating the special diagnosis vector of the query sentence, or , It is also possible to determine the keyword of the query sentence by further determining the keyword of the query sentence from the one or more word segmentation and weighting the word vector of the keyword to calculate the feature vector of the query sentence, and then by inputting the feature vector
  • the preset intent recognition model determines the intent of the query sentence, which helps to improve the accuracy of intent recognition. Each is explained in detail below.
  • FIG. 1 is a schematic flowchart of a method for intent recognition based on a recognition model provided by an embodiment of the present application. Specifically, the method of this embodiment can be applied to the above-mentioned identification device such as a robot. As shown in FIG. 1, the intent recognition method based on the recognition model may include the following steps:
  • the target query sentence may be any sentence to be intentionally recognized, such as any sentence received by a recognition device such as a robot.
  • the sentence may be text, voice, or a sentence in a video.
  • the recognition device may also convert the sentence into a text sentence after acquiring the sentence, so as to quickly implement word segmentation processing on the sentence.
  • the word segmentation method corresponding to the word segmentation processing may be stammer word segmentation or Stanford word segmentation or other word segmentation methods, which is not limited in this application.
  • word segmentation processing is performed on the target query sentence, and the resulting multiple word segments (also called words, words, terms, etc.) that make up the target query sentence can be all the word segments that make up the target query sentence, Partial participles of all participles that make up the target query sentence can also be used, such as participles after stop words or other meaningless participles are removed from all participles, in order to reduce the computational cost of subsequent word vectors and feature vectors , Help to improve the efficiency of intent recognition.
  • word segmentation processing is performed on the target query sentence, and the resulting multiple word segments (also called words, words, terms, etc.) that make up the target query sentence can be all the word segments that make up the target query sentence, Partial participles of all participles that make up the target query sentence can also be used, such as participles after stop words or other meaningless participles are removed from all participles, in order to reduce the computational cost of subsequent word vectors and feature vectors , Help to improve the efficiency of intent recognition.
  • a filter list may be preset, and the filter list may include various stop words or other meaningless words, such as "ah”, “oh”, “de”, etc., so that after segmenting the query sentence, It is possible to determine meaningless words such as stop words in the query sentence by matching and comparing with the words in the filter list, and remove these words.
  • the target query sentence may contain only one word segmentation, that is, the target query sentence is subjected to word segmentation processing, and the resulting word segment constituting the target query sentence may be one or more word segments.
  • the intent recognition method based on one participle is the same as the intent recognition method based on multiple participles. For ease of understanding, this application uses the multiple participles as an example for description.
  • the word vector of each word segment can be calculated.
  • word vectors for word segmentation There are many ways to calculate word vectors for word segmentation. For example, you can use various pre-collected corpora (which can be each participle) to train the structure of the continuous bag of words model (CBOW model), and use the Word2Vec model framework in Gensim to obtain the CBOW model whose input is a corpus and whose output is a word vector. .
  • the word vectors of each word segmentation can be obtained by inputting the plurality of word segmentations into the model respectively. Or, the word vector of each word segmentation may be calculated in an existing manner, and the calculation method of the word vector of the word segmentation is not limited in this application.
  • the keyword list may include one or more keywords, which may specifically be one or more intended keywords. Therefore, when matching, multiple word segments corresponding to the target query sentence can be matched and compared with the keywords in the keyword list (to determine whether the same keywords exist), that is, to detect whether the multiple word segments exist Keywords in this keyword list. If it exists, the matched keyword can be used as the keyword of the target query sentence, that is, the target keyword.
  • the target keyword is the word segment in the plurality of word segments that matches the key in the keyword list.
  • steps 102 and 103 are not limited.
  • step 103 may be executed first, and then step 102 is executed; for another example, steps 102 and 103 may also be executed simultaneously.
  • This application is not limited.
  • the recognition device may set different weights for the common words in the participle (such as the participles other than the keywords in the multiple participles) and the keywords, and set the weights of the keywords to be higher than the weights of the common words,
  • a weighting coefficient is set for a keyword to increase the weight of the keyword, thereby optimizing the feature vector of the query sentence and improving the accuracy of intention recognition.
  • the keyword vector *k that is, the weighting coefficient is k
  • k is greater than 1, so that the feature vector of the query sentence can be biased as much as possible in the vector direction of the keyword.
  • the weight of common words in the query sentence can also be reduced (such as setting the weighting coefficient of common words between 0-1), and the weight of the keyword can be increased (such as weighting the keyword
  • the coefficient is set to be greater than 1) or remains unchanged (for example, the weighting coefficient of the common word is set to 1), that is, the weighting coefficient of the keyword is greater than the weighting coefficient of the common word.
  • the vector of the query sentence may refer to the sum of the word vectors of the multiple word segments obtained by the word segmentation of the query sentence.
  • keywords can be selected from the obtained multiple word segmentation words, and then the feature vector of the target query sentence can be determined according to the word vectors of the keyword and the remaining common words In this way, the reliability of the feature vector of the determined query sentence can be improved, and the accuracy of intent recognition based on the feature vector can be improved.
  • the word vector model is used to extract text features, which is more representative than the vector space model.
  • the feature vector of the query sentence such as the target query sentence may be determined according to the weighted word vector of each participle (including the weighted keyword word vector and the unweighted word vector of common words), the target The feature vector of the query sentence may be the same as the vector of the target query sentence, or may be different.
  • the recognition device may calculate the sum of the weighted word vectors of each participle, and use the sum as the feature vector of the target query sentence. That is to say, the feature vector of the target query sentence may be the sum of the word vectors of common words in the plurality of participles and the word vectors of the weighted keywords.
  • the recognition device may calculate the sum of the word vectors of each participle after the weighting process, and calculate the ratio of the sum to the number of the plurality of participles.
  • the ratio is used as the feature vector of the target query sentence. That is to say, the feature vector of the target query sentence can be the ratio of the sum of the word vectors of the multiple word segments of the target query sentence divided by the total number of word segments of the multiple word segments after the weighting process, that is, normalization.
  • the recognition device may calculate the average or root mean square value of the word vector of each word segmentation after the weighting process, and use the average or root mean square value as the The feature vectors of the target query sentence are not listed here one by one.
  • n word segments are obtained (that is, the multiple word segments are n), and the calculated word vector of each word segment is as follows:
  • V (v 1 , v 2 ,..., v n )
  • v i represents the word vector corresponding to the participle w i .
  • the vector T of the target query sentence can be expressed as:
  • u represents the weighting coefficient
  • k is a constant (assuming that the coefficients of all target keywords are k), for example, it is assumed to be 2 or other values in actual use. If the current word segmentation is in the keyword list/set (that is, it is indicated as a keyword), multiply the weighting coefficient k when calculating the feature and then add the vector; if the current word segmentation is not in the keyword list/set (you can indicate that it is not Keywords), you can directly add the original vector without multiplying the weighting coefficient, or use the weighting coefficient as 1, so as to obtain the feature vector of the target query sentence, such as using T as the feature vector of the target query sentence, Or the value of T/n is used as the feature vector of the target query sentence and so on.
  • the multiple participles may not be weighted and calculated
  • the feature vector of the target query sentence can be determined based on the word vector of each participle, for example, the sum of the word vectors of each participle is used as the target query sentence
  • the feature vector for example, the ratio of the sum value to the number of the plurality of participles is used as the feature vector of the target query sentence, etc., which will not be repeated here.
  • the intention recognition model may be composed of multiple two-classifiers, that is, the intention recognition model may be a multiple classifier composed of the multiple two-classifiers.
  • Each of the two classifiers may correspond to an intent
  • the intent recognition model may be trained from a query sentence sample (corpus) of the intents corresponding to the plurality of two classifiers, and may specifically be an intent corresponding to the plurality of two classifiers
  • the feature vectors of the query sentence samples are obtained by training.
  • the recognition result is used to indicate the intent of the target query sentence, and the intent of the target query sentence is any intent under the two classifiers or irrelevant intent.
  • the recognition result may include information on the intention of the target query sentence, the probability that the intention of the target query sentence is an intention under a certain two classifier, and any one or more of the intention of the two classifier .
  • the intent information of the target query sentence may refer to text information of the intent of the target query sentence.
  • the user's intention indicated by the recognition result of the target query sentence can be used to search for information corresponding to the intention in the information library, such as weather information when the intention is a weather query , And if the intention is to search for ticket information during ticket query, etc., and can output the information (such as text output, or voice output, or other output, etc.) or send the information to the corresponding terminal of the user to For users to view, guide users and so on.
  • information corresponding to the intention in the information library such as weather information when the intention is a weather query
  • the intention is to search for ticket information during ticket query, etc.
  • the information such as text output, or voice output, or other output, etc.
  • a recognition device such as a robot can perform word segmentation processing on the obtained query sentence to obtain multiple word segments of the query sentence and determine the word vector of each word segment, and then determine the query from the multiple word segments
  • the feature vector of the query sentence is obtained to determine the intent of the query sentence by inputting the feature vector into a preset intention recognition model, not only to recognize
  • the trained intent of the model can also identify irrelevant intent, which helps to improve the accuracy of intent recognition.
  • FIG. 2 is a schematic flowchart of another intent recognition method based on a recognition model provided by an embodiment of the present application.
  • the intent recognition method based on the recognition model may include the following steps:
  • the preset sample database may include query sentence samples (corpus) for each intent, and the selected query sentence samples for each intent may include multiple, for example, each intent may correspond to a plurality of selected intents including the intent.
  • Each query sentence sample can be composed of text.
  • each query sentence sample may be stored in the sample database in association with its corresponding intent information, such as an intent tag, to facilitate sample search and subsequent model training.
  • the recognition device may perform word segmentation processing on each sample of the intent to obtain multiple word segments after word segmentation.
  • the method of word segmentation may adopt the stuttering word segmentation or Stanford word segmentation method and so on.
  • the participle included in each participle set may be all participles of all query sentence samples in the sample set of intents corresponding to the selected participle set, or part of the participles in the participle, such as For all the participles, the stopwords or other meaningless participles are removed to reduce the computational cost, which is not repeated here.
  • the keyword determination rule can be set in advance.
  • the keyword determination rule may include a keyword determination rule based on TF-IDF value, a keyword determination rule based on word frequency, a keyword determination rule based on frequency, a keyword determination rule based on chi-square check value, etc.
  • the rules determined by any one rule or combination of multiple rules are not limited in this application.
  • a keyword list including the keyword may be generated, all the keywords of the intent may be stored in the same keyword list, or the keywords of different intents may be Stored in different keyword lists, such as one-to-one correspondence between keywords of intent and keyword lists, and intent tags can be set for different keyword lists.
  • the recognition device may separately calculate the word frequency-inverse file frequency TF-IDF value of each word segment in the word segmentation set of each intent.
  • the word segmentation of each intent word segmentation set with a TF-IDF value exceeding a preset threshold is determined as the keyword of that intent; or, the word segmentation of each intent word segmentation set is performed in order of TF-IDF value from large to small
  • the word segmentation corresponding to the TF-IDF value of M before sorting is determined as the keyword of this kind of intention, where M is an integer greater than 0.
  • TF-IDF calculation can be performed on the word segmentation in the word segmentation set, and the keywords are selected according to the TF-IDF value of the word segmentation, for example, the TF-IDF value threshold exceeds a certain preset
  • the threshold word segmentation is used as the keyword of the intention, or the preset number of words with the highest TF-IDF value ranking are taken as the keyword of the intention, and so on.
  • the recognition device may separately calculate the frequency of occurrence of each participle in the participle set of each intention in the participle set, each participle
  • the corresponding frequency is the ratio of the number of occurrences of the participle in the participle set and the total number of participles in the participle set (such as the number of participles after removing stop words), such as term frequency (TF);
  • the participles of the intent word segmentation set whose frequency exceeds the preset frequency threshold are determined as keywords of that kind of intent; or, the word segments in each intent word segmentation set are sorted in order of frequency from large to small, and the top N
  • the word segmentation corresponding to the frequency of is determined as the keyword of this kind of intention, where N is an integer greater than 0.
  • the word frequency of the word segmentation of the word segmentation set can be counted, and the keywords can be selected according to the word frequency of the word segmentation, for example, select a word segment whose word frequency exceeds a preset frequency threshold as the keyword of the intention, or take out the word frequency sorting
  • a certain number in the top such as the top 6 entries, are used as keywords for the intention and so on.
  • the recognition device may separately calculate the number of occurrences of each participle in the participle set of each intent in the set of participles.
  • the word segmentation of the intent word segmentation set that exceeds the preset number of times threshold is determined as the keyword of the intent; or, the word segments in each intent word segmentation set are sorted according to the order of the number of times from the largest to the smallest.
  • the word segmentation corresponding to the number of times is determined as the keyword of this intention, where E is an integer greater than 0.
  • the recognition device may perform a chi-square test on the word segmentation in the word segmentation set of each intent to obtain each word segmentation set in each intent
  • the value of the chi-square test for each participle, the participle whose value of the chi-square test in each intent set of words exceeds the preset verification threshold is determined as the keyword of that intent; or, according to the value of the chi-square test from Sort the word segments in the set of word segments of each intent in a small order, and determine the word segment corresponding to the value of the chi-square test of F before sorting as the keyword of the intent, where F is an integer greater than 0.
  • any one of the above keyword determination rules can be used to select, or can be selected by combining several rules, such as selecting one or more of the above rules
  • the same keyword is used as the keyword of the intention; alternatively, a weight can be set for each selection rule, and the keywords selected under the above rules can be combined with the weight of the corresponding rule to further filter the word segmentation whose value is greater than the preset threshold
  • a variety of rules can be used to select the keyword of each intent, and each The keyword corresponding to the keyword determination rule (such as a keyword list) is bound to the corresponding usage scenario, and then in the subsequent keyword matching, different usage scenarios can be combined to select the bound keyword for matching. Therefore, the reliability and flexibility of the selected keywords can be further improved, so as to improve the accuracy of intent recognition.
  • the TF or TF-IDF of each word segmentation in the word segmentation set can be calculated as follows:
  • the word frequency can refer to the number or frequency of occurrence of a given word in the intent, that is, the number or frequency of word segmentation in the set of word segmentation of the intent, for example, in order to prevent it from biasing to long files, the word frequency can be the number of times The total number of participles divided by the set is the total number of intended words.
  • the TF of each word can be calculated.
  • IDF inverse document frequency
  • the TF-IDF value of each word segmentation can be calculated.
  • the method of calculating the feature vector of each query sentence sample is the same as the above method of calculating the feature vector of the target query sentence.
  • step 104 the related description of step 104 in the embodiment shown in FIG. 1, and details are not described here.
  • the weights between keywords such as the weighting coefficient may also be set to be different.
  • the weight of each keyword can be determined according to the TF-IDF value of the keyword.
  • the weight of each keyword can be based on The TF (or the number of times or the chi-square check value, etc.) of the keyword is determined. The larger the TF corresponding to the keyword, the higher the weighting coefficient set for it, etc., which are not listed here.
  • the corresponding relationship between different TF-IDF values (or TF values or times or chi-square check values, etc.) and their corresponding weighting coefficients can be preset; or the TF-IDF interval (or TF interval or times interval or chi-square) can be preset Check value interval, etc.) and their corresponding weighting coefficients to reduce system storage overhead.
  • the keywords and their corresponding weighting coefficients can also be associated and stored in the keyword list. Thereby, the reliability and flexibility of determining the weighting coefficient of keywords can be improved, which helps to further improve the accuracy and reliability of intention recognition.
  • the intention recognition model may be composed of a plurality of two classifiers, and the plurality of two classifiers may correspond to the plurality of intentions one-to-one.
  • the feature vector After obtaining the feature vector of the query sentence sample for each intent, the feature vector can be input into the intent recognition model for classification to train the two classifiers corresponding to each intent.
  • the multi-classifier performs prediction, it often performs normalization, so that when it predicts irrelevant queries, it often appears that the irrelevant queries are forcibly classified into a related category, and irrelevant queries cannot be identified. The output query result is inaccurate. Therefore, this application can adopt a method of converting a multi-classifier into multiple two-class classifiers, so that it has the ability to identify irrelevant queries.
  • a sample query sentence (intent sentence) of each intent can be input for each two classifier ) Feature vector, the output is the corresponding intent.
  • a vector product SVM can be used to train a two-classifier, so that multiple two-classifier classifiers can be obtained by training.
  • the intent recognition model is composed of 6 binary classifiers, corresponding to weather, food, air ticket, stock, credit card and entertainment.
  • the weather classifier the secondary classification result it produces is: the current query is the weather intent and the current query is not the weather intent; and for the ticket classifier, the secondary classification result is: the current query is the ticket intent and the current query is not the ticket intent , Etc., food, stocks, credit cards, entertainment are similar, by inputting the feature vector of the corresponding intent sentence to the corresponding classifier for training, so as to obtain multiple binary classification classifiers to obtain the intent recognition model.
  • the feature vectors of the intent sentences corresponding to the two classifiers can also be used as positive samples, and the feature vectors of the remaining intent sentences can be used as negative samples to implement the two classifiers Training.
  • the feature vectors of intent sentences that belong to weather training samples such as weather intent
  • the feature vectors of intent sentences that belong to other types of training samples such as airline tickets and gourmet meals, can be used as negative samples. Realize the training of the weather classifier.
  • the classification with a large number of samples greatly reduces the model's ability to normalize, resulting in low auc (area under curve) (the larger the auc, the better the classifier effect). For example, to take an extreme example: if there are only 1 positive sample and 99 negative samples, then this classifier has no brain to divide all samples into negative samples, and its accuracy rate will also be 99%. But such a classifier is obviously invalid, and its recognition result is unreliable.
  • some methods can be used to balance the number of positive and negative samples, and then the classifier is trained to improve the accuracy and reliability of the training model. For example, for the case of fewer positive samples, you can increase the positive samples to balance the positive and negative samples; for example, for the case of fewer negative samples, you can use some ways to increase the negative samples to balance the positive and negative samples.
  • the way to balance positive and negative samples can be as follows:
  • Up-sampling increase the number of samples with fewer samples by copying the original samples directly. For example, it can be used when there are few samples.
  • Downsampling Reduce samples with a larger number of samples by discarding these extra samples. For example, it can be used when there are many samples.
  • an upsampling method can be used to increase the samples and train the classifier.
  • Synthetic samples increase the number of samples with a smaller number of samples. Synthesizing refers to generating new samples by combining the features of existing samples. Specifically, the method of generating a new sample may be to randomly select some features from each feature or select some specific features through some methods (such as features whose number of occurrences are higher than the threshold, or the sample similarity is higher than the threshold, such as Europe After the features between the samples whose distance is less than the threshold, etc.), the selected features are spliced into a new sample, thereby increasing the number of samples in the category with fewer samples. Unlike upsampling, which is simply copying samples, but here is stitching to obtain new samples, which can further improve the reliability of classifier training.
  • upsampling which is simply copying samples, but here is stitching to obtain new samples, which can further improve the reliability of classifier training.
  • the SMOTE Synthetic Minority Over-sampling Technology
  • SMOTE Synthetic Minority Over-sampling Technology
  • Change sample weight increase the weight of samples with fewer samples. For sample categories with fewer samples, you can multiply them by a weight, so that the classifier pays more attention to the sample with fewer samples.
  • the weight of the sample may be related to the number of samples. For example, the smaller the sample, the higher the weight; for example, setting a fixed weight for a sample category below a certain number, and so on.
  • the probability thresholds corresponding to each second classifier can be used to indicate whether the input query statement is the intent corresponding to the second classifier.
  • the probability threshold corresponding to each two classifiers may be the same or different. Further optionally, the probability threshold can also be adjusted according to the verification result of the recognition result of the two-classifier/intention recognition model within a preset time period, such as the recognition result of the one-two classifier within a time period such as one week
  • the verification result is that when the recognition success rate is lower than a preset threshold, such as 90%, increase the probability threshold corresponding to the two classifiers, for example, increase the probability threshold according to a preset value, such as 3%, to improve the accuracy of intent recognition and reliability.
  • the target keyword may be a word segment in the plurality of word segments that matches the key in the keyword list.
  • steps 206-208 reference may be made to the related descriptions of steps 101-104 in the embodiment shown in FIG. 1 above, and details are not described here.
  • the weighting coefficient corresponding to each keyword may be the same or different.
  • a weighting coefficient can be set for the keyword, and each intent keyword and the keyword The corresponding weighting coefficients are stored in the keyword list in association, and are not repeated here.
  • the recognition device when the recognition device performs weighting processing on the word vector of the target keyword according to the preset weighting coefficient, it can determine the weighting coefficient corresponding to the target keyword from the keyword list, and according to the determined weighting coefficient The word vector of the target keyword is weighted.
  • the weighting coefficients corresponding to the keywords matching the target keywords can be determined from the keyword list as the weighting coefficients corresponding to the target keywords, and the weighting coefficients corresponding to the target keywords can be determined.
  • the weighting coefficient weights the respective word vectors. Therefore, the adjustment vector of the target query sentence is calculated based on the weighted word vector of the target keyword and the word vectors of other participles in the plurality of participles.
  • the target may be determined
  • the query sentence is an irrelevant query
  • the recognition result of the target query sentence is used to indicate that the intention of the target query sentence is an irrelevant intention. If there is only one binary classifier, the recognition result includes a probability not lower than the probability threshold corresponding to the binary classifier, then it can be determined that the intent of the target query sentence is the intent of the binary classifier, and the recognition result of the target query sentence is Yu indicates that the intent of the target query sentence is the intent of the two classifiers.
  • the recognition result includes a probability that is not lower than the corresponding probability threshold, then the maximum probability of each probability that is not lower than the corresponding probability threshold can be further determined, and the maximum probability corresponding to the second probability can be determined.
  • the intent of the classifier is taken as the intent of the target query sentence, and the recognition result of the target query sentence can be used to indicate that the intent of the target query sentence is the intent of the second classifier corresponding to the maximum probability.
  • the intents of the multiple classifiers corresponding to the maximum probabilities can be used as the intent of the target query sentence, or they can also be used as scene changes to switch to another keyword to determine the key corresponding to the rule
  • the word list performs keyword matching and determines its weighting coefficient to calculate the feature vector of the target query sentence before performing intent recognition, etc., which are not listed here one by one.
  • the calculated feature vector of the target query sentence may be one or more.
  • the feature vector can be one, then the multiple two classifiers can respectively determine whether the intent of the target query sentence is their own intent based on the feature vector; and if the feature vector can be multiple, calculate the target query
  • the target keyword corresponding to each intent can be determined from the keyword list corresponding to each intent, and the feature vector of the target query sentence corresponding to each intent can be calculated based on the target keyword corresponding to each intent.
  • the corresponding intent two classifier (such as determining the corresponding one-pass two classifier based on the intent label or other methods) to determine whether the intent of the target query statement is its own intent That is, the feature vectors of the two classifiers (that is, the feature vectors corresponding to each intention) can be separately extracted, and the corresponding two classifiers can make the decision output. This can avoid the problem of inaccurate judgments caused when the same keywords exist in different intents, and help to further improve the reliability of intent recognition.
  • the recognition device such as The robot can receive the query request input by the user to the model for intent recognition.
  • the request can be a request in the form of a picture, text, or voice, which can be converted into a text sentence corresponding to the request, that is, the target query sentence, the sentence is segmented, and the stopwords in each participle obtained by the segmentation can be determined Among the keywords, and calculate the feature vector of the sentence (may be one or more, such as extracting the feature vector of the weather classifier, the feature vector of the food classifier, the feature vector of the ticket classifier, the stock classifier Feature vectors of credit cards, credit card classifiers, and entertainment classifiers), and then input sentence feature vectors into the model for intention recognition.
  • the feature vector of the sentence may be one or more, such as extracting the feature vector of the weather classifier, the feature vector of the food classifier, the feature vector of the ticket classifier, the stock classifier Feature vectors of credit cards, credit card classifiers, and entertainment classifiers
  • each of the two classifiers can decide the output, and decide whether it is the intent corresponding to each of the two classifiers, for example, it can output its own corresponding intent (such as weather intent, gourmet intent, etc.) and its probability (such as forward probability, that is, query Is the probability of that intention). If the forward probability output by all the two classifiers is lower than the corresponding threshold, it can indicate that the user query is classified as an irrelevant query, and the information that the query is an irrelevant query can be output, and it is no longer mandatory to classify it into a certain category .
  • intent such as weather intent, gourmet intent, etc.
  • the probability such as forward probability, that is, query Is the probability of that intention
  • the intent corresponding to the category is output as the recognition result, that is, the intent of the query request, and its corresponding forward probability, ie, confidence, can be further output.
  • the intention with the highest forward probability among the multiple categories may be output as the intention of the query request, and the corresponding forward probability may be further output.
  • the identification device can determine the keywords of the multiple intents by selecting query samples of multiple intents from the preset sample database, and then calculate the feature vector of the query sentence samples based on the keywords
  • the feature vector of the query sentence can be determined according to the keyword of the target query sentence, and the feature vector is input into the preset intent recognition model to determine the Query the intent of the sentence, thereby improving the accuracy of intent recognition.
  • FIG. 3 is a schematic structural diagram of an identification device according to an embodiment of the present application.
  • the identification device (apparatus) of the embodiment of the present application may include a unit for performing the above-mentioned identification model-based intention identification method.
  • the identification device 300 of this embodiment may include: an acquisition unit 301 and a processing unit 302. among them,
  • the obtaining unit 301 is configured to receive a target query sentence input by a user
  • the processing unit 302 is configured to perform word segmentation processing on the target query sentence to obtain multiple word segments constituting the target query sentence; match the multiple word segments with each keyword in a preset keyword list, To determine a target keyword of the target query sentence from the plurality of word segments, the target keyword is a word segment of the plurality of word segments that matches a key in the keyword list;
  • the processing unit 302 is further configured to calculate the word vector of each word segment in the plurality of word segments, and perform weighting processing on the word vector of the target keyword according to a preset weighting coefficient, according to each of the weighted processing
  • the word vector of the word segmentation is calculated to obtain the feature vector of the target query sentence;
  • the processing unit 302 is further configured to input the feature vector of the target query sentence into a preset intention recognition model to obtain a recognition result of the target query sentence; wherein, the intention recognition model is composed of multiple two classifiers , Each of the two classifiers corresponds to an intent, the intent recognition model is trained by query sentence samples of the intent corresponding to the multiple two classifiers, the recognition result is used to indicate the intent of the target query sentence, the The intention of the target query sentence is the intention under any of the two classifiers or the irrelevant intention.
  • the obtaining unit 301 is further used to select query sentence samples of multiple intentions from the preset sample database respectively;
  • the processing unit 302 is also used to perform word segmentation processing on the query sentence samples of each intent to obtain the word segmentation set of the query sentence samples of each intent, and the word segmentation set of each intent includes the Multiple participles; keywords of each intent are determined from the set of participles of each intent according to the preset keyword determination rules; the word vector of each participle is calculated, and each intent is calculated according to the preset weighting coefficient The word vector of the keyword is weighted, and the feature vector of each query sentence sample is calculated according to the word vector of each word segmentation of each query sentence after the weighting process;
  • the processing unit 302 is further configured to obtain the intent recognition model based on the feature vector of each query sentence sample in the multiple intent query sentence samples and the intent training corresponding to the query sentence sample; wherein, the intent recognition model It is composed of multiple two-classifiers, and the multiple two-classifiers correspond to the multiple intents one-to-one.
  • processing unit 302 determines the keywords of each intention from the word segmentation set of each intention according to the preset keyword determination rule, it may be specifically used to:
  • the word segmentation in which the TF-IDF value of each intent word segmentation set exceeds a preset threshold is determined as the keyword of that intent; or,
  • processing unit 302 determines the keywords of each intention from the word segmentation set of each intention according to the preset keyword determination rule, it may be specifically used to:
  • the word segmentation in each intent word segmentation set is sorted in order from the largest to the smallest frequency, and the word segmentation corresponding to the frequency of N before sorting is determined as the keyword of the intent, where N is an integer greater than 0.
  • the identification device further includes a storage unit 303;
  • the processing unit 302 is further configured to set a weighting coefficient for the keyword according to the word frequency-inverse file frequency TF-IDF value or frequency corresponding to each intent keyword; wherein, the frequency corresponding to each keyword is The ratio of the number of occurrences of such intentional participle sets to the total number of participles of the participle set;
  • the storage unit 303 is configured to store the keyword of each intention and the weighting coefficient corresponding to the keyword in the keyword list;
  • the processing unit 302 When the processing unit 302 performs the weighting process of the word vector of the target keyword according to the preset weighting coefficient, it may be specifically used for:
  • a weighting coefficient corresponding to the target keyword is determined from the keyword list, and the word vector of the target keyword is weighted according to the determined weighting coefficient.
  • the processing unit 302 may be further configured to respectively set probability thresholds for the plurality of second classifiers, and the probability threshold corresponding to each second classifier is used to indicate whether the input query sentence is the intention corresponding to the second classifier;
  • the processing unit 302 executes the input of the feature vector of the target query sentence into a preset intention recognition model to obtain the recognition result of the target query sentence, it may be specifically used for:
  • the recognition result includes the probability that the intention of the target query sentence is the intention of the two classifiers
  • the recognition results of the multiple two classifiers are all lower than the corresponding probability threshold, it is determined that the target query sentence is an irrelevant query, and the recognition result of the target query sentence is used to indicate the intention of the target query sentence Unrelated intentions;
  • the recognition result of a second classifier includes a probability not lower than the probability threshold corresponding to the second classifier, it is determined that the intention of the target query sentence is the intention of the second classifier, and the recognition result of the target query sentence is used to Indicating that the intention of the target query sentence is the intention of the two classifiers;
  • the recognition result includes a probability not lower than the corresponding probability threshold, the maximum probability among the probabilities not lower than the corresponding probability threshold is determined, and the maximum probability of the binary classifier corresponding to the maximum probability is determined.
  • the intention is the intention of the target query sentence, and the recognition result of the target query sentence is used to indicate that the intention of the target query sentence is the intention of the two classifier corresponding to the maximum probability.
  • processing unit 302 executes the word vector calculation of each word segmentation after the weighting process to obtain the feature vector of the target query sentence, it may be specifically used to:
  • the recognition device may implement part or all of the steps in the method for recognizing an intention based on a recognition model in the embodiments shown in FIG. 1 to FIG. 2 through the above unit.
  • the embodiments of the present application are device embodiments corresponding to the method embodiments, and the description of the method embodiments is also applicable to the embodiments of the present application.
  • FIG. 4 is a schematic structural diagram of another identification device provided by an embodiment of the present application.
  • the identification device is used to perform the above method.
  • the identification device 400 in this embodiment may include: one or more processors 401 and a memory 402.
  • the identification device may further include one or more user interfaces 403, and/or one or more communication interfaces 404.
  • the processor 401, the user interface 403, the communication interface 404, and the memory 402 may be connected through the bus 405, or may be connected in other ways.
  • the bus mode is used as an example in FIG.
  • the memory 402 is used to store a computer program, and the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402.
  • the processor 401 may be used to call the program instructions to perform some or all of the steps in FIGS. 1 to 2 described above.
  • the processor 401 may be used to call the program instructions to perform the following steps: call the user interface 403 to receive a target query sentence input by a user, and perform word segmentation processing on the target query sentence to obtain a plurality of word segments constituting the target query sentence ; Matching the plurality of participles with each keyword in the preset keyword list to determine the target keyword of the target query sentence from the plurality of participles, the target keyword is the Word segments in multiple word segments that match the key in the keyword list; calculate the word vector of each word segment in the multiple word segments, and perform weighting processing on the word vector of the target keyword according to a preset weighting coefficient , Calculating the feature vector of the target query sentence according to the weighted word vector of each participle; inputting the feature vector of the target query sentence into a preset intention recognition model to obtain the target query sentence Recognition results; wherein, the intent recognition model is composed of multiple two classifiers, each of which corresponds to an intent, and the intent recognition model is trained by query sentence samples of the intent corresponding
  • the processor 401 before executing the inputting the feature vector of the target query sentence into a preset intention recognition model, the processor 401 is further used to perform the following steps: separately select query sentence samples of multiple intentions from a preset sample database , And separately perform word segmentation processing on the query sentence samples of each intent to obtain the word segmentation set of the query sentence samples of each intent.
  • the word segmentation set of each intent includes multiple word segments that make up the query sentence samples of the intent;
  • the established keyword determination rules determine the keywords of each intention from the set of participles of each intent; calculate the word vector of each participle, and perform the word vector of each intent keyword according to the preset weighting coefficient Weighting processing, calculating the feature vector of each query sentence sample according to the word vector of each participle of each query sentence after the weighting process; according to the The feature vector and the intent corresponding to the query sentence sample are trained to obtain the intent recognition model; wherein, the intent recognition model is composed of multiple two classifiers, and the multiple two classifiers correspond to the multiple intents one-to-one.
  • each intention is calculated separately The word frequency-inverse file frequency TF-IDF value of each word segmentation in the word segmentation set; determine the word segmentation of each intent word segmentation set whose TF-IDF value exceeds the preset threshold as the keyword of that intent; or, according to TF -The IDF value sorts the word segmentation in each intent word segmentation set from large to small, and determines the word segmentation corresponding to the TF-IDF value of the M before sorting as the keyword of this intent, where M is greater than An integer of 0.
  • each intention is calculated separately
  • the frequency of occurrence of each participle in the participle set of the participle set, the frequency corresponding to each participle is the ratio of the number of times the participle appears in the participle set and the total number of participles in the participle set;
  • the word segmentation in the word segmentation set whose frequency exceeds the preset frequency threshold is determined as the keyword of this kind of intention; or, the word segments in the word segmentation set of each intention are sorted in order of frequency from large to small, and the N before the sorting
  • the word segmentation corresponding to the frequency is determined as the keyword of this intention, where N is an integer greater than 0.
  • the processor 401 can also perform the following steps: set a weighting coefficient for the keyword according to the word frequency-inverse file frequency TF-IDF value or frequency corresponding to each intent keyword; wherein, each keyword corresponds to The frequency is the ratio of the number of times the keyword appears in the participle set of the intent and the total number of participles of the participle set; the keyword of each intent and the weighting coefficient corresponding to the keyword are stored in the keyword list;
  • the processor 401 When the processor 401 performs the weighting process of the word vector of the target keyword according to the preset weighting coefficient, it may specifically perform the following steps: determine the corresponding to the target keyword from the keyword list Weighting coefficient, and weighting the word vector of the target keyword according to the determined weighting coefficient.
  • the processor 401 may also perform the following steps: separately setting probability thresholds for the plurality of second classifiers, and the probability threshold corresponding to each second classifier is used to indicate whether the input query statement corresponds to the second classifier intention;
  • the processor 401 executes the inputting the feature vector of the target query sentence into a preset intention recognition model to obtain the recognition result of the target query sentence, it may specifically perform the following steps:
  • the feature vector is input into a preset intent recognition model to obtain the recognition result of the target query sentence by the multiple two classifiers included in the intent recognition model, and the recognition result corresponding to each two classifier includes the target query
  • the intention of the sentence is the probability of the intention of the two classifiers; separately determine whether the probability included in the recognition result of each two classifiers is lower than the probability threshold corresponding to the two classifiers; if the recognition results of the multiple two classifiers include The probability of is lower than the corresponding probability threshold, and the target query sentence is determined to be an irrelevant query, and the recognition result of the target query sentence is used to indicate that the intention of the target query sentence is an irrelevant intention; if there is a two-classifier recognition The probability included in the result is not lower than the probability threshold corresponding to the second classifier, the intention of the target query sentence is determined
  • the processor 401 when it performs the word vector calculation of each word segmentation after the weighting process to obtain the feature vector of the target query sentence, it may specifically perform the following steps: calculating the weighted process The sum of the word vectors of each participle, and use the sum as the feature vector of the target query sentence; or, calculate the sum of the word vectors of each participle after the weighting process, and calculate The ratio of the summation value to the number of the plurality of participles is used as the feature vector of the target query sentence.
  • the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), application specific integrated circuits (Application Specific Integrated) Circuit (ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the user interface 403 may include an input device and an output device.
  • the input device may include a touch panel, a microphone, and the like
  • the output device may include a display (LCD, etc.), a speaker, and the like.
  • the communication interface 404 may include a receiver and a transmitter for communicating with other devices.
  • the memory 402 may include a read-only memory and a random access memory, and provide instructions and data to the processor 401. A portion of the memory 402 may also include non-volatile random access memory. For example, the memory 402 may also store the above keyword list, word segmentation, and so on.
  • the processor 401 and the like described in the embodiments of the present application can execute the implementation described in the method embodiments shown in FIG. 1 to FIG. 2 above, and can also execute each of the methods described in FIG. 3 of the embodiment of the present application. The implementation of the unit is not repeated here.
  • An embodiment of the present application also provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the computer program can be implemented as described in the embodiments corresponding to FIG. 1 to FIG. 2 Part or all of the steps in the intent recognition method based on the recognition model may also realize the functions of the recognition device of the embodiment shown in FIG. 3 or FIG. 4 of the present application, which will not be repeated here.
  • An embodiment of the present application further provides a computer program product containing instructions, which when run on a computer, causes the computer to perform some or all of the steps in the above method.
  • the computer-readable storage medium may be an internal storage unit of the identification device according to any of the foregoing embodiments, such as a hard disk or a memory of the identification device.
  • the computer-readable storage medium may also be an external storage device of the identification device, such as a plug-in hard disk equipped on the identification device, a smart memory card (Smart, Media, Card, SMC), and a secure digital (SD, Digital, SD) ) Card, flash card (Flash Card), etc.
  • the size of the sequence numbers of the above processes does not mean that the execution order is sequential, and the execution order of each process should be determined by its function and inherent logic, and should not correspond to the implementation process of the embodiments of the present application Constitute any limitation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种基于识别模型的意图识别方法、识别设备及介质,应用于人工智能技术领域。其中,该方法包括:接收用户输入的目标查询语句,对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;从所述多个分词中确定出所述目标查询语句的目标关键词;计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果。采用本申请,有助于提升意图识别的准确性。

Description

一种基于识别模型的意图识别方法、识别设备及介质
本申请要求于2019年01月04日提交中国专利局、申请号为201910015234.6、申请名称为“一种基于识别模型的意图识别方法、识别设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于识别模型的意图识别方法、识别设备及介质。
背景技术
目前,机器人在进行意图识别时,一般是使用多分类的分类器对用户查询意图进行识别,该多分类分类器在进行意图预测的时候,常常会进行归-化处理,这就导致多分类分类器在对无关查询进行预测时,常常会出现把该无关意图强制性分类到该多分类中某-个类别的情况。也即,目前的意图识别方式无法识别无关查询,即使某一查询的意图不属于分类器的意图时,也可能将意图识别为该分类器下的意图,导致无法准确识别用户意图。
发明内容
本申请实施例提供一种基于识别模型的意图识别方法、识别设备及介质,有助于提升意图识别的准确性。
第一方面,本申请实施例提供了一种基于识别模型的意图识别方法,包括:
接收用户输入的目标查询语句,对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;
将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所述关键词列表中的关键匹配的分词;
计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;
将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
第二方面,本申请实施例提供了一种识别设备,该识别设备包括用于执行上述第一方面的方法的单元。
第三方面,本申请实施例提供了另一种识别设备,包括处理器和存储器,所述处理器和存储器相互连接,其中,所述存储器用于存储支持识别设备执行上述方法的计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第一方面的方法。可选的,该识别设备还可包括通信接口和/或用户接口。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所 述处理器执行上述第一方面的方法。
本申请实施例能够通过对获取的查询语句进行分词处理,以得到该查询语句的多个分词并确定每个分词的词向量,进而从该多个分词中确定出该查询语句的关键词并对关键词的词向量进行加权处理后,得到该查询语句的特征向量,进而能够通过将该特征向量输入预置的意图识别模型来确定出该查询语句的意图,这就有助于提升意图识别的准确性,且能够识别出无关意图。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图进行说明。
图1是本申请实施例提供的一种基于识别模型的意图识别方法的流程示意图;
图2是本申请实施例提供的另一种基于识别模型的意图识别方法的流程示意图;
图3是本申请实施例提供的一种识别设备的结构示意图;
图4是本申请实施例提供的另一种识别设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
本申请的技术方案可应用于识别设备中,该识别设备可包括服务器、终端、机器人或其他识别设备,用于对用户查询语句的意图进行识别。本申请涉及的终端可以是手机、电脑、平板、个人计算机、智能手表等,本申请不做限定。
具体的,本申请能够通过对待进行意图识别的查询语句进行分词处理,以得到该查询语句的一个或多个分词并确定每个分词的词向量,从而计算得到该查询语句的特诊向量,或者,还能够通过进一步从该一个或多个分词中确定出该查询语句的关键词并对关键词的词向量进行加权处理后,计算得到该查询语句的特征向量,进而能够通过将该特征向量输入预置的意图识别模型来确定该查询语句的意图,这就有助于提升意图识别的准确性。以下分别详细说明。
请参见图1,图1是本申请实施例提供的一种基于识别模型的意图识别方法的流程示意图。具体的,本实施例的方法可应用于上述的识别设备如机器人中。如图1所示,该基于识别模型的意图识别方法可以包括以下步骤:
101、接收用户输入的目标查询语句,对该目标查询语句进行分词处理,以得到组成该目标查询语句的多个分词。
其中,该目标查询语句可以是待进行意图识别的任一语句,比如识别设备如机器人接收到的任一语句。可选的,该语句可以是文本,也可以是语音,也可以是视频中的语句。进一步可选的,如果获取到的语句为文本以外的语句,识别设备在获取到该语句之后,还可将该语句转换为文本语句,以便于快速实现对该语句进行分词处理。
可选的,该分词处理对应的分词方法可以为结巴分词或斯坦福分词法或其他分词方法,本申请不做限定。
进一步可选的,对该目标查询语句进行分词处理,得到的组成该目标查询语句的多个分词(还可称为词、词、词条等等)可以为组成该目标查询语句的所有分词,也可以为组成该目标查询语句的所有分词中的部分分词,比如为该所有分词中去掉停用词或其他无意义的分词后的分词,以便于减小后续的词向量和特征向量的计算开销,有助于提升意图识 别效率。例如,可预置一个过滤列表,该过滤列表可包括各种停用词或其他无意义的词,如“啊”、“哦”、“的”等等,从而在对查询语句进行分词后,能够通过与该过滤列表中的词进行匹配对比的方式确定出查询语句中的停用词等无意义的词,并去掉这些词。
可以理解,该目标查询语句可以仅包含一个分词,也即,对该目标查询语句进行分词处理,得到的组成该目标查询语句的分词可以为一个或多个分词。该基于一个分词的意图识别方法与该基于多个分词的意图识别方法相同,为便于理解,本申请以该多个分词为例进行说明。
102、计算该多个分词中每个分词的词向量。
在获取得到该多个分词之后,即可计算得到每各分词的词向量。对于分词的词向量的计算方式可以为多种。例如,可使用预先收集的各种语料(可以是各个分词)对连续词袋模型(CBOW模型)结构进行训练,实现采用Gensim中Word2Vec模型框架,训练得到输入为语料、输出为词向量的CBOW模型。进而可以通过将该多个分词分别输入该模型,以得到各分词的词向量。或者还可采用现有方式计算出每个分词的词向量,对分词的词向量的计算方式,本申请不做限定。
103、将该多个分词与预设的关键词列表中的各关键词进行匹配,以从该多个分词中确定出该目标查询语句的目标关键词。
其中,该关键词列表可包括一个或多个关键词,具体可以为一种或多种意图的关键词。从而在进行匹配时,可分别将该目标查询语句对应的多个分词分别与该关键词列表中的关键词进行匹配对比(判断是否存在相同的关键词),即检测该多个分词中是否存在该关键字列表中的关键词。如果存在,则可将该匹配的关键词作为该目标查询语句的关键词,即目标关键词。也就是说,该目标关键词即为该多个分词中与该关键词列表中的关键匹配的分词。
可以理解,该步骤102和103的执行步骤不受限制,例如,在其他可选的实施例中,还可以先执行步骤103,再执行步骤102;又如,该步骤102和103还可同时执行,本申请不做限定。
104、按照预设的加权系数对该目标关键词的词向量进行加权处理,根据该加权处理后的每个分词的词向量计算得到该目标查询语句的特征向量。
可选的,识别设备可通过为分词中的普通词(如该多个分词中除关键词以外的分词)和关键词设置不同的权重,将关键词的权重设置为高于普通词的权重,比如为关键词设置加权系数,以提升该关键词的权重,从而实现对查询语句的特征向量的优化,提升意图识别的准确性。例如,为区分句子中关键词和普通词间不同的重要性,当查询语句中出现关键词时,可将关键词的向量*k(即加权系数为k)加到这个查询语句的向量中,k大于1,以使查询语句的特征向量能够尽可能的向关键词的向量方向偏。或者,在其他实施例中,还可降低查询语句中普通词的权重(比如将普通词的加权系数设置为0-1之间),而该关键词的权重可以增加(比如将关键词的加权系数设置为大于1)或保持不变(比如将普通词的加权系数设置为1),即关键词的加权系数大于普通词的加权系数。其中,查询语句的向量可以是指该查询语句分词得到的该多个分词的词向量之和。由此,在对查询语句如该目标查询语句进行分词处理之后,可以从得到的该多个分词词选取关键词后,再根据关键词和其余普通词的词向量确定该目标查询语句的特征向量,从而可提升确定出的查询语句的特征向量的可靠性,进而可提升基于该特征向量的意图识别的准确性。采用词向量模型提取文本特征,比向量空间模型提取特征更有代表性。
其中,查询语句如目标查询语句的特征向量可以是根据加权处理后的每个分词的词向量(包括加权后的关键词的词向量和未加权的普通词的词向量)确定出的,该目标查询语句的特征向量可以与该目标查询语句的向量相同,也可以不同。
例如,在一种可能的实施方式中,识别设备可计算得到该加权处理后的每个分词的词向量的和值,并将该和值作为该目标查询语句的特征向量。也就是说,该目标查询语句的特征向量可以为该多个分词中普通词的词向量和该加权处理后的关键词的词向量的和值。
又如,在一种可能的实施方式中,识别设备可计算得到该加权处理后的每个分词的词向量的和值,并计算得到该和值与该多个分词的数目的比值,将该比值作为该目标查询语句的特征向量。也就是说,该目标查询语句的特征向量可以为该加权处理后该目标查询语句的该多个分词的词向量的和值除以该多个分词的分词总数目的比值,即进行归一化。
又如,在一种可能的实施方式中,识别设备可计算得到该加权处理后的每个分词的词向量的平均值或均方根值等等,将该平均值或均方根值作为该目标查询语句的特征向量,此处不一一列举。
举例来说,假设对该目标查询语句分词处理并过滤停用词等无意义的词之后,得到n个分词(即该多个分词为n),且计算得到的各个分词的词向量如下:
V=(v 1,v 2,...,v n)
其中,v i表示分词w i对应的词向量。
假设预设的关键词列表包括关键词词集合是:B,则该目标查询语句的向量T可表示为:
Figure PCTCN2019088802-appb-000001
Figure PCTCN2019088802-appb-000002
其中,u表示加权系数,k是一个常量(假设所有目标关键词的系数均为k),比如实际使用中假设设为2或其他值。如果当前分词在该关键词列表/集合中(即表明为关键词),计算特征时乘以该加权系数k再向量相加;如果当前分词不在该关键词列表/集合中(即可表明不为关键词),则可直接原向量相加,而不乘以加权系数,或者将其加权系数作为1,从而根据得到该目标查询语句的特征向量,比如将该T作为目标查询语句的特征向量,或者T/n的值作为目标查询语句的特征向量等等。
进一步可选的,如果该多个分词与该关键词列表中的关键词均不匹配,即该多个分词中不存在于该关键字列表中,则可不对该多个分词进行加权处理,计算得到该多个分词中每个分词的词向量之后,即可基于该每个分词的词向量确定该目标查询语句的特征向量,比如将每个分词的词向量的和值作为该目标查询语句的特征向量,又如将该和值与该多个分词的数目的比值作为该目标查询语句的特征向量等等,此处不赘述。
105、将该目标查询语句的特征向量输入预置的意图识别模型,以得到对该目标查询语句的识别结果。
可选的,该意图识别模型可以是由多个二分类器组成的,也即该意图识别模型可以为由该多个二分类器组成的多分类器。该每个二分类器可对应一个意图,该意图识别模型可 以是由该多个二分类器对应的意图的查询语句样本(语料)训练得到,具体可以是该多个二分类器对应的意图的查询语句样本的特征向量训练得到。该识别结果用于指示该目标查询语句的意图,该目标查询语句的意图为任一该二分类器下的意图或无关意图。可选的,该识别结果可包括该目标查询语句的意图的信息、该目标查询语句的意图为某一二分类器下的意图的概率及该二分类器的意图中的任一项或多项。如该目标查询语句的意图的信息可以是指该目标查询语句的意图的文字信息。通过训练多个二分类器组成一个多分类器,不仅能识别出已训练的意图,即该多个二分类器对应的意图,还可以识别出无关意图,从而提升了意图识别的准确性。
进一步可选的,在得到对该目标查询语句的识别结果之后,即可基于该目标查询语句的识别结果指示的用户意图在信息库中查找意图对应的信息,比如意图为天气查询时查找天气信息,又如意图为机票查询时查找机票信息等等,并可输出该信息(比如通过文字输出,或者通过语音输出,或者通过其他方式输出等等)或者向该用户对应的终端发送该信息,以供用户查看,对用户进行引导等等。
在本实施例中,识别设备如机器人能够通过对获取的查询语句进行分词处理,以得到该查询语句的多个分词并确定每个分词的词向量,进而从该多个分词中确定出该查询语句的关键词并对关键词的词向量进行加权处理后,得到该查询语句的特征向量,以通过将该特征向量输入预置的意图识别模型来确定出该查询语句的意图,不仅能够识别出该模型已训练的意图,还能识别出无关意图,这就有助于提升意图识别的准确性。
请参见图2,图2是本申请实施例提供的另一种基于识别模型的意图识别方法的流程示意图。具体的,如图2所示,该基于识别模型的意图识别方法可以包括以下步骤:
201、从预设样本数据库分别选取多种意图的查询语句样本,并分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词。
其中,该预设样本数据库可包括各意图的查询语句样本(语料),该选取的每一种意图的查询语句样本可以包括多个,如每一种意图可对应一个包括选取的该意图的多个查询语句样本的样本集合。每一个查询语句样本可以由文本组成。可选的,各查询语句样本可以与其对应的意图的信息如意图标签关联存储于该样本数据库中,以便于样本的查找以及后续的模型训练。
在选取出各个意图的样本之后,针对每个意图的样本,识别设备可以对该意图的每个样本进行分词处理,得到分词后的多个分词。其中,该分词的方法可采用结巴分词或斯坦福分词方法等等。
可选的,每个分词集合(词袋)包括的分词可以为选取出的该分词集合对应的意图的样本集合中所有查询语句样本的所有分词,也可以为该所有分词中的部分分词,比如为该所有分词中去掉停用词或其他无意义的分词后的分词,以减小计算开销,此处不赘述。
202、按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词。
其中,该关键词确定规则可预先设置得到。例如,该关键词确定规则可包括基于TF-IDF值的关键词确定规则、基于词频的关键词确定规则、基于次数的关键词确定规则、基于卡方校验值的关键词确定规则等中的任一种规则或多种规则组合确定的规则,本申请不做限定。进一步可选的,在确定出每种意图的关键词之后,可以生成包括将该关键词的关键词列表,所有意图的关键词可以存储于同一关键词列表中,或者不同的意图的关键词可以存 储于不同的关键词列表中,比如意图的关键词和关键词列表一一对应,还可为不同的关键词列表设置意图标签。
例如,在一种可能的实施方式中,在确定每种意图的关键词时,识别设备可分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值,将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将该排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。也就是说,针对每种意图的分词集合,可对该分词集合中的分词的进行TF-IDF计算,根据分词的TF-IDF值选取关键词,比如选取TF-IDF值阈值超过某一预设阈值的分词作为该意图的关键词,或者取出TF-IDF值排序靠前的预设数目的词作为该意图的关键词等等。
又如,在一种可能的实施方式中,在确定每种意图的关键词时,识别设备可分别计算得到每种意图的分词集合中的每个分词在该分词集合出现的频率,每个分词对应的频率为该分词在该分词集合出现的次数与该分词集合的分词总数目(如具体可以是去掉停用词后的分词数目)的比值,如词频(term frequency,TF);将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将该排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。也就是说,针对每种意图的分词集合,可统计分词集合的分词的词频,根据分词的词频选取关键词,比如选取词频超过预设频率阈值的分词作为该意图的关键词,或者取出词频排序靠前的一定数目如排在前6的词条作为该意图的关键词等等。
又如,在一种可能的实施方式中,在确定每种意图的关键词时,识别设备可分别计算得到每种意图的分词集合中的每个分词在该分词集合出现的次数,将每种意图的分词集合中次数超过预设次数阈值的分词确定为该种意图的关键词;或者,按照次数由大到小的顺序对每种意图的分词集合中的分词进行排序,将该排序前E的次数对应的分词确定为该种意图的关键词,其中,E为大于0的整数。
又如,在一种可能的实施方式中,在确定每种意图的关键词时,识别设备可对每种意图的分词集合中的分词进行卡方检验,得到每种意图的分词集合中的每个分词的卡方检验的值,将每种意图的分词集合中卡方检验的值超过预设校验阈值的分词确定为该种意图的关键词;或者,按照卡方检验的值由大到小的顺序对每种意图的分词集合中的分词进行排序,将该排序前F的卡方检验的值对应的分词确定为该种意图的关键词,其中,F为大于0的整数。
可选的,在选取意图的关键词时,可以采用上述任一种关键词确定规则来选取,或者可以通过将几种规则结合来选取,比如将上述的一种或多种规则下选取出的相同关键词作为该意图的关键词;或者,可以为每一种选取规则设置一个权重,将上述的各规则下选取出的关键词结合对应规则的权重进一步筛选出取值大于预设阈值的分词作为该意图的关键词,或将取值靠前的预设数目如前5的分词作为该意图的关键词等等;或者,可分别采用多种规则选取出各意图的关键词,并将各种关键词确定规则对应的关键词(如关键词列表)和对应的使用场景进行绑定,进而在后续进行关键词匹配时,可结合不同的使用场景来选取绑定的关键词进行匹配。从而能够进一步提升选取出的关键词的可靠性和灵活性,以便于提升意图识别的准确性。
例如,分词集合中每个分词的TF或TF-IDF可以通过如下方式计算得到:
词频可以是指某一个给定的词语在该意图中出现的次数或频率,也即分词在所在意图的分词集合出现的次数或频率,比如为了防止它偏向长的文件,该词频可以为该次数除以集合的分词总数目即意图总词数。
Figure PCTCN2019088802-appb-000003
从而能够计算得到每个词的TF。
进一步的,可计算逆向文件频率(inverse document frequency,IDF),IDF的主要思想是:如果包含分词t的意图越少,IDF越大,则说明分词具有很好的类别区分能力,某个分词对语句的重要性越高,它的TF-IDF值就越大。某一特定分词的IDF,可以由总意图数目除以包含该分词之意图的数目加1的和,再将得到的商取对数得到。比如IDF可以为:
Figure PCTCN2019088802-appb-000004
计算分词的TF-IDF:
TF-IDF w=TF w×IDF w
从而能够计算得到每个分词的TF-IDF值。
203、计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据该加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量。
可选的,该计算每个查询语句样本的特征向量的方式与上述计算目标查询语句的特征向量的方式相同,具体可参见图1所示实施例中步骤104的相关描述,此处不赘述。
可选的,除了关键词和普通词的权重不同之外,关键词之间的权重如该加权系数也可以设置为不同。例如,各关键词的权重可根据关键词的TF-IDF值确定出,如各关键词的TF-IDF值越大,为其设置的加权系数越高;又如,各关键词的权重可根据关键词的TF(或次数或卡方校验值等等)确定出,关键词对应的TF越大,为其设置的加权系数越高,等等,此处不一一列举。具体可预先设置不同TF-IDF值(或TF值或次数或卡方校验值等)及其对应的加权系数的对应关系;或者预先设置TF-IDF区间(或TF区间或次数区间或卡方校验值区间等)及其对应的加权系数的对应关系,以降低系统存储开销。进一步可选的,还可将关键词及其对应的加权系数关联存储至该关键词列表中。从而能够提升关键词的加权系数确定的可靠性和灵活性,有助于进一步提升意图识别的准确性和可靠性。
204、根据该多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到该意图识别模型。
其中,该意图识别模型可以由多个二分类器组成,该多个二分类器可以和该多种意图一一对应。
在得到每种意图的查询语句样本的特征向量之后,即可将该特征向量输入意图识别模型进行分类,以对各意图对应的二分类器进行训练。由于多分类分类器在进行预测的时候,常常会进行归-化,使得它在对无关查询进行预测时,经常会出现把无关查询强制性分类到某-个相关类别当中,无法识别无关查询,输出的查询结果不准确。由此,本申请可采用将-个多分类分类器转化为多个二分类分类器的方法,使其有能力识别无关查询,具体可针对各二分类器输入各意图的查询语句样本(意图句子)的特征向量,输出为对应的意 图,如可采用向量积SVM训练二分类器,从而训练得到多个二分类分类器。
举例来说,假设该意图识别模型由6个二分类器组成,分别对应天气,美食,机票,股票,信用卡和娱乐。例如天气分类器,它产生的二分类结果为:当前查询是天气意图和当前查询不是天气意图;又如对于机票分类器,产生的二分类结果为:当前查询是机票意图和当前查询不是机票意图,等等,美食、股票、信用卡、娱乐类似,通过输入对应的意图句子的特征向量到相应的分类器进行训练,从而训练得到多个二分类分类器,以得到该意图识别模型。可选的,在进行训练时,对于任一二分类器,还可以将该二分类器对应的意图句子的特征向量作为正样本,将其余意图句子的特征向量作为负样本以实现对该二分类器的训练。以天气分类器为例,可以将属于天气的训练样本如天气意图的意图句子的特征向量作为正样本,属于其他类的训练样本如机票、美食等意图的意图句子的特征向量作为负样本,以实现对该天气分类器的训练。
可选的,在训练分类器时,很多时候存在正负样本不平衡的情况,导致训练出的分类器识别准确性较差,容易对比例大的样本造成过拟合,也就是说预测容易偏向样本数较多的分类,这就大大降低了模型的范化能力,导致auc(area under curve)很低(auc越大的分类器效果越好)。例如,举个极端的例子:假如正样本只有1个,负样本有99个,那么,这个分类器无脑将所有的样本都分为负样本,它的准确率也会有99%。但是这样的分类器显然是无效的,其识别结果不可靠。因此,在训练分类器时,可以采用一些方法平衡正负样本的数量之后,再对分类器进行训练,以提升训练模型的准确性和可靠性。例如,针对正样本较少的情况,可以采用增加正样本的方式来平衡正负样本;又如,针对负样本较少的情况,可以采用一些方式增加负样本来平衡正负样本。可选的,平衡正负样本的方式可以如下:
a、上采样:增加样本数较少的样本,其方式是直接复制原来的样本。比如可以在样本较少时采用。
b、下采样:减少样本数较多的样本,其方式是丢弃这些多余的样本。比如可以在样本较多时采用。
一般来说,样本越多,训练出的模型准确性越高。由此,为了提升意图识别的可靠性,可采用上采样方式增加样本后,对分类器进行训练。
c、合成样本:增加样本数目较少的那一类的样本,合成指的是通过组合已有的样本的各个特征(feature)从而产生新的样本。具体的,该产生新样本的方式可以是从各个feature中随机选出一些feature或者通过一些方式选出某些特定的feature(如出现次数高于阈值的feature,或者样本相似度高于阈值如欧氏距离小于阈值的样本之间的feature等等)之后,将选取的feature拼接成一个新的样本,从而增加了样本数目较少的类别的样本数。不同于上采样是单纯的复制样本,而这里则是拼接得到新的样本,使得能够进一步提升分类器训练的可靠性。例如,可采用SMOTE(Synthetic Minority Over-sampling Technique,合成少数类过采样技术)算法合成新样本,其是根据已知的正样本向量,来生成模拟的正样本向量,通过在相似样本中进行feature的随机选择并拼接出新的样本,加入到训练集之中。
d、改变样本权重:增大样本数较少类别的样本的权重,对于样本数较少的样本类别,可以乘上一个权重,从而让分类器更加关注这一类数目较少的样本。可选的,该样本的权重可以和样本数相关,比如样本越少,权重越高;又如为低于某一数量的样本类别设置一固定的权重,等等。
205、分别为该多个二分类器设置概率阈值,每个二分类器对应的概率阈值可用于指 示输入的查询语句是否为该二分类器对应的意图。
可选的,每个二分类器对应的概率阈值可以相同,也可以不同。进一步可选的,该概率阈值还可根据预设时间段内对二分类器/意图识别模型的识别结果的校验结果进行调整,比如一时间段内如一周内对一二分类器的识别结果的校验结果为识别成功率低于一预设阈值如90%时,增加该二分类器对应的概率阈值,比如按照预设值如3%增加该概率阈值,以提升意图识别的准确性和可靠性。
206、接收用户输入的目标查询语句,对该目标查询语句进行分词处理,以得到组成该目标查询语句的多个分词。
207、将该多个分词与预设的关键词列表中的各关键词进行匹配,以从该多个分词中确定出该目标查询语句的目标关键词。
其中,该目标关键词可以为该多个分词中与该关键词列表中的关键匹配的分词。
208、计算该多个分词中每个分词的词向量,并按照预设的加权系数对该目标关键词的词向量进行加权处理,根据该加权处理后的每个分词的词向量计算得到该目标查询语句的特征向量。
可选的,该步骤206-208的其余描述可参照上述图1所示实施例中步骤101-104的相关描述,此处不赘述。
进一步可选的,每个关键词对应的加权系数可以相同也可以不同。例如,可根据每种意图的关键词对应的TF-IDF值或频率或次数或卡方校验值等等,为该关键词设置加权系数,并可将每种意图的关键词和该关键词对应的加权系数关联存储至该关键词列表,此处不赘述。进而识别设备在按照预设的加权系数对该目标关键词的词向量进行加权处理时,可以从该关键词列表中确定出与该目标关键词对应的加权系数,并按照确定出的加权系数对该目标关键词的词向量进行加权处理。如果该目标关键词为多个,则可分别从该关键词列表中确定出与各目标关键词匹配的关键词对应的加权系数作为各目标关键词对应的加权系数,并按照各目标关键词的加权系数对各自的词向量进行加权处理。从而基于加权处理后的目标关键词的词向量和该多个分词中的其他分词的词向量计算得到该目标查询语句的调整向量。
209、将该目标查询语句的特征向量输入预置的意图识别模型,以得到该意图识别模型包括的该多个二分类器对该目标查询语句的识别结果,每个二分类器对应的识别结果包括该目标查询语句的意图为该二分类器的意图的概率。
210、分别判断每个二分类器的识别结果包括的概率是否低于该二分类器对应的概率阈值,并根据判断结果确定该目标查询语句的意图。
具体的,如果该多个二分类器即所有二分类器的识别结果包括的概率均低于对应的概率阈值(每个二分类器对应的概率阈值可以相同也可以不同),则可确定该目标查询语句为无关查询,该目标查询语句的识别结果用于指示该目标查询语句的意图为无关意图。如果仅存在一个二分类器的识别结果包括的概率不低于该二分类器对应的概率阈值,则可确定该目标查询语句的意图为该二分类器的意图,该目标查询语句的识别结果用于指示该目标查询语句的意图为该二分类器的意图。如果存在多个二分类器的识别结果包括的概率不低于对应的概率阈值,则可进一步确定该不低于对应的概率阈值的各概率中的最大概率,并可将该最大概率对应的二分类器的意图作为该目标查询语句的意图,该目标查询语句的识别结果可用于指示该目标查询语句的意图为该最大概率对应的二分类器的意图。如果该最大概率存在多个,可将最大概率对应的多个二分类器的意图均作为该目标查询语句的意图, 或者,还可作为使用场景变更,切换另一种关键词确定规则对应的关键词列表进行关键词匹配以及确定其加权系数以计算得到该目标查询语句的特征向量后再进行意图识别,等等,此处不一一列举。
可选的,该计算得到的该目标查询语句的特征向量可以为一个也可以为多个。比如该特征向量可以为一个,则可由该多个二分类器基于该特征向量分别判决该目标查询语句的意图是否为自身对应的意图;又如该特征向量可以为多个,在计算该目标查询语句的特征向量时,可分别从各意图对应的关键词列表中确定出各意图对应的目标关键词,并基于各意图对应的目标关键词计算得到各意图对应的该目标查询语句的特征向量,进而通过将各意图对应的特征向量输入到相应的意图的二分类器(比如基于意图标签或其他方式确定该相应的一通的二分类器)以判决该目标查询语句的意图是否为自身对应的意图,也即,可通过分别提取出各二分类器的特征向量(即各意图对应的特征向量),由对应的二分类器进行判决输出。由此可以避免不同意图存在相同关键词时带来的判决可能不准确的问题,有助于进一步提升意图识别的可靠性。
举例来说,假设训练完成上述由6个二分类分类器,如天气分类器、美食分类器、机票分类器、股票分类器、信用卡分类器和娱乐分类器组成的意图识别模型之后,识别设备如机器人可接收用户输入的查询请求到该模型以进行意图识别。该请求可以是图片、文字或语音等方式的请求,进而可转换得到该请求对应的文本句子,即目标查询语句,对该句子进行分词,去掉分词得到的各分词中的停用词后可确定其中的关键词,并计算句子的特征向量(可以是一个,也可以是多个,如分别提取得到天气分类器的特征向量、美食分类器的特征向量、机票分类器的特征向量、股票分类器的特征向量、信用卡分类器的特征向量和娱乐分类器的特征向量),进而将句子的特征向量输入该模型进行意图识别。然后模型即各个二分类器可以判决输出,判决是否为各二分类器对应的意图,比如可以输出为自身对应的意图(如天气意图、美食意图等)及其概率(如正向概率,即查询为该意图的概率)。如果所有二分类器输出的正向概率均低于对应的阈值,则可表明用户查询分类为无关查询,可输出该查询为无关查询的信息,而不再强制性将其分类到某一个类别中。如果存在正向概率大于阈值的分类,则将该分类对应的意图作为识别结果,即作为该查询请求的意图进行输出,并可进一步输出其对应的正向概率,即置信度。如果正向概率大于阈值的分类有多个,则可以将该多个分类中正向概率最高的意图作为该查询请求的意图进行输出,并可进一步输出对应的正向概率。在识别出用户意图之后,即可根据该意图向用户返回信息、对客户进行引导等等。
在本实施例中,识别设备能够通过从预设样本数据库分别选取多种意图的查询语句样本,确定出该多种意图的关键词,以基于该关键词计算得到该查询语句样本的特征向量后训练得到意图识别模型,进而在获取到用户输入的目标查询语句之后,能够根据该目标查询语句的关键词确定该查询语句的特征向量,并将该特征向量输入预置的意图识别模型来确定该查询语句的意图,从而提升了意图识别的准确性。
上述方法实施例都是对本申请的基于识别模型的意图识别方法的举例说明,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
请参见图3,图3是本申请实施例提供的一种识别设备的结构示意图。本申请实施例的识别设备(装置)可包括用于执行上述基于识别模型的意图识别方法的单元。具体的,本实施例的识别设备300可包括:获取单元301和处理单元302。其中,
获取单元301,用于接收用户输入的目标查询语句;
处理单元302,用于对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所述关键词列表中的关键匹配的分词;
处理单元302,还用于计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;
处理单元302,还用于将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
可选的,获取单元301,还用于从预设样本数据库分别选取多种意图的查询语句样本;
处理单元302,还用于并分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词;按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词;计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据所述加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量;
处理单元302,还用于根据所述多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到所述意图识别模型;其中,所述意图识别模型由多个二分类器组成,所述多个二分类器和所述多种意图一一对应。
可选的,处理单元302在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,可具体用于:
分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值;
将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,
按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。
可选的,处理单元302在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,可具体用于:
分别计算得到每种意图的分词集合中的每个分词在所述分词集合出现的频率,每个分词对应的频率为该分词在所述分词集合出现的次数与所述分词集合的分词总数目的比值;
将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,
按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。
可选的,所述识别设备还包括存储单元303;
处理单元302,还用于根据每种意图的关键词对应的词频-逆文件频率TF-IDF值或频率,为该关键词设置加权系数;其中,每个关键词对应的频率为该关键词在该种意图的分词集合出现的次数与所述分词集合的分词总数目的比值;
存储单元303,用于将每种意图的关键词和该关键词对应的加权系数关联存储至所述关键词列表;
处理单元302在执行所述按照预设的加权系数对所述目标关键词的词向量进行加权处理时,可具体用于:
从所述关键词列表中确定出与所述目标关键词对应的加权系数,并按照确定出的加权系数对所述目标关键词的词向量进行加权处理。
可选的,处理单元302,还可用于分别为所述多个二分类器设置概率阈值,每个二分类器对应的概率阈值用于指示输入的查询语句是否为该二分类器对应的意图;
处理单元302在执行所述将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果时,可具体用于:
将所述目标查询语句的特征向量输入预置的意图识别模型,以得到所述意图识别模型包括的所述多个二分类器对所述目标查询语句的识别结果,每个二分类器对应的识别结果包括所述目标查询语句的意图为该二分类器的意图的概率;
分别判断每个二分类器的识别结果包括的概率是否低于该二分类器对应的概率阈值;
如果所述多个二分类器的识别结果包括的概率均低于对应的概率阈值,确定所述目标查询语句为无关查询,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为无关意图;
如果存在一个二分类器的识别结果包括的概率不低于该二分类器对应的概率阈值,确定所述目标查询语句的意图为该二分类器的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为该二分类器的意图;
如果存在多个二分类器的识别结果包括的概率不低于对应的概率阈值,确定所述不低于对应的概率阈值的概率中的最大概率,并将所述最大概率对应的二分类器的意图作为所述目标查询语句的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为所述最大概率对应的二分类器的意图。
可选的,处理单元302在执行所述根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量时,可具体用于:
计算得到所述加权处理后的每个分词的词向量的和值,并将所述和值作为所述目标查询语句的特征向量;或者,
计算得到所述加权处理后的每个分词的词向量的和值,并计算得到所述和值与所述多个分词的数目的比值,将所述比值作为所述目标查询语句的特征向量。
具体的,该识别设备可通过上述单元实现上述图1至图2所示实施例中的基于识别模型的意图识别方法中的部分或全部步骤。应理解,本申请实施例是对应方法实施例的装置实施例,对方法实施例的描述,也适用于本申请实施例。
请参见图4,图4是本申请实施例提供的另一种识别设备的结构示意图。该识别设备用于执行上述的方法。如图4所示,本实施例中的识别设备400可以包括:一个或多个处理器401和存储器402。可选的,该识别设备还可包括一个或多个用户接口403,和/或,一个或多个通信接口404。上述处理器401、用户接口403、通信接口404和存储器402可通过总线405连接,或者可以通过其他方式连接,图4中以总线方式进行示例说明。其中,存储器402用于存储计算机程序,所述计算机程序包括程序指令,处理器401用于执行存储器402存储的程序指令。其中,处理器401可用于调用所述程序指令执行上述图1至图2中的部分或全部步骤。
例如,处理器401可用于调用所述程序指令执行以下步骤:调用用户接口403接收用户输入的目标查询语句,对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所述关键词列表中的关键匹配的分词;计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
可选的,处理器401在执行所述将所述目标查询语句的特征向量输入预置的意图识别模型之前,还用于执行以下步骤:从预设样本数据库分别选取多种意图的查询语句样本,并分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词;按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词;计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据所述加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量;根据所述多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到所述意图识别模型;其中,所述意图识别模型由多个二分类器组成,所述多个二分类器和所述多种意图一一对应。
可选的,处理器401在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,可具体执行以下步骤:分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值;将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。
可选的,处理器401在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,可具体执行以下步骤:分别计算得到每种意图的分词集合中的每个分词在所述分词集合出现的频率,每个分词对应的频率为该分词在所述分词集合出现的次数与所述分词集合的分词总数目的比值;将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。
可选的,处理器401还可执行以下步骤:根据每种意图的关键词对应的词频-逆文件频率TF-IDF值或频率,为该关键词设置加权系数;其中,每个关键词对应的频率为该关键词在该种意图的分词集合出现的次数与所述分词集合的分词总数目的比值;将每种意图的关键词和该关键词对应的加权系数关联存储至所述关键词列表;
处理器401在执行所述按照预设的加权系数对所述目标关键词的词向量进行加权处理时,可具体执行以下步骤:从所述关键词列表中确定出与所述目标关键词对应的加权系数,并按照确定出的加权系数对所述目标关键词的词向量进行加权处理。
可选的,处理器401还可执行以下步骤:分别为所述多个二分类器设置概率阈值,每个二分类器对应的概率阈值用于指示输入的查询语句是否为该二分类器对应的意图;
处理器401在执行所述将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果时,可具体执行以下步骤:将所述目标查询语句的特征向量输入预置的意图识别模型,以得到所述意图识别模型包括的所述多个二分类器对所述目标查询语句的识别结果,每个二分类器对应的识别结果包括所述目标查询语句的意图为该二分类器的意图的概率;分别判断每个二分类器的识别结果包括的概率是否低于该二分类器对应的概率阈值;如果所述多个二分类器的识别结果包括的概率均低于对应的概率阈值,确定所述目标查询语句为无关查询,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为无关意图;如果存在一个二分类器的识别结果包括的概率不低于该二分类器对应的概率阈值,确定所述目标查询语句的意图为该二分类器的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为该二分类器的意图;如果存在多个二分类器的识别结果包括的概率不低于对应的概率阈值,确定所述不低于对应的概率阈值的概率中的最大概率,并将所述最大概率对应的二分类器的意图作为所述目标查询语句的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为所述最大概率对应的二分类器的意图。
可选的,处理器401在执行所述根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量时,可具体执行以下步骤:计算得到所述加权处理后的每个分词的词向量的和值,并将所述和值作为所述目标查询语句的特征向量;或者,计算得到所述加权处理后的每个分词的词向量的和值,并计算得到所述和值与所述多个分词的数目的比值,将所述比值作为所述目标查询语句的特征向量。
其中,所述处理器401可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
用户接口403可包括输入设备和输出设备,输入设备可以包括触控板、麦克风等,输出设备可以包括显示器(LCD等)、扬声器等。
通信接口404可包括接收器和发射器,用于与其他设备进行通信。
存储器402可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器402的一部分还可以包括非易失性随机存取存储器。例如,存储器402还可以存储上述的关键词列表、分词等等。
具体实现中,本申请实施例中所描述的处理器401等可执行上述图1至图2所示的方法实施例中所描述的实现方式,也可执行本申请实施例图3所描述的各单元的实现方式,此处不赘述。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时可实现图1至图2所对应实施例中描述的基于识别模型的意图识别方法中的部分或全部步骤,也可实现本申请图3或图4所示实施例的识别设备的功能,此处不赘述。
本申请实施例还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述方法中的部分或全部步骤。
所述计算机可读存储介质可以是前述任一实施例所述的识别设备的内部存储单元,例如识别设备的硬盘或内存。所述计算机可读存储介质也可以是所述识别设备的外部存储设备,例如所述识别设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
在本申请中,术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
以上所述,仅为本申请的部分实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。

Claims (20)

  1. 一种基于识别模型的意图识别方法,其特征在于,包括:
    接收用户输入的目标查询语句,对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;
    将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所述关键词列表中的关键匹配的分词;
    计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;
    将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
  2. 根据权利要求1所述的方法,其特征在于,在所述将所述目标查询语句的特征向量输入预置的意图识别模型之前,所述方法还包括:
    从预设样本数据库分别选取多种意图的查询语句样本,并分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词;
    按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词;
    计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据所述加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量;
    根据所述多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到所述意图识别模型;其中,所述意图识别模型由多个二分类器组成,所述多个二分类器和所述多种意图一一对应。
  3. 根据权利要求2所述的方法,其特征在于,所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词,包括:
    分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值;
    将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,
    按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。
  4. 根据权利要求2所述的方法,其特征在于,所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词,包括:
    分别计算得到每种意图的分词集合中的每个分词在所述分词集合出现的频率,每个分词对应的频率为该分词在所述分词集合出现的次数与所述分词集合的分词总数目的比值;
    将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,
    按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。
  5. 根据权利要求3或4所述的方法,其特征在于,所述方法还包括:
    根据每种意图的关键词对应的词频-逆文件频率TF-IDF值或频率,为该关键词设置加权系数;其中,每个关键词对应的频率为该关键词在该种意图的分词集合出现的次数与所述分词集合的分词总数目的比值;
    将每种意图的关键词和该关键词对应的加权系数关联存储至所述关键词列表;
    所述按照预设的加权系数对所述目标关键词的词向量进行加权处理,包括:
    从所述关键词列表中确定出与所述目标关键词对应的加权系数,并按照确定出的加权系数对所述目标关键词的词向量进行加权处理。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括;
    分别为所述多个二分类器设置概率阈值,每个二分类器对应的概率阈值用于指示输入的查询语句是否为该二分类器对应的意图;
    所述将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果,包括:
    将所述目标查询语句的特征向量输入预置的意图识别模型,以得到所述意图识别模型包括的所述多个二分类器对所述目标查询语句的识别结果,每个二分类器对应的识别结果包括所述目标查询语句的意图为该二分类器的意图的概率;
    分别判断每个二分类器的识别结果包括的概率是否低于该二分类器对应的概率阈值;
    如果所述多个二分类器的识别结果包括的概率均低于对应的概率阈值,确定所述目标查询语句为无关查询,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为无关意图;
    如果存在一个二分类器的识别结果包括的概率不低于该二分类器对应的概率阈值,确定所述目标查询语句的意图为该二分类器的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为该二分类器的意图;
    如果存在多个二分类器的识别结果包括的概率不低于对应的概率阈值,确定所述不低于对应的概率阈值的概率中的最大概率,并将所述最大概率对应的二分类器的意图作为所述目标查询语句的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为所述最大概率对应的二分类器的意图。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量,包括:
    计算得到所述加权处理后的每个分词的词向量的和值,并将所述和值作为所述目标查询语句的特征向量;或者,
    计算得到所述加权处理后的每个分词的词向量的和值,并计算得到所述和值与所述多个分词的数目的比值,将所述比值作为所述目标查询语句的特征向量。
  8. 一种识别设备,其特征在于,包括:获取单元和处理单元;
    所述获取单元,用于接收用户输入的目标查询语句;
    所述处理单元,用于对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所 述关键词列表中的关键匹配的分词;
    所述处理单元,还用于计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;
    所述处理单元,还用于将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
  9. 根据权利要求8所述的识别设备,其特征在于,
    所述获取单元,还用于从预设样本数据库分别选取多种意图的查询语句样本;
    所述处理单元,还用于分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词;按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词;计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据所述加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量;
    所述处理单元,还用于根据所述多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到所述意图识别模型;其中,所述意图识别模型由多个二分类器组成,所述多个二分类器和所述多种意图一一对应。
  10. 根据权利要求9所述的识别设备,其特征在于,所述处理单元在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,具体用于:分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值;将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。
  11. 根据权利要求9所述的识别设备,其特征在于,所述处理单元在执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,具体用于:分别计算得到每种意图的分词集合中的每个分词在所述分词集合出现的频率,每个分词对应的频率为该分词在所述分词集合出现的次数与所述分词集合的分词总数目的比值;将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。
  12. 根据权利要求10或11所述的识别设备,其特征在于,所述识别设备还包括:存储单元;
    所述处理单元,还用于根据每种意图的关键词对应的词频-逆文件频率TF-IDF值或频率,为该关键词设置加权系数;其中,每个关键词对应的频率为该关键词在该种意图的分词集合出现的次数与所述分词集合的分词总数目的比值;
    所述存储单元,用于将每种意图的关键词和该关键词对应的加权系数关联存储至所述关键词列表;
    所述处理单元在执行所述按照预设的加权系数对所述目标关键词的词向量进行加权处 理时,具体用于:从所述关键词列表中确定出与所述目标关键词对应的加权系数,并按照确定出的加权系数对所述目标关键词的词向量进行加权处理。
  13. 根据权利要求8所述的识别设备,其特征在于,
    所述处理单元,还用于分别为所述多个二分类器设置概率阈值,每个二分类器对应的概率阈值用于指示输入的查询语句是否为该二分类器对应的意图;
    所述处理单元在执行所述将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果时,具体用于:将所述目标查询语句的特征向量输入预置的意图识别模型,以得到所述意图识别模型包括的所述多个二分类器对所述目标查询语句的识别结果,每个二分类器对应的识别结果包括所述目标查询语句的意图为该二分类器的意图的概率;分别判断每个二分类器的识别结果包括的概率是否低于该二分类器对应的概率阈值;如果所述多个二分类器的识别结果包括的概率均低于对应的概率阈值,确定所述目标查询语句为无关查询,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为无关意图;如果存在一个二分类器的识别结果包括的概率不低于该二分类器对应的概率阈值,确定所述目标查询语句的意图为该二分类器的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为该二分类器的意图;如果存在多个二分类器的识别结果包括的概率不低于对应的概率阈值,确定所述不低于对应的概率阈值的概率中的最大概率,并将所述最大概率对应的二分类器的意图作为所述目标查询语句的意图,所述目标查询语句的识别结果用于指示所述目标查询语句的意图为所述最大概率对应的二分类器的意图。
  14. 根据权利要求8所述的识别设备,其特征在于,所述处理单元在执行所述根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量时,具体用于:计算得到所述加权处理后的每个分词的词向量的和值,并将所述和值作为所述目标查询语句的特征向量;或者,计算得到所述加权处理后的每个分词的词向量的和值,并计算得到所述和值与所述多个分词的数目的比值,将所述比值作为所述目标查询语句的特征向量。
  15. 一种识别设备,其特征在于,包括处理器和存储器,所述处理器和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下步骤:
    接收用户输入的目标查询语句,对所述目标查询语句进行分词处理,以得到组成所述目标查询语句的多个分词;
    将所述多个分词与预设的关键词列表中的各关键词进行匹配,以从所述多个分词中确定出所述目标查询语句的目标关键词,所述目标关键词为所述多个分词中与所述关键词列表中的关键匹配的分词;
    计算所述多个分词中每个分词的词向量,并按照预设的加权系数对所述目标关键词的词向量进行加权处理,根据所述加权处理后的每个分词的词向量计算得到所述目标查询语句的特征向量;
    将所述目标查询语句的特征向量输入预置的意图识别模型,以得到对所述目标查询语句的识别结果;其中,所述意图识别模型由多个二分类器组成,每个二分类器对应一个意图,所述意图识别模型由所述多个二分类器对应的意图的查询语句样本训练得到,所述识别结果用于指示所述目标查询语句的意图,所述目标查询语句的意图为任一所述二分类器下的意图或无关意图。
  16. 根据权利要求15所述的识别设备,其特征在于,所述处理器在调用所述程序指令 执行所述将所述目标查询语句的特征向量输入预置的意图识别模型之前,还用于执行以下步骤:
    从预设样本数据库分别选取多种意图的查询语句样本,并分别对每种意图的查询语句样本进行分词处理,以得到每种意图的查询语句样本的分词集合,每种意图的分词集合包括组成该种意图的查询语句样本的多个分词;
    按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词;
    计算每个分词的词向量,并按照预设的加权系数对每种意图的关键词的词向量进行加权处理,根据所述加权处理后的每个查询语句的每个分词的词向量计算得到每个查询语句样本的特征向量;
    根据所述多种意图的查询语句样本中的每个查询语句样本的特征向量及该查询语句样本对应的意图训练得到所述意图识别模型;其中,所述意图识别模型由多个二分类器组成,所述多个二分类器和所述多种意图一一对应。
  17. 根据权利要求16所述的识别设备,其特征在于,所述处理器在调用所述程序指令执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,具体执行以下步骤:分别计算得到每种意图的分词集合中的每个分词的词频-逆文件频率TF-IDF值;将每种意图的分词集合中TF-IDF值超过预设阈值的分词确定为该种意图的关键词;或者,按照TF-IDF值由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前M的TF-IDF值对应的分词确定为该种意图的关键词,其中,M为大于0的整数。
  18. 根据权利要求16所述的识别设备,其特征在于,所述处理器在调用所述程序指令执行所述按照预设的关键词确定规则分别从每种意图的分词集合中确定出每种意图的关键词时,具体执行以下步骤:分别计算得到每种意图的分词集合中的每个分词在所述分词集合出现的频率,每个分词对应的频率为该分词在所述分词集合出现的次数与所述分词集合的分词总数目的比值;将每种意图的分词集合中频率超过预设频率阈值的分词确定为该种意图的关键词;或者,按照频率由大到小的顺序对每种意图的分词集合中的分词进行排序,将所述排序前N的频率对应的分词确定为该种意图的关键词,其中,N为大于0的整数。
  19. 根据权利要求17或18所述的识别设备,其特征在于,所述处理器还用于执行以下步骤:
    根据每种意图的关键词对应的词频-逆文件频率TF-IDF值或频率,为该关键词设置加权系数;其中,每个关键词对应的频率为该关键词在该种意图的分词集合出现的次数与所述分词集合的分词总数目的比值;
    将每种意图的关键词和该关键词对应的加权系数关联存储至所述关键词列表;
    所述处理器在调用所述程序指令执行所述按照预设的加权系数对所述目标关键词的词向量进行加权处理时,具体执行以下步骤:从所述关键词列表中确定出与所述目标关键词对应的加权系数,并按照确定出的加权系数对所述目标关键词的词向量进行加权处理。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-7任一项所述的方法。
PCT/CN2019/088802 2019-01-04 2019-05-28 一种基于识别模型的意图识别方法、识别设备及介质 WO2020140372A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910015234.6 2019-01-04
CN201910015234.6A CN109815492A (zh) 2019-01-04 2019-01-04 一种基于识别模型的意图识别方法、识别设备及介质

Publications (1)

Publication Number Publication Date
WO2020140372A1 true WO2020140372A1 (zh) 2020-07-09

Family

ID=66604088

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088802 WO2020140372A1 (zh) 2019-01-04 2019-05-28 一种基于识别模型的意图识别方法、识别设备及介质

Country Status (2)

Country Link
CN (1) CN109815492A (zh)
WO (1) WO2020140372A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786041A (zh) * 2020-12-23 2021-05-11 平安普惠企业管理有限公司 语音处理方法及相关设备
US11741956B2 (en) 2021-02-26 2023-08-29 Walmart Apollo, Llc Methods and apparatus for intent recognition

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210030B (zh) * 2019-05-31 2021-02-09 腾讯科技(深圳)有限公司 语句分析的方法及装置
CN110263854B (zh) * 2019-06-20 2023-06-27 广州酷狗计算机科技有限公司 直播标签确定方法、装置及存储介质
CN110297544B (zh) * 2019-06-28 2021-08-17 联想(北京)有限公司 输入信息响应方法及装置、计算机系统和可读存储介质
CN110472027B (zh) * 2019-07-18 2024-05-14 平安科技(深圳)有限公司 意图识别方法、设备及计算机可读存储介质
CN112307281A (zh) * 2019-07-25 2021-02-02 北京搜狗科技发展有限公司 一种实体推荐方法及装置
CN110503143B (zh) * 2019-08-14 2024-03-19 平安科技(深圳)有限公司 基于意图识别的阈值选取方法、设备、存储介质及装置
CN110727862B (zh) * 2019-09-24 2022-11-08 苏宁云计算有限公司 一种商品搜索的查询策略的生成方法及装置
CN110737768B (zh) * 2019-10-16 2022-04-08 信雅达科技股份有限公司 基于深度学习的文本摘要自动生成方法及装置、存储介质
CN111008309B (zh) * 2019-12-06 2023-08-08 北京百度网讯科技有限公司 查询方法及装置
CN112989839A (zh) * 2019-12-18 2021-06-18 中国科学院声学研究所 一种基于关键词特征嵌入语言模型的意图识别方法及系统
CN111198938B (zh) * 2019-12-26 2023-12-01 深圳市优必选科技股份有限公司 一种样本数据处理方法、样本数据处理装置及电子设备
CN111159360B (zh) * 2019-12-31 2022-12-02 合肥讯飞数码科技有限公司 获得讯询问话题分类模型、讯询问话题分类的方法和装置
CN113496118B (zh) * 2020-04-07 2024-05-31 北京中科闻歌科技股份有限公司 一种新闻主体识别方法、设备和计算机可读存储介质
CN111581388B (zh) * 2020-05-11 2023-09-19 北京金山安全软件有限公司 一种用户意图识别方法、装置及电子设备
CN111651600B (zh) * 2020-06-02 2023-04-07 携程计算机技术(上海)有限公司 语句多意图识别方法、系统、电子设备及存储介质
CN113806469B (zh) * 2020-06-12 2024-06-11 华为技术有限公司 语句意图识别方法及终端设备
CN111539208B (zh) * 2020-06-22 2023-11-14 北京百度网讯科技有限公司 语句处理方法和装置、以及电子设备和可读存储介质
CN111797214A (zh) * 2020-06-24 2020-10-20 深圳壹账通智能科技有限公司 基于faq数据库的问题筛选方法、装置、计算机设备及介质
CN111737436A (zh) * 2020-06-24 2020-10-02 网易(杭州)网络有限公司 语料的意图识别方法及装置、电子设备、存储介质
CN111797115A (zh) * 2020-06-28 2020-10-20 中国工商银行股份有限公司 一种员工信息的搜索方法及装置
CN111832305B (zh) * 2020-07-03 2023-08-25 北京小鹏汽车有限公司 一种用户意图识别方法、装置、服务器和介质
CN112100368B (zh) * 2020-07-21 2024-01-26 深思考人工智能科技(上海)有限公司 对话交互意图的识别方法和装置
CN112232068B (zh) * 2020-09-30 2023-05-05 和美(深圳)信息技术股份有限公司 一种意图识别方法、装置、电子设备及存储介质
CN112163415A (zh) * 2020-09-30 2021-01-01 北京猎豹移动科技有限公司 针对反馈内容的用户意图识别方法、装置及电子设备
CN112287108B (zh) * 2020-10-29 2022-08-16 四川长虹电器股份有限公司 一种物联领域的意图识别优化方法
CN112800201B (zh) * 2021-01-28 2023-06-09 杭州汇数智通科技有限公司 自然语言的处理方法、装置及电子设备
CN113157892A (zh) * 2021-05-24 2021-07-23 中国平安人寿保险股份有限公司 用户意图处理方法、装置、计算机设备及存储介质
CN113792549B (zh) * 2021-09-17 2023-08-08 中国平安人寿保险股份有限公司 一种用户意图识别的方法、装置、计算机设备及存储介质
CN114422199B (zh) * 2021-12-28 2024-04-16 中国电信股份有限公司 一种cms识别方法及装置
CN116738973B (zh) * 2022-09-30 2024-04-19 荣耀终端有限公司 一种搜索意图识别方法、构建预测模型的方法和电子设备
CN116756294B (zh) * 2023-08-14 2023-12-26 北京智精灵科技有限公司 对话意图识别模型的构建方法、对话意图识别方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9058319B2 (en) * 2007-06-18 2015-06-16 International Business Machines Corporation Sub-model generation to improve classification accuracy
CN105389307A (zh) * 2015-12-02 2016-03-09 上海智臻智能网络科技股份有限公司 语句意图类别识别方法及装置
CN106407333A (zh) * 2016-09-05 2017-02-15 北京百度网讯科技有限公司 基于人工智能的口语查询识别方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002297635A (ja) * 2001-03-30 2002-10-11 Seiko Epson Corp 要約文作成システム及びその方法
CN107329949B (zh) * 2017-05-24 2021-01-01 北京捷通华声科技股份有限公司 一种语义匹配方法和系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9058319B2 (en) * 2007-06-18 2015-06-16 International Business Machines Corporation Sub-model generation to improve classification accuracy
CN105389307A (zh) * 2015-12-02 2016-03-09 上海智臻智能网络科技股份有限公司 语句意图类别识别方法及装置
CN106407333A (zh) * 2016-09-05 2017-02-15 北京百度网讯科技有限公司 基于人工智能的口语查询识别方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786041A (zh) * 2020-12-23 2021-05-11 平安普惠企业管理有限公司 语音处理方法及相关设备
CN112786041B (zh) * 2020-12-23 2023-11-24 光禹莱特数字科技(上海)有限公司 语音处理方法及相关设备
US11741956B2 (en) 2021-02-26 2023-08-29 Walmart Apollo, Llc Methods and apparatus for intent recognition

Also Published As

Publication number Publication date
CN109815492A (zh) 2019-05-28

Similar Documents

Publication Publication Date Title
WO2020140372A1 (zh) 一种基于识别模型的意图识别方法、识别设备及介质
WO2020140373A1 (zh) 一种意图识别方法、识别设备及计算机可读存储介质
CN110472027B (zh) 意图识别方法、设备及计算机可读存储介质
WO2021072885A1 (zh) 识别文本的方法、装置、设备及存储介质
WO2019153551A1 (zh) 文章分类方法、装置、计算机设备及存储介质
WO2020244073A1 (zh) 基于语音的用户分类方法、装置、计算机设备及存储介质
US7099819B2 (en) Text information analysis apparatus and method
EP3270304A1 (en) Artificial intelligence-based prior art document identification system
CN109471944B (zh) 文本分类模型的训练方法、装置及可读存储介质
CN111814770B (zh) 一种新闻视频的内容关键词提取方法、终端设备及介质
WO2019228203A1 (zh) 一种短文本分类方法及系统
US20120136812A1 (en) Method and system for machine-learning based optimization and customization of document similarities calculation
CN111325156B (zh) 人脸识别方法、装置、设备和存储介质
US9110986B2 (en) System and method for using a combination of semantic and statistical processing of input strings or other data content
CN113254643B (zh) 文本分类方法、装置、电子设备和
CN112347223B (zh) 文档检索方法、设备及计算机可读存储介质
WO2022141875A1 (zh) 用户意图识别方法、装置、设备及计算机可读存储介质
CN108287848B (zh) 用于语义解析的方法和系统
CN110046648B (zh) 基于至少一个业务分类模型进行业务分类的方法及装置
US20200285878A1 (en) Layout-aware, scalable recognition system
CN114756675A (zh) 文本分类方法、相关设备及可读存储介质
CN113934848B (zh) 一种数据分类方法、装置和电子设备
CN112579781B (zh) 文本归类方法、装置、电子设备及介质
CN107590163B (zh) 文本特征选择的方法、装置和系统
CN117493645B (zh) 一种基于大数据的电子档案推荐系统

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18/11/2021)

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19907404

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19907404

Country of ref document: EP

Kind code of ref document: A1