WO2024041350A1 - 意图识别方法、装置、电子设备及存储介质 - Google Patents

意图识别方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2024041350A1
WO2024041350A1 PCT/CN2023/111242 CN2023111242W WO2024041350A1 WO 2024041350 A1 WO2024041350 A1 WO 2024041350A1 CN 2023111242 W CN2023111242 W CN 2023111242W WO 2024041350 A1 WO2024041350 A1 WO 2024041350A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
intention
recognized
intent
target
Prior art date
Application number
PCT/CN2023/111242
Other languages
English (en)
French (fr)
Inventor
丁隆耀
蒋宁
吴海英
李宽
权佳成
Original Assignee
马上消费金融股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 马上消费金融股份有限公司 filed Critical 马上消费金融股份有限公司
Publication of WO2024041350A1 publication Critical patent/WO2024041350A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • This application relates to the field of artificial intelligence, and in particular to an intention recognition method, device, electronic equipment and storage medium.
  • robot customer service to automatically respond to customer questions can save a lot of human resources and improve communication efficiency.
  • the robot customer service needs to first perform intent recognition based on the text of the customer's question and clarify the customer's purpose.
  • This application provides an intention recognition method, device, electronic device and storage medium to improve the accuracy of intention recognition.
  • this application provides an intention recognition method, which includes: obtaining text to be recognized; performing intent classification processing on the text to be recognized to obtain the intention category of the text to be recognized; if it is determined that the intention category is a specified intention Category, then the text to be recognized is spliced with the preset template sentence to obtain the target text; the preset template sentence is used to represent the intention prompt information; the target text is input into the intention recognition model to obtain the target text.
  • the intention recognition result of the text is used to perform intention recognition processing on the text to be recognized based on the intention prompt information. reason.
  • this application provides a training method for an intent recognition model, which includes: obtaining initial training text; the intent category of the initial training text is a specified intent category; and splicing the initial training text with a preset template sentence. , obtain the target training text; the preset template sentence is used to represent the intention prompt information; input the target training text into the initial intention recognition model for iterative training, and obtain the intention recognition model.
  • the present application provides an intention recognition method applied to digital humans, which includes: obtaining the text to be recognized input by the user; identifying the intention of the text to be recognized according to the intention recognition method as described in the first aspect, and obtaining the user's intention of the text to be recognized.
  • Intention obtain the target text corresponding to the user intention in the digital human system according to the user intention, and display the target text.
  • embodiments of the present application provide an intention recognition device, including: a first acquisition unit, used to acquire the text to be recognized; a classification unit, used to perform intention classification processing on the text to be recognized, and obtain the text to be recognized. Identify the intention category of the text; the first splicing unit is used to splice the text to be recognized and the preset template sentence to obtain the target text if the intention category is determined to be the specified intention category; the preset template sentence Used to represent intention prompt information; the first recognition unit is used to input the target text into an intention recognition model to obtain the intention recognition result of the text to be recognized, and the intention recognition model is used to identify the intention prompt information based on the intention prompt information.
  • the text to be recognized is processed for intent recognition.
  • the present application provides a training device for an intention recognition model, including: a second acquisition unit for acquiring initial training text; the intention category of the initial training text is a specified intention category; a second splicing unit for The initial training text and the preset template sentence are spliced to obtain the target training text; the preset template sentence is used to represent the intention prompt information; the training unit is used to input the target training text into the initial intention recognition model for processing Iterative training to obtain the intent recognition model.
  • the present application provides an intention recognition device applied to digital humans, including: a third acquisition unit, used to acquire the text to be recognized input by the user; a second recognition unit, used according to the following:
  • the intention recognition method identifies the intention of the text to be recognized and obtains the user intention; the display unit is used to obtain the target text corresponding to the user intention in the digital human system according to the user intention, and Display the target text.
  • the present application provides an electronic device, including: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to perform the first aspect
  • the present application provides a computer-readable storage medium for storing computer-executable instructions that, when executed by a processor, implement the intention identification method as described in the first aspect, or, as The training method of the intention recognition model described in the second aspect, or the intention recognition method applied to digital humans as described in the third aspect.
  • Figure 1 is a processing flow chart of an intention recognition method provided by an embodiment of the present application.
  • Figure 2 is a processing flow chart of another intention recognition method provided by an embodiment of the present application.
  • Figure 3 is a processing flow chart of a method for identifying non-long tail intentions provided by an embodiment of the present application
  • Figure 4 is a processing flow chart of a method for identifying important long-tail intentions provided by an embodiment of the present application
  • Figure 5 is a mapping relationship diagram between mask values and intent tags provided by the embodiment of the present application.
  • Figure 6 is a processing flow chart of a response method provided by an embodiment of the present application.
  • Figure 7 is a processing flow chart of a training method for an intention recognition model provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of an intention recognition device provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of a training device for an intention recognition model provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the user's intention categories involved in customer questions are very rich.
  • the frequency of occurrence of some intention categories is low, and the frequency of occurrence of other intention categories is high. If the corresponding intention recognition model is uniformly trained by labeling, on the one hand, due to the sufficient number of samples, the intention recognition model will have a higher frequency of occurrence.
  • the recognition effect of customer problems of the intention category is excellent; on the other hand, due to the lack of enough samples, the intention recognition model has a poor identification effect of the customer problems of the intention category with lower frequency, which leads to the intention recognition model based on
  • the robot that automatically responds based on the intent recognition results is wrong, leaving a bad experience for the customer, and the customer has to transfer to a human to get an accurate response.
  • the robot indirectly increases the workload of manual customer service.
  • embodiments of the present application provide an intention recognition method.
  • the intention recognition method proposed in this application can be executed by an electronic device, specifically by a processor in the electronic device.
  • the electronic devices mentioned here can be terminal devices, such as smartphones, tablets, desktop computers, intelligent voice interaction devices, wearable devices, robots, vehicle-mounted terminals, etc.; or, the electronic devices can also be servers, such as independent A physical server, a server cluster composed of multiple servers, or a cloud server capable of cloud computing.
  • FIG. 1 a processing flow chart of an intention recognition method provided by an embodiment of the present application is shown.
  • the intention identification method provided by the embodiment of the present application may specifically include the following steps:
  • Step S102 Obtain the text to be recognized.
  • Obtaining the text to be recognized can be obtained by obtaining the voice data to be recognized, converting the voice data into text form to obtain the text to be recognized, or obtaining the text to be recognized input by the user, or obtaining the existence intention recognition requirement through other methods.
  • the text input by the user to be recognized may be referred to as input text.
  • Step S104 Perform intent classification processing on the text to be recognized to obtain the intent category of the text to be recognized.
  • the intentions involved in customer questions can include a small number of main intentions and a large number of long-tail intentions.
  • Primary intent is used to represent a smaller number of intents that cover a very high amount of traffic
  • long-tail intent is used to represent a larger number of intents that each account for very little traffic. If you configure a dedicated intent response operation for each long-tail intent, the workload is huge and the cost-effectiveness is extremely low.
  • all texts are unified for intent recognition, since the number of samples corresponding to the long-tail intent in the historical data is much smaller than the main intent, the long-tail intent will have a very low cost performance.
  • the recognition accuracy of tail intentions is likely to be lower than that of main intentions.
  • different intent recognition models need to be adopted for different intent categories to ensure that intent categories with different characteristics can obtain better intent recognition results.
  • the intent categories of the text to be identified include, but are not limited to: non-long-tail intent, important long-tail intent, and non-important long-tail intent.
  • non-long-tail intentions can be main intentions, that is, intentions that are small in number but cover extremely high traffic; important long-tail intentions can be intentions that are large in number but each occupy extremely low traffic and are highly important; non-important Long-tail intentions can be intentions that are large in number but each occupy very low traffic and have low importance.
  • non-long-tail intent is characterized by a smaller number but extremely high traffic coverage.
  • a non-long-tail intention recognition model composed of a pre-trained language model and a multi-layer perceptron can be used for intent recognition.
  • Pre-trained language models include but are not limited to: BERT (Bidirectional Encoder Representations from Transformers) model, or RoBERTa (a Robustly Optimized BERT Pretraining Approach) model, etc.
  • the BERT model is a language representation model, represented by the two-way encoder of Transformer.
  • the training process of the BERT model can be divided into a pre-training part and a model fine-tuning part.
  • the model fine-tuning part uses the pre-trained BERT model for model fine-tuning training. , widely used in tasks such as text classification and text matching.
  • Pre-training and model fine-tuning can be illustrated by the following example: Assume that there is already a training set A, first use the A training set to pre-train the network, learn the network parameters on the A task, and then save the network parameters for use in the subsequent training process. When a new task B comes, the same network structure is adopted. When the network parameters are initialized, the parameters learned by task A can be loaded. Other high-level parameters are randomly initialized. Then the training data of task B is used to train the network. When the loaded parameters As the training of B task continues to change, it is called "fine-tuning", that is, the parameters are better adjusted to make them more suitable for the current B task.
  • the RoBERTa model is similar to the BERT model. It mainly makes several adjustments based on BERT: 1) The training time is longer, the batch size is larger, and the training data is more; 2) Next predict loss is removed; 3 ) The training sequence is longer; 4) Dynamically adjust the masking mechanism. Because it performs better than the BERT model in many scenarios, it is widely used in NLP (Natural Language Processing) tasks.
  • NLP Natural Language Processing
  • the model fine-tuning of the pre-trained language model can be achieved.
  • the training method When the number of samples used to train the model is larger, the training effect of the model is better.
  • Non-long-tail intentions have the characteristics of a small number but extremely high traffic coverage, that is, non-long-tail intentions appear very frequently in historical intent data. Therefore, a large number of non-long-tail intentions can be obtained from historical intent data. corresponding training samples. Furthermore, the non-long-tail intention recognition model trained using the model fine-tuning training method has a higher accuracy in identifying non-long-tail intentions.
  • non-important long-tail intents have a larger number but each non-important long-tail intent occupies the flow If the quantity is extremely low and the importance is weak, then for non-important long-tail intentions, the accuracy requirements for intent recognition are not high and do not involve subsequent complex steps. Therefore, a non-important long-tail intent recognition model can be configured.
  • the non-important long-tail intent recognition model The model can directly use keywords to achieve intent recognition, and the matching conditions can be set more stringently when setting keywords.
  • performing intent classification processing on the text to be recognized to obtain the intent category of the text to be recognized includes: calculating the number of historical texts included in the pre-stored historical text collection to obtain the first number; in the historical text collection , determine the number of historical texts that have the same intention as the text to be recognized, and obtain the second quantity; based on the first quantity and the second quantity, determine the frequency of occurrence of the intention corresponding to the text to be recognized in the historical text collection, and obtain the target frequency. value; determine the intended category of the text to be recognized based on the comparison result between the target frequency value and the preset frequency threshold.
  • the pre-stored historical text collection may include multiple historical texts, each historical text corresponding to an intention.
  • the intention of two historical texts can be the same intention or different intentions.
  • Two historical texts belonging to the same intention can be texts with completely identical contents or texts with different contents. Whether any two historical texts have the same intention can be reflected by the similarity between the two texts.
  • the number of historical texts included in the pre-stored historical text collection is calculated to obtain the first quantity; in the historical text collection, the number of historical texts belonging to the same intention as the text to be recognized is determined to obtain the second quantity. Then, through the first quantity and the second quantity, the frequency of occurrence of the intention corresponding to the text to be recognized in the historical text collection can be calculated, and the target frequency value can be obtained.
  • the target frequency value may be the ratio of the first quantity and the second quantity, or may be calculated based on the preset coefficient, the first quantity and the second quantity.
  • determining the number of historical texts that belong to the same intention as the text to be recognized and obtaining the second number includes: calculating the similarity between the text to be recognized and each historical text in the historical text collection. degree to obtain the target similarity; if the target similarity is greater than or equal to the preset similarity threshold, it is determined that the text to be recognized and the historical text corresponding to the target similarity belong to the same intention; calculate the number of historical texts that belong to the same intention as the text to be recognized, Get the second quantity.
  • the text to be recognized is "I heard that product A is having an event recently? What does that coupon mean?”
  • the historical text 1 is: "Is product A having an event recently and is there any large coupons?”
  • the text to be identified is the same as If the target similarity of historical text 1 is a%, and a% is greater than the preset similarity threshold A%, it is determined that the text to be recognized and historical text 1 belong to the same intention.
  • the second quantity can reflect the number of occurrences of the intention of the text to be recognized in the historical text collection.
  • the intent category of the text to be recognized includes one of non-long-tail intent, important long-tail intent, and non-important long-tail intent; according to the comparison result between the target frequency value and the preset frequency threshold, Determining the intention category of the text to be recognized includes: if the target frequency value is greater than or equal to the preset frequency threshold, then determine the intention category of the text to be recognized as a non-long-tail intention; if the target frequency value is less than the preset frequency threshold, then According to the business rules corresponding to the text to be recognized, it is judged whether the importance parameter of the intention of the text to be recognized is greater than or equal to the preset parameter threshold; the importance parameter is used to characterize the importance of the intention of the text to be recognized; if the intention of the text to be recognized is If the importance parameter is greater than or equal to the preset parameter threshold, then the intention category of the text to be identified is determined to be an important long-tail intention; if the importance parameter of the intention of the text to be identified is less than the
  • the target frequency value is greater than or equal to the preset frequency threshold, it means that the intention corresponding to the text to be recognized appears more frequently in the historical text collection, so it can be determined that the intention category of the text to be recognized is a non-long-tail intention; if the target frequency value The degree value is less than the preset frequency threshold, indicating that the intention corresponding to the text to be recognized appears less frequently in the historical text collection, so it can be determined that the intention category of the text to be recognized is a long-tail intention.
  • the intention category of the text to be recognized is a long-tail intention
  • Different business rules apply to different application scenarios.
  • Business rules can be pre-configured with judgment conditions for how to determine whether an intention is important, and can also be configured with a generation method and preset parameter thresholds for importance parameters.
  • it is determined whether the importance parameter of the intention of the text to be recognized is greater than or equal to the preset parameter threshold. If the importance parameter of the intention of the text to be recognized is greater than or equal to the preset parameter threshold, then the importance of the intention of the text to be recognized is determined.
  • the intention category of the text to be recognized is an important long-tail intention; if the importance parameter of the intention of the text to be recognized is less than the preset parameter threshold, it is determined that the importance of the intention of the text to be recognized is low, and the intention of the text to be recognized is low. Identifies the text's intent category as non-important long-tail intent.
  • Step S106 if it is determined that the intention category is the specified intention category, the text to be recognized and the preset template sentence are spliced to obtain the target text; the preset template sentence is used to represent the intention prompt information.
  • the specified intent category can be a long-tail intent, an important long-tail intent, or other intent categories with a smaller number of samples that can be used for model training.
  • each robot that executes the intention recognition method provided by the embodiments of the present application can be pre-configured with a preset template statement corresponding to the business.
  • the preset template statements can reflect the intention prompt information of the business.
  • the function of the preset template statement is to reconstruct the text to be recognized to generate a target text that includes both the text to be recognized and the intention prompt information.
  • Intent hint information can be reflected through the mask (mask) and the context of the mask.
  • the template of the target text is: x+preset template statement.
  • the default template statement is: (that, I want [mask].)
  • the target text can be obtained by splicing the text to be recognized and the preset template sentence: I heard that product A is doing activities recently? What does that coupon mean? That, I want to [mask].
  • Prompt Learning is a new paradigm of NLP training learning recently proposed. Different from the commonly used pre-training model + model fine-tuning, tips Rather than adapting a pre-trained language model (LM) to a downstream task through goal engineering, learning reformulates the downstream task to look more like the task that was solved during the original LM training with the help of text prompts .
  • LM pre-trained language model
  • Intent prompt information is the text prompt used in prompt learning.
  • Step S108 Input the target text into the intention recognition model to obtain the intention recognition result of the text to be recognized.
  • the intention recognition model is used to perform intention recognition processing on the text to be recognized based on the intention prompt information.
  • the preset template sentence consists of a preset sentence pattern and a mask
  • the intent recognition model includes a prediction sub-model and a label determination module connected in sequence
  • the prediction sub-model is used to perform mask processing based on the target text. Value prediction processing to obtain the corresponding mask prediction value
  • the label determination module is used to determine the mask prediction value based on the mask prediction value corresponding to the target text and the mapping relationship between the preconfigured mask value and the intent label.
  • There is a mapping relationship between the target intention tags, and the target intention tag is determined as the intention recognition result of the text to be recognized.
  • the default sentence pattern can be a fixed sentence pattern preset based on business scenarios. For example, if the robot is mainly used to handle customers' inquiries about promotional activities, the fixed sentence pattern can be "I want to ask about ___ discounts"; or if the electronic device is mainly used to handle customer complaints, the fixed sentence pattern can be " My opinion of ___ is ___”, etc.
  • the mask can be used to represent an unknown number to be predicted, corresponding to the area to be filled in the fixed sentence. For example, the default template statement can be "I want to ask about [mask]'s discount", or, "My opinion about [mask1] is [mask2]". In a default template statement, the number of masks can be one or multiple.
  • the mask prediction value is no longer classified as 0, 1, 2, etc., but allows the prediction sub-model to select the answer it thinks is the most likely from the mask value space, and the mask value space requires
  • the label determination module is required to calculate the mask prediction value based on the mapping relationship between a preconfigured mask value and the intent label. Map to a label space, which includes multiple intent labels.
  • the mask prediction value "ask about coupons” is obtained.
  • the mask prediction value "ask about coupons” can be substituted into [mask] to obtain the substitution result: "I heard that product A is doing activities recently? What does that coupon mean? Well, I want to ask Coupon.”
  • the intent tags with mapping relationships are determined based on the substitution results.
  • the mask prediction value "Ask for Coupon” can also be output directly without substitution.
  • the label determination module directly determines the intention tag with a mapping relationship based on the mask prediction value. For example, the intent label that is mapped to the mask prediction value "ask about coupons” is "ask about coupons.”
  • the prediction sub-model is used to: determine the probability that the value of the mask is each preset mask value in a preconfigured mask value set according to the target text, and obtain each preset mask value. Corresponding prediction probability; sort each preset mask value according to the value of the prediction probability to obtain the sorting result; based on the sorting result, determine the preset mask value with the highest prediction probability as the mask corresponding to the target text code prediction value.
  • the text to be recognized may be text input by the target user.
  • the prediction sub-model can determine the probability that the value of the mask is each preset mask value in the preconfigured mask value set based on the target text, and obtain the prediction probability corresponding to each preset mask value; according to the prediction probability Value size, sort each preset mask value to obtain the sorting result; based on the sorting result, determine the preset number of preset mask values with the highest prediction probability as the mask prediction value, and then, the intent recognition model can output the preset mask value
  • a number of intent recognition results are set to feed back intent confirmation information carrying a preset number of intent recognition results to the target user, and a targeted reply is made after receiving the intent selection instruction of the target user.
  • the preset number can be a natural number greater than 1.
  • the target user may be a customer who poses a question to the voice customer service robot, and the question may be Text entered by the target user.
  • the prediction sub-model is obtained by inputting training text data into the initial prediction sub-model for iterative training; the training text data is obtained by splicing training text and preset template statements filled with sample mask values.
  • the embodiment of the present application adopts a training method of prompt learning.
  • the training sample input to the initial prediction sub-model is not the text to be recognized and the intention label corresponding to the text to be recognized, but the text to be recognized and the filled-in mask.
  • the target text obtained by splicing the preset template statements of the code value.
  • the initial prediction sub-model can be a pre-trained model.
  • the initial prediction sub-model can be a pre-trained language representation model, such as the BERT model, or the RoBERTa model, etc.
  • the initial prediction sub-model can also be a pre-trained open source model, such as the finbert model, or the mengzi-fin model, etc.
  • Pre-training models often use a large amount of sample data to perform cloze tasks in the pre-training stage. Therefore, pre-training models have powerful word-filling capabilities.
  • the pre-trained model can "recall" the corresponding answer method, which can improve the expressive ability of the pre-trained model when the number of samples of important long-tail intentions is small.
  • the prompt learning training method is used to train the pre-training model, the training text obtained by splicing the training text and the preset template sentence filled with sample mask values is input into the initial prediction sub-model for iterative training. Even if the training text is With a smaller number, better model training results can be achieved, so that the accuracy of the mask prediction value predicted by the trained intention recognition model after being put into use is higher.
  • the fine-tuning training method is used to train the pre-training model, the text to be recognized with the intention label is input into the initial prediction sub-model for iterative training.
  • This training method requires an extremely large number of samples, and a small number of samples cannot meet the training needs. This may lead to inaccurate prediction results of the trained intent recognition model.
  • the intention recognition method also includes: feeding back the intention confirmation information to the target user; the intention confirmation information carries multiple intention recognition results; receiving the target user's intention selection instruction, and determining the intention selected by the intention selection instruction as the target intention; According to the target intention, the corresponding intention response operation is performed.
  • the intent confirmation information carries multiple intent recognition results, and multiple intents to be selected can be displayed to the target user through the display interface so that the user can select the true intent.
  • Each intention to be selected corresponds to an intention recognition result.
  • the intent confirmation information may be: Do you want to inquire about: 1. Coupon A; 2. Promotional activities B; 3. Discounts on product C.
  • Target intentions include but are not limited to: consulting, asking for help, complaining, shopping, etc.
  • the intention response operation corresponding to the target intention may be to reply to the coupon introduction information
  • the intention response operation corresponding to the target intention may be to obtain the target user current location information and destination location information, and push the corresponding navigation route to the target user
  • the target intention is a customer complaint
  • the intention response operation corresponding to the target intention can be to reply to the target user with preset comfort words and record the corresponding Complaint information
  • the intent response operation corresponding to the target intention can be to push shopping price comparison information and shopping links to the target user, etc.
  • multiple intention recognition modules connected in series can also be pre-constructed, namely "non-long-tail intention recognition module", “important long-tail intention recognition module” and “non-important long-tail intention recognition module”. If the “non-long-tail intent recognition module” determines that the intent category of the received text to be recognized is a non-long-tail intent, it can perform intent recognition on the text to be recognized based on the "non-long-tail intent recognition module” and output the intent label; if " The "non-long-tail intent recognition module” determines that the intent category of the received text to be recognized is not a non-long-tail intent, then the recognition fails and continues to transmit the text to be recognized to "Important long-tail intent recognition module”; if the "important long-tail intent recognition module” determines that the intent category of the received text to be recognized is an important long-tail intent, then the text to be recognized can be processed based on the "important long-tail intent recognition module” Intent recognition, output the intent label; if the "
  • the structure of the non-long-tail intention recognition module can refer to the above-mentioned non-long-tail intention recognition model; the important long-tail intention recognition module can refer to the intention recognition model provided in the embodiments of this application; the non-important long-tail intention recognition module can refer to the above-mentioned non-long-tail intention recognition model.
  • Important long-tail intent identification model can refer to the above-mentioned non-long-tail intention recognition model provided in the embodiments of this application; the non-important long-tail intention recognition module can refer to the above-mentioned non-long-tail intention recognition model.
  • Important long-tail intent identification model can refer to the above-mentioned non-long-tail intention recognition model.
  • the text to be recognized is obtained; secondly, the text to be recognized is subjected to intention classification processing to obtain the intention category of the text to be recognized; then, if the intention category is determined to be the specified intention category, the text to be recognized is The recognized text is spliced with the preset template sentence to obtain the target text; the preset template sentence is used to represent the intention prompt information; finally, the target text is input into the intention recognition model to obtain the intention recognition result of the text to be recognized, and the intention recognition model is used Intent recognition processing is performed on the text to be recognized based on the intent prompt information.
  • Figure 2 is a processing flow chart of another intention identification method provided by an embodiment of the present application.
  • Step S202 Enter text.
  • Step S202 is equivalent to step S102 in the embodiment of FIG. 1 .
  • the text input in step S202 may be text to be recognized.
  • Step S204 Determine whether the non-long-tail intent recognition module has successfully identified it.
  • the non-long-tail intent recognition module may include a non-long-tail intent recognition model trained by labeling.
  • the non-long-tail intent recognition model has extremely high recognition accuracy for text whose intent category is the non-long-tail model. If the text is input into the non-long-tail intent recognition model for intent recognition processing, and the obtained intent recognition result is used to characterize the intent category of the text as a non-long-tail intent, then it is determined that the non-long-tail intent recognition module has successfully identified it; if the text is entered into The non-long-tail intent recognition model performs intent recognition processing, and the obtained intent recognition result is used to represent that the intent category of the text is other than the non-long-tail intent. Then it is determined that the recognition by the non-long-tail intent recognition module is unsuccessful.
  • step S210 is executed; if the non-long-tail intention recognition module fails, step S206 is executed.
  • Step S206 Determine whether the important long-tail intent identification module has successfully identified it.
  • step S204 To determine whether the important long-tail intent identification module has successfully identified it, refer to step S204. If yes, execute step S210; if not, execute step S208.
  • Step S208 Determine whether the non-important long-tail intention identification module has successfully identified it.
  • step S204 To determine whether the non-important long-tail intent identification module has successfully identified it, refer to step S204.
  • step S210 If yes, execute step S210; if not, execute step S212.
  • Step S210 Output the intent tag.
  • Steps S204, S206 and S210 can replace step S104, step 106 and step 108 in the embodiment of Figure 1.
  • the text to be recognized can be input into the non-long-tail intent recognition module for intent recognition processing, and we get Non-long-tail intent recognition results of text to be recognized If the non-long-tail intent recognition result is used to represent the non-long-tail intent recognition module, the recognition is unsuccessful, then the text to be recognized is spliced with the preset template sentence to obtain the target text; the preset template sentence is used to represent the intention prompt.
  • the important long-tail intent recognition module in the embodiment of FIG. 2 may include various structural components in the intent recognition model provided in the embodiment of FIG. 1 and implement the same function.
  • Step S212 Output the complete answer.
  • the answer can be a manual transfer or other preset response methods, for example, generating a prompt message suggesting that the user dial a manual customer service number.
  • Figure 3 is a processing flow chart of a method for identifying non-long tail intentions provided by an embodiment of the present application.
  • Non-long-tail intentions are the main intentions, which have the characteristics that the number of intentions accounts for a small proportion of all intentions and occupies a large amount of traffic.
  • the input text can be input into the non-long-tail intent recognition model for intent prediction processing to obtain the intent prediction results.
  • the non-long-tail intent recognition model may include a pre-trained model, a multi-layer perceptron, and a normalized exponential function, that is, a Softmax function, which are connected in sequence.
  • the number of categories for the Softmax function can include the number of all common questions plus "other questions”.
  • the intent prediction results include but are not limited to: 1. Checking the loan balance; 2. How to repay early; 3. WeChat deduction issues; 4. What loan products are available, etc.
  • FIG. 4 is a processing flow chart of a method for identifying important long-tail intentions provided by an embodiment of the present application.
  • Step S402 enter text.
  • Step S402 is equivalent to step S102 in the embodiment of FIG. 1 .
  • the input text can be text to be recognized.
  • Step S404 Create a default template statement.
  • Step S406 Construct a mapping relationship.
  • Figure 5 is a mapping relationship diagram between mask values and intent tags provided by the embodiment of the present application.
  • the intent label space includes multiple intent labels, for example, “consultation coupon”, “customer complaint”, etc.
  • the mask value space includes multiple preset mask values, for example, “ask about coupons”, “ask about coupons”, “report”, “complain”, “report on you”, etc.
  • the intention tag "consult about coupons” can establish a mapping relationship with the preset mask value “ask about coupons” and the preset mask value “ask about coupons” respectively; the intention tag “customer complaint” can respectively establish a mapping relationship with the preset mask value "ask about coupons”.
  • the code value "report”, the default mask value "complain”, and the default mask value "report you” establish a mapping relationship.
  • Step S408 Generate text carrying a preset template sentence.
  • Step S408 is equivalent to step S106 in the embodiment of FIG. 1 .
  • Step S410 perform mask value prediction.
  • Step S412 Based on the mapping relationship, the model prediction result is output.
  • Steps S410 and S412 are equivalent to step S108 in the embodiment of FIG. 1 .
  • embodiments of the present application also provide a response method, which can be applied in the field of artificial intelligence.
  • Figure 6 is a processing flow chart of a response method provided by an embodiment of the present application.
  • Step S602 Convert customer voice questions into text.
  • Step S604 Enter the text into the intention recognition model for intention recognition.
  • Step S606 Map the answer corresponding to the intention.
  • Step S608 Convert the answer into voice output.
  • Step S610 The robot plays the corresponding voice words to answer the customer.
  • embodiments of the present application also provide a training method for an intent recognition model.
  • Figure 7 is a processing flow chart of a method for training an intention recognition model provided by an embodiment of the present application.
  • step S702 initial training text is obtained; the intention category of the initial training text is the specified intention category.
  • Step S704 The initial training text and the preset template sentence are spliced to obtain the target training text; the preset template sentence is used to represent the intention prompt information.
  • Step S706 Input the target training text into the initial intention recognition model for iterative training to obtain the intention recognition model.
  • the preset template sentence can be a preset sentence pattern filled with sample mask values;
  • the intent recognition model includes a prediction sub-model and a label determination module that are connected in sequence;
  • the prediction sub-model can be iteratively trained by inputting the target training text into the initial prediction sub-model Obtained;
  • the label determination model can be obtained by inputting the mask prediction value into the initial label determination module for iterative training;
  • the mask prediction value is generated by the prediction sub-model.
  • the embodiment of the present application adopts a training method of prompt learning.
  • the training sample input to the initial prediction sub-model is not the initial training text and the intention label corresponding to the initial training text. Instead, it is composed of the initial training text and the filled mask.
  • the target training text obtained by splicing the preset template sentences of the code value.
  • the initial prediction sub-model can be a pre-trained model.
  • the initial prediction sub-model can be a pre-trained language representation model, such as the BERT model, or the RoBERTa model, etc.
  • the initial prediction sub-model can also be a pre-trained open source model, such as the finbert model, or the mengzi-fin model, etc.
  • Pre-training models often use a large amount of sample data to perform cloze tasks in the pre-training stage. Therefore, pre-training models have powerful word-filling capabilities.
  • the pre-trained model can "recall" the corresponding answer method, which can improve the expressive ability of the pre-trained model when the number of samples of important long-tail intentions is small.
  • the Use the hint learning training method to train the pre-training model that is, the template training text obtained by splicing the initial training text and the preset template statement filled with sample mask values is input into the initial prediction sub-model for iterative training, even if the number of training texts is small, It can also achieve better model training results, so that the accuracy of the mask prediction value predicted by the trained intention recognition model after being put into use is higher.
  • the fine-tuning training method is used to train the pre-training model, the initial training text carrying the intention label is input into the initial prediction sub-model for iterative training.
  • This training method requires an extremely large number of samples, and a small number of samples cannot meet the training needs. This may lead to inaccurate prediction results of the trained intent recognition model.
  • the label determination model is iteratively trained by inputting the mask prediction values into the initial label determination module.
  • the mask prediction value can be generated by inputting the target training text into the prediction sub-model.
  • embodiments of the present application also provide an intention recognition method applied to digital humans, including:
  • target text corresponding to the user intention is obtained in the digital human system, and the target text is displayed.
  • the text to be recognized input by the user includes the text to be recognized input by the user during interface operation, the audio played by the user's voice, the text to be recognized obtained by recognizing the audio, or the text to be recognized manually input by the user.
  • obtaining the target text corresponding to the user's intention in the digital human's system according to the user's intention includes: searching for content matching the user's intention in the digital human's system according to the user's intention, and matching the The obtained content is used as the target text; displaying the target text includes the digital person broadcasting the target text, or the digital person displaying the target text on the display interface of the digital person .
  • Figure 8 is a schematic diagram of an intention recognition device provided by an embodiment of the present application.
  • This embodiment provides an intention recognition device 800, which includes: a first acquisition unit 801, used to obtain text to be recognized; a classification unit 802, used to perform intention classification processing on the text to be recognized, and obtain the intention category of the text to be recognized;
  • the splicing unit 803 is used to splice the text to be recognized and the preset template sentence to obtain the target text if the intention category is determined to be the specified intention category; the preset template sentence is used to represent the intention prompt information; the first identification unit 804, It is used to input the target text into the intent recognition model, perform intent recognition processing on the text to be recognized based on the intent prompt information, and obtain the intent recognition result of the text to be recognized.
  • the preset template sentence is composed of a preset sentence pattern and a mask
  • the intent recognition model includes a prediction sub-model and a label determination module that are connected in sequence
  • the prediction sub-model is used to predict the value of the mask based on the target text. , to obtain the corresponding mask prediction value
  • the label determination module is used to determine the existence of a mapping with the mask prediction value based on the mask prediction value corresponding to the target text and the mapping relationship between the preconfigured mask value and the intent label.
  • the target intention label of the relationship is determined as the intention recognition result of the text to be recognized.
  • the prediction sub-model is used to: determine the probability that the value of the mask is each preset mask value in the preconfigured mask value set according to the target text, and obtain the probability corresponding to each preset mask value. Prediction probability; sort each preset mask value according to the value of the prediction probability to obtain the sorting result; based on the sorting result, determine the preset mask value with the highest prediction probability as the mask prediction corresponding to the target text value.
  • the classification unit 802 includes: a calculation subunit, used to calculate the number of historical texts included in the pre-stored historical text set, to obtain the first number; and a first determination subunit, used to determine the number of historical texts included in the historical text set.
  • the number of historical texts that belong to the same intention as the text to be recognized is obtained to obtain a second quantity; the second determination subunit is used to determine the frequency of occurrence of the intention corresponding to the text to be recognized in the historical text collection based on the first quantity and the second quantity. to obtain the target frequency value; the third determination subunit is used to determine the meaning of the text to be recognized based on the comparison result between the target frequency value and the preset frequency threshold.
  • Figure category used to calculate the number of historical texts included in the pre-stored historical text set, to obtain the first number
  • a first determination subunit used to determine the number of historical texts included in the historical text set.
  • the number of historical texts that belong to the same intention as the text to be recognized is obtained to obtain a second quantity
  • the first determination subunit is used to: calculate the similarity between the text to be recognized and each historical text in the historical text collection to obtain the target similarity; if the target similarity is greater than or equal to the preset similarity threshold, then Determine that the historical text corresponding to the similarity between the text to be recognized and the target belongs to the same intention; calculate the number of historical texts belonging to the same intention as the text to be recognized, and obtain the second number.
  • the intent category of the text to be recognized includes one of non-long-tail intent, important long-tail intent, and non-important long-tail intent; the third determination subunit is used to: if the target frequency value is greater than or equal to the preset frequency If the target frequency value is less than the preset frequency threshold, then determine whether the importance parameter of the intention of the text to be identified is based on the business rules corresponding to the text to be identified. Greater than or equal to the preset parameter threshold; the importance parameter is used to characterize the importance of the intention of the text to be recognized; if yes, then determine the intention category of the text to be recognized as an important long-tail intention; if not, determine the intention category of the text to be recognized as Non-important long tail intentions.
  • the intention recognition device 800 further includes: a feedback unit for feeding back the intention confirmation information to the target user;
  • the intention confirmation information carries multiple intention recognition results;
  • the receiving unit is used to receive the intention selection instruction of the target user, and determines the intention selected by the intention selection instruction as the target intention;
  • the execution unit is used to execute the corresponding intention according to the target intention Response operation.
  • the intention recognition device includes a first acquisition unit, a classification unit, a first splicing unit and a first recognition unit.
  • the first acquisition unit is used to acquire the text to be recognized;
  • the classification unit is used to classify the text to be recognized.
  • Intent classification processing to obtain the intent category of the text to be recognized;
  • the first splicing unit is used to splice the text to be recognized and the preset template sentence to obtain the target text if the intention category is determined to be the specified intention category; the preset template sentence It is used to represent the intent prompt information;
  • the first recognition unit is used to input the target text into the intent recognition model to obtain the intent recognition result of the text to be recognized.
  • the intent recognition model is used to perform intent recognition processing on the text to be recognized based on the intent prompt information.
  • intent classification processing on the text to be recognized and determining the intent category of the text to be recognized, it is possible to determine whether the text to be recognized is a specified intent category, and then only target the specified intent category.
  • the text to be recognized in the graph category and the preset template sentence are spliced together to obtain the input data of the intention recognition model.
  • the intention recognition model the target text is identified based on the intention prompt information represented by the preset template sentence, and the intention recognition is obtained. As a result, the accuracy of intent recognition for the specified intent category is improved.
  • Figure 9 is a schematic diagram of a training device for an intention recognition model provided by an embodiment of the present application.
  • This embodiment provides a training device 900 for an intention recognition model, including: a second acquisition unit 901, used to obtain initial training text; the intention category of the initial training text is a specified intention category; a second splicing unit 902, used to convert the initial training text
  • the training text and the preset template sentence are spliced to obtain the target training text; the preset template sentence is used to represent the intention prompt information; the training unit 903 is used to input the target training text into the initial intention recognition model for iterative training to obtain the intention recognition model .
  • an intention recognition method for digital humans is provided.
  • an intention recognition device for digital humans including: a third acquisition unit for acquiring The text to be recognized input by the user; the second recognition unit, used to recognize the intention of the text to be recognized according to the intention recognition method as described in the first aspect, and obtain the user's intention; the display unit, used to display the user's intention according to the user's intention.
  • the target text corresponding to the user's intention is obtained from the digital human system, and the target text is displayed.
  • embodiments of the present application also provide an electronic device, the electronic device is used to perform the intention recognition method provided above, or corresponding to an intention recognition method described above Model training method, based on the same technical concept, embodiments of the present application also provide an electronic device, the electronic device is used to perform the above-provided intention recognition model training method, or corresponding to the above-described one for digital human beings Intention recognition method, based on the same technical concept, embodiments of the present application also provide an electronic device, the electronic device is used to execute the above-mentioned intention recognition method applied to digital humans.
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • electronic devices may vary greatly due to different configurations or performance, and may include one or more processors 1001 and memory 1002.
  • the memory 1002 may store one or more storage applications or data. . Among them, the memory 1002 can be a short-term storage or a persistent storage.
  • Application programs stored in memory 1002 may include one or more modules (not shown), and each module may include a series of computer-executable instructions in the electronic device.
  • the processor 1001 may be configured to communicate with the memory 1002 and execute a series of computer-executable instructions in the memory 1002 on the electronic device.
  • the electronic device may also include one or more power supplies 1003, one or more wired or wireless network interfaces 1004, one or more input/output interfaces 1005, one or more keyboards 1006, etc.
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may Comprises a series of computer-executable instructions in an electronic device, and is configured to be executed by one or more processors.
  • the one or more programs include computer-executable instructions for: obtaining text to be recognized; performing text processing on the text to be recognized.
  • Intent classification processing is performed to obtain the intent category of the text to be recognized; if the intent category is determined to be the specified intent category, the text to be recognized is spliced with the preset template sentence to obtain the target text; the preset template sentence is used to represent the intention prompt information; Input the target text into the intent recognition model to obtain the intent recognition result of the text to be recognized.
  • the intent recognition model is used to perform intent recognition processing on the text to be recognized based on the intent prompt information.
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module May include a series of computer-executable instructions in an electronic device and configured to be executed by one or more processors.
  • the one or more programs include computer-executable instructions for: obtaining initial training text; initial training text
  • the intent category is the specified intent category; the initial training text and the preset template sentence are spliced to obtain the target training text; the preset template sentence is used to represent the intention prompt information;
  • the target training text is input into the initial intent recognition model for iterative training to obtain the intent recognition model.
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may Comprises a series of computer-executable instructions in an electronic device, and is configured to be executed by one or more processors.
  • the one or more programs include computer-executable instructions for: obtaining text to be recognized input by a user; according to The intention recognition method described in each of the foregoing intention recognition method embodiments recognizes the intention of the text to be recognized and obtains the user intention; obtains the target text corresponding to the user intention in the digital human system according to the user intention, and Display the target text.
  • embodiments of the present application also provide a computer-based method.
  • Read storage media Read storage media.
  • the computer-readable storage medium provided by this embodiment is used to store computer-executable instructions.
  • the computer-executable instructions implement the following processes: obtain the text to be recognized; perform intent classification on the text to be recognized. Process to obtain the intent category of the text to be recognized; if the intent category is determined to be the specified intent category, the text to be recognized is spliced with the preset template sentence to obtain the target text; the preset template sentence is used to represent the intention prompt information; the target text The text is input into the intent recognition model to obtain the intent recognition result of the text to be recognized.
  • the intent recognition model is used to perform intent recognition processing on the text to be recognized based on the intent prompt information.
  • the computer-readable storage medium provided by this embodiment is used to store computer-executable instructions.
  • the computer-executable instructions When executed by the processor, the computer-executable instructions implement the following processes: obtaining the initial training text; the intention of the initial training text The category is the specified intent category; the initial training text and the preset template sentence are spliced to obtain the target training text; the preset template sentence is used to represent the intention prompt information; the target training text is input into the initial intention recognition model for iterative training to obtain the intention Identify the model.
  • this embodiment provides a computer-readable storage medium for storing calculations Computer executable instructions.
  • the computer executable instructions When executed by the processor, the computer executable instructions implement the following process: obtain the text to be recognized input by the user; identify the intention of the text to be recognized according to the intention recognition method described in each of the foregoing intention recognition method embodiments, Obtain the user intention; obtain the target text corresponding to the user intention in the digital human system according to the user intention, and display the target text.
  • embodiments of the present application may be provided as methods, systems or computer program products. Therefore, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present description may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) embodying computer-usable program code therein.
  • computer-readable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable device to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture that includes instruction means that performs A function specified in a process or processes in a flow diagram and/or in a block or blocks in a block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable device such that a series of operational steps are performed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide The steps used to implement the functionality specified in a process or processes in a flowchart and/or in a block or blocks in a block diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash random access memory
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cassettes disk storage or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • Embodiments of the present application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

本说明书实施例提供了意图识别方法、装置、电子设备及存储介质,意图识别方法包括:获取待识别文本;对待识别文本进行意图分类处理,得到待识别文本的意图类别;若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;将目标文本输入意图识别模型,得到待识别文本的意图识别结果;意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。

Description

意图识别方法、装置、电子设备及存储介质
交叉引用
本申请要求在2022年08月25日提交中国专利局、申请号为202211029991.7、发明名称为“意图识别方法、装置、电子设备及存储介质”的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种意图识别方法、装置、电子设备及存储介质。
背景技术
随着电子技术的发展,机器人的使用越来越普遍。例如,通过机器人客服实现客户问题的自动应答,可以节省大量人力资源,提高沟通效率。但机器人客服在针对客户问题应答之前,需要先基于客户问题的文本进行意图识别,明确客户的目的。
发明内容
本申请提供了一种意图识别方法、装置、电子设备及存储介质,以提高意图识别的准确率。
一方面,本申请提供了一种意图识别方法,包括:获取待识别文本;对所述待识别文本进行意图分类处理,得到所述待识别文本的意图类别;若确定所述意图类别为指定意图类别,则将所述待识别文本与预设模板语句进行拼接处理,得到目标文本;所述预设模板语句用于表征意图提示信息;将所述目标文本输入意图识别模型,得到所述待识别文本的意图识别结果,所述意图识别模型用于基于所述意图提示信息对所述待识别文本进行意图识别处 理。
一方面,本申请提供了一种意图识别模型的训练方法,包括:获取初始训练文本;所述初始训练文本的意图类别为指定意图类别;将所述初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;所述预设模板语句用于表征意图提示信息;将所述目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
一方面,本申请提供了一种应用于数字人的意图识别方法,包括:获取用户输入的待识别文本;根据如第一方面所述的意图识别方法识别所述待识别文本的意图,得到用户意图;根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
第四方面,本申请实施例提供了一种意图识别装置,包括:第一获取单元,用于获取待识别文本;分类单元,用于对所述待识别文本进行意图分类处理,得到所述待识别文本的意图类别;第一拼接单元,用于若确定所述意图类别为指定意图类别,则将所述待识别文本与预设模板语句进行拼接处理,得到目标文本;所述预设模板语句用于表征意图提示信息;第一识别单元,用于将所述目标文本输入意图识别模型,得到所述待识别文本的意图识别结果,所述意图识别模型用于基于所述意图提示信息对所述待识别文本进行意图识别处理。
一方面,本申请提供了一种意图识别模型的训练装置,包括:第二获取单元,用于获取初始训练文本;所述初始训练文本的意图类别为指定意图类别;第二拼接单元,用于将所述初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;所述预设模板语句用于表征意图提示信息;训练单元,用于将所述目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
一方面,本申请提供了一种应用于数字人的意图识别装置,包括:第三获取单元,用于获取用户输入的待识别文本;第二识别单元,用于根据如第 一方面所述的意图识别方法识别所述待识别文本的意图,得到用户意图;展示单元,用于根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
一方面,本申请提供了一种电子设备,包括:处理器;以及,被配置为存储计算机可执行指令的存储器,所述计算机可执行指令在被执行时使所述处理器执行如第一方面所述的意图识别方法,或者,如第二方面所述的意图识别模型的训练方法,或者,如第三方面所述的应用于数字人的意图识别方法。
一方面,本申请提供了一种计算机可读存储介质,用于存储计算机可执行指令,所述计算机可执行指令在被处理器执行时实现如第一方面所述的意图识别方法,或者,如第二方面所述的意图识别模型的训练方法,或者,如第三方面所述的应用于数字人的意图识别方法。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图;
图1为本申请实施例提供的一种意图识别方法的处理流程图;
图2为本申请实施例提供的另一种意图识别方法的处理流程图;
图3为本申请实施例提供的一种非长尾意图的识别方法的处理流程图;
图4为本申请实施例提供的一种重要长尾意图的识别方法的处理流程图;
图5为本申请实施例提供的掩码值与意图标签之间的映射关系图;
图6为本申请实施例提供的一种应答方法的处理流程图;
图7为本申请实施例提供的一种意图识别模型的训练方法的处理流程图;
图8为本申请实施例提供的一种意图识别装置示意图;
图9为本申请实施例提供的一种意图识别模型的训练装置示意图;
图10为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请实施例中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书的一部分实施例,而不是全部的实施例。基于本申请实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请的保护范围。
在具有自动应答功能的机器人(包括数字人)的实际应用中,客户问题涉及的用户的意图类别非常丰富。一部分意图类别的出现频率较低,另一部分意图类别的出现频率较高,若统一采用贴标签的方式训练对应的意图识别模型,则一方面,由于样本数量充足,该意图识别模型对出现频率较高的意图类别的客户问题的识别效果极好;另一方面,由于缺乏足够多的样本,该意图识别模型对出现频率较低的意图类别的客户问题的识别效果欠佳,进而导致基于意图识别模型的意图识别结果进行自动应答的机器人牛头不对马嘴,给客户留下了不好的体验,且客户不得不转接人工以获得准确的应答,机器人的间接提高了人工客服的工作量。
为了克服上述问题,本申请实施例提供了一种意图识别方法。
本申请提出的意图识别方法可由电子设备执行,具体可由电子设备中的处理器执行。此处所提到的电子设备可以是终端设备,比如智能手机、平板电脑、台式电脑、智能语音交互设备、可穿戴设备、机器人以及车载终端等等;或者,电子设备还可以是服务器,比如独立的物理服务器、由多个服务器组成的服务器集群或者能够进行云计算的云服务器。
下面将通过几个实施例具体介绍本申请提出的意图识别方法。
参照图1,为本申请实施例提供的一种意图识别方法的处理流程图。如图1所示,本申请实施例提供的意图识别方法具体可包括如下步骤:
步骤S102,获取待识别文本。
获取待识别文本,可以是获取待识别的语音数据,将语音数据转换为文字形式,得到待识别文本,也可以是获取用户输入的待识别文本,还可以是通过其他方式获取的存在意图识别需求的文本。该用户输入的待识别文本可以简称为输入文本。
步骤S104,对待识别文本进行意图分类处理,得到待识别文本的意图类别。
以机器人的自动应答场景为例,客户问题所涉及的意图可以包括少量主要意图和大量长尾意图。主要意图用于表示数量较少但涵盖的流量极高的意图,长尾意图用于表示数量较多但每个仅占据极少流量的意图。若针对每个长尾意图配置专门的意图响应操作,工作量巨大,性价比极低,但若将所有文本统一进行意图识别,由于长尾意图在历史数据中对应的样本数量远小于主要意图,长尾意图的识别准确率很可能低于主要意图。为此,需要针对不同的意图类别,采取不同的意图识别模型,以保证具有不同特征的意图类别均能够得到较好的意图识别结果。
待识别文本的意图类别包括且不限于:非长尾意图、重要长尾意图以及非重要长尾意图。
其中,非长尾意图可以是主要意图,即数量较少但涵盖的流量极高的意图;重要长尾意图可以是数量较多但每个占据流量极低,且重要性强的意图;非重要长尾意图可以是数量较多但每个占据流量极低,且重要性弱的意图。
一方面,非长尾意图具有数量较少但涵盖的流量极高的特征。对于非长尾意图,可以采用预训练的语言模型与多层感知机构成的非长尾意图识别模型进行意图识别。预训练的语言模型包括且不限于:BERT(Bidirectional Encoder Representations from Transformers)模型,或者,RoBERTa(a Robustly  Optimized BERT Pretraining Approach)模型,等等。
其中,BERT模型是一种语言表征模型,用Transformer的双向编码器表示,BERT模型的训练过程可以分为预训练部分和模型微调部分,其中模型微调部分使用预训练好的BERT模型进行模型微调训练,广泛的应用于文本分类,文本匹配等任务。
预训练和模型微调可以通过如下示例来说明:假设已有A训练集,先用A训练集对网络进行预训练,在A任务上学会网络参数,然后保存网络参数以备后续训练过程所使用,当来一个新的任务B,采取相同的网络结构,网络参数初始化的时候可以加载A任务学习好的参数,其他的高层参数随机初始化,之后用B任务的训练数据来训练网络,当加载的参数随着B任务的训练进行不断地改变,称为“fine-tuning(微调)”,即更好地把参数调整使得更适合当前的B任务。
RoBERTa模型和BERT模型类似,主要是在BERT基础上做了几点调整:1)训练时间更长,批尺寸(batch size)更大,训练数据更多;2)移除了next predict loss;3)训练序列更长;4)动态调整掩码机制。因其在诸多场景下比BERT模型效果更好而广泛应用在NLP(Natural Language Processing,自然语言处理)任务中。
通过设置非长尾意图识别模型包括依次连接的预训练的语言模型、多层感知机以及归一化指数函数,即Softmax函数,可以实现对预训练的语言模型的模型微调,该训练方式下,当用于训练模型的样本数量较多时,模型的训练效果较好。
非长尾意图具有数量较少但涵盖的流量极高的特征,即,非长尾意图在历史意图数据中的出现频度极高,因此,可以从历史意图数据中获取大量的非长尾意图对应的训练样本。进而,采用模型微调的训练方式训练得到的非长尾意图识别模型对非长尾意图的意图识别的准确率较高。
另一方面,非重要长尾意图具有数量较多但每个非重要长尾意图占据流 量极低,且重要性弱的特征,则对于非重要长尾意图,意图识别的精度要求不高不涉及后续复杂步骤,所以可以配置非重要长尾意图识别模型,该非重要长尾意图识别模型可直接使用关键词实现意图识别,设置关键词时可以将匹配条件设置得更为严格。
又一方面,重要长尾意图具有数量较多但每个占据流量极低,且重要性高的特征,则对于重要长尾意图,意图识别的精度要求较高,若采用非长尾意图识别模型相同的模型结构来进行意图识别,样本数量较少可能会导致模型训练效果不佳,进而导致重要长尾意图的识别准确率低下。因此,重要长尾意图存在提高意图识别准确率的需求。
在一个具体的实施例中,对待识别文本进行意图分类处理,得到待识别文本的意图类别,包括:计算预存的历史文本集合所包括的历史文本的数量,得到第一数量;在历史文本集合中,确定与待识别文本属于相同意图的历史文本的数量,得到第二数量;根据第一数量和第二数量,确定待识别文本对应的意图在历史文本集合中的出现频度,得到目标频度值;根据目标频度值与预设频度阈值的比较结果,确定待识别文本的意图类别。
预存的历史文本集合可以包括多个历史文本,每个历史文本对应于一个意图。两个历史文本的意图可以是相同的意图,也可以是不同的意图。属于相同意图的两个历史文本可以是内容完全一致的文本,也可以是内容存在差异的文本。任意两个历史文本是否属于相同意图,可以通过两个文本之间的相似度体现。
计算预存的历史文本集合所包括的历史文本的数量,得到第一数量;在历史文本集合中,确定与待识别文本属于相同意图的历史文本的数量,得到第二数量。则通过第一数量与第二数量,可以计算待识别文本对应的意图在历史文本集合中的出现频度,得到目标频度值。
目标频度值可以是第一数量与第二数量之比,也可以基于预设系数、第一数量以及第二数量计算得到。
在一个实施例中,在历史文本集合中,确定与待识别文本属于相同意图的历史文本的数量,得到第二数量,包括:计算待识别文本与历史文本集合中每个历史文本之间的相似度,得到目标相似度;若目标相似度大于等于预设相似度阈值,则确定待识别文本与目标相似度对应的历史文本属于相同意图;计算与待识别文本属于相同意图的历史文本的数量,得到第二数量。
若两个文本之间的相似度较高,可以视为该两个文本属于相同的意图。
例如,待识别文本为“听说A产品最近在搞活动?那个券是啥意思。”历史文本1为:“A产品是不是最近有活动,有大额的优惠券吗?”待识别文本与历史文本1的目标相似度为a%,该a%大于预设相似度阈值A%,则确定待识别文本与历史文本1属于相同意图。
计算与预待识别文本属于相同意图的历史文本,得到第二数量,第二数量可以体现待识别文本的意图在历史文本集合中出现的数量。
在一个具体的实施例中,待识别文本的意图类别包括非长尾意图、重要长尾意图以及非重要长尾意图中的一者;根据目标频度值与预设频度阈值的比较结果,确定待识别文本的意图类别,包括:若目标频度值大于等于预设频度阈值,则确定待识别文本的意图类别为非长尾意图;若目标频度值小于预设频度阈值,则根据待识别文本对应的业务规则,判断待识别文本的意图的重要性参数是否大于等于预设参数阈值;重要性参数用于表征待识别文本的意图的重要程度;若所述待识别文本的意图的重要性参数大于等于预设参数阈值,则确定待识别文本的意图类别为重要长尾意图;若所述待识别文本的意图的重要性参数小于预设参数阈值,则确定待识别文本的意图类别为非重要长尾意图。
若目标频度值大于等于预设频度阈值,说明待识别文本对应的意图在历史文本集合中的出现频度较高,故可以确定待识别文本的意图类别为非长尾意图;若目标频度值小于预设频度阈值,说明待识别文本对应的意图在历史文本集合中的出现频度较低,故可以确定待识别文本的意图类别为长尾意图。
在确定待识别文本的意图类别为长尾意图之后,可以根据待识别文本对应的业务规则,判断待识别文本的意图的重要性参数是否大于等于预设参数阈值。不同的应用场景中适用于不同的业务规则。业务规则可以预先配置有如何判断意图是否重要的判断条件,也可以配置有重要性参数的生成方法和预设参数阈值。进而,判断待识别文本的意图的重要性参数是否大于等于预设参数阈值,若所述待识别文本的意图的重要性参数大于等于所述预设参数阈值,则确定待识别文本的意图的重要性较高,待识别文本的意图类别为重要长尾意图;若所述待识别文本的意图的重要性参数小于所述预设参数阈值,则确定待识别文本的意图的重要性较低,待识别文本的意图类别为非重要长尾意图。
步骤S106,若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息。
指定意图类别可以是长尾意图,也可以是重要长尾意图,还可以是其他可用于模型训练的样本数量较少的意图类别。
在人工智能场景下,执行本申请实施例所提供的意图识别方法的每个机器人可以预先配置有与业务对应的预设模板语句。预设模板语句可以反映该业务的意图提示信息。预设模版语句的作用是对待识别文本进行重新构造,以生成一个既包括待识别文本,又包括带有意图提示信息的目标文本。意图提示信息可以通过掩码(mask)以及掩码的上下文反映。
例如,待识别文本为:x=听说A产品最近在搞活动哇?那个券是啥意思。
目标文本的模板为:x+预设模板语句。
预设模板语句为:(那个,我想[mask]。)
则对待识别文本和预设模板语句进行拼接处理,可以得到目标文本:听说A产品最近在搞活动哇?那个券是啥意思。那个,我想[mask]。
下面简单介绍一下提示学习:提示学习(Prompt Learning)是一种最近提出的NLP训练学习的新范式。与现在常用的预训练模型+模型微调不同,提示 学习不是通过目标工程使预训练的语言模型(LM)适应下游任务,而是重新形式化下游任务,使其看起来更像是在文本prompt(提示)的帮助下在原始LM训练期间解决的任务。
例如,输入:我喜欢这个电影;
输出:“正面"或者"负面"。
而如果用Prompt Learning去解决的话,任务可以变成“完型填空"。
例如,输入:我喜欢这个电影,整体上来看,这是一个__的电影;
输出:“有趣的"或者"无聊的"。
意图提示信息即提示学习中所采用的文本prompt(提示)。
步骤S108,将目标文本输入意图识别模型,得到待识别文本的意图识别结果,意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。
在一个具体的实施例中,预设模板语句由预设句式和掩码构成;意图识别模型包括依次连接的预测子模型和标签确定模块;预测子模型,用于根据目标文本对掩码进行取值预测处理,得到对应的掩码预测值;标签确定模块,用于根据目标文本对应的掩码预测值和预先配置的掩码值与意图标签之间的映射关系,确定与掩码预测值之间存在映射关系的目标意图标签,将目标意图标签确定为待识别文本的意图识别结果。
预设句式可以是基于业务场景预先设置的固定句式。例如,机器人主要用于处理客户的咨询优惠活动的问题,则固定句式可以是“我想问___优惠”,或者,电子设备主要用于处理客户的投诉,则固定句式可以是“我对___的看法是___”,等等。掩码可以用于表征一个待预测的未知数,对应于固定句式中待填空的区域。例如,预设模板语句可以是“我想问[mask]优惠”,或者,“我对[mask1]的看法是[mask2]”。在一个预设模板语句中,掩码的数量可以是一个,也可以是多个。
与普通的深度学习不同,掩码预测值不再是0,1,2等数字分类,而是让预测子模型从掩码值空间中选择其认为最有可能的回答,而掩码值空间需 要提前配置多个预设掩码值,在预测子模型输出掩码预测值之后,需要标签确定模块,基于一个预先配置好的掩码值与意图标签之间的映射关系,将掩码预测值映射到标签空间中,标签空间包括多个意图标签。
例如,通过预测子模型,对目标文本中的[mask]进行预测,得到了掩码预测值“问优惠券”。在一种实施方式中,可以将掩码预测值“问优惠券”代入[mask]中,得到代入结果:“听说A产品最近在搞活动哇?那个券是啥意思。那个,我想问优惠券。”,再通过标签确定模块,基于代入结果确定存在映射关系的意图标签。在另一种实施方式中,也可以直接输出掩码预测值“问优惠券”,无需代入,通过标签确定模块,直接根据掩码预测值确定存在映射关系的意图标签。例如,与掩码预测值“问优惠券”之间存在映射关系的意图标签为“咨询优惠券”。
在一个实施例中,预测子模型,用于:根据目标文本,确定掩码的取值为预先配置的掩码值集合中每个预设掩码值的概率,得到每个预设掩码值对应的预测概率;按照预测概率的数值大小,对每个预设掩码值进行排序,得到排序结果;基于排序结果,将预测概率的数值最高的预设掩码值确定为目标文本对应的掩码预测值。
在另一个实施例中,待识别文本可以是目标用户输入的文本。预测子模型可以根据目标文本,确定掩码的取值为预先配置的掩码值集合中每个预设掩码值的概率,得到每个预设掩码值对应的预测概率;按照预测概率的数值大小,对每个预设掩码值进行排序,得到排序结果;基于排序结果将预测概率最高的预设数量个预设掩码值确定为掩码预测值,进而,意图识别模型可以输出预设数量个意图识别结果,以向目标用户反馈携带有预设数量个意图识别结果的意图确认信息,在接收目标用户的意图选择指令后再针对性做出答复。
预设数量可以是大于1的自然数。示例性地,在语音客服机器人的应用场景中,目标用户可以是向语音客服机器人发出提问的客户,该提问可以是 目标用户输入的文本。
通过将预测概率最高的预设数量个预设掩码值确定为掩码预测值,进而预测得到多个意图识别结果,可以向目标用户反馈预测概率最高的多种意图以供用户选择,从而结合目标用户的主观选择操作以提高意图识别准确率。
在一个实施例中,预测子模型是通过将训练文本数据输入初始预测子模型进行迭代训练得到的;训练文本数据由训练文本和填充有样本掩码值的预设模板语句拼接得到。
本申请实施例采用了提示学习的训练方法,该训练方法中,输入初始预测子模型的训练样本并不是待识别文本以及待识别文本对应的意图标签,而是,由待识别文本与填充有掩码值的预设模板语句拼接得到的目标文本。
初始预测子模型可以是预训练模型。初始预测子模型可以是预训练的语言表征模型,例如,BERT模型,或者,RoBERTa模型,等等。示例性地,在金融问答领域,初始预测子模型也可以是预训练的开源模型,例如finbert模型,或者,mengzi-fin模型,等等。
预训练模型在预训练阶段往往使用了大量的样本数据执行完型填空任务,因此,预训练模型具有强大的填词能力。通过提示学习的训练方式设定预设模板语句,使预训练模型“回忆”起相应回答的方法,可以在重要长尾意图的样本数量较少的情况下,提高预训练模型的表达能力。在此基础上,若采用提示学习的训练方式训练预训练模型,即将训练文本和填充有样本掩码值的预设模板语句拼接得到的训练文本输入初始预测子模型进行迭代训练,即便训练文本的数量较少,也可以取得较好的模型训练效果,使得训练好的意图识别模型在投入使用后预测得到的掩码预测值的准确率较高。
反之,若采用微调的训练方式训练预训练模型,即将携带有意图标签的待识别文本输入初始预测子模型进行迭代训练,该训练方式所需要的样本数量极多,少量样本无法满足训练需求,很可能导致训练得到的意图识别模型的预测结果不准确。
在一个实施例中,若待识别文本为目标用户输入的文本,且待识别文本的意图识别结果的数量为多个,则将目标文本输入意图识别模型进行意图识别处理,得到待识别文本的意图识别结果之后,意图识别方法还包括:向目标用户反馈意图确认信息;意图确认信息携带有多个意图识别结果;接收目标用户的意图选择指令,将被意图选择指令选中的意图确定为目标意图;根据目标意图,执行对应的意图响应操作。
意图确认信息携带有多个意图识别结果,可通过展示界面向目标用户展示多个待选择的意图,以供用户选择真实意图。每个待选择的意图对应于一个意图识别结果。示例性地,意图确认信息可以是:您是否想要咨询:1、A优惠券;2、B优惠活动;3、C商品的折扣。
接收目标用户在展示界面中的意图选择指令,将被意图选择指令选中的意图确定为目标意图。
目标意图包括且不限于:咨询、求助、投诉、购物,等等。示例性地,若目标意图为针对优惠券的咨询,则目标意图对应的意图响应操作可以是回复优惠券介绍信息;若目标意图为迷路求助,则目标意图对应的意图响应操作可以是获取目标用户的当前位置信息和目的地位置信息,并推送对应的导航路线给目标用户;若目标意图为客户投诉,则目标意图对应的意图响应操作可以是向目标用户回复预设安抚话术并记录对应的投诉信息;若目标意图为购物,目标意图对应的意图响应操作可以是向目标用户推送购物比价信息以及购物链接,等等。
在另一个实施例中,还可以预先构建多个依次串联的意图识别模块,分别为“非长尾意图识别模块”、“重要长尾意图识别模块”以及“非重要长尾意图识别模块”。若“非长尾意图识别模块”确定接收到的待识别文本的意图类别为非长尾意图,则可以基于该“非长尾意图识别模块”对待识别文本进行意图识别,输出意图标签;若“非长尾意图识别模块”确定接收到的待识别文本的意图类别不是非长尾意图,则识别失败,继续将待识别文本传输至 “重要长尾意图识别模块”;若“重要长尾意图识别模块”确定接收到的待识别文本的意图类别为重要长尾意图,则可以基于该“重要长尾意图识别模块”对待识别文本进行意图识别,输出意图标签;若“重要长尾意图识别模块”确定接收到的待识别文本的意图类别不是重要长尾意图,则识别失败,继续将待识别文本传输至“非重要长尾意图识别模块”;若“非重要长尾意图识别模块”确定接收到的待识别文本的意图类别为非重要长尾意图,则可以基于该“非重要长尾意图识别模块”对待识别文本进行意图识别,输出意图标签;若“非重要长尾意图识别模块”确定接收到的待识别文本的意图类别不是非重要长尾意图,则识别失败,可以输出兜底答案,例如,“您的问题无法识别,请问您是否需要转接至人工服务”。
非长尾意图识别模块的结构可以参考上述的非长尾意图识别模型;重要长尾意图识别模块可以参照本申请实施例所提供的意图识别模型;非重要长尾意图识别模块可以参照上述的非重要长尾意图识别模型。
通过对不同意图类别的待识别文本采用不同的意图识别方式,可以满足每种意图类别的识别需求,提高识别准确率。
在如图1所示的实施例中,首先,获取待识别文本;其次,对待识别文本进行意图分类处理,得到待识别文本的意图类别;接着,若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;最后,将目标文本输入意图识别模型,得到待识别文本的意图识别结果,意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。以此,通过对待识别文本进行意图分类处理,确定待识别文本的意图类别,可以确定待识别文本是否为指定意图类别,进而仅针对指定意图类别的待识别文本和预设模板语句进行拼接处理,得到意图识别模型的输入数据,从而通过意图识别模型,基于预设模板语句所表征的意图提示信息对目标文本进行意图识别,得到意图识别结果,提高了指定意图类别的意图识别的准确率。
出于与图1的方法实施例相同的技术构思,本申请实施例还提供另一种意图识别方法。图2为本申请实施例提供的另一种意图识别方法的处理流程图。
步骤S202,输入文本。
步骤S202相当于图1实施例中的步骤S102。步骤S202中输入的文本可以是待识别文本。
步骤S204,判断非长尾意图识别模块是否识别成功。
非长尾意图识别模块可以包括通过贴标签的方式训练得到的非长尾意图识别模型,该非长尾意图识别模型对意图类别为非长尾模型的文本的识别准确率极高。若将文本输入该非长尾意图识别模型进行意图识别处理,得到的意图识别结果用于表征该文本的意图类别为非长尾意图,则确定非长尾意图识别模块识别成功;若将文本输入该非长尾意图识别模型进行意图识别处理,得到的意图识别结果用于表征该文本的意图类别是非长尾意图以外的结果,则确定非长尾意图识别模块识别不成功。
若非长尾意图识别模块识别成功,则执行步骤S210;若非长尾意图识别模块识别不成功,则执行步骤S206。
步骤S206,判断重要长尾意图识别模块是否识别成功。
判断重要长尾意图识别模块是否识别成功可参照步骤S204。若是,执行步骤S210;若否,执行步骤S208。
步骤S208,判断非重要长尾意图识别模块是否识别成功。
判断非重要长尾意图识别模块是否识别成功可参照步骤S204。
若是,执行步骤S210;若否,执行步骤S212。
步骤S210,输出意图标签。
步骤S204、步骤S206以及步骤S210可以替换图1实施例中的步骤S104、步骤106以及步骤108,在获取待识别文本之后,可以将待识别文本输入非长尾意图识别模块进行意图识别处理,得到待识别文本的非长尾意图识别结 果;若该非长尾意图识别结果用于表征非长尾意图识别模块识别不成功,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;将目标文本输入重要长尾意图识别模块进行意图识别处理,得到待识别文本的重要长尾意图识别结果;若该重要长尾意图识别结果用于表征重要长尾意图识别模块识别成功,则输出该重要长尾意图识别结果作为待识别文本的意图标签。
图2实施例中的重要长尾意图识别模块可以包括图1实施例所提供的意图识别模型中的各个结构部件并实现相同的功能。
步骤S212,输出兜底答案。
兜底答案可以是转接人工,也可以是其他预设应答方式,例如,生成建议用户拨打人工客服电话的提示信息。
出于与前述方法实施例相同的技术构思,本申请实施例还提供一种非长尾意图的识别方法。图3为本申请实施例提供的一种非长尾意图的识别方法的处理流程图。
非长尾意图即主要意图,具有意图数量占所有意图的比例较少,且占据的流量较大的特性。如图3所示,可以将输入文本输入非长尾意图识别模型进行意图预测处理,得到意图预测结果。该非长尾意图识别模型可以包括依次连接的预训练模型、多层感知机以及归一化指数函数,即Softmax函数。Softmax函数的分类数量可以包括所有常见问题的数量加上“其他问题”。
示例性地,意图预测结果包括且不限于:1.查询贷款余额;2.如何提前还款;3.微信扣款问题;4.有哪些贷款产品,等等。
出于与前述方法实施例相同的技术构思,本申请实施例还提供一种重要长尾意图的识别方法。图4为本申请实施例提供的一种重要长尾意图的识别方法的处理流程图。
步骤S402,输入文本。
步骤S402,相当于图1实施例中的步骤S102。
输入的文本可以是待识别文本。
步骤S404,建立预设模板语句。
建立预设模板语句,可以是预先配置与业务对应的预设模板语句。
可以整合当前机器人的应用场景中该机器人对应的重要长尾意图的句法特点,以配置预设模板语句。步骤S406,构建映射关系。
图5为本申请实施例提供的掩码值与意图标签之间的映射关系图。
如图5所示,意图标签空间包括多个意图标签,例如,“咨询优惠券”、“客户投诉”,等等。掩码值空间包括多个预设掩码值,例如,“问优惠券”、“问券的事”、“举报”、“投诉”、“告发你”,等等。
其中,意图标签“咨询优惠券”可以分别与预设掩码值“问优惠券”和预设掩码值“问券的事”建立映射关系;意图标签“客户投诉”可以分别与预设掩码值“举报”、预设掩码值“投诉”、预设掩码值“告发你”建立映射关系。
步骤S408,生成携带预设模板语句的文本。
步骤S408相当于图1实施例中的步骤S106。
步骤S410,进行掩码值预测。
步骤S412,基于映射关系,输出模型预测结果。
步骤S410和步骤S412相当于图1实施例中的步骤S108。
出于与前述方法实施例相同的技术构思,本申请实施例还提供一种应答方法,可以应用于人工智能领域。图6为本申请实施例提供的一种应答方法的处理流程图。
步骤S602,将客户语音问题转为文本。
步骤S604,将文本输入意图识别模型进行意图识别。
步骤S606,映射意图对应的回答。
步骤S608,将回答转为语音输出。
步骤S610,机器人播出对应语音话术以回答客户。
出于与前述方法实施例相同的技术构思,本申请实施例还提供一种意图识别模型的训练方法。
图7为本申请实施例提供的一种意图识别模型的训练方法的处理流程图。
如图7所示,步骤S702,获取初始训练文本;初始训练文本的意图类别为指定意图类别。
步骤S704,将初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;预设模板语句用于表征意图提示信息。
步骤S706,将目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
预设模板语句可以是填充有样本掩码值的预设句式;意图识别模型包括依次连接的预测子模型和标签确定模块;预测子模型可以通过将目标训练文本输入初始预测子模型进行迭代训练得到;标签确定模型可以通过将掩码预测值输入初始标签确定模块进行迭代训练得到;掩码预测值由预测子模型生成。
本申请实施例采用了提示学习的训练方法,该训练方法中,输入初始预测子模型的训练样本并不是初始训练文本以及初始训练文本对应的意图标签,而是,由初始训练文本与填充有掩码值的预设模板语句拼接得到的目标训练文本。
初始预测子模型可以是预训练模型。初始预测子模型可以是预训练的语言表征模型,例如,BERT模型,或者,RoBERTa模型,等等。示例性地,在金融问答领域,初始预测子模型也可以是预训练的开源模型,例如finbert模型,或者,mengzi-fin模型,等等。
预训练模型在预训练阶段往往使用了大量的样本数据执行完型填空任务,因此,预训练模型具有强大的填词能力。通过提示学习的训练方式设定预设模板语句,使预训练模型“回忆”起相应回答的方法,可以在重要长尾意图的样本数量较少的情况下,提高预训练模型的表达能力。在此基础上,若采 用提示学习的训练方式训练预训练模型,即将初始训练文本和填充有样本掩码值的预设模板语句拼接得到的模板训练文本输入初始预测子模型进行迭代训练,即便训练文本的数量较少,也可以取得较好的模型训练效果,使得训练好的意图识别模型在投入使用后预测得到的掩码预测值的准确率较高。
反之,若采用微调的训练方式训练预训练模型,即将携带有意图标签的初始训练文本输入初始预测子模型进行迭代训练,该训练方式所需要的样本数量极多,少量样本无法满足训练需求,很可能导致训练得到的意图识别模型的预测结果不准确。
标签确定模型是通过将掩码预测值输入初始标签确定模块进行迭代训练得到的。该掩码预测值可以通过将目标训练文本输入预测子模型所生成。
出于与前述的各个意图识别方法实施例相同的技术构思,本申请实施例还提供了一种应用于数字人的意图识别方法,包括:
获取用户输入的待识别文本。
根据如上述任一实施例所述的意图识别方法识别所述待识别文本的意图,得到用户意图。
根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
在本实施例中,用户输入的待识别文本包括用户在界面操作时输入的待识别文本,用户语音播放的音频,识别所述音频得到的待识别文本,或者是用户手动输入的待识别文本。
在本实施例中,根据用户意图在所述数字人的系统中获取对应所述用户意图的目标文本包括:根据用户意图在所述数字人的系统中查找匹配所述用户意图的内容,将匹配得到的内容作为所述目标文本;对所述目标文本进行展示,包括所述数字人将所述目标文本进行播报,或者所述数字人对所述目标文本展示在所述数字人的显示界面上。
在上述的实施例中,提供了一种意图识别方法,与之相对应的,还提供 了一种意图识别装置,下面结合附图进行说明。
图8为本申请实施例提供的一种意图识别装置示意图。
本实施例提供一种意图识别装置800,包括:第一获取单元801,用于获取待识别文本;分类单元802,用于对待识别文本进行意图分类处理,得到待识别文本的意图类别;第一拼接单元803,用于若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;第一识别单元804,用于将目标文本输入意图识别模型,以基于意图提示信息对待识别文本进行意图识别处理,得到待识别文本的意图识别结果。
可选地,预设模板语句由预设句式和掩码构成;意图识别模型包括依次连接的预测子模型和标签确定模块;预测子模型,用于根据目标文本对掩码进行取值预测处理,得到对应的掩码预测值;标签确定模块,用于根据目标文本对应的掩码预测值和预先配置的掩码值与意图标签之间的映射关系,确定与掩码预测值之间存在映射关系的目标意图标签,将目标意图标签确定为待识别文本的意图识别结果。
可选地,预测子模型,用于:根据目标文本,确定掩码的取值为预先配置的掩码值集合中每个预设掩码值的概率,得到每个预设掩码值对应的预测概率;按照预测概率的数值大小,对每个预设掩码值进行排序,得到排序结果;基于排序结果,将预测概率的数值最高的预设掩码值确定为目标文本对应的掩码预测值。
可选地,分类单元802,包括:计算子单元,用于计算预存的历史文本集合所包括的历史文本的数量,得到第一数量;第一确定子单元,用于在历史文本集合中,确定与待识别文本属于相同意图的历史文本的数量,得到第二数量;第二确定子单元,用于根据第一数量和第二数量,确定待识别文本对应的意图在历史文本集合中的出现频度,得到目标频度值;第三确定子单元,用于根据目标频度值与预设频度阈值的比较结果,确定待识别文本的意 图类别。
可选地,第一确定子单元,用于:计算待识别文本与历史文本集合中每个历史文本之间的相似度,得到目标相似度;若目标相似度大于等于预设相似度阈值,则确定待识别文本与目标相似度对应的历史文本属于相同意图;计算与待识别文本属于相同意图的历史文本的数量,得到第二数量。
可选地,待识别文本的意图类别包括非长尾意图、重要长尾意图以及非重要长尾意图中的一者;第三确定子单元,用于:若目标频度值大于等于预设频度阈值,则确定待识别文本的意图类别为非长尾意图;若目标频度值小于预设频度阈值,则根据待识别文本对应的业务规则,判断待识别文本的意图的重要性参数是否大于等于预设参数阈值;重要性参数用于表征待识别文本的意图的重要程度;若是,则确定待识别文本的意图类别为重要长尾意图;若否,则确定待识别文本的意图类别为非重要长尾意图。
可选地,若待识别文本为目标用户输入的文本,且待识别文本的意图识别结果的数量为多个,则意图识别装置800还包括:反馈单元,用于向目标用户反馈意图确认信息;意图确认信息携带有多个意图识别结果;接收单元,用于接收目标用户的意图选择指令,将被意图选择指令选中的意图确定为目标意图;执行单元,用于根据目标意图,执行对应的意图响应操作。
本申请实施例所提供的意图识别装置包括第一获取单元、分类单元、第一拼接单元以及第一识别单元,第一获取单元,用于获取待识别文本;分类单元,用于对待识别文本进行意图分类处理,得到待识别文本的意图类别;第一拼接单元,用于若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;第一识别单元,用于将目标文本输入意图识别模型,得到待识别文本的意图识别结果,意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。以此,通过对待识别文本进行意图分类处理,确定待识别文本的意图类别,可以确定待识别文本是否为指定意图类别,进而仅针对指定意 图类别的待识别文本和预设模板语句进行拼接处理,得到意图识别模型的输入数据,从而通过意图识别模型,基于预设模板语句所表征的意图提示信息对目标文本进行意图识别,得到意图识别结果,提高了指定意图类别的意图识别的准确率。
在上述的实施例中,提供了一种意图识别模型的训练方法,与之相对应的,还提供了一种意图识别模型的训练装置,下面结合附图进行说明。
图9为本申请实施例提供的一种意图识别模型的训练装置示意图。
本实施例提供一种意图识别模型的训练装置900,包括:第二获取单元901,用于获取初始训练文本;初始训练文本的意图类别为指定意图类别;第二拼接单元902,用于将初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;预设模板语句用于表征意图提示信息;训练单元903,用于将目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
在上述的实施例中,提供了一种应用于数字人的意图识别方法,与之相对应的,还提供了一种应用于数字人的意图识别装置,包括:第三获取单元,用于获取用户输入的待识别文本;第二识别单元,用于根据如第一方面所述的意图识别方法识别所述待识别文本的意图,得到用户意图;展示单元,用于根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
对应上述描述的一种意图识别方法,基于相同的技术构思,本申请实施例还提供一种电子设备,该电子设备用于执行上述提供的意图识别方法,或者,对应上述描述的一种意图识别模型的训练方法,基于相同的技术构思,本申请实施例还提供一种电子设备,该电子设备用于执行上述提供的意图识别模型的训练方法,或者,对应上述描述的一种应用于数字人的意图识别方法,基于相同的技术构思,本申请实施例还提供一种电子设备,该电子设备用于执行上述提供的应用于数字人的意图识别方法。图10为本申请实施例提供的一种电子设备的结构示意图。
如图10所示,电子设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上的处理器1001和存储器1002,存储器1002中可以存储有一个或一个以上存储应用程序或数据。其中,存储器1002可以是短暂存储或持久存储。存储在存储器1002的应用程序可以包括一个或一个以上模块(图示未示出),每个模块可以包括电子设备中的一系列计算机可执行指令。更进一步地,处理器1001可以设置为与存储器1002通信,在电子设备上执行存储器1002中的一系列计算机可执行指令。电子设备还可以包括一个或一个以上电源1003,一个或一个以上有线或无线网络接口1004,一个或一个以上输入/输出接口1005,一个或一个以上键盘1006等。
在一个实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取待识别文本;对待识别文本进行意图分类处理,得到待识别文本的意图类别;若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;将目标文本输入意图识别模型,得到待识别文本的意图识别结果,意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。
在另一个实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取初始训练文本;初始训练文本的意图类别为指定意图类别;将初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;预设模板语句用于表征意图提示信息;将 目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
在一个实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取用户输入的待识别文本;根据前述各个意图识别方法实施例所述的意图识别方法识别所述待识别文本的意图,得到用户意图;根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
对应上述描述的一种意图识别方法,或者,一种意图识别模型的训练方法,或者,一种应用于数字人的意图识别方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。
在一个实施例中,本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取待识别文本;对待识别文本进行意图分类处理,得到待识别文本的意图类别;若确定意图类别为指定意图类别,则将待识别文本与预设模板语句进行拼接处理,得到目标文本;预设模板语句用于表征意图提示信息;将目标文本输入意图识别模型,得到待识别文本的意图识别结果,意图识别模型用于基于意图提示信息对待识别文本进行意图识别处理。
在另一个实施例中,本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取初始训练文本;初始训练文本的意图类别为指定意图类别;将初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;预设模板语句用于表征意图提示信息;将目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
在一个实施例中,本实施例提供的计算机可读存储介质,用于存储计算 机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取用户输入的待识别文本;根据前述各个意图识别方法实施例所述的意图识别方法识别所述待识别文本的意图,得到用户意图;根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于意图识别方法或意图识别模型的训练方法的实施例基于同一发明构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
本领域内的技术人员应明白,本申请实施例可提供为方法、系统或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本说明书是参照根据本说明书实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程设备的处理器以产生一个机器,使得通过计算机或其他可编程设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种 过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本申请实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书的一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本文件的实施例而已,并不用于限制本文件。对于本领域技术人员来说,本文件可以有各种更改和变化。凡在本文件的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本文件的权利要求范围之内。

Claims (12)

  1. 一种意图识别方法,包括:
    获取待识别文本;
    对所述待识别文本进行意图分类处理,得到所述待识别文本的意图类别;
    若确定所述意图类别为指定意图类别,则将所述待识别文本与预设模板语句进行拼接处理,得到目标文本;
    将所述目标文本输入意图识别模型,得到所述待识别文本的意图识别结果,所述意图识别模型用于基于所述意图提示信息对所述待识别文本进行意图识别处理。
  2. 根据权利要求1所述的方法,其中,
    所述方法包括:根据所述目标文本对所述掩码进行取值预测处理,得到对应的掩码预测值;
    根据所述目标文本对应的掩码预测值和预先配置的掩码值与意图标签之间的映射关系,确定与所述掩码预测值之间存在映射关系的目标意图标签,将所述目标意图标签确定为所述待识别文本的意图识别结果。
  3. 根据权利要求2所述的方法,其中,所述方法包括:
    根据所述目标文本,确定所述掩码的取值为预先配置的掩码值集合中每个预设掩码值的概率,得到每个所述预设掩码值对应的预测概率;
    按照所述预测概率的数值大小,对每个所述预设掩码值进行排序,得到排序结果;
    基于所述排序结果,将所述预测概率的数值最高的预设掩码值确定为所述目标文本对应的掩码预测值。
  4. 根据权利要求1所述的方法,其中,所述对所述待识别文本进行意图 分类处理,得到所述待识别文本的意图类别,包括:
    计算预存的历史文本集合所包括的历史文本的数量,得到第一数量;
    在所述历史文本集合中,确定与所述待识别文本属于相同意图的历史文本的数量,得到第二数量;
    根据所述第一数量和所述第二数量,确定所述待识别文本对应的意图在所述历史文本集合中的出现频度,得到目标频度值;
    根据所述目标频度值与预设频度阈值的比较结果,确定所述待识别文本的意图类别。
  5. 根据权利要求4所述的方法,其中,所述在所述历史文本集合中,确定与所述待识别文本属于相同意图的历史文本的数量,得到第二数量,包括:
    计算所述待识别文本与所述历史文本集合中每个所述历史文本之间的相似度,得到目标相似度;
    若所述目标相似度大于等于预设相似度阈值,则确定所述待识别文本与所述目标相似度对应的历史文本属于相同意图;
    计算与所述待识别文本属于相同意图的历史文本的数量,得到所述第二数量。
  6. 根据权利要求4所述的方法,其中,所述待识别文本的意图类别包括非长尾意图、重要长尾意图以及非重要长尾意图中的一者;所述根据所述目标频度值与预设频度阈值的比较结果,确定所述待识别文本的意图类别,包括:
    若所述目标频度值大于等于预设频度阈值,则确定所述待识别文本的意图类别为非长尾意图;
    若所述目标频度值小于预设频度阈值,则根据所述待识别文本对应的业务规则,判断所述待识别文本的意图的重要性参数是否大于等于预设参数阈 值;所述重要性参数用于表征所述待识别文本的意图的重要程度;
    若所述待识别文本的意图的重要性参数大于等于预设参数阈值,则确定所述待识别文本的意图类别为重要长尾意图;
    若所述待识别文本的意图的重要性参数小于预设参数阈值,则确定所述待识别文本的意图类别为非重要长尾意图。
  7. 根据权利要求1至6任一项所述的方法,其中,若所述待识别文本为目标用户输入的文本,且所述待识别文本的意图识别结果的数量为多个,则所述将所述目标文本输入意图识别模型进行意图识别处理,得到所述待识别文本的意图识别结果之后,还包括:
    向所述目标用户反馈意图确认信息;所述意图确认信息携带有多个所述意图识别结果;
    接收所述目标用户的意图选择指令,将被所述意图选择指令选中的意图确定为目标意图;
    根据所述目标意图,执行对应的意图响应操作。
  8. 一种意图识别模型的训练方法,包括:
    获取初始训练文本;所述初始训练文本的意图类别为指定意图类别;
    将所述初始训练文本与预设模板语句进行拼接处理,得到目标训练文本;所述预设模板语句用于表征意图提示信息;
    将所述目标训练文本输入初始意图识别模型进行迭代训练,得到意图识别模型。
  9. 一种应用于数字人的意图识别方法,包括:
    获取用户输入的待识别文本;
    根据如权利要求1至7任一项所述的意图识别方法识别所述待识别文本的意图,得到用户意图;
    根据所述用户意图在所述数字人的系统中获取对应所述用户意图的目标文本,并对所述目标文本进行展示。
  10. 一种意图识别装置,包括:
    第一获取单元,用于获取待识别文本;
    分类单元,用于对所述待识别文本进行意图分类处理,得到所述待识别文本的意图类别;
    第一拼接单元,用于若确定所述意图类别为指定意图类别,则将所述待识别文本与预设模板语句进行拼接处理,得到目标文本;所述预设模板语句用于表征意图提示信息;
    识别单元,用于将所述目标文本输入意图识别模型,得到所述待识别文本的意图识别结果,所述意图识别模型用于基于所述意图提示信息对所述待识别文本进行意图识别处理。
  11. 一种电子设备,包括:
    处理器;以及,被配置为存储计算机可执行指令的存储器,所述计算机可执行指令在被执行时使所述处理器执行如权利要求1-7任一项所述的意图识别方法,或者,如权利要求8所述的意图识别模型的训练方法,或者,如权利要求9所述的应用于数字人的意图识别方法。
  12. 一种计算机可读存储介质,所述计算机可读存储介质用于存储计算机可执行指令,所述计算机可执行指令在被处理器执行时实现如权利要求1-7任一项所述的意图识别方法,或者,如权利要求8所述的意图识别模型的训练方法,或者,如权利要求9所述的应用于数字人的意图识别方法。
PCT/CN2023/111242 2022-08-25 2023-08-04 意图识别方法、装置、电子设备及存储介质 WO2024041350A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211029991.7 2022-08-25
CN202211029991.7A CN117708266A (zh) 2022-08-25 2022-08-25 意图识别方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2024041350A1 true WO2024041350A1 (zh) 2024-02-29

Family

ID=90012488

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/111242 WO2024041350A1 (zh) 2022-08-25 2023-08-04 意图识别方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN117708266A (zh)
WO (1) WO2024041350A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380861A (zh) * 2020-11-13 2021-02-19 北京京东尚科信息技术有限公司 模型训练方法、装置及意图识别方法、装置
US20210067470A1 (en) * 2019-08-28 2021-03-04 International Business Machines Corporation Methods and systems for improving chatbot intent training
CN112989035A (zh) * 2020-12-22 2021-06-18 平安普惠企业管理有限公司 基于文本分类识别用户意图的方法、装置及存储介质
CN114357973A (zh) * 2021-12-10 2022-04-15 马上消费金融股份有限公司 意图识别方法、装置、电子设备及存储介质
CN114528844A (zh) * 2022-01-14 2022-05-24 中国平安人寿保险股份有限公司 意图识别方法、装置、计算机设备及存储介质
CN114757176A (zh) * 2022-05-24 2022-07-15 上海弘玑信息技术有限公司 一种获取目标意图识别模型的方法以及意图识别方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210067470A1 (en) * 2019-08-28 2021-03-04 International Business Machines Corporation Methods and systems for improving chatbot intent training
CN112380861A (zh) * 2020-11-13 2021-02-19 北京京东尚科信息技术有限公司 模型训练方法、装置及意图识别方法、装置
CN112989035A (zh) * 2020-12-22 2021-06-18 平安普惠企业管理有限公司 基于文本分类识别用户意图的方法、装置及存储介质
CN114357973A (zh) * 2021-12-10 2022-04-15 马上消费金融股份有限公司 意图识别方法、装置、电子设备及存储介质
CN114528844A (zh) * 2022-01-14 2022-05-24 中国平安人寿保险股份有限公司 意图识别方法、装置、计算机设备及存储介质
CN114757176A (zh) * 2022-05-24 2022-07-15 上海弘玑信息技术有限公司 一种获取目标意图识别模型的方法以及意图识别方法

Also Published As

Publication number Publication date
CN117708266A (zh) 2024-03-15

Similar Documents

Publication Publication Date Title
US11394667B2 (en) Chatbot skills systems and methods
US20210256417A1 (en) System and method for creating data to train a conversational bot
TW201935273A (zh) 語句的使用者意圖識別方法和裝置
CN111428010B (zh) 人机智能问答的方法和装置
CN110019742B (zh) 用于处理信息的方法和装置
JP7488871B2 (ja) 対話推薦方法、装置、電子機器、記憶媒体ならびにコンピュータプログラム
EP4060517A1 (en) System and method for designing artificial intelligence (ai) based hierarchical multi-conversation system
JP6199517B1 (ja) 決定装置、決定方法および決定プログラム
CN114817538B (zh) 文本分类模型的训练方法、文本分类方法及相关设备
CN116863935B (zh) 语音识别方法、装置、电子设备与计算机可读介质
JP7182584B2 (ja) スピーチ理解における解析異常の情報を出力するための方法
CN112487188A (zh) 一种舆情监测方法、装置、电子设备和存储介质
WO2024041350A1 (zh) 意图识别方法、装置、电子设备及存储介质
US11941414B2 (en) Unstructured extensions to rpa
CN115114281A (zh) 查询语句的生成方法和装置,存储介质和电子设备
CN113343668B (zh) 选择题解题方法、装置、电子设备及可读存储介质
CN112131484A (zh) 一种多人会话建立方法、装置、设备和存储介质
CN117972222B (zh) 基于人工智能的企业信息检索方法及装置
CN116776870B (zh) 意图识别方法、装置、计算机设备及介质
WO2024067377A1 (zh) 样本生成方法、装置、电子设备及存储介质
CN116933800B (zh) 一种基于模版的生成式意图识别方法及装置
Agrawal et al. WASABI Contextual BOT
KR102662500B1 (ko) 추론 응답 시간을 기반으로 한 딥러닝 모델 동적 전환 시스템
US11475875B2 (en) Method and system for implementing language neutral virtual assistant
US20230138741A1 (en) Social network adapted response

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856440

Country of ref document: EP

Kind code of ref document: A1