CN110704641A

CN110704641A - Ten-thousand-level intention classification method and device, storage medium and electronic equipment

Info

Publication number: CN110704641A
Application number: CN201910966234.4A
Authority: CN
Inventors: 韩亚昕; 李航; 宋成业; 曾文佳; 冯梦盈
Original assignee: Zero Rhino (beijing) Technology Co Ltd
Current assignee: Zero Rhino (beijing) Technology Co Ltd
Priority date: 2019-10-11
Filing date: 2019-10-11
Publication date: 2020-01-17
Anticipated expiration: 2039-10-11
Also published as: CN110704641B

Abstract

The application relates to the technical field of artificial intelligence, and provides a ten-thousand-level intention classification method, a ten-thousand-level intention classification device, a storage medium and electronic equipment. The ten-thousand-level intention classification method comprises the following steps: obtaining a conversation sentence of at least one round of conversation with a user; performing context analysis on the conversation sentences to complement the context information of the conversation; performing semantic analysis on the spoken sentence to obtain a plurality of candidate intentions of the user; and determining the real intention of the user by using an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information. The method is realized based on a brand-new three-layer man-machine conversation technical framework, a plurality of candidate intents are obtained through semantic analysis in a semantic understanding layer, the complemented context information is obtained through context analysis in a logic reasoning layer, dynamic decision is carried out according to the candidate intents and the complemented context information by utilizing an intention decision model in a decision judging layer, and the accuracy of intention classification is high.

Description

Ten-thousand-level intention classification method and device, storage medium and electronic equipment

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a ten-thousand-level intention classification method and device, a storage medium and electronic equipment.

Background

In recent years, perceptual intelligence, represented by deep learning, has reached a completely new height approaching that of humans, even exceeding the human level in some application areas, in terms of automatic mining and recognition of features such as video, images, speech and text. The perception intelligence technology enabled by deep learning is increasingly close to the 'ceiling', and the machine has a large gap in the aspect of human cognition ability simulation, namely cognition intelligence. How to make a machine have cognitive intelligence is a significant application requirement and technical challenge facing the field of artificial intelligence.

The natural man-machine conversation is one of the most classical technical application scenes in the field of cognitive intelligence at present, is similar to conversation communication among human beings, and the natural degree pursued by the natural man-machine conversation is determined by whether the machine can observe the language and look, and can say words which enable the opposite party to feel satisfied and comfortable, namely whether the machine can accurately recognize or insights the complex intention of the appeal or the inner mind of a target user according to the grasped specific context state, so as to make the answer words or skill decision which can meet the intention and experience of the target user.

With the increasing rise and popularization of applications such as intelligent customer service and intelligent assistance, man-machine conversation technology is rapidly developed, but in general, most of the current enterprise-oriented man-machine conversation products have weak recognition capability on user intentions (for example, only hundreds of intentions can be classified), and can only meet conversation requirements of small and medium-sized enterprises in simple business scenes, while a considerable number of large-sized enterprises have the characteristics of multiple application scenes, complex business types, fine client requirements and the like in the process of enabling enterprise innovation by applying the man-machine conversation technology, and the recognition capability on the user intentions is higher (for example, ten thousands of intentions are classified), and the current man-machine conversation technology cannot meet the requirements.

Disclosure of Invention

An embodiment of the present invention provides a method, an apparatus, a storage medium, and an electronic device for classifying ten-thousand intentions, so as to solve the above technical problems.

In order to achieve the above purpose, the present application provides the following technical solutions:

in a first aspect, an embodiment of the present application provides a ten-thousand-level intent classification method, including: obtaining a conversation sentence of at least one round of conversation with a user; performing context analysis on the conversation sentences to complement the context information of the conversation; performing semantic analysis on the conversation sentences to obtain a plurality of candidate intentions of the user; and determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

The method is realized based on a brand-new three-layer man-machine conversation technical framework, the framework comprises a semantic understanding layer, a logic inference layer and a decision-making judgment layer, a plurality of candidate intents are obtained in the semantic understanding layer in a semantic analysis mode, the complemented context information is obtained in the logic inference layer in a context analysis mode, an intention decision model established by a reinforcement learning algorithm is utilized in the decision-making judgment layer, dynamic decision is made according to the candidate intents and the complemented context information, and the real intents of a user are finally output.

In an implementation manner of the first aspect, the performing context analysis on the conversational sentence to complement the contextual information of the conversation includes: extracting entities and relations in the dialogue sentences; according to the extracted entities and the extracted relations, performing context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing context information of the conversation; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

The implementation mode introduces the domain knowledge through the panoramic graph, so that the context information can be effectively supplemented, and the accuracy of the intention decision is improved.

In one implementation of the first aspect, the context information of the dialog comprises: entity and slot position information, user emotion information, user portrait information, general knowledge related to conversation and scene information of conversation in the conversation sentence; the method for performing context analysis on the conversation sentences by using the pre-constructed panoramic graph to complement the context information of the conversation comprises the following steps: performing language context analysis on the dialogue sentences by using a pre-constructed panoramic map, and completing entity and slot position information in the dialogue sentences; performing culture context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing user emotion information, user portrait information and general knowledge related to conversation; and performing scene context analysis on the conversation sentences by using a pre-constructed panoramic map to complement the scene information of the conversation.

In the above implementation, the language context information, the culture context information and the scene context information constitute complete context information, which is beneficial to fully describing the language environment in which the dialog occurs, and there is no precedent for simultaneously using the three types of context information for intent classification in the prior art.

In one implementation manner of the first aspect, the determining, based on the plurality of candidate intentions and the complemented dialog context information, a true intention of the user from the plurality of candidate intentions by using an intention decision model constructed based on a reinforcement learning algorithm includes: inputting a pre-constructed panoramic atlas, the plurality of candidate intentions, complemented conversation context information, text features of conversation sentences input by a user in the previous round, text features of conversation sentences input by the user in the current round and probability distribution of the conversation states of the user in the previous round into the intention decision model, and obtaining real intentions of the user and the conversation states in the current round output by the intention decision model; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

The realization mode describes possible input and output of an intention decision model, the model can carry out dynamic decision, is not limited by intention classification types, does not need to pre-configure a conversation process according to limited user intentions, can well deal with the challenging problems of intention accurate identification, conversation state dynamic decision and the like under the requirement of ten thousand levels of intention classification, and is favorable for realizing multi-round free conversation dynamic decision.

In an implementation manner of the first aspect, the performing semantic analysis on the dialog statement to obtain a plurality of candidate intentions of the user includes: and performing semantic analysis on the dialogue sentences by utilizing a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user.

In the semantic understanding layer, one of a semantic matching model and an intention classification model or the analysis results of the two models are combined to obtain a plurality of candidate intentions of the user according to requirements.

In an implementation manner of the first aspect, performing semantic analysis on the dialog sentence by using a semantic matching model and an intention classification model to obtain a plurality of candidate intentions of the user, including: performing semantic analysis on the dialogue statement by using a semantic matching model to obtain a first intention output by the model; semantic analysis is carried out on the dialogue sentences by using an intention classification model, and a second intention output by the model is obtained; and judging whether the first intention and the second intention are the same, and if the first intention and the second intention are not the same, determining the first intention and the second intention as the candidate intentions.

In one implementation form of the first aspect, the method further comprises: if the first intention and the second intention are the same, determining the first intention as the real intention of the user.

The two implementation modes describe the situation when a semantic matching model and an intention classification model are adopted at the same time in a semantic understanding layer, if the predictions of the two models on the user intention are inconsistent, the intention of the user is not clear, the two intentions output by the models can be used as candidate intentions and submitted to an intention decision model for further decision making; if the two models predict the user intention in a consistent way, the intention expressed by the user can be basically determined to be clear, and the real intention of the user can be directly output as an intention classification result without further decision by an intention decision model.

In an implementation manner of the first aspect, the performing semantic analysis on the dialog sentence by using a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user includes: embedding and representing the dialogue sentences based on characters, words and domain knowledge of the domain related to the dialogue; and inputting the result of the embedded representation of the dialogue statement into the semantic matching model and/or the intention classification model to perform semantic analysis on the dialogue statement, so as to obtain a plurality of candidate intentions of the user.

In the implementation mode, the domain knowledge is introduced when the sentence is embedded and expressed, so that the dimensionality of the feature serving as the expression result is increased, and therefore the accuracy of the intention classification result of the semantic understanding layer model is improved, and the accuracy of the final intention decision is improved.

In an implementation manner of the first aspect, the obtaining of the dialog sentences of at least one round of dialog with the user includes: acquiring a first dialogue statement input by a user; judging whether statement information is missing in the first dialogue statement; if the statement information is missing, performing at least one round of clarification conversation with the user; wherein the dialog sentences include the first dialog sentence and a clarifying dialog sentence of the at least one round of clarifying dialog.

In the implementation mode, when the fact that the dialogue sentences of the user are missing is found, clarification dialogue with the user can be actively performed, more valuable information is collected in the clarification dialogue process, and operations such as sentence missing information completion and contextual information completion can be performed conveniently.

In an implementation manner of the first aspect, the performing semantic analysis on the dialog statement to obtain a plurality of candidate intentions of the user includes: completing information loss in the dialogue sentences by using the completed dialogue context information; and carrying out semantic analysis on the completed dialogue sentences to obtain a plurality of candidate intentions of the user.

In the implementation mode, the missing key information in the dialogue sentences is completed, so that the accuracy of the intention classification result of the semantic understanding layer model is improved, and the accuracy of the final intention decision is improved.

In an implementation manner of the first aspect, the context information of the dialog includes entity and slot information in a dialog statement, and the complementing the information missing in the dialog statement with the complemented dialog context information includes: and utilizing the completed entity and slot position information to complete the information loss in the dialogue statement.

In a second aspect, an embodiment of the present application provides a ten-thousand-level intention classification apparatus, including: the conversation acquisition module is used for acquiring conversation sentences of at least one round of conversation with the user; the context information completion module is used for performing context analysis on the conversation sentences and completing the context information of the conversation; the semantic understanding module is used for carrying out semantic analysis on the conversation sentences to obtain a plurality of candidate intentions of the user; and the intention decision module is used for determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are read and executed by a processor, the computer program instructions perform the method provided by the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: a memory in which computer program instructions are stored, and a processor, where the computer program instructions are read and executed by the processor to perform the method provided by the first aspect or any one of the possible implementation manners of the first aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is an architecture diagram of a natural human-machine interaction technique provided by an embodiment of the present application;

FIG. 2 is a flowchart of a ten-thousand level intent classification method according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of one implementation of the ten thousand level intent classification method of FIG. 2;

FIG. 4 is a schematic diagram of one implementation of the logical inference layer of FIG. 3;

FIG. 5 is a schematic diagram of one implementation of the decision making fault of FIG. 3;

FIG. 6 is a functional block diagram of a ten thousand intent classification apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. In this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

One of the core problems of the natural human-computer conversation technology is user intention recognition, and once a robot (generally, an entity capable of performing human-computer conversation with a user, which may be implemented by software, hardware or a combination of the two) recognizes the intention of the user, a corresponding answer or skill decision can be taken, so as to provide a service satisfying the user. One mainstream implementation of the intention identification is intention classification, that is, several intention categories are preset, and which category the intention that the user wants to express belongs to is determined according to the dialog content of the user.

However, the existing intent classification method is only suitable for classification of hundred-class intent, and it is difficult to obtain an ideal classification result for ten-thousand-class intent classification, whereas the ten-thousand-class intent classification method provided by the embodiment of the present application can be well used for ten-thousand-class intent classification. In this context, "hundreds intention" means that the maximum number of intentions to be classified is hundreds, which is often found in simple business scenarios of small and medium-sized enterprises, for example, business scenarios in which a user orders dinner, the number of intentions is limited, and whether to ask whether there are seats, how many dining prices are, whether there are parking lots, and the like. The "ten-thousand intentions" means that the maximum value of the number of intentions to be classified is tens of thousands, and the maximum value is commonly found in a complex business scene of a large enterprise, for example, a scene in which a user uses an e-commerce platform to conduct e-commerce transactions, the number of intentions is huge, the user has different intention expressions as a buyer and a seller, taking the buyer as an example, the user can further include contents of inquiring rules of the e-commerce platform, inquiring of a shopping process, complaint of the seller, proposing of an optimization suggestion to the platform, and the like, and each content includes a large number of more detailed intentions. It is understood that the above "hundreds" and "tens of thousands" are only divisors and are not exact numerical values. In addition, the ten-thousand-level intention classification method provided by the embodiment of the application can be naturally used for the intention classification below ten thousand levels, and can also be used for the intention classification above ten thousand levels, and only for the sake of simplicity, the ten-thousand-level intention classification is mainly explained as an example in the following.

The inventor finds that, aiming at the fine classification requirement of ten thousand intentions, the current man-machine conversation technology mainly has the following bottleneck problems:

(1) semantic wrapping problem under ten thousand level intention classification

For hundred-level intention classification, whether classification is carried out according to hierarchical classification or single hierarchical classification, intention recognition of matching semantic similarity of sentences is carried out according to word statistical characteristics or distributed vector representation, or intention recognition is carried out by utilizing a labeled corpus training classification model, and the traditional semantic understanding framework integrating two technical schemes of semantic matching and classification models can basically and accurately recognize more than 90% of the intention of users. However, for the requirement of ten-thousand level intention classification, because the services are finely classified, semantic relevance and word overlap exist between two different service demands, so that the word expressions of several service demands which are similar semantically are likely to belong to completely different intention classifications, which can be called as a semantic entanglement problem under the ten-thousand level intention classification. For example: in the intelligent customer service of the operator, the expression "how much money i owed", "why i owed" and "why i owed thirty blocks" by the user seems to all express the "query call" appeal in the "call" business category, but actually expresses three different, more elaborate intentions, respectively: "inquire balance", "inquire arrearage reason" and "check call charge going". The semantic entanglement problem generally exists in a ten-thousand-level intention classification scene, and the classification accuracy is not high by utilizing the existing technical framework, so that the semantic entanglement problem becomes an important factor for restricting successful application of a natural man-machine conversation system in a large-scale enterprise business scene.

(2) Key information missing problem under ten-thousand intention classification

At the initial stage of implementing a natural man-machine conversation project, the business markup corpus provided by an enterprise is generally less, so that an effective semantic matching model and an effective intention classification model are difficult to train. For the application of hundred-level intention classification, technologies such as data enhancement, pseudo labels or self-help methods and the like can be adopted to increase labeled corpus and train the model, or parameter fine tuning is carried out based on a pre-training model such as Bert and the like according to a transfer learning idea to improve the accuracy of a semantic matching model or an intention classification model. However, for a large enterprise with ten-thousand level intention classification requirements, due to the fact that the business types are multiple and complex, business intentions are frequently updated and adjusted, updated intention-related linguistic data is often few or not, the problem that key information needed by most business intentions needs to be classified is lost, the period for complementing the related key information is long, therefore, the statistical characteristics of a plurality of business intentions are difficult to mine, and the traditional cognitive architecture cannot meet the ten-thousand level intention classification requirements.

(3) Multi-round free dialogue dynamic decision problem under ten-thousand-level intention classification

For the hundred-level intention classification, under the guarantee of high-accuracy intention classification rate, the commonly used multi-turn dialog management can be realized by modeling based on a finite state machine or other Markov decision process configured by a graph structure path. But this dialog management mechanism requires that all possible dialog paths be preset in advance and that the state of the dialog is assumed to be limited, which is basically adequate for multiple rounds of dialog under the hundreds of intention classification. However, for multiple rounds of dialogs under ten-thousand-level intent classification, due to the fact that there are many involved intentions, the situations such as multi-intent service appeal and multi-intent conversion often occur in man-machine multiple rounds of dialogs, and dynamic decision of dialog states needs to be made according to the context language expression environment, so that the dynamic decision problem of multiple rounds of free dialogs can be called. In response to such problems, the conventional dialog decision method based on the finite fixed state sequence is difficult to deal with.

It should be particularly emphasized that the discovery of the above-mentioned problems is the result of the inventor's practice and careful study, and therefore, the discovery process and the solution proposed by the following embodiments of the present invention to the above-mentioned problems are the contribution of the inventor to the present invention in the course of the present invention.

The traditional framework mainly performs intention classification through a semantic matching model and an intention classification model from the perspective of semantic understanding. The cognitive intelligence based on the traditional framework mainly solves the semantic understanding and intention classification problems according to statistical characteristics or patterns, and the model based on the statistical characteristics has inherent limitations, namely, the inherent logicality of natural language and the environment of language expression (context for short) are not considered, so that the semantic entanglement, key information loss and multi-round free conversation dynamic decision problems (namely, the three problems described above) under ten thousand level intention classification cannot be solved.

After research, the inventor considers that how to fuse logical expression (namely knowledge representation) and context representation of a natural language symbol space into a statistical feature space for joint learning and modeling is the key for solving the natural man-machine interaction problem under ten thousand level intention classification. To this end, referring to fig. 1, the present application proposes a natural human-machine conversation technology architecture integrating three layers, namely a semantic understanding layer 100 based on a semantic matching model and an intention classification model, a logical inference layer 110 based on a knowledge graph and a case graph, and a decision making layer 120 based on a multi-turn conversation decision of reinforcement learning. In each layer of fig. 1, some (but not all) technical points included in each layer are also shown as rectangular small boxes, and a specific implementation manner of each layer will be described later.

Aiming at the semantic understanding layer, the related entities and the relation characteristics of the domain knowledge are further embedded in the traditional semantic characteristic representation based on the statistical characteristics (namely the semantic characteristic representation based on the characters and the words), so that the prediction accuracy of a semantic matching model and a classification model is improved.

Aiming at a logic inference layer, a panoramic atlas consisting of a domain knowledge atlas and a case atlas is constructed, functions of entity identification, entity error correction, completion omission, reference resolution and the like are completed through context analysis based on the panoramic atlas, completion of key information of conversation sentences and completion of context information of conversation are realized, and complete conversation context knowledge containing three elements of language, culture and scene is constructed.

In decision-making judgment, the conversation state is tracked based on the supplemented conversation context information, and decision-making judgment tasks such as multi-round conversation state decision-making, intention classification and the like are completed.

How the user's intention classification is implemented on the basis of the technical framework of fig. 1 will be specifically explained below in conjunction with fig. 2 to 5. Fig. 2 is a flowchart of a ten-thousand-level intent classification method provided in an embodiment of the present application, fig. 3 is a specific implementation manner of the ten-thousand-level intent classification method in fig. 2, fig. 4 is a specific implementation manner of a logical inference layer in fig. 3, and fig. 5 is a specific implementation manner of a decision making and decision making layer in fig. 3. The methods in fig. 2 to 5 may be performed by a robot performing a human-machine conversation, and will not be described in detail below.

Referring first to fig. 2, the method includes:

step S200: and acquiring the conversation sentences of at least one round of conversation with the user.

In order to recognize the intention of the user, the robot needs to perform at least one round of conversation with the user to acquire a conversation sentence as an analysis material, wherein the conversation sentence can include the words spoken by the user or the words spoken by the robot. The specific manner of the man-machine conversation is not limited: for example, the user may input text through the keyboard to talk with the robot, and for example, the user may talk with the robot through voice, and the robot is responsible for converting the user's voice into text and converting the words to be spoken into artificial voice for output, and so on.

There are various ways to generate at least one round of dialog, for example, if the robot is mainly used for chatting, at least one round of dialog will be naturally generated in the course of chatting, and if the robot is mainly used for customer service, after the user inputs the first dialog sentence, the robot determines whether the sentence already has the condition of intent classification (e.g., whether the sentence information is missing, whether the contextual information of the dialog is complete, etc.), if the condition of intent classification already has can be directly classified according to the sentence, otherwise, at least one round of clarifying dialog can be actively performed with the user according to the clarifying dialog, and the first dialog sentence input by the user and the clarified dialog sentence generated by at least one round of clarifying dialog are used as materials for classifying the user's intent. In which a clarification dialog, i.e. a dialog that the robot carries out in order to guide the user to say his real intention, for example, the user says "i want to buy an airline ticket", which is a vague business intention expression, the robot can guide the user to say more details about buying an airline ticket by clarification, for example, the robot can reply to "when you want to buy an airline ticket", "where you want to buy an airline ticket from here to where", etc., and when the customer answers these questions, his real intention is progressively clarified.

Referring to fig. 3, after the user inputs the first dialog sentence, the sentence may be preprocessed, and the original sentence may be converted into a form more suitable for subsequent processing. For example, the preprocessing may include operations such as de-punctuation, text regularization, word segmentation, and part-of-speech tagging of sentences, and specific methods thereof may refer to the prior art and are not specifically set forth herein.

It should be noted that the above "first dialog sentence" does not strictly mean the words when the user talks to the robot for the first time, and the dialog between the user and the robot may be a continuous process, for example, if the user wants the robot to provide several services for the user, after the robot recognizes an intention of the user, the sentence in which the user starts to express the next intention may also be referred to as "first dialog sentence".

For the preprocessed dialogue sentences, the robot can judge whether information is missing through a preset rule, for example, whether the sentences have pronouns (if pronouns exist, the sentences have missing), whether the sentences omit subjects or objects (if the sentences omit the pronouns, the sentences have missing), and the like. If no information is missing, classifying the user intention by using the model of the semantic understanding layer (see step S202); if the information is missing, the dialog sentence is indicated not to meet the condition for intent classification, so that the robot can perform at least one round of clarifying dialog with the user according to the clarifying dialog, the missing information is supplemented in the dialog process, and the user intent classification is performed by using the model of the semantic understanding layer according to the dialog sentence after the information is supplemented (see step S202). The completion of the missing statement information is explained again in step S201. It should be noted that, since the user inputs a new sentence in the process of clarifying the dialog, the new sentence may have information missing, and therefore, the object of completing the information is not limited to the first dialog sentence input by the user, but may also be the new sentence input by the user, which is not explicitly shown in fig. 3. In addition, the robot may also pre-process the newly input sentence by the user, which is not explicitly shown in fig. 3.

The clarifying dialog may continue until the robot recognizes the real intention of the user, and each time the robot completes the missing information in the dialog sentence already performed, the completed sentence may be fed back to the model of the semantic understanding layer for intention classification (see step S202).

The inventor researches and discovers that information loss is one of main reasons which cause difficulty in accurate classification of ten-thousand intentions, so that the information loss of a sentence with incomplete information is supplemented by clarifying dialogues firstly, and then intention classification is carried out, thereby being beneficial to improving the accuracy of the intention classification.

Step S201: and performing context analysis on the conversation sentences to complement the context information of the conversation.

It has been mentioned above that the context, i.e. the environment of the linguistic expression, in different environments the same sentence may express different intentions, and therefore completing the contextual information of the dialog is of great significance for improving the accuracy of the intent classification.

After at least one turn of dialogue sentences is obtained in step S200, entities and relationships in the dialogue sentences may be extracted, and then, according to the extracted entities and relationships, a pre-constructed panorama is used to perform context analysis on the dialogue sentences, so as to complement context information of the dialogue. The panoramic atlas includes a knowledge atlas and a case atlas of the domain related to the dialog, the knowledge atlas is a knowledge base based on binary relations and is mainly used for describing various entities and concepts existing in the real world and the relations among the entities and the concepts, the case atlas is a case logic knowledge base and is mainly used for describing evolution rules and modes among events, the domain related to the dialog can be a domain where the robot provides services, for example, the domain is an insurance domain for the robot of an insurance company. For the construction of knowledge-graphs and case-graphs, reference is made to the prior art and no specific explanation is made here.

Referring to fig. 4, for the dialogue sentences obtained in step S200 (in fig. 4, the dialogue sentences are obtained by clarifying the dialogue with the user, but other ways of obtaining the dialogue sentences in step S200 are not excluded), completion of the context information of the dialogue can be performed based on the panorama. The context information of the dialog may include, but is not limited to, the three parts shown in fig. 4, i.e., language context information, cultural context information, and scene context information, which may be complemented by panorama-based language context analysis, cultural context analysis, and scene context analysis, respectively. The language context information focuses on describing the content of the statement itself, and specifically may include entity and slot information in the dialogue statement; the culture context information focuses on describing the state of a user who carries out a conversation, and specifically comprises user emotion information (the emotion of the user during the conversation), user portrait information (characteristics of the identity, attributes, preferences and the like of the user), common knowledge related to the conversation and the like; the context information focuses on describing external environmental factors of the dialog occurrence, and specifically may include context information of the dialog, such as time, place, and the like of the dialog occurrence.

With continued reference to fig. 3, after completing the three types of context information of < language, culture, and scene > in fig. 4, the completed context information may be output to the decision-making layer (see step S203), and the completion of the context information is crucial to improving the decision accuracy of the intent decision-making model.

On the other hand, still referring to fig. 3, the logical inference layer has two functions, one of which is described above, i.e. the context information of the dialog is complemented by the panorama, and the other is mentioned in the description of step S200: when the dialogue statement has information loss, the information of the completion statement is lost. For example, the completing of the missing information may specifically include performing entity identification, entity error correction, completion omission, reference resolution, and the like on the sentence, and in an alternative, the completing of the sentence missing information may be implemented by using the completed entity and slot information in fig. 4. Also, it has been mentioned before that the completed sentence can be fed back to the model in the semantic understanding layer for intent classification (see step S202), thereby facilitating to improve the accuracy of intent classification.

It should be noted that, although in fig. 3, the clarification session is actively performed only when the robot determines that there is a missing conversation sentence, the reason for performing the clarification session is not limited to the missing sentence information, for example, when the robot determines that there is a missing context information, the robot may actively perform the clarification session with the user and gradually replenish the context information of the conversation during the session.

Step S202: and performing semantic analysis on the spoken sentence to obtain a plurality of candidate intentions of the user.

The semantic analysis may employ, but is not limited to, one or both of a semantic matching model, an intent classification model, e.g., the former may be a shift-learned Bert model, and the latter may be a Fasttext model, a Bert classification model, etc. The semantic matching model and the intention classification model can both output the probability of intention classification, and the model implementation is not further described here, and only the prior art is referred to.

Before semantic analysis of the spoken sentence, the sentence may be appropriately quantized (e.g., in the form of a feature vector) and then input to a model for intent classification. For example, the dialog sentence may be embedded and represented based on three characteristic dimensions of the word, the word and the domain knowledge of the domain in which the dialog relates (for example, based on the distributed embedded representation of the Bert model), and of course, the dialog sentence may also be embedded and represented based on two characteristic dimensions of the word and the word or adopt other representation methods.

The candidate intentions are intentions which are possibly expressed by the user and are determined by a model of a semantic understanding layer, but due to the problems of semantic entanglement and the like, the semantic understanding layer is difficult to distinguish which is the real intention of the user, so that a plurality of candidate intentions can be input into an intention decision model of a decision-making layer to make final decision output. Certainly, the intention to be expressed by the user is not excluded to be very simple, the semantic understanding layer can directly determine the real intention of the user, the real intention of the user can be directly output at the moment, and the intention decision is not needed to be made through the decision making layer.

There are several ways to obtain candidate intents: for example, if only one model is used, taking the intention classification model as an example, the model outputs the probability magnitude of each intention, if the probability value of the intention with the highest probability exceeds a certain threshold (e.g., 80%, 90%, etc.), it can be regarded as the true intention of the user, and no candidate intention is necessarily output, and if the probability values of several intentions with the highest probabilities are close, e.g., the probabilities of the intentions of the first two are 45%, 42%, respectively, these two intentions can be regarded as candidate intentions.

For another example, if the semantic matching model and the intention classification model are used simultaneously, both models output intentions with the highest probability, that is, the semantic analysis is performed on the dialogue sentence by using the semantic matching model to obtain the first intention of the model output, and the semantic analysis is performed on the dialogue sentence by using the intention classification model to obtain the second intention of the model output. At this time, whether the first intention and the second intention are the same or not can be judged, if the first intention and the second intention are not the same, the intention of the user is not clear in the semantic understanding layer, the first intention and the second intention can be determined as two candidate intentions, and the candidate intentions are further processed by a decision-making judgment layer; if the two meanings are the same, the intention of the user is clear in the semantic understanding layer, the first intention (or the second intention) can be determined as the real intention of the user and directly output without further processing by a decision-making judgment layer. Of course, in some implementations, it is not necessary that only one candidate intention is output by one model, but several candidate intents with approximate probabilities may be output, so that the number of candidate intents in these implementations may exceed two.

Referring to fig. 3, a semantic matching model and an intention classification model are adopted in fig. 3, the input of the model is an information-complemented dialogue sentence (or a dialogue sentence without information complementation), and the dialogue sentence is embedded and represented based on characters, words and domain knowledge, wherein the characters and words for embedding and representation can be obtained when the sentence is preprocessed (preprocessing can include word segmentation and word segmentation). And the candidate intents output by fusing the semantic analysis results of the semantic matching model and the intention classification model are input to the intention decision model of the decision judgment layer for further decision output. Note that fig. 3 does not show branches where the semantic understanding layer directly outputs the true intent of the user.

Step S203: and determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

Referring to fig. 3, an intention decision model may be pre-constructed through a reinforcement learning algorithm, where inputs of the model include the completed dialog context information obtained in step S201 and the plurality of candidate intentions obtained in step S202, an output of the model includes a predicted real intention (belonging to one of the plurality of candidate intentions) of the user, and after the real intention of the user is determined, the robot may adopt an appropriate answer or skill decision, where the skill decision may include, but is not limited to, a business corresponding to the real intention expressed by the user.

In fig. 3, not only the first dialog sentence input by the user, but also the answer mode of the user for the clarification dialog of the robot is various, and some answers even ask questions, which lack logic. According to the method, the dialogue sentences are analyzed and reasoned by using knowledge expression of the panoramic map, on one hand, the sentences after information completion are input to the model of the semantic understanding layer again by completing key information in the sentences with fuzzy user expression intentions, so that the classification accuracy of candidate intentions is improved, and finally the accuracy of intention classification of the intention decision model is improved; on the other hand, complete dialogue context information containing three factors of language, culture and scene is completed, and then the completed context information is input to the intention decision model and is combined with the panoramic atlas to make the intention decision, which is also beneficial to improving the accuracy of the intention classification of the intention decision model.

In some implementations, the robot may also evaluate the intention output by the model, and only when the evaluation condition is satisfied, the real intention of the user is finally confirmed, if the evaluation condition is not satisfied, the dialog with the user may be continued (for example, the clarification dialog is continued), and after each dialog, the real intention of the user is judged by the intention decision model and the intention evaluation is performed until the evaluation condition is satisfied (this process is not explicitly shown in fig. 3). For example, the intention decision model may predict a probability value for all candidate intentions, and output the candidate intention with the highest probability value as the true intention of the user, and the evaluation condition may be set such that the probability value corresponding to the true intention must be greater than a certain threshold (e.g., 80%, 90%, etc.).

Referring to fig. 5, in a specific implementation of the decision making fault, the intention decision model may be constructed by using, but not limited to, a deep Q learning network DQN reinforcement learning algorithm proposed by deep mind. The input of the model may include a pre-constructed panorama, a plurality of candidate intentions and complemented dialog context information, text features of dialog sentences input by the user in the previous round, text features of dialog sentences input by the user in the current round of dialog, and probability distribution of the previous round of dialog states, and the output of the model may include real intentions of the user (not shown in fig. 5) and the current round of dialog states (the transition dialog states after the current round of dialog). As mentioned above, if the robot evaluates that the classification result of the intention decision model does not satisfy the evaluation condition after classifying the intention, the dialog with the user can be continued, and thus there is an iterative relationship between the dialog states in the input and output of the model.

The inventor researches and discovers that a plurality of intentions coexist, intentions are switched back and forth, and intention expression is blurred in a plurality of rounds of conversations between a user and a robot, and the conventional method generally presets a plurality of intention identification, intention conversion and conversation coping paths of a client on the basis of a finite state machine model, so that conversation management cannot identify and cope with the condition that the intention expressed by the user is not in a preset scheme. In the scene of ten thousand levels of intention classification, due to the fact that the intention is multiple, the configuration of all intention classification and dialectical coping paths is difficult to do in advance, the method and the system utilize a panoramic atlas, an intention decision model built based on an enhanced learning algorithm is used, the current conversation state of the robot and the real intention of a user are dynamically decided according to relevant context information and historical conversation states, and a good intention classification result can be obtained.

In some implementations, the panoramic atlas may also be dynamically updated to further improve the accuracy of the intent decision.

In summary, the ten-thousand-level intention classification method provided by the embodiment of the application is realized based on a brand-new three-layer man-machine conversation technical framework, and the framework is provided for ten-thousand-level intention classification scenes and specifically comprises a semantic understanding layer, a logical reasoning layer and a decision-making judgment layer. The method comprises the steps of obtaining a plurality of candidate intentions in a semantic understanding layer in a semantic analysis mode, obtaining supplemented context information in a logic reasoning layer in a context analysis mode, utilizing an intention decision model established by a reinforcement learning algorithm in a decision judgment layer, carrying out dynamic decision according to the candidate intentions and the supplemented context information, and finally outputting the real intentions of a user.

In view of the three bottleneck problems in the human-computer interaction field, the ten-thousand-level intent classification method provided by the embodiment of the present application correspondingly provides a valuable solution to respond: firstly, on the basis of a traditional semantic matching model and an intention classification model, the symbolic features of the internal logic of natural language are mainly considered, a semantic feature representation method for embedding domain knowledge is provided, the accuracy of intention classification of dialogue sentences on the semantic matching model and the intention classification model is improved, and the semantic entanglement problem is favorably solved. In addition, when the dialogue sentences and/or the contextual information of the user are missing, at least one round of clarification dialogue is actively carried out with the user, more effective information is collected, the panoramic atlas constructed based on the domain knowledge is used for analyzing and reasoning, missing information in the dialogue and the contextual information of the dialogue are complemented, and the semantic entanglement problem caused by the missing of the sentence information and the missing of the dialogue context (represented as the contextual information) is effectively improved. Meanwhile, the completion of the sentence missing information and the context information can be finally fed back to the input of the intention decision model, and the problem that the intention classification result is not ideal due to the missing of the key information is favorably solved. In addition, the intention decision model based on the reinforcement learning algorithm is innovatively provided, dynamic decision can be made through the intention decision model, the intention classification type is not limited, conversation processes are not required to be configured in advance according to limited user intentions, challenging problems such as intention accurate identification and dynamic decision of conversation states under the requirement of ten-thousand-level intention classification can be well solved, and multi-round free conversation dynamic decision can be achieved.

Fig. 6 is a functional block diagram of a ten-thousand-level intent classification apparatus 300 according to an embodiment of the present application. Referring to fig. 6, the ten-thousand-level intention classification device 300 includes: a dialogue acquiring module 310, configured to acquire dialogue sentences of at least one round of dialogue performed with a user; a context information completing module 320, configured to perform context analysis on the dialog sentences and complete the context information of the dialog; a semantic understanding module 330, configured to perform semantic analysis on the dialog statement to obtain multiple candidate intentions of the user; and the intention decision module 340 is used for determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

In one implementation of the ten-thousand-level intent classification apparatus 300, the contextual information completion module 320 performs contextual analysis on the dialog sentence, and completes the contextual information of the dialog sentence, including: extracting entities and relations in the dialogue sentences; according to the extracted entities and the extracted relations, performing context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing context information of the conversation; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

In one implementation of the ten thousand level intent classification apparatus 300, the contextual information of the dialog includes: entity and slot position information, user emotion information, user portrait information, general knowledge related to conversation and scene information of conversation in the conversation sentence; the context information completing module 320 performs context analysis on the dialog sentences by using a pre-constructed panorama, and completes the context information of the dialog, including: performing language context analysis on the dialogue sentences by using a pre-constructed panoramic map, and completing entity and slot position information in the dialogue sentences; performing culture context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing user emotion information, user portrait information and general knowledge related to conversation; and performing scene context analysis on the conversation sentences by using a pre-constructed panoramic map to complement the scene information of the conversation.

In one implementation of the ten-thousand-level intent classification apparatus 300, the intent decision module 340 determines the true intent of the user from the plurality of candidate intentions by using an intent decision model constructed based on a reinforcement learning algorithm based on the plurality of candidate intentions and the complemented dialog context information, including: inputting a pre-constructed panoramic atlas, the plurality of candidate intentions, complemented conversation context information, text features of conversation sentences input by a user in the previous round, text features of conversation sentences input by the user in the current round and probability distribution of the conversation states of the user in the previous round into the intention decision model, and obtaining real intentions of the user and the conversation states in the current round output by the intention decision model; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

In one implementation of the ten-thousand-level intention classification apparatus 300, the semantic understanding module 330 performs semantic analysis on the dialog sentence to obtain a plurality of candidate intentions of the user, including: and performing semantic analysis on the dialogue sentences by utilizing a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user.

In one implementation of the ten-thousand-level intention classification apparatus 300, the semantic understanding module 330 performs semantic analysis on the dialog sentence by using a semantic matching model and an intention classification model to obtain a plurality of candidate intentions of the user, including: performing semantic analysis on the dialogue statement by using a semantic matching model to obtain a first intention output by the model; semantic analysis is carried out on the dialogue sentences by using an intention classification model, and a second intention output by the model is obtained; and judging whether the first intention and the second intention are the same, and if the first intention and the second intention are not the same, determining the first intention and the second intention as the candidate intentions.

In one implementation of the ten thousand level intent classification apparatus 300, the semantic understanding module 330 is further configured to: if the first intention and the second intention are the same, determining the first intention as the real intention of the user.

In one implementation of the ten-thousand-level intention classification apparatus 300, the semantic understanding module 330 performs semantic analysis on the dialog sentence by using a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user, including: embedding and representing the dialogue sentences based on characters, words and domain knowledge of the domain related to the dialogue; and inputting the result of the embedded representation of the dialogue statement into the semantic matching model and/or the intention classification model to perform semantic analysis on the dialogue statement, so as to obtain a plurality of candidate intentions of the user.

In one implementation of the ten-thousand-level intent classification device 300, the dialogue acquisition module 310 acquires dialogue sentences of at least one round of dialogue performed by a user, including: acquiring a first dialogue statement input by a user; judging whether statement information is missing in the first dialogue statement; if the statement information is missing, performing at least one round of clarification conversation with the user; wherein the dialog sentences include the first dialog sentence and a clarifying dialog sentence of the at least one round of clarifying dialog.

In one implementation of the ten-thousand-level intention classification apparatus 300, the semantic understanding module 330 performs semantic analysis on the dialog sentence to obtain a plurality of candidate intentions of the user, including: completing information loss in the dialogue sentences by using the completed dialogue context information; and carrying out semantic analysis on the completed dialogue sentences to obtain a plurality of candidate intentions of the user.

In one implementation of the universal intent classification apparatus 300, the contextual information of the dialog includes entity and slot information in a dialog sentence, and the semantic understanding module 330 completes the information missing in the dialog sentence by using the completed dialog contextual information, including: and utilizing the completed entity and slot position information to complete the information loss in the dialogue statement.

The implementation principle and the resulting technical effect of the ten-thousand-level intention classifying device 300 provided in the embodiment of the present application have been introduced in the foregoing method embodiments, and for the sake of brief description, portions of the device embodiments that are not mentioned may refer to corresponding contents in the method embodiments.

Fig. 7 is a schematic view of an electronic device according to an embodiment of the present application. Referring to fig. 7, the electronic device 400 includes: a processor 410, a memory 420, and a communication interface 430, which are interconnected and in communication with each other via a communication bus 440 and/or other form of connection mechanism (not shown).

The Memory 420 includes one or more (Only one is shown in the figure), which may be, but not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The processor 410, as well as possibly other components, may access, read, and/or write data to the memory 420.

The processor 410 includes one or more (only one shown) which may be an integrated circuit chip having signal processing capabilities. The Processor 410 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Micro Control Unit (MCU), a Network Processor (NP), or other conventional processors; or a special-purpose Processor, including a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, and a discrete hardware component.

Communication interface 430 includes one or more (only one shown) devices that can be used to communicate directly or indirectly with other devices for data interaction. For example, the communication interface 430 may be an ethernet interface; may be a mobile communications network interface, such as an interface for a 3G, 4G, 5G network; or may be other types of interfaces having data transceiving functions.

One or more computer program instructions may be stored in memory 420 and read and executed by processor 410 to implement the ten thousand intent classification methods provided by embodiments of the present application, as well as other desired functions.

It will be appreciated that the configuration shown in fig. 7 is merely illustrative and that electronic device 400 may include more or fewer components than shown in fig. 7 or have a different configuration than shown in fig. 7. The components shown in fig. 7 may be implemented in hardware, software, or a combination thereof. For example, the electronic device 400 may be a single server (or other devices having arithmetic processing capabilities), a combination of a plurality of servers, a cluster of a large number of servers, or the like, and may be either a physical device or a virtual device.

The embodiment of the present application further provides a computer-readable storage medium, where computer program instructions are stored on the computer-readable storage medium, and when the computer program instructions are read and executed by a processor of a computer, the ten-thousand-level intention classification method provided in the embodiment of the present application is executed. The computer-readable storage medium may be implemented as, for example, memory 420 in electronic device 400 in fig. 7.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A ten-thousand level intent classification method, comprising:

obtaining a conversation sentence of at least one round of conversation with a user;

performing context analysis on the conversation sentences to complement the context information of the conversation;

performing semantic analysis on the conversation sentences to obtain a plurality of candidate intentions of the user;

and determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

2. The method for ten thousand intent classification according to claim 1, wherein the performing context analysis on the dialogue sentences to complement the context information of the dialogue comprises:

extracting entities and relations in the dialogue sentences;

according to the extracted entities and the extracted relations, performing context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing context information of the conversation; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

3. The method of claim 2, wherein the contextual information of the dialog comprises: entity and slot position information, user emotion information, user portrait information, general knowledge related to conversation and scene information of conversation in the conversation sentence;

the method for performing context analysis on the conversation sentences by using the pre-constructed panoramic graph to complement the context information of the conversation comprises the following steps:

performing language context analysis on the dialogue sentences by using a pre-constructed panoramic map, and completing entity and slot position information in the dialogue sentences;

performing culture context analysis on the conversation sentences by using a pre-constructed panoramic map, and completing user emotion information, user portrait information and general knowledge related to conversation;

and performing scene context analysis on the conversation sentences by using a pre-constructed panoramic map to complement the scene information of the conversation.

4. The ten-thousand level intention classification method according to claim 1, wherein the determining the real intention of the user from the candidate intentions based on the candidate intentions and the complemented dialogue context information by using an intention decision model constructed based on a reinforcement learning algorithm comprises:

inputting a pre-constructed panoramic atlas, the plurality of candidate intentions, complemented conversation context information, text features of conversation sentences input by a user in the previous round, text features of conversation sentences input by the user in the current round and probability distribution of the conversation states of the user in the previous round into the intention decision model, and obtaining real intentions of the user and the conversation states in the current round output by the intention decision model; the panoramic graph comprises a knowledge graph and a case graph of the field related to the conversation.

5. The ten-thousand level intention classification method according to claim 1, wherein the semantic analysis of the dialog sentences to obtain a plurality of candidate intentions of the user comprises:

and performing semantic analysis on the dialogue sentences by utilizing a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user.

6. The ten-thousand level intention classification method according to claim 5, wherein the semantic analysis is performed on the dialog sentences by using a semantic matching model and an intention classification model to obtain a plurality of candidate intentions of the user, and the method comprises the following steps:

performing semantic analysis on the dialogue statement by using a semantic matching model to obtain a first intention output by the model;

semantic analysis is carried out on the dialogue sentences by using an intention classification model, and a second intention output by the model is obtained;

and judging whether the first intention and the second intention are the same, and if the first intention and the second intention are not the same, determining the first intention and the second intention as the candidate intentions.

7. The method of ten thousand intent classifications of claim 6 further comprising: if the first intention and the second intention are the same, determining the first intention as the real intention of the user.

8. The ten-thousand level intention classification method according to claim 5, wherein the semantic analysis is performed on the dialog sentence by using a semantic matching model and/or an intention classification model to obtain a plurality of candidate intentions of the user, including:

embedding and representing the dialogue sentences based on characters, words and domain knowledge of the domain related to the dialogue;

and inputting the result of the embedded representation of the dialogue statement into the semantic matching model and/or the intention classification model to perform semantic analysis on the dialogue statement, so as to obtain a plurality of candidate intentions of the user.

9. The method for ten thousand intent classification of any one of claims 1-8, characterized in that said obtaining dialog sentences of at least one dialog turn with the user comprises:

acquiring a first dialogue statement input by a user;

judging whether statement information is missing in the first dialogue statement;

if the statement information is missing, performing at least one round of clarification conversation with the user; wherein the dialog sentences include the first dialog sentence and a clarifying dialog sentence of the at least one round of clarifying dialog.

10. The ten-thousand level intention classification method according to any one of claims 1 to 8, wherein the semantic analysis of the conversational sentence to obtain a plurality of candidate intentions of the user comprises:

completing information loss in the dialogue sentences by using the completed dialogue context information;

and carrying out semantic analysis on the completed dialogue sentences to obtain a plurality of candidate intentions of the user.

11. The method according to claim 10, wherein the contextual information of the dialog includes entity and slot information in a dialog sentence, and the complementing the information missing in the dialog sentence with complemented dialog contextual information includes:

and utilizing the completed entity and slot position information to complete the information loss in the dialogue statement.

12. A ten thousand level intent classification apparatus, comprising:

the conversation acquisition module is used for acquiring conversation sentences of at least one round of conversation with the user;

the context information completion module is used for performing context analysis on the conversation sentences and completing the context information of the conversation;

the semantic understanding module is used for carrying out semantic analysis on the conversation sentences to obtain a plurality of candidate intentions of the user;

and the intention decision module is used for determining the real intention of the user from the candidate intentions by utilizing an intention decision model constructed based on a reinforcement learning algorithm based on the candidate intentions and the complemented conversation context information.

13. A computer-readable storage medium having stored thereon computer program instructions which, when read and executed by a processor, perform the method of any one of claims 1-11.

14. An electronic device, comprising: a memory having stored therein computer program instructions which, when read and executed by the processor, perform the method of any of claims 1-11.