CN112732882A - User intention identification method, device, equipment and computer readable storage medium - Google Patents

User intention identification method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN112732882A
CN112732882A CN202011631344.4A CN202011631344A CN112732882A CN 112732882 A CN112732882 A CN 112732882A CN 202011631344 A CN202011631344 A CN 202011631344A CN 112732882 A CN112732882 A CN 112732882A
Authority
CN
China
Prior art keywords
intention
preset
node
label
success rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011631344.4A
Other languages
Chinese (zh)
Inventor
李志韬
王健宗
程宁
吴天博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011631344.4A priority Critical patent/CN112732882A/en
Priority to PCT/CN2021/084250 priority patent/WO2022141875A1/en
Publication of CN112732882A publication Critical patent/CN112732882A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application relates to the technical field of intelligent decision making, and provides a user intention identification method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring text information corresponding to voice data input by a user, and inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for expressing voice intentions; then determining a preset number of candidate intention labels from the preset intention labels according to the output probability of each preset intention label; then determining the conversation success rate of each intention node in a preset intention knowledge graph; determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node; and determines the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user. According to the method, the target intention label of the user can be determined by combining the preset intention classification model with the conversation success rate of each intention node in the preset intention knowledge graph.

Description

User intention identification method, device, equipment and computer readable storage medium
Technical Field
The present application relates to the field of intelligent decision making technologies, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for identifying a user intention.
Background
The dialog system is a man-machine interaction system based on natural language, and the intention recognition is an important part of the man-machine interaction system, converts the content of user dialog into a way which can be understood by a computer, and the recognized intention can directly influence whether the content of the next sentence spoken by the robot is related to the expression of the user or not and is satisfied by the client. Among them, intent recognition mainly includes two parts: intent detection and semantic slot extraction. Traditionally, methods for identifying intentions have good experimental results from Hidden Markov Models (HMMs), Conditional Random Fields (CRFs), Support Vector Machines (SVMs), and convolutional neural networks and cyclic neural networks that are popular in the last decade. However, these models can only achieve good results with less contextual input and large-scale corpus. Moreover, the conventional intention recognition is to select the intention with the highest probability as the final intention through a classifier, and as a result, in a practical application scenario, recognition errors of some intentions are caused by occasional voice recognition errors, so how to accurately determine the target intention of the user according to the voice data of the user is a problem to be solved urgently.
Disclosure of Invention
The present application is directed to a method, an apparatus, a device and a computer readable storage medium for identifying user intention, which are used to accurately determine the intention of voice data input by a user.
In a first aspect, the present application provides a user intention identification method, including:
acquiring text information corresponding to voice data input by a user, and inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for expressing voice intentions;
determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label;
determining a conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data;
determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node;
and determining the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user.
In a second aspect, the present application further provides a user intention identifying apparatus, which includes an obtaining module, a generating module, a screening module, a first determining module, a second determining module, and a third determining module, wherein:
the acquisition module is used for acquiring text information corresponding to voice data input by a user;
the generating module is used for inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for representing voice intentions;
the screening module is used for determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label;
the first determining module is used for determining the conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data;
the second determining module is used for determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node;
the third determining module is configured to determine the candidate intention label with the highest dialog success rate as the intention label of the voice data input by the user.
In a third aspect, the present application also provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the user intent recognition method as described above.
In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements the steps of the user intention identification method as described above.
The application provides a user intention identification method, a device, equipment and a computer readable storage medium, and the method comprises the steps of obtaining text information corresponding to voice data input by a user, and inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for representing voice intentions; then determining a preset number of candidate intention labels from the preset intention labels according to the output probability of each preset intention label; then determining the conversation success rate of each intention node in a preset intention knowledge graph; determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node; and determines the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user. According to the method, the output probabilities of the preset intention labels can be obtained through the preset intention classification model, and the target intention labels of the user can be accurately determined by combining the output probabilities of the preset intention labels and the conversation success rate of each intention node in the preset intention knowledge graph.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart illustrating steps of a user intention identification method according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of a preset intent classification model provided in an embodiment of the present application;
FIG. 3 is a flow diagram illustrating sub-steps of the user intent recognition method of FIG. 1;
FIG. 4 is a schematic view of a scenario of a default intent knowledge graph according to an embodiment of the present application;
fig. 5 is a schematic block diagram of a user intention recognition apparatus according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a sub-module of the user intent recognition apparatus of FIG. 5;
fig. 7 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The embodiment of the application provides a user intention identification method, a user intention identification device, user intention identification equipment and a computer readable storage medium. The user intention identification method can be applied to terminal equipment, and the terminal equipment can be electronic equipment such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant and wearable equipment.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating steps of a user intention identification method according to an embodiment of the present application.
As shown in fig. 1, the user intention identifying method includes steps S101 to S105.
Step S101, obtaining text information corresponding to voice data input by a user, inputting the text information to a preset intention classification model, and obtaining output probabilities of a plurality of preset intention labels for expressing voice intentions.
The preset intention classification model is a pre-trained model, and comprises a plurality of neural network layers, wherein the neural network layers at least comprise at least one of the following: the device comprises a vector extraction layer, a time delay neural network layer, a ReLU layer, a residual error network layer, an addition layer, a cyclic neural network layer, a dropout layer and a Solfmaxlayer.
Specifically, as shown in fig. 2, text information corresponding to voice data input by a user of the user is obtained, the text information is input to a vector extraction layer to obtain a plurality of word vectors, the word vectors are input to a delay neural network layer to extract a plurality of word vector characteristics, the word vector characteristics are input to a ReLU layer, processing the characteristics of a plurality of word vectors, reducing the gradient disappearance of the characteristics of the word vectors, obtaining semantic label vectors, inputting the semantic label vectors and the word vectors into a summation layer to obtain a plurality of preliminary intention label vectors, inputting the preliminary intention label vectors into a recurrent neural network layer to obtain a plurality of candidate intention label vectors, inputting the candidate intention label vectors into a dropout layer to obtain a plurality of preset intention label vectors, and inputting the intention label vectors into a Solfmaxlayer to obtain the output probability of the preset intention labels.
It should be noted that the vector extraction layer may be selected according to actual situations, for example, the vector extraction layer is a Word2Vec model, the delay neural network layer and the ReLU layer further include a residual error network layer, and the residual error network layer makes parameter processing of the delay neural network layer and the ReLU layer more accurate, and through the dropout layer, overfitting of candidate intention label vectors can be prevented, and accuracy of output intention label vectors is improved.
The preset intention classification model may be trained in the following manner: obtaining sample text information, labeling the sample text information according to the class identification corresponding to the output probability of the preset intention label to construct sample data, and performing iterative training on the neural network model based on the sample data until the neural network model is converged, thereby obtaining the preset intention classification model. The neural network model comprises a convolutional neural network model, a cyclic neural network model and a cyclic convolutional neural network model, and certainly, other network models can be adopted for training to obtain a preset intention classification model, which is not specifically limited in the application.
In one embodiment, text information corresponding to voice data input by a user is obtained and input to a preset intention classification model, and output probabilities of a plurality of preset intention labels are obtained. The output probability of a plurality of preset intention labels is accurately and quickly determined through the preset intention classification model, and the use experience of a user is greatly improved.
In an embodiment, the manner of obtaining the text information corresponding to the voice data input by the user is as follows: and acquiring voice input by a user, and inputting the voice into a preset voice recognition model to obtain text information. The preset speech recognition model is a pre-trained neural network model, and this is not specifically limited in the present application. In other embodiments, the text information corresponding to the voice data transmitted by the other device is obtained, and the text information corresponding to the voice data input by the user is obtained. It can be understood that there are other ways to obtain the text information corresponding to the user data voice data, and this application is not limited in this respect.
Step S102, determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label.
Wherein the candidate intent tags are the intent tags with the closer intent to the user's intent.
In one embodiment, a plurality of preset intention labels are sorted from large to small according to the output probability to obtain an intention label queue; and sequentially selecting the preset intention labels from the intention label queue until a preset number of candidate intention labels are obtained. The preset intention label may be set according to an actual situation, which is not specifically limited in the present application, for example, the preset intention label may be set to 5. By arranging the intention label queue of the preset intention labels and then selecting the candidate intention labels according to the probability, the accuracy and the efficiency of selecting the candidate intention labels can be improved.
Illustratively, the output probability of the preset intention label 1 is 10%, the output probability of the preset intention label 2 is 20%, the output probability of the preset intention label 3 is 5%, the output probability of the preset intention label 4 is 12%, the output probability of the preset intention label 5 is 7%, the output probability of the preset intention label 6 is 18%, the output probability of the preset intention label 7 is 25%, the output probability of the preset intention label 8 is 14%, the output probability of the preset intention label 9 is 4% and the output probability of the preset intention label 10 is 28%, the 10 intention labels are sequentially ordered from large to small according to the probability of each preset intention label, and the intention label array is [ preset intention label 10, preset intention label 7, preset intention label 2, preset intention label 6, preset intention label 8, preset intention label 4, preset intention label 1, preset intention label 6, preset intention label 8, preset intention label 4, Preset intention tags 5, preset intention tags 3, preset intention tags 9], obtaining the number of the preset intention tags as 5, selecting the top 5 candidate intention tags from the intention tag queue, and obtaining the candidate intention tags as preset intention tags 10, preset intention tags 7, preset intention tags 2, preset intention tags 6 and preset intention tags 8. By sequencing the preset intention labels, the candidate intention labels can be quickly selected.
And step S103, determining the conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data.
The preset intention knowledge graph is generated according to historical dialogue data, specifically, all historical dialogue data are collected, the historical dialogue data are classified, and the related dialogue data are related to obtain the preset intention knowledge graph.
In one embodiment, a preset intention knowledge graph is obtained, wherein the preset intention knowledge graph is generated according to historical dialogue data; and taking the success rate of the intention node corresponding to each intention node in the preset intention knowledge graph as the conversation success rate of each intention node. The conversation success rate of each intention node can be accurately determined through the preset intention knowledge graph.
In an embodiment, as shown in fig. 3, step S103 includes sub-steps S1031 to S1034.
And a substep S1031, obtaining a plurality of circulation paths of each intention node from the preset intention knowledge graph, and counting the number of the circulation paths, wherein each circulation path comprises a plurality of intention nodes.
Wherein each flow path includes a plurality of intent nodes.
Illustratively, as shown in fig. 4, the preset intention knowledge graph includes an intention node a, an intention node b, an intention node c, an intention node d, an intention node e, an intention node f and an intention node g, the circulation path of the intention node a includes a circulation path in which the intention node a connects the intention node b to connect the intention node c, a circulation path in which the intention node a connects the intention node b to connect the intention node e to connect the intention node f, a circulation path in which the intention node a connects the intention node b to connect the intention node e to connect the intention node g, a circulation path in which the intention node a connects the intention node b to connect the intention node e to connect the intention node f, and a circulation path in which the intention node a connects the intention node d to connect the intention node g, the circulation path of the intention node b includes a circulation path in which the intention node b connects the intention node c, a circulation path of the intention node b to connect the intention node c, a circulation, The method comprises the steps that an intention node b is connected with a circulation path of an intention node e connected with an intention node f, the intention node b is connected with a circulation path of an intention node e connected with an intention node g, and the intention node b is connected with a circulation path of an intention node e connected with an intention node a, the intention node e is connected with an intention node f, the circulation path of the intention node e is connected with the intention node g, the circulation path of the intention node d comprises the circulation path of the intention node connected with the intention node g, and the intention node c and the intention node f do not have circulation paths.
Illustratively, as shown in fig. 4, the number of paths of the flow path of the intention node a is 5, the number of paths of the flow path of the intention node b is 4, the number of paths of the flow path of the intention node c is 0, the number of paths of the flow path of the intention node d is 1, the number of paths of the flow path of the intention node e is 3, the number of paths of the flow path of the intention node f is 0, and the number of paths of the flow path of the intention node g is 0.
And a substep S1032 of determining the circulation path with the attribute identifier of the last intention node as the preset attribute identifier as a circulation success path.
Wherein the attribute of the intention node is identified as a keyword set according to the actual situation, such as a keyword of time, place, event, and the like.
In an embodiment, the flow path with the attribute of the last intention node in the flow path being the preset attribute identification is determined as the flow success path. For example, the preset attribute is identified as time, and if the last intention node in the flow path is the flow path of time, the flow path is a flow success path.
And a substep S1033 of counting the number of the flow success paths among the plurality of flow paths of each of the intention nodes.
Specifically, the number of paths of the plurality of circulation paths of each intention node is determined according to a preset intention knowledge graph.
Illustratively, as shown in fig. 4, when the preset attribute of the intention node is identified as g, querying the preset intention knowledge graph to find the number of the attribute identifications of the last intention node in the plurality of circulation paths of each intention node as g, wherein the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node a is 2, the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node b is 1, the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node c is 0, the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node d is 1, the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node e is 1, the number of the attribute identifications of the last intention node in the plurality of circulation paths of the intention node f is 0, the number of attribute identifiers g of the last intention node in the plurality of circulation paths of the intention node g is 1, so that the circulation success frequency of the intention node a is obtained as 2 times, the circulation success frequency of the intention node b is obtained as 1 time, the circulation success frequency of the intention node c is obtained as 0 time, the circulation success frequency of the intention node d is obtained as 1 time, the circulation success frequency of the intention node e is obtained as 1 time, the circulation success frequency of the intention node f is obtained as 0 time, and the circulation success frequency of the intention node f is obtained as 1 time.
And a substep S1034 of calculating the percentage of the number of successful circulation paths in the circulation paths of each intention node in the number of all circulation paths, and taking the calculated percentage as the conversation success rate of each intention node.
In one embodiment, determining the percentage of the flow success times of each intention node in the corresponding path number; and determining the percentage of the circulation success times of each intention node in the corresponding path number as the conversation success rate of each intention node in the preset intention knowledge graph.
Illustratively, the number of successful flows of the intention node a is 2, the number of successful flows of the intention node b is 1, the number of successful flows of the intention node c is 0, the number of successful flows of the intention node d is 1, the number of successful flows of the intention node e1 is 2, the number of successful flows of the intention node f is 0, the number of successful flows of the intention node g is 0, the number of successful flows of the intention node a is 5, the number of flows of the intention node b is 4, the number of flows of the intention node c is 0, the number of flows of the intention node d is 1, the number of flows of the intention node e is 3, the number of flows of the intention node f is 0, the number of flows of the intention node g is 0, it is obtained that the number of successful flows of the intention node a accounts for 40% of the corresponding flows, the success times of the circulation of the intention node b account for 25% of the corresponding path number, the success times of the circulation of the intention node c account for 0% of the corresponding path number, the success times of the circulation of the intention node d account for 100% of the corresponding path number, the success times of the circulation of the intention node e account for 33.3% of the corresponding path number, the success times of the circulation of the intention node f account for 0% of the corresponding path number, the success times of the circulation of the intention node g account for 100% of the corresponding path number, and according to the success times of the circulation of the intention nodes a, b, c, d, e, f and g, the conversation success rate of the intention node a is determined to be 40%, the conversation success rate of the intention node b is determined to be 25%, the conversation success rate of the intention node c is determined to be 0%, the conversation success rate of the intention node d is determined to be 100%, the conversation success rate of the intention node e is determined to be 33, the conversation success rate of the intention node f is determined to be 0%, and the conversation success rate of the intention node g is determined to be 100%.
And step S104, determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node.
Wherein the success rate of the candidate intention label is between 0 and 100 percent, and the larger the candidate intention label is, the higher the probability of the success of the dialogue of the candidate intention label is.
In one embodiment, the dialogue success rate of each intention node is mapped with a preset intention label corresponding to each intention node to obtain the dialogue success rate of each preset intention label; and mapping the candidate intention label with a preset intention label, and taking the conversation success rate of the mapped preset intention label as the conversation success rate of the candidate intention label.
And step S105, determining the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user.
Wherein the target intent tag is the intent tag closest to the user's intent.
In one embodiment, the candidate intention labels are ranked according to the conversation success rate of each candidate intention label to obtain a candidate intention label queue, and the candidate intention label with the highest conversation success rate is selected from the candidate intention label queue to serve as the target intention label of the user. The candidate intention labels are sorted through the success rate, and the candidate intention label with the maximum success rate is selected as the target intention label of the user, so that the accuracy of determining the intention of the user is greatly improved.
Illustratively, the conversation success rate of the candidate intention label 1 is 50%, the conversation success rate of the candidate intention label 2 is 25%, the conversation success rate of the candidate intention label 3 is 15%, the conversation success rate of the candidate intention label 4 is 60%, and the conversation success rate of the candidate intention label 5 is 40%, the candidate intention labels 1, 2, 3, 4 and 5 are sorted according to the conversation success rate of the candidate intention labels to obtain a candidate intention label queue, [ candidate intention label 4, candidate intention label 1, candidate intention label 5, candidate intention label 2, candidate intention label 3], and the candidate intention label 4 with the highest conversation success rate is selected from the candidate intention label queue as the target intention label of the user.
In the method for identifying a user intention provided in the above embodiment, the output probabilities of a plurality of preset intention labels used for representing a voice intention are obtained by acquiring text information corresponding to voice data input by a user and inputting the text information to a preset intention classification model; then determining a preset number of candidate intention labels from the preset intention labels according to the output probability of each preset intention label; then determining the conversation success rate of each intention node in a preset intention knowledge graph; determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node; and determines the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user. According to the method, the output probabilities of the preset intention labels can be obtained through the preset intention classification model, and the target intention labels of the user can be accurately determined by combining the output probabilities of the preset intention labels and the conversation success rate of each intention node in the preset intention knowledge graph.
Referring to fig. 5, fig. 5 is a schematic block diagram of a user intention recognition apparatus according to an embodiment of the present application.
As shown in fig. 5, the user intention recognition apparatus 200 includes an acquisition module 210, a generation module 220, a filtering module 230, a first determination module 240, a second determination module 250, and a third determination module 260, wherein,
the obtaining module 210 is configured to obtain text information corresponding to voice data input by a user;
the generating module 220 is configured to input the text information into a preset intention classification model, so as to obtain output probabilities of a plurality of preset intention labels for representing the voice intention;
the screening module 230 is configured to determine a preset number of candidate intention tags from the preset intention tags according to the output probability of each preset intention tag;
the first determining module 240 is configured to determine a conversation success rate of each intention node in a preset intention knowledge graph, where the preset intention knowledge graph is generated according to historical conversation data;
the second determining module 250 is configured to determine a dialog success rate of each candidate intention tag according to the dialog success rate of each intention node;
the third determining module 260 is configured to determine the candidate intention label with the highest dialog success rate as the intention label of the voice data input by the user.
In one embodiment, the screening module 230 is further configured to:
sequencing the preset intention labels from large to small according to the output probability to obtain an intention label queue;
and sequentially selecting the preset intention labels from the intention label queue until a preset number of candidate intention labels are obtained.
In an embodiment, the first determining module 240 is further configured to:
acquiring the preset intention knowledge graph;
and taking the success rate of the intention node corresponding to each intention node in the preset intention knowledge graph as the conversation success rate of each intention node.
In one embodiment, as shown in fig. 6, the first determining module 240 includes an obtaining sub-module 241, a counting module 242, a determining sub-module 243, and a calculating module 244, wherein:
the obtaining submodule 241 is configured to obtain a plurality of circulation paths of each intention node from the preset intention knowledge graph;
the counting module 242 is configured to count the number of the plurality of circulation paths, where each circulation path includes a plurality of intention nodes;
the determining submodule 243 is configured to determine a flow path in which the attribute identifier of the last intention node is the preset attribute identifier as a flow successful path;
the counting module 242 is further configured to count the number of successful flow paths in the plurality of flow paths of each intention node;
the calculating module 244 is configured to calculate a percentage of the number of successful flow paths in the multiple flow paths of each intention node to the number of all flow paths, and use the calculated percentage as the conversation success rate of each intention node.
In an embodiment, the second determining module 250 is further configured to:
mapping the conversation success rate of each intention node and a preset intention label corresponding to each intention node to obtain the conversation success rate of each preset intention label;
and mapping the candidate intention label and the preset intention label, and taking the conversation success rate of the mapped preset intention label as the conversation success rate of the candidate intention label.
In an embodiment, the generating module 220 is further configured to:
inputting the text information into the vector extraction layer to obtain a plurality of word vectors;
inputting a plurality of word vectors into the time delay neural network layer, and extracting a plurality of word vector characteristics;
inputting a plurality of word vector features into the ReLU layer to obtain a semantic tag vector;
inputting the semantic label vector and the plurality of word vectors into the summation layer to obtain a plurality of preliminary intention label vectors;
inputting a plurality of the preliminary intention label vectors into the recurrent neural network layer to obtain a plurality of candidate intention label vectors;
inputting the candidate intention label vectors into the dropout layer to obtain a plurality of preset intention label vectors;
and inputting a plurality of the intention label vectors into the Solfmaxlayer to obtain the output probability of preset intention labels.
The apparatus provided by the above embodiments may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 7.
Referring to fig. 7, fig. 7 is a schematic block diagram illustrating a structure of a computer device according to an embodiment of the present disclosure.
As shown in fig. 7, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a nonvolatile storage medium and an internal memory.
The non-volatile storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any of the user intent recognition methods.
The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.
The internal memory provides an environment for running a computer program in the non-volatile storage medium, which, when executed by the processor, causes the processor to perform any of the user intention recognition methods.
The network interface is used for communication. Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
It should be understood that the bus is, for example, an I2C (Inter-Integrated Circuit) bus, the Memory may be a Flash chip, a Read-Only Memory (ROM) magnetic disk, an optical disk, a usb disk, or a removable hard disk, the Processor may be a Central Processing Unit (CPU), the Processor may also be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, or the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:
acquiring text information corresponding to voice data input by a user, and inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for expressing voice intentions;
determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label;
determining a conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data;
determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node;
and determining the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user.
In one embodiment, the processor, when implementing the determining of the preset number of candidate intention tags from the plurality of preset intention tags according to the output probability of each preset intention tag, is configured to implement:
sequencing the preset intention labels from large to small according to the output probability to obtain an intention label queue;
and sequentially selecting the preset intention labels from the intention label queue until a preset number of candidate intention labels are obtained.
In one embodiment, the processor, in implementing the determining the dialog success rate for each intent node in the preset intent knowledge graph, is configured to implement:
acquiring the preset intention knowledge graph;
and taking the success rate of the intention node corresponding to each intention node in the preset intention knowledge graph as the conversation success rate of each intention node.
In one embodiment, the processor, in implementing the determining the dialog success rate for each intent node in the preset intent knowledge graph, is configured to implement:
acquiring a plurality of circulation paths of each intention node from the preset intention knowledge graph, and counting the number of the circulation paths, wherein each circulation path comprises a plurality of intention nodes;
determining a circulation path with the attribute identifier of the last intention node as a preset attribute identifier as a circulation success path;
counting the number of successful circulation paths in a plurality of circulation paths of each intention node;
and calculating the percentage of the number of successful circulation paths in the plurality of circulation paths of each intention node in the number of all circulation paths, and taking the calculated percentage as the conversation success rate of each intention node.
In one embodiment, the processor, in implementing the determining the conversation success rate for each of the candidate intent tags based on the conversation success rate for each of the intent nodes, is configured to implement:
mapping the conversation success rate of each intention node and a preset intention label corresponding to each intention node to obtain the conversation success rate of each preset intention label;
and mapping the candidate intention label and the preset intention label, and taking the conversation success rate of the mapped preset intention label as the conversation success rate of the candidate intention label.
In one embodiment, the processor includes a vector extraction layer, a time delay neural network layer, a ReLU layer, a residual network layer, a summation layer, a recurrent neural network layer, a dropout layer, and a Solfmaxlayer in implementing the preset intent classification model; when the text information is input into a preset intention classification model and the output probabilities of N preset intention labels are obtained, the method is used for realizing that:
inputting the text information into the vector extraction layer to obtain a plurality of word vectors;
inputting a plurality of word vectors into the time delay neural network layer, and extracting a plurality of word vector characteristics;
inputting a plurality of word vector features into the ReLU layer to obtain a semantic tag vector;
inputting the semantic label vector and the plurality of word vectors into the summation layer to obtain a plurality of preliminary intention label vectors;
inputting a plurality of the preliminary intention label vectors into the recurrent neural network layer to obtain a plurality of candidate intention label vectors;
inputting the candidate intention label vectors into the dropout layer to obtain a plurality of preset intention label vectors;
and inputting a plurality of the intention label vectors into the Solfmaxlayer to obtain the output probability of preset intention labels.
It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the computer device may refer to the corresponding process in the foregoing embodiment of the user intention identification method, and details are not described herein again.
Embodiments of the present application also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes program instructions, and a method implemented when the program instructions are executed may refer to the embodiments of the user intention identification method in the present application.
The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A user intention recognition method, comprising:
acquiring text information corresponding to voice data input by a user, and inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for expressing voice intentions;
determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label;
determining a conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data;
determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node;
and determining the candidate intention label with the highest conversation success rate as the intention label of the voice data input by the user.
2. The method of claim 1, wherein the determining a preset number of candidate intent tags from a plurality of preset intent tags according to the output probability of each preset intent tag comprises:
sequencing the preset intention labels from large to small according to the output probability to obtain an intention label queue;
and sequentially selecting the preset intention labels from the intention label queue until a preset number of candidate intention labels are obtained.
3. The method of claim 1, wherein determining a conversation success rate for each intent node in a preset intent knowledge graph comprises:
acquiring the preset intention knowledge graph;
and taking the success rate of the intention node corresponding to each intention node in the preset intention knowledge graph as the conversation success rate of each intention node.
4. The method of claim 1, wherein determining a conversation success rate for each intent node in a preset intent knowledge graph comprises:
acquiring a plurality of circulation paths of each intention node from the preset intention knowledge graph, and counting the number of the circulation paths, wherein each circulation path comprises a plurality of intention nodes;
determining a circulation path with the attribute identifier of the last intention node as a preset attribute identifier as a circulation success path;
counting the number of successful circulation paths in a plurality of circulation paths of each intention node;
and calculating the percentage of the number of successful circulation paths in the plurality of circulation paths of each intention node in the number of all circulation paths, and taking the calculated percentage as the conversation success rate of each intention node.
5. The method of any of claims 1-4, wherein the determining a conversation success rate for each of the candidate intent tags from the conversation success rate for each of the intent nodes comprises:
mapping the conversation success rate of each intention node and a preset intention label corresponding to each intention node to obtain the conversation success rate of each preset intention label;
and mapping the candidate intention label and the preset intention label, and taking the conversation success rate of the mapped preset intention label as the conversation success rate of the candidate intention label.
6. The method according to any one of claims 1 to 4, wherein the preset intention classification model includes a vector extraction layer, a time-delay neural network layer, a ReLU layer, a residual network layer, a summation layer, a recurrent neural network layer, a dropout layer, and a Solfmaxlayer; the step of inputting the text information into a preset intention classification model to obtain output probabilities of N preset intention labels includes:
inputting the text information into the vector extraction layer to obtain a plurality of word vectors;
inputting a plurality of word vectors into the time delay neural network layer, and extracting a plurality of word vector characteristics;
inputting a plurality of word vector features into the ReLU layer to obtain a semantic tag vector;
inputting the semantic label vector and the plurality of word vectors into the summation layer to obtain a plurality of preliminary intention label vectors;
inputting a plurality of the preliminary intention label vectors into the recurrent neural network layer to obtain a plurality of candidate intention label vectors;
inputting the candidate intention label vectors into the dropout layer to obtain a plurality of preset intention label vectors;
and inputting a plurality of the intention label vectors into the Solfmaxlayer to obtain the output probability of preset intention labels.
7. A user intention recognition device is characterized by comprising an acquisition module, a generation module, a screening module, a first determination module, a second determination module and a third determination module, wherein:
the acquisition module is used for acquiring text information corresponding to voice data input by a user;
the generating module is used for inputting the text information into a preset intention classification model to obtain output probabilities of a plurality of preset intention labels for representing voice intentions;
the screening module is used for determining a preset number of candidate intention labels from a plurality of preset intention labels according to the output probability of each preset intention label;
the first determining module is used for determining the conversation success rate of each intention node in a preset intention knowledge graph, wherein the preset intention knowledge graph is generated according to historical conversation data;
the second determining module is used for determining the conversation success rate of each candidate intention label according to the conversation success rate of each intention node;
the third determining module is configured to determine the candidate intention label with the highest dialog success rate as the intention label of the voice data input by the user.
8. The apparatus of claim 7, wherein the first determination module comprises an acquisition sub-module, a statistics module, a determination sub-module, and a calculation module, wherein:
the obtaining submodule is used for obtaining a plurality of circulation paths of each intention node from the preset intention knowledge graph;
the counting module is used for counting the number of the circulation paths, wherein each circulation path comprises a plurality of intention nodes;
the determining submodule is used for determining the circulation path of which the attribute identifier of the last intention node is the preset attribute identifier as a circulation success path;
the counting module is further configured to count the number of successful flow paths in the plurality of flow paths of each intention node;
the calculation module is used for calculating the percentage of the number of successful circulation paths in the circulation paths of each intention node in the number of all the circulation paths, and taking the calculated percentage as the conversation success rate of each intention node.
9. A computer arrangement, characterized in that the computer arrangement comprises a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, carries out the steps of the user intent recognition method according to any of claims 1 to 6.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, wherein the computer program, when being executed by a processor, carries out the steps of the user intent recognition method according to any one of claims 1 to 6.
CN202011631344.4A 2020-12-30 2020-12-30 User intention identification method, device, equipment and computer readable storage medium Pending CN112732882A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011631344.4A CN112732882A (en) 2020-12-30 2020-12-30 User intention identification method, device, equipment and computer readable storage medium
PCT/CN2021/084250 WO2022141875A1 (en) 2020-12-30 2021-03-31 User intention recognition method and apparatus, device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011631344.4A CN112732882A (en) 2020-12-30 2020-12-30 User intention identification method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN112732882A true CN112732882A (en) 2021-04-30

Family

ID=75608457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011631344.4A Pending CN112732882A (en) 2020-12-30 2020-12-30 User intention identification method, device, equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN112732882A (en)
WO (1) WO2022141875A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377969A (en) * 2021-08-16 2021-09-10 中航信移动科技有限公司 Intention recognition data processing system
CN113593533A (en) * 2021-09-10 2021-11-02 平安科技(深圳)有限公司 Flow node skipping method, device, equipment and medium based on intention recognition
CN114860912A (en) * 2022-05-20 2022-08-05 马上消费金融股份有限公司 Data processing method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9720981B1 (en) * 2016-02-25 2017-08-01 International Business Machines Corporation Multiple instance machine learning for question answering systems
CN110377911B (en) * 2019-07-23 2023-07-21 中国工商银行股份有限公司 Method and device for identifying intention under dialog framework
CN111695352A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Grading method and device based on semantic analysis, terminal equipment and storage medium
CN111897935B (en) * 2020-07-30 2023-04-07 中电金信软件有限公司 Knowledge graph-based conversational path selection method and device and computer equipment
CN111949787B (en) * 2020-08-21 2023-04-28 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377969A (en) * 2021-08-16 2021-09-10 中航信移动科技有限公司 Intention recognition data processing system
CN113377969B (en) * 2021-08-16 2021-11-09 中航信移动科技有限公司 Intention recognition data processing system
CN113593533A (en) * 2021-09-10 2021-11-02 平安科技(深圳)有限公司 Flow node skipping method, device, equipment and medium based on intention recognition
WO2023035524A1 (en) * 2021-09-10 2023-03-16 平安科技(深圳)有限公司 Intention recognition-based process node jump method and apparatus, device, and medium
CN113593533B (en) * 2021-09-10 2023-05-02 平安科技(深圳)有限公司 Method, device, equipment and medium for jumping flow node based on intention recognition
CN114860912A (en) * 2022-05-20 2022-08-05 马上消费金融股份有限公司 Data processing method and device, electronic equipment and storage medium
CN114860912B (en) * 2022-05-20 2023-08-29 马上消费金融股份有限公司 Data processing method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2022141875A1 (en) 2022-07-07

Similar Documents

Publication Publication Date Title
CN109299344B (en) Generation method of ranking model, and ranking method, device and equipment of search results
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
CN112732882A (en) User intention identification method, device, equipment and computer readable storage medium
WO2020163627A1 (en) Systems and methods for machine learning-based multi-intent segmentation and classification
JP2020520492A (en) Document abstract automatic extraction method, device, computer device and storage medium
CN110853626B (en) Bidirectional attention neural network-based dialogue understanding method, device and equipment
CN112395506A (en) Information recommendation method and device, electronic equipment and storage medium
CN108027814B (en) Stop word recognition method and device
CN111274797A (en) Intention recognition method, device and equipment for terminal and storage medium
JP2010537321A (en) Method and system for optimal selection strategy for statistical classification
CN112036168B (en) Event main body recognition model optimization method, device, equipment and readable storage medium
CN111046656A (en) Text processing method and device, electronic equipment and readable storage medium
CN110890088B (en) Voice information feedback method and device, computer equipment and storage medium
CN112183106A (en) Semantic understanding method and device based on phoneme association and deep learning
CN109726386B (en) Word vector model generation method, device and computer readable storage medium
CN105810192A (en) Speech recognition method and system thereof
CN110309252B (en) Natural language processing method and device
CN113569021A (en) Method for user classification, computer device and readable storage medium
CN110874408B (en) Model training method, text recognition device and computing equipment
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN113139368B (en) Text editing method and system
CN111680514A (en) Information processing and model training method, device, equipment and storage medium
CN115357720A (en) Multi-task news classification method and device based on BERT
CN112541357B (en) Entity identification method and device and intelligent equipment
CN114722832A (en) Abstract extraction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination