CN114625878A

CN114625878A - Intention identification method, interactive system and equipment

Info

Publication number: CN114625878A
Application number: CN202210283675.6A
Authority: CN
Inventors: 陈成
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2022-03-22
Filing date: 2022-03-22
Publication date: 2022-06-14

Abstract

The invention relates to the technical field of man-machine interaction, and provides an intention identification method, an interaction system and equipment, wherein the intention identification method comprises the steps of comparing acquired problem information of a user with a preset rule, and retrieving the problem information according to a distributed full-text search engine when the problem information is not matched with the preset rule to obtain a corpus set related to the semantics of the problem information; and calculating the similarity of the question information of each corpus in the corpus set according to the enhanced sequential reasoning model, and determining the corresponding intention of the question information according to the similarity. Compared with the traditional classification algorithm, the method has the advantages that the defect that an intention recognition model is sensitive to data is overcome, the problem that the semantics are few or the semantics are single can be output by directly hitting the preset rule, and the problem that the intention recognition recall rate is poor is solved; the method and the device can improve the speed and the efficiency of intention identification, can also improve the accuracy, are favorable for improving the interaction effect of the intention identification model applied to the outbound service, and improve the user experience.

Description

Intention identification method, interactive system and equipment

Technical Field

The invention relates to the technical field of human-computer interaction, in particular to an intention identification method, an interaction system and equipment.

Background

In the related art, a commonly used intention recognition model is mainly a single-task algorithm, and the intention recognition model outputs an intention result. Generally, the intention recognition model is sensitive to data, and for an intention branch with a small amount of data, the model is difficult to make a correct prediction, which may result in a reduction in the recall rate of the whole model. For models with more intentions, accuracy may not be ideal.

Disclosure of Invention

The present invention has been made to solve at least one of the technical problems occurring in the related art to some extent.

Therefore, the intention identification method provided by the embodiment of the invention can effectively improve the accuracy and recall rate of intention identification, is beneficial to improving the interaction effect and improves the user experience.

In order to achieve the technical purpose, the technical scheme adopted by the embodiment of the invention comprises the following steps:

in one aspect, an embodiment of the present invention provides an intention identifying method, including:

acquiring problem information of a user, and matching the problem information with a preset rule;

when the problem information is not matched with the preset rule, retrieving the problem information by using a preset distributed full-text search engine to obtain a corpus set related to the problem information semantics;

calculating the similarity between each corpus in the corpus set and the question information by using a pre-trained enhanced sequential reasoning model;

and determining the corresponding intention of the question information according to the similarity.

Further, in an embodiment of the present invention, before the calculating, by using a pre-trained enhanced sequential inference model, a similarity between each corpus in the corpus set and the question information, the method further includes:

similarity prediction is carried out on each corpus by using the distributed full-text search engine to obtain prediction similarity corresponding to each corpus;

and screening the linguistic data with the prediction similarity larger than a first preset threshold value to obtain the filtered linguistic data set.

Further, in an embodiment of the present invention, the matching the question information with a preset rule includes:

establishing a rule base by utilizing preset intention branches, wherein each intention branch corresponds to a plurality of rules;

extracting dialects from the question information and matching the dialects with the rules in the rule base;

the intention recognition method further includes:

when the dialect is matched with any rule in the rule base, the intention branch corresponding to the matched rule is directly determined as the intention corresponding to the question information.

Further, in an embodiment of the present invention, the determining the intention corresponding to the question information according to the similarity includes:

comparing the similarity of each corpus with a second preset threshold;

when the similarity is greater than or equal to the second preset threshold, determining the corpus with the highest similarity in the corpus set as the intention corresponding to the question information;

and when the similarity is smaller than the second preset threshold, determining the intention of the question information by using a text classification algorithm.

Further, in one embodiment of the present invention, the determining the intention of the question information by using a text classification algorithm includes:

comparing the problem information with a plurality of preset intention branches by using a preset rapid text classifier, and calculating the output probability of each intention branch;

and obtaining the intention corresponding to the problem information according to the output probability.

Further, in an embodiment of the present invention, the deriving the intention corresponding to the question information according to the output probability includes:

and when the output probability is greater than or equal to a third preset threshold value, determining the intention branch corresponding to the output probability as the intention.

Further, in an embodiment of the present invention, the method further includes:

and determining a training set according to the corpus set, and performing similarity training on the enhanced sequential reasoning model through the training set to obtain the trained enhanced sequential reasoning model.

In another aspect, an embodiment of the present invention provides an interactive system, including:

the rule engine module is used for acquiring the problem information of the user and matching the problem information of the user with a preset rule;

the recall module is used for retrieving the question information by utilizing a preset distributed full-text search engine when the question information is not matched with the preset rule to obtain a corpus set related to the question information semantics;

the similarity calculation module is used for calculating the similarity between each corpus in the corpus set and the question information by utilizing a pre-trained enhanced sequential reasoning model and determining the corresponding intention of the question information according to the similarity;

and the interaction module is used for inquiring the reply information corresponding to the question information according to the intention and sending the reply information to the user.

In another aspect, an embodiment of the present invention provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the intention identifying method of the above-described embodiment.

In another aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores computer-executable instructions, and is characterized in that the computer-executable instructions are used for executing the intention identification method of the above embodiment.

The embodiment of the invention discloses an intention identification method, which comprises the steps of comparing the acquired problem information of a user with a preset rule, and judging whether the problem information is matched with the preset rule or not; when the problem information is not matched with a preset rule, retrieving the problem information by using a preset distributed full-text search engine to obtain a corpus set related to the problem information semantics; and calculating the similarity between each corpus in the corpus set and the question information by using a pre-trained enhanced sequential reasoning model, and determining the corresponding intention of the question information according to the similarity. Compared with the traditional classification algorithm, the method has the advantages that the defect that an intention recognition model is sensitive to data is overcome, the output result can be directly hit through comparison with a preset rule for the problem that the semantics are few or the semantics are single, the intention recognition efficiency is effectively improved, and the problem that the intention recognition recall rate is poor is solved; aiming at the condition of more problem semantics, the corpus close to the user semantics is screened out from a large amount of corpora through a distributed full-text search engine to recall, and then the similarity between the recalled corpus and the user problem is calculated through the similarity, so that the intention of the user is obtained, the speed and the efficiency of intention identification can be improved, the accuracy can be improved, the interaction effect of an intention identification model can be improved, and the user experience is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of an implementation environment of an intent recognition method according to an embodiment of the present invention;

FIG. 2 is a flowchart of an intent recognition method according to an embodiment of the present invention;

FIG. 3 is a flow chart of the control logic of the intent recognition method of an embodiment of the present invention;

FIG. 4 is a flow chart of rules engine matching according to an embodiment of the present invention;

FIG. 5 is a flow chart of filtering ES recall results according to an embodiment of the present invention;

FIG. 6 is a flow chart of determining user intent based on similarity according to an embodiment of the present invention;

FIG. 7 is a flowchart of an embodiment of the present invention for outputting results with intent based on FastText computation;

fig. 8 is a flowchart of an interaction method according to an embodiment of the present invention.

Detailed Description

The invention is further described with reference to the drawings and the specific examples. The described embodiments should not be considered as limiting the invention, and all other embodiments obtained by a person skilled in the art without making any inventive step are within the scope of protection of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) The rule engine is developed by the inference engine, is a component embedded in an application program, and realizes the separation of business decisions from application program codes and the writing of the business decisions by using a predefined semantic module. And receiving data input, interpreting business rules, and making business decisions according to the business rules. Many organizations are moving from the object-oriented business process management paradigm to the service-oriented approach; in fact, services are becoming an essential element of application development.

2) Business Process Execution Language (BPEL), the de facto standard for orchestrating the services of a rules engine and managing the flawless Execution of Business processes, has resulted in a number of opportunities for more flexible, cost effective management of Business processes.

3) A Distributed full-text search engine (Distributed search engine) is a device which divides the whole network into a plurality of autonomous regions according to the region, theme, IP address and other division standards, and sets a retrieval server in each autonomous region. The elastic search (ES for short) is a search server based on Lucene, is used for a distributed full-text search engine, provides a full-text search engine with distributed multi-user capability, is based on a RESTful web interface, is used in cloud computing, can achieve real-time search, is stable, reliable, quick and convenient to install and use, is a distributed, high-expansion and high-real-time search and data analysis engine, and can conveniently enable a large amount of data to have the capabilities of searching, analyzing and exploring. The horizontal flexibility of the elastic search is fully utilized, so that the data becomes more valuable in a production environment.

4) Representational State Transfer (REST) refers to a set of architectural constraints and principles, and if an architecture meets the constraints and principles of REST, it is called a RESTful architecture.

5) An Enhanced Sequential Inference Model (ESIM) is a Model that comprehensively applies a bilst (tm) and an attention mechanism, has a very strong effect in text matching, and text matching is an important basic problem in Natural Language Processing, and can be applied to a large number of Natural Language Processing (NLP) tasks, such as information retrieval, question answering system, question answering, dialogue system, machine translation, and the like, which can be abstracted to a great extent as a text matching problem.

6) The Long Short-Term Memory Network (LSTM) is one of Recurrent Neural Networks (RNNs), and is specially designed to solve the Long-Term dependence problem of general RNNs, and all RNNs have a chain form of a repetitive Neural Network module. In the standard RNN, this duplicated structure block has only a very simple structure. LSTM is well suited for modeling time series data, such as text data, due to its design features.

7) A bidirectional Long Short-Term Memory (BilSTM) model is composed of a forward LSTM and a backward LSTM, and both of the forward LSTM and the backward LSTM are used for modeling context information in natural language processing tasks.

8) A fast text classifier (FastText) is a word vector and text classification tool, a typical application scenario is a text classification problem with supervision, a simple and efficient text classification and characterization learning method is provided, and the performance is higher than that of shoulder deep learning and the speed is higher. FastText combines the most successful ideas in natural language processing and machine learning. These include characterizing sentences using bags of words and bags of n-grams, as well as using subword (subword) information, and sharing information among categories by hiding tokens.

The intention recognition technology is currently applied to the fields of search engines, dialog systems, intelligent internet of things, robots and the like, for example, in a dialog system, based on intention recognition, a user knows what business or chatting the user wants, and a corresponding model is adopted for processing, the more accurate the intention recognition is, the more accurate the reply obtained by the user is, and the higher the use experience is.

In the related art, an intention recognition model commonly used in the industry is mainly a single-task algorithm, and the intention recognition model outputs an intention result according to a question of a user, for example, accurate information can be searched through intention recognition and a question of the user can be answered when man-machine interaction is performed. Generally, the intention recognition model is sensitive to data, and when there are few intention branches, it is impossible to find an intention branch corresponding to a problem intention of a user, and it is difficult for the model to make an accurate prediction, which may result in a decrease in the recall rate of the entire model. For models with more intentions, accuracy may not be ideal. With current outbound services, the number of intentions branches is as high as 30, and higher accuracy is required for application scenarios such as querying expenses, consulting policy, etc.

If the traditional intention recognition model is adopted for recognition, the marking data is classified on the predefined user intention, for the emerging user intention, the marking data needs to be collected again, the model needs to be trained again, time and labor are consumed, certain bottlenecks exist in accuracy and recall rate, in the face of recall of a large amount of data, recognition efficiency is low, an intention output result often cannot meet business requirements, the human-computer interaction effect is poor, and user experience is reduced.

In order to solve the problems that the accuracy and recall rate are not high, the recognition efficiency is low and the intention output result cannot meet the service requirement when a traditional intention recognition model is adopted in the related technology, the embodiment of the invention provides an intention recognition method, an interaction method, a system and equipment; aiming at the condition of more intention branches, the corpus close to the dialect of the user is screened out from a large number of corpora through the distributed full-text search engine to recall, and then the similarity between the recalled corpus and the dialect of the user is calculated through the similarity, so that the intention of the user is obtained, the speed and the efficiency of intention identification can be improved, the accuracy can be improved, the interaction effect of the intention identification model applied to the outbound service can be improved, and the user experience is improved.

Fig. 1 is a schematic diagram of an implementation environment of an intention identification method according to an embodiment of the present invention. Referring to fig. 1, the software and hardware main body of the implementation environment mainly includes an operation terminal 101 and a server 102, and the operation terminal 101 is connected to the server 102 in a communication manner. The intention recognition method may be separately configured to be executed by the operation terminal 101, may also be separately configured to be executed by the server 102, or may be executed based on the interaction between the operation terminal 101 and the server 102, which may be appropriately selected according to the actual application, and this embodiment is not particularly limited thereto.

Specifically, the operation terminal 101 in the embodiment of the present invention may include, but is not limited to, any one or more of a smart watch, a smart phone, a computer, a Personal Digital Assistant (PDA), an intelligent voice interaction device, an intelligent appliance, or a vehicle-mounted terminal. The server 102 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform. The operation terminal 101 and the server 102 may establish a communication connection through a wireless Network or a wired Network, which uses standard communication technologies and/or protocols, and the Network may be set as the internet, or may be any other Network, such as, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wired, or wireless Network, a private Network, or any combination of virtual private networks.

Fig. 2 is a flowchart of an intention identification method according to an embodiment of the present invention, where an execution subject of the method may be at least one of an operation terminal 101 or a server 102, and fig. 2 illustrates an example of the intention identification method being configured in the operation terminal.

Referring to fig. 2, the intention identifying method of the embodiment includes, but is not limited to, steps S100 to S400.

And step S100, matching the acquired problem information with a preset rule.

It can be understood that, when the intention of the user is identified in the embodiment, whether the problem information is matched with the preset rule is judged by acquiring the problem information of the user and then comparing the acquired problem information with the preset rule. The question information of the user can be understood as the dialogical information of the user, and the dialogical information comprises the speaking purpose of the user, namely what the user wants to express and what the user wants to do. Taking a smart phone as an example, a user can input a question on the smart phone through voice or text, for example, "premium" is input through voice, the question is converted into text information, then the text information is matched with a preset rule, and whether the user intends to hit is judged according to a rule matching mode.

In some embodiments, the matching process of the question information and the preset rule may be implemented by a rule engine, specifically, a rule corresponding to the intention branch may be formulated by the rule engine, so that the rule engine has a corresponding rule base to match the acquired question information with the preset rule. It can be appreciated that when the question information matches a rule in the rules engine, an intent branch corresponding to the question information is indicated to be matched, thereby resulting in the user's question intent. For example, the problem of inputting 'premium' can be matched with the intention branch of 'the cost which should be delivered when the applicant participates in insurance', and after the intention of the user is determined, the corresponding answer is inquired for replying, so that the man-machine interaction efficiency is effectively improved.

It can be understood that, in the method of the embodiment, the rule matching is performed through the rule engine, and some problem information with few dialogs or relatively single dialogs can be directly hit to the corresponding intention branch, so that the intention recognition result can be output quickly, the problem that the recall rate of the traditional classification algorithm for the few intention branches is poor is solved, and the speed and the efficiency of the intention recognition can be greatly improved as a whole. Since the intention of the user can be directly obtained after the rule matching is successful, the flow of the intention identification method is ended at this time, that is, the subsequent steps S200 to S400 do not need to be continuously executed.

It should be noted that the method of the embodiment is specifically applied to a human-computer interaction system, and it is considered that most business processes include a plurality of decision points, and at these decision points, a certain condition is evaluated. The business processes modify their behavior according to these criteria or business rules. In fact, these business rules have a driving role in the business process. These rules are typically embedded within the business process itself or within custom Java code. The business processes are separated from the business rules using a rules engine, where rules are exposed as services, and BPEL processes take advantage of these services by querying the engine when reaching a decision point, which is more flexible and rules can be manipulated graphically rather than encoded in a programming language or within a program. The business user can use the tool to write the rule by himself, and the rule change after deployment can be carried out without the assistance of IT personnel. Since most of the updates and functional enhancements are performed by the business user, maintenance costs can be significantly reduced.

And S200, when the problem information is not matched with the preset rule, retrieving the problem information according to a distributed full-text search engine to obtain a corpus set related to the problem information semantics.

It can be understood that when the problem information is not matched with the preset rule, the rule base has no intention branch matched with the problem information, at this time, the matching process of the rule engine is ended, and then the problem information is retrieved based on the preset distributed full-text search engine, that is, for the intention branch with less data, the intention can be quickly identified by the rule engine; when the intention branches are more, the distributed full-text search engine is adopted to continue recognition under the condition that the rule engine cannot be matched, so that the dialect close to the user dialect definition can be recalled, and the purpose of combining the rule engine and the distributed full-text search engine to recognize the intention is realized.

Specifically, the embodiment searches problem information based on the sharding search of the Elasticissearch, and can also be understood as an Elasticissearch recall (ES recall for short), and the implementation principle of the Elasticissearch is mainly divided into the following steps. The Elasticsearch is distributed, which means that the index can be divided into shards, each shard can have 0 or more copies, each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard; rebalancing and routing are done automatically. The related data is typically stored in the same index, which consists of one or more master slices and zero or more replica slices.

It should be noted that the Elasticsearch has an independent database, and various types of searches can be executed and merged by the Elasticsearch, for example, the search is performed by combining structured data and unstructured data, and the search mode is more flexible. Specifically, keywords of the dialect are input into the Elasticsearch, and a corpus set related to the semantics of the question information can be recalled. In the embodiment, the segmentation search based on the elastic search can quickly find 100 dialogues which are most similar to the meaning of the user language term from a large amount of linguistic data, and the 100 dialogues can be understood as a recalled corpus, so that the search range can be quickly narrowed, which is a preliminary screening work and is beneficial to reducing the data volume.

And step S300, calculating the similarity between each corpus in the corpus set and the question information according to the enhanced sequential reasoning model.

It can be understood that the fragmental search function of the Elasticsearch is to quickly screen out the corpus most similar to the user speech technology from the massive corpus for recall, so that the calculation amount can be effectively reduced, and then the similarity calculation is performed on the recall data by using the enhanced sequential inference model.

It should be noted that, the enhanced sequential inference model used in the embodiment is ESIM, the ESIM is a deep semantic similarity model trained in advance, and by calculating the similarity between the corpus recalled by ES and the user question, it can be understood that the higher the similarity is, the closer the corpus recalled is matched with the user speech, the closer the corpus is to the intention of the user. That is, the similarity between all corpora in the corpus set and the user's speech can be accurately calculated through ESIM.

In step S400, the intention corresponding to the question information is determined based on the similarity, which may be understood as comparing the similarities calculated by the ESIMs and outputting the utterance with the highest similarity as the user intention, so that intention recognition may be completed.

Referring to fig. 3, fig. 3 is a flow chart of control logic of an intention identification method according to an embodiment. Firstly, rule matching is carried out on the problems of the user according to a rule engine, and whether the results are directly matched or not is judged. If the matching is successful, directly outputting the matching intention, and finishing intention identification; if the result is not matched, recalling the corpus set close to the user dialect through the ES, then calculating the similarity through the ESIM model, and determining the intention output result according to the similarity.

Compared with the traditional classification algorithm, the method has the advantages that the defect that an intention recognition model is sensitive to data is overcome, and the problem of poor recall of intention branches is solved by directly hitting preset rules to output when the intention branches are few; aiming at the condition of more intention branches, the linguistic data close to the dialect of the user is screened out from a large number of linguistic data through the ES to be recalled, and then the similarity between the recalled linguistic data and the dialect of the user is calculated through the ESIM model, so that the intention of the user is obtained, the speed and the efficiency of intention identification can be improved, the accuracy can be improved, the interaction effect of the intention identification model applied to the outbound service can be improved, and the user experience is improved.

For example, if the rule engine does not match, the ES can recall a corpus set close to the corresponding terminology, and the ES can search using "insurance" and "fee" as keywords, so that the terminology such as "insurance payment process", "policy fee inquiry", "change of premium" and "premium payment method" can be obtained, then similarity calculation is performed using the ESIM model, and finally "policy non-renewal" is used as the terminology with the highest similarity, and the result is output as the user intention, and the intention recognition process is ended.

It should be noted that the algorithm of the intention identifying method provided by the embodiment of the present invention can be divided into the following modules: the system comprises a rule engine module, a recall module and a similarity calculation module, wherein each module has a corresponding function, and the rule engine module is used for matching with a preset rule according to the acquired problem information; the recall module is used for retrieving the question information according to the ES to obtain a corpus set related to the semantics of the question information; the similarity calculation module is used for calculating the similarity between the question information and the corpus set according to the ESIM model and determining the corresponding intention of the question information according to the similarity.

It can be understood that the intention identification method of the embodiment of the invention is realized based on the combination of the rule engine, the ES recall and the ESIM model, and the algorithm has higher accuracy, recall rate and efficiency for the intention identification as a whole, and is suitable for the outbound service with higher requirements on the intention identification accuracy rate and recall rate.

In some embodiments, the above step S100 is further described, which specifically includes, but is not limited to, step S110 to step S130. Referring to FIG. 4, FIG. 4 is a flow diagram of rule engine matching of an embodiment.

Step S110, establishing a rule base by utilizing preset intention branches, wherein each intention branch corresponds to a plurality of rules;

step S120, matching the dialect of the problem information with the rule in the rule base;

step S130, when the question information is matched with any rule in the rule base, the intention branch corresponding to the matched rule is directly determined as the intention corresponding to the question information.

In step S110, the embodiment establishes a rule base through a rule engine, and defines a corresponding rule for each intention branch, and each intention branch corresponds to a plurality of rules, where the preset rule may be understood as a dialectic template corresponding to each intention, and the preset rule includes a plurality of rules.

In step S120, matching the dialect of the question information with the rule in the rule base may be implemented by comparing the dialect of the user with a preset dialect template to determine whether the two are matched. The word "speak" may be understood as that the user uses various methods of language and speed to properly express the meaning that he should express, such as "inhibit and pause", arrangement, simulation, etc., and the word "speak" of the user can be extracted through the question information.

It should be noted that, each intention branch corresponds to a plurality of tactical templates, which can improve the probability of matching hit, thereby improving recall rate and rule matching flexibility. Each dialect template can comprise keywords and entity types, wherein the entity types can be dangerous information such as 'life insurance', 'car insurance', and the like, and the dialect templates are formed by combining the entity types and the keywords. For example, the jargon template may be "car insurance? Claim ", wherein"? "represents any character, the dialect template can correspond to an intention branch for inquiring the car insurance claim information, and when the dialect of the user is matched with the dialect template, the intention of the user for inquiring the car insurance claim information can be matched. Of course, the intention branch may correspond to a plurality of dialect templates, for example, the intention of "the claim settlement flow of car insurance", "the claim settlement amount of car insurance", etc. may be specifically matched. It is understood that the preset rule can be set according to the requirements of the actual application scenario, and is not further limited.

It can be understood that, in step S130, when the user' S dialect matches with the preset dialect template, the corresponding intention can be hit, so that the recognition result can be output quickly, and the intention recognition is ended. Of course, when the dialect of the user does not match the preset dialect template, the retrieval of the corpus with higher similarity based on the ES call is performed, which specifically refers to the identification process from step S200 to step S400, and is not described herein again.

Referring to FIG. 5, FIG. 5 is a flow diagram of filtering ES recall results according to an embodiment. In some embodiments, the process of intent recognition may also include a step of filtering the ES recall result, so the method of the present invention may further include, but is not limited to, steps S210 to S220.

Step S210, carrying out similarity prediction on the corpora in the corpus set by using a distributed full-text search engine to obtain prediction similarity corresponding to each corpus in the corpus set;

step S220, the linguistic data with the predicted similarity larger than the first preset threshold are screened, and a filtered linguistic data set is obtained.

It should be noted that, the ES is self-provided with the ranking of similarity scores, a threshold may be divided for the recalled corpus, the similarity scores are compared with the threshold, dialects higher than the threshold are screened, and then similarity calculation is performed, so that the calculation amount of the model can be reduced, and the efficiency of the model can be improved.

Specifically, the corpus obtained based on the ES recall is pre-scored according to the degree of correlation with the utterances of the user to obtain a score of the predicted similarity, for example, in the embodiment, 100 utterances most similar to the utterance terminology meaning are found through the ES recall, the 100 utterances are scored to have a score of the predicted similarity for each utterance, and then the recalled utterances are ranked according to the score. And then screening according to the sorting condition, screening out the linguistic data with the predicted similarity score larger than a first preset threshold value so as to obtain a filtered linguistic data set, and filtering out the linguistic data with the predicted similarity score smaller than the first preset threshold value, namely, narrowing the scope of conversational search through the ES recall, and then filtering the recall result, so that the data volume of the ES recall can be further reduced.

For example, 100 points are used as full points of the predicted similarity score to indicate that the recalled dialogs are closest, the value of the first preset threshold may be 80 points, the similarity calculation is performed through the ESIM module only when the predicted similarity score is greater than 80 points, and the similarity calculation is discarded when the similarity score is less than 80 points, so that the corpus set with higher similarity can be acquired. It should be noted that the first preset threshold may be set according to the requirements of the actual application scenario, and is not further limited herein.

In step S300, similarity calculation is performed on the question and the filtered corpus according to the ESIM model, so that the calculation amount of the ESIM model can be reduced, and the efficiency of intent recognition can be further improved. In addition, the ESIM model employs both BilSTM and attention mechanisms, it being understood that the ESIM model uses BilSTM as a basis module for the inference model, which is used to encode the input word first, where BilSTM can construct the final prediction result by learning the token and its context information, and by using BilSTM to perform inference synthesis, since BilSTM can well characterize the local inference information and its influence in context.

In some embodiments, the corpus set recalled from the ES is used as a training set, and the ESIM model is subjected to similarity training through the training set, so that it can be understood that the training set is formed based on the corpus set recalled from the ES, and the training set does not change due to an intention recognition result, so that the training efficiency of the model can be improved, and thus, the trained ESIM model can be obtained, and not only the efficiency of the model can be improved, but also the accuracy of the model can be improved.

It can be understood that the ESIM model of the embodiment is used for identifying the intention of the user, so that the similarity training of the ESIM model is obtained by training in a question scene, and the ESIM model has better interaction prediction performance when being applied to an interactive system.

In some embodiments, the above step S400 is further described, which specifically includes, but is not limited to, step S410 to step S430. Referring to fig. 6, fig. 6 is a flowchart of determining a user intention according to a similarity according to an embodiment.

Step S410, acquiring the similarity of each corpus in the corpus set through an enhanced sequential reasoning model;

step S420, comparing the similarity with a second preset threshold;

step S430, when the similarity is greater than or equal to the second preset threshold, determining the corpus with the highest similarity in the corpus set as the intention corresponding to the question information.

It can be understood that, when the similarity is calculated through the ESIM model, each corpus in the corpus set can be scored, so that each corpus has a corresponding similarity score, and the closeness between the recalled dialect and the user dialect can be judged according to the similarity score, wherein the higher the similarity score is, the closer the similarity score is to the user dialect. Though the corpus collection after ES recall is filtered and has relatively high similarity, the linguistic technique of ES recall is not necessarily capable of accurately matching the intention of the user in consideration.

In order to ensure that the accuracy reaches a higher level, in step S420 and step S430 of the embodiment, and as can be understood by referring to fig. 3, the corpus is further screened, and the similarity score of the corpus is compared with a second preset threshold, where the second preset threshold can be understood as a lowest value of the similarity score corresponding to the accuracy, that is, when the similarity score is lower than the second preset threshold, the similarity is lower, and the accuracy of the intended recognition is also lower; on the contrary, when the similarity score is higher than the second preset threshold, the accuracy of the intention identification is higher, the requirement of the accuracy is met, further, the intention of the user can be obtained from the corpus with the similarity score larger than the second preset threshold, and the actual value of the second preset threshold can be set according to the actual use scene.

Specifically, all the corpora in the corpus may be sorted according to the similarity score, and two cases may occur by comparing the results: the similarity score of at least more than one corpus in the corpus set is higher than a second preset threshold, or the similarity scores of all the corpuses are lower than the second preset threshold. In the foregoing case, the corpus set includes the corpus closest to the user's speech, and at this time, the corpus with the highest similarity score in the corpus set is known as the intention corresponding to the question information according to the order of the similarity scores, that is, the intention of the user can be determined by calculating the similarity through the ESIM model.

In the latter case, which means that the intention of the user cannot be further determined through the corpus collection, some embodiments of the present invention may further determine the intention of the user problem through a text classification algorithm, and specifically, in the embodiments, the FastText is used for further intention identification.

Referring to FIG. 7, FIG. 7 is a flow chart illustrating an output result according to the intent of FastText calculation according to an embodiment. Therefore, the method of the present invention may further include, but is not limited to, step S510 to step S520.

Step S510, when the similarity is smaller than a second preset threshold value, comparing the problem information with a plurality of preset intention branches, and calculating the output probability of each intention branch;

in step S520, when the output probability is greater than or equal to the third preset threshold, the intention branch corresponding to the output probability is determined as the intention.

It will be appreciated that FastText is a fast text classification algorithm that has two advantages over conventional classification algorithms: the FastText can accelerate the training speed and the testing speed under the condition of keeping high precision, and does not need pre-trained word vectors, and the FastText can train the word vectors by itself. In the case that the user intention cannot be determined after the similarity is calculated by the ESIM model, the embodiment uses a FastText algorithm to carry out bottom-finding, calculates the output probability of each intention branch by the FastText, and then determines the user intention according to the output probability.

For example, FastText can learn the type of policy and can categorize these policies into relevant documents, and then when the user asks the question "what my policy is", it can look up in the document and understand that what the user wants to ask is a policy-related question through FastText.

Specifically, in step S520, the third preset threshold may be understood as the lowest output probability satisfying the similarity requirement, and it may be understood in conjunction with fig. 3 that, when the output probability is greater than or equal to the third preset threshold, the intention branch corresponding to the output probability is determined as the intention, and in the case where there are a plurality of intention branches having output probabilities higher than the third preset threshold, the highest output probability may be selected as the output result. It can be understood that when the output probability is lower than the third preset threshold, it can be regarded as having no obvious intention, and the intention recognition ends.

Compared with the traditional classification algorithm, the provided intention identification algorithm based on the combination of the rule engine, the ES recall model and the ESIM model solves the defect that the model is sensitive to data, and can accurately identify intention branches with less data; ES can select the corpus that is most similar with user's speech from the magnanimity corpus fast and recall, then use ESIM model to carry out the similarity training, so not only can promote model speed and efficiency, can also promote the rate of accuracy of model, and the FastText pocket bottom can guarantee that the holistic coverage of algorithm is better.

Compared with the traditional classification algorithm, the intention identification method of the embodiment of the invention overcomes the defect that an intention identification model is sensitive to data, can output the intention through directly hitting a preset rule when the number of intention branches is small, and solves the problem of poor recall rate of the intention branches; aiming at the condition that the intention branches are more, the linguistic data close to the dialect of the user is screened out from a large number of linguistic data through ES recall to be recalled, then similarity calculation is carried out through an ESIM model, the intention of the user is obtained, so that the speed and the efficiency of intention identification can be improved, the accuracy can be improved, the speed and the efficiency of the model can be improved, the accuracy of the model can be improved, bottom-pocket processing is carried out through FastText, the integral coverage rate of the algorithm can be better ensured, the interactive effect of the intention identification model applied to the outbound service can be improved, and the user experience is improved.

The embodiment of the invention also provides an interaction method, and similarly, the interaction method can be applied to the implementation environment shown in fig. 1. The interaction method may be executed by being configured separately in the operation terminal 101, or by being configured separately in the server 102, or by being executed based on the interaction between the operation terminal 101 and the server 102, and may be selected appropriately according to the actual application, which is not limited in this embodiment.

Referring to fig. 8, a flowchart of an interaction method according to an embodiment of the present invention is shown. In this embodiment, an operation terminal and a server are taken together as an execution subject. The interaction method includes, but is not limited to, steps 610 through 650.

Step S610, obtaining question information of a user;

step S620, directly outputting the intention of the user when the question information is matched with the preset rule, and executing step S650;

step S630, when the question information is not matched with the preset rule, utilizing an Elasticissearch to retrieve the question information to obtain a corpus set related to the semantics of the question information;

step S640, calculating the similarity between each corpus in the corpus set and the question information by using an ESIM (electronic information modeling) model, and determining the corresponding intention of the question information according to the similarity;

and step S650, inquiring the reply information corresponding to the question information according to the intention, and sending the reply information to the user.

In the embodiment of the invention, taking the interaction between the operation terminal and the server as an example to realize the interaction method in the invention, the operation terminal at least has the functions of collecting the voice data of a user, sending the voice data to the server, receiving the text data of the target output statement returned by the server, and converting the text data of the target output statement into audio data for output; the server at least has the functions of receiving voice data sent by the operation terminal, recognizing text content of the voice data to obtain input information, inputting the input information into the trained intention recognition model to obtain a target output sentence, and sending the text data of the target output sentence to the operation terminal. Therefore, the operation terminal can send the collected voice data to the server, interactively predict the text content of the voice data through an intention recognition model in the server, output a target output statement, and play the target output statement to a user through the operation terminal, so that man-machine interaction is performed.

The embodiment of the invention also provides an interactive system, which comprises:

the recall module is used for retrieving the problem information by utilizing a preset distributed full-text search engine when the problem information is not matched with a preset rule to obtain a corpus set related to the semantics of the problem information;

the similarity calculation module is used for calculating the similarity between each corpus of the corpus set and the question information through a pre-trained enhanced sequential reasoning model and determining the corresponding intention of the question information according to the similarity;

It can be understood that the contents of the intention identification method of the above embodiment are all applied to the interactive system of the present embodiment, the functions specifically implemented by the interactive system of the present embodiment are the same as the intention identification method of the above embodiment, and the beneficial effects achieved by the interactive system of the present embodiment are also the same as the beneficial effects achieved by the intention identification method of the above embodiment.

The embodiment of the invention also discloses computer equipment, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the processor executes the computer program, the intention identification method of the embodiment shown in fig. 2 to 7 or the interaction method of the embodiment shown in fig. 8 are realized.

An embodiment of the present invention also discloses a computer-readable storage medium, in which a program executable by a processor is stored, and the program executable by the processor is used for implementing the intention identification method of the embodiment shown in fig. 2 to 7 or the interaction method of the embodiment shown in fig. 8 when being executed by the processor.

It is to be understood that the contents of the intent recognition method in the embodiment shown in fig. 2 to 7 or the interaction method in the embodiment shown in fig. 8 are all applicable to the embodiment of the computer-readable storage medium, the functions implemented by the embodiment of the computer-readable storage medium are the same as the intent recognition method in the embodiment shown in fig. 2 to 7 or the interaction method in the embodiment shown in fig. 8, and the advantages achieved are the same as the advantages achieved by the intent recognition method in the embodiment shown in fig. 2 to 7 or the interaction method in the embodiment shown in fig. 8.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the functions and/or features may be integrated in a single physical device and/or software module, or one or more of the functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims

In the description herein, references to the description of "one embodiment," "another embodiment," or "certain embodiments," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Claims

1. An intent recognition method, comprising:

2. The method according to claim 1, wherein before calculating the similarity between each corpus in the corpus set and the question information by using a pre-trained enhanced sequential inference model, the method further comprises:

similarity prediction is carried out on each corpus and the problem information by using the distributed full-text search engine, and prediction similarity corresponding to each corpus is obtained;

3. The intention recognition method according to claim 1, wherein the matching of the question information with a preset rule comprises:

extracting dialogs from the question information and matching the dialogs with the rules in the rule base;

the intention identifying method further comprises:

4. The intention identification method according to claim 1, wherein the determining the intention corresponding to the question information according to the similarity includes:

comparing the similarity of each corpus with a second preset threshold;

5. The intent recognition method of claim 4, wherein said determining the intent of the question information using a text classification algorithm comprises:

comparing the problem information with a plurality of preset intention branches by using a preset text classifier, and calculating the output probability of each intention branch;

6. The intention identification method according to claim 5, wherein the deriving the intention corresponding to the question information according to the output probability includes:

and when the output probability is greater than or equal to a third preset threshold, determining the intention branch corresponding to the output probability as the intention.

7. The intention recognition method according to claim 1, further comprising:

8. An interactive system, comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the intent recognition method of any of claims 1-7 when executing the computer program.

10. A computer-readable storage medium storing computer-executable instructions for performing the intent recognition method of any of claims 1-7.