CN116541493A

CN116541493A - Interactive response method, device, equipment and storage medium based on intention recognition

Info

Publication number: CN116541493A
Application number: CN202310379954.7A
Authority: CN
Inventors: 陈琦; 吴振宇; 王建明; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2023-04-07
Filing date: 2023-04-07
Publication date: 2023-08-04

Abstract

The embodiment of the application provides an interactive response method, device, equipment and storage medium based on intention recognition, and belongs to the technical field of artificial intelligence. The method comprises the following steps: acquiring target interaction information and history interaction information; inputting the target interaction information into an intention recognition model, and determining a target database according to the obtained target intention classification result; acquiring reference corpus information from a target database according to the target interaction information; and inputting the historical interaction information, the target interaction information and the reference corpus information into the trained response generation model to generate target response content. According to the technical scheme of the embodiment, the data of model training is expanded by introducing the historical interaction information, so that the intention recognition model is ensured to have enough training data, and meanwhile, a corresponding target database is selected according to the intention of the target interaction information, so that the reference corpus information which better accords with the intention of a user is obtained, the generated target response can be more accurate, and the accuracy of the chat robot in acquiring the response content is improved.

Description

Interactive response method, device, equipment and storage medium based on intention recognition

Technical Field

The application relates to the technical field of artificial intelligence, in particular to an interactive response method, device, equipment and storage medium based on intention recognition.

Background

Currently, chat robots become an important tool for enterprise operation, and a common method for establishing chat robots is through two modes of rule and deep learning. And accumulating common question-answer pairs according to business experience, and matching according to keywords or fuzzy search, so that common similar questions can be answered. Or the training model is matched with the problem with high similarity and answers are ordered according to the historical text data as a corpus pre-trained by the deep learning model. Accurately understanding the needs of users and providing corresponding information is a primary problem addressed by chat robots. However, the data scale of the original corpus as accumulated according to business experience and historical data often cannot fully cover the requirements in actual business scenes, and a large number of long-tail theme problems and dynamic changes of industry public opinion are generated. Insufficient quantity and quantity of marked data can cause that the chat robot can not fully understand the problem of the user, and the chat robot can delay the reply and other problems caused by the failure to provide proper answers and information.

Disclosure of Invention

The embodiment of the application mainly aims to provide an interactive response method, device, equipment and storage medium based on intention recognition, aiming at improving the accuracy of a chat robot in acquiring response contents.

To achieve the above object, a first aspect of an embodiment of the present application proposes an interactive response method based on intent recognition, the method including:

acquiring target interaction information, and acquiring historical interaction information associated with the target interaction information from a historical content database;

inputting the target interaction information into a trained intention recognition model to obtain a target intention classification result;

determining a target database from a plurality of selectable databases according to the target intention classification result, wherein the intention classification result corresponding to each selectable database is different;

acquiring reference corpus information from a target database according to the target interaction information;

and inputting the historical interaction information, the target interaction information and the reference corpus information into a trained response generation model to generate target response content.

In some embodiments, the intent recognition model comprises a pre-trained model, the intent recognition model being trained by:

Acquiring a voice sample, and performing voice recognition on the voice sample to obtain a text sample;

obtaining a plurality of language samples, translating each language sample into a target language, and obtaining a plurality of translated samples, wherein the language categories corresponding to each language sample are different;

obtaining a history sample from the history content database according to the text sample, and adding a sample label associated with the text sample to the history sample;

training the pre-training model according to the sample tags, the history samples, the translation samples and the text samples.

In some embodiments, the pre-training model further comprises a word embedding model, the training the pre-training model according to the sample tags, the history samples, the translation samples, and the text samples comprising:

constructing a plurality of word embedding training samples according to the history samples, the translation samples and the text samples;

inputting each word embedding training sample into the word embedding model, and training the word embedding model at least twice according to each word embedding training sample to obtain at least two word embedding data;

determining text vectors according to the average value of at least two word embedding data, and determining first similarity between each text vector and the corresponding sample label;

And training a text conversion matrix according to all the text vectors and the corresponding first similarity.

In some embodiments, the intent recognition model further includes a fine tuning model, the method further comprising, after said training a text conversion matrix based on all of the text vectors and the corresponding first similarity:

obtaining a test sample, wherein the test sample is marked with a test label in advance;

inputting the test sample into the text conversion matrix to obtain a classification label;

and when the second similarity between the test tag and the classification tag meets a preset threshold, determining that the intention recognition model is trained to be converged.

In some embodiments, the inputting the historical interaction information, the target interaction information and the reference corpus information into the trained answer generation model generates target answer content, including:

generating a plurality of selectable response messages according to the historical interaction information, the target interaction information and the reference corpus information;

determining third similarity among the selectable response messages, and sorting a plurality of selectable response messages according to the order of the third similarity from high to low;

And determining the ordered plurality of selectable response messages as the target response content.

In some embodiments, after the determining the ordered plurality of the selectable answer information as the target answer content, the method further comprises:

acquiring response selection information, and determining target response information from the selectable response information according to the response selection information;

and determining the target response information as a positive sample, determining the optional response information which is not selected as the target response information as a negative sample, and saving the positive sample and the negative sample to the target database.

In some embodiments, the acquiring the target interaction information includes:

acquiring inquiry interaction information;

when the query interaction information is text information, adjusting the query interaction information into the target interaction information according to a preset grammar rule;

and/or when the query interaction information is voice information, performing voice recognition on the query interaction information to obtain topic information and keyword information, and combining the topic information and the keyword information into the target interaction information.

To achieve the above object, a second aspect of the embodiments of the present application proposes an interactive response device based on intention recognition, the device comprising:

The information acquisition module is used for acquiring target interaction information and acquiring historical interaction information associated with the target interaction information from the historical content database;

the intention recognition module is used for inputting the target interaction information into a trained intention recognition model to obtain a target intention classification result;

the database selection module is used for determining a target database from a plurality of selectable databases according to the target intention classification result, and the intention classification result corresponding to each selectable database is different from each other;

the corpus acquisition module is used for acquiring reference corpus information from a target database according to the target interaction information;

and the response generation module is used for inputting the historical interaction information, the target interaction information and the reference corpus information into a trained response generation model to generate target response content.

To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, which includes a memory and a processor, the memory storing a computer program, the processor implementing the method according to the first aspect when executing the computer program.

To achieve the above object, a fourth aspect of the embodiments of the present application proposes a storage medium, which is a computer-readable storage medium, storing a computer program, which when executed by a processor implements the method described in the first aspect.

According to the interactive response method, the device, the equipment and the storage medium based on the intention recognition, the historical interaction information associated with the target interaction information is obtained from a historical content database by obtaining the target interaction information; inputting the target interaction information into a trained intention recognition model to obtain a target intention classification result; determining a target database from a plurality of selectable databases according to the target intention classification result, wherein the intention classification result corresponding to each selectable database is different; acquiring reference corpus information from a target database according to the target interaction information; and inputting the historical interaction information, the target interaction information and the reference corpus information into a trained response generation model to generate target response content. According to the technical scheme of the embodiment, the data of model training is expanded by introducing the historical interaction information, so that the intention recognition model is ensured to have enough training data, and meanwhile, a corresponding target database is selected according to the intention of the target interaction information, so that the reference corpus information which better accords with the intention of a user is obtained, the generated target response capacity is more accurate, and the accuracy of the chat robot in acquiring the reply content is improved.

Drawings

FIG. 1 is a flow chart of an interactive response method based on intent recognition provided in one embodiment of the present application;

FIG. 2 is a training flow diagram of a classification model according to another embodiment of the present application;

fig. 3 is a flowchart of step S204 in fig. 2;

fig. 4 is a flowchart of step S304 in fig. 3;

fig. 5 is a flowchart of step S105 in fig. 1;

fig. 6 is a flowchart of step S503 in fig. 5;

fig. 7 is a flowchart of step S105 in fig. 1;

FIG. 8 is a schematic structural diagram of an interactive response device based on intent recognition according to an embodiment of the present application;

fig. 9 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

First, several nouns referred to in this application are parsed:

artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.

Natural language processing (natural language processing, NLP): NLP is a branch of artificial intelligence that is a interdisciplinary of computer science and linguistics, and is often referred to as computational linguistics, and is processed, understood, and applied in human languages (e.g., target language, english, etc.). Natural language processing includes parsing, semantic analysis, chapter understanding, and the like. Natural language processing is commonly used in the technical fields of machine translation, handwriting and print character recognition, voice recognition and text-to-speech conversion, information intent recognition, information extraction and filtering, text classification and clustering, public opinion analysis and opinion mining, and the like, and relates to data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research, linguistic research related to language calculation, and the like.

Information extraction (Information Extraction): extracting the fact information of the appointed type of entity, relation, event and the like from the natural language text, and forming the text processing technology of the structured data output. Information extraction is a technique for extracting specific information from text data. Text data is made up of specific units, such as sentences, paragraphs, chapters, and text information is made up of small specific units, such as words, phrases, sentences, paragraphs, or a combination of these specific units. The noun phrase, the name of a person, the name of a place, etc. in the extracted text data are all text information extraction, and of course, the information extracted by the text information extraction technology can be various types of information.

Long and short memory neural networks (Long Short Term Memory, LSTM): is a time recurrent neural network adapted to process and predict relatively long-spaced and delayed events of importance in a time series, belonging to one of the recurrent neural networks (Recurrent Neural Network, RNN) & LSTM neural network has a "gate" structure (including an input gate, a forget gate and an output gate) capable of eliminating or adding information to the Cell state (Cell) such that the LSTM neural network is capable of remembering long-term information

BERT model: a language model published by google in 2018 that trains deep bi-directional representations by combining bi-directional converters in all layers. The BERT model combines the advantages of a plurality of natural language processing models, and obtains better effects in a plurality of natural language processing tasks. In the related art, the model input vector of the BERT model is the vector sum of a word vector (Token) and a position vector (Position Embedding) and a sentence vector (Segment Embedding). Wherein, the word vector is the vectorization representation of the words, the position vector is used for representing the position of the words in the text, and the sentence vector is used for representing the sequence of sentences in the text

Based on the above, the embodiment of the application provides an interactive response method, device, equipment and storage medium based on intention recognition, aiming at improving the accuracy of a chat robot in acquiring reply content.

The recommendation method and device, the electronic device and the storage medium provided in the embodiments of the present application are specifically described through the following embodiments, and first, the interactive response method based on intent recognition in the embodiments of the present application is described.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The embodiment of the application provides an interactive response method based on intention recognition, and relates to the technical field of artificial intelligence. The interactive response method based on the intention recognition can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements an interactive response method based on intention recognition, but is not limited to the above form.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Fig. 1 is an optional flowchart of an interactive response method based on intent recognition provided in an embodiment of the present application, and the method in fig. 1 may include, but is not limited to, steps S101 to S105.

Step S101, acquiring target interaction information, and acquiring historical interaction information associated with the target interaction information from a historical content database;

step S102, inputting target interaction information into a trained intention recognition model to obtain a target intention classification result;

step S103, determining a target database from a plurality of selectable databases according to target intention classification results, wherein the intention classification results corresponding to each selectable database are different;

step S104, obtaining reference corpus information from a target database according to the target interaction information;

step S105, the historical interaction information, the target interaction information and the reference corpus information are input into the trained response generation model to generate target response content.

In step S101 of some embodiments, the target interaction information may be voice information or text information, and may be adjusted according to the specific type of the chat robot, where the target interaction information is text information and may be directly applied as the target interaction information, or the target interaction information may be obtained by extracting a simple keyword; when the target interaction information is voice information, the target interaction information can be converted into text information through a simple voice recognition model, and then the processing mode is referred to.

It should be noted that, the historical content database may store historical chat data of different users, when the user accessing the chat robot accesses for the first time, in order to enrich the training data, the historical chat data of the user available for reference may be selected as the historical interaction information according to the user information, the user available for reference may be a user with similar user attributes such as age, gender, etc., which is not limited in this embodiment, by introducing the historical chat data as the historical interaction information, the reference may be provided for the chat habit and the word habit of the user, and the accurate generation of the answer may be realized with less real-time chat data.

In step S102 of some embodiments, a specific network structure of the intent recognition model may be selected according to actual requirements, for example, a RoBERTa model, an N-Gram model, etc. may be set for extracting information, and an LSTM model, a BERT model, etc. may be set for classification.

It should be noted that, the intention classification result may be trained in advance according to the function of the chat robot, for example, for the chat robot in the insurance industry, the intention classification may include an insurance problem, a finance problem and other problems, after the target interaction information is obtained, it may be determined that a specific intention classification is to be determined according to the intention recognition model, so as to obtain the target intention classification result.

In step S103 and step S104 of some embodiments, the corpus information stored in the optional database may be periodically updated by a maintainer to ensure that the corpus conforms to the latest chat style, so as to avoid that the chat robot cannot answer correctly when the user uses trending vocabulary, and improve the user experience of the chat robot.

It should be noted that, after determining the target database, entity retrieval may be performed from the target database according to the target interaction information, so as to obtain the reference corpus information.

In step S105 of some embodiments, the answer generation model may be a common answer generator, for example, a generator is constructed by using a GPT-3 model and a T5 model as a basic framework, so that it can generate answer content that is easy to understand and read after the historical interaction information, the target interaction information and the reference corpus information are input.

It should be noted that, the answer content may include a plurality of answer information for customer service personnel to select, or may be ranked according to the score after scoring according to a certain scoring standard, which is not limited in this embodiment.

Step S101 to step S105 illustrated in the embodiment of the present application, by acquiring target interaction information, acquiring history interaction information associated with the target interaction information from a history content database; inputting the target interaction information into a trained intention recognition model to obtain a target intention classification result; determining a target database from a plurality of selectable databases according to target intention classification results, wherein the intention classification results corresponding to each selectable database are different; acquiring reference corpus information from a target database according to the target interaction information; and inputting the historical interaction information, the target interaction information and the reference corpus information into the trained response generation model to generate target response content. According to the technical scheme of the embodiment, the data of model training is expanded by introducing the historical interaction information, so that the intention recognition model is ensured to have enough training data, and meanwhile, a corresponding target database is selected according to the intention of the target interaction information, so that the reference corpus information which better accords with the intention of a user is obtained, the generated target response capacity is more accurate, and the accuracy of the chat robot in acquiring the reply content is improved.

In some embodiments, the intent recognition model comprises a pre-training model, referring to fig. 2, the training step of the intent recognition model may include, but is not limited to, steps S201 through S204:

step S201, a voice sample is obtained, and voice recognition is carried out on the voice sample to obtain a text sample;

step S202, a plurality of language samples are obtained, each language sample is translated into a target language, a plurality of translated samples are obtained, and the language categories corresponding to each language sample are different;

step S203, obtaining a history sample from a history content database according to the text sample, and adding a sample label associated with the text sample to the history sample;

step S204, training a pre-training model according to the sample label, the history sample, the translation sample and the text sample.

It should be noted that, in order to enrich the supportable types of the intent recognition network, the voice sample and the text sample may be used for training at the same time, and after the voice sample is obtained, the voice sample is converted into the text sample through a common voice recognition module, for example, a common HuBERT pre-training model, and the like, and model training is performed based on the text sample, where the voice recognition module is a technology well known to those skilled in the art, and is not limited herein.

It should be noted that, in order to enrich training data, multiple language types of language samples can be adopted, and training is performed through the seq2seq and LSTM frames, so that machine translation is achieved, the multiple language types of language samples are translated into target languages, and multiple translation samples are obtained, so that the trained intention recognition network can support more language types, and the application range of the chat robot is improved.

It should be noted that, the history samples may be the history chat data described in the above embodiment, and by introducing the history chat data, the number of training samples can be effectively increased, and accuracy of the intention recognition network training can be improved.

After the history sample is obtained, chat data can be expanded according to a back transformation mode, and tag data are added, so that training for identifying the network is enabled to have tag basis, and similarity calculation is convenient to follow-up.

In some embodiments, the pre-training model further includes a word embedding model, referring to fig. 3, step S204 may include, but is not limited to, steps S301 to S304:

step S301, constructing a plurality of word embedding training samples according to the history samples, the translation samples and the text samples;

Step S302, inputting each word embedding training sample into a word embedding model, and training the word embedding model at least twice according to each word embedding training sample to obtain at least two word embedding data;

step S303, determining text vectors according to the average value of at least two word embedded data, and determining first similarity between each text vector and the corresponding sample label;

step S304, training a text conversion matrix according to all the text vectors and the corresponding first similarity.

After the history sample, the translation sample and the text sample are obtained, a text data can be constructed according to each sample, and then the text data is trained by word embedding models and N-gram and RoBERTa modes. In order to increase the diversity of data, the original data is input twice during training to obtain two word embedded data, and the final result is the average of the two word embedded data.

In the training process, because the historical samples are pre-expanded with sample tags, after the text vector is calculated, the first similarity between the text vector and the sample tags in terms of semantics can be determined, so that the training result is verified, the obtained word embedding model can convert the text into a matrix, and after the target interaction information is obtained, the similarity between the text embedding model and each preset tag is determined through the text conversion matrix, so that the intention classification result is determined according to the similarity.

In some embodiments, the intent recognition model further includes a fine tuning model, referring to fig. 4, after step S304 is performed, it may further include, but is not limited to, steps S401 to S403:

step S401, a test sample is obtained, and the test sample is marked with a test label in advance;

step S402, inputting a test sample into a text conversion matrix to obtain a classification label;

step S403, when the second similarity between the test tag and the classification tag meets a preset threshold, determining that the intention recognition model is trained to converge.

It should be noted that after training the intent recognition model is completed, accuracy of the model is required to be tested, based on the accuracy, a test sample can be obtained, a test label is labeled in the test sample in advance, after the test sample is output to a classification label through a text conversion matrix, a second similarity is compared with the test label, and when the second similarity meets a preset threshold, training convergence of the intent recognition model can be determined and can be used for subsequent operation.

Referring to fig. 5, in some embodiments, step S105 may include, but is not limited to, steps S501 to S503:

step S501, generating a plurality of selectable response messages according to the historical interaction information, the target interaction information and the reference corpus information;

Step S502, determining a third similarity among the selectable response messages, and sequencing the plurality of selectable response messages according to the sequence from high to low of the third similarity;

step S503, determining the ordered plurality of selectable response information as the target response content.

It should be noted that, the response generation model may use GPT-3 and T5 as basic frames to perform initialization training, and the training samples may be the history samples, the translation samples, and the text samples obtained in the foregoing embodiments, so that the data volume of the samples can be effectively increased.

After the training sample is input to the answer generation model, after the answer content generated by the answer generation model is selected manually by the seat personnel, whether or not the answer content is used is selected manually, the record is used as a positive sample, and the unselected answer content is used as a negative sample for repeated training. In the answer content selected for use, whether the answer content with the first rank is used or not can be checked, if not, the similarity between the target text and the generated answer pair can be recalculated according to the click of the user in the next round of sorting, and thus the accuracy of information provision and the more standard speaking style of the generated answer are ensured.

It should be noted that, through the model training manner, after the response generation model obtains the historical interaction information, the target interaction information and the reference corpus information, a plurality of selectable response information can be generated, in order to facilitate user selection, the determination with the highest third similarity can be the first one according to the determination of the third similarity, and so on, so that the display of the target response content accords with the user intention more and the user experience is improved.

Referring to fig. 6, in some embodiments, after step S503 is performed, steps S601 to S602 may be included, but are not limited to:

step S601, acquiring response selection information, and determining target response information from selectable response information according to the response selection information;

in step S602, the target response information is determined as a positive sample, the optional response information not selected as the target response information is determined as a negative sample, and the positive sample and the negative sample are saved to the target database.

It should be noted that, according to the above embodiment, the selected optional response information may be determined to be in accordance with the user intention, so that the selected optional response information may be used as a positive sample to perform subsequent training, whereas, as a negative sample, through the above technical solution, the training sample may be continuously updated during the use of the chat robot, and the training sample may better reflect the user intention, and in the subsequent training process, the accuracy of the model output may be further increased.

Referring to fig. 7, in some embodiments, step S105 may include, but is not limited to, steps S701 through S703:

step S701, acquiring inquiry interaction information;

step S702, when the query interaction information is text information, the query interaction information is adjusted to be target interaction information according to a preset grammar rule;

and/or step S703, when the query interaction information is voice information, performing voice recognition on the query interaction information to obtain topic information and keyword information, and combining the topic information and the keyword information into target interaction information.

After the query interactive information is obtained, for example, a sentence of voice or a section of text input by the user to the chat robot, when the interactive query information is text information, the text information can be directly used as target interactive information for subsequent recognition, or can be subjected to proper preprocessing, and text with low correlation can be deleted, or the text information can be rearranged according to a preset grammar rule, so that the accuracy of the post-matching reference corpus information is improved. When the query interaction information is voice information, after voice recognition is performed, the subject information and the keyword information in the voice information can be extracted to eliminate the influence of background noise, so that the target interaction information is combined.

Referring to fig. 8, the embodiment of the present application further provides an interactive response device based on intent recognition, which may implement the interactive response method based on intent recognition, where the interactive response device 800 based on intent recognition includes:

an information obtaining module 810, configured to obtain target interaction information, and obtain history interaction information associated with the target interaction information from a history content database;

the intention recognition module 820 is configured to input the target interaction information into a trained intention recognition model to obtain a target intention classification result;

the database selection module 830 is configured to determine a target database from a plurality of selectable databases according to the target intention classification result, where the intention classification result corresponding to each selectable database is different from each other;

the corpus acquisition module 840 is configured to acquire reference corpus information from the target database according to the target interaction information;

the response generation module 850 is configured to input the historical interaction information, the target interaction information, and the reference corpus information to a trained response generation model to generate target response content.

The specific implementation manner of the interactive response device based on the intention recognition is basically the same as the specific embodiment of the interactive response method based on the intention recognition, and is not described herein.

The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the interactive response method based on the intention recognition when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.

Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:

the processor 901 may be implemented by a general purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided by the embodiments of the present application;

the Memory 902 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access Memory (Random Access Memory, RAM). The memory 902 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present application are implemented by software or firmware, relevant program codes are stored in the memory 902, and the processor 901 invokes an interactive response method based on intent recognition to perform the embodiments of the present application;

An input/output interface 903 for inputting and outputting information;

the communication interface 904 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);

a bus 905 that transfers information between the various components of the device (e.g., the processor 901, the memory 902, the input/output interface 903, and the communication interface 904);

wherein the processor 901, the memory 902, the input/output interface 903 and the communication interface 904 are communicatively coupled to each other within the device via a bus 905.

The embodiment of the application also provides a storage medium, which is a computer readable storage medium, and the storage medium stores a computer program, and the computer program realizes the interactive response method based on the intention recognition when being executed by a processor.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

According to the interactive response method, the device, the equipment and the storage medium based on the intention recognition, the historical interaction information associated with the target interaction information is obtained from the historical content database by obtaining the target interaction information; inputting the target interaction information into a trained intention recognition model to obtain a target intention classification result; determining a target database from a plurality of selectable databases according to the target intention classification result, wherein the intention classification result corresponding to each selectable database is different; acquiring reference corpus information from a target database according to the target interaction information; and inputting the historical interaction information, the target interaction information and the reference corpus information into a trained response generation model to generate target response content. According to the technical scheme of the embodiment, the data of model training is expanded by introducing the historical interaction information, so that the intention recognition model is ensured to have enough training data, and meanwhile, a corresponding target database is selected according to the intention of the target interaction information, so that the reference corpus information which better accords with the intention of a user is obtained, the generated target response capacity is more accurate, and the accuracy of the chat robot in acquiring the reply content is improved.

The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.

It will be appreciated by those skilled in the art that the technical solutions shown in the figures do not constitute limitations of the embodiments of the present application, and may include more or fewer steps than shown, or may combine certain steps, or different steps.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The present embodiments are operational with numerous general purpose or special purpose computer device environments or configurations. For example: personal computers, server computers, hand-held or portable electronic devices, tablet electronic devices, multiprocessor devices, microprocessor-based devices, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above devices or electronic devices, and the like. The application may be described in the general context of computer programs, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing electronic devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The units involved in the embodiments of the present application may be implemented by means of software, or may be implemented by means of hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

It should be noted that although in the above detailed description several modules or units of an electronic device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit, in accordance with embodiments of the present application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing electronic device (may be a personal computer, a server, a touch terminal, or a network electronic device, etc.) to perform the method according to the embodiments of the present application.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.

It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the above embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims

1. An interactive response method based on intention recognition, characterized in that the method comprises the following steps:

2. The interactive response method based on intention recognition according to claim 1, wherein the intention recognition model is trained by:

3. The intent recognition based interactive response method of claim 2, wherein the pre-training model further comprises a word embedding model, the training the pre-training model based on the sample tags, the history samples, the translation samples, and the text samples comprising:

4. An interactive response method based on intention recognition according to claim 3, wherein said intention recognition model further comprises a fine tuning model, said method further comprising, after said training a text conversion matrix based on all of said text vectors and said corresponding first similarity:

5. The interactive response method based on intention recognition according to claim 1, wherein the inputting the history interactive information, the target interactive information and the reference corpus information into the trained response generation model generates target response content comprises:

6. The intent recognition based interactive response method according to claim 5, wherein after said determining a plurality of said selectable response information in order as said target response content, said method further comprises:

7. The interactive response method based on intention recognition according to any one of claims 1 to 6, wherein the acquiring the target interactive information comprises:

acquiring inquiry interaction information;

8. An interactive response device based on intent recognition, the device comprising:

9. An electronic device comprising a memory storing a computer program and a processor implementing the intent recognition based interactive response method of any one of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the intent recognition based interactive response method of any one of claims 1 to 7.