WO2020155619A1

WO2020155619A1 - Method and apparatus for chatting with machine with sentiment, computer device and storage medium

Info

Publication number: WO2020155619A1
Application number: PCT/CN2019/103516
Authority: WO
Inventors: 吴壮伟
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-01-28
Filing date: 2019-08-30
Publication date: 2020-08-06
Also published as: CN109977201A; CN109977201B

Abstract

Disclosed in embodiments of the present application are a method and apparatus for chatting with a machine with sentiment, a computer device and a storage medium, wherein the method comprises the following steps: obtaining a chat sentence inputted by a user; inputting the chat sentence into a preset response generation model to obtain an initial response outputted by the response generation model in response to the chat sentence; inputting the initial response into a preset sentiment generation model to obtain at least two candidate responses carrying sentiment outputted by the sentiment generation model in response to the initial response; inputting the candidate responses and the chat sentence into a trained deep reinforcement learning network model to obtain a deep reinforcement learning value of each candidate response; and returning the candidate response that has the largest deep reinforcement learning value as a response sentence of the chat sentence. A reply that has sentiment is returned for the chat statement inputted by the user, making the machine chat more natural and humanized.

Description

Emotional machine chat method, device, computer equipment and storage medium

【cross reference】

This application is based on the Chinese invention patent application with the application number 2019100819896 on January 28, 2019, titled "Emotional machine chat method, device, computer equipment, and storage medium", and requires priority.

【Technical Field】

This application relates to the field of artificial intelligence technology, and in particular to an emotional machine chat method, device, computer equipment, and storage medium.

【Background technique】

With the development of artificial intelligence technology, chatbots have gradually emerged. A chatbot is a program used to simulate human conversations or chats. It can be used for practical purposes, such as customer service and consultation. There are also some social robots used to chat with people.

Some chatbots will be equipped with a natural language processing system, but more of them extract keywords from input sentences, and then retrieve answers based on keywords from the database. The answers of these chat robots are usually pretty, non-emotional, and the chat mode is the same, which leads to people's low interest in chatting with them, and the utilization rate of chat robots is also low.

[Content of the invention]

This application provides an emotional machine chatting method, device, computer equipment, and storage medium to solve the problem that chat robots answer the same question without emotion.

An emotional machine chat method includes the following steps:

Get the chat sentence entered by the user;

Input the chat sentence into a preset response generation model, and obtain an initial response output by the response generation model in response to the chat sentence;

Inputting the initial response into a preset emotion generation model, and obtaining at least two candidate responses that carry emotion and output by the emotion generation model in response to the initial response; Input the candidate response and the chat sentence into a trained deep reinforcement learning network model to obtain the deep reinforcement learning value of each candidate response;

The candidate response with the largest deep reinforcement learning value is returned as the response sentence of the chat sentence.

An emotional machine chat device, including:

Obtaining module, used to obtain chat sentences input by the user;

A generating module, configured to input the chat sentence into a preset response generation model, and obtain an initial response output by the response generation model in response to the chat sentence;

A processing module, configured to input the initial response into a preset emotion generation model, and obtain at least two candidate responses that carry emotion that are output by the emotion generation model in response to the initial response;

A calculation module, configured to input the candidate response and the chat sentence into a trained deep-strength learning network model to obtain the deep-strength learning value of each candidate response;

The execution module is used to return the candidate response with the largest deep reinforcement learning value as the response sentence of the chat sentence.

A computer device includes a memory and a processor, wherein computer readable instructions are stored in the memory, and when the computer readable instructions are executed by the processor, the processor implements the following steps: acquiring a chat input by a user Statement

Inputting the initial response into a preset emotion generation model, and obtaining at least two candidate responses carrying emotions output by the emotion generation model in response to the initial response;

Input the candidate response and the chat sentence into a trained deep reinforcement learning network model to obtain the deep reinforcement learning value of each candidate response;

The candidate response with the largest deep reinforcement learning value is returned as the response sentence of the chat sentence. A computer-readable storage medium having computer-readable instructions stored thereon, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

Get the chat sentence entered by the user;

The beneficial effects of the embodiments of the present application are: by acquiring the chat sentence input by the user; inputting the chat sentence into a preset response generation model, and obtaining the initial response output by the response generation model in response to the chat sentence; The initial response is input into a preset emotion generation model, and at least two candidate responses that carry emotions output by the emotion generation model in response to the initial response are obtained; and the candidate response and the chat sentence are input into the process In the trained deep reinforcement learning network model, the deep reinforcement learning value of each candidate response is obtained; and the candidate response with the largest deep reinforcement learning value is returned as the response sentence of the chat sentence. It returns emotional responses to the chat sentences entered by the user, making machine chat more natural and humane.

【Explanation of drawings】

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. For those skilled in the art, without creative work, other drawings can be obtained from these drawings.

Figure 1 is a schematic diagram of the basic flow of an emotional machine chat method according to an embodiment of this application; Figure 2 is a schematic diagram of the flow of generating an initial response according to an embodiment of this application;

FIG. 3 is a schematic diagram of the process of generating an initial response through a question-and-answer knowledge base according to an embodiment of the application; FIG. 4 is a schematic diagram of the training process of an emotion generation model according to an embodiment of the application;

FIG. 5 is a schematic diagram of a process of deep learning enhanced network training according to an embodiment of this application;

FIG. 6 is a basic structural block diagram of an emotional machine chat device according to an embodiment of this application;

Fig. 7 is a block diagram of the basic structure of a computer device according to an embodiment of the application.

[Detailed Embodiments] In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application.

In some of the procedures described in the specification and claims of this application and the above-mentioned drawings, multiple operations appearing in a specific order are included, but it should be clearly understood that these operations may not be in accordance with them. In this document, the order of execution or parallel execution, operation serial numbers such as 101, 102, etc., are only used to distinguish different operations, and the serial numbers themselves do not represent any execution order. In addition, these processes may include more or fewer operations, and these operations may be executed sequentially or in parallel. It should be noted that the descriptions of "first" and "second" in this article are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, nor do they limit "first" and "second". "Is a different type.

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative efforts shall fall within the protection scope of this application.

Example

Those skilled in the art can understand that the "terminal" and "terminal equipment" used herein include both wireless signal receiver equipment, which only has equipment with wireless signal receivers without transmitting capability, and also includes receiving and transmitting hardware equipment. A device, which has a device capable of performing two-way communication receiving and transmitting hardware on a two-way communication link. Such equipment may include: cellular or other communication equipment, which has a single line display or multi-line display or cellular or other communication equipment without a multi-line display; PCS (Personal Communications Service, personal communication system), which can combine voice and data Processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant), which can include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars and/or GPS (Global Positioning System (Global Positioning System) receiver; conventional laptop and/or palmtop computer or other device, which has and/or includes a radio frequency receiver. The "terminal" and "terminal equipment" used here may be portable, transportable, installed in vehicles (aviation, sea and/or land), or suitable and/or configured to operate locally, and/or In a distributed form, it runs on the earth and/or any other location in space. The "terminal" and "terminal equipment" used here can also be communication terminals, internet terminals, music/video playback terminals, for example, PDAs, MIDs (Mobile Internet Devices, mobile Internet devices) and/or music/video playback Functional mobile phones can also be devices such as smart TVs and set-top boxes.

The terminal in this embodiment is the aforementioned terminal. Specifically, please refer to FIG. 1. FIG. 1 is a schematic diagram of the basic flow of an emotional machine chat method in this embodiment. As shown in Figure 1, an emotional machine chat method includes the following steps:

S 10K obtains the chat sentence entered by the user;

The language information input by the user is acquired through the interactive page on the terminal. The received information can be text information or voice information. The voice information is converted into text information through a voice recognition device.

S102. Input the chat sentence into a preset response generation model, and obtain an initial response output by the response generation model in response to the chat sentence.

The response generation model can use the trained Seq2Seq model. The specific training process is to prepare the training corpus, that is, prepare the input sequence and the corresponding output sequence, input the input sequence into the Seq2Seq model, calculate the probability of the output sequence, and adjust the parameters of the Seq2Seq model , So that the entire sample, that is, all input sequences, has the highest probability of outputting the corresponding output sequence after Seq2Seq. The process of using the Seq2Seq model to generate the initial response is to first vectorize the chat sentence, for example, use the one-hot vocabulary encoding method to obtain the word vector, and input it to the Encoder layer, where the Encoder layer uses the bidirectional LSTM layer as the basic neuron unit Multi-layer neuron layer; The output state vector of the encoder is input to the Decoder layer, where the Decoder layer is also a multi-layer neural network with the bidirectional LSTM (Long Short-Term Memory) layer as the basic neuron unit; The decoder layer is output The final_state state vector is input to the Softmax layer, and the initial response content with the highest probability is obtained.

In some embodiments, machine chat is applied to a question answering scenario, and the response generation model adopted is a question-and-answer knowledge base. Through keyword search, the answer to the question contained in the chat sentence input by the user is obtained, and the answer is returned as the initial response .

In some implementations, the machine chat is used to accompany the user in small talk and to answer the user's question. The response generation model is selected by first determining whether it is a question answering scene. Please refer to FIG. 2 for specific description.

S103: Input the initial response into a preset emotion generation model, and obtain at least two candidate responses that carry emotions output by the emotion generation model in response to the initial response.

Input the initial response into the preset emotion generation model to obtain the candidate responses output by the emotion generation model. The preset emotion generation model contains at least two emotion generation sub-models, which can transform the initial response into emotion. For example, changing the initial response with neutral emotion to a response with positive emotion, or changing the initial response with neutral emotion to a response with negative emotion.

Any emotion generation sub-model is based on the pre-trained Seq2Seq model. An emotion generation sub-model is a Seq2Seq model that outputs a candidate response that carries emotion. Each Seq2Seq model in the preset emotion generation model generates emotion due to different training corpus The factors are different, the output port Emotional candidate responses are also different. The initial response is input to each Seq2Seq model in the preset emotion generation model, and candidate responses carrying various emotions are output. It is worth noting that the Seq2Seq model used for emotion generation here is different from the aforementioned Seq2Seq model used for generating initial responses, and the specific training process of the Seq2Seq model used for emotion generation is shown in FIG. 4.

S104. Input the candidate response and the chat sentence into a trained deep reinforcement learning network model to obtain a deep reinforcement learning value of each candidate response.

Both the generated candidate responses and the chat sentences input by the user are input into the trained deep learning network model to obtain the deep reinforcement learning value of each candidate response. The deep reinforcement learning network combines the perception ability of the deep learning network and the decision-making ability of the reinforcement learning network, and determines which candidate response to adopt by calculating the reinforcement learning value of each candidate response. Among them, the deep reinforcement learning network has the following loss

Network parameters, Q is the true deep reinforcement learning value, and 0 is the deep reinforcement learning value predicted by the deep reinforcement learning network.

The training process of the deep reinforcement learning network is to prepare training samples. Each sample in the training sample contains the input chat sentence and the candidate response corresponding to the chat sentence and the deep learning value of each candidate response; the deep learning value is based on the preset Rule labeling. For example, when a candidate response to a chat sentence causes the user to directly end the conversation, the deep learning value of the candidate response is low. When a candidate response to a chat sentence makes the user’s next chat sentence input If there is a positive change in emotion, the deep learning value of the candidate response is high.

Input the training samples to the deep reinforcement learning network model to obtain the deep reinforcement learning value predicted by the deep reinforcement learning network model, and substitute the deep reinforcement learning value predicted by the deep reinforcement learning network model and the actual deep learning value of the sample into the above loss function L( w), adjust the network parameters of the deep reinforcement learning network model, and end when L(w) is minimum.

S105. Return the candidate response with the largest deep reinforcement learning value as the response sentence of the chat sentence. The candidate response with the largest deep reinforcement learning value is considered to be the most appropriate response to the chat sentence input by the current user. The response sentence is returned to the client terminal, and the text information is displayed on the terminal screen. The text information can also be converted to audio first. The audio output device of the terminal outputs language information.

As shown in Figure 2, the preset response generation model includes M response generation sub-models, where M is a positive integer greater than 1, and when the chat sentence is input into the preset response generation model, the initial response is obtained The steps in the answer include the following steps:

5111. Input the chat sentence into a preset scene recognition model, and obtain a scene output by the scene recognition model in response to the chat sentence.

When machine chat is applied to a variety of scenarios, for example, it is applied to both the question answering scene and the non-question answering scene. The scene is first identified, and then the corresponding response is determined according to the scene to generate a sub-model, which can make the generated response more Targeted.

The scene recognition model can be based on keywords to determine whether it is a problem-solving scene or a non-question-answering scene. It can be judged whether the input chat sentence contains keywords that express questions, such as "?" What" "How much" "Where" "How" and other interrogative particles. You can also use a regular matching algorithm to judge whether the chat is a question or not. A regular expression is a logical formula for string manipulation. It uses predefined specific characters and combinations of these specific characters to form a "regular character" String", this "rule string" is used to express a filtering logic for strings.

When the input chat sentence is not an interrogative sentence, it is judged that the scene is a non-question answering scene. Identify whether it is a problem-solving scene, and further, you can subdivide the scenes, for example, the non-problem-solving scene can be subdivided into small chat, appreciation, and complaints; the problem-solving scene can be subdivided into pre-sales consultation, after-sales service, etc. The segmented scenes can be judged by the preset keyword list. For each type of segmented scene, a keyword list is preset. When the extracted keywords in the input chat sentence are in the keyword list corresponding to a certain type of segmented scene When the words are the same, it is considered that the input chat sentence corresponds to the segmented scene.

In some embodiments, a pre-trained LSTM-CNN neural network model is used for scene recognition. Specifically, for the input content, the Chinese word segmentation is performed first, and the basic word segmentation database is used, and the stop words, punctuation marks, etc. are removed sequentially, and the word embedding vector is obtained through the word vector model, and then passed into the neural network model based on LSTM-CNN. That is, the word embedding vector enters the multi-layer LSTM neural unit to obtain the state vector and output of each stage; then, based on the state vector of each stage, perform convolution and pooling operations (CNN) to obtain the integrated vector index; then integrate The vector index is input into the softmax function to obtain the probability of the corresponding scene. The scene with the highest probability is selected as the scene corresponding to the input chat sentence.

S112. According to the scenario, determine a response generation submodel corresponding to the chat sentence; the response generation model presets M response generation submodels, and the response generation submodel has a mapping relationship with the scenario. The scene of the input chat sentence is determined, and the response generation sub-model corresponding to the user input chat sentence is determined according to the mapping relationship between the scene and the response generation sub-model.

In the embodiment of this application, the mapping relationship between the response generation sub-model and the scene is that when the scene is a question answering type, the question and answer knowledge base is used as the response generation sub-model, and when the scene is a non-question answering type, Use the trained Seq2Seq model.

S113. Input the chat sentence into the response generation sub-model, and obtain an initial response output by the response generation sub-model in response to the chat sentence.

Input the chat sentence into the response generation sub-model corresponding to the scene, and the response generation sub-model responds to the chat sentence to output the initial response. In the embodiment of this application, when the chat sentence corresponds to a non-question answering scenario, the initial response is generated by the Seq2Seq model. For the specific process, please refer to the description of S102. When the chat sentence corresponds to a question answering scenario, the process of generating the initial response is shown in Fig. 3.

As shown in Figure 3, when a chat sentence corresponds to a question answering scenario, it is determined that the response generation sub-model corresponding to the chat sentence is a question and answer knowledge base; in S111, the following steps are further included:

S 12K segmentation of the chat sentence to obtain the keywords of the chat sentence;

The two-way maximum matching method is adopted in the embodiment of this application. The two-way maximum matching method is a dictionary-based word segmentation method. The dictionary-based word segmentation method is to match the Chinese character string to be analyzed with the entry in a machine dictionary according to a certain strategy. If a certain character string is found in the dictionary, the matching is successful. The dictionary-based word segmentation method is divided into forward matching and reverse matching according to different scanning directions, and divided into maximum matching and minimum matching according to the difference in length. The two-way maximum matching method compares the word segmentation results obtained by the forward maximum matching method and the reverse maximum matching method to determine the correct word segmentation method. According to research, about 90.0% of the sentences in Chinese, the forward maximum matching method and the reverse maximum matching method are completely coincident and correct. Only about 9.0% of the sentences get different results from the two segmentation methods, but one of them must be Correct, only less than 1.0% of the sentences, or the segmentation of the forward maximum matching method and the reverse maximum matching method overlap but is wrong, that is, there is ambiguity, or the forward maximum matching method and the reverse maximum matching method segmentation Different but neither is right. Therefore, in order to make the segmented vocabulary accurately reflect the meaning of the sentence, the two-way maximum matching method is used for word segmentation.

After word segmentation is performed on the chat sentence, the word segmentation result can also be matched with a preset stop word list, the stop words are removed, and the keywords of the chat sentence are obtained.

S122. Search the question and answer knowledge base according to the keywords to obtain a search result matching the keywords.

The Q&A knowledge base is searched according to keywords, and search results matching the keywords are obtained. To search the Q & A knowledge base based on keywords, a third-party search engine can be used to search the Q & A knowledge base.

S123. Return the search result as the initial response of the chat sentence.

Generally, the Q&A knowledge base is retrieved by keywords, and there are multiple retrieval results. In the embodiment of the application, it is determined that among the retrieval results, the top ranked result is used as the initial response of the chat sentence. As shown in Figure 4, the emotion generation model is based on N pre-trained Seq2Seq models. After each Seq2Seq model is trained, different emotions are added to the initial response. The training of any Seq2Seq model includes the following steps:

S 13K: Obtain training corpus, the training corpus includes a number of input sequence and output sequence pairs, where the output sequence is the expression of the specified emotion type of the input sequence;

The training corpus is a number of sequence pairs, including an input sequence and an output sequence, where the output sequence is the expression of the specified emotion type of the input sequence, for example, the input sequence is a neutral expression "Today's weather is sunny, temperature is 25 degrees, air quality index 20", the expected output sequence is a positive expression "the weather is great today, the temperature is at a comfortable 25 degrees, and the air quality is good".

S132. Input the input sequence into the Seq2Seq model, and adjust the parameters of the Seq2Seq model so that the Seq2Seq model has the greatest probability of outputting the output sequence in response to the input sequence.

Input the input sequence in the training corpus into the Seq2Seq model, and adjust the parameters of each node of the Seq2Seq model through the gradient descent method to maximize the probability of the Seq2Seq model outputting the expected output sequence, the training ends. The parameter file obtained at this time defines the Seq2Seq model that generates the specified emotion type.

As shown in Figure 5, in this embodiment of the application, the training of the deep reinforcement learning network model is performed through the following steps:

S141. Obtain training samples, where each sample in the training samples includes the input chat sentence and the candidate response corresponding to the chat sentence and the deep reinforcement learning value of each candidate response.

Prepare training samples. Each sample in the training sample contains the input chat sentence and the candidate response corresponding to the chat sentence and the deep learning value of each candidate response; the deep learning value is labeled according to preset rules, for example, If a candidate response of the user directly ends the dialogue, the deep learning value of the candidate response is low. When a candidate response to the chat sentence causes the user to enter the next round of chat sentences with a positive change in emotion, then the The candidate response has a high deep learning value.

S142: Input the training samples into a deep reinforcement learning network model, and obtain the deep reinforcement learning value predicted by the deep reinforcement learning network model.

Input the training samples into the deep reinforcement learning network model to obtain the deep reinforcement learning value predicted by the deep reinforcement learning network model. Deep reinforcement learning can be analogous to supervised learning. Deep reinforcement learning tasks are usually described by Markov decision process. The robot is in an environment, and each state is the robot's perception of the environment. When the robot performs an action, it will make the environment transfer to another state according to probability; at the same time, the environment will be given to the robot according to the reward function.

S143: Calculate the value of the loss function L(w) according to the predicted deep learning value. Substitute the deep reinforcement learning value predicted by the deep reinforcement learning network model and the actual deep learning value of the sample into the above loss function L(w), and calculate the value of the loss function.

S144. Adjust the network parameters of the deep reinforcement learning network model until the value of the loss function L(w) is the smallest.

The goal of training is the convergence of the loss function L(w), that is, when the network parameters of the deep reinforcement learning network model are continuously adjusted, the value of the loss function no longer decreases, but increases instead, the training ends. At this time, the obtained parameter file is It is a file that defines the deep reinforcement learning network model.

In order to solve the above technical problems, the embodiment of the present application also provides an emotional machine chat device. Please refer to Fig. 6 for details. Fig. 6 is a basic structural block diagram of an emotional machine chat device according to this embodiment.

As shown in FIG. 6, an emotional machine chat device includes: an acquisition module 210, a generation module 220, a processing module 230, a calculation module 240, and an execution module 250. Wherein, the obtaining module 210 is used to obtain the chat sentence input by the user; the generating module 220 is used to input the chat sentence into a preset response generation model, and obtain the output of the response generation model in response to the chat sentence Initial response; a processing module 230, configured to input the initial response into a preset emotion generation model, and obtain at least two emotion-carrying candidate responses output by the emotion generation model in response to the initial response; calculation module 240 , Used to input the candidate response and the chat sentence into the trained deep reinforcement learning network model to obtain the deep reinforcement learning value of each candidate response; the execution module 250 is used to return the candidate response with the largest deep reinforcement learning value As a response sentence of the chat sentence.

The embodiment of the application obtains the chat sentence input by the user; inputs the chat sentence into a preset response generation model, and obtains the initial response output by the response generation model in response to the chat sentence; and input the initial response In a preset emotion generation model, obtain at least two candidate responses carrying emotions output by the emotion generation model in response to the initial response; input the candidate responses and the chat sentence into the trained deep reinforcement learning In the network model, the deep reinforcement learning value of each candidate response is obtained; and the candidate response with the largest deep reinforcement learning value is returned as the response sentence of the chat sentence. It returns emotional responses to the chat sentences entered by the user, making machine chat more natural and more humane.

In some embodiments, the generation module includes: a first recognition sub-module, a first confirmation sub-module, and a first generation sub-module, wherein the first recognition sub-module is used to input the chat sentence into a preset In the scene recognition model, obtain the scene output by the scene recognition model in response to the chat sentence; a first confirmation sub-module, configured to determine a response generation sub-model corresponding to the chat sentence according to the scene; first generation A sub-module for inputting the chat sentence into the response generation sub-model, Acquire the initial response output by the response generation sub-model in response to the chat sentence.

In some embodiments, the first recognition sub-module includes: a first matching sub-module, a second confirming sub-module, and a third confirming sub-module, wherein the first matching sub-module is configured to combine the chat sentence with a preset The regular expression matching of the chat sentence, where the preset regular expression contains the characteristics of the question sentence; the second confirmation sub-module is used to determine the question corresponding to the chat sentence when the chat sentence matches the preset regular expression Answering scenario; the third confirmation sub-module is used to determine that the chat sentence corresponds to a non-question answering scenario when the chat sentence does not match the preset regular expression.

In some embodiments, the first generation submodule includes: a first word segmentation submodule, a first search submodule, and a first execution submodule, wherein the first word segmentation submodule performs word segmentation on the chat sentence to obtain Keywords of the chat sentence; a first search sub-module for searching the question and answer knowledge base according to the keywords to obtain search results that match the keywords; a first execution sub-module for returning the The search result is used as the initial response of the chat sentence.

In some embodiments, the emotion generation model in the emotional machine chat device is based on N pre-trained Seq2Seq models, and the emotional machine chat device further includes: a first acquisition submodule, a first calculation A sub-module, wherein the first acquisition sub-module is used to acquire training corpus, the training corpus includes a number of input sequence and output sequence pairs, wherein the output sequence is the expression of the specified emotion type of the input sequence; first The calculation sub-module is configured to input the input sequence into the Seq2Seq model and adjust the parameters of the Seq2Seq model to maximize the probability of the Seq2Seq model outputting the output sequence in response to the input sequence.

In some embodiments, the deep reinforcement learning network in the emotional machine chat device

In some embodiments, the emotional machine chat device further includes: a second acquisition sub-module, a second calculation sub-module, a third calculation sub-module, and a first adjustment sub-module, wherein the second acquisition sub-module uses To obtain training samples, each of the training samples includes the input chat sentence and the candidate response corresponding to the chat sentence and the deep reinforcement learning value of each candidate response; the second calculation sub-module is used to combine the training sample Input to the deep reinforcement learning network model to obtain the deep reinforcement learning value predicted by the deep reinforcement learning network model; the third calculation sub-module is used for the deep learning according to the prediction Value, calculate the value of the loss function L(w); the first adjustment sub-module is used to adjust the network parameters of the deep reinforcement learning network model, and end when the value of the loss function L(w) is minimum.

To solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to Fig. 7 for details. Fig. 7 is a block diagram of the basic structure of the computer device in this embodiment. As shown in Figure 7, a schematic diagram of the internal structure of the computer device. As shown in FIG. 7, the computer device includes a processor, a nonvolatile storage medium, a memory, and a network interface connected through a system bus. Wherein, the non-volatile storage medium of the computer device stores an operating system, a database, and computer-readable instructions. The database may store a control information sequence. When the computer-readable instructions are executed by the processor, the processor can make the processor implement the above Any embodiment of an emotional machine chat method. The processor of the computer device is used to provide calculation and control capabilities, and supports the operation of the entire computer device. Computer readable instructions may be stored in the memory of the computer device, and when the computer readable instructions are executed by the processor, the processor can make the processor execute an emotional machine chat method. The network interface of the computer equipment is used to connect and communicate with the terminal. Those skilled in the art can understand that the structure shown in FIG. 7 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may It includes more or less parts than shown in the figure, or combines some parts, or has a different part arrangement.

In this embodiment, the processor is used to execute the specific content of the acquisition module 210, the generation module 220, the processing module 230, the calculation module 240, and the execution module 250 in FIG. 6, and the memory stores the program codes and various data required to execute the above modules. The network interface is used for data transmission between user terminals or servers. The memory in this embodiment stores the program code and data required to execute all the sub-modules in the emotional machine chat method, and the server can call the program code and data of the server to execute the functions of all the sub-modules.

The computer device obtains the chat sentence input by the user; inputs the chat sentence into a preset response generation model, obtains the initial response output by the response generation model in response to the chat sentence; and inputs the initial response to the pre- In the sentiment generation model, obtain at least two candidate responses carrying emotions output by the sentiment generation model in response to the initial response; input the candidate responses and the chat sentence into a trained deep reinforcement learning network model In the process, the deep reinforcement learning value of each candidate response is obtained; and the candidate response with the largest deep reinforcement learning value is returned as the response sentence of the chat sentence. It returns emotional responses to the chat sentences entered by the user, making machine chat more natural and humane.

The present application also provides a storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors can execute any of the foregoing

Claims

Describe the steps of an emotional machine chat method. A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. The aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc. It should be understood that, although the various steps in the flowchart of the drawings are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly restricted in order, and they can be executed in other orders. Moreover, at least a part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps. The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application. Claims

1. An emotional machine chat method, characterized in that it includes the following steps:

Get the chat sentence entered by the user;

2. The emotional machine chat method according to claim 1, wherein the preset response generation model includes at least two response generation sub-models, and the chat sentence is input into the preset response generation model. In the model, the step of obtaining the initial response includes the following steps:

Input the chat sentence into a preset scene recognition model, and obtain the scene output by the scene recognition model in response to the chat sentence;

According to the scenario, determine a response generation sub-model corresponding to the chat sentence;

Input the chat sentence into the response generation sub-model, and obtain the initial response output by the response generation sub-model in response to the chat sentence.

3. The emotional machine chat method according to claim 2, wherein the preset scene recognition model adopts a regular matching algorithm, and in said inputting the chat sentence into the preset scene recognition model, The step of the scene that the scene recognition model outputs in response to the chat sentence includes the following steps:

Matching the chat sentence with a preset regular expression, where the preset regular expression includes interrogative sentence features;

When the chat sentence matches the preset regular expression, it is determined that the chat sentence corresponds to a question answering scenario;

When the chat sentence does not match the preset regular expression, it is determined that the chat sentence corresponds to a non-question-solving scene.

4. The emotional machine chat method according to claim 3, wherein the step of determining the response generation sub-model corresponding to the chat sentence according to the scene is: According to the question answering scenario, determining that the response generation sub-model corresponding to the chat sentence is a question and answer knowledge base;

The steps of inputting the chat sentence into the response generation sub-model and obtaining the initial response output by the response generation sub-model in response to the chat sentence include the following steps:

Word segmentation of the chat sentence to obtain keywords of the chat sentence;

Retrieve the question and answer knowledge base according to the keywords to obtain retrieval results matching the keywords; and return the retrieval results as the initial response of the chat sentence.

5. The emotional machine chat method according to claim 1, wherein the emotion generation model is based on N pre-trained Seq2Seq models, wherein the training of any Seq2Seq model includes the following steps:

Acquiring a training corpus, the training corpus including a number of input sequence and output sequence pairs, where the output sequence is an expression of a specified emotion type of the input sequence;

The input sequence is input into the Seq2Seq model, and the parameters of the Seq2Seq model are adjusted so that the Seq2Seq model has the greatest probability of outputting the output sequence in response to the input sequence.

6. The emotional machine chat method according to claim 1, wherein the depth

7. The emotional machine chat method according to claim 6, characterized in that the training of the deep reinforcement learning network model is performed through the following steps:

Acquiring training samples, each of the training samples includes the input chat sentence and the candidate response corresponding to the chat sentence and the deep reinforcement learning value of each candidate response;

Inputting the training samples into a deep reinforcement learning network model to obtain the deep reinforcement learning value predicted by the deep reinforcement learning network model;

Calculating the value of the loss function L(w) according to the predicted deep learning value;

Adjust the network parameters of the deep reinforcement learning network model until the value of the loss function L(w) is minimum.

8. An emotional machine chat device, characterized in that it comprises:

Obtaining module, used to obtain chat sentences input by the user; A generating module, configured to input the chat sentence into a preset response generation model, and obtain an initial response output by the response generation model in response to the chat sentence;

9. The emotional machine chat device according to claim 8, wherein the generating module comprises:

The first recognition sub-module is configured to input the chat sentence into a preset scene recognition model, and obtain a scene output by the scene recognition model in response to the chat sentence;

The first confirmation sub-module is configured to determine the response generation sub-model corresponding to the chat sentence according to the scenario;

The first generation sub-module is configured to input the chat sentence into the response generation sub-model, and obtain the initial response output by the response generation sub-model in response to the chat sentence.

10. The emotional machine chat device according to claim 9, wherein the first recognition sub-module comprises:

The first matching sub-module is configured to match the chat sentence with a preset regular expression, where the preset regular expression includes interrogative sentence features;

The second confirmation sub-module is used to determine that the chat sentence corresponds to a problem-solving scenario when the chat sentence matches a preset regular expression;

The third confirmation sub-module is used for determining that the chat sentence corresponds to a non-question answering scene when the chat sentence does not match the preset regular expression.

11. A computer device comprising a memory and a processor, wherein computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the processor implements the following steps: obtaining user input Chat sentence;

Input the initial response into a preset emotion generation model to obtain the emotion generation model response At least two candidate responses carrying emotions output in response to the initial response;

12. The computer device according to claim 11, wherein the preset response generation model includes at least two response generation sub-models, and the chat sentence is input into the preset response generation model to obtain The initial response steps include the following steps:

13. The computer device according to claim 12, wherein the preset scene recognition model uses a regular matching algorithm, and in said inputting the chat sentence into the preset scene recognition model, the scene recognition is obtained The steps of the scenario output by the model in response to the chat sentence include the following steps:

14. The computer device of claim 12, wherein the step of determining a response generation sub-model corresponding to the chat sentence according to the scenario is:

According to the question answering scenario, determining that the response generation sub-model corresponding to the chat sentence is a question and answer knowledge base;

Word segmentation of the chat sentence to obtain keywords of the chat sentence;

15. The computer device according to claim 11, wherein the emotion generation model is based on N pre-trained Seq2Seq models, wherein the training of any Seq2Seq model includes the following steps:

16. A computer-readable storage medium having computer-readable instructions stored thereon, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

Get the chat sentence entered by the user;

17. The computer-readable storage medium according to claim 16, wherein the preset response generation model includes at least two response generation sub-models, and the chat sentence is input into the preset response generation model In the step of obtaining the initial response, the following steps are included:

18. The computer-readable storage medium according to claim 17, wherein the preset scene recognition model adopts a regular matching algorithm, and in the input of the chat sentence into the preset scene recognition model, all The step of the scene that the scene recognition model outputs in response to the chat sentence includes the following steps:

Matching the chat sentence with a preset regular expression, where the preset regular expression includes interrogative sentence features; When the chat sentence matches a preset regular expression, it is determined that the chat sentence corresponds to a question answering scenario;

19. The computer-readable storage medium according to claim 18, wherein the step of determining the response generation sub-model corresponding to the chat sentence according to the scenario is:

Word segmentation of the chat sentence to obtain keywords of the chat sentence;

20. The computer-readable storage medium according to claim 18, wherein the emotion generation model is based on N pre-trained Seq2Seq models, wherein the training of any Seq2Seq model includes the following steps: