CN116521893A

CN116521893A - Control method and control device of intelligent dialogue system and electronic equipment

Info

Publication number: CN116521893A
Application number: CN202310485470.0A
Authority: CN
Inventors: 周镇镇
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2023-04-28
Filing date: 2023-04-28
Publication date: 2023-08-01

Abstract

The embodiment of the application provides a control method, a control device and electronic equipment of an intelligent dialogue system, wherein the method comprises the following steps: acquiring current input information of a user, and extracting keywords of the current input information; determining topics corresponding to current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of a user and an intelligent dialogue system; inquiring knowledge corresponding to keywords and topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain; inputting the prompt word information into a first pre-training language model, analyzing the prompt word information by using the first pre-training language model, determining reply information, and controlling the intelligent dialogue system to output the reply information. By the method and the device, the problem that the intelligent chat dialogue system is insufficient in understanding ability and cannot output accurate reply information is solved.

Description

Control method and control device of intelligent dialogue system and electronic equipment

Technical Field

The embodiment of the application relates to the field of artificial intelligence, in particular to a control method, a control device, a computer readable storage medium and electronic equipment of an intelligent dialogue system.

Background

With the development of natural language processing technology, various dialogue systems for subdividing scenes have been developed, and the dialogue systems are also called chat robots, dialogue robots, chat accompanying robots, psychological consultation robots, communication robots, question answering robots, intelligent assistants and the like in different application environments, and the above different terms are collectively called "dialogue systems" in this application.

The conversation system can be mainly classified into a task-type conversation system and a chat-type conversation system, and it is particularly important for the chat-type conversation system to maintain conversation continuity with a user. In the prior art, the patent name is "on-line market of plug-in for enhanced dialogue system", the application number is CN202011438705.3, which specifically discloses: an online marketplace for enhancing plug-ins for dialog systems. A method includes maintaining an online marketplace that may include a plurality of dialog system extension elements. Each of the plurality of dialog system extension elements may include at least one of: dialog system plug-ins, dialog system attachments, dialog system updates and dialog system upgrades. The method may further include receiving a selection of one of the plurality of dialog system extension elements from the software developer. A software developer may be associated with a dialog system. The method may continue with associating one of the plurality of dialog system extension elements with the dialog system of the software developer. The patent name is 'method and system for maintaining session continuity of a session system', the application number is CN201611060135.2, which specifically discloses: the invention provides a method and a system for maintaining conversation continuity of a conversation system, wherein the method comprises the steps of cutting a related topic set belonging to the same topic as a current conversation from all historical conversations in real time according to the current conversation input; mining current topic keywords from the related topic sets; and determining response output according to the current dialogue input and the current topic keyword. According to the method and the system for maintaining the conversation continuity of the conversation system, the related topic sets which belong to the same topic with the current conversation are cut out from all previous topics in real time according to the content of the current conversation input, and the most representative current topic keywords are mined out from the cut related topic sets. The patent name is epidemic situation information extraction frame construction method based on a pre-training language model, the application number is CN202210010887.7, and the method specifically discloses: the invention designs a data labeling rule of an infected case, and provides an epidemic situation case information extraction framework construction method based on a pre-training language model, wherein the information extraction framework based on the pre-training language model automatically extracts core elements in the case, the named entity identification network can accurately identify the named entity in the case text, locate key information of a case propagation path, the implication type prediction network can efficiently predict implication type, judge main forms of the case propagation path, and the framework realizes structural representation of the case text so as to further assist disease prevention and control specialists to formulate intervention measures of novel coronavirus propagation.

As can be seen from the above disclosure in the prior art, the current chat-type dialogue system uses a smaller number of parameters of a natural language model, has a lower understanding ability for user input, has no memory ability for the user input, or uses a conventional method of high-frequency questions (FAQ) to match questions only through keywords and output corresponding answers, so that a reply satisfying the user cannot be formed frequently, and the user's appeal cannot be solved.

Therefore, how to enable the chat-type dialogue system to trace back the long-term past dialogue and improve the understanding capability of the chat-type dialogue system is a problem to be solved at present.

Disclosure of Invention

The embodiment of the application provides a control method, a control device, a computer readable storage medium and electronic equipment of an intelligent dialogue system, which are used for at least solving the problems that the intelligent dialogue system in the related art is insufficient in understanding ability and cannot output a response satisfactory to a user.

According to one embodiment of the present application, there is provided a control method of an intelligent dialog system, including: acquiring current input information of a user, and extracting keywords of the current input information; determining topics corresponding to the current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of the user and an intelligent dialogue system; inquiring knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain; inputting prompt word information into a first pre-training language model, analyzing the prompt word information by using the first pre-training language model, determining reply information, and controlling the intelligent dialogue system to output the reply information, wherein the prompt word information comprises prompt sentences, current input information, topics and target domain knowledge, the prompt sentences are preset sentences serving as guide words of the reply information, the first pre-training language model is trained by using a plurality of groups of first data through machine learning, and each group of first data in the plurality of groups of first data comprises: history prompt word information and history reply information.

In an exemplary embodiment, the first pre-training language model includes an embedding layer, a conversion layer, a linear layer, and a logic layer, and the analyzing the prompt word information using the first pre-training language model to determine reply information includes: acquiring word embedding and position coding corresponding to the prompt word information by using the embedding layer; embedding the word corresponding to the prompt word information into a conversion layer according to the position code to obtain a feature space corresponding to the word embedding and the position code, wherein the conversion layer comprises a multi-head attention mechanism layer, a normalization layer and a feedforward neural network layer; compressing the feature space by using the linear layer to obtain the compressed feature space; processing the compressed feature space using the logic layer to obtain probabilities of a plurality of output sentences, wherein each of the output sentences (y 1, y ₂ ,…，y _n ) The probability of (2) isn represents the number of the prompt word information, and x _i Representing the ith prompt word information, y _i Representing an ith said output statement; and taking the output statement with the highest probability as the reply information.

In an exemplary embodiment, extracting the keyword of the current input information includes: performing part-of-speech screening on all words contained in the current input information to obtain a plurality of words with preset parts-of-speech, and generating a plurality of words with preset parts-of-speech to a first candidate keyword group, wherein the preset parts-of-speech is a preset part-of-speech; deleting repeated first candidate keywords in the first candidate keyword groups to obtain second candidate keyword groups; and determining the keywords corresponding to the current input information according to the second candidate keyword group.

In an exemplary embodiment, determining the keyword corresponding to the current input information according to the second candidate keyword group includes: constructing each second candidate keyword in the second candidate keyword group into a keyword sentence of a predetermined sentence pattern, so as to obtain a plurality of second candidate keyword sentences; analyzing the second candidate keyword sentences by using a second pre-training language model to obtain the probability of each second candidate keyword sentence, and determining the second candidate keyword corresponding to the second candidate keyword sentence with the highest probability as the keyword corresponding to the current input information, wherein the second pre-training language model is trained by using a plurality of groups of second data through machine learning, and each group of second data in the plurality of groups of second data comprises: historical keyword sentences and historical keywords.

In an exemplary embodiment, before determining the topic corresponding to the current input information according to the keyword and the historical dialogue information base, the method further includes: recording a plurality of historical dialogue information of the user and the intelligent dialogue system, and generating the historical dialogue information to a historical dialogue information base.

In an exemplary embodiment, determining, according to the keyword and the historical dialogue information base, a topic corresponding to the current input information includes: calculating the importance degree of each keyword on a plurality of historical dialogue information in the historical dialogue information base by using a word frequency-inverse document frequency algorithm to obtain a frequency value corresponding to each keyword; and arranging the frequency values according to a preset sequence, and determining the preset number of keywords as the topics corresponding to the current input information.

In an exemplary embodiment, after controlling the intelligent dialog system to output the reply information, the method further includes: and storing the current input information and the reply information into the historical dialogue information base.

According to another embodiment of the present application, there is provided a control device of an intelligent dialog system, including: the extraction module is used for acquiring current input information of a user and extracting keywords of the current input information; the determining module is used for determining topics corresponding to the current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of the user and an intelligent dialogue system; the query module is used for querying the knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain; the control module is configured to analyze the prompt word information by using a first pre-training language model, determine reply information, and control the intelligent dialogue system to output the reply information, where the prompt sentence is a preset sentence serving as a guide word of the reply information, the first pre-training language model is trained by using multiple groups of first data through machine learning, and each group of first data in the multiple groups of first data includes: a tag for prompting word information and reply information.

According to a further embodiment of the present application, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

According to a further embodiment of the present application, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

According to the method and the device, current input information of a user is firstly obtained, keywords of the current input information are extracted, topics are determined according to the keywords and a historical dialogue information base, target domain knowledge is obtained by inquiring the keywords and the topics in an industry domain knowledge base, prompt word information comprising prompt sentences, the current input information, the topics and the target domain knowledge is input into a first pre-training language model, and reply information is obtained after the prompt word information is analyzed by the first pre-training language model. Compared with the method that the intelligent dialogue system has weak understanding capability and can not trace back the historical dialogue information to obtain the reply information satisfactory to the user in the prior art, the intelligent dialogue system can analyze the current input information of the user according to the historical dialogue information and the target field knowledge by using the first pre-training language model and output the accurate reply information, so that the problem that the intelligent chat dialogue system has insufficient understanding capability can be solved, and the accuracy of the reply information is improved.

Drawings

Fig. 1 is a hardware block diagram of a mobile terminal of a control method of an intelligent dialogue system according to an embodiment of the present application;

FIG. 2 is a flow chart of a method of controlling an intelligent dialog system according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a control method of a specific intelligent dialog system according to an embodiment of the present application;

FIG. 4 is a schematic diagram of the prompt word information in a specific control method of the intelligent dialog system according to the embodiment of the present application;

FIG. 5 is a schematic structural diagram of a pre-training language model in a control method of a specific intelligent dialog system according to an embodiment of the present application;

fig. 6 is a block diagram of a control device of a specific intelligent dialog system according to an embodiment of the present application.

Wherein the above figures include the following reference numerals:

102. a processor; 104. a memory; 106. a transmission device; 108. and an input/output device.

Detailed Description

Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal of a control method of an intelligent dialogue system according to an embodiment of the present application. As shown in fig. 1, a mobile terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the mobile terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.

The memory 104 may be used to store computer programs, such as software programs of application software and modules, such as computer programs corresponding to the control methods of the intelligent dialog system in the embodiments of the present application, and the processor 102 executes the computer programs stored in the memory 104 to perform various functional applications and data processing, i.e., implement the methods described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.

In this embodiment, a control method of an intelligent dialogue system running on the mobile terminal is provided, and fig. 2 is a flowchart of control of an intelligent dialogue system according to an embodiment of the application, as shown in fig. 2, where the flowchart includes the following steps:

step S202, current input information of a user is obtained, and keywords of the current input information are extracted;

specifically, the intelligent dialogue system is a chat dialogue system, and can also be applied to chat robots, dialogue robots, chat accompanying robots, psychological consultation robots, communication robots, question-answering robots, intelligent assistants and the like. When the current input information is sent to the intelligent dialogue system, the intelligent dialogue system obtains the current input information of the user, namely the current input information, extracts keywords in the current input information, wherein the current input information of the user is generally a sentence or several sentences, each sentence generally comprises a plurality of words, therefore, the keywords in the current input information are extracted, namely main words are extracted from the plurality of words, the keywords are generally nouns, and the step of extracting the keywords is explained below.

Step S204, determining topics corresponding to the current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of the user and an intelligent dialogue system;

specifically, assuming that the user has performed multiple conversations with the intelligent conversation system before inputting the current information, the multiple conversations of the user and the intelligent conversation system form a historical conversational information base, i.e., the historical conversational information base contains the historical conversational information of the user and the intelligent conversation system. After the keywords of the current input information of the user are extracted in the steps, topics of the conversation are determined according to the historical conversation information base and the keywords. Historical dialog information such as: the user: how many controllers can be supported by my cluster? The small source: your good-! I are your intelligent customer service small source asking what are your cluster models?

Step S206, inquiring the knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain;

specifically, the target domain knowledge base contains theoretical knowledge about the domain where the intelligent dialog system is engaged in using the user of the terminal, fact data, and experience summarized by the domain expert. The target domain knowledge in the target domain knowledge base is determined, and a distributed open source Search engine Elastic Search (ES for short) is used. The input information of the user each time forms a current input information database, corresponding different topics form a topic database, different knowledge in the target field forms a target field knowledge database, and different records in the current input information database, the topic database and the field knowledge database all contain unique index numbers so as to support searching keywords by a text search technology. Thus, an inverted index of "topic-document" relationships is constructed. Inquiring the knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, namely rapidly acquiring a document list containing the keywords through the topics, wherein the documents are records in a current input information database, a topic database and a domain knowledge database. And quickly inquiring the corresponding document according to the keywords and outputting the corresponding document. The target domain knowledge database supports the manager role login of the unified management platform background and performs adding, deleting, checking and modifying operation on the target domain knowledge, particularly, checking the target domain knowledge uploaded by the manual customer service, and adding or modifying the knowledge after the checking is passed to the target domain knowledge database. And supporting browsing and inquiring operations for the manual customer service, and simultaneously allowing the manual customer service to fill up the knowledge of the target field which is not covered by the existing knowledge base or modify the changed knowledge and upload newly added or modified contents. In particular, for each query, the industry domain knowledge base outputs s pieces of relevant knowledge that are preferred for content output, taking into account the length limitations of the input prompt word information later using the pre-trained language model, in some alternative embodiments s=5.

After the keywords and topics are determined, the keywords and topics are queried in a target domain knowledge base, and target domain knowledge corresponding to the keywords and topics is obtained. Knowledge of the target domain such as: AS5300G5& AS5500G5& AS5600G5& AS5800G5& HF5000G5& HF6000G5 supports a maximum of 16 controllers. AS6800G5& HF8000G5 supports a maximum of 32 controllers. AS18000G5-I & HF18000G5-I supports a maximum of 48 controllers.

Step S208, inputting prompt word information into a first pre-training language model, analyzing the prompt word information by using the first pre-training language model, determining reply information, and controlling the intelligent dialogue system to output the reply information, wherein the prompt word information comprises prompt sentences, the current input information, the topics and the target domain knowledge, the prompt sentences are preset sentences serving as guide words of the reply information, the first pre-training language model is trained by using a plurality of groups of first data through machine learning, and each group of first data in the plurality of groups of first data comprises: history prompt word information and history reply information.

Specifically, after the current input information, the topic corresponding to the current input information and the knowledge of the target field are acquired, a prompt sentence is acquired, which can be understood as a guide word of the reply information, for example: the small source is a very enthusiasm, a very happy customer service, and has excellent knowledge reserves in the professional field, and can listen to the complaints of customers, serve each customer with the heart, and look up much knowledge before encountering the customer. And then, the prompt sentence, the current input information, the topics and the target domain knowledge are used as prompt word information to be input into a first pre-training language model, so that the first pre-training language model analyzes the prompt word information, and reply information is obtained and output. The hint word information generally has a length limitation such as: limited to 2048 tokens (tokens in computer authentication).

The pre-training language model can be any reasonably and effectively pre-training large model, such as an artificial intelligence huge language model, and has more parameters and higher natural language understanding and natural language generating capability.

Through the steps, firstly, the current input information of a user is obtained, keywords of the current input information are extracted, topics are determined according to the keywords and a historical dialogue information base, target domain knowledge is obtained by inquiring the keywords and the topics in an industry domain knowledge base, the prompt word information comprising prompt sentences, the current input information, the topics and the target domain knowledge is input into a first pre-training language model, and the first pre-training language model analyzes the prompt word information to obtain reply information. Compared with the method that the intelligent dialogue system has weak understanding capability and can not trace back the historical dialogue information to obtain the reply information satisfactory to the user in the prior art, the intelligent dialogue system can analyze the current input information of the user according to the historical dialogue information and the target field knowledge by using the first pre-training language model and output accurate reply information, so that the problem that the intelligent chat dialogue system has insufficient understanding capability is solved, and the accuracy of the reply information is improved.

The main execution body of the above steps may be a server, a terminal, or the like, but is not limited thereto.

In particular realizeIn the process, the step S208 may be implemented by the following steps: the first pre-training language model includes an embedding layer, a conversion layer, a linear layer and a logic layer, and the first pre-training language model is used for analyzing the prompt word information to determine reply information, including: acquiring word embedding and position coding corresponding to the prompt word information by using the embedding layer; embedding the word corresponding to the prompt word information into a conversion layer corresponding to the position code to obtain a feature space corresponding to the word embedding and the position code, wherein the conversion layer comprises a multi-head attention mechanism layer, a normalization layer and a feedforward neural network layer; compressing the characteristic space by using the linear layer to obtain the compressed characteristic space; processing the compressed feature space using the logic layer to obtain probabilities of a plurality of output sentences, wherein each of the output sentences (y ₁ ,y ₂ ,…，y _n ) The probability of (2) isn represents the number of the prompt word information, and x _i Representing the ith prompt word information, y _i Representing an ith said output statement; and taking the output sentence with the highest probability as the reply information. According to the method, the prompt word information is analyzed through the first pre-training language model, so that the requirements of a user can be fully analyzed according to current input information, topic and target domain knowledge and prompt sentences in the prompt word information, and reply information is obtained.

Specifically, an embedding layer (input embedding) is used for the prompt word information, word embedding (word embedding) representation corresponding to the prompt word information is obtained, the word embedding representation is combined with a position code (or position embedding, positional embedding) representation, combined embedded expression of the prompt word information is obtained, then 76 conversion layers are input, and a multi-head attention mechanism (multihead attention) and an overlapping normalization layer (add) are arranged in the conversion layers&norm), feed forward neural network layer (feed forward). Finally, using linear layer to complete the compression of feature space dimension, and then usingThe logic layer (logics) obtains the likelihood of each output statement, each output statement (y ₁ ,y ₂ ,…，y _n ) The probability of (2) isAnd taking the output sentence with the highest probability as the reply information.

In order to extract the keywords in the current input information of the user, in a specific implementation process, the step S202 may be implemented by the following steps: performing part-of-speech screening on all words contained in the current input information to obtain a plurality of words with preset parts-of-speech, and generating a plurality of words with preset parts-of-speech to a first candidate keyword group, wherein the preset parts-of-speech is a preset part-of-speech; deleting the repeated first candidate keywords in the first candidate keyword groups to obtain second candidate keyword groups; and determining the keywords corresponding to the current input information according to the second candidate keyword group.

Specifically, the keyword extraction uses the language understanding capability of the pre-trained language model to directly extract the keyword, and the main steps are as follows: for the current input information of a user, performing part-of-speech screening by using a pre-training language model to obtain a first candidate keyword group [ keyword 1, keyword 2 … …, keyword m ] and m is more than or equal to 1; and carrying out data preprocessing operations such as removing repeated and disabled words and the like on the first candidate keyword group to obtain a second candidate keyword group [ keyword 1, keyword 2 … …, keyword n ], wherein n is more than or equal to 1. And then determining the final keywords according to the second candidate keyword groups.

In a specific implementation process, the step S202 may be further implemented by the following steps: determining the keywords corresponding to the current input information according to the second candidate keyword group, including: constructing each second candidate keyword in the second candidate keyword group into a keyword sentence of a predetermined sentence pattern, so as to obtain a plurality of second candidate keyword sentences; analyzing the second candidate keyword sentences by using a second pre-training language model to obtain the probability of each second candidate keyword sentence, and determining the second candidate keyword corresponding to the second candidate keyword sentence with the largest probability as the keyword corresponding to the current input information, wherein the second pre-training language model is trained by using a plurality of groups of second data through machine learning, and each group of second data in the plurality of groups of second data comprises: historical keyword sentences and historical keywords. The method comprises the steps of constructing a second candidate keyword sentence, and inputting the second candidate keyword sentence into a second pre-training language model for analysis, so that more accurate keywords can be obtained.

Specifically, each of the second candidate keyword groups [ keyword 1, keyword 2 … …, keyword n ], n≡1 is structured as follows: please determine if the following description is correct: the keyword 1 is the keyword of the user input information. N identical sentence patterns are constructed. The n identical sentence patterns are put into a second pre-training model, an embedding layer (input embedding) is used for input, word embedding (word embedding) representation of the input corresponding token is obtained, the word embedding representation is overlapped with a position coding (or position embedding, positional embedding) representation, and a combined embedded expression of the input is obtained. The 76 transducer layers are input subsequently, and the interior of the transducer layers consists of a multi-head attention mechanism (multihead attention), an add & norm layer and a feed forward neural network layer (feed forward). And finally, using a linear layer to complete the compression of the feature space dimension, and using a logic layer (logics) to acquire the possibility of each second keyword sentence. The logical layer is used here because the possibility that each keyword becomes a keyword of the information input by the user at this time is independent of each other.

In order to enable the intelligent dialogue system to accurately analyze the problem of the user and output more accurate reply information, in a specific implementation process, the step S204 may be further implemented by the following steps: recording a plurality of history dialogue information of the user and the intelligent dialogue system, and generating the history dialogue information into a history dialogue information base.

Specifically, the current input information of the user is usually a plurality of pieces, and the history dialogue information may be information before the present piece of information, for example: the user: how many controllers can be supported by my cluster? The small source: your good-! I are your intelligent customer service small source asking what are your cluster models? The user: model AS5500G5. In this dialogue, the historical dialogue information may be the information that the user first entered with the intelligent dialogue system: how many controllers can be supported by my cluster? The small source: your good-! I are your intelligent customer service small source asking what are your cluster models? That is, the historical dialogue information can be dialogue information before the sentence information in the pieces of information input by the user at this time, and the intelligent dialogue system obtains the theme of the pieces of information input at this time according to the historical dialogue information and the keywords.

In a specific implementation process, the step S204 may be further implemented by the following steps: calculating the importance degree of each keyword on a plurality of historical dialogue information in the historical dialogue information base by using a word frequency-inverse document frequency algorithm to obtain a frequency value corresponding to each keyword; and arranging the frequency values according to a preset sequence, and determining the preset number of keywords as the topics corresponding to the current input information. According to the method, the topics are determined through the keywords, so that the problems of the user can be more accurately positioned, and the questions of the user are replied in a targeted manner according to the topics.

Specifically, topic positioning is used for positioning main topics in multiple interactions of a user and an intelligent dialogue system, after keywords of current input information of the user are extracted in the steps, each current input information in a historical dialogue information base is regarded as a document, and the importance degree of each keyword on the document is calculated by using a Term Frequency-inverse document Frequency (Term Frequency-Inverse Document Frequency, TF-IDF) algorithm; the TF-IDF values corresponding to the keywords of each currently input information are arranged in a descending order, and a predetermined number of words arranged at the forefront are taken, and in some alternative embodiments, the predetermined number may be p=5, and the first 5 keywords are used as topics of the document and as labels of the document.

In order to enable the intelligent dialogue system to "trace back" the content of the history dialogue to determine topics according to keywords and history dialogue information, the method further comprises the following steps in the specific implementation process: and storing the current input information and the reply information into the historical dialogue information base.

Specifically, as described above, the current input information of the user is generally a plurality of sentences, that is, before a current sentence, the user and the intelligent dialogue system have multiple dialogue information, the existing multiple dialogue information is stored in the history dialogue information base, that is, after the intelligent dialogue system outputs the reply information, the current input information and the reply information are stored in the history dialogue information base, so that the intelligent dialogue system has a "memorizing" function, can "trace back" the history dialogue information, and determines topics according to the importance degree of keywords on the history dialogue information.

In order to enable those skilled in the art to more clearly understand the technical solutions of the present application, the implementation process of the control method of the intelligent dialog system of the present application will be described in detail below with reference to specific embodiments.

The embodiment relates to a specific control method of an intelligent dialogue system, as shown in fig. 3 to 5, including the following steps:

step S1: fig. 3 is a schematic structural diagram of a specific control method of the intelligent dialogue system, in which a user inputs information through a terminal interface of a client using a terminal, and the terminal interface supports the user to access with multiple platforms, including IOS, android, web and various applets, H5, quick applications, and the like. Namely, uplink: currently inputting information;

step S2: the unified management platform (intelligent dialogue system) of the dialogue system comprises: the intelligent dialogue system acquires current input information of a user, and the keyword of the current input information is extracted through the keyword extraction module, and the steps are as follows: for the current input information of a user, performing part-of-speech screening by using a pre-training language model to obtain a first candidate keyword group [ keyword 1, keyword 2 … …, keyword m ] and m is more than or equal to 1; and carrying out data preprocessing operations such as removing repeated and disabled words and the like on the first candidate keyword group to obtain a second candidate keyword group [ keyword 1, keyword 2 … …, keyword n ], wherein n is more than or equal to 1. Each keyword in the second candidate keyword group [ keyword 1, keyword 2 … …, keyword n ], n is larger than or equal to 1 is constructed into the following format: please determine if the following description is correct: the keyword 1 is the keyword of the user input information. N identical sentence patterns are constructed. The n identical sentence patterns are put into a second pre-training model, an embedding layer (input embedding) is used for input, word embedding (word embedding) representation of the input corresponding token is obtained, the word embedding representation is overlapped with a position coding (or position embedding, positional embedding) representation, and a combined embedded expression of the input is obtained. The 76 transducer layers are input subsequently, and the interior of the transducer layers consists of a multi-head attention mechanism (multihead attention), an add & norm layer and a feed forward neural network layer (feed forward). Finally, using a linear layer to complete the compression of feature space dimension, using a logic layer (logics) to obtain the probability of each second keyword sentence, and determining the second candidate keywords corresponding to the second candidate keyword sentences with the highest probability as the keywords corresponding to the current input information;

Step S3: recording a plurality of historical dialogue information of a user and the intelligent dialogue system, and generating the historical dialogue information into a historical dialogue information base;

step S4: regarding each current input information in the historical dialogue information base as a document, and calculating the importance degree of each keyword on the document by using a Term Frequency-inverse document Frequency (TF-IDF) algorithm; the TF-IDF values corresponding to the keywords of each currently input information are arranged in a descending order, and a predetermined number of words arranged at the forefront are taken, wherein in some alternative embodiments, the predetermined number may be p=5, and the first 5 keywords are used as topics of the document and as labels of the document;

step S5: the method comprises the steps that a distributed open source Search Engine (ES) is used for inquiring knowledge corresponding to the keywords and topics in a target domain knowledge database to obtain target domain knowledge, and an administrator and customer service staff can access the target domain knowledge database;

step S6: inputting a prompt sentence, the current input information, the topic and the target domain knowledge as prompt word information into a first pre-training language model, wherein the prompt word information is shown in fig. 4: the prompt sentence represents the first sentence information of the intelligent dialogue system starting to reply as the guide language of the intelligent dialogue system. For example, a "small source is a very enthusiasm, hearty customer service, while having excellent expertise reserves of knowledge, capable of serving each customer with hearts from listening to the customer's complaints. Prior to encountering a customer, she consults the following relevant knowledge: "industry related field information (target field knowledge)" AS5300G5& AS5500G5& AS5600G5& AS5800G5& HF5000G5& HF6000G5 max 16 controllers. AS6800G5& HF8000G5 maximum of 32 controllers. AS18000G5-I & HF18000G5-I maximum 48 controllers. The cluster system is connected with a topological graph through an FC switch, and takes four controllers as an example, the series of storage support multi-controller clusters, wherein the maximum number of support controllers of each cluster is 48', and user related historical dialogue information (historical dialogue information) is user: how many controllers can be supported by my cluster? The small source: your good-! I are your intelligent customer service small source asking what are your cluster models? The "user current input information (current input information)" is "user: the AS5500G5 should be used. Inputting the prompt word information into a first pre-training language model;

Step S7: the first pre-training language model is as shown in FIG. 5, using an embedding layer for the prompt word information, obtaining word embedding representation of the prompt word information corresponding to token (computer term: mark), combining the word embedding representation with position coding (or position embedding, positional embedding) representation, obtaining combined embedded expression of the prompt word information, then inputting 76 transducer layers, wherein the interior of the transducer layers is composed of a multi-head attention mechanism, an overlapped normalization layer and a feedforward neural network layer, finally using a linear layer to complete compression of feature space dimension, using a logic layer to obtain possibility of each output sentence, and each output sentence (y ₁ ,y ₂ ,…，y _n ) The probability of (2) isn represents the number of the prompt word information, and x _i And (3) representing the ith prompt word information, yi representing the ith output sentence, and taking the output sentence with the highest probability as reply information.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.

In this embodiment, a control device of an intelligent dialogue system is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and is not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

Fig. 6 is a block diagram of a control device of the intelligent dialogue system according to an embodiment of the application, as shown in fig. 6, the device includes:

the extracting module 22 is configured to obtain current input information of a user, and extract keywords of the current input information;

specifically, the intelligent dialogue system is a chat dialogue system, and can also be applied to chat robots, dialogue robots, chat accompanying robots, psychological consultation robots, communication robots, question-answering robots, intelligent assistants and the like. When the current input information is sent to the intelligent dialogue system, the intelligent dialogue system obtains the current input information of the user, namely the current input information, extracts keywords in the current input information, wherein the current input information of the user is generally a sentence or several sentences, each sentence generally comprises a plurality of words, therefore, the keywords in the current input information are extracted, namely main words are extracted from the plurality of words, the keywords are generally nouns, and the extracting device of the keywords is described in a unfolding way below.

A determining module 24, configured to determine a topic corresponding to the current input information according to the keyword and a historical dialogue information base, where the historical dialogue information base includes historical dialogue information of the user and an intelligent dialogue system;

specifically, assuming that the user has performed multiple conversations with the intelligent conversation system before inputting the current information, the multiple conversations of the user and the intelligent conversation system form a historical conversational information base, i.e., the historical conversational information base contains the historical conversational information of the user and the intelligent conversation system. After the device extracts the keywords of the current input information of the user, the topic of the conversation is determined according to the historical conversation information base and the keywords. Historical dialog information such as: the user: how many controllers can be supported by my cluster? The small source: your good-! I are your intelligent customer service small source asking what are your cluster models?

A query module 26, configured to query a target domain knowledge database for knowledge corresponding to the keywords and the topics, to obtain target domain knowledge, where the target domain knowledge database includes theoretical knowledge of a target domain;

The control module 28 is configured to analyze the prompt word information using a first pre-training language model to determine a reply message, and control the intelligent dialogue system to output the reply message, where the prompt sentence is a preset sentence serving as a guide of the reply message, the first pre-training language model is trained by machine learning using a plurality of sets of first data, and each set of first data includes: a tag for prompting word information and reply information.

The pre-training language model can be any reasonable and effective pre-training large model, such as an artificial intelligence huge language model, and has more parameters and higher natural language understanding and natural language generating capability.

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

Through the device, the current input information of the user is firstly obtained, keywords of the current input information are extracted, topics are determined according to the keywords and the historical dialogue information base, target domain knowledge is obtained by inquiring the keywords and the topics in the industry domain knowledge base, the prompt word information comprising prompt sentences, the current input information, the topics and the target domain knowledge is input into the first pre-training language model, and the first pre-training language model analyzes the prompt word information to obtain reply information. Compared with the device which is not strong in understanding ability and cannot trace back historical dialogue information and obtain reply information satisfactory to users in the prior art, the intelligent dialogue system can analyze current input information of users according to the historical dialogue information and target field knowledge by using the first pre-training language model and output accurate reply information, solves the problem that the intelligent chat dialogue system is insufficient in understanding ability, and improves the accuracy of the reply information.

In the specific implementation process, the control module comprises an acquisition sub-module, an input sub-module, a compression sub-module, a processing sub-module and an execution sub-module, wherein the acquisition sub-module is used for acquiring word embedding and position coding corresponding to the prompt word information by using the embedding layer; the input sub-module is used for embedding the words corresponding to the prompt word information into a conversion layer corresponding to the position codes to obtain feature spaces corresponding to the word embedding and the position codes, wherein the conversion layer comprises a multi-head attention mechanism layer, a normalization layer and a feedforward neural network layer; the compression submodule is used for compressing the characteristic space by using the linear layer to obtain the compressed characteristic space; the processing sub-module is used for processing the compressed feature space by using the logic layer to obtain probabilities of a plurality of output sentences, wherein each output sentence (y 1, y ₂ ,…，y _n ) The probability of (2) isAnd the execution submodule is used for taking the output statement with the highest probability as the reply information. The device analyzes the prompt word information through the first pre-training language model, so that the requirements of the user can be fully analyzed according to the current input information, topic and target domain knowledge and prompt sentences in the prompt word information, and reply information is obtained.

Specifically, an embedding layer (input embedding) is used for the prompt word information, a word embedding (word embedding) representation of the prompt word information corresponding to a token (computer term: mark) is obtained, the word embedding representation is embedded with a position code (or position embedding,positional embedding) to obtain combined embedded expression of prompt word information, then inputting 76 convertors layers, and adding multiple head attention mechanisms (multihead attention) and superimposed normalization layers (add)&norm), feed forward neural network layer (feed forward). Finally, the linear layer is used for completing the compression of the feature space dimension, and the logic layer (logics) is used for acquiring the possibility of each output statement, each output statement (y ₁ ,y ₂ ,…，y _n ) The probability of (2) isAnd taking the output sentence with the highest probability as the reply information.

In order to extract keywords in current input information of a user, in a specific implementation process, an extraction module comprises a generation sub-module, a deletion sub-module and a first determination sub-module, wherein the generation sub-module is used for performing part-of-speech screening on all words contained in the current input information to obtain a plurality of words with preset parts-of-speech, and generating the words with the preset parts-of-speech into a first candidate keyword group, wherein the preset parts-of-speech is a preset part-of-speech; the deleting sub-module is used for deleting the repeated first candidate keywords in the first candidate keyword group to obtain a second candidate keyword group; the first determining submodule is used for determining the keywords corresponding to the current input information according to the second candidate keyword group.

Specifically, the keyword extraction uses the language understanding capability of the pre-trained language model to directly extract the keywords, and the main device is as follows: for the current input information of a user, performing part-of-speech screening by using a pre-training language model to obtain a first candidate keyword group [ keyword 1, keyword 2 … …, keyword m ] and m is more than or equal to 1; and carrying out data preprocessing operations such as removing repeated and disabled words and the like on the first candidate keyword group to obtain a second candidate keyword group [ keyword 1, keyword 2 … …, keyword n ], wherein n is more than or equal to 1. And then determining the final keywords according to the second candidate keyword groups.

In the specific implementation process, the first determination submodule comprises a construction submodule and a second determination submodule, wherein the construction submodule is used for constructing each second candidate keyword in the second candidate keyword group into a keyword sentence of a preset sentence pattern to obtain a plurality of second candidate keyword sentences; the second determining submodule is configured to analyze the second candidate keyword sentence by using a second pre-training language model to obtain a probability of each second candidate keyword sentence, and determine the second candidate keyword corresponding to the second candidate keyword sentence with the largest probability as the keyword corresponding to the current input information, where the second pre-training language model is trained by using multiple sets of second data through machine learning, and each set of second data in the multiple sets of second data includes: historical keyword sentences and historical keywords. The device analyzes the second candidate keyword sentences by constructing the second candidate keyword sentences and inputting the second candidate keyword sentences into the second pre-training language model, so that more accurate keywords can be obtained.

In order to enable the intelligent dialogue system to accurately analyze the problem of the user and output more accurate reply information, in a specific implementation process, the first determining module further comprises a generating sub-module, which is used for recording a plurality of historical dialogue information of the user and the intelligent dialogue system and generating the historical dialogue information to a historical dialogue information base.

In the specific implementation process, the first determining module further comprises a calculating submodule and a third determining submodule, wherein the calculating submodule is used for calculating the importance degree of each keyword on a plurality of historical dialogue information in the historical dialogue information base by using a word frequency-inverse document frequency algorithm to obtain a frequency value corresponding to each keyword; and the third determining submodule is used for arranging the frequency values according to a preset sequence and determining the preset number of keywords as the topics corresponding to the current input information. The device determines the topics through the keywords, so that the problems of the user can be more accurately positioned, and the questions of the user can be pertinently replied according to the topics.

Specifically, topic positioning is used for positioning main topics in multiple interactions of a user and an intelligent dialogue system, after keywords of current input information of the user are extracted from the device, each current input information in a historical dialogue information base is regarded as a document, and the importance degree of each keyword on the document is calculated by using a Term Frequency-inverse document Frequency (Term Frequency-Inverse Document Frequency, TF-IDF) algorithm; the TF-IDF values corresponding to the keywords of each currently input information are arranged in a descending order, and a predetermined number of words arranged at the forefront are taken, and in some alternative embodiments, the predetermined number may be p=5, and the first 5 keywords are used as topics of the document and as labels of the document.

In order to enable the intelligent dialogue system to "trace" the content of the history dialogue so as to determine topics according to keywords and history dialogue information, in a specific implementation process, the device further comprises a storage module, which is used for storing the current input information and the reply information into the history dialogue information base.

Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the steps of any of the method embodiments described above when run.

In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.

Embodiments of the present application also provide an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.

Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the present application should be included in the protection scope of the present application.

Claims

1. A method for controlling an intelligent dialog system, comprising:

Acquiring current input information of a user, and extracting keywords of the current input information;

determining topics corresponding to the current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of the user and an intelligent dialogue system;

inquiring knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain;

inputting prompt word information into a first pre-training language model, analyzing the prompt word information by using the first pre-training language model, determining reply information, and controlling the intelligent dialogue system to output the reply information, wherein the prompt word information comprises prompt sentences, current input information, topics and target domain knowledge, the prompt sentences are preset sentences serving as guide words of the reply information, the first pre-training language model is trained by using a plurality of groups of first data through machine learning, and each group of first data in the plurality of groups of first data comprises: history prompt word information and history reply information.

2. The method of claim 1, wherein the first pre-trained language model comprises an embedding layer, a conversion layer, a linear layer, and a logic layer, wherein analyzing the hint word information using the first pre-trained language model to determine reply information comprises:

acquiring word embedding and position coding corresponding to the prompt word information by using the embedding layer;

embedding the word corresponding to the prompt word information into a conversion layer according to the position code to obtain a feature space corresponding to the word embedding and the position code, wherein the conversion layer comprises a multi-head attention mechanism layer, a normalization layer and a feedforward neural network layer;

compressing the feature space by using the linear layer to obtain the compressed feature space;

processing the compressed feature space using the logic layer to obtain probabilities of a plurality of output sentences, wherein each of the output sentences (y 1, y ₂ ,…，y _n ) The probability of (2) isn represents the prompt word letterNumber of pieces of information, x _i Representing the ith prompt word information, y _i Representing an ith said output statement;

and taking the output statement with the highest probability as the reply information.

3. The method of claim 1, wherein extracting keywords of the current input information comprises:

Performing part-of-speech screening on all words contained in the current input information to obtain a plurality of words with preset parts-of-speech, and generating a plurality of words with preset parts-of-speech to a first candidate keyword group, wherein the preset parts-of-speech is a preset part-of-speech;

deleting repeated first candidate keywords in the first candidate keyword groups to obtain second candidate keyword groups;

and determining the keywords corresponding to the current input information according to the second candidate keyword group.

4. The method of claim 3, wherein determining the keyword corresponding to the current input information from the second candidate keyword group comprises:

constructing each second candidate keyword in the second candidate keyword group into a keyword sentence of a predetermined sentence pattern, so as to obtain a plurality of second candidate keyword sentences;

analyzing the second candidate keyword sentences by using a second pre-training language model to obtain the probability of each second candidate keyword sentence, and determining the second candidate keyword corresponding to the second candidate keyword sentence with the highest probability as the keyword corresponding to the current input information, wherein the second pre-training language model is trained by using a plurality of groups of second data through machine learning, and each group of second data in the plurality of groups of second data comprises: historical keyword sentences and historical keywords.

5. The method of claim 1, further comprising, prior to determining the topic corresponding to the current input information from the keyword and historical dialog information library:

recording a plurality of historical dialogue information of the user and the intelligent dialogue system, and generating the historical dialogue information to a historical dialogue information base.

6. The method of claim 5, wherein determining topics corresponding to the current input information from the keyword and historical dialog information library comprises:

calculating the importance degree of each keyword on a plurality of historical dialogue information in the historical dialogue information base by using a word frequency-inverse document frequency algorithm to obtain a frequency value corresponding to each keyword;

and arranging the frequency values according to a preset sequence, and determining the preset number of keywords as the topics corresponding to the current input information.

7. The method of claim 1, further comprising, after controlling the intelligent dialog system to output the reply message:

and storing the current input information and the reply information into the historical dialogue information base.

8. A control device for an intelligent dialog system, comprising:

the extraction module is used for acquiring current input information of a user and extracting keywords of the current input information;

the determining module is used for determining topics corresponding to the current input information according to the keywords and a historical dialogue information base, wherein the historical dialogue information base comprises historical dialogue information of the user and an intelligent dialogue system;

the query module is used for querying the knowledge corresponding to the keywords and the topics in a target domain knowledge database to obtain target domain knowledge, wherein the target domain knowledge database comprises theoretical knowledge of a target domain;

the control module is configured to analyze the prompt word information by using a first pre-training language model, determine reply information, and control the intelligent dialogue system to output the reply information, where the prompt sentence is a preset sentence serving as a guide word of the reply information, the first pre-training language model is trained by using multiple groups of first data through machine learning, and each group of first data in the multiple groups of first data includes: a tag for prompting word information and reply information.

9. A computer readable storage medium, characterized in that a computer program is stored in the computer readable storage medium, wherein the computer program, when being executed by a processor, implements the steps of the method according to any of the claims 1 to 7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 7 when the computer program is executed.