CN116737894B

CN116737894B - Intelligent robot service system based on model training

Info

Publication number: CN116737894B
Application number: CN202310646279.XA
Authority: CN
Inventors: 凌玉飞; 张棋光; 车浩流
Original assignee: Shenzhen Keyike Information Technology Co ltd
Current assignee: Shenzhen Keyike Information Technology Co ltd
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2024-02-20
Anticipated expiration: 2043-06-02
Also published as: CN116737894A

Abstract

The invention relates to the field of artificial intelligence, in particular to an intelligent robot service system based on model training. The system comprises a selection module, a feedback module and a feedback module, wherein the selection module is used for selecting any problem to be fed back from a first database; the input module is used for inputting the problem to be fed back into a language model of a transducer architecture to form a first feedback list; the receiving module is used for receiving the feedback information of the user when the feedback information presents different information characterization sequences, so as to form a second feedback list; the extraction module is used for extracting first feedback information positioned at the first position in the first feedback list, comparing the first feedback information with the service satisfaction degree of the first feedback information in the second feedback list and obtaining a comparison result; and the adjustment module adjusts the adjustment strategy of the language model according to the comparison result so as to enable the first feedback information in the first feedback list to be the optimal feedback information. According to the invention, the language model is adjusted, so that the satisfaction degree of a person can be maximized in the language of the information fed back by the robot system for the problem raised by the user.

Description

Intelligent robot service system based on model training

Technical Field

The invention relates to the field of artificial intelligence, in particular to an intelligent robot service system based on model training.

Background

With the rapid development of science and technology, artificial intelligence has deepened into various industries, intelligent machines in some industries can search and answer simple questions presented by users, and the machines can answer the questions but still have limitations, especially in some cases, the answers which users want to get are more temperature, so that in order to optimize the answers of the artificial intelligence questions, the interactivity with people is improved, and an artificial intelligence optimizing system needs to be designed.

Disclosed in patent document publication No. CN115759123a is an intelligent question-answering robot system, which includes an acquisition model for acquiring a question posed by a user, a question-answering model, and an output model; the question-answering model is used for finding out the question which is most matched with the user query, and further giving out a corresponding answer, and the question-answering model adopts two forms of converting words into vectors and word shift distances; the output model is used for receiving the corresponding answers given by the question-answer model and outputting the answers.

The intelligent question-answering robot in the prior art answers questions presented by users by searching a database, and feedback information presented by the intelligent question-answering robot is single in form and has limitations and poor interaction effect.

Disclosure of Invention

Therefore, the intelligent robot service system based on model training can solve the problems that the feedback information form of a robot is single, limitation exists, and the interaction effect is poor.

To achieve the above object, the present invention provides an intelligent robot service system based on model training, comprising: the selection module is used for selecting any problem to be fed back from a first database, and a plurality of problems to be fed back are prestored in the first database;

the input module is connected with the selection module and used for inputting the to-be-fed-back problem into a language model based on a transducer architecture, outputting at least one piece of feedback information based on the to-be-fed-back problem, wherein a plurality of pieces of feedback information form a first feedback list, and the feedback information comprises at least two information characterizations;

the receiving module is used for receiving user feedback information when the feedback information presents different information characterization sequences, wherein the user feedback information is used for representing service satisfaction degree of a user on the feedback information, and ordering the feedback information from high to low based on the service satisfaction degree to form a second feedback list;

the extraction module is connected with the input module and used for extracting first feedback information positioned at the first position in the first feedback list, comparing the first feedback information with the service satisfaction degree of the first feedback information in the second feedback list and obtaining a comparison result;

the adjusting module is connected with the extracting module and used for adjusting the adjusting strategy of the language model of the transducer architecture according to the comparison result so as to enable the first feedback information in the first feedback list to be the optimal feedback information.

Further, the adjustment module comprises a joining unit, a calculating unit and an iteration unit;

the adding unit is used for adding KL divergence into a loss function of a language model of the transducer architecture as one item of the loss function;

the calculation unit is used for calculating the KL divergence of the loss function according to the comparison result, wherein the KL divergence r _KL The expression of (2) isSaid pi ^RL Output distribution probability representing the first feedback information of the language model outputting the transducer architecture, the pi ^SFT The output distribution probability of the feedback information after the adjustment module adjusts the language model of the transducer architecture is represented;

the iteration unit is connected with the calculation unit and used for enabling the first feedback information in the first feedback list to be iterated into optimal feedback information by minimizing the KL divergence.

Further, in the iterative process, the iteration unit is preset with a standard threshold, and when a comparison result is obtained, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is greater than or equal to the standard threshold, the KL divergence value is reduced for multiple times, so that the language model of the transducer architecture is updated for multiple times, and the first feedback information in the first feedback list is output to the optimal feedback information until the KL divergence value is minimum.

Further, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is smaller than the standard threshold value, updating the language model of the Transformer architecture once to enable the KL divergence value to be minimum, and outputting optimal feedback information by the first feedback information in the first feedback list.

Further, the language model of the transducer architecture includes: an input embedded layer, an encoder, a decoder, and an output layer;

the input embedding layer is used for converting word sequences input into questions to be fed back into vector representations;

the encoder is configured to convert the vector representation into an output vector of the encoder;

the decoder is used for converting the output vector of the encoder into a query vector, and calculating the query vector and the output vector representation of the encoder to obtain the context representation of the problem to be fed back to obtain the output vector of the decoder;

the output layer is used for mapping the output vector of the decoder into the answer of the feedback question.

Further, the encoder comprises an encoder position encoder, an encoder multi-head self-attention mechanism and an encoder forward neural network layer, and the working process of the encoder is as follows:

encoding position information of words in each of the questions to be fed back into a vector by using the encoder position encoder;

inputting the vector into the multi-head self-attention mechanism of the encoder, and establishing the relation between words in the to-be-fed-back problem in an input sequence;

and using the encoder forward neural network to process the representation of the word in each to-be-fed-back problem in the multi-head self-attention mechanism of the encoder to obtain an output vector of the encoder.

Further, the decoder comprises a decoder position encoder, a decoder multi-head self-attention mechanism, a multi-head attention mechanism and a decoder forward neural network layer, and the working process of the decoder is as follows:

encoding the position information of the words in each of the questions to be fed back into a vector by using the decoder position encoder;

inputting the vector into a multi-head self-attention mechanism of a decoder, and establishing a relation between words in the decoder to obtain an intermediate layer representation of the decoder;

inputting an intermediate layer representation of the decoder into a multi-headed attention mechanism;

the decoder forward neural network is used to process the representation of each word in the multi-headed attention mechanism to obtain the output vector of the decoder.

The working process of the language model of the transducer architecture is as follows:

inputting word sequences in a problem to be fed back into the input embedding layer to be converted into vector representations;

the multi-head self-attention mechanism of the encoder processes the words of the problem to be fed back, and calculates the importance degree of each word of the problem to be fed back to the sentence of the problem to be fed back according to the current position and semantic relation of the word of the problem to be fed back, so as to obtain the representation of the word of the problem to be fed back in the input process;

encoding the representation of the to-be-fed-back problem word obtained by the multi-head self-attention mechanism of the encoder in input through the feedforward full-connection layer, and enhancing the semantic expression capability of the representation;

taking output vectors passing through the encoder multi-head self-attention mechanism and the feedforward full-connection layer as input, and outputting answers to feedback questions through the encoder self-attention mechanism and the multi-layer decoder;

and obtaining the probability distribution of the answer of the output feedback problem by using a normalized exponential function.

Further, the receiving module counts the feedback information of the user through big data, gives the service satisfaction degree of the user on the feedback information based on the statistics result, and sorts the feedback information based on the service satisfaction degree from high to low.

Further, the minimized KL divergence is obtained by a gradient optimization algorithm, and the gradient optimization algorithm is used for searching the minimum value of the function by calculating the gradient of the objective function, by continuously adjusting the function parameters, so that the value of the objective function is continuously reduced until a local minimum value or a global minimum value is reached.

Compared with the prior art, the intelligent robot service system based on model training has the beneficial effects that the intelligent robot service system based on model training comprises the selection module, the input module, the receiving module, the extraction module and the adjustment module, wherein the selection module reduces the repeatability and improves the efficiency of the subsequent working process by selecting any problem to be fed back from the first database; the input module outputs at least one feedback message from any to-be-fed back problem selected from the first database through a language model based on a transform architecture, and the language model of the transform architecture has parallel computing capability, so that the consumption of computing resources is reduced; the receiving module sorts the feedback information from high to low based on the service satisfaction, so that scoring consistency of labeling personnel is improved; the extraction module is used for extracting first feedback information positioned at the first position in the first feedback list and comparing the service satisfaction degree of the first feedback information in the second feedback list to obtain a comparison result; the language model of the transducer architecture is adjusted through the adjusting module, so that the satisfaction degree of people can be maximized when the robot system feeds back information of the problems raised by the users.

Further, the adjustment module comprises a joining unit, a calculating unit and an iteration unit, and the adjustment module can continuously update the language model of the transducer architecture, so that the answer output by the language model of the transducer architecture is close to the answer with highest satisfaction.

Further, the iteration unit sets a standard threshold, and the updating process of the language model of the converter architecture can be judged through the standard threshold, so that the updating efficiency of the language model of the converter architecture is improved, and the optimal updating result of the language model of the converter architecture is realized.

Further, the language model of the transducer architecture has the advantages of no limitation on position-related operation, strong modeling capability, strong universality and strong expandability.

Further, the language model of the transducer architecture through the multi-head self-attention mechanism can focus on different parts of the problem to be fed back from different positions and angles, and the representation capability of the model is improved.

Further, the gradient optimization algorithm can select a reasonable parameter updating direction, and efficiency of KL divergence minimization is improved.

Drawings

FIG. 1 is a schematic diagram of an intelligent robot service system based on model training according to an embodiment of the present invention;

fig. 2 is a schematic diagram of another structure of an intelligent robot service system based on model training according to an embodiment of the present invention.

Detailed Description

In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.

It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, and do not indicate or imply that the apparatus or elements must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.

Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.

Referring to fig. 1, the present invention provides an intelligent robot service system based on model training, which includes:

the selection module 10 is configured to select any problem to be fed back from a first database, where a plurality of problems to be fed back are stored in the first database in advance;

the input module 20 is connected with the selection module 10, and is configured to input the to-be-fed-back problem to a language model based on a transducer architecture, output at least one feedback information based on the to-be-fed-back problem, and form a first feedback list by a plurality of feedback information, where the feedback information includes at least two information characterizations;

the receiving module 30 is configured to receive user feedback information when the feedback information presents different information characterization sequences, where the user feedback information is used to represent service satisfaction of a user on the feedback information, and order the feedback information from high to low based on the service satisfaction, so as to form a second feedback list;

the extracting module 40 is connected with the input module 20, and is configured to extract first feedback information located at the first position in the first feedback list, and compare the first feedback information with service satisfaction of the first feedback information in the second feedback list, so as to obtain a comparison result;

the adjustment module 50 is connected to the extraction module 40, and is configured to adjust an adjustment policy of the language model of the transducer architecture according to the comparison result, so that the first feedback information in the first feedback list is the optimal feedback information.

Specifically, the selection module 10 selects any problem to be fed back from the first database, so that the selected problem reduces repeatability and improves efficiency of the subsequent working process; the input module 20 outputs at least one feedback message from any problem to be fed back selected from the first database through a language model based on a transform architecture, and the language model of the transform architecture has parallel computing capability, so that the consumption of computing resources is reduced; the receiving module 30 orders the feedback information from high to low based on the service satisfaction, improving consistency of the satisfaction; the extracting module 40 extracts first feedback information in the first feedback list and service satisfaction of the first feedback information in the second feedback list to obtain a comparison result and help subsequent calculation; the language model of the transducer architecture is adjusted by the adjusting module 50, so that the satisfaction degree of people can be highest when the robot system feeds back information to the problem posed by the user, better optimization of feedback information output in the human-computer interaction process is realized, and the practicability of the feedback information is improved.

Specifically, referring to fig. 2, the adjustment module 50 includes a joining unit 51, a calculating unit 52, and an iterating unit 53;

the adding unit 51 is configured to add KL divergence as one of the loss functions in the loss function of the language model of the transducer architecture;

the calculation unit 52 is configured to calculate a KL divergence of the loss function according to the comparison result, wherein the KL divergence r _KL The expression of (2) isSaid pi ^RL Output distribution probability representing the first feedback information of the language model outputting the transducer architecture, the pi ^SFT The output distribution probability of the feedback information after the adjustment module 50 adjusts the language model of the transducer architecture is represented;

the iteration unit 53 is connected to the calculation unit 52 for iterating the first feedback information in the first feedback list to the optimal feedback information by minimizing the KL divergence.

Specifically, the adding unit 51 adds KL divergence to the loss function in the language model of the transducer architecture to help update the language model of the transducer architecture, so that the language model of the transducer architecture is updated with minimum modification, and the output calculation result of the first feedback information and the feedback information after the adjustment of the language model of the transducer architecture is close by calculating KL divergence.

Specifically, in the iterative process, the iterative unit 53 sets a standard threshold in advance, and when obtaining the comparison result, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is greater than or equal to the standard threshold, the KL divergence value is reduced multiple times, so as to implement multiple updates on the language model of the transducer architecture, until the KL divergence value is minimum, the first feedback information in the first feedback list is output with optimal feedback information.

Specifically, the updating process of the language model of the transducer architecture can be judged through the standard threshold, the updating efficiency of the language model of the transducer architecture is improved, the optimal updating result of the language model of the transducer architecture is realized, further effective optimization of feedback information is realized, and the optimizing efficiency is improved.

Specifically, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is smaller than the standard threshold value, updating the language model of the Transformer architecture once to enable the KL divergence value to be minimum, and outputting optimal feedback information by the first feedback information in the first feedback list.

Specifically, when the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is smaller than the standard threshold, the language model updating process of the transducer architecture is reduced, and the language model updating efficiency of the transducer architecture is improved.

Specifically, the language model of the transducer architecture includes: an input embedded layer, an encoder, a decoder, and an output layer;

Specifically, the structure of the language model of the transducer architecture has the advantages of realizing complete parallel calculation, being flexible and extensible in structure, good in pre-training effect and processing multi-mode data.

Specifically, the encoder comprises an encoder position encoder, an encoder multi-head self-attention mechanism and an encoder forward neural network layer, and the working process of the encoder is as follows:

Specifically, the encoder adopts the encoder self-attention mechanism, so that the model can process information of all positions in the input sequence simultaneously in one calculation, parallelism and efficiency of the model are improved, and the encoder simultaneously uses the encoder multi-head self-attention mechanism, so that the capability of the model to learn relations among different positions in the input sequence is improved.

Specifically, the decoder comprises a decoder position encoder, a decoder multi-head self-attention mechanism, a multi-head attention mechanism and a decoder forward neural network layer, and the working process of the decoder is as follows:

In particular, the decoder uses the decoder multi-headed self-attention mechanism to improve the modeling capability of the model so that it is more accurate in processing the input sequence.

Specifically, the working process of the language model of the transducer architecture is as follows:

Specifically, the language model of the transducer architecture has strong context awareness capability when analyzing text and sequence data, so that the model has long-term dependence in learning text and sequence, the structure of the language model of the transducer architecture has high efficiency, parallelism and expandability, and the reasoning speed of the model is fast for a large-scale data set.

Specifically, the receiving module 30 counts the feedback information of the user through big data, gives the service satisfaction degree of the user to the feedback information based on the statistics result, and orders the feedback information from high to low based on the service satisfaction degree.

Specifically, the user feedback information data requires a large amount of data, so that the user satisfaction degree ranking reflected based on the statistical result of the user feedback information is more accurate.

Specifically, the minimized KL divergence is obtained by a gradient optimization algorithm, and the gradient optimization algorithm is used for searching the minimum value of the function by calculating the gradient of the objective function, by continuously adjusting the function parameters, so that the value of the objective function is continuously reduced until a local minimum value or a global minimum value is reached.

Specifically, the gradient optimization algorithm can select a reasonable parameter updating direction, and the KL divergence minimization efficiency is improved.

Specifically, the intelligent robot service system based on model training provided by the embodiment of the invention can be used for natural language understanding and generation of the intelligent question-answering robot for credit service, realizes accurate understanding and quick response to various problems of users in the field of credit service, reduces the workload of manual customer service, improves the response speed and accuracy of customer consultation, and provides more professional and convenient financial service for the users.

Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.

The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An intelligent robot service system based on model training, comprising:

the selection module is used for selecting any problem to be fed back from a first database, and a plurality of problems to be fed back are prestored in the first database;

the adjusting module is connected with the extracting module and used for adjusting the adjusting strategy of the language model of the transducer architecture according to the comparison result so as to enable the first feedback information in the first feedback list to be optimal feedback information;

the adjusting module comprises a joining unit, a calculating unit and an iteration unit;

the iteration unit is connected with the calculation unit and is used for enabling the first feedback information in the first feedback list to be iterated into optimal feedback information by minimizing the KL divergence;

the iteration unit is provided with a standard threshold in advance in the iteration process, and when a comparison result is obtained, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is greater than or equal to the standard threshold, the KL divergence value is reduced for a plurality of times, so that the language model of the transducer architecture is updated for a plurality of times, and the first feedback information in the first feedback list is output to the optimal feedback information until the KL divergence value is minimum;

and if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is smaller than the standard threshold value, updating the language model of the transducer architecture once to enable the KL divergence value to be minimum, and outputting optimal feedback information by the first feedback information in the first feedback list.

2. The model-based intelligent robot service system of claim 1, wherein the language model of the transducer architecture comprises: an input embedded layer, an encoder, a decoder, and an output layer;

3. The model training-based intelligent robot service system of claim 2, wherein the encoder comprises an encoder position encoder, an encoder multi-headed self-attention mechanism and an encoder forward neural network layer, and the encoder operates as follows:

and using the encoder forward neural network layer to process the representation of the word in each to-be-fed-back problem in the multi-head self-attention mechanism of the encoder to obtain the output vector of the encoder.

4. The model training based intelligent robotic service system of claim 3, wherein the decoder comprises a decoder position encoder, a decoder multi-headed self-attention mechanism, a multi-headed attention mechanism, and a decoder forward neural network layer, the decoder operating as:

the decoder forward neural network layer is used for processing the representation of each word in the multi-head attention mechanism to obtain an output vector of the decoder.

5. The intelligent robot service system based on model training of claim 4, wherein the language model of the transducer architecture works as follows:

the multi-head self-attention mechanism of the encoder processes the words of the problem to be fed back, calculates the importance degree of each word of the problem to be fed back to the sentence of the problem to be fed back according to the position and semantic relation of the current word of the problem to be fed back, and obtains the representation of the word of the problem to be fed back in the input;

coding the representation of the to-be-fed-back problem word obtained by the multi-head self-attention mechanism of the encoder in input through a feedforward full-connection layer, and enhancing the semantic expression capability of the representation;

taking the output vector passing through the multi-head self-attention mechanism of the encoder and the feedforward full-connection layer as input, and outputting an answer of a feedback problem through the multi-layer decoder and the multi-head self-attention mechanism of the encoder;

6. The intelligent robot service system based on model training according to claim 5, wherein the receiving module counts the feedback information of the user through big data, gives the service satisfaction of the user for the feedback information based on the statistics result, and orders the feedback information from high to low based on the service satisfaction.

7. The intelligent robot service system based on model training according to claim 6, wherein the minimization of KL-divergence is by a gradient optimization algorithm that finds the minimum of the function by calculating the gradient of the objective function by continuously adjusting the function parameters such that the value of the objective function is continuously reduced until a local or global minimum is reached.