CN116737894A

CN116737894A - Intelligent robot service system based on model training

Info

Publication number: CN116737894A
Application number: CN202310646279.XA
Authority: CN
Inventors: 凌玉飞; 张棋光; 车浩流
Original assignee: Shenzhen Keyike Information Technology Co ltd
Current assignee: Shenzhen Keyike Information Technology Co ltd
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2023-09-12
Anticipated expiration: 2043-06-02
Also published as: CN116737894B

Abstract

The invention relates to the field of artificial intelligence, in particular to an intelligent robot service system based on model training. The system comprises a selection module, a feedback module and a feedback module, wherein the selection module is used for selecting any problem to be fed back from a first database; the input module is used for inputting the problem to be fed back into a language model of a transducer architecture to form a first feedback list; the receiving module is used for receiving the feedback information of the user when the feedback information presents different information characterization sequences, so as to form a second feedback list; the extraction module is used for extracting first feedback information positioned at the first position in the first feedback list, comparing the first feedback information with the service satisfaction degree of the first feedback information in the second feedback list and obtaining a comparison result; and the adjustment module adjusts the adjustment strategy of the language model according to the comparison result so as to enable the first feedback information in the first feedback list to be the optimal feedback information. According to the invention, the language model is adjusted, so that the satisfaction degree of a person can be maximized in the language of the information fed back by the robot system for the problem raised by the user.

Description

Intelligent robot service system based on model training

技术领域Technical field

本发明涉及人工智能领域，尤其涉及一种基于模型训练的智能机器人服务系统。The invention relates to the field of artificial intelligence, and in particular to an intelligent robot service system based on model training.

背景技术Background technique

随着科技的迅猛发展，人工智能已经深入了各行各业，在一些行业中的智能机器可以对用户所提出的简单问题进行检索回答，这些机器虽然可以回答问题但是它们仍然有所限制，特别是有些情况，用户希望得到的回答更加有温度，为了优化人工智能问题回答的答案，提升与人的交互性，需要设计一种人工智能的优化系统。With the rapid development of science and technology, artificial intelligence has penetrated into all walks of life. In some industries, intelligent machines can retrieve and answer simple questions asked by users. Although these machines can answer questions, they still have limitations, especially In some cases, users want to get more warm answers. In order to optimize the answers to artificial intelligence questions and improve the interactivity with people, it is necessary to design an artificial intelligence optimization system.

公开号为CN115759123A的专利文献中公开了一种智能问答机器人系统，包括获取模型、问答模型和输出模型，所述获取模型用于获取用户提出的问题；所述问答模型用于找到和用户query最匹配的问题，进而给出对应的答案，所述问答模型采用将单词转换成向量和词移距离两种形式；所述输出模型用于接受问答模型给出的对应答案后进行输出。The patent document with the publication number CN115759123A discloses an intelligent question and answer robot system, which includes an acquisition model, a question and answer model and an output model. The acquisition model is used to obtain questions raised by users; the question and answer model is used to find the answer that best matches the user's query. Matching questions and then giving corresponding answers, the question and answer model adopts two forms: converting words into vectors and word movement distance; the output model is used to receive the corresponding answers given by the question and answer model and then output.

现有技术中的智能问答机器人通过检索数据库对用户所提出的问题进行回答，智能问答机器人给出的反馈信息形式单一，存在局限性进而交互效果差。Intelligent question-and-answer robots in the prior art answer questions raised by users by retrieving databases. The feedback information given by the intelligent question-and-answer robots is in a single form, has limitations, and results in poor interactive effects.

发明内容Contents of the invention

为此，本发明提供一种基于模型训练的智能机器人服务系统，可以解决机器人反馈信息形式单一，存在局限性进而交互效果差的问题。To this end, the present invention provides an intelligent robot service system based on model training, which can solve the problem of a single form of robot feedback information, limitations and poor interaction effects.

为实现上述目的，本发明提供一种基于模型训练的智能机器人服务系统包括：选择模块，用以从第一数据库内选择任意待反馈问题，所述第一数据库内预先存储有若干待反馈问题；In order to achieve the above object, the present invention provides an intelligent robot service system based on model training, including: a selection module for selecting any question to be fed back from a first database, where a number of questions to be fed back are pre-stored in the first database;

输入模块，与所述选择模块连接，用以将所述待反馈问题输入至基于Transformer架构的语言模型，并输出基于所述待反馈问题的至少一个反馈信息，若干所述反馈信息形成第一反馈列表，所述反馈信息中包括至少两个信息表征；An input module, connected to the selection module, is used to input the question to be fed back into the language model based on the Transformer architecture, and output at least one feedback information based on the question to be fed back, and a plurality of the feedback information forms the first feedback A list, the feedback information includes at least two information representations;

接收模块，用以在反馈信息呈现不同的信息表征顺序时接收用户反馈信息，所述用户反馈信息用以表示用户对所述反馈信息的服务满意度，并基于所述服务满意度从高到低对所述反馈信息进行排序，形成第二反馈列表；A receiving module, configured to receive user feedback information when the feedback information presents different information representation orders. The user feedback information is used to represent the user's service satisfaction with the feedback information, and is arranged from high to low based on the service satisfaction. Sort the feedback information to form a second feedback list;

提取模块，与所述输入模块连接，用以提取所述第一反馈列表中位于首位的第一反馈信息，并将所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度进行比较，获取比较结果；An extraction module, connected to the input module, used to extract the first feedback information located first in the first feedback list, and combine the first feedback information with the service of the first feedback information in the second feedback list Compare satisfaction levels and obtain comparison results;

调整模块，与所述提取模块连接，用以根据所述比较结果调整所述Transformer架构的语言模型的调整策略，以使第一反馈列表中的第一反馈信息为最优反馈信息。An adjustment module, connected to the extraction module, is used to adjust the adjustment strategy of the language model of the Transformer architecture according to the comparison result, so that the first feedback information in the first feedback list is the optimal feedback information.

进一步地，所述调整模块包括加入单元、计算单元和迭代单元；Further, the adjustment module includes a joining unit, a calculation unit and an iteration unit;

所述加入单元用以在所述Transformer架构的语言模型的损失函数中加入KL散度作为所述损失函数的一项；The adding unit is used to add KL divergence as an item of the loss function in the loss function of the language model of the Transformer architecture;

所述计算单元用以根据所述比较结果计算所述损失函数的KL散度，其中KL散度r_KL的表达式为所述π^RL表示输出所述Transformer架构的语言模型第一反馈信息的输出分布概率，所述π^SFT表示经过所述调整模块对所述Transformer架构的语言模型进行调整后的反馈信息的输出分布概率；The calculation unit is used to calculate the KL divergence of the loss function according to the comparison result, where the expression of the KL divergence r _KL is The π ^RL represents the output distribution probability of the first feedback information of the language model of the Transformer architecture, and the π ^SFT represents the output distribution probability of the feedback information after the adjustment module adjusts the language model of the Transformer architecture. ;

所述迭代单元与所述计算单元连接，用以通过最小化所述KL散度使所述第一反馈列表中的第一反馈信息迭代为最优反馈信息。The iteration unit is connected to the calculation unit and is used to iterate the first feedback information in the first feedback list into optimal feedback information by minimizing the KL divergence.

进一步地，所述迭代单元在迭代的过程中，预先设置有标准阈值，在获取比较结果时，若所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度大于等于所述标准阈值，则通过多次降低所述KL散度值，以实现对所述Transformer架构的语言模型进行多次更新，直至所述KL散度值最小时，将所述第一反馈列表中的第一反馈信息输出最优反馈信息。Further, the iteration unit is preset with a standard threshold during the iteration process. When obtaining the comparison result, if the service satisfaction between the first feedback information and the first feedback information in the second feedback list is greater than or equal to The standard threshold value is reduced by reducing the KL divergence value multiple times to update the language model of the Transformer architecture multiple times until the KL divergence value is minimum, and the first feedback list is updated The first feedback information outputs the optimal feedback information.

进一步地，若所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度小于所述标准阈值，对所述Transformer架构的语言模型进行一次更新使得KL散度值变为最小，所述第一反馈列表中的第一反馈信息输出最优反馈信息。Further, if the service satisfaction of the first feedback information and the first feedback information in the second feedback list is less than the standard threshold, the language model of the Transformer architecture is updated once so that the KL divergence value becomes minimum, the first feedback information in the first feedback list outputs optimal feedback information.

进一步地，所述Transformer架构的语言模型包括：输入嵌入层、编码器、解码器和输出层；Further, the language model of the Transformer architecture includes: input embedding layer, encoder, decoder and output layer;

所述输入嵌入层用以将输入待反馈问题中的单词序列转化成向量表示；The input embedding layer is used to convert the word sequence in the input question to be fed back into a vector representation;

所述编码器用以将所述向量表示转化为编码器的输出向量；The encoder is used to convert the vector representation into an output vector of the encoder;

所述解码器用以将所述编码器的输出向量转化为查询向量，并将所述查询向量和所述编码器的输出向量表示进行计算，得到所述待反馈问题的上下文表示，得到解码器的输出向量；The decoder is used to convert the output vector of the encoder into a query vector, and calculate the query vector and the output vector representation of the encoder to obtain the context representation of the problem to be fed back, and obtain the decoder's output vector;

所述输出层用以将所述解码器的输出向量映射为所述反馈问题的答案。The output layer is used to map the output vector of the decoder into the answer to the feedback question.

进一步地，所述编码器包括编码器位置编码器、编码器多头自我注意力机制和编码器前向神经网络层，所述编码器的工作过程为：Further, the encoder includes an encoder position encoder, an encoder multi-head self-attention mechanism and an encoder forward neural network layer. The working process of the encoder is:

使用所述编码器位置编码器将每个所述待反馈问题中单词的位置信息编码成一个向量；Use the encoder position encoder to encode the position information of the word in each of the questions to be fed back into a vector;

将所述向量输入到所述编码器多头自我注意力机制中，建立输入序列中所述待反馈问题中单词之间的关系；Input the vector into the encoder multi-head self-attention mechanism to establish the relationship between words in the question to be fed back in the input sequence;

使用所述编码器前向神经网络处理每个所述待反馈问题中单词在所述编码器多头自我注意力机制中的表示，得到所述编码器的输出向量。The encoder forward neural network is used to process the representation of the words in each of the questions to be fed back in the encoder multi-head self-attention mechanism to obtain the output vector of the encoder.

进一步地，所述解码器包括解码器位置编码器、解码器多头自我注意力机制、多头注意力机制和解码器前向神经网络层，所述解码器的工作过程为：Further, the decoder includes a decoder position encoder, a decoder multi-head self-attention mechanism, a multi-head attention mechanism and a decoder forward neural network layer. The working process of the decoder is:

使用所述解码器位置编码器将每个所述待反馈问题中单词的位置信息编码成一个向量；Use the decoder position encoder to encode the position information of the word in each of the questions to be fed back into a vector;

将所述向量输入到解码器多头自我注意力机制中，建立解码器中单词之间的关系，得到解码器的中间层表示；Input the vector into the multi-head self-attention mechanism of the decoder, establish the relationship between words in the decoder, and obtain the intermediate layer representation of the decoder;

将所述解码器的中间层表示输入到多头注意力机制中；Input the intermediate layer representation of the decoder into the multi-head attention mechanism;

使用所述解码器前向神经网络处理每个单词在多头注意力机制中的表示，得到解码器的输出向量。The decoder forward neural network is used to process the representation of each word in the multi-head attention mechanism to obtain the output vector of the decoder.

所述Transformer架构的语言模型的工作过程为：The working process of the language model of the Transformer architecture is:

将待反馈问题中的单词序列输入到所述输入嵌入层转化为向量表示；Input the word sequence in the question to be fed back into the input embedding layer and convert it into a vector representation;

所述编码器多头自注意力机制对所述待反馈问题的单词进行处理，根据当前所述待反馈问题单词的位置和语义关系，计算出每个所述待反馈问题单词对于所述待反馈问题句子的重要程度，得到所述待反馈问题单词在输入中的表示；The encoder multi-head self-attention mechanism processes the words of the question to be fed back, and calculates the contribution of each question word to be fed back to the question to be fed back based on the current position and semantic relationship of the question word to be fed back. The importance of the sentence is used to obtain the representation of the word to be fed back in the input;

将所述编码器多头自注意力机制获得的所述待反馈问题单词在输入中的表示通过所述前馈全连接层进行编码，增强所述表示的语义表达能力；The representation of the question word to be fed back in the input obtained by the multi-head self-attention mechanism of the encoder is encoded through the feed-forward fully connected layer to enhance the semantic expression ability of the representation;

将通过所述编码器多头自注意力机制和所述前馈全连接层的输出向量作为输入，通过所述编码器自注意力机制和所述多层解码器输出反馈问题的答案；The output vector passing through the encoder multi-head self-attention mechanism and the feedforward fully connected layer is used as input, and the answer to the feedback question is output through the encoder self-attention mechanism and the multi-layer decoder;

使用归一化指数函数得到所述输出反馈问题的答案的概率分布。A probability distribution of answers to the output feedback question is obtained using a normalized exponential function.

进一步地，所述接收模块中通过大数据将用户反馈信息统计，基于统计结果给出用户对所述反馈信息的服务满意度，并基于所述服务满意度从高到低对所述反馈信息进行排序。Further, the receiving module collects statistics on user feedback information through big data, gives the user's service satisfaction with the feedback information based on the statistical results, and ranks the feedback information from high to low based on the service satisfaction. Sort.

进一步地，所述最小化KL散度通过梯度优化算法，所述梯度优化算法通过计算目标函数的梯度来寻找函数最小值的方法，是通过不断地调整函数参数，使得目标函数的值不断降低，直到达到一个局部最小值或全局最小值。Further, the minimization of KL divergence is achieved through a gradient optimization algorithm. The gradient optimization algorithm is a method of finding the minimum value of a function by calculating the gradient of the objective function. The method is to continuously adjust the function parameters to continuously reduce the value of the objective function. until a local minimum or global minimum is reached.

与现有技术相比，本发明的有益效果在于，该一种基于模型训练的智能机器人服务系统包括选择模块、输入模块、接收模块、提取模块和调整模块，所述选择模块通过从第一数据库内选择任意待反馈问题，选择出来的问题减少了重复性，提高了后续工作过程的效率；所述输入模块通过基于Transformer架构的语言模型将从第一数据库内选择的任意待反馈问题输出至少一个反馈信息，Transformer架构的语言模型拥有并行计算的能力,减少了计算资源的消耗；所述接收模块基于所述服务满意度从高到低对所述反馈信息进行排序，提高了标注人员的打分一致性；所述提取模块，提取所述第一反馈列表中位于首位的第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度进行比较，获取比较结果；通过所述调整模块对所述Transformer架构的语言模型进行调整，使机器人系统在对用户提出的问题所反馈的信息的语言能够使人的满意度最高。Compared with the existing technology, the beneficial effect of the present invention is that the intelligent robot service system based on model training includes a selection module, an input module, a receiving module, an extraction module and an adjustment module. The selection module obtains data from the first database Select any question to be fed back in the first database, which reduces duplication and improves the efficiency of the subsequent work process; the input module will output at least one question from any question to be fed back selected in the first database through a language model based on the Transformer architecture. For feedback information, the language model of the Transformer architecture has parallel computing capabilities, which reduces the consumption of computing resources; the receiving module sorts the feedback information from high to low based on the service satisfaction, improving the scoring consistency of the annotators. property; the extraction module extracts the first feedback information in the first feedback list and compares the service satisfaction of the first feedback information in the second feedback list to obtain the comparison result; through the adjustment module The language model of the Transformer architecture is adjusted so that the language of the information fed back by the robot system to the questions raised by the user can maximize human satisfaction.

进一步地，所述调整模块包括加入单元、计算单元和迭代单元，通过所述调整模块可以对所述Transformer架构的语言模型进行不断更新，使所述Transformer架构的语言模型输出的答案接近满意度最高的答案。Further, the adjustment module includes a joining unit, a calculation unit and an iteration unit. Through the adjustment module, the language model of the Transformer architecture can be continuously updated, so that the answer output by the language model of the Transformer architecture is close to the highest satisfaction level. s answer.

进一步地，所述迭代单元设置标准阈值，通过所述标准阈值可以判断所述Transformer架构的语言模型的更新过程，提高所述Transformer架构的语言模型的更新效率，实现所述Transformer架构的语言模型的最优更新结果。Further, the iteration unit sets a standard threshold, through which the update process of the language model of the Transformer architecture can be judged, the update efficiency of the language model of the Transformer architecture can be improved, and the language model of the Transformer architecture can be implemented. Optimal update results.

进一步地，所述Transformer架构的语言模型有位置关联操作不受限,建模能力强,通用性强,可扩展性强优点。Furthermore, the language model of the Transformer architecture has the advantages of unlimited position-related operations, strong modeling capabilities, strong versatility, and strong scalability.

进一步地，通过所述多头自注意力机制所述Transformer架构的语言模型能够从不同的位置和角度关注所述待反馈问题的不同部分，提高模型的表示能力。Furthermore, through the multi-head self-attention mechanism, the language model of the Transformer architecture can focus on different parts of the problem to be fed back from different positions and angles, improving the representation ability of the model.

进一步地，所述梯度优化算法能够选择合理的参数更新方向，提高KL散度最小化的效率。Furthermore, the gradient optimization algorithm can select a reasonable parameter update direction and improve the efficiency of KL divergence minimization.

附图说明Description of drawings

图1为本发明实施例提供的基于模型训练的智能机器人服务系统结构示意图；Figure 1 is a schematic structural diagram of an intelligent robot service system based on model training provided by an embodiment of the present invention;

图2为本发明实施例提供的基于模型训练的智能机器人服务系统的另一种结构示意图。Figure 2 is another structural schematic diagram of an intelligent robot service system based on model training provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的和优点更加清楚明白，下面结合实施例对本发明作进一步描述；应当理解，此处所描述的具体实施例仅仅用于解释本发明，并不用于限定本发明。In order to make the purpose and advantages of the present invention more clear, the present invention will be further described below in conjunction with the examples; it should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention.

下面参照附图来描述本发明的优选实施方式。本领域技术人员应当理解的是，这些实施方式仅仅用于解释本发明的技术原理，并非在限制本发明的保护范围。Preferred embodiments of the present invention will be described below with reference to the accompanying drawings. Those skilled in the art should understand that these embodiments are only used to explain the technical principles of the present invention and are not intended to limit the scope of the present invention.

需要说明的是，在本发明的描述中，术语“上”、“下”、“左”、“右”、“内”、“外”等指示的方向或位置关系的术语是基于附图所示的方向或位置关系，这仅仅是为了便于描述，而不是指示或暗示所述装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。It should be noted that in the description of the present invention, the terms "upper", "lower", "left", "right", "inner", "outer" and other terms indicating the direction or positional relationship are based on the figures. The directions or positional relationships shown are only for convenience of description and do not indicate or imply that the device or element must have a specific orientation, be constructed and operated in a specific orientation, and therefore cannot be construed as a limitation of the present invention.

此外，还需要说明的是，在本发明的描述中，除非另有明确的规定和限定，术语“安装”、“相连”、“连接”应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或一体地连接；可以是机械连接，也可以是电连接；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通。对于本领域技术人员而言，可根据具体情况理解上述术语在本发明中的具体含义。In addition, it should be noted that in the description of the present invention, unless otherwise clearly stated and limited, the terms "installation", "connection" and "connection" should be understood in a broad sense. For example, it can be a fixed connection or a fixed connection. It is a detachable connection or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection or an indirect connection through an intermediate medium; it can be an internal connection between two components. For those skilled in the art, the specific meanings of the above terms in the present invention can be understood according to specific circumstances.

请参阅图1所示，本发明提供一种基于模型训练的智能机器人服务系统包括：Please refer to Figure 1. The present invention provides an intelligent robot service system based on model training, which includes:

选择模块10，用以从第一数据库内选择任意待反馈问题，所述第一数据库内预先存储有若干待反馈问题；The selection module 10 is used to select any question to be fed back from the first database, where a number of questions to be fed back are pre-stored in the first database;

输入模块20，与所述选择模块10连接，用以将所述待反馈问题输入至基于Transformer架构的语言模型，并输出基于所述待反馈问题的至少一个反馈信息，若干所述反馈信息形成第一反馈列表，所述反馈信息中包括至少两个信息表征；The input module 20 is connected to the selection module 10, and is used to input the question to be fed back into the language model based on the Transformer architecture, and output at least one feedback information based on the question to be fed back. A plurality of the feedback information forms a first A feedback list, the feedback information includes at least two information representations;

接收模块30，用以在反馈信息呈现不同的信息表征顺序时接收用户反馈信息，所述用户反馈信息用以表示用户对所述反馈信息的服务满意度，并基于所述服务满意度从高到低对所述反馈信息进行排序，形成第二反馈列表；The receiving module 30 is used to receive user feedback information when the feedback information presents different information representation orders. The user feedback information is used to represent the user's service satisfaction with the feedback information, and based on the service satisfaction, from high to Sort the feedback information to form a second feedback list;

提取模块40，与所述输入模块20连接，用以提取所述第一反馈列表中位于首位的第一反馈信息，并将所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度进行比较，获取比较结果；The extraction module 40 is connected to the input module 20 to extract the first feedback information in the first feedback list, and combine the first feedback information with the first feedback information in the second feedback list. Compare service satisfaction and obtain comparison results;

调整模块50，与所述提取模块40连接，用以根据所述比较结果调整所述Transformer架构的语言模型的调整策略，以使第一反馈列表中的第一反馈信息为最优反馈信息。The adjustment module 50 is connected to the extraction module 40 and is used to adjust the adjustment strategy of the language model of the Transformer architecture according to the comparison result, so that the first feedback information in the first feedback list is the optimal feedback information.

具体而言，所述选择模块10通过从第一数据库内选择任意待反馈问题，选择出来的问题减少了重复性，提高了后续工作过程的效率；所述输入模块20通过基于Transformer架构的语言模型将从第一数据库内选择的任意待反馈问题输出至少一个反馈信息，Transformer架构的语言模型拥有并行计算的能力,减少了计算资源的消耗；所述接收模块30基于所述服务满意度从高到低对所述反馈信息进行排序，提高了所述满意度的一致性；所述提取模块40，提取所述第一反馈列表中位于首位的第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度进行比较，获取比较结果，帮助后续计算；通过所述调整模块50对所述Transformer架构的语言模型进行调整，使机器人系统在对用户提出的问题所反馈的信息的语言能够使人的满意度最高，进而实现了在人机交互过程中输出的反馈信息实现较好优化，提高反馈信息的实用性。Specifically, the selection module 10 selects any question to be fed back from the first database, and the selected questions reduce duplication and improve the efficiency of the subsequent work process; the input module 20 uses a language model based on the Transformer architecture At least one feedback information will be output from any question to be fed back selected from the first database. The language model of the Transformer architecture has the capability of parallel computing, which reduces the consumption of computing resources; the receiving module 30 is based on the service satisfaction from high to The feedback information is sorted to improve the consistency of the satisfaction level; the extraction module 40 extracts the first feedback information located first in the first feedback list and the first feedback information in the second feedback list Compare the service satisfaction of the feedback information to obtain the comparison results to help subsequent calculations; adjust the language model of the Transformer architecture through the adjustment module 50 so that the robot system can respond to the questions raised by the user in the language of the information. This maximizes people's satisfaction, thereby enabling better optimization of the feedback information output during the human-computer interaction process, and improving the practicality of the feedback information.

具体而言，请参阅图2所示，所述调整模块50包括加入单元51、计算单元52和迭代单元53；Specifically, please refer to Figure 2, the adjustment module 50 includes a joining unit 51, a calculation unit 52 and an iteration unit 53;

所述加入单元51用以在所述Transformer架构的语言模型的损失函数中加入KL散度作为所述损失函数的一项；The adding unit 51 is used to add KL divergence as an item of the loss function in the loss function of the language model of the Transformer architecture;

所述计算单元52用以根据所述比较结果计算所述损失函数的KL散度，其中KL散度r_KL的表达式为所述π^RL表示输出所述Transformer架构的语言模型第一反馈信息的输出分布概率，所述π^SFT表示经过所述调整模块50对所述Transformer架构的语言模型进行调整后的反馈信息的输出分布概率；The calculation unit 52 is used to calculate the KL divergence of the loss function according to the comparison result, where the expression of the KL divergence r _KL is The π ^RL represents the output distribution probability of the first feedback information of the language model of the Transformer architecture, and the π ^SFT represents the output distribution of the feedback information after the adjustment module 50 adjusts the language model of the Transformer architecture. probability;

所述迭代单元53与所述计算单元52连接，用以通过最小化所述KL散度使所述第一反馈列表中的第一反馈信息迭代为最优反馈信息。The iteration unit 53 is connected to the calculation unit 52 and is used to iterate the first feedback information in the first feedback list into optimal feedback information by minimizing the KL divergence.

具体而言，所述加入单元51对所述Transformer架构的语言模型中的损失函数中加入KL散度，帮助更新Transformer架构的语言模型，使Transformer架构的语言模型在最小的改动上完成更新，通过KL散度的计算，所述第一反馈信息和所述Transformer架构的语言模型进行调整后的反馈信息的输出计算结果靠近。Specifically, the adding unit 51 adds KL divergence to the loss function in the language model of the Transformer architecture to help update the language model of the Transformer architecture, so that the language model of the Transformer architecture can be updated with minimal changes. In the calculation of KL divergence, the output calculation results of the first feedback information and the adjusted feedback information of the language model of the Transformer architecture are close to each other.

具体而言，所述迭代单元53在迭代的过程中，预先设置有标准阈值，在获取比较结果时，若所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度大于等于所述标准阈值，则通过多次降低所述KL散度值，以实现对所述Transformer架构的语言模型进行多次更新，直至所述KL散度值最小时，将所述第一反馈列表中的第一反馈信息输出最优反馈信息。Specifically, the iteration unit 53 is preset with a standard threshold during the iteration process. When obtaining the comparison result, if the service satisfaction of the first feedback information and the first feedback information in the second feedback list is is greater than or equal to the standard threshold, then the KL divergence value is reduced multiple times to update the language model of the Transformer architecture multiple times, until the KL divergence value is minimum, the first feedback The first feedback information in the list outputs the optimal feedback information.

具体而言，通过所述标准阈值可以判断所述Transformer架构的语言模型的更新过程，提高所述Transformer架构的语言模型的更新效率，实现所述Transformer架构的语言模型的最优更新结果，进而实现对反馈信息的有效优化，提高优化效率。Specifically, the standard threshold can be used to judge the update process of the language model of the Transformer architecture, improve the update efficiency of the language model of the Transformer architecture, achieve the optimal update result of the language model of the Transformer architecture, and then achieve Effectively optimize feedback information and improve optimization efficiency.

具体而言，若所述第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度小于所述标准阈值，对所述Transformer架构的语言模型进行一次更新使得KL散度值变为最小，所述第一反馈列表中的第一反馈信息输出最优反馈信息。Specifically, if the service satisfaction of the first feedback information and the first feedback information in the second feedback list is less than the standard threshold, the language model of the Transformer architecture is updated once so that the KL divergence value becomes is minimum, the first feedback information in the first feedback list outputs optimal feedback information.

具体而言，当第一反馈信息与所述第二反馈列表中的首位反馈信息的服务满意度小于所述标准阈值，所述Transformer架构的语言模型更新过程减少，提高Transformer架构的语言模型更新效率。Specifically, when the service satisfaction between the first feedback information and the first feedback information in the second feedback list is less than the standard threshold, the language model update process of the Transformer architecture is reduced, improving the language model update efficiency of the Transformer architecture. .

具体而言，所述Transformer架构的语言模型包括：输入嵌入层、编码器、解码器和输出层；Specifically, the language model of the Transformer architecture includes: input embedding layer, encoder, decoder and output layer;

具体而言，所述Transformer架构的语言模型的结构具有实现完全并行的计算，结构灵活可扩展，预训练效果好，处理多模态数据的优点。Specifically, the structure of the language model of the Transformer architecture has the advantages of fully parallel computing, flexible and scalable structure, good pre-training effect, and processing of multi-modal data.

具体而言，所述编码器包括编码器位置编码器、编码器多头自我注意力机制和编码器前向神经网络层，所述编码器的工作过程为：Specifically, the encoder includes an encoder position encoder, an encoder multi-head self-attention mechanism and an encoder forward neural network layer. The working process of the encoder is:

具体而言，所述编码器采用了所述编码器自注意力机制，使得模型可以在一次计算中同时处理输入序列中所有位置的信息，提高了模型的并行性和效率，所述编码器同时使用所述编码器多头自注意力机制，提高模型学习输入序列中不同位置之间的关系的能力。Specifically, the encoder adopts the encoder self-attention mechanism, so that the model can simultaneously process the information of all positions in the input sequence in one calculation, improving the parallelism and efficiency of the model. The encoder simultaneously The encoder multi-head self-attention mechanism is used to improve the model's ability to learn the relationship between different positions in the input sequence.

具体而言，所述解码器包括解码器位置编码器、解码器多头自我注意力机制、多头注意力机制和解码器前向神经网络层，所述解码器的工作过程为：Specifically, the decoder includes a decoder position encoder, a decoder multi-head self-attention mechanism, a multi-head attention mechanism and a decoder forward neural network layer. The working process of the decoder is:

具体而言，所述解码器使用所述解码器多头自注意力机制，提高模型的建模能力，使其在处理输入序列时更加准确。Specifically, the decoder uses the decoder multi-head self-attention mechanism to improve the modeling ability of the model and make it more accurate when processing the input sequence.

具体而言，所述Transformer架构的语言模型的工作过程为：Specifically, the working process of the language model of the Transformer architecture is:

具体而言，所述Transformer架构的语言模型在分析文本和序列数据时，有强大的上下文感知能力，使得模型在学习文本和序列中有长期依赖性，所述Transformer架构的语言模型的结构具有高效性、并行性和可扩展性，模型的推理速度对于大规模数据集快。Specifically, the language model of the Transformer architecture has strong context awareness capabilities when analyzing text and sequence data, making the model have long-term dependencies in learning text and sequences. The structure of the language model of the Transformer architecture is highly efficient. performance, parallelism and scalability, the model’s inference speed is fast for large-scale data sets.

具体而言，所述接收模块30中通过大数据将用户反馈信息统计，基于统计结果给出用户对所述反馈信息的服务满意度，并基于所述服务满意度从高到低对所述反馈信息进行排序。Specifically, the receiving module 30 counts user feedback information through big data, provides the user's service satisfaction with the feedback information based on the statistical results, and ranks the feedback from high to low based on the service satisfaction. Information is sorted.

具体而言，所述用户反馈信息数据要求数量多，这样基于所述用户反馈信息的统计结果所反映出的用户满意度排序更准确。Specifically, the user feedback information data requires a large amount, so that the ranking of user satisfaction based on the statistical results of the user feedback information is more accurate.

具体而言，所述最小化KL散度通过梯度优化算法，所述梯度优化算法通过计算目标函数的梯度来寻找函数最小值的方法，是通过不断地调整函数参数，使得目标函数的值不断降低，直到达到一个局部最小值或全局最小值。Specifically, the KL divergence is minimized through a gradient optimization algorithm. The gradient optimization algorithm calculates the gradient of the objective function to find the minimum value of the function. It continuously adjusts the function parameters to continuously reduce the value of the objective function. , until reaching a local minimum or global minimum.

具体而言，所述梯度优化算法能够选择合理的参数更新方向，提高KL散度最小化的效率。Specifically, the gradient optimization algorithm can select a reasonable parameter update direction and improve the efficiency of KL divergence minimization.

具体而言，本发明实施例提供的基于模型训练的智能机器人服务系统可以用于信贷业务智能问答机器人的自然语言理解和生成，实现对用户在信贷业务领域的各种问题准确理解和快速响应，减轻人工客服的工作负担，提高客户咨询的响应速度和准确性，为用户提供更专业、便捷的金融服务。Specifically, the intelligent robot service system based on model training provided by the embodiment of the present invention can be used for natural language understanding and generation of intelligent question and answer robots for credit business, to achieve accurate understanding and rapid response to various questions of users in the field of credit business. Reduce the workload of manual customer service, improve the response speed and accuracy of customer inquiries, and provide users with more professional and convenient financial services.

至此，已经结合附图所示的优选实施方式描述了本发明的技术方案，但是，本领域技术人员容易理解的是，本发明的保护范围显然不局限于这些具体实施方式。在不偏离本发明的原理的前提下，本领域技术人员可以对相关技术特征做出等同的更改或替换，这些更改或替换之后的技术方案都将落入本发明的保护范围之内。So far, the technical solution of the present invention has been described with reference to the preferred embodiments shown in the drawings. However, those skilled in the art can easily understand that the protection scope of the present invention is obviously not limited to these specific embodiments. Without departing from the principles of the present invention, those skilled in the art can make equivalent changes or replacements to relevant technical features, and the technical solutions after these changes or replacements will fall within the protection scope of the present invention.

以上所述仅为本发明的优选实施例，并不用于限制本发明；对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention; for those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims

1. An intelligent robot service system based on model training, comprising:

the selection module is used for selecting any problem to be fed back from a first database, and a plurality of problems to be fed back are prestored in the first database;

the input module is connected with the selection module and used for inputting the to-be-fed-back problem into a language model based on a transducer architecture, outputting at least one piece of feedback information based on the to-be-fed-back problem, wherein a plurality of pieces of feedback information form a first feedback list, and the feedback information comprises at least two information characterizations;

the receiving module is used for receiving user feedback information when the feedback information presents different information characterization sequences, wherein the user feedback information is used for representing service satisfaction degree of a user on the feedback information, and ordering the feedback information from high to low based on the service satisfaction degree to form a second feedback list;

the extraction module is connected with the input module and used for extracting first feedback information positioned at the first position in the first feedback list, comparing the first feedback information with the service satisfaction degree of the first feedback information in the second feedback list and obtaining a comparison result;

the adjusting module is connected with the extracting module and used for adjusting the adjusting strategy of the language model of the transducer architecture according to the comparison result so as to enable the first feedback information in the first feedback list to be the optimal feedback information.

2. The model training-based intelligent robot service system of claim 1, wherein the adjustment module comprises a joining unit, a calculation unit, and an iteration unit;

the adding unit is used for adding KL divergence into a loss function of a language model of the transducer architecture as one item of the loss function;

the calculation unit is used for calculating the KL divergence of the loss function according to the comparison result, wherein the KL divergence r _KL The expression of (2) isSaid pi ^RL Output distribution probability representing the first feedback information of the language model outputting the transducer architecture, the pi ^SFT The output distribution probability of the feedback information after the adjustment module adjusts the language model of the transducer architecture is represented;

the iteration unit is connected with the calculation unit and used for enabling the first feedback information in the first feedback list to be iterated into optimal feedback information by minimizing the KL divergence.

3. The intelligent robot service system based on model training according to claim 2, wherein the iteration unit presets a standard threshold in the process of iteration, and when a comparison result is obtained, if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is greater than or equal to the standard threshold, the KL divergence value is reduced for multiple times, so as to update the language model of the transducer architecture for multiple times, and the first feedback information in the first feedback list is output to the optimal feedback information until the KL divergence value is minimum.

4. The intelligent robot service system based on model training according to claim 3, wherein if the service satisfaction degree of the first feedback information in the first feedback list and the first feedback information in the second feedback list is smaller than the standard threshold value, the language model of the transducer architecture is updated once so that the KL divergence value becomes minimum, and the first feedback information in the first feedback list outputs the optimal feedback information.

5. The model training-based intelligent robot service system of claim 4, wherein the language model of the transducer architecture comprises: an input embedded layer, an encoder, a decoder, and an output layer;

the input embedding layer is used for converting word sequences input into questions to be fed back into vector representations;

the encoder is configured to convert the vector representation into an output vector of the encoder;

the decoder is used for converting the output vector of the encoder into a query vector, and calculating the query vector and the output vector representation of the encoder to obtain the context representation of the problem to be fed back to obtain the output vector of the decoder;

the output layer is used for mapping the output vector of the decoder into the answer of the feedback question.

6. The model training-based intelligent robot service system of claim 5, wherein the encoder comprises an encoder position encoder, an encoder multi-headed self-attention mechanism and an encoder forward neural network layer, the encoder working process is as follows:

encoding position information of words in each of the questions to be fed back into a vector by using the encoder position encoder;

inputting the vector into the multi-head self-attention mechanism of the encoder, and establishing the relation between words in the to-be-fed-back problem in an input sequence;

and using the encoder forward neural network to process the representation of the word in each to-be-fed-back problem in the multi-head self-attention mechanism of the encoder to obtain an output vector of the encoder.

7. The model training based intelligent robotic service system of claim 6, wherein the decoder comprises a decoder position encoder, a decoder multi-headed self-attention mechanism, a multi-headed attention mechanism, and a decoder forward neural network layer, the decoder operating as:

encoding the position information of the words in each of the questions to be fed back into a vector by using the decoder position encoder;

inputting the vector into a multi-head self-attention mechanism of a decoder, and establishing a relation between words in the decoder to obtain an intermediate layer representation of the decoder;

inputting an intermediate layer representation of the decoder into a multi-headed attention mechanism;

the decoder forward neural network is used to process the representation of each word in the multi-headed attention mechanism to obtain the output vector of the decoder.

8. The intelligent robot service system based on model training of claim 7, wherein the language model of the transducer architecture works as follows:

inputting word sequences in a problem to be fed back into the input embedding layer to be converted into vector representations;

the multi-head self-attention mechanism of the encoder processes the words of the problem to be fed back, and calculates the importance degree of each word of the problem to be fed back to the sentence of the problem to be fed back according to the current position and semantic relation of the word of the problem to be fed back, so as to obtain the representation of the word of the problem to be fed back in the input process;

encoding the representation of the to-be-fed-back problem word obtained by the multi-head self-attention mechanism of the encoder in input through the feedforward full-connection layer, and enhancing the semantic expression capability of the representation;

taking output vectors passing through the encoder multi-head self-attention mechanism and the feedforward full-connection layer as input, and outputting answers to feedback questions through the encoder self-attention mechanism and the multi-layer decoder;

and obtaining the probability distribution of the answer of the output feedback problem by using a normalized exponential function.

9. The intelligent robot service system based on model training according to claim 8, wherein the receiving module counts the feedback information of the user through big data, gives the service satisfaction of the user for the feedback information based on the statistics result, and orders the feedback information from high to low based on the service satisfaction.

10. The intelligent robot service system based on model training according to claim 9, wherein the minimized KL divergence is obtained by a gradient optimization algorithm that finds the minimum of the function by calculating the gradient of the objective function by continuously adjusting the function parameters such that the value of the objective function is continuously reduced until a local minimum or global minimum is reached.