CN112307168B

CN112307168B - Artificial intelligence-based inquiry session processing method and device and computer equipment

Info

Publication number: CN112307168B
Application number: CN202011190094.5A
Authority: CN
Inventors: 李帅
Original assignee: Kangjian Information Technology Shenzhen Co Ltd
Current assignee: Kangjian Information Technology Shenzhen Co Ltd
Priority date: 2020-10-30
Filing date: 2020-10-30
Publication date: 2023-11-07
Anticipated expiration: 2040-10-30
Also published as: CN112307168A

Abstract

The application relates to artificial intelligence and provides a consultation session processing method, a consultation session processing device, computer equipment and a storage medium based on the artificial intelligence. The method comprises the following steps: acquiring session content sent by a target user object when participating in a consultation session, and determining user attribute information related to the target user object; text analysis is carried out on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary consultation object participating in a consultation session; the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening target reply texts meeting the matching conditions from the plurality of candidate reply texts through a reverse selector, and replying the target reply texts in a consultation session through an auxiliary consultation object. By adopting the method, the recovery accuracy can be improved.

Description

Artificial intelligence-based inquiry session processing method and device and computer equipment

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an artificial intelligence-based inquiry session processing method, apparatus, computer device, and storage medium.

Background

With the development of computer technology, artificial intelligence technology is also rapidly developing, and artificial intelligence is affecting various industries with an unprecedented effort, such as application fields of robot customer service and conversation assistant. The application fields of robot customer service and session assistant are mainly to reply to business problems of user consultation by replacing professional personnel with robots.

The conversation assistant in the traditional scheme can be regarded as a processing interface of a knowledge base, and only the processed input parameters are interactive contents in a natural language form, and the output is response contents in a text form. Conventional conversation assistants often respond by matching appropriate answers from a knowledge base based on rules. However, rule-based matching is often limited by rule formulation and the range size of the knowledge base, and has the problem of inaccurate response.

Disclosure of Invention

In view of the foregoing, it is desirable to provide an artificial intelligence-based inquiry session processing method, apparatus, computer device, and storage medium that can improve accuracy of response.

An artificial intelligence based inquiry session processing method, the method comprising:

Acquiring session content sent by a target user object when participating in a consultation session, and determining user attribute information related to the target user;

text analysis is carried out on the session content to obtain a corresponding session structural expression;

determining a personalized text generation model corresponding to the auxiliary consultation object participating in the consultation session;

the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through the personalized text generation model, and a plurality of candidate reply texts are output;

and screening target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in the inquiry session through the auxiliary inquiry object.

In one embodiment, the text analysis is performed on the session content to obtain a corresponding session structural expression, which includes:

inputting the session content into an intention prediction model, processing the session content through the intention prediction model, and outputting a target intention category corresponding to the session content;

Extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong;

and constructing the conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong.

In one embodiment, the processing the input data by the personalized text generation model and outputting a plurality of candidate reply texts includes:

the conversation content, the conversation structural expression and the user attribute information are respectively encoded through a plurality of parallel encoders in the personalized text generation model, so that a corresponding encoding vector sequence is obtained;

fusing the coded vector sequences output by each encoder through a fusion module in the personalized text generation model to obtain a fused vector sequence;

and decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the decoding, by a decoder in the personalized text generation model, the fused vector sequence, and outputting a plurality of candidate reply texts includes:

Acquiring the current attention weight vector corresponding to the fusion vector sequence;

according to the attention weight vector and the fusion vector sequence, calculating to obtain a current content vector;

sequentially calculating to obtain a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector and a word vector of a target word determined in the previous time, and determining the current target word according to the current decoding hidden layer vector and the current content vector;

and outputting a plurality of candidate reply texts based on the plurality of groups of target word sequences obtained by sequential decoding.

In one embodiment, the selecting, by the reverse selector, the target reply text that meets the matching condition from the plurality of candidate reply texts includes:

determining, by the reverse selector, a plurality of first character entities in the conversation content and a respective existing second character entity in each candidate reply text;

and respectively matching the second character entities corresponding to the candidate reply texts with the similarity with the plurality of first character entities through the reverse selector, and taking the candidate reply text corresponding to the second character entity with the largest matching degree as the target reply text.

calculating the first times of occurrence of question and each candidate reply text in the session content as a question-answer combination pair respectively;

for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs;

for each candidate reply text, taking the quotient of the first times and the second times corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text;

and taking the candidate reply text with the highest reasonable probability value in the plurality of candidate reply texts as a target reply text.

In one embodiment, the training step of the personalized text generation model includes:

acquiring texts belonging to different application fields, and pre-training an initial model through the texts belonging to the different application fields to obtain a text generation model applicable to the general field;

acquiring a generalized inquiry record belonging to the medical field, and retraining a text generation model of the generic field through the generalized inquiry record to obtain a generalized text generation model;

Acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain a personalized text generation model corresponding to the auxiliary inquiry object; the text belonging to different application fields and/or the generalized inquiry records and/or the personalized inquiry records belonging to the medical field are stored in the blockchain.

An artificial intelligence based inquiry session processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring session content sent by a target user object when participating in a consultation session and determining user attribute information related to the target user;

the text analysis module is used for carrying out text analysis on the session content to obtain a corresponding session structural expression;

the determining module is used for determining a personalized text generation model corresponding to the auxiliary consultation object participating in the consultation session;

the model processing module is used for taking the conversation content, the conversation structural expression and the user attribute information as input data together, processing the input data through the personalized text generation model and outputting a plurality of candidate reply texts;

And the reply module is used for screening target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in the inquiry session through the auxiliary inquiry object.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

According to the method, the device, the computer equipment and the storage medium for processing the inquiry session based on the artificial intelligence, the session content sent by the target user object when participating in the inquiry session is obtained, and the user attribute information related to the target user is determined. And further, text analysis can be carried out on the conversation content to obtain a corresponding conversation structural expression. And processing session content, session structural expression and user attribute information through a personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the multiple candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary consultation object, the output target reply text is more close to the style of the auxiliary consultation object. And the target reply text is generated based on the session content, the session structural expression and the user attribute information, and is reversely selected, namely, the target reply text is closer and more accurate to the user demand. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining a knowledge base, can respond to a large number of long-tail session requests, improves the accuracy of reply, and further improves the satisfaction of clients.

Drawings

FIG. 1 is an application environment diagram of an artificial intelligence based consultation session processing method in one embodiment;

FIG. 2 is a flow diagram of an artificial intelligence based interrogation session processing method in one embodiment;

FIG. 3 is a flow diagram of training steps for a personalized text generation model in one embodiment;

FIG. 4 is a block diagram of an artificial intelligence based interrogation session processing apparatus in one embodiment;

FIG. 5 is a block diagram of an artificial intelligence based interrogation session processing apparatus in another embodiment;

fig. 6 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The inquiry session processing method based on artificial intelligence is mainly used in the field of natural language processing, and can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 and the server 104 may be used separately to execute the method for processing an artificial intelligence-based inquiry session according to the embodiments of the present application, or may be used to cooperatively execute the method for processing an artificial intelligence-based inquiry session according to the embodiments of the present application. For example, the terminal may acquire session content that is sent by the target user object when participating in the inquiry session, and determine user attribute information related to the target user; text analysis is carried out on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary consultation object participating in a consultation session; the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening target reply texts meeting the matching conditions from the plurality of candidate reply texts through a reverse selector, and replying the target reply texts in a consultation session through an auxiliary consultation object.

The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 2, an artificial intelligence based inquiry session processing method is provided, and the method is applied to a computer device, which may be specifically a terminal or a server in fig. 1, for example. The inquiry session process based on artificial intelligence comprises the following steps:

step S202, session content sent by the target user object when participating in the inquiry session is obtained, and user attribute information related to the target user object is determined.

The inquiry session is a session for the user to communicate with the auxiliary inquiry object through the user account, and specifically may be a text session or a voice session. The user attribute information is attribute information related to a target user object, such as the sex, age, or native of the user.

Specifically, the target user can enter a medical inquiry page through the user terminal, and input user attribute information in the medical inquiry page. And then can jump to the corresponding session interface for session with the auxiliary consultation object. The user may enter the content desired to be queried in the session interface. The computer device may obtain user attribute information previously entered by the user, as well as a user entered inquiry session.

And step S204, performing text analysis on the session content to obtain a corresponding session structural expression.

The session structural expression is skeleton information in the proposed session content according to a preset structural rule. Specifically, the computer device may perform word segmentation processing on the session content to obtain a corresponding word sequence, and then remove the word of the mood, the non-meaning word, and the like to obtain a keyword set, and further, the computer device may determine a corresponding session structural expression according to the keyword set.

In one embodiment, text analysis is performed on session content to obtain a corresponding session structural expression, including: inputting the session content into an intention prediction model, processing the session content through the intention prediction model, and outputting a target intention category corresponding to the session content; extracting keywords in session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, each keyword and the target attribute category to which each keyword respectively belongs.

Specifically, the computer device may input the inquiry session into the intention prediction model, extract text features of the text corresponding to the session content through the intention prediction model, classify the text features based on the extracted text features, obtain probabilities corresponding to each preset intention category, and output the preset intention category with the highest probability as the target intention category. Among these, the intention category is, for example, "confirm symptoms associated with a disease", "ask for a treatment text of a disease", or the like, that is, the purpose of the user at the time of online inquiry.

In one embodiment, the intent prediction model predicts the intent and purpose of a user to initiate a session based primarily on a interview session, facilitating responding to the user's needs based on the intent. The intention prediction model can be obtained by training a sample text and a preset intention category to which the sample text belongs in advance. In the training process, the computer equipment can input the sample text into an intention prediction model, the intention prediction model extracts text features of the sample text, and classification processing is carried out on the basis of the extracted text features to obtain a predicted intention category. And the computer equipment adjusts the model parameters according to the difference between the predicted intention category and the preset intention category to which the sample text belongs. And then, the next group of sample texts are processed through the intention prediction model after the model parameters are adjusted, so that continuous training is performed until the training stopping condition is reached, and the trained intention prediction model is obtained. The training stopping condition may specifically be that a preset iteration number is reached or that the performance of the intention prediction model after training reaches a preset performance, etc.

Further, the computer device may perform text analysis on the session content by means of word segmentation or dictionary matching, extract a keyword set in the session content, and further determine a category to which each keyword in the keyword set belongs, such as a "disease" category, a "related symptom" category, and the like.

Furthermore, the computer device may splice the target intention category, each keyword, and the target attribute category to which each keyword belongs with preset characters to obtain a corresponding structured expression. Such as: the user asks "does hypertension lead to tinnitus? The conversion into session structured expressions is: "intention": "confirm symptoms associated with disease"; "disease": "hypertension"; "related symptoms": "tinnitus".

In other embodiments, the computer device may also convert the target intent category into a corresponding vector representation, and convert the plurality of keywords in the conversation content and the respective belonging target attribute category into a corresponding vector representation. Furthermore, the computer device may splice the vector representations to obtain corresponding conversational structured representations.

In the above embodiment, the query intention of the user is predicted by the intention prediction model, and the target attribute category to which each keyword belongs in the session content is determined based on the session content of the user, so that the session structural expression of the session content is constructed based on the target intention category, each keyword, and the target attribute category to which each keyword respectively belongs. The structured expression abandons useless information in the session, can be regarded as a skeleton of the session, is a minimized representation of the session, and is added into a target intention category of the user, so that accurate response reply can be better carried out on the problem of the user.

Step S206, determining a personalized text generation model corresponding to the auxiliary consultation objects participating in the consultation session.

The auxiliary inquiry object is an object for replying to an inquiry problem of a user, and in the application, a service account in the computer device is referred to, and the service account may be a professional doctor staff or an artificial intelligence object, etc., which is not limited in the embodiment of the application.

In addition, different auxiliary consultation objects have respective corresponding personalized text generation models. The different personalized text generation models are all obtained by training based on the personalized inquiry records of the corresponding auxiliary inquiry objects, so that the reply text output based on the personalized text generation models has a language style similar to that of the auxiliary inquiry objects and is easy to be accepted by users.

And step S208, the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output.

Specifically, the computer device may construct input data according to the session text, the session structural expression, and the user attribute information, and further input the input data to a personalized text generation model, and output a plurality of candidate reply texts through the personalized text generation model.

In one embodiment, the computer device may concatenate the conversation text, conversation structured expressions, and user attribute information with preset characters (such as commas or semicolons) to form the input data.

It should be noted that, the model structure of the personalized text generation model in the present application may be specifically an encoder-decoder (codec) structure, and may be specifically a GPT (generating Pre-Training) model, and the personalized text generation may perform codec processing on input data, so as to output a plurality of candidate reply texts.

Specifically, an encoder in the personalized text generation model may encode input data to obtain a coded vector sequence, sequentially decode the coded vector sequence by a decoder to obtain a target end vector, decode each time according to the word vector of the candidate word determined in the previous time, and determine the candidate word at the current time according to the target end vector at the current time. It will be appreciated that the decoder may decode multiple candidate words during each decoding pass. In this way, the candidate words may be combined by the personalized text generation model to output a plurality of candidate reply texts.

In one embodiment, session content, session structural expression and user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output, including: respectively carrying out coding processing on session content, session structural expression and user attribute information through a plurality of parallel encoders in the personalized text generation model to obtain a corresponding coding vector sequence; fusing the coded vector sequences output by each encoder through a fusion module in the personalized text generation model to obtain a fused vector sequence; decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

Specifically, the personalized text generation model comprises at least an encoder, a fusion module and a decoder which are parallel, and the computer equipment can be used for generating the personalized text generation model. And the three parallel encoders respectively encode the session content, the session structural expression and the user attribute information to obtain code vector sequences respectively corresponding to the three dimensions. And further fusing the coded vector sequences with the three dimensions through a fusion module to obtain a fused vector sequence. The fusion vector sequence is sequentially decoded by a decoder, and a plurality of candidate reply texts are output by a beam search algorithm.

In one embodiment, decoding the fused vector sequence by a decoder in the personalized text generation model, outputting a plurality of candidate reply texts, comprising: acquiring the current attention weight vector corresponding to the fusion vector sequence; according to the attention weight vector and the fusion vector sequence, calculating to obtain the current content vector; sequentially calculating to obtain a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector and a word vector of a candidate word determined in the previous time, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

It can be understood that the decoding process of the computer device is time-sequential, and in the decoding process, the computer device decodes the fused vector sequence according to the word vector of the candidate word obtained by the previous decoding to obtain the current decoding hidden layer vector, and then determines the current candidate word according to the current decoding hidden layer vector.

In one embodiment, the personalized text generation model may perform processing through an attention mechanism when decoding, that is, obtain a current attention weight vector corresponding to the fused vector sequence, and calculate a current content vector according to the attention weight vector and the fused vector sequence. Specifically, each attention weight in the attention weight vectors is multiplied by the corresponding fusion vector, and then each result of the multiplication is summed to obtain the current content vector.

Further, the computer device may calculate a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector, and a word vector of the candidate word determined in the previous time, and determine the current candidate word according to the current decoding hidden layer vector and the current content vector.

In one embodiment, the candidate words determined by the computer device according to the current decoded hidden layer vector may be one or more. Specifically, the decoder may calculate the current output probability sequence according to the current (also referred to as the current time) decoded hidden layer vector. The probability sequence is a sequence formed by probabilities that each candidate word in the word set at the output end is the target word output at the present time. Further, the personalized text generation model may select a candidate word corresponding to the maximum probability in the output probability sequence as the current candidate word, or select a candidate word corresponding to the first few (e.g., the first 3) with the maximum probability value in the output probability sequence as the current candidate word.

When the number of candidate words determined at the time is plural, the computer device may perform plural sets of decoding processing in parallel, respectively, at the next decoding. The decoder may calculate the current decoded hidden layer vector from each candidate word determined in the previous time, respectively. Thus, the number of the calculated current decoding hidden layer vectors is also a plurality of, and then the current candidate words are determined according to the current decoding hidden layer vectors. And the same is repeated until a plurality of groups of candidate word sequences are obtained through decoding. Further, the decoder may output a plurality of candidate reply texts based on the plurality of sets of candidate word sequences.

In one embodiment, when the computer device decodes the fusion vector sequence, a greedy algorithm (greedy search) or a beam search algorithm (beam search) may be used to perform decoding processing, so as to obtain multiple sets of candidate reply texts.

In step S210, the target reply text satisfying the matching condition is selected from the multiple candidate reply texts by the reverse selector, and the target reply text is replied in the inquiry session by the auxiliary inquiry object.

Specifically, the computer device may screen out candidate reply texts matching the session content from the plurality of candidate reply texts as the target reply text through the reverse selector. And in turn reply to the target reply text in the interview session by assisting the interview object.

In one embodiment, selecting, by the reverse selector, a target reply text from the plurality of candidate reply texts that satisfies the matching condition includes: determining, by a reverse selector, a plurality of first character entities in the conversation content and a respective existing second character entity in each candidate reply text; and respectively matching the second character entity corresponding to each candidate reply text with the similarity of the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the largest matching degree as a target reply text.

Specifically, the computer device may perform entity matching on each candidate reply text and the question in the session content proposed by the user, and use, as the target reply text, the candidate reply text with the largest matching degree according to the result of entity matching.

In another embodiment, selecting, by the reverse selector, the target reply text satisfying the matching condition from the plurality of candidate reply texts includes: calculating the first number of the question and each candidate reply text in the conversation content, which are respectively used as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first times and the second times corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the highest reasonable probability value in the plurality of candidate reply texts as a target reply text.

In particular, the reverse selector may select the target reply text based on a conditional probability principle. The larger the reasonable probability value, the more reasonable. In one embodiment, it is assumed that a question text that a user makes an inquiry is Q (x 1-xm), a structural expression corresponding to the question text is QS, and user attribute information is BS. Then the corresponding input text S is determined based on the sentence text Q, the structured expression QS and the user attribute information BS. The personalized text generation model determines the target word from the candidate words during training, which can be determined by maximizing the following formula: The characters from m+1 to N are target characters and are used for forming corresponding candidate reply texts. And selecting the target reply text from the plurality of candidate reply texts for the reverse selector. By maximizing +.>The function determines, i.e., answers, the conditional probability of the reverse question. Intuitively, maximizing the backward model likelihood places a penalty on all boring answers, since frequent and repetitive answers may be related to many possible questions and thus at any given pointThe probability of getting under the question will be lower.

In this way, the matching condition of each candidate reply text and the conversation content is determined through the reverse selector, so that the most reasonably matched candidate reply text is output as the target reply text, and the reply accuracy can be greatly improved.

According to the inquiry session processing method based on the artificial intelligence, session content sent by a target user object when participating in an inquiry session is obtained, and user attribute information related to the target user is determined. And further, text analysis can be carried out on the conversation content to obtain a corresponding conversation structural expression. And processing session content, session structural expression and user attribute information through a personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the multiple candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary consultation object, the output target reply text is more close to the style of the auxiliary consultation object. And the target reply text is generated based on the session content, the session structural expression and the user attribute information, and is reversely selected, namely, the target reply text is closer and more accurate to the user demand. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining a knowledge base, can respond to a large number of long-tail session requests, improves the accuracy of reply, and further improves the satisfaction of clients.

In one embodiment, referring to fig. 3, the method further comprises a training step of personalizing the text generating model, the step specifically comprising:

step S302, obtaining texts belonging to different application fields, and pre-training an initial model through the texts belonging to different application fields to obtain a text generation model applicable to the general field.

In one embodiment, the training of the personalized text generation model includes three stages, and the first stage is training through a large number of texts in different fields to obtain a text generation model in a general field. Wherein texts in different fields such as novels, lyrics, poems, etc. The text generation model in the general field obtained through the text training can acquire a good language structure, and the output text meets basic conditions such as sentence smoothness. In the pre-training process, model parameters can be adjusted through the difference between the prediction output of the model and the post text of the sample text until the training stop condition is reached, and the text generation model applicable to the general field is obtained.

Step S304, acquiring a generalized inquiry record belonging to the medical field, and retraining a text generation model applicable to the generic field through the generalized inquiry record to obtain the generalized text generation model of the medical field.

In the second stage, the computer device may perform specialized training on the text generation model applicable to the general field based on the historical text records (i.e., the inquiry records of the plurality of different auxiliary inquiry assistants) in the medical field, to obtain the general text generation model in the medical field. The specific training steps at this stage include: and acquiring historical inquiry records, and processing the problem content in each historical inquiry record to obtain a corresponding structured expression. For each historical inquiry record, corresponding user attribute information is determined, along with inquiry replies in the inquiry record. And splicing the question text, the structural expression and the user attribute information corresponding to the question content to form a training sample, and training the text generation model by taking the inquiry reply as a training label. And constructing a loss function through the difference between the predicted output of the text generation model and the inquiry reply, so that the model parameters are adjusted through the loss function. In this way, the generalized text generation model P is obtained by training through a large number of history inquiry records in the medical field. The universal text generation model obtained through training can well acquire the text characteristics in the medical field.

Step S306, acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain a personalized text generation model corresponding to the auxiliary inquiry object; the text belonging to different application fields and/or the generalized inquiry records and/or the personalized inquiry records belonging to the medical field are stored in the blockchain.

In the third stage, the computer device can retrain the auxiliary consultation objects based on the personalized consultation records corresponding to the auxiliary consultation objects on the basis of the generalized text generation model P, and fine-tune the model parameters to obtain the personalized text generation model PX matched with the auxiliary consultation objects. The expression mode of each person is expressed in the corresponding historical session and is expressed as the personal style of each person, and the personal style is expressed implicitly in a digital mode to a certain extent. The reply style of the personalized text generation model will be more matched with the corresponding auxiliary query object.

It should be emphasized that, to further ensure the privacy and security of the training samples, the above-mentioned text belonging to different application fields and/or the generalized query records and/or personalized query records belonging to the medical field may be stored in the blockchain or may be stored in a node of a blockchain.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

In the embodiment, a large amount of domain knowledge can be saved into the model in an off-line stage, and the saving mode avoids a large amount of explicit rules and reduces the difficulty of realizing and maintaining a knowledge base. And the knowledge is shared by all personalized session assistant systems, so that a large number of long-tail session requests can be responded, the reply accuracy is improved, and the customer satisfaction is further improved. Because a personalized text generation model can be customized for each person according to the history session of the person, the expression mode of the model is close to the style of the person, the possibility of being accepted by the person is improved, and the processing efficiency of the person is improved.

In an embodiment, in actual use, when the personalized text generation model is started to reply, the method can be combined by a pedestrian and a machine, staff corresponding to the auxiliary inquiry object has a rejection button on the generated target reply text, and if the rejection is not performed within a preset second, the target reply text generated by the personalized text generation model is considered to be reasonable, and can be directly sent to the user terminal. The target reply text overrule by the staff will be taken as a reverse example for optimizing the personalized text generation model PX and the reverse selector. After combining the man-machine for a period of time, when the overruling frequency is low enough, the personalized text generation model can be independently used, and then the personalized text generation model is perfected through spot check.

It should be noted that the method of the present application is applicable to other technical fields, and is not limited by the formulation of rules. Through the historical interaction information and the current interaction content, an interaction response with information quantity is generated, and the requirement of customers on domain expertise is met. The customer service labor cost can be greatly reduced, and the customer satisfaction is improved.

It should be understood that, although the steps in the flowcharts of fig. 2-3 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps of fig. 2-3 may include multiple steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the steps or stages in other steps or other steps.

In one embodiment, as shown in FIG. 4, there is provided an artificial intelligence based inquiry session processing apparatus 400 comprising: an acquisition module 401, a text analysis module 402, a determination module 403, a model processing module 404, and a reply module 405, wherein:

the acquiring module 401 is configured to acquire session content that is sent by the target user object when the target user object participates in the inquiry session, and determine user attribute information related to the target user.

The text analysis module 402 is configured to perform text analysis on the session content to obtain a corresponding session structural expression.

The determining module 403 is configured to determine a personalized text generation model corresponding to an auxiliary consultation object participating in a consultation session.

The model processing module 404 is configured to take the session content, the session structural expression and the user attribute information together as input data, process the input data through the personalized text generation model, and output a plurality of candidate reply texts.

And a reply module 405, configured to screen out target reply texts meeting the matching condition from the multiple candidate reply texts through the reverse selector, and reply the target reply text in the inquiry session through the auxiliary inquiry object.

In one embodiment, the text analysis module 402 is further configured to input the session content into an intent prediction model, process the session content through the intent prediction model, and output a target intent category corresponding to the session content; extracting keywords in session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, each keyword and the target attribute category to which each keyword respectively belongs.

In one embodiment, the model processing module 404 is further configured to encode the session content, the session structural expression, and the user attribute information respectively by using a plurality of parallel encoders in the personalized text generation model, so as to obtain a corresponding encoding vector sequence; fusing the coded vector sequences output by each encoder through a fusion module in the personalized text generation model to obtain a fused vector sequence; decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the model processing module 404 is further configured to obtain a current attention weight vector corresponding to the fused vector sequence; according to the attention weight vector and the fusion vector sequence, calculating to obtain the current content vector; sequentially calculating to obtain a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector and a word vector of a candidate word determined in the previous time, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

In one embodiment, the reply module 405 is further configured to determine, through the reverse selector, a plurality of first character entities in the session content, and a second character entity that exists in each candidate reply text; and respectively matching the second character entity corresponding to each candidate reply text with the similarity of the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the largest matching degree as a target reply text.

In one embodiment, the reply module 405 is further configured to calculate a first number of occurrences of each question and each candidate reply text in the session content as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first times and the second times corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the highest reasonable probability value in the plurality of candidate reply texts as a target reply text.

Referring to fig. 5, in one embodiment, the apparatus further includes a model training module 406, configured to obtain texts belonging to different application fields, and pretrain the initial model through the texts belonging to the different application fields to obtain a text generation model applicable to the general field; acquiring a generalized inquiry record belonging to the medical field, and retraining a text generation model applicable to the generic field through the generalized inquiry record to obtain a generalized text generation model of the medical field; and acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain the personalized text generation model corresponding to the auxiliary inquiry object.

The inquiry session processing device based on artificial intelligence acquires session content sent by a target user object when participating in an inquiry session, and determines user attribute information related to the target user. And further, text analysis can be carried out on the conversation content to obtain a corresponding conversation structural expression. And processing session content, session structural expression and user attribute information through a personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the multiple candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary consultation object, the output target reply text is more close to the style of the auxiliary consultation object. And the target reply text is generated based on the session content, the session structural expression and the user attribute information, and is reversely selected, namely, the target reply text is closer and more accurate to the user demand. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining a knowledge base, can respond to a large number of long-tail session requests, improves the accuracy of reply, and further improves the satisfaction of clients.

For specific limitations regarding the artificial intelligence based consultation session processing means, reference may be made to the above limitations regarding the artificial intelligence based consultation session processing method, and will not be described herein. The above-described modules in the artificial intelligence-based inquiry session processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements an artificial intelligence based interrogation session processing method.

It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of: acquiring session content sent by a target user object when participating in a consultation session, and determining user attribute information related to the target user object; text analysis is carried out on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary consultation object participating in a consultation session; the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening target reply texts meeting the matching conditions from the plurality of candidate reply texts through a reverse selector, and replying the target reply texts in a consultation session through an auxiliary consultation object.

In one embodiment, the processor when executing the computer program further performs the steps of: inputting the session content into an intention prediction model, processing the session content through the intention prediction model, and outputting a target intention category corresponding to the session content; extracting keywords in session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, each keyword and the target attribute category to which each keyword respectively belongs.

In one embodiment, the processor when executing the computer program further performs the steps of: respectively carrying out coding processing on session content, session structural expression and user attribute information through a plurality of parallel encoders in the personalized text generation model to obtain a corresponding coding vector sequence; fusing the coded vector sequences output by each encoder through a fusion module in the personalized text generation model to obtain a fused vector sequence; decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring the current attention weight vector corresponding to the fusion vector sequence; according to the attention weight vector and the fusion vector sequence, calculating to obtain the current content vector; sequentially calculating to obtain a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector and a word vector of a candidate word determined in the previous time, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

In one embodiment, the processor when executing the computer program further performs the steps of: determining, by a reverse selector, a plurality of first character entities in the conversation content and a respective existing second character entity in each candidate reply text; and respectively matching the second character entity corresponding to each candidate reply text with the similarity of the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the largest matching degree as a target reply text.

In one embodiment, the processor when executing the computer program further performs the steps of: calculating the first number of the question and each candidate reply text in the conversation content, which are respectively used as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first times and the second times corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the highest reasonable probability value in the plurality of candidate reply texts as a target reply text.

In one embodiment, the processor when executing the computer program further performs the steps of: acquiring texts belonging to different application fields, and pre-training an initial model through the texts belonging to the different application fields to obtain a text generation model applicable to the general field; acquiring a generalized inquiry record belonging to the medical field, and retraining a text generation model applicable to the generic field through the generalized inquiry record to obtain a generalized text generation model of the medical field; and acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain the personalized text generation model corresponding to the auxiliary inquiry object.

The computer equipment acquires session content sent by the target user object when participating in the inquiry session, and determines user attribute information related to the target user. And further, text analysis can be carried out on the conversation content to obtain a corresponding conversation structural expression. And processing session content, session structural expression and user attribute information through a personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the multiple candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary consultation object, the output target reply text is more close to the style of the auxiliary consultation object. And the target reply text is generated based on the session content, the session structural expression and the user attribute information, and is reversely selected, namely, the target reply text is closer and more accurate to the user demand. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining a knowledge base, can respond to a large number of long-tail session requests, improves the accuracy of reply, and further improves the satisfaction of clients.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring session content sent by a target user object when participating in a consultation session, and determining user attribute information related to the target user object; text analysis is carried out on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary consultation object participating in a consultation session; the conversation content, the conversation structural expression and the user attribute information are taken as input data together, the input data is processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening target reply texts meeting the matching conditions from the plurality of candidate reply texts through a reverse selector, and replying the target reply texts in a consultation session through an auxiliary consultation object.

The storage medium acquires session content sent by a target user object when participating in a consultation session, and determines user attribute information related to the target user. And further, text analysis can be carried out on the conversation content to obtain a corresponding conversation structural expression. And processing session content, session structural expression and user attribute information through a personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the multiple candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary consultation object, the output target reply text is more close to the style of the auxiliary consultation object. And the target reply text is generated based on the session content, the session structural expression and the user attribute information, and is reversely selected, namely, the target reply text is closer and more accurate to the user demand. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining a knowledge base, can respond to a large number of long-tail session requests, improves the accuracy of reply, and further improves the satisfaction of clients.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. An artificial intelligence based inquiry session processing method, which is characterized by comprising the following steps:

acquiring session content sent by a target user object when participating in a consultation session, and determining user attribute information related to the target user object;

constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong;

screening target reply texts meeting matching conditions from the plurality of candidate reply texts through a reverse selector, and replying the target reply texts in the inquiry session through the auxiliary inquiry object;

the training step of the personalized text generation model comprises the following steps:

acquiring a generalized inquiry record belonging to the medical field, and retraining the text generation model applicable to the generic field through the generalized inquiry record to obtain a generalized text generation model of the medical field;

and acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain a personalized text generation model corresponding to the auxiliary inquiry object.

2. The method of claim 1, wherein the processing the input data with the personalized text generation model, the session content, the session structured expression, and the user attribute information together as input data, and outputting a plurality of candidate reply texts, comprises:

3. The method of claim 2, wherein decoding the fused vector sequence by a decoder in the personalized text generation model, outputting a plurality of candidate reply texts, comprises:

sequentially calculating to obtain a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector and a word vector of a candidate word determined in the previous time, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector;

and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

4. The method of claim 1, wherein the screening, by a reverse selector, the target reply text from the plurality of candidate reply texts that satisfies a matching condition comprises:

5. The method of claim 1, wherein the screening, by a reverse selector, the target reply text from the plurality of candidate reply texts that satisfies a matching condition comprises:

6. Method according to any of claims 1 to 5, characterized in that the text belonging to different application fields and/or the generalized and/or personalized inquiry records belonging to the medical field are stored in a blockchain.

7. An artificial intelligence based inquiry session processing apparatus, the apparatus comprising:

the text analysis module is used for inputting the session content into an intention prediction model, processing the session content through the intention prediction model and outputting a target intention category corresponding to the session content; extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong; constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong;

the reply module is used for screening target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in the inquiry session through the auxiliary inquiry object;

the training module is used for acquiring texts belonging to different application fields, and pre-training the initial model through the texts belonging to the different application fields to obtain a text generation model applicable to the general field; acquiring a generalized inquiry record belonging to the medical field, and retraining the text generation model applicable to the generic field through the generalized inquiry record to obtain a generalized text generation model of the medical field; and acquiring a personalized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the personalized inquiry record to obtain a personalized text generation model corresponding to the auxiliary inquiry object.

8. The apparatus according to claim 7, wherein the model processing module is specifically configured to:

9. The apparatus of claim 8, wherein the model processing module is further configured to:

10. The apparatus of claim 7, wherein the reply module is specifically configured to:

11. The apparatus of claim 7, wherein the reply module is specifically configured to:

12. The apparatus according to any of claims 7 to 11, characterized in that the text belonging to different application fields and/or the generalized and/or personalized inquiry records belonging to the medical field are stored in a blockchain.

13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.