CN112307168A

CN112307168A - Artificial intelligence-based inquiry session processing method and device and computer equipment

Info

Publication number: CN112307168A
Application number: CN202011190094.5A
Authority: CN
Inventors: 李帅
Original assignee: Kangjian Information Technology Shenzhen Co Ltd
Current assignee: Kangjian Information Technology Shenzhen Co Ltd
Priority date: 2020-10-30
Filing date: 2020-10-30
Publication date: 2021-02-02
Anticipated expiration: 2040-10-30
Also published as: CN112307168B

Abstract

The application relates to artificial intelligence and provides an inquiry session processing method and device based on artificial intelligence, computer equipment and a storage medium. The method comprises the following steps: acquiring session content sent by a target user object when participating in an inquiry session, and determining user attribute information related to the target user object; performing text analysis on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary inquiry object participating in an inquiry session; the conversation content, the conversation structural expression and the user attribute information are used as input data together, the input data are processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening out target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in an inquiry session through an auxiliary inquiry object. By adopting the method, the recovery accuracy can be improved.

Description

Artificial intelligence-based inquiry session processing method and device and computer equipment

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an inquiry session processing method and apparatus, a computer device, and a storage medium based on artificial intelligence.

Background

With the development of computer technology, artificial intelligence technology is also rapidly developing, and artificial intelligence is affecting various industries with unprecedented power, such as application fields of robot customer service, conversation assistants and the like. The application field of robot customer service and conversation assistant is mainly to reply to business problems consulted by users by replacing professional personnel with robots.

The conversation assistant in the traditional scheme can be regarded as a processing interface of a knowledge base, and only the input parameters of the processing are interactive contents in a natural language form, and the output is response contents in a text form. Most conventional session assistants respond by matching appropriate answers from the knowledge base based on rules. However, rule-based matching is limited by the formulation of rules and the range size of a knowledge base, and the problem of inaccurate response exists.

Disclosure of Invention

In view of the above, there is a need to provide an artificial intelligence based inquiry session processing method, apparatus, computer device and storage medium capable of improving accuracy of reply response.

An artificial intelligence based interrogation session processing method, the method comprising:

acquiring session content sent by a target user object when participating in an inquiry session, and determining user attribute information related to the target user;

performing text analysis on the conversation content to obtain a corresponding conversation structural expression;

determining a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry session;

the conversation content, the conversation structural expression and the user attribute information are jointly used as input data, the input data are processed through the personalized text generation model, and a plurality of candidate reply texts are output;

and screening out target reply texts meeting matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in the inquiry session through the auxiliary inquiry object.

In one embodiment, the performing text analysis on the conversation content to obtain a corresponding conversation structural expression includes:

inputting the conversation content into an intention prediction model, processing the conversation content through the intention prediction model, and outputting a target intention category corresponding to the conversation content;

extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong;

and constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong.

In one embodiment, the jointly using the conversation content, the conversation structured expression and the user attribute information as input data, and processing the input data through the personalized text generation model to output a plurality of candidate reply texts includes:

respectively coding the conversation content, the conversation structural expression and the user attribute information through a plurality of parallel coders in the personalized text generation model to obtain corresponding coding vector sequences;

fusing the coding vector sequences output by the encoders through a fusion module in the personalized text generation model to obtain fusion vector sequences;

and decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the decoding, by a decoder in the personalized text generation model, the fused vector sequence to output a plurality of candidate reply texts includes:

acquiring the current attention weight vector corresponding to the fusion vector sequence;

calculating to obtain a current content vector according to the attention weight vector and the fusion vector sequence;

calculating to obtain a current decoding hidden layer vector according to the current content vector, the previous decoding hidden layer vector and the word vector of the target word determined at the previous time in sequence, and determining the current target word according to the current decoding hidden layer vector and the current content vector;

and outputting a plurality of candidate reply texts based on a plurality of groups of target word sequences obtained by sequential decoding.

In one embodiment, the screening out, by the reverse selector, a target reply text meeting a matching condition from the plurality of candidate reply texts includes:

determining, by the reverse selector, a plurality of first character entities in the conversation content and respectively existing second character entities in each candidate reply text;

and respectively carrying out similarity matching on the second character entities corresponding to the candidate reply texts and the plurality of first character entities through the reverse selector, and taking the candidate reply text corresponding to the second character entity with the maximum matching degree as a target reply text.

calculating the first times of the questions in the conversation content and the candidate reply texts which are respectively used as a question-answer combination pair;

for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs;

for each candidate reply text, taking the quotient of the first time and the second time corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text;

and taking the candidate reply text with the maximum reasonable probability value in the candidate reply texts as the target reply text.

In one embodiment, the step of training the personalized text generation model comprises:

acquiring texts belonging to different application fields, and pre-training an initial model through the texts belonging to the different application fields to obtain a text generation model suitable for the general field;

acquiring a universal inquiry record belonging to the medical field, and retraining a text generation model of the universal field through the universal inquiry record to obtain a universal text generation model;

acquiring an individualized inquiry record corresponding to the auxiliary inquiry object, and retraining the generalized text generation model through the individualized inquiry record to obtain an individualized text generation model corresponding to the auxiliary inquiry object; wherein the texts belonging to different application fields and/or the generalized and/or personalized inquiry records belonging to the medical field are stored in the blockchain.

An artificial intelligence based interrogation session processing apparatus, the apparatus comprising:

the system comprises an acquisition module, a query module and a query module, wherein the acquisition module is used for acquiring session content sent by a target user object when participating in an inquiry session and determining user attribute information related to a target user;

the text analysis module is used for performing text analysis on the conversation content to obtain a corresponding conversation structural expression;

the determining module is used for determining a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry session;

the model processing module is used for taking the conversation content, the conversation structural expression and the user attribute information as input data together, processing the input data through the personalized text generation model and outputting a plurality of candidate reply texts;

and the reply module is used for screening out target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector and replying the target reply texts in the inquiry session through the auxiliary inquiry object.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

The method, the device, the computer equipment and the storage medium for processing the inquiry session based on the artificial intelligence acquire the session content sent by the target user object when participating in the inquiry session, and determine the user attribute information related to the target user. And further, the text analysis can be carried out on the conversation content to obtain the corresponding conversation structural expression. And processing the conversation content, the conversation structural expression and the user attribute information through a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry conversation, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary inquiry object, the output target reply text is closer to the style of the auxiliary inquiry object. And the target reply text is generated based on the conversation content, the conversation structural expression and the user attribute information, and is selected reversely, namely, the target reply text is closer to and more accurate to the user requirement. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining the knowledge base, can respond to a large number of long-tailed session requests, improves the reply accuracy and further improves the customer satisfaction.

Drawings

FIG. 1 is a diagram of an application environment for an artificial intelligence based interrogation session processing method in one embodiment;

FIG. 2 is a schematic flow diagram illustrating a method for processing an artificial intelligence based interrogation session in one embodiment;

FIG. 3 is a flowchart illustrating the training steps for the personalized text generation model in one embodiment;

FIG. 4 is a block diagram of an artificial intelligence based interrogation session processing apparatus in one embodiment;

FIG. 5 is a block diagram showing an artificial intelligence based inquiry session processing apparatus according to another embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The inquiry session processing method based on artificial intelligence is mainly used in the field of natural language processing, and can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 and the server 104 may be used alone to execute the artificial intelligence based inquiry session processing method provided in the embodiments of the present application, or may be used to cooperatively execute the artificial intelligence based inquiry session processing method provided in the embodiments of the present application. For example, the terminal may obtain session content sent by the target user object while participating in the inquiry session, and determine user attribute information related to the target user; performing text analysis on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary inquiry object participating in an inquiry session; the conversation content, the conversation structural expression and the user attribute information are used as input data together, the input data are processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening out target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in an inquiry session through an auxiliary inquiry object.

The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, an artificial intelligence based method for processing an inquiry session is provided, which is described by taking the method as an example applied to a computer device, where the computer device may specifically be the terminal or the server in fig. 1. The artificial intelligence based inquiry session processing comprises the following steps:

step S202, obtaining the session content sent by the target user object when participating in the inquiry session, and determining the user attribute information related to the target user object.

The inquiry session is a session for performing medical inquiry communication between the user and the auxiliary inquiry object through the user account, and may specifically be a text session or a voice session. The user attribute information is attribute information related to a target user object, such as gender, age, or native place of the user.

Specifically, the target user may enter a medical inquiry page through the user terminal, and input user attribute information in the medical inquiry page. And then the user can jump to a corresponding session interface for carrying out session with the auxiliary inquiry object. The user may enter content in the session interface that the user desires to ask. The computer device may obtain user attribute information previously input by the user, and an inquiry session input by the user.

And step S204, performing text analysis on the conversation content to obtain a corresponding conversation structural expression.

The conversation structured expression is skeleton information in the conversation content extracted according to a preset structure rule. Specifically, the computer device may perform word segmentation processing on the conversation content to obtain a corresponding word sequence, then remove away the linguistic and linguistic words, the unrealistic words, and the like to obtain a keyword set, and further, the computer device may determine a corresponding conversation structural expression according to the keyword set.

In one embodiment, the text analysis of the conversation content to obtain the corresponding conversation structural expression includes: inputting the conversation content into an intention prediction model, processing the conversation content through the intention prediction model, and outputting a target intention category corresponding to the conversation content; extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong.

Specifically, the computer device may input the inquiry session into the intention prediction model, perform text feature extraction on a text corresponding to the session content through the intention prediction model, perform classification processing based on the extracted text features, obtain probabilities corresponding to respective preset intention categories, and output the preset intention category with the highest probability as the target intention category. The intention category is, for example, "to confirm symptoms related to a disease", "to inquire about a treatment text for a disease", or the like, that is, the purpose of the user when performing an online inquiry.

In one embodiment, the intent prediction model is primarily based on an interrogation session to predict the intent and purpose of a user initiating a session, facilitating response to the user's needs based on intent. The intention prediction model can be obtained by training the sample text and the preset intention category to which the sample text belongs in advance. In the training process, the computer equipment can input the sample text into the intention prediction model, the intention prediction model extracts the text features of the sample text, and classification processing is carried out on the basis of the extracted text features to obtain the prediction intention category. And then the computer equipment adjusts the model parameters according to the difference between the prediction intention category and the preset intention category to which the sample text belongs. And then processing the next group of sample texts by the intention prediction model after adjusting the model parameters, and continuously training until the training stopping condition is reached to obtain the trained intention prediction model. The training stopping condition may specifically be that a preset number of iterations is reached, or that the performance of the trained intent prediction model reaches a preset performance.

Further, the computer device may perform text analysis on the conversation content in a word segmentation or dictionary matching manner, extract a keyword set in the conversation content, and further determine categories to which each keyword in the keyword set belongs, such as a "disease" category, a "related symptom" category, and the like.

Furthermore, the computer equipment can splice the target intention category, each keyword and the target attribute category to which each keyword belongs by using preset characters to obtain the corresponding structural expression. Such as: the user asks "is hypertension lead to tinnitus? ", the conversion to a conversational structured representation is: "intention": "confirmation of symptoms associated with disease"; "disease": "hypertension"; "associated symptoms": "tinnitus".

In other embodiments, the computer device may further convert the target intent categories into corresponding vector representations, and convert the plurality of keywords in the session content and the corresponding affiliated target attribute categories into corresponding vector representations. Further, the computer device can splice the vector expressions to obtain corresponding session structured expressions.

In the above embodiment, the inquiry intention of the user is predicted through the intention prediction model, and the target attribute category to which each keyword belongs in the conversation content is determined based on the conversation content of the user, so that the conversation structural expression of the conversation content is constructed based on the target intention category, each keyword, and the target attribute category to which each keyword belongs. The structured expression abandons useless information in the conversation, can be regarded as a framework of the conversation, is a minimized expression of the conversation, is added into a target intention category of the user, and can better respond to and reply the question of the user accurately.

Step S206, determining the personalized text generation model corresponding to the auxiliary inquiry object participating in the inquiry session.

The auxiliary inquiry object is an object for replying an inquiry question of a user, and in the present application, refers to a service account in a computer device, and the specific use of the service account may be professional doctor staff, or an artificial intelligence object, and the like, which is not limited in the embodiment of the present application.

In addition, different auxiliary inquiry subjects have respective corresponding personalized text generation models. And different personalized text generation models are obtained by training based on the personalized inquiry records of corresponding auxiliary inquiry objects, so that the reply text output based on the personalized text generation models has a language style similar to that of the auxiliary inquiry objects and is easy to be accepted by users.

And step S208, taking the conversation content, the conversation structural expression and the user attribute information as input data together, processing the input data through a personalized text generation model, and outputting a plurality of candidate reply texts.

Specifically, the computer device can construct input data according to the session text, the session structured expression and the user attribute information, further input the input data into the personalized text generation model, and output a plurality of candidate reply texts through the personalized text generation model.

In one embodiment, the computer device may concatenate the session text, the session structured representation, and the user attribute information via preset characters (such as commas or semicolons) to compose the input data.

It should be noted that the model structure of the personalized text generation model in the present application may specifically be an encoder-decoder (encoding-decoding) structure, and may specifically be a GPT (general Pre-tracking) model, and the personalized text generation may perform encoding and decoding processing on input data, so as to output a plurality of candidate reply texts.

Specifically, an encoder in the personalized text generation model may encode input data to obtain an encoded vector sequence, decode the encoded vector sequence sequentially by a decoder to obtain a target end vector, decode the target end vector according to a word vector of a candidate word determined at the previous time during each decoding, and determine a current candidate word according to the current target end vector. It is understood that a decoder may decode multiple candidate words each time it decodes. In this way, candidate words may be combined by the personalized text generation model to output a plurality of candidate reply texts.

In one embodiment, the method for generating the multiple candidate reply texts comprises the following steps of using session content, session structural expression and user attribute information as input data together, processing the input data through a personalized text generation model, and outputting multiple candidate reply texts, wherein the method comprises the following steps: respectively coding the conversation content, the conversation structural expression and the user attribute information through a plurality of parallel coders in the personalized text generation model to obtain corresponding coding vector sequences; fusing the coding vector sequences output by each coder through a fusion module in the personalized text generation model to obtain fusion vector sequences; and decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

Specifically, the personalized text generation model comprises at least an encoder, a fusion module and a decoder in parallel, and a computer device can use the personalized text generation model. And the three parallel encoders respectively encode the session content, the session structured expression and the user attribute information to obtain encoding vector sequences corresponding to three dimensions. And then the coding vector sequences of the three dimensions are fused through a fusion module to obtain a fusion vector sequence. The fused vector sequence is sequentially decoded by a decoder, and a plurality of candidate reply texts are output by a beam search algorithm.

In one embodiment, decoding the fused vector sequence by a decoder in the personalized text generation model to output a plurality of candidate reply texts comprises: acquiring the current attention weight vector corresponding to the fusion vector sequence; calculating to obtain a current content vector according to the attention weight vector and the fusion vector sequence; calculating to obtain a current decoding hidden layer vector according to the current content vector, the previous decoding hidden layer vector and a word vector of a candidate word determined at the previous time in sequence, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

It can be understood that the decoding process of the computer device is time-sequential, and in the decoding process, the computer device decodes the fused vector sequence according to the word vector of the candidate word obtained by the previous decoding to obtain the current decoding hidden layer vector, and then determines the current candidate word according to the current decoding hidden layer vector.

In one embodiment, when the personalized text generation model is decoded, the personalized text generation model can be processed through an attention mechanism, that is, the attention weight vector of the current time corresponding to the fusion vector sequence is obtained, and the content vector of the current time is calculated according to the attention weight vector and the fusion vector sequence. Specifically, each attention weight in the attention weight vector may be multiplied by the corresponding fusion vector, and then the results of the multiplication are summed to obtain the current content vector.

Furthermore, the computer device may calculate a current decoding hidden layer vector according to the current content vector, a previous decoding hidden layer vector, and a word vector of a candidate word determined previously, and determine a current candidate word according to the current decoding hidden layer vector and the current content vector.

In one embodiment, the candidate word determined by the computer device according to the current decoding hidden vector can be one or more. Specifically, the decoder may calculate a current output probability sequence according to a current decoded hidden layer vector (also referred to as a current time). The probability sequence is a sequence formed by the probabilities that each candidate word in the output end word set is the target word output at the current time. Further, the personalized text generation model may select a candidate word corresponding to the maximum probability in the output probability sequence as the current candidate word, or select candidate words corresponding to the first few (for example, the first 3) with the maximum probability values in the output probability sequence as the current candidate word.

When the number of candidate words determined at the time is plural, the computer apparatus may perform plural sets of decoding processes in parallel, respectively, at the time of the next decoding. The decoder can respectively calculate the decoding hidden layer vector of the current time according to each candidate word determined at the previous time. In this way, the number of the current decoding hidden layer vectors obtained by calculation is also multiple, and then the current candidate word is determined according to the current decoding hidden layer vectors. And repeating the steps until a plurality of groups of candidate word sequences are obtained by decoding. In turn, the decoder may output a plurality of candidate reply texts based on the plurality of sets of candidate word sequences.

In one embodiment, when the computer device decodes the fusion vector sequence, it may use a greedy search algorithm (greedy search) or a beam search algorithm (beam search) to perform decoding processing, so as to obtain multiple sets of candidate reply texts.

And step S210, selecting a target reply text meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply text in an inquiry session through an auxiliary inquiry object.

Specifically, the computer device may screen candidate reply texts matching the conversation content from the plurality of candidate reply texts through the reverse selector as the target reply text. And then replying to the target reply text in the inquiry session through the auxiliary inquiry subjects.

In one embodiment, the step of screening out the target reply text meeting the matching condition from a plurality of candidate reply texts through a reverse selector comprises the following steps: determining, by the reverse selector, a plurality of first character entities in the conversation content and respectively existing second character entities in each candidate reply text; and respectively carrying out similarity matching on the second character entities corresponding to the candidate reply texts and the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the maximum matching degree as a target reply text.

Specifically, the computer device may perform entity matching on each candidate reply text with an inquiry question in the session content provided by the user, and use the candidate reply text with the largest matching degree as the target reply text according to the result of the entity matching.

In another embodiment, the step of screening out the target reply text meeting the matching condition from the plurality of candidate reply texts through the reverse selector comprises the following steps: calculating the first number of times that the question in the conversation content and each candidate reply text respectively serve as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first time and the second time corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the maximum reasonable probability value in the candidate reply texts as the target reply text.

In particular, the reverse selector may select the target reply text based on a conditional probability principle. A higher reasonable probability value indicates more reasonable. In one embodiment, the question text that the user is supposed to ask is set toQ (x1-xm), QS is the structured expression corresponding to the question text, and BS is the user attribute information. The corresponding input text S is determined on the basis of the sentence text Q, the structured expression QS and the user attribute information BS. Then the personalized text generation model determines the target word from the candidate words during the training process, which can be determined by maximizing the following formula:

and the characters from m +1 to N are target characters and are used for forming corresponding candidate reply texts. And for the reverse selector to select the target reply text from the plurality of candidate reply texts. Can be maximized by

The function determines, i.e. answers, the conditional probability of the inverse question. Intuitively, maximizing the backward model likelihood penalizes all boring answers, since frequent and repetitive answers may be associated with many possible questions, and therefore the probability of getting under any particular question is lower.

Therefore, the matching condition of each candidate reply text with the conversation content is determined through the reverse selector, so that the candidate reply text which is most reasonably matched is output as the target reply text, and the reply accuracy can be greatly improved.

The inquiry session processing method based on artificial intelligence obtains the session content sent by the target user object when participating in the inquiry session, and determines the user attribute information related to the target user. And further, the text analysis can be carried out on the conversation content to obtain the corresponding conversation structural expression. And processing the conversation content, the conversation structural expression and the user attribute information through a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry conversation, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary inquiry object, the output target reply text is closer to the style of the auxiliary inquiry object. And the target reply text is generated based on the conversation content, the conversation structural expression and the user attribute information, and is selected reversely, namely, the target reply text is closer to and more accurate to the user requirement. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining the knowledge base, can respond to a large number of long-tailed session requests, improves the reply accuracy and further improves the customer satisfaction.

In one embodiment, referring to fig. 3, the method further includes a step of training the personalized text generation model, where the step specifically includes:

step S302, obtaining texts belonging to different application fields, and pre-training the initial model through the texts belonging to the different application fields to obtain a text generation model suitable for the general field.

In one embodiment, the training of the personalized text generation model comprises three stages, wherein in the first stage, the text generation model in the general field is obtained by training a large number of texts in different fields. Wherein the texts in different fields are such as novel, lyrics, poetry, etc. The text generation model in the general field obtained by the text training can learn a good language structure, and the output text meets basic conditions such as sentence smoothness. In the pre-training process, model parameters can be adjusted according to the difference between the prediction output of the model and the text in the future of the sample text until the training stopping condition is reached, and the text generation model suitable for the general field is obtained.

And S304, acquiring a generalized inquiry record belonging to the medical field, and retraining the text generation model suitable for the general field through the generalized inquiry record to obtain the generalized text generation model in the medical field.

In the second stage, the computer device may perform specialized training on the text generation model applicable to the general field based on the historical text records in the medical field (i.e., the inquiry records of a plurality of different assistant inquiry assistants), so as to obtain a generalized text generation model in the medical field. The specific training steps at this stage include: and acquiring historical inquiry records, and processing the problem content in each historical inquiry record to obtain the corresponding structural expression. And for each historical inquiry record, determining corresponding user attribute information and an inquiry reply in the inquiry record. And splicing the question text corresponding to the question content, the structured expression and the user attribute information to form a training sample, and training the text generation model by using the inquiry reply as a training label. And constructing a loss function according to the difference between the prediction output of the text generation model and the inquiry reply, so as to adjust the model parameters through the loss function. In this way, the generalized text generation model P is obtained by training a large number of historical inquiry records in the medical field. The generalized text generation model obtained by training can well acquire the text characteristics in the medical field.

Step S306, acquiring personalized inquiry records corresponding to the auxiliary inquiry objects, and retraining the generalized text generation model through the personalized inquiry records to obtain personalized text generation models corresponding to the auxiliary inquiry objects; wherein the texts belonging to different application fields and/or the generalized and/or personalized inquiry records belonging to the medical field are stored in the blockchain.

In the third stage, the computer device may retrain each auxiliary inquiry object based on the personalized inquiry record corresponding to each auxiliary inquiry object on the basis of the generalized text generation model P, and fine-tune the model parameters, so as to obtain a personalized text generation model PX matched with the auxiliary inquiry object. The expression mode of each person is embodied in the corresponding historical conversation and is expressed as the personal style of each person, and here, the personal style is implicitly expressed to a certain extent in a digital mode. The reply style of the personalized text generation model can be more matched with the corresponding auxiliary inquiry object.

It should be emphasized that, in order to further ensure the privacy and safety of the training samples, the texts belonging to different application fields and/or the generalized and/or personalized inquiry records belonging to the medical field may be stored in the blockchain and also stored in the nodes of the blockchain.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In the embodiment, a large amount of domain knowledge can be stored in the model in an off-line stage, and the storage mode avoids a large amount of explicit rules and reduces the difficulty of realizing and maintaining the knowledge base. And the knowledge is shared by all the personalized conversation assistant systems, so that a large number of long-tailed conversation requests can be responded, the reply accuracy is improved, and the customer satisfaction is further improved. Because an individual text generation model can be customized for each person according to the individual historical conversation, the expression mode of the model is close to the individual style, the possibility of being accepted by people is improved, and the manual processing efficiency is improved.

In an embodiment, in actual use, when the personalized text generation model is used for replying, man-machine combination may be performed first, a worker corresponding to the auxiliary inquiry subject has a rejection button on the generated target reply text, and if no rejection is detected within a preset second, the target reply text generated by the personalized text generation model is considered reasonable and may be directly sent to the user terminal. The target reply text overruled by the staff will be used as a reverse example for optimizing the personalized text generation model PX and the reverse selector. After the man-machine is combined for a period of time, when the rejection times are low enough, the personalized text generation model can be independently used, and then the personalized text generation model is perfected through spot check.

It should be noted that the method of the present application can be applied to other technical fields, and is not limited to the formulation of rules. And generating interactive response with information quantity through historical interactive information and current interactive content, and meeting the requirements of customers on field professional knowledge. The customer service labor cost can be greatly reduced, and the customer satisfaction degree is improved.

It should be understood that although the various steps in the flow charts of fig. 2-3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-3 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.

In one embodiment, as shown in FIG. 4, there is provided an artificial intelligence based interrogation session processing apparatus 400 comprising: an obtaining module 401, a text analysis module 402, a determining module 403, a model processing module 404, and a replying module 405, wherein:

the obtaining module 401 is configured to obtain session content sent by a target user object when participating in an inquiry session, and determine user attribute information related to the target user.

And the text analysis module 402 is configured to perform text analysis on the session content to obtain a corresponding session structural expression.

A determining module 403, configured to determine a personalized text generation model corresponding to an auxiliary inquiry object participating in an inquiry session.

And the model processing module 404 is configured to use the session content, the session structural expression, and the user attribute information together as input data, process the input data through the personalized text generation model, and output a plurality of candidate reply texts.

And a reply module 405, configured to screen out, through a reverse selector, a target reply text that meets the matching condition from the multiple candidate reply texts, and reply the target reply text in the inquiry session through the auxiliary inquiry subject.

In one embodiment, the text analysis module 402 is further configured to input the session content into an intention prediction model, process the session content through the intention prediction model, and output a target intention category corresponding to the session content; extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong.

In one embodiment, the model processing module 404 is further configured to encode the session content, the session structural expression, and the user attribute information respectively through a plurality of parallel encoders in the personalized text generation model to obtain corresponding encoding vector sequences; fusing the coding vector sequences output by each coder through a fusion module in the personalized text generation model to obtain fusion vector sequences; and decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the model processing module 404 is further configured to obtain a current attention weight vector corresponding to the fusion vector sequence; calculating to obtain a current content vector according to the attention weight vector and the fusion vector sequence; calculating to obtain a current decoding hidden layer vector according to the current content vector, the previous decoding hidden layer vector and a word vector of a candidate word determined at the previous time in sequence, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

In one embodiment, the reply module 405 is further configured to determine, through the reverse selector, a plurality of first character entities in the conversation content and respectively existing second character entities in each candidate reply text; and respectively carrying out similarity matching on the second character entities corresponding to the candidate reply texts and the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the maximum matching degree as a target reply text.

In one embodiment, the reply module 405 is further configured to calculate a first number of times that the question in the conversation content and each candidate reply text respectively appear as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first time and the second time corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the maximum reasonable probability value in the candidate reply texts as the target reply text.

Referring to fig. 5, in one embodiment, the apparatus further includes a model training module 406, configured to obtain texts belonging to different application fields, and pre-train the initial model through the texts belonging to the different application fields, so as to obtain a text generation model applicable to a general field; acquiring a universal inquiry record belonging to the medical field, and retraining a text generation model suitable for the universal field through the universal inquiry record to obtain a universal text generation model in the medical field; and acquiring the personalized inquiry records corresponding to the auxiliary inquiry objects, and retraining the generalized text generation model through the personalized inquiry records to obtain the personalized text generation model corresponding to the auxiliary inquiry objects.

The inquiry session processing device based on artificial intelligence obtains the session content sent by the target user object when participating in the inquiry session, and determines the user attribute information related to the target user. And further, the text analysis can be carried out on the conversation content to obtain the corresponding conversation structural expression. And processing the conversation content, the conversation structural expression and the user attribute information through a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry conversation, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary inquiry object, the output target reply text is closer to the style of the auxiliary inquiry object. And the target reply text is generated based on the conversation content, the conversation structural expression and the user attribute information, and is selected reversely, namely, the target reply text is closer to and more accurate to the user requirement. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining the knowledge base, can respond to a large number of long-tailed session requests, improves the reply accuracy and further improves the customer satisfaction.

For specific limitations of the artificial intelligence based inquiry session processing device, reference may be made to the above limitations of the artificial intelligence based inquiry session processing method, which are not described herein again. The modules in the above-mentioned artificial intelligence based inquiry session processing device can be wholly or partially realized by software, hardware and the combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an artificial intelligence based interrogation session processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program: acquiring session content sent by a target user object when participating in an inquiry session, and determining user attribute information related to the target user object; performing text analysis on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary inquiry object participating in an inquiry session; the conversation content, the conversation structural expression and the user attribute information are used as input data together, the input data are processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening out target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in an inquiry session through an auxiliary inquiry object.

In one embodiment, the processor, when executing the computer program, further performs the steps of: inputting the conversation content into an intention prediction model, processing the conversation content through the intention prediction model, and outputting a target intention category corresponding to the conversation content; extracting keywords in the session content, and determining target attribute categories to which the keywords respectively belong; and constructing a conversation structural expression of the conversation content according to the target intention category, the keywords and the target attribute categories to which the keywords respectively belong.

In one embodiment, the processor, when executing the computer program, further performs the steps of: respectively coding the conversation content, the conversation structural expression and the user attribute information through a plurality of parallel coders in the personalized text generation model to obtain corresponding coding vector sequences; fusing the coding vector sequences output by each coder through a fusion module in the personalized text generation model to obtain fusion vector sequences; and decoding the fusion vector sequence through a decoder in the personalized text generation model, and outputting a plurality of candidate reply texts.

In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring the current attention weight vector corresponding to the fusion vector sequence; calculating to obtain a current content vector according to the attention weight vector and the fusion vector sequence; calculating to obtain a current decoding hidden layer vector according to the current content vector, the previous decoding hidden layer vector and a word vector of a candidate word determined at the previous time in sequence, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector; and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

In one embodiment, the processor, when executing the computer program, further performs the steps of: determining, by the reverse selector, a plurality of first character entities in the conversation content and respectively existing second character entities in each candidate reply text; and respectively carrying out similarity matching on the second character entities corresponding to the candidate reply texts and the plurality of first character entities through a reverse selector, and taking the candidate reply text corresponding to the second character entity with the maximum matching degree as a target reply text.

In one embodiment, the processor, when executing the computer program, further performs the steps of: calculating the first number of times that the question in the conversation content and each candidate reply text respectively serve as a question-answer combination pair; for each candidate reply text, respectively calculating the second times of occurrence of each candidate reply text in all the historical question-answer combination pairs; for each candidate reply text, taking the quotient of the first time and the second time corresponding to the corresponding candidate reply text as a reasonable probability value corresponding to the corresponding candidate reply text; and taking the candidate reply text with the maximum reasonable probability value in the candidate reply texts as the target reply text.

In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring texts belonging to different application fields, and pre-training an initial model through the texts belonging to the different application fields to obtain a text generation model suitable for the general field; acquiring a universal inquiry record belonging to the medical field, and retraining a text generation model suitable for the universal field through the universal inquiry record to obtain a universal text generation model in the medical field; and acquiring the personalized inquiry records corresponding to the auxiliary inquiry objects, and retraining the generalized text generation model through the personalized inquiry records to obtain the personalized text generation model corresponding to the auxiliary inquiry objects.

The computer device acquires session content sent by a target user object when participating in an inquiry session, and determines user attribute information related to the target user. And further, the text analysis can be carried out on the conversation content to obtain the corresponding conversation structural expression. And processing the conversation content, the conversation structural expression and the user attribute information through a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry conversation, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary inquiry object, the output target reply text is closer to the style of the auxiliary inquiry object. And the target reply text is generated based on the conversation content, the conversation structural expression and the user attribute information, and is selected reversely, namely, the target reply text is closer to and more accurate to the user requirement. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining the knowledge base, can respond to a large number of long-tailed session requests, improves the reply accuracy and further improves the customer satisfaction.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring session content sent by a target user object when participating in an inquiry session, and determining user attribute information related to the target user object; performing text analysis on the conversation content to obtain a corresponding conversation structural expression; determining a personalized text generation model corresponding to an auxiliary inquiry object participating in an inquiry session; the conversation content, the conversation structural expression and the user attribute information are used as input data together, the input data are processed through a personalized text generation model, and a plurality of candidate reply texts are output; and screening out target reply texts meeting the matching conditions from the candidate reply texts through a reverse selector, and replying the target reply texts in an inquiry session through an auxiliary inquiry object.

The storage medium acquires session content sent by a target user object when participating in an inquiry session, and determines user attribute information related to the target user. And further, the text analysis can be carried out on the conversation content to obtain the corresponding conversation structural expression. And processing the conversation content, the conversation structural expression and the user attribute information through a personalized text generation model corresponding to an auxiliary inquiry object participating in the inquiry conversation, and outputting a plurality of candidate reply texts. And then, a target reply text meeting the matching condition is screened out from the candidate reply texts through a reverse selector, and the target reply text is replied in the inquiry session through the auxiliary inquiry object. Because the personalized text generation model corresponds to the auxiliary inquiry object, the output target reply text is closer to the style of the auxiliary inquiry object. And the target reply text is generated based on the conversation content, the conversation structural expression and the user attribute information, and is selected reversely, namely, the target reply text is closer to and more accurate to the user requirement. The storage mode avoids a large number of explicit rules, reduces the difficulty of realizing and maintaining the knowledge base, can respond to a large number of long-tailed session requests, improves the reply accuracy and further improves the customer satisfaction.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An artificial intelligence based inquiry session processing method, characterized in that the method comprises:

acquiring session content sent by a target user object when participating in an inquiry session, and determining user attribute information related to the target user object;

2. The method of claim 1, wherein the text analyzing the conversation content to obtain a corresponding conversation structural expression comprises:

3. The method of claim 1, wherein the using the conversation content, the conversation structured expression and the user attribute information as input data, and processing the input data through the personalized text generation model to output a plurality of candidate reply texts comprises:

4. The method of claim 3, wherein decoding, by a decoder in the personalized text generation model, the fused vector sequence to output a plurality of candidate reply texts comprises:

calculating to obtain a current decoding hidden layer vector according to the current content vector, the previous decoding hidden layer vector and a word vector of a candidate word determined at the previous time in sequence, and determining the current candidate word according to the current decoding hidden layer vector and the current content vector;

and outputting a plurality of candidate reply texts based on the plurality of groups of candidate word sequences obtained by sequential decoding.

5. The method of claim 1, wherein the step of screening out the target reply texts meeting the matching condition from the plurality of candidate reply texts through a reverse selector comprises:

6. The method of claim 1, wherein the step of screening out the target reply texts meeting the matching condition from the plurality of candidate reply texts through a reverse selector comprises:

7. The method according to any one of claims 1 to 6, wherein the training step of the personalized text generation model comprises:

acquiring a generalized inquiry record belonging to the medical field, and retraining the text generation model suitable for the general field through the generalized inquiry record to obtain a generalized text generation model in the medical field;

8. An artificial intelligence based interrogation session processing apparatus, the apparatus comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.