CN116738233A

CN116738233A - Method, device, equipment and storage medium for training model online

Info

Publication number: CN116738233A
Application number: CN202310720687.5A
Authority: CN
Inventors: 姚磊; 应亦丰; 李娜; 张哲�
Original assignee: Advanced New Technologies Co Ltd
Current assignee: Advanced New Technologies Co Ltd
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2023-09-12
Also published as: CN110457449B; CN110457449A

Abstract

The embodiment of the application provides a method, a device, equipment and a storage medium for training a model online, and relates to the technical field of artificial intelligence. The method comprises the following steps: receiving a session message sent by an initiator of a current session; labeling the session message based on the context information of the current session to obtain the labeling intention of the session message; carrying out intention recognition on the session message through an intention recognition model to obtain a recognition intention of the session message; parameters of the intent recognition model are adjusted based on a difference between the labeling intent and the recognition intent such that the difference is less than a first predetermined threshold. The technical scheme of the embodiment of the application can label the conversation content in combination with the context, and feed back the prediction result of the model in real time, thereby optimizing the model in real time.

Description

Method, device, equipment and storage medium for training model online

The application discloses a method, a device and equipment for online training of a model, and a divisional application of a Chinese patent application of which the application number is 201910603432.4 and the application name is 'online training model' which is filed on the date of 2019, 07 and 05.

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a method for online model training, an apparatus for online model training, and a computer readable storage medium.

Background

With the development of NLU (Natural Language Understanding ) technology, the application of man-machine conversation technology is also becoming more and more widespread.

The man-machine conversation model is an intelligent conversation model obtained based on an NLU technology, and the model can replace manpower to communicate with the opposite side. In one technical scheme, after a man-machine session is ended, session messages of a plurality of rounds in the man-machine session are acquired, the session messages of each round are marked independently, and a man-machine session model is trained based on the marked session messages. However, in this technical solution, on the one hand, the context information is not considered, and it is difficult to accurately label the session content; on the other hand, the results of the human-machine conversation model cannot be fed back in real time, so that the model cannot be optimized in real time according to the feedback.

Disclosure of Invention

The embodiment of the application aims to provide a method for online model training, a device for online model training, equipment for online model training and a computer readable storage medium, so as to solve the problems that conversation contents are difficult to accurately mark and real-time feedback and optimization cannot be performed on models.

In order to solve the technical problems, the embodiment of the application is realized as follows:

according to a first aspect of an embodiment of the present application, there is provided a method for training a model online, including: receiving a session message sent by an initiator of a current session; labeling the session message based on the context information of the current session to obtain the labeling intention of the session message; performing intention recognition on the session message through an intention recognition model to obtain a recognition intention of the session message; parameters of the intent recognition model are adjusted based on a difference between the labeling intent and the recognition intent such that the difference is less than a first predetermined threshold.

In some embodiments of the present application, based on the above scheme, the method further includes: generating an original response message corresponding to the session message through a response generation model based on the recognition intention, wherein the recognition intention comprises an intention attribute; scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message; and if the scoring result is smaller than a second preset threshold value, adjusting the original response message based on the context information of the current session so as to generate a valid response message.

In some embodiments of the present application, based on the above-mentioned scheme, the adjusting the original response message based on the context information of the current session to generate an effective response message includes: adjusting the original response message based on the context information of the current session to generate an intermediate response message; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than the second preset threshold value, taking the intermediate response message as the effective response message.

In some embodiments of the present application, based on the above scheme, the method further includes: determining a difference between the original response message and the valid response message; parameters of the response generation model are adjusted based on differences between the original response message and the valid response message.

In some embodiments of the present application, based on the above-described scheme, the determining a difference between the original response message and the valid response message includes: word segmentation processing is carried out on the original response message and the effective response message; generating word vectors of the original response message and word vectors of the effective response message based on the word segmentation processing result; determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

In some embodiments of the present application, based on the above solution, the labeling the session message based on the context information of the current session includes: word segmentation processing is carried out on the session message of the current session to obtain a plurality of words; performing lexical, syntactic and grammatical analysis on the plurality of words based on the context information of the current session; and labeling the session message based on the analysis result.

In some embodiments of the present application, based on the above-mentioned scheme, the performing, by using an intention recognition model, the intention recognition on the session message to obtain the recognition intention of the session message includes: performing topic analysis on the session message based on the context of the session message, and determining the topic of the session message; and carrying out intention analysis on the session message based on the theme and the intention recognition model, and determining the recognition intention of the session message.

In some embodiments of the present application, based on the above-described scheme, the adjusting the parameters of the intent recognition model based on the difference between the labeling intent and the recognition intent includes: word segmentation processing is carried out on the labeling intention and the recognition intention; determining word vectors corresponding to the labeling intents and word vectors corresponding to the recognition intents based on the word segmentation processing results; determining the distance between the word vector corresponding to the labeling intention and the word vector corresponding to the identification intention; and adjusting parameters of the intention recognition model based on the distance.

In some embodiments of the present application, based on the above-mentioned scheme, the generating, by a response generation model, an original response message corresponding to the session message based on the recognition intention includes: determining a conversation type of the conversation message based on the identifying intent, the conversation type comprising: question-answering type, task type or chat type; determining a corresponding response generation model based on the session type; an original response message corresponding to the session message is generated based on the determined response generation model.

According to a second aspect of the present application, there is provided an apparatus for training a model online, comprising: the receiving module is used for receiving the session message sent by the initiator of the current session; the labeling module is used for labeling the session message based on the context information of the current session to obtain the labeling intention of the session message; the intention recognition module is used for carrying out intention recognition on the session message through an intention recognition model to obtain the recognition intention of the session message; and the first adjustment module is used for adjusting the parameters of the intention recognition model based on the difference between the labeling intention and the recognition intention so that the difference is smaller than a first preset threshold value.

In some embodiments of the present application, based on the above-mentioned scheme, the apparatus further includes: a response generation module for generating an original response message corresponding to the session message through a response generation model based on the recognition intention, wherein the recognition intention comprises an intention attribute; the scoring module is used for scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message; and the response adjustment module is used for adjusting the original response message based on the context information of the current session to generate an effective response message if the scoring result is smaller than a second preset threshold value.

In some embodiments of the present application, based on the above-mentioned scheme, the response adjustment module includes: an intermediate response generating unit, configured to adjust the original response message based on context information of the current session, and generate an intermediate response message; the intermediate result generating unit is used for scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and the effective response generation unit is used for taking the intermediate response message as the effective response message if the grading result of the intermediate response message is larger than the second preset threshold value.

In some embodiments of the present application, based on the above-mentioned scheme, the apparatus further includes: a first difference determining module for determining a difference between the original response message and the valid response message; and the second adjustment module is used for adjusting parameters of the response generation model based on the difference between the original response message and the effective response message.

In some embodiments of the present application, based on the above-described scheme, the first difference determining module includes: the first word segmentation processing unit is used for carrying out word segmentation processing on the original response message and the effective response message; a first word vector generating unit, configured to generate a word vector of the original response message and a word vector of the valid response message based on a result of word segmentation processing; and the distance determining unit is used for determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

In some embodiments of the present application, based on the above solution, the labeling module includes: the second word segmentation processing unit is used for carrying out word segmentation processing on the session message of the current session to obtain a plurality of words; a syntax analysis unit for performing lexical, syntactic and syntax analysis on the plurality of words based on the context information of the current session; and the labeling unit is used for labeling the session message based on the analysis result.

In some embodiments of the present application, based on the above-described scheme, the intention recognition module includes: the topic determination unit is used for performing topic analysis on the session message based on the context of the session message and determining the topic of the session message; and the intention analysis unit is used for carrying out intention analysis on the session message based on the theme and the intention recognition model and determining the recognition intention of the session message.

In some embodiments of the present application, based on the above-mentioned scheme, the first adjustment module includes: the third word segmentation processing unit is used for carrying out word segmentation processing on the labeling intention and the recognition intention; the second word vector generation unit is used for determining a word vector corresponding to the labeling intention and a word vector corresponding to the recognition intention based on the word segmentation processing result; a second distance determining unit, configured to determine a distance between a word vector corresponding to the labeling intention and a word vector corresponding to the recognition intention; and the adjusting unit is used for adjusting the parameters of the intention recognition model based on the distance.

In some embodiments of the present application, based on the above-mentioned scheme, the response generation module includes: a session type determining unit configured to determine a session type of the session message based on the recognition intention, the session type including: question-answering type, task type or chat type; a module determining unit, configured to determine a corresponding response generation model based on the session type; and the original response generation unit is used for generating an original response message corresponding to the session message based on the determined response generation model.

According to a third aspect of an embodiment of the present application, there is provided an apparatus for training a model online, comprising: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to implement the steps of the method of the online training model of any of the first aspects above.

According to a fourth aspect of embodiments of the present application, there is provided a storage medium storing computer executable instructions that, when executed, implement the steps of the method of the online training model of any one of the above first aspects.

According to the technical scheme provided by the embodiment of the application, on one hand, the conversation message is marked based on the context information of the current conversation, the conversation content can be marked by combining the context, and the marking accuracy is improved; on the other hand, the intention recognition model is used for carrying out intention recognition on the conversation message of the current conversation, and the parameters of the intention recognition model are adjusted based on the difference between the labeling intention of the label and the recognized recognition intention, so that the prediction result of the model can be fed back on line in real time, and the model can be optimized in real time.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 illustrates a flow diagram of a method of online training a model provided in accordance with some embodiments of the application;

FIG. 2 illustrates a flow diagram for generating a valid reply message provided in accordance with some embodiments of the application;

FIG. 3 illustrates a schematic diagram of a response generation model provided in accordance with some embodiments of the application as a decision tree model;

FIG. 4 shows a flow diagram of a method of online training a model provided in accordance with further embodiments of the present application;

FIG. 5 illustrates a schematic block diagram of an apparatus for online training of a model provided in accordance with some embodiments of the application;

FIG. 6 illustrates a schematic block diagram of an apparatus for online training of a model provided in accordance with some embodiments of the application;

FIG. 7 shows a schematic block diagram of an apparatus for online training of a model provided in accordance with further embodiments of the application; and

FIG. 8 illustrates a schematic block diagram of an apparatus for online training of a model provided in accordance with some embodiments of the application.

Detailed Description

In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, shall fall within the scope of the application.

FIG. 1 illustrates a flow diagram of a method of online training a model provided in accordance with some embodiments of the application. The method can be applied to terminal equipment, including but not limited to mobile phones, tablet computers, intelligent sound boxes, intelligent watches, desktop computers and the like, and can also be applied to other appropriate equipment, and the application is not limited in particular. The method includes steps S110 to S140, and a method for training a model online in an exemplary embodiment is described in detail below with reference to fig. 1.

Referring to fig. 1, in step S110, a session message transmitted by an initiator of a current session is received.

In the example embodiment, during a two-person conversation, two parties are defined to say a sentence as one round of conversation, and in the current nth round of conversation, the initiator of the current conversation sends an nth round of conversation message. For example, in a shopping scenario, the initiator of the current session enters the session message "buy nai shoes".

In step S120, the session message is labeled based on the context information of the current session, so as to obtain the labeling intention of the session message.

In an example embodiment, word segmentation is performed on a session message of a current session to obtain a plurality of words, word, syntax and grammar analysis are performed on the plurality of words based on context information of the current session, and the session message is labeled based on a result of the analysis to obtain a labeling intention of the session message. For example, the conversation message "buy nai shoes" of the current conversation is subjected to word segmentation processing to obtain three words of "buy", "nai g" and "shoe" and obtain the context information of the current conversation, such as shopping scene, wherein "buy" is verb, "nai g" and "shoe" are nouns, so that the intention of the conversation message is shopping, and the attribute of the intention includes "shoe" and "nai g", therefore, the label of the conversation message "buy nai g shoes" is "shopping", and the attribute of the intention of the label is "shoe" and "nai g".

Further, in an example embodiment, an intent template may be preset, where the intent template includes a mapping relationship between a predetermined vocabulary and a corresponding intent, and the intent in the session message is labeled based on the mapping relationship, for example, "buy" maps to "shopping", "play" maps to "listen to music", "train ticket" maps to "go out", "hotel" maps to "accommodation", and so on.

In the example embodiment, the Word2Vector manner may be used to perform Word segmentation and part-of-speech tagging on the session message, or other manners, such as Glove or ELMo may be used to perform Word segmentation and part-of-speech tagging on the session message, which is not particularly limited by the present application. In addition, in other embodiments, the lexical, syntactic, grammatical and intention labels may be performed on the session message manually to obtain the labeled intention and the intention attribute of the session message. Further, the annotated conversation message containing the annotation intention is stored in a corpus.

In step S130, the intention recognition is performed on the session message of the current session through the intention recognition model, so as to obtain the recognition intention of the session message.

In an exemplary embodiment, the intent recognition model is a classification model in a machine learning model, such as an SVM (Support Vector Machine ) model, a CNN (Convolutional Neural Networks, convolutional neural network) model, an LSTM (Long Short-Term Memory) model, etc., but may be other suitable classification models, which are not particularly limited in this regard.

Further, in an example embodiment, a corresponding feature vector is generated based on a word labeling result of a session message of a current session, the generated feature vector is input to an intention recognition model, and the intention recognition is performed on the session message of the current session based on the intention recognition model to obtain a recognition intention of the session message. For example, based on the word labeling results "buy", "shoe", "nike" of the conversation message "buy" the naike shoes ", the corresponding word vectors are generated as feature vectors, the generated feature vectors are input to the intention recognition model, the recognition intention of the conversation message, i.e." shopping ", is obtained based on the intention recognition model, and the parameters of the recognition intention are" shoes "," nike ".

In step S140, parameters of the intent recognition model are adjusted based on the difference between the labeling intent and the recognition intent such that the difference is less than a predetermined threshold.

In an example embodiment, a session message of a current session is a training sample, labeling is a labeling result of the training sample, recognition is intended to recognize the training sample by an intent recognition model, parameters of the intent recognition model are adjusted based on the difference between the recognition result and the labeling result of the training sample, the difference between the recognition result and the labeling result of the training sample represents a difference between a predicted value and a true value, namely a loss function, the smaller the value of the loss function is, the smaller the difference between the predicted value and the true value is, the more accurate the predicted result of the model is, and when the value of the loss function is smaller than a predetermined threshold, the trained intent recognition model is obtained, and the predetermined threshold can be determined according to the size of sample data amount and the size of computing resources.

Specifically, word segmentation is carried out on the labeling intention and the recognition intention of the session message of the current session; determining word vectors corresponding to the labeling intents and word vectors corresponding to the recognition intents based on the word segmentation processing results; determining the distance between the word vector corresponding to the labeling intention and the word vector corresponding to the recognition intention; and adjusting parameters of the intention recognition model based on the distance, and when the distance is smaller than a preset threshold value, indicating that the recognition result of the intention recognition model is more accurate, and training the model to reach an expected target.

It should be noted that, the distance between the word vectors may be a hamming distance, a euclidean distance, a cosine distance, but the distance in the exemplary embodiment of the present application is not limited thereto, and may be a mahalanobis distance, a manhattan distance, or the like, for example.

According to the method for online training the model in the example embodiment of fig. 1, on one hand, the conversation message is marked based on the context information of the current conversation, the conversation content can be marked in combination with the context, and the accuracy of marking is improved; on the other hand, the intention recognition model is used for carrying out intention recognition on the conversation message of the current conversation, and the parameters of the intention recognition model are adjusted based on the difference between the labeling intention of the label and the recognized recognition intention, so that the prediction result of the model can be fed back on line in real time, and the model can be optimized in real time.

Furthermore, to accurately identify intent of a conversation message, in an example embodiment, a topic analysis is performed on the conversation message based on the context of the conversation message, determining a topic in which the conversation message is located; and carrying out intention analysis on the session message based on the theme and the intention recognition model, and determining the recognition intention of the session message. For example, if the context dialogue contains tourist attraction information, the topic of the session message is downstream, and the intention analysis is performed on the session message based on the topic and the intention recognition model to determine the recognition intention of the session message.

Fig. 2 illustrates a flow diagram for generating a valid reply message provided in accordance with some embodiments of the application.

Referring to fig. 2, in step S210, an original response message corresponding to a session message of a current session is generated through a response generation model based on an identification intention of the session message.

In an example embodiment, an identified intent and intent attributes of a conversation message of a current conversation are obtained, and an original response message corresponding to the conversation message is generated based on the identified intent and intent attributes of the conversation message. For example, assuming that the current conversation message is "buying a nikk shoe", the recognition intention of the conversation message is "shopping", the intention attribute is "shoe", "nikk", a response message corresponding to the conversation message is generated through a response generation model based on the recognition intention and the intention attribute of the conversation message, the response generation model can be a decision tree model, the response message corresponding to the conversation message is generated based on the decision tree model, referring to fig. 3, firstly, determining that the intention of a user is shopping, namely the type of commodity under the shopping intention is shoe, then determining whether parameter information such as brand and size of the commodity is complete, and if the parameter information is complete, outputting corresponding commodity information for the user to select; and if the information is incomplete, outputting a response message corresponding to the missing information. For example, after the user inputs "buy nikk shoes", at least 2 pieces of parameter information of commodity shoes, namely, brand, size, lack of size information of shoes, a corresponding response message is generated as "how large size shoes you want? ".

In step S220, the original response message is scored based on the context information of the current session, so as to obtain a scoring result of the original response message.

In an example embodiment, a real response message corresponding to an original response message is obtained from a session database based on context information of a current session, the original response message is scored based on the context information and the real response message, and a large number of session messages of the session are stored in the session database in advance. For example, a difference between the original response message and the real response message may be determined based on the context information, and a scoring result of the original response message may be determined based on the difference.

In other example embodiments, the scoring result of the original response message is obtained by manually scoring the original response message based on context information of the current session, e.g., determining whether the original response message understands the context accurately, whether the response can be joined with the context, whether the direction of the conversation can be guided, whether there is an anthropomorphic language, etc., based on the context information.

In step S230, if the scoring result of the original response message is smaller than the predetermined threshold, the original response message is adjusted based on the context information of the current session to generate a valid response message.

In an example embodiment, if the scoring result of the original response message is smaller than a predetermined threshold, the original response message is adjusted based on the context information of the current session, and an intermediate response message is generated; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than a second preset threshold value, the intermediate response message is used as a valid response message. For example, based on the context information of the current session, a plurality of corresponding intermediate response messages are queried from the session database, the plurality of intermediate response messages are sequentially scored, and if the scoring result of a certain intermediate response message is greater than a second predetermined threshold, the intermediate response message is used as a valid response message.

In other embodiments, the original response message may also be manually adjusted based on the context information of the current session to generate an intermediate response message; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than a second preset threshold value, the intermediate response message is used as a valid response message.

Further, in an example embodiment, a session type of the session message may be determined according to the user's recognition intention, the session type including: question-and-answer type, task type or chat type, a corresponding answer generation model is determined based on the determined session type, and an answer message corresponding to the session message is generated based on the determined answer generation model. For example, if the session message of the current session of the user is "buying a train ticket", the recognition intention of the session message is "buying a ticket", the session type of the session message is determined to be a task type, and the response generation model corresponding to the task type session is determined to be a decision tree model; if the session message of the current session of the user is "how good the tomorrow is," the recognition intention of the session message is "question", the session type of the session message is determined to be question-answer type, and the response generation model corresponding to the question-answer type session is determined to be search model; if the conversation message of the current conversation of the user is 'bad me' and the identification intention of the conversation message is 'chat', determining that the conversation type of the conversation message is chat type and the response generation model corresponding to the chat type conversation is deep learning model.

Further, in an example embodiment, after the valid response message is obtained, a difference between the original response message and the valid response message is determined, and parameters of the response generation model are adjusted based on the difference between the original response message and the valid response message. For example, word segmentation is performed on the original response message and the effective response message; generating word vectors of the original response message and word vectors of the effective response message based on the word segmentation processing result; determining the distance between the word vector of the original response message and the word vector of the effective response message, taking the distance as the difference between the original response message and the effective response message, and adjusting the parameters of the response generation model based on the difference so that the difference is smaller than a preset threshold value.

FIG. 4 shows a flow diagram of a method of training a model online provided in accordance with further embodiments of the present application.

Referring to fig. 4, in step S410, the initiator of the session transmits a session message. In a double conversation process, two parties are defined to say a sentence as a round of dialogue, and in the current nth round of dialogue, an initiator of the current conversation sends an nth round of dialogue message.

In step S420, word segmentation is performed on the session message of the current session to obtain a plurality of words, word, syntax and grammar analysis are performed on the plurality of words based on the context information of the current session, the session message is labeled based on the analysis result to obtain the labeling intention and the intention attribute of the session message, and the labeled result is transmitted to the corpus.

In step S430, the intention recognition is performed on the session message of the current session based on the intention recognition model, and the recognition intention of the session message is obtained. For example, a corresponding feature vector is generated based on a word labeling result of a conversation message of a current conversation, the generated feature vector is input into an intention recognition model, and the intention recognition is performed on the conversation message of the current conversation based on the intention recognition model to obtain the recognition intention of the conversation message.

In step S440, the recognition intention and intention attribute of the session message of the current session are acquired, and the original response message corresponding to the session message is generated based on the recognition intention and intention attribute of the session message.

In step S450, the original response message is scored based on the context information of the current session, so as to obtain a scoring result of the original response message, for example, whether the original response message understands the context accurately or not is judged based on the context information, whether the response can be linked with the context or not, whether the direction of the conversation can be guided or not, whether the anthropomorphic language exists, and the like.

In step S460, the original response message is manually adjusted, e.g., modified or rewritten, based on the context information of the current session, generating an intermediate response message; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than a second preset threshold value, the intermediate response message is used as a valid response message.

In step S470, a valid response message is returned to the counterpart.

FIG. 5 illustrates a schematic block diagram of an apparatus for online training of a model provided in accordance with some embodiments of the application.

Referring to fig. 5, the device for online training the model includes an online labeling module 510, an online training module 520, a feedback module 530, and a labeled corpus 540, where the online labeling module 510 is configured to label a session message and a response message; the online training module 520 is configured to identify an intention of the session message, generate a corresponding response based on the identification result, and train the model based on the labeled session message and the response message; the feedback module 530 is configured to perform feedback adjustment on the intention recognition module based on a difference between the labeling intention and the recognition intention, and perform feedback adjustment on the response generation module based on a difference between the original response message and the real response message.

Wherein, the online labeling module 510 comprises: semantic annotation unit 512, intent annotation unit 514, and response annotation unit 516. The semantic annotation unit 512 is configured to perform grammar, lexical, and syntactic annotation on the conversation message, for example, perform word segmentation processing on the conversation message "buy naike shoes" of the current conversation to obtain three words "buy", "naike", "shoe", where "buy" is a verb, and "naike", "shoe" is a noun. The intention labeling unit 514 is configured to label intention of the conversation message based on context information of the current conversation, for example, context information of the conversation message "buy naikok shoes" of the current conversation is shopping scene, and determine intention of the user is shopping based on verb "buy" and shopping scene. The response labeling unit 514 is configured to label the response message, for example, determine, based on the context information, whether the response message is accurate in understanding the context, whether the response can be linked with the context, whether the direction of the conversation can be guided, whether the response message has an anthropomorphic language, and the like, and label the response message.

The online training module 520 includes: the intention recognition unit 522 and the response generation unit 524, the intention recognition unit 522 is configured to recognize the intention of the conversation message of the current conversation by using the intention recognition model, and generate a corresponding recognition intention. The intention recognition model is a classification model in a machine learning model, for example, an SVM (Support Vector Machine ) model, a CNN (Convolutional Neural Networks, convolutional neural network) model, an LSTM (Long Short-Term Memory) model, or the like, and may be other suitable classification models, which are not particularly limited in the present application. The response generation unit 524 is configured to generate a corresponding response message based on the recognition intention generated by the intention recognition unit 522 and the context information.

The feedback module 530 includes: the model recognition efficiency determining unit 532 and the model response efficiency determining unit 534, wherein the model recognition efficiency determining unit 532 is configured to determine a difference between the recognition intention generated by the intention recognizing unit 522 and the labeling intention generated by the intention labeling unit 514, and feed back the difference to the intention recognizing unit 522 to adjust the intention recognition model. The model response performance determining unit 534 is configured to determine a difference between the response message generated by the response generating unit 524 and the labeling message generated by the response labeling unit 516, and feed back the difference to the response generating unit 524 to adjust the response generating model.

In an example embodiment of the present application, an apparatus for training a model online is also provided. Referring to fig. 6, the apparatus 600 includes: a receiving module 610, a labeling module 620, an intent identifying module 630, and a first adjusting module 640. The receiving module 610 is configured to receive a session message sent by an initiator of a current session; the labeling module 620 is configured to label the session message based on the context information of the current session, so as to obtain a labeling intention of the session message; the intention recognition module 630 is configured to perform intention recognition on the session message through an intention recognition model, so as to obtain a recognition intention of the session message; the first adjustment module 640 is configured to adjust parameters of the intent recognition model based on a difference between the labeling intent and the recognition intent such that the difference is less than a first predetermined threshold.

In some embodiments of the present application, based on the above scheme, the apparatus 600 further includes: a response generation module for generating an original response message corresponding to the session message through a response generation model based on the recognition intention, wherein the recognition intention comprises an intention attribute; the scoring module is used for scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message; and the response adjustment module is used for adjusting the original response message based on the context information of the current session to generate an effective response message if the scoring result is smaller than a second preset threshold value.

In some embodiments of the present application, based on the above-described scheme, the intention recognition module 630 includes: a topic determination unit 710, configured to perform topic analysis on the session message based on the context of the session message, and determine a topic where the session message is located; and an intention analysis unit 720, configured to perform intention analysis on the session message based on the theme and the intention recognition model, and determine a recognition intention of the session message.

According to the device for online training the model in the example embodiment of fig. 6, on one hand, on the other hand, the session message is marked based on the context information of the current session, so that the session content can be marked in combination with the context, and the accuracy of marking is improved; on the other hand, the intention recognition model is used for carrying out intention recognition on the conversation message of the current conversation, and the parameters of the intention recognition model are adjusted based on the difference between the labeling intention of the label and the recognized recognition intention, so that the prediction result of the model can be fed back on line in real time, and the model can be optimized in real time.

The device for training the model online provided by the embodiment of the application can realize each process in the embodiment of the method and achieve the same functions and effects, and is not repeated here.

Further, the embodiment of the application also provides equipment for training the model online, as shown in fig. 8.

The devices of the online training model may vary widely in configuration or performance, and may include one or more processors 801 and memory 802, where the memory 802 may store one or more stored applications or data. Wherein the memory 802 may be transient storage or persistent storage. The application program stored in the memory 802 may include one or more modules (not shown in the figures), each of which may include a series of computer-executable instructions in the device for training the model online. Still further, the processor 801 may be configured to communicate with the memory 802 to execute a series of computer executable instructions in the memory 802 on a device that trains the model online. The device for online training of the model may also include one or more power supplies 803, one or more wired or wireless network interfaces 804, one or more input/output interfaces 805, one or more keyboards 806, and the like.

In a particular embodiment, an apparatus for online training a model includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions in the apparatus for online training a model, and configured to be executed by one or more processors, the one or more programs comprising computer-executable instructions for: receiving a session message sent by an initiator of a current session; labeling the session message based on the context information of the current session to obtain the labeling intention of the session message; performing intention recognition on the session message through an intention recognition model to obtain a recognition intention of the session message; parameters of the intent recognition model are adjusted based on a difference between the labeling intent and the recognition intent such that the difference is less than a first predetermined threshold.

Optionally, the computer executable instructions, when executed, further comprise: generating an original response message corresponding to the session message through a response generation model based on the recognition intention, wherein the recognition intention comprises an intention attribute; scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message; and if the scoring result is smaller than a second preset threshold value, adjusting the original response message based on the context information of the current session so as to generate a valid response message.

Optionally, the computer executable instructions, when executed, adjust the original response message based on the context information of the current session to generate a valid response message, comprising: adjusting the original response message based on the context information of the current session to generate an intermediate response message; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than the second preset threshold value, taking the intermediate response message as the effective response message.

Optionally, the computer executable instructions, when executed, further comprise: determining a difference between the original response message and the valid response message; parameters of the response generation model are adjusted based on differences between the original response message and the valid response message.

Optionally, the computer executable instructions, when executed, determine a difference between the original response message and the valid response message, comprising: word segmentation processing is carried out on the original response message and the effective response message; generating word vectors of the original response message and word vectors of the effective response message based on the word segmentation processing result; determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

Optionally, the computer executable instructions, when executed, annotate the session message based on the context information of the current session, including: word segmentation processing is carried out on the session message of the current session to obtain a plurality of words; performing lexical, syntactic and grammatical analysis on the plurality of words based on the context information of the current session; and labeling the session message based on the analysis result.

Optionally, the computer executable instructions, when executed, perform intent recognition on the session message by an intent recognition model to obtain a recognition intent for the session message, including: performing topic analysis on the session message based on the context of the session message, and determining the topic of the session message; and carrying out intention analysis on the session message based on the theme and the intention recognition model, and determining the recognition intention of the session message.

Optionally, the computer-executable instructions, when executed, adjust parameters of the intent recognition model based on a difference between the labeling intent and the recognition intent, comprising: word segmentation processing is carried out on the labeling intention and the recognition intention; determining word vectors corresponding to the labeling intents and word vectors corresponding to the recognition intents based on the word segmentation processing results; determining the distance between the word vector corresponding to the labeling intention and the word vector corresponding to the identification intention; and adjusting parameters of the intention recognition model based on the distance.

Optionally, the computer executable instructions, when executed, generate an original response message corresponding to the session message by a response generation model based on the identified intent, comprising: determining a conversation type of the conversation message based on the identifying intent, the conversation type comprising: question-answering type, task type or chat type; determining a corresponding response generation model based on the session type; an original response message corresponding to the session message is generated based on the determined response generation model.

In addition, the embodiment of the present application further provides a storage medium, configured to store computer executable instructions, where in a specific embodiment, the storage medium may be a usb disk, an optical disc, a hard disk, etc., where the computer executable instructions stored in the storage medium when executed by a processor can implement the following procedures: receiving a session message sent by an initiator of a current session; labeling the session message based on the context information of the current session to obtain the labeling intention of the session message; performing intention recognition on the session message through an intention recognition model to obtain a recognition intention of the session message; parameters of the intent recognition model are adjusted based on a difference between the labeling intent and the recognition intent such that the difference is less than a first predetermined threshold.

Optionally, the storage medium stores computer executable instructions that, when executed by the processor, further comprise: generating an original response message corresponding to the session message through a response generation model based on the recognition intention, wherein the recognition intention comprises an intention attribute; scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message; and if the scoring result is smaller than a second preset threshold value, adjusting the original response message based on the context information of the current session so as to generate a valid response message.

Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, adjust the original response message based on the context information of the current session to generate a valid response message, comprising: adjusting the original response message based on the context information of the current session to generate an intermediate response message; scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message; and if the grading result of the intermediate response message is larger than the second preset threshold value, taking the intermediate response message as the effective response message.

Optionally, the storage medium stores computer executable instructions that, when executed by the processor, further comprise: determining a difference between the original response message and the valid response message; parameters of the response generation model are adjusted based on differences between the original response message and the valid response message.

Optionally, the storage medium stores computer executable instructions that, when executed by the processor, determine a difference between the original response message and the valid response message, comprising: word segmentation processing is carried out on the original response message and the effective response message; generating word vectors of the original response message and word vectors of the effective response message based on the word segmentation processing result; determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, annotate the session message based on the context information of the current session, including: word segmentation processing is carried out on the session message of the current session to obtain a plurality of words; performing lexical, syntactic and grammatical analysis on the plurality of words based on the context information of the current session; and labeling the session message based on the analysis result.

Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, perform intent recognition on the session message through an intent recognition model to obtain a recognition intent of the session message, including: performing topic analysis on the session message based on the context of the session message, and determining the topic of the session message; and carrying out intention analysis on the session message based on the theme and the intention recognition model, and determining the recognition intention of the session message.

Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, adjust parameters of the intent recognition model based on a difference between the labeling intent and the recognition intent, comprising: word segmentation processing is carried out on the labeling intention and the recognition intention; determining word vectors corresponding to the labeling intents and word vectors corresponding to the recognition intents based on the word segmentation processing results; determining the distance between the word vector corresponding to the labeling intention and the word vector corresponding to the identification intention; and adjusting parameters of the intention recognition model based on the distance.

Optionally, the computer executable instructions stored on the storage medium, when executed by the processor, generate an original response message corresponding to the session message by a response generation model based on the identified intent, comprising: determining a conversation type of the conversation message based on the identifying intent, the conversation type comprising: question-answering type, task type or chat type; determining a corresponding response generation model based on the session type; an original response message corresponding to the session message is generated based on the determined response generation model.

The computer readable storage medium provided by the embodiments of the present application can implement each process in the foregoing method embodiments and achieve the same functions and effects, and are not repeated here.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. A method of training a model online, comprising:

carrying out intention recognition on a session message of a current session through an intention recognition model to obtain a recognition intention of the session message;

adjusting parameters of the intent recognition model based on a difference between the recognition intent and a labeling intent of the conversation message such that the difference is less than a first predetermined threshold;

Generating an original response message corresponding to the session message based on the recognition intention through a response generation model; generating an effective response message based on the original response message and the context information of the current session;

parameters of the response generation model are adjusted based on a difference between the original response message and the valid response message such that the difference is less than a third predetermined threshold.

2. The method as recited in claim 1, further comprising:

and receiving a session message sent by an initiator of the current session.

3. The method as recited in claim 1, further comprising:

labeling the session message of the current session based on the context information of the current session to obtain the labeling intention of the session message;

or alternatively, the process may be performed,

and labeling the session message of the current session based on a preset intention template to obtain the labeling intention of the session message.

4. The method of claim 1, wherein generating a valid response message based on the original response message and the context information comprises:

scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message;

And if the scoring result is smaller than a second preset threshold value, adjusting the original response message based on the context information of the current session so as to generate a valid response message.

5. The method of claim 4, wherein adjusting the original response message based on the context information of the current session to generate a valid response message comprises:

adjusting the original response message based on the context information of the current session to generate an intermediate response message;

scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message;

and if the grading result of the intermediate response message is larger than the second preset threshold value, taking the intermediate response message as the effective response message.

6. The method of claim 4, further comprising, prior to adjusting parameters of the response generation model based on a difference between the original response message and the valid response message:

a difference between the original response message and the valid response message is determined.

7. The method of claim 6, wherein the determining the difference between the original response message and the valid response message comprises:

Word segmentation processing is carried out on the original response message and the effective response message;

generating word vectors of the original response message and word vectors of the effective response message based on the word segmentation processing result;

determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

8. A method according to claim 3, wherein the annotating the session message of the current session based on the context information of the current session comprises:

word segmentation processing is carried out on the session message of the current session to obtain a plurality of words;

performing lexical, syntactic and grammatical analysis on the plurality of words based on the context information of the current session;

and labeling the session message based on the analysis result.

9. The method according to claim 1, wherein the performing intention recognition on the session message of the current session through the intention recognition model to obtain the recognition intention of the session message comprises:

performing topic analysis on the session message based on the context of the session message of the current session, and determining the topic of the session message;

And carrying out intention analysis on the session message based on the theme and the intention recognition model, and determining the recognition intention of the session message.

10. The method of claim 1, wherein the adjusting parameters of the intent recognition model based on differences between the recognition intent and labeling intent of the conversation message comprises:

word segmentation processing is carried out on the labeling intention of the session message and the recognition intention;

determining word vectors corresponding to the labeling intents and word vectors corresponding to the recognition intents based on the word segmentation processing results;

determining the distance between the word vector corresponding to the labeling intention and the word vector corresponding to the identification intention;

and adjusting parameters of the intention recognition model based on the distance.

11. The method of claim 4, wherein the generating, by a response generation model, an original response message corresponding to the session message based on the recognition intent, comprises:

determining a conversation type of the conversation message based on the identifying intent, the conversation type comprising: question-answering type, task type or chat type;

determining a corresponding response generation model based on the session type;

An original response message corresponding to the session message is generated based on the determined response generation model.

12. An apparatus for training a model online, comprising:

the intention recognition module is used for carrying out intention recognition on the conversation message of the current conversation through the intention recognition model to obtain the recognition intention of the conversation message;

a first adjustment module for adjusting parameters of the intent recognition model based on a difference between the recognition intent and a labeling intent of the conversation message such that the difference is less than a first predetermined threshold;

the original response generation module is used for generating an original response message corresponding to the session message based on the identification intention through a response generation model;

an effective response generation module, configured to generate an effective response message based on the original response message and context information of the current session;

and the second adjustment module is used for adjusting the parameters of the response generation model based on the difference between the original response message and the effective response message so that the difference is smaller than a third preset threshold value.

13. The apparatus as recited in claim 12, further comprising:

And the receiving module is used for receiving the session message sent by the initiator of the current session.

14. The apparatus as recited in claim 12, further comprising:

the labeling module is used for labeling the session message of the current session based on the context information of the current session to obtain the labeling intention of the session message;

or alternatively, the process may be performed,

15. The apparatus of claim 12, wherein the valid response generation module comprises:

the scoring module is used for scoring the original response message based on the context information of the current session to obtain a scoring result of the original response message;

and the response adjustment module is used for adjusting the original response message based on the context information of the current session to generate an effective response message if the scoring result is smaller than a second preset threshold value.

16. The apparatus of claim 15, wherein the response adjustment module comprises:

an intermediate response generating unit, configured to adjust the original response message based on context information of the current session, and generate an intermediate response message;

The intermediate result generating unit is used for scoring the intermediate response message based on the context information of the current session to obtain a scoring result of the intermediate response message;

and the effective response generation unit is used for taking the intermediate response message as the effective response message if the grading result of the intermediate response message is larger than the second preset threshold value.

17. The apparatus of claim 15, wherein the apparatus further comprises:

a first difference determining module for determining a difference between the original response message and the valid response message.

18. The apparatus of claim 17, wherein the first variance determining module comprises:

the first word segmentation processing unit is used for carrying out word segmentation processing on the original response message and the effective response message;

a first word vector generating unit, configured to generate a word vector of the original response message and a word vector of the valid response message based on a result of word segmentation processing;

and the distance determining unit is used for determining the distance between the word vector of the original response message and the word vector of the effective response message, and taking the distance as the difference between the original response message and the effective response message.

19. The apparatus of claim 14, wherein the labeling module comprises:

the second word segmentation processing unit is used for carrying out word segmentation processing on the session message of the current session to obtain a plurality of words;

a syntax analysis unit for performing lexical, syntactic and syntax analysis on the plurality of words based on the context information of the current session;

and the labeling unit is used for labeling the session message based on the analysis result.

20. The apparatus of claim 12, wherein the intent recognition module includes:

the topic determination unit is used for performing topic analysis on the session message based on the context of the session message of the current session, and determining the topic of the session message;

and the intention analysis unit is used for carrying out intention analysis on the session message based on the theme and the intention recognition model and determining the recognition intention of the session message.

21. The apparatus of claim 12, wherein the first adjustment module comprises:

the third word segmentation processing unit is used for carrying out word segmentation processing on the labeling intention of the session message and the recognition intention;

the second word vector generation unit is used for determining a word vector corresponding to the labeling intention and a word vector corresponding to the recognition intention based on the word segmentation processing result;

A second distance determining unit, configured to determine a distance between a word vector corresponding to the labeling intention and a word vector corresponding to the recognition intention;

and the adjusting unit is used for adjusting the parameters of the intention recognition model based on the distance.

22. The apparatus of claim 15, wherein the response generation module comprises:

a session type determining unit configured to determine a session type of the session message based on the recognition intention, the session type including: question-answering type, task type or chat type;

a module determining unit, configured to determine a corresponding response generation model based on the session type;

and the original response generation unit is used for generating an original response message corresponding to the session message based on the determined response generation model.

23. An apparatus for training a model online, comprising: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to implement the method of the online training model of any of the above claims 1-11.

24. A storage medium storing computer executable instructions which when executed implement the method of on-line training model of any one of the preceding claims 1 to 11.