CN112528679B

CN112528679B - Method and device for training intention understanding model, and method and device for intention understanding

Info

Publication number: CN112528679B
Application number: CN202011500085.1A
Authority: CN
Inventors: 尹坤; 刘权; 陈志刚; 王智国; 胡国平
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2024-02-13
Anticipated expiration: 2040-12-17
Also published as: CN112528679A

Abstract

The application discloses an intention understanding model training method and device, and an intention understanding method and device, wherein the intention understanding model training method comprises the following steps: after target language training data and auxiliary language training data are obtained, the target language training data and the auxiliary language training data are input into an intention understanding model to obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model, and model prediction loss of the intention understanding model is determined according to the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached. The intention understanding performance of the intention understanding model can be effectively improved.

Description

Method and device for training intention understanding model, and method and device for intention understanding

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and apparatus for training an intent understanding model, and a method and apparatus for understanding intent.

Background

Currently, some man-machine interaction devices are capable of performing man-machine interaction with a user for user sentences (e.g., speech sentences and/or text sentences) input by the user, so that the man-machine interaction device can assist the user to complete corresponding operation requirements (e.g., requirements of route query, air ticket ordering, etc.).

For the man-machine interaction device, after the man-machine interaction device receives a user sentence input by a user, the man-machine interaction device needs to understand and determine the user intention according to the intention of the user sentence, and then the man-machine interaction device performs man-machine interaction with the user according to the user intention.

However, the existing man-machine interaction device still cannot accurately understand the intention of the user (especially, the user statement in the language with a smaller application range such as local dialect and small language), so how to accurately understand the intention of the user is a technical problem to be solved.

Disclosure of Invention

The main purpose of the embodiments of the present application is to provide an intent understanding model training method and apparatus, and an intent understanding method and apparatus, which can accurately understand user intent from user sentences, and especially can accurately understand user intent from user sentences in a small range of use such as local dialects and small languages.

The embodiment of the application provides an intention understanding model training method, which comprises the following steps:

acquiring target language training data and auxiliary language training data;

inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model;

determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data;

updating the intention understanding model according to the model prediction loss of the intention understanding model, and continuing to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model until a preset stop condition is reached.

In one possible embodiment, the target language training data includes at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating auxiliary language real data; the target language generation data is generated from candidate intent data.

In one possible implementation manner, the acquiring process of the target language generating data is:

inputting the candidate intention data into a pre-constructed target language data generation model to obtain target language generation data output by the target language data generation model; the target language data generation model is trained by using target language marking data and auxiliary language marking data.

In one possible implementation manner, the construction process of the target language data generation model includes:

training the model to be trained by using the auxiliary language marking data to obtain an auxiliary language data generation model;

training the auxiliary language data generation model by utilizing the target language marking data to obtain the target language data generation model.

In one possible implementation manner, the process of acquiring the target language translation data is:

determining target language words corresponding to the words to be translated according to a preset word mapping relation and the words to be translated in the auxiliary language real data, and determining the target language translation data according to the auxiliary language real data and the target language words corresponding to the words to be translated; the preset vocabulary mapping relation comprises a corresponding relation between the vocabulary to be translated and a target language vocabulary corresponding to the vocabulary to be translated;

Or,

inputting the auxiliary language real data into a pre-constructed language translation model to obtain a translation result corresponding to the auxiliary language real data output by the language translation model, and determining the target language translation data according to the translation result corresponding to the auxiliary language real data.

In one possible implementation manner, the determining the model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data includes:

determining target language prediction loss according to the prediction intention corresponding to the target language training data;

determining an auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data;

and determining the model prediction loss of the intention understanding model according to the target language prediction loss and the auxiliary language prediction loss.

In one possible embodiment, when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intention corresponding to the target language training data includes a predicted intention of the target language real data, a predicted intention of the target language translation data, and a predicted intention of the target language generation data, the method further includes:

Acquiring actual intention of real data of a target language, reference intention of translation data of the target language and actual intention of generation data of the target language;

the determining the target language prediction loss according to the prediction intention corresponding to the target language training data comprises the following steps:

according to the predicted intention of the target language real data and the actual intention of the target language real data, determining the predicted loss corresponding to the target language real data;

determining a prediction loss corresponding to the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data;

determining a prediction loss corresponding to the target language generation data according to the prediction intention of the target language generation data and the actual intention of the target language generation data;

and determining the target language prediction loss according to the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data.

The embodiment of the application also provides an intention understanding method, which comprises the following steps:

acquiring data to be understood of a target language;

Inputting the target language to-be-understood data into the intention understanding model to obtain a predicted intention corresponding to the target language to-be-understood data output by the intention understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

The embodiment of the application also provides an intention understanding model training device, which comprises:

the first acquisition unit is used for acquiring target language training data and auxiliary language training data;

the first prediction unit is used for inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data output by the intention understanding model;

a loss determination unit, configured to determine a model prediction loss of the intention understanding model according to a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data;

and the model updating unit is used for updating the intention understanding model according to the model prediction loss of the intention understanding model, returning to the first prediction unit, and continuously executing the step of inputting the target language training data and the auxiliary language training data into the intention understanding model until a preset stop condition is reached.

The embodiment of the application also provides an intention understanding device, which comprises:

the second acquisition unit is used for acquiring the data to be understood of the target language;

the second prediction unit is used for inputting the target language to-be-understood data into the intention understanding model to obtain a prediction intention corresponding to the target language to-be-understood data output by the intention understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

The embodiment of the application also provides an intention understanding model training device, which comprises: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

the memory is used to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any of the implementations of the intent understanding model training methods provided by the embodiments of the present application.

The embodiment of the application also provides an intention understanding device, which comprises: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

The memory is used to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any of the methods of intent understanding provided by the embodiments of the present application.

The embodiment of the application also provides a computer readable storage medium, in which instructions are stored, which when executed on a terminal device, cause the terminal device to perform any implementation of the method for training an intent understanding model provided by the embodiment of the application, or perform any implementation of the method for training an intent understanding model provided by the embodiment of the application.

The present embodiments also provide a computer program product, which when executed on a terminal device, causes the terminal device to perform any implementation of the method for training an intent understanding model provided by the present embodiments, or perform any implementation of the method for intent understanding provided by the present embodiments.

Based on the technical scheme, the application has the following beneficial effects:

in the method for training the intention understanding model, after target language training data and auxiliary language training data are acquired, the target language training data and the auxiliary language training data are input into the intention understanding model to obtain the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data output by the intention understanding model, and model prediction loss of the intention understanding model is determined according to the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

The intention understanding model is trained based on the target language training data, so that the intention understanding model can accurately identify the user intention described by the user statement in the target language, and the user intention can be accurately understood from the user statement in the target language (for example, language with a small application range such as local dialect and small language) based on the intention understanding model.

The difference between the auxiliary language and the target language is smaller, so that the difference between the auxiliary language data and the target language data is smaller, the auxiliary language data can be used for expanding training data corresponding to the intention understanding model, adverse effects caused by less training data corresponding to the intention understanding model are avoided, the intention understanding performance of the intention understanding model obtained through training based on the target language training data and the auxiliary language training data is better, the trained intention understanding model can accurately understand the user intention from the user statement more accurately, and particularly, the user intention can be accurately understood from the user statement in a smaller application range such as a local language, a small language and the like.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an intent understanding model training method provided in an embodiment of the present application;

FIG. 2 is a schematic diagram of a language translation model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a process for constructing a target language data generation model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a model for understanding the intention provided in the embodiment of the present application;

FIG. 5 is a training schematic diagram of an intent understanding model provided in an embodiment of the present application;

FIG. 6 is a flow chart of an intent understanding method provided by embodiments of the present application;

FIG. 7 is a training schematic diagram of an intent understanding model applied to the understanding of the intention of Guangdong language according to the embodiment of the present application;

FIG. 8 is a schematic structural diagram of a training device for training a model for understanding the intention provided in the embodiment of the present application;

Fig. 9 is a schematic structural view of an apparatus for understanding the intention provided in the embodiment of the present application.

Detailed Description

The inventors found in the study of the intention understanding that in the related art, the intention understanding model may be trained in advance using the user history statement in the target language, so that the trained intention understanding model can perform the intention understanding on the user statement in the target language input by the user. However, if the application range of the target language (such as local dialect or small language) is smaller, the user history sentences in the target language are less, so that the training data corresponding to the intention understanding model is less, and the intention understanding performance of the intention understanding model obtained by training based on a small amount of training data is poor.

The inventors have also found in studies of intended understanding that there is similarity between different languages. For example, a chinese local dialect (e.g., cantonese, etc.) is only slightly different from a chinese official language (i.e., mandarin), in terms of the order and pronunciation, so that the chinese local dialect is relatively similar to the chinese official language, and thus the difference between the chinese local dialect and the chinese official language is relatively small, so when the intent understanding model is used for intent understanding of a user sentence in a target language, and the target language is the chinese local dialect, training data corresponding to the intent understanding model may be extended by using the chinese official language, so that the intent understanding model can be trained based on the user history sentence in the chinese official language and the user history sentence in the chinese local dialect.

Based on the above-mentioned findings of the inventors, in order to solve the technical problems in the background section and the drawbacks of the related art, an embodiment of the present application provides an intent understanding model training method, which includes: acquiring target language training data and auxiliary language training data; inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model; determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data; updating the intention understanding model according to the model prediction loss, and returning to the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

It can be seen that, because the intent understanding model is trained based on the target language training data, the intent understanding model can accurately identify the user intent described by the user sentence in the target language, so that the user intent can be accurately understood from the user sentence in the target language (for example, language with a small use range such as local dialect, small language, etc.) based on the intent understanding model. The difference between the auxiliary language and the target language is smaller, so that the difference between the auxiliary language data and the target language data is smaller, the auxiliary language data can be used for expanding training data corresponding to the intention understanding model, adverse effects caused by less training data corresponding to the intention understanding model are avoided, the intention understanding performance of the intention understanding model obtained through training based on the target language training data and the auxiliary language training data is better, the trained intention understanding model can accurately understand the user intention from the user statement more accurately, and particularly, the user intention can be accurately understood from the user statement in a smaller application range such as a local language, a small language and the like.

In addition, the embodiment of the present application does not limit the execution subject of the method for training the intent understanding model, and for example, the method for training the intent understanding model provided in the embodiment of the present application may be applied to a data processing device such as a terminal device or a server. The terminal device may be a smart phone, a computer, a personal digital assistant (Personal Digital Assitant, PDA), a tablet computer, or the like. The servers may be stand alone servers, clustered servers, or cloud servers.

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Method embodiment one

Referring to fig. 1, a flowchart of a method for training an intention understanding model is provided in an embodiment of the present application.

The intention understanding model training method provided by the embodiment of the application comprises S101-S105:

S101: target language training data and auxiliary language training data are acquired.

The target language refers to a language with a smaller use range (or a smaller number of people), and the embodiment of the application is not limited to the target language, for example, the target language may be a local language or a small language.

The auxiliary language refers to a language having a smaller difference from the target language (i.e., the difference between the auxiliary language and the target language is lower than a preset difference threshold). For example, since the official chinese language (i.e., mandarin chinese) and the local chinese dialect (e.g., northern, wu, xiang, ganxi, guest, min, guang, etc.) differ only slightly in terms of the order and pronunciation, the official chinese language and the local chinese dialect are less different, and thus the official chinese language may be used as an auxiliary language when the target language is the local chinese dialect.

The auxiliary language training data refers to auxiliary language sentences used for training the intention understanding model, and the auxiliary language training data is used for expanding training data used for the intention understanding model. It should be noted that, the embodiment of the application is not limited to the method of acquiring the auxiliary language training data, for example, the auxiliary language training data may be legally crawled from a web page, or may be read from the user history statement under the auxiliary language stored in the man-machine interaction device.

The target language training data refers to target language sentences used for training the intention understanding model. In addition, the embodiments of the present application are not limited to target language training data, and for example, the target language training data may include at least one of target language real data, target language translation data, and target language generation data. For ease of understanding, the target language real data, target language translation data, and target language generation data are described below, respectively.

The real data of the target language refers to real target language sentences obtained by preset collection means (for example, legal crawling from web pages, reading from user history sentences under the target language stored by the man-machine interaction equipment and the like). It should be noted that the preset collection means may be preset, and the embodiment of the present application is not limited to the preset collection means.

The target language translation data refers to target language sentences obtained through a preset translation means. It should be noted that the preset translation means may be preset, and the embodiment of the present application is not limited to the preset translation means, for example, the preset translation means may be a means for translating based on a preset vocabulary mapping relationship described below, or may be a means for translating based on a language translation model described below. In addition, the embodiment of the present application is not limited to the translation object of the preset translation means, and for example, the translation object may be an auxiliary language sentence because the difference between the auxiliary language and the target language is small.

To facilitate an understanding of the process of obtaining translation data in a target language, two possible implementations are described below.

In a first possible implementation manner, the process of obtaining the translation data of the target language may include steps 11-12:

step 11: and determining a target language vocabulary corresponding to the vocabulary to be translated according to the preset vocabulary mapping relation and the vocabulary to be translated in the auxiliary language real data. The preset vocabulary mapping relation comprises a corresponding relation between a vocabulary to be translated and a target language vocabulary corresponding to the vocabulary to be translated.

The preset vocabulary mapping relation is used for recording the corresponding relation between each target language vocabulary and each auxiliary language vocabulary. For example, when the target language is cantonese and the auxiliary language is mandarin chinese, the preset vocabulary mapping relationship may be used to record the correspondence between each cantonese vocabulary and each mandarin chinese vocabulary (as shown in table 1).

Guangdong vocabulary	Mandarin vocabulary
		edge degree	Where to place
Sample application line	How to walk
		……	……
Sample application	How is

Table 1 correspondence between each Guangdong vocabulary and Mandarin vocabulary

The auxiliary language real data refers to real auxiliary language sentences acquired and comprises at least one auxiliary language vocabulary. It should be noted that, the embodiment of the application is not limited to the collection manner of the real data of the auxiliary language, for example, the real data of the auxiliary language may be obtained by legal crawling from a web page, or may be read from the user history statement in the auxiliary language stored in the man-machine interaction device. In addition, the embodiment of the application also does not limit the relationship between the auxiliary language real data and the above auxiliary language training data, for example, the auxiliary language real data may be the same as the above auxiliary language training data or may be different from the above auxiliary language training data.

The vocabulary to be translated refers to auxiliary language vocabulary which needs to be translated into target language vocabulary in auxiliary language real data, and the vocabulary to be translated can be determined according to a preset vocabulary mapping relation. For example, if the auxiliary language real data is the mandarin sentence "how to walk by ten thousand", since it is known that the mandarin vocabulary "how to walk" corresponds to the cantonese vocabulary "sample line" according to the preset vocabulary mapping relation shown in table 1, it is possible to determine how to walk the mandarin vocabulary in the mandarin sentence "how to walk by ten thousand" as the vocabulary to be translated.

Based on the above-mentioned related content of step 11, after the auxiliary language real data (e.g., how the mandarin sentence "how to walk") is obtained, each vocabulary to be translated (e.g., how to walk) in the auxiliary language real data may be determined according to the preset vocabulary mapping relationship; and searching target language words (such as guangdong word sample application lines) corresponding to the words to be translated from a preset word mapping relation, so that the auxiliary language real data can be translated into target language translation data (such as guangdong word sentences and guangdong word sample application lines) corresponding to the auxiliary language real data based on the target language words corresponding to the words to be translated.

Step 12: and determining target language translation data according to the auxiliary language real data and target language words corresponding to the words to be translated.

In the embodiment of the present application, after obtaining the target language vocabulary corresponding to each word to be translated in the auxiliary language real data, each word to be translated in the auxiliary language real data may be replaced with the target language vocabulary corresponding to each word to be translated, so as to obtain the target language translation data. For example, when the auxiliary language real data includes N words to be translated, the 1 st word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the 1 st word to be translated, the 2 nd word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the 2 nd word to be translated, … … (and so on), the nth word to be translated in the auxiliary language real data may be replaced with a target language word corresponding to the nth word to be translated, so as to obtain target language translation data corresponding to the auxiliary language real data, so that the target language translation data includes the target language word corresponding to the N words to be translated, and the target language translation data does not include the N words to be translated.

Based on the above-mentioned related content of the first possible implementation manner of obtaining the target language translation data, each word to be translated in the auxiliary language real data may be directly replaced with a target language word corresponding to each word to be translated, so as to obtain the target language translation data corresponding to the auxiliary language real data, so that the target language translation data may include the target language word corresponding to each word to be translated. For example, the mandarin vocabulary of how to walk in mandarin sentence "how to walk" is directly replaced by the cantonese vocabulary "sample line" to obtain the cantonese sentence "how to walk sample line".

In a second possible implementation manner, the process of obtaining the target language translation data may include step 21-step 22:

step 21: inputting the auxiliary language real data into a pre-constructed language translation model to obtain a translation result corresponding to the auxiliary language real data output by the language translation model.

The language translation model is used for translating the auxiliary language sentence into the target language sentence. Wherein the auxiliary language sentence comprises at least one auxiliary language vocabulary and the target language sentence comprises at least one target language vocabulary.

It should be noted that, the language translation model is not limited in the embodiment of the present application, and the language translation model may be any model that can translate an auxiliary language sentence into a target language sentence in the existing or future.

The translation result corresponding to the auxiliary language real data refers to a translation result obtained by translating the auxiliary language real data by the language translation model. In addition, the embodiment of the application does not limit the number of the translation results corresponding to the real data of the auxiliary language. For example, as shown in fig. 2, when the target language is cantonese and the auxiliary language is mandarin, if the auxiliary language real data is mandarin sentence "turn-off navigation", the language translation model can translate the auxiliary language real data and output M translation results "turn-off navigation" sequentially ordered according to the recommended sequence; closing the navigation; guan tour; … …). Wherein M is a positive integer.

Based on the above-mentioned related content of step 21, after the auxiliary language real data is obtained, the auxiliary language real data may be directly input into a pre-constructed language translation model, so that the language translation model may translate the auxiliary language real data and output M translation results corresponding to the auxiliary language real data, so that the target language translation data corresponding to the auxiliary language real data may be determined from the M translation results.

Step 22: and determining target language translation data according to the translation result corresponding to the auxiliary language real data.

In this embodiment of the present application, after M translation results corresponding to auxiliary language real data are obtained, target language translation data corresponding to the auxiliary language real data may be directly screened out from the M translation results. For example, one of the M translation results that is the Top1 translation result that is the Top most ranked may be determined as the target language translation data. For another example, in order to improve diversity of training data, one translation result may be randomly selected from the Top1 translation results to the TopG translation results (i.e., top1 translation results) in the Top order among the M translation results, and determined as the target language translation data. Wherein G is a positive integer.

Based on the above-mentioned related content of the second possible implementation manner of obtaining the target language translation data, the auxiliary language real data may be directly translated by using the pre-constructed language translation model, at least one translation result corresponding to the auxiliary language real data may be output, and the target language translation data corresponding to the auxiliary language real data may be determined according to the at least one translation result.

The target language generation data refers to target language sentences generated from candidate intention data. Wherein the candidate intention data is used to describe the intention of the user under a preset application field (i.e., application field of the intention understanding model).

It should be noted that the preset application field may be preset, and the embodiment of the application is not limited to the preset application field, for example, the preset application field may be a navigation technical field. In addition, the embodiment of the present application is not limited to the user intention of the preset application domain, for example, if the preset application domain is the navigation technical domain, the user intention of the preset application domain may include the user intention in the navigation technical domain shown in table 2.

Table 2 list of user intents in the field of navigation technology

In addition, the embodiments of the present application do not limit the representation form of the candidate intention data, for example, the candidate intention data may be represented in a binary group, that is, the candidate intention data may be represented as (intention type, intention parameter). The intention type refers to a type to which the intention of the user belongs, for example, the intention type can be a type of positioning, POI searching and the like. The intent parameter (also called slot) refers to a parameter related to the user's intent, for example, the intent parameter may be a POI (e.g., shanghai).

In addition, the embodiment of the present application does not limit the process of acquiring the target language generating data, and for convenience of understanding, a possible implementation of acquiring the target language generating data will be described below.

In one possible implementation manner, the process of acquiring the target language generation data may specifically be: and inputting the candidate intention data into a pre-constructed target language data generation model to obtain target language generation data output by the target language data generation model.

The target language data generation model is used for generating target language generation data according to the candidate intention data, and is trained by utilizing the target language marking data and the auxiliary language marking data.

The target language annotation data refers to target language sentences labeled with real user intention, which are used for training the target language data generation model. In addition, the embodiment of the present application does not limit the representation form of the target language annotation data, for example, the target language annotation data may be represented by a triplet (target language sentence, intention type, intention parameter). In addition, the embodiments of the present application also do not limit the relationship between the target language annotation data and the above target language real data, for example, the target language annotation data may be the same as the above target language real data or may be different from the above target language real data.

The auxiliary language marking data refers to auxiliary language sentences marked with real user intention, which are used for training the target language data generation model. In addition, the embodiment of the application is not limited to the representation form of the auxiliary language annotation data, for example, the auxiliary language annotation data may be represented by a triplet (auxiliary language sentence, intention type, intention parameter). In addition, the embodiment of the application is not limited to the relationship between the auxiliary language marking data and the above auxiliary language real data, for example, the auxiliary language marking data may be the same as the above auxiliary language training data or may be different from the above auxiliary language training data.

In addition, the target language data generation model may be a generation model, and the embodiment of the present application does not limit the generation framework of the target language data generation model, for example, the generation framework of the target language data generation model may be UnLim-V2 proposed by Microsoft.

In addition, the embodiment of the present application is not limited to the process of constructing the target language data generation model, and for convenience of understanding, a possible implementation of constructing the target language data generation model will be described below.

In one possible implementation, the process of constructing the target language data generation model may include steps 31-32:

step 31: training the model to be trained by using the auxiliary language marking data to obtain an auxiliary language data generation model.

The model to be trained is a basic model for constructing the target language data generation model, and the model to be trained can become the target language data generation model after two rounds of training. In addition, the embodiment of the application is not limited to the model to be trained, for example, the model to be trained may be a generative model, and in particular, may be a generative model using UnLim-V2 as a generative framework.

The auxiliary language data generation model is used for generating auxiliary language sentences according to the candidate intention data. For example, when the candidate intention data is a and the auxiliary language sentence is X, then the auxiliary language data generation model is used to generate X (i.e., a→x) from a. It can be seen that the auxiliary language data generation model can be expressed as formula (1).

X＝SC _f (A) (1)

Wherein X is an auxiliary language sentence (e.g., mandarin sentence "open sun to the sea"); a is candidate intention data; SC (SC) _f () Model functions of the model are generated for the auxiliary language data. Note that, the embodiment of the present application does not limit the expression of a, for example, a= (I, P), I is an intention type (for example, POI search), and P is an intention parameter (for example, shanghai).

It should be noted that, the training process of step 31 is not limited in this embodiment of the present application, and any training method that can train the model to be trained into the auxiliary language data generation model may be used to implement the training process.

Step 32: training the auxiliary language data generation model by using the target language marking data to obtain a target language data generation model.

The target language data generation model is used for generating target language sentences according to the candidate intention data. For example, when the candidate intention data is a and the target language sentence is L, then the target language data generation model is used to generate L (i.e., a→l) from a. It can be seen that the target language data generation model can be expressed as formula (2).

L＝SC _m (A) (2)

Where L is a target language sentence (e.g., the Guangdong sentence "hear the sun go to the sea"); a is candidate intention data; SC (SC) _m () For the target languageThe data generates a model function of the model.

The training process of step 32 is not limited either, and may be implemented by any training method that can train the auxiliary language data generation model into the auxiliary language data generation model, existing or occurring in the future.

It should be further noted that, for the target language data generation model, step 31 is a pre-training process and step 32 is a transfer learning process. For example, as shown in FIG. 3, the construction process of the target language data generation model with the generative framework UnLim-V2 may include two stages of pre-training and migration learning.

Based on the above-mentioned related content of S101, in the embodiment of the present application, in order to enable training to obtain an intended understanding model with better intended understanding performance, target language training data and auxiliary language training data may be obtained, so that the intended understanding model may be trained by using the target language training data and the auxiliary language training data together, so that sufficient training data may be provided when the intended understanding model is trained, and thus adverse effects caused by less training data on the intended understanding model may be effectively avoided.

S102: inputting the target language training data and the auxiliary language training data into an intention understanding model to obtain the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data output by the intention understanding model.

The intention understanding model is used for understanding the intention of the user on the user statement in the target language.

In addition, the embodiments of the present application are not limited to the structure of the intended understanding model, and may be implemented using any model structure that can be intended to be understood, either existing or appearing in the future. For example, as shown in FIG. 4, the intent understanding model may include an encoder (i.e., an encoding layer) and a classifier (i.e., a fully connected layer). The encoder is used for carrying out semantic understanding on the model input data of the intention understanding model to obtain sentence vectors corresponding to the model input data. The classifier is used for determining a prediction intention (i.e., model output data) corresponding to the model input data according to a sentence vector corresponding to the model input data output by the encoder.

It should be noted that the embodiments of the present application are not limited to an encoder, and may be implemented using any semantic understanding model that exists in the present or future, for example, the encoder may be BERT (Bidirectional Encoder Representations from Transformers). In addition, embodiments of the present application are not limited to classifiers, and may be implemented using any classifier that exists in the present or future, for example, the classifier may be a linear classifier. It can be seen that when the intent understanding model includes an encoder and a classifier, the encoder is BERT and the classifier is a linear classifier, the intent understanding model may be implemented using formula (3).

y＝liner(BERT(x)) (3)

Where y is model output data of the intent understanding model (i.e., predicted intent corresponding to input data x); x is model input data of the intent understanding model; BERT () is the coding function in the intent understanding model; link () is a classification function in the intent understanding model.

In addition, the predicted intention corresponding to the target language training data is obtained by intention understanding of the intention understanding model on the target language training data.

In addition, the predicted intention corresponding to the auxiliary language training data is obtained by intention understanding of the auxiliary language training data by an intention understanding model.

Based on the above-mentioned content related to S102, the target language training data and the auxiliary language training data may be input into the intention understanding model, so that the intention understanding model performs intention understanding on the target language training data and the auxiliary language training data, to obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data. For example, as shown in fig. 5, when the target language training data includes target language real data, target language translation data, and target language generation data, the target language real data, target language translation data, target language generation data, and auxiliary language training data may be input into the intention understanding model, and the predicted intention corresponding to the target language real data, the predicted intention corresponding to the target language translation data, the predicted intention corresponding to the target language generation data, and the predicted intention corresponding to the auxiliary language training data output by the intention understanding model may be obtained, so that the intention understanding performance of the intention understanding model can be subsequently determined based on these predicted intentions.

S103: and determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data.

Wherein, the model prediction loss is used for representing the intention understanding performance of the intention understanding model, and specifically comprises the following steps: the larger the model prediction loss, the worse the intended understanding performance of the intended understanding model, but the smaller the model prediction loss, the better the intended understanding performance of the intended understanding model.

The embodiment of the present application does not limit the calculation manner of the model prediction loss, and for convenience of understanding, the following description is made with reference to examples.

As an example, S103 may specifically include S1031-S1033:

s1031: and determining target language prediction loss according to the prediction intention corresponding to the target language training data.

Wherein the target language prediction loss is used for representing the intention understanding performance of the intention understanding model on the target language sentence. In addition, the embodiment of the present application does not limit the determination manner of the target language prediction loss. For ease of understanding, the following description is provided in connection with one possible embodiment.

In one possible implementation manner, when the target language training data includes target language real data, target language translation data and target language generation data, and the prediction intention corresponding to the target language training data includes the prediction intention of the target language real data, the prediction intention of the target language translation data and the prediction intention of the target language generation data, the process of acquiring the target language prediction loss may specifically include steps 41 to 45:

Step 41: the method comprises the steps of obtaining actual intention of real data of a target language, reference intention of translation data of the target language and actual intention of generation data of the target language.

Wherein, the actual intention of the target language real data refers to the actual intention described by the target language real data; in addition, in the training process of the intention understanding model, the actual intention of the target language real data can be used as label information to guide the training process of the intention understanding model, so that the trained predicted intention of the intention understanding model output aiming at the target language real data can be as close to the actual intention of the target language real data as possible.

The reference intention of the target language translation data can be used as label information to guide the training process of the intention understanding model, so that the trained predicted intention of the intention understanding model for the output of the target language translation data can be as close to the reference intention of the target language translation data as possible.

It should be noted that, the embodiments of the present application do not limit the reference intention of the target language translation data. In some cases, if the target language translation data is obtained by translating the auxiliary language real data, the reference intention of the target language translation data may be a predicted intention corresponding to the auxiliary language real data. The predicted intention corresponding to the auxiliary language real data is obtained by intention understanding of an intention understanding model aiming at the auxiliary language real data. For example, if the auxiliary language real data is the above auxiliary language training data, the reference intention of the target language translation data may be the predicted intention corresponding to the above auxiliary language training data.

The actual intention of the target language generation data refers to the actual intention of the target language generation data; and if the target language generation data is generated from candidate intent data, the actual intent of the target language generation data may be the intent described by the candidate intent data.

In addition, in the training process of the intention understanding model, the actual intention of the target language generation data can be used as label information to guide the training process of the intention understanding model, so that the trained predicted intention of the intention understanding model output aiming at the target language generation data can be as close to the actual intention of the target language generation data as possible.

Step 42: and determining the prediction loss corresponding to the target language real data according to the prediction intention of the target language real data and the actual intention of the target language real data.

The prediction loss corresponding to the target language real data is used for representing the intention understanding performance of the intention understanding model on the target language real data.

In addition, the embodiment of the present application does not limit the calculation method of the prediction loss corresponding to the real data of the target language, for example, step 42 may specifically be: as shown in the formula (4), cross Entropy (CE) between the predicted intention of the target language real data and the actual intention of the target language real data is determined as a prediction loss corresponding to the target language real data.

In the method, in the process of the invention,the prediction loss corresponding to the real data of the target language is obtained; />Predictive intent for real data in target language (e.g., +.>And->True data for the target language); />Actual intention of real data of target language; CE () is a cross entropy function.

Step 43: and determining the prediction loss corresponding to the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data.

The prediction loss corresponding to the target language translation data is used for representing the intention understanding performance of the intention understanding model on the target language translation data.

In addition, the embodiment of the present application does not limit the calculation method of the prediction loss corresponding to the target language translation data, for example, step 43 may specifically be: as shown in the formula (5), a relative entropy (Kullback-Leibler Divergence, KL divergence) between the predicted intention of the target language translation data and the reference intention of the target language translation data is determined as a prediction loss corresponding to the target language translation data.

In the method, in the process of the invention,predictive loss corresponding to the target language translation data; />Predictive intent to translate data for a target language (e.g., +.>And->Translate data for the target language); / >Translating the reference intent of the data for the target language; KL () is a KL divergence function.

Step 44: and determining the prediction loss corresponding to the target language generating data according to the prediction intention of the target language generating data and the actual intention of the target language generating data.

The prediction loss corresponding to the target language generation data is used for representing the intention understanding performance of the intention understanding model on the target language generation data.

In addition, the embodiment of the present application does not limit the calculation method of the prediction loss corresponding to the target language generation data, for example, step 44 may specifically be: as shown in the formula (6), cross entropy between the predicted intention of the target language generation data and the actual intention of the target language generation data is determined as a prediction loss corresponding to the target language generation data.

In the method, in the process of the invention,generating a prediction loss corresponding to the data for the target language; />Generating predictive intent (e.g., -for data for target language>And->Generating data for the target language);generating actual intent of the data for the target language; CE () is a cross entropy function.

Step 45: and determining the target language prediction loss according to the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data.

In the embodiment of the present application, after obtaining the prediction loss corresponding to the real data of the target language, the prediction loss corresponding to the translation data of the target language, and the prediction loss corresponding to the generated data of the target language, the prediction loss corresponding to the real data of the target language, the prediction loss corresponding to the translation data of the target language, and the prediction loss corresponding to the generated data of the target language may be weighted and summed to obtain the prediction loss of the target language. It should be noted that the weights involved in the weighted summation process may be set in advance according to the application scenario.

Based on the above-mentioned related content of step 41 to step 45, if the target language training data includes the target language real data, the target language translation data and the target language generation data, the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data may be calculated according to the prediction intention of the target language real data, the prediction intention of the target language translation data and the prediction intention of the target language generation data, respectively; and determining the target language prediction loss according to the prediction loss corresponding to the real data of the target language, the prediction loss corresponding to the translation data of the target language and the prediction loss corresponding to the generation data of the target language, so that the target language prediction loss can accurately represent the intention understanding performance of the intention understanding model on the target language statement.

S1032: and determining the auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data.

Wherein the auxiliary language prediction penalty is used to characterize the intended understanding performance of the intended understanding model on the auxiliary language statements.

In addition, the embodiment of the present application does not limit the calculation process of the auxiliary language prediction loss, for example, the process of obtaining the auxiliary language prediction loss may specifically include steps 51 to 52:

step 51: and acquiring the actual intention corresponding to the auxiliary language training data.

The actual intention corresponding to the auxiliary language training data refers to the actual intention of the auxiliary language training data; in addition, in the training process of the intention understanding model, the actual intention corresponding to the auxiliary language training data can be used as label information to guide the training process of the intention understanding model, so that the trained predicted intention of the intention understanding model output aiming at the auxiliary language training data can be as close as possible to the actual intention corresponding to the auxiliary language training data. Note that, the embodiment of the present application is not limited to the manner of acquiring the actual intent corresponding to the auxiliary language training data.

Step 52: and determining the auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data and the actual intention corresponding to the auxiliary language training data.

The embodiment of the application does not limit the calculation method of the auxiliary language prediction loss, and for convenience of understanding, the following description is made with reference to examples.

As an example, step 52 may specifically be: as shown in formula (7), the cross entropy between the predicted intention corresponding to the auxiliary language training data and the actual intention corresponding to the auxiliary language training data is determined as an auxiliary language prediction loss.

In the method, in the process of the invention,predicting a penalty for the auxiliary language; />Predictive intent (e.g., +.>And->Training data for an auxiliary language); />The actual intention corresponding to the auxiliary language training data; CE () is a cross entropy function.

Based on the above-mentioned related content of steps 51 to 52, the cross entropy between the predicted intent corresponding to the auxiliary language training data and the actual intent corresponding to the auxiliary language training data may be determined as an auxiliary language prediction loss, so that the auxiliary language prediction loss may accurately represent the intended understanding performance of the intended understanding model on the auxiliary language sentence.

S1033: and determining the model prediction loss of the intention understanding model according to the target language prediction loss and the auxiliary language prediction loss.

In the embodiment of the application, after the target language prediction loss and the auxiliary language prediction loss are obtained, the target language prediction loss and the auxiliary language prediction loss may be weighted and summed to obtain the model prediction loss of the intention understanding model. It should be noted that the weights involved in the weighted summation process may be set in advance according to the application scenario.

Based on the above-mentioned content related to S1031 to S1033, after obtaining the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data by using the intention understanding model, the target language prediction loss may be obtained according to the prediction intention corresponding to the target language training data, and the auxiliary language prediction loss may be obtained according to the prediction intention corresponding to the auxiliary language training data; and then carrying out weighted summation on the target language prediction loss and the auxiliary language prediction loss to obtain the model prediction loss of the intention understanding model, so that the model prediction loss can accurately represent the intention understanding performance of the intention understanding model.

S104: judging whether a preset stopping condition is met, if so, ending the training process of the intention understanding model; if not, S105 is performed.

The preset stopping condition refers to a constraint condition which is preset and is required to be reached when the intention understanding model is stopped to train. In addition, the preset stop condition may be set in advance according to the application scenario. Further, the embodiment of the present application does not limit the preset stop condition, for example, the preset stop condition may be that the model prediction loss of the intended understanding model is lower than the first threshold. As another example, the preset stop condition may be that the rate of change of the model predictive loss of the intent understanding model is less than a second threshold (i.e., the predictive intent of the intent understanding model output is converging). Also, for example, the preset stop condition may be that the number of updates of the intent understanding model reaches a third threshold. The first threshold, the second threshold and the third threshold may be preset.

In the embodiment of the application, if the intention understanding model of the current wheel reaches the preset stopping condition, the intention understanding model of the current wheel is indicated to have higher intention understanding performance, so that the training process of the intention understanding model can be directly ended, the intention understanding model of the current wheel can be stored, and the intention understanding model can be used for carrying out intention understanding on user sentences in the target language; if the intention understanding model of the current wheel does not reach the preset stopping condition, the intention understanding performance of the intention understanding model of the current wheel is lower, so that the intention understanding model can be updated according to the model prediction loss of the intention understanding model, and then the intention understanding performance of the updated intention understanding model can be detected by utilizing the target language training data and the auxiliary language training data.

S105: the intention understanding model is updated according to the model prediction loss of the intention understanding model, and execution returns to S102.

It should be noted that the embodiment of the present application is not limited to the process of updating the model, and may be implemented by any model updating method existing or appearing in the future.

Based on the above-mentioned content related to S101 to S105, in the training method of the intention understanding model provided in the present application, after obtaining the target language training data and the auxiliary language training data, the target language training data and the auxiliary language training data are input into the intention understanding model to obtain the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data output by the intention understanding model, and the model prediction loss of the intention understanding model is determined according to the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data; and updating the intention understanding model according to the model prediction loss, and returning to execute the step of inputting the target language training data and the auxiliary language training data into the intention understanding model and the subsequent steps until a preset stop condition is reached.

Based on the above-mentioned content related to the method for training the intent understanding model, since the trained intent understanding model can accurately predict the intent of the user sentence in the target language, after training the intent understanding model by using any embodiment of the method for training the intent understanding model provided above, the trained intent understanding model can be used for performing the intent understanding on the user sentence in the target language. Based on this, the embodiments of the present application also provide an intent understanding method, combined withSquare Method example twoAn explanation is given.

Method embodiment II

Referring to fig. 6, a flowchart of a method for understanding intent is provided in an embodiment of the present application.

The intention understanding method provided by the embodiment of the application comprises S601-S602:

s601: and obtaining the data to be understood of the target language.

The target language data to be understood refers to target language sentences which need to be understood. In addition, the embodiment of the application does not limit the process of acquiring the data to be understood in the target language. For ease of understanding, the following description is provided in connection with two examples.

Example 1, S601 may specifically be: and determining the data to be understood of the target language according to the text data of the target language input by the user. Where the target language text data refers to a target language sentence that a user inputs (e.g., types in a text box) by way of text input.

It can be seen that after the target language text data input by the user through the text input mode is obtained, the target language data to be understood can be determined according to the target language text data. For example, the target language text data may be directly determined as the target language data to be understood. For another example, the first processing may be performed on the target language text data, and then the target language text data after the first processing is determined to be the target language data to be understood; the first process may be preset, and the embodiment of the present application is not limited to the first process (for example, the first process may include an error correction process or the like).

Example 2, S601 may specifically include S6011-S6013:

s6011: and acquiring target language voice data input by a user.

The target language voice data refers to target language sentences input by a user through a voice input mode. It should be noted that, the embodiment of the present application is not limited to the process of acquiring the voice data of the target language, and may be implemented by any existing or future voice acquisition mode.

S6012: and performing voice recognition on the target language voice data to obtain text data corresponding to the target language voice data.

The text data corresponding to the target language voice data refers to a voice recognition result of the target language voice data; and the text data corresponding to the target language voice data comprises the text information recorded in the target language voice data.

It should be noted that the embodiments of the present application are not limited to the implementation of speech recognition, and may be implemented by any speech recognition method that occurs in the present or future.

S6013: and determining the data to be understood of the target language according to the text data corresponding to the voice data of the target language.

In this embodiment of the present application, after obtaining text data corresponding to target language voice data, the data to be understood in the target language may be determined according to the text data corresponding to the target language voice data. For example, text data corresponding to the target language voice data may be directly determined as the target language data to be understood. For another example, the text data corresponding to the target language voice data may be first processed for the second time, and then the text data after the second processing is determined as the data to be understood in the target language. The second process may be preset, and embodiments of the present application are not limited to the second process (e.g., the second process includes an error correction process, etc.).

S602: inputting the target language to-be-understood data into the intention understanding model to obtain the prediction intention corresponding to the target language to-be-understood data output by the intention understanding model.

The intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application. In addition, please refer to the above for relevant content of the intent understanding modelMethod embodiment one。

Based on the above-mentioned content related to S601 to S602, after the target language data to be understood is obtained, the target language data to be understood may be input into the intent understanding model trained by any embodiment of the method for training an intent understanding model provided above, so that the intent understanding model can perform intent understanding on the target language data to be understood, and obtain the predicted intent corresponding to the target language data to be understood output by the intent understanding model. The intention understanding model can accurately identify the user intention described by the user statement in the target language, so that the predicted intention corresponding to the target language to-be-understood data predicted by the intention understanding model can accurately represent the user intention described by the target language to-be-understood data, and the intention understanding accuracy of the user statement in the target language can be effectively improved.

To facilitate understanding the above intent understanding model training method and intent understanding method, a description is given below in connection with scene embodiments.

Scene embodiment

Assume that the target language is cantonese; the auxiliary language is mandarin; the application field of the intention understanding model (i.e., the above preset application field) is the navigation technical field; the candidate intention data is navigation intention data, and the navigation intention data is used for describing user intention in the navigation technical field. The number of navigation intention data is not limited in the embodiments of the present application.

Based on the above assumption, the intent understanding model can be used for intent understanding for cantonese sentences; and the training process of the intent understanding model may specifically include steps 60 to 69:

step 60: acquiring real data of Yue language, actual intention of the real data of Yue language, real data of Mandarin, actual intention of the real data of Mandarin and navigation intention data.

Wherein, the actual intention of the Yue-language real data is used as the tag information of the Yue-language real data; and the actual intention of mandarin real data is to be used as tag information of mandarin real data.

It should be noted that, the embodiment of the application is not limited to the process of acquiring the real data and the actual intention of the cantonese, for example, legal crawling can be performed in the webpage, and the history cantonese dialogue stored in the man-machine interaction device can be read. In addition, the embodiment of the application is not limited to the process of acquiring the mandarin real data and the actual intention thereof, for example, legal crawling can be performed in the webpage, and the mandarin real data can be read from the historical mandarin dialogue stored in the man-machine interaction device.

It should be further noted that, the embodiment of the present application does not limit the process of obtaining the navigation intention data, for example, the navigation intention may be generated according to the navigation intention that is legal to be crawled in the web page, or the navigation intention may be generated according to the history stored in the preset navigation application program.

Step 61: and carrying out Puyue translation on the mandarin real data to obtain Yue translation data.

Wherein, the Puyue translation means that the Putonghua sentence is translated into the Yue sentence. The present examples are not limited to the embodiments of the translation of Puyue, and may be implemented by any method capable of translating Putonghua sentences into Yue sentences. For example, pre-constructed Pryue vocabulary mapping relationships may be used for Pryue translation; the Puyue vocabulary mapping relation is used for recording the mapping relation between each Putonghua vocabulary and each Yue vocabulary. For another example, the translation of Puyue may be performed by a language translation model having a translation function of Puyue constructed in advance.

Step 62: and inputting the navigation intention data into a pre-constructed cantonese language data generation model to obtain cantonese language generation data.

Wherein the cantonese language data generation model is capable of generating cantonese language generation data from the navigation intent data, and the training process of the cantonese language data generation model is similar to the training process of the above target language data generation model.

Step 63: inputting the real data of the Guangdong language, the real data of the Mandarin language, the translation data of the Guangdong language and the generation data of the Guangdong language into an intention understanding model to obtain the prediction intention corresponding to the real data of the Guangdong language, the prediction intention corresponding to the real data of the Mandarin language, the prediction intention corresponding to the translation data of the Guangdong language and the prediction intention corresponding to the generation data of the Guangdong language which are output by the intention understanding model.

Step 64: and determining cross entropy between the predicted intention corresponding to the real data of the Guangdong language and the actual intention of the real data of the Guangdong language as the predicted loss corresponding to the real data of the Guangdong language.

Step 65: and determining cross entropy between the predicted intention corresponding to the mandarin real data and the actual intention of the mandarin real data as the predicted loss corresponding to the mandarin real data.

Step 66: and determining the KL divergence between the prediction intention corresponding to the Guangdong language translation data and the prediction intention corresponding to the Mandarin real data as the prediction loss corresponding to the Guangdong language translation data.

Step 67: and determining cross entropy between the prediction intention corresponding to the cantonese generation data and the navigation intention data as the prediction loss corresponding to the cantonese generation data.

Step 68: and carrying out weighted summation on the prediction loss corresponding to the real data of the Guangdong language, the prediction loss corresponding to the real data of the Mandarin language, the prediction loss corresponding to the translation data of the Guangdong language and the prediction loss corresponding to the generation data of the Guangdong language to obtain the model prediction loss of the intention understanding model.

That is, step 68 may utilize equation (8) for the calculation.

In the loss of _model Model predictive loss for an intent understanding model;the prediction loss corresponding to the real data of the Guangdong language is obtained; alpha is the weight occupied by the prediction loss corresponding to the real data of the Guangdong language; />The prediction loss corresponding to the mandarin real data is obtained; beta is the weight occupied by the prediction loss corresponding to the mandarin real data; />Predictive loss corresponding to the Yue language translation data; gamma is the weight occupied by the prediction loss corresponding to the Guangdong translation data; />Generating a prediction loss corresponding to data for the Guangdong language; delta is the weight occupied by the prediction loss corresponding to the Guangdong language generation data.

Step 69: judging whether a preset stopping condition is met, if so, ending the training process of the intention understanding model; if not, the intent understanding model is updated according to the model prediction loss of the intent understanding model, and the process returns to step 63.

Based on the above-mentioned related content of step 60 to step 69, in the embodiment of the present application, the training data (such as, for example, yue-pin real data, mandarin real data, yue-pin translation data and yue-pin generation data) required for training the intention understanding model may be generated by combining the yue-pin real data with the tag information, the mandarin real data with the tag information, and the navigation intention data, and then training the intention understanding model based on the training data.

Therefore, the difference between the Mandarin sentences and the Guangdong sentences is smaller, so that the difference between the Mandarin real data formed by the Mandarin sentences and the Guangdong real data formed by the Guangdong sentences is smaller, training data corresponding to the intent understanding model can be expanded by utilizing the Mandarin real data and the Guangdong translation data generated based on the Mandarin real data, adverse effects caused by fewer training data can be effectively avoided, and the intent understanding performance of the intent understanding model is improved. In addition, the user intention in the application field of the intention understanding model can be accurately described due to the navigation intention data, so that the user intention in the application field of the intention understanding model can be more accurately represented by generating the cantonese generation data based on the navigation intention data, the intention understanding model obtained by training based on the cantonese generation data can better predict the user intention in the application field, and the intention understanding performance of the intention understanding model is improved.

Based on the above assumption and the training process of the above intent understanding model, the use process of the intent understanding model includes steps 71-72:

Step 71: and acquiring the data to be understood of the Yue language.

The Yue language to be understood data refers to Yue language sentences which need to be understood. In addition, the process of acquiring the data to be understood in the cantonese language is similar to the process of acquiring the data to be understood in the target language.

Step 72: and inputting the Yue language to-be-understood data into an intention understanding model trained by utilizing the steps 60 to 69, and obtaining the predicted intention corresponding to the Yue language to-be-understood data output by the intention understanding model.

Based on the above-mentioned related content of steps 71 to 72, after training to obtain the intent understanding model by using steps 60 to 69, the data to be understood in cantonese can be directly input into the intent understanding model, so that the intent understanding model can accurately determine the user intent described by the data to be understood in cantonese, thus improving the accuracy of understanding the user intent in cantonese.

Based on the method for training the intention understanding model provided by the embodiment of the method, the embodiment of the application also provides a device for training the intention understanding model, which is explained and illustrated below with reference to the accompanying drawings.

Device embodiment 1

Device embodiments an intent understanding model training device is described, and reference is made to the method embodiments described above for relevant content.

Referring to fig. 8, a schematic structural diagram of a model training device for understanding intention is provided in an embodiment of the present application.

The intent understanding model training device 800 provided in the embodiment of the present application includes:

a first obtaining unit 801, configured to obtain target language training data and auxiliary language training data;

a first prediction unit 802, configured to input the target language training data and the auxiliary language training data into an intention understanding model, and obtain a predicted intention corresponding to the target language training data and a predicted intention corresponding to the auxiliary language training data output by the intention understanding model;

a loss determination unit 803 for determining a model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data;

a model updating unit 804, configured to update the intent understanding model according to the model prediction loss of the intent understanding model, and return to the first predicting unit 802 to continue to perform the step of inputting the target language training data and the auxiliary language training data into the intent understanding model until a preset stop condition is reached.

In one possible implementation, the target language training data includes at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating auxiliary language real data; the target language generation data is generated from candidate intent data.

In one possible implementation manner, the acquiring process of the target language generation data is:

In one possible implementation manner, the process of obtaining the target language translation data is:

or,

In one possible implementation, the loss determination unit 803 includes:

a first determining subunit, configured to determine a target language prediction loss according to a prediction intention corresponding to the target language training data;

a second determining subunit, configured to determine an auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data;

And a third determining subunit, configured to determine a model prediction loss of the intent understanding model according to the target language prediction loss and the auxiliary language prediction loss.

In one possible implementation, when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intention corresponding to the target language training data includes a predicted intention of the target language real data, a predicted intention of the target language translation data, and a predicted intention of the target language generation data, the intention understanding model training apparatus 800 further includes:

a third acquisition unit for acquiring an actual intention of the target language real data, a reference intention of the target language translation data, and an actual intention of the target language generation data;

the first determining subunit is specifically configured to: according to the predicted intention of the target language real data and the actual intention of the target language real data, determining the predicted loss corresponding to the target language real data; determining a prediction loss corresponding to the target language translation data according to the prediction intention of the target language translation data and the reference intention of the target language translation data; determining a prediction loss corresponding to the target language generation data according to the prediction intention of the target language generation data and the actual intention of the target language generation data; and determining the target language prediction loss according to the prediction loss corresponding to the target language real data, the prediction loss corresponding to the target language translation data and the prediction loss corresponding to the target language generation data.

Based on the intended understanding method provided by the method embodiments, the embodiments of the present application further provide an intended understanding device, which is explained and illustrated below with reference to the accompanying drawings.

Device example two

Device embodiment two will be described with reference to the device for understanding intention, and for relevant content, reference is made to the method embodiment described above.

Referring to fig. 9, a schematic diagram of an apparatus for understanding the structure of the present application is provided.

The intent understanding device 900 provided in the embodiment of the present application includes:

a second obtaining unit 901, configured to obtain data to be understood of a target language;

the second prediction unit 902 is configured to input the target language data to be understood into the intent understanding model, and obtain a predicted intent corresponding to the target language data to be understood output by the intent understanding model; the intention understanding model is trained by any implementation mode of the intention understanding model training method provided by the embodiment of the application.

Further, an embodiment of the present application further provides an intent understanding model training apparatus, including: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

The memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any of the embodiments of the intent understanding model training method described above.

Further, an embodiment of the present application further provides an intent understanding apparatus, including: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any of the embodiments of the intent understanding method described above.

Further, the embodiments of the present application also provide a computer readable storage medium having instructions stored therein, which when executed on a terminal device, cause the terminal device to perform any one of the embodiments of the above-described intent understanding model training method, or to perform any one of the embodiments of the above-described intent understanding method.

Further, the embodiments of the present application also provide a computer program product, which when run on a terminal device, causes the terminal device to perform any one of the above-described methods of training an intent understanding model, or to perform any one of the above-described methods of understanding intent.

From the above description of embodiments, it will be apparent to those skilled in the art that all or part of the steps of the above described example methods may be implemented in software plus necessary general purpose hardware platforms. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.

It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An intent understanding model training method, the method comprising:

acquiring target language training data and auxiliary language training data;

determining model prediction loss of the intention understanding model according to the prediction intention corresponding to the target language training data and the prediction intention corresponding to the auxiliary language training data; the determining the model prediction loss of the intention understanding model according to the predicted intention corresponding to the target language training data and the predicted intention corresponding to the auxiliary language training data comprises the following steps: determining target language prediction loss according to the prediction intention corresponding to the target language training data; determining an auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data; determining a model predictive loss of the intent understanding model based on the target language predictive loss and the auxiliary language predictive loss;

2. The method of claim 1, wherein the target language training data comprises at least one of target language real data, target language translation data, and target language generation data; the target language translation data is obtained by translating auxiliary language real data; the target language generation data is generated from candidate intent data.

3. The method according to claim 2, wherein the target language generation data is obtained by:

4. A method according to claim 3, wherein the process of constructing the target language data generation model comprises:

5. The method according to claim 2, wherein the obtaining process of the target language translation data is:

or,

6. The method of claim 1, wherein when the target language training data includes target language real data, target language translation data, and target language generation data, and the predicted intent corresponding to the target language training data includes a predicted intent of the target language real data, a predicted intent of the target language translation data, and a predicted intent of the target language generation data, the method further comprises:

7. An intended understanding method, the method comprising:

acquiring data to be understood of a target language;

inputting the target language to-be-understood data into the intention understanding model to obtain a predicted intention corresponding to the target language to-be-understood data output by the intention understanding model; wherein the intention understanding model is trained using the intention understanding model training method according to any one of claims 1 to 6.

8. An intent understanding model training apparatus, the apparatus comprising:

A loss determination unit, configured to determine a model prediction loss of the intention understanding model according to a prediction intention corresponding to the target language training data and a prediction intention corresponding to the auxiliary language training data; the loss determination unit includes: a first determining subunit, configured to determine a target language prediction loss according to a prediction intention corresponding to the target language training data; a second determining subunit, configured to determine an auxiliary language prediction loss according to the prediction intention corresponding to the auxiliary language training data; a third determining subunit configured to determine a model prediction loss of the intent understanding model according to the target language prediction loss and the auxiliary language prediction loss;

9. An intended understanding device, the device comprising:

The second prediction unit is used for inputting the target language to-be-understood data into the intention understanding model to obtain a prediction intention corresponding to the target language to-be-understood data output by the intention understanding model; wherein the intention understanding model is trained using the intention understanding model training method according to any one of claims 1 to 6.