CN112650834B - Intention model training method and device - Google Patents

Intention model training method and device Download PDF

Info

Publication number
CN112650834B
CN112650834B CN202011573198.4A CN202011573198A CN112650834B CN 112650834 B CN112650834 B CN 112650834B CN 202011573198 A CN202011573198 A CN 202011573198A CN 112650834 B CN112650834 B CN 112650834B
Authority
CN
China
Prior art keywords
intention
data
model
current
intent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011573198.4A
Other languages
Chinese (zh)
Other versions
CN112650834A (en
Inventor
简仁贤
王海波
马永宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Emotibot Technologies Ltd
Original Assignee
Emotibot Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Emotibot Technologies Ltd filed Critical Emotibot Technologies Ltd
Priority to CN202011573198.4A priority Critical patent/CN112650834B/en
Publication of CN112650834A publication Critical patent/CN112650834A/en
Application granted granted Critical
Publication of CN112650834B publication Critical patent/CN112650834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application provides an intention model training method and device, wherein the method comprises the following steps: constructing an unintended data set which is formed by gathering data irrelevant to the intention of the intention model; training by using the current intention data set to obtain a current intention model; predicting the non-intention data set by using the current intention model to obtain a current prediction result; and when the current prediction result meets the iteration stop condition, taking the current intention model as an optimal intention model. According to the technical scheme provided by the embodiment of the application, the probability of false triggering of the intention model is reduced, and the trained intention model has good robustness.

Description

Intention model training method and device
Technical Field
The application relates to the technical field of natural language processing, in particular to an intention model training method and device.
Background
In natural language processing systems, the intent model typically needs to recognize inputs that are not intended for the model, thereby preventing false triggering of the intent model with unnecessary consequences. In real scenes, unlike the controllable inputs in experimental environments, the inputs of real users are usually world-space and are mixed with various noises, sometimes even malicious, through attack models to achieve specific purposes.
In order to reduce the probability of false triggering of an intention model, it is common practice to write a portion of the corpus for the intention model that does not belong to the intention of the model, and train with the corpus that belongs to the intention of the model, thereby recognizing the input that does not belong to the intention of the model to some extent. However, the corpus written in this section is very limited compared to the real world complexity, and it is difficult to train a model with better robustness. In addition, if the labor cost is not considered, a large amount of corpus is written to solve the complexity problem of the real world, and the imbalance problem of the corpus is brought, so that the trained model has serious bias and cannot represent the situation of the model.
Disclosure of Invention
Therefore, the embodiment of the application provides a training method for an intention model, which not only reduces the probability of false triggering of the intention model, but also has good robustness.
The embodiment of the application provides a training method of an intention model, which comprises the following steps: constructing an unintended data set, wherein the unintended data set is formed by gathering data irrelevant to the intention of an intention model; training by using the current intention data set to obtain a current intention model; predicting the non-intention data set by using the current intention model to obtain a current prediction result; and when the current prediction result meets the iteration stop condition, taking the current intention model as an optimal intention model.
In one embodiment, the constructing the unintended dataset includes: randomly selecting articles irrelevant to the intention of the intention model; randomly extracting sentences in the article as non-intention data; the unintended data is constructed into an unintended data set.
In one embodiment, the constructing the unintended dataset includes: randomly selecting a user query log which is irrelevant to the intention of the intention model as non-intention data; the unintended data is constructed into an unintended data set.
In one embodiment, the constructing the unintended dataset includes: randomly selecting seed questions unrelated to the intent of the intent model; inputting the seed questions into a question-answering model to obtain output results as non-intention data; the unintended data is constructed into an unintended data set.
In one embodiment, the inputting the seed question into the question-answering model to obtain an output result as the unintended data includes: inputting the seed questions into a question-answering model to obtain similar questions; and inputting the seed questions and the similar questions into a question-answer model together to obtain an output result as the unintended data.
In one embodiment, the first current intent data set is assembled from initial intent data and N1 non-intent data.
In one embodiment, when the current prediction result does not meet the iteration stop condition, selecting N2 data which do not meet the iteration stop condition to be added into the current intention data set, and forming the next current intention data set.
In one embodiment, the predicting the non-intent data set using the current intent model to obtain a current prediction result includes: the current intent model predicts whether each data in the non-intent data set is relevant to the current intent model, and labels the data as relevant data when the data is relevant to the current intent model; and counting the quantity of the related data and taking the quantity as the current prediction result.
In one embodiment, when the current prediction result does not meet the iteration stop condition, selecting N2 data that does not meet the iteration stop condition to join the current intent data set includes: and calculating the proportion of the related data in the non-intention data set, and selecting N2 related data to be added into the current intention data set when the proportion is larger than a preset proportion threshold value.
The embodiment of the application provides an intention model training device, which comprises: a building module for building a non-intent dataset assembled from data unrelated to the intent of the intent model; the training module is used for training by using the current intention data set to obtain a current intention model; the prediction module is used for predicting the non-intention data set by using the current intention model to obtain a current prediction result; and the iteration stopping module is used for taking the current intention model as an optimal intention model when the current prediction result meets the iteration stopping condition.
According to the technical scheme provided by the embodiment of the application, a non-intention data set can be constructed, the semantic space which is inexhaustible and irrelevant to the intention model is approximately represented, then the problem of extremely unbalanced data is solved by a model training method of iterative sampling, and finally, high-quality corpus which does not belong to the intention model is obtained. The model trained by the method can well identify the input which does not belong to the intention of the model in the real scene, so that the method has strong robustness.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art. The above and other objects, features and advantages of the present application will become more apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the several views of the drawings. The drawings are not intended to be drawn to scale, with emphasis instead being placed upon illustrating the principles of the application.
FIG. 1 is a schematic flow chart of an intent model training method according to an embodiment of the present application;
FIG. 2 is a flowchart of an intent model training method according to another embodiment of the present application;
fig. 3 is a block diagram of an intention model training apparatus according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
Like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
In natural language processing systems, the intent model needs to identify inputs that do not belong to the intent of the model (i.e., "other classes of intent"), such as in IoT (Internet of Things ) scenarios, one intent model is to identify whether the user is speaking "turn off" or "other classes of intent" (i.e., expressions that do not belong to turn off lights). The expression of "turning off the lamp" is relatively limited, although it is relatively diverse, and particularly, the expression is relatively high-frequency. However, the expression "other kinds of intentions", that is, the expression not belonging to the intention of turning off the lamp, is inexhaustible and difficult to be exhaustive. Therefore, in a real scene, the intention model is often triggered by mistake, for example, when a user says "turn off the heating", the intention model is identified as the intention of "turn off the light" without targeted training. However, it is apparent that "turn off the heating" is "other type of intent", which results in false triggering of the intent model, followed by the initiation of an instruction to turn off the light.
In view of the above problems in the scenario, it is a common practice to train the intent model in a targeted manner, and some other types of intents can be manually written and trained in combination with the initial corpus of the intent model. This approach alleviates to some extent the probability that the intent model is false triggered, but the effort of manually composing other types of intents is very high. On the other hand, even if a large number of other kinds of intents are manually written without considering the workload, there are corresponding problems. Taking IoT scenarios as an example, an intent recognition model is to recognize whether a user is speaking "turn on" or "other types of intentions" (i.e., not belonging to turn on and express), other types of intentions of financial classes such as "deposit", "buy fund", etc., other types of intentions of living classes such as "cook", "cook face", etc., and then sum these intentions together to form "other types of intentions". However, the "other kinds of intentions" written manually cannot express all the expressions which are not in the light-on state. Of course, in order to make the other types of intentions approach and express as much as possible, all the expressions which are not in the lighting and expressing state can be written as much as possible without cost. However, this brings a problem that the corpus of "other types of intentions" is far more than the initial corpus of the intention model, and if the two are combined for training, the trained intention model has serious deviation and has no good robustness.
To solve the above-mentioned problems, on the one hand, it is necessary to change the way other class intents are acquired so that the other class intents can only characterize infinite semantic space; on the other hand, there is also a need to balance the amount between the initial corpus of the intent model and the corpus of other types of intents. The trained model can identify the input which does not belong to the intention of the model in a real scene with high probability, and the use experience of a user is improved. The embodiment of the application provides an intention model training method, an intention model training device, electronic equipment and a computer readable storage medium, so as to realize optimization of model training and obtain an optimal intention model. The technology can be implemented by adopting corresponding software, hardware and a combination of the software and the hardware, and the embodiment of the application is described in detail below.
Referring to fig. 1, an embodiment of the present application provides an intention model training method applied to an electronic device, the method including:
step S101: an unintended data set is constructed, the unintended data set being assembled from data unrelated to the intent of the intent model.
In this embodiment, the manner of constructing the unintended data set may be: the article irrelevant to the intention of the intention model is selected randomly, and the article can be randomly grabbed on the network, so long as the article irrelevant to the intention can be grabbed randomly. Then randomly extracting sentences in the article as the non-intention data, and finally constructing the non-intention data into a non-intention data set.
In this embodiment, the manner of constructing the unintended data set may also be: user query logs unrelated to the intent of the intent model are randomly selected as non-intent data, and then the non-intent data are constructed into a non-intent data set. The user query log may be a user record of any system with query functionality, such as a user query record of a search engine, a user query log of various databases, a user query record of an artificial intelligence customer service, and so forth.
In this embodiment, the manner of constructing the unintended data set may also be: randomly selecting seed problems irrelevant to the intention of the intention model; then inputting the seed questions into a question-answering model to obtain output results as non-intention data; finally, the above-mentioned unintended data is constructed into unintended data set. The seed questions may be preset, for example, may be manually written questions unrelated to intention, and then a question may be randomly extracted therein and input into the question-answer model, so as to obtain a plurality of output results. In some cases, after the seed questions are input into the question-answering model, similar questions (similar questions to the seed questions) can be obtained, so that the seed questions and the similar questions can be input into the question model together, and more output results can be obtained.
In this embodiment, the manner of constructing the unintended data set may also be: randomly extracting one or more seed questions from preset seed questions, inputting the seed questions into a search engine, randomly extracting titles displayed on a search engine result interface to serve as similar questions of the seed questions, or grabbing questions displayed on the bottom of the search engine to serve as similar questions of the seed questions. And then respectively inputting the seed problem and the similar problem into a search engine, and randomly extracting content at a result interface of the search engine to serve as a final output result, namely, the non-intention data. Finally, the above-mentioned unintended data is constructed into an unintended data set.
The non-intent data set constructed in the various possible ways described above may already approximate the endless semantic space represented by the intent that is unrelated to the intent of the intent model (i.e. "other types of intent"), training such non-intent data set along with the initial intent data of the intent model, the trained intent model may identify the endless semantic space to some extent.
Step S102: training is performed by using the current intention data set, and a current intention model is obtained.
In the embodiment of the present application, the first current intention data set is formed by gathering initial intention data and N1 pieces of non-intention data. The initial intent data may be data related to the intent of the intent model or a combination of a large amount of data related to the intent of the intent model and a small amount of data unrelated to the intent of the intent model. Other current intent datasets are generated by subsequent steps. All the current intention data sets are trained to obtain a current intention model. The training mode may be, for example, training using fasttet or training using deep pyramid cnn.
Obviously, the first current intention data set does not have data imbalance, and the number of N1 can be controlled to ensure that the initial intention data and the non-intention data set reach a relatively balanced state, so that the trained first current intention model does not generate great deviation.
To further bring the first current intent data set to a more balanced state, N1 may be a ratio of the total amount of initial intent data to the amount of initial intent data categories, that is, N1 may be an average of the amount of data in each category of initial intent data.
Step S103: and predicting the non-intention data set by using the current intention model to obtain a current prediction result.
In the embodiment of the application, the current intention model prediction mode can be used for predicting by using fasttext, for example, and can also be used for predicting by using deep pyramid cnn. By way of example of the prediction described above, the current intent model predicts for each data in the non-intent data set, and results of whether each data is relevant to the current intent model may be obtained. The electronic device will flag the data related to the current intent model as related data and count the amount of related data as the current prediction result. In an ideal situation, the prediction of a perfect model should be that every data in the unintended dataset is not relevant to the model, but this is very difficult and difficult to achieve. Thus, the requirements can be reduced, and a relatively perfect model, i.e., an optimal intent model, can be obtained. For this purpose, the current prediction result may be limited.
Specifically, when the current prediction result does not meet the iteration stop condition, selecting N2 data which does not meet the iteration stop condition to be added into the current intention data set, and forming the next current intention data training set. The limit on the current prediction result may be, for example: judging whether the quantity of the related data is larger than a preset quantity threshold value, and when the quantity of the related data is larger than the preset quantity threshold value, selecting N2 related data by the electronic equipment to be added into the current intention data set so as to form the next current intention data set. For example, it is also possible to: calculating the proportion of the related data in the unintended data set, judging whether the proportion is larger than a preset proportion threshold, and selecting N2 related data to be added into the current intended data set when the proportion is larger than the preset proportion threshold so as to form the next current intended data set.
The N1, N2, the preset quantity threshold and the preset proportion threshold are all super parameters.
N2 may be one fifth of N1, so that even if iterated a plurality of times, the unintended data contained in the current intended data set will not be excessive, and the amount of data in each category of the initial intended data will not be different from that of the initial intended data, and will not cause the problem of data imbalance. Of course, when the number of related data is less than one fifth of N1, the number of N2 is determined as the number of related data, that is, the related data is all added to the current intention data set to form the next current intention data set.
The preset ratio threshold may be set to between 1% and 5%, and as described above, since obtaining a perfect model is difficult to achieve, the result of model prediction may be controlled within a relatively reasonable range, so that an optimal intention recognition model may be obtained. Because the application is adopted to construct the unintended data set (approximate infinite semantic space), even the proportion of 5% of related data means that the model can more robustly identify more than 95% of data, and in terms of actual application scenes, the 95% of data can still form an approximate infinite semantic space, and most of input which does not belong to the intended meaning of the intended pattern can be identified.
In the embodiment of the application, the data in the constructed unintended data set can be scattered randomly, and then the data in N1 unintended data sets is selected according to a uniformly distributed random sampling mode. The data which does not meet the iteration stop condition can be scattered randomly, and then N2 data which does not meet the iteration stop condition are selected according to a uniformly distributed random sampling mode. In this way, in each round of iteration, the intentions (i.e. "other types of intentions") which are irrelevant to the intentions of the intent model in the current intent data set are obtained by means of random sampling, and no influence factors of human intervention are added, so that the generalization of the trained model is further improved.
Step S104: and when the current prediction result meets the iteration stop condition, taking the current intention model as an optimal intention model.
In the embodiment of the application, for example, the electronic device may determine whether the number of related data is greater than a preset number threshold, and when the number of related data is less than or equal to the preset number threshold, the electronic device uses the current intention model as the optimal intention model. For example, the electronic device may also calculate a proportion of the related data in the unintended data set, determine whether the proportion is greater than a preset proportion threshold, and when the proportion is less than or equal to the preset proportion threshold, the electronic device uses the current intent model as the optimal intent model.
Through the above steps 101-104, an unintended dataset can be constructed first, which can approximate an endless semantic space. Then, the problem of extremely unbalanced data is effectively relieved through a sampling iterative training method. Through the synergistic effect of the two aspects, the trained intention model can gradually identify the approximate endless semantic space, so that the situation of misidentification can be greatly reduced. In a real scene, the probability of false triggering of the intention model is greatly reduced due to inaccurate user input, peripheral noise and the like.
Referring to fig. 2, an embodiment of the present application provides an intent model training method applied to an electronic device, which may be used in an IoT scenario, for example.
In the embodiment of the application, one intention model in the IoT scene is to identify three other intentions of turning on, turning off and the like, wherein the other intentions are all semantic expressions irrelevant to the turning on and the turning off, and the expression mode is almost endless. To enhance the ability of the intent model to identify other types of intent, it may be implemented as follows:
step 201: randomly extracting a seed problem from the preset seed problems.
The pre-set seed questions may be considered to be written, for example, "i do not want to cook, turn on air conditioner how to operate, turn up the volume a little bit is possible, i do not want to buy a fund, how to buy a high-speed railway ticket to Beijing … …". Then, the seed problem of how to operate the air conditioner is extracted through a random algorithm.
Step 202: the seed questions are input into the search engine to obtain similar questions.
The seed problem of how to operate the air conditioner is input into software with searching functions like hundred degrees, google, knowledgeable, today's top-hat and the like, and a searching page can be obtained. Similar problems to the seed problem can be obtained by randomly extracting some title and bottom similar problems in the search page. To be able to get more similar questions, the similar questions can be re-entered into the search engine, then the title and bottom similar questions can be randomly extracted, and these can also be similar questions.
Step S203: the seed problem and the similar problem are input to the search engine, respectively, output results are obtained as unintended data, and are constructed into unintended data sets.
The seed problem and the similar problem are input into the software with the searching function again, and any content in the searching result page is extracted. All the obtained output results are pooled together to form the unintended dataset.
Step S204: the method comprises the steps of randomly scattering the unintended data set, and selecting 100 data in the unintended data set according to a uniformly distributed random sampling mode.
In the embodiment of the present application, the initial intention data may be divided into 10 categories, each category having 100 data on average, so 100 data may be selected from the non-intention data set as the 11 th category, that is, other category intents.
Step S205: the initial intent data and the selected 100 non-intent data are aggregated into a first current intent data set.
Step S206: training the current intention data set to obtain a current intention model.
In an embodiment of the application, a first current intent data set is trained to obtain a first current intent model. Meanwhile, in the subsequent iteration step, training is required to be performed on the current intention data set of each round, and a current intention model is obtained. The training method here may be deep pyramid cnn.
Step S207: and predicting the non-intention data set by using the current intention model to obtain a prediction result.
In the embodiment of the present application, the current intent model predicts each data in the unintended data set, and the prediction method may be deep pyramid cnn. If the predicted found data and the current intention model (i.e. the intention to turn on and off the lamp) are relevant, the data is marked as relevant data; otherwise, no marking is performed. After predicting the data in each non-intended data set, counting the number marked as the related data, and calculating the ratio of the number to the data in the non-intended data set, wherein the ratio is the prediction result.
For the first current intent model, it was found after prediction that 100 data out of the 500 non-intent data sets were correlated with the first current intent model (these 100 data are also referred to as badcase), and the ratio of the amount of correlated data to the amount of non-intent data set data was calculated to be 20%. Thus, the prediction result for the first current intent model is a relevant data duty cycle of 20%.
Step S208: and judging whether the predicted result is smaller than or equal to a preset proportion threshold value.
In the embodiment of the application, the preset proportion threshold is set to 5%.
For the first current intent model, the prediction result is that the relevant data duty ratio is 20%, which is far greater than the preset proportion threshold, so step S209 will be performed.
In the subsequent training and prediction, if the relevant data duty ratio is less than or equal to 5%, step S210 is performed.
Step S209: when the prediction result is greater than the preset proportion threshold, 20 pieces of related data are selected to be added into the current intention data set to form a next current intention data set, and steps S206-S208 are executed again.
In the embodiment of the application, since 100 data are averaged for each category of the initial intention data, 20 pieces of related data are selected to be added into the current intention data set (one fifth of the average value of each category of the initial intention data), so that other categories of intention in the next current intention data set are 120 pieces of data, and the difference from the average value of each category of the initial intention data is not larger.
For the first current intent model, it is predicted that 100 data are found to be relevant to the first current intent model, therefore, the 100 data are randomly scattered, and 20 data are selected using a random sampling manner to be added to the first current intent data set, forming a second current intent data set. Subsequently, steps S206-S208 continue to be performed for the second current intent data set.
Step S210: and when the predicted result is smaller than or equal to the preset proportion threshold value, taking the current intention model as an optimal intention model.
In an embodiment of the present application, steps S206-S208 may be performed multiple times, i.e., the intent model may be iterated multiple times, repeating the training and predicting steps.
For the second current intent model, 50 data are predicted to be correlated with the second current intent model, wherein the correlated data account for 10% and are greater than a preset proportional threshold of 5%. Thus, the 50 data are randomly scattered and 20 data are selected using a random sampling pattern to be added to the second current intent data set to form a third current intent data set. This process is repeated continuously, with the possibility of a fourth current intent data set, a fifth current intent data set, … …, and an nth current intent data set.
For the fourth current intent model, it is predicted that 38 data are correlated with the fifth current intent model, where the correlated data account for 7.6% and are greater than a preset proportional threshold of 5%. Therefore, steps S206-S208 still need to be performed. For the fifth current intention model, it is predicted that 20 data are found to be related to the fifth current intention model, and at this time, the related data account for 4% and are smaller than the preset proportion threshold value by 5%. At this time, through continuous iteration, the predicted result of the fifth current intention model shows that the duty ratio of the relevant data is acceptable, that is, the number of badcases is controlled within an acceptable range. Thus, the fifth current intent model is taken as the optimal intent model.
The above-mentioned non-intention data set can approximate the characteristic an endless semantic space, namely including massive intention data which are not "turn off" and "turn on", so that the trained intention model can approximate the identification endless semantic space, and the intention which is not "turn off" and "turn on" is identified, so that the situation of misidentification is greatly reduced. In IoT scenarios, the intent to "turn off" and "turn on" may be well identified, while the intent to "turn off" and "turn on" may also be well identified, thereby reducing the probability of false triggering of the intent model.
Referring to fig. 3, an embodiment of the present application provides an intention model training apparatus, which includes: a building module 301, a training module 302, a prediction module 303, an iteration stop module 304, and an iteration module 305.
A construction module 301 for constructing an unintended data set consisting of data independent of the intent model.
The training module 302 is configured to perform training using the current intention data set to obtain a current intention model.
And the prediction module 303 is configured to predict the unintended data set by using the current intent model to obtain a current prediction result.
And the iteration stop module 304 is configured to take the current intention model as an optimal intention model when the current prediction result meets an iteration stop condition.
In one embodiment, the building module 301 is configured to randomly select an article unrelated to the intention of the intention model; randomly extracting sentences in the article as non-intention data; the above-described unintended data is constructed as an unintended data set.
In one embodiment, the building module 301 is configured to randomly select, as the non-intent data, a user query log unrelated to the intent of the intent model; the above-described unintended data is constructed as an unintended data set.
In one embodiment, the building module 301 is configured to randomly select seed questions unrelated to the intention of the intention model; inputting the seed questions into a question-answering model to obtain output results as non-intention data; the above-described unintended data is constructed as an unintended data set.
In one embodiment, the constructing module 301 is configured to input the seed question into a question-answering model to obtain a similar question; and inputting the seed questions and the similar questions into a question-answer model together to obtain an output result as unintended data.
In one embodiment, the first current intent data set is assembled from initial intent data and N1 non-intent data.
In an embodiment, the apparatus further includes an iteration module 305, configured to select N2 data that does not satisfy the iteration stop condition to be added to the current intent data set to form a next current intent data set when the current prediction result does not satisfy the iteration stop condition.
In one embodiment, the prediction module 303 is configured to predict, by using the current intent model, whether each data in the non-intent data set is related to the current intent model, and label the data as related data when the data is related to the current intent model; and counting the quantity of the related data and taking the quantity as the current prediction result.
In an embodiment, the iteration module 305 is configured to calculate a proportion of the relevant data in the unintended data set, and select N2 relevant data to add to the current intended data set when the proportion is greater than a preset proportion threshold.
The N1, N2 and the preset ratio threshold are super parameters.
The above N1 is a ratio of the initial intention data amount to the initial intention data category amount.
When the number of the related data is less than one fifth of N2, N2 is the number of the related data; when the related data is equal to or more than one fifth of N2, N2 is one fifth of N1.
The preset proportion threshold value is 1% -5%.
The implementation process of the functions and roles of each module in the above device is specifically detailed in the implementation process of the corresponding steps in the above intent model training method, and will not be described herein.
Referring to fig. 4, an embodiment of the present application provides an electronic device 400 including a processor 401 and a memory 402 for storing instructions executable by the processor 401. Wherein the processor 401 is configured to perform the intent model training method in any of the embodiments described above.
The processor 401 may be an integrated circuit chip having signal processing capabilities. The processor 401 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. Which may implement or perform the methods, steps and logic blocks disclosed in embodiments of the present application.
The Memory 402 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM), electrically erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), erasable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. The memory 502 also stores one or more modules that are executed by the one or more processors 401, respectively, to perform the intent model training method steps in any of the embodiments described above.
Embodiments of the present application also provide a computer-readable storage medium storing a computer program executable by the processor 401 to perform the intent model training method in any of the above embodiments.
In the several embodiments provided in the present application, the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored on a computer readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (6)

1. A method of training an intent model, comprising:
constructing an unintended dataset assembled from data unrelated to intent of the intent model, comprising: randomly selecting seed questions unrelated to the intent of the intent model; inputting the seed questions into a question-answering model to obtain output results as non-intention data; constructing the unintended data into an unintended data set; the step of inputting the seed questions into a question-answering model to obtain output results as non-intention data comprises the following steps: inputting the seed questions into a question-answering model to obtain similar questions; inputting the seed questions and the similar questions into a question-answer model together to obtain an output result as unintended data;
training by using a current intention data set to obtain a current intention model, wherein the current intention data set is formed by collecting initial intention data and N1 pieces of non-intention data, and N1 is the ratio of the total amount of the initial intention data to the amount of the initial intention data category;
predicting the non-intention data set by using the current intention model to obtain a current prediction result;
when the current prediction result meets the iteration stop condition, taking the current intention model as an optimal intention model; when the current prediction result does not meet the iteration stop condition, selecting N2 data which do not meet the iteration stop condition to be added into the current intention data set to form a next current intention data set, wherein N2 is smaller than or equal to N1.
2. The method of claim 1, wherein the constructing the unintended dataset comprises:
randomly selecting articles irrelevant to the intention of the intention model;
randomly extracting sentences in the article as non-intention data;
the unintended data is constructed into an unintended data set.
3. The method of claim 1, wherein the constructing the unintended dataset comprises:
randomly selecting a user query log which is irrelevant to the intention of the intention model as non-intention data;
the unintended data is constructed into an unintended data set.
4. The method of claim 1, wherein predicting the non-intent data set using the current intent model to obtain a current prediction result comprises:
the current intent model predicts whether each data in the non-intent data set is relevant to the current intent model, and labels the data as relevant data when the data is relevant to the current intent model;
and counting the quantity of the related data and taking the quantity as the current prediction result.
5. The method of claim 4, wherein selecting N2 data that results in the failure to meet the iteration stop condition to be added to the current intent data set when the current prediction result does not meet the iteration stop condition, comprises:
and calculating the proportion of the related data in the non-intention data set, and selecting N2 related data to be added into the current intention data set when the proportion is larger than a preset proportion threshold value.
6. An intent model training device, comprising:
a building module for building a non-intent dataset assembled from data unrelated to intent of an intent model, comprising: randomly selecting seed questions unrelated to the intent of the intent model; inputting the seed questions into a question-answering model to obtain output results as non-intention data; constructing the unintended data into an unintended data set; the step of inputting the seed questions into a question-answering model to obtain output results as non-intention data comprises the following steps: inputting the seed questions into a question-answering model to obtain similar questions; inputting the seed questions and the similar questions into a question-answer model together to obtain an output result as unintended data;
the training module is used for training by using a current intention data set to obtain a current intention model, wherein the current intention data set is formed by collecting initial intention data and N1 pieces of non-intention data, and N1 is the ratio of the total quantity of the initial intention data to the quantity of the category of the initial intention data;
the prediction module is used for predicting the non-intention data set by using the current intention model to obtain a current prediction result;
the iteration stopping module is used for taking the current intention model as an optimal intention model when the current prediction result meets an iteration stopping condition; when the current prediction result does not meet the iteration stop condition, selecting N2 data which do not meet the iteration stop condition to be added into the current intention data set to form a next current intention data set, wherein N2 is smaller than or equal to N1.
CN202011573198.4A 2020-12-25 2020-12-25 Intention model training method and device Active CN112650834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011573198.4A CN112650834B (en) 2020-12-25 2020-12-25 Intention model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011573198.4A CN112650834B (en) 2020-12-25 2020-12-25 Intention model training method and device

Publications (2)

Publication Number Publication Date
CN112650834A CN112650834A (en) 2021-04-13
CN112650834B true CN112650834B (en) 2023-10-03

Family

ID=75363590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011573198.4A Active CN112650834B (en) 2020-12-25 2020-12-25 Intention model training method and device

Country Status (1)

Country Link
CN (1) CN112650834B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019153522A1 (en) * 2018-02-09 2019-08-15 卫盈联信息技术(深圳)有限公司 Intelligent interaction method, electronic device, and storage medium
CN110909136A (en) * 2019-10-10 2020-03-24 百度在线网络技术(北京)有限公司 Satisfaction degree estimation model training method and device, electronic equipment and storage medium
WO2020056621A1 (en) * 2018-09-19 2020-03-26 华为技术有限公司 Learning method and apparatus for intention recognition model, and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019153522A1 (en) * 2018-02-09 2019-08-15 卫盈联信息技术(深圳)有限公司 Intelligent interaction method, electronic device, and storage medium
WO2020056621A1 (en) * 2018-09-19 2020-03-26 华为技术有限公司 Learning method and apparatus for intention recognition model, and device
CN110909136A (en) * 2019-10-10 2020-03-24 百度在线网络技术(北京)有限公司 Satisfaction degree estimation model training method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
B. Simons.A Bootstrapped Model to Detect Abuse and Intent in White Supremacist Corpora.IEEE.2020,全文. *
邱云飞 ; 刘聪 ; .基于协同训练的意图分类优化方法.现代情报.2019,(第05期),全文. *

Also Published As

Publication number Publication date
CN112650834A (en) 2021-04-13

Similar Documents

Publication Publication Date Title
WO2017084362A1 (en) Model generation method, recommendation method and corresponding apparatuses, device and storage medium
JP2018538587A (en) Risk assessment method and system
Reinanda et al. Mining, ranking and recommending entity aspects
CN105389349A (en) Dictionary updating method and apparatus
CN107844533A (en) A kind of intelligent Answer System and analysis method
JP2021101361A (en) Method, device, apparatus and storage medium for generating event topics
CN112380319B (en) Model training method and related device
CN111385602A (en) Video auditing method, medium and computer equipment based on multi-level and multi-model
CN104967587A (en) Method for identifying malicious account numbers, and apparatus thereof
CN111782637A (en) Model construction method, device and equipment
WO2015084757A1 (en) Systems and methods for processing data stored in a database
Calderón et al. Content-based echo chamber detection on social media platforms
CN110968564A (en) Data processing method and training method of data state prediction model
CN110019556B (en) Topic news acquisition method, device and equipment thereof
CN112650834B (en) Intention model training method and device
Li et al. Research on the application of multimedia entropy method in data mining of retail business
Trushkowsky et al. Getting it all from the crowd
CN115329078B (en) Text data processing method, device, equipment and storage medium
CN112348279B (en) Information propagation trend prediction method, device, electronic equipment and storage medium
WO2023024474A1 (en) Data set determination method and apparatus, and computer device and storage medium
US11449789B2 (en) System and method for hierarchical classification
CN113297854A (en) Method, device and equipment for mapping text to knowledge graph entity and storage medium
CN107885808B (en) Shared resource file anti-cheating method
CN112528021A (en) Model training method, model training device and intelligent equipment
CN110968668A (en) Method and device for calculating similarity of network public sentiment subjects based on hyper-network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant