CN111400473A - Method and device for training intention recognition model, storage medium and electronic equipment - Google Patents

Method and device for training intention recognition model, storage medium and electronic equipment Download PDF

Info

Publication number
CN111400473A
CN111400473A CN202010191431.6A CN202010191431A CN111400473A CN 111400473 A CN111400473 A CN 111400473A CN 202010191431 A CN202010191431 A CN 202010191431A CN 111400473 A CN111400473 A CN 111400473A
Authority
CN
China
Prior art keywords
data
sample data
intention
user
negative sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010191431.6A
Other languages
Chinese (zh)
Inventor
刘硕
杨玉树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202010191431.6A priority Critical patent/CN111400473A/en
Publication of CN111400473A publication Critical patent/CN111400473A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides an intention recognition model training method and device, a storage medium and an electronic device. A method of training an intent recognition model, comprising: acquiring stored interaction log data, wherein the interaction log data comprise a plurality of groups of user interaction data, and each group of user interaction data comprises: the system comprises user query data, a recommendation intention data list and user selection data selected by a user from the recommendation intention data list; generating first positive sample data and first negative sample data based on each group of user interaction data; acquiring second positive sample data and second negative sample data in a knowledge base based on preset first configuration information; training an intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data and the second negative sample data; wherein the intention identification model is used for carrying out intention identification on the query statement input by the user.

Description

Method and device for training intention recognition model, storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for training an intention recognition model, a storage medium, and an electronic device.
Background
With the continuous development of artificial intelligence technology, user intention recognition is widely applied to scenes such as intelligent customer service, customer service robots and the like. The machine interacts with the user using natural language by analyzing the user's intent, providing services to the user.
FIG. 1 is a schematic diagram of a smart customer service interaction page shown according to an example. As shown in fig. 1, the machine analyzes the query (query) input by the user through the intention recognition module, performs similarity matching with the candidate intention, selects the true intention that the user wants to express from the candidate intention, and feeds back the true intention to the user.
In the related art, a model-based intention recognition method is used, for example, similarity calculation is performed between a query input by a user and candidate intentions based on some similarity models, and the user intention is judged through a scoring and sorting method. Although the method has the advantages of strong generalization capability, no need of manually configuring relevant rules and the like, the accuracy of model-based identification depends heavily on training data. Therefore, how to provide accurate training data for the similarity model becomes a key technology for improving the accuracy of intent recognition.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure is directed to a method and an apparatus for training an intention recognition model, a storage medium, and an electronic device, which overcome, at least to some extent, the problem of low accuracy in intention recognition in the related art.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a training method of an intention recognition model, including: acquiring stored interaction log data, wherein the interaction log data comprise a plurality of groups of user interaction data, and each group of user interaction data comprises: the system comprises user query data, a recommendation intention data list and user selection data selected by a user from the recommendation intention data list; generating first positive sample data and first negative sample data based on each group of user interaction data; acquiring second positive sample data and second negative sample data in a knowledge base based on preset first configuration information; training an intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data and the second negative sample data; wherein the intention identification model is used for carrying out intention identification on the query statement input by the user.
In an embodiment of the present disclosure, generating first positive sample data and first negative sample data based on each set of user interaction data includes: forming first positive sample data by the user query data and the user selection data; and based on preset second configuration information, combining the user query data with all or part of unselected intention data in the recommendation intention data list to form first negative sample data.
In an embodiment of the present disclosure, the interaction log data does not include user interaction data that is manually operated after the user selects the user selection data from the recommendation intention data list.
In one embodiment of the present disclosure, the second configuration information is used to configure a proportion of the first negative sample data, where the proportion includes a selection proportion of non-selected intention data in the recommendation intention data list for composing the first negative sample data.
In one embodiment of the present disclosure, the second positive sample data includes: the system comprises a knowledge base, a database and a database, wherein the knowledge base comprises a plurality of intents, sample data and sample data, and the intention is respectively combined with different expression data under the intents, and the sample data is composed of different expression data under the intents; the second negative sample data includes: and sample data consisting of different expression data under the intentions and different preselected expression data under other preselected intentions in the knowledge base respectively and sample data consisting of different expression data under the intentions and different preselected expression data under other preselected intentions mutually respectively.
In an embodiment of the present disclosure, the first configuration information is used to configure a proportion of the second negative sample data, where the proportion includes: the preselected selection ratio of other intentions and/or the preselected selection ratio of the presentation data.
In one embodiment of the present disclosure, the intent recognition model includes a similarity model.
According to another aspect of the present disclosure, there is provided a training apparatus of an intention recognition model, including: the data acquisition module is used for acquiring stored interaction log data, wherein the interaction log data comprise a plurality of groups of user interaction data, and each group of user interaction data comprises: the system comprises user query data, a recommendation intention data list and user selection data selected by a user from the recommendation intention data list; the sample generation module is used for generating first positive sample data and first negative sample data based on each group of user interaction data; the sample acquisition module is used for acquiring second positive sample data and second negative sample data in the knowledge base based on preset first configuration information; the model training module is used for training the intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data and the second negative sample data; wherein the intention identification model is used for carrying out intention identification on the query statement input by the user.
According to yet another aspect of the present disclosure, there is provided a computer apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the above-described method of training an intent recognition model via execution of the executable instructions.
According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described method of training an intent recognition model.
The training method for the intention recognition model provided by the embodiment of the disclosure obtains training data for training the intention recognition model based on the combination of data in the knowledge base and user interaction data. On the one hand, the labor cost generated by manual labeling is avoided, and on the other hand, the accuracy of training data is improved, so that the accuracy of intention identification is higher.
Furthermore, negative samples with different degrees of correlation can be obtained based on user interaction data, and the negative samples and completely irrelevant negative samples sampled by the knowledge base are used as a training set of the intention recognition model, so that the mutual complementation effect can be achieved, and the accuracy of intention recognition is further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
FIG. 1 is a schematic diagram of a smart customer service interaction page shown according to an example.
Fig. 2 is a schematic structural diagram of a computer system according to an exemplary embodiment of the present disclosure.
FIG. 3 shows a flowchart of a training method of an intent recognition model in an embodiment of the present disclosure.
FIG. 4 is a schematic diagram of a knowledge base shown according to an example.
FIG. 5 is a training diagram of an intent recognition model shown according to an example.
FIG. 6 is a flow chart illustrating another method for training an intent recognition model in an embodiment of the present disclosure.
FIG. 7 is a diagram illustrating a list of user query data and recommendation intent data, according to an example.
FIG. 8 is a schematic diagram illustrating an apparatus for training an intent recognition model according to an embodiment of the present disclosure.
Fig. 9 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure.
FIG. 10 is a schematic diagram of a computer-readable storage medium in an embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Further, in the description of the present disclosure, "a plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. The terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
In the related art, the training data can be generated in two ways in general:
one is to manually mark the user's query data with its intentions to form data pairs of query data and intention data. However, in this way, when the training data scale is large, the labor cost required to be consumed is high, and the labeling speed cannot meet the requirement of model iteration updating online.
The other method is to acquire training data by sampling a knowledge base in the system; when the user query data need to be matched, some candidate intentions are recalled from the knowledge base, and then the user query data and the candidate intentions form a data pair to be used as training data of the similarity model. This approach obtains a large amount of training data by acquiring positive samples within the same intent and negative samples between different intents. However, because the knowledge base itself is positive sample friendly, this approach is more accurate when generating positive samples; but when generating negative samples, it is always extremely irrelevant, since it is sampling between different intentions. For example, a negative sample training data pair generated based on the knowledge base might be "can i apply for refund (query), hamburger does not need (intention)". This gives the illusion of a model to be trained that only totally irrelevant text is a negative sample. There are actually some intentions that are related to the query to some extent, but not the same meaning, e.g. "do i become applying for refund (query), query progress of refund (intention)".
The scheme provided by the embodiment of the disclosure provides a training method and a device for an intention recognition model, and based on the combination of data in a knowledge base and user interaction data, training data for training the intention recognition model is obtained, so that on one hand, the labor cost generated by manual labeling is avoided, on the other hand, the accuracy of the training data is improved, and the accuracy of intention recognition is higher.
To facilitate understanding, several terms referred to in the present disclosure are explained below.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. Embodiments of the present disclosure generally relate to natural language processing techniques and machine learning/deep learning techniques, among others.
Natural language processing (N L P) is an important direction in the fields of computer science and artificial intelligence, which studies various theories and methods that enable efficient communication between a person and a computer using natural language.
Machine learning (Machine L earning, M L) is a multi-domain cross discipline, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. a special study on how a computer simulates or implements human learning behavior to acquire new knowledge or skills, reorganizes existing knowledge structures to continuously improve its performance.
The scheme provided by the embodiment of the disclosure relates to technologies such as artificial intelligence intention recognition and machine learning, and is specifically explained by the following embodiments:
fig. 2 is a schematic structural diagram of a computer system according to an exemplary embodiment of the present disclosure. The system comprises: a number of terminals 120 and a server cluster 140.
The terminal 120 may be a mobile terminal such as a mobile phone, a game console, a tablet Computer, an e-book reader, smart glasses, an MP4(moving picture Experts Group Audio L eye IV) player, an intelligent home device, an AR (Augmented Reality) device, a VR (Virtual Reality) device, or the terminal 120 may be a Personal Computer (Personal Computer), such as a laptop Computer and a desktop Computer.
The terminals 120 are connected to the server cluster 140 through a communication network. Optionally, the communication network is a wired network or a wireless network.
The server cluster 140 is a server, or is composed of a plurality of servers, or is a virtualization platform, or is a cloud computing service center. The server cluster 140 is used to provide background services for applications that provide training methods for the intent recognition model.
Alternatively, the clients of the applications installed in different terminals 120 are the same, or the clients of the applications installed on two terminals 120 are clients of the same type of application of different control system platforms. Based on different terminal platforms, the specific form of the client of the application program may also be different, for example, the client of the application program may be a mobile phone client, a PC client, or a World Wide Web (Web) client.
The server cluster 140 obtains interaction data with the user through different clients in the terminal 120.
Those skilled in the art will appreciate that the number of terminals 120 described above may be greater or fewer. For example, the number of the terminals may be only one, or several tens or hundreds of the terminals, or more. The number of terminals and the type of the device are not limited in the embodiments of the present disclosure.
Optionally, the system may further include a management device (not shown in fig. 2), and the management device is connected to the server cluster 140 through a communication network. Optionally, the communication network is a wired network or a wireless network.
The Network is typically the Internet, but may be any Network including, but not limited to, a local Area Network (L cal Area Network, L AN), a Metropolitan Area Network (Metropolisan Area Network, MAN), a Wide Area Network (WAN), a mobile, wireline, or wireless Network, a Private Network, or any combination of Virtual Private networks.
Hereinafter, the steps of the training method for the intention recognition model in the exemplary embodiment of the present disclosure will be described in more detail with reference to the drawings and the embodiment.
FIG. 3 shows a flowchart of a training method of an intent recognition model in an embodiment of the present disclosure. The method provided by the embodiment of the present disclosure may be executed by any electronic device with computing processing capability, for example, the server cluster 140 in fig. 2. In the following description, the server cluster 140 is used as an execution subject for illustration.
As shown in fig. 3, the training method 10 of the intention recognition model includes:
in step S102, the stored interaction log data is acquired.
Taking the intelligent customer service as an example, the interaction log data records a plurality of groups of interaction data interacted between the user and the intelligent customer service in the past period.
As shown in fig. 2, each set of user interaction data may include, for example: the system comprises user query data, an intelligent customer service recommended intention data list and user selection data selected from the recommended intention data list by a user.
In the process of interacting with the intelligent customer service, if the action of transferring to the manual service occurs, the intention data list recommended by the intelligent customer service for the user may not have the intention expected by the user, and therefore the user is transferred to request the manual service. To improve the accuracy of the data, in some embodiments, the interaction log data may discard data that has a transition to artificial behavior during the interaction, and only retain data that has no transition to artificial behavior.
In step S104, first positive sample data and first negative sample data are generated based on each set of user interaction data.
And respectively generating first positive sample data and first negative sample data based on each group of user interaction data. For example, from each set of user interaction data, the user query data and the user selection data may be combined into a first positive sample data, and the user query data and other intentions in the recommendation intention list not selected by the user may be combined into different first negative samples respectively.
In step S106, second positive sample data and second negative sample data in the knowledge base are obtained based on the preset first configuration information.
For example, in some embodiments, the second positive sample data comprises: and in the knowledge base, each intention is respectively combined with sample data consisting of different expression data under each intention and sample data consisting of different expression data under each intention.
The second negative sample data includes: and in the knowledge base, sample data respectively consisting of different expression data under the intentions and other preselected intentions and sample data respectively consisting of different expression data under the intentions and other preselected expression data under other preselected intentions are stored.
FIG. 4 is a schematic diagram of a knowledge base shown according to an example.
As shown in fig. 4, under each category (e.g., take away, SASS, hotel, etc.), different intents correspond to multiple expressions. Different expressions intended to correspond thereto may constitute different second positive samples, respectively. For example, the intention "i want to apply for refund" and the expression 1 "can do so" may constitute positive sample 1.
In addition, different second positive samples may be composed of different expressions corresponding to the same intention. For example, the positive sample 2 may be composed between the expression 1 "i can apply for refund lam" corresponding to the intention "i will apply for refund" and the expression 2 "refund bar".
Second negative sample data may be formed between the different expression data for each intention and different preselected expressions for other preselected intentions, respectively. As shown in fig. 4, a negative sample 1 may be formed between an expression 2 "refund bar" corresponding to the intention "i want to apply for refund" and an expression 1 "hamburger do not salad" corresponding to the intention "i want to add remarks", and a negative sample 2 may be formed between an expression 2 "refund bar" corresponding to the intention "i want to apply for refund" and an expression 1 "how to set a printer" corresponding to the intention "how to connect a printer".
Furthermore, the expression data for each purpose other than the other preselected purpose may also constitute a second negative example. As in fig. 4, the expression 2 "refund bar" with the intention "i want to additionally recite" corresponding to the intention "i want to apply for refund" constitutes a negative example 3.
The first configuration information may be used to configure the preselection intention and the preselection expression corresponding to the preselection intention. For example, it may be configured based on a proportion, and taking the intention "i will apply for refund" as an example, a proportion (e.g., 30%, 50%, 70%, etc.) of preselected intentions may be selected from other intentions, and the second negative sample data may be composed in the above manner. Or may specify the preselected intention directly by configuration from other intentions.
Similarly, the preselected expression in the preselected intention may be determined according to a configuration scale, a direct specification, or the like.
In practical applications, the above proportion may be configured according to practical requirements, and the disclosure is not limited thereto.
In step S108, the intention recognition model is trained based on the first positive sample data, the first negative sample data, the second positive sample data, and the second negative sample data.
FIG. 5 is a training diagram of an intent recognition model shown according to an example.
As shown in fig. 5, a training data set of the intention recognition model is constructed based on the first positive sample data, the first negative sample data, the second positive sample data, and the second negative sample data, and the intention recognition model is trained based on the training data set.
The intention recognition model is used for carrying out intention recognition on a query statement input by a user.
In some embodiments, the intent recognition model includes a similarity model, such as a DSSM, CDSSM, Seq2Seq, Transformer, Bert, or the like.
The training method for the intention recognition model provided by the embodiment of the disclosure obtains training data for training the intention recognition model based on the combination of data in the knowledge base and user interaction data. On the one hand, the labor cost generated by manual labeling is avoided, and on the other hand, the accuracy of training data is improved, so that the accuracy of intention identification is higher.
FIG. 6 is a flow chart illustrating another method for training an intent recognition model in an embodiment of the present disclosure. Unlike the method for training the intent recognition model shown in fig. 3, the method shown in fig. 6 further provides an example of how to generate the first positive sample data and the first negative sample data based on each set of user interaction data, that is, further provides a specific implementation manner of step S104.
Referring to fig. 6, step S104 includes:
in step S1042, the user query data and the user selection data are combined into first positive sample data.
FIG. 7 is a diagram illustrating a list of user query data and recommendation intent data, according to an example.
Taking the example shown in fig. 7 as an example, when the query data output by the user is "he does not find a location", the recommendation intention data given by the intelligent customer service through the interactive interface, for example, includes: "locate address inaccurate", "rider does not know way", "how location is on or off", "delivery slow/timeout", "how delivery status is queried". The user selects the 'rider does not know the way' by clicking the interactive interface, namely the user selects the data as 'rider does not know the way'.
The user query data "he does not find a location" and the user selection data "rider does not know a way" may be combined into the first positive sample data, i.e., positive sample 1 in fig. 7.
The positive sample data obtained by the method is composed of the query data input by the user in real time and the selection data selected by the user, so that the intention of the user can be directly reflected, and the accuracy of the positive sample is high.
In step S1044, based on the preset second configuration information, the user query data and all or part of the unselected intention data in the recommendation intention data list are combined to form first negative sample data.
Still taking fig. 7 as an example, the user query data and all or part of the unselected intention data in the recommendation intention data list are combined into the first negative sample data.
For example, the user query data "he did not find a location" is combined with the intention recommendation data "locate address inaccurate", "how location is on or off", and "delivery slow/timeout" to negative examples 1, 2, and 3, respectively.
Therein, negative example 1 ("he does not find a location", "location address is not accurate") may be considered strongly correlated, since it is intended that the "location address is not accurate" is likely the cause of the query data "he does not find a location", and both contain a relevant description of "location" in the text message.
Negative examples 3 ("he did not find a location", "delivery slow/time out") may also be considered relevant, since the intention "delivery slow/time out" and the general probability are the result of the user querying the data "he did not find a location".
Negative example 2 ("he did not find a location", "how the positioning is switched on or off") is considered weakly correlated, because although both are only relevant descriptions of "location" contained in the text information, they do not have much relevance from a semantic point of view.
The negative samples with different degrees of correlation and completely irrelevant negative samples sampled by the knowledge base are used as a training set of the intention recognition model and can play a role of mutual complementation.
The second configuration information may be used, for example, to configure a proportion of the first negative sample data. The proportion includes a selection proportion for selecting the intention data in the list of recommendation intention data that is not selected for composing the first negative sample data.
That is, the sampling ratio of the first negative sample data may be configured by the second configuration information.
Different strategies can be designed by combining the first configuration information and the second configuration information so as to sample negative sample data from the user interaction data and the knowledge base respectively. The embodiment of the present disclosure provides a mechanism that can configure the proportion of negative samples from different sources, but the specific configuration may be set according to actual requirements, and the present disclosure is not limited thereto.
It is noted that the above-mentioned figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
FIG. 8 is a schematic diagram illustrating an apparatus for training an intent recognition model according to an embodiment of the present disclosure. The training apparatus of the intention recognition model shown in fig. 8 can be applied to the server cluster 140 in fig. 2, for example.
Referring to fig. 8, the training apparatus 20 for the intention recognition model includes: a data acquisition module 202, a sample generation module 204, a sample acquisition module 206, and a model training module 208.
The data obtaining module 202 is configured to obtain stored interaction log data, where the interaction log data includes multiple sets of user interaction data, and each set of user interaction data includes: the recommendation system comprises user query data, a recommendation intention data list and user selection data selected from the recommendation intention data list by a user.
The sample generating module 204 is configured to generate first positive sample data and first negative sample data based on each set of user interaction data.
The sample obtaining module 206 is configured to obtain second positive sample data and second negative sample data in the knowledge base based on preset first configuration information.
The model training module 208 is configured to train the intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data, and the second negative sample data.
The intention recognition model is used for carrying out intention recognition on a query statement input by a user.
In some embodiments, the sample generation module 204 includes: a positive sample generation unit and a negative sample generation unit. The positive sample generating unit is used for forming first positive sample data by the user query data and the user selection data; the negative sample generating unit is used for forming first negative sample data by the user query data and all or part of unselected intention data in the recommendation intention data list respectively based on preset second configuration information.
In some embodiments, the interaction log data does not include user interaction data that is manually operated after the user selects the user selection data from the recommendation intention data list.
In some embodiments, the second configuration information is used to configure a proportion of the first negative sample data, the proportion including a selection proportion of non-selected intention data in the list of recommended intention data for composing the first negative sample data.
In some embodiments, the second positive sample data comprises: sample data composed of different expression data under each intention and sample data composed of different expression data under each intention in the knowledge base; the second negative sample data includes: and in the knowledge base, sample data respectively consisting of different expression data under the intentions and other preselected intentions and sample data respectively consisting of different expression data under the intentions and other preselected expression data under other preselected intentions are stored.
In some embodiments, the first configuration information is used to configure a proportion of the second negative sample data, the proportion including: other intended selection ratios that are preselected and/or selection ratios that are preselected to represent data.
In some embodiments, the intent recognition model includes a similarity model.
The training device for the intention recognition model provided by the embodiment of the disclosure obtains training data for training the intention recognition model based on the combination of data in the knowledge base and user interaction data. On the one hand, the labor cost generated by manual labeling is avoided, and on the other hand, the accuracy of training data is improved, so that the accuracy of intention identification is higher.
Furthermore, negative samples with different degrees of correlation can be obtained based on user interaction data, and the negative samples and completely irrelevant negative samples sampled by the knowledge base are used as a training set of the intention recognition model, so that the mutual complementation effect can be achieved, and the accuracy of intention recognition is further improved.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 800 according to this embodiment of the disclosure is described below with reference to fig. 9. The electronic device 800 shown in fig. 9 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.
As shown in fig. 9, the electronic device 800 is in the form of a general purpose computing device. The components of the electronic device 800 may include, but are not limited to: the at least one processing unit 810, the at least one memory unit 820, and a bus 830 that couples the various system components including the memory unit 820 and the processing unit 810.
Where the memory unit stores program code, the program code may be executed by the processing unit 810 to cause the processing unit 810 to perform steps according to various exemplary embodiments of the present disclosure as described in the "exemplary methods" section above in this specification. For example, the processing unit 810 may execute S102 as shown in fig. 3, acquiring the stored interaction log data; s104, generating first positive sample data and first negative sample data based on each group of user interaction data; s106, acquiring second positive sample data and second negative sample data in the knowledge base based on preset first configuration information; and S108, training the intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data and the second negative sample data.
The storage unit 820 may include readable media in the form of volatile memory units such as a random access memory unit (RAM)8201 and/or a cache memory unit 8202, and may further include a read only memory unit (ROM) 8203.
The storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 830 may be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
Electronic device 800 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, Bluetooth device, etc.), and may also communicate with one or more devices that enable a user to interact with electronic device 800, and/or with any device (e.g., router, modem, etc.) that enables electronic device 800 to communicate with one or more other computing devices.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device.
Referring to fig. 10, a program product 900 for implementing the above method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including AN object oriented programming language such as Java, C + +, or the like, as well as conventional procedural programming languages, such as the "C" language or similar programming languages.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A method for training an intention recognition model, comprising:
acquiring stored interaction log data, wherein the interaction log data comprise a plurality of groups of user interaction data, and each group of user interaction data comprises: the system comprises user query data, a recommendation intention data list and user selection data selected by a user from the recommendation intention data list;
generating first positive sample data and first negative sample data based on each group of user interaction data;
acquiring second positive sample data and second negative sample data in a knowledge base based on preset first configuration information; and
training an intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data, and the second negative sample data;
wherein the intention identification model is used for carrying out intention identification on the query statement input by the user.
2. The method of claim 1, wherein generating first positive sample data and first negative sample data based on the sets of user interaction data comprises:
forming first positive sample data by the user query data and the user selection data; and
and based on preset second configuration information, combining the user query data with all or part of unselected intention data in the recommendation intention data list to form first negative sample data.
3. The method of claim 2, wherein the interaction log data does not include user interaction data that is manually shifted to after the user selects the user selection data from the recommendation intent data list.
4. The method of claim 2, wherein the second configuration information is used to configure a proportion of the first negative sample data, the proportion including a selection proportion of non-selected intention data in the list of recommendation intention data for composing the first negative sample data.
5. The method of claim 1, wherein the second positive sample data comprises: the system comprises a knowledge base, a database and a database, wherein the knowledge base comprises a plurality of intents, sample data and sample data, and the intention is respectively combined with different expression data under the intents, and the sample data is composed of different expression data under the intents;
the second negative sample data includes: and sample data consisting of different expression data under the intentions and different preselected expression data under other preselected intentions in the knowledge base respectively and sample data consisting of different expression data under the intentions and different preselected expression data under other preselected intentions mutually respectively.
6. The method of claim 5, wherein the first configuration information is used to configure a proportion of the second negative sample data, and wherein the proportion comprises: the preselected selection ratio of other intentions and/or the preselected selection ratio of the presentation data.
7. The method of any of claims 1-6, wherein the intent recognition model comprises a similarity model.
8. An apparatus for training an intention recognition model, comprising:
the data acquisition module is used for acquiring stored interaction log data, wherein the interaction log data comprise a plurality of groups of user interaction data, and each group of user interaction data comprises: the system comprises user query data, a recommendation intention data list and user selection data selected by a user from the recommendation intention data list;
the sample generation module is used for generating first positive sample data and first negative sample data based on each group of user interaction data;
the sample acquisition module is used for acquiring second positive sample data and second negative sample data in the knowledge base based on preset first configuration information; and
the model training module is used for training an intention recognition model based on the first positive sample data, the first negative sample data, the second positive sample data and the second negative sample data;
wherein the intention identification model is used for carrying out intention identification on the query statement input by the user.
9. A computer device, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the executable instructions.
10. A computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, implement the method of any one of claims 1-7.
CN202010191431.6A 2020-03-18 2020-03-18 Method and device for training intention recognition model, storage medium and electronic equipment Pending CN111400473A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010191431.6A CN111400473A (en) 2020-03-18 2020-03-18 Method and device for training intention recognition model, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010191431.6A CN111400473A (en) 2020-03-18 2020-03-18 Method and device for training intention recognition model, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111400473A true CN111400473A (en) 2020-07-10

Family

ID=71428842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010191431.6A Pending CN111400473A (en) 2020-03-18 2020-03-18 Method and device for training intention recognition model, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111400473A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111970150A (en) * 2020-08-20 2020-11-20 北京达佳互联信息技术有限公司 Log information processing method, device, server and storage medium
CN112328891A (en) * 2020-11-24 2021-02-05 北京百度网讯科技有限公司 Method for training search model, method for searching target object and device thereof
CN112417132A (en) * 2020-12-17 2021-02-26 南京大学 New intention recognition method for screening negative samples by utilizing predicate guest information
CN113407689A (en) * 2021-06-15 2021-09-17 北京三快在线科技有限公司 Method and device for model training and business execution
US20220310084A1 (en) * 2021-03-24 2022-09-29 Adobe Inc. Extensible search, content, and dialog management system with human-in-the-loop curation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294341A (en) * 2015-05-12 2017-01-04 阿里巴巴集团控股有限公司 A kind of Intelligent Answer System and theme method of discrimination thereof and device
CN108230007A (en) * 2017-11-28 2018-06-29 北京三快在线科技有限公司 A kind of recognition methods of user view, device, electronic equipment and storage medium
CN109871446A (en) * 2019-01-31 2019-06-11 平安科技(深圳)有限公司 Rejection method for identifying, electronic device and storage medium in intention assessment
CN110377720A (en) * 2019-07-26 2019-10-25 中国工商银行股份有限公司 The more wheel exchange methods of intelligence and system
CN110717536A (en) * 2019-09-30 2020-01-21 北京三快在线科技有限公司 Method and device for generating training sample

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294341A (en) * 2015-05-12 2017-01-04 阿里巴巴集团控股有限公司 A kind of Intelligent Answer System and theme method of discrimination thereof and device
CN108230007A (en) * 2017-11-28 2018-06-29 北京三快在线科技有限公司 A kind of recognition methods of user view, device, electronic equipment and storage medium
CN109871446A (en) * 2019-01-31 2019-06-11 平安科技(深圳)有限公司 Rejection method for identifying, electronic device and storage medium in intention assessment
CN110377720A (en) * 2019-07-26 2019-10-25 中国工商银行股份有限公司 The more wheel exchange methods of intelligence and system
CN110717536A (en) * 2019-09-30 2020-01-21 北京三快在线科技有限公司 Method and device for generating training sample

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111970150A (en) * 2020-08-20 2020-11-20 北京达佳互联信息技术有限公司 Log information processing method, device, server and storage medium
CN111970150B (en) * 2020-08-20 2023-08-18 北京达佳互联信息技术有限公司 Log information processing method, device, server and storage medium
CN112328891A (en) * 2020-11-24 2021-02-05 北京百度网讯科技有限公司 Method for training search model, method for searching target object and device thereof
CN112328891B (en) * 2020-11-24 2023-08-01 北京百度网讯科技有限公司 Method for training search model, method for searching target object and device thereof
CN112417132A (en) * 2020-12-17 2021-02-26 南京大学 New intention recognition method for screening negative samples by utilizing predicate guest information
CN112417132B (en) * 2020-12-17 2023-11-17 南京大学 New meaning identification method for screening negative samples by using guest information
US20220310084A1 (en) * 2021-03-24 2022-09-29 Adobe Inc. Extensible search, content, and dialog management system with human-in-the-loop curation
US11948566B2 (en) * 2021-03-24 2024-04-02 Adobe Inc. Extensible search, content, and dialog management system with human-in-the-loop curation
CN113407689A (en) * 2021-06-15 2021-09-17 北京三快在线科技有限公司 Method and device for model training and business execution

Similar Documents

Publication Publication Date Title
CN109165249B (en) Data processing model construction method and device, server and user side
US10579654B2 (en) Method and device for generating online question paths from existing question banks using a knowledge graph
CN111400473A (en) Method and device for training intention recognition model, storage medium and electronic equipment
US11899681B2 (en) Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium
CN116935169B (en) Training method for draft graph model and draft graph method
CN111666416B (en) Method and device for generating semantic matching model
CN113590776B (en) Knowledge graph-based text processing method and device, electronic equipment and medium
CN109564573A (en) Platform from computer application metadata supports cluster
CN116127020A (en) Method for training generated large language model and searching method based on model
CN115131698B (en) Video attribute determining method, device, equipment and storage medium
CN113761220A (en) Information acquisition method, device, equipment and storage medium
CN113806487A (en) Semantic search method, device, equipment and storage medium based on neural network
CN111324715A (en) Method and device for generating question-answering robot
CN111563158A (en) Text sorting method, sorting device, server and computer-readable storage medium
CN110795544B (en) Content searching method, device, equipment and storage medium
CN110222144A (en) Method for extracting content of text, device, electronic equipment and storage medium
CN110807097A (en) Method and device for analyzing data
CN109190116A (en) Semantic analytic method, system, electronic equipment and storage medium
CN110837549B (en) Information processing method, device and storage medium
CN111931034A (en) Data searching method, device, equipment and storage medium
CN117312518A (en) Intelligent question-answering method and device, computer equipment and storage medium
US20200159824A1 (en) Dynamic Contextual Response Formulation
JP2023554210A (en) Sort model training method and apparatus for intelligent recommendation, intelligent recommendation method and apparatus, electronic equipment, storage medium, and computer program
US20220300836A1 (en) Machine Learning Techniques for Generating Visualization Recommendations
CN112749553B (en) Text information processing method and device for video file and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200710