CN117151078A - Social platform-oriented few-sample multi-field user intention recognition method and system - Google Patents

Social platform-oriented few-sample multi-field user intention recognition method and system Download PDF

Info

Publication number
CN117151078A
CN117151078A CN202311005859.7A CN202311005859A CN117151078A CN 117151078 A CN117151078 A CN 117151078A CN 202311005859 A CN202311005859 A CN 202311005859A CN 117151078 A CN117151078 A CN 117151078A
Authority
CN
China
Prior art keywords
domain
data
intention
few
corpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311005859.7A
Other languages
Chinese (zh)
Inventor
杜亚军
王旭阳
李显勇
刘佳
李艳丽
陈晓亮
谢春芝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xihua University
Original Assignee
Xihua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xihua University filed Critical Xihua University
Priority to CN202311005859.7A priority Critical patent/CN117151078A/en
Publication of CN117151078A publication Critical patent/CN117151078A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a social platform-oriented few-sample multi-domain user intention recognition method and a social platform-oriented few-sample multi-domain user intention recognition system, which relate to the technical field of natural language processing, wherein the method comprises the steps of obtaining user intention linguistic data of a plurality of social platform multi-domains, preprocessing the obtained user intention linguistic data, and classifying the preprocessed user intention linguistic data in the domain; in each training iteration, tasks in two different fields are randomly sampled from the classified user intention corpus data set to perform field countermeasure training: respectively encoding texts of the two tasks into word vectors; extracting the domain features of the text from the two obtained word vectors respectively, and converting the domain features into domain knowledge; classifying the generated domain knowledge and judging the domain corresponding to the domain knowledge; word vectors of the two tasks are correspondingly enhanced respectively by using the generated domain knowledge so as to obtain sentence vectors respectively; and classifying the sentence vectors and judging the category to which the sentence vectors belong.

Description

Social platform-oriented few-sample multi-field user intention recognition method and system
Technical Field
The application relates to the technical field of natural language processing, in particular to a social platform-oriented few-sample multi-field user intention recognition method and system.
Background
With the advent of the Internet era, browsing real-time information on various large social platforms and publishing own beliefs in real time became the current daily life of people. When hundreds of millions of users comment on an incident on a social platform, the lack of effective supervision is extremely likely to lead to the rapid and widespread spread of negative public opinion. Therefore, identifying user intent (especially negative, guided intent) in a social platform using artificial intelligence techniques has extremely high practical significance and value.
In addition, current social platforms generally enrich information in various fields. For example, consider a newwave microblog as an example, which has more than 40 classification areas, including financial, photographic, automotive, sports, digital, etc. In each domain, users discuss related topics of that domain, and users also show different intentions for information of different domains. In addition, in terms of expressions, users may use specialized vocabulary and expressions in different fields when discussing the contents of the fields. Therefore, it can be seen that the information of each field in the social platform is different, and how to realize that the user intention of each field can be efficiently identified is a current difficult problem.
Disclosure of Invention
In order to solve the technical problems in the related art, the application provides a social platform-oriented few-sample multi-field user intention recognition method and system. The model established by the method can have stronger capability of identifying the user intention corpus in different fields, can not cause the problem of obviously reduced identification performance due to field change, and can maintain high-efficiency and accurate identification performance when identifying the user intention in the sudden public opinion corpus.
In order to achieve the above purpose, the technical scheme adopted by the application comprises the following steps:
according to a first aspect of the present application, there is provided a social platform-oriented few-sample multi-domain user intent recognition method comprising the steps of:
acquiring user intention linguistic data in multiple fields of a plurality of social platforms, preprocessing the acquired user intention linguistic data, and carrying out field classification on the preprocessed user intention linguistic data;
in each training iteration, tasks in two different fields are randomly sampled from the classified user intention corpus data set to perform field countermeasure training: respectively encoding texts of the two tasks into word vectors; extracting the domain features of the text from the two obtained word vectors respectively, and converting the domain features into domain knowledge; classifying the generated domain knowledge and judging the domain corresponding to the domain knowledge; word vectors of the two tasks are correspondingly enhanced respectively by using the generated domain knowledge so as to obtain sentence vectors respectively; and classifying the sentence vectors and judging the category to which the sentence vectors belong.
Optionally, the fields include a general field and a specific field, the general field including: politics, finance, entertainment, sports, transportation, smart appliances, sports, games, digital, apparel, make-up and home; the specific fields include: event domain, regional domain, custom domain, and religious domain.
Optionally, the method further comprises:
the classified user intention corpus is used as training data, and in each training iteration, the data is sampled from a random field to be used as training data.
Optionally, the method further comprises:
in each training iteration, firstly randomly sampling two different target fields, and then randomly sampling an N-way-K-shot task from each of the two target fields, wherein the N-way-K-shot task means that each task only comprises N categories of intention data, and each intention data only comprises K marked samples and Q unmarked samples;
inputting two tasks from different fields into a model, and mapping corresponding user intention corpus into a low-dimensional dense vector space respectively to obtain word vectors corresponding to the two tasks;
And respectively extracting corresponding domain features from the two word vectors by utilizing the bidirectional SLTM, and converting the domain features into corresponding domain knowledge by utilizing the full connection layer and the activation function.
Optionally, the preprocessing includes data cleansing and de-disabling words.
Optionally, the method specifically includes:
step S1-1: acquiring user intention corpus in multiple fields of multiple social platforms and preprocessing the corpus;
step S1-2: dividing the preprocessed user intention corpus according to the fields: data train =*Dom 1 ,Dom 2 ,…,Dom n };
Step S2-1: from the categorized user intent corpus Data set Data train In a field of random sampling Dom a As a target field; wherein a is more than or equal to 1 and n is more than or equal to n;
step S2-2: from the target domain Dom a Randomly sampling N categories of data;
step S2-3: randomly sampling k+q samples from each class;
step S2-4: for samplingDividing N× (K+Q) samples to obtain a taskWherein s is a support set containing N×K marked data, Q is a query set containing N×Q unmarked data, wherein X represents an intention text, and Y represents a corresponding category label;
step S2-5: repeating steps S2-1 to S2-4 from another domain Dom b Resampling one task to get another task
Step S3: inputting sampled data in different fields to perform model training;
step S4: text-to-text using BERT models, respectivelyAnd->Word embedding is carried out:
in the method, in the process of the invention,and->Respectively->And->A corresponding word vector;
step S5: extracting field features of the word vectors obtained in the step S4 by respectively adopting a bidirectional LSTM:
in the method, in the process of the invention,and->Respectively word vector->And->Corresponding domain features;
step S6: converting the domain features into corresponding domain knowledge by using the full connection layer and the activation function respectively:
in the method, in the process of the invention,and->Domain features->And->Corresponding domain knowledge; softmax (·) is the activation function, W 1 And z 1 Weights and biases for the full connection layer;
step S7: classifying the domain knowledge by using a domain discriminator respectively:
in the method, in the process of the invention,and->Domain knowledge->And->The corresponding predictive probability matrix, d (·) is the domain discriminator;
step S8: enhancing word vectors by using the produced domain knowledge to obtain sentence vectors respectively:
in the method, in the process of the invention,and->Is an enhanced sentence vector;
step S9: respectively identifying sentence vectors of the query set obtained in the step S8:
wherein p is a And p b Sentence vectors of query sets, respectivelyAnd->Corresponding probability matrix, W 2 And z 2 Weights and biases for the full connection layer.
According to a second aspect of the present application, there is further provided a social platform-oriented few-sample multi-domain user intention recognition system for performing a social platform-oriented few-sample multi-domain user intention recognition method according to any one of the first aspect of the present application, the system comprising a domain-generalized few-sample multi-domain user intention recognition framework, wherein the domain-generalized few-sample multi-domain user intention recognition framework comprises:
the social platform corpus processing module is used for acquiring multi-domain user intention corpus from a plurality of social platforms, and preprocessing and classifying the user intention corpus;
the task sampling module is used for sampling tasks used for training and testing in each iteration, for each task, the task sampling module firstly randomly selects one field, and then samples an N-way-K-shot task from the target field, wherein the N-way-K-shot task means that each task only comprises N categories of intention data, and each intention data only comprises K marked samples and Q unmarked samples; also, in each training iteration, the model is required to be trained with n×k labeled data, after which a test is performed on n×q unlabeled data.
Optionally, the system further comprises a domain countermeasure model, wherein the domain countermeasure model comprises:
the intention text embedding module is a pre-training model BERT, and in each iteration, the intention text embedding module is used for encoding two randomly sampled N-way-K-shot tasks from different fields so as to convert a user intention corpus into word vectors corresponding to semantic information;
the domain knowledge generator module is used for extracting domain features from the word vectors obtained by the intention text embedding module and converting the domain features into domain knowledge;
the domain discriminator module is used for classifying the domain knowledge generated by the domain knowledge generator module and judging from which of two different domains the domain knowledge is from;
and the intention classifier module is used for classifying sentence vectors of the two query sets and judging the category to which the contained user corpus intention belongs.
According to a third aspect of the present application, there is further provided a computer readable storage medium having instructions stored therein, which when run on a terminal, cause the terminal to perform a social platform oriented few-sample multi-domain user intent recognition method as described in any of the technical solutions of the first aspect of the present application.
According to a fourth aspect of the present application, there is further provided an electronic device, including a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a computer program or instructions to implement the social platform-oriented few-sample multi-domain user intention recognition method according to any one of the first aspect of the present application.
The beneficial effects are that:
1. according to the technical scheme, on the one hand, the multi-domain corpus is used as training data and divided according to the domains, and after training, the trained model can be applied to different domains, so that the domain adaptation capability of the model can be obviously enhanced. On the other hand, the application designs the model based on the domain countermeasure training, can train by utilizing the user intention corpus from different domains, and the final target domain knowledge can help the model to better adapt to the target domain, so that the method can be suitable for the situation that the brand new user intention corpus from a plurality of domains in a social platform continuously emerges, and has very efficient and accurate recognition performance.
2. Other benefits or advantages of the present application will be described in more detail in the detailed description.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed for the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of steps of a social platform-oriented few-sample multi-domain user intent recognition method provided by an exemplary embodiment of the present application;
FIG. 2 is a flowchart illustrating steps of a social platform-oriented few-sample multi-domain user intent recognition method provided by another exemplary embodiment of the present application;
FIG. 3 is a schematic flow diagram of an algorithm for a domain generalized few-sample multi-domain user intent recognition framework provided by an exemplary embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Furthermore, references to the terms "comprising" and "having" and any variations thereof in the description of the present application are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or apparatus.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In the following, some related terms and techniques involved in the embodiments of the present application are explained.
1) The BERT model, BERT (Bidirectional Encoder Representations from Transformers) model, is a pre-trained language model that learns good language characterization capabilities by performing unsupervised pre-training on a large scale corpus.
2) LSTM, LSTM (Long Short Term Memory) is a neural network with the ability to memorize long-short term information.
In order to facilitate the technical solution of the present application to be more clearly and accurately understood by the relevant technicians, the following description is made in more detail in the related art.
With the advent of the internet era, browsing real-time information on various large social platforms (such as microblogs, red books, bean paste and the like) and publishing own beliefs in real time becomes the current daily life of people. Taking 2023, month 1 as an example, the mainstream social platform in China, for example, the live users of the new wave microblogs and the small red books reach 57251 ten thousand people and 17248 ten thousand people respectively. When hundreds of millions of users comment on an incident on a social platform, the lack of effective supervision is extremely likely to lead to the rapid and widespread spread of negative public opinion. For example, 2018 xx city buses fell into the river. After the first official report of the case, the disaster event itself causes sad emotion of many net friends to the involved passengers, and various rumors and negative comments are widely spread, so that anger emotion of net friends to the rumors is greatly caused. After the official reports the case condition for the second time (the monitoring video when the event is published), the negative emotion of the public opinion is controlled and the network public opinion is guided positively due to the fact that the official clarifies the fact and breaks the rumors, so that the emotion of the people is gradually stable and calm. Therefore, identifying user intent (especially negative, guided intent) in a social platform using artificial intelligence techniques has extremely high practical significance and value.
In addition, current social platforms generally enrich information in various fields. In addition, in terms of expressions, users may use specialized vocabulary and expressions in different fields when discussing the contents of the fields. For example, in the health field, the xx net published a micro-blog called "xx hospital ophthalmic accessory major physician xx: looking at the phone sideways may result in acute commonality internal strabismus. "wherein" ophthalmic "," acute commonality internal strabismus "and the like all belong to professionally related words. Therefore, it can be seen that the information of each field in the social platform is different, and how to realize that the user intention belonging to each field can be efficiently identified is a current difficult problem.
At present, in the related art, a small sample model is generally trained by using a crawled common user intention corpus with a large number of labels as training data, and after model training is completed, classification of a brand new user intention corpus can be realized by using only a small amount of label data.
However, the related art is not aware of the domain deviation (which may also be referred to as a difference between data distributions) between the original training data and the new user intent corpus, and when the related art processes data from the new domain, it is difficult for a model trained using the original user intent corpus to effectively adapt to the difference between different data distributions, thereby resulting in poor recognition performance. For example, if a model in the related art is trained on the basis of a user intention corpus in the entertainment field, the recognition performance of the model will be significantly degraded when the model processes data from the politics field. In the social platform, users receive information from various fields and generate different intents for the information of the corresponding fields. In addition, the intention of the user is changed along with the development of time, environment and society, so that the intention recognition model is required to have strong capability of recognizing the user intention corpus in different fields, and meanwhile, the intention recognition model is free from the problem of obviously reduced recognition performance caused by field change.
That is, the intention recognition model in the related art has difficulty in efficiently and reliably processing user intention corpora from different fields, and particularly, it has significantly reduced recognition effectiveness and reliability when facing sudden public opinion corpora, and it has difficulty in ensuring efficient and accurate recognition performance when recognizing user intention in sudden public opinion corpora.
In view of the above, the application provides a novel user intention corpus recognition model, namely a social platform-oriented few-sample multi-domain user intention recognition method and system, so that the model established by the method can have stronger capability of recognizing user intention corpuses in different domains, the problem of obviously reduced recognition performance caused by domain changes can be avoided, and meanwhile, the efficient and accurate recognition performance can be maintained when the user intention in sudden public opinion corpuses is recognized.
The technical scheme of the application is described in detail below with reference to the accompanying drawings.
Example 1
Referring to fig. 1, according to a first aspect of the present application, the present embodiment provides a social platform-oriented few-sample multi-domain user intention recognition method, which may include the following steps:
Acquiring user intention linguistic data of multiple social platform multiple fields, preprocessing the acquired user intention linguistic data, and carrying out field classification on the preprocessed user intention linguistic data.
In this embodiment, unlike the related art, the present application uses multi-domain corpus as training data and divides the training data according to domains, so that a model trained by the method of the present application encounters user intent corpus from a plurality of different domains at the beginning of training, and after training, the trained model can be applied to different domains (or data distribution), and thus the domain adaptation capability of the model can be effectively enhanced.
In each training iteration, tasks in two different fields are randomly sampled from the classified user intention corpus data set to perform field countermeasure training: respectively encoding texts of the two tasks into word vectors; extracting the domain features of the text from the two obtained word vectors respectively, and converting the domain features into domain knowledge; classifying the generated domain knowledge and judging the domain corresponding to the domain knowledge; word vectors of the two tasks are correspondingly enhanced respectively by using the generated domain knowledge so as to obtain sentence vectors respectively; and classifying the sentence vectors and judging the category to which the sentence vectors belong.
In this embodiment, the domain countermeasure training is performed by sampling tasks in two different domains, so as to further improve the capability of the model for the user intention corpus from different domains, and after the domain countermeasure training, the model trained by the method can only use a small amount of annotation data (the data can come from the domain which is not seen by the model) to realize high-quality recognition on the user intention corpus in any domain.
According to the technical scheme, on the one hand, the multi-domain corpus is used as training data and divided according to the domains, and after training, the trained model can be applied to different domains, so that the domain adaptation capability of the model can be obviously enhanced. On the other hand, the application designs the model based on the domain countermeasure training, can train by utilizing the user intention corpus from different domains, and the final target domain knowledge can help the model to better adapt to the target domain, so that the method can be suitable for the situation that the brand new user intention corpus from a plurality of domains in a social platform continuously emerges, and has very efficient and accurate recognition performance.
In an alternative embodiment, the fields of the application may include the general fields and the specific fields, the general fields including: politics, finance, entertainment, sports, transportation, smart appliances, sports, games, digital, apparel, make-up and home; specific fields include: event domain, regional domain, custom domain, and religious domain.
The common field is a field related to daily life of a user, and the specific field is a special emergency event (for example, an emergency public health event, an emergency event centered on a certain area, an emergency public opinion event centered on a certain custom, an emergency public opinion event centered on a certain religion, etc.).
In addition, it should be further noted that the social platform of the present application is a generic term, and includes a plurality of mainstream social platforms at present, for example, newwave microblogs, knowledgeable books, reddish books, and the like, because the expression modes, word habits, and the like of users of different social platforms are different, corpus from different platforms can enhance the richness of training data.
In an optional implementation manner, the social platform-oriented few-sample multi-domain user intention recognition method in the application can further comprise the following steps: the classified user intention corpus is used as training data, and in each training iteration, the data is sampled from a random field to be used as training data. Thus, the simulation model can meet a field different from the previous field, and the model is required to adapt to data distribution in different fields as much as possible.
In an optional implementation manner, the social platform-oriented few-sample multi-domain user intention recognition method in the application can further comprise the following steps: in each training iteration, two different target fields are randomly sampled first, then an N-way-K-shot task is randomly sampled from each of the two target fields, wherein the N-way-K-shot task means that each task only contains N categories of intention data, each intention data only contains K marked samples and Q unmarked samples, K can be a small positive integer (for example, 1, 3, 4, 5 and the like), and in each training iteration, the model is required to be trained by using N multiplied by K marked data, and then testing is performed on N multiplied by Q unmarked data. Such an arrangement is to simulate a few sample situation. For example, in an abrupt public opinion event, some negative user intent corpus categories have only a small amount of annotation data. The N-way-K-shot task setting can fully simulate such a few sample situation, and after a large number of training iterations (usually hundreds or thousands of iterations), the model can gradually learn how to identify intent data from an unknown field by using a small amount of annotation data. Inputting two tasks from different fields into a model, and mapping corresponding user intention corpus into a low-dimensional dense vector space respectively to obtain word vectors corresponding to the two tasks; and respectively extracting corresponding domain features from the two word vectors by utilizing the bidirectional SLTM, and converting the domain features into corresponding domain knowledge by utilizing the full connection layer and the activation function. In this way, by sampling two tasks from different domains for a domain countermeasure training-based few-sample intent recognition method in each iteration, the effect of the domain countermeasure training thereafter can be effectively enhanced.
In an alternative embodiment, the preprocessing in the present application may include data cleansing and decommissioning words.
Referring to fig. 2, in an exemplary embodiment, the social platform-oriented few-sample multi-domain user intention recognition method of the present application may specifically include:
step S1-1: acquiring user intention corpus in multiple fields of multiple social platforms and preprocessing the corpus; for example, the multi-domain multi-platform user intent corpus may be crawled through a crawling module, followed by preprocessing of the crawled user intent corpus (e.g., data cleansing, de-disabling words, etc.).
Step S1-2: dividing the preprocessed user intention corpus according to the fields: data train =*Dom 1 ,Dom 2 ,…,Dom n -a }; namely, dividing and sorting the user intention corpus in multiple fields according to the fields of the user intention corpus.
Step S2-1: from the categorized user intent corpus Data set Data train In a field of random sampling Dom a As a target field; wherein a is more than or equal to 1 and n is more than or equal to n;
step S2-2: from the target domain Dom a Randomly sampling N categories of data;
step S2-3: randomly sampling k+q samples from each class;
step S2-4: dividing the sampled N× (K+Q) samples to obtain a task Wherein s is a support set containing N×K marked data, Q is a query set containing N×Q unmarked data, wherein X represents an intention text, and Y represents a corresponding category label;for task t a Corresponding support set text and query set text;
step S2-5: repeating steps S2-1 to S2-4 from another domain Dom b Resampling one task to get another taskIn (1) the->For task t b Corresponding support set text and query set text;
step S3: inputting sampled data in different fields to perform model training;
step S4: text-to-text using BERT models, respectivelyAnd->Word embedding is carried out:
in the method, in the process of the invention,and->Respectively->And->A corresponding word vector;
step S5: extracting field features of the word vectors obtained in the step S4 by respectively adopting a bidirectional LSTM:
in the method, in the process of the invention,and->Respectively word vector->And->Corresponding domain features;
wherein,and->For calculating the forward and backward hidden states respectively,and the method is used for splicing the two-way hidden state to obtain the domain characteristics.
Step S6: converting the domain features into corresponding domain knowledge by using the full connection layer and the activation function respectively:
in the method, in the process of the invention,and->Domain features- >And->Corresponding domain knowledge; softmax (·) is the activation function, W 1 And z 1 Weights and biases for the full connection layer;
i.e. a fully connected layer and activation function is used to translate the domain features into corresponding domain knowledge.
Step S7: classifying the domain knowledge by using a domain discriminator respectively:
in the method, in the process of the application,and->Domain knowledge->And->The corresponding predictive probability matrix, d (·) is the domain discriminator;
step S8: enhancing word vectors by using the produced domain knowledge to obtain sentence vectors respectively:
in the method, in the process of the application,and->Is an enhanced sentence vector;
i.e., to enhance BERT word embedding using the calculated domain knowledge, and to derive sentence representations, e.g.,is->Sentence level representation of (c).
The purpose of step S8 is to apply pressure (also called countermeasures) to the training process of the domain generator to improve the quality of the knowledge generated by the domain generator. The domain discriminator is a three-layer fully connected layer (a basic neural network) with a Relu activation function. It should be noted that, for convenience of description, the present application represents the domain discriminator as d ().
Step S9: respectively identifying sentence vectors of the query set obtained in the step S8:
Wherein p is a And p b Sentence vectors of query sets, respectivelyAnd->Corresponding probability matrix, W 2 And z 2 Weights and biases for the full connection layer.
The purpose of step S9 is to compare two sets of queries p a And p b The sentence vectors of (2) are classified to determine which category the intention it contains belongs to. It should be noted that a full-connection layer is used as the classifier.
In this embodiment, when training the model, the user intention corpus processed in step S1-1 and step S1-2 is used as training data, and a domain countermeasure model is obtained through steps S2 to S9. After training is finished, the domain countermeasure model can utilize a small amount of labeling data, and high-quality recognition of the user intention corpus category from the domain which is not seen by any model can be realized.
Through the technical scheme, the social platform-oriented few-sample multi-domain user intention recognition method can achieve acquisition and division of user intention corpus in a social platform, task sampling, intention text embedding, countermeasure training based on multi-domain data and recognition of user intention corpus from different domains. The recognition method is easy to realize and deploy on a large scale, the domain countermeasure model training is efficient and stable, and the multi-domain intention corpus recognition method is reasonable and effective. The method and the device can realize effective recognition of the user intention corpus in multiple fields under the condition that only a small amount of annotation data exists. When sudden and negative public opinion events occur, the application can effectively adapt to and identify the user intention from different fields in the public opinion process. In particular, the application can realize the recognition effect of maintaining high level for the user intentions in different fields, and can not generate great fluctuation of recognition performance due to the change of the intended fields. The method provides great help for realizing stable and high-quality public opinion guiding and public opinion controlling, provides a basis for the direction of user public opinion guiding for a social platform, and has high practical value and commercial value.
According to a second aspect of the present application, there is further provided a social platform-oriented few-sample multi-domain user intention recognition system for performing the social platform-oriented few-sample multi-domain user intention recognition method according to any one of the first aspect of the present application, the system including a domain-generalized few-sample multi-domain user intention recognition framework, wherein the domain-generalized few-sample multi-domain user intention recognition framework includes:
the social platform corpus processing module is used for acquiring multi-domain user intention corpus from a plurality of social platforms, and preprocessing and classifying the user intention corpus;
the task sampling module is used for sampling tasks used for training and testing in each iteration, for each task, the task sampling module firstly randomly selects one field, and then samples an N-way-K-shot task from the target field, wherein the N-way-K-shot task means that each task only comprises N categories of intention data, and each intention data only comprises K marked samples and Q unmarked samples; also, in each training iteration, the model is required to be trained with n×k labeled data, after which a test is performed on n×q unlabeled data.
In an alternative embodiment, the social platform-oriented few-sample multi-domain user intent recognition system of the present application may further include a domain countermeasure model, wherein the domain countermeasure model includes:
the intention text embedding module is a pre-training model BERT, and in each iteration, the intention text embedding module is used for encoding two randomly sampled N-way-K-shot tasks from different fields so as to convert user intention corpus into word vectors corresponding to semantic information;
the domain knowledge generator module is used for extracting domain features from the word vectors obtained by the intention text embedding module and converting the domain features into domain knowledge;
the domain discriminator module is used for classifying the domain knowledge generated by the domain knowledge generator module and judging from which of two different domains the domain knowledge is from;
and the intention classifier module is used for classifying sentence vectors of the two query sets and judging the category to which the contained user corpus intention belongs.
In this way, the application can be divided into six modules, namely a social platform corpus processing module, a task sampling module, an intention text embedding module, a domain knowledge generator module, a domain discriminator module and an intention classifier module through one frame (a domain generalized few-sample multi-domain user intention recognition frame) and one model (a domain countermeasure model). The method and the device respectively realize the acquisition and arrangement of multi-domain multi-platform user intention corpus data, task sampling based on a few-sample learning method, intention text embedding based on a pre-training model, domain specific knowledge generation based on a domain knowledge generator, domain knowledge identification based on a domain discriminator and multi-domain intention corpus classification based on an intention classifier. In the social platform corpus processing module and the task sampling module, the model training is performed by utilizing the multi-domain data, meanwhile, the training data is divided according to the domains, the corpus of one domain is randomly sampled in each iteration, and the arrangement is beneficial to remarkably enhancing the domain adaptability of the model. In the domain knowledge generator module and the domain discriminator module, the application introduces two tasks from different domains to enhance the effect of domain antagonism. Therefore, the method and the device can effectively enhance the quality of the domain knowledge extracted by the domain generator, and help the model to better adapt to the target domain.
According to the embodiment of the application, the functional modules or functional units can be divided into the few-sample multi-field user intention recognition system of the social platform according to the method example, for example, each functional module or functional unit can be divided corresponding to each function, and two or more functions can be integrated into one processing module. The integrated modules may be implemented in hardware, or in software functional modules or functional units. The division of the modules or units in the embodiment of the present application is schematic, which is merely a logic function division, and other division manners may be implemented in practice.
According to a third aspect of the present application, there is further provided a computer readable storage medium having instructions stored therein, which when run on a terminal, cause the terminal to perform a social platform oriented few-sample multi-domain user intention recognition method as in any of the first aspects of the present application.
The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), an erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a register, a hard disk, an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing, or any other form of computer readable storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuit, ASIC). In embodiments of the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
According to a fourth aspect of the present application, there is further provided an electronic device, including a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a computer program or instructions to implement a social platform-oriented few-sample multi-domain user intention recognition method as in any of the first aspects of the present application.
The above-described processors may be implemented or executed with the various illustrative logical blocks, modules, and circuits described in connection with the present disclosure. The processor may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. The processor may also be a combination that performs the function of a computation, e.g., a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, etc.
In the several embodiments provided herein, it should be understood that the disclosed systems, modules, and methods may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interface, indirect coupling or communication connection of devices or units, electrical, mechanical, or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The present application is not limited to the above embodiments, and any changes or substitutions within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (10)

1. The social platform-oriented few-sample multi-field user intention recognition method is characterized by comprising the following steps of:
acquiring user intention linguistic data in multiple fields of a plurality of social platforms, preprocessing the acquired user intention linguistic data, and carrying out field classification on the preprocessed user intention linguistic data;
In each training iteration, tasks in two different fields are randomly sampled from the classified user intention corpus data set to perform field countermeasure training: respectively encoding texts of the two tasks into word vectors; extracting the domain features of the text from the two obtained word vectors respectively, and converting the domain features into domain knowledge; classifying the generated domain knowledge and judging the domain corresponding to the domain knowledge; word vectors of the two tasks are correspondingly enhanced respectively by using the generated domain knowledge so as to obtain sentence vectors respectively; and classifying the sentence vectors and judging the category to which the sentence vectors belong.
2. The social platform-oriented few-sample multi-domain user intent recognition method of claim 1, wherein the domains include a common domain and a specific domain, the common domain including: politics, finance, entertainment, sports, transportation, smart appliances, sports, games, digital, apparel, make-up and home; the specific fields include: event domain, regional domain, custom domain, and religious domain.
3. The social platform-oriented few-sample multi-domain user intent recognition method of claim 1, further comprising:
The classified user intention corpus is used as training data, and in each training iteration, the data is sampled from a random field to be used as training data.
4. The social platform-oriented few-sample multi-domain user intent recognition method of claim 1, further comprising:
in each training iteration, firstly randomly sampling two different target fields, and then randomly sampling an N-way-K-shot task from each of the two target fields, wherein the N-way-K-shot task means that each task only comprises N categories of intention data, and each intention data only comprises K marked samples and Q unmarked samples;
inputting two tasks from different fields into a model, and mapping corresponding user intention corpus into a low-dimensional dense vector space respectively to obtain word vectors corresponding to the two tasks;
and respectively extracting corresponding domain features from the two word vectors by utilizing the bidirectional SLTM, and converting the domain features into corresponding domain knowledge by utilizing the full connection layer and the activation function.
5. The social platform-oriented few-sample multi-domain user intent recognition method of claim 1, wherein the preprocessing includes data cleansing and decommissioning words.
6. The social platform-oriented few-sample multi-domain user intent recognition method of any one of claims 1-5, wherein the method specifically comprises:
step S1-1: acquiring user intention corpus in multiple fields of multiple social platforms and preprocessing the corpus;
step S1-2: dividing the preprocessed user intention corpus according to the fields: data train =*Dom 1 ,Dom 2 ,…,Dom n };
Step S2-1: from the categorized user intent corpus Data set Data train In a field of random sampling Dom a As a target field; wherein a is more than or equal to 1 and n is more than or equal to n;
step S2-2: from the target domain Dom a Randomly sampling N categories of data;
step S2-3: randomly sampling k+q samples from each class;
step S2-4: dividing the sampled N× (K+Q) samples to obtain a taskWherein s is a support set containing N×K marked data, Q is a query set containing N×Q unmarked data, X represents an intention text, and Y represents a corresponding category label;
step S2-5: repeating steps S2-1 to S2-4 from another domain Dom b Resampling one task to get another task
Step S3: inputting sampled data in different fields to perform model training;
Step S4: text-to-text using BERT models, respectivelyAnd->Word embedding is carried out:
in the method, in the process of the invention,and->Respectively->And->A corresponding word vector;
step S5: extracting field features of the word vectors obtained in the step S4 by respectively adopting a bidirectional LSTM:
in the method, in the process of the invention,and->Respectively word vector->And->Corresponding domain features;
step S6: converting the domain features into corresponding domain knowledge by using the full connection layer and the activation function respectively:
in the method, in the process of the invention,and->Domain features->And->Corresponding domain knowledge; softmax (·) is the activation function, W 1 And z 1 Weights and biases for the full connection layer;
step S7: classifying the domain knowledge by using a domain discriminator respectively:
in the method, in the process of the invention,and->Domain knowledge->And->The corresponding predictive probability matrix, d (·) is the domain discriminator;
step S8: enhancing word vectors by using the produced domain knowledge to obtain sentence vectors respectively:
in the method, in the process of the invention,and->Is an enhanced sentence vector;
step S9: respectively identifying sentence vectors of the query set obtained in the step S8:
wherein p is a And p b Sentence vectors of query sets, respectivelyAnd->Corresponding probability matrix, W 2 And z 2 Weights and biases for the full connection layer.
7. A social platform-oriented few-sample multi-domain user intent recognition system for performing the social platform-oriented few-sample multi-domain user intent recognition method of any one of claims 1-6, the system comprising a domain-generalized few-sample multi-domain user intent recognition framework, wherein the domain-generalized few-sample multi-domain user intent recognition framework comprises:
The social platform corpus processing module is used for acquiring multi-domain user intention corpus from a plurality of social platforms, and preprocessing and classifying the user intention corpus;
the task sampling module is used for sampling tasks used for training and testing in each iteration, for each task, the task sampling module firstly randomly selects one field, and then samples an N-way-K-shot task from the target field, wherein the N-way-K-shot task means that each task only comprises N categories of intention data, and each intention data only comprises K marked samples and Q unmarked samples; also, in each training iteration, the model is required to be trained with n×k labeled data, after which a test is performed on n×q unlabeled data.
8. The social platform-oriented few-sample multi-domain user intent recognition system of claim 7, further comprising a domain countermeasure model, wherein the domain countermeasure model comprises:
the intention text embedding module is a pre-training model BERT, and in each iteration, the intention text embedding module is used for encoding two randomly sampled N-way-K-shot tasks from different fields so as to convert a user intention corpus into word vectors corresponding to semantic information;
The domain knowledge generator module is used for extracting domain features from the word vectors obtained by the intention text embedding module and converting the domain features into domain knowledge;
the domain discriminator module is used for classifying the domain knowledge generated by the domain knowledge generator module and judging from which of two different domains the domain knowledge is from;
and the intention classifier module is used for classifying sentence vectors of the two query sets and judging the category to which the contained user corpus intention belongs.
9. A computer readable storage medium, having instructions stored therein, which when run on a terminal, cause the terminal to perform the social platform oriented few-sample multi-domain user intent recognition method as claimed in any of claims 1-6.
10. An electronic device comprising a processor and a communication interface coupled to the processor for executing a computer program or instructions to implement the social platform-oriented few-sample multi-domain user intent recognition method of any one of claims 1-6.
CN202311005859.7A 2023-08-10 2023-08-10 Social platform-oriented few-sample multi-field user intention recognition method and system Pending CN117151078A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311005859.7A CN117151078A (en) 2023-08-10 2023-08-10 Social platform-oriented few-sample multi-field user intention recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311005859.7A CN117151078A (en) 2023-08-10 2023-08-10 Social platform-oriented few-sample multi-field user intention recognition method and system

Publications (1)

Publication Number Publication Date
CN117151078A true CN117151078A (en) 2023-12-01

Family

ID=88883367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311005859.7A Pending CN117151078A (en) 2023-08-10 2023-08-10 Social platform-oriented few-sample multi-field user intention recognition method and system

Country Status (1)

Country Link
CN (1) CN117151078A (en)

Similar Documents

Publication Publication Date Title
CN110427463B (en) Search statement response method and device, server and storage medium
CN110442718B (en) Statement processing method and device, server and storage medium
CN110427461B (en) Intelligent question and answer information processing method, electronic equipment and computer readable storage medium
CN112100383B (en) Meta-knowledge fine tuning method and platform for multitask language model
CN111506722A (en) Knowledge graph question-answering method, device and equipment based on deep learning technology
CN111143576A (en) Event-oriented dynamic knowledge graph construction method and device
CN112257441B (en) Named entity recognition enhancement method based on counterfactual generation
CN109086265A (en) A kind of semanteme training method, multi-semantic meaning word disambiguation method in short text
CN113239169A (en) Artificial intelligence-based answer generation method, device, equipment and storage medium
CN112100375A (en) Text information generation method and device, storage medium and equipment
CN116796857A (en) LLM model training method, device, equipment and storage medium thereof
CN113704434A (en) Knowledge base question and answer method, electronic equipment and readable storage medium
CN113486174B (en) Model training, reading understanding method and device, electronic equipment and storage medium
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN116662522B (en) Question answer recommendation method, storage medium and electronic equipment
CN113705207A (en) Grammar error recognition method and device
CN117132923A (en) Video classification method, device, electronic equipment and storage medium
Parshakova et al. Latent question interpretation through variational adaptation
CN113610080B (en) Cross-modal perception-based sensitive image identification method, device, equipment and medium
CN112052320B (en) Information processing method, device and computer readable storage medium
CN117151078A (en) Social platform-oriented few-sample multi-field user intention recognition method and system
CN113807920A (en) Artificial intelligence based product recommendation method, device, equipment and storage medium
CN113051607A (en) Privacy policy information extraction method
CN109885687A (en) A kind of sentiment analysis method, apparatus, electronic equipment and the storage medium of text
Wang et al. Multi‐Task and Attention Collaborative Network for Facial Emotion Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination