CN112434736A - Deep active learning text classification method based on pre-training model - Google Patents

Deep active learning text classification method based on pre-training model Download PDF

Info

Publication number
CN112434736A
CN112434736A CN202011332730.3A CN202011332730A CN112434736A CN 112434736 A CN112434736 A CN 112434736A CN 202011332730 A CN202011332730 A CN 202011332730A CN 112434736 A CN112434736 A CN 112434736A
Authority
CN
China
Prior art keywords
model
training
marking
classifier
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011332730.3A
Other languages
Chinese (zh)
Inventor
尹学渊
祁松茂
江天宇
陈洪宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Potential Artificial Intelligence Technology Co ltd
Original Assignee
Chengdu Potential Artificial Intelligence Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Potential Artificial Intelligence Technology Co ltd filed Critical Chengdu Potential Artificial Intelligence Technology Co ltd
Priority to CN202011332730.3A priority Critical patent/CN112434736A/en
Publication of CN112434736A publication Critical patent/CN112434736A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a deep active learning text classification method based on a pre-training model, which combines the pre-training model trained on a large number of general texts and utilizes the semantic code of the text obtained by the pre-training model as the input characteristic of a classifier. And then constructing a classifier to start training on an initial training set, continuously iterating with the participation of manual labeling based on the initial model and a to-be-labeled sample selection strategy and a data supplement strategy provided by the invention until the maximum iteration times are reached or the labeling budget is exhausted, then embedding the obtained model into a specific product, and executing an inference process. By the method, higher identification accuracy can be obtained at lower data labeling cost under the condition of ensuring the diversity of the training samples.

Description

Deep active learning text classification method based on pre-training model
Technical Field
The invention relates to the technical field of automatic text classification, in particular to a deep active learning text classification method based on a pre-training model.
Background
At present, various types of text classification methods are proposed in succession in order to better automatically classify texts. For example, patent publication No. CN110263173A provides a machine learning method and apparatus for rapidly improving text classification performance, but this method uses a threshold to divide a sample into two parts, namely an automatically generated label and a label that needs to be manually labeled when determining the sample that needs to be manually labeled, where a higher threshold tends to increase labeling cost, and a lower threshold increases the risk of introducing an error label (automatic label). Patent publication No. CN107169001A discloses a text classification model optimization method based on crowdsourcing feedback and active learning, but the method only selects the most uncertain samples in the model to label each time and then iteratively trains, which is not suitable for the deep model, and the labeling of active learning only by using uncertain samples easily causes the lack of diversity of the final training samples. The patent publication No. CN110826470A provides a fundus image left and right eye identification method based on deep active learning, but this method selects the threshold value of the active learning labeling sample as a fixed value (0.4-0.6), which easily causes the labeling sample data to be huge in the initial several iterations of the active learning process, and further causes the increase of the labeling cost. It is therefore desirable to provide a scheme to facilitate higher recognition accuracy at lower data labeling cost while ensuring training sample diversity.
Disclosure of Invention
The invention aims to provide a deep active learning text classification method based on a pre-training model, which is used for achieving the technical effect of obtaining higher recognition accuracy rate with lower data marking cost under the condition of ensuring the diversity of training samples.
The invention provides a deep active learning text classification method based on a pre-training model, which comprises the steps of obtaining an input text and analyzing the language of the input text; performing semantic coding on the input text through a pre-training model corresponding to the language, and acquiring a corresponding feature vector as a semantic coding feature of the input text; inputting the feature vectors into a classifier for classification processing to obtain corresponding classification results; wherein the classifier is implemented by the following stepsThe training is carried out to obtain: selecting a classifier pre-training model according to the task type of the input text; marking or pseudo-marking the sample data, dividing the data set into three types of sets, and respectively marking the data set D for the manual markingLSet of unlabeled data DUAnd a high confidence data set DH(ii) a Marking the current artificial marked data set DLAs a training set, training the classifier pre-training model, and simultaneously, after every preset iteration, collecting the current artificial marking data set DLAnd a current high confidence data set DHTraining the classifier pre-training models together as a training set, storing the classification models when the training is finished, and collecting the current high-confidence data set DHData return unmarked data set D in (1)UWaiting for the next round of active learning iterative process; and selecting the classification model after the iteration is finished or the mark budget is exhausted as the final model of the classifier.
Further, the classifier further comprises, during training: selecting a sample for artificial labeling based on uncertainty, and labeling the artificial labeling data set DLAnd (4) supplementing.
Further, the uncertainty-based selection of samples for artificial labeling, the set of artificial labeling data DLThe supplementing step adopts a lowest confidence degree sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DULc of each dataiValue, then lciArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL;lciThe values are calculated as:
Figure BDA0002796273480000021
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true class, w represents the parameters of the classifier model; x is the number ofiRepresenting the input of the ith sample, i.e. by pre-training the modelTo the i-th sample, yiThe category of the ith model is represented.
Further, the uncertainty-based selection of samples for artificial labeling, the set of artificial labeling data DLThe step of supplementing adopts a difference sampling mode, comprising the following steps of: analysis of label-free data sets by current classification model DUMs of each data iniValue of, then msiArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL;msiThe values are calculated as:
msi=p(yi=j1|xi;w)-p(yi=j2|xi;w)
in the formula, p is the probability given by the model; i represents the ith input sample; j is a function of1Representing the category with the highest probability value in the analysis result; j is a function of2Representing the category with the lowest probability value in the analysis result; w represents the parameters of the classifier model.
Further, the uncertainty-based selection of samples for artificial labeling, the set of artificial labeling data DLThe supplementing step adopts an entropy-based sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DUEn of each data iniValue of en is againiThe values are arranged in descending order, the first K samples are selected for manual marking, and after marking, a manual marking data set D is addedL;eniThe values are calculated as:
Figure BDA0002796273480000031
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true class, w represents the parameters of the classifier model, and m represents the total number of classes.
Further, the classifier further comprises, during training: calculating the information entropy of the unmarked samples, and performing the information entropy and a set threshold valueAnd comparing, if the information entropy of the unlabeled sample is smaller than the threshold, marking as a high-reliability sample, giving the unlabeled sample the class with the highest probability in the classification model as a pseudo label, and supplementing the unlabeled sample to a high-reliability data set DHIn (1).
Further, the method further comprises: dynamically adjusting the threshold value after each round of active learning iteration process is completed, wherein the implementation mode is as follows:
Figure BDA0002796273480000032
in the formula, delta0Is an initial threshold; dr is the decay rate of the threshold, and t represents the iteration turn of the active learning iterative process.
Further, the pre-training model is a bert model; the feature vector is a vector of a fixed dimension obtained after the output of the nth layer of the reciprocal of the bert model is averaged in a token dimension; or the feature vector is the maximum value of the output of the nth layer of the reciprocal of the bert model in a token dimension; or the feature vector is a vector of a fixed dimension obtained after the output of the nth layer from the reciprocal of the bert model is averaged in a token dimension, and is spliced with the maximum value of the output of the nth layer from the reciprocal of the bert model in the token dimension to form a vector of a d dimension; wherein the value of d is determined by the specific pre-training model type and the hyper-parameter, and n is an integer greater than 1.
Further, the pre-training model is a bert model; the feature vector is a vector corresponding to the position of the last layer [ CLS ] of the bert model.
The beneficial effects that the invention can realize are as follows: the invention combines a pre-training model trained on a large amount of general texts, and semantic codes of the texts obtained by the pre-training model are used as input features of the classifier. And then constructing a classifier to start training on an initial training set, continuously iterating under the participation of manual annotation based on the initial model and a sample selection strategy and a data supplement strategy to be marked until the maximum iteration times are reached or the marking budget is exhausted, then embedding the obtained model into a specific product, and executing an inference process. By the method, higher identification accuracy can be obtained at lower data labeling cost under the condition of ensuring the diversity of the training samples.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic flow chart of a deep active learning text classification method based on a pre-training model according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a classifier training process according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, fig. 1 is a schematic flow chart of a deep active learning text classification method based on a pre-training model according to an embodiment of the present invention.
The applicant researches and discovers that the existing text classification active learning method generally only selects the most uncertain samples in the model to label and then carries out iterative training, and only utilizes the uncertain samples to label active learning, so that the mode causes the lack of diversity of the samples, and the too few samples are not enough to support the training of the depth model and are not suitable for deep learning. The embodiment of the invention provides a depth active learning text classification method based on a pre-training model, and the method meets the requirement of the number of samples for updating the depth model by simultaneously adopting newly added labeling samples and high-confidence-degree samples based on uncertainty during each iteration, strengthens the learned knowledge while reducing the labeling cost, and solves the problems.
In an embodiment, the method for classifying a text based on deep active learning of a pre-training model according to the embodiment of the present invention includes:
step S101, an input text is obtained and the language of the input text is analyzed.
In one embodiment, the computer system may first obtain the input text, and then analyze the language of the input text to determine the language of the input text.
And S102, performing semantic coding on the input text through a pre-training model corresponding to the language, and acquiring a corresponding feature vector.
In one embodiment, after the language of the input text is confirmed, the text may be semantically encoded using a pre-trained public model (e.g., bert or albert) in the corresponding language, and the corresponding feature vector is obtained as the semantic encoding feature of the input text.
Illustratively, the pre-training model may be a bert model; the feature vector is a vector of a fixed dimension obtained after the output of the nth layer of the reciprocal of the bert model is averaged in the token dimension; or the feature vector is the maximum value of the output of the nth layer of the reciprocal of the bert model in the token dimension; or the characteristic vector is a vector of a fixed dimension obtained after the output of the nth layer of the reciprocal of the bert model is averaged in the token dimension and is spliced with the maximum value of the output of the nth layer of the reciprocal of the bert model in the token dimension to form a vector of a d dimension; meanwhile, the feature vector can also be a vector corresponding to the position of the last layer [ CLS ] of the bert model.
In the implementation process, the value of d is determined by the specific pre-training model type and the hyper-parameters thereof; the value of n is an integer greater than 1, and is preferably set to 2; the user can also adjust the value of n according to actual requirements.
And step S103, inputting the feature vectors into a classifier for classification processing to obtain corresponding classification results.
Referring to fig. 2, fig. 2 is a schematic diagram of a classifier training process according to an embodiment of the present invention.
In one embodiment, the classifier can be trained by:
and step 1031, selecting a classifier pre-training model according to the task type of the input text.
In one embodiment, the generic text-type classification task may directly use a pre-trained model, and the domain-specific text task may be further refined using a dataset associated with the current task on the basis of the pre-trained model and then used as a feature extractor.
Step 1032, marking or pseudo marking the sample data, dividing the data set into three types of sets, and respectively marking the data set D for the manual markingLSet of unlabeled data DUAnd a high confidence data set DH
Step 1033, set D of current artificial mark dataLAs a training set, training the classifier pre-training model, and simultaneously, after every preset iteration, collecting the current artificial marking data set DLAnd a current high confidence data set DHTraining the classifier pre-training models together as a training set, storing the classification models when the training is finished, and collecting the current high-confidence data set DHData return unmarked data set D in (1)UAnd waiting for the next round of active learning iterative process.
In one embodiment, the current manual tagging data set D may be performed every t roundsLAnd a current high confidence data set DHTraining the classifier pre-training models together as a training set; the value of t can be 1-5, the value is set to be larger when the model is larger, and the value is set to be smaller if the model is sensitive to the labeling cost.
It should be noted that the value range of t is not limited to the range described in the implementation process, and the user may expand the value range according to the actual use condition, that is, the value of t may also be greater than 5.
Step 1034, selecting the classification model after the iteration is finished or the marked budget is exhausted as the final model of the classifier.
In one embodiment, when the maximum number of iterations is reached or the labeling budget is exhausted, the current classification model is used as the final model of the classifier.
In one embodiment, the training of the classifier further comprises: selecting a sample for artificial labeling based on uncertainty, and labeling a data set DLAnd (4) supplementing.
Illustratively, the samples are selected for manual tagging based on uncertainty, and the manually tagged data set D is labeledLThe supplementing step adopts a lowest confidence degree sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DULc of each dataiValue, then lciArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL(ii) a Wherein, K can be 200-2000, lciThe values are calculated as:
Figure BDA0002796273480000071
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true class, w represents the parameters of the classifier model; x is the number ofiThe input representing the i-th sample, i.e. the representation vector of the i-th sample obtained by pre-training the model, yiThe category of the ith model is represented.
It should be noted that the value range of K is only a preferred value range provided by the embodiment of the present invention, and the value range of K may also be correspondingly expanded according to an actual situation in an actual use process, that is, K may be less than 200 or greater than 2000.
In one embodiment, the baseSelecting a sample for artificial marking according to uncertainty, and carrying out artificial marking on a data set DLThe supplementing step can also adopt a difference sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DUMs of each data iniValue of, then msiArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL(ii) a Wherein the value range of K is 200-2000 msiThe values are calculated as:
msi=p(yi=j1|xi;w)-p(yi=j2|xi;w)
in the formula, p is the probability given by the model; i represents the ith input sample; j is a function of1Representing the category with the highest probability value in the analysis result; j is a function of2Representing the category with the lowest probability value in the analysis result; w represents the parameters of the classifier model.
In one embodiment, the samples are selected for artificial labeling based on uncertainty, and the data set D is artificially labeledLThe supplementing step can also adopt a mode of sampling based on entropy, and comprises the following steps: analysis of label-free data sets by current classification model DUEn of each data iniValue of en is againiThe values are arranged in descending order, the first K samples are selected for manual marking, and after marking, a manual marking data set D is addedL(ii) a Wherein the value range of K is 200-2000, eniThe values are calculated as:
Figure BDA0002796273480000081
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true class, w represents the parameters of the classifier model, and m represents the total number of classes; x is the number ofiThe input representing the i-th sample, i.e. the representation vector of the i-th sample obtained by pre-training the model, yiThe category of the ith model is represented.
In one embodiment, the method comprisesWhen the classifier is trained, the method further comprises the following steps: calculating the information entropy of the unlabeled samples, comparing the information entropy with a set threshold, if the information entropy of the unlabeled samples is smaller than the threshold, marking the unlabeled samples as high-reliability samples, giving the unlabeled samples to the classes with the highest probability in the classification model as pseudo labels, and supplementing the unlabeled samples to a high-reliability data set DHIn (1). The sum of en can be used in calculating the entropy of unlabeled samplesiThe same calculation is performed. Further, when obtaining a high-reliability sample, the threshold value set can be dynamically adjusted after each round of active learning iteration process is completed, and the implementation manner is as follows:
Figure BDA0002796273480000091
in the formula, delta0Is an initial threshold value, and the value range can be 0.05-0.1; dr is the attenuation rate of the threshold, the value range can be 0.001-0.0035, and t represents the turn of the active learning iterative process.
In addition, δ is the same as the above0The value range and the value range of dr are only a better value range provided by the embodiment of the invention, and a user can correspondingly expand the value range, namely delta, according to the actual application requirement0The value of (d) can also be less than 0.05 or greater than 0.1, and the value range of dr can also be less than 0.001 or greater than 0.0035.
In summary, the embodiment of the present invention provides a deep active learning text classification method based on a pre-training model, including obtaining an input text and analyzing a language of the input text; performing semantic coding on an input text through a pre-training model corresponding to the language, and acquiring a corresponding feature vector as a semantic coding feature of the input text; inputting the feature vectors into a classifier for classification processing to obtain corresponding classification results; the classifier is obtained by training the following steps: selecting a classifier pre-training model according to the task type of the input text; marking or pseudo-marking the sample data, dividing the data set into three types of sets, and respectively marking the data set D for the manual markingLIs not markedRecording data set DUAnd a high confidence data set DH(ii) a Marking the current artificial marked data set DLAs a training set, training a classifier pre-training model, and simultaneously, after every preset iteration, collecting the current artificial marking data set DLAnd a current high confidence data set DHTraining the classifier pre-training models together as a training set, storing the classification models when the training is finished, and collecting the current high-confidence data set DHData return unmarked data set D in (1)UWaiting for the next round of active learning iterative process; selecting a classification model after iteration is finished or the mark budget is exhausted as a final model of the classifier; and a higher identification accuracy rate can be obtained with a lower data labeling cost under the condition of ensuring the diversity of the training samples.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A deep active learning text classification method based on a pre-training model is characterized by comprising the following steps:
acquiring an input text and analyzing the language of the input text;
performing semantic coding on the input text through a pre-training model corresponding to the language, and acquiring a corresponding feature vector as a semantic coding feature of the input text;
inputting the feature vectors into a classifier for classification processing to obtain corresponding classification results; wherein, the classifier is obtained by training the following steps:
selecting a classifier pre-training model according to the task type of the input text;
marking or pseudo-marking the sample data, dividing the data set into three types of sets, which are respectively artificialMarked data set DLSet of unlabeled data DUAnd a high confidence data set DH
Marking the current artificial marked data set DLAs a training set, training the classifier pre-training model, and simultaneously, after every preset iteration, collecting the current artificial marking data set DLAnd a current high confidence data set DHTraining the classifier pre-training models together as a training set, storing the classification models when the training is finished, and collecting the current high-confidence data set DHData return unmarked data set D in (1)UWaiting for the next round of active learning iterative process;
and selecting the classification model after the iteration is finished or the mark budget is exhausted as the final model of the classifier.
2. The method of claim 1, wherein the classifier, when trained, further comprises:
selecting a sample for artificial labeling based on uncertainty, and labeling the artificial labeling data set DLAnd (4) supplementing.
3. The method of claim 2, wherein the selecting samples for artificial labeling based on uncertainty is performed on the set of artificial labeling data DLThe supplementing step adopts a lowest confidence degree sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DULc of each dataiValue, then lciArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL;lciThe values are calculated as:
Figure FDA0002796273470000021
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true category of the content,w represents parameters of the classifier model; x is the number ofiThe input representing the i-th sample, i.e. the representation vector of the i-th sample obtained by pre-training the model, yiThe category of the ith model is represented.
4. The method of claim 2, wherein the selecting samples for artificial labeling based on uncertainty is performed on the set of artificial labeling data DLThe step of supplementing adopts a difference sampling mode, comprising the following steps of: analysis of label-free data sets by current classification model DUMs of each data iniValue of, then msiArranging the values in ascending order, selecting the first K samples for artificial marking, adding an artificial marking data set D after markingL;msiThe values are calculated as:
msi=p(yi=j1|xi;w)-p(yi=j2|xi;w)
in the formula, p is the probability given by the model; i represents the ith input sample; j is a function of1Representing the category with the highest probability value in the analysis result; j is a function of2Representing the category with the lowest probability value in the analysis result; w represents the parameters of the classifier model.
5. The method of claim 2, wherein the selecting samples for artificial labeling based on uncertainty is performed on the set of artificial labeling data DLThe supplementing step adopts an entropy-based sampling mode, and comprises the following steps: analysis of label-free data sets by current classification model DUEn of each data iniValue of en is againiThe values are arranged in descending order, the first K samples are selected for manual marking, and after marking, a manual marking data set D is addedL;eniThe values are calculated as:
Figure FDA0002796273470000022
in the formula, p is the probability given by the model; i represents the ith input sample; j represents the true class, w represents the parameters of the classifier model, and m represents the total number of classes.
6. The method of claim 1, wherein the classifier, when trained, further comprises:
calculating the information entropy of the unlabeled samples, comparing the information entropy with a set threshold, if the information entropy of the unlabeled samples is smaller than the threshold, marking the unlabeled samples as high-reliability samples, giving the unlabeled samples the classes with the highest probability in the classification model as pseudo labels, and supplementing the unlabeled samples to a high-reliability data set DHIn (1).
7. The method of claim 6, further comprising:
dynamically adjusting the threshold value after each round of active learning iteration process is completed, wherein the implementation mode is as follows:
Figure FDA0002796273470000031
in the formula, delta0Is an initial threshold; dr is the decay rate of the threshold, and t represents the iteration turn of the active learning iterative process.
8. The method of claim 1, wherein the pre-trained model is a bert model; the feature vector is a vector of a fixed dimension obtained after the output of the nth layer of the reciprocal of the bert model is averaged in a token dimension; or the feature vector is the maximum value of the output of the nth layer of the reciprocal of the bert model in a token dimension; or the feature vector is a vector of a fixed dimension obtained after the output of the nth layer from the reciprocal of the bert model is averaged in a token dimension, and is spliced with the maximum value of the output of the second layer from the reciprocal of the bert model in the token dimension to form a vector of a d dimension; wherein the value of d is determined by the specific pre-training model type and its hyper-parameters, and n is an integer greater than 1.
9. The method of claim 1, wherein the pre-trained model is a bert model; the feature vector is a vector corresponding to the position of the last layer [ CLS ] of the bert model.
CN202011332730.3A 2020-11-24 2020-11-24 Deep active learning text classification method based on pre-training model Pending CN112434736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011332730.3A CN112434736A (en) 2020-11-24 2020-11-24 Deep active learning text classification method based on pre-training model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011332730.3A CN112434736A (en) 2020-11-24 2020-11-24 Deep active learning text classification method based on pre-training model

Publications (1)

Publication Number Publication Date
CN112434736A true CN112434736A (en) 2021-03-02

Family

ID=74697457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011332730.3A Pending CN112434736A (en) 2020-11-24 2020-11-24 Deep active learning text classification method based on pre-training model

Country Status (1)

Country Link
CN (1) CN112434736A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269226A (en) * 2021-04-14 2021-08-17 南京大学 Picture selection and annotation method based on local and global information
CN114328936A (en) * 2022-03-01 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for establishing classification model

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050234955A1 (en) * 2004-04-15 2005-10-20 Microsoft Corporation Clustering based text classification
CN108228569A (en) * 2018-01-30 2018-06-29 武汉理工大学 A kind of Chinese microblog emotional analysis method based on Cooperative Study under the conditions of loose
CN110110080A (en) * 2019-03-29 2019-08-09 平安科技(深圳)有限公司 Textual classification model training method, device, computer equipment and storage medium
CN110188197A (en) * 2019-05-13 2019-08-30 北京一览群智数据科技有限责任公司 It is a kind of for marking the Active Learning Method and device of platform
CN110263350A (en) * 2019-03-08 2019-09-20 腾讯科技(深圳)有限公司 Model training method, device, computer readable storage medium and computer equipment
CN110990576A (en) * 2019-12-24 2020-04-10 用友网络科技股份有限公司 Intention classification method based on active learning, computer device and storage medium
CN111414942A (en) * 2020-03-06 2020-07-14 重庆邮电大学 Remote sensing image classification method based on active learning and convolutional neural network
CN111507378A (en) * 2020-03-24 2020-08-07 华为技术有限公司 Method and apparatus for training image processing model
US20200286002A1 (en) * 2019-03-05 2020-09-10 Kensho Technologies, Llc Dynamically updated text classifier
CN111914061A (en) * 2020-07-13 2020-11-10 上海乐言信息科技有限公司 Radius-based uncertainty sampling method and system for text classification active learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050234955A1 (en) * 2004-04-15 2005-10-20 Microsoft Corporation Clustering based text classification
CN108228569A (en) * 2018-01-30 2018-06-29 武汉理工大学 A kind of Chinese microblog emotional analysis method based on Cooperative Study under the conditions of loose
US20200286002A1 (en) * 2019-03-05 2020-09-10 Kensho Technologies, Llc Dynamically updated text classifier
CN110263350A (en) * 2019-03-08 2019-09-20 腾讯科技(深圳)有限公司 Model training method, device, computer readable storage medium and computer equipment
CN110110080A (en) * 2019-03-29 2019-08-09 平安科技(深圳)有限公司 Textual classification model training method, device, computer equipment and storage medium
CN110188197A (en) * 2019-05-13 2019-08-30 北京一览群智数据科技有限责任公司 It is a kind of for marking the Active Learning Method and device of platform
CN110990576A (en) * 2019-12-24 2020-04-10 用友网络科技股份有限公司 Intention classification method based on active learning, computer device and storage medium
CN111414942A (en) * 2020-03-06 2020-07-14 重庆邮电大学 Remote sensing image classification method based on active learning and convolutional neural network
CN111507378A (en) * 2020-03-24 2020-08-07 华为技术有限公司 Method and apparatus for training image processing model
CN111914061A (en) * 2020-07-13 2020-11-10 上海乐言信息科技有限公司 Radius-based uncertainty sampling method and system for text classification active learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269226A (en) * 2021-04-14 2021-08-17 南京大学 Picture selection and annotation method based on local and global information
CN114328936A (en) * 2022-03-01 2022-04-12 支付宝(杭州)信息技术有限公司 Method and device for establishing classification model

Similar Documents

Publication Publication Date Title
CN111897908B (en) Event extraction method and system integrating dependency information and pre-training language model
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN110134757B (en) Event argument role extraction method based on multi-head attention mechanism
CN108415977B (en) Deep neural network and reinforcement learning-based generative machine reading understanding method
CN111382565B (en) Emotion-reason pair extraction method and system based on multiple labels
CN109635108B (en) Man-machine interaction based remote supervision entity relationship extraction method
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN111813954B (en) Method and device for determining relationship between two entities in text statement and electronic equipment
CN113298151A (en) Remote sensing image semantic description method based on multi-level feature fusion
CN111241816A (en) Automatic news headline generation method
CN113505200B (en) Sentence-level Chinese event detection method combined with document key information
CN111738007A (en) Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN110046356B (en) Label-embedded microblog text emotion multi-label classification method
CN112131345B (en) Text quality recognition method, device, equipment and storage medium
CN112434736A (en) Deep active learning text classification method based on pre-training model
CN112036168A (en) Event subject recognition model optimization method, device and equipment and readable storage medium
CN116661805B (en) Code representation generation method and device, storage medium and electronic equipment
CN111832603A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN113836896A (en) Patent text abstract generation method and device based on deep learning
CN111695053A (en) Sequence labeling method, data processing device and readable storage medium
CN112990196A (en) Scene character recognition method and system based on hyper-parameter search and two-stage training
CN116167379A (en) Entity relation extraction method based on BERT and entity position information
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism
CN113919358A (en) Named entity identification method and system based on active learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination