CN113139053B - Text classification method based on self-supervision contrast learning - Google Patents

Text classification method based on self-supervision contrast learning Download PDF

Info

Publication number
CN113139053B
CN113139053B CN202110406702.XA CN202110406702A CN113139053B CN 113139053 B CN113139053 B CN 113139053B CN 202110406702 A CN202110406702 A CN 202110406702A CN 113139053 B CN113139053 B CN 113139053B
Authority
CN
China
Prior art keywords
sample
text
classification model
texts
self
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110406702.XA
Other languages
Chinese (zh)
Other versions
CN113139053A (en
Inventor
程良伦
王德培
张伟文
李睿濠
谭骏铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202110406702.XA priority Critical patent/CN113139053B/en
Publication of CN113139053A publication Critical patent/CN113139053A/en
Application granted granted Critical
Publication of CN113139053B publication Critical patent/CN113139053B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text classification method based on self-supervision contrast learning, which relates to the technical field of natural language processing, and comprises the following steps: acquiring sample texts and category labels corresponding to each sample text; dividing the sample text into a training set, a verification set and a test set and constructing an initial classification model; preprocessing all sample texts; inputting all the preprocessed sample texts into an initial classification model, and pre-training the initial classification model by using a self-supervision contrast learning method based on the sample texts in a training set; adjusting the initial classification model after pre-training by using sample texts in the verification set; testing the adjusted initial classification model by using the sample text in the test set to obtain a final classification model; and inputting the text to be classified into a final classification model to obtain a result to be classified. The invention realizes quick learning under a small amount of marked data, has low data cost and accurate classification result.

Description

Text classification method based on self-supervision contrast learning
Technical Field
The invention relates to the technical field of natural language processing, in particular to a text classification method based on self-supervision contrast learning.
Background
Currently, most text classification techniques are based on deep neural networks, which require large amounts of labeled data when classifying text. Obtaining large amounts of labeling data requires high economic overhead and intensive human labor and is also difficult to ensure labeling accuracy. With the gradual expansion of the application field of machine learning, the field data with labels is seriously deficient. The self-supervision learning method has great progress in image processing tasks, and the self-supervision learning can improve the generalization performance of the model by only needing less data and labels. How to apply the self-supervision learning mode in the natural language processing field.
The Chinese patent CN112395419A published in 2.2021 and 23 provides a training method and device for a text classification model, and a text classification method and device, wherein the method comprises the following steps: determining a first vector group and a second vector group set according to the first sample text of the sample text set and the label set; inputting the first vector group and the second vector group set into a word level attention layer to obtain a third vector set and a fourth vector set; inputting the third vector set and the fourth vector set into a sentence-level attention layer to obtain a first text vector set related to the tag set; inputting the first text vector set into a full connection layer to obtain a prediction label of the first text; training the text classification model based on the predictive labels and the first label group corresponding to the first sample text in the label set until a training stop condition is reached. According to the method, the accuracy of the text classification model is improved to a certain extent through the steps, but when the text classification model is trained, a large number of accurate sample texts and label sets are obtained, and the data cost is high; and the accuracy of the tag can have an impact on the accuracy of the classification.
Disclosure of Invention
The invention provides a text classification method based on self-supervision contrast learning, which can realize quick learning under a small amount of marked data, classify the text to be classified, and has the advantages of low data cost and accurate classification result.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the invention provides a text classification method based on self-supervision contrast learning, which comprises the following steps:
s1: acquiring sample texts and category labels corresponding to each sample text; dividing the sample text into a training set, a verification set and a test set and constructing an initial classification model;
s2: preprocessing all sample texts;
s3: inputting all the preprocessed sample texts into an initial classification model, and pre-training the initial classification model by using a self-supervision contrast learning method based on the sample texts in a training set; adjusting the initial classification model after pre-training by using sample texts in the verification set; testing the adjusted initial classification model by using the sample text in the test set to obtain a final classification model;
s4: and inputting the text to be classified into a final classification model to obtain a classification result of the text to be classified.
Preferably, the sample text is obtained from an existing Cnews dataset.
Preferably, the method for obtaining the category label corresponding to the sample text includes: the method for manual labeling, the method for semi-automatic labeling by adopting auxiliary tools and the method for full-automatic labeling by adopting rules and dictionary.
Preferably, the specific method for pretreatment is as follows:
text clause: sentence segmentation is carried out on the text according to punctuation marks;
sentence segmentation: dividing Chinese words according to semantics, and dividing English into words according to space;
removing stop words: the deactivated vocabulary, punctuation marks and numbers that do not significantly contribute to classification are removed.
Preferably, in the step S3, the specific method for obtaining the final classification model is as follows:
s3.1: based on the preprocessed sample texts, word vector representation forms of all the sample texts are obtained;
s3.2: extracting features of all sample texts in the word vector representation form;
s3.3: pooling operation is carried out on the sample text after feature extraction, and a pooled training set, verification set and test set are obtained;
s3.4: based on the sample text in the pooled training set, pre-training the initial classification model by using a self-supervision learning method; continuously adjusting the initial classification model by using the sample text in the pooled verification set through setting a first loss function, and finishing adjustment when the value of the first loss function is minimum;
s3.5: testing the adjusted initial classification model by using sample texts in the pooled test data set; and setting a second loss function, and when the value of the second loss function is minimum, completing the test to obtain a final classification model.
Preferably, in S3.1, the specific method for obtaining the word vector representation form of the sample text is as follows:
word vector training is carried out on all the preprocessed sample texts by using a word embedding technology, and the sample texts are vectorized and encoded into x i ={w 1 ,w 2 ,…,w j X, where x i Vector representing ith sample text, w j A word vector representing the jth word in the ith sample text.
Preferably, in S3.2, feature extraction is performed on all sample texts by using the multi-layer CNN, and the sample texts are divided into positive type sample texts and negative type sample texts according to the features.
Preferably, in S3.3, the pooling operation, specifically, the maximum pooling operation, is performed on the sample text after feature extraction.
Preferably, in S3.4, the first loss function is:
where x represents a sample text vector, x + Representing a positive class sample text vector, x m Represents the m-th negative sample text vector, N represents the number of negative sample texts, f represents the encoder, f T Coding transpose representing sample text vector x, f m Representing the coding result of the negative sample text, f + Representing the result of encoding a positive class sample text, exp () represents an exponential function based on e.
Preferably, in S3.5, the second loss function is:
wherein C represents the number of categories of the sample text, C is a certain category, y i A label representing the text label of the i-th sample,a label representing a prediction of the i-th sample text.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
according to the method, a small amount of sample texts with category labels are divided into a training set, a verification set and a test set, firstly, based on sample texts of the training set, an initial classification model is pre-trained by using a self-supervision comparison learning method, the initial classification model which is trained is adjusted by using sample texts of the verification set, and finally, the adjusted initial classification model is tested by using sample texts of the test set, so that a final classification model is obtained; inputting the text to be classified into a final classification model to obtain a classification result of the text to be classified. In the complete training process, the invention uses a small amount of sample text with category labels as input to train the initial classification model, thereby greatly reducing the dependence on a large amount of data with accurate labels, reducing the repeated labor of manual labels, realizing quick learning under a small amount of data with labels, having low data cost and accurate classification result.
Drawings
FIG. 1 is a flow chart of a text classification method based on self-supervised contrast learning according to an embodiment;
FIG. 2 is a flow chart of a method of obtaining a final classification model according to an embodiment;
FIG. 3 is a diagram showing learning speed and classification accuracy of a conventional textRNN at 500 per class of sample text;
FIG. 4 is a diagram showing learning speed and classification accuracy of a conventional textCNN at 500 per class of sample text;
FIG. 5 is a diagram showing learning speed and classification accuracy of a conventional SCL-RNN at 500 per class of sample text;
FIG. 6 is a schematic diagram of learning speed and classification accuracy of 500 sample texts according to the method of the embodiment;
FIG. 7 is a diagram showing learning speed and classification accuracy of a conventional textRNN when 1000 text samples are used in each class;
FIG. 8 is a diagram showing learning speed and classification accuracy of a conventional textCNN when 1000 text samples are each;
FIG. 9 is a diagram showing learning speed and classification accuracy of a conventional SCL-RNN when 1000 sample texts are used in each class;
FIG. 10 is a diagram showing learning speed and classification accuracy of 1000 sample texts according to the method of the embodiment.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Examples
The invention provides a text classification method based on self-supervision contrast learning, which can realize quick learning under a small amount of marked data, classify the text to be classified, and has low data components and accurate classification results.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the embodiment provides a text classification method based on self-supervision contrast learning, as shown in fig. 1, the method comprises the following steps:
s1: acquiring sample texts and category labels corresponding to each sample text; dividing the sample text into a training set, a verification set and a test set and constructing an initial classification model; in the embodiment, an initial classification model is constructed by using an unsupervised clustering method;
s2: preprocessing all sample texts;
s3: inputting all the preprocessed sample texts into an initial classification model, and pre-training the initial classification model by using a self-supervision contrast learning method based on the sample texts in a training set; adjusting the initial classification model after pre-training by using sample texts in the verification set; testing the adjusted initial classification model by using the sample text in the test set to obtain a final classification model;
s4: and inputting the text to be classified into a final classification model to obtain a classification result of the text to be classified.
The sample text is obtained from an existing Cnews dataset.
The method for acquiring the category label corresponding to the sample text comprises the following steps: the method for manual labeling, the method for semi-automatic labeling by adopting auxiliary tools and the method for full-automatic labeling by adopting rules and dictionary.
The specific method for preprocessing comprises the following steps:
text clause: sentence segmentation is carried out on the text according to punctuation marks;
sentence segmentation: dividing Chinese words according to semantics, and replacing English with space cutting words;
removing stop words: the deactivated vocabulary, punctuation marks and numbers that do not significantly contribute to classification are removed.
As shown in fig. 2, in the step S3, a specific method for obtaining the final classification model is as follows:
s3.1: based on the preprocessed sample texts, word vector representation forms of all the sample texts are obtained;
word vector training is carried out on all the preprocessed sample texts by using a word embedding technology, and the sample texts are vectorized and encoded into x i ={w 1 ,w 2 ,…,w j X, where x i Vector representing ith sample text, w j A word vector representing the jth word in the ith sample text.
S3.2: extracting features of all sample texts in the word vector representation form;
and extracting the characteristics of all the sample texts by using the multi-layer CNN, and dividing the sample texts into positive type sample texts and negative type sample texts according to the characteristics. In this embodiment, the basic parameters of the CNN network are set as follows: the input dimension is 500, dropout=0.2, filter=256, kernel_size=3/4/5, and the activation function uses the relu function.
S3.3: performing maximum pooling operation on the sample text after feature extraction to obtain a pooled training set, a pooled verification set and a pooled test set;
s3.4: based on the sample text in the pooled training set, pre-training the initial classification model by using a self-supervision learning method; continuously adjusting the initial classification model by using the sample text in the pooled verification set through setting a first loss function, and finishing adjustment when the value of the first loss function is minimum;
the first loss function is:
where x represents a sample text vector, x + Representing a positive class sample text vector, x m Represents the m-th negative sample text vector, N represents the number of negative sample texts, f represents the encoder, f T Coding transpose representing sample text vector x, f m Representing the coding result of the negative sample text, f + Representing the result of encoding a positive class sample text, exp () represents an exponential function based on e.
S3.5: testing the adjusted initial classification model by using sample texts in the pooled test data set; and setting a second loss function, and when the value of the second loss function is minimum, completing the test to obtain a final classification model.
The second loss function is:
wherein,c represents the number of categories of the sample text, C is a certain category, y i A label representing the text label of the i-th sample,a label representing a prediction of the i-th sample text.
In the implementation process, sample texts are obtained from a Cnews data set, 500 and 1000 sample texts in each class are respectively taken, the parameter setting is consistent, the text classification method (SCL-CNN) based on self-supervision contrast learning of the embodiment is compared with the traditional TextRNN, textCNN and SCL-RNN, when the number of each class of sample texts is 500, the results are shown in the following table,
LR=0.001 Batch-size Epochs Acc
TextRNN 64 50 55.6
TextCNN 64 50 83.0
SCL-RNN 64 50 91.7
SCL-CNN 64 50 97.7
when the number of each type of sample text is 1000, the results are shown in the following table,
LR=0.001 Batch-size Epochs Acc
TextRNN 64 50 57.45
TextCNN 64 50 85.7
SCL-RNN 64 50 95.35
SCL-CNN 64 50 97.6
the Batch-size and the Epochs in the table are used as parameters, the settings are kept consistent, the Acc represents the precision, and as can be seen from the table, the classification precision of the text classification method (SCL-CNN) based on self-supervision contrast learning provided by the embodiment is higher than that of the traditional TextRNN, textCNN and SCL-RNN, and the sample texts are improved from 500 in each class to 1000 in each class, so that the method does not bring improvement of the classification precision, and can realize accurate classification of the texts to be classified under a small amount of marked data;
the left side of the graphs in fig. 3-6 is a graph of learning speed of 500 times each class of sample text, the abscissa is Epochs, and the ordinate is the LOSS value of the second LOSS function; it can be seen that when training is performed by the method provided by this example, the Epochs value at the inflection point appears much earlier than those at the inflection points of the conventional TextRNN, textCNN and SCL-RNN; the right side in fig. 3-6 is a schematic diagram of 500 time classification precision of each class of sample text, the abscissa is Epochs, and the ordinate is classification precision value, and it can be seen that the maximum classification precision is reached more quickly when training is performed by the method provided by the embodiment;
the sample texts are increased from 500 to 1000 in each class, and the left side of fig. 7-10 is a schematic diagram of learning speed of 1000 in each class of sample texts, and it can also be seen that when training is performed by the method provided by this embodiment, the value of Epochs appearing at the inflection point is far earlier than those appearing at the inflection points of the traditional TextRNN, textCNN and SCL-RNN; 3-6, 1000 time classification accuracy diagrams of each class of sample text are shown, and it can be seen that the maximum classification accuracy is reached faster when training is performed by the method provided by the embodiment;
to sum up, the learning speed of the method provided by the embodiment is far faster than that of the traditional TextRNN, textCNN and SCL-RNN;
comparing fig. 6 and fig. 10, increasing the sample text from 500 to 1000 in each class, the epoch value of the inflection point and the epoch value of the maximum classification precision are not significantly increased, which indicates that the method can realize rapid learning with a small amount of marked data.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (7)

1. A text classification method based on self-supervised contrast learning, the method comprising the steps of:
s1: acquiring sample texts and category labels corresponding to each sample text; dividing the sample text into a training set, a verification set and a test set and constructing an initial classification model;
s2: preprocessing all sample texts;
s3: inputting all the preprocessed sample texts into an initial classification model, and pre-training the initial classification model by using a self-supervision contrast learning method based on the sample texts in a training set; adjusting the initial classification model after pre-training by using sample texts in the verification set; testing the adjusted initial classification model by using the sample text in the test set to obtain a final classification model; the specific method comprises the following steps:
s3.1: based on the preprocessed sample texts, word vector representation forms of all the sample texts are obtained;
s3.2: extracting features of all sample texts in the word vector representation form;
s3.3: pooling operation is carried out on the sample text after feature extraction, and a pooled training set, verification set and test set are obtained;
s3.4: based on the sample text in the pooled training set, pre-training the initial classification model by using a self-supervision learning method; continuously adjusting the initial classification model by using the sample text in the pooled verification set through setting a first loss function, and finishing adjustment when the value of the first loss function is minimum;
the first loss function is:
where x represents a sample text vector, x + Representing a positive class sample text vector, x m Represents the m-th negative sample text vector, N represents the number of negative sample texts, f represents the encoder, f T Coding transpose representing sample text vector x, f m Representing the coding result of the negative sample text, f + Representing the coding result of the positive class sample text, exp () represents an exponential function based on e;
s3.5: testing the adjusted initial classification model by using sample texts in the pooled test data set; setting a second loss function, and when the value of the second loss function is minimum, completing the test to obtain a final classification model;
the second loss function is:
wherein C represents the number of categories of the sample text, C is a certain category, y i A label representing the text label of the i-th sample,a tag representing a prediction of an ith sample text;
s4: and inputting the text to be classified into a final classification model to obtain a classification result of the text to be classified.
2. A method of classifying text based on self-supervised contrast learning as recited in claim 1, wherein the sample text is obtained from an existing Cnews dataset.
3. The text classification method based on self-supervised contrast learning as set forth in claim 2, wherein the method for obtaining the category label corresponding to the sample text includes: the method for manual labeling, the method for semi-automatic labeling by adopting auxiliary tools and the method for full-automatic labeling by adopting rules and dictionary.
4. A method of classifying text based on self-supervised contrast learning as recited in claim 3, wherein the preprocessing includes sentence segmentation, word segmentation, and stop word removal for the sample text.
5. The text classification method based on self-supervised contrast learning of claim 4, wherein in S3.1, the specific method for obtaining the word vector representation form of the sample text is as follows:
word vector training is carried out on all the preprocessed sample texts by using a word embedding technology, and the sample texts are vectorized and encoded into x i ={w 1 ,w 2 ,…,w j X, where x i Vector representing ith sample text, w j A word vector representing the jth word in the ith sample text.
6. The text classification method based on self-supervised contrast learning as claimed in claim 5, wherein in S3.2, feature extraction is performed on all sample texts by using multi-layer CNN, and the sample texts are classified into positive type sample texts and negative type sample texts according to features.
7. The text classification method based on self-supervised contrast learning of claim 6, wherein in S3.3, the pooling operation, specifically, the maximum pooling operation, is performed on the sample text after feature extraction.
CN202110406702.XA 2021-04-15 2021-04-15 Text classification method based on self-supervision contrast learning Active CN113139053B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110406702.XA CN113139053B (en) 2021-04-15 2021-04-15 Text classification method based on self-supervision contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110406702.XA CN113139053B (en) 2021-04-15 2021-04-15 Text classification method based on self-supervision contrast learning

Publications (2)

Publication Number Publication Date
CN113139053A CN113139053A (en) 2021-07-20
CN113139053B true CN113139053B (en) 2024-03-05

Family

ID=76812978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110406702.XA Active CN113139053B (en) 2021-04-15 2021-04-15 Text classification method based on self-supervision contrast learning

Country Status (1)

Country Link
CN (1) CN113139053B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642618B (en) * 2021-07-27 2024-03-01 上海展湾信息科技有限公司 Method and equipment for training screw device state prediction model
CN114330312B (en) * 2021-11-03 2024-06-14 腾讯科技(深圳)有限公司 Title text processing method, title text processing device, title text processing program, and recording medium
CN114357168B (en) * 2021-12-31 2022-08-02 成都信息工程大学 Text classification method
CN114548321B (en) * 2022-03-05 2024-06-25 昆明理工大学 Self-supervision public opinion comment viewpoint object classification method based on contrast learning
CN117421595A (en) * 2023-10-25 2024-01-19 广东技术师范大学 System log anomaly detection method and system based on deep learning technology

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191732A (en) * 2020-01-03 2020-05-22 天津大学 Target detection method based on full-automatic learning
CN111274405A (en) * 2020-02-26 2020-06-12 北京工业大学 Text classification method based on GCN
CN111897961A (en) * 2020-07-22 2020-11-06 深圳大学 Text classification method and related components of wide neural network model
CN111950269A (en) * 2020-08-21 2020-11-17 清华大学 Text statement processing method and device, computer equipment and storage medium
CN112100387A (en) * 2020-11-13 2020-12-18 支付宝(杭州)信息技术有限公司 Training method and device of neural network system for text classification
CN112101328A (en) * 2020-11-19 2020-12-18 四川新网银行股份有限公司 Method for identifying and processing label noise in deep learning
CN112348792A (en) * 2020-11-04 2021-02-09 广东工业大学 X-ray chest radiography image classification method based on small sample learning and self-supervision learning
CN112381116A (en) * 2020-10-21 2021-02-19 福州大学 Self-supervision image classification method based on contrast learning
CN112464879A (en) * 2020-12-10 2021-03-09 山东易视智能科技有限公司 Ocean target detection method and system based on self-supervision characterization learning
WO2021051560A1 (en) * 2019-09-17 2021-03-25 平安科技(深圳)有限公司 Text classification method and apparatus, electronic device, and computer non-volatile readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021051560A1 (en) * 2019-09-17 2021-03-25 平安科技(深圳)有限公司 Text classification method and apparatus, electronic device, and computer non-volatile readable storage medium
CN111191732A (en) * 2020-01-03 2020-05-22 天津大学 Target detection method based on full-automatic learning
CN111274405A (en) * 2020-02-26 2020-06-12 北京工业大学 Text classification method based on GCN
CN111897961A (en) * 2020-07-22 2020-11-06 深圳大学 Text classification method and related components of wide neural network model
CN111950269A (en) * 2020-08-21 2020-11-17 清华大学 Text statement processing method and device, computer equipment and storage medium
CN112381116A (en) * 2020-10-21 2021-02-19 福州大学 Self-supervision image classification method based on contrast learning
CN112348792A (en) * 2020-11-04 2021-02-09 广东工业大学 X-ray chest radiography image classification method based on small sample learning and self-supervision learning
CN112100387A (en) * 2020-11-13 2020-12-18 支付宝(杭州)信息技术有限公司 Training method and device of neural network system for text classification
CN112101328A (en) * 2020-11-19 2020-12-18 四川新网银行股份有限公司 Method for identifying and processing label noise in deep learning
CN112464879A (en) * 2020-12-10 2021-03-09 山东易视智能科技有限公司 Ocean target detection method and system based on self-supervision characterization learning

Also Published As

Publication number Publication date
CN113139053A (en) 2021-07-20

Similar Documents

Publication Publication Date Title
CN113139053B (en) Text classification method based on self-supervision contrast learning
CN109241530B (en) Chinese text multi-classification method based on N-gram vector and convolutional neural network
CN110532554B (en) Chinese abstract generation method, system and storage medium
CN109977199B (en) Reading understanding method based on attention pooling mechanism
CN108733647B (en) Word vector generation method based on Gaussian distribution
CN113378563B (en) Case feature extraction method and device based on genetic variation and semi-supervision
CN110851594A (en) Text classification method and device based on multi-channel deep learning model
CN115438154A (en) Chinese automatic speech recognition text restoration method and system based on representation learning
CN113420145A (en) Bidding text classification method and system based on semi-supervised learning
CN115687626A (en) Legal document classification method based on prompt learning fusion key words
CN111597328A (en) New event theme extraction method
CN112052319B (en) Intelligent customer service method and system based on multi-feature fusion
CN112667806A (en) Text classification screening method using LDA
CN107451116B (en) Statistical analysis method for mobile application endogenous big data
Lian et al. Fast and accurate detection of surface defect based on improved YOLOv4
CN117332788B (en) Semantic analysis method based on spoken English text
CN113392191B (en) Text matching method and device based on multi-dimensional semantic joint learning
CN114896398A (en) Text classification system and method based on feature selection
CN114742047A (en) Text emotion recognition method based on maximum probability filling and multi-head attention mechanism
CN111898375B (en) Automatic detection and division method for article discussion data based on word vector sentence chain
CN111160756A (en) Scenic spot assessment method and model based on secondary artificial intelligence algorithm
Sudarma et al. Balinese script’s character reconstruction using Linear Discriminant Analysis
CN115269833A (en) Event information extraction method and system based on deep semantics and multitask learning
Zhao et al. Machine learning based text classification technology
Lin et al. Design and implementation of intelligent scoring system for handwritten short answer based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant