CN110399482B - Text classification method, model and device - Google Patents
Text classification method, model and device Download PDFInfo
- Publication number
- CN110399482B CN110399482B CN201910492286.2A CN201910492286A CN110399482B CN 110399482 B CN110399482 B CN 110399482B CN 201910492286 A CN201910492286 A CN 201910492286A CN 110399482 B CN110399482 B CN 110399482B
- Authority
- CN
- China
- Prior art keywords
- feature vector
- inputting
- model
- vector
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a text classification method, a model and a device, wherein the method comprises the following steps: converting the text to be classified into a word vector V1; inputting the word vector V1 into a convolution part of the CNN model, outputting a feature vector V3 by the convolution part of the CNN model, inputting the feature vector V3 into a first pooling layer, and outputting a feature vector V4 by the first pooling layer; and inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5; merging the feature vector V4 and the feature vector V5 into a feature vector V6; inputting the feature vector V6 into a full connection layer; and the full connection layer outputs the text classification of the text to be classified. The method combines the RNN model and the CNN model, and improves the accuracy of the text classification result.
Description
Technical Field
The invention relates to the field of computers, in particular to a text classification method, a text classification model and a text classification device.
Background
With the development of the internet and social media, at present, a great amount of text information including wikipedia entries, academic articles, news reports and various after-sales service comments exists on the network, and a great amount of valuable information is contained in the text information, specific information in the text information can be roughly extracted by the existing text classification technology, for example, the satisfaction degree of a consumer on the product or service can be known by performing sentiment analysis on the after-sales comments, the field of the news reports can be roughly distinguished by classifying news data, and the relation in a knowledge graph can be obtained by classifying sentences of the wikipedia data. In summary, text classification is an extremely important technology, and at present, more common methods include traditional text classification methods such as svm, nearest neighbor, and decision tree, and deep learning models.
Currently, popular deep learning models include rnn (current Neural networks), cnn (volumetric Neural networks), and transform.
RNN is good at text classification of long sequences of text. CNN is applied to image processing first and then to the field of artificial intelligence, and has an advantage in that local text information can be better recognized. the transformer is a new generation of encoder proposed by google, which overcomes the dependence of RNN on the state before sequence information and performs better than RNN and CNN in most artificial intelligence processing tasks, but the transformer performs worse on small and medium data sets, and the training is very unstable, the long-distance dependence capability is not as good as that of the traditional RNN.
Disclosure of Invention
In view of the above, the present invention provides a text classification method, a text classification model and a text classification device, so as to solve the disadvantages of the existing deep learning model.
The invention provides a text classification method, which comprises the following steps
Converting the text to be classified into a word vector V1;
inputting the word vector V1 into a convolution part of the CNN model, outputting a feature vector V3 by the convolution part of the CNN model, inputting the feature vector V3 into a first pooling layer, and outputting a feature vector V4 by the first pooling layer; and inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
merging the feature vector V4 and the feature vector V5 into a feature vector V6;
and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The present invention also provides a text classification model, which includes:
vector conversion layer: for converting the text to be classified into a word vector V1;
a feature extraction layer: for inputting the word vector V1 into the convolution part of the CNN model, which outputs the feature vector V3, and the feature vector V3 into the first pooling layer, which outputs the feature vector V4; and, for inputting the word vector V1 into the second pooling layer, the second pooling layer outputting the feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and the feature vector V5 into a feature vector V6;
full connection layer: and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The present invention also provides a non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method described above.
The invention also provides a text classification device which is characterized by comprising a processor and the non-transitory computer readable storage medium.
According to the text classification method, the series structure of the RNN and the CNN and the modeling mode with the characteristics of the series structure are utilized to obtain richer word vector characteristics with different semantic levels, and the classification accuracy is improved.
The method or the model combines the excellent long sequence modeling capability of the RNN and the advantages of the local modeling of the CNN, and the classification effect in most text classification tasks is superior to that of the traditional RNN and CNN models.
Compared with a transformer, the text classification method or the model training of the invention is stable, and because the model parameters are fewer, the invention only needs less hardware resource overhead.
Drawings
FIG. 1 is a first flowchart of a text classification method according to the present invention;
FIG. 2 is a second flowchart of the text classification method of the present invention;
FIG. 3 is a first block diagram of the text classification method of the present invention;
fig. 4 is a second structural diagram of the text classification method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention provides a text classification method, as shown in figure 1, the method comprises
S11: converting the text to be classified into a word vector V1;
s13 includes S13-1, S13-2 and S13-3;
s13-1: inputting the word vector V1 into a convolution part of the CNN model, and outputting a feature vector V3 by the convolution part of the CNN model;
s13-2: inputting the feature vector V3 into a first pooling layer, the first pooling layer outputting a feature vector V4;
s13-3: inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
s15: merging the feature vector V4 and the feature vector V5 into a feature vector V6;
s17: and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The convolution part of the CNN model comprises one convolution layer or a plurality of series convolution layers, and each convolution layer consists of a prior multi-scale convolution kernel convolution and a subsequent pooling layer.
Wherein the first pooling layer and the second pooling layer may be a maximum pooling layer or an average pooling layer.
Optionally, as shown in fig. 2, it is also possible to add between S11 and S13:
s12, inputting the word vector V1 into a BLSTM (Bidirectional Long Short-term Memory) model, and outputting a feature vector V1' by the BLSTM model;
accordingly, S13-1 is adjusted to: inputting the feature vector V1' into a convolution part of the CNN model, and outputting a feature vector V3 by the convolution part of the CNN model;
s13-3 is adjusted as follows: inputting the feature vector V1' into a second pooling layer, the second pooling layer outputting a feature vector V5;
the BLSTM can bidirectionally extract the correlation of long-distance words in the text to be classified, and is favorable for improving the accuracy of later text classification.
According to the text classification method, the series structure of the RNN and the CNN and the modeling mode with the characteristics of the series structure are utilized to obtain richer word vector characteristics with different semantic levels, and the classification accuracy is improved.
The method combines the excellent long sequence modeling capability of the RNN and the advantages of the local modeling of the CNN, and the classification effect in most text classification tasks is superior to that of the traditional RNN and CNN models.
Compared with a transformer, the text classification method is stable in training and only needs less hardware resource overhead because of less model parameters.
The present invention also provides a text classification model, as shown in fig. 3, comprising: the device comprises a vector conversion layer, a feature extraction layer, a feature merging layer and a full connection layer.
Vector conversion layer: for converting the text to be classified into a word vector V1;
the characteristic extraction layer comprises a convolution part of the CNN model, a first pooling layer and a second pooling layer;
convolution part of CNN model: for inputting the word vector V1 into the convolution portion of the CNN model, which outputs a feature vector V3;
a first pooling layer: for inputting the feature vector V3 into the first pooling layer, which outputs the feature vector V4;
a second pooling layer: for inputting the word vector V1 into the second pooling layer, which outputs the feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and the feature vector V5 into a feature vector V6;
full connection layer: inputting the feature vector V6 into a full connection layer; and the full connection layer outputs the text classification of the text to be classified.
As shown in fig. 4, between the vector conversion layer and the feature extraction layer, there may be further included:
the BLMST model, the BLMST model input word vector V1, outputs the feature vector V1'.
Accordingly, the applicability of the convolution portion of the CNN model is adjusted to: the convolution part is used for inputting the characteristic vector V1' into the CNN model and outputting a characteristic vector V3;
the second pooling layer suitability was adjusted to: for inputting the feature vector V1' into the second pooling layer, which outputs the feature vector V5.
The present invention also provides a non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method described above.
The invention also provides a text classification device which is characterized by comprising a processor and the non-transitory computer readable storage medium.
It should be noted that the embodiment of the text classification model or apparatus of the present invention has the same principle as the embodiment of the text classification method, and the related parts can be referred to each other.
The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (4)
1. A method of text classification, the method comprising:
converting the text to be classified into a word vector V1;
inputting the word vector V1 into a convolution part of a CNN model, the convolution part of the CNN model outputting a feature vector V3, inputting the feature vector V3 into a first pooling layer, the first pooling layer outputting a feature vector V4; and, inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
merging the feature vector V4 and the feature vector V5 into a feature vector V6;
inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer;
the inputting of the word vector V1 into the convolution portion of the CNN model includes: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', and inputting the feature vector V1' into a convolution part of a CNN model;
and/or, the inputting the word vector V1 into the second pooling layer comprises: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', the feature vector V1' into a second pooling layer;
the BLSTM model is a shared model.
2. A text classification model, the model comprising:
vector conversion layer: for converting the text to be classified into a word vector V1;
a feature extraction layer: for inputting the word vector V1 into the convolution part of a CNN model, which outputs a feature vector V3, inputting the feature vector V3 into a first pooling layer, which outputs a feature vector V4; and, for inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and feature vector V5 into a feature vector V6;
full connection layer: inputting the feature vector V6 into the full-link layer, and outputting the text classification of the text to be classified by the full-link layer;
the inputting of the word vector V1 into the convolution portion of the CNN model includes: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', and inputting the feature vector V1' into a convolution part of a CNN model;
and/or, the inputting the word vector V1 into the second pooling layer comprises: inputting the word vector V1 into a BLSTM model, which outputs a feature vector V1 ', inputting the feature vector V1' into a second pooling layer:
the BLSTM model is a shared model.
3. A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method of claim 1.
4. A text classification apparatus comprising a processor and the non-transitory computer readable storage medium of claim 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910492286.2A CN110399482B (en) | 2019-06-06 | 2019-06-06 | Text classification method, model and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910492286.2A CN110399482B (en) | 2019-06-06 | 2019-06-06 | Text classification method, model and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110399482A CN110399482A (en) | 2019-11-01 |
CN110399482B true CN110399482B (en) | 2021-12-03 |
Family
ID=68323125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910492286.2A Active CN110399482B (en) | 2019-06-06 | 2019-06-06 | Text classification method, model and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110399482B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107885853A (en) * | 2017-11-14 | 2018-04-06 | 同济大学 | A kind of combined type file classification method based on deep learning |
CN109241283A (en) * | 2018-08-08 | 2019-01-18 | 广东工业大学 | A kind of file classification method based on multi-angle capsule network |
US10304208B1 (en) * | 2018-02-12 | 2019-05-28 | Avodah Labs, Inc. | Automated gesture identification using neural networks |
-
2019
- 2019-06-06 CN CN201910492286.2A patent/CN110399482B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107885853A (en) * | 2017-11-14 | 2018-04-06 | 同济大学 | A kind of combined type file classification method based on deep learning |
US10304208B1 (en) * | 2018-02-12 | 2019-05-28 | Avodah Labs, Inc. | Automated gesture identification using neural networks |
CN109241283A (en) * | 2018-08-08 | 2019-01-18 | 广东工业大学 | A kind of file classification method based on multi-angle capsule network |
Non-Patent Citations (3)
Title |
---|
基于CNN和BiLSTM网络特征融合的文本情感分析;李洋,董红斌;《计算机应用》;20181130;第38卷(第11期);第3075-3080页 * |
李洋,董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析.《计算机应用》.2018,第38卷(第11期), * |
用于文本分类的局部化双向长短时记忆;万圣贤,兰艳艳;《中文信息学报》;20170531;第31卷(第3期);第62-68页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110399482A (en) | 2019-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086303B (en) | Intelligent conversation method, device and terminal based on machine reading understanding | |
WO2018006727A1 (en) | Method and apparatus for transferring from robot customer service to human customer service | |
CN112164391B (en) | Statement processing method, device, electronic equipment and storage medium | |
US20190163742A1 (en) | Method and apparatus for generating information | |
CN112685565A (en) | Text classification method based on multi-mode information fusion and related equipment thereof | |
US20210271823A1 (en) | Content generation using target content derived modeling and unsupervised language modeling | |
JP2023535709A (en) | Language expression model system, pre-training method, device, device and medium | |
KR102576344B1 (en) | Method and apparatus for processing video, electronic device, medium and computer program | |
CN106227792B (en) | Method and apparatus for pushed information | |
CN110162766B (en) | Word vector updating method and device | |
CN111783903B (en) | Text processing method, text model processing method and device and computer equipment | |
CN112084307B (en) | Data processing method, device, server and computer readable storage medium | |
CN113158554B (en) | Model optimization method and device, computer equipment and storage medium | |
CN111831826A (en) | Training method, classification method and device of cross-domain text classification model | |
CN113392179A (en) | Text labeling method and device, electronic equipment and storage medium | |
CN111625715A (en) | Information extraction method and device, electronic equipment and storage medium | |
WO2020006488A1 (en) | Corpus generating method and apparatus, and human-machine interaction processing method and apparatus | |
CN114817478A (en) | Text-based question and answer method and device, computer equipment and storage medium | |
Song | Sentiment analysis of Japanese text and vocabulary learning based on natural language processing and SVM | |
CN110969005A (en) | Method and device for determining similarity between entity corpora | |
CN116881462A (en) | Text data processing, text representation and text clustering method and equipment | |
CN110399482B (en) | Text classification method, model and device | |
CN116975221A (en) | Text reading and understanding method, device, equipment and storage medium | |
JP2023554210A (en) | Sort model training method and apparatus for intelligent recommendation, intelligent recommendation method and apparatus, electronic equipment, storage medium, and computer program | |
CN113010664A (en) | Data processing method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |