CN110399482B - Text classification method, model and device - Google Patents

Text classification method, model and device Download PDF

Info

Publication number
CN110399482B
CN110399482B CN201910492286.2A CN201910492286A CN110399482B CN 110399482 B CN110399482 B CN 110399482B CN 201910492286 A CN201910492286 A CN 201910492286A CN 110399482 B CN110399482 B CN 110399482B
Authority
CN
China
Prior art keywords
feature vector
inputting
model
vector
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910492286.2A
Other languages
Chinese (zh)
Other versions
CN110399482A (en
Inventor
杨志明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ideepwise Artificial Intelligence Robot Technology Beijing Co ltd
Original Assignee
Ideepwise Artificial Intelligence Robot Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ideepwise Artificial Intelligence Robot Technology Beijing Co ltd filed Critical Ideepwise Artificial Intelligence Robot Technology Beijing Co ltd
Priority to CN201910492286.2A priority Critical patent/CN110399482B/en
Publication of CN110399482A publication Critical patent/CN110399482A/en
Application granted granted Critical
Publication of CN110399482B publication Critical patent/CN110399482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a text classification method, a model and a device, wherein the method comprises the following steps: converting the text to be classified into a word vector V1; inputting the word vector V1 into a convolution part of the CNN model, outputting a feature vector V3 by the convolution part of the CNN model, inputting the feature vector V3 into a first pooling layer, and outputting a feature vector V4 by the first pooling layer; and inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5; merging the feature vector V4 and the feature vector V5 into a feature vector V6; inputting the feature vector V6 into a full connection layer; and the full connection layer outputs the text classification of the text to be classified. The method combines the RNN model and the CNN model, and improves the accuracy of the text classification result.

Description

Text classification method, model and device
Technical Field
The invention relates to the field of computers, in particular to a text classification method, a text classification model and a text classification device.
Background
With the development of the internet and social media, at present, a great amount of text information including wikipedia entries, academic articles, news reports and various after-sales service comments exists on the network, and a great amount of valuable information is contained in the text information, specific information in the text information can be roughly extracted by the existing text classification technology, for example, the satisfaction degree of a consumer on the product or service can be known by performing sentiment analysis on the after-sales comments, the field of the news reports can be roughly distinguished by classifying news data, and the relation in a knowledge graph can be obtained by classifying sentences of the wikipedia data. In summary, text classification is an extremely important technology, and at present, more common methods include traditional text classification methods such as svm, nearest neighbor, and decision tree, and deep learning models.
Currently, popular deep learning models include rnn (current Neural networks), cnn (volumetric Neural networks), and transform.
RNN is good at text classification of long sequences of text. CNN is applied to image processing first and then to the field of artificial intelligence, and has an advantage in that local text information can be better recognized. the transformer is a new generation of encoder proposed by google, which overcomes the dependence of RNN on the state before sequence information and performs better than RNN and CNN in most artificial intelligence processing tasks, but the transformer performs worse on small and medium data sets, and the training is very unstable, the long-distance dependence capability is not as good as that of the traditional RNN.
Disclosure of Invention
In view of the above, the present invention provides a text classification method, a text classification model and a text classification device, so as to solve the disadvantages of the existing deep learning model.
The invention provides a text classification method, which comprises the following steps
Converting the text to be classified into a word vector V1;
inputting the word vector V1 into a convolution part of the CNN model, outputting a feature vector V3 by the convolution part of the CNN model, inputting the feature vector V3 into a first pooling layer, and outputting a feature vector V4 by the first pooling layer; and inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
merging the feature vector V4 and the feature vector V5 into a feature vector V6;
and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The present invention also provides a text classification model, which includes:
vector conversion layer: for converting the text to be classified into a word vector V1;
a feature extraction layer: for inputting the word vector V1 into the convolution part of the CNN model, which outputs the feature vector V3, and the feature vector V3 into the first pooling layer, which outputs the feature vector V4; and, for inputting the word vector V1 into the second pooling layer, the second pooling layer outputting the feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and the feature vector V5 into a feature vector V6;
full connection layer: and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The present invention also provides a non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method described above.
The invention also provides a text classification device which is characterized by comprising a processor and the non-transitory computer readable storage medium.
According to the text classification method, the series structure of the RNN and the CNN and the modeling mode with the characteristics of the series structure are utilized to obtain richer word vector characteristics with different semantic levels, and the classification accuracy is improved.
The method or the model combines the excellent long sequence modeling capability of the RNN and the advantages of the local modeling of the CNN, and the classification effect in most text classification tasks is superior to that of the traditional RNN and CNN models.
Compared with a transformer, the text classification method or the model training of the invention is stable, and because the model parameters are fewer, the invention only needs less hardware resource overhead.
Drawings
FIG. 1 is a first flowchart of a text classification method according to the present invention;
FIG. 2 is a second flowchart of the text classification method of the present invention;
FIG. 3 is a first block diagram of the text classification method of the present invention;
fig. 4 is a second structural diagram of the text classification method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention provides a text classification method, as shown in figure 1, the method comprises
S11: converting the text to be classified into a word vector V1;
s13 includes S13-1, S13-2 and S13-3;
s13-1: inputting the word vector V1 into a convolution part of the CNN model, and outputting a feature vector V3 by the convolution part of the CNN model;
s13-2: inputting the feature vector V3 into a first pooling layer, the first pooling layer outputting a feature vector V4;
s13-3: inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
s15: merging the feature vector V4 and the feature vector V5 into a feature vector V6;
s17: and inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer.
The convolution part of the CNN model comprises one convolution layer or a plurality of series convolution layers, and each convolution layer consists of a prior multi-scale convolution kernel convolution and a subsequent pooling layer.
Wherein the first pooling layer and the second pooling layer may be a maximum pooling layer or an average pooling layer.
Optionally, as shown in fig. 2, it is also possible to add between S11 and S13:
s12, inputting the word vector V1 into a BLSTM (Bidirectional Long Short-term Memory) model, and outputting a feature vector V1' by the BLSTM model;
accordingly, S13-1 is adjusted to: inputting the feature vector V1' into a convolution part of the CNN model, and outputting a feature vector V3 by the convolution part of the CNN model;
s13-3 is adjusted as follows: inputting the feature vector V1' into a second pooling layer, the second pooling layer outputting a feature vector V5;
the BLSTM can bidirectionally extract the correlation of long-distance words in the text to be classified, and is favorable for improving the accuracy of later text classification.
According to the text classification method, the series structure of the RNN and the CNN and the modeling mode with the characteristics of the series structure are utilized to obtain richer word vector characteristics with different semantic levels, and the classification accuracy is improved.
The method combines the excellent long sequence modeling capability of the RNN and the advantages of the local modeling of the CNN, and the classification effect in most text classification tasks is superior to that of the traditional RNN and CNN models.
Compared with a transformer, the text classification method is stable in training and only needs less hardware resource overhead because of less model parameters.
The present invention also provides a text classification model, as shown in fig. 3, comprising: the device comprises a vector conversion layer, a feature extraction layer, a feature merging layer and a full connection layer.
Vector conversion layer: for converting the text to be classified into a word vector V1;
the characteristic extraction layer comprises a convolution part of the CNN model, a first pooling layer and a second pooling layer;
convolution part of CNN model: for inputting the word vector V1 into the convolution portion of the CNN model, which outputs a feature vector V3;
a first pooling layer: for inputting the feature vector V3 into the first pooling layer, which outputs the feature vector V4;
a second pooling layer: for inputting the word vector V1 into the second pooling layer, which outputs the feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and the feature vector V5 into a feature vector V6;
full connection layer: inputting the feature vector V6 into a full connection layer; and the full connection layer outputs the text classification of the text to be classified.
As shown in fig. 4, between the vector conversion layer and the feature extraction layer, there may be further included:
the BLMST model, the BLMST model input word vector V1, outputs the feature vector V1'.
Accordingly, the applicability of the convolution portion of the CNN model is adjusted to: the convolution part is used for inputting the characteristic vector V1' into the CNN model and outputting a characteristic vector V3;
the second pooling layer suitability was adjusted to: for inputting the feature vector V1' into the second pooling layer, which outputs the feature vector V5.
The present invention also provides a non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method described above.
The invention also provides a text classification device which is characterized by comprising a processor and the non-transitory computer readable storage medium.
It should be noted that the embodiment of the text classification model or apparatus of the present invention has the same principle as the embodiment of the text classification method, and the related parts can be referred to each other.
The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (4)

1. A method of text classification, the method comprising:
converting the text to be classified into a word vector V1;
inputting the word vector V1 into a convolution part of a CNN model, the convolution part of the CNN model outputting a feature vector V3, inputting the feature vector V3 into a first pooling layer, the first pooling layer outputting a feature vector V4; and, inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
merging the feature vector V4 and the feature vector V5 into a feature vector V6;
inputting the feature vector V6 into a full-link layer, and outputting the text classification of the text to be classified by the full-link layer;
the inputting of the word vector V1 into the convolution portion of the CNN model includes: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', and inputting the feature vector V1' into a convolution part of a CNN model;
and/or, the inputting the word vector V1 into the second pooling layer comprises: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', the feature vector V1' into a second pooling layer;
the BLSTM model is a shared model.
2. A text classification model, the model comprising:
vector conversion layer: for converting the text to be classified into a word vector V1;
a feature extraction layer: for inputting the word vector V1 into the convolution part of a CNN model, which outputs a feature vector V3, inputting the feature vector V3 into a first pooling layer, which outputs a feature vector V4; and, for inputting the word vector V1 into a second pooling layer, the second pooling layer outputting a feature vector V5;
a characteristic merging layer: for merging the feature vector V4 and feature vector V5 into a feature vector V6;
full connection layer: inputting the feature vector V6 into the full-link layer, and outputting the text classification of the text to be classified by the full-link layer;
the inputting of the word vector V1 into the convolution portion of the CNN model includes: inputting the word vector V1 into a BLSTM model, the BLSTM model outputting a feature vector V1 ', and inputting the feature vector V1' into a convolution part of a CNN model;
and/or, the inputting the word vector V1 into the second pooling layer comprises: inputting the word vector V1 into a BLSTM model, which outputs a feature vector V1 ', inputting the feature vector V1' into a second pooling layer:
the BLSTM model is a shared model.
3. A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps in the text classification method of claim 1.
4. A text classification apparatus comprising a processor and the non-transitory computer readable storage medium of claim 3.
CN201910492286.2A 2019-06-06 2019-06-06 Text classification method, model and device Active CN110399482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910492286.2A CN110399482B (en) 2019-06-06 2019-06-06 Text classification method, model and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910492286.2A CN110399482B (en) 2019-06-06 2019-06-06 Text classification method, model and device

Publications (2)

Publication Number Publication Date
CN110399482A CN110399482A (en) 2019-11-01
CN110399482B true CN110399482B (en) 2021-12-03

Family

ID=68323125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910492286.2A Active CN110399482B (en) 2019-06-06 2019-06-06 Text classification method, model and device

Country Status (1)

Country Link
CN (1) CN110399482B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107885853A (en) * 2017-11-14 2018-04-06 同济大学 A kind of combined type file classification method based on deep learning
CN109241283A (en) * 2018-08-08 2019-01-18 广东工业大学 A kind of file classification method based on multi-angle capsule network
US10304208B1 (en) * 2018-02-12 2019-05-28 Avodah Labs, Inc. Automated gesture identification using neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107885853A (en) * 2017-11-14 2018-04-06 同济大学 A kind of combined type file classification method based on deep learning
US10304208B1 (en) * 2018-02-12 2019-05-28 Avodah Labs, Inc. Automated gesture identification using neural networks
CN109241283A (en) * 2018-08-08 2019-01-18 广东工业大学 A kind of file classification method based on multi-angle capsule network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于CNN和BiLSTM网络特征融合的文本情感分析;李洋,董红斌;《计算机应用》;20181130;第38卷(第11期);第3075-3080页 *
李洋,董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析.《计算机应用》.2018,第38卷(第11期), *
用于文本分类的局部化双向长短时记忆;万圣贤,兰艳艳;《中文信息学报》;20170531;第31卷(第3期);第62-68页 *

Also Published As

Publication number Publication date
CN110399482A (en) 2019-11-01

Similar Documents

Publication Publication Date Title
CN109086303B (en) Intelligent conversation method, device and terminal based on machine reading understanding
WO2018006727A1 (en) Method and apparatus for transferring from robot customer service to human customer service
CN112164391B (en) Statement processing method, device, electronic equipment and storage medium
US20190163742A1 (en) Method and apparatus for generating information
CN112685565A (en) Text classification method based on multi-mode information fusion and related equipment thereof
US20210271823A1 (en) Content generation using target content derived modeling and unsupervised language modeling
JP2023535709A (en) Language expression model system, pre-training method, device, device and medium
KR102576344B1 (en) Method and apparatus for processing video, electronic device, medium and computer program
CN106227792B (en) Method and apparatus for pushed information
CN110162766B (en) Word vector updating method and device
CN111783903B (en) Text processing method, text model processing method and device and computer equipment
CN112084307B (en) Data processing method, device, server and computer readable storage medium
CN113158554B (en) Model optimization method and device, computer equipment and storage medium
CN111831826A (en) Training method, classification method and device of cross-domain text classification model
CN113392179A (en) Text labeling method and device, electronic equipment and storage medium
CN111625715A (en) Information extraction method and device, electronic equipment and storage medium
WO2020006488A1 (en) Corpus generating method and apparatus, and human-machine interaction processing method and apparatus
CN114817478A (en) Text-based question and answer method and device, computer equipment and storage medium
Song Sentiment analysis of Japanese text and vocabulary learning based on natural language processing and SVM
CN110969005A (en) Method and device for determining similarity between entity corpora
CN116881462A (en) Text data processing, text representation and text clustering method and equipment
CN110399482B (en) Text classification method, model and device
CN116975221A (en) Text reading and understanding method, device, equipment and storage medium
JP2023554210A (en) Sort model training method and apparatus for intelligent recommendation, intelligent recommendation method and apparatus, electronic equipment, storage medium, and computer program
CN113010664A (en) Data processing method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant