CN109871448B - Short text classification method and system - Google Patents

Short text classification method and system Download PDF

Info

Publication number
CN109871448B
CN109871448B CN201910191018.7A CN201910191018A CN109871448B CN 109871448 B CN109871448 B CN 109871448B CN 201910191018 A CN201910191018 A CN 201910191018A CN 109871448 B CN109871448 B CN 109871448B
Authority
CN
China
Prior art keywords
convolution
feature
text
dimensional
multidimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910191018.7A
Other languages
Chinese (zh)
Other versions
CN109871448A (en
Inventor
朱芬红
朱巧明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201910191018.7A priority Critical patent/CN109871448B/en
Publication of CN109871448A publication Critical patent/CN109871448A/en
Application granted granted Critical
Publication of CN109871448B publication Critical patent/CN109871448B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a method and a system for classifying short texts, wherein the method comprises the following steps: and carrying out text processing on the short text to be classified to obtain a text vector matrix. And extracting features of the text vector matrix based on the convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and compressing each multidimensional convolutional feature to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature. For each one-dimensional refined feature, a weight value for each one-dimensional refined feature is calculated. And weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics aiming at each multidimensional convolution characteristic, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution characteristics. In the scheme, the weight of the multi-dimensional convolution features of different angles in the short text is calculated by extracting the multi-dimensional convolution features of the short text from different angles. And the self-adaptive convolution characteristic for determining the short text category is obtained based on the weight calculation, so that the accuracy of the short text category is improved.

Description

Short text classification method and system
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a system for classifying short texts.
Background
With the development of science and technology, the use of neural network algorithms to construct classification models for classifying short texts gradually becomes one of the mainstream classification modes, wherein the classification model based on convolutional neural networks achieves better performance on short text classification.
The step of constructing the classification model based on the convolutional neural network algorithm is as follows: and taking the short text training data set as the input of a network model, extracting text features from multiple angles by utilizing multiple convolution kernels, splicing the largest feature in each convolution feature as the final feature representation of the text, and further carrying out category prediction on the text representation by using a full-connection layer, so that the predicted category distribution obtained by the model is continuously approximate to the real category distribution, and model parameters are continuously optimized by utilizing back propagation according to the approaching process, and the model is converged, thereby obtaining the short text classification model which is fit with training data and has better generalization capability. Short text can completely express sentence meaning by using fewer words, so words are used for refining, each word can possibly express topic categories with different angles, but when multi-angle features are processed by a convolutional neural network, the importance degree of each angle feature to text representation is considered to be consistent, and all the angle features are directly spliced, so that the distinction degree of the text features is not strong and the informativeness of the text feature representation is not enough, and the classification accuracy is reduced.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a method and a system for classifying short texts, so as to solve the problems of weak distinction between text features extracted by the existing classification model formed based on the convolutional neural network and insufficient informativity, which often results in low classification accuracy.
In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
the first aspect of the embodiment of the invention discloses a method for classifying short texts, which comprises the following steps:
performing text processing on short texts to be classified to obtain a text vector matrix;
extracting features of the text vector matrix based on a convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and compressing each multidimensional convolutional feature to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature, wherein each angle corresponds to one multidimensional convolutional feature;
calculating a weight value of each one-dimensional essential feature aiming at each one-dimensional essential feature;
and weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution characteristics.
Preferably, the text processing of the short text to be classified to obtain a text vector matrix includes:
performing word segmentation processing and de-stop word processing on the short text to be classified to obtain a first word list;
filtering low-frequency words in the first vocabulary to obtain a second vocabulary;
numbering the words in the second vocabulary to obtain a first text sequence containing the corresponding relation between the words and the word numbers;
based on the sequence length of the first text sequence, zero padding processing or truncation processing or no processing is carried out on the first text sequence, so as to obtain a second text sequence;
and mapping the second text sequence to obtain a text vector matrix based on the word vector matrix, wherein the word vector matrix is obtained from a pre-trained word vector model.
Preferably, the feature extraction is performed on the text vector matrix based on the convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and each multidimensional convolutional feature is compressed to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature, which comprises:
based on each convolution kernel preset in the convolution neural network, carrying out convolution operation on the text vector matrix to obtain multidimensional convolution features corresponding to a plurality of different angles, wherein each convolution kernel corresponds to an angle;
And compressing each multidimensional convolution feature through an average pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
Preferably, the calculating, for each one-dimensional essential feature, a weight value of each one-dimensional essential feature includes:
randomly initializing a parameter matrix and a bias vector;
optimizing the one-dimensional essential features based on network parameters aiming at each one-dimensional essential feature to obtain a first importance set containing the importance of each one-dimensional essential feature in the short text to be classified, wherein the network parameters comprise a convolution kernel, a parameter matrix and a bias vector;
based on RELU activation function, setting the useless one-dimensional essential features in the first importance set to zero to obtain a second importance set;
and compressing the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
Preferably, the weighting the multidimensional convolution feature based on the weight value corresponding to the one-dimensional essential feature for each multidimensional convolution feature to obtain an adaptive convolution feature for determining the short text category to be classified includes:
Performing maximum pooling operation on the multidimensional convolution features aiming at each multidimensional convolution feature to obtain an optimal convolution feature corresponding to each multidimensional convolution feature;
weighting the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature to obtain a plurality of weighted convolution features;
performing splicing treatment on the plurality of weighted convolution features to obtain a self-adaptive convolution feature;
inputting the self-adaptive convolution characteristics into a pre-constructed classification sub-model, and determining the class of the short text to be classified based on the class corresponding to the largest weighted convolution characteristic in the self-adaptive convolution characteristics.
In a second aspect, an embodiment of the present invention discloses a system for classifying short text, the system including:
the first processing unit is used for carrying out text processing on the short text to be classified to obtain a text vector matrix;
the second processing unit is used for extracting the characteristics of the text vector matrix based on a convolutional neural network to obtain multidimensional convolutional characteristics corresponding to a plurality of different angles, and compressing each multidimensional convolutional characteristic to obtain one-dimensional essential characteristics corresponding to each multidimensional convolutional characteristic, wherein each angle corresponds to one multidimensional convolutional characteristic;
A calculating unit configured to calculate a weight value of each one-dimensional essential feature for each one-dimensional essential feature;
the classification unit is used for weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics aiming at each multidimensional convolution characteristic, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution characteristics.
Preferably, the first processing unit includes:
the first processing module is used for carrying out word segmentation processing and de-stopping word processing on the short text to be classified to obtain a first word list;
the filtering module is used for filtering the low-frequency words in the first word list to obtain a second word list;
the numbering module is used for numbering words in the second word list to obtain a first text sequence containing the corresponding relation of the words and the word numbers;
the second processing module is used for carrying out zero padding processing or truncation processing or no processing on the first text sequence based on the sequence length of the first text sequence to obtain a second text sequence;
and the mapping module is used for mapping the second text sequence to obtain the text vector matrix based on the word vector matrix, wherein the word vector matrix is obtained from a pre-trained word vector model.
Preferably, the second processing unit includes:
the operation module is used for carrying out convolution operation on the text vector matrix based on each convolution kernel preset in the convolution neural network to obtain multidimensional convolution features corresponding to a plurality of different angles, and each convolution kernel corresponds to an angle;
and the compression module is used for compressing each multidimensional convolution feature through an average pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
Preferably, the calculation unit includes:
the initialization module is used for randomly initializing the parameter matrix and the bias vector;
the optimization module is used for optimizing the one-dimensional essential features based on network parameters aiming at each one-dimensional essential feature to obtain a first importance set containing the importance of each one-dimensional essential feature in the short text to be classified, wherein the network parameters comprise a convolution kernel, a parameter matrix and a bias vector;
the processing module is used for setting the useless one-dimensional essential features in the first importance set to zero based on the RELU activation function to obtain a second importance set;
and the compression module is used for compressing the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
Preferably, the classifying unit includes:
the operation module is used for carrying out maximum pooling operation on the multidimensional convolution characteristics aiming at each multidimensional convolution characteristic to obtain an optimal convolution characteristic corresponding to each multidimensional convolution characteristic;
the weighting module is used for carrying out weighting processing on the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature to obtain a plurality of weighted convolution features;
the splicing module is used for carrying out splicing treatment on the plurality of weighted convolution characteristics to obtain a self-adaptive convolution characteristic;
the determining module is used for inputting the self-adaptive convolution characteristic into a pre-constructed classifying sub-model, and determining the class of the short text to be classified based on the class corresponding to the largest weighted convolution characteristic in the self-adaptive convolution characteristic.
Based on the method and the system for classifying the short text provided by the embodiment of the invention, the method comprises the following steps: and carrying out text processing on the short text to be classified to obtain a text vector matrix. And extracting features of the text vector matrix based on the convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and compressing each multidimensional convolutional feature to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature. For each one-dimensional refined feature, a weight value for each one-dimensional refined feature is calculated. And weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics aiming at each multidimensional convolution characteristic, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution characteristics. In the scheme, the weight of the multi-dimensional convolution features of different angles in the short text is calculated by extracting the multi-dimensional convolution features of the short text from different angles. And the self-adaptive convolution characteristic for determining the short text category is obtained based on the weight calculation, so that the accuracy of the short text category is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for classifying short text according to an embodiment of the present invention;
FIG. 2 is a flowchart of obtaining a text vector matrix according to an embodiment of the present invention;
FIG. 3 is a flowchart of calculating a weight value of a one-dimensional essential feature according to an embodiment of the present invention;
FIG. 4 is a flow chart of an adaptive convolution feature provided by an embodiment of the present disclosure;
FIG. 5 is a block diagram of a system for classifying short text according to an embodiment of the present invention;
FIG. 6 is a block diagram of a system for classifying short text according to an embodiment of the present invention;
FIG. 7 is a block diagram of a system for classifying short text according to an embodiment of the present invention;
FIG. 8 is a block diagram of a system for classifying short text according to an embodiment of the present invention;
Fig. 9 is a block diagram of a system for classifying short text according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the present disclosure, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As known from the background art, when the convolutional neural network processes multi-angle features, it considers that the importance degree of each angle feature to the text representation is consistent, and the various angle features are directly spliced, which may result in poor distinction of the text features and insufficient informativity of the text feature representation, so that the classification accuracy is reduced.
Therefore, the embodiment of the invention provides a method and a system for classifying short texts, which are characterized in that the multi-dimensional convolution characteristics of the short texts are extracted from different angles, and the weights of the multi-dimensional convolution characteristics of the different angles in the short texts are calculated. An adaptive convolution characteristic for determining short text categories is computed based on the weights. And determining the category of the short text to be classified by utilizing the self-adaptive convolution characteristic so as to improve the accuracy of classifying the short text.
Referring to fig. 1, a flowchart of a method for classifying short text according to an embodiment of the present invention is shown, including the following steps:
step S101: and carrying out text processing on the short text to be classified to obtain a text vector matrix.
In the specific implementation process of step S101, before classifying the short text to be classified, the short text to be classified needs to be processed to obtain the text vector matrix.
Step S102: and extracting features of the text vector matrix based on a convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and compressing each multidimensional convolutional feature to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature.
In the specific implementation process of step S102, first, based on each convolution kernel preset in the convolutional neural network, a convolution operation is performed on the text vector matrix, so as to obtain multidimensional convolution features corresponding to a plurality of different angles. And then, compressing each multidimensional convolution feature through an average pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
It should be noted that, in the convolutional neural network, although the window size of the convolutional kernel is the same, the parameters of each convolutional kernel are different, so each convolutional kernel corresponds to an angle, and each angle corresponds to a multidimensional convolutional feature. For example, if there are 10 convolution kernels with different parameters in the convolutional neural network, the convolution operation on the text vector can obtain multidimensional convolution features corresponding to 10 different angles.
It should be noted that, each convolution feature is a semantic block set including multiple concept levels, and meaning deviation of each semantic block representation in each semantic block set may be large, so that the above-mentioned average pooling operation needs to be used to perform addition processing and averaging processing on the semantic block set to obtain one-dimensional essential features of the semantic block set.
Step S103: for each one-dimensional refined feature, a weight value of each one-dimensional refined feature is calculated.
In the specific implementation process of step S103, the weight value of each one-dimensional essential feature is calculated by formula (1). In the formula (1), a is the one-dimensional essential feature, W 1 For initialized parameter matrix, b 1 Is the bias vector, delta is the RELU activation function, and sigma is the sigmoid activation function.
g=σ(δ(a*W 1 +b 1 )) (1)
Step S104: and weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics, and determining the category of the short text to be classified by utilizing the adaptive convolution characteristics.
In the specific implementation process of step S104, the multidimensional convolution feature is weighted based on the weight value corresponding to the one-dimensional essential feature, so as to obtain an adaptive convolution feature, and the adaptive convolution feature is input into a classification submodel, so that the classification of the short text to be classified can be determined.
In the embodiment of the invention, the weight of the multi-dimensional convolution characteristics of different angles in the short text is calculated by extracting the multi-dimensional convolution characteristics of the short text from different angles. And based on the weight calculation, obtaining self-adaptive convolution characteristics for determining the category of the short text, determining the category of the short text to be classified by utilizing the self-adaptive convolution characteristics, and improving the accuracy of short text classification.
Referring to fig. 2, a flowchart of obtaining a text vector matrix according to an embodiment of the present invention is shown, where the process of obtaining a text vector matrix in step S101 includes the following steps:
step S201: and performing word segmentation processing and de-stop word processing on the short text to be classified to obtain a first word list.
In the specific implementation process of step S201, the short text to be classified is subjected to word segmentation processing, so as to obtain a plurality of words. And removing the stop word in the words based on the pre-acquired stop word list, and finally obtaining the first word list.
The words in the first vocabulary are typically arranged in a word segmentation order.
The term deactivation process is to remove nonsensical terms, such as "functional terms having no actual meaning" and the like. It should be noted that, the term "stop word" is generally understood as a functional word included in a human language, and the functional word has no actual meaning compared with other words.
Step S202: and filtering the low-frequency words in the first vocabulary to obtain a second vocabulary.
In the specific implementation process of step S202, words in the first vocabulary, where the frequency of use is lower than a threshold value, are filtered out. For example, if the word with the frequency lower than 2 times is preset as the low-frequency word, the word with the frequency lower than 2 times is filtered out from the first vocabulary.
It should be noted that, the threshold value may also be 3, 4, 5, or the like.
Step S203: numbering the words in the second vocabulary, and obtaining a first text sequence containing the corresponding relation between the words and the word numbers.
In the specific implementation process of step S203, the words in the second vocabulary are numbered based on a preset numbering rule, so that a corresponding relationship between the words and the word numbers can be obtained. It should be noted that, only one number is required for the same word in the second vocabulary, for example, only one "we" needs to be numbered if the word "we" is 5 times.
Optionally, the preset numbering rule is to number the same words in the second vocabulary according to the extracted word segmentation order.
Step S204: and carrying out zero padding processing or truncation processing or no processing on the first text sequence based on the sequence length of the first text sequence to obtain a second text sequence.
In the process of embodying step S204, when a model of short text classification is established, the lengths of text sequences input to the model need to be uniform. Therefore, when the length of the input text sequence is inconsistent with the standard length, the text sequence needs to be processed, for example, zero padding processing is performed when the length of the text sequence is smaller than the standard length, truncation processing is performed when the length of the text sequence is greater than the standard length, and no processing is performed when the length of the text sequence is just identical with the standard length. The zero padding process refers to padding zero into a text sequence having a length shorter than the standard length until the length matches the standard length. The truncation processing is to truncate words in the text sequence with the length being longer than the standard length until the length is consistent with the standard length. For example, a standard length of 5 words is set, a text sequence has 7 words, and the last two words in the text sequence are cut off.
Step S205: and mapping the second text sequence to obtain the text vector matrix based on the word vector matrix.
In the specific implementation process of step S205, a Word vector is obtained based on a Word2vec model trained in advance, and a corresponding relationship between a Word number and the Word vector is constructed based on the Word number, so as to obtain the Word vector matrix.
In the embodiment of the invention, the text vector matrix is obtained by processing the short text to be classified, and the multi-dimensional convolution characteristics corresponding to different angles are obtained by extracting the characteristics of the text vector matrix from different angles. Based on the multidimensional convolution characteristics, self-adaptive convolution characteristics for determining the category of the short text to be classified are obtained, and the category of the short text to be classified is determined by utilizing the self-adaptive convolution characteristics, so that the accuracy of short text classification can be improved.
The process of calculating the weight value of the one-dimensional essential feature related to step S103 in fig. 1, referring to fig. 3, shows a flowchart for calculating the weight value of the one-dimensional essential feature according to an embodiment of the present invention, including the following steps:
step S301: the parameter matrix and bias vector are randomly initialized.
Step S302: optimizing the one-dimensional essential features based on network parameters for each one-dimensional essential feature to obtain a first importance set containing the importance of each one-dimensional essential feature in the short text to be classified.
In the specific implementation of step S302, the network parameters include a convolution kernel, the parameter matrix, and a bias vector. And optimizing each one-dimensional essential feature based on the network parameters to obtain the importance of each one-dimensional essential feature in the short text to be classified, namely obtaining the first importance set, namely the importance distribution.
Described by formula (1), wherein (a×w in formula (1) 1 +b 1 ) For characterising the significance profile corresponding to one-dimensional essential feature a
Step S303: and setting the useless one-dimensional essential features in the first importance set to zero based on the RELU activation function to obtain a second importance set.
In a process embodying step S303, a calculation is performed based on the RELU activation function and the first set of importance. If the calculated value is less than or equal to 0, setting 0, and if the calculated value is greater than 0, keeping unchanged. And obtaining the second importance set through the processing.
Step S304: and compressing the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
In the specific implementation process of step S304, 0 in the second importance set is set to be one half, and the weight value of each one-dimensional essential feature is obtained after compression processing.
According to the embodiment of the invention, the weights of the multidimensional convolution features at different angles in the short text are calculated, the self-adaptive convolution features for determining the category of the short text are obtained based on the weight calculation, and the category of the short text to be classified is determined by utilizing the self-adaptive convolution features, so that the accuracy of short text classification can be improved.
The process of obtaining the adaptive convolution feature related to step S104 in fig. 1, referring to fig. 4, shows a flowchart of obtaining the adaptive convolution feature according to an embodiment of the present invention, including the following steps:
step S401: and carrying out maximum pooling operation on the multidimensional convolution features aiming at each multidimensional convolution feature to obtain an optimal convolution feature corresponding to each multidimensional convolution feature.
In the specific implementation process of step S401, a plurality of convolution features exist in each multi-dimensional convolution feature, and the maximum convolution feature in each multi-dimensional convolution feature is extracted through the maximum pooling operation, so as to obtain an optimal convolution feature corresponding to each multi-dimensional convolution feature.
Step S402: and carrying out weighting processing on the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature to obtain a plurality of weighted convolution features.
In the specific implementation process of step S402, for each of the optimal convolution features corresponding to each of the multidimensional convolution features, weighting processing is performed on each of the optimal convolution features based on a weight value of each of the one-dimensional essential features, so as to obtain a plurality of weighted convolution features.
It should be noted that, each multidimensional convolution feature has a corresponding relationship with each one-dimensional essential feature.
Step S403: and performing splicing treatment on the plurality of weighted convolution features to obtain a self-adaptive convolution feature.
In the specific implementation process of step S403, the weighted convolution features obtained in step S402 are spliced to obtain an adaptive convolution feature.
Step S404: inputting the self-adaptive convolution characteristics into a pre-constructed classification sub-model, and determining the class of the short text to be classified based on the class corresponding to the largest weighted convolution characteristic in the self-adaptive convolution characteristics.
In the process of implementing step S404, text category labels including a plurality of text categories are built in the classification sub-model in advance. Inputting the self-adaptive convolution characteristics into the classification submodel, determining the largest weighted convolution characteristic in the self-adaptive convolution characteristics, comparing the self-adaptive convolution characteristics with the text category labels, and taking the text category corresponding to the largest weighted convolution characteristic as the category of the short text to be classified.
In the embodiment of the invention, based on the weight value corresponding to each one-dimensional essential feature, the multidimensional convolution feature is weighted and spliced to obtain an adaptive convolution feature, and the adaptive convolution feature is used as the input of the classification submodel to determine the classification of the short text to be classified, so that the accuracy of short text classification can be improved.
Corresponding to the method for classifying short text provided in the above embodiment of the present invention, referring to fig. 5, the embodiment of the present invention further provides a structural block diagram of a system for classifying short text, where the system includes: a first processing unit 501, a second processing unit 502, a computing unit 503, and a classifying unit 504.
The first processing unit 501 is configured to perform text processing on the short text to be classified to obtain a text vector matrix.
The second processing unit 502 is configured to perform feature extraction on the text vector matrix based on a convolutional neural network to obtain multidimensional convolution features corresponding to a plurality of different angles, and perform compression processing on each multidimensional convolution feature to obtain one-dimensional essential features corresponding to each multidimensional convolution feature, where each angle corresponds to one multidimensional convolution feature. For details of calculating the one-dimensional essential features, see the corresponding content of step S102 disclosed in fig. 1 in the above embodiment of the present invention.
A calculating unit 503, configured to calculate a weight value of each one-dimensional essential feature for each one-dimensional essential feature. The calculation process of the weight value of the one-dimensional essential feature is referred to the content corresponding to step S103 disclosed in fig. 1 in the above embodiment of the present invention.
The classifying unit 504 is configured to weight the multidimensional convolution feature based on a weight value corresponding to the one-dimensional essential feature for each multidimensional convolution feature, and determine the category of the short text to be classified by using the obtained adaptive convolution feature.
In the embodiment of the invention, the weight of the multi-dimensional convolution characteristics of different angles in the short text is calculated by extracting the multi-dimensional convolution characteristics of the short text from different angles. And based on the weight calculation, obtaining self-adaptive convolution characteristics for determining the category of the short text, determining the category of the short text to be classified by utilizing the self-adaptive convolution characteristics, and improving the accuracy of short text classification.
Referring to fig. 6, there is shown a block diagram of a system for classifying short text according to an embodiment of the present invention, where the first processing unit 501 includes: a first processing module 5011, a filtering module 5012, a numbering module 5013, a second processing module 5014, and a mapping module 5015.
The first processing module 5011 is configured to perform word segmentation processing and de-stop word processing on the short text to be classified, so as to obtain a first vocabulary. For the specific content of the short text to be classified, please refer to the content corresponding to step S201 disclosed in fig. 2 in the above embodiment of the present invention.
The filtering module 5012 is configured to filter the low-frequency words in the first vocabulary to obtain a second vocabulary, and refer to the content corresponding to step S202 disclosed in fig. 2 in the above embodiment of the present invention for specific content.
And a numbering module 5013, configured to number the words in the second vocabulary, and obtain a first text sequence including a correspondence between the words and the word numbers. For details of numbering the words in the second vocabulary, please refer to the corresponding content of step S203 disclosed in fig. 2 in the above embodiment of the present invention.
And the second processing module 5014 is configured to perform zero padding processing or truncation processing or no processing on the first text sequence based on the sequence length of the first text sequence, to obtain a second text sequence. The specific content of the processing of the first text sequence is shown in the content corresponding to step S204 disclosed in fig. 2 in the above embodiment of the present invention.
The mapping module 5015 is configured to map the second text sequence to obtain a text vector matrix based on a word vector matrix, where the word vector matrix is obtained from a word vector model trained in advance.
In the embodiment of the invention, the text vector matrix is obtained by processing the short text to be classified, and the multi-dimensional convolution characteristics corresponding to different angles are obtained by extracting the characteristics of the text vector matrix from different angles. Based on the multidimensional convolution characteristics, self-adaptive convolution characteristics for determining the category of the short text to be classified are obtained, and the category of the short text to be classified is determined by utilizing the self-adaptive convolution characteristics, so that the accuracy of short text classification can be improved.
Referring to fig. 7, there is shown a block diagram of a system for classifying short text according to an embodiment of the present invention, where the second processing unit 502 includes:
the operation module 5021 is configured to perform convolution operation on the text vector matrix based on each convolution kernel preset in the convolutional neural network to obtain multidimensional convolution features corresponding to a plurality of different angles, where each convolution kernel corresponds to a corresponding angle.
And the compression module 5022 is used for compressing each multidimensional convolution feature through an average pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
Referring to fig. 8, there is shown a block diagram of a system for classifying short text according to an embodiment of the present invention, the computing unit 503 includes: an initialization module 5031, an optimization module 5032, a processing module 5033, and a compression module 5034.
An initialization module 5031 for randomly initializing the parameter matrix and the bias vector.
An optimization module 5032, configured to optimize the one-dimensional essential features based on network parameters for each one-dimensional essential feature, to obtain a first importance set including importance of each one-dimensional essential feature in the short text to be classified, where the network parameters include a convolution kernel, the parameter matrix, and a bias vector. The specific content of the first importance set is shown in the content corresponding to step S302 disclosed in fig. 3 in the above embodiment of the present invention.
The processing module 5033 is configured to set the useless one-dimensional essential features in the first importance set to zero based on the RELU activation function, so as to obtain a second importance set. For details of the processing of the first importance set, please refer to the corresponding content of step S303 disclosed in fig. 3 in the above embodiment of the present invention.
The compression module 5034 is configured to compress the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
According to the embodiment of the invention, the weights of the multidimensional convolution features at different angles in the short text are calculated, the self-adaptive convolution features for determining the category of the short text are obtained based on the weight calculation, and the category of the short text to be classified is determined by utilizing the obtained self-adaptive convolution features, so that the accuracy of short text classification can be improved.
Referring to fig. 9, there is shown a block diagram of a system for classifying short text according to an embodiment of the present invention, the classifying unit 504 includes: an operation module 5041, a weighting module 5042, a stitching module 5043 and a determination module 5044.
The operation module 5041 is configured to perform a maximum pooling operation on each multi-dimensional convolution feature to obtain an optimal convolution feature corresponding to each multi-dimensional convolution feature.
And the weighting module 5042 is configured to perform weighting processing on the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature, so as to obtain a plurality of weighted convolution features. For a specific process of obtaining the weighted convolution feature, please refer to the content corresponding to step S402 disclosed in fig. 4 in the above embodiment of the present invention.
And the splicing module 5043 is used for performing splicing processing on the plurality of weighted convolution features to obtain an adaptive convolution feature.
The determining module 5044 is configured to input the adaptive convolution feature into a pre-constructed classification sub-model, and determine a class of the short text to be classified based on a class corresponding to a largest weighted convolution feature in the adaptive convolution feature. For a specific process of determining the category of the short text to be classified, please refer to the content corresponding to step S402 disclosed in fig. 4 in the above embodiment of the present invention.
Based on the system for classifying the short text disclosed by the embodiment of the invention, the system can be equivalent to a classification model formed by a convolutional neural network (Convolutional Neural Networks, CNN) and a full-connection layer, wherein the CNN is used for extracting the characteristics of the text to be classified, and a classification sub-model is constructed in the full-connection layer and used for determining the classification of the text to be classified. The specific implementation functions and principles are equivalent to those of the corresponding matters disclosed in fig. 5 to 9 in the foregoing embodiment of the present invention, and will not be described herein again.
In summary, the embodiment of the invention provides a method and a system for classifying short texts, wherein the method comprises the following steps: and carrying out text processing on the short text to be classified to obtain a text vector matrix. And extracting features of the text vector matrix based on the convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and compressing each multidimensional convolutional feature to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature. For each one-dimensional refined feature, a weight value for each one-dimensional refined feature is calculated. And weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics aiming at each multidimensional convolution characteristic, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution characteristics. In the scheme, the weight of the multi-dimensional convolution features of different angles in the short text is calculated by extracting the multi-dimensional convolution features of the short text from different angles. And the self-adaptive convolution characteristic for determining the short text category is obtained based on the weight calculation, so that the accuracy of the short text category is improved.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A method of short text classification, the method comprising:
performing text processing on short texts to be classified to obtain a text vector matrix;
feature extraction is carried out on the text vector matrix based on a convolutional neural network to obtain multidimensional convolutional features corresponding to a plurality of different angles, and each multidimensional convolutional feature is subjected to addition processing and averaging processing through averaging pooling operation to obtain one-dimensional essential features corresponding to each multidimensional convolutional feature, wherein each angle corresponds to one multidimensional convolutional feature;
calculating a weight value of each one-dimensional essential feature aiming at each one-dimensional essential feature;
For each multi-dimensional convolution feature, weighting the multi-dimensional convolution feature based on a weight value corresponding to the one-dimensional essential feature, and determining the category of the short text to be classified by utilizing the obtained self-adaptive convolution feature;
the calculating, for each one-dimensional refined feature, a weight value of each one-dimensional refined feature includes:
randomly initializing a parameter matrix and a bias vector;
optimizing the one-dimensional essential features based on network parameters aiming at each one-dimensional essential feature to obtain a first importance set containing the importance of each one-dimensional essential feature in the short text to be classified, wherein the network parameters comprise a convolution kernel, a parameter matrix and a bias vector;
based on RELU activation function, setting the useless one-dimensional essential features in the first importance set to zero to obtain a second importance set;
and compressing the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
2. The method according to claim 1, wherein the text processing of the short text to be classified to obtain a text vector matrix comprises:
Performing word segmentation processing and de-stop word processing on the short text to be classified to obtain a first word list;
filtering low-frequency words in the first vocabulary to obtain a second vocabulary;
numbering the words in the second vocabulary to obtain a first text sequence containing the corresponding relation between the words and the word numbers;
based on the sequence length of the first text sequence, zero padding processing or truncation processing or no processing is carried out on the first text sequence, so as to obtain a second text sequence;
and mapping the second text sequence to obtain a text vector matrix based on the word vector matrix, wherein the word vector matrix is obtained from a pre-trained word vector model.
3. The method according to claim 1, wherein the feature extraction is performed on the text vector matrix based on the convolutional neural network to obtain multidimensional convolution features corresponding to a plurality of different angles, and the adding processing and the averaging processing are performed on each multidimensional convolution feature through an averaging pooling operation to obtain one-dimensional essential features corresponding to each multidimensional convolution feature, including:
based on each convolution kernel preset in the convolution neural network, carrying out convolution operation on the text vector matrix to obtain multidimensional convolution features corresponding to a plurality of different angles, wherein each convolution kernel corresponds to an angle;
And adding and averaging each multidimensional convolution feature through an averaging pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
4. The method according to claim 1, wherein for each of the multi-dimensional convolution features, weighting the multi-dimensional convolution feature based on a weight value corresponding to the one-dimensional essential feature to obtain an adaptive convolution feature for determining the short text category to be classified, comprises:
performing maximum pooling operation on the multidimensional convolution features aiming at each multidimensional convolution feature to obtain an optimal convolution feature corresponding to each multidimensional convolution feature;
weighting the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature to obtain a plurality of weighted convolution features;
performing splicing treatment on the plurality of weighted convolution features to obtain a self-adaptive convolution feature;
inputting the self-adaptive convolution characteristics into a pre-constructed classification sub-model, and determining the class of the short text to be classified based on the class corresponding to the largest weighted convolution characteristic in the self-adaptive convolution characteristics.
5. A system for short text classification, the system comprising:
the first processing unit is used for carrying out text processing on the short text to be classified to obtain a text vector matrix;
the second processing unit is used for extracting the characteristics of the text vector matrix based on a convolutional neural network to obtain multidimensional convolutional characteristics corresponding to a plurality of different angles, adding and processing each multidimensional convolutional characteristic through an averaging pooling operation and averaging to obtain one-dimensional essential characteristics corresponding to each multidimensional convolutional characteristic, wherein each angle corresponds to one multidimensional convolutional characteristic;
a calculating unit configured to calculate a weight value of each one-dimensional essential feature for each one-dimensional essential feature;
the classification unit is used for weighting the multidimensional convolution characteristics according to the weight value corresponding to the one-dimensional essential characteristics aiming at each multidimensional convolution characteristic, and determining the category of the short text to be classified by utilizing the obtained adaptive convolution characteristics;
the second processing unit includes:
the operation module is used for carrying out convolution operation on the text vector matrix based on each convolution kernel preset in the convolution neural network to obtain multidimensional convolution features corresponding to a plurality of different angles, and each convolution kernel corresponds to an angle;
And the compression module is used for compressing each multidimensional convolution feature through an average pooling operation to obtain a one-dimensional essential feature corresponding to each multidimensional convolution feature.
6. The system of claim 5, wherein the first processing unit comprises:
the first processing module is used for carrying out word segmentation processing and de-stopping word processing on the short text to be classified to obtain a first word list;
the filtering module is used for filtering the low-frequency words in the first word list to obtain a second word list;
the numbering module is used for numbering words in the second word list to obtain a first text sequence containing the corresponding relation of the words and the word numbers;
the second processing module is used for carrying out zero padding processing or truncation processing or no processing on the first text sequence based on the sequence length of the first text sequence to obtain a second text sequence;
and the mapping module is used for mapping the second text sequence to obtain the text vector matrix based on the word vector matrix, wherein the word vector matrix is obtained from a pre-trained word vector model.
7. The system of claim 5, wherein the computing unit comprises:
The initialization module is used for randomly initializing the parameter matrix and the bias vector;
the optimization module is used for optimizing the one-dimensional essential features based on network parameters aiming at each one-dimensional essential feature to obtain a first importance set containing the importance of each one-dimensional essential feature in the short text to be classified, wherein the network parameters comprise a convolution kernel, a parameter matrix and a bias vector;
the processing module is used for setting the useless one-dimensional essential features in the first importance set to zero based on the RELU activation function to obtain a second importance set;
and the compression module is used for compressing the second importance set based on a sigmoid function to obtain a weight value of each one-dimensional essential feature.
8. The system of claim 5, wherein the classification unit comprises:
the operation module is used for carrying out maximum pooling operation on the multidimensional convolution characteristics aiming at each multidimensional convolution characteristic to obtain an optimal convolution characteristic corresponding to each multidimensional convolution characteristic;
the weighting module is used for carrying out weighting processing on the weight value corresponding to each one-dimensional essential feature and the optimal convolution feature to obtain a plurality of weighted convolution features;
The splicing module is used for carrying out splicing treatment on the plurality of weighted convolution characteristics to obtain a self-adaptive convolution characteristic;
the determining module is used for inputting the self-adaptive convolution characteristic into a pre-constructed classifying sub-model, and determining the class of the short text to be classified based on the class corresponding to the largest weighted convolution characteristic in the self-adaptive convolution characteristic.
CN201910191018.7A 2019-03-12 2019-03-12 Short text classification method and system Active CN109871448B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910191018.7A CN109871448B (en) 2019-03-12 2019-03-12 Short text classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910191018.7A CN109871448B (en) 2019-03-12 2019-03-12 Short text classification method and system

Publications (2)

Publication Number Publication Date
CN109871448A CN109871448A (en) 2019-06-11
CN109871448B true CN109871448B (en) 2023-08-15

Family

ID=66920376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910191018.7A Active CN109871448B (en) 2019-03-12 2019-03-12 Short text classification method and system

Country Status (1)

Country Link
CN (1) CN109871448B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852386B (en) * 2019-11-13 2023-05-02 北京秒针人工智能科技有限公司 Data classification method, apparatus, computer device and readable storage medium
CN113378567B (en) * 2021-07-05 2022-05-10 广东工业大学 Chinese short text classification method for improving low-frequency words

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291822B (en) * 2017-05-24 2020-03-24 北京邮电大学 Problem classification model training method, classification method and device based on deep learning
US10163022B1 (en) * 2017-06-22 2018-12-25 StradVision, Inc. Method for learning text recognition, method for recognizing text using the same, and apparatus for learning text recognition, apparatus for recognizing text using the same
CN109446333A (en) * 2019-01-16 2019-03-08 深兰人工智能芯片研究院(江苏)有限公司 A kind of method that realizing Chinese Text Categorization and relevant device

Also Published As

Publication number Publication date
CN109871448A (en) 2019-06-11

Similar Documents

Publication Publication Date Title
CN109948149B (en) Text classification method and device
EP3568811A1 (en) Training machine learning models
US11803731B2 (en) Neural architecture search with weight sharing
CN109918663A (en) A kind of semantic matching method, device and storage medium
CN113571064B (en) Natural language understanding method and device, vehicle and medium
CN109871448B (en) Short text classification method and system
CN111724370B (en) Multi-task image quality evaluation method and system based on uncertainty and probability
CN110135681A (en) Risk subscribers recognition methods, device, readable storage medium storing program for executing and terminal device
KR102250728B1 (en) Sample processing method and device, related apparatus and storage medium
CN106802888B (en) Word vector training method and device
CN109299246B (en) Text classification method and device
CN112380319A (en) Model training method and related device
CN110705708A (en) Compression method and device of convolutional neural network model and computer storage medium
CN113657421A (en) Convolutional neural network compression method and device and image classification method and device
CN114332500A (en) Image processing model training method and device, computer equipment and storage medium
CN111259189B (en) Music classification method and device
CN111046177A (en) Automatic arbitration case prejudging method and device
CN113011532A (en) Classification model training method and device, computing equipment and storage medium
US20220253694A1 (en) Training neural networks with reinitialization
CN113887709A (en) Neural network adaptive quantization method, apparatus, device, medium, and product
CN115472179A (en) Automatic detection method and system for digital audio deletion and insertion tampering operation
CN113345464A (en) Voice extraction method, system, device and storage medium
CN114239666A (en) Method, apparatus, computer readable medium for classification model training
CN111832815A (en) Scientific research hotspot prediction method and system
CN113128660A (en) Deep learning model compression method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant