CN109376241B - DenseNet-based telephone appeal text classification algorithm for power field - Google Patents
DenseNet-based telephone appeal text classification algorithm for power field Download PDFInfo
- Publication number
- CN109376241B CN109376241B CN201811208673.0A CN201811208673A CN109376241B CN 109376241 B CN109376241 B CN 109376241B CN 201811208673 A CN201811208673 A CN 201811208673A CN 109376241 B CN109376241 B CN 109376241B
- Authority
- CN
- China
- Prior art keywords
- text
- telephone
- appeal
- classified
- classification algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007635 classification algorithm Methods 0.000 title claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 41
- 230000009467 reduction Effects 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000013434 data augmentation Methods 0.000 claims abstract description 7
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 5
- 238000012545 processing Methods 0.000 claims description 30
- 238000000034 method Methods 0.000 claims description 18
- 230000011218 segmentation Effects 0.000 claims description 14
- 230000009849 deactivation Effects 0.000 claims description 6
- 238000000513 principal component analysis Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 230000007547 defect Effects 0.000 abstract description 5
- 238000012549 training Methods 0.000 description 20
- 230000000694 effects Effects 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000004880 explosion Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a DenseNet-based telephone appeal text classification algorithm facing the power field, belongs to the technical field of text classification algorithms, and is characterized in that a text classifier is obtained by carrying out operations such as preprocessing, data augmentation, vocabulary dictionary establishment, word vector id matching, word vector dimension reduction, splicing characteristic values, characteristic values after random permutation, combination and splicing and the like on a text to be classified, and the text is classified by utilizing the text classifier. The DenseNet-based telephone appeal text classification algorithm facing the power field can effectively make up for the defects of the traditional algorithm, well adapt to the characteristics of strong speciality, large length difference, mixed characters and numbers and the like of the power appeal text, reduce the complexity of a model on the premise of ensuring the classification accuracy, realize the rapid and accurate classification of the telephone appeal text in the power field, and well meet the classification requirement.
Description
Technical Field
The invention relates to the technical field of text classification algorithms, in particular to a telephone appeal text classification algorithm facing the power field based on DenseNet.
Background
With popularization and improvement of power grid construction, more and more power grid users are provided, in order to guarantee stability of power supply of a power grid and improve satisfaction degree of power utilization of users, a power grid company builds a telephone feedback platform, and the users can consult service contents, reflect power utilization faults, evaluate the power grid company, and put opinions or complaints to the power grid company and the like through the telephone feedback platform. In order to better complete the construction and service of the power grid company through the telephone feedback platform, the telephone appeal texts need to be classified. The existing classification method generally classifies texts through a convolutional neural network model, but the classification method needs a relatively comprehensive corpus and has single output characteristics, and the method has great defects in classifying short texts such as telephone appeal texts in the power field. In order to improve the defects of classifying the telephone appeal text by using the convolutional neural network, the feature output needs to be increased by increasing the maximum pooling layer and using filters with different sizes, and the improvement means also needs a larger corpus and the filters with different sizes also increase the training parameter quantity of the model. In addition, the flow mode of text features needs to be changed, shallow features flow in deep layers through a dense connection convolution network, the diversity of feature learning is increased, and the classification effect is improved. However, the method can deepen the network level, has huge parameter quantity needing training, is sensitive to the sparsity of text features, has low classification speed, and cannot well meet the requirement of classifying the telephone appeal text in the power field.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides a DenseNet-based telephone appeal text classification algorithm facing the power field, which is low in model complexity and good in classification effect.
In order to achieve the technical purpose, the telephone appeal text classification algorithm facing the power field based on the DenseNet comprises the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 x 1 convolution layer processing on the word vector subjected to dimension reduction in the step S6, and splicing eigenvalues with the same size obtained after convolution layer processing;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
Preferably, the preprocessing performed on the telephone appeal text to be classified in the step S2 includes a de-duplication processing, a de-noising processing, a de-deactivation processing and a text word segmentation processing.
Preferably, in step S2, the telephone appeal text to be classified is deduplicated with euclidean distance.
Preferably, in the step S2, the telephone appeal text to be classified is denoised based on the hash value of the DOM tree.
Preferably, in the step S2, the deactivation processing of the telephone appeal text to be classified is realized by creating a deactivation word bank dedicated to the power domain.
Preferably, in step S2, a jieba language model is used to perform word segmentation on the to-be-classified telephone appeal text to realize text word segmentation.
Preferably, in step S4, the vocabulary dictionary is built by using a double array trie tree method.
Preferably, in step S6, principal component analysis dimensionality reduction is performed on the one-hot form word vector.
Preferably, in step S7, the eigenvalues are spliced by a formula,
wherein R iskThe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkPresentation puzzleSubsequent characteristic value, xk+1Represents the input to the (k + 1) th layer and H represents the activation function.
After the technical scheme is adopted, the telephone appeal text classification algorithm facing the power field based on the DenseNet has the following advantages:
1. the DenseNet-based telephone appeal text classification algorithm facing the power field can effectively make up for the defects of the traditional algorithm, well adapt to the characteristics of strong speciality, large length difference, mixed characters and numbers and the like of the power appeal text, reduce the complexity of a model on the premise of ensuring the classification accuracy, realize the rapid and accurate classification of the telephone appeal text in the power field, and well meet the classification requirement.
The preprocessing mainly includes cleaning and normalization, and aims to improve the quality of text data so as to improve the execution efficiency in classification. The purpose of increasing the training data volume can be achieved by converting the original data under the condition of less data volume by data augmentation according to the text, so that the problem of sparse features of the telephone appeal text in the power field is solved. The vocabulary dictionary is established according to the augmented data, so that the space utilization rate and efficiency can be effectively improved, and the training time can be shortened. And performing word vector id matching according to the established vocabulary dictionary, namely matching a word vector for each word, and avoiding repeated training of the word vectors, thereby effectively reducing the parameters, complexity and training time of network training. The dimensionality of the word vector can be reduced by reducing the dimensionality of the word vector, excessive model parameters caused by overhigh dimensionality of the word vector are avoided, parameter learning of the model is reduced, and the complexity of the model is reduced. The processed two groups of feature values with the same size are spliced, so that edge feature expression and shallow feature flow can be realized, the flow of redundant features can be reduced, and unnecessary feature learning and parameter iteration are reduced. And the spliced features are randomly combined to prevent the model from being over-fitted, and the obtained high-level features are used as input to improve the classification accuracy of the model. The mixed high-level features are used as the input quantity of the neural network to realize the classification of the telephone appeal texts, and the classification speed and accuracy are effectively improved.
2. The preprocessing of the telephone appeal text to be classified comprises duplication removing processing, denoising processing, deactivation removing processing and text word segmentation processing, wherein the duplication removing processing is realized by Euclidean distances, the Euclidean distances of all texts are calculated, only one text with a short distance is reserved, and the duplication removing accuracy is improved. The denoising processing can remove the part of the text irrelevant to the classification as noise, thereby being beneficial to improving the accuracy of the classification. And comparing the words in the text with the words in the stop word bank one by one, and deleting the words from the text if the words are stop words, so that the data quality is improved. And performing word segmentation on the text by adopting a jieba language model to realize word segmentation processing of the text, so that reasonable data augmentation can be performed according to words obtained by word segmentation in the subsequent steps.
3. Since the one-hot word vector has how many words, in order to avoid dimension explosion of the word vector, dimension reduction needs to be performed on the word vector in this form. And the principal component analysis dimensionality reduction is realized by calculating the eigenvalues of the covariance matrix of the word vectors and selecting a plurality of maximum eigenvalues as principal components, and then multiplying the original word vectors by the eigenvector matrix corresponding to the selected maximum eigenvalue to obtain the word vectors after dimensionality reduction.
Drawings
Fig. 1 is a schematic flowchart of a telephone appeal text classification algorithm for the power domain based on DenseNet according to an embodiment of the present invention;
FIG. 2 is a time-error rate line graph of EPCT text classification by several models in an embodiment of the present invention;
FIG. 3 is a time-error rate line graph of several models classifying THUCNews text in an embodiment of the present invention;
FIG. 4 is a graph of error rate versus training data size for several models training EPCT text in accordance with an embodiment of the present invention;
FIG. 5 is a graph of error rate versus training data size for training the THUCNews text by several models in an embodiment of the present invention;
FIG. 6 is a histogram of the computation time for classifying EPCT texts by several models according to the embodiment of the present invention;
fig. 7 is a histogram of operation time for several models to classify the THUCNews text according to the embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples. It is to be understood that the following terms "upper," "lower," "left," "right," "longitudinal," "lateral," "inner," "outer," "vertical," "horizontal," "top," "bottom," and the like are used merely to indicate an orientation or positional relationship relative to one another as illustrated in the drawings, merely to facilitate describing and simplifying the invention, and are not intended to indicate or imply that the device/component so referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore are not to be considered limiting of the invention.
Example one
As shown in fig. 1, a denneet-oriented telephone appeal text classification algorithm in the power domain according to an embodiment of the present invention includes the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 x 1 convolution layer processing on the word vector subjected to dimension reduction in the step S6, and splicing eigenvalues with the same size obtained after convolution layer processing;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
In the step S1, the telephone appeal text to be classified may be obtained by platform calling or the like.
In the step S2, the preprocessing of the telephone appeal text to be classified includes the following steps,
step S201, deduplication: the Euclidean distance is adopted to perform duplicate removal processing on the telephone appeal texts to be classified, the Euclidean distance of each text is calculated, only one text with a short distance is reserved, and the duplicate removal accuracy is improved;
step S202, denoising treatment: denoising the telephone appeal text to be classified by adopting a hash value based on a DOM tree, and removing the part irrelevant to classification in the text as noise;
step S203, deactivation removal processing: newly building a shutdown word bank special for the power field, comparing words in the text with words in the shutdown word bank one by one, and deleting the words from the text if the words are shutdown words, so that shutdown processing is realized, and the data quality is improved;
step S204, text word segmentation processing: and performing word segmentation on the telephone appeal text to be classified by adopting a jieba language model to realize text word segmentation processing, so that reasonable data augmentation can be performed according to words obtained by word segmentation in the subsequent step S3.
In step S3, the specialized vocabulary in the power domain is added to the data to increase the generalization ability of the model to the data.
In the step S4, the vocabulary dictionary is built by using the double-array trie tree method according to the augmented data, so that the space utilization rate and efficiency are effectively improved, and the training time is shortened.
In the step S5, word vector id matching is performed according to the established vocabulary dictionary, that is, one word vector is matched for each word, so that repeated training of the word vectors is avoided, and thus the parameters, complexity and training time of network training are effectively reduced.
In step S6, since the one-hot word vector has how many words, it is necessary to perform dimension reduction on the word vector in this form in order to avoid dimension explosion of the word vector. And the principal component analysis dimensionality reduction is realized by calculating the eigenvalues of the covariance matrix of the word vectors and selecting a plurality of maximum eigenvalues as principal components, and then multiplying the original word vectors by the eigenvector matrix corresponding to the selected maximum eigenvalue to obtain the word vectors after dimensionality reduction. The condition that the number of parameters of the model is too large due to too high dimensionality of the word vector can be avoided by reducing the dimensionality of the word vector, the learning of the model to the parameters is reduced, and the complexity of the model is reduced.
In the above step S7, the eigenvalues are spliced by the formula,
wherein R iskThe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkRepresenting the characteristic value, x, after stitchingk+1Represents the input to the (k + 1) th layer and H represents the activation function.
The processed two groups of feature values with the same size are spliced, so that edge feature expression and shallow feature flow can be realized, the flow of redundant features can be reduced, and unnecessary feature learning and parameter iteration are reduced.
In step S8, the features after the stitching are randomly combined to prevent overfitting of the model, and the classification accuracy of the model can be improved by using the obtained high-level features as input.
In the step S9, the text classifier is formed by the high-level features obtained in the step S8, and the classification of the telephone appeal text is realized by using the mixed high-level features as the input amount of the neural network, so that the speed and the accuracy of the classification are effectively improved.
To examine the effect of the classification algorithm of the present embodiment, the present embodiment also designed the following experiment.
The hardware configuration of the experimental environment is 4GB RAM, Nvidia Geforce GTX 970M and video memory 3GB, the integrated configuration is anaconda3(64bit) + python (3.6) + spyder, and the experimental framework is tensoflow (1.1.0).
Experimental data, for better evaluation of the model, a data set with different fields, data scales and classification numbers is selected in the experiment, and specific characteristic information is shown in table 1. Wherein, thycnews is a standard news text classification data set, and EPCT (power appeal text) comprises 95598 annual acceptance text data.
TABLE 1 data set characteristic information Table
Name (R) | Number of classification | Number of | Average text length | Training/validation/ | FIELD |
THUCNews | |||||
20 | 20000 | 236 | 12000/4000/4000 | News | |
EPCT | 7 | 5000 | 93 | 12000/4000/4000 | Appeal in the field of electric power |
And (3) model parameter configuration, because the classification algorithm of the invention is spliced on the premise that the sizes of the characteristic values are the same, a 1 × 1 convolution layer is added after a3 × 3 convolution layer and a 2 × 2 average pooling layer to change the mapping size of the characteristic, and the related model parameter values are set in a table 2.
TABLE 2 parameter value settings of the model
Parameter name | Parameter value |
Size of embedding layer | 64 |
Upper limit of sentence length | 600 |
Number of words | 500 |
Hidden layer size | 128 |
Batch size | 64 |
Number of |
10 |
And the evaluation indexes adopt the error rate, the F1 score and the model operation time as the evaluation indexes, and the model is evaluated in a multi-angle and all-around manner.
And (3) comparing the models, namely evaluating the performances of the One-hot and Word2vec Word vector models and different combination models from the aspect of error rate, wherein the specific comparison of the error rate is shown in a table 3. As can be seen from table 3, the classification algorithm of the present embodiment obtains better processing effect than other algorithms in both data sets, and especially, the error rate is as low as 7.63% in the data processing of the EPCT.
TABLE 3 error Rate comparison of several model Process datasets
Model combination of algorithms | THUCNews | EPCT |
one-hot+CNN | 11.47 | 9.5 |
word2vec+CNN | 8.46 | 8.21 |
one-hot+Densenet | 8.34 | 7.92 |
word2vec+Densenet | 8.21 | 7.75 |
Classification algorithm of the present embodiment | 8.06 | 7.63 |
Next, for the splicing operation in which the present embodiment is improved, an F1 score that performs optimally before and after splicing is selected as an evaluation result, which is specifically shown in table 4. As can be seen from table 4, the model using the stitching operation in this embodiment achieves better effects in multiple categories.
TABLE 4 comparison of F1 scores before and after splicing
In addition, as can be seen from fig. 2 and fig. 3, the improvement of the classification algorithm in the embodiment has good effect in terms of model efficiency, the error rate of classifying the EPCT text by training in the classification algorithm of the embodiment can be as low as 7.5%, and the error rate of classifying the THUCNews text can be as low as 8.6%.
The trend graph of the error rate and the scale of the training data for the training sets of different scales is shown in fig. 4 and 5. As can be seen from fig. 4 and 5, the model provided by the present invention has significant advantages in both data sets, and particularly, when processing an EPCT data set, a good effect can be obtained even when the training data size is not large.
Finally, the efficiency of the model is evaluated by using the operation time of the model as an index, and as shown in fig. 6 and 7, compared with the one-hot + densenert model, the model of the classification algorithm of the present embodiment shortens the operation time by about 40% when processing the EPCT text and shortens the operation time by about 35% when processing the THUCNews text. Therefore, the classification algorithm of the embodiment can rapidly, accurately and efficiently classify the telephone appeal texts in the power field, and better meets the classification requirements.
It can be understood that, in the present embodiment, reference may be made to the prior art for a specific method for performing deduplication processing on a to-be-classified telephone appeal text by using the euclidean distance.
It can be understood that, in the present embodiment, reference may be made to the prior art for a specific method for denoising a to-be-classified telephone appeal text by using a hash value based on a DOM tree.
It can be understood that, in the present embodiment, the prior art may be referred to for a specific method for segmenting the words of the telephone appeal text to be classified by using the jieba language model.
It is understood that, in the present embodiment, reference may be made to the prior art for a specific method for building a vocabulary dictionary by using the double-array trie method.
It is understood that, in the present embodiment, the specific method for performing principal component analysis dimension reduction on the word vector in the one-hot form may refer to the prior art.
Other embodiments of the present invention than the preferred embodiments described above, and those skilled in the art can make various changes and modifications according to the present invention without departing from the spirit of the present invention, should fall within the scope of the present invention defined in the claims.
Claims (8)
1. A DenseNet-based telephone appeal text classification algorithm facing the power field is characterized by comprising the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 × 1 convolution layer processing on the word vector after dimension reduction in the step S6, splicing the eigenvalues of the same size obtained after convolution layer processing through a formula I,
wherein,Rkthe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkRepresenting the characteristic value, x, after stitchingk+1Represents the input of the (k + 1) th layer, and H represents the activation function;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
2. The telephone appeal text classification algorithm according to claim 1, wherein the preprocessing of the telephone appeal text to be classified in the step S2 includes a de-duplication process, a de-noising process, a de-deactivation process and a text word segmentation process.
3. The algorithm for classifying telephone appeal text according to claim 2, wherein the euclidean distance is used to perform de-duplication on the telephone appeal text to be classified in step S2.
4. The phone complaint text classification algorithm of claim 2, wherein the phone complaint text to be classified is denoised in step S2 by using a DOM tree-based hash value.
5. The phone complaint text classification algorithm of claim 2, wherein the step S2 is implemented by creating a deactivation word bank dedicated to the power domain to perform deactivation processing on the phone complaint text to be classified.
6. The telephone appeal text classification algorithm according to claim 2, wherein in the step S2, a jieba language model is adopted to perform word segmentation on the telephone appeal text to be classified so as to realize text word segmentation.
7. The telephony appeal text classification algorithm of claim 1, wherein the vocabulary dictionary is built in the step S4 by using a double array trie method.
8. The telephony appeal text classification algorithm according to claim 1, wherein in the step S6, principal component analysis dimensionality reduction is performed on the one-hot form word vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811208673.0A CN109376241B (en) | 2018-10-17 | 2018-10-17 | DenseNet-based telephone appeal text classification algorithm for power field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811208673.0A CN109376241B (en) | 2018-10-17 | 2018-10-17 | DenseNet-based telephone appeal text classification algorithm for power field |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109376241A CN109376241A (en) | 2019-02-22 |
CN109376241B true CN109376241B (en) | 2020-09-18 |
Family
ID=65400603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811208673.0A Active CN109376241B (en) | 2018-10-17 | 2018-10-17 | DenseNet-based telephone appeal text classification algorithm for power field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109376241B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111050315B (en) * | 2019-11-27 | 2021-04-13 | 北京邮电大学 | Wireless transmitter identification method based on multi-core two-way network |
CN113553844B (en) * | 2021-08-11 | 2023-07-25 | 四川长虹电器股份有限公司 | Domain identification method based on prefix tree features and convolutional neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105975573A (en) * | 2016-05-04 | 2016-09-28 | 北京广利核系统工程有限公司 | KNN-based text classification method |
CN108009284A (en) * | 2017-12-22 | 2018-05-08 | 重庆邮电大学 | Using the Law Text sorting technique of semi-supervised convolutional neural networks |
CN108563791A (en) * | 2018-04-29 | 2018-09-21 | 华中科技大学 | A kind of construction quality complains the method and system of text classification |
CN108596329A (en) * | 2018-05-11 | 2018-09-28 | 北方民族大学 | Threedimensional model sorting technique based on end-to-end Deep integrating learning network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1049030A1 (en) * | 1999-04-28 | 2000-11-02 | SER Systeme AG Produkte und Anwendungen der Datenverarbeitung | Classification method and apparatus |
-
2018
- 2018-10-17 CN CN201811208673.0A patent/CN109376241B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105975573A (en) * | 2016-05-04 | 2016-09-28 | 北京广利核系统工程有限公司 | KNN-based text classification method |
CN108009284A (en) * | 2017-12-22 | 2018-05-08 | 重庆邮电大学 | Using the Law Text sorting technique of semi-supervised convolutional neural networks |
CN108563791A (en) * | 2018-04-29 | 2018-09-21 | 华中科技大学 | A kind of construction quality complains the method and system of text classification |
CN108596329A (en) * | 2018-05-11 | 2018-09-28 | 北方民族大学 | Threedimensional model sorting technique based on end-to-end Deep integrating learning network |
Also Published As
Publication number | Publication date |
---|---|
CN109376241A (en) | 2019-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104167208B (en) | A kind of method for distinguishing speek person and device | |
CN105022754B (en) | Object classification method and device based on social network | |
CN110175221B (en) | Junk short message identification method by combining word vector with machine learning | |
AU2017243270A1 (en) | Method and device for extracting core words from commodity short text | |
CN107895000B (en) | Cross-domain semantic information retrieval method based on convolutional neural network | |
CN111460148A (en) | Text classification method and device, terminal equipment and storage medium | |
CN104102919A (en) | Image classification method capable of effectively preventing convolutional neural network from being overfit | |
CN110826618A (en) | Personal credit risk assessment method based on random forest | |
CN104538035B (en) | A kind of method for distinguishing speek person and system based on Fisher super vectors | |
CN107818173B (en) | Vector space model-based Chinese false comment filtering method | |
CN108846047A (en) | A kind of picture retrieval method and system based on convolution feature | |
CN113239690A (en) | Chinese text intention identification method based on integration of Bert and fully-connected neural network | |
CN109376241B (en) | DenseNet-based telephone appeal text classification algorithm for power field | |
CN112347246B (en) | Self-adaptive document clustering method and system based on spectrum decomposition | |
CN107526792A (en) | A kind of Chinese question sentence keyword rapid extracting method | |
CN112989052B (en) | Chinese news long text classification method based on combination-convolution neural network | |
CN111782804A (en) | TextCNN-based same-distribution text data selection method, system and storage medium | |
CN115456043A (en) | Classification model processing method, intent recognition method, device and computer equipment | |
CN114420151B (en) | Speech emotion recognition method based on parallel tensor decomposition convolutional neural network | |
CN111245820A (en) | Phishing website detection method based on deep learning | |
CN114266249A (en) | Mass text clustering method based on birch clustering | |
CN109858035A (en) | A kind of sensibility classification method, device, electronic equipment and readable storage medium storing program for executing | |
CN113743079A (en) | Text similarity calculation method and device based on co-occurrence entity interaction graph | |
WO2023147299A1 (en) | Systems and methods for short text similarity based clustering | |
CN111125304A (en) | Word2 vec-based patent text automatic classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |