CN109376241B - DenseNet-based telephone appeal text classification algorithm for power field - Google Patents

DenseNet-based telephone appeal text classification algorithm for power field Download PDF

Info

Publication number
CN109376241B
CN109376241B CN201811208673.0A CN201811208673A CN109376241B CN 109376241 B CN109376241 B CN 109376241B CN 201811208673 A CN201811208673 A CN 201811208673A CN 109376241 B CN109376241 B CN 109376241B
Authority
CN
China
Prior art keywords
text
telephone
appeal
classified
classification algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811208673.0A
Other languages
Chinese (zh)
Other versions
CN109376241A (en
Inventor
王亿
陆岷
章晨璐
汪宇杰
李豪帅
吴亦灵
孔锋峰
邱海锋
陈杰
翁利国
陈辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd
Original Assignee
State Grid Corp of China SGCC
Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Hangzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd, Zhejiang Zhongxin Electric Power Engineering Construction Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201811208673.0A priority Critical patent/CN109376241B/en
Publication of CN109376241A publication Critical patent/CN109376241A/en
Application granted granted Critical
Publication of CN109376241B publication Critical patent/CN109376241B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a DenseNet-based telephone appeal text classification algorithm facing the power field, belongs to the technical field of text classification algorithms, and is characterized in that a text classifier is obtained by carrying out operations such as preprocessing, data augmentation, vocabulary dictionary establishment, word vector id matching, word vector dimension reduction, splicing characteristic values, characteristic values after random permutation, combination and splicing and the like on a text to be classified, and the text is classified by utilizing the text classifier. The DenseNet-based telephone appeal text classification algorithm facing the power field can effectively make up for the defects of the traditional algorithm, well adapt to the characteristics of strong speciality, large length difference, mixed characters and numbers and the like of the power appeal text, reduce the complexity of a model on the premise of ensuring the classification accuracy, realize the rapid and accurate classification of the telephone appeal text in the power field, and well meet the classification requirement.

Description

DenseNet-based telephone appeal text classification algorithm for power field
Technical Field
The invention relates to the technical field of text classification algorithms, in particular to a telephone appeal text classification algorithm facing the power field based on DenseNet.
Background
With popularization and improvement of power grid construction, more and more power grid users are provided, in order to guarantee stability of power supply of a power grid and improve satisfaction degree of power utilization of users, a power grid company builds a telephone feedback platform, and the users can consult service contents, reflect power utilization faults, evaluate the power grid company, and put opinions or complaints to the power grid company and the like through the telephone feedback platform. In order to better complete the construction and service of the power grid company through the telephone feedback platform, the telephone appeal texts need to be classified. The existing classification method generally classifies texts through a convolutional neural network model, but the classification method needs a relatively comprehensive corpus and has single output characteristics, and the method has great defects in classifying short texts such as telephone appeal texts in the power field. In order to improve the defects of classifying the telephone appeal text by using the convolutional neural network, the feature output needs to be increased by increasing the maximum pooling layer and using filters with different sizes, and the improvement means also needs a larger corpus and the filters with different sizes also increase the training parameter quantity of the model. In addition, the flow mode of text features needs to be changed, shallow features flow in deep layers through a dense connection convolution network, the diversity of feature learning is increased, and the classification effect is improved. However, the method can deepen the network level, has huge parameter quantity needing training, is sensitive to the sparsity of text features, has low classification speed, and cannot well meet the requirement of classifying the telephone appeal text in the power field.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides a DenseNet-based telephone appeal text classification algorithm facing the power field, which is low in model complexity and good in classification effect.
In order to achieve the technical purpose, the telephone appeal text classification algorithm facing the power field based on the DenseNet comprises the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 x 1 convolution layer processing on the word vector subjected to dimension reduction in the step S6, and splicing eigenvalues with the same size obtained after convolution layer processing;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
Preferably, the preprocessing performed on the telephone appeal text to be classified in the step S2 includes a de-duplication processing, a de-noising processing, a de-deactivation processing and a text word segmentation processing.
Preferably, in step S2, the telephone appeal text to be classified is deduplicated with euclidean distance.
Preferably, in the step S2, the telephone appeal text to be classified is denoised based on the hash value of the DOM tree.
Preferably, in the step S2, the deactivation processing of the telephone appeal text to be classified is realized by creating a deactivation word bank dedicated to the power domain.
Preferably, in step S2, a jieba language model is used to perform word segmentation on the to-be-classified telephone appeal text to realize text word segmentation.
Preferably, in step S4, the vocabulary dictionary is built by using a double array trie tree method.
Preferably, in step S6, principal component analysis dimensionality reduction is performed on the one-hot form word vector.
Preferably, in step S7, the eigenvalues are spliced by a formula,
Figure BDA0001831873250000031
wherein R iskThe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkPresentation puzzleSubsequent characteristic value, xk+1Represents the input to the (k + 1) th layer and H represents the activation function.
After the technical scheme is adopted, the telephone appeal text classification algorithm facing the power field based on the DenseNet has the following advantages:
1. the DenseNet-based telephone appeal text classification algorithm facing the power field can effectively make up for the defects of the traditional algorithm, well adapt to the characteristics of strong speciality, large length difference, mixed characters and numbers and the like of the power appeal text, reduce the complexity of a model on the premise of ensuring the classification accuracy, realize the rapid and accurate classification of the telephone appeal text in the power field, and well meet the classification requirement.
The preprocessing mainly includes cleaning and normalization, and aims to improve the quality of text data so as to improve the execution efficiency in classification. The purpose of increasing the training data volume can be achieved by converting the original data under the condition of less data volume by data augmentation according to the text, so that the problem of sparse features of the telephone appeal text in the power field is solved. The vocabulary dictionary is established according to the augmented data, so that the space utilization rate and efficiency can be effectively improved, and the training time can be shortened. And performing word vector id matching according to the established vocabulary dictionary, namely matching a word vector for each word, and avoiding repeated training of the word vectors, thereby effectively reducing the parameters, complexity and training time of network training. The dimensionality of the word vector can be reduced by reducing the dimensionality of the word vector, excessive model parameters caused by overhigh dimensionality of the word vector are avoided, parameter learning of the model is reduced, and the complexity of the model is reduced. The processed two groups of feature values with the same size are spliced, so that edge feature expression and shallow feature flow can be realized, the flow of redundant features can be reduced, and unnecessary feature learning and parameter iteration are reduced. And the spliced features are randomly combined to prevent the model from being over-fitted, and the obtained high-level features are used as input to improve the classification accuracy of the model. The mixed high-level features are used as the input quantity of the neural network to realize the classification of the telephone appeal texts, and the classification speed and accuracy are effectively improved.
2. The preprocessing of the telephone appeal text to be classified comprises duplication removing processing, denoising processing, deactivation removing processing and text word segmentation processing, wherein the duplication removing processing is realized by Euclidean distances, the Euclidean distances of all texts are calculated, only one text with a short distance is reserved, and the duplication removing accuracy is improved. The denoising processing can remove the part of the text irrelevant to the classification as noise, thereby being beneficial to improving the accuracy of the classification. And comparing the words in the text with the words in the stop word bank one by one, and deleting the words from the text if the words are stop words, so that the data quality is improved. And performing word segmentation on the text by adopting a jieba language model to realize word segmentation processing of the text, so that reasonable data augmentation can be performed according to words obtained by word segmentation in the subsequent steps.
3. Since the one-hot word vector has how many words, in order to avoid dimension explosion of the word vector, dimension reduction needs to be performed on the word vector in this form. And the principal component analysis dimensionality reduction is realized by calculating the eigenvalues of the covariance matrix of the word vectors and selecting a plurality of maximum eigenvalues as principal components, and then multiplying the original word vectors by the eigenvector matrix corresponding to the selected maximum eigenvalue to obtain the word vectors after dimensionality reduction.
Drawings
Fig. 1 is a schematic flowchart of a telephone appeal text classification algorithm for the power domain based on DenseNet according to an embodiment of the present invention;
FIG. 2 is a time-error rate line graph of EPCT text classification by several models in an embodiment of the present invention;
FIG. 3 is a time-error rate line graph of several models classifying THUCNews text in an embodiment of the present invention;
FIG. 4 is a graph of error rate versus training data size for several models training EPCT text in accordance with an embodiment of the present invention;
FIG. 5 is a graph of error rate versus training data size for training the THUCNews text by several models in an embodiment of the present invention;
FIG. 6 is a histogram of the computation time for classifying EPCT texts by several models according to the embodiment of the present invention;
fig. 7 is a histogram of operation time for several models to classify the THUCNews text according to the embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples. It is to be understood that the following terms "upper," "lower," "left," "right," "longitudinal," "lateral," "inner," "outer," "vertical," "horizontal," "top," "bottom," and the like are used merely to indicate an orientation or positional relationship relative to one another as illustrated in the drawings, merely to facilitate describing and simplifying the invention, and are not intended to indicate or imply that the device/component so referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore are not to be considered limiting of the invention.
Example one
As shown in fig. 1, a denneet-oriented telephone appeal text classification algorithm in the power domain according to an embodiment of the present invention includes the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 x 1 convolution layer processing on the word vector subjected to dimension reduction in the step S6, and splicing eigenvalues with the same size obtained after convolution layer processing;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
In the step S1, the telephone appeal text to be classified may be obtained by platform calling or the like.
In the step S2, the preprocessing of the telephone appeal text to be classified includes the following steps,
step S201, deduplication: the Euclidean distance is adopted to perform duplicate removal processing on the telephone appeal texts to be classified, the Euclidean distance of each text is calculated, only one text with a short distance is reserved, and the duplicate removal accuracy is improved;
step S202, denoising treatment: denoising the telephone appeal text to be classified by adopting a hash value based on a DOM tree, and removing the part irrelevant to classification in the text as noise;
step S203, deactivation removal processing: newly building a shutdown word bank special for the power field, comparing words in the text with words in the shutdown word bank one by one, and deleting the words from the text if the words are shutdown words, so that shutdown processing is realized, and the data quality is improved;
step S204, text word segmentation processing: and performing word segmentation on the telephone appeal text to be classified by adopting a jieba language model to realize text word segmentation processing, so that reasonable data augmentation can be performed according to words obtained by word segmentation in the subsequent step S3.
In step S3, the specialized vocabulary in the power domain is added to the data to increase the generalization ability of the model to the data.
In the step S4, the vocabulary dictionary is built by using the double-array trie tree method according to the augmented data, so that the space utilization rate and efficiency are effectively improved, and the training time is shortened.
In the step S5, word vector id matching is performed according to the established vocabulary dictionary, that is, one word vector is matched for each word, so that repeated training of the word vectors is avoided, and thus the parameters, complexity and training time of network training are effectively reduced.
In step S6, since the one-hot word vector has how many words, it is necessary to perform dimension reduction on the word vector in this form in order to avoid dimension explosion of the word vector. And the principal component analysis dimensionality reduction is realized by calculating the eigenvalues of the covariance matrix of the word vectors and selecting a plurality of maximum eigenvalues as principal components, and then multiplying the original word vectors by the eigenvector matrix corresponding to the selected maximum eigenvalue to obtain the word vectors after dimensionality reduction. The condition that the number of parameters of the model is too large due to too high dimensionality of the word vector can be avoided by reducing the dimensionality of the word vector, the learning of the model to the parameters is reduced, and the complexity of the model is reduced.
In the above step S7, the eigenvalues are spliced by the formula,
Figure BDA0001831873250000071
wherein R iskThe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkRepresenting the characteristic value, x, after stitchingk+1Represents the input to the (k + 1) th layer and H represents the activation function.
The processed two groups of feature values with the same size are spliced, so that edge feature expression and shallow feature flow can be realized, the flow of redundant features can be reduced, and unnecessary feature learning and parameter iteration are reduced.
In step S8, the features after the stitching are randomly combined to prevent overfitting of the model, and the classification accuracy of the model can be improved by using the obtained high-level features as input.
In the step S9, the text classifier is formed by the high-level features obtained in the step S8, and the classification of the telephone appeal text is realized by using the mixed high-level features as the input amount of the neural network, so that the speed and the accuracy of the classification are effectively improved.
To examine the effect of the classification algorithm of the present embodiment, the present embodiment also designed the following experiment.
The hardware configuration of the experimental environment is 4GB RAM, Nvidia Geforce GTX 970M and video memory 3GB, the integrated configuration is anaconda3(64bit) + python (3.6) + spyder, and the experimental framework is tensoflow (1.1.0).
Experimental data, for better evaluation of the model, a data set with different fields, data scales and classification numbers is selected in the experiment, and specific characteristic information is shown in table 1. Wherein, thycnews is a standard news text classification data set, and EPCT (power appeal text) comprises 95598 annual acceptance text data.
TABLE 1 data set characteristic information Table
Name (R) Number of classification Number of Average text length Training/validation/testing FIELD
THUCNews
20 20000 236 12000/4000/4000 News
EPCT 7 5000 93 12000/4000/4000 Appeal in the field of electric power
And (3) model parameter configuration, because the classification algorithm of the invention is spliced on the premise that the sizes of the characteristic values are the same, a 1 × 1 convolution layer is added after a3 × 3 convolution layer and a 2 × 2 average pooling layer to change the mapping size of the characteristic, and the related model parameter values are set in a table 2.
TABLE 2 parameter value settings of the model
Parameter name Parameter value
Size of embedding layer 64
Upper limit of sentence length 600
Number of words 500
Hidden layer size 128
Batch size 64
Number of iterations 10
And the evaluation indexes adopt the error rate, the F1 score and the model operation time as the evaluation indexes, and the model is evaluated in a multi-angle and all-around manner.
And (3) comparing the models, namely evaluating the performances of the One-hot and Word2vec Word vector models and different combination models from the aspect of error rate, wherein the specific comparison of the error rate is shown in a table 3. As can be seen from table 3, the classification algorithm of the present embodiment obtains better processing effect than other algorithms in both data sets, and especially, the error rate is as low as 7.63% in the data processing of the EPCT.
TABLE 3 error Rate comparison of several model Process datasets
Model combination of algorithms THUCNews EPCT
one-hot+CNN 11.47 9.5
word2vec+CNN 8.46 8.21
one-hot+Densenet 8.34 7.92
word2vec+Densenet 8.21 7.75
Classification algorithm of the present embodiment 8.06 7.63
Next, for the splicing operation in which the present embodiment is improved, an F1 score that performs optimally before and after splicing is selected as an evaluation result, which is specifically shown in table 4. As can be seen from table 4, the model using the stitching operation in this embodiment achieves better effects in multiple categories.
TABLE 4 comparison of F1 scores before and after splicing
Figure BDA0001831873250000091
In addition, as can be seen from fig. 2 and fig. 3, the improvement of the classification algorithm in the embodiment has good effect in terms of model efficiency, the error rate of classifying the EPCT text by training in the classification algorithm of the embodiment can be as low as 7.5%, and the error rate of classifying the THUCNews text can be as low as 8.6%.
The trend graph of the error rate and the scale of the training data for the training sets of different scales is shown in fig. 4 and 5. As can be seen from fig. 4 and 5, the model provided by the present invention has significant advantages in both data sets, and particularly, when processing an EPCT data set, a good effect can be obtained even when the training data size is not large.
Finally, the efficiency of the model is evaluated by using the operation time of the model as an index, and as shown in fig. 6 and 7, compared with the one-hot + densenert model, the model of the classification algorithm of the present embodiment shortens the operation time by about 40% when processing the EPCT text and shortens the operation time by about 35% when processing the THUCNews text. Therefore, the classification algorithm of the embodiment can rapidly, accurately and efficiently classify the telephone appeal texts in the power field, and better meets the classification requirements.
It can be understood that, in the present embodiment, reference may be made to the prior art for a specific method for performing deduplication processing on a to-be-classified telephone appeal text by using the euclidean distance.
It can be understood that, in the present embodiment, reference may be made to the prior art for a specific method for denoising a to-be-classified telephone appeal text by using a hash value based on a DOM tree.
It can be understood that, in the present embodiment, the prior art may be referred to for a specific method for segmenting the words of the telephone appeal text to be classified by using the jieba language model.
It is understood that, in the present embodiment, reference may be made to the prior art for a specific method for building a vocabulary dictionary by using the double-array trie method.
It is understood that, in the present embodiment, the specific method for performing principal component analysis dimension reduction on the word vector in the one-hot form may refer to the prior art.
Other embodiments of the present invention than the preferred embodiments described above, and those skilled in the art can make various changes and modifications according to the present invention without departing from the spirit of the present invention, should fall within the scope of the present invention defined in the claims.

Claims (8)

1. A DenseNet-based telephone appeal text classification algorithm facing the power field is characterized by comprising the following steps,
s1, obtaining a telephone appeal text to be classified;
s2, preprocessing the telephone appeal text acquired in the step S1;
s3, performing data augmentation according to the telephone appeal text preprocessed in the step S2;
s4, establishing a vocabulary dictionary according to the data amplified in the step S3;
s5, performing word vector id matching according to the vocabulary dictionary established in the step S4;
s6, performing word vector dimension reduction on the word vectors matched in the step S5;
s7, adopting ResNet and DenseNet-BC to perform 1 × 1 convolution layer processing on the word vector after dimension reduction in the step S6, splicing the eigenvalues of the same size obtained after convolution layer processing through a formula I,
Figure FDA0002483705880000011
wherein,Rkthe characteristic value obtained after treatment of 1 × 1 convolutional layer with ResNet, DkThe characteristic value, C, obtained by treating a 1 × 1 convolutional layer with DenseNet-BCkRepresenting the characteristic value, x, after stitchingk+1Represents the input of the (k + 1) th layer, and H represents the activation function;
s8, randomly arranging the characteristic values spliced in the step S7 to obtain high-level characteristics;
and S9, classifying the telephone appeal texts by using the high-level features obtained in the step S8 to achieve the purpose of classification.
2. The telephone appeal text classification algorithm according to claim 1, wherein the preprocessing of the telephone appeal text to be classified in the step S2 includes a de-duplication process, a de-noising process, a de-deactivation process and a text word segmentation process.
3. The algorithm for classifying telephone appeal text according to claim 2, wherein the euclidean distance is used to perform de-duplication on the telephone appeal text to be classified in step S2.
4. The phone complaint text classification algorithm of claim 2, wherein the phone complaint text to be classified is denoised in step S2 by using a DOM tree-based hash value.
5. The phone complaint text classification algorithm of claim 2, wherein the step S2 is implemented by creating a deactivation word bank dedicated to the power domain to perform deactivation processing on the phone complaint text to be classified.
6. The telephone appeal text classification algorithm according to claim 2, wherein in the step S2, a jieba language model is adopted to perform word segmentation on the telephone appeal text to be classified so as to realize text word segmentation.
7. The telephony appeal text classification algorithm of claim 1, wherein the vocabulary dictionary is built in the step S4 by using a double array trie method.
8. The telephony appeal text classification algorithm according to claim 1, wherein in the step S6, principal component analysis dimensionality reduction is performed on the one-hot form word vector.
CN201811208673.0A 2018-10-17 2018-10-17 DenseNet-based telephone appeal text classification algorithm for power field Active CN109376241B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811208673.0A CN109376241B (en) 2018-10-17 2018-10-17 DenseNet-based telephone appeal text classification algorithm for power field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811208673.0A CN109376241B (en) 2018-10-17 2018-10-17 DenseNet-based telephone appeal text classification algorithm for power field

Publications (2)

Publication Number Publication Date
CN109376241A CN109376241A (en) 2019-02-22
CN109376241B true CN109376241B (en) 2020-09-18

Family

ID=65400603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811208673.0A Active CN109376241B (en) 2018-10-17 2018-10-17 DenseNet-based telephone appeal text classification algorithm for power field

Country Status (1)

Country Link
CN (1) CN109376241B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111050315B (en) * 2019-11-27 2021-04-13 北京邮电大学 Wireless transmitter identification method based on multi-core two-way network
CN113553844B (en) * 2021-08-11 2023-07-25 四川长虹电器股份有限公司 Domain identification method based on prefix tree features and convolutional neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975573A (en) * 2016-05-04 2016-09-28 北京广利核系统工程有限公司 KNN-based text classification method
CN108009284A (en) * 2017-12-22 2018-05-08 重庆邮电大学 Using the Law Text sorting technique of semi-supervised convolutional neural networks
CN108563791A (en) * 2018-04-29 2018-09-21 华中科技大学 A kind of construction quality complains the method and system of text classification
CN108596329A (en) * 2018-05-11 2018-09-28 北方民族大学 Threedimensional model sorting technique based on end-to-end Deep integrating learning network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1049030A1 (en) * 1999-04-28 2000-11-02 SER Systeme AG Produkte und Anwendungen der Datenverarbeitung Classification method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105975573A (en) * 2016-05-04 2016-09-28 北京广利核系统工程有限公司 KNN-based text classification method
CN108009284A (en) * 2017-12-22 2018-05-08 重庆邮电大学 Using the Law Text sorting technique of semi-supervised convolutional neural networks
CN108563791A (en) * 2018-04-29 2018-09-21 华中科技大学 A kind of construction quality complains the method and system of text classification
CN108596329A (en) * 2018-05-11 2018-09-28 北方民族大学 Threedimensional model sorting technique based on end-to-end Deep integrating learning network

Also Published As

Publication number Publication date
CN109376241A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN104167208B (en) A kind of method for distinguishing speek person and device
CN105022754B (en) Object classification method and device based on social network
CN110175221B (en) Junk short message identification method by combining word vector with machine learning
AU2017243270A1 (en) Method and device for extracting core words from commodity short text
CN107895000B (en) Cross-domain semantic information retrieval method based on convolutional neural network
CN111460148A (en) Text classification method and device, terminal equipment and storage medium
CN104102919A (en) Image classification method capable of effectively preventing convolutional neural network from being overfit
CN110826618A (en) Personal credit risk assessment method based on random forest
CN104538035B (en) A kind of method for distinguishing speek person and system based on Fisher super vectors
CN107818173B (en) Vector space model-based Chinese false comment filtering method
CN108846047A (en) A kind of picture retrieval method and system based on convolution feature
CN113239690A (en) Chinese text intention identification method based on integration of Bert and fully-connected neural network
CN109376241B (en) DenseNet-based telephone appeal text classification algorithm for power field
CN112347246B (en) Self-adaptive document clustering method and system based on spectrum decomposition
CN107526792A (en) A kind of Chinese question sentence keyword rapid extracting method
CN112989052B (en) Chinese news long text classification method based on combination-convolution neural network
CN111782804A (en) TextCNN-based same-distribution text data selection method, system and storage medium
CN115456043A (en) Classification model processing method, intent recognition method, device and computer equipment
CN114420151B (en) Speech emotion recognition method based on parallel tensor decomposition convolutional neural network
CN111245820A (en) Phishing website detection method based on deep learning
CN114266249A (en) Mass text clustering method based on birch clustering
CN109858035A (en) A kind of sensibility classification method, device, electronic equipment and readable storage medium storing program for executing
CN113743079A (en) Text similarity calculation method and device based on co-occurrence entity interaction graph
WO2023147299A1 (en) Systems and methods for short text similarity based clustering
CN111125304A (en) Word2 vec-based patent text automatic classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant