CN109840279A - File classification method based on convolution loop neural network - Google Patents
File classification method based on convolution loop neural network Download PDFInfo
- Publication number
- CN109840279A CN109840279A CN201910025175.0A CN201910025175A CN109840279A CN 109840279 A CN109840279 A CN 109840279A CN 201910025175 A CN201910025175 A CN 201910025175A CN 109840279 A CN109840279 A CN 109840279A
- Authority
- CN
- China
- Prior art keywords
- convolution
- indicate
- input
- text
- formula
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Character Discrimination (AREA)
Abstract
The present invention discloses a kind of file classification method based on convolution loop neural network, the advantage for making full use of convolutional neural networks to extract local feature carries out feature extraction to text, while the contextual feature of extraction is connected the semantic information for preferably indicating text by the advantage using LSTM with memory.This method not only obtains preferable classifying quality on English data set while also obtaining higher classification accuracy on Chinese data collection.
Description
Technical field
The present invention relates to a kind of file classification methods, are a kind of texts based on convolution loop neural network specifically
Classification method.
Background technique
With the fast development of depth learning technology, convolutional neural networks and Recognition with Recurrent Neural Network are in various engineerings
Huge success is achieved in habit task.For example, convolutional neural networks have been widely used for computer vision field, handling
Comparative maturity, such as image classification, object detection, image segmentation, speech recognition in Computer Vision Task.Circulation nerve
Network is the important branch of another in deep learning, it is mainly used to processing sequence problem.Long memory network in short-term
(LSTM) be Recognition with Recurrent Neural Network a kind of specific type, it can capture the contextual information of sequence, be widely used in the time
Sequence problem, such as speech recognition, machine translation.
In recent years, on processing sequence data problem, more and more researchers are neural by convolutional neural networks and circulation
Network integration gets up to be used together.The mixed model is referred to as convolution loop neural network (CRNN), and CRNN can be retouched simply
State in convolutional neural networks followed by Recognition with Recurrent Neural Network.Convolutional neural networks are primarily used to extract feature in the model,
Recognition with Recurrent Neural Network mainly gets up contextual feature informational linkage.Currently, the model is applied to music assorting, height
Spectral data classification, bird audio detection etc..
Convolution loop neural network model is equally applicable to text classification.In text classification, convolutional neural networks are used
Neatly the feature of text can be extracted, due to during text classification classification results by the shadow of entire content of text
It rings, therefore, being connected the feature of extraction using long memory network in short-term can preferably indicate that text is preferably real in turn
Existing text classification.Therefore, herein text classify using convolution loop neural network and use Chinese data collection and English
Data set is compared as experimental data with other classification methods.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of file classification methods based on convolution loop neural network, first
Multiple groups feature extraction first is carried out using text information of the convolutional network to input and pond is carried out to extract in text to it respectively
Then the feature extracted is carried out fusion and is sent into LSTM neural network and by full articulamentum output category knot by important feature
Fruit.
In order to solve the technical problem, the technical issues of present invention uses, is: the text based on convolution loop neural network
This classification method, it is characterised in that: the following steps are included:
S01), term vector matrix is converted as the input of convolutional layer using the sample data of text sequence;
S02), convolution operation is carried out to input data using multiple dimensioned convolution kernel, the height of characteristic pattern uses after convolution
Formula 1 calculates.During convolution operation, each local feature of input is calculated respectively using single convolution kernel first,
Calculation formula such as formula 2, it is then using formula 3 that calculated feature is connected longitudinally, activation primitive is finally reused to calculating
As a result it carries out NONLINEAR CALCULATION and obtains final convolution feature, calculation formula such as formula 4,
h1F(i)=f (WFX (i:i+F-1)+b) (2),
In formula, H2The height of characteristic pattern, H after expression convolution1Indicate the height inputted before convolution, F indicates the height of convolution kernel
Degree, P indicate the size of Padding, and S indicates step-length,It indicates to be rounded downwards, WFIndicate that height is the convolution kernel of F, X (i:i+
F-1 the local feature vectors of the feature from ith feature to the i-th+F-1 in sample input vector) are indicated, b indicates bias;
S03), pond is carried out to extract the important of text to the result after convolution using maximum pond layer MaxPooling1D
Then the result of Chi Huahou is played the input as LSTM layers, calculation formula point using Concatenate functional link by feature
Not as shown in formula 5,6,
S04), LSTM will be utilized by different convolution kernels treated text feature sequence as the input of LSTM network
Network can more accurately indicate the semantic information of text, and then the classification of text is better achieved, LSTM network each moment
Calculation formula it is as follows:
ft=σ (Wf·[ht-1, h1t]+bf) (7),
ht=otοtanh(ct) (12),
ftIt indicates to forget door, σ indicates sigmoid function, WfIndicate the weight matrix of forgetting door,It indicates two
Vector is combined into a longer vector, ht-1Represent the output at LSTM network last moment, h1tIndicate the output through convolution Chi Huahou
h1In the input of t moment, bfIt is the bias for forgeing door, itIndicate input gate, WiIndicate the weight matrix of input gate, biIndicate defeated
The bias of introduction,Indicate location mode currently entered, it is calculated according to last output and current input
Come, WcIndicate the weight matrix of location mode currently entered, bcIndicate the bias of location mode currently entered, ctTable
Show the location mode at current time, it is by forgetting door ftMultiplied by the location mode c of last momentt-1, add input gate itMultiply
With location mode currently enteredAnd calculating get, the thus memory c that LSTM is long-termt-1With current memoryIn conjunction with
New location mode c is formed togethert, otIndicate out gate, WoThe weight for representing out gate is placed in the middle, boRepresent the inclined of out gate
Set value, htIndicate final output, it is by location mode ctWith out gate otIt is common to determine.
Further, this method further includes step S05), increase full articulamentum, full articulamentum output dimension is in training set
Class number and sample calculated by Softmax function belong to the probability of each classification, calculation formula isIn formula, y (i) represents the value of i-th of neuron of output layer, y (k) generation
The value of k-th of neuron in table output layer, exp represent the exponential function using e the bottom of as.
3, the file classification method according to claim 1 based on convolutional neural networks, it is characterised in that: step
Further include the steps that in S01 in detail below: (1) participle operation being carried out to Chinese training dataset, (2) establish dictionary and establish word
Text sequence is mapped as index sequence by the mapping of allusion quotation and index, (3), and the sequence length of all samples is processed into the same by (4)
Length, can pass through mend 0 or truncation realize that (5) carry out word insertion using the good term vector of pre-training, if Length of sample series is
M, the good term vector dimension of pre-training be N, then word insertion after, each sample data be converted into the term vector matrix of M*N and by its
Input as convolutional layer.
Further, in step S02, convolution operation, the height difference of convolution kernel are carried out to input using one-dimensional convolutional layer
2 and 3 two scales are taken, the number of convolution kernel is 256, and activation primitive is Relu function.
Further, it joined Normalization layers of Batch between step S02 and S03 data are normalized
Processing, accelerates the convergence rate of model.
Further, Dropout layers be joined between step S04 and S05, the random neuron for disconnecting designated ratio connects
It connects, prevents over-fitting.
Beneficial effects of the present invention: the present invention is based on convolutional neural networks and Recognition with Recurrent Neural Network LSTM to propose that one kind is based on
The file classification method of convolution loop neural network.This method makes full use of the advantage pair of convolutional neural networks extraction local feature
Text carries out feature extraction, while using LSTM there is the advantage of memory the contextual feature of extraction is connected more preferable earth's surface
Show the semantic information of text.This method not only obtains preferable classifying quality on English data set while on Chinese data collection
Also higher classification accuracy is obtained.
Detailed description of the invention
Fig. 1 is convolution loop Artificial Neural Network Structures figure;
Fig. 2 is convolutional neural networks structure chart;
Fig. 3 is LSTM network structure.
Specific embodiment
The present invention is further illustrated in the following with reference to the drawings and specific embodiments.
Embodiment 1
The present embodiment discloses a kind of file classification method based on convolution loop neural network, and this method is based on convolution loop
Neural network model, as shown in Figure 1, the model includes input layer, word embeding layer, convolutional layer, pond layer, long short-term memory LSTM
Network layer, full articulamentum, the model use convolutional network to carry out multiple groups feature extraction and difference to the text information of input first
Pond is carried out to it to extract feature important in text, the feature extracted is then subjected to fusion and is sent into LSTM neural network
And pass through full articulamentum output category result.
The specific steps of this method are as follows:
S01), term vector matrix is converted as the input of convolutional layer using the sample data of text sequence;
In text classification, sample data is usually a text sequence, therefore before being sent to neural network, need to be by it
It is expressed as term vector matrix.The length of each sample is inconsistent when due to text classification, needs before word insertion by sample
Length is processed into the same length, and the size of sample length (sets sample length depending on the size of data set as M).Make herein
Carry out word insertion with the good term vector of pre-training and term vector dimension indicated with N, thus each sample be represented by the word of M*N to
Moment matrix and input as convolutional layer.
S02), in order to more accurately indicate the semantic feature of text, the present embodiment is using multiple dimensioned convolution kernel to input
Data carry out convolution operation, are operated with maximum pondization and carry out pond to the result after convolution to extract the important feature of text, with
The result of Chi Huahou is connected into the input as LSTM layers afterwards, convolutional neural networks structure is as shown in Figure 2.
In the present embodiment, convolution operation, the height difference of convolution kernel are carried out to input using one-dimensional convolutional layer (Conv1D)
2 and 3 two scales are taken, the number of convolution kernel is 256, and activation primitive is Relu function.Text size usually takes 100 in text, because
The height of characteristic pattern is respectively 99 and 98 (calculation formula such as formulas 1) after this convolution, therefore characteristic pattern dimension is respectively after convolution
(99,256) and (98,256).
H2 indicates the height of characteristic pattern after convolution, H in formula (1)1Indicate the height inputted before convolution, F indicates convolution kernel
Height, P indicates the size (text in padding size be 0) of Padding, and S indicates step-length (step-length is 1 in text),It indicates
It is rounded downwards.
In convolution characteristic extraction procedure, each local feature of input is calculated respectively using single convolution kernel first
(calculation formula such as formula 2), it is then again that calculated feature is (such as formula 3) connected longitudinally, activation primitive is finally reused to calculating
As a result it carries out NONLINEAR CALCULATION and obtains final convolution feature (such as formula 4).
h1F(i)=f (WF·X(i:i+F-1)+b)(2)
Wherein, WFIndicate height be F convolution kernel, X (i:i+F-1) indicate sample input vector in from ith feature to
The local feature vectors of i-th+F-1 features, b indicate bias.
S03), pond is carried out to extract the important of text to the result after convolution using maximum pond layer MaxPooling1D
Then the result of Chi Huahou is played the input as LSTM layers, calculation formula point using Concatenate functional link by feature
Not as shown in formula 5,6,
S04), the advantages of capable of capturing contextual information using shot and long term memory network (LSTM), will pass through different convolution
Core treated input of the text feature sequence as LSTM network, can more accurately be indicated the semanteme of text, into
And the classification of text is better achieved.LSTM network structure is as shown in Figure 3.
The calculation formula at LSTM network each moment is as follows:
ft=σ (Wf·[ht-1, h1t]+bf) (7),
ht=otοtanh(ct) (12),
ftIt indicates to forget door, σ indicates sigmoid function, WfIndicate the weight matrix of forgetting door,It indicates two
Vector is combined into a longer vector, ht-1The output at LSTM network last moment is represented,It indicates through the defeated of convolution Chi Huahou
H out1In the input of t moment, bfIt is the bias for forgeing door, itIndicate input gate, WiIndicate the weight matrix of input gate, biIt indicates
The bias of input gate,Indicate location mode currently entered, it is calculated according to last output and current input
It gets, WcIndicate the weight matrix of location mode currently entered, bcIndicate the bias of location mode currently entered, ct
Indicate the location mode at current time, it is by forgetting door ftMultiplied by the location mode c of last momentt-1, add input gate it
Multiplied by location mode currently enteredAnd calculating get, the thus memory c that LSTM is long-termt-1With current memory
It is combined together to form new location mode ct, otIndicate out gate, the weight that Wo represents out gate is placed in the middle, boRepresent out gate
Bias, htIndicate final output, it is by location mode ctWith out gate otIt is common to determine.
S05), over-fitting in order to prevent joined Dropout layers multiple, rate 0.5 in model.Finally it is in model
Full articulamentum, the last one full articulamentum output dimension calculate sample for classification number in data set and by softmax function
Belong to the probability of each classification, calculation formula such as following formula (13)
In formula, y (i) represents i-th of neuron of output layer
Value, y (k) represent the value of k-th of neuron in output layer, and exp represents the exponential function using e the bottom of as.
Embodiment 2
The present embodiment chooses 2 groups of Chinese data collection and 5 groups of common English Text Classification data sets follow the convolution of proposition
Ring neural network model is assessed.Chinese data collection is derived from the Hownet paper data that oneself is collected, 5 groups of English data set sources
In a text such as Zhang, data set covers different classification tasks such as sentiment analysis, subject classification, news category.Training sample
This size is differed from 120K to 1.4M, and the quantity of classification is between 4 and 14 in classification task.Specific data set information is such as
Shown in following table.
1 text classification data set information table of table
Data set | Training data | Test data | Classification | Classification task | Language |
Paper Data Set 1 | 160000 | 40000 | 5 | Document classification | CH |
Paper Data Set2 | 320000 | 80000 | 10 | Document classification | CH |
AG's news | 120000 | 7600 | 4 | News category | EN |
Sogou news | 450000 | 60000 | 5 | News category | EN |
DBPedia | 560000 | 70000 | 14 | Ontology | EN |
Yelp Review Full | 650000 | 50000 | 5 | Sentiment analysis | EN |
Yahoo!Answers | 1400000 | 60000 | 10 | Subject classification | EN |
Paper Data Set: academic paper of the academic paper data set in the Hownet that oneself is collected, data set 1
In include 5 document classifications, respectively clinical medicine, mathematics, power industry, biology, vocational education.Each classification is chosen
40000 datas are as experimental data, wherein 80% data set is as training data, 20% data set is as test number
According to.It include 10 document classifications, respectively chemistry, light industry handicraft, herding and animal medicine, pharmacy, news in data set 2
With medium, railway transportation, paediatrics, sport, physics, agricultural economy, each classification equally chooses 40000 datas as real
Data are tested, 80% data set is as training data, and 20% data set is as test data.
AG ' s news corpus:AG is a set more than 1,000,000 news articles, is ComeToMyHead in several years
The news article from more than 2000 a sources of news collected in preceding activity.Data set is mainly used for data mining, and (classification gathers
Class), in any non-commercial activities such as information retrieval (ranking, search).The theme of news categorized data set of AG be by Zhang, etc.
It builds from data above concentration in character level convolutional neural networks text classification experiment.The data set is from original language
4 maximum classes of selection include World, Sports, Business, Sci/Tech in material library, and each class selects 30000 training
Sample and 1900 test samples.Comprising 3 column in each sample, respectively class indexes (1 to 4), title, description information.
Sogou news corpus: search dog theme of news categorized data set is by Zhang etc. from SogouCA and SogouCS
In choose for character level convolutional neural networks text classification experiment in.The data set selects 5 from original language material library
Maximum classification includes Sports, finance, entertainment, automobile, technology, each class selection
90000 samples are for training, and 12000 samples are for testing.The data set is originally Chinese data collection, but Zhang etc. makes
Combine stammerer Words partition system that Chinese data is converted into phonetic text with the library pypinyin.Equally divide comprising 3 column in each sample
It Wei not class index (1 to 5), title and content.
DBPedia ontology dataset:DBpedia is a crowdsourcing community, it is intended to knot is extracted from wikipedia
The content [24] of structure.DBpedia ontology data collection is constructed by selecting 14 non-overlap classes from DBpedia 2014
, classification include Company, EducationalInstitution, Artist, Athlete, OfficeHolder,
MeanOfTransportation、Building、NaturalPlace、Village、Animal、Plant、Album、Film、
WrittenWork.From each of this 14 ontology classes class, 40000 training samples and 5000 tests are randomly choosed
Sample.The field of data set includes the title and abstract of class index (1 to 14), every wikipedia article.
Yelp Review Full:Yelp comment data collection is obtained from Yelp Dataset Challenge in 2015
?.Original comment data collection includes 5 i.e. 1-5 of star comment altogether.Yelp comment data collection is by commenting on from each star
In randomly select 130000 training samples and the building of 10000 test samples.Rope is commented on comprising star in each sample
Draw (1 to 5) and comment content.
Yahoo!Answers dataset:Yahoo!Answers dataset derives from Yahoo!Webscope data
Collection.Yahoo!Include 4483032 problems and their answers in Webscope corpus.Yahoo!Answers subject classification number
That 10 maximum classifications buildings are chosen from original language material library according to collection, subject categories include society with culture, science and
Mathematics, health, education and reference book, computer and network, sport, commercially with finance, amusement with music, family and relationship and
Politics and government.It include 140000 training samples and 6000 test samples in each classification.It include classification in each sample
Index (1 to 10), problem title, problem content and optimum answer.
4.2 benchmark model
Choose convolution loop neural network classification of the disaggregated model more classical in recent years as benchmark model and proposition
Model compares.Classical fastText and HAN classification are above chosen in homemade 2 groups of Chinese academic papers data set
Model is as benchmark model.The benchmark model of text selection includes traditional disaggregated model on 5 groups of general English data sets
With model neural network based.Traditional model is mainly linear method, and result provides in a text such as Zhang.It is based on
The model of neural network includes char-CNN, fastText and VDCNN, their result is respectively in Zhang etc., Joulin
Deng providing in the quotations such as, Conneau, the above benchmark model has used identical experimental data set, therefore for the mould to proposition
Type is further to be assessed, and is equally tested using model of the above-mentioned data set to proposition in text.
The setting of 4.3 model parameters
It is to the progress word insertion of input text and fine-tuning during model training using the good term vector of pre-training;
Term vector dimension size is 100;Depending on the length of the maximum sentence length Yin Wenben of each sample;The size of dictionary is according to data
The difference of collection and it is different, be usually arranged as 20000;The data set that selection ratio is 0.1 is as cross-validation data set;
Dropout ratio is 0.5;Convolution kernel size is 2 and 3 and convolution kernel number is 256;The neuron number of LSTM network layer
It is 70;Using Adam optimization method and learning rate is set as 1e-4;Batch size is set as 256.
4.4 experimental results and analysis
Herein using above data respectively to the convolution loop neural network textual classification model of proposition carry out experiment and with
Benchmark model compares.In addition, in order to enable the convolution loop neural network classification model proposed to obtain preferably text
Classifying quality is tested respectively for different convolution kernel numbers in text, in experiment convolution kernel number take 64 respectively, 128,
256,512.Specific experiment result is respectively as shown in table 2 and table 3.
The different convolution kernel number experimental result tables of table 2
3 text classification experimental result table of table
It can be seen that in a certain range from the experimental result in table 2, with the increase of convolution kernel number, text classification
Accuracy rate be continuously improved, when convolution kernel number be 256 when, text classification effect is best.In addition, from the experimental result in table 3
As can be seen that the model proposed in text not only achieves preferable classifying quality simultaneously in AG ' s news on Chinese data collection
Classification accuracy on corpus and DBPedia ontology dataset is also above other benchmark models.In summary, it mentions
Model out is applicable not only to the classification of Chinese data collection, is equally applicable to the classification of English data set.
Convolutional neural networks can be extracted the advantage of local feature and Recognition with Recurrent Neural Network LSTM by the present invention has memory
Advantage combine and propose a kind of file classification method based on convolution loop neural network, while choosing 2 groups of Chinese numbers
According to collection and 5 groups, commonly English data set tests the model of proposition.The experimental results showed that the model of proposition is not only in
Classification accuracy with higher on literary data set also has good classifying quality on other English data sets.
Described above is only basic principle and preferred embodiment of the invention, and those skilled in the art do according to the present invention
Improvement and replacement out, belong to the scope of protection of the present invention.
Claims (6)
1. the file classification method based on convolution loop neural network, it is characterised in that: the following steps are included:
S01), term vector matrix is converted as the input of convolutional layer using the sample data of text sequence;
S02), convolution operation is carried out to input data using multiple dimensioned convolution kernel, the height of characteristic pattern uses formula 1 after convolution
It calculates, during convolution operation, each local feature of input is calculated respectively using single convolution kernel first, calculate public
Formula such as formula 2, then using formula 3 calculated feature is connected longitudinally, finally reuse activation primitive to calculated result into
Row NONLINEAR CALCULATION obtains final convolution feature, calculation formula such as formula 4,
h1F(i)=f (WFX (i:i+F-1)+b) (2),
h1F=[h1F(1);h1F(2);...;h1F(H2)] (3),
hr1F=relu (h1F) (4),
In formula, H2The height of characteristic pattern, H after expression convolution1Indicate the height inputted before convolution, F indicates the height of convolution kernel, P
Indicating the size of Padding, S indicates step-length,It indicates to be rounded downwards, WFIndicate that height is the convolution kernel of F, X (i:i+F-1)
Indicate the local feature vectors of the feature from ith feature to the i-th+F-1 in sample input vector, b indicates bias;
S03), pond is carried out to extract the important spy of text to the result after convolution using maximum pond layer MaxPooling1D
Then the result of Chi Huahou is played the input as LSTM layers, calculation formula difference using Concatenate functional link by sign
As shown in formula 5,6,
hrp1F=max (hr1F) (5),
S04), LSTM network will be utilized by different convolution kernels treated text feature sequence as the input of LSTM network
It can more accurately indicate the semantic information of text, and then the classification of text is better achieved, the meter at LSTM network each moment
It is as follows to calculate formula:
ft=σ (Wf·[ht-1, h1t]+bf) (7),
it=σ (Wi·[ht-1, h1t]+bi) (8),
ot=σ (Wo·[ht-1, h1t]+bo) (11),
ht=ot·tanh(ct) (12),
ftIt indicates to forget door, σ indicates sigmoid function, WfIndicate the weight matrix of forgetting door, [ht-1, h1t] indicate two to
Amount is combined into a longer vector, ht-1Represent the output at LSTM network last moment, h1tIndicate the output h through convolution Chi Huahou1
In the input of t moment, bfIt is the bias for forgeing door, itIndicate input gate, WiIndicate the weight matrix of input gate, biIndicate defeated
The bias of introduction,Indicate location mode currently entered, it is calculated according to last output and current input
Come, WcIndicate the weight matrix of location mode currently entered, bcIndicate the bias of location mode currently entered, ctTable
Show the location mode at current time, it is by forgetting door ftMultiplied by the location mode c of last momentt-1, add input gate itMultiply
With location mode currently enteredAnd calculating get, the thus memory c that LSTM is long-termt-1With current memoryKnot
It is combined to form new location mode ct, otIndicate out gate, WoThe weight for representing out gate is placed in the middle, boRepresent out gate
Bias, htIndicate final output, it is by location mode ctWith out gate otIt is common to determine.
2. the file classification method according to claim 1 based on convolution loop neural network, it is characterised in that: further include
Step S05), increase full articulamentum, the output dimension of full articulamentum is the class number in training set and passes through Softmax letter
Number calculates the probability that sample belongs to each classification, and calculation formula isFormula
In, y (i) represents the value of i-th of neuron of output layer, and y (k) represents the value of k-th of neuron in output layer, exp represent with e as
The exponential function at bottom.
3. the file classification method according to claim 1 based on convolution loop neural network, it is characterised in that: step
Further include the steps that in S01 in detail below: (1) participle operation being carried out to Chinese training dataset, (2) establish dictionary and establish word
Text sequence is mapped as index sequence by the mapping of allusion quotation and index, (3), and the sequence length of all samples is processed into the same by (4)
Length, (5) carry out word insertion using the good term vector of pre-training, if Length of sample series is M, the good term vector dimension of pre-training
Degree is N, then after word insertion, each sample data is converted into the term vector matrix of M*N and the input as convolutional layer.
4. the file classification method according to claim 1 based on convolution loop neural network, it is characterised in that: step
Convolution operation is carried out to input using one-dimensional convolutional layer in S02, the height of convolution kernel takes 2 and 3 two scales respectively, convolution kernel
Number is 256, and activation primitive is Relu function.
5. the file classification method according to claim 1 based on convolution loop neural network, it is characterised in that: step
It joined Normalization layers of Batch between S02 and S03 data are normalized, accelerate the convergence speed of model
Degree.
6. the file classification method according to claim 1 based on convolution loop neural network, it is characterised in that: step
It joined Dropout layers between S04 and S05, the random neuron connection for disconnecting designated ratio prevents over-fitting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910025175.0A CN109840279A (en) | 2019-01-10 | 2019-01-10 | File classification method based on convolution loop neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910025175.0A CN109840279A (en) | 2019-01-10 | 2019-01-10 | File classification method based on convolution loop neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109840279A true CN109840279A (en) | 2019-06-04 |
Family
ID=66883776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910025175.0A Pending CN109840279A (en) | 2019-01-10 | 2019-01-10 | File classification method based on convolution loop neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109840279A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110347826A (en) * | 2019-06-17 | 2019-10-18 | 昆明理工大学 | A method of Laos's words and phrases feature is extracted based on character |
CN110399455A (en) * | 2019-06-05 | 2019-11-01 | 福建奇点时空数字科技有限公司 | A kind of deep learning data digging method based on CNN and LSTM |
CN110569358A (en) * | 2019-08-20 | 2019-12-13 | 上海交通大学 | Model, method and medium for learning long-term dependency and hierarchical structure text classification |
CN110569400A (en) * | 2019-07-23 | 2019-12-13 | 福建奇点时空数字科技有限公司 | Information extraction method for personnel information modeling based on CNN and LSTM |
CN110717330A (en) * | 2019-09-23 | 2020-01-21 | 哈尔滨工程大学 | Word-sentence level short text classification method based on deep learning |
CN110765785A (en) * | 2019-09-19 | 2020-02-07 | 平安科技(深圳)有限公司 | Neural network-based Chinese-English translation method and related equipment thereof |
CN111078833A (en) * | 2019-12-03 | 2020-04-28 | 哈尔滨工程大学 | Text classification method based on neural network |
CN111310801A (en) * | 2020-01-20 | 2020-06-19 | 桂林航天工业学院 | Mixed dimension flow classification method and system based on convolutional neural network |
CN111460100A (en) * | 2020-03-30 | 2020-07-28 | 中南大学 | Criminal legal document and criminal name recommendation method and system |
CN111459927A (en) * | 2020-03-27 | 2020-07-28 | 中南大学 | CNN-L STM developer project recommendation method |
CN112597311A (en) * | 2020-12-28 | 2021-04-02 | 东方红卫星移动通信有限公司 | Terminal information classification method and system based on low-earth-orbit satellite communication |
CN112989052A (en) * | 2021-04-19 | 2021-06-18 | 北京建筑大学 | Chinese news text classification method based on combined-convolutional neural network |
CN113297364A (en) * | 2021-06-07 | 2021-08-24 | 吉林大学 | Natural language understanding method and device for dialog system |
CN113378556A (en) * | 2020-02-25 | 2021-09-10 | 华为技术有限公司 | Method and device for extracting text keywords |
CN114207605A (en) * | 2019-10-31 | 2022-03-18 | 深圳市欢太科技有限公司 | Text classification method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169035A (en) * | 2017-04-19 | 2017-09-15 | 华南理工大学 | A kind of file classification method for mixing shot and long term memory network and convolutional neural networks |
CN108595632A (en) * | 2018-04-24 | 2018-09-28 | 福州大学 | A kind of hybrid neural networks file classification method of fusion abstract and body feature |
CN108763216A (en) * | 2018-06-01 | 2018-11-06 | 河南理工大学 | A kind of text emotion analysis method based on Chinese data collection |
-
2019
- 2019-01-10 CN CN201910025175.0A patent/CN109840279A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169035A (en) * | 2017-04-19 | 2017-09-15 | 华南理工大学 | A kind of file classification method for mixing shot and long term memory network and convolutional neural networks |
CN108595632A (en) * | 2018-04-24 | 2018-09-28 | 福州大学 | A kind of hybrid neural networks file classification method of fusion abstract and body feature |
CN108763216A (en) * | 2018-06-01 | 2018-11-06 | 河南理工大学 | A kind of text emotion analysis method based on Chinese data collection |
Non-Patent Citations (1)
Title |
---|
CHUNTING ZHOU等: "A C-LSTM Neural Network for Text Classification", 《COMPUTER SCIENCE》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399455A (en) * | 2019-06-05 | 2019-11-01 | 福建奇点时空数字科技有限公司 | A kind of deep learning data digging method based on CNN and LSTM |
CN110347826A (en) * | 2019-06-17 | 2019-10-18 | 昆明理工大学 | A method of Laos's words and phrases feature is extracted based on character |
CN110569400A (en) * | 2019-07-23 | 2019-12-13 | 福建奇点时空数字科技有限公司 | Information extraction method for personnel information modeling based on CNN and LSTM |
CN110569358A (en) * | 2019-08-20 | 2019-12-13 | 上海交通大学 | Model, method and medium for learning long-term dependency and hierarchical structure text classification |
CN110765785A (en) * | 2019-09-19 | 2020-02-07 | 平安科技(深圳)有限公司 | Neural network-based Chinese-English translation method and related equipment thereof |
CN110765785B (en) * | 2019-09-19 | 2024-03-22 | 平安科技(深圳)有限公司 | Chinese-English translation method based on neural network and related equipment thereof |
CN110717330A (en) * | 2019-09-23 | 2020-01-21 | 哈尔滨工程大学 | Word-sentence level short text classification method based on deep learning |
CN114207605A (en) * | 2019-10-31 | 2022-03-18 | 深圳市欢太科技有限公司 | Text classification method and device, electronic equipment and storage medium |
CN111078833A (en) * | 2019-12-03 | 2020-04-28 | 哈尔滨工程大学 | Text classification method based on neural network |
CN111078833B (en) * | 2019-12-03 | 2022-05-20 | 哈尔滨工程大学 | Text classification method based on neural network |
CN111310801A (en) * | 2020-01-20 | 2020-06-19 | 桂林航天工业学院 | Mixed dimension flow classification method and system based on convolutional neural network |
CN113378556A (en) * | 2020-02-25 | 2021-09-10 | 华为技术有限公司 | Method and device for extracting text keywords |
CN113378556B (en) * | 2020-02-25 | 2023-07-14 | 华为技术有限公司 | Method and device for extracting text keywords |
CN111459927A (en) * | 2020-03-27 | 2020-07-28 | 中南大学 | CNN-L STM developer project recommendation method |
CN111459927B (en) * | 2020-03-27 | 2022-07-08 | 中南大学 | CNN-LSTM developer project recommendation method |
CN111460100A (en) * | 2020-03-30 | 2020-07-28 | 中南大学 | Criminal legal document and criminal name recommendation method and system |
CN112597311B (en) * | 2020-12-28 | 2023-07-11 | 东方红卫星移动通信有限公司 | Terminal information classification method and system based on low-orbit satellite communication |
CN112597311A (en) * | 2020-12-28 | 2021-04-02 | 东方红卫星移动通信有限公司 | Terminal information classification method and system based on low-earth-orbit satellite communication |
CN112989052A (en) * | 2021-04-19 | 2021-06-18 | 北京建筑大学 | Chinese news text classification method based on combined-convolutional neural network |
CN112989052B (en) * | 2021-04-19 | 2022-03-08 | 北京建筑大学 | Chinese news long text classification method based on combination-convolution neural network |
CN113297364A (en) * | 2021-06-07 | 2021-08-24 | 吉林大学 | Natural language understanding method and device for dialog system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109840279A (en) | File classification method based on convolution loop neural network | |
CN107967318A (en) | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets | |
CN109740154A (en) | A kind of online comment fine granularity sentiment analysis method based on multi-task learning | |
CN106445919A (en) | Sentiment classifying method and device | |
CN110083700A (en) | A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks | |
CN106599933A (en) | Text emotion classification method based on the joint deep learning model | |
CN113033610B (en) | Multi-mode fusion sensitive information classification detection method | |
CN108090099B (en) | Text processing method and device | |
CN110825850B (en) | Natural language theme classification method and device | |
CN109977199A (en) | A kind of reading understanding method based on attention pond mechanism | |
Pacha et al. | Towards self-learning optical music recognition | |
CN112347766A (en) | Multi-label classification method for processing microblog text cognition distortion | |
CN109062958B (en) | Primary school composition automatic classification method based on TextRank and convolutional neural network | |
Fei et al. | Beyond prompting: Making pre-trained language models better zero-shot learners by clustering representations | |
Smitha et al. | Meme classification using textual and visual features | |
Jishan et al. | Natural language description of images using hybrid recurrent neural network | |
CN113033180B (en) | Automatic generation service system for Tibetan reading problem of primary school | |
Kasthuri et al. | An artificial bee colony and pigeon inspired optimization hybrid feature selection algorithm for twitter sentiment analysis | |
CN114925198B (en) | Knowledge-driven text classification method integrating character information | |
Li et al. | Multilingual toxic text classification model based on deep learning | |
Mouri et al. | An empirical study on bengali news headline categorization leveraging different machine learning techniques | |
Alvarado et al. | Detecting Disaster Tweets using a Natural Language Processing technique | |
Rawat et al. | A Systematic Review of Question Classification Techniques Based on Bloom's Taxonomy | |
Alsharhan | Natural Language Generation and Creative Writing A Systematic Review | |
Alhabeeb et al. | An Investigation into Indonesian Students' Opinions on Educational Reforms through the Use of Machine Learning and Sentiment Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190604 |