CN109241287A - Textual classification model and method based on intensified learning and capsule network - Google Patents
Textual classification model and method based on intensified learning and capsule network Download PDFInfo
- Publication number
- CN109241287A CN109241287A CN201811109798.8A CN201811109798A CN109241287A CN 109241287 A CN109241287 A CN 109241287A CN 201811109798 A CN201811109798 A CN 201811109798A CN 109241287 A CN109241287 A CN 109241287A
- Authority
- CN
- China
- Prior art keywords
- capsule
- intensified learning
- word
- network
- classification model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The present invention relates to the technical fields of natural language processing, text classification, more particularly, to textual classification model and method based on intensified learning and capsule network.The present invention is basic frame with intensified learning Actor-Critic, capsule network CapsNet, and capsule network extracts the feature of text information, and intensified learning differentiates the connection between capsule layer.Innovative content of the invention is to introduce the routing relation that intensified learning is gone between study capsule network layer, is introduced into capsule network and goes to solve the task that the multi-tag in textual classification model is classified.Using advantage of the capsule network in multi-tag classification task, in the task applied to the classification of text multi-tag, to reach better effect;The whole mechanism of misadjustment, study to the contact method between preferably routing are met using intensified learning.
Description
Technical field
The present invention relates to the technical field of natural language processing, text classification, more particularly, to based on intensified learning and
The textual classification model and method of capsule network.
Background technique
Feature learning is a fundamental problem of artificial intelligence field, especially in terms of natural language processing, feature
Extraction it is even more important.And in natural language processing, text classification is one very basic, very common process, it ten
Divide the learning process dependent on feature.Different from image domains, the semantic logic of text is more difficult to be caught in and express,
It is bigger so as to cause the classification task difficulty on text.The optimized integration of general artificial intelligence in natural language processing is exactly
Machine is appreciated that the language of the mankind, that is, understands the semantic information of text, so as to execute the task of defined.And text classification
It is that machine understands background task in text semantic information task, thus has important research significance herein.For present
Textual classification model, there is no be similar to the good of single label text disaggregated model for the performance on multi-tag text categorization task
Effect, and the textual classification model based on capsule network has advantageous advantage, and its routing algorithm in this field
Belong to a kind of unsupervised clustering algorithm to a certain extent, and intensified learning have the effect of in terms of cluster it is good.
The Text Representation method of present mainstream can substantially be divided into four classes.1. bag of words characteristic model is that one kind does not consider
The Text Representation method of the sequence of word in sentence, it encodes the word in sentence, in sentence the feature of word to
Amount length is exactly the size of bag of words, for example, the DAN model that Mohit et al. is proposed, it splits the word in a sentence
Label, then in afferent nerve network structure, the word of these labels does not retain original location information;What Joulin et al. was proposed
Fasttext model, word is directly passed through a lookup table by it, and a neural network model is added, and is not accounted for
To the order information of word.2. sequence table representation model is then a kind of model for considering word order, such as Convolutional
Neural Network, RecurrentNeural Network etc., but it the shortcomings that first is that not using sentence knot
The information of structure.Kim et al. proposes TextCNN model, and using the convolution property of CNN, the information of the word of front and back K is combined
Come, thus retain the part order information of word in sentence, the structural information of word but it does not with a hook at the end, each word
Part of speech is the same.3. structure feature model is then a kind of model for considering sentence structure information, for example, Tai et al. is mentioned
Tree-like LSTM model out has modified the chain type LSTM model of script, tree structure is changed to according to syntax tree, to retain sentence
The structural information of middle word;Qian et al. proposes Recursive Autoencoder model, it is also using making in advance
Syntax tree go construction sentence structure feature.4. based on the model of attention mechanism, for each ingredient in a sentence,
Semantic percentage contribution is different, attention mechanism is intended to artificial for specific word label one relatively high point
Number, does not have markd word compared to other, attention mechanism improves contribution of the particular words in semantic description.Yang et al.
The HAN model based on attention mechanism is proposed, word improves text point by encoding and then adding attention mechanism
The effect of class.
The shortcomings that above-mentioned prior art is 1. bag of words.The shortcomings that bag of words, is not accounting for suitable in sentence
Sequence structure problem.Textual classification model based on bag of words has collected the word occurred in sentence, and will be under the storage of its word frequency
Come, but such algorithm model has abandoned many sentence information, and from the angle of information theory, more information can be brought
Better effect.2. sequence table representation model.The shortcomings that sequence table representation model, is its information being only utilized between sentence sequence,
And for the word across word, series model captures its information there is no method, therefore sequence table representation model to a certain extent
And it is lost information.3. structure feature model.The shortcomings that structure feature model, is to use the language made in advance
Method tree goes the structure feature of construction word not to be directed to although phraseological information is utilized in capture semantically
The structure of property.4. the model based on attention mechanism.Model based on attention mechanism corresponds in relationship in acquisition input and output
With bigger advantage, but the text model based on attention mechanism does not account for the information of word order sequence, this for
Natural language is very bad, because text word order contains very big information.
Summary of the invention
The present invention in order to overcome at least one of the drawbacks of the prior art described above, is provided based on intensified learning and capsule network
Textual classification model and method, its effect in multi-tag text classification is promoted with the capsule network in conjunction with intensified learning
Fruit.
For multi-tag text categorization task, capsule network has advantageous advantage.First we uses TextCNN
The word order between word and partial structural information are obtained, after obtaining its feature, the connection of feature is carried out with capsule network, it can
To merge the information of word order and structure.And in capsule network, the characteristic information of fusion can parsing by neural network
To the classification results of multi-tag.And intensified learning to a certain extent may be used as the routing mechanism between training capsule network layer
To obtain better on-link mode (OLM).Therefore, in conjunction with the textual classification model of intensified learning and capsule network, other moulds be can solve
The insurmountable word order of type, the defect of structure.
The technical scheme is that the present invention is with intensified learning Actor-Critic, capsule network CapsNet is basic
Frame, capsule network extract text information feature, intensified learning differentiate capsule layer between connection, concrete implementation step
It is as follows:
It is divided into the frame of two parts: the frame of intensified learning and the frame of capsule network herein.
The frame of intensified learning includes:
State: indicating current state, and state here mainly contains environment locating for Agent, oneself state, tool
It is the connection relationship between two capsule layers herein for body, the state of the connection of all capsule layers constitutes extensive chemical
The state of a step in habit.
Action: indicating the action of Agent, and action here is main whether be the connection between capsule layer, either
Connect probability.
Reward: indicating the reward that Agent is obtained, and is generally divided into reward immediately and following reward, reward here exist
It is exactly the effect classified for text classification, general we use F1, and the measurement standards such as Precision are as reward.
The frame of capsule network the following steps are included:
S1. either divide original raw text to word by participle, be converted into using a lookup table
The word or word of embedding form;
S2., the word of embedding form or root are obtained to the Primary after convolution according to the method for TextCNN
Capsule;
S3. Primary Capsule is connect after Routing with next layer of Capsule Layer, then and Full
Connect Network connection, exports the probability size of different labels;
S4. by lookup in the weight size of BP algorithm modification Full Connect Layer and Embedding layers
The representation of each word of table.
In the step S1, detailed process is:
S11. the lookup table of oneself is first initialized according to existing word embedding, wherein
The depth of embedding is 300, as the random number being then set between 0 or 0-1 not occurred;
S12. and then again by way of search, each of raw text word or each word are converted into
The format of embedding, in this way, each sentence of raw text has been converted into the matrix of long*300 form.
In the step S2, detailed process is:
It S21. is respectively 3*300,4*300,5* with kernel size according to the matrix of the resulting long*300 of S1 step
300 convolution nuclear convolution obtains corresponding feature vector, the quantity of each kernel size is 32;
S22. the Vector_Size (VS) * 32 generated according to S21 step, by the transition matrix of a 32*32*16, obtains
To the Primary Capsule of VS*32*16, the dimension of Capsule is 16 here.
In the step S3, detailed process is:
S31. according to the Primary Capsule of the obtained VS*32*16 of S22,32* is set in Capsule Layer
The Capsule Layer of 16*16 has used 16 filters here;
S32. the weight matrix value of VS*32*32*16 is set as the State in intensified learning, action each time is just
It is modification weighted value therein;
S33. the Capsule Layer of 32*16*16 is calculated according to weight matrix, spreads out through a Full
The neural network of Connect, then by one Softmax layers, obtain the probability size of different labels;
S34. according to compared with correct result, obtained loss value utilizes A3C as the Reward in intensified learning
Weight matrix in algorithm improvement S32.
Compared with prior art, beneficial effect is: innovative content of the invention is that introducing intensified learning removes study glue
Routing relation between keed network layers is introduced into the task for the multi-tag classification that capsule network goes to solve in textual classification model.Benefit
With advantage of the capsule network in multi-tag classification task, in the task applied to the classification of text multi-tag, to reach more preferable
Effect;The whole mechanism of misadjustment, study to the contact method between preferably routing are met using intensified learning.
Detailed description of the invention
Fig. 1 is intensified learning partial schematic diagram of the present invention.
Fig. 2 is capsule network partial schematic diagram of the present invention.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;In order to better illustrate this embodiment, attached
Scheme certain components to have omission, zoom in or out, does not represent the size of actual product;To those skilled in the art,
The omitting of some known structures and their instructions in the attached drawings are understandable.Being given for example only property of positional relationship is described in attached drawing
Illustrate, should not be understood as the limitation to this patent.
As shown in Figure 1,
S1:state is the weighted value of present weight matrix, and Critic can be commented according to finally obtained Loss Value
The value of valence present weight matrix;
S2:Policy is the operation for modifying weight matrix, the value for the modification weight matrix that actor can be random;
S3: the operation done according to S2 obtains new loss value.
It is as shown in Figure 2:
S1: either divide original raw text to word by participle, be converted into using a lookup table
The word or word of embedding form;
S2: the word of embedding form or root are obtained to the Primary after convolution according to the method for TextCNN
Capsule;
S3: Primary Capsule is connect after Routing with next layer of Capsule Layer, then and Full
Connect Network connection, exports the probability size of different labels.
S4: by lookup in the weight size of BP algorithm modification Full Connect Layer and Embedding layers
The representation of each word of table.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention
Protection scope within.
Claims (5)
1. the textual classification model based on intensified learning and capsule network, which is characterized in that frame and glue including intensified learning
The frame of keed network;
The frame of intensified learning includes:
State: indicating current state, and state here mainly contains environment locating for Agent, oneself state;
Action: indicating the action of Agent, and action here is main whether be the connection between capsule layer, or connection
Probability;
Reward: indicating the reward that Agent is obtained, and is divided into reward immediately and following reward.
2. the method for the textual classification model described in claim 1 based on intensified learning and capsule network, it is characterised in that: glue
The frame of keed network the following steps are included:
S1. either divide original raw text to word by participle, be converted into using a lookup table
The word or word of embedding form;
S2., the word of embedding form or root are obtained to the Primary Capsule after convolution according to the method for TextCNN;
S3. Primary Capsule is connect after Routing with next layer of Capsule Layer, then and Full
Connect Network connection, exports the probability size of different labels;
S4. by lookup in the weight size of BP algorithm modification Full Connect Layer and Embedding layers
The representation of each word of table.
3. the method for the textual classification model according to claim 2 based on intensified learning and capsule network, feature exist
In: in the step S1, detailed process is:
S11. oneself lookup table is first initialized according to existing word embedding, wherein embedding
Depth is 300, as the random number being then set between 0 or 0-1 not occurred;
S12. and then again by way of search, each of raw text word or each word are converted into
The format of embedding, in this way, each sentence of raw text has been converted into the matrix of long*300 form.
4. the method for the textual classification model according to claim 2 based on intensified learning and capsule network, feature exist
In: in the step S2, detailed process is:
It S21. is respectively 3*300,4*300 with kernel size according to the matrix of the resulting long*300 of S1 step, 5*300's
Convolution nuclear convolution obtains corresponding feature vector, the quantity of each kernel size is 32;
S22. the Vector_Size (VS) * 32 generated according to S21 step, by the transition matrix of a 32*32*16, obtains
The Primary Capsule of VS*32*16, the dimension of Capsule is 16 here.
5. the method for the textual classification model according to claim 4 based on intensified learning and capsule network, feature exist
In: in the step S3, detailed process is:
S31. according to the Primary Capsule of the obtained VS*32*16 of S22,32*16*16 is set in Capsule Layer
Capsule Layer, used 16 filters here;
S32. the weight matrix value of VS*32*32*16 is set as the State in intensified learning, action each time is exactly to repair
Change weighted value therein;
S33. the Capsule Layer of 32*16*16 is calculated according to weight matrix, spreads out through a Full
The neural network of Connect, then by one Softmax layers, obtain the probability size of different labels;
S34. according to compared with correct result, obtained loss value utilizes A3C algorithm as the Reward in intensified learning
Improve the weight matrix in S32.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811109798.8A CN109241287B (en) | 2018-09-21 | 2018-09-21 | Text classification model and method based on reinforcement learning and capsule network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811109798.8A CN109241287B (en) | 2018-09-21 | 2018-09-21 | Text classification model and method based on reinforcement learning and capsule network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109241287A true CN109241287A (en) | 2019-01-18 |
CN109241287B CN109241287B (en) | 2021-10-15 |
Family
ID=65057494
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811109798.8A Active CN109241287B (en) | 2018-09-21 | 2018-09-21 | Text classification model and method based on reinforcement learning and capsule network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109241287B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046671A (en) * | 2019-04-24 | 2019-07-23 | 吉林大学 | A kind of file classification method based on capsule network |
CN110059601A (en) * | 2019-04-10 | 2019-07-26 | 西安交通大学 | A kind of multi-feature extraction and the intelligent failure diagnosis method merged |
CN110111365A (en) * | 2019-05-06 | 2019-08-09 | 深圳大学 | Training method and device and method for tracking target and device based on deep learning |
CN110188195A (en) * | 2019-04-29 | 2019-08-30 | 苏宁易购集团股份有限公司 | A kind of text intension recognizing method, device and equipment based on deep learning |
CN110263855A (en) * | 2019-06-20 | 2019-09-20 | 深圳大学 | A method of it is projected using cobasis capsule and carries out image classification |
CN110570425A (en) * | 2019-10-18 | 2019-12-13 | 北京理工大学 | Lung nodule analysis method and device based on deep reinforcement learning algorithm |
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
CN111046181A (en) * | 2019-12-05 | 2020-04-21 | 贵州大学 | Actor-critic algorithm for automatic classification induction |
CN111274425A (en) * | 2020-01-20 | 2020-06-12 | 平安科技(深圳)有限公司 | Medical image classification method, medical image classification device, medical image classification medium and electronic equipment |
CN111402014A (en) * | 2020-06-04 | 2020-07-10 | 江苏省质量和标准化研究院 | Capsule network-based E-commerce defective product prediction method |
CN111460160A (en) * | 2020-04-02 | 2020-07-28 | 复旦大学 | Event clustering method for streaming text data based on reinforcement learning |
CN112084327A (en) * | 2019-06-14 | 2020-12-15 | 国际商业机器公司 | Classification of sparsely labeled text documents while preserving semantics |
CN113190681A (en) * | 2021-03-02 | 2021-07-30 | 东北大学 | Fine-grained text classification method based on capsule network mask memory attention |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110250274A1 (en) * | 2008-09-19 | 2011-10-13 | Shaked Ze Ev | Estriol formulations |
CN106919702A (en) * | 2017-02-14 | 2017-07-04 | 北京时间股份有限公司 | Keyword method for pushing and device based on document |
JP2018041443A (en) * | 2016-07-14 | 2018-03-15 | 株式会社シーサイドジャパン | Deep learning artificial neural network-based task provision platform |
CN108170736A (en) * | 2017-12-15 | 2018-06-15 | 南瑞集团有限公司 | A kind of document based on cycle attention mechanism quickly scans qualitative method |
-
2018
- 2018-09-21 CN CN201811109798.8A patent/CN109241287B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110250274A1 (en) * | 2008-09-19 | 2011-10-13 | Shaked Ze Ev | Estriol formulations |
JP2018041443A (en) * | 2016-07-14 | 2018-03-15 | 株式会社シーサイドジャパン | Deep learning artificial neural network-based task provision platform |
CN106919702A (en) * | 2017-02-14 | 2017-07-04 | 北京时间股份有限公司 | Keyword method for pushing and device based on document |
CN108170736A (en) * | 2017-12-15 | 2018-06-15 | 南瑞集团有限公司 | A kind of document based on cycle attention mechanism quickly scans qualitative method |
Non-Patent Citations (2)
Title |
---|
PER-ARNE ANDERSEN: "Deep Reinforcement Learning using Capsules", 《阿格德尔大学信息和通信技术学院工程与科学系2018年硕士论文》 * |
XI OUYANG ETC: "Audio-Visual Emotion Recognition with Capsule-like Feature Representation and Model-Based Reinforcement Learning", 《2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA)》 * |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059601A (en) * | 2019-04-10 | 2019-07-26 | 西安交通大学 | A kind of multi-feature extraction and the intelligent failure diagnosis method merged |
CN110046671A (en) * | 2019-04-24 | 2019-07-23 | 吉林大学 | A kind of file classification method based on capsule network |
CN110188195A (en) * | 2019-04-29 | 2019-08-30 | 苏宁易购集团股份有限公司 | A kind of text intension recognizing method, device and equipment based on deep learning |
CN110188195B (en) * | 2019-04-29 | 2021-12-17 | 南京星云数字技术有限公司 | Text intention recognition method, device and equipment based on deep learning |
CN110111365A (en) * | 2019-05-06 | 2019-08-09 | 深圳大学 | Training method and device and method for tracking target and device based on deep learning |
CN112084327B (en) * | 2019-06-14 | 2024-04-16 | 国际商业机器公司 | Classification of sparsely labeled text documents while preserving semantics |
US11455527B2 (en) | 2019-06-14 | 2022-09-27 | International Business Machines Corporation | Classification of sparsely labeled text documents while preserving semantics |
CN112084327A (en) * | 2019-06-14 | 2020-12-15 | 国际商业机器公司 | Classification of sparsely labeled text documents while preserving semantics |
CN110263855B (en) * | 2019-06-20 | 2021-12-14 | 深圳大学 | Method for classifying images by utilizing common-basis capsule projection |
CN110263855A (en) * | 2019-06-20 | 2019-09-20 | 深圳大学 | A method of it is projected using cobasis capsule and carries out image classification |
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
CN110866113B (en) * | 2019-09-30 | 2022-07-26 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning burt model |
CN110570425A (en) * | 2019-10-18 | 2019-12-13 | 北京理工大学 | Lung nodule analysis method and device based on deep reinforcement learning algorithm |
CN110570425B (en) * | 2019-10-18 | 2023-09-08 | 北京理工大学 | Pulmonary nodule analysis method and device based on deep reinforcement learning algorithm |
CN111046181B (en) * | 2019-12-05 | 2023-04-07 | 贵州大学 | Actor-critic method for automatic classification induction |
CN111046181A (en) * | 2019-12-05 | 2020-04-21 | 贵州大学 | Actor-critic algorithm for automatic classification induction |
CN111274425A (en) * | 2020-01-20 | 2020-06-12 | 平安科技(深圳)有限公司 | Medical image classification method, medical image classification device, medical image classification medium and electronic equipment |
CN111274425B (en) * | 2020-01-20 | 2023-08-22 | 平安科技(深圳)有限公司 | Medical image classification method, device, medium and electronic equipment |
CN111460160A (en) * | 2020-04-02 | 2020-07-28 | 复旦大学 | Event clustering method for streaming text data based on reinforcement learning |
CN111460160B (en) * | 2020-04-02 | 2023-08-18 | 复旦大学 | Event clustering method of stream text data based on reinforcement learning |
CN111402014A (en) * | 2020-06-04 | 2020-07-10 | 江苏省质量和标准化研究院 | Capsule network-based E-commerce defective product prediction method |
CN113190681A (en) * | 2021-03-02 | 2021-07-30 | 东北大学 | Fine-grained text classification method based on capsule network mask memory attention |
CN113190681B (en) * | 2021-03-02 | 2023-07-25 | 东北大学 | Fine granularity text classification method based on capsule network mask memory attention |
Also Published As
Publication number | Publication date |
---|---|
CN109241287B (en) | 2021-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241287A (en) | Textual classification model and method based on intensified learning and capsule network | |
CN112199511B (en) | Cross-language multi-source vertical domain knowledge graph construction method | |
Liu et al. | Implicit discourse relation classification via multi-task neural networks | |
CN105930314B (en) | System and method is generated based on coding-decoding deep neural network text snippet | |
CN108182295A (en) | A kind of Company Knowledge collection of illustrative plates attribute extraction method and system | |
Dohare et al. | Text summarization using abstract meaning representation | |
CN102591988B (en) | Short text classification method based on semantic graphs | |
CN107967261A (en) | Interactive question semanteme understanding method in intelligent customer service | |
CN107704892A (en) | A kind of commodity code sorting technique and system based on Bayesian model | |
CN103123618B (en) | Text similarity acquisition methods and device | |
CN103559199B (en) | Method for abstracting web page information and device | |
CN109840322A (en) | It is a kind of based on intensified learning cloze test type reading understand analysis model and method | |
CN111538848A (en) | Knowledge representation learning method fusing multi-source information | |
CN109359297A (en) | A kind of Relation extraction method and system | |
CN110197284A (en) | A kind of address dummy recognition methods, device and equipment | |
CN107357785A (en) | Theme feature word abstracting method and system, feeling polarities determination methods and system | |
CN108932278A (en) | Interactive method and system based on semantic frame | |
WO2022241913A1 (en) | Heterogeneous graph-based text summarization method and apparatus, storage medium, and terminal | |
CN111143553A (en) | Method and system for identifying specific information of real-time text data stream | |
CN109920476A (en) | The disease associated prediction technique of miRNA- based on chaos game playing algorithm | |
CN114490953B (en) | Method for training event extraction model, method, device and medium for extracting event | |
CN107908757A (en) | Website classification method and system | |
CN110334340B (en) | Semantic analysis method and device based on rule fusion and readable storage medium | |
CN111814450A (en) | Aspect-level emotion analysis method based on residual attention | |
Gao et al. | A hybrid GCN and RNN structure based on attention mechanism for text classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |