CN114842233A - Sequence random network image classification method - Google Patents
Sequence random network image classification method Download PDFInfo
- Publication number
- CN114842233A CN114842233A CN202110131810.0A CN202110131810A CN114842233A CN 114842233 A CN114842233 A CN 114842233A CN 202110131810 A CN202110131810 A CN 202110131810A CN 114842233 A CN114842233 A CN 114842233A
- Authority
- CN
- China
- Prior art keywords
- layer
- image
- block
- bilstm
- resnet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 12
- 230000014759 maintenance of location Effects 0.000 claims description 6
- 230000005012 migration Effects 0.000 claims description 6
- 238000013508 migration Methods 0.000 claims description 6
- 238000013526 transfer learning Methods 0.000 claims description 5
- 241001465754 Metazoa Species 0.000 abstract description 6
- 238000013145 classification model Methods 0.000 abstract description 4
- 230000002457 bidirectional effect Effects 0.000 abstract description 2
- 230000007547 defect Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 4
- 230000004913 activation Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of image classification and identification, in particular to an image classification method using a sequence random network. Aiming at the defects of the conventional image classification and retrieval method, the invention provides a random network image classification model capable of capturing an image one-dimensional depth feature sequence, which is called BilSTM-TDN. The BilSTM-TDN image classification model consists of a bidirectional long-short term memory module BilSTM and a plurality of Tanh-Dropout blocks. If the target image is a target training database consisting of airplane images, the airplane type can be accurately and automatically identified by using the method; if the animal type identification method is a target training database composed of animals, the animal type can be accurately and automatically identified by the method.
Description
Technical Field
The invention relates to the field of image classification and identification, in particular to an image classification method using a sequence random network.
Background
Image recognition and classification are very widely used: such as identifying the type of car, the type of animal, identifying the identity from a human face, etc. Most images today use deep convolutional neural models (such as ResNet, DenseNet) to extract image features, and then perform image content recognition. In recent years, a tanformer image recognition model based on an attention mechanism has also appeared. Although these models greatly improve the recognition performance of images, these methods are difficult to meet the practical requirements due to the higher and higher application requirements.
The deepening and the horizontal operation of the network enable the requirement of the model on machine hardware to be higher and higher, and usually, several days or even dozens of days are needed for training a deep model; however, the performance improvement in exchange for this is only marginal, at least not as expected. One of the fundamental reasons that existing network models are limited in performance is that correlation properties between image sequences are ignored. Usually, a plurality of sample images are arranged in a class in a training set, and the sample images have great similarity; the features extracted by the deep network also have strong correlation, and the classification effect can be effectively improved by utilizing the correlation.
Disclosure of Invention
Aiming at the defects of the existing image classification and retrieval method, the invention provides a random network image classification model capable of capturing an image one-dimensional depth feature sequence, which is called BilSTM-TDN. The BilSTM-TDN image classification model is composed of a bidirectional long-short term memory module (BilSTM) and a plurality of Tanh-Dropout blocks (TD blocks for short). BilSTM is a recurrent neural network, and for each time t, the input is simultaneously provided to two opposite long-short memory (LSTM) networks, and the output is jointly determined by the LSTMs in the two directions. BilSTM is used here to learn information about long-term dependencies between several feature sequences. TD blocks refer to the combination of a tan layer and a drop Dropout layer by a bi-tangent activation function. The double tangent activation function tanh is expressed as follows:
wherein sinh and cosh are hyperbolic sine function sinh and hyperbolic cosine function cosh, respectively. The Dropout layer is a layer commonly used in a deep convolutional network, and randomly selects a subset of the weight set W to be reserved with a reserved probability p during training, i.e., a part of weights are discarded with a probability 1-p, and is represented as follows:
y=W| p *x
where x represents the input to the layer and y is the output of the layer. When classification is performed, the output is multiplied by the probability p, which is expressed as follows:
y=W*px
when the classification is carried out, firstly, a network model ResNet-101 pre-trained on ImageNet is subjected to transfer learning on a target database, and then all the characteristics of the images in the database are extracted. The features of the images are one-dimensional, and then the images are trained by the BilSTM-TDN according to the features, and finally the trained BilSTM-TDN is used for image classification. If the target image is a target training database formed by airplane images, the airplane type can be accurately and automatically identified by using the method; if the target training database is composed of animals, the animal type can be accurately and automatically identified by using the method.
Drawings
FIG. 1 is a diagram of a structure of a sequential stochastic network model according to the present invention.
Fig. 2 is a schematic diagram of a TD block structure in a sequential random network.
FIG. 3 is a flow chart of the classification implemented by the method of the present invention.
Detailed Description
FIG. 1 is a sequence random network BilSTM-TDN of the present invention, in which the first layer from left to right is a sequence data input layer 1-a feature sequence input layer; the second layer is a 2-Dropout layer, with the retention probability p set to 0.6; the third layer is a BilSTM layer; the fourth layer to the seventh layer are intermediate layers composed of TD blocks, and respectively are: 4-TD block-1, 4-TD block-2, 4-TD block-3, 4-TD block-4, wherein the retention probability p in 4-TD block-1 and 4-TD block-2 is set to 0.6, and the retention probability p in 4-TD block-3 and 4-TD block-4 is set to 0.5; the eighth layer is a 5-full connection layer, and the number of nodes in the layer is the number of categories of the training data set; the ninth layer is a 6-Softmax output layer and is used for realizing image classification; FIG. 2 is the structure of a TD block in a BilSTM-TDN; the TD block consists of one Tanh layer and one Dropout layer.
The method of the invention is further explained by combining the attached drawings, and the specific implementation steps are as follows:
step 1, carrying out transfer learning on a ResNet-101 model pre-trained on ImageNet: firstly, initializing a ResNet-101 model by network parameters trained on an ImageNet large database, and then performing transfer learning on a specific target database to obtain a ResNet-101 transfer model;
step 2, extracting one-dimensional features of the image by using a ResNet-101 migration model, and combining the one-dimensional features into a feature vector database: calculating the characteristics of each image on a target database by using a ResNet-101 migration model, wherein the image characteristics are characteristic vectors of 2048 dimensions; recombining the image features in the target database into a feature vector database;
step 3, training a BilSTM-TDN network on the feature vector database;
and 4, classifying and identifying images by using the trained BilSTM-TDN network: given an image, 2048-dimensional feature vectors of the image are extracted by using a ResNet-101 migration model, and the feature vectors of the image are input into a network to identify the category of the image.
Claims (2)
1. An image recognition and classification method, characterized in that it comprises the following steps:
step 1, carrying out transfer learning on a ResNet-101 model pre-trained on ImageNet: firstly, initializing a ResNet-101 model by network parameters trained on an ImageNet large database, and then performing transfer learning on a specific target database to obtain a ResNet-101 transfer model;
step 2, extracting one-dimensional features of the image by using a ResNet-101 migration model, and combining the one-dimensional features into a feature vector database: calculating the characteristics of each image on a target database by using the ResNet-101 migration model, wherein the image characteristics are characteristic vectors with 2048 dimensions; recombining the image features in the target database into a feature vector database;
step 3, training a sequence random network BiLSTM-TDN on a feature vector database;
and 4, classifying and identifying images by using the trained sequence random network BilSTM-TDN: given an image, 2048-dimensional feature vectors of the image are extracted by using a ResNet-101 migration model, and the feature vectors of the image are input into a network to identify the category of the image.
2. An image recognition and classification method according to claim 1, characterized in that a sequential stochastic network BiLSTM-TDN is designed, in which network the first layer is a sequential data input layer 1-a feature sequence input layer; the second layer is a 2-Dropout layer, with the retention probability p set to 0.6; the third layer is a BilSTM layer; the fourth layer to the seventh layer are intermediate layers composed of TD blocks, and respectively are: 4-TD block-1, 4-TD block-2, 4-TD block-3, 4-TD block-4, wherein the retention probability p in 4-TD block-1 and 4-TD block-2 is set to 0.6, and the retention probability p in 4-TD block-3 and 4-TD block-4 is set to 0.5; the eighth layer is a 5-full connection layer, and the number of nodes in the layer is the number of categories of the training data set; the ninth layer is a 6-Softmax output layer and is used for realizing image classification; the TD block consists of one Tanh layer and one Dropout layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110131810.0A CN114842233A (en) | 2021-02-01 | 2021-02-01 | Sequence random network image classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110131810.0A CN114842233A (en) | 2021-02-01 | 2021-02-01 | Sequence random network image classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114842233A true CN114842233A (en) | 2022-08-02 |
Family
ID=82561077
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110131810.0A Pending CN114842233A (en) | 2021-02-01 | 2021-02-01 | Sequence random network image classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114842233A (en) |
-
2021
- 2021-02-01 CN CN202110131810.0A patent/CN114842233A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110334765B (en) | Remote sensing image classification method based on attention mechanism multi-scale deep learning | |
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
Li et al. | Deep joint discriminative learning for vehicle re-identification and retrieval | |
CN111738143B (en) | Pedestrian re-identification method based on expectation maximization | |
CN104915643A (en) | Deep-learning-based pedestrian re-identification method | |
Ge et al. | Modelling local deep convolutional neural network features to improve fine-grained image classification | |
CN109784197B (en) | Pedestrian re-identification method based on hole convolution and attention mechanics learning mechanism | |
CN109410184B (en) | Live broadcast pornographic image detection method based on dense confrontation network semi-supervised learning | |
CN108154133B (en) | Face portrait-photo recognition method based on asymmetric joint learning | |
CN113011357A (en) | Depth fake face video positioning method based on space-time fusion | |
CN113688894B (en) | Fine granularity image classification method integrating multiple granularity features | |
CN106599864A (en) | Deep face recognition method based on extreme value theory | |
CN111709313A (en) | Pedestrian re-identification method based on local and channel combination characteristics | |
CN112232395B (en) | Semi-supervised image classification method for generating countermeasure network based on joint training | |
CN112784921A (en) | Task attention guided small sample image complementary learning classification algorithm | |
CN114782997B (en) | Pedestrian re-recognition method and system based on multi-loss attention self-adaptive network | |
CN113936295A (en) | Character detection method and system based on transfer learning | |
CN112329771A (en) | Building material sample identification method based on deep learning | |
CN113743443B (en) | Image evidence classification and recognition method and device | |
CN109101984B (en) | Image identification method and device based on convolutional neural network | |
CN111242114B (en) | Character recognition method and device | |
CN115019175B (en) | Pest identification method based on migration element learning | |
CN111191027A (en) | Generalized zero sample identification method based on Gaussian mixture distribution (VAE) | |
Song et al. | Text Siamese network for video textual keyframe detection | |
CN114842233A (en) | Sequence random network image classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20220802 |