CN114842233A - Sequence random network image classification method - Google Patents

Sequence random network image classification method Download PDF

Info

Publication number
CN114842233A
CN114842233A CN202110131810.0A CN202110131810A CN114842233A CN 114842233 A CN114842233 A CN 114842233A CN 202110131810 A CN202110131810 A CN 202110131810A CN 114842233 A CN114842233 A CN 114842233A
Authority
CN
China
Prior art keywords
layer
image
block
bilstm
resnet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110131810.0A
Other languages
Chinese (zh)
Inventor
李朝荣
覃凤清
曾安平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yibin University
Original Assignee
Yibin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yibin University filed Critical Yibin University
Priority to CN202110131810.0A priority Critical patent/CN114842233A/en
Publication of CN114842233A publication Critical patent/CN114842233A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of image classification and identification, in particular to an image classification method using a sequence random network. Aiming at the defects of the conventional image classification and retrieval method, the invention provides a random network image classification model capable of capturing an image one-dimensional depth feature sequence, which is called BilSTM-TDN. The BilSTM-TDN image classification model consists of a bidirectional long-short term memory module BilSTM and a plurality of Tanh-Dropout blocks. If the target image is a target training database consisting of airplane images, the airplane type can be accurately and automatically identified by using the method; if the animal type identification method is a target training database composed of animals, the animal type can be accurately and automatically identified by the method.

Description

Sequence random network image classification method
Technical Field
The invention relates to the field of image classification and identification, in particular to an image classification method using a sequence random network.
Background
Image recognition and classification are very widely used: such as identifying the type of car, the type of animal, identifying the identity from a human face, etc. Most images today use deep convolutional neural models (such as ResNet, DenseNet) to extract image features, and then perform image content recognition. In recent years, a tanformer image recognition model based on an attention mechanism has also appeared. Although these models greatly improve the recognition performance of images, these methods are difficult to meet the practical requirements due to the higher and higher application requirements.
The deepening and the horizontal operation of the network enable the requirement of the model on machine hardware to be higher and higher, and usually, several days or even dozens of days are needed for training a deep model; however, the performance improvement in exchange for this is only marginal, at least not as expected. One of the fundamental reasons that existing network models are limited in performance is that correlation properties between image sequences are ignored. Usually, a plurality of sample images are arranged in a class in a training set, and the sample images have great similarity; the features extracted by the deep network also have strong correlation, and the classification effect can be effectively improved by utilizing the correlation.
Disclosure of Invention
Aiming at the defects of the existing image classification and retrieval method, the invention provides a random network image classification model capable of capturing an image one-dimensional depth feature sequence, which is called BilSTM-TDN. The BilSTM-TDN image classification model is composed of a bidirectional long-short term memory module (BilSTM) and a plurality of Tanh-Dropout blocks (TD blocks for short). BilSTM is a recurrent neural network, and for each time t, the input is simultaneously provided to two opposite long-short memory (LSTM) networks, and the output is jointly determined by the LSTMs in the two directions. BilSTM is used here to learn information about long-term dependencies between several feature sequences. TD blocks refer to the combination of a tan layer and a drop Dropout layer by a bi-tangent activation function. The double tangent activation function tanh is expressed as follows:
Figure BDA0002926115830000011
wherein sinh and cosh are hyperbolic sine function sinh and hyperbolic cosine function cosh, respectively. The Dropout layer is a layer commonly used in a deep convolutional network, and randomly selects a subset of the weight set W to be reserved with a reserved probability p during training, i.e., a part of weights are discarded with a probability 1-p, and is represented as follows:
y=W| p *x
where x represents the input to the layer and y is the output of the layer. When classification is performed, the output is multiplied by the probability p, which is expressed as follows:
y=W*px
when the classification is carried out, firstly, a network model ResNet-101 pre-trained on ImageNet is subjected to transfer learning on a target database, and then all the characteristics of the images in the database are extracted. The features of the images are one-dimensional, and then the images are trained by the BilSTM-TDN according to the features, and finally the trained BilSTM-TDN is used for image classification. If the target image is a target training database formed by airplane images, the airplane type can be accurately and automatically identified by using the method; if the target training database is composed of animals, the animal type can be accurately and automatically identified by using the method.
Drawings
FIG. 1 is a diagram of a structure of a sequential stochastic network model according to the present invention.
Fig. 2 is a schematic diagram of a TD block structure in a sequential random network.
FIG. 3 is a flow chart of the classification implemented by the method of the present invention.
Detailed Description
FIG. 1 is a sequence random network BilSTM-TDN of the present invention, in which the first layer from left to right is a sequence data input layer 1-a feature sequence input layer; the second layer is a 2-Dropout layer, with the retention probability p set to 0.6; the third layer is a BilSTM layer; the fourth layer to the seventh layer are intermediate layers composed of TD blocks, and respectively are: 4-TD block-1, 4-TD block-2, 4-TD block-3, 4-TD block-4, wherein the retention probability p in 4-TD block-1 and 4-TD block-2 is set to 0.6, and the retention probability p in 4-TD block-3 and 4-TD block-4 is set to 0.5; the eighth layer is a 5-full connection layer, and the number of nodes in the layer is the number of categories of the training data set; the ninth layer is a 6-Softmax output layer and is used for realizing image classification; FIG. 2 is the structure of a TD block in a BilSTM-TDN; the TD block consists of one Tanh layer and one Dropout layer.
The method of the invention is further explained by combining the attached drawings, and the specific implementation steps are as follows:
step 1, carrying out transfer learning on a ResNet-101 model pre-trained on ImageNet: firstly, initializing a ResNet-101 model by network parameters trained on an ImageNet large database, and then performing transfer learning on a specific target database to obtain a ResNet-101 transfer model;
step 2, extracting one-dimensional features of the image by using a ResNet-101 migration model, and combining the one-dimensional features into a feature vector database: calculating the characteristics of each image on a target database by using a ResNet-101 migration model, wherein the image characteristics are characteristic vectors of 2048 dimensions; recombining the image features in the target database into a feature vector database;
step 3, training a BilSTM-TDN network on the feature vector database;
and 4, classifying and identifying images by using the trained BilSTM-TDN network: given an image, 2048-dimensional feature vectors of the image are extracted by using a ResNet-101 migration model, and the feature vectors of the image are input into a network to identify the category of the image.

Claims (2)

1. An image recognition and classification method, characterized in that it comprises the following steps:
step 1, carrying out transfer learning on a ResNet-101 model pre-trained on ImageNet: firstly, initializing a ResNet-101 model by network parameters trained on an ImageNet large database, and then performing transfer learning on a specific target database to obtain a ResNet-101 transfer model;
step 2, extracting one-dimensional features of the image by using a ResNet-101 migration model, and combining the one-dimensional features into a feature vector database: calculating the characteristics of each image on a target database by using the ResNet-101 migration model, wherein the image characteristics are characteristic vectors with 2048 dimensions; recombining the image features in the target database into a feature vector database;
step 3, training a sequence random network BiLSTM-TDN on a feature vector database;
and 4, classifying and identifying images by using the trained sequence random network BilSTM-TDN: given an image, 2048-dimensional feature vectors of the image are extracted by using a ResNet-101 migration model, and the feature vectors of the image are input into a network to identify the category of the image.
2. An image recognition and classification method according to claim 1, characterized in that a sequential stochastic network BiLSTM-TDN is designed, in which network the first layer is a sequential data input layer 1-a feature sequence input layer; the second layer is a 2-Dropout layer, with the retention probability p set to 0.6; the third layer is a BilSTM layer; the fourth layer to the seventh layer are intermediate layers composed of TD blocks, and respectively are: 4-TD block-1, 4-TD block-2, 4-TD block-3, 4-TD block-4, wherein the retention probability p in 4-TD block-1 and 4-TD block-2 is set to 0.6, and the retention probability p in 4-TD block-3 and 4-TD block-4 is set to 0.5; the eighth layer is a 5-full connection layer, and the number of nodes in the layer is the number of categories of the training data set; the ninth layer is a 6-Softmax output layer and is used for realizing image classification; the TD block consists of one Tanh layer and one Dropout layer.
CN202110131810.0A 2021-02-01 2021-02-01 Sequence random network image classification method Pending CN114842233A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110131810.0A CN114842233A (en) 2021-02-01 2021-02-01 Sequence random network image classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110131810.0A CN114842233A (en) 2021-02-01 2021-02-01 Sequence random network image classification method

Publications (1)

Publication Number Publication Date
CN114842233A true CN114842233A (en) 2022-08-02

Family

ID=82561077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110131810.0A Pending CN114842233A (en) 2021-02-01 2021-02-01 Sequence random network image classification method

Country Status (1)

Country Link
CN (1) CN114842233A (en)

Similar Documents

Publication Publication Date Title
CN110334765B (en) Remote sensing image classification method based on attention mechanism multi-scale deep learning
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
Li et al. Deep joint discriminative learning for vehicle re-identification and retrieval
CN111738143B (en) Pedestrian re-identification method based on expectation maximization
CN104915643A (en) Deep-learning-based pedestrian re-identification method
Ge et al. Modelling local deep convolutional neural network features to improve fine-grained image classification
CN109784197B (en) Pedestrian re-identification method based on hole convolution and attention mechanics learning mechanism
CN109410184B (en) Live broadcast pornographic image detection method based on dense confrontation network semi-supervised learning
CN108154133B (en) Face portrait-photo recognition method based on asymmetric joint learning
CN113011357A (en) Depth fake face video positioning method based on space-time fusion
CN113688894B (en) Fine granularity image classification method integrating multiple granularity features
CN106599864A (en) Deep face recognition method based on extreme value theory
CN111709313A (en) Pedestrian re-identification method based on local and channel combination characteristics
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
CN112784921A (en) Task attention guided small sample image complementary learning classification algorithm
CN114782997B (en) Pedestrian re-recognition method and system based on multi-loss attention self-adaptive network
CN113936295A (en) Character detection method and system based on transfer learning
CN112329771A (en) Building material sample identification method based on deep learning
CN113743443B (en) Image evidence classification and recognition method and device
CN109101984B (en) Image identification method and device based on convolutional neural network
CN111242114B (en) Character recognition method and device
CN115019175B (en) Pest identification method based on migration element learning
CN111191027A (en) Generalized zero sample identification method based on Gaussian mixture distribution (VAE)
Song et al. Text Siamese network for video textual keyframe detection
CN114842233A (en) Sequence random network image classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220802