CN111488907B - Robust image recognition method based on dense PCANet - Google Patents
Robust image recognition method based on dense PCANet Download PDFInfo
- Publication number
- CN111488907B CN111488907B CN202010147376.0A CN202010147376A CN111488907B CN 111488907 B CN111488907 B CN 111488907B CN 202010147376 A CN202010147376 A CN 202010147376A CN 111488907 B CN111488907 B CN 111488907B
- Authority
- CN
- China
- Prior art keywords
- feature
- dense
- image
- steps
- atlas
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
A robust image recognition method based on dense PCANet comprises two steps of robust feature extraction and nearest neighbor classification based on chi-square distance. The robust feature extraction process uses dense connection of feature graphs and dense coding of pattern graphs, wherein the dense connection is to combine the features output by all convolution layers to form wider convolution layer features; dense coding, i.e. when using convolutional layers for pattern coding, uses smaller jump amplitudes so that the pattern map reflects as much as possible the correlation between the feature maps. The classification process comprises the following steps: step 1, acquiring distance measurement from an image to be identified to each training image based on chi-square distance in a high-dimensional histogram feature space; and 2, obtaining a class mark corresponding to the training sample with the minimum distance measurement as the class mark of the image to be identified. The method and the device can effectively process the changes such as shielding, illumination change, resolution difference and the like in the image to be identified, thereby effectively improving the identification rate of the offset image.
Description
Technical Field
The invention relates to the field of image processing and pattern recognition, in particular to robust image recognition with large difference between an image to be recognized and a training image, which is mainly used for processing and recognizing images in reality.
Background
Recently, in the field of computer vision and image recognition, deep neural networks (Deep Neural Network, DNN), represented by convolutional neural networks (Convolutional Neural Networks, CNN), have met with great success, and on some disclosed data sets, the classification capabilities of leading edge deep learning methods even exceed those of humans, for example: authentication accuracy on LFW face database, image classification accuracy on ImageNet, and handwriting digital recognition accuracy on MNIST, etc. However, in practice, the image to be identified tends to have a large difference in "distribution" or "structure" from the training image, which can cause DNN to suffer from a large-scale recognition error, which phenomenon is called "Covariate Shift" in the field of deep learning.
Disclosure of Invention
In order to overcome the defect of low image recognition rate caused by covariate offset of the existing image recognition method, the invention provides a robust image recognition method based on Dense PCANet (DPCANET). DPCANET can effectively solve the recognition problem caused by covariate offset, and particularly can greatly improve the image recognition performance when the images to be recognized have offset with larger amplitude such as shielding, illumination change, resolution difference and the like.
The technical scheme adopted for solving the technical problems is as follows:
a robust image recognition method based on dense PCANet, comprising the steps of:
step 1 selecting J images A= { A 1 ,…,A J As training set, the corresponding class label isY={Y 1 ,…,Y K The number is the set of images to be identified, i.e. the test set, here +.>Respectively represent the C on the real number domain 0 An image with length and width of m x n of E {1,3} channels;
Wherein, the liquid crystal display device comprises a liquid crystal display device, is->Mean value of-> Representing from->B e {1,2, …, mn } feature blocks of size k×k extracted from the c-th channel, vec (·) represents the operation of stretching the matrix into column vectors;
step 4 ifIndicating that the network is in a testing stage, jumping to the step 7, otherwise, executing the next step;
step 5 calculationMain direction->Wherein (1)>Is covariance matrix->The i' th eigenvector of (a), the corresponding eigenvalue is lambda i′ And->
Step 7 the feature atlas X of the (1) th convolution layer is calculated as follows (l+1) :7.1 Will) beProjected to7.2 Will->The elements in (a) are reorganized into feature atlas +.>Wherein (1)>And is also provided with Here, a->Representation->Column vectors from row a to row b of column c, a% b representing a remainder of b, +.>Representation rounding down the real number a, mat m×n (v) represents that an arbitrary column vector is +.>Rearranging into an m×n matrix;
step 9, let l=l+1, execute the above steps 3 to 8 until l=l, where L represents a predetermined maximum convolution layer number;
step 10, performing dense coding on the feature atlas F to obtain a mode atlas P: p= { P i, β} i=1,…,N;β=1,…,B Wherein, the method comprises the steps of, wherein,beta e {1, …, B } pattern diagram representing the ith sample, F i,· Representing feature map subset F i In>T represents the number of channels participating in the encoding of a single pattern, 1.ltoreq.τ.ltoreq.T, USF (. Cndot.) represents a unit step function (Unit Step Function, USF), and the input value is binarized by comparison with 0, i.e.:
step 11 extracts histogram features H from the pattern atlas P: h= [ H ] i ] i=1,…,N Wherein H is i =[H i,1 ,…,H i,B ] T ,H i,β =Qhist(P i,β ),Qhist(P i,β ) Representing the pattern diagram P i,β Divided into Q blocks, a histogram is extracted from each block, each histogram using 2 T The number of packets, i.e. the code value of the statistical pattern diagram is 2 for each feature block T The frequency of occurrence in the individual packets;
step 14 calculates a metric matrix m= [ M ] i,j ] i=1,…,J;j=1,…,K Wherein, the method comprises the steps of, wherein,here the number of the elements is the number,
wherein D representsAnd->Length of->Representation->The d element of (a)>Representation ofThe d element of (a);
step 15, calculating class id= [ Id ] of each sample in the test set Y i ] i=1,…,K :
Wherein M is i Represents the ith column vector in the metric matrix M, minIndx (·) represents M i Index of the smallest element in the (c).
Further, in the step 7, the feature atlas X of the (1+1) th convolution layer is calculated as follows (l+1) :
7.2 Will) beThe elements in (a) are reorganized into feature atlas +.> Wherein (1)>And is also provided withc=j%C l+1 The method comprises the steps of carrying out a first treatment on the surface of the Here, a->Representation->Columns from row a to row b of column cThe vector, a% b, represents a to b remainder,representation rounding down the real number a, mat m×n (v) represents that an arbitrary column vector is +.>Rearranged into an mxn matrix.
The technical conception of the invention is as follows: when the images to be identified and the training set images have large offsets such as shielding, illumination change, resolution difference and the like, the identification performance of the existing neural network model is often greatly reduced, and PCANet can better solve the problems. However, PCANet does not fully exploit the learned features: (1) PCANet only uses the feature map output by the last convolution layer to generate subsequent pattern map and histogram features; (2) When the PCANet performs mode encoding, the PCANet uses a large jump, and the correlation between the feature maps cannot be fully utilized. In order to solve the problems, insanenet inspired by the invention, dense connection and dense coding are introduced into a network model of PCANet so as to enrich the extracted features of the PCANet as much as possible, thereby improving the robustness of the PCANet. Dense connection, i.e.: combining all the characteristics output by the convolution layers to form wider convolution layer characteristics; dense coding, i.e.: in pattern coding using convolutional layers, smaller jump amplitudes are used so that the pattern reflects correlation between feature patterns as much as possible.
The beneficial effects of the invention are mainly shown in the following steps: the method can more effectively process the changes such as shielding, illumination change, resolution difference and the like in the image to be identified, thereby effectively improving the identification rate of the offset image.
Drawings
Fig. 1 is a feature map extraction process of dense PCANet according to the present invention, wherein,a convolution operator is represented, and the step 7 of the invention content is detailed; the U-shaped representation performs dense connection on the feature mapA union operator; />The method comprises the following steps of (1) extracting the block histogram characteristics of a pattern diagram, and referring to step 11 of the invention content in detail;
FIG. 2 is a classification process of dense PCANet according to the present invention;
FIG. 3 is a test set and training set sample from an AR face database, where (a) is a test set I sample, (b) is a test set II sample, (c) is a test set III sample, and (d) is a training set sample;
FIG. 4 is a process for extracting feature blocks from a feature map, where (a) is the original feature map, (b) is boundary zero padding, (c) is feature block selection, and (d) is the selected multi-channel feature block;
FIG. 5 (a) is a schematic diagram of the Vec (·) operator stretching the matrix into column vectors, FIG. 5 (b) is mat m×n An (-) operator resets the column vector to a schematic of the matrix;
fig. 6 is a pattern diagram generated by four networks, (a) represents PCANet, (b) represents DPCANet-1, (c) represents DPCANet-2, (d) represents DPCANet-3, where DPCANet-1 represents DPCANet employing only dense linking to the feature diagram, DPCANet-2 represents DPCANet employing only dense encoding to the pattern diagram, and DPCANet-3 represents DPCANet employing both dense linking to the feature diagram and dense encoding to the pattern diagram.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 6, a robust image recognition method based on Dense PCANet (DPCANet), the method comprising the steps of:
step 1 selecting J images A= { A 1 ,…,A J As training set, the corresponding class label isY={Y 1 ,…,Y K The number is the set of images to be identified, i.e. the test set, here +.>Respectively represent the C on the real number domain 0 Image with length width m×n of E {1,3} channels, specifically, C 0 =1 represents a gray scale image, C 0 =3 denotes RGB image, fig. 3 shows a training set sample from AR face database and three sample subsets of images to be identified;
Wherein, the liquid crystal display device comprises a liquid crystal display device, is->Mean value of-> Representing from->B e {1,2, …, mn } feature blocks of size k×k extracted from the c-th lane, vec (·) represents the operation of stretching the matrix into column vectors. Here, the size of the selected feature block, i.e., the size of the learned PCA filter, is typically taken as k=3 or k=5; when selecting the feature blocks, it should be noted that the number of the feature blocks selected in the single feature map on the single channel should be equal to the size m×n of the feature map, in order to achieve this, the feature blocks need to be selected downward and rightward with 1 as an interval, and the boundaries of the feature map need to be filled with 0, as shown in fig. 4 (a) - (b); FIG. 5 (a) gives an example of the operation of the Vec (-) operator;
step 4 ifIf the network is in the test stage, jumping to the step 7, otherwiseThen, executing the next step;
step 5 calculationMain direction->Wherein (1)>Is covariance matrix->The i' th eigenvector of (a), the corresponding eigenvalue is lambda i′ And->
Step 7 the feature atlas X of the (1) th convolution layer is calculated as follows (l+1) :7.1 Will) beProjected to7.2 Will->The elements in (a) are reorganized into feature atlas +.>Wherein (1)>And is also provided with Here, a->Representation->Column vectors from row a to row b of column c, a% b representing a remainder of b, +.>Representation rounding down the real number a, mat m×n (v) represents that an arbitrary column vector is +.>Rearranged into an mxn matrix, FIG. 5 (b) shows mat m×n An example of a (-) operation;
step 9, let l=l+1, execute the above steps 3 to 8 until l=l, where L represents a predetermined maximum convolution layer number; usually we can take L.epsilon.2, 3;
step 10, performing dense coding on the feature atlas F to obtain a mode atlas P: p= { P i,β } i=1,…,N;β=1,…,B Wherein, the method comprises the steps of, wherein,beta e {1, …, B } pattern diagram representing the ith sample, F i,· Representing feature map subset F i In>T represents the number of channels involved in the encoding of a single pattern, typically set t=8, τ(1. Ltoreq.τ.ltoreq.T) for controlling the step size at intervals when the feature map is acquired, typically τ=T/2, USF (·) represents a unit step function (Unit Step Function, USF), and the input value is binarized by comparison with 0, i.e.:
step 11 extracts histogram features H from the pattern atlas P: h= [ H ] i ] i=1,…,N Wherein H is i =[H i,1 ,…,H i,B ] T ,H i,β =Qhist(P i,β ),Qhist(P i,β ) Representing the pattern diagram P i,β Divided into Q blocks, a histogram is extracted from each block, each histogram using 2 T The number of packets, i.e. the code value of the statistical pattern diagram is 2 for each feature block T The frequency of occurrence in the individual packets;
step 14 calculates a metric matrix m= [ M ] i,j ] i=1,…,J;j=1,…,K Wherein, the method comprises the steps of, wherein,here the number of the elements is the number,
wherein D representsAnd->Length of->Representation->The d element of (a)>Representation ofThe d element of (a);
step 15, calculating class id= [ Id ] of each sample in the test set Y i ] i=1,…,K :
Wherein M is i Represents the ith column vector in the metric matrix M, minIndx (·) represents M i Index of the smallest element in the (c).
Table 1 for the training and test sets given in fig. 3, the recognition rates of three versions of DPCANet (DPCANet-1, DPCANet-2, DPCANet-3) were compared with the existing method (VGG-Face, LCNN, PCANet), as can be seen: DPCANet-1 to DPCANet-3 all show better performance than PCANet, but each of the performance of DPCANet-1 and DPCANet-2 has better or worse, and DPCANet-3 has the optimal recognition performance, especially when the resolution of the image to be recognized is lower, the advantage is more remarkable.
Table 1.
Claims (2)
1. A robust image recognition method based on dense PCANet, the method comprising the steps of:
step 1 selecting J images A= { A 1 ,…,A J As training set, the corresponding class label isY={Y 1 ,…,Y K The number is the set of images to be identified, i.e. the test set, here +.>Respectively represent the C on the real number domain 0 An image with length and width of m x n of E {1,3} channels;
step 2, initializing parameters and input data: order theHere, a->For indicating the stage in which the network is located, +.>Indicating that the network is in training phase->Indicating that the network is in a testing stage; let l=0, where l is used to indicate the number of layers of the input image or feature map in the network,/>Wherein n=j,>let f= { F 1 ,…,F N The symbol "represents a set of feature maps generated by the convolutional layers, where +.> Representing an empty set;
Wherein, the liquid crystal display device comprises a liquid crystal display device, is->Mean value of-> Representing from->Is the c-th pass of (2)B epsilon {1,2, …, mn } feature blocks of size k×k extracted in the trace, vec (·) represents the operation of stretching the matrix into column vectors;
step 4 ifIndicating that the network is in a testing stage, jumping to the step 7, otherwise, executing the steps 5-6;
step 5 calculationMain direction->Wherein (1)>Is covariance matrix->The i' th eigenvector of (a), the corresponding eigenvalue is lambda i′ And->
Step 7 computing the feature atlas X of the 1+1th convolution layer (l+1) ;
step 9, let l=l+1, execute the above steps 3 to 8 until l=l, where L represents a predetermined maximum convolution layer number;
step 10, performing dense coding on the feature atlas F to obtain a mode atlas P: p= { P i,β } i=1,...,N;β=1,...,B Wherein, the method comprises the steps of, wherein,beta e {1, …, B } pattern diagram representing the ith sample, F i, Representing feature map subset F i In>T represents the number of channels participating in the encoding of a single pattern, τ is the step size of the interval used for controlling the acquisition of the feature map, 1 is less than or equal to τ is less than or equal to T, USF (·) represents a unit step function, and the input value is binarized by comparing with 0, namely:
step 11 extracts histogram features H from the pattern atlas P: h= [ H ] i ] i=1,…,N Wherein H is i =[H i,1 ,…,H i,B ] T ,H i,β =Qhist(P i,β ),Qhist(P i,β ) Representing the pattern diagram P i,β Divided into Q blocks, a histogram is extracted from each block, each histogram using 2 T The number of packets, i.e. the code value of the statistical pattern diagram is 2 for each feature block T The frequency of occurrence in the individual packets;
step 14 calculates a metric matrix m= [ M ] i,j ] i=1,…,J;j=1,…,K Wherein, the method comprises the steps of, wherein,here the number of the elements is the number,
wherein D representsAnd->Length of->Representation->The d element of (a)>Representation->The d element of (a);
step 15, calculating class id= [ Id ] of each sample in the test set Y i ] i=1,…,K :
Wherein M is i Represents the ith column vector in the metric matrix M, minIndx (·) represents M i Index of the smallest element in the (c).
2. The robust image recognition method based on dense PCANet as recited in claim 1, wherein in said step 7, the feature atlas X of the l+1th convolution layer is calculated as follows (l+1) :
7.2 Will) beReorganizing elements in a feature atlas X (l+1) :/>Wherein, the liquid crystal display device comprises a liquid crystal display device,and is also provided withc=j%C l+1 The method comprises the steps of carrying out a first treatment on the surface of the Here, a->Representation->Column vectors from rows a to b of column c, a% b representing a-to-b remainder,representation rounding down the real number a, mat m×n (v) represents that an arbitrary column vector is +.>Rearranged into an mxn matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010147376.0A CN111488907B (en) | 2020-03-05 | 2020-03-05 | Robust image recognition method based on dense PCANet |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010147376.0A CN111488907B (en) | 2020-03-05 | 2020-03-05 | Robust image recognition method based on dense PCANet |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111488907A CN111488907A (en) | 2020-08-04 |
CN111488907B true CN111488907B (en) | 2023-07-14 |
Family
ID=71811696
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010147376.0A Active CN111488907B (en) | 2020-03-05 | 2020-03-05 | Robust image recognition method based on dense PCANet |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111488907B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191411B (en) * | 2021-04-22 | 2023-02-07 | 杭州卓智力创信息技术有限公司 | Electronic sound image file management method based on photo group |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105447473A (en) * | 2015-12-14 | 2016-03-30 | 江苏大学 | PCANet-CNN-based arbitrary attitude facial expression recognition method |
CN107194375A (en) * | 2017-06-20 | 2017-09-22 | 西安电子科技大学 | Video sequence sorting technique based on three-dimensional principal component analysis network |
CN109410251A (en) * | 2018-11-19 | 2019-03-01 | 南京邮电大学 | Method for tracking target based on dense connection convolutional network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10713563B2 (en) * | 2017-11-27 | 2020-07-14 | Technische Universiteit Eindhoven | Object recognition using a convolutional neural network trained by principal component analysis and repeated spectral clustering |
-
2020
- 2020-03-05 CN CN202010147376.0A patent/CN111488907B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105447473A (en) * | 2015-12-14 | 2016-03-30 | 江苏大学 | PCANet-CNN-based arbitrary attitude facial expression recognition method |
CN107194375A (en) * | 2017-06-20 | 2017-09-22 | 西安电子科技大学 | Video sequence sorting technique based on three-dimensional principal component analysis network |
CN109410251A (en) * | 2018-11-19 | 2019-03-01 | 南京邮电大学 | Method for tracking target based on dense connection convolutional network |
Non-Patent Citations (3)
Title |
---|
Chan, TH;et al.PCANet: A Simple Deep Learning Baseline for Image Classification?.IEEE Transactions on Image Processing.2015,第24卷(第12期),第5017-5032页. * |
Zhiwen Huang;et al.Medical Image Classification Using a Light-Weighted Hybrid Neural Network Based on PCANet and DenseNet.IEEE Access.2020,第8卷第24697-24712页. * |
张幸蕊.随机采样技术在2D-LDA与PCANet人脸识别算法上的应用研究.信息科技.2020,(第2期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111488907A (en) | 2020-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113343707B (en) | Scene text recognition method based on robustness characterization learning | |
CN108416377A (en) | Information extracting method in block diagram and device | |
US10311322B2 (en) | Character information recognition method based on image processing | |
CN108229588B (en) | Machine learning identification method based on deep learning | |
CN102779157B (en) | Method and device for searching images | |
CN109740606A (en) | A kind of image-recognizing method and device | |
CN111488907B (en) | Robust image recognition method based on dense PCANet | |
CN113052859A (en) | Super-pixel segmentation method based on self-adaptive seed point density clustering | |
CN112686104A (en) | Deep learning-based multi-vocal music score identification method | |
CN115908421A (en) | Active learning medical image segmentation method based on superpixels and diversity | |
CN111310820A (en) | Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration | |
CN106709501B (en) | Scene matching area selection and reference image optimization method of image matching system | |
CN114387454A (en) | Self-supervision pre-training method based on region screening module and multi-level comparison | |
CN109389173B (en) | M-CNN-based test paper score automatic statistical analysis method and device | |
CN115410059B (en) | Remote sensing image part supervision change detection method and device based on contrast loss | |
CN108492345B (en) | Data block dividing method based on scale transformation | |
CN111488906B (en) | Low-resolution image recognition method based on channel correlation PCANet | |
CN111488905B (en) | Robust image recognition method based on high-dimensional PCANet | |
CN106056575A (en) | Image matching method based on object similarity recommended algorithm | |
US9092688B2 (en) | Assisted OCR | |
CN115984639A (en) | Intelligent detection method for fatigue state of part | |
CN111209872B (en) | Real-time rolling fingerprint splicing method based on dynamic programming and multi-objective optimization | |
CN111382703B (en) | Finger vein recognition method based on secondary screening and score fusion | |
CN113887737A (en) | Sample set automatic generation method based on machine learning | |
CN113223098A (en) | Preprocessing optimization method for image color classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |