CN104376312A - Face recognition method based on word bag compressed sensing feature extraction - Google Patents

Face recognition method based on word bag compressed sensing feature extraction Download PDF

Info

Publication number
CN104376312A
CN104376312A CN201410739127.5A CN201410739127A CN104376312A CN 104376312 A CN104376312 A CN 104376312A CN 201410739127 A CN201410739127 A CN 201410739127A CN 104376312 A CN104376312 A CN 104376312A
Authority
CN
China
Prior art keywords
image
feature
face recognition
scale
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410739127.5A
Other languages
Chinese (zh)
Other versions
CN104376312B (en
Inventor
周凯
元昌安
郑彦
宋文展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to CN201410739127.5A priority Critical patent/CN104376312B/en
Publication of CN104376312A publication Critical patent/CN104376312A/en
Application granted granted Critical
Publication of CN104376312B publication Critical patent/CN104376312B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face recognition method based on word bag compressed sensing feature extraction. A face recognition system is used for the face recognition method. The face recognition method is characterized by comprising the following steps of extraction of scale invariant feature transformation features of an image, feature coding and fusion and classification of features with different scales. Compared with the original word bag model, the face recognition method is simple, practical and effective; and after key features of the image are extracted by scale invariant feature transformation, a clustering center or a learning dictionary is not required, and a random matrix is used. The key features are coded through the random matrix, a large amount of time can be saved, and the problem that a large amount of space information is lost in the original word bag model method is solved. Influences on face recognition due to change of illumination, shielding, expressions and the like of the face of a user can be avoided well, the recognition rate can be high, the running speed is also high, face recognition is carried out on a challenging AR (auxiliary register) database, the face recognition rate is greatly improved, and the instantaneity is high.

Description

Face recognition method based on bag-of-words compressed sensing feature extraction
Technical Field
The invention relates to a machine vision and image processing technology, in particular to a face recognition method.
Background
In the existing face recognition system, the brightness of light, the posture of a face, glasses and the like are challenging problems in face recognition, and feature extraction is a key step in image preprocessing. The existing face recognition methods are various, such as a method based on bag-of-words features, because the bag-of-words model ignores the spatial information of the features and makes the features unordered, the algorithm recognition rate is low, and the K-means clustering in the bag-of-words model takes long time, so that the running time of the whole algorithm is long.
Disclosure of Invention
The invention aims to provide a face recognition method based on bag-of-words compressed sensing feature extraction, which improves the performance of an algorithm and runs faster.
In order to achieve the purpose, the technical scheme of the method is as follows:
a face recognition method based on bag-of-words compressed sensing feature extraction comprises a face recognition system, and is characterized in that the recognition step comprises the following steps:
step one, extracting image features by using a method of transforming features with scale-invariant features
(1) Setting a function of an image asConvolving the image with a Gaussian kernel function to obtain scale spaces under different scales; the formula is as follows:
(1)
wherein,the position of the pixel is represented by,the scale space is represented by a scale of,representing scale space factorsA seed;
(2) after obtaining the scale space of the image, adoptPyramidal methods, i.e. spaces obtained by convolution of images with a differential Gaussian pyramid functionThe method of (1) searching for extreme points to obtainThe formula of (1) is:
(2)
where k is two different adjacent scale spaces;
(3) determining a key point according to the extreme point, and giving a direction to the key point to realize the rotation invariance of the image(ii) a ComputingEach point ofThe gradient and direction of (d);
(3)
(4)
(4) a neighborhood around the keypoint is selected, and a histogram is formed using the gradients of all points in this region centered on the keypoint. And gaussian weighted for the gradient at its midpoint. This neighborhood is divided into four sub-regions, taking eight directions in each sub-region. Thereby obtaining the scale-invariant feature transformation of the image;
step two, feature coding
Segmentation of images into blocksAfter blocking, each block obtains the local features of the image by using the scale invariant feature transformation, and then a random dictionary B is randomly generated by the system by using the concept of compressed sensing, and then feature codes are obtained through sparse representation;
if the system generates a random dictionary of one imageIs divided intoBlockIs thatImage scale invariant feature transform extractionThe block feature, each local block of an image can be obtained by the feature coding according to the formula (5), which is as follows:
(5)
whereinIs a constant number of times that the number of the first,is the desired feature code
Step three, fusing features of different scales in the image
Obtaining a feature encoding matrix of an image using equation (5) asIs corresponding toThe coefficients of a block, defined as the maximum pool method for fusing coefficients:
(6)
whereinIs a pool vectorTo (1) aThe number of the elements is one,representing a coefficient coding matrixIs/are as followsThe rows of the image data are, in turn,columns;
finally, a space pyramid matching algorithm is used, namely, a pair of images are segmented intoAnddifferent blocks can carry out feature coding on the subregions with different spatial positions and scales, if a spatial pyramid matching algorithm is used for obtaining the maximum pool of the scales asThen, the feature vectors of different scales and regions are connected in series to obtain the feature vector of the image;
step four, classifying
After the feature vector of each image is obtained by feature extraction by the method, the classification is carried out by adopting a kernel sparse representation method, a kernel function adopts a histogram cross kernel, and the expression is as follows:
(7)
whereinIs two dimensions ofIs determined by the feature vector of (a),are respectivelyThe eigenvalues of the eigenvectors;
if the training set obtained after the image feature extraction isThe test sample isTo a first orderOne of the test samples is taken as an example,can pass through a matrixThe kernel is represented as:
(8)
whereinIs a kernel functionAnd (3) the sparse coefficient of the high-dimensional feature projection space is obtained by expanding the above formula as follows:
(9)
whereinIs shown asThe number of the test specimens is determined,expression solutionAndcross kernel of the histogram of (1). Solving equation (9) to obtain the coefficientThen, finally, the minimum residual error is calculatedThe classification of (1):
(10)
in the formulaIs shown asClass corresponds toThe coefficients are sparsely represented.
The invention has the characteristics and advantages that:
1. compared with the original bag-of-words model, the method is simple, practical and more effective, and after the scale-invariant feature transformation is used for extracting the image key point features, the method does not find the clustering center or learn the dictionary any more, but utilizes the random matrix. The keypoint features are encoded by a random matrix. Therefore, a large amount of time can be saved, and a large amount of spatial information is not lost like the original bag-of-words model method, so that the recognition rate is greatly improved;
2. the method combines the idea of compressed sensing, can well reconstruct the original image by solving a sparse coefficient matrix in the compressed sensing scheme, and also utilizes the advantages of a space pyramid model and a maximum pool, so that the algorithm has more stability;
3. the method can well overcome the influence of changes such as illumination, shielding and expression of the face on face recognition, not only can obtain higher recognition rate, but also has higher running speed, carries out face recognition on a challenging AR database, and compared with a classical bag-of-words model, the method can well overcome the influence of various factors, greatly improves the face recognition rate and has real-time property.
Drawings
FIG. 1 is a 7 frontal face image of the present invention with varying illumination, expression and camouflage.
Detailed Description
The invention will be further explained with reference to the drawings.
The present invention is explained in detail by a specific example, in a face recognition system, simulation is performed by MATLAB, and an experimental platform is an i5 processor, a main frequency 2.4GHz, and a 2G memory. The protective scope of the invention is not limited to the following examples of implementation.
Fig. 1 shows 7 frontal face images of the present invention with varying illumination, expression and camouflage. The first image is a normal image, the second image is an image with a changed facial expression, the third image is a changed illumination, the fourth image is worn with glasses, the fifth image is worn with glasses and changed illumination, the sixth image is a scarf, and the seventh image is a scarf and changed illumination. This example was run on an AR database, a public very challenging face database. The AR database contains 2600 front face images with different illumination, expression and camouflage changes, and totally 100 people, and 26 images of each person. The AR database is divided into two parts, the first 1-7 images of the first part are changes of expressions and illumination, the text is used as a training set (700 pieces), then the 8 th-10 th glasses and the 11 th-13 th scarf wearing face images of the first part and the second part are respectively taken to be used as a testing set (300 pieces respectively), and the face is normalized to 83 pieces for reducing the costAn image of 60 pixels in size.
First in Matlab for all image segmentationAndand different blocks, and then extracting the features of each block by using a scale-invariant feature transformation method. Assuming that the features of the training set are obtained asThe test set is characterized in thatWhereinAndrepresenting an image composed of 1 to N blocks.
Then training setTest setBy the formulaFeature codes of each block of an image are obtained. All blocks of the image are processed by a formula by using a space pyramid matching and maximum pool methodFusing, and finally obtaining the feature vector of each image and the training set asTest set
Using kernel projection into a high-dimensional feature space, according to kernel functionsComputing training setAnd test setAndrespectively to obtainAnd
reuse formulaCalculating each test sampleCorresponding to training samplesOf the sparse coefficient matrix
Finally according to the sparse coefficient matrixBy finding the minimum residual errorTo distinguish the classification:
(11)
in the formulaIs shown asClass-corresponding sparse representation coefficients.
The experimental results are shown in table 1, wherein it can be seen that the method of the present invention has a significant recognition rate superior to the existing methods. The recognition rate of wearing glasses and scarf on the first part of the AR face library is more than 97%, and the recognition rate of the method is 7% higher than that of the existing algorithm although the recognition rate of the second part is reduced because the first 7 pieces of the first part are adopted in the experimental training set. The time in the final table is representative of the average per image processing time, and it can be seen that the method of the present invention takes less time than the prior art methods.
TABLE 1 comparison of recognition rates of two algorithms on AR database
Spectacles 1 Scarf 1 Glasses 2 Scarf 2 Time(s)
Word bag (existing method) 81.35 80.34 73.37 62.03 0.1800
The method of the invention 98.32 97.33 80.96 87.02 0.1001
Therefore, the method can be widely applied to real life, and the method has good robustness as can be seen from experiments.

Claims (1)

1. A face recognition method based on bag-of-words compressed sensing feature extraction comprises a face recognition system and is characterized in that the recognition steps are as follows:
firstly, extracting image features by using a method of transforming the features by using scale-invariant features;
(1) setting a function of an image asConvolving the image with a Gaussian kernel function to obtain scale spaces under different scales; the formula is as follows:
(1)
wherein,the position of the pixel is represented by,the scale space is represented by a scale of,representing a scale space factor;
after obtaining the scale space of the image, adoptPyramidal methods, i.e. spaces obtained by convolution of images with a differential Gaussian pyramid functionThe method of (1) searching for extreme points to obtainThe formula of (1) is:
(2)
where k is two different adjacent scale spaces;
(3) determining key points according to the extreme points, and giving a direction to the key points to realize the rotation invariance of the imageSmooth image(ii) a ComputingEach point ofThe gradient and direction of (d);
(3)
(4)
(4) selecting a neighborhood around the key point, and forming a histogram by taking the key point as a center and utilizing the gradients of all points in the neighborhood;
and the gradient of the midpoint is weighted Gaussian;
the neighborhood is divided into four sub-areas, and eight directions are taken in each sub-area;
thereby obtaining the scale-invariant feature transformation of the image;
step two, feature coding
Segmentation of images into blocksAfter blocking, each block obtains the local features of the image by using the scale invariant feature transformation, and then a random dictionary B is randomly generated by the system by using the concept of compressed sensing, and then feature codes are obtained through sparse representation;
if the system generates a random dictionary of one imageIs divided intoBlockIs thatImage scale invariant feature transform extractionThe block feature, each local block of an image can be obtained by the feature coding according to the formula (5), which is as follows:
(5)
whereinIs a constant number of times that the number of the first,is the desired feature code
Step three, fusing features of different scales in the image
Obtaining a feature encoding matrix of an image using equation (5) asIs corresponding toThe coefficients of a block, defined as the maximum pool method for fusing coefficients:
(6)
whereinIs a pool vectorTo (1) aThe number of the elements is one,representing a coefficient coding matrixIs/are as followsThe rows of the image data are, in turn,columns;
finally, a space pyramid matching algorithm is used, namely, a pair of images are segmented intoAnddifferent blocks can be used for carrying out feature coding on sub-regions with different spatial positions and scalesCode, if the maximum pool of the scale is obtained by using the space pyramid matching algorithmThen, the feature vectors of different scales and regions are connected in series to obtain the feature vector of the image;
step four, classifying
After the feature vector of each image is obtained by feature extraction by the method, the classification is carried out by adopting a kernel sparse representation method, a kernel function adopts a histogram cross kernel, and the expression is as follows:
(7)
whereinIs two dimensions ofIs determined by the feature vector of (a),are respectivelyThe eigenvalues of the eigenvectors;
if the training set obtained after the image feature extraction isThe test sample isTo a first orderOne of the test samples is taken as an example,can pass through a matrixThe kernel is represented as:
(8)
whereinIs a kernel functionAnd (3) the sparse coefficient of the high-dimensional feature projection space is obtained by expanding the above formula as follows:
(9)
whereinIs shown asThe number of the test specimens is determined,expression solutionAndthe histogram cross kernel of (1);
solving equation (9) to obtain the coefficientThen, finally, the minimum residual error is calculatedThe classification of (1):
(10)
in the formulaIs shown asClass-corresponding sparse representation coefficients.
CN201410739127.5A 2014-12-08 2014-12-08 Face identification method based on bag of words compressed sensing feature extraction Active CN104376312B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410739127.5A CN104376312B (en) 2014-12-08 2014-12-08 Face identification method based on bag of words compressed sensing feature extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410739127.5A CN104376312B (en) 2014-12-08 2014-12-08 Face identification method based on bag of words compressed sensing feature extraction

Publications (2)

Publication Number Publication Date
CN104376312A true CN104376312A (en) 2015-02-25
CN104376312B CN104376312B (en) 2019-03-01

Family

ID=52555210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410739127.5A Active CN104376312B (en) 2014-12-08 2014-12-08 Face identification method based on bag of words compressed sensing feature extraction

Country Status (1)

Country Link
CN (1) CN104376312B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488491A (en) * 2015-12-23 2016-04-13 西安电子科技大学 Human body sleep posture detection method based on pyramid matching histogram intersection kernel
CN106056135A (en) * 2016-05-20 2016-10-26 北京九艺同兴科技有限公司 Human body motion classification method based on compression perception
CN108229330A (en) * 2017-12-07 2018-06-29 深圳市商汤科技有限公司 Face fusion recognition methods and device, electronic equipment and storage medium
CN108960201A (en) * 2018-08-01 2018-12-07 西南石油大学 A kind of expression recognition method extracted based on face key point and sparse expression is classified
CN109800719A (en) * 2019-01-23 2019-05-24 南京大学 Low resolution face identification method based on sub-unit and compression dictionary rarefaction representation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102637251A (en) * 2012-03-20 2012-08-15 华中科技大学 Face recognition method based on reference features
CN103310208A (en) * 2013-07-10 2013-09-18 西安电子科技大学 Identifiability face pose recognition method based on local geometrical visual phrase description
CN103745200A (en) * 2014-01-02 2014-04-23 哈尔滨工程大学 Facial image identification method based on word bag model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102637251A (en) * 2012-03-20 2012-08-15 华中科技大学 Face recognition method based on reference features
CN103310208A (en) * 2013-07-10 2013-09-18 西安电子科技大学 Identifiability face pose recognition method based on local geometrical visual phrase description
CN103745200A (en) * 2014-01-02 2014-04-23 哈尔滨工程大学 Facial image identification method based on word bag model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FILIPE MAGALHAES ET AL.: "Compressive Sensing Based Face Detection without Explicit Image Reconstruction Using Support Vector Machines", 《INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS & RECOGNITION》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488491A (en) * 2015-12-23 2016-04-13 西安电子科技大学 Human body sleep posture detection method based on pyramid matching histogram intersection kernel
CN106056135A (en) * 2016-05-20 2016-10-26 北京九艺同兴科技有限公司 Human body motion classification method based on compression perception
CN108229330A (en) * 2017-12-07 2018-06-29 深圳市商汤科技有限公司 Face fusion recognition methods and device, electronic equipment and storage medium
CN108960201A (en) * 2018-08-01 2018-12-07 西南石油大学 A kind of expression recognition method extracted based on face key point and sparse expression is classified
CN109800719A (en) * 2019-01-23 2019-05-24 南京大学 Low resolution face identification method based on sub-unit and compression dictionary rarefaction representation
CN109800719B (en) * 2019-01-23 2020-08-18 南京大学 Low-resolution face recognition method based on sparse representation of partial component and compression dictionary

Also Published As

Publication number Publication date
CN104376312B (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN110097051B (en) Image classification method, apparatus and computer readable storage medium
Cherian et al. Riemannian dictionary learning and sparse coding for positive definite matrices
Paisitkriangkrai et al. Pedestrian detection with spatially pooled features and structured ensemble learning
Wang et al. Bag of contour fragments for robust shape classification
CN110659589B (en) Pedestrian re-identification method, system and device based on attitude and attention mechanism
Anami et al. A comparative study of suitability of certain features in classification of bharatanatyam mudra images using artificial neural network
CN104376312B (en) Face identification method based on bag of words compressed sensing feature extraction
CN105956560A (en) Vehicle model identification method based on pooling multi-scale depth convolution characteristics
Wang et al. Review of image low-level feature extraction methods for content-based image retrieval
CN104966081B (en) Spine image-recognizing method
Zeng et al. Curvature bag of words model for shape recognition
CN106096658B (en) Aerial Images classification method based on unsupervised deep space feature coding
Zhao et al. Bisecting k-means clustering based face recognition using block-based bag of words model
CN110334715A (en) A kind of SAR target identification method paying attention to network based on residual error
Li et al. Place recognition based on deep feature and adaptive weighting of similarity matrix
CN109034213B (en) Hyperspectral image classification method and system based on correlation entropy principle
CN104504368A (en) Image scene recognition method and image scene recognition system
Singh et al. Leaf identification using feature extraction and neural network
CN111199558A (en) Image matching method based on deep learning
CN111325275A (en) Robust image classification method and device based on low-rank two-dimensional local discriminant map embedding
Giraddi et al. Flower classification using deep learning models
CN110826534B (en) Face key point detection method and system based on local principal component analysis
CN112784722A (en) Behavior identification method based on YOLOv3 and bag-of-words model
Huang et al. Human emotion recognition based on face and facial expression detection using deep belief network under complicated backgrounds
CN105069402A (en) Improved RSC algorithm for face identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant