CN108805216A - Face image processing process based on depth Fusion Features - Google Patents
Face image processing process based on depth Fusion Features Download PDFInfo
- Publication number
- CN108805216A CN108805216A CN201810630864.XA CN201810630864A CN108805216A CN 108805216 A CN108805216 A CN 108805216A CN 201810630864 A CN201810630864 A CN 201810630864A CN 108805216 A CN108805216 A CN 108805216A
- Authority
- CN
- China
- Prior art keywords
- feature
- image
- training
- sift
- extraction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the present invention discloses a kind of face image processing process based on depth Fusion Features, the generalization ability that can be improved on finite data collection.This method includes:Facial expression image data set is expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set;Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature series connection that correspondence image is obtained is a feature vector, and is trained using SVM classifier.
Description
Technical field
The present invention relates to field of image recognition more particularly to a kind of face image processing sides based on depth Fusion Features
Method.
Background technology
People can trace back in the 1970s, the research of early stage is concentrated mainly on psychology the research of Expression Recognition
In terms of biology.
Traditional facial expression recognition flow based on classification includes Face datection, feature extraction and the several steps of pattern classification
Suddenly, face detection module is detected and positions to face;Expression spy extraction module is extracted from face subgraph and can be characterized
The description information of expression;Mode classification module analyzes the output of a upper module, expression classification to phase according to Classification and Identification standard
The classification answered.Wherein human facial feature extraction is most important part in Expression Recognition system, and the quality of recognition effect relies primarily on
The quality of feature.
How researchers is extracting and using in better feature if generally spending a large amount of time and efforts, these
The feature of hand-designed not only can the telephone expenses researcher a large amount of time, simultaneously for training data have stronger dependence,
Often less reliable and stability is poor, is easily disturbed, to be improved in the generalization that Expression Recognition field has.
Invention content
The embodiment of the present invention provides a kind of face image processing process based on depth Fusion Features, can be in finite data
The generalization ability improved on collection.
The embodiment of the present invention adopts the following technical scheme that:
A kind of face image processing process based on depth Fusion Features, including:
Facial expression image data set is expanded using data enhancement methods, while extracting the facial expression image
The SIFT feature of each image in data set;
Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth that correspondence image is obtained
Feature series connection is a feature vector, and is trained using SVM classifier.
Optionally, the facial expression image data set is CK+ data sets, described to be used to facial expression image data set
Data enhancement methods expand:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested
Card collection, test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, and pass through the Adaboost method for detecting human face based on Haar features
Face datection is carried out to every face picture, cuts human face region, removes background influence;
Spatial normalization is carried out to image using opencv visions library, the line adjusted between face two is allowed to keep water
It is flat, face is snapped into same position;
Histogram equalization is carried out to all images, enhances the contrast of image, brightness of image caused by weakening illumination is poor
Influence;Finally by all image normalizations to 70*70 pixels;
Using the method EDS extended data set for carrying out geometric transformation to raw image data, the training of every 70*70 is schemed
Picture, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out horizon glass
Training set is expanded 10 times by picture.
Optionally, the SIFT feature of each image includes in the extraction facial expression image data set:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4* that each SIFT key point, which describes son,
The vector of 8=128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the feature extracted
The number of point, it is specified that extracting 20 SIFT feature description for each image, with 20*128 tieed up by the shallow-layer feature of each training sample
Vector indicate;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT
Feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500 regard
Feel word.
Optionally, described to include using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance method meter
Which of dictionary visual vocabulary should be belonged to by calculating each SIFT feature, and statistics falls into the number of characteristic point in each dictionary, obtains
To the statistic histogram of each image, feature vector that statistic histogram can be tieed up with one 500 indicate to get to the picture pair
It should be indicated in the Bow of dictionary.
Optionally, further include:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is made
For the further feature learnt.
Optionally, the AlexNet models using pre-training are finely adjusted on CK+ training sets, and extraction model connects entirely
The feature for connecing layer includes as the further feature learnt:
Using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, in training set
On parameter is finely adjusted, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is
0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
Optionally, further include:
Feature using the full articulamentum of layer last of obtained CNN model extraction training set images is special as deep layer
Sign, each single image obtain the feature vector of one 500 dimension.
Optionally, further include:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, obtains one and melt
Feature is closed, and all feature vectors are normalized in [- 1,1] range.
Optionally, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is tested using grid.py intersections
Card selects optimal parameter c and g, and then svmtrain uses the optimal parameter c obtained and g, linear classifier, RBF kernel functions pair
The feature vector of entire training dataset is trained, and obtains final target detection model.
Optionally, further include:
When test, equally uses the SIFT feature that Bow models indicate as shallow-layer feature test data set image zooming-out, carry
Take the feature of the full articulamentum of CNN models layer last as further feature, by two feature vectors connect to obtain fusion feature to
It measures and normalizes, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The face image processing process based on depth Fusion Features based on the above-mentioned technical proposal, to facial expression image number
Expanded using data enhancement methods according to collection, while the SIFT for extracting each image in the facial expression image data set is special
Sign;Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature string that correspondence image is obtained
Connection is a feature vector, and is trained using SVM classifier, so as to the extensive energy improved on finite data collection
Power.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not
The disclosure can be limited.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the present invention
Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is the flow chart of the face image processing process based on depth Fusion Features shown in the embodiment of the present invention.
Specific implementation mode
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent and the consistent all embodiments of the present invention.On the contrary, they be only with it is such as appended
The example of the consistent device and method of some aspects being described in detail in claims, of the invention.
The embodiment of the present invention changes poor robustness for the feature of traditional method for extracting to the complexity such as illumination, expression, posture
Etc. limitations, and general convolutional neural networks feature extraction scarce capacity in training on finite data collection cannot effective table
Intelligent's face expressive features and the problem of influence generalization ability, it is proposed that a kind of feature representation mode based on depth Fusion Features,
SIFT feature by extracting facial image respectively describes son and the full articulamentum feature vector of convolutional neural networks, and by two spies
Sign carries out fused in tandem into a kind of new feature representation, reuses SVM classifier and is identified, by SIFT feature to CNN spies
That levies is assisted and strengthened so that fusion feature has preferable generalization ability on finite data collection.
Embodiment 1
As shown in Figure 1, the embodiment of the present invention provides a kind of face image processing process based on depth Fusion Features, the packet
It includes:
11, facial expression image data set is expanded using data enhancement methods, while extracts the facial expression
Image data concentrates SIFT (Scale-invariant feature transform, the scale invariant feature change of each image
Change) feature;
12, shallow-layer of the SIFT feature indicated using bag of words Bow (Bag-of-words, bag of words) as extraction
Feature, the depth feature series connection that correspondence image is obtained is a feature vector, and utilizes SVM (Support Vector
Machine, support vector machines) grader is trained.
Optionally, the facial expression image data set is CK+ data sets, described to be used to facial expression image data set
Data enhancement methods expand:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested
Card collection, test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, and pass through the Adaboost method for detecting human face based on Haar features
Face datection is carried out to every face picture, cuts human face region, removes background influence;
Use opencv (OpenSource Computer Vision Library, computer vision of increasing income library) vision library
Spatial normalization is carried out to image, the line between face two is adjusted and is allowed to holding level, face is snapped into same position;
Histogram equalization is carried out to all images, enhances the contrast of image, brightness of image caused by weakening illumination is poor
Influence;Finally by all image normalizations to 70*70 pixels;
Using the method EDS extended data set for carrying out geometric transformation to raw image data, the training of every 70*70 is schemed
Picture, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out horizon glass
Training set is expanded 10 times by picture.
Optionally, the SIFT feature of each image includes in the extraction facial expression image data set:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4* that each SIFT key point, which describes son,
The vector of 8=128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the feature extracted
The number of point, it is specified that extracting 20 SIFT feature description for each image, with 20*128 tieed up by the shallow-layer feature of each training sample
Vector indicate;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT
Feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500 regard
Feel word.
Optionally, described to include using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance method meter
Which of dictionary visual vocabulary should be belonged to by calculating each SIFT feature, and statistics falls into the number of characteristic point in each dictionary, obtains
To the statistic histogram of each image, feature vector that statistic histogram can be tieed up with one 500 indicate to get to the picture pair
It should be indicated in the Bow of dictionary.
Optionally, further include:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is made
For the further feature learnt.
Optionally, the AlexNet models using pre-training are finely adjusted on CK+ training sets, and extraction model connects entirely
The feature for connecing layer includes as the further feature learnt:
Using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, in training set
On parameter is finely adjusted, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is
0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
Optionally, further include:
Feature using the full articulamentum of layer last of obtained CNN model extraction training set images is special as deep layer
Sign, each single image obtain the feature vector of one 500 dimension.
Optionally, further include:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, obtains one and melt
Feature is closed, and all feature vectors are normalized in [- 1,1] range.
Optionally, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is tested using grid.py intersections
Card selects optimal parameter c (penalty coefficient) and g (the parameter coefficient of kernel function), and then svmtrain uses the optimal parameter obtained
C is with g, linear classifier, RBF (Radial Basis Function, radial basis function) kernel function to entire training dataset
Feature vector is trained, and obtains final target detection model.
Optionally, further include:
When test, equally uses the SIFT feature that Bow models indicate as shallow-layer feature test data set image zooming-out, carry
Take the feature of the full articulamentum of CNN models layer last as further feature, by two feature vectors connect to obtain fusion feature to
It measures and normalizes, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The face image processing process based on depth Fusion Features of the embodiment of the present invention, to facial expression image data set
Expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set;
Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature that correspondence image is obtained is connected
It for a feature vector, and is trained using SVM classifier, so as to the generalization ability improved on finite data collection.
Embodiment 2
The face image processing process based on depth Fusion Features of the present embodiment the present invention will be described in detail embodiment, this reality
It is CK+ standard data sets, including 510 width facial expression images to apply data set used by example, and corresponding anger is detested, fears, certainly
So, basic emotion in glad, sadness, surprised seven, expands CK+ training sets using certain data enhancement methods, uses
The AlexNet models of pre-training are finely adjusted on CK+ training sets, and the feature of the full articulamentum of extraction model is as the depth learnt
Layer feature;The SIFT feature for extracting each image simultaneously, using the SIFT feature that bag of words Bow is indicated as the shallow-layer of extraction
Feature;The depth feature series connection that correspondence image is obtained is a feature vector, and SVM classifier is used in combination to be trained, and finally will
Test set single image is identified in trained SVM models, and the general of this feature is verified on the data set of several mainstreams
Change ability and robustness.
This approach includes the following steps:
201, CK+ data sets totally 510 facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested
Card collection, test set, and ensure that piece identity is not overlapped in each set.
202, all images concentrated to data pre-process, and pass through the Adaboost faces based on Haar features first
Detection method carries out Face datection to every face picture, cuts human face region, removes background influence;Use opencv visions library
Spatial normalization is carried out to image, the line between face two is adjusted and is allowed to holding level, face is snapped into same position;
Histogram equalization is carried out to all images, enhances the contrast of image, the influence of brightness of image difference caused by weakening illumination;Most
Afterwards by all image normalizations to 70*70 pixels.
203, using the method EDS extended data set for carrying out geometric transformation to raw image data, for the instruction of every 70*70
Practice image, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out level
Training set is expanded 10 times by mirror image.
204, to the training set image zooming-out SIFT feature after all expansions, it is one that each SIFT key point, which describes son,
The vector of 4*4*8=128 dimensions.After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is to carry
The number for the characteristic point got.It is defined as each image and extracts 20 SIFT feature description, the shallow-layer feature of each training sample
It is indicated with the 20*128 vectors tieed up.
205, dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all
SIFT feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500
A visual word.
206, the SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance
Method, which calculates each SIFT feature, should belong to which of dictionary visual vocabulary, and statistics falls into of characteristic point in each dictionary
Number, obtains the statistic histogram of each image, the feature vector that statistic histogram can be tieed up with one 500 indicate to get to this
The Bow that picture corresponds to dictionary is indicated.
207, it using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, is instructing
Practice and parameter is finely adjusted on collection, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial study speed
Rate is 0.001, when verification collection discrimination is no longer promoted, stops iteration, obtains the CNN models for extracting further feature.
208, using the feature of the full articulamentum of layer last of the CNN model extraction training set images obtained in step 7
As further feature, each single image obtains the feature vector of one 500 dimension.
209, it after the shallow-layer feature for 500 dimensions that step 6 is extracted being connected on to the 500 dimension further features that step 8 is extracted, obtains
One fusion feature, and all feature vectors are normalized in [- 1,1] range.
210, the fusion feature of training sample data collection is trained using support vector machines, is handed over using grid.py
Fork verification selects optimal parameter c and g, and then svmtrain uses the optimal parameter c obtained and g, linear classifier, RBF core letters
Several feature vectors to entire training dataset are trained, and obtain final target detection model.
211, test phase equally uses the SIFT feature that Bow models indicate as shallow-layer test data set image zooming-out
Feature extracts the feature of the full articulamentum of CNN models layer last as further feature, two feature vectors is connected and are merged
Feature vector simultaneously normalizes, and using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The embodiment of the present invention is finely adjusted using the AlexNet models of pre-training on CK+ training sets, and extraction model connects entirely
The feature of layer is connect as the further feature learnt;The SIFT feature for extracting each image simultaneously, is indicated using bag of words Bow
SIFT feature as extraction shallow-layer feature;By shallow-layer feature and further feature carry out fused in tandem be a new feature to
It measures and normalizes, the feature representation new as image, training SVM classifier.The more manual features of CNN can learn richer to image
Rich advanced character representation, generally requires sufficient data volume, although the recognition effect of the craft feature such as SIFT is not so good as CNN,
But they should not a large amount of training data generate useful feature, the method overcome human face expression data set it is insufficient in the case of
The problem of CNN further features ability to express deficiency assists depth characteristic to improve the identity under small data using traditional characteristic
Energy.
The fusion feature recognition effect that the embodiment of the present invention proposes is better than SIFT-Bow shallow-layers feature and the depth of CNN extractions
The discrimination of further feature, by merging shallow-layer feature, can be improved 2% or so by layer feature on CK+ data sets;Simultaneously
The generalization ability for introducing cross datasets experimental verification this feature, by extracting fusion feature on CK+ data sets in SVM classifier
Training, is tested on JAFFE data sets, and recognition effect reaches 47.8%, and property rate has a distinct increment compared with classical scheme,
Synthesis result is preferable.
The face image processing process based on depth Fusion Features of the embodiment of the present invention, to facial expression image data set
Expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set;
Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature that correspondence image is obtained is connected
It for a feature vector, and is trained using SVM classifier, so as to the generalization ability improved on finite data collection.
Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport
In the principle, practical application or improvement to the technology in market for best explaining each embodiment, or make the art
Other those of ordinary skill can understand each embodiment disclosed herein.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice disclosure disclosed herein
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or
Person's adaptive change follows the general principles of this disclosure and includes the undocumented common knowledge in the art of the disclosure
Or conventional techniques.
Claims (10)
1. a kind of face image processing process based on depth Fusion Features, which is characterized in that including:
Facial expression image data set is expanded using data enhancement methods, while extracting the facial expression image data
Concentrate the Scale invariant features transform SIFT feature of each image;
Using the SIFT feature that bag of words bag of words Bow is indicated as the shallow-layer feature of extraction, correspondence image is obtained
The series connection of depth feature is a feature vector, and is trained using support vector machines grader.
2. according to the method described in claim 1, it is characterized in that, the facial expression image data set be CK+ data sets, institute
It states that facial expression image data set using data enhancement methods expand and includes:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set, verification collection,
Test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, by the Adaboost method for detecting human face based on Haar features to every
It opens face picture and carries out Face datection, cut human face region, remove background influence;
Using increasing income, computer vision library opencv visions library carries out spatial normalization to image, adjusts the company between face two
Line is allowed to holding level, and face is snapped to same position;
Histogram equalization is carried out to all images, enhances the contrast of image, the shadow of brightness of image difference caused by weakening illumination
It rings;Finally by all image normalizations to 70*70 pixels;
The training image of every 70*70 is cut out using the method EDS extended data set for carrying out geometric transformation to raw image data
Cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and horizontal mirror image is carried out to the image after cutting, will instructed
Practice collection and expands 10 times.
3. according to the method described in claim 1, it is characterized in that, described extract every width in the facial expression image data set
The SIFT feature of image includes:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4*8=that each SIFT key point, which describes son,
The vector of 128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the characteristic point extracted
Number, it is specified that extract 20 SIFT features description for each image, the shallow-layer feature of each training sample with 20*128 tie up to
Amount indicates;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT features
It is clustered, obtains K cluster centre, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e. 500 visual words.
4. according to the method described in claim 1, it is characterized in that, the SIFT feature indicated using bag of words Bow is made
Include for the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, is calculated by minimum distance method every
A SIFT feature should belong to which of dictionary visual vocabulary, and statistics falls into the number of characteristic point in each dictionary, obtain every
The statistic histogram of width image, the feature vector that statistic histogram can be tieed up with one 500 indicate to correspond to get to the picture
The Bow of dictionary is indicated.
5. according to the method described in claim 1, it is characterized in that, further including:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is as
The further feature practised.
6. according to the method described in claim 5, it is characterized in that, the AlexNet models using pre-training are trained in CK+
It is finely adjusted on collection, the feature of the full articulamentum of extraction model includes as the further feature learnt:
It is right on training set using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet
Parameter is finely adjusted, and the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is
0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
7. according to the method described in claim 6, it is characterized in that, further including:
Using obtained CNN model extraction training set images the full articulamentum of layer last feature as further feature, often
A single image obtains the feature vector of one 500 dimension.
8. according to the method described in claim 4, it is characterized in that, further including:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, it is special to obtain a fusion
Sign, and all feature vectors are normalized in [- 1,1] range.
9. according to the method described in claim 1, it is characterized in that, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is selected using grid.py cross validations
Optimal parameter c and g are selected, then svmtrain uses the optimal parameter c obtained and g, linear classifier, radial basis function RBF cores
The feature vector of the entire training dataset of function pair is trained, and obtains final target detection model.
10. method according to any one of claim 1 to 9, which is characterized in that further include:
When test, equally use the SIFT feature that Bow models indicate as shallow-layer feature, extraction test data set image zooming-out
Two feature vectors are connected to obtain fusion feature vector by the feature of the full articulamentum of CNN models layer last as further feature
And normalize, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810630864.XA CN108805216A (en) | 2018-06-19 | 2018-06-19 | Face image processing process based on depth Fusion Features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810630864.XA CN108805216A (en) | 2018-06-19 | 2018-06-19 | Face image processing process based on depth Fusion Features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108805216A true CN108805216A (en) | 2018-11-13 |
Family
ID=64083492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810630864.XA Pending CN108805216A (en) | 2018-06-19 | 2018-06-19 | Face image processing process based on depth Fusion Features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108805216A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978067A (en) * | 2019-04-02 | 2019-07-05 | 北京市天元网络技术股份有限公司 | A kind of trade-mark searching method and device based on convolutional neural networks and Scale invariant features transform |
CN110008876A (en) * | 2019-03-26 | 2019-07-12 | 电子科技大学 | A kind of face verification method based on data enhancing and Fusion Features |
CN110210329A (en) * | 2019-05-13 | 2019-09-06 | 高新兴科技集团股份有限公司 | A kind of method for detecting human face, device and equipment |
CN110516988A (en) * | 2019-07-08 | 2019-11-29 | 国网浙江省电力有限公司金华供电公司 | One kind being fitted Power Material method based on neural network |
CN111259913A (en) * | 2020-01-14 | 2020-06-09 | 哈尔滨工业大学 | Cell spectral image classification method based on bag-of-word model and textural features |
CN111401390A (en) * | 2019-01-02 | 2020-07-10 | 中国移动通信有限公司研究院 | Classifier method and device, electronic device and storage medium |
CN111860039A (en) * | 2019-04-26 | 2020-10-30 | 四川大学 | Cross-connection CNN + SVR-based street space quality quantification method |
CN112668482A (en) * | 2020-12-29 | 2021-04-16 | 中国平安人寿保险股份有限公司 | Face recognition training method and device, computer equipment and storage medium |
CN112883880A (en) * | 2021-02-25 | 2021-06-01 | 电子科技大学 | Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal |
CN113792574A (en) * | 2021-07-14 | 2021-12-14 | 哈尔滨工程大学 | Cross-data-set expression recognition method based on metric learning and teacher student model |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106156793A (en) * | 2016-06-27 | 2016-11-23 | 西北工业大学 | Extract in conjunction with further feature and the classification method of medical image of shallow-layer feature extraction |
CN107729835A (en) * | 2017-10-10 | 2018-02-23 | 浙江大学 | A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features |
-
2018
- 2018-06-19 CN CN201810630864.XA patent/CN108805216A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106156793A (en) * | 2016-06-27 | 2016-11-23 | 西北工业大学 | Extract in conjunction with further feature and the classification method of medical image of shallow-layer feature extraction |
CN107729835A (en) * | 2017-10-10 | 2018-02-23 | 浙江大学 | A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features |
Non-Patent Citations (4)
Title |
---|
MIKAEL JORDA ET AL.: "Emotion classification on face images", 《HTTP://CS229.STANFORD.EDU/PROJ2015/》 * |
TEE CONNIE ET AL.: "Facial Expression Recognition Using a Hybrid CNN–SIFT Aggregator", 《MULTI-DISCIPLINARY TRENDS IN ARTIFICIAL INTELLIGENCE》 * |
孙晓 等: "基于ROI_KNN卷积神经网络的面部表情识别", 《自动化学报》 * |
张晓明: "基于SIFT特征的人脸表情识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111401390B (en) * | 2019-01-02 | 2023-04-07 | 中国移动通信有限公司研究院 | Classifier method and device, electronic device and storage medium |
CN111401390A (en) * | 2019-01-02 | 2020-07-10 | 中国移动通信有限公司研究院 | Classifier method and device, electronic device and storage medium |
CN110008876A (en) * | 2019-03-26 | 2019-07-12 | 电子科技大学 | A kind of face verification method based on data enhancing and Fusion Features |
CN109978067A (en) * | 2019-04-02 | 2019-07-05 | 北京市天元网络技术股份有限公司 | A kind of trade-mark searching method and device based on convolutional neural networks and Scale invariant features transform |
CN111860039B (en) * | 2019-04-26 | 2022-08-02 | 四川大学 | Cross-connection CNN + SVR-based street space quality quantification method |
CN111860039A (en) * | 2019-04-26 | 2020-10-30 | 四川大学 | Cross-connection CNN + SVR-based street space quality quantification method |
CN110210329A (en) * | 2019-05-13 | 2019-09-06 | 高新兴科技集团股份有限公司 | A kind of method for detecting human face, device and equipment |
CN110516988A (en) * | 2019-07-08 | 2019-11-29 | 国网浙江省电力有限公司金华供电公司 | One kind being fitted Power Material method based on neural network |
CN111259913A (en) * | 2020-01-14 | 2020-06-09 | 哈尔滨工业大学 | Cell spectral image classification method based on bag-of-word model and textural features |
CN112668482A (en) * | 2020-12-29 | 2021-04-16 | 中国平安人寿保险股份有限公司 | Face recognition training method and device, computer equipment and storage medium |
CN112668482B (en) * | 2020-12-29 | 2023-11-21 | 中国平安人寿保险股份有限公司 | Face recognition training method, device, computer equipment and storage medium |
CN112883880A (en) * | 2021-02-25 | 2021-06-01 | 电子科技大学 | Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal |
CN112883880B (en) * | 2021-02-25 | 2022-08-19 | 电子科技大学 | Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal |
CN113792574A (en) * | 2021-07-14 | 2021-12-14 | 哈尔滨工程大学 | Cross-data-set expression recognition method based on metric learning and teacher student model |
CN113792574B (en) * | 2021-07-14 | 2023-12-19 | 哈尔滨工程大学 | Cross-dataset expression recognition method based on metric learning and teacher student model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108805216A (en) | Face image processing process based on depth Fusion Features | |
CN109359538B (en) | Training method of convolutional neural network, gesture recognition method, device and equipment | |
CN106096557B (en) | A kind of semi-supervised learning facial expression recognizing method based on fuzzy training sample | |
CN103942550B (en) | A kind of scene text recognition methods based on sparse coding feature | |
Jin et al. | Face detection using template matching and skin-color information | |
CN103984948B (en) | A kind of soft double-deck age estimation method based on facial image fusion feature | |
KR20200000824A (en) | Method for recognizing facial expression based on deep-learning model using center-dispersion loss function | |
CN109145871B (en) | Psychological behavior recognition method, device and storage medium | |
CN109190561B (en) | Face recognition method and system in video playing | |
CN108776774A (en) | A kind of human facial expression recognition method based on complexity categorization of perception algorithm | |
Shirbhate et al. | Sign language recognition using machine learning algorithm | |
CN108182409A (en) | Biopsy method, device, equipment and storage medium | |
CN110188708A (en) | A kind of facial expression recognizing method based on convolutional neural networks | |
CN104504383B (en) | A kind of method for detecting human face based on the colour of skin and Adaboost algorithm | |
Zhao et al. | Semantic parts based top-down pyramid for action recognition | |
CN105373777A (en) | Face recognition method and device | |
JP2021193610A (en) | Information processing method, information processing device, electronic apparatus and storage medium | |
CN106960181A (en) | A kind of pedestrian's attribute recognition approach based on RGBD data | |
CN111694959A (en) | Network public opinion multi-mode emotion recognition method and system based on facial expressions and text information | |
CN107992807A (en) | A kind of face identification method and device based on CNN models | |
Paul et al. | Extraction of facial feature points using cumulative histogram | |
Prabhu et al. | Facial Expression Recognition Using Enhanced Convolution Neural Network with Attention Mechanism. | |
Booysens et al. | Ear biometrics using deep learning: A survey | |
CN109508660A (en) | A kind of AU detection method based on video | |
CN111738177B (en) | Student classroom behavior identification method based on attitude information extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181113 |
|
RJ01 | Rejection of invention patent application after publication |