CN108805216A - Face image processing process based on depth Fusion Features - Google Patents

Face image processing process based on depth Fusion Features Download PDF

Info

Publication number
CN108805216A
CN108805216A CN201810630864.XA CN201810630864A CN108805216A CN 108805216 A CN108805216 A CN 108805216A CN 201810630864 A CN201810630864 A CN 201810630864A CN 108805216 A CN108805216 A CN 108805216A
Authority
CN
China
Prior art keywords
feature
image
training
sift
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810630864.XA
Other languages
Chinese (zh)
Inventor
孙晓
夏平平
吕曼
丁帅
杨善林
田芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201810630864.XA priority Critical patent/CN108805216A/en
Publication of CN108805216A publication Critical patent/CN108805216A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the present invention discloses a kind of face image processing process based on depth Fusion Features, the generalization ability that can be improved on finite data collection.This method includes:Facial expression image data set is expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set;Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature series connection that correspondence image is obtained is a feature vector, and is trained using SVM classifier.

Description

Face image processing process based on depth Fusion Features
Technical field
The present invention relates to field of image recognition more particularly to a kind of face image processing sides based on depth Fusion Features Method.
Background technology
People can trace back in the 1970s, the research of early stage is concentrated mainly on psychology the research of Expression Recognition In terms of biology.
Traditional facial expression recognition flow based on classification includes Face datection, feature extraction and the several steps of pattern classification Suddenly, face detection module is detected and positions to face;Expression spy extraction module is extracted from face subgraph and can be characterized The description information of expression;Mode classification module analyzes the output of a upper module, expression classification to phase according to Classification and Identification standard The classification answered.Wherein human facial feature extraction is most important part in Expression Recognition system, and the quality of recognition effect relies primarily on The quality of feature.
How researchers is extracting and using in better feature if generally spending a large amount of time and efforts, these The feature of hand-designed not only can the telephone expenses researcher a large amount of time, simultaneously for training data have stronger dependence, Often less reliable and stability is poor, is easily disturbed, to be improved in the generalization that Expression Recognition field has.
Invention content
The embodiment of the present invention provides a kind of face image processing process based on depth Fusion Features, can be in finite data The generalization ability improved on collection.
The embodiment of the present invention adopts the following technical scheme that:
A kind of face image processing process based on depth Fusion Features, including:
Facial expression image data set is expanded using data enhancement methods, while extracting the facial expression image The SIFT feature of each image in data set;
Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth that correspondence image is obtained Feature series connection is a feature vector, and is trained using SVM classifier.
Optionally, the facial expression image data set is CK+ data sets, described to be used to facial expression image data set Data enhancement methods expand:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested Card collection, test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, and pass through the Adaboost method for detecting human face based on Haar features Face datection is carried out to every face picture, cuts human face region, removes background influence;
Spatial normalization is carried out to image using opencv visions library, the line adjusted between face two is allowed to keep water It is flat, face is snapped into same position;
Histogram equalization is carried out to all images, enhances the contrast of image, brightness of image caused by weakening illumination is poor Influence;Finally by all image normalizations to 70*70 pixels;
Using the method EDS extended data set for carrying out geometric transformation to raw image data, the training of every 70*70 is schemed Picture, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out horizon glass Training set is expanded 10 times by picture.
Optionally, the SIFT feature of each image includes in the extraction facial expression image data set:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4* that each SIFT key point, which describes son, The vector of 8=128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the feature extracted The number of point, it is specified that extracting 20 SIFT feature description for each image, with 20*128 tieed up by the shallow-layer feature of each training sample Vector indicate;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT Feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500 regard Feel word.
Optionally, described to include using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance method meter Which of dictionary visual vocabulary should be belonged to by calculating each SIFT feature, and statistics falls into the number of characteristic point in each dictionary, obtains To the statistic histogram of each image, feature vector that statistic histogram can be tieed up with one 500 indicate to get to the picture pair It should be indicated in the Bow of dictionary.
Optionally, further include:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is made For the further feature learnt.
Optionally, the AlexNet models using pre-training are finely adjusted on CK+ training sets, and extraction model connects entirely The feature for connecing layer includes as the further feature learnt:
Using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, in training set On parameter is finely adjusted, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is 0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
Optionally, further include:
Feature using the full articulamentum of layer last of obtained CNN model extraction training set images is special as deep layer Sign, each single image obtain the feature vector of one 500 dimension.
Optionally, further include:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, obtains one and melt Feature is closed, and all feature vectors are normalized in [- 1,1] range.
Optionally, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is tested using grid.py intersections Card selects optimal parameter c and g, and then svmtrain uses the optimal parameter c obtained and g, linear classifier, RBF kernel functions pair The feature vector of entire training dataset is trained, and obtains final target detection model.
Optionally, further include:
When test, equally uses the SIFT feature that Bow models indicate as shallow-layer feature test data set image zooming-out, carry Take the feature of the full articulamentum of CNN models layer last as further feature, by two feature vectors connect to obtain fusion feature to It measures and normalizes, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The face image processing process based on depth Fusion Features based on the above-mentioned technical proposal, to facial expression image number Expanded using data enhancement methods according to collection, while the SIFT for extracting each image in the facial expression image data set is special Sign;Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature string that correspondence image is obtained Connection is a feature vector, and is trained using SVM classifier, so as to the extensive energy improved on finite data collection Power.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not The disclosure can be limited.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the present invention Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is the flow chart of the face image processing process based on depth Fusion Features shown in the embodiment of the present invention.
Specific implementation mode
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent and the consistent all embodiments of the present invention.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects being described in detail in claims, of the invention.
The embodiment of the present invention changes poor robustness for the feature of traditional method for extracting to the complexity such as illumination, expression, posture Etc. limitations, and general convolutional neural networks feature extraction scarce capacity in training on finite data collection cannot effective table Intelligent's face expressive features and the problem of influence generalization ability, it is proposed that a kind of feature representation mode based on depth Fusion Features, SIFT feature by extracting facial image respectively describes son and the full articulamentum feature vector of convolutional neural networks, and by two spies Sign carries out fused in tandem into a kind of new feature representation, reuses SVM classifier and is identified, by SIFT feature to CNN spies That levies is assisted and strengthened so that fusion feature has preferable generalization ability on finite data collection.
Embodiment 1
As shown in Figure 1, the embodiment of the present invention provides a kind of face image processing process based on depth Fusion Features, the packet It includes:
11, facial expression image data set is expanded using data enhancement methods, while extracts the facial expression Image data concentrates SIFT (Scale-invariant feature transform, the scale invariant feature change of each image Change) feature;
12, shallow-layer of the SIFT feature indicated using bag of words Bow (Bag-of-words, bag of words) as extraction Feature, the depth feature series connection that correspondence image is obtained is a feature vector, and utilizes SVM (Support Vector Machine, support vector machines) grader is trained.
Optionally, the facial expression image data set is CK+ data sets, described to be used to facial expression image data set Data enhancement methods expand:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested Card collection, test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, and pass through the Adaboost method for detecting human face based on Haar features Face datection is carried out to every face picture, cuts human face region, removes background influence;
Use opencv (OpenSource Computer Vision Library, computer vision of increasing income library) vision library Spatial normalization is carried out to image, the line between face two is adjusted and is allowed to holding level, face is snapped into same position;
Histogram equalization is carried out to all images, enhances the contrast of image, brightness of image caused by weakening illumination is poor Influence;Finally by all image normalizations to 70*70 pixels;
Using the method EDS extended data set for carrying out geometric transformation to raw image data, the training of every 70*70 is schemed Picture, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out horizon glass Training set is expanded 10 times by picture.
Optionally, the SIFT feature of each image includes in the extraction facial expression image data set:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4* that each SIFT key point, which describes son, The vector of 8=128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the feature extracted The number of point, it is specified that extracting 20 SIFT feature description for each image, with 20*128 tieed up by the shallow-layer feature of each training sample Vector indicate;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT Feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500 regard Feel word.
Optionally, described to include using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance method meter Which of dictionary visual vocabulary should be belonged to by calculating each SIFT feature, and statistics falls into the number of characteristic point in each dictionary, obtains To the statistic histogram of each image, feature vector that statistic histogram can be tieed up with one 500 indicate to get to the picture pair It should be indicated in the Bow of dictionary.
Optionally, further include:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is made For the further feature learnt.
Optionally, the AlexNet models using pre-training are finely adjusted on CK+ training sets, and extraction model connects entirely The feature for connecing layer includes as the further feature learnt:
Using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, in training set On parameter is finely adjusted, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is 0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
Optionally, further include:
Feature using the full articulamentum of layer last of obtained CNN model extraction training set images is special as deep layer Sign, each single image obtain the feature vector of one 500 dimension.
Optionally, further include:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, obtains one and melt Feature is closed, and all feature vectors are normalized in [- 1,1] range.
Optionally, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is tested using grid.py intersections Card selects optimal parameter c (penalty coefficient) and g (the parameter coefficient of kernel function), and then svmtrain uses the optimal parameter obtained C is with g, linear classifier, RBF (Radial Basis Function, radial basis function) kernel function to entire training dataset Feature vector is trained, and obtains final target detection model.
Optionally, further include:
When test, equally uses the SIFT feature that Bow models indicate as shallow-layer feature test data set image zooming-out, carry Take the feature of the full articulamentum of CNN models layer last as further feature, by two feature vectors connect to obtain fusion feature to It measures and normalizes, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The face image processing process based on depth Fusion Features of the embodiment of the present invention, to facial expression image data set Expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set; Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature that correspondence image is obtained is connected It for a feature vector, and is trained using SVM classifier, so as to the generalization ability improved on finite data collection.
Embodiment 2
The face image processing process based on depth Fusion Features of the present embodiment the present invention will be described in detail embodiment, this reality It is CK+ standard data sets, including 510 width facial expression images to apply data set used by example, and corresponding anger is detested, fears, certainly So, basic emotion in glad, sadness, surprised seven, expands CK+ training sets using certain data enhancement methods, uses The AlexNet models of pre-training are finely adjusted on CK+ training sets, and the feature of the full articulamentum of extraction model is as the depth learnt Layer feature;The SIFT feature for extracting each image simultaneously, using the SIFT feature that bag of words Bow is indicated as the shallow-layer of extraction Feature;The depth feature series connection that correspondence image is obtained is a feature vector, and SVM classifier is used in combination to be trained, and finally will Test set single image is identified in trained SVM models, and the general of this feature is verified on the data set of several mainstreams Change ability and robustness.
This approach includes the following steps:
201, CK+ data sets totally 510 facial images are obtained, by data set according to 8:1:1 ratio cut partition training set is tested Card collection, test set, and ensure that piece identity is not overlapped in each set.
202, all images concentrated to data pre-process, and pass through the Adaboost faces based on Haar features first Detection method carries out Face datection to every face picture, cuts human face region, removes background influence;Use opencv visions library Spatial normalization is carried out to image, the line between face two is adjusted and is allowed to holding level, face is snapped into same position; Histogram equalization is carried out to all images, enhances the contrast of image, the influence of brightness of image difference caused by weakening illumination;Most Afterwards by all image normalizations to 70*70 pixels.
203, using the method EDS extended data set for carrying out geometric transformation to raw image data, for the instruction of every 70*70 Practice image, cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and to after cutting image carry out level Training set is expanded 10 times by mirror image.
204, to the training set image zooming-out SIFT feature after all expansions, it is one that each SIFT key point, which describes son, The vector of 4*4*8=128 dimensions.After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is to carry The number for the characteristic point got.It is defined as each image and extracts 20 SIFT feature description, the shallow-layer feature of each training sample It is indicated with the 20*128 vectors tieed up.
205, dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT feature is clustered, and K cluster centre is obtained, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e., 500 A visual word.
206, the SIFT feature of each image is expressed as a feature vector with Bow bag of words, passes through minimum distance Method, which calculates each SIFT feature, should belong to which of dictionary visual vocabulary, and statistics falls into of characteristic point in each dictionary Number, obtains the statistic histogram of each image, the feature vector that statistic histogram can be tieed up with one 500 indicate to get to this The Bow that picture corresponds to dictionary is indicated.
207, it using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet, is instructing Practice and parameter is finely adjusted on collection, the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial study speed Rate is 0.001, when verification collection discrimination is no longer promoted, stops iteration, obtains the CNN models for extracting further feature.
208, using the feature of the full articulamentum of layer last of the CNN model extraction training set images obtained in step 7 As further feature, each single image obtains the feature vector of one 500 dimension.
209, it after the shallow-layer feature for 500 dimensions that step 6 is extracted being connected on to the 500 dimension further features that step 8 is extracted, obtains One fusion feature, and all feature vectors are normalized in [- 1,1] range.
210, the fusion feature of training sample data collection is trained using support vector machines, is handed over using grid.py Fork verification selects optimal parameter c and g, and then svmtrain uses the optimal parameter c obtained and g, linear classifier, RBF core letters Several feature vectors to entire training dataset are trained, and obtain final target detection model.
211, test phase equally uses the SIFT feature that Bow models indicate as shallow-layer test data set image zooming-out Feature extracts the feature of the full articulamentum of CNN models layer last as further feature, two feature vectors is connected and are merged Feature vector simultaneously normalizes, and using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
The embodiment of the present invention is finely adjusted using the AlexNet models of pre-training on CK+ training sets, and extraction model connects entirely The feature of layer is connect as the further feature learnt;The SIFT feature for extracting each image simultaneously, is indicated using bag of words Bow SIFT feature as extraction shallow-layer feature;By shallow-layer feature and further feature carry out fused in tandem be a new feature to It measures and normalizes, the feature representation new as image, training SVM classifier.The more manual features of CNN can learn richer to image Rich advanced character representation, generally requires sufficient data volume, although the recognition effect of the craft feature such as SIFT is not so good as CNN, But they should not a large amount of training data generate useful feature, the method overcome human face expression data set it is insufficient in the case of The problem of CNN further features ability to express deficiency assists depth characteristic to improve the identity under small data using traditional characteristic Energy.
The fusion feature recognition effect that the embodiment of the present invention proposes is better than SIFT-Bow shallow-layers feature and the depth of CNN extractions The discrimination of further feature, by merging shallow-layer feature, can be improved 2% or so by layer feature on CK+ data sets;Simultaneously The generalization ability for introducing cross datasets experimental verification this feature, by extracting fusion feature on CK+ data sets in SVM classifier Training, is tested on JAFFE data sets, and recognition effect reaches 47.8%, and property rate has a distinct increment compared with classical scheme, Synthesis result is preferable.
The face image processing process based on depth Fusion Features of the embodiment of the present invention, to facial expression image data set Expanded using data enhancement methods, while extracting the SIFT feature of each image in the facial expression image data set; Using the SIFT feature that bag of words Bow is indicated as the shallow-layer feature of extraction, the depth feature that correspondence image is obtained is connected It for a feature vector, and is trained using SVM classifier, so as to the generalization ability improved on finite data collection.
Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or improvement to the technology in market for best explaining each embodiment, or make the art Other those of ordinary skill can understand each embodiment disclosed herein.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice disclosure disclosed herein Its embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Person's adaptive change follows the general principles of this disclosure and includes the undocumented common knowledge in the art of the disclosure Or conventional techniques.

Claims (10)

1. a kind of face image processing process based on depth Fusion Features, which is characterized in that including:
Facial expression image data set is expanded using data enhancement methods, while extracting the facial expression image data Concentrate the Scale invariant features transform SIFT feature of each image;
Using the SIFT feature that bag of words bag of words Bow is indicated as the shallow-layer feature of extraction, correspondence image is obtained The series connection of depth feature is a feature vector, and is trained using support vector machines grader.
2. according to the method described in claim 1, it is characterized in that, the facial expression image data set be CK+ data sets, institute It states that facial expression image data set using data enhancement methods expand and includes:
The CK+ data sets for including multiple facial images are obtained, by data set according to 8:1:1 ratio cut partition training set, verification collection, Test set, and ensure that piece identity is not overlapped in each set;
All images concentrated to data pre-process, by the Adaboost method for detecting human face based on Haar features to every It opens face picture and carries out Face datection, cut human face region, remove background influence;
Using increasing income, computer vision library opencv visions library carries out spatial normalization to image, adjusts the company between face two Line is allowed to holding level, and face is snapped to same position;
Histogram equalization is carried out to all images, enhances the contrast of image, the shadow of brightness of image difference caused by weakening illumination It rings;Finally by all image normalizations to 70*70 pixels;
The training image of every 70*70 is cut out using the method EDS extended data set for carrying out geometric transformation to raw image data Cut its upper left, upper right, lower-left, bottom right, centre 64*64 sizes region, and horizontal mirror image is carried out to the image after cutting, will instructed Practice collection and expands 10 times.
3. according to the method described in claim 1, it is characterized in that, described extract every width in the facial expression image data set The SIFT feature of image includes:
To the training set image zooming-out SIFT feature after all expansions, it is a 4*4*8=that each SIFT key point, which describes son, The vector of 128 dimensions;
After extracting feature, each sample is converted to the eigenmatrix of n*128 dimensions, and wherein n is the characteristic point extracted Number, it is specified that extract 20 SIFT features description for each image, the shallow-layer feature of each training sample with 20*128 tie up to Amount indicates;
Dictionary is built using all characteristic points of the training set of extraction, with k-means clustering algorithms to all SIFT features It is clustered, obtains K cluster centre, these cluster centres are exactly the dictionary constructed, it is specified that K=500, i.e. 500 visual words.
4. according to the method described in claim 1, it is characterized in that, the SIFT feature indicated using bag of words Bow is made Include for the shallow-layer feature of extraction:
The SIFT feature of each image is expressed as a feature vector with Bow bag of words, is calculated by minimum distance method every A SIFT feature should belong to which of dictionary visual vocabulary, and statistics falls into the number of characteristic point in each dictionary, obtain every The statistic histogram of width image, the feature vector that statistic histogram can be tieed up with one 500 indicate to correspond to get to the picture The Bow of dictionary is indicated.
5. according to the method described in claim 1, it is characterized in that, further including:
It is finely adjusted on CK+ training sets using the AlexNet models of pre-training, the feature of the full articulamentum of extraction model is as The further feature practised.
6. according to the method described in claim 5, it is characterized in that, the AlexNet models using pre-training are trained in CK+ It is finely adjusted on collection, the feature of the full articulamentum of extraction model includes as the further feature learnt:
It is right on training set using the AlexNet convolutional neural networks model (AlexNet-CNN) of the pre-training on ImageNet Parameter is finely adjusted, and the number of nodes of the full articulamentum of AlexNet layers last is revised as 500, initial learning rate is 0.001, when verification collection discrimination is no longer promoted, stops iteration, obtain the CNN models for extracting further feature.
7. according to the method described in claim 6, it is characterized in that, further including:
Using obtained CNN model extraction training set images the full articulamentum of layer last feature as further feature, often A single image obtains the feature vector of one 500 dimension.
8. according to the method described in claim 4, it is characterized in that, further including:
After the shallow-layer feature of 500 dimensions of step extraction to be connected on to the 500 dimension further features that step 8 is extracted, it is special to obtain a fusion Sign, and all feature vectors are normalized in [- 1,1] range.
9. according to the method described in claim 1, it is characterized in that, it is described using SVM classifier be trained including:
The fusion feature of training sample data collection is trained using support vector machines, is selected using grid.py cross validations Optimal parameter c and g are selected, then svmtrain uses the optimal parameter c obtained and g, linear classifier, radial basis function RBF cores The feature vector of the entire training dataset of function pair is trained, and obtains final target detection model.
10. method according to any one of claim 1 to 9, which is characterized in that further include:
When test, equally use the SIFT feature that Bow models indicate as shallow-layer feature, extraction test data set image zooming-out Two feature vectors are connected to obtain fusion feature vector by the feature of the full articulamentum of CNN models layer last as further feature And normalize, using this fusion feature vector as the input of SVM, the class label of output is as recognition result.
CN201810630864.XA 2018-06-19 2018-06-19 Face image processing process based on depth Fusion Features Pending CN108805216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810630864.XA CN108805216A (en) 2018-06-19 2018-06-19 Face image processing process based on depth Fusion Features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810630864.XA CN108805216A (en) 2018-06-19 2018-06-19 Face image processing process based on depth Fusion Features

Publications (1)

Publication Number Publication Date
CN108805216A true CN108805216A (en) 2018-11-13

Family

ID=64083492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810630864.XA Pending CN108805216A (en) 2018-06-19 2018-06-19 Face image processing process based on depth Fusion Features

Country Status (1)

Country Link
CN (1) CN108805216A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978067A (en) * 2019-04-02 2019-07-05 北京市天元网络技术股份有限公司 A kind of trade-mark searching method and device based on convolutional neural networks and Scale invariant features transform
CN110008876A (en) * 2019-03-26 2019-07-12 电子科技大学 A kind of face verification method based on data enhancing and Fusion Features
CN110210329A (en) * 2019-05-13 2019-09-06 高新兴科技集团股份有限公司 A kind of method for detecting human face, device and equipment
CN110516988A (en) * 2019-07-08 2019-11-29 国网浙江省电力有限公司金华供电公司 One kind being fitted Power Material method based on neural network
CN111259913A (en) * 2020-01-14 2020-06-09 哈尔滨工业大学 Cell spectral image classification method based on bag-of-word model and textural features
CN111401390A (en) * 2019-01-02 2020-07-10 中国移动通信有限公司研究院 Classifier method and device, electronic device and storage medium
CN111860039A (en) * 2019-04-26 2020-10-30 四川大学 Cross-connection CNN + SVR-based street space quality quantification method
CN112668482A (en) * 2020-12-29 2021-04-16 中国平安人寿保险股份有限公司 Face recognition training method and device, computer equipment and storage medium
CN112883880A (en) * 2021-02-25 2021-06-01 电子科技大学 Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal
CN113792574A (en) * 2021-07-14 2021-12-14 哈尔滨工程大学 Cross-data-set expression recognition method based on metric learning and teacher student model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156793A (en) * 2016-06-27 2016-11-23 西北工业大学 Extract in conjunction with further feature and the classification method of medical image of shallow-layer feature extraction
CN107729835A (en) * 2017-10-10 2018-02-23 浙江大学 A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156793A (en) * 2016-06-27 2016-11-23 西北工业大学 Extract in conjunction with further feature and the classification method of medical image of shallow-layer feature extraction
CN107729835A (en) * 2017-10-10 2018-02-23 浙江大学 A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MIKAEL JORDA ET AL.: "Emotion classification on face images", 《HTTP://CS229.STANFORD.EDU/PROJ2015/》 *
TEE CONNIE ET AL.: "Facial Expression Recognition Using a Hybrid CNN–SIFT Aggregator", 《MULTI-DISCIPLINARY TRENDS IN ARTIFICIAL INTELLIGENCE》 *
孙晓 等: "基于ROI_KNN卷积神经网络的面部表情识别", 《自动化学报》 *
张晓明: "基于SIFT特征的人脸表情识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401390B (en) * 2019-01-02 2023-04-07 中国移动通信有限公司研究院 Classifier method and device, electronic device and storage medium
CN111401390A (en) * 2019-01-02 2020-07-10 中国移动通信有限公司研究院 Classifier method and device, electronic device and storage medium
CN110008876A (en) * 2019-03-26 2019-07-12 电子科技大学 A kind of face verification method based on data enhancing and Fusion Features
CN109978067A (en) * 2019-04-02 2019-07-05 北京市天元网络技术股份有限公司 A kind of trade-mark searching method and device based on convolutional neural networks and Scale invariant features transform
CN111860039B (en) * 2019-04-26 2022-08-02 四川大学 Cross-connection CNN + SVR-based street space quality quantification method
CN111860039A (en) * 2019-04-26 2020-10-30 四川大学 Cross-connection CNN + SVR-based street space quality quantification method
CN110210329A (en) * 2019-05-13 2019-09-06 高新兴科技集团股份有限公司 A kind of method for detecting human face, device and equipment
CN110516988A (en) * 2019-07-08 2019-11-29 国网浙江省电力有限公司金华供电公司 One kind being fitted Power Material method based on neural network
CN111259913A (en) * 2020-01-14 2020-06-09 哈尔滨工业大学 Cell spectral image classification method based on bag-of-word model and textural features
CN112668482A (en) * 2020-12-29 2021-04-16 中国平安人寿保险股份有限公司 Face recognition training method and device, computer equipment and storage medium
CN112668482B (en) * 2020-12-29 2023-11-21 中国平安人寿保险股份有限公司 Face recognition training method, device, computer equipment and storage medium
CN112883880A (en) * 2021-02-25 2021-06-01 电子科技大学 Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal
CN112883880B (en) * 2021-02-25 2022-08-19 电子科技大学 Pedestrian attribute identification method based on human body structure multi-scale segmentation, storage medium and terminal
CN113792574A (en) * 2021-07-14 2021-12-14 哈尔滨工程大学 Cross-data-set expression recognition method based on metric learning and teacher student model
CN113792574B (en) * 2021-07-14 2023-12-19 哈尔滨工程大学 Cross-dataset expression recognition method based on metric learning and teacher student model

Similar Documents

Publication Publication Date Title
CN108805216A (en) Face image processing process based on depth Fusion Features
CN109359538B (en) Training method of convolutional neural network, gesture recognition method, device and equipment
CN106096557B (en) A kind of semi-supervised learning facial expression recognizing method based on fuzzy training sample
CN103942550B (en) A kind of scene text recognition methods based on sparse coding feature
Jin et al. Face detection using template matching and skin-color information
CN103984948B (en) A kind of soft double-deck age estimation method based on facial image fusion feature
KR20200000824A (en) Method for recognizing facial expression based on deep-learning model using center-dispersion loss function
CN109145871B (en) Psychological behavior recognition method, device and storage medium
CN109190561B (en) Face recognition method and system in video playing
CN108776774A (en) A kind of human facial expression recognition method based on complexity categorization of perception algorithm
Shirbhate et al. Sign language recognition using machine learning algorithm
CN108182409A (en) Biopsy method, device, equipment and storage medium
CN110188708A (en) A kind of facial expression recognizing method based on convolutional neural networks
CN104504383B (en) A kind of method for detecting human face based on the colour of skin and Adaboost algorithm
Zhao et al. Semantic parts based top-down pyramid for action recognition
CN105373777A (en) Face recognition method and device
JP2021193610A (en) Information processing method, information processing device, electronic apparatus and storage medium
CN106960181A (en) A kind of pedestrian's attribute recognition approach based on RGBD data
CN111694959A (en) Network public opinion multi-mode emotion recognition method and system based on facial expressions and text information
CN107992807A (en) A kind of face identification method and device based on CNN models
Paul et al. Extraction of facial feature points using cumulative histogram
Prabhu et al. Facial Expression Recognition Using Enhanced Convolution Neural Network with Attention Mechanism.
Booysens et al. Ear biometrics using deep learning: A survey
CN109508660A (en) A kind of AU detection method based on video
CN111738177B (en) Student classroom behavior identification method based on attitude information extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181113

RJ01 Rejection of invention patent application after publication