WO2017107957A1 - 人脸图像的检索方法及装置 - Google Patents

人脸图像的检索方法及装置 Download PDF

Info

Publication number
WO2017107957A1
WO2017107957A1 PCT/CN2016/111533 CN2016111533W WO2017107957A1 WO 2017107957 A1 WO2017107957 A1 WO 2017107957A1 CN 2016111533 W CN2016111533 W CN 2016111533W WO 2017107957 A1 WO2017107957 A1 WO 2017107957A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
face
face image
attribute
Prior art date
Application number
PCT/CN2016/111533
Other languages
English (en)
French (fr)
Other versions
WO2017107957A9 (zh
Inventor
陆平
霍静
贾霞
刘金羊
刘明
张媛媛
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017107957A1 publication Critical patent/WO2017107957A1/zh
Publication of WO2017107957A9 publication Critical patent/WO2017107957A9/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Definitions

  • the present disclosure relates to the field of face image recognition, for example, to a method and apparatus for searching a face image.
  • Face recognition technology is widely used, such as assisting the public security department in criminal investigation, machine automatic identification, video surveillance tracking and recognition, facial facial expression analysis and so on.
  • Face recognition methods include template matching, example learning, neural network, hidden Markov model based methods and support vector machine based methods.
  • features obtained by simply processing a large amount of image data can be defined as low-level features, and description features such as lines, polygons, and patterns are defined as high-level features.
  • Image Principal Component Analysis (PCA) features, wavelet transform features and some statistical features are all low-level features, while the results of face component shape analysis are high-level features.
  • Face recognition with attributes such as men, women, smiles, brunettes, and glasses can give good results.
  • face recognition can also be performed using the similarity data with a face.
  • the Labeled Faces in the Wild (LFW) and the Columbia University's Public Figures Face Database (Pubfig) are two separate public datasets. The images in the gallery are in an uncontrolled environment. Obtained under. Differences in posture, expression, and illumination in these two data sets can have a large impact on face recognition.
  • the conventional method in the related art only uses low-level features for face recognition, resulting in poor face retrieval.
  • the present disclosure provides a method and a device for retrieving a face image, which avoids the phenomenon that face recognition is performed using low-level features in the related art, resulting in a poor face retrieval effect.
  • the present disclosure provides a method for retrieving a face image, including: calculating a to-be-searched by a preset rule a matching degree of the feature vector of the image and the feature vector of each face image in the library, and one or more images matching the image to be retrieved are retrieved according to the matching degree, wherein the feature vector includes the attribute feature and Similarity characteristics.
  • the method before the matching of the feature vector of the image to be retrieved with the feature vector of each face image in the library by the preset rule, the method further includes:
  • the attribute feature and the similarity feature are used as feature vectors of each face image in the gallery.
  • the bottom layer feature of the face image in the library is trained, and the attribute features of the face image are obtained:
  • Detecting key points in each face image in the library wherein the key points include: four corners of the eyes, a tip of the nose, and two ends of the mouth;
  • the attribute classifier is used to classify a plurality of the face underlying features of different regions to obtain different types of the attribute features.
  • training the underlying features of the reference face image in the gallery to obtain similarity features include:
  • Detecting a key point of the first predetermined number of reference face images wherein the key points include: four eyes, two nose corners, and two ends of the mouth;
  • the similarity feature is obtained by classifying the data set using a similarity classifier.
  • the attribute classifier and the similarity classifier comprise: a support vector machine SVM classifier.
  • calculating a feature vector of the image to be retrieved and each face in the gallery by using a preset rule The matching degree of the feature vector of the image, and the one or more images matching the image to be retrieved according to the matching degree include:
  • Performing distance calculation on the feature vector of the image to be retrieved and the feature vector of each face image in the library, and the method for calculating the distance includes: a cosine distance method or a Euclidean distance method;
  • the present disclosure also provides a retrieval device for a face image, including:
  • a search module configured to calculate, by using a preset rule, a matching degree of a feature vector of the image to be retrieved and a feature vector of each face image in the library, and retrieve one or more matching the image to be retrieved according to the matching degree An image, wherein the feature vector includes an attribute feature and a similarity feature.
  • the searching device further includes:
  • a first semantic feature extraction module configured to train an underlying feature of the face image in the gallery to obtain an attribute feature of the face image
  • a second semantic feature extraction module configured to train a bottom feature of the reference face image to obtain a similarity feature
  • the processing module is configured to use the attribute feature and the similarity feature as feature vectors of each face image in the gallery.
  • the first semantic feature extraction module includes:
  • a first detecting unit configured to detect a key point in the face image in the gallery, wherein the key points include: four corners of the eyes, a tip of the nose, and two ends of the mouth;
  • a first processing unit configured to divide a region of the face image according to the key point, and extract a bottom feature of the face corresponding to the different region;
  • the second semantic feature extraction unit is configured to use the attribute classifier to classify the plurality of the face underlying features of different regions to obtain different types of the first number of attribute features.
  • the second semantic feature extraction module includes:
  • a second detecting unit configured to perform a first predetermined number of key points of the reference face image Detection, wherein the key points include: four eyes of the eyes, the tip of the nose, and the ends of the mouth;
  • a second processing unit is configured to divide the face image according to the key point, and extract a bottom feature of the face corresponding to the different area, to obtain a data set corresponding to a different area of the face;
  • the second semantic feature extraction unit is configured to perform classification learning on the data set by using a similarity classifier to obtain the similarity feature.
  • the attribute classifier and the similarity classifier comprise: a support vector machine SVM classifier.
  • the searching module includes:
  • An acquiring unit configured to acquire a feature vector of the image to be retrieved and a feature vector of each face image in the library
  • a calculating unit configured to perform distance calculation on a feature vector of the image to be retrieved and a feature vector of each face image in the library, where the distance calculation method comprises: a cosine distance method or an Euclidean distance method;
  • a retrieval unit configured to sort the plurality of calculation results according to a rule from large to small, and select, from the sorted calculation results, a face image corresponding to the second predetermined number of calculation results of the previous value as the to-be-waited Retrieve the matching image of the image.
  • the present disclosure also provides a computer readable storage medium storing computer executable instructions arranged to perform the methods described above.
  • the present disclosure also provides an electronic device, including:
  • At least one processor At least one processor
  • the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one processor to perform the method described above.
  • one or more images matching the image to be retrieved are retrieved by comparing the image to be retrieved with the feature vector of the face image in the gallery, that is, the feature vector of the face image in the image to be retrieved and the gallery.
  • the feature vector includes attribute features and similarity features, and the attribute features and similarity features belong to high-level features, so the matching result is matched with the image to be retrieved. High, reducing the use of low-level features in the related art for face recognition, resulting in poor face retrieval, improving the efficiency and matching of face retrieval.
  • FIG. 1 is a flowchart of a method for retrieving a face image according to an embodiment of the present invention
  • FIG. 2 is a block diagram showing the structure of a face image retrieval device according to an embodiment of the present invention
  • FIG. 3 is a block diagram 1 of an optional structure of a face image retrieval device according to an embodiment of the present invention.
  • FIG. 4 is a block diagram 2 of an optional structure of a face image retrieval device according to an embodiment of the present invention.
  • FIG. 5 is a block diagram 3 of an optional structure of a face image retrieval device according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of face key point detection according to an alternative embodiment of the present invention.
  • Figure 7 is a schematic diagram of a coordinate system of an alternative embodiment of the present invention.
  • FIGS. 8a-8b are schematic diagrams of comparison before and after rotation alignment of a face image according to an alternative embodiment of the present invention.
  • FIG. 9 is a schematic diagram of similarity image region segmentation according to an alternative embodiment of the present invention.
  • FIG. 10 is a schematic diagram of an attribute or similarity feature classifier learning and feature extraction process in an alternative embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a picture storage and retrieval process according to an alternative embodiment of the present invention.
  • FIG. 12 is a schematic diagram showing the hardware structure of an electronic device according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for retrieving a face image according to an embodiment of the present invention.
  • step 110 the underlying features of the face image in the gallery are trained to obtain a first number of attribute features of the face image.
  • step 120 the underlying features of the first predetermined number of reference face images in the gallery are trained to obtain a second number of similarity features.
  • step 130 the first number of attribute features and the second number of similarity features are used as feature vectors for each face image in the gallery.
  • step 140 the matching degree of the feature vector of the image to be retrieved and the feature vector of each face image in the library is calculated by using a preset rule, and one or the image to be retrieved is retrieved according to the matching degree. Multiple images.
  • the library is a set of face images written in the database in the face retrieval system, and the reference face image is a subset of the face images in the gallery.
  • step 110 to step 140 the underlying features of the face image are trained to obtain a first number of attribute features, and in addition, the specified semantic features of the first predetermined number of reference face images are trained.
  • a similarity feature of the second number of specified semantic features relative to the attribute features of the face image in the gallery is obtained.
  • the image to be retrieved is compared with the feature vector of the face image in the gallery, the feature vector includes the attribute feature and the similarity feature, and the attribute feature and the similarity feature belong to a high level.
  • the feature so the matching result is highly matched with the image to be retrieved, thereby solving the problem of using the low-level feature for face recognition in the related art, resulting in poor face retrieval effect, improving the efficiency and matching of face retrieval. degree.
  • step 1101 key points in each face image in the gallery are detected, wherein the key points include: four corners of the eyes, a tip of the nose, and both ends of the mouth.
  • key points involved in the above are only optional key points of the alternative embodiment, and other key points: hair, chin, ears, and the like are all possible. In other words, as long as the characteristics of the human face are ok.
  • step 1102 the face image is divided into regions according to the key points, and the face bottom features corresponding to the different regions are extracted.
  • the attribute classifier is used to classify the plurality of face underlying features of different regions to obtain the first number of attribute features of different types.
  • the first quantity selectable value is 69
  • the face attribute features may include: male, female, smiling, black hair, glasses, etc. Both describe the semantic features of the face, and the goal of the face attribute classifier is to classify the face image and determine whether the face image has a specific attribute. That is, in the application scenario, based on the above steps, a classifier that can train 69 attribute features such as smile, black hair, and glasses can be used to represent the face feature.
  • the attribute feature extraction is to extract the attribute features of the face through the attribute feature classifier obtained by training, that is, 69 attribute values are calculated by the 69 trained attribute classifiers, and the image attribute features are stitched together.
  • the manners involved in the step 1101 to the step S1103 may be:
  • Extracting the underlying features of the attribute you can perform face detection on each face image in the library, locate the key points, obtain key point information of the face image, align the face image rotation, and perform area on the face image according to the attribute requirement. Segmentation (for example, the area corresponding to the eyeglass attribute is the eye area, and the area corresponding to the white hair attribute is the hair area), different attributes may need to be segmented into different numbers of areas, and the bottom layer features effective for the area are extracted from the divided area. ;
  • the extracted underlying attribute features are divided into two equal parts, half for training and half for testing (of course, this is just an example, other ratios are also possible, and can be divided according to the situation), if an attribute is used
  • Multiple segmentation areas can be used for feature splicing.
  • the earrings are used, the left and right ear regions can be used to learn the corresponding Support Vector Machine (SVM) attribute classifiers, such as the smile attribute classifier, for the underlying features.
  • SVM Support Vector Machine
  • attribute classifiers such as black hair attribute classifier and glasses attribute classifier;
  • the underlying feature of the first predetermined number of reference face images is trained in step 120 to obtain a second number of similarity features.
  • the following may be implemented.
  • step 1201 a key point of the first predetermined number of reference face images is detected, wherein the key points include: four eyes of the eyes, the tip of the nose, and both ends of the mouth.
  • the key points mentioned above are only optional points of the alternative embodiment, and other key points: hair, chin, ears, etc. are all possible.
  • step 1202 the reference face image is divided according to the key points, and the bottom layer features corresponding to the different regions are extracted to obtain a data set corresponding to the specified semantic feature.
  • step 1203 the data set is classified by the similarity classifier to obtain a second number of similarity features.
  • the first predetermined number involved in the foregoing may take a value of 10. Based on the foregoing steps 1201 to 1203, in the application scenario, the training process of the similarity classifier may be as follows:
  • each reference person is separately processed, and all face images of each reference person are taken as positive samples, and an equal number of other face images are selected as negative samples, and a data is constructed by reference persons. set;
  • the extraction feature is processed for each data set according to the following process, that is, face detection and key point positioning can be performed, and the face image is rotated and aligned, and the eyes, eyebrows, nose and mouth are respectively segmented on each face image.
  • Sub-blocks extracting the underlying features for the four sub-blocks, and transforming the data set into four new sub-data sets, namely an eye data set, an eyebrow data set, a mouth data set, and a nose data set;
  • Each sub-data set of each reference person is divided, half of the data is used as training, and the other half of the data is tested.
  • the SVM model is learned on the training set, and the similarity classifier classification effect is verified on the test set, and the generated model file is trained. And the feature normalization file is saved for subsequent similarity feature extraction. Calling the training to obtain the similarity classifier calculates the similarity values for the four sub-blocks, and obtains 40 similarity values.
  • the attribute classifier and the similarity classifier involved in this embodiment may be selected as an SVM attribute classifier.
  • the preset rule involved in step 140 is adopted.
  • the degree of matching between the feature vector of the image to be retrieved and the feature vector of each face image in the library is calculated, and one or more images matching the image to be retrieved are retrieved according to the matching degree, which can be implemented as follows.
  • step 1401 a feature vector of the image to be retrieved and a feature vector of each face image in the library are obtained.
  • the feature vector of the image to be retrieved is calculated from the feature vector of each face image in the library.
  • the method for calculating the distance may be a cosine distance method or an Euclidean distance method in this alternative embodiment.
  • step 1403 a plurality of calculation results are sorted according to rules from large to small, and a face image corresponding to the second predetermined number of calculation results of the previous value is selected as the image to be retrieved from the sorted calculation results. Matching images.
  • the foregoing step 1401 to step 1403 may be: after obtaining the feature vector composed of the attribute value and the similarity value of the face image, the face search based on the combined face attribute feature and the similarity feature may be performed, and The obtained 69 attribute values and 40 similarity values are spliced as feature vectors of each face image, and the weight of each dimension of the feature vector is optimized using the Large Margin Nearest Neighbors (LMNN) algorithm.
  • LMNN Large Margin Nearest Neighbors
  • the similarity of two faces can be calculated using the feature vector and weight.
  • the cosine of two vector angles is used as the similarity value between vectors, and the value of cos ⁇ is in [-1, +1], and the closer to +1, the two face images are represented. The more similar the face is.
  • the technical solution of the present invention can be embodied in the form of a software product stored in a storage medium (such as Read-Only Memory (ROM) or Random Access Memory (Random Access).
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the memory, the RAM, the optical disk, and the optical disk include one or more instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present invention.
  • a device for retrieving a face image is provided, which is used to implement the above-mentioned embodiments and optional embodiments, and has not been described again.
  • the term “module” may implement a combination of software or hardware (or software and hardware) of a predetermined function.
  • the apparatus includes: a first semantic feature extraction module 22, a second semantic feature extraction module 24, a processing module 26, and a retrieval module. 28.
  • the first semantic feature extraction module 22 is configured to face the face map in the gallery.
  • the underlying features of the image are trained to obtain the attribute features of the first number of face images.
  • the second semantic feature extraction module 24 is coupled to the first semantic feature extraction module 22 and configured to train the underlying features of the first predetermined number of reference facial images to obtain a second number of similarity features.
  • the processing module 26 is coupled to the second semantic feature extraction module 24 and configured to use the first number of attribute features and the second number of similarity features as feature vectors of each face image in the library.
  • the retrieval module 28 is coupled to the processing module 26, and configured to calculate, by using a preset rule, a matching degree between a feature vector of the image to be retrieved and a feature vector of each face image in the library, and retrieve the location according to the matching degree. One or more images that match the retrieved image are described.
  • the first semantic feature extraction module 22 includes: a first detection unit 32, a first processing unit 34, and a second Semantic feature extraction unit 36.
  • the first detecting unit 32 is configured to detect key points in the face image in the gallery, wherein the key points are the four corners of the eyes, the tip of the nose, and the ends of the mouth.
  • the first processing unit 34 is coupled to the first detecting unit 32, and is configured to divide the face image according to the key point, and extract the bottom feature of the face corresponding to the different area.
  • the second semantic feature extraction unit 36 is coupled to the first processing unit 34 and configured to use the attribute classifier to classify the plurality of face underlying features of different regions to obtain different types of attribute features.
  • the second semantic feature extraction module 24 includes: a second detection unit 42, a second processing unit 44, and a second Semantic feature extraction unit 46.
  • the second detecting unit 42 is configured to detect a key point of the first predetermined number of reference face images, wherein the key points include: four eyes of the eyes, a tip of the nose, and two ends of the mouth.
  • the second processing unit 44 is coupled to the second detecting unit 42 and configured to divide the face image according to the key point, and extract the bottom layer features corresponding to the different regions to obtain data corresponding to the specified semantic feature. set.
  • the second semantic feature extraction unit 46 is coupled to the second processing unit 44 and configured to perform classification learning on the data set by using the similarity classifier to obtain a second number of similarity features.
  • the attribute classifier and the similarity classifier involved in this embodiment are support vector machine SVM classifiers.
  • FIG. 5 is a block diagram 3 of an optional structure of a face image retrieval apparatus according to an embodiment of the present invention.
  • the retrieval module 28 includes an acquisition unit 52, a calculation unit 54, and a retrieval unit 56.
  • the obtaining unit 52 is configured to acquire a feature vector of the image to be retrieved and a feature vector of each face image in the library.
  • the calculation unit 54 is coupled to the acquisition unit 52 and configured to perform distance calculation on the feature vector of the image to be retrieved and the feature vector of each face image in the library.
  • the method for calculating the distance includes a cosine distance method or an Euclidean distance method.
  • the retrieval unit 56 is coupled to the calculation unit 54 and configured to sort the plurality of calculation results according to the rules from large to small, and select the second predetermined number of calculation results corresponding to the previous values from the sorted calculation results.
  • the face image serves as a matching image of the image to be retrieved.
  • the above modules can be implemented by software or hardware.
  • the above modules may be located in one or more storage media in the form of software modules.
  • the above modules may all be located in the same processor; or, the above modules may be located in multiple processors.
  • the present invention provides a method for extracting a high-level semantic feature of a face.
  • the method extracts a high-level semantic feature of a face through an attribute classifier and a similarity classifier, and performs face matching by combining face attribute features and similarity features.
  • Similarity measure which implements similar face retrieval, including face attribute classifier learning and face attribute feature acquisition method, face similarity classifier learning and similarity feature acquisition method, based on face attribute feature and similarity feature
  • face is retrieved in three parts, which are described in detail below.
  • the face attribute classifier learning and the face attribute feature acquiring method may include: a face attribute classifier learning mode, an attribute classifier training, a face similarity classifier learning, and a similarity classifier training.
  • the face attributes include: male, female, smile, black hair, glasses, etc., which describe the semantic features of the face.
  • the face attribute classifier can classify the face image and determine whether the face image has a specific attribute.
  • a classifier that has trained 69 attribute features such as a smile, a black hair, and glasses is used for the representation of a face feature.
  • the attribute feature extraction is to extract the attribute features of the face through the attribute feature classifier obtained by the training.
  • 69 attribute values are calculated on the face image by 69 trained attribute classifiers, and the person is stitched to form the person. Face image attribute characteristics.
  • the training process of the attribute classifier can include:
  • Face detection for each face image in the annotation set, key point positioning, and obtaining the face image Key point information, the face image is rotated and aligned, and the face image can be segmented according to the attribute requirement (for example, the area corresponding to the eyeglass attribute is the eye area, and the area corresponding to the white hair attribute is the hair area), and different attributes may need to be segmented.
  • the attribute requirement for example, the area corresponding to the eyeglass attribute is the eye area, and the area corresponding to the white hair attribute is the hair area
  • different attributes may need to be segmented.
  • a different number of regions are extracted, and the underlying features of the region are extracted for the segmented region (such as Local Binary Pattern (LBP) algorithm, Gabor algorithm, etc.).
  • LBP Local Binary Pattern
  • Gabor algorithm Gabor algorithm
  • the underlying attribute features extracted from the positive and negative image in the previous annotation set are divided into two equal parts, half for training and half for testing. If one attribute uses multiple divided areas, feature splicing can be performed first, such as whether The earrings attribute can be used to the left and right ear regions, and the corresponding SVM attribute classifiers are learned for the underlying features, such as a smile attribute classifier, a black hair attribute classifier, a glasses attribute classifier, and the like, and 69 attribute classifiers are generated.
  • Feature normalization files on the training set saving the model files generated by the training and the feature normalization files;
  • the attribute classification effect is verified according to the classification value of the attribute classifier.
  • the face similarity classifier learning process can include:
  • the training goal of the similarity classifier is to train the reference person's facial features similarity classifier.
  • the facial features similar classifier the new facial image is classified, and it can be judged whether the facial features of the human face and the reference person's facial features are similar.
  • the training process of the similarity classifier can include:
  • the feature extraction process of the attribute classifier and the similarity classifier is similar and may include:
  • face detection can be performed on the face image according to the same process as in the attribute classifier training, key point positioning, key point information of the face image is obtained, and the picture is rotated and aligned according to the requirements of each attribute.
  • the segmentation of the face image is performed, and the attribute classifier obtained by the training is called to perform classification, and the attribute classifier value is obtained, and all the attribute classification values are spliced to obtain the face attribute feature of the input image;
  • the process is similar to the extracted attribute feature, which may include: performing face detection on the face image, positioning the key point, and aligning the face image rotation, respectively, on each face image Dividing out four sub-blocks of eyes, eyebrows, nose and mouth, extracting the underlying features for each of the four sub-blocks, calling the training to obtain the similarity classifier to calculate the similarity values for the four sub-blocks, and splicing all the similarity values to obtain the input. Face similarity characteristics of the image;
  • the attribute features and similarity features of the input image are spliced, so that a feature vector consisting of 69 attribute values and 40 similarity values of the image is obtained.
  • the face search based on the combined face attribute feature and the similarity feature can be performed. as follows:
  • the obtained 69 attribute values and 40 similarity values are spliced as feature vectors of each face image, and the weight of each dimension of the feature vector is optimized using the LMNN algorithm.
  • the similarity of two faces can be calculated using the feature vector and weight.
  • the cosine of the two vector angles can be used as the similarity value between the vectors, and the cos ⁇ value ranges from [-1, +1], and the closer to +1, the two face images are represented. The more similar the faces are.
  • the technical solutions adopted by each component of the algorithm in this alternative embodiment may include: face key point detection, image preprocessing, image area segmentation, feature extraction, and classifier training.
  • FIG. 6 is a schematic diagram of face key point detection according to an alternative embodiment of the present invention.
  • flandmark open source implementation of face landmark detector
  • detection points It can be the 7 key points of the corners of the eyes, the tip of the nose and the ends of the mouth.
  • Image preprocessing is the rotation and alignment of the original face image. According to the obtained key point data, the binocular pupil position information can be located. Since the pupil of the rotated face should be on a straight line, that is, the X values of the pupil coordinates should be equal, and the angle of rotation can be calculated. The rotated image is saved to 250 pixels by 250 pixels, and the insufficient portion is filled with black.
  • the coordinate system used in the alternative embodiment is different from the usual coordinate system, the horizontal direction is the y-axis from left to right, and the vertical direction is x-axis from top to bottom.
  • FIG. 7 is a schematic illustration of a coordinate system of an alternative embodiment of the present invention, as shown in Figure 7, the ellipse representing the position of the eye.
  • the face image can be processed by rotating ⁇ degrees such that the two eye lines coincide with the y axis; and the image is scaled so that the distance between the eyes is dd; Move the face image so that the midpoints of both eyes move to (mx, my).
  • FIG. 8a-8b are schematic diagrams of comparison before and after rotation alignment of a face image according to an alternative embodiment of the present invention, wherein FIG. 8a is before rotation and FIG. 8b is after rotation;
  • Image region segmentation taking the segmentation of the eye as an example, find the key point coordinate pLeftIndex of the left eye corner and the key point coordinate pRightIndex of the right eye corner, and calculate the midpoint according to the coordinates of the two points.
  • this midpoint as the center point of the rectangle, the distance from the center point to the left and right borders of the rectangle is centerToLeft, the distance to the upper and lower boundaries of the rectangle is centerToUp, and the width of the defined image is width, height is height, according to the center point position and centerToUp and centerToLeft can obtain the coordinates of the upper left corner of the segmentation area.
  • a face image containing the divided eye region information is obtained.
  • Feature extraction can extract features of image blocks using Gabor wavelet transform.
  • Gabor wavelet transform has less data processing and can meet the real-time requirements of the system.
  • the wavelet transform is not sensitive to illumination changes and can tolerate a certain degree of image rotation and deformation when using the cosine distance based on the angle.
  • the feature mode does not need to be strictly corresponding to the feature to be tested, so the robustness of the system can be improved. Therefore, in the process of face recognition, the Gabor wavelet transform method can be used to extract features from images.
  • the SVM classifier training uses LIBSVM.
  • LIBSVM is a simple, easy to use and fast and efficient SVM pattern recognition and regression software package. It is a library that implements the support vector machine SVM algorithm. Using LIBSVM, you can train a dataset to obtain a classification model and use the model to predict the class of the test dataset. The SVM classifier can find a separate hyperplane that maximizes the spacing in the feature space, separating the different classes.
  • Attribute classifier similarity classifier specific implementation manner; in the implementation of the attribute classifier, the attribute classifier adopts LFW (Labeled Faces in the Wild) data set, and each attribute of 69 attributes in Table 1 You can select 1000 images of the face images that match this attribute and do not match this attribute from the LFW data set (recommended not less than this number), and mark each face image, if it matches the genus The sex flag is +1, and if it is not, it is marked as -1. Split the area corresponding to the attribute, as shown in Table 2. Since some attributes can use the same split area, 69 attributes can be reduced to correspond to 19 areas. The Gabor wavelet transform feature is extracted for the face image used. Each attribute is trained using LIBSVM.
  • the similarity classifier can use the PubFig (Public Figures Face Database) data set, and select 10 representative people from the PubFig image library, and each person's face image is not There are less than 150 sheets, which divide the four regions of eyes, eyebrows, nose and mouth required by similar classifiers.
  • FIG. 9 is a schematic diagram of the segmentation of similarity image regions according to an alternative embodiment of the present invention, as shown in FIG. 40 regions can be obtained, and features are extracted separately. Take the eye area of the first reference person as an example, and take his own eye area as a positive example, marked as +1.
  • the face image of this person is not selected from the gallery (suggested to be a number of different people), the number of sheets is roughly equal to the number of faces of all the faces of the reference person, the eye area is segmented, and the feature is extracted as a negative example. Is -1.
  • the Gabor wavelet transform feature is also extracted for the segmented region. 40 similarity classifiers were obtained using LIBSVM training.
  • Calculating the attribute features and similarity features of the input picture and the feature weight learning may include:
  • the corresponding value can be obtained by LIBSVM's svmpredict() function, so that 69 attribute values and 40 similarity values of the image are obtained, and the attribute values and similarities are obtained. Combine the sex values to get the feature vector of the image as well as
  • the LMNN algorithm is used to optimize the weight of each dimension of the feature vector for the attribute features and similarity features extracted on the image set.
  • the 109-dimensional feature vector is also processed for the newly input image to be retrieved;
  • ⁇ x, y> represents the inner product between two vectors x and y
  • represents the modulus of the vector
  • sim(v i , v j ) takes the value range [- 1, +1], the closer to +1, the more similar the faces in the two faces are.
  • the feature vector of the retrieved picture and the feature vector of all face images in the searched library are used to find the cosine of the angle, and the first N (which may default to 1000) pictures with the largest cosine value is taken.
  • the person in the selected top picture is considered to be the most likely to be the same person in the input picture.
  • FIG. 10 is an attribute or similarity feature classifier of an alternative embodiment of the present invention.
  • the steps of the attribute or similarity classifier training process include:
  • the steps of the attribute or similarity classifier extraction process may include:
  • the difference between the attribute classifier and the similarity classifier is that the SVM classifier training and testing part, the attribute classifier matches the attribute tag as +1, and if not, the flag is -1; in the similarity classifier The current reference person is marked as +1 and the other person is marked as -1.
  • FIG. 11 is a schematic diagram of a picture warehousing and retrieval process according to an alternative embodiment of the present invention. As shown in FIG. 11, the process may include: storing a picture into a database process and a retrieval process;
  • the image stored in the database process may include:
  • a database of portrait attributes or similarity characteristics performs a combination of attributes or similarity features after execution.
  • the retrieval process can include:
  • the final feature vector may be a 109-dimensional vector containing 69 attribute values and 40 similarity values.
  • Embodiments of the present invention also provide a non-transitory computer readable storage medium.
  • the non-transitory computer readable storage medium stores computer executable instructions, and the computer executable instructions may be configured to perform a retrieval method of any of the above face images.
  • the present disclosure also provides a hardware structure diagram of an electronic device.
  • the electronic device includes:
  • At least one processor 120 is exemplified by one processor 120 in FIG. 12; and a memory 121 may further include a communication interface 122 and a bus 123.
  • the processor 120, the communication interface 122, and the memory 121 can complete communication with each other through the bus 123.
  • Communication interface 122 can be used for information transmission.
  • the processor 120 can call the logic instructions in the memory 121 to perform a retrieval method of the face image.
  • logic instructions in the memory 121 described above may be implemented in the form of a software functional unit and sold or used as a stand-alone product, and may be stored in a computer readable storage medium.
  • the memory 121 is a computer readable storage medium, and can be used to store a software program, a computer executable program, a program instruction or a module corresponding to the method in the embodiment of the present disclosure.
  • the processor 120 executes a function application and data processing by executing a software program, an instruction or a module stored in the memory 121, that is, a retrieval method of a face image.
  • the memory 121 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to usage of the terminal device, and the like. Further, the memory 121 may include a high speed random access memory, and may also include a nonvolatile memory.
  • the technical solution of the present disclosure may be embodied in the form of a software product stored in a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) Performing all or part of the steps of the method of the embodiments of the present disclosure.
  • the foregoing storage medium may be a non-transitory storage medium, including: a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, and the like, and may be a temporary storage medium.
  • modules or steps of the present invention can be implemented by a general-purpose computing device, which can be centralized on a single computing device or distributed over a network of multiple computing devices. They may be implemented by program code executable by a computing device, such that they may be stored in a storage device for execution by a computing device, and the steps shown or described may be performed in an order different than that herein. Either they are separately fabricated into a plurality of integrated circuit modules, or a plurality of modules or steps thereof are fabricated into a single integrated circuit module.
  • the method and device for retrieving face images provided by the present disclosure reduce the phenomenon that face recognition is performed using low-level features in the related art, resulting in poor face retrieval effect, and improving the efficiency and matching degree of face retrieval.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种人脸图像的检索方法及装置,该方法包括:通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像,其中,所述特征向量包括属性特征和相似性特征。

Description

人脸图像的检索方法及装置 技术领域
本公开涉及人脸图像识别领域,例如,涉及一种人脸图像的检索方法及装置。
背景技术
随着社会的不断发展以及多方面对于快速有效的自动身份验证的迫切要求,生物特征识别技术在近几十年中得到了飞速的发展。其中人脸识别技术的研究吸引了大批研究者。人脸识别技术应用非常广泛,如协助公安部门刑侦破案,机器自动进行身份验证,视频监控跟踪识别,人脸面部表情分析等等。当前很多国家展开了有关人脸识别的研究,人脸识别的方法包括模板匹配、示例学习、神经网络、基于隐马尔可夫模型的方法以及基于支持向量机的方法。
在计算机人脸识别中,可以将那些通过大量图像数据简单处理后获得的特征定义为低层次特征,而将线、面、模式等描述特征定义为高层次特征。图像主成分分析(Principal Component Analysis,PCA)特征、小波变换特征及一些统计特征均属低层次特征的范畴,而人脸部件形状分析的结果则为高层次特征。采用男性,女性,微笑,黑发,带眼镜等属性进行人脸识别能获得不错的结果。此外,利用和一个人脸的相似性数据也可以进行人脸识别。户外脸部检测图库(Labeled Faces in the Wild,LFW)和哥伦比亚大学公众人物脸部图库(Public Figures Face Database,Pubfig)是两个独立的公共数据集,图库中的图片都是在非受控环境下获取的。这两个数据集中的姿势、表情、光照等不同会对人脸识别造成很大影响。相关技术中传统的方法只使用低层次特征进行人脸识别,导致人脸检索效果不佳。
针对相关技术中的上述,目前尚未存在有效的解决方案。
发明内容
本公开提供了一种人脸图像的检索方法及装置,避免了相关技术中使用低层次特征进行人脸识别,导致人脸检索效果不佳的现象。
本公开提供了一种人脸图像的检索方法,包括:通过预设规则计算待检索 图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像,其中,所述特征向量包括属性特征和相似性特征。
可选地,通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度之前,所述方法还包括:
对所述图库中的人脸图像的底层特征进行训练,得到所述人脸图像的属性特征;
对参考人脸图像的底层特征进行训练,得到相似性特征;以及
将所述属性特征与所述相似性特征作为所述图库中每幅人脸图像的特征向量。
可选地,所述对所述图库中的人脸图像的底层特征进行训练,得到所述人脸图像的属性特征包括:
对所述图库中每幅人脸图像中的关键点进行检测,其中,所述关键点包括:双眼的四个眼角、鼻尖以及嘴巴两端;
依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的底层特征;以及
利用属性分类器对不同区域的多个所述人脸底层特征进行分类学习得到不同类型的所述属性特征。
可选地,对图库中的参考人脸图像的底层特征进行训练,得到相似性特征包括:
对所述第一预定数量的参考人脸图像的关键点进行检测,其中,所述关键点包括:双眼四个眼角、鼻尖以及嘴巴两端;
依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征,得到与人脸不同区域对应的数据集;以及
利用相似性分类器对所述数据集进行分类学习得到所述相似性特征。
可选地,所述属性分类器和所述相似性分类器包括:支持向量机SVM分类器。
可选地,通过预设规则计算待检索图像的特征向量与所述图库中每幅人脸 图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像包括:
获取所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量;
对所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量进行距离计算,所述距离计算的方法包括:余弦距离方法或欧式距离方法;以及
对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为所述待检索图像的匹配图像。
本公开还提供一种人脸图像的检索装置,包括:
检索模块,设置为通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像,其中,所述特征向量包括属性特征和相似性特征。
可选地,所述检索装置还包括:
第一语义特征提取模块,设置为对图库中的人脸图像的底层特征进行训练得到所述人脸图像的属性特征;
第二语义特征提取模块,设置为对参考人脸图像的底层特征进行训练得到相似性特征;以及
处理模块,设置为将所述属性特征与相似性特征作为所述图库中每幅人脸图像的特征向量。
可选地,所述第一语义特征提取模块包括:
第一检测单元,设置为对所述图库中的人脸图像中的关键点进行检测,其中,所述关键点包括:双眼的四个眼角、鼻尖以及嘴巴两端;
第一处理单元,设置为依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征;以及
第二语义特征提取单元,设置为利用属性分类器对不同区域的多个所述人脸底层特征进行分类学习得到不同类型的所述第一数量的属性特征。
可选地,所述第二语义特征提取模块包括:
第二检测单元,设置为对所述参考人脸图像的第一预定数量的关键点进行 检测,其中,所述关键点包括:双眼四个眼角、鼻尖以及嘴巴两端;
第二处理单元,设置为依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征,得到与人脸不同区域对应的数据集;以及
第二语义特征提取单元,设置为利用相似性分类器对所述数据集进行分类学习得到所述相似性特征。
可选地,所述属性分类器和所述相似性分类器包括:支持向量机SVM分类器。
可选地,所述检索模块包括:
获取单元,设置为获取所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量;
计算单元,设置为对所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量进行距离计算,所述距离计算的方法包括:余弦距离方法或欧式距离方法;以及
检索单元,设置为对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为所述待检索图像的匹配图像。
本公开还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述的方法。
本公开还提供了一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行上述方法。
在本公开中,通过待检索图像与图库中人脸图像的特征向量比较,检索出与待检索图像匹配的一个或多个图像,即是通过待检索图像与图库中的人脸图像的特征向量进行比较,该特征向量包括属性特征与相似性特征,而属性特征与相似性特征都属于高层次特征,因此匹配出来的结果与待检索图像的匹配度 高,减少了相关技术中使用低层次特征进行人脸识别,导致人脸检索效果不佳的现象,提高了人脸检索的效率与匹配度。
附图说明
此处所说明的附图用来提供对本公开的可选理解,构成本申请的一部分,本公开的示意性实施例及说明用于解释本发明,并不构成对本公开的不当限定。在附图中:
图1是本发明实施例的人脸图像的检索方法的流程图;
图2是本发明实施例的人脸图像的检索装置的结构框图;
图3是本发明实施例的人脸图像的检索装置的可选结构框图一;
图4是本发明实施例的人脸图像的检索装置的可选结构框图二;
图5是本发明实施例的人脸图像的检索装置的可选结构框图三;
图6是本发明可选实施例的人脸关键点检测的示意图;
图7是本发明可选实施例的坐标系统示意图;
图8a-8b是本发明可选实施例的人脸图像旋转对齐之前和之后的对比示意图;
图9是本发明可选实施例的相似性图像区域分割的示意图;
图10是本发明可选实施例的属性或相似性特征分类器学习以及特征提取过程示意图;
图11是本发明可选实施例的图片入库以及检索流程示意图;以及
图12是本发明实施例的电子设备的硬件结构示意图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本实施例提供了一种人脸图像的检索方法,图1是本发明实施例的人脸图像的检索方法的流程图。
在步骤110中,对图库中的人脸图像的底层特征进行训练得到人脸图像的第一数量的属性特征。
在步骤120中,对图库中第一预定数量的参考人脸图像的底层特征进行训练得到第二数量的相似性特征。
在步骤130中,将第一数量的属性特征与第二数量的相似性特征作为图库中每幅人脸图像的特征向量。
在步骤140中,通过预设规则计算待检索图像的特征向量与所述图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像。
其中,所述图库为人脸检索系统中写入数据库中的人脸图像集合,参考人脸图像为图库中的人脸图像子集合。
通过步骤110至步骤140可知,在本实施例中采用的是对人脸图像的底层特征进行训练得到第一数量的属性特征,此外还对第一预定数量参考人脸图像的指定语义特征进行训练得到第二数量的指定语义特征相对于图库中的人脸图像的属性特征的相似性特征。通过得到的第一数量的属性特征与第二数量的相似性特征得到每幅人脸图像的特征向量,比较待检索图像与图库中图像的特征向量,检索出与待检索图像匹配的一个或多个图像。也就是说,在本实施例中,是通过待检索图像与图库中的人脸图像的特征向量进行比较,该特征向量包括属性特征与相似性特征,而属性特征与相似性特征都属于高层次特征,因此匹配出来的结果与待检索图像的匹配度高,从而解决了相关技术中使用低层次特征进行人脸识别,导致人脸检索效果不佳的问题,提高了人脸检索的效率与匹配度。
对于上述步骤110中涉及到的对图库中的人脸图像的底层特征进行训练得到第一数量的人脸图像的属性特征的方式,在本实施例的可选实施方式中,可以通过如下方式来实现。
在步骤1101中,对图库中每幅人脸图像中的关键点进行检测,其中,关键点包括:双眼的四个眼角、鼻尖以及嘴巴两端。
需要说明的是,上述中涉及到的关键点仅仅是本可选实施例的可选关键点,其他关键点:头发、下巴、耳朵等等都是可以的。也就是说,只要是人脸上的特征都是可以的。
在步骤1102中,依据关键点对人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征。
在步骤1103中,利用属性分类器对不同区域的多个人脸底层特征进行分类学习得到不同类型的第一数量的属性特征。
上述步骤1101至步骤1103,在本实施例的应用场景中,该第一数量可选取值为69,因为人脸属性特征可以包括:男性,女性,微笑,黑发,带眼镜等等,这些都描述了人脸的语义特征,而人脸属性分类器的目标是对人脸图像进行分类,判断该人脸图像是否具有特定的属性。即,在应用场景中,基于上述步骤可以训练微笑、黑发、戴眼镜等69个属性特征的分类器用于表示人脸特征。而属性特征提取就是通过训练得到的属性特征分类器提取人脸的属性特征,也就是通过69个训练好的属性分类器对图像计算得到69个属性值,拼接形成该图像属性特征。
基于上述描述,在本实施例的可选实施方式中,该步骤1101至步骤S1103中涉及到的方式可以是:
抽取属性的底层特征,可以对图库中的每张人脸图像进行人脸检测,关键点定位,获得人脸图像的关键点信息,将人脸图像旋转对齐,根据属性需求对人脸图像进行区域分割(例如眼镜属性对应的区域为眼睛区域,白头发属性对应的区域为头发区域),不同的属性可能需要分割出不同数目的区域,对分割出的区域提取出该属性该区域有效的底层特征;
将抽取出的底层属性特征分为数量相等的两部分,一半用于训练,一半用于测试(当然这仅仅是举例说明,其他比例也是可以的,可以根据情况进行划分),如果一个属性使用了多个分割区域可以进行特征拼接,如是否带耳环这个属性可以使用到左右两边耳朵区域,对该底层特征学习相应的支持向量机(Support Vector Machine,SVM)属性分类器,如笑脸属性分类器、黑头发属性分类器、眼镜属性分类器等共69个属性分类器;以及
对依据属性分类器的分类值验证属性分类效果。
步骤120中涉及到的对第一预定数量的参考人脸图像的底层特征进行训练得到第二数量的相似性特征,在本实施例的可选实施方式中,可以通过如下方式来实现。
在步骤1201中,对第一预定数量的参考人脸图像的关键点进行检测,其中,关键点包括:双眼四个眼角、鼻尖以及嘴巴两端。
与上述步骤1101中涉及到的关键点一样,上述中涉及到的关键点仅仅是本可选实施例的可选关键点,其他关键点:头发、下巴、耳朵等等都是可以的。
在步骤1202中,依据关键点对参考人脸图像进行区域划分,抽取得到与不同区域对应的人脸底层特征,得到与指定语义特征对应的数据集。
在步骤1203中,利用相似性分类器对数据集进行分类学习得到第二数量的相似性特征。
其中,在上述涉及到的第一预定数量可以取值为10,基于此,上述步骤1201至步骤1203,在应用场景中,相似性分类器的训练过程可以如下:
选取例如10个参考人,分别对每个参考人单独处理,将每个参考人所有的人脸图片作为正样本,并选择同等数量的其他人脸图片作为负样本,以参考人为单位构成一个数据集;
对每个数据集按照如下过程处理抽取特征,即,可以进行人脸检测以及关键点定位,并将人脸图片旋转对齐,在每张人脸图片上分别分割出眼睛、眉毛、鼻子和嘴巴四个子块,对四个子块分别抽取底层特征,将这个数据集转化成4个新的子数据集,即眼睛数据集、眉毛数据集、嘴巴数据集和鼻子数据集;以及
将每个参考人的每个子数据集进行划分,一半数据作为训练,另一半数据作测试,在训练集上学习SVM模型,在测试集上验证相似性分类器分类效果,将训练产生的模型文件以及特征归一化文件进行保存,用于后续的相似性特征提取。调用训练得到相似性分类器对四个子块分别计算相似性数值,得到40个相似性值。
本实施例中涉及到的属性分类器和相似性分类器可选为SVM属性分类器。
此外,在本实施例的可选实施方式中,步骤140中涉及到的通过预设规则 计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据匹配度检索出与待检索图像匹配的一个或多个图像,可以通过如下方式来实现。
在步骤1401中,获取待检索图像的特征向量与图库中每幅人脸图像的特征向量。
在步骤1402中,对待检索图像的特征向量与图库中每幅人脸图像的特征向量进行距离计算,其中,该距离计算的方法,在本可选实施方式可以是余弦距离方法或欧式距离方法。
在步骤1403中,对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为待检索图像的匹配图像。
在应用场景中,上述步骤1401至步骤1403可以是:在得到人脸图像的属性值和相似性值构成的特征向量后,可以进行基于组合人脸属性特征和相似性特征的人脸检索,将获得的69个属性值和40个相似性值拼接作为每幅人脸图像的特征向量,使用大间隔最近邻居(Large Margin Nearest Neighbors,LMNN)算法优化特征向量每一维的权重。使用特征向量和权重便可计算两张人脸的相似度。本实施例采用两个向量夹角的余弦(Cosine)作为向量之间的相似性数值,cos θ取值范围在[-1,+1],越接近于+1,代表两张人脸图片中的人脸越相似。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例的方法可借助软件加通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,本发明的技术方案本质上可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器(Read-Only Memory,ROM)或随机存取存储器(RandomAccess Memory,RAM)、磁碟、光盘)中,包括一个或多个指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明实施例所述的方法。
在本实施例中还提供了一种人脸图像的检索装置,该装置用于实现上述实施例及可选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件或硬件(或者,软件和硬件)的组合。
图2是本发明实施例的人脸图像的检索装置的结构框图,如图2所示,该装置包括:第一语义特征提取模块22,第二语义特征提取模块24,处理模块26以及检索模块28。其中,第一语义特征提取模块22,设置为对图库中的人脸图 像的底层特征进行训练得到第一数量的人脸图像的属性特征。第二语义特征提取模块24,与第一语义特征提取模块22耦合连接,设置为对第一预定数量的参考人脸图像的底层特征进行训练得到第二数量的相似性特征。处理模块26,与第二语义特征提取模块24耦合连接,设置为将第一数量的属性特征与第二数量的相似性特征作为图库中每幅人脸图像的特征向量。检索模块28,与处理模块26耦合连接,设置为通过预设规则计算待检索图像的特征向量与所述图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像。
图3是本发明实施例的人脸图像的检索装置的可选结构框图一,如图3所示,第一语义特征提取模块22包括:第一检测单元32,第一处理单元34以及第二语义特征提取单元36。其中,第一检测单元32,设置为对图库中的人脸图像中的关键点进行检测,其中,关键点为双眼的四个眼角、鼻尖以及嘴巴两端。第一处理单元34,与第一检测单元32耦合连接,设置为依据关键点对人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征。第二语义特征提取单元36,与第一处理单元34耦合连接,设置为利用属性分类器对不同区域的多个人脸底层特征进行分类学习得到不同类型的属性特征。
图4是本发明实施例的人脸图像的检索装置的可选结构框图二,如图4所示,第二语义特征提取模块24包括:第二检测单元42,第二处理单元44以及第二语义特征提取单元46。其中,第二检测单元42,设置为对第一预定数量的参考人脸图像的关键点进行检测,其中,关键点包括:双眼四个眼角、鼻尖以及嘴巴两端。第二处理单元44,与第二检测单元42耦合连接,设置为依据关键点对人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征,得到与指定语义特征对应的数据集。第二语义特征提取单元46,与第二处理单元44耦合连接,设置为利用相似性分类器对数据集进行分类学习得到第二数量的相似性特征。
可选地,本实施例中涉及到的属性分类器和相似性分类器为支持向量机SVM分类器。
图5是本发明实施例的人脸图像的检索装置的可选结构框图三,如图5所示,检索模块28包括:获取单元52,计算单元54以及检索单元56。获取单元52,设置为获取待检索图像的特征向量与图库中每幅人脸图像的特征向量。计 算单元54,与获取单元52耦合连接,设置为对待检索图像的特征向量与图库中每幅人脸图像的特征向量进行距离计算,所述距离计算的方法包括:余弦距离方法或欧式距离方法。检索单元56,与计算单元54耦合连接,设置为对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为待检索图像的匹配图像。
上述模块是可以通过软件或硬件来实现的。对于软件实现方式,上述模块可以以软件模块的形式位于一个或多个存储介质中。对于硬件实现方式,上述模块可以均位于同一处理器中;或者,上述模块可以分别位于多个处理器中。
下面结合本发明的可选实施例对本公开进行举例说明;
本可选实施例提供了一种人脸高层语义特征的提取方法,该方法通过属性分类器以及相似性分类器提取人脸的高层语义特征,通过组合人脸属性特征和相似性特征进行人脸相似性度量,实现相似人脸检索,其中,包含人脸属性分类器学习以及人脸属性特征获取方法,人脸相似性分类器学习以及相似性特征获取方法,基于人脸属性特征和相似性特征的人脸检索三个部分,下面对该三个部分进行详细说明。
1.人脸属性分类器学习以及人脸属性特征获取方法可以包括:人脸属性分类器学习方式、属性分类器的训练、人脸相似性分类器学习以及相似性分类器的训练。
在人脸属性分类器学习方式的过程中,人脸属性包括:男性,女性,微笑,黑发,带眼镜等,它描述了人脸的语义特征。人脸属性分类器的可以对人脸图像进行分类,判断该人脸图像是否具有特定属性。例如,在本公开中,共训练了微笑、黑发、戴眼镜等69个属性特征的分类器用于人脸特征的表示。属性特征提取就是通过训练得到的属性特征分类器提取人脸的属性特征,在本实施例中也就是通过69个训练好的属性分类器对人脸图像计算得到69个属性值,拼接形成该人脸图像属性特征。
属性分类器的训练过程可以包括:
获得该属性的标注图像,对每个属性选择一定规模的正例样本和负例样本人脸图片(样本人脸图片对该属性表现明显),以此作为该属性的标注集;
对标注集中的每张人脸图像进行人脸检测,关键点定位,获得人脸图像的 关键点信息,将人脸图片旋转对齐,可以根据属性需求对人脸图像进行区域分割(例如眼镜属性对应的区域为眼睛区域,白头发属性对应的区域为头发区域),不同的属性可能需要分割出不同数目的区域,对分割出的区域提取出该属性该区域有效的底层特征(如采用局部二值模式(Local Binary Pattern,LBP)算法,Gabor算法等)。
将之前标注集中的正负例图像抽取出的底层属性特征分为数量相等的两部分,一半用于训练,一半用于测试,如果一个属性使用了多个分割区域可以先进行特征拼接,如是否带耳环这个属性可以使用到左右两边耳朵区域,对该底层特征学习相应的SVM属性分类器,如笑脸属性分类器、黑头发属性分类器、眼镜属性分类器等共69个属性分类器,此外生成训练集上的特征归一化文件,将训练产生的模型文件以及特征归一化文件进行保存;以及
对测试集进行同样的归一化处理以及特征拼接后,依据属性分类器的分类值验证属性分类效果。
人脸相似性分类器学习过程可以包括:
相似性分类器的训练目标是训练参考人的五官相似性分类器,根据五官相似分类器对新的人脸图像进行分类,可以判断该人脸的五官与参考人的五官是否相似。
相似性分类器的训练过程可以包括:
选取多个(例如10个)参考人,分别对每个参考人单独处理,将每个参考人所有的图片作为正样本,并选择同等数量的其他人脸图片作为负样本,以参考人为单位构成一个数据集;
进行人脸检测,关键点定位,并将人脸图片旋转对齐,在每张人脸图片上分别分割出眼睛、眉毛、鼻子和嘴巴四个子块,对四个子块分别抽取底层特征(如LBP,Gabor等),将这个数据集转化成4个新的子数据集,即眼睛数据集、眉毛数据集、嘴巴数据集和鼻子数据集;以及
将每个参考人的多个子数据集进行划分,一半数据作为训练,另一半数据作测试,在训练集上学习SVM模型,在测试集上验证相似性分类器分类效果,将训练产生的模型文件以及特征归一化文件进行保存,用于后续的相似性特征提取。
在属性特征和相似性特征提取中,属性分类器和相似性分类器的特征提取过程是类似的,可以包括:
对于一幅输入图像,可以按照和属性分类器训练中一样的过程对人脸图像进行人脸检测,关键点定位,获得人脸图像的关键点信息,将图片旋转对齐,根据每个属性的需求对人脸图像进行区域分割,并调用训练得到的属性分类器进行分类,得到属性分类器数值,将所有的属性分类数值进行拼接,得到输入图像的人脸属性特征;
对于一幅输入图像提取相似性特征,其过程同提取属性特征类似,可以包括:对人脸图像进行人脸检测,关键点定位,并将人脸图片旋转对齐,在每张人脸图片上分别分割出眼睛、眉毛、鼻子和嘴巴四个子块,对四个子块分别抽取底层特征,调用训练得到相似性分类器对四个子块分别计算相似性数值,将所有的相似性数值进行拼接,得到输入图像的人脸相似性特征;以及
对输入图像的属性特征和相似性特征进行拼接,这样就得到了这张图像的69个属性值和40个相似性值构成的一个特征向量。
在基于属性特征和相似性特征组合的人脸检索中,在得到人脸图像的属性值和相似性值构成的特征向量后,可以进行基于组合人脸属性特征和相似性特征的人脸检索,如下:
将获得的69个属性值和40个相似性值拼接作为每幅人脸图像的特征向量,使用LMNN算法优化特征向量每一维的权重。使用特征向量和权重便可计算两张人脸的相似度。本公开中,可以采用两个向量夹角的余弦(Cosine)作为向量之间的相似性数值,cosθ取值范围在[-1,+1],越接近于+1,代表两张人脸图片中的人脸越相似。
下面结合附图对本发明可选实施例进行详细的说明。
本可选实施例中的算法每个部件采用的技术方案可以包括:人脸关键点检测、图像预处理、图像区域分割、特征抽取、以及分类器训练。
(1)关键点检测;
图6是本发明可选实施例的人脸关键点检测的示意图,如图6所示,本可选实施例中采用flandmark(开源实现面部地标探测器)进行快速人脸关键点检测,检测点可以为眼角、鼻尖和嘴巴两端这7个关键点。
(2)图像预处理;
图像预处理是对原始人脸图像的旋转和对齐。根据得到的关键点的数据可以定位得到双眼瞳孔位置信息。由于旋转后的人脸的瞳孔应该在一条直线上,即瞳孔坐标的X值应该相等,进而可以计算旋转的角度。旋转后的图像保存为250像素*250像素大小,不足的部分用黑色填充。
本可选实施例中采用的坐标系统与通常的坐标系不同,水平方向从左向右为y轴,垂直方向从上向下为x轴。
假设一幅图像,左右眼睛的坐标分别为(plx,ply)和(prx,pry),两眼之间连线的中点坐标为(mx,my),两眼瞳孔之间的距离为d。此时图像放缩比例ratio=d/dd(dd可以默认为75)。两眼连线与y轴之间的夹角为θ,两眼连线的斜率为k。图7是本发明可选实施例的坐标系统示意图,如图7所示,椭圆代表眼睛的位置。
要从原人脸图像中分割出符合人脸标准图像,可以对人脸图像进行以下处理:旋转θ度,使得两眼连线与y轴重合;进行图像缩放,使得两眼距离为dd;以及移动人脸图像,使得两眼中点移动到(mx,my)。
图8a-8b是本发明可选实施例的人脸图像旋转对齐之前和之后的对比示意图,其中,图8a是旋转之前,图8b是旋转之后;
(3)区域分割;
图像区域分割,以分割眼睛为例,找到左眼角的关键点坐标pLeftIndex和右眼角的关键点坐标pRightIndex,根据这两点的坐标可以计算出它们的中点
Figure PCTCN2016111533-appb-000001
以这个中点为矩形的中心点,定义中心点到矩形左右边界的距离为centerToLeft,到矩形上下边界的距离为centerToUp,以及定义图像的宽为width,高为height,根据中心点位置以及centerToUp和centerToLeft可以得到分割区域左上角的坐标,根据左上角坐标以及宽和高的信息,就得到了包含分割的眼睛区域信息的人脸图像。
(4)特征提取;
特征抽取可以用Gabor小波变换提取图像块的特征。在特征提取方面,Gabor小波变换处理的数据量较少,能满足系统的实时性要求,小波变换对光照变化不敏感,且能容忍一定程度的图像旋转和变形,当采用基于夹角余弦距离进行识别时,特征模式与待测特征不需要严格的对应,故能提高系统的鲁棒性。因此,在人脸识别的过程中可以采用Gabor小波变换方法对图像进行特征提取。
(5)SVM分类器;
SVM分类器训练使用了LIBSVM。LIBSVM是一个简单、易于使用和快速有效的SVM模式识别与回归的软件包,是一个实现了支持向量机SVM算法的库。使用LIBSVM中,可以训练一个数据集获得分类模型,以及使用模型预测测试数据集的类标。SVM分类器可以在特征空间中找到一个最大化间隔的分离超平面,将不同的类分开。
2.属性分类器相似性分类器具体实施方式;在属性分类器的实施方式中,属性分类器采用了LFW(Labeled Faces in the Wild)数据集,对表1中69个属性中的每个属性,可以从LFW数据集中挑选出符合这个属性和不符合这个属性的人脸图片各1000张(建议不少于这个数目),对每张人脸图片都进行标记,如符合该属 性标记为+1,如不符合则标记为-1。分割出属性对应的区域,见表2。因为一些属性可以使用相同的分割区域,所以69个属性可以缩减对应到19个区域。对所用人脸图片抽取Gabor小波变换特征。使用LIBSVM对每个属性进行训练。
表1
Male/男性 Eyes Open/睁眼
Asian/亚洲人 Big Nose/大鼻子
White/白人 Pointy Nose/尖鼻子
Black/黑人 Big Lips/大嘴唇
Child/儿童 Mouth Closed/张嘴
Youth/青年 Mouth Slightly Open/轻微张嘴
Middle Aged/中年 Mouth Wide Open/张大嘴
Senior/较年长者 Teeth Not Visible/牙齿不可见
Black Hair/黑发 No Beard/没胡须
Blond Hair/金发碧眼 Goatee/山羊胡子
Brown Hair/棕发 Round Jaw/圆下巴
Bald/秃顶 Double Chin/双下巴
No Eyewear/不戴眼镜 Wearing Hat/戴帽子
Eyeglasses/戴眼镜 Oval Face/椭圆形脸
Sunglasses/戴太阳镜 Square Face/方脸
Mustache/胡子 Round Face/圆脸
Smiling/笑 Frowning/皱眉
Narrow Eyes/窄眼 Chubby/丰满
Blurry/模糊 Gray Hair/灰发
Harsh Lighting/光照刺目 Bags Under Eyes/眼袋
Soft Lighting/光照柔和 Heavy Makeup/浓妆
Curly Hair/卷毛 Rosy Cheeks/玫瑰色面颊
Wavy Hair/波浪形头发 Shiny Skin/皮肤有光泽
Straight Hair/直发 Pale Skin/皮肤苍白
Receding Hairline/高发际线 Five O’Clock Shadow/满脸胡须
Bangs/刘海 Strong Nose-Mouth Lines/明显的鼻嘴间线
Sideburns/连鬓胡子 Wearing Lipstick/涂口红
Fully Visible Forehead/裸露额头 Flushed Face/面露激动
Partially Visible Forehead/半露额头 High Cheekbones/高颧骨
Obstructed Forehead/不露额头 Brown Eyes/灰色眼睛
Bushy Eyebrows/眉毛浓密 Wearing Earrings/戴耳环
Arched Eyebrows/弧形眉毛 Wearing Necktie/戴领带
Posed Photo/摆姿势 Wearing Necklace/戴项链
Attractive Man/吸引人的男性 Indian/印度人
Attractive Woman/吸引人的女性  
表2
Figure PCTCN2016111533-appb-000002
Figure PCTCN2016111533-appb-000003
Figure PCTCN2016111533-appb-000004
Figure PCTCN2016111533-appb-000005
Figure PCTCN2016111533-appb-000006
Figure PCTCN2016111533-appb-000007
在相似性分类器的实施方式中,相似性分类器可以采用了PubFig(Public Figures Face Database)数据集,从PubFig图像库中选出10个具有代表性的人,每个人的人脸图片要不少于150张,分割出相似分类器所需的眼睛、眉毛、鼻子和嘴巴四个区域,图9是本发明可选实施例的相似性图像区域分割的示意图,如图9所示,这样总共可以得到40个区域,分别提取特征。以第一个参考人的眼睛区域为例,把他自己的眼睛区域作为正例,标记为+1。另外从图库中挑选出不是这个人的人脸图片(建议是多个不同人),张数大致等于这个参考人所有人脸图片的张数,分割出眼睛区域,提取特征,作为负例,标记为-1。同样对分割的区域提取Gabor小波变换特征。使用LIBSVM训练得到40个相似性分类器。
3.计算输入图片的属性特征和相似性特征以及特征权重学习,可以包括:
对输入图片进行人脸关键点检测,图像预处理,图像区域分割,特征抽取操作,19个区域得到19个区域特征;
对69个属性和40个相似性分类器模型,通过LIBSVM的svmpredict()函数可以得到对应的值,这样就得到了这张图像的69个属性值和40个相似性值,对属性值和相似性值进行组合,得到这幅图片的特征向量
Figure PCTCN2016111533-appb-000008
以及
在输入的已标记人员类标的图片集上(采用的LFW数据集),对图片集上提取的属性特征和相似性特征,使用LMNN算法优化特征向量每一维的权重,得到
Figure PCTCN2016111533-appb-000009
4.人脸检索实施过程;
对检索图库所有人脸图片进行处理得到每张人脸图片的109维特征向量;
对新输入的待检索图片也同样处理得到109维特征向量;以及
计算人脸图片相似性,这里采用夹角余弦,对于两个向量
Figure PCTCN2016111533-appb-000010
Figure PCTCN2016111533-appb-000011
Figure PCTCN2016111533-appb-000012
其中,上述公式中,<x,y>表示求两个向量x和y之间的内积,||·||表示求向量的模,sim(vi,vj)取值范围在[-1,+1],越接近于+1,代表两张人脸图片中的人脸越相似。对待检索图片的特征向量和检索库中所有人脸图片的特征向量求夹角余弦,取余弦值最大的前N(可以默认为1000)张图片。被选出的取值靠前的图片中的人被认为最可能和输入图片中的是同一人。
本可选实施例中的人脸属性分类器学习以及属性特征获取,人脸相似性分类器学习以及相似性特征获取的流程,图10是本发明可选实施例的属性或相似性特征分类器学习以及特征提取过程示意图,如图10所示,该提取过程包括:属性或相似性分类器训练过程和属性或相似性分类器提取过程。
其中,属性或相似性分类器训练过程的步骤包括:
带标记图库;
快速人脸关键点检测;
图像预处理;
图像区域分割;
特征抽取;以及
SVM分类训练和测试。
属性或相似性分类器提取过程的步骤可以包括:
测试或检索图片;
快速人脸关键点检测;
图像预处理;
图像区域分割;
特征抽取;以及
计算属性或相似性值。
需要说明的是,属性分类器和相似性分类器的区别在于SVM分类器训练和测试部分,属性分类器中符合该属性标记为+1,如不符合则标记为-1;相似性分类器中当前参考人标记为+1,其他人标记为-1。
图11是本发明可选实施例的图片入库以及检索流程示意图,如图11所示,该过程可以包括:图片存入数据库流程和检索流程;
其中,图片存入数据库流程可以包括:
检索图库;
计算属性和相似性特征值;
属性或相似性特征组合;以及
人像属性或相似性特征数据库;该步骤执行完之后执行属性或相似性特征组合。
检索流程可以包括:
待检索图片;
计算属性和相似性特征值;
属性或相似性特征组合;
特征比对;以及
检索结果或最相似图片。
需要说明的是,最终特征向量可以是包含69个属性值和40个相似性值的109维向量。
本发明的实施例还提供了一种非暂态计算机可读存储介质。可选地,在本实施例中,上述非暂态计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令可以被设置为执行上述任一人脸图像的检索方法。
本公开还提供了一种电子设备的硬件结构示意图。参见图12,该电子设备包括:
至少一个处理器(Processor)120,图12中以一个处理器120为例;和存储器(Memory)121,还可以包括通信接口(Communications Interface)122和总线123。其中,处理器120、通信接口122、存储器121可以通过总线123完成相互间的通信。通信接口122可以用于信息传输。处理器120可以调用存储器121中的逻辑指令,以执行人脸图像的检索方法。
此外,上述的存储器121中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。
存储器121作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序,如本公开实施例中的方法对应的程序指令或模块。处理器120通过运行存储在存储器121中的软件程序、指令或模块,从而执行功能应用以及数据处理,即实现人脸图像的检索方法。
存储器121可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器。
本公开的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括一个或多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例所述方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质,包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。
本领域的技术人员应该明白,上述的本发明的模块或步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成多个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。
工业实用性
本公开提供的人脸图像的检索方法及装置,减少了相关技术中使用低层次特征进行人脸识别,导致人脸检索效果不佳的现象,提高了人脸检索的效率与匹配度。

Claims (13)

  1. 一种人脸图像的检索方法,包括:
    通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像,其中,所述特征向量包括属性特征和相似性特征。
  2. 根据权利要求1所述的方法,通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度之前,所述方法还包括:
    对所述图库中的人脸图像的底层特征进行训练,得到所述人脸图像的属性特征;
    对参考人脸图像的底层特征进行训练,得到相似性特征;以及
    将所述属性特征与所述相似性特征作为所述图库中每幅人脸图像的特征向量。
  3. 根据权利要求2所述的方法,其中,所述对所述图库中的人脸图像的底层特征进行训练,得到所述人脸图像的属性特征包括:
    对所述图库中每幅人脸图像中的关键点进行检测,其中,所述关键点包括:双眼的四个眼角、鼻尖以及嘴巴两端;
    依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的底层特征;以及
    利用属性分类器对不同区域的多个所述人脸底层特征进行分类学习得到不同类型的所述属性特征。
  4. 根据权利要求2所述的方法,其中,对图库中的参考人脸图像的底层特征进行训练,得到相似性特征包括:
    对所述第一预定数量的参考人脸图像的关键点进行检测,其中,所述关键点包括:双眼四个眼角、鼻尖以及嘴巴两端;
    依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征,得到与人脸不同区域对应的数据集;以及
    利用相似性分类器对所述数据集进行分类学习得到所述相似性特征。
  5. 根据权利要求3或4所述的方法,其中,所述属性分类器和所述相似性分类器包括:支持向量机SVM分类器。
  6. 根据权利要求1所述的方法,其中,通过预设规则计算待检索图像的特征向量与所述图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像包括:
    获取所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量;
    对所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量进行距离计算,所述距离计算的方法包括:余弦距离方法或欧式距离方法;以及
    对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为所述待检索图像的匹配图像。
  7. 一种人脸图像的检索装置,包括:
    检索模块,设置为通过预设规则计算待检索图像的特征向量与图库中每幅人脸图像的特征向量的匹配度,依据所述匹配度检索出与所述待检索图像匹配的一个或多个图像,其中,所述特征向量包括属性特征和相似性特征。
  8. 根据权利要求6所述的装置,还包括:
    第一语义特征提取模块,设置为对图库中的人脸图像的底层特征进行训练得到所述人脸图像的属性特征;
    第二语义特征提取模块,设置为对参考人脸图像的底层特征进行训练得到相似性特征;以及
    处理模块,设置为将所述属性特征与相似性特征作为所述图库中每幅人脸图像的特征向量。
  9. 根据权利要求7所述的装置,其中,所述第一语义特征提取模块包括:
    第一检测单元,设置为对所述图库中的人脸图像中的关键点进行检测,其中,所述关键点包括:双眼的四个眼角、鼻尖以及嘴巴两端;
    第一处理单元,设置为依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征;以及
    第二语义特征提取单元,设置为利用属性分类器对不同区域的多个所述人脸底层特征进行分类学习得到不同类型的所述第一数量的属性特征。
  10. 根据权利要求6所述的装置,其中,所述第二语义特征提取模块包括:
    第二检测单元,设置为对所述参考人脸图像的第一预定数量的关键点进行检测,其中,所述关键点包括:双眼四个眼角、鼻尖以及嘴巴两端;
    第二处理单元,设置为依据所述关键点对所述人脸图像进行区域的划分,并抽取得到与不同区域对应的人脸底层特征,得到与人脸不同区域对应的数据集;以及
    第二语义特征提取单元,设置为利用相似性分类器对所述数据集进行分类学习得到所述相似性特征。
  11. 根据权利要求9或10所述的装置,其中,所述属性分类器和所述相似性分类器包括:支持向量机SVM分类器。
  12. 根据权利要求8所述的装置,其中,所述检索模块包括:
    获取单元,设置为获取所述待检索图像的特征向量与所述图库中每幅人脸图像的特征向量;
    计算单元,设置为对所述待检索图像的特征向量与所述图库中每幅人脸图 像的特征向量进行距离计算,所述距离计算的方法包括:余弦距离方法或欧式距离方法;以及
    检索单元,设置为对多个计算结果按照从大到小的规则进行排序,并从排序后的计算结果中选择取值靠前的第二预定数量的计算结果对应的人脸图像作为所述待检索图像的匹配图像。
  13. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行权利要求1-6中任一项的方法。
PCT/CN2016/111533 2015-12-22 2016-12-22 人脸图像的检索方法及装置 WO2017107957A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510963793.1 2015-12-22
CN201510963793.1A CN106909870A (zh) 2015-12-22 2015-12-22 人脸图像的检索方法及装置

Publications (2)

Publication Number Publication Date
WO2017107957A1 true WO2017107957A1 (zh) 2017-06-29
WO2017107957A9 WO2017107957A9 (zh) 2017-09-08

Family

ID=59089016

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/111533 WO2017107957A1 (zh) 2015-12-22 2016-12-22 人脸图像的检索方法及装置

Country Status (2)

Country Link
CN (1) CN106909870A (zh)
WO (1) WO2017107957A1 (zh)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256438A (zh) * 2017-12-26 2018-07-06 杭州魔点科技有限公司 基于后端人脸识别视频监控方法及装置
CN108985198A (zh) * 2018-07-02 2018-12-11 四川斐讯信息技术有限公司 一种基于大数据特征向量的余弦距离计算方法
CN109472269A (zh) * 2018-10-17 2019-03-15 深圳壹账通智能科技有限公司 图像特征配置及校验方法、装置、计算机设备及介质
CN109753850A (zh) * 2017-11-03 2019-05-14 富士通株式会社 面部识别模型的训练方法和训练设备
CN110032912A (zh) * 2018-01-11 2019-07-19 富士通株式会社 人脸验证方法和装置及计算机存储介质
CN110866466A (zh) * 2019-10-30 2020-03-06 平安科技(深圳)有限公司 一种人脸识别方法、装置、存储介质和服务器
CN110909562A (zh) * 2018-09-14 2020-03-24 传线网络科技(上海)有限公司 视频审核方法及装置
CN110929064A (zh) * 2019-11-29 2020-03-27 交通银行股份有限公司 一种人脸辨识样本库及检索方法
CN110942014A (zh) * 2019-11-22 2020-03-31 浙江大华技术股份有限公司 人脸识别快速检索方法、装置、服务器及存储装置
CN111079688A (zh) * 2019-12-27 2020-04-28 中国电子科技集团公司第十五研究所 一种人脸识别中的基于红外图像的活体检测的方法
CN111382286A (zh) * 2018-12-27 2020-07-07 深圳云天励飞技术有限公司 数据处理方法及相关产品
CN111444374A (zh) * 2020-04-09 2020-07-24 上海依图网络科技有限公司 人体检索系统和方法
CN111597894A (zh) * 2020-04-15 2020-08-28 杭州东信北邮信息技术有限公司 一种基于人脸检测技术的人脸库更新方法
CN111652015A (zh) * 2019-03-27 2020-09-11 上海铼锶信息技术有限公司 一种对图片中关键人脸进行选择的方法和系统
CN111931567A (zh) * 2020-07-01 2020-11-13 珠海大横琴科技发展有限公司 人体识别的方法及装置、电子设备、存储介质
CN112069908A (zh) * 2020-08-11 2020-12-11 西安理工大学 基于共现属性的行人重识别方法
CN112084904A (zh) * 2020-08-26 2020-12-15 武汉普利商用机器有限公司 人脸搜索方法、装置及存储介质
CN112163456A (zh) * 2020-08-28 2021-01-01 北京中科虹霸科技有限公司 身份识别模型训练方法、测试方法、识别方法及装置
CN112633119A (zh) * 2020-12-17 2021-04-09 北京赢识科技有限公司 一种人体属性识别方法、装置、电子设备及介质
CN112800819A (zh) * 2019-11-14 2021-05-14 深圳云天励飞技术有限公司 一种人脸识别方法、装置及电子设备
CN113052064A (zh) * 2021-03-23 2021-06-29 北京思图场景数据科技服务有限公司 基于面部朝向、面部表情及瞳孔追踪的注意力检测方法
CN113158939A (zh) * 2021-04-29 2021-07-23 南京甄视智能科技有限公司 人脸遮挡部位的识别方法及系统
CN113535899A (zh) * 2021-07-07 2021-10-22 西安康奈网络科技有限公司 一种针对互联网信息情感倾向性的自动研判方法
CN113674177A (zh) * 2021-08-25 2021-11-19 咪咕视讯科技有限公司 一种人像唇部的自动上妆方法、装置、设备和存储介质
WO2022205259A1 (zh) * 2021-04-01 2022-10-06 京东方科技集团股份有限公司 人脸属性检测方法及装置、存储介质及电子设备

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205040A (zh) * 2017-08-09 2021-08-03 北京市商汤科技开发有限公司 人脸图像处理方法、装置和电子设备
CN107679474A (zh) * 2017-09-25 2018-02-09 北京小米移动软件有限公司 人脸匹配方法及装置
CN107622256A (zh) * 2017-10-13 2018-01-23 四川长虹电器股份有限公司 基于面部识别技术的智能相册系统
CN108932321B (zh) * 2018-06-29 2020-10-23 金蝶软件(中国)有限公司 人脸图像检索方法、装置、计算机设备及存储介质
CN109993102B (zh) * 2019-03-28 2021-09-17 北京达佳互联信息技术有限公司 相似人脸检索方法、装置及存储介质
CN110147776B (zh) * 2019-05-24 2021-06-11 北京百度网讯科技有限公司 确定人脸关键点位置的方法和装置
CN111914649A (zh) * 2020-07-01 2020-11-10 珠海大横琴科技发展有限公司 人脸识别的方法及装置、电子设备、存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495999A (zh) * 2011-11-14 2012-06-13 深圳市奔凯安全技术有限公司 一种人脸识别的方法
CN102968626A (zh) * 2012-12-19 2013-03-13 中国电子科技集团公司第三研究所 一种人脸图像匹配的方法
CN104883548A (zh) * 2015-06-16 2015-09-02 金鹏电子信息机器有限公司 监控视频人脸抓取处理方法及其系统
CN105100735A (zh) * 2015-08-31 2015-11-25 张慧 人员智能监测管理系统及管理方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8311294B2 (en) * 2009-09-08 2012-11-13 Facedouble, Inc. Image classification and information retrieval over wireless digital networks and the internet
GB0818089D0 (en) * 2008-10-03 2008-11-05 Eastman Kodak Co Interactive image selection method
CN102622590B (zh) * 2012-03-13 2015-01-21 上海交通大学 基于人脸-指纹协同的身份识别方法
CN103853795A (zh) * 2012-12-07 2014-06-11 中兴通讯股份有限公司 一种基于n元模型的图片索引构建方法及装置
US9235567B2 (en) * 2013-01-14 2016-01-12 Xerox Corporation Multi-domain machine translation model adaptation
CN103824052B (zh) * 2014-02-17 2017-05-03 北京旷视科技有限公司 一种基于多层次语义特征的人脸特征提取方法及识别方法
CN104732602B (zh) * 2015-02-04 2017-02-22 四川长虹电器股份有限公司 一种基于云端人脸及表情识别的考勤方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102495999A (zh) * 2011-11-14 2012-06-13 深圳市奔凯安全技术有限公司 一种人脸识别的方法
CN102968626A (zh) * 2012-12-19 2013-03-13 中国电子科技集团公司第三研究所 一种人脸图像匹配的方法
CN104883548A (zh) * 2015-06-16 2015-09-02 金鹏电子信息机器有限公司 监控视频人脸抓取处理方法及其系统
CN105100735A (zh) * 2015-08-31 2015-11-25 张慧 人员智能监测管理系统及管理方法

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753850A (zh) * 2017-11-03 2019-05-14 富士通株式会社 面部识别模型的训练方法和训练设备
CN109753850B (zh) * 2017-11-03 2022-10-25 富士通株式会社 面部识别模型的训练方法和训练设备
CN108256438A (zh) * 2017-12-26 2018-07-06 杭州魔点科技有限公司 基于后端人脸识别视频监控方法及装置
CN110032912A (zh) * 2018-01-11 2019-07-19 富士通株式会社 人脸验证方法和装置及计算机存储介质
CN108985198A (zh) * 2018-07-02 2018-12-11 四川斐讯信息技术有限公司 一种基于大数据特征向量的余弦距离计算方法
CN110909562A (zh) * 2018-09-14 2020-03-24 传线网络科技(上海)有限公司 视频审核方法及装置
CN109472269A (zh) * 2018-10-17 2019-03-15 深圳壹账通智能科技有限公司 图像特征配置及校验方法、装置、计算机设备及介质
CN111382286A (zh) * 2018-12-27 2020-07-07 深圳云天励飞技术有限公司 数据处理方法及相关产品
CN111652015A (zh) * 2019-03-27 2020-09-11 上海铼锶信息技术有限公司 一种对图片中关键人脸进行选择的方法和系统
CN111652015B (zh) * 2019-03-27 2024-04-26 上海铼锶信息技术有限公司 一种对图片中关键人脸进行选择的方法和系统
CN110866466A (zh) * 2019-10-30 2020-03-06 平安科技(深圳)有限公司 一种人脸识别方法、装置、存储介质和服务器
CN110866466B (zh) * 2019-10-30 2023-12-26 平安科技(深圳)有限公司 一种人脸识别方法、装置、存储介质和服务器
CN112800819A (zh) * 2019-11-14 2021-05-14 深圳云天励飞技术有限公司 一种人脸识别方法、装置及电子设备
CN110942014A (zh) * 2019-11-22 2020-03-31 浙江大华技术股份有限公司 人脸识别快速检索方法、装置、服务器及存储装置
CN110929064B (zh) * 2019-11-29 2024-02-09 交通银行股份有限公司 一种人脸辨识样本库及检索方法
CN110929064A (zh) * 2019-11-29 2020-03-27 交通银行股份有限公司 一种人脸辨识样本库及检索方法
CN111079688A (zh) * 2019-12-27 2020-04-28 中国电子科技集团公司第十五研究所 一种人脸识别中的基于红外图像的活体检测的方法
CN111444374B (zh) * 2020-04-09 2023-05-02 上海依图网络科技有限公司 人体检索系统和方法
CN111444374A (zh) * 2020-04-09 2020-07-24 上海依图网络科技有限公司 人体检索系统和方法
CN111597894A (zh) * 2020-04-15 2020-08-28 杭州东信北邮信息技术有限公司 一种基于人脸检测技术的人脸库更新方法
CN111597894B (zh) * 2020-04-15 2023-09-15 新讯数字科技(杭州)有限公司 一种基于人脸检测技术的人脸库更新方法
CN111931567A (zh) * 2020-07-01 2020-11-13 珠海大横琴科技发展有限公司 人体识别的方法及装置、电子设备、存储介质
CN111931567B (zh) * 2020-07-01 2024-05-28 珠海大横琴科技发展有限公司 人体识别的方法及装置、电子设备、存储介质
CN112069908B (zh) * 2020-08-11 2024-04-05 西安理工大学 基于共现属性的行人重识别方法
CN112069908A (zh) * 2020-08-11 2020-12-11 西安理工大学 基于共现属性的行人重识别方法
CN112084904A (zh) * 2020-08-26 2020-12-15 武汉普利商用机器有限公司 人脸搜索方法、装置及存储介质
CN112163456A (zh) * 2020-08-28 2021-01-01 北京中科虹霸科技有限公司 身份识别模型训练方法、测试方法、识别方法及装置
CN112163456B (zh) * 2020-08-28 2024-04-09 北京中科虹霸科技有限公司 身份识别模型训练方法、测试方法、识别方法及装置
CN112633119A (zh) * 2020-12-17 2021-04-09 北京赢识科技有限公司 一种人体属性识别方法、装置、电子设备及介质
CN113052064B (zh) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 基于面部朝向、面部表情及瞳孔追踪的注意力检测方法
CN113052064A (zh) * 2021-03-23 2021-06-29 北京思图场景数据科技服务有限公司 基于面部朝向、面部表情及瞳孔追踪的注意力检测方法
WO2022205259A1 (zh) * 2021-04-01 2022-10-06 京东方科技集团股份有限公司 人脸属性检测方法及装置、存储介质及电子设备
CN113158939B (zh) * 2021-04-29 2022-08-23 南京甄视智能科技有限公司 人脸遮挡部位的识别方法及系统
CN113158939A (zh) * 2021-04-29 2021-07-23 南京甄视智能科技有限公司 人脸遮挡部位的识别方法及系统
CN113535899B (zh) * 2021-07-07 2024-02-27 西安康奈网络科技有限公司 一种针对互联网信息情感倾向性的自动研判方法
CN113535899A (zh) * 2021-07-07 2021-10-22 西安康奈网络科技有限公司 一种针对互联网信息情感倾向性的自动研判方法
CN113674177B (zh) * 2021-08-25 2024-03-26 咪咕视讯科技有限公司 一种人像唇部的自动上妆方法、装置、设备和存储介质
CN113674177A (zh) * 2021-08-25 2021-11-19 咪咕视讯科技有限公司 一种人像唇部的自动上妆方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN106909870A (zh) 2017-06-30
WO2017107957A9 (zh) 2017-09-08

Similar Documents

Publication Publication Date Title
WO2017107957A1 (zh) 人脸图像的检索方法及装置
Kumar et al. Attribute and simile classifiers for face verification
CN107169455B (zh) 基于深度局部特征的人脸属性识别方法
González-Briones et al. A multi-agent system for the classification of gender and age from images
Kumar et al. Real time face recognition using adaboost improved fast PCA algorithm
KR101760258B1 (ko) 얼굴 인식 장치 및 그 방법
US8064653B2 (en) Method and system of person identification by facial image
CN105205480B (zh) 一种复杂场景中人眼定位方法及系统
Asteriadis et al. Facial feature detection using distance vector fields
Tome et al. Identification using face regions: Application and assessment in forensic scenarios
CN106778450B (zh) 一种面部识别方法和装置
CN108629336B (zh) 基于人脸特征点识别的颜值计算方法
Cevikalp et al. Face and landmark detection by using cascade of classifiers
CN111126240B (zh) 一种三通道特征融合人脸识别方法
CN103632147A (zh) 实现面部特征标准化语义描述的系统及方法
WO2021196721A1 (zh) 一种舱内环境的调整方法及装置
Singh et al. Face detection and eyes extraction using sobel edge detection and morphological operations
Paul et al. Extraction of facial feature points using cumulative histogram
WO2022213396A1 (zh) 猫的面部个体识别装置、方法、计算机设备及存储介质
Hebbale et al. Real time COVID-19 facemask detection using deep learning
CN104008364A (zh) 人脸识别方法
Shanmugavadivu et al. Rapid face detection and annotation with loosely face geometry
Mayer et al. Adjusted pixel features for robust facial component classification
Barbu An automatic face detection system for RGB images
JP2013008093A (ja) 画像認識装置、画像認識方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16877767

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16877767

Country of ref document: EP

Kind code of ref document: A1