WO2015149534A1 - 基于Gabor二值模式的人脸识别方法及装置 - Google Patents

基于Gabor二值模式的人脸识别方法及装置 Download PDF

Info

Publication number
WO2015149534A1
WO2015149534A1 PCT/CN2014/093450 CN2014093450W WO2015149534A1 WO 2015149534 A1 WO2015149534 A1 WO 2015149534A1 CN 2014093450 W CN2014093450 W CN 2014093450W WO 2015149534 A1 WO2015149534 A1 WO 2015149534A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
scale
lgbp
binary
pixel
Prior art date
Application number
PCT/CN2014/093450
Other languages
English (en)
French (fr)
Inventor
贲圣兰
王慕妮
姜耀国
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015149534A1 publication Critical patent/WO2015149534A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]

Definitions

  • Embodiments of the present invention relate to image processing and pattern recognition technologies, and in particular, to a face recognition method and apparatus based on Gabor binary mode.
  • face recognition technology is intuitive and non-replicable, it is widely used in security systems, access control systems, time and attendance systems, intelligent robot systems, and virtual game systems.
  • the basic concept is to detect the face region from an image or video containing a face; select and extract feature descriptors with strong face discrimination; then design a classifier according to the selected features to realize face recognition.
  • FIG. 1 is a diagram showing an example of summing pixel values of three pixel points in a Gabor filter response image in eight directions corresponding to a certain scale in the prior art.
  • the recognition of the face using the above recognition technology may result in the loss of image texture hopping characteristics, and ultimately the recognition ability is low.
  • Embodiments of the present invention provide a face recognition method and apparatus based on a Gabor binary mode to improve the ability to discriminate a face.
  • an embodiment of the present invention provides a face recognition device based on a Gabor binary mode, including:
  • the first filtered response image adopts a Fisher Fisher criterion to acquire an identification factor at the scale of the direction, and determine the scales in each direction according to the discriminating factors at each scale of each direction.
  • a filtering processing module configured to perform Gabor filtering processing of each scale in each direction according to at least two directions and at least two presets, to obtain a second filtered response image at each scale of each direction;
  • a determining module configured to determine, according to each dimension of each direction obtained by the filtering processing module, the second filtered response image according to a pixel threshold value in the same direction and scale as the second filtered response image a binary mode LGBP binary map corresponding to each of the second filtered response images;
  • An acquiring module configured to acquire a feature vector of each of the LGBP binary graphs obtained by the determining module, and acquire a feature vector of the image to be processed according to a feature vector of each of the LGBP binary graphs;
  • An identification module configured to obtain, according to the feature vector of the image to be processed acquired by the acquiring module and the feature vector of any training image in the training image set, the similarity between the image to be processed and the training image in the training image set And according to the similarity threshold, the recognition result is obtained.
  • each of the first filtered response images in the set of training images is a sample having the same direction and scale as the first filtered response image. And determining, by using the first filtered response image of the same scale in the same direction of the training image of the same target in the training image set as an intra-class sample of the scale of the direction, the threshold determining module acquiring the scale of the direction.
  • An identification factor at the scale of the direction is calculated based on the intra-class discrete matrix at the scale of the direction and the inter-class discrete matrix at the scale of the direction.
  • the threshold determining module determines, according to the discriminating factor at each scale of each direction, a pixel point threshold of each scale of each direction:
  • the pixel point threshold t at the scale of the direction in which the first filtered response image is located is calculated according to the following formula:
  • t is the pixel point threshold of the first filtered response image on the scale in the same direction as W
  • is a real number greater than 1
  • W is an identification factor at the scale of the direction in which the first filtered response image is located.
  • an embodiment of the present invention provides a face recognition method based on a chopped Gabor binary mode, including:
  • the pixel threshold is: using Fisher Fisher criteria for all first filtered response images of the same scale in the same direction in the training image set, obtaining the discrimination under the scale of the direction a factor, and determined according to the discriminating factor at the scale of the direction;
  • each of the first filtered response images in the training image set is a sample having the same direction and scale as the first filtered response image. And determining, by using the first filtered response image of the training target image of the same target in the same direction in the same direction as the intra-class sample of the scale of the direction, the obtaining the identifier of the direction Factor, specifically:
  • An identification factor at the scale of the direction is calculated based on the intra-class discrete matrix at the scale of the direction and the inter-class discrete matrix at the scale of the direction.
  • the determining, The pixel thresholds at various scales in one direction are specifically:
  • the pixel point threshold t at the scale of the direction in which the first filtered response image is located is calculated according to the following formula:
  • t is the pixel point threshold of the first filtered response image on the scale in the same direction as W
  • is a real number greater than 1
  • W is an identification factor at the scale of the direction in which the first filtered response image is located.
  • the pixel point threshold at one scale for one direction is obtained by the identification factor of the first filtered response image at the same scale in the same direction, and then one scale of one direction obtained by using the training phase is obtained.
  • the lower pixel threshold extracts the LGBP texture features in the image to be recognized, which can improve the robustness of the LGBP texture features, and thus ensure that the extracted LGBP texture features have strong discriminating ability and improve the recognition ability of the face.
  • FIG. 1 is a diagram showing an example of summing pixel values of three pixel points in a Gabor filtered response image in eight directions corresponding to a certain scale in the prior art
  • Embodiment 1 of a face recognition device based on Gabor binary mode according to the present invention
  • FIG. 3 is a structural diagram of Embodiment 2 of a face recognition device based on Gabor binary mode according to the present invention. intention;
  • FIG. 4 is a diagram showing an example of an image before and after Gabor filtering in the second embodiment of the face recognition device based on the Gabor binary mode
  • FIG. 5 is a diagram showing an example before and after fusion in a second embodiment of a face recognition device based on a Gabor binary mode
  • FIG. 6 is a diagram showing an example of calculating a binary value of LGBP in Embodiment 2 of a face recognition method based on Gabor binary mode according to the present invention
  • Embodiment 7 is a schematic structural diagram of Embodiment 3 of a face recognition apparatus based on Gabor binary mode according to the present invention.
  • FIG. 8 is a flowchart of Embodiment 1 of a face recognition method based on Gabor binary mode according to the present invention.
  • FIG. 2 is a schematic structural diagram of Embodiment 1 of a face recognition apparatus based on Gabor binary mode according to the present invention.
  • the embodiment of the invention provides a face recognition device based on a Gabor binary mode, which can be integrated in a communication device, wherein the communication device can be a mobile phone, a personal computer (PC), a laptop or a server. Wait for any terminal device.
  • the apparatus 10 of this embodiment includes: a threshold determination module 11, a filter processing module 12, a determination module 13, an acquisition module 14, and an identification module 15.
  • the threshold determining module 11 is configured to adopt a Fisher criterion for all first filtered response images of the same scale in the same direction in the training image set, and obtain an identification factor at the scale of the direction, and Determining a pixel point threshold at each scale of each direction according to the discriminating factor at each scale of each direction;
  • the filtering processing module 12 is configured to process the at least two directions according to the preset at least two directions The image performs Gabor filtering processing of each scale in each direction to obtain a second filtered response image at each scale of each direction;
  • the determining module 13 is used for The second filtered response image for each scale of each direction obtained by the filtering processing module 12 is determined according to a pixel point threshold (provided by the threshold determining module 11) having the same direction and scale as the second filtered response image.
  • the Gabor LGBP binary map corresponding to the second filtered response image is: having the same as the second filtered response image Direction and scale of the LGBP binary map.
  • the obtaining module 14 is configured to acquire the feature vector of each LGBP binary image obtained by the determining module 13, and obtain the feature vector of the image to be processed according to the feature vector of each LGBP binary image; the identifying module 15 is configured to be processed according to the acquiring module 14 And obtaining a similarity between the image to be processed and the training image in the training image set, and obtaining a recognition result according to the similarity threshold.
  • the similarity threshold of each scale in each direction and the pixel threshold of each scale in each direction are obtained according to the training image set in the training phase.
  • the similarity threshold is obtained from the feature vector of the training image in the training image set.
  • the filter processing module 12 is specifically configured to: perform a convolution process on the image to be processed using a kernel function of the Gabor filter, and obtain a second filter response image at each scale in each direction; wherein the scale and direction in the kernel function The value is set according to actual needs.
  • the convolution processing of the image to be processed using the kernel function specifically refers to: convolution processing the image to be processed using the Gabor filter kernel function of each scale value and direction value set above, thereby obtaining A second filtered response image at each scale of each direction. For example, when the scale is 5 and the direction is 8, it traverses five scales (0, 1, 2, 3, 4) and eight directions (0, 1, 2, 3, 4, 5, 6). , 7), obtain 40 kernel functions, and use each kernel function to convolve the image to be processed, thereby obtaining 40 second filter response images of 5 scales and 8 directions.
  • the kernel function of the Gabor filter can be:
  • the determining module 13 may further include: a threshold acquiring unit 131, configured to acquire a pixel point threshold on a scale of a direction in which the second filtered response image is located; and a determining unit 132, configured to: a second filtered response image, the LGBP binary sequence corresponding to each pixel of the second filtered response image is obtained according to the pixel threshold of the scale of the direction of the second filtered response image, and according to the second filtered response image
  • the LGBP binary sequence corresponding to each pixel point obtains the LGBP binary map corresponding to the second filtered response image.
  • the binary image uses a binary value to represent the pixel value in the second filtered response image.
  • the binary mode LGBP binary map corresponding to the second filtered response image is specifically: the second filtered response image has the same scale LGBP binary graph in the same direction.
  • the determining unit 132 obtains the LGBP binary sequence corresponding to each pixel point in the second filtered response image, specifically: obtaining corresponding neighbors when each of the pixel points in the second filtered response image is used as a central pixel point according to the following formula
  • the LGBP binary value of any surrounding pixel in the domain is:
  • u b represents the pixel value of a surrounding pixel point b in the neighborhood where the central pixel point c is located when any pixel in the second filtered response image is the central pixel point c
  • i c represents the central pixel point a pixel value of c
  • t represents a pixel point threshold in a direction and a scale in which the second filter response image is located
  • S(u b , i c , t) represents the surrounding pixel of the neighborhood in which the central pixel point c is located a binary value of a point
  • a binary sequence corresponding to the pixel point is a binary sequence consisting of binary values of surrounding pixels of the pixel
  • the LGBP is a local binary pattern (LBP) that is further extracted on the filtered response image obtained by Gabor filtering, and reflects the micro texture structure of the image to be processed after Gabor filtering.
  • LBP describes the texture with a pattern of local regions, and each pixel is marked by a code value formed by the original texture of the local neighborhood that best matches it.
  • the face recognition device needs to train the existing image collection, which is called a training phase.
  • the face recognition device obtains the pixel threshold of each of the above directions and each scale according to the training image set.
  • the determining module 13 calculates the second filtered response image for each scale in each direction according to a pixel threshold value in the same direction and scale as the second filtered response image.
  • the second filter responds to the LGBP binary value corresponding to each pixel in the image.
  • the manner of determining the corresponding LGBP binary value is determined, because the present invention is based on each first filtered response image in the training image set in all directions.
  • the pixel threshold of the first filter response image at each scale is calculated, and the LGBP binary value is extracted, so that the more LGBP texture feature with more discriminative ability can be extracted. Therefore, the pixel point threshold is set to improve the robustness of the extracted LGBP texture features, thereby realizing the high discrimination capability of the face recognition device.
  • the acquiring module 14 obtains, in the feature vector of each of the LGBP binary graphs obtained by the determining module 13, the acquiring process of the feature vector of any one of the LGBP binary graphs, including: using a preset size region block, and determining the received
  • the LGBP binary image sent by the module 13 performs area division; converts the neighborhood binary sequence of each pixel in each of the area blocks into a decimal value as the LGBP coded value of the pixel; the neighborhood binary sequence of the pixel Comprising a binary value of each surrounding pixel point of the pixel; using the largest LGBP encoded value in all the area blocks as the total dimension of the vector corresponding to each of the area blocks, the LGBP encoding value in the area block is
  • the number of LGBP coded values of n-1 is taken as the value of the nth dimension in the corresponding vector of the area block; the values of the dimensions of the vector corresponding to the area block constitute the LGBP histogram corresponding to the area block; wherein, n An arbitrary integer between
  • the first dimension in each vector indicates the number of LGBP encoding values in the region block corresponding to the vector is 0, the second dimension indicates the number of LGBP encoding values in the region block, and the third dimension indicates the LGBP encoding in the region block.
  • the number of the LGBP binary graph is obtained by the acquisition module 13 acquiring the feature vector of the image to be processed according to the feature vector of each of the LGBP binary graphs.
  • the feature vector obtains the feature vector of the image to be processed.
  • the shape of the area block may be any shape such as a rectangular block or a square block; the first filter response image and the first filter in the second filter response image
  • the "second" is only a response image in which the training image and the image to be recognized are filtered by Gabor.
  • the scale of the direction in the embodiment of the present invention represents a certain scale in a certain direction, such as the first scale of the first direction (such as the filter response image of the top left corner of the 40 filtered response images in FIG. 4).
  • the first scale of the first direction such as the filter response image of the top left corner of the 40 filtered response images in FIG. 4
  • the first direction of the first scale it can also be said to be the first direction of the first scale. Therefore, for the representation of the filtered response image, the LGBP binary graph, the discrimination factor, the pixel threshold, etc., the expression of a certain scale in a certain direction. It has the same meaning as the expression of a certain direction of a certain scale. It is mainly the specific value of direction and scale.
  • the second dimension of the first direction and the first direction of the second scale, and the fifth dimension of the second direction It has the same meaning as the second direction of the fifth scale, etc. It is different if the values of the directions or scales are different.
  • the identification module 15 may be specifically configured to: obtain the training in the image to be processed and the training image set according to the feature vector of the image to be processed acquired by the acquiring module 14 and the feature vector of any training image in the training image set by using the histogram intersection method. The similarity of the image; and based on the similarity threshold, the recognition result is obtained. Specifically, determining that the similarity is greater than or equal to the similarity threshold, and determining that the image to be processed is the same target image as the training image for acquiring the similarity; or determining that the similarity is less than the similarity threshold, and determining the image to be processed An image that is a different target from the training image used to acquire the similarity.
  • the embodiment of the present invention extracts a pixel point threshold obtained by using a training phase (based on a pixel threshold calculated by a discrimination factor of a first filtered response image of each first filter response image in each direction of the training image set)
  • the LGBP texture feature in the image to be identified improves the robustness of the LGBP texture feature, thereby ensuring that the extracted LGBP texture features have strong discriminating ability.
  • the image to be identified is authenticated one by one with each training image to determine whether the image to be identified is included in the training image set.
  • the image to be processed may include an image to be identified and an image in the training image set.
  • the number of training images in the training image set is an arbitrary value, which is not limited herein.
  • the size of the training image set is determined according to the actual scene. set up. For the purpose of distinguishing, the training image set for training, each training image is marked with an independent identifier (Identifier, referred to as ID).
  • FIG. 3 is a schematic structural diagram of Embodiment 2 of a face recognition device based on Gabor binary mode according to the present invention; Figure. As shown in FIG. 3, this embodiment may further include a pre-processing module 21 on the basis of the foregoing embodiment.
  • the images in the above training image set are processed. specifically:
  • the filter processing module 12 is configured to perform a Gabor filter process on the training image.
  • Gabor filtering process the scale factor is determined to be 5 (ie, the scale can be 0, 1, 2, 3, and 4) and the direction factor is 8 (ie, the direction can be 0, 1, 2, 3, 4, 5,6,7) Gabor kernel function; each training image in the training image set is convolved with the Gabor kernel function to obtain each scale (8 directions in total) at each scale (5 in total)
  • the first filtered response image of the scale that is, each training image in the training image set is replaced by 40 (5*8) first filtered response images, wherein 40 refers to each scale on 5 scales. There are 40 images in 8 directions, as shown in Figure 4.
  • the threshold determining module 11 uses Fisher's criterion for all first filtered response images in the same direction and the same scale in the training image set to acquire the discriminating factors in the direction and scale, and the discriminating factors according to the scales in all directions. Determine the pixel threshold at each scale for each direction.
  • each of the first filtered response images is taken as a sample having the same direction and scale as the first filtered response image.
  • the first filtered response image in the same direction and scale of the training image of the same target in the training image set is taken as an intra-class sample in the direction and scale, and the t- th training image of the j-th target is represented by j t
  • a target has 8 training images, taking 8 directions in 5 directions as an example, and j is the first filtered response image of the first scale in the first direction.
  • the intra-class discrete matrix in the direction and scale and the inter-class discrete matrix in the direction and scale are used to calculate the discriminating factor at the scale of the direction.
  • the intra-class dispersion matrix S w and the inter-class dispersion matrix S b between all the first filtered response images are calculated as follows:
  • S w and S b are obtained according to formula (3) and formula (4), and the ratio S w /S b of the two is calculated, and the ratio is used as the discriminating factor in the direction and scale.
  • the discrimination factor in the direction and the scale is: an identification factor corresponding to the first filtered response image in the direction and the scale.
  • the threshold determining module 11 determines the pixel point thresholds of the respective scales in each direction according to the discriminating factors at each scale of each direction, that is, determines the pixel point thresholds of the same scale having the same direction as the discriminating factors. Taking 8 scales in 5 directions as an example, there are 40 first filtered response images for each training image. Then the threshold determination module can acquire 40 discrimination factors and 40 pixel point thresholds. Wherein, the discrimination factor and the pixel point threshold are in one-to-one correspondence.
  • the identification factor of the first filtered response image on a certain direction scale may be inversely proportional to the pixel threshold value extracted by the LGBP binary value of the first filtered response image on the corresponding direction scale, and the first direction is determined.
  • the threshold determining module 11 determines, according to the discriminating factor at each scale of each direction, a pixel point threshold of each scale of each direction, specifically: calculating a pixel point at a scale of a direction in which the first filtered response image is located according to the following formula:
  • the threshold t is:
  • t is the pixel threshold of the first filtered response image on the scale in the same direction as W
  • is a real number greater than 1
  • W is the discrimination factor at the scale of the direction in which the first filtered response image is located.
  • the determination of the pixel threshold in the embodiment of the present invention is based on the ratio of the dispersion between the classes and the intra-class dispersion as the discrimination factor at a certain direction scale; the larger the identification factor, the scale is The dispersion between the sample classes in the direction is large and the intra-class dispersion is small, which is more capable of discriminating; otherwise, the discriminating ability is poor. Thereby the accuracy of recognition can be improved.
  • the determining module 13 determines from each of the preset pixel point thresholds in the same direction and scale as the first filtered response image.
  • the LGBP binary image corresponding to the first filtered response image specifically includes: for each first filtered response image, each pixel point is used as a central pixel point of a neighborhood, and the neighborhood radius is determined to be 1, and the surrounding pixel points are The number is 8, extracting the LGBP binary sequence of the central pixel. Taking each pixel in the first filtered response image as a central pixel, the LGBP binary sequence corresponding to the central pixel is extracted as the value of the pixel in the LGBP binary. For pixels located at the edge of the image, bilinear interpolation is used to complement the neighborhood. The radius of the neighborhood and the number of surrounding pixels are only examples, and are not limited thereto.
  • the first filtered response image is augmented by interpolation.
  • the expanded pixel points are indicated by dashed boxes in FIG.
  • the pixel point X11 in the upper left corner of the first filtered response image is used as the starting pixel point, and each pixel point is traversed laterally (ie, X11-->...-->X17-->X21-->...-->X27 -->...-->X57).
  • each pixel point LGBP binary sequence is calculated separately.
  • Fig. 6 the calculation process of the LGBP binary value is described by taking an X11 pixel as an example.
  • t is a pixel point threshold corresponding to the first filtered response image.
  • the LGBP binary value extraction method using the gray value of the central pixel as the pixel point threshold is sensitive to noise. If the pixel value of the central pixel of the neighborhood is between the pixel values of the neighboring pixels, then comparing the neighboring pixel points with the pixel values of the central pixel respectively, the hopping information is included, and the The LGBP binary value expands the texture information within the neighborhood, so the method is said to be sensitive to noise.
  • the embodiment of the present invention extracts the LGBP binary value of the training image according to the discrimination factor calculation pixel threshold value, so as to ensure the rationality of the pixel point threshold, as shown in formula (2).
  • the LGBP binary value extraction method provided by the invention not only suppresses the texture information with partial gray value very close, improves the robustness of the texture feature extraction, and improves the discrimination ability of the extracted LGBP binary value.
  • the determining module 13 is further operative to merge the LGBP binary map for each of at least two directions for each scale, merging each of the at least two directions at the same scale
  • the LGBP binary graph is obtained, and the LGBP binary graph after each scale is obtained.
  • the determining module 13 first acquires a binary pattern LGBP binary map corresponding to each of the second filtered response images, that is, obtains 5 directions each.
  • Eight scale LGBP binary plots in one direction for a total of 40 LGBP binary plots.
  • the module is determined to fuse the LGBP binary graph in each direction (with 5 directions) at the same scale, and 8 LGBP binary graphs are obtained, that is, each scale corresponds to a merged LGBP binary graph.
  • the determining module 13 merges the LGBP binary graphs in each of the at least two directions in the same scale for each scale, and obtains the LGBP binary graph after each scale fusion, specifically: the determining module 13
  • the LGBP binary map in all directions at the same scale is merged in a phase-wise manner to obtain a LGBP binary map after each scale fusion.
  • the determining module 13 fuses the LGBP binary maps in all directions at the same scale, and the fusion method is as shown in FIG. 5.
  • FIG. 5 only the corresponding three-bit binary value in the binary sequence of a corresponding pixel point in the LGBP binary graph in each direction at the same scale is enumerated, and other binary values are analogized in turn.
  • an OR operation is performed on each pair of binary values corresponding to the binary sequence of the pixel points corresponding to the LGBP binary image in the first direction and the second direction; then the result of the OR operation is compared with the LGBP in the third direction.
  • the binary map corresponds to the corresponding binary value in the binary sequence of the pixel and then performs an OR operation; and so on until the result of the OR operation corresponds to the corresponding binary in the binary sequence of the pixel corresponding to the LGBP binary image in the eighth direction.
  • the value is ORed as a merged LGBP binary graph
  • each training image in the training image set consists of a fused LGBP binary map that determines the number of sheets (ie, the number of scales, such as 5).
  • each scale has a merged LGBP binary map.
  • the first binary value of the binary sequence of each pixel point in the LGBP binary image of each first filtered response image in different directions of the same scale adopts a "bitwise phase or" fusion method if a certain first filtering response
  • the hopping information appears on the image, that is, if the binary value is "1", the result of "phase” with the binary value corresponding to the position of the hopping information in the other first filtered response images at the scale must be "1". , that is, the jump information at the position is retained.
  • the extracted LGBP binary value has been filtered for the identification information, and the information with stronger discriminating ability is obtained.
  • the fusion processing can effectively reduce the calculation amount of face recognition under the premise of ensuring the identification ability of the feature data.
  • the obtaining, by the obtaining module 14 , the feature vector of each of the LGBP binary graphs obtained by the determining module 13 is: obtaining the feature vector of the LGBP binary graph after each scale fusion obtained by the determining module 13 , and transmitting the feature vector to the threshold determination Module 11.
  • the obtaining module 14 performs segmentation on the LGBP binary image of each scale by using a preset size area block;
  • the size of the preset size area block may be preset according to the actual size of the LGBP binary image. For example, it can be 4*8; convert the binary sequence corresponding to each pixel in each area block into a decimal value, and use the decimal value as the LGBP coded value of the pixel; the maximum of all the pixels in all the area blocks.
  • the LGBP coded value is used as the total dimension of the vector corresponding to each area block, and the number of LGBP coded values of the LGBP coded value of n-1 in the area block is taken as the value of the nth dimension in the corresponding vector of the area block;
  • the values of the dimensions of the vector corresponding to the block constitute the LGBP histogram corresponding to the area block.
  • n is any integer between 1 and the maximum LGBP encoded value.
  • the maximum coding value in all the regional blocks is 59, where there are 4 coded values 3 in one block, 10 coded values 6 and 9 coded values 59, and the other values are 0, then the LGBP histogram is:
  • the total length of the vector is 59 dimensions;
  • the LGBP histogram of each block is concatenated to form the LGBP histogram of the LGBP binary.
  • the LGBP binary graph of a certain scale is divided into two regional blocks, and the LGBP histogram corresponding to each regional block is Then, the LGBP histogram of the two regional blocks is connected in series to form a LGBP histogram of the LGBP binary graph, that is,
  • the above examples are only for explaining the manner of concatenation.
  • the specific vector dimensions and the values of the elements in the vector are not limited to the above examples.
  • the feature vectors of the LGBP binary graphs are concatenated to obtain the feature vector of the image to be processed.
  • the intermediate series can also refer to the above examples for easy understanding.
  • the acquisition module 14 concatenates the LGBP histograms corresponding to the merged LGBP binary graphs of each scale in series, and forms a LGBP histogram corresponding to the training image.
  • the LGBP histogram corresponding to the training image is a feature vector for the training image.
  • the threshold determination module 11 receives the feature vectors of the respective training images transmitted by the acquisition module 14. Further, the threshold determining module 11 may be specifically configured to: arbitrarily combine the training images in the training image set according to the cross-checking criterion, and divide the training images in the training image set into a to-be-trained image and a test image; using a histogram intersection method Calculating the similarity between the image to be trained and the feature vector of the test image; sequentially using each similarity as a threshold to calculate the accuracy and false positive rate in each group; determining the accuracy and false positive rate in each group Similarity threshold. Specifically, the accuracy rate and the false positive rate in each group are traversed.
  • the corresponding similarity threshold is taken as the optimal value of the group.
  • the similarity threshold; the average of the optimal similarity values of each group is taken as the similarity threshold of the training image set.
  • the training of the similarity threshold is exemplified below: according to the feature vector of each training image obtained above, the training image set is divided into ten according to the cross-recognition criterion, and any one of them is selected as the test image set, and the remaining is used as the image to be trained.
  • the collection, repeated ten times, constitutes a set of ten sets of images to be trained and test images, so that each of the ten copies has been tested.
  • each group contains the image to be trained and the test image; the histogram intersection method is used to calculate the similarity between the feature vector of each training image and the test image in each group; and the training image and the test image are sequentially processed for each group.
  • the similarity is taken as the threshold of the similarity to be determined.
  • the positive instance identified by the positive instance (true positive rate, TRP for short) and the negative instance of the positive instance are considered to be negative.
  • the ratio of all negative instances (FPR); the minimum absolute value of FPR-(1-TPR) is used as the criterion to calculate the optimal similarity threshold for each group, and finally the mean of the ten optimal thresholds.
  • the similarity threshold of the identification phase the similarity threshold used by the recognition module 15 for identification.
  • TP is the number of times the image of the same person is correctly recognized
  • FN is the number of times the image of the same person is recognized as a different person
  • FP is the number of times the image of the different person is recognized as the same person
  • TN is different. The number of times a person's image is recognized as an image of a different person.
  • the image to be recognized is processed, the purpose is to calculate the feature vector of the image to be identified, and the image to be identified and the feature vector of each training image are authenticated one by one, and the similarity between the two is calculated, and the similarity threshold is calculated. A comparison is made to determine whether the image to be identified belongs to the same person as a certain training image. specifically:
  • the recognition module 15 is based on the feature vector of the image to be processed. And the feature vector of each training image in the training image set, the similarity between the image to be processed and the training image in the training image set is obtained, and the image to be processed is identified according to the similarity threshold obtained by the threshold determining module 11, and the recognition result is obtained.
  • the filtering processing module 12 is configured to perform Gabor filtering processing on each scale of each direction according to at least two directions and at least two presets, to obtain a second filtered response image at each scale of each direction;
  • the determining module 13 is configured to determine, according to the pixel threshold value in the same direction and scale for the second filtered response image, the second filtered response image for each scale of each direction obtained by the filtering processing module 12 LGBP binary map corresponding to each of the second filtered response images;
  • the obtaining module 14 is configured to obtain a feature vector of each of the LGBP binary graphs obtained by the determining module 13, and acquire a feature vector of the image to be processed according to a feature vector of each of the LGBP binary graphs;
  • the identification module 15 is configured to obtain, according to the feature vector of the image to be processed acquired by the acquiring module 14 and the feature vector of any training image in the training image set, the similarity between the image to be processed and the training image in the training image set, and according to The similarity threshold is obtained, and the recognition result is obtained.
  • the determining module 13 determines the binary mode LGBP binary map method, and the obtaining module acquires the feature vector of the image to be processed, which is different from the method of determining the module and the obtaining module in the training phase.
  • the embodiments of the present invention do not provide a detailed introduction to the training phase and the identification phase.
  • the identification module 15 is specifically configured to: obtain a to-be-processed image and a training image set according to the feature vector of the image to be processed acquired by the acquiring module and the feature vector of any training image in the training image set by using a histogram intersection method The similarity of the training image; and based on the similarity threshold, Get the recognition result.
  • the identification module 15 obtains the recognition result according to the similarity threshold, specifically: determining that the similarity is greater than or equal to the similarity threshold, and determining that the to-be-processed image is the same target as the training image for acquiring the similarity. And determining that the similarity is less than the similarity threshold, and determining that the image to be processed and the training image for acquiring similarity are images of different targets.
  • the changes in the face image include internal changes and external changes: the internal changes are caused by different human identities and belong to the essential attributes of the human face; and the external changes are caused by different external conditions. , including lighting, gestures, expressions, age, etc., reflecting different image acquisition conditions.
  • the ideal face description feature should only reflect the intrinsic changes of the face, but not the external changes. Therefore, on the basis of the above, further, the face recognition device 20 includes a pre-processing module 21 for performing pre-processing on the image to be processed, and transmitting the pre-processed image to be processed to the filter processing module 12, wherein the pre-processing Includes face area acquisition, face alignment processing, and lighting pre-processing.
  • the pre-processing module 24 is specifically configured to: in the image to be processed, obtain the distance between the two eyes according to the coordinates of the human eye; and intercept the forehead, the eyes, the nose, and the nose in the image to be processed according to the distance. The mouth and the area where the chin is located.
  • the pre-processing module 21 is specifically configured to: calculate an angle between the two-eye line and the horizontal line in the image to be processed; according to the angle, rotate the image to be processed, so that two images in the image to be processed The eye line is in a horizontal position.
  • the pre-processing module 21 is specifically configured to: use a Gaussian filter and/or a gamma correction illumination pre-processing method to make the illumination intensity of the image to be processed uniform.
  • the face recognition device determines the pixel point threshold extracted by the LGBP binary graph by using the Fisher criterion, and further performs the fusion of the LGBP binary graph by “bitwise phase or” calculation, thereby improving the discrimination capability of the LGBP binary graph, and reducing
  • the LGBP binary graph calculates the amount of hopping information of the LGBP binary mode in each direction while increasing the recognition rate.
  • the pre-processing module, the filtering processing module, the determining module, the obtaining module, and the threshold determining module may be used in a training phase, and the face recognition device acquires the pixel threshold and the similarity threshold offline and/or online.
  • the preprocessing module, the filtering processing module, the determining module, the obtaining module, and the identifying module are usable for obtaining the recognition result online by the face recognition device during the identification phase.
  • the above modules may be integrated into one face recognition device, and may also be used for the device in the training phase and the device for the identification phase. The invention is not limited thereto.
  • FIG. 7 is a schematic structural diagram of Embodiment 3 of a face recognition apparatus based on Gabor binary mode according to the present invention.
  • the device can be integrated in a communication device, wherein the communication device can be any terminal device such as a mobile phone, a PC, a notebook computer, or a server.
  • the apparatus 70 of this embodiment includes a processor 71 and a memory 72.
  • the processor 71 is configured to adopt a Fisher criterion for all first filtered response images of the same scale in the same direction in the training image set, and obtain an identification factor at the scale of the direction, and according to each direction
  • the discriminating factor at the scale determines a pixel point threshold at each scale of each direction
  • the memory 72 is coupled to the processor 71 for storing pixel thresholds and pixels at each scale of each direction Corresponding relationship between the threshold and the scale of the preset direction, and the similarity threshold;
  • the processor 71 may be further configured to perform Gabor filtering on each scale of each direction according to the preset at least two directions and the at least two scales Processing, obtaining a second filtered response image at each scale of each direction; for the second filtered response image at each scale of each direction, according to the same direction and scale for the second filtered response image a pixel point threshold, determining an LGBP binary map corresponding to each of the second filtered response images; acquiring each of the LGBP binary patterns Obtaining a feature vector of the image
  • processor of the embodiment of the present invention is further configured to execute the foregoing steps of the threshold determining module, the filtering processing module, the determining module, the obtaining module, and the identifying module, and the embodiments of the present invention are not detailed herein.
  • FIG. 8 is a flowchart of Embodiment 1 of a face recognition method based on Gabor binary mode according to the present invention.
  • An embodiment of the present invention provides a face recognition method based on a Gabor binary mode, which may be performed by the above-mentioned face recognition device, and the device may be integrated in a communication device, where the communication device may be a mobile phone, a PC, or a notebook computer. Or any terminal device such as a server.
  • the face recognition method based on the Gabor binary mode includes:
  • the pixel threshold is: the pixel threshold is: using Fisher Fisher criteria for all first filtered response images of the same scale in the same direction in the training image set, and acquiring the direction at the scale Identification factor and determined based on the identification factor at the scale of the direction.
  • the method of the embodiment of the present invention may be performed by the apparatus shown in FIG. 2, FIG. 3 or FIG. 7.
  • the implementation principle and the technical effect are similar, and details are not described herein again.
  • each of the first filtered response images in the training image set is a sample having the same direction and scale as the first filtered response image
  • the training image of the same target in the training image set is a first filtered response image at the same scale in the same direction as an intra-class sample of the scale of the direction
  • the acquiring the identification factor of the scale of the direction specifically: calculating a pixel mean value of the intra-class sample in the direction of each target in the training image set, and all the training image set a pixel average of all samples of the target in the direction and the scale; determining the location based on the pixel mean at the scale of the direction and the pixel average at the scale of the direction An intra-class discrete matrix at the scale of the direction and an inter-class discrete matrix at the scale of the direction; the intra-class discrete matrix and the direction according to the direction of the direction An inter-class discrete matrix at the scale, the discriminant factor at the scale of the direction is calculated.
  • determining, according to the discriminating factor at each scale of each direction, a pixel point threshold at each scale of each direction may be specifically: calculating a scale of a direction in which the first filtered response image is located according to the following formula
  • the pixel threshold t is:
  • t is the pixel threshold of the first filtered response image at the same direction as W, and ⁇ is large A real number of 1, the discrimination factor at the scale of the direction in which the first filtered response image is located.
  • the S802 may include: acquiring a pixel threshold corresponding to the second filtered response image, where the pixel threshold corresponding to the second filtered response image is the second filtered response image. a pixel point threshold in the direction and the scale; for each of the second filter response images, obtaining, according to the pixel threshold corresponding to the second filtered response image, corresponding to each pixel point of the second filtered response image An LGBP binary sequence, and obtaining an LGBP binary map corresponding to the second filtered response image according to the LGBP binary sequence corresponding to each pixel of the second filtered response image.
  • the obtaining the LGBP binary sequence corresponding to each pixel point in the second filtered response image is specifically: obtaining a corresponding neighborhood when each of the pixel points in the second filtered response image is used as a central pixel point according to the following formula
  • u b represents the pixel value of a surrounding pixel point b in the neighborhood where the central pixel point c is located when any pixel in the second filtered response image is the central pixel point c
  • i c represents the central pixel point a pixel value of c
  • t represents a pixel point threshold in a direction and a scale in which the second filter response image is located
  • S(u b , i c , t) represents the surrounding pixel of the neighborhood in which the central pixel point c is located a binary value of a point
  • a binary sequence corresponding to the pixel point is a binary sequence consisting of binary values of surrounding pixels of the pixel
  • the acquiring process of the feature vector of any of the LGBP binary graphs includes:
  • the neighborhood binary sequence of the pixel consists of binary values of each surrounding pixel of the pixel;
  • the maximum LGBP encoded value in all of the region blocks is taken as the total dimension of the vector corresponding to each of the region blocks,
  • the number of LGBP coded values of the LGBP coded value of the n-1 in the area block is taken as the value of the nth dimension in the corresponding vector of the area block; the values of the dimensions of the vector corresponding to the area block constitute the corresponding area block.
  • LGBP histogram wherein n is an arbitrary integer between 1 and the maximum LGPB encoded value; LGBP histogram of each of the regional blocks is concatenated to obtain the LGBP
  • the feature vector of the binary image is obtained by acquiring the feature vector of the image to be processed according to the feature vector of each of the LGBP binary graphs by: connecting the feature vectors of the LGBP binary graphs in series to obtain the feature vector of the image to be processed.
  • the method may further include: said LGBP binary graph for each of at least two directions for each scale, merging said each of said at least two directions in the same scale
  • the LGBP binary graph obtains the LGBP binary graph after each scale fusion;
  • the obtaining the feature vector of each of the LGBP binary graphs is specifically: acquiring the feature vector of the LGBP binary graph after the fusion of each scale.
  • the LGBP binary graph is merged in each of the at least two directions in the same scale, and the LGBP binary graph obtained after each scale fusion is specifically: in a phase-by-phase manner, the same scale is merged The LGBP binary graph of each of the at least two directions is obtained, and the LGBP binary graph after each scale fusion is obtained.
  • the S804 may include: using a histogram intersection method, acquiring a similarity between the image to be processed and the training image in the training image set according to the feature vector of the image to be processed and the feature vector of any training image in the training image set; According to the similarity threshold, the recognition result is obtained.
  • the obtaining the recognition result may be specifically: determining that the similarity is greater than or equal to the similarity threshold, and determining that the image to be processed is the same target as the training image for acquiring the similarity; or determining that the similarity is less than The similarity threshold is determined, and an image in which the image to be processed and the training image for acquiring similarity are different targets are determined.
  • the method may further include: arbitrarily combining the training images in the training image set according to the cross-checking criterion, and dividing the training images in the training image set into the to-be-trained image and the test image. Using the histogram intersection method, calculating the similarity between the image to be trained and the feature vector of the test image; using each similarity as the threshold, statistical accuracy and false positive rate; according to the accuracy and false positive rate in each group , determine the similarity threshold.
  • the determining the similarity threshold according to the accuracy rate and the false positive rate in each group may include: traversing the accuracy rate and the false positive rate in each group, if the accuracy rate and the false positive rate in a group are added and then subtracted When the absolute value after 1 is the smallest, the corresponding similarity threshold is taken as the optimal similarity threshold of the group; the average of the optimal similarity thresholds of each group is taken as the similarity threshold of the training image set.
  • the face recognition method based on the Gabor binary mode may further include: preprocessing the image to be processed, and the preprocessing may include face region acquisition, face pair Processing and lighting pretreatment.
  • the obtaining of the facial region may include: obtaining, in the image to be processed, a distance between the two eyes according to the coordinates of the human eye; and, according to the distance, intercepting an area where the forehead, the eyes, the nose, the mouth, and the chin are located in the image to be processed.
  • the face alignment processing may include: calculating an angle between the two-eye line and the horizontal line in the image to be processed; and rotating the image to be processed according to the angle, so that the two-eye line in the image to be processed is in a horizontal position.
  • the illumination pre-processing may include: using a Gaussian filter and/or a gamma correction illumination pre-processing method to uniformize the illumination intensity of the image to be processed.
  • the LGBP texture feature in the image to be recognized is extracted by using the pixel threshold obtained by the training phase, and the robustness of the LGBP texture feature is improved, thereby ensuring that the extracted LGBP texture feature has strong discriminating ability and improving the face. Identification ability.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

本发明实施例提供一种基于Gabor二值模式的人脸识别方法及装置。该装置包括:阈值确定模块,采用费希尔准则获取训练图像集合中所有第一滤波响应图像的鉴别因子,并根据鉴别因子确定各方向的各尺度下的像素点阈值;滤波处理模块,对待处理图像进行珈波滤波,获得预设的每一方向和尺度下的第二滤波响应图像;确定模块,依据与各第二滤波响应图像的像素点阈值,确定该第二滤波响应图像的LGBP二进制图;获取模块,根据LGBP二进制图获取待处理图像的特征向量;识别模块,根据上述特征向量及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像的相似度,并根据相似度阈值得到识别结果。本发明实施例可提高对人脸的鉴别能力。

Description

基于Gabor二值模式的人脸识别方法及装置 技术领域
本发明实施例涉及图像处理与模式识别技术,尤其涉及一种基于Gabor二值模式的人脸识别方法及装置。
背景技术
由于人脸识别技术具有直观性与不可复制性,因此,被广泛应用于安检系统、门禁系统、考勤系统、智能机器人系统以及虚拟游戏系统等。其基本概念是从一幅包含人脸的图像或视频中检测出人脸区域;选择并提取人脸区分性较强的特征描述符;然后根据所选特征设计分类器,实现人脸的识别。
现有技术中,采用基于二维多尺度局部珈波二进制模式(Multi-scale Block Local Gabor Binary Patterns,简称:MB-LGBP)特征的表情识别及其光照检测的方法对人脸进行识别。该方法利用Gabor小波滤波以及局部二值模式(Local Binary Patterns,简称:LBP)相结合的方法进行人脸表情识别,其关键步骤在于对同一尺度,不同方向下的Gabor滤波响应图像中的对应像素点的像素值求和,从而减少Gabor滤波响应图像的数量,在此基础上进一步提取每一尺度的LBP二值模式,作为最终的特征数据,输入到向量分类器中进行表情分类。其中,求和示例如图1所示,图1为现有技术对某一尺度对应的八个方向下的Gabor滤波响应图像中的三个像素点的像素值求和示例图。
但采用上述识别技术进行人脸的识别,可能造成图像纹理跳变特征的丢失,最终导致鉴别能力低下。
发明内容
本发明实施例提供一种基于Gabor二值模式的人脸识别方法及装置,以提高对人脸的鉴别能力。
第一方面,本发明实施例提供一种基于Gabor二值模式的人脸识别装置,包括:
阈值确定模块,用于对训练图像集合中的在同一方向的同一尺度的所有 第一滤波响应图像采用费希尔Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各尺度下的像素点阈值;
滤波处理模块,用于根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;
确定模块,用于对于所述滤波处理模块获得的每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的二进制模式LGBP二进制图;
获取模块,用于获取所述确定模块得到的每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量;
识别模块,用于根据所述获取模块获取的所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取所述待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
结合第一方面,在第一方面的第一种可能的实现方式中,将所述训练图像集合中每一个所述第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本,将所述训练图像集合中同一目标的训练图像的在同一方向的同一尺度的第一滤波响应图像作为该方向的该尺度的类内样本,所述阈值确定模块获取所述方向的所述尺度下的鉴别因子具体为:
计算所述训练图像集合中每一目标所述方向的所述尺度下类内样本的像素均值,及所述训练图像集合中所有目标在所述方向的所述尺度下的所有样本的像素平均值;
根据所述方向的所述尺度下的所述像素均值和所述方向的所述尺度下的所述像素平均值,确定所述方向的所述尺度下的类内离散矩阵和所述方向的所述尺度下的类间离散矩阵;
根据所述方向的所述尺度下的所述类内离散矩阵和所述方向的所述尺度下的类间离散矩阵,计算所述方向的所述尺度下的鉴别因子。
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二 种可能的实现方式中,所述阈值确定模块根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各个尺度的像素点阈值具体为:
根据如下公式,计算所述第一滤波响应图像所在方向的尺度下的像素点阈值t为:
Figure PCTCN2014093450-appb-000001
其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大于1的实数,W为所述第一滤波响应图像所在的方向的尺度下的鉴别因子。
第二方面,本发明实施例提供一种基于珈波Gabor二值模式的人脸识别方法,包括:
根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;
对于每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的二进制模式LGBP二进制图;所述像素点阈值为:对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用费希尔Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据所述方向的尺度下的所述鉴别因子确定的;
获取每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量;
根据所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取所述待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
结合第二方面,在第二方面的第一种可能的实现方式中,将所述训练图像集合中每一个所述第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本,将所述训练图像集合中同一目标的训练图像的在同一方向的同一尺度下的第一滤波响应图像作为该方向的该尺度的类内样本,所述获取所述方向的所述尺度的鉴别因子,具体为:
计算所述训练图像集合中每一目标所述方向的所述尺度下类内样本的像素均值,及所述训练图像集合中所有目标在所述方向的所述尺度下的所有样 本的像素平均值;
根据所述方向的所述尺度下的所述像素均值和所述方向的所述尺度下的所述像素平均值,确定所述方向的所述尺度下的类内离散矩阵和所述方向的所述尺度下的类间离散矩阵;
根据所述方向的所述尺度下的所述类内离散矩阵和所述方向的所述尺度下的类间离散矩阵,计算所述方向的所述尺度下的的鉴别因子。
结合第二方面或第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各个尺度下的像素点阈值具体为:
根据如下公式,计算所述第一滤波响应图像所在方向的尺度下的像素点阈值t为:
Figure PCTCN2014093450-appb-000002
其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大于1的实数,W为所述第一滤波响应图像所在的方向的尺度下的鉴别因子。
本发明实施例中,对于一个方向的一个尺度下的像素点阈值是通过在同一方向的同一尺度下的第一滤波响应图像的鉴别因子获取的,则通过使用训练阶段获取的一个方向的一个尺度下的像素点阈值提取待识别图像中的LGBP纹理特征,可以提高LGBP纹理特征的健壮性,进而确保所提取的LGBP纹理特征具有较强的鉴别能力,提高对人脸的鉴别能力。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为现有技术对某一尺度对应的八个方向下的Gabor滤波响应图像中的三个像素点的像素值求和示例图;
图2为本发明基于Gabor二值模式的人脸识别装置实施例一的结构示意图;
图3为本发明基于Gabor二值模式的人脸识别装置实施例二的结构示 意图;
图4为本发明基于Gabor二值模式的人脸识别装置实施例二中Gabor滤波前后图像示例图;
图5为本发明基于Gabor二值模式的人脸识别装置实施例二中融合前后示例图;
图6为本发明基于Gabor二值模式的人脸识别方法实施例二中计算LGBP二进制值的示例图;
图7为本发明基于Gabor二值模式的人脸识别装置实施例三的结构示意图;
图8为本发明基于Gabor二值模式的人脸识别方法实施例一的流程图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图2为本发明基于Gabor二值模式的人脸识别装置实施例一的结构示意图。本发明实施例提供一种基于Gabor二值模式的人脸识别装置,该装置可以集成在通信设备中,其中,通信设备可以为手机、个人计算机(Personal Computer,简称:PC)、笔记本电脑或服务器等任意终端设备。如图2所示,本实施例的装置10包括:阈值确定模块11、滤波处理模块12、确定模块13、获取模块14和识别模块15。
其中,阈值确定模块11用于对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用费希尔(Fisher)准则,获取所述方向的所述尺度下的鉴别因子,并根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各尺度下的像素点阈值;滤波处理模块12用于根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;确定模块13用于 对于滤波处理模块12获得的每一个方向的各个尺度下第二滤波响应图像,依据针对与该第二滤波响应图像具有相同方向和尺度下的像素点阈值(由阈值确定模块11提供),确定与每一个第二滤波响应图像对应的二进制模式(Local Gabor Binary Patterns,简称:LGBP)二进制图;所述与第二滤波响应图像对应的Gabor LGBP二进制图为:与所述第二滤波响应图像具有相同方向和尺度的LGBP二进制图。
获取模块14用于获取确定模块13得到的每一LGBP二进制图的特征向量,根据每一LGBP二进制图的特征向量获取待处理图像的特征向量;识别模块15用于根据获取模块14获取的待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取所述待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
其中,每一方向的各尺度的相似度阈值和每一方向的各尺度的像素点阈值均为训练阶段根据训练图像集合获得的。相似度阈值为根据训练图像集合中训练图像的特征向量获得的。
具体地,滤波处理模块12具体用于:对待处理图像使用Gabor滤波的核函数做卷积处理,获得每一个方向的各个尺度下的第二滤波响应图像;其中,核函数中的尺度及方向的取值为根据实际需求设定的。其中,将待处理图像使用核函数做卷积处理具体是指:将待处理图像,使用上述所设定的每一尺度取值和方向取值的Gabor滤波的核函数做卷积处理,从而得到每一个方向的各个尺度下的第二滤波响应图像。例如,当尺度取值为5,方向取值为8是,则遍历五个尺度(0,1,2,3,4)和八个方向(0,1,2,3,4,5,6,7),得到40个核函数,并利用每一核函数分别对待处理图像做卷积,从而得到5个尺度、8个方向的40个第二滤波响应图像。
其中,Gabor滤波的核函数可以为:
Figure PCTCN2014093450-appb-000003
其中,z=(x,y)表示空间域像素的位置,即待处理图像中各像素点的坐标值;||·||表示求取范数,
Figure PCTCN2014093450-appb-000004
以及
Figure PCTCN2014093450-appb-000005
中,ν的取值指示了Gabor核函数的尺度,μ的取值指示了Gabor核函数的方向, K表示总的方向数,
Figure PCTCN2014093450-appb-000006
决定了高斯窗口的大小。
在现有的实现有,一般设定5个不同的尺度,即υ∈{0,1,...,4},8个方向,即μ∈{0,1,...,7},共40个Gabor滤波核函数,确定σ=2π,kmax=π/2,
Figure PCTCN2014093450-appb-000007
同时参考图2和图3,确定模块13可以包括:阈值获取单元131,用于获取该第二滤波响应图像所在的方向的尺度下的像素点阈值;确定单元132,用于针对每一个所述第二滤波响应图像,根据所述与该第二滤波响应图像所在方向的尺度的像素点阈值,获得该第二滤波响应图像的各像素点对应的LGBP二进制序列,并根据该第二滤波响应图像的各像素点对应的LGBP二进制序列得到该第二滤波响应图像对应的LGBP二进制图。其中,二进制图即用二进制值表征第二滤波响应图像中的像素值。所述第二滤波响应图像对应的二进制模式LGBP二进制图具体为:所述第二滤波响应图像具有相同方向下的相同尺度的LGBP二进制图。
进一步的,确定单元132获得所述第二滤波响应图像中各像素点对应的LGBP二进制序列具体为:根据如下公式获得所述第二滤波响应图像中各所述像素点作为中心像素点时对应邻域中的任一周围像素点的LGBP二进制值:
Figure PCTCN2014093450-appb-000008
其中,ub表示所述第二滤波响应图像中任一像素点作为中心像素点c时,该中心像素点c所在邻域中的一个周围像素点b的像素值;ic表示该中心像素点c的像素值,t表示所述第二滤波响应图像所在的方向和尺度下的像素点阈值,S(ub,ic,t)表示中心像素点c所在邻域中的该任一个周围像素点的二进制值;所述像素点对应的二进制序列为该像素点的各周围像素点的二进制值组成的二进制序列
具体地,LGBP是在经过Gabor滤波后得到的滤波响应图像上进一步提取得出的局部二值模式(Local Binary Patterns,简称:LBP),体现了待处理图像经过Gabor滤波后的微观纹理结构。其中,LBP是用一个局部区域的模式来描述纹理,每个像素点由一个与之最匹配的局部邻域的原始纹理形成的码值来标记。
实际场景中,对待处理图像进行人脸识别之前,人脸识别装置还需要对已有的图像集合进行训练,称之为训练阶段。其中,在训练阶段,人脸识别装置根据训练图像集合获得上述每一方向以及各尺度的像素点阈值。
进一步的,本发明实施例中,确定模块13对于每一个方向的各个尺度下第二滤波响应图像,根据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,计算该第二滤波响应图像中每个像素点对应的LGBP二进制值。相比于现有技术中根据各像素点所对应的中心像素点的像素值,确定其对应的LGBP二进制值的方式,由于本发明根据训练图像集合中每一第一滤波响应图像在所有方向的各尺度下的第一滤波响应图像的鉴别因子计算得到的像素点阈值,提取LGBP二进制值,从而能够提取出更具有鉴别能力的LGBP纹理特征。因此,该像素点阈值的设置目的在于提高所提取的LGBP纹理特征的健壮性,从而实现人脸识别装置的高鉴别能力。
获取模块14获取确定模块13得到的每一所述LGBP二进制图的特征向量中,任一个所述LGBP二进制图的特征向量的获取过程包括:采用预设大小的区域块,对其所接收的确定模块13发送的LGBP二进制图进行区域划分;将各所述区域块中每一像素点的邻域二进制序列转换成十进制值,作为该像素点的LGBP编码值;所述像素点的邻域二进制序列由该像素点的各周围像素点的二进制值组成;以所有所述区域块中的最大LGBP编码值作为每一个所述区域块所对应向量的总维度,将所述区域块内LGBP编码值为n-1的LGBP编码值的个数作为该区域块对应向量中第n维的取值;所述区域块对应的向量的各维度的取值组成该区域块对应的LGBP直方图;其中,n为1到最大LGPB编码值之间的任意整数;串联各所述区域块的LGBP直方图,得到所述LGBP二进制图的特征向量。比如,各向量中第1维表示该向量对应的区域块内LGBP编码值为0的个数,第2维表示区域块内LGBP编码值为1的个数,第3维表示区域块内LGBP编码值为2的个数,依次类推到总维度为止;进一步的,所述获取模块13根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量具体为:串联各LGBP二进制图的特征向量,得到所述待处理图像的特征向量。
需要说明的是,在本发明任一实施例中,区域块的形状可以为矩形块或方形块等任意形状;第一滤波响应图像及第二滤波响应图像中的“第一”与 “第二”仅为区分训练图像和待识别图像通过Gabor滤波后的响应图像。
另外,值得说明的是,本发明实施例中方向的尺度表示某一方向上的某一尺度,如第一方向的第一尺度(如图4中40个滤波响应图像中最左上角的滤波响应图像,而针对这个滤波响应图像,也可以说成是第一尺度的第一方向。因此,对于表征滤波响应图像、LGBP二进制图、鉴别因子、像素点阈值等,某一方向的某一尺度的表达和某一尺度的某一方向的表达具有相同的意义。主要是方向和尺度的具体取值。比如,第一方向的第二尺度与第二尺度的第一方向,第二方向的第五尺度和第五尺度的第二方向,等具有相同的意义。如果方向或尺度的取值不同,则不同。
识别模块15可以具体用于:采用直方图交叉法,根据获取模块14获取的待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度;并根据相似度阈值,得到识别结果。具体的,确定该相似度大于或等于相似度阈值,并确定待处理图像与用于获取相似度的训练图像为同一目标的图像;或,确定该相似度小于相似度阈值,并确定待处理图像与用于获取相似度的训练图像为不同目标的图像。
本发明实施例通过使用训练阶段获取的像素点阈值(根据训练图像集合中每一第一滤波响应图像在所有方向的各尺度下的第一滤波响应图像的鉴别因子计算得到的像素点阈值)提取待识别图像中的LGBP纹理特征,提高LGBP纹理特征的健壮性,进而确保所提取的LGBP纹理特征具有较强的鉴别能力。
在以下实施例中,结合人脸识别的训练阶段和识别阶段,对本发明实施例提供的人脸识别装置的实际功能和技术效果进行详细说明,以验证待识别图像是否存在于训练图像集合中,即将待识别图像与每一幅训练图像作一一认证,以判定该待识别图像是否包含于上述训练图像集合中。其中,待处理图像可以包括待识别图像和训练图像集合中的图像,训练图像集合中训练图像的个数为任意数值,在这里不对其进行限定,在具体应用时训练图像集合的大小根据实际场景设定。为便于区分,用于训练的训练图像集合,每张训练图像标记有独立的标识符(Identifier,简称:ID)。
图3为本发明基于Gabor二值模式的人脸识别装置实施例二的结构示意 图。如图3所示,该实施例在上述实施例的基础上,还可以包括预处理模块21。
一、在训练阶段,对上述训练图像集合中的图像进行处理。具体地:
滤波处理模块12用于对训练图像进行Gabor滤波处理。Gabor滤波过程:确定尺度因子为5(即尺度的取值可以为0,1,2,3和4)与方向因子为8(即方向的取值可以为0,1,2,3,4,5,6,7)的Gabor核函数;分别将训练图像集合中的每一训练图像,与Gabor核函数作卷积,得到每一方向(共8个方向)的各尺度下的(共5个尺度)的第一滤波响应图像,即训练图像集合中的每一训练图像都由40(5*8)张第一滤波响应图像来代替,其中,40指的是5个尺度上每个尺度均有8个方向的图像,共有40张,如图4所示。
阈值确定模块11对训练图像集合中的在同一方向和同一尺度的所有第一滤波响应图像采用Fisher准则,获取所述方向和尺度下的鉴别因子,并根据所有方向的各尺度下的该鉴别因子确定每一方向的各尺度下的像素点阈值。
具体的,将每一个第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本。将训练图像集合中同一目标的训练图像的在同一方向和尺度下的第一滤波响应图像作为在该方向和尺度下的类内样本,用jt表示第j目标的第t张训练图像在该方向和尺度下的第一滤波响应图像的各像素点的像素值,其中t=1,……,K,K为每个目标的训练图像总张数。作为一个例子,一个目标有8张训练图像,以5个方向8个尺度为例,而j为第一方向的第一尺度的第一滤波响应图像。计算训练图像集合中每一个目标在该方向和尺度下的类内样本的像素均值mj(即在与类内样本相同的方向和尺度下的像素平均值)及训练图像集合中所有目标在该方向和尺度下的样本的像素平均值M(即在与类内样本相同的方向和尺度下的所有样本的像素平均值),其中j=1,……,N,N为训练图像集合中目标的总数;根据该方向和尺度下的像素均值和该方向和尺度下的像素平均值,确定所述第一滤波图像在所述方向和尺度下的类内离散矩阵和类间离散矩阵;根据所述方向和尺度下的类内离散矩阵和所述方向和尺度下的类间离散矩阵,计算所述方向的尺度下的鉴别因子。
根据xt、mj以及M,计算所有第一滤波响应图像之间的类内离散度矩 阵Sw以及类间离散度矩阵Sb,计算公式如下:
Figure PCTCN2014093450-appb-000009
Figure PCTCN2014093450-appb-000010
根据公式(3)和公式(4)得到Sw和Sb,计算二者的比值Sw/Sb,将该比值作为该方向和尺度下的鉴别因子。所述该方向和尺度下的鉴别因子为:在该方向和尺度下的第一滤波响应图像对应的鉴别因子。
进一步的,阈值确定模块11根据每一方向的各尺度下的鉴别因子确定每一方向的各个尺度的像素点阈值,也就是确定与鉴别因子具有相同方向的相同尺度的像素点阈值。以5个方向8个尺度为例,每一张训练图像共40个第一滤波响应图像。则阈值确定模块可以获取40个鉴别因子,以及40个像素点阈值。其中,鉴别因子和像素点阈值一一对应。
具体的,可以按照某一方向尺度上第一滤波响应图像的鉴别因子与对应方向尺度上该第一滤波响应图像的LGBP二进制值提取的像素点阈值成反比的原则,确定每一个方向尺度上第一滤波响应图像的LGBP二进制提取的像素点阈值。可选地,阈值确定模块11根据每一方向的各尺度下的鉴别因子确定每一方向的各个尺度的像素点阈值具体为:根据如下公式,计算第一滤波响应图像所在方向的尺度下像素点阈值t为:
Figure PCTCN2014093450-appb-000011
其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大于1的实数,W为第一滤波响应图像所在的方向的尺度下的鉴别因子。采用公式(5)计算像素点阈值的目的是为保证像素点阈值的准确性。当W>1时,公式(5)可进一步简化为取
Figure PCTCN2014093450-appb-000012
从上可以看出,本发明实施例中像素点阈值的确定根据了以类间离散度与类内离散度的比值作为某个方向尺度下的鉴别因子计算;该鉴别因子越大,说明该尺度方向下的样本类间离散度较大且类内离散度较小,更具有鉴别能力;否则鉴别能力较差。从而可以提高识别的准确率。
进一步的,对于滤波处理模块12获得的每一第一滤波响应图像,确定模块13从依据针对与所述该第一滤响应图像具有相同方向和尺度下的预设像素点阈值,确定与每一个所述第一滤波响应图像对应的LGBP二进制图具体包括:对每一第一滤波响应图像,以每个像素点分别作为一个邻域的中心像素点,确定邻域半径为1,周围像素点个数为8,提取该中心像素点的LGBP二进制序例。将第一滤波响应图像中每一像素点作为中心像素点,提取该中心像素点对应的LGBP二进制序列作为LGBP二进制图中该像素点的值。对位于图像边缘的像素点,则采用双线性插值法,补全该邻域。其中,上述邻域半径和周围像素点的个数仅为示例,不以此为限。
为了计算第一滤波响应图像中边缘像素点的LGBP二进制序列,利用插值法扩充第一滤波响应图像。在图6中用虚线框表示扩充的像素点。作为一个例子,以第一滤波响应图像左上角像素点X11作为起始像素点,横向遍历每个像素点(即X11-->…-->X17-->X21-->…-->X27-->…-->X57)。在遍历的过程中,分别计算每个像素点LGBP二进制序列。
在图6中,以计算X11像素点作为例子,描述LGBP二进制值的计算过程。其中,t为该第一滤波响应图像对应的像素点阈值。
以中心像素点的灰度值作为像素点阈值的LGBP二进制值提取方法,对噪音比较敏感。如果该邻域的中心像素点的像素值位于这些邻域像素点的像素值之间,那么将邻域像素点分别与该中心像素点的像素值作比较,就会包含跳变信息,呈现出的LGBP二进制值扩大了该邻域内的纹理信息,因此说该方法对噪音比较敏感。如果对中心像素点的像素值加上一个合适的值再作为阈值,促使在该情况下,使阈值都大于或小于邻域像素点的像素值,那么将邻域像素点与该中心像素点的像素值作比较时,就不存在跳变信息,这样就可以抑制部分像素值非常接近的纹理信息。因此,本发明实施例根据鉴别因子计算像素点阈值来提取训练图像的LGBP二进制值,就是为了保证像素点阈值的合理性,如公式(2)所示。本发明提供的LGBP二进制值提取方法不仅抑制部分灰度值非常接近的纹理信息,提高纹理特征提取的健壮性,更提高了所提取的LGBP二进制值的鉴别能力。
确定模块13进一步可用于针对每一尺度下至少两个方向中的每个方向的所述LGBP二进制图,融合同一尺度下所述至少两个方向中每个方向的所 述LGBP二进制图,得到每一尺度融合后的LGBP二进制图。以图4为例,针对5个方向8个尺度的第二滤波响应图像,确定模块13先获取与每一个所述第二滤波响应图像对应的二进制模式LGBP二进制图,即获取5个方向中每一个方向的8个尺度的LGBP二进制图,共40个LGBP二进制图。然后,确定模块融合同一尺度下的每个方向(有5个方向)的LGBP二进制图,得到8个融合后的LGBP二进制图,即每个尺度对应一个融合后的LGBP二进制图。
可选地,确定模块13针对每一尺度,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图具体为:确定模块13以按位相或的方式,融合同一尺度下所有方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图。
具体地,确定模块13融合同一尺度下,所有方向的LGBP二进制图,融合方法如图5所示。参考图5,图中只列举,同一尺度下,各个方向的LGBP二进制图中对应的一个像素点的二进制序列中的对应三位二进制值,其他二进制值依次类推。首先对第一方向与第二方向的LGBP二进制图对应像素点的二进制序列中的每一对相互对应的二进制值进行“或”操作;然后将这个“或”操作的结果与第三方向的LGBP二进制图对应像素点的二进制序列中的对应二进制值再做“或”操作;依次类推,直到将“或”操作的结果与第八方向上的LGBP二进制图对应像素点的二进制序列中的对应二进制值做“或”操作,作为融合后的LGBP二进制图
对所有尺度下的LGBP二进制图做上述处理,得到各尺度的融合后的LGBP二进制度。即训练图像集合中的每一训练图像由确定张数(即尺度的数量,如5))的融合后的LGBP二进制图组成。也就是说:每个尺度分别有一张融合后的LGBP二进制图。对同一尺度在不同方向下的各第一滤波响应图像的LGBP二进制图中的各像素点的二进制序列的第一个二进制值采用“按位相或”作融合的方法,如果某个第一滤波响应图像上出现跳变信息,即存在二进制值为“1”,则通过与该尺度下其他第一滤波响应图像中对应该跳变信息的位置的二进制值“相或”的结果必为“1”,即保留了该位置上的跳变信息。此外,在鉴别因子以及像素点阈值的作用下,提取的LGBP二进制值已经进行了鉴别信息的筛选,得到的是鉴别能力更强的信息。在此基础 上进行融合处理,可以实现在保证特征数据的鉴别能力的前提下有效降低人脸识别的计算量。
获取模块14获取确定模块13得到的每一所述LGBP二进制图的特征向量具体为:获取确定模块13得到的每一尺度融合后的LGBP二进制图的特征向量,并将该特征向量发送给阈值确定模块11。
具体地,获取模块14对每个尺度的融合后的LGBP二进制图采用预设大小的区域块进行分块;所述预设大小的区域块的大小可以根据LGBP二进制图实际大小预先设定的,例如可以为4*8;将每一个区域块中每个像素点对应的二进制序列转换成十进制值,并将该十进制值作为该像素点的LGBP编码值;以所有区域块中所有像素点的最大LGBP编码值作为每一个区域块所对应向量的总维度,将区域块内LGBP编码值为n-1的LGBP编码值的个数作为该区域块对应向量中第n维的取值;所述区域块对应的向量的各维度的取值组成该区域块对应的LGBP直方图。其中n为1到最大LGBP编码值之间的任意整数。例如,所有区域块中最大编码值为59,其中一个区域块中有4个编码值3,有10个编码值6,有9个编码值59,而其他值为0,则LGBP直方图为:
Figure PCTCN2014093450-appb-000013
向量的总长度为59维;串联每一个区域块的LGBP直方图,构成该LGBP二进制图的LGBP直方图。又比如,某一尺度的LGBP二进制图划分为2个区域块,每一区域块对应的LGBP直方图均为
Figure PCTCN2014093450-appb-000014
则串联这2个区域块的LGBP直方图,构成该LGBP二进制图的LGBP直方图,即
Figure PCTCN2014093450-appb-000015
在这里,上述例子仅为说明串联的方式,具体的向量维度和向量中各元素的取值不以上述例子为限;另外,串联各LGBP二进制图的特征向量得到所述待处理图像的特征向量中串联方式也可参考上述例子,以便于理解。
进一步地,获取模块14串联各尺度的融合后的LGBP二进制图各自对应的LGBP直方图,构成该训练图像对应LGBP直方图。所述该训练图像对应的LGBP直方图即为对训练图像的特征向量。
阈值确定模块11接收获取模块14发送的各训练图像的特征向量。进一步地,阈值确定模块11可具体用于:按照十字交叉验证准则,任意组合训练图像集合中的训练图像,将训练图像集合中的训练图像分为待训练图像和测试图像;采用直方图交叉法,计算各待训练图像与测试图像的特征向量的相似度;依次以每一相似度作为阈值,统计每组中的准确率及误判率;根据每组中的准确率与误判率,确定相似度阈值。具体的,遍历每组中准确率与误判率,若一组中的准确率与误判率相加再减1后的绝对值最小时,将其对应的相似度阈值作为该组的最优相似度阈值;取各个组最优相似度值的平均值,作为该训练图像集合的相似度阈值。
以下举例说明相似度阈值的训练:根据上述得到的各训练图像的特征向量,按照十字交叉验证准则,将训练图像集合分为十份,选择其中任意一份作为测试图像集合,余下作为待训练图像集合,重复十次,构成十组待训练图像与测试图像的集合,使得十份中的每一份都作过测试图像。如此,每一组都包含待训练图像与测试图像;采用直方图交叉法计算每组中每一对待训练图像与测试图像的特征向量之间的相似度;依次以每一对待训练图像和测试图像的相似度作为待定相似度阈值,根据公式(6)和公式(7)统计所识别出的正实例占所有正实例的比例(true positive rate,简称:TPR)以及错认为正实例的负实例占所有负实例的比例(false positive rate,简称:FPR)值;以FPR-(1-TPR)的绝对值最小为准则,计算出每组最优相似度阈值,最终取十组最优阈值的均值作为识别阶段的相似度阈值,即识别模块15用于识别的相似度阈值。
Figure PCTCN2014093450-appb-000016
Figure PCTCN2014093450-appb-000017
其中,TP为将同一个人的图像正确的识别出来的次数,FN是将同一个人的图像识别为不同人的次数,FP是将不同人的图像识别为同一个人的图像的次数,TN是将不同人的图像识别为不同人的图像的次数。
二、在识别阶段,对待识别图像进行处理,目的在于计算待识别图像的特征向量,将待识别图像与每一训练图像的特征向量作一一认证,计算两者的相似度,与相似度阈值进行比较以确定该待识别图像与某一训练图像是否属于同一个人。具体地:
与训练阶段相比,在识别阶段,只需利用训练阶段阈值确定模块11获取的所述每一方向的各尺度下的像素点阈值,及相似度阈值;识别模块15基于待处理图像的特征向量和训练图像集合中各训练图像的特征向量,获取待处理图像与训练图像集合中训练图像的相似度,并根据阈值确定模块11得到的相似度阈值,对待处理图像进行识别,获得识别结果。
以下说明识别阶段各模块的用途:
滤波处理模块12用于根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;
确定模块13用于对于滤波处理模块12获得的每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的LGBP二进制图;
获取模块14用于获取确定模块13得到的每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量;
识别模块15用于根据获取模块14获取的所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
进一步的,在识别阶段,确定模块13确定二进制模式LGBP二进制图的方法,以及获取模块获取所述待处理图像的特征向量的方法,与在训练阶段确定模块和获取模块的执行的方法不同。本发明实施例并不针对训练阶段和识别阶段作详细的区分介绍。
识别模块15具体用于:采用直方图交叉法,根据所述获取模块获取的所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度;并根据相似度阈值, 得到识别结果。
其中,识别模块15根据相似度阈值,得到识别结果具体为:确定所述相似度大于或等于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为同一目标的图像;或,确定所述相似度小于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为不同目标的图像。
还需说明的是,人脸图像的变化包括内在变化和外在变化:内在变化是由于人的身份不同引起的,属于人脸的本质属性;而外在变化是由于外界条件的不同而引起的,包括光照、姿态、表情、年龄等,反映了不同的图像采集条件。理想的人脸描述特征应该只反映人脸的内在变化,而对外在变化不敏感。因此,在上述基础上,进一步地,人脸识别装置20包括预处理模块21,用于对待处理图像进行预处理,并将预处理后的待处理图像发送给滤波处理模块12,其中,预处理包括脸部区域获取、人脸对齐处理及光照预处理。
若预处理为脸部区域获取,则预处理模块24具体用于:在待处理图像中,根据人眼坐标,获取两眼间距离;根据距离,截取待处理图像中的前额、眼睛、鼻子、嘴巴以及下巴所在的区域。
若预处理为人脸对齐处理,则预处理模块21具体用于:计算待处理图像中两眼连线与水平线之间的夹角;根据该夹角,旋转待处理图像,使得待处理图像中两眼连线位于水平位置。
若预处理为光照预处理,则预处理模块21具体用于:采用高斯滤波和/或伽马(Gamma)矫正光照预处理方法,使待处理图像的光照强度变均匀。
本发明实施例中,人脸识别装置通过Fisher准则确定LGBP二进制图提取的像素点阈值,另通过对LGBP二进制图进行“按位相或”的计算进行融合,提高LGBP二进制图的鉴别能力,在降低LGBP二进制图计算量的同时保留各个方向上的LGBP二值模式的跳变信息,从而提高识别率。
需要说明的是,在上述实施例中,预处理模块、滤波处理模块、确定模块、获取模块和阈值确定模块可用于训练阶段,人脸识别装置离线和/或在线获取像素点阈值和相似度阈值;预处理模块、滤波处理模块、确定模块、获取模块和识别模块可用于在识别阶段,人脸识别装置在线获取识别结果。其中,上述各模块可以集成在一个人脸识别装置中,也可用于训练阶段的装置和用于识别阶段的装置分立设置,本发明不对其进行限制。
图7为本发明基于Gabor二值模式的人脸识别装置实施例三的结构示意图。该装置可以集成在通信设备中,其中,通信设备可以为手机、PC、笔记本电脑或服务器等任意终端设备。如图7所示,本实施例的装置70包括:处理器71和存储器72。
其中,处理器71用于对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各尺度下的像素点阈值;存储器72与处理器71连接,用于存储所述每一方向的各尺度下的像素点阈值、各像素点阈值与预设方向的尺度的对应关系,以及相似度阈值;处理器71还可以用于根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;对于每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的LGBP二进制图;获取每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取待处理图像的特征向量;及,根据所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度,并根据所述相似度阈值,得到识别结果。
此外,本发明实施例的处理器,还用于执行上述阈值确定模块,滤波处理模块,确定模块,获取模块以及识别模块执行的各个步骤,本发明实施例在此不在一一详细。
图8为本发明基于Gabor二值模式的人脸识别方法实施例一的流程图。本发明实施例提供一种基于Gabor二值模式的人脸识别方法,该方法可以由上述人脸识别装置执行,该装置可以集成在通信设备中,其中,通信设备可以为手机、PC、笔记本电脑或服务器等任意终端设备。如图8所示,该基于Gabor二值模式的人脸识别方法包括:
S801、根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像。
S802、对于每一个方向的各个尺度下第二滤波响应图像,依据针对与第 二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个第二滤波响应图像对应的LGBP二进制图。
其中,该像素点阈值为:所述像素点阈值为:对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用费希尔Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据所述方向的尺度下的所述鉴别因子确定的。
S803、获取每一LGBP二进制图的特征向量,根据每一LGBP二进制图的特征向量获取待处理图像的特征向量。
S804、根据待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
本发明实施例的方法,可以由图2、图3或图7所示装置执行,其实现原理和技术效果类似,此处不再赘述。
在上述实施例中,将所述训练图像集合中每一个所述第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本,将所述训练图像集合中同一目标的训练图像的在同一方向的同一尺度下的第一滤波响应图像作为该方向的该尺度的类内样本,
所述获取所述方向的所述尺度的鉴别因子,具体为:计算所述训练图像集合中每一目标所述方向的所述尺度下类内样本的像素均值,及所述训练图像集合中所有目标在所述方向和所述尺度下的所有样本的像素平均值;根据所述方向的所述尺度下的所述像素均值和所述方向的所述尺度下的所述像素平均值,确定所述方向的所述尺度下的类内离散矩阵和所述方向的所述尺度下的类间离散矩阵;根据所述方向的所述尺度下的所述类内离散矩阵和所述方向的所述尺度下的类间离散矩阵,计算所述方向的所述尺度下的的鉴别因子。
可选地,根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各个尺度下的像素点阈值可以具体为:根据如下公式,计算第一滤波响应图像所在方向的尺度下的像素点阈值t为:
Figure PCTCN2014093450-appb-000018
其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大 于1的实数,W所述第一滤波响应图像所在的方向的尺度下的鉴别因子。
在上述实施例的基础上,S802可以包括:获取与所述第二滤波响应图像对应的像素点阈值,所述与所述第二滤波响应图像对应的像素点阈值为该第二滤波响应图像所在的方向和尺度下的像素点阈值;针对每一个所述第二滤波响应图像,根据所述与该第二滤波响应图像对应的像素点阈值,获得该第二滤波响应图像的各像素点对应的LGBP二进制序列,并根据该第二滤波响应图像的各像素点对应的LGBP二进制序列得到该第二滤波响应图像对应的LGBP二进制图。
进一步地,所述获得所述第二滤波响应图像中各像素点对应的LGBP二进制序列具体为:根据如下公式获得所述第二滤波响应图像中各所述像素点作为中心像素点时对应邻域中的任一周围像素点的LGBP二进制值:
Figure PCTCN2014093450-appb-000019
其中,ub表示所述第二滤波响应图像中任一像素点作为中心像素点c时,该中心像素点c所在邻域中的一个周围像素点b的像素值;ic表示该中心像素点c的像素值,t表示所述第二滤波响应图像所在的方向和尺度下的像素点阈值,S(ub,ic,t)表示中心像素点c所在邻域中的该任一个周围像素点的二进制值;所述像素点对应的二进制序列为该像素点的各周围像素点的二进制值组成的二进制序列
在上述基础上,步骤803中获取每一所述LGBP二进制图的特征向量中,任一个所述LGBP二进制图的特征向量的获取过程包括:
采用预设大小的区域块,对所述LGBP二进制图进行区域划分;将各所述区域块中每一像素点的邻域二进制序列转换成十进制值,作为该像素点的LGBP编码值;所述像素点的邻域二进制序列由该像素点的各周围像素点的二进制值组成;以所有所述区域块中的最大LGBP编码值作为每一个所述区域块所对应向量的总维度,将所述区域块内LGBP编码值为n-1的LGBP编码值的个数作为该区域块对应向量中第n维的取值;所述区域块对应的向量的各维度的取值组成该区域块对应的LGBP直方图;其中,n为1到最大LGPB编码值之间的任意整数;串联各所述区域块的LGBP直方图,得到所述LGBP 二进制图的特征向量;所述根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量具体为:串联各LGBP二进制图的特征向量,得到所述待处理图像的特征向量。
可选地,S803之前,该方法还可以包括:针对每一尺度至少两个方向中的每个方向的所述LGBP二进制图,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图;所述获取每一所述LGBP二进制图的特征向量具体为:获取所述每一尺度融合后的LGBP二进制图的特征向量。
其中,所述融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图具体为:以按位相或的方式,融合同一尺度下所述至少两个方向中每个方向的的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图。
进一步地,S804可以包括:采用直方图交叉法,根据待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度;并根据相似度阈值,得到识别结果。
其中,根据相似度阈值,得到识别结果可以具体为:确定相似度大于或等于相似度阈值,并确定待处理图像与用于获取相似度的训练图像为同一目标的图像;或,确定相似度小于相似度阈值,并确定待处理图像与用于获取相似度的训练图像为不同目标的图像。
在上述实施例中,S804之前,所述方法还可以包括:按照十字交叉验证准则,任意组合所述训练图像集合中的训练图像,将训练图像集合中的训练图像分为待训练图像和测试图像;采用直方图交叉法,计算各待训练图像与测试图像的特征向量的相似度;依次以每一相似度作为阈值,统计准确率及误判率;根据每组中的准确率与误判率,确定相似度阈值。
其中,根据每组中的准确率与误判率,确定相似度阈值,可以包括:遍历每组中所述准确率与误判率,若一组中的准确率与误判率相加再减1后的绝对值最小时,将其对应的相似度阈值作为该组的最优相似度阈值;取各个组的最优相似度阈值的平均值,作为所述训练图像集合的相似度阈值。
在上述实施例中,S801之前,基于Gabor二值模式的人脸识别方法还可以包括:对待处理图像进行预处理,预处理可以包括脸部区域获取、人脸对 齐处理及光照预处理。
其中,脸部区域获取可以包括:在待处理图像中,根据人眼坐标,获取两眼间距离;根据距离,截取待处理图像中的前额、眼睛、鼻子、嘴巴以及下巴所在的区域。
人脸对齐处理可以包括:计算待处理图像中两眼连线与水平线之间的夹角;根据夹角,旋转待处理图像,使得待处理图像中两眼连线位于水平位置。
光照预处理可以包括:采用高斯滤波和/或Gamma矫正光照预处理方法,使待处理图像的光照强度变均匀。
本发明实施例通过使用训练阶段获取的像素点阈值提取待识别图像中的LGBP纹理特征,提高LGBP纹理特征的健壮性,进而确保所提取的LGBP纹理特征具有较强的鉴别能力,提高对人脸的鉴别能力。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (24)

  1. 一种基于珈波Gabor二值模式的人脸识别装置,其特征在于,包括:
    阈值确定模块,用于对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用费希尔Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各尺度下的像素点阈值;
    滤波处理模块,用于根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;
    确定模块,用于对于所述滤波处理模块获得的每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的二进制模式LGBP二进制图;
    获取模块,用于获取所述确定模块得到的每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量;
    识别模块,用于根据所述获取模块获取的所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取所述待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
  2. 根据权利要求1所述的装置,其特征在于,将所述训练图像集合中每一个所述第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本,将所述训练图像集合中同一目标的训练图像的在同一方向的同一尺度的第一滤波响应图像作为该方向的该尺度的类内样本,所述阈值确定模块获取所述方向的所述尺度下的鉴别因子具体为:
    计算所述训练图像集合中每一目标所述方向的所述尺度下类内样本的像素均值,及所述训练图像集合中所有目标在所述方向的所述尺度下的所有样本的像素平均值;
    根据所述方向的所述尺度下的所述像素均值和所述方向的所述尺度下的所述像素平均值,确定所述方向的所述尺度下的类内离散矩阵和所述方向的所述尺度下的类间离散矩阵;
    根据所述方向的所述尺度下的所述类内离散矩阵和所述方向的所述尺度下的类间离散矩阵,计算所述方向的所述尺度下的鉴别因子。
  3. 根据权利要求1或2所述的装置,其特征在于,所述阈值确定模块根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各个尺度的像素点阈值具体为:
    根据如下公式,计算所述第一滤波响应图像所在方向的尺度下的像素点阈值t为:
    Figure PCTCN2014093450-appb-100001
    其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大于1的实数,W为所述第一滤波响应图像所在的方向的尺度下的鉴别因子。
  4. 根据权利要求1-3任一项所述的装置,其特征在于,所述确定模块包括:
    阈值获取单元,用于获取该第二滤波响应图像所在的方向的尺度下的像素点阈值;
    确定单元,用于针对每一个所述第二滤波响应图像,根据所述与该第二滤波响应图像所在方向的尺度的像素点阈值,获得该第二滤波响应图像的各像素点对应的LGBP二进制序列,并根据该第二滤波响应图像的各像素点对应的LGBP二进制序列得到该第二滤波响应图像对应的LGBP二进制图。
  5. 根据权利要求4所述的装置,其特征在于,所述确定单元获得所述第二滤波响应图像中各像素点对应的LGBP二进制序列具体为:
    根据如下公式获得所述第二滤波响应图像中各所述像素点作为中心像素点时对应邻域中的任一周围像素点的LGBP二进制值:
    Figure PCTCN2014093450-appb-100002
    其中,ub表示所述第二滤波响应图像中任一像素点作为中心像素点c时,该中心像素点c所在邻域中的一个周围像素点b的像素值;ic表示该中心像素点c的像素值,t表示所述第二滤波响应图像所在的方向和尺度下的像素点阈值,S(ub,ic,t)表示中心像素点c所在邻域中的该任一个周围像素点的二进制值;所述像素点对应的二进制序列为该像素点的各周围像素点的二进制值组 成的二进制序列。
  6. 根据权利要求4或5所述的装置,其特征在于,所述获取模块获取所述确定模块得到的每一所述LGBP二进制图的特征向量中,任一个所述LGBP二进制图的特征向量的获取过程包括:
    采用预设大小的区域块,对其所接收的所述确定模块得到的所述LGBP二进制图进行区域划分;
    将各所述区域块中每一像素点对应的二进制序列转换成十进制值,作为该像素点的LGBP编码值;
    以所有所述区域块中的最大LGBP编码值作为每一个所述区域块所对应向量的总维度,将所述区域块内LGBP编码值为n-1的LGBP编码值的个数作为该区域块对应向量中第n维的取值;所述区域块对应的向量的各维度的取值组成该区域块对应的LGBP直方图;其中,n为1到最大LGPB编码值之间的任意整数;
    串联各所述区域块的LGBP直方图,得到所述LGBP二进制图的特征向量;
    所述获取模块根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量具体为:串联各LGBP二进制图的特征向量,得到所述待处理图像的特征向量。
  7. 根据权利要求1-6任一项所述的装置,其特征在于,所述确定模块进一步用于针对每一尺度下至少两个方向中的每个方向的所述LGBP二进制图,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图;
    所述获取模块获取所述确定模块得到的每一所述LGBP二进制图的特征向量具体为:获取所述确定模块得到的每一尺度融合后的LGBP二进制图的特征向量。
  8. 根据权利要求7所述的装置,其特征在于,所述确定模块针融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图具体为:所述确定模块以按位相或的方式,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图。
  9. 根据权利要求1-8任一项所述的装置,其特征在于,所述识别模块具体用于:
    采用直方图交叉法,根据所述获取模块获取的所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度;并根据相似度阈值,得到识别结果。
  10. 根据权利要求9所述的装置,其特征在于,所述识别模块根据相似度阈值,得到识别结果具体为:
    确定所述相似度大于或等于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为同一目标的图像;或
    确定所述相似度小于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为不同目标的图像。
  11. 根据权利要求1-10任一项所述的装置,其特征在于,所述阈值确定模块还用于:
    按照十字交叉验证准则,任意组合所述训练图像集合中的训练图像,将所述训练图像集合中的训练图像分为待训练图像和测试图像;
    采用直方图交叉法,计算各所述待训练图像与所述测试图像的特征向量的相似度;
    依次以每一相似度作为阈值,统计该组的准确率及误判率;
    根据每组中的准确率与误判率,确定相似度阈值。
  12. 根据权利要求11所述的装置,其特征在于,所述阈值确定模块根据每组中的准确率与误判率,确定相似度阈值具体为:
    遍历每组中所述准确率与误判率,若一组中的准确率与误判率相加再减1后的绝对值最小时,将其对应的相似度阈值作为该组的最优相似度阈值;
    取各个组的最优相似度阈值的平均值,作为所述训练图像集合的相似度阈值。
  13. 一种基于珈波Gabor二值模式的人脸识别方法,其特征在于,包括:
    根据预设的至少两个方向与至少两个尺度,对待处理图像进行每一个方向的各个尺度的Gabor滤波处理,获得每一个方向的各个尺度下的第二滤波响应图像;
    对于每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述 第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的二进制模式LGBP二进制图;所述像素点阈值为:对训练图像集合中的在同一方向的同一尺度的所有第一滤波响应图像采用费希尔Fisher准则,获取所述方向的所述尺度下的鉴别因子,并根据所述方向的尺度下的所述鉴别因子确定的;
    获取每一所述LGBP二进制图的特征向量,根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量;
    根据所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取所述待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果。
  14. 根据权利要求13所述的方法,其特征在于,将所述训练图像集合中每一个所述第一滤波响应图像作为与该第一滤波响应图像具有相同方向和尺度的样本,将所述训练图像集合中同一目标的训练图像的在同一方向的同一尺度下的第一滤波响应图像作为该方向的该尺度的类内样本,所述获取所述方向的所述尺度的鉴别因子,具体为:
    计算所述训练图像集合中每一目标所述方向的所述尺度下类内样本的像素均值,及所述训练图像集合中所有目标在所述方向的所述尺度下的所有样本的像素平均值;
    根据所述方向的所述尺度下的所述像素均值和所述方向的所述尺度下的所述像素平均值,确定所述方向的所述尺度下的类内离散矩阵和所述方向的所述尺度下的类间离散矩阵;
    根据所述方向的所述尺度下的所述类内离散矩阵和所述方向的所述尺度下的类间离散矩阵,计算所述方向的所述尺度下的的鉴别因子。
  15. 根据权利要求13或14所述的方法,其特征在于,所述根据每一方向的各尺度下的所述鉴别因子确定所述每一方向的各个尺度下的像素点阈值具体为:
    根据如下公式,计算所述第一滤波响应图像所在方向的尺度下的像素点阈值t为:
    Figure PCTCN2014093450-appb-100003
    其中,t为与W同方向的尺度下的第一滤波响应图像的像素点阈值,α为大 于1的实数,W为所述第一滤波响应图像所在的方向的尺度下的鉴别因子。
  16. 根据权利要求13-15任一项所述的方法,其特征在于,所述对于每一个方向的各个尺度下所述第二滤波响应图像,依据针对与所述第二滤波响应图像具有相同方向和尺度下的像素点阈值,确定与每一个所述第二滤波响应图像对应的LGBP二进制图,包括:
    获取与所述第二滤波响应图像对应的像素点阈值,所述与所述第二滤波响应图像对应的像素点阈值为该第二滤波响应图像所在的方向和尺度下的像素点阈值;
    针对每一个所述第二滤波响应图像,根据所述与该第二滤波响应图像对应的像素点阈值,获得该第二滤波响应图像的各像素点对应的LGBP二进制序列,并根据该第二滤波响应图像的各像素点对应的LGBP二进制序列得到该第二滤波响应图像对应的LGBP二进制图。
  17. 根据权利要求16所述的方法,其特征在于,所述获得所述第二滤波响应图像中各像素点对应的LGBP二进制序列具体为:
    根据如下公式获得所述第二滤波响应图像中各所述像素点作为中心像素点时对应邻域中的任一周围像素点的LGBP二进制值:
    Figure PCTCN2014093450-appb-100004
    其中,ub表示所述第二滤波响应图像中任一像素点作为中心像素点c时,该中心像素点c所在邻域中的一个周围像素点b的像素值;ic表示该中心像素点c的像素值,t表示所述第二滤波响应图像所在的方向和尺度下的像素点阈值,S(ub,ic,t)表示中心像素点c所在邻域中的该任一个周围像素点的二进制值;所述像素点对应的二进制序列为该像素点的各周围像素点的二进制值组成的二进制序列。
  18. 根据权利要求16或17所述的方法,其特征在于,所述获取每一所述LGBP二进制图的特征向量中,任一个所述LGBP二进制图的特征向量的获取过程包括:
    采用预设大小的区域块,对所述LGBP二进制图进行区域划分;
    将各所述区域块中每一像素点的邻域二进制序列转换成十进制值,作为该像素点的LGBP编码值;
    以所有所述区域块中的最大LGBP编码值作为每一个所述区域块所对应向量的总维度,将所述区域块内LGBP编码值为n-1的LGBP编码值的个数作为该区域块对应向量中第n维的取值;所述区域块对应的向量的各维度的取值组成该区域块对应的LGBP直方图;其中,n为1到最大LGPB编码值之间的任意整数;
    串联各所述区域块的LGBP直方图,得到所述LGBP二进制图的特征向量;
    所述根据每一所述LGBP二进制图的特征向量获取所述待处理图像的特征向量具体为:串联各LGBP二进制图的特征向量,得到所述待处理图像的特征向量。
  19. 根据权利要求13-18任一项所述的方法,其特征在于,所述获取每一所述LGBP二进制图的特征向量之前,所述方法还包括:
    针对每一尺度至少两个方向中的每个方向的所述LGBP二进制图,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图;
    所述获取每一所述LGBP二进制图的特征向量具体为:获取所述每一尺度融合后的LGBP二进制图的特征向量。
  20. 根据权利要求19所述的方法,其特征在于,所述融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图具体为:以按位相或的方式,融合同一尺度下所述至少两个方向中每个方向的所述LGBP二进制图,得到每一尺度融合后的LGBP二进制图。
  21. 根据权利要求13-20任一项所述的方法,其特征在于,所述根据所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度,并根据相似度阈值,得到识别结果,包括:
    采用直方图交叉法,根据所述待处理图像的特征向量以及训练图像集合中任一训练图像的特征向量,获取待处理图像与训练图像集合中该训练图像的相似度;并根据相似度阈值,得到识别结果。
  22. 根据权利要求21所述的方法,其特征在于,所述根据相似度阈值,得到识别结果具体为:
    确定所述相似度大于或等于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为同一目标的图像;或
    确定所述相似度小于所述相似度阈值,并确定所述待处理图像与用于获取相似度的训练图像为不同目标的图像。
  23. 根据权利要求13-22任一项所述的方法,其特征在于,所述根据相似度阈值,得到识别结果之前,所述方法还包括:
    按照十字交叉验证准则,任意组合所述训练图像集合中的训练图像,将所述训练图像集合中的训练图像分为待训练图像和测试图像;
    采用直方图交叉法,计算各所述待训练图像与所述测试图像的特征向量的相似度;
    依次以每一相似度作为阈值,统计准确率及误判率;
    根据每组中的准确率与误判率,确定相似度阈值。
  24. 根据权利要求23所述的方法,其特征在于,所述根据每组中的准确率与误判率,确定相似度阈值,包括:
    遍历每组中所述准确率与误判率,若一组中的准确率与误判率相加再减1后的绝对值最小时,将其对应的相似度阈值作为该组的最优相似度阈值;
    取各个组的最优相似度阈值的平均值,作为所述训练图像集合的相似度阈值。
PCT/CN2014/093450 2014-03-31 2014-12-10 基于Gabor二值模式的人脸识别方法及装置 WO2015149534A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410126927.XA CN103902977B (zh) 2014-03-31 2014-03-31 基于Gabor二值模式的人脸识别方法及装置
CN201410126927.X 2014-03-31

Publications (1)

Publication Number Publication Date
WO2015149534A1 true WO2015149534A1 (zh) 2015-10-08

Family

ID=50994289

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/093450 WO2015149534A1 (zh) 2014-03-31 2014-12-10 基于Gabor二值模式的人脸识别方法及装置

Country Status (2)

Country Link
CN (1) CN103902977B (zh)
WO (1) WO2015149534A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520215A (zh) * 2018-03-28 2018-09-11 电子科技大学 基于多尺度联合特征编码器的单样本人脸识别方法
CN109376754A (zh) * 2018-08-31 2019-02-22 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN109859053A (zh) * 2018-11-08 2019-06-07 平安科技(深圳)有限公司 图像查重的方法、装置、计算机设备及存储介质
CN109947756A (zh) * 2019-03-18 2019-06-28 成都好享你网络科技有限公司 用于增广数据的数据清洗方法、装置和设备
CN110321858A (zh) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 视频相似度确定方法、装置、电子设备及存储介质
CN112069993A (zh) * 2020-09-04 2020-12-11 西安西图之光智能科技有限公司 基于五官掩膜约束的密集人脸检测方法及系统和存储介质
CN113065530A (zh) * 2021-05-12 2021-07-02 曼德电子电器有限公司 人脸识别方法和装置、介质、设备
CN109377444B (zh) * 2018-08-31 2023-10-24 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN116957524A (zh) * 2023-09-21 2023-10-27 青岛阿斯顿工程技术转移有限公司 一种技术转移过程中人才信息智能管理方法及系统

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902977B (zh) * 2014-03-31 2017-04-05 华为技术有限公司 基于Gabor二值模式的人脸识别方法及装置
CN104183029A (zh) * 2014-09-02 2014-12-03 济南大学 一种便携式快速人群考勤方法
CN104361357B (zh) * 2014-11-07 2018-02-06 北京途迹科技有限公司 基于图片内容分析的相片集分类系统及分类方法
CN105678208B (zh) * 2015-04-21 2019-03-08 深圳Tcl数字技术有限公司 提取人脸纹理的方法及装置
CN105469080B (zh) * 2016-01-07 2018-09-25 东华大学 一种人脸表情识别方法
CN106507199A (zh) * 2016-12-20 2017-03-15 深圳Tcl数字技术有限公司 电视节目推荐方法及装置
CN106934350A (zh) * 2017-02-21 2017-07-07 东南大学 一种基于Gabor张量的MLFDA人脸识别方法
CN107392142B (zh) * 2017-07-19 2020-11-13 广东工业大学 一种真伪人脸识别方法及其装置
CN108596250B (zh) * 2018-04-24 2019-05-14 深圳大学 图像特征编码方法、终端设备及计算机可读存储介质
CN108875629B (zh) * 2018-06-14 2021-06-04 电子科技大学 基于多样本特征融合的掌静脉识别方法
CN112148981A (zh) * 2020-09-29 2020-12-29 广州小鹏自动驾驶科技有限公司 同人识别方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080107311A1 (en) * 2006-11-08 2008-05-08 Samsung Electronics Co., Ltd. Method and apparatus for face recognition using extended gabor wavelet features
CN102024141A (zh) * 2010-06-29 2011-04-20 上海大学 基于Gabor小波变换和局部二值模式优化的人脸识别方法
CN102663426A (zh) * 2012-03-29 2012-09-12 东南大学 一种基于小波多尺度分析和局部三值模式的人脸识别方法
CN103902977A (zh) * 2014-03-31 2014-07-02 华为技术有限公司 基于Gabor二值模式的人脸识别方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080107311A1 (en) * 2006-11-08 2008-05-08 Samsung Electronics Co., Ltd. Method and apparatus for face recognition using extended gabor wavelet features
CN102024141A (zh) * 2010-06-29 2011-04-20 上海大学 基于Gabor小波变换和局部二值模式优化的人脸识别方法
CN102663426A (zh) * 2012-03-29 2012-09-12 东南大学 一种基于小波多尺度分析和局部三值模式的人脸识别方法
CN103902977A (zh) * 2014-03-31 2014-07-02 华为技术有限公司 基于Gabor二值模式的人脸识别方法及装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIU, ZHONGHUA ET AL.: "Face Recognition Based on Multi-scale Block Local Binary Pattern", COMPUTER SCIENCE, vol. 36, no. 11, 30 November 2009 (2009-11-30), pages 293 - 295 and 299 *
ZHANG, WENCHAO ET AL.: "Histogram Sequence of Local Gabor Binary Pattern for Face Description and Identification", JOURNAL OF SOFTWARE, vol. 17, no. 12, 31 December 2006 (2006-12-31), pages 2508 - 2517, XP055229190 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520215A (zh) * 2018-03-28 2018-09-11 电子科技大学 基于多尺度联合特征编码器的单样本人脸识别方法
CN109376754B (zh) * 2018-08-31 2023-08-04 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN109376754A (zh) * 2018-08-31 2019-02-22 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN109377444B (zh) * 2018-08-31 2023-10-24 平安科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN109859053A (zh) * 2018-11-08 2019-06-07 平安科技(深圳)有限公司 图像查重的方法、装置、计算机设备及存储介质
CN109859053B (zh) * 2018-11-08 2023-08-29 平安科技(深圳)有限公司 图像查重的方法、装置、计算机设备及存储介质
CN109947756A (zh) * 2019-03-18 2019-06-28 成都好享你网络科技有限公司 用于增广数据的数据清洗方法、装置和设备
CN110321858B (zh) * 2019-07-08 2022-06-14 北京字节跳动网络技术有限公司 视频相似度确定方法、装置、电子设备及存储介质
CN110321858A (zh) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 视频相似度确定方法、装置、电子设备及存储介质
CN112069993A (zh) * 2020-09-04 2020-12-11 西安西图之光智能科技有限公司 基于五官掩膜约束的密集人脸检测方法及系统和存储介质
CN112069993B (zh) * 2020-09-04 2024-02-13 西安西图之光智能科技有限公司 基于五官掩膜约束的密集人脸检测方法及系统和存储介质
CN113065530B (zh) * 2021-05-12 2023-05-30 曼德电子电器有限公司 人脸识别方法和装置、介质、设备
CN113065530A (zh) * 2021-05-12 2021-07-02 曼德电子电器有限公司 人脸识别方法和装置、介质、设备
CN116957524A (zh) * 2023-09-21 2023-10-27 青岛阿斯顿工程技术转移有限公司 一种技术转移过程中人才信息智能管理方法及系统
CN116957524B (zh) * 2023-09-21 2024-01-05 青岛阿斯顿工程技术转移有限公司 一种技术转移过程中人才信息智能管理方法及系统

Also Published As

Publication number Publication date
CN103902977A (zh) 2014-07-02
CN103902977B (zh) 2017-04-05

Similar Documents

Publication Publication Date Title
WO2015149534A1 (zh) 基于Gabor二值模式的人脸识别方法及装置
Chakraborty et al. An overview of face liveness detection
Dagnes et al. Occlusion detection and restoration techniques for 3D face recognition: a literature review
WO2017106996A1 (zh) 一种人脸识别的方法以及人脸识别装置
CN110569756A (zh) 人脸识别模型构建方法、识别方法、设备和存储介质
Kawulok et al. Precise multi-level face detector for advanced analysis of facial images
CN108416291B (zh) 人脸检测识别方法、装置和系统
WO2014146415A1 (zh) 人脸识别方法和设备
CN105335719A (zh) 活体检测方法及装置
Paul et al. Extraction of facial feature points using cumulative histogram
US20230206700A1 (en) Biometric facial recognition and liveness detector using ai computer vision
CN112686191B (zh) 基于人脸三维信息的活体防伪方法、系统、终端及介质
CN111611849A (zh) 一种用于门禁设备的人脸识别系统
Das et al. Multi-angle based lively sclera biometrics at a distance
Chen et al. 3d face mask anti-spoofing via deep fusion of dynamic texture and shape clues
Deng et al. Attention-aware dual-stream network for multimodal face anti-spoofing
Li et al. Solving a special type of jigsaw puzzles: Banknote reconstruction from a large number of fragments
Masaoud et al. A review paper on ear recognition techniques: models, algorithms and methods
CN112800941B (zh) 基于非对称辅助信息嵌入网络的人脸反欺诈方法及系统
Shu et al. Face anti-spoofing based on weighted neighborhood pixel difference pattern
Paul et al. Automatic adaptive facial feature extraction using CDF analysis
Lin et al. Face detection algorithm based on multi-orientation gabor filters and feature fusion
CN112380966A (zh) 基于特征点重投影的单眼虹膜匹配方法
Tao et al. A Novel Illumination-Insensitive Feature Extraction Method
Paul et al. Extraction of facial feature points using cumulative distribution function by varying single threshold group

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14888357

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase
122 Ep: pct application non-entry in european phase

Ref document number: 14888357

Country of ref document: EP

Kind code of ref document: A1