WO2017024963A1 - 图像识别方法、度量学习方法、图像来源识别方法及装置 - Google Patents

图像识别方法、度量学习方法、图像来源识别方法及装置 Download PDF

Info

Publication number
WO2017024963A1
WO2017024963A1 PCT/CN2016/092785 CN2016092785W WO2017024963A1 WO 2017024963 A1 WO2017024963 A1 WO 2017024963A1 CN 2016092785 W CN2016092785 W CN 2016092785W WO 2017024963 A1 WO2017024963 A1 WO 2017024963A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
model
feature
similarity
source
Prior art date
Application number
PCT/CN2016/092785
Other languages
English (en)
French (fr)
Inventor
易东
刘荣
张帆
张伦
楚汝峰
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017024963A1 publication Critical patent/WO2017024963A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the present application relates to a pattern recognition technology, and in particular to an image recognition method and apparatus.
  • the application also provides a metric learning method and device, and an image source identification method and device.
  • Face recognition is one of the hot topics in the fields of pattern recognition, image processing, machine vision, neural network and cognitive science. Face recognition generally refers to the computer technology that extracts visual features with discriminative ability from the face image and uses it to determine the identity of the face. It can be divided into two categories: face recognition and face verification. Face recognition refers to the identification of the identity of a face image, that is, the image of which person a certain face image is determined; face verification refers to determining whether the identity of a face image is a claimed person.
  • Existing face recognition techniques usually contain two main research directions: feature learning and metric learning.
  • feature learning is to transform the face image into a more distinguishable and more discriminative form; while the metric learning is used to learn from the training sample a measure model or a measure function that evaluates the distance or similarity between samples, where Bayesian face is a metric learning method that is widely used at present. It is a metric learning method derived from probability discriminant analysis based on Gaussian hypothesis.
  • the main processes of face recognition include: training process and recognition process.
  • the training process refers to using the face image training set to solve the parameters of the similarity measure model, which is also called the measure learning process, and the face image training set is composed of face images and identity tags (identifying which images are from the same person, which The image is composed of different people.
  • the recognition process refers to first collecting the face image registration set for query.
  • the registration set usually consists of face image, identity tag and identity information.
  • the source is generally single and the quality is good. Comparing the feature of the face image to be recognized with the feature of the registered set of samples, and using the trained similarity measure model to calculate the similarity between the face image feature to be recognized and the registered image feature, thereby determining the corresponding face image to be recognized identity of.
  • the basic assumption of the joint Bayesian face is that the face samples x and y participating in the comparison obey the same Gaussian distribution, and in a specific application, the image source in the registration set is usually controllable, and the source of the face image to be recognized More complex Miscellaneous, the quality is uneven, such as: video screenshots, scanned pictures, photo stickers, etc., that is: the image in the registration set and the source of the image to be identified may be different, resulting in the face samples participating in the comparison may not meet the same Gaussian distribution.
  • the requirements also known as asymmetric faces
  • the existing face recognition technology is usually not handled well, resulting in low recognition accuracy and can not meet the needs of the application.
  • the above problems also occur due to different image sources (i.e., asymmetric guest images).
  • the embodiment of the present application provides an image recognition method and apparatus to solve the problem that the existing image recognition technology has low accuracy of object image recognition with variable source.
  • the embodiment of the present application further provides a metric learning method and apparatus, and an image source identification method and apparatus.
  • the application provides an image recognition method, including:
  • the set of measurement models includes at least one similarity measure model, and different similarity measure models respectively correspond to different source categories of the object image.
  • each similarity measurement model corresponding to different source categories in the set of measurement models is separately trained by using a reference object image training set belonging to a preset source category and a comparison object image training set corresponding to different source categories. of.
  • the object image in the reference object image training set belongs to the same source category as the registered image.
  • the source category of the object image to be identified is determined.
  • the object image classification model is a multi-class classification model trained by the following algorithm:
  • Softmax algorithm multi-class SVM algorithm, or random forest algorithm.
  • the similarity measure model includes: a pseudo-distribution of the guest features participating in the comparison Set and establish an asymmetric measurement model.
  • the asymmetric metric model includes: an asymmetric metric model based on joint Bayesian faces;
  • the parameters in the asymmetric metric model are solved, and the training of the model is completed.
  • the asymmetric metric model corresponding to a particular source category is as follows:
  • the sample y ⁇ y + ⁇ y , ⁇ y and ⁇ y obey the mean of 0, the covariance matrix is the Gaussian distribution of S yy and T yy , and S xy and S yx are the mutual covariance matrix between X and Y; (x, y) is the similarity calculated based on the intra-class/inter-class log likelihood ratio;
  • the solving the parameters in the asymmetric metric model includes: solving S xx , T xx , S yy , T yy , S xy , and S yx .
  • the solving the parameters in the asymmetric metric model includes:
  • the parameters in the model are iteratively solved using an expectation maximization algorithm.
  • the calculating the similarity between the object feature and the registered image object feature includes:
  • the calculating the similarity between the object feature and the registered image object feature includes:
  • the extracting the object feature of the object image to be identified includes:
  • the object feature is extracted using a deep convolutional network.
  • the object image to be identified includes: a face image to be recognized; and the object feature includes: a face feature.
  • the source categories include:
  • ID photo life photo
  • video screenshot scanned image, remake image, or monitor image.
  • an image recognition apparatus including:
  • An image obtaining unit configured to acquire an object image to be identified
  • a feature extraction unit configured to extract an object feature of the object image to be identified
  • a similarity calculation unit configured to select, from a pre-trained set of measurement models, a similarity measure model corresponding to the source category of the object image to be identified, and calculate a similarity between the object feature and the registered image object feature, As the basis for outputting the object recognition result;
  • the similarity calculation unit includes:
  • a metric model selection subunit configured to select, from the pre-trained metric model set, a similarity metric model corresponding to the source category of the object image to be identified;
  • the calculation execution subunit is configured to calculate the similarity between the object feature and the registered image object feature by using the similarity measure model selected by the measurement model selection subunit as a basis for outputting the object recognition result.
  • the device includes:
  • a metric model training unit configured to use a reference object image training set belonging to a preset source category, and a matching object image training set corresponding to different source categories, respectively training to obtain respective similarities of corresponding source categories in the metric model set Metric model.
  • the device includes:
  • a source category determining unit configured to determine, by using the object feature as an input, the source type of the object image to be identified by using the object image classification model that is pre-trained before the similarity calculation unit is triggered to work.
  • the device includes:
  • the source classification model training unit is configured to train and train the object image classification model by using the following algorithm before the triggering of the source category determining unit: a Softmax algorithm, a multi-class SVM algorithm, or a random forest algorithm.
  • the device includes:
  • a metric model training unit configured to train each similarity metric model in the set of metric models, the similarity metric model comprising: based on a joint Bayesian face under the assumption that the object features of the participating alignments are subject to respective Gaussian distributions Established asymmetric metric model;
  • the metric model training unit trains the above asymmetric metric model corresponding to a particular source category by:
  • a reference sample extraction subunit configured to extract object features of each image in the reference object image training set belonging to the preset source category, as a reference feature sample set
  • the metric model establishes a sub-unit for establishing an asymmetric metric model including parameters under the assumption that the object features of the participating alignments obey the respective Gaussian distributions;
  • the model parameter solving subunit is configured to solve the parameters in the asymmetric metric model according to the samples in the two types of feature sample sets and the identity tags that identify whether the samples belong to the same object, and complete the training of the model.
  • the model parameter solving subunit is specifically configured to estimate a parameter in the model by using a divergence matrix, or iteratively solve a parameter in the model by using an expectation maximization algorithm.
  • the calculation execution subunit is specifically configured to calculate a similarity between the object feature and a registered image object feature corresponding to a specific identity
  • the device also includes:
  • a first threshold comparison unit configured to determine whether the similarity is greater than a preset threshold
  • a first recognition result output unit configured to determine, when the output of the first threshold comparison unit is YES, that the to-be-identified object image and the registration image corresponding to the specific identity belong to the same object, and determine the The object recognition result is output.
  • the calculation execution subunit is specifically configured to calculate a similarity between the object feature and a registered image object feature within a specified range
  • the device also includes:
  • a second threshold comparison unit configured to determine whether a maximum value of the calculated similarities is greater than a preset threshold
  • a second recognition result output unit configured to determine, when the output of the second threshold comparison unit is YES, that the matching object image is successfully matched in the registered image within the specified range, and the maximum value is The relevant identity information of the corresponding registered image is output as the object recognition result.
  • the feature extraction unit is specifically configured to extract the guest feature by using a local binary mode algorithm, extract the guest feature by using a Gabor wavelet transform algorithm, or extract the guest feature by using a deep convolution network.
  • the application also provides a metric learning method, including:
  • the parameters in the asymmetric metric model are solved using the samples of the above two types of feature sample sets.
  • the asymmetric metric model includes: an asymmetric metric model based on joint Bayesian faces;
  • the asymmetric metric model is as follows:
  • the sample y ⁇ y + ⁇ y , ⁇ y and ⁇ y obey the mean of 0, the covariance matrix is the Gaussian distribution of S yy and T yy , and S xy and S yx are the mutual covariance matrix between X and Y; (x, y) is the similarity calculated based on the intra-class/inter-class log likelihood ratio;
  • the solving the parameters in the asymmetric metric model includes: solving S xx , T xx , S yy , T yy , S xy , and S yx .
  • the solving the parameters in the asymmetric metric model includes:
  • the parameters in the model are iteratively solved using an expectation maximization algorithm.
  • the reference object image and the comparison object image comprise: a face image; and the object feature comprises: a face feature.
  • the application further provides a metric learning device, including:
  • a reference sample extraction unit configured to extract object features of each image in the reference object image training set belonging to the same source category, as a reference feature sample set
  • the comparison sample extraction unit is configured to extract object features of each image in the comparison object image training set belonging to the same source category but different from the reference object image, as the comparison feature sample set;
  • An asymmetric metric model establishing unit is configured to establish an asymmetric metric model including parameters under the assumption that the object features of the participating alignments obey the respective Gaussian distribution;
  • the metric model parameter solving unit is configured to solve the parameters in the asymmetric metric model by using the samples in the two types of feature sample sets.
  • the metric model established by the asymmetric metric model establishing unit comprises: an asymmetric metric model based on joint Bayesian faces.
  • the metric model parameter solving unit is specifically configured to estimate a parameter in the model by using a divergence matrix, or iteratively solve a parameter in the model by using an expectation maximization algorithm.
  • the application also provides an image source identification method, including:
  • the object image classification model is used to identify the source category of the object image to be classified.
  • the object image classification model is a multi-class classification model trained by the following algorithm:
  • Softmax algorithm multi-class SVM algorithm, or random forest algorithm.
  • the object image includes: a face image; and the object feature includes: a face feature.
  • an image source identification device including:
  • a training sample collection unit for collecting object image sets belonging to different source categories, and extracting object features to form a training sample set
  • a classification model training unit configured to use the object feature sample in the training sample set and its source category to train an image source classification model
  • a feature extraction unit to be used for extracting object features from the object image to be classified
  • the source category identifying unit is configured to use the object feature extracted by the to-be-classified feature extracting unit as an input, and use the object image source classification model to identify a source category of the object image to be classified.
  • the object image classification model includes: a multi-class classification model
  • the classification model training unit is specifically configured to train the object image classification model by using a Softmax algorithm, a multi-class SVM algorithm, or a random forest algorithm.
  • the image recognition method provided by the present application first acquires an object image to be identified, extracts an object feature of the object image to be identified, and then selects, from a pre-trained set of measurement models, a source category corresponding to the object image to be identified.
  • the similarity measure model and calculate the similarity between the object feature and the registered image object feature as the basis for outputting the object recognition result.
  • Image recognition using this method because a single similarity measure model is not used, but a pre-trained similarity measure model corresponding to the source category of the object image to be identified is selected, so that the asymmetric object image recognition problem can be effectively processed.
  • the identification of the object image to be identified with variable source has better robustness and higher accuracy.
  • the metric learning method provided by the present application establishes an asymmetric metric model including parameters under the assumption that the face features participating in the comparison obey the respective Gaussian distribution, and uses the set of object image feature samples from different sources to solve the asymmetric metric model.
  • the parameters in it thus completing the construction of the asymmetric metric model.
  • the method modifies the hypothesis in the traditional image recognition technology, that is, the two object samples x and y participating in the comparison can respectively obey the respective Gaussian distribution without sharing the parameters, and on this basis, from the different source categories. Learning in a sample set for identifying asymmetric
  • the similarity measure model of the object provides a basis for high-performance object recognition to adapt to various image sources.
  • the image source identification method provided by the present application first extracts object feature composition training sample sets from object image sets belonging to different source categories, and uses the object feature samples in the training sample set and their source categories to train the object image classification model. And then taking the object feature extracted from the object image to be classified as input, and using the object image source classification model to identify the source category of the object image to be classified.
  • the method can effectively identify the source category of the object image, thereby providing a basis for selecting the correct similarity measure model in the object recognition process, and ensuring the correctness of the recognition result.
  • FIG. 1 is a flow chart of an embodiment of an image recognition method provided by the present application.
  • FIG. 2 is a schematic diagram of a training process of a metric model set provided by an embodiment of the present application
  • FIG. 3 is a process flowchart of training an asymmetric metric model provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of performing face recognition using a metric model set provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of an image recognition apparatus provided by the present application.
  • FIG. 6 is a flowchart of an embodiment of a metric learning method provided by the present application.
  • FIG. 7 is a schematic diagram of an embodiment of a metric learning device provided by the present application.
  • FIG. 8 is a flowchart of an embodiment of an image source identification method provided by the present application.
  • FIG. 9 is a schematic diagram of an embodiment of an image source identification device provided by the present application.
  • the application field of the technical solution of the present application is not limited to face recognition, and the technical solution provided by the present application can also be used in the identification application for other object images. .
  • the existing image recognition technology generally does not consider the source of the object image, and uses a single similarity measurement model for identification.
  • the technical solution of the present application provides a complex source and uneven quality for the object image to be identified.
  • a new idea of image recognition is proposed: the similarity measure model corresponding to different source categories is pre-trained, and the similarity measure model corresponding to the source category of the object image to be identified is selected for identification in specific applications, so that non-processing can be handled.
  • the recognition problem of symmetric object images has better robustness and higher accuracy for the recognition of object images belonging to different source categories.
  • the guest image generally refers to an image in which a main display content (for example, a foreground image as an image subject) is a guest such as a face or various articles.
  • Object images from different sources usually refer to images whose object characteristics follow different data distribution due to different acquisition methods or different acquisition devices. Different sources may include: video screenshots, scanned images, remake images, and the like.
  • the face image recognition is mainly described.
  • FIG. 1 is a flowchart of an embodiment of an image recognition method of the present application. The method includes the following steps:
  • Step 101 Train a similarity measure model corresponding to different source categories to form a set of measurement models.
  • various source categories include, but are not limited to, a certificate photo, a life photo, a video screenshot, a scanned image, a remake image, or a monitoring screen.
  • the similarity measure models corresponding to different source categories may be trained first, and all the trained similarity measure models together constitute a set of measurement models, each member of the set, ie each The similarity measure model corresponds to different source categories of the face image.
  • the similarity measure model is used to evaluate the similarity between the two.
  • the similarity measure model is usually It can be represented by the metric function f(x, y, P), where P is the parameter of the model.
  • the purpose of the training is to solve the parameter P of the metric model based on the given training set. Once the parameter P is determined, the model is trained.
  • the training process can be repeated multiple times to obtain multiple metric functions, each metric function being applied to face images of different source categories.
  • the training set consists of three parts: a reference face image training set X belonging to a preset source category as a training benchmark, and a matching face image corresponding to the specific source category.
  • the training set Y, and the identity tag Z for identifying which images are from the same person and which images are from different people.
  • FIG. 2 is a schematic diagram of a training process of a metric model set.
  • a similarity metric model may be established by using different algorithms.
  • a similarity metric model is established based on the commonly used joint Bayesian face, and the established model is called Is an asymmetric measurement model.
  • the process of training the asymmetric metric model is further described below with reference to FIG. 3, which includes:
  • Step 101-1 Extract face features of each image in the reference face image training set belonging to the preset source category as a reference feature sample set.
  • the face image in the reference face image training set X as a training reference is usually collected under a controllable environment, and the preset source category may be: a photo ID, or other image quality is generally better. Source category.
  • the face features of each image may be extracted as samples, that is, the so-called face samples, and all the samples together constitute a reference feature sample set. For instructions on how to extract face features, see the text description in step 103 below.
  • Step 101-2 Extract facial features of each image of the matching face image training set belonging to the specific source category as a comparison feature sample set.
  • the specific source category may be different from the source category of the reference face image training set X.
  • X is a photo taken in a controlled environment
  • the face image in the face image training set Y may be A photo taken in a controlled environment.
  • the comparison face image training set is collected, the face features of each image may be extracted as samples, and all the samples together constitute a comparison feature sample set. For instructions on how to extract face features, see the text description in step 103 below.
  • Step 101-3 Under the assumption that the face features participating in the comparison obey the respective Gaussian distribution, an asymmetric metric model including parameters is established.
  • This embodiment improves on the basis of the traditional joint Bayesian face and establishes an asymmetric metric model. For ease of understanding, a brief description of the Bayesian face and the joint Bayesian face is given first.
  • Bayesian face is usually the abbreviation of the classic Bayesian face recognition method. This method uses the difference of the features of the two face images as the pattern vector. If the two images belong to the same person, it is called the intra-class mode, otherwise it is called The inter-class mode converts the multi-classification problem of face recognition into a two-category problem. For any two face samples x and y, if the log likelihood ratio obtained based on the intra-class/inter-class mode is greater than a preset threshold, it can be determined as the same person.
  • the joint Bayesian face is based on the Bayesian face, and establishes a two-dimensional model for the joint probability distribution of x and y, and represents each face sample as the sum of two independent latent variables: different faces Change + change in the same face, Then, using a large number of sample training, a similarity measure model based on log likelihood ratio is obtained. It should be noted that although the above two Bayesian face techniques are proposed for face image recognition, they can also be applied to the recognition of other object images.
  • the recognition accuracy of the joint Bayesian face is higher than that of the classical Bayesian face, but the basic assumption of the joint Bayesian face is that the face samples x and y participating in the comparison obey the same Gaussian distribution, but in the specific application.
  • the source of the image in the registration set is usually controllable, and the source of the face image to be recognized is more complicated and the quality is uneven, that is, the face sample participating in the comparison may not satisfy the requirement of obeying the same Gaussian distribution.
  • the joint Bayesian face technique usually does not handle this situation well, and the recognition accuracy is low.
  • the inventor of the present application proposed an asymmetric metric model and a metric learning method for training using a face image training set of different source categories based on the modification of the joint Bayesian face hypothesis. It is called the “asymmetric” metric model because the face images corresponding to the two face samples that are compared by the model can belong to different source categories, because the data distribution caused by different source categories is considered in the modeling. Differences, based on the similarity estimated by the model, can obtain more accurate face recognition results.
  • the joint distribution also obeys the Gaussian distribution.
  • the X and Y spaces are connected, and the samples are represented as ⁇ x, y ⁇ , and the mean value of the random variable is still 0, and the variance is analyzed in two cases.
  • S xy and S yx are the mutual covariance matrix between X and Y.
  • Step 101-4 According to the samples in the two types of feature sample sets above, and whether the identification sample belongs to the same person. Labeling, solving parameters in the asymmetric metric model, and completing training of the model.
  • the main task of training the asymmetric metric model is to solve the A, B and G parameters in the model expression shown in Equation 1, and the derivation process in Step 101-3 shows that these three parameters can pass S xx , T xx S yy , T yy , S xy , and S yx are obtained through specific operations. Therefore, the core of training the asymmetric metric model is to solve the above-mentioned respective covariance matrices and cross-covariance matrices.
  • the plurality of face samples in the reference feature sample set X and the comparison feature sample set Y are used to solve the respective parameters by estimating the divergence matrix, which will be described in detail below.
  • C is the number of categories (the face samples belonging to the same person are of the same type)
  • i-th sample Indicates the number of samples in the i-th class, where m x is the mean of the entire sample. Is the mean of the i-th sample.
  • C is the number of categories
  • m y is the mean of the entire sample. Is the mean of the i-th sample.
  • the parameters A, B, and G can be further calculated according to the derivation process of step 101-3. Values, substituting these parameter values into Equation 1, and obtaining a trained asymmetric metric model.
  • each parameter in the asymmetric metric model is solved by using a method for estimating a divergence matrix on the basis of using a large number of face samples.
  • a traditional joint Bayelet may also be used.
  • the technical solution of the present application can also be implemented by solving the expectation maximization algorithm adopted by the face and solving the parameters in the model by means of multiple rounds of iteration.
  • the present embodiment establishes a similarity metric model corresponding to different source categories by modifying the hypothesis on the basis of the joint Bayesian face.
  • other methods or techniques may also be used to establish the similarity metric.
  • Models for example, using Canonical Correlation Analysis (CCA), Asymmetric Deep Metric Learning (ADML), or Multimodal Restricted Boltzmann The method of Machines) establishes the similarity measure model. Regardless of the algorithm or technique, it is within the scope of the present application to establish and train a corresponding similarity measure model for different face images of different sources without departing from the core of the present application.
  • Step 102 Acquire an image of a face to be recognized.
  • the face image to be recognized generally refers to a face image to be determined, generally collected under an uncontrollable environment, and has many source categories, which may include: a life photo, a remake poster, a remake TV, a monitoring screen, a scanned image, etc. .
  • the image of the face to be recognized can be obtained in various ways, for example, by a camera with a camera or a mobile terminal device, downloaded from a resource database of the Internet, scanned by a scanner, or received by a client (for example) : a mobile terminal device or a desktop computer, etc.) a face image to be recognized uploaded by wire or wirelessly.
  • Step 103 Extract a facial feature of the face image to be recognized.
  • the face feature can be directly extracted from the face image to be recognized.
  • the face image background can also be firstly Detecting a specific location of the face, for example, using a skin-based detection method, a shape-based detection method, or a statistical theory-based detection method, etc., to determine a specific location of a face in the image, and then from the specific location The face feature is extracted from the corresponding face image.
  • the process of extracting features is a process of converting a face image into a vector.
  • This vector is called a face feature.
  • the face feature has strong discriminative power on face images from different people and is robust to external interference factors.
  • various feature extraction methods such as Local Binary Patterns (LBP), Gabor wavelet transform algorithm, and deep convolution network, etc., can be used, in which the recognition accuracy and execution performance are recognized.
  • LBP Local Binary Patterns
  • Gabor wavelet transform algorithm Gabor wavelet transform algorithm
  • deep convolution network etc.
  • Step 104 Determine, by using a pre-trained face image source classification model, a source category of the to-be-recognized face image.
  • the source category of the to-be-recognized face image may be determined according to the manner in which the image to be recognized is obtained in step 103.
  • the face image of the ordinary life obtained by taking a photo with the camera is the source type of the life photo; If the acquired face image is scanned by a scanner, the source category is a scanned image.
  • the source category of the face image may be determined according to the information.
  • the method described in this step may be adopted: the source type of the face image to be recognized is determined by using the face image source classification model.
  • the face image source classification model is a multi-class classification model (also referred to as a multi-class classifier).
  • the face image source classification model may be pre-trained before performing this step, for example, the implementation.
  • the example uses the Softmax regression algorithm to train the classification model. The training process is further explained below.
  • a face image set belonging to K different source categories is collected, and a face feature is extracted from each face image to form a training sample set, and each sample in the training sample set is composed of two parts: a face feature and
  • the probability of belonging to the kth class for a given face feature is as follows:
  • is the parameter of the model and can be solved by minimizing the following objective function:
  • the source category corresponding to the maximum value is the source category to which the to-be-recognized face image belongs.
  • the face image classification model is implemented by using the Softmax algorithm.
  • other methods different from the foregoing algorithms may also be used, for example, multiple types of SVM algorithms or random forest algorithms may be used. Yes.
  • Step 105 Select a similarity metric model corresponding to the source category of the to-be-recognized face image from the pre-trained metric model set, and calculate a similarity between the facial feature and the registered image facial feature as The basis for outputting face recognition results.
  • the registration image generally refers to a face image in a face image registration set for querying in a specific application.
  • the images in the face image registration set are usually collected in a controlled environment, and the source thereof is usually single, and the quality is usually good, for example, the second generation card photo, the registration photo, etc., and the scale is relatively large, and can reach tens of thousands to Tens of millions.
  • the face image registration set and the reference face image training set used in training the similarity degree measurement model in step 101 may use images of the same source category, for example: All use the ID photo.
  • the face features of each face image may be extracted, and the face image, the face feature, and the corresponding identity tag and identity information are stored in the image.
  • the identity information generally refers to information capable of identifying an individual identity corresponding to a face image, such as a name, an identity ID, and the like.
  • the pre-trained metric model set includes K similarity metric models, each similarity
  • a corresponding similarity measure model is selected from the set of measurement models, for example, the source category of the face image to be recognized is a scanned image, then this step is selected for scanning.
  • the image is a pre-trained similarity measure model of the source category, and the selected model is used to calculate the similarity between the face feature of the face image to be recognized and the registered image face feature, and finally the face recognition result is output according to the similarity.
  • FIG. 4 is a schematic diagram of the processing procedure in the specific example.
  • this step is to calculate the facial features and registration.
  • the similarity of the image face features is similar, there are two different cases, which are respectively described below.
  • the face verification generally refers to determining whether the identity of a face image is a specific person.
  • the identity information of the specific person such as a digital identifier (identity ID) representing its identity
  • the registration image database can be queried according to the identity information to obtain the registration image corresponding to the identity.
  • a face feature and then calculating a similarity between the face feature of the face image to be recognized and the registered image face feature acquired from the database, and if the similarity is greater than a preset threshold, the waiting may be determined
  • the recognized face image belongs to the same person as the registered image, that is, the identity of the face image to be recognized is indeed the specific person, and the determination is output as a face recognition result.
  • the face recognition generally refers to identifying the identity of the face image to be recognized, that is, determining which person's image of the person image to be recognized is specific.
  • this step may calculate the similarity between the face feature of the face image to be recognized and the registered image face feature within the specified range, for example, may be in the database of the pre-established registered image database.
  • the registration image face features are compared one by one, and part of the registered image face features in the registration image database may be selected according to a preset strategy for comparison, and the corresponding similarity is calculated.
  • the related identity information of the registered image corresponding to the maximum value is output as a face recognition result, for example, an identity ID of the registered image corresponding to the maximum value, or an identity such as a name may be output. information.
  • Step 101 is a training process for measuring a set of models.
  • each similarity measure model in the set of measurement models can be used repeatedly after training, without having to re-execute the acquired face image to be recognized each time.
  • the step 104 is not necessary. If the source category of the image to be recognized is known, or the image to be recognized itself carries the source category label, step 104 may not be performed.
  • the above embodiment takes the face recognition as an example, and details the specific implementation process of the image recognition method provided by the present application.
  • the image recognition method provided by the present application can also be applied to the recognition of other object images (for example, images containing various items), and the following is an example of identifying the luggage image as an example.
  • the similarity metric model corresponding to different image source categories may be respectively trained according to the reference luggage image training set and the matching luggage image training set corresponding to different source categories, and after the image of the luggage to be identified is acquired, the image is extracted first.
  • the bag feature in the image of the to-be-identified bag and then selecting a similarity measure model corresponding to the source category of the bag image to be identified, calculating a similarity between the bag feature and the registered image bag feature, and outputting according to the similarity
  • the recognition result of the image of the to-be-identified bag for example, whether the image of the to-be-identified bag and the registered image corresponding to the specific identity belong to the same bag, or the related identity information of the image of the bag to be identified.
  • Identity information for items such as luggage may typically include one or a combination of the following information: manufacturer, brand information, model information, and the like.
  • the image recognition method provided by the present application does not adopt a single similarity measurement model when performing object image recognition, but uses a pre-trained similarity measure corresponding to the source category of the object image to be identified.
  • the model can effectively deal with the recognition problem of asymmetric object images, and has better robustness and higher accuracy for the identification of the object image to be identified with variable source.
  • FIG. 5 is a schematic diagram of an embodiment of an image recognition apparatus according to the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • the device embodiments described below are merely illustrative.
  • An image recognition apparatus of the embodiment includes: a metric model training unit 501, configured to use a reference object image training set belonging to a preset source category, and a comparison object image training set corresponding to different source categories, respectively Each of the similarity measurement models corresponding to the different source categories in the set of measurement models; the image acquisition unit 502 is configured to acquire the object image to be identified; the feature extraction unit 503 is configured to extract the object features of the object image to be identified; The unit 504 is configured to use the object feature as an input, determine a source category of the object image to be identified by using a pre-trained object image source classification model, and a similarity calculation unit 505, configured to use the pre-trained metric model Selecting a similarity measure model corresponding to the source category of the object image to be identified in the set, and calculating a similarity between the object feature and the registered image object feature as a basis for outputting the object recognition result;
  • the similarity calculation unit includes:
  • a metric model selection subunit configured to select, from the pre-trained metric model set, a similarity metric model corresponding to the source category of the object image to be identified;
  • the calculation execution subunit is configured to calculate the similarity between the object feature and the registered image object feature by using the similarity measure model selected by the measurement model selection subunit as a basis for outputting the object recognition result.
  • the device includes:
  • the source classification model training unit is configured to train and train the object image classification model by using the following algorithm before the triggering the source category determining unit: Softmax algorithm, multi-class SVM algorithm, or random forest algorithm.
  • the metric model training unit is specifically configured to: train an asymmetric metric model corresponding to different source categories, where the asymmetric metric model is based on the assumption that the object features of the participating alignment obey the respective Gaussian distribution a metric model established by the leaves face;
  • the metric model training unit trains an asymmetric metric model corresponding to a particular source category by:
  • a reference sample extraction subunit configured to extract object features of each image in the reference object image training set belonging to the preset source category, as a reference feature sample set
  • the metric model establishes a sub-unit for establishing an asymmetric metric model including parameters under the assumption that the object features of the participating alignments obey the respective Gaussian distributions;
  • the model parameter solving subunit is configured to solve the parameters in the asymmetric metric model according to the samples in the two types of feature sample sets and the identity tags that identify whether the samples belong to the same object, and complete the training of the model.
  • the model parameter solving subunit is specifically configured to estimate a parameter in the model by using a divergence matrix, or iteratively solve a parameter in the model by using an expectation maximization algorithm.
  • the calculation execution subunit is specifically configured to calculate a similarity between the object feature and a registered image object feature corresponding to a specific identity
  • the device also includes:
  • a first threshold comparison unit configured to determine whether the similarity is greater than a preset threshold
  • a first recognition result output unit configured to determine, when the output of the first threshold comparison unit is YES, that the to-be-identified object image and the registration image corresponding to the specific identity belong to the same object, and determine the The object recognition result is output.
  • the calculation execution subunit is specifically configured to calculate a similarity between the object feature and a registered image object feature within a specified range
  • the device also includes:
  • a second threshold comparison unit configured to determine whether a maximum value of the calculated similarities is greater than a preset threshold
  • a second recognition result output unit configured to determine, when the output of the second threshold comparison unit is YES, that the matching object image is successfully matched in the registered image within the specified range, and the maximum value is The relevant identity information of the corresponding registered image is output as the object recognition result.
  • the feature extraction unit is specifically configured to extract the guest feature by using a local binary mode algorithm, extract the guest feature by using a Gabor wavelet transform algorithm, or extract the guest feature by using a deep convolution network.
  • FIG. 6 is a flowchart of an embodiment of a metric learning method provided by the present application. The parts of the embodiment that are identical to the steps of the image recognition method embodiment are not described again. The differences are mainly described below.
  • a metric learning method provided by the present application includes:
  • Step 601 Extract object features of each image in the reference object image training set belonging to the same source category as a reference feature sample set.
  • Step 602 Extract object features of each image in the matching object image training set belonging to the same source but different from the reference object image, as the comparison feature sample set.
  • Step 603 Establish an asymmetric metric model including parameters under the assumption that the object features participating in the comparison obey the respective Gaussian distribution.
  • the asymmetric metric model includes: an asymmetric metric model based on joint Bayesian faces; the asymmetric metric model is as follows:
  • Step 604 Solving parameters in the asymmetric face similarity measure model by using samples in the two types of feature sample sets.
  • the samples in the two types of feature sample sets can be used to solve various parameters in the model by using an algorithm or a method corresponding to the established model.
  • the parameters in the model can be estimated using the divergence matrix according to the samples in the two types of feature sample sets and the identity tag information identifying whether the samples belong to the same object.
  • the metric learning method provided in this embodiment may be used to learn a similarity metric model of an asymmetric face image.
  • the reference object image and the comparison object image include: a face image;
  • the object features include: face features.
  • the metric learning method provided by this embodiment may also be used to learn the similarity metric model of other asymmetric guest images.
  • the metric learning method provided by the present application modifies the hypothesis in the traditional image recognition technology, that is, the two object samples x and y participating in the comparison can respectively obey the respective Gaussian distribution without sharing the parameters, and on this basis Learning a similarity measure model for identifying asymmetric objects from a set of samples belonging to different source categories, thereby It provides the basis for high-performance object recognition for a variety of image sources.
  • FIG. 7 is a schematic diagram of an embodiment of a metric learning device of the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • the device embodiments described below are merely illustrative.
  • the metric learning device of the embodiment includes: a reference sample extracting unit 701, configured to extract facial features of each image in the reference object image training set belonging to the same source category as a reference feature sample set; and the comparison sample extracting unit 702 And for extracting object features of the images in the matching object image training set belonging to the same source category but different from the reference object image, as the comparison feature sample set; the asymmetric metric model establishing unit 703, Under the assumption that the object features participating in the comparison obey the respective Gaussian distribution, an asymmetric metric model including parameters is established; the metric model parameter solving unit 704 is configured to solve the asymmetric metric by using the samples of the two types of feature sample sets. The parameters in the model.
  • the metric model established by the asymmetric metric model establishing unit comprises: an asymmetric metric model based on joint Bayesian faces.
  • the metric model parameter solving unit is specifically configured to estimate a parameter in the model by using a divergence matrix, or iteratively solve a parameter in the model by using an expectation maximization algorithm.
  • FIG. 8 is a flowchart of an embodiment of an image source identification method provided by the present application. The same parts of the embodiment are the same as those of the foregoing embodiment, and the differences are described below.
  • An image source identification method provided by the present application includes:
  • Step 801 Collect object image sets belonging to different source categories, and extract object features from them to form a training sample set.
  • Step 802 Train the object image classification model by using the object feature sample and the source category in the training sample set.
  • the object image classification model is usually a multi-class classification model.
  • the object image classification model may be trained by using the following algorithm: Softmax algorithm, multi-class SVM algorithm, or random forest algorithm.
  • Step 803 Extract the object feature from the object image to be classified.
  • Step 804 Taking the object feature extracted as an input, and using the object image source classification model to identify a source category of the object image to be classified.
  • the image source identification method provided in this embodiment may be used to identify a source category of a face image, and
  • the object image includes: a face image;
  • the object feature includes: a face feature;
  • the pre-trained object image source classification model refers to a face image source classification model.
  • this method can also be used to identify the source categories of other object images.
  • the image source identification method provided by the present application can effectively identify the source category of the object image, thereby providing a basis for selecting a correct similarity measure model in the object image recognition process, and ensuring the correctness of the recognition result.
  • an image source identification method is provided.
  • the present application further provides an image source identification device.
  • FIG. 9 is a schematic diagram of an embodiment of an image source identification device according to the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • the device embodiments described below are merely illustrative.
  • An image source identification device of the embodiment includes: a training sample collection unit 901, configured to collect object image sets belonging to different source categories, and extract object feature composition training sample sets therefrom; and a classification model training unit 902 for utilizing The object feature sample in the training sample set and its source category, the training object image source classification model; the to-be-classified feature extraction unit 903 is configured to extract the object feature from the object image to be classified; the source category identification unit 904 is configured to The object feature extracted by the feature extraction unit to be classified is an input, and the source image of the object image to be classified is identified by using the object image classification model.
  • the object image classification model includes: a multi-class classification model
  • the classification model training unit is specifically configured to train the object image classification model by using a softmax algorithm, a multi-class SVM algorithm, or a random forest algorithm.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media including both permanent and non-persistent, removable and non-removable media may be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), flash memory, or other Memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, magnetic cassette, magnetic tape storage or other magnetic storage device or any other non-transportable medium, available for Stores information that can be accessed by the computing device.
  • computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment in combination of software and hardware.
  • the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

公开了一种图像识别方法及装置、一种度量学习方法及装置、以及一种图像来源识别方法及装置。其中,所述图像识别方法包括:获取待识别客体图像;提取所述待识别客体图像的客体特征;从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;其中,所述度量模型集合包含至少一个相似度度量模型,不同的相似度度量模型分别与客体图像的不同来源类别相对应。采用本方法进行图像识别,能够有效处理非对称客体图像识别问题,对来源多变的待识别客体图像的识别具有更好的鲁棒性和更高的准确率。

Description

图像识别方法、度量学习方法、图像来源识别方法及装置
本申请要求2015年08月11日递交的申请号为201510490041.8、发明名称为“图像识别方法、度量学习方法、图像来源识别方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及模式识别技术,具体涉及一种图像识别方法及装置。本申请同时提供一种度量学习方法及装置,以及图像来源识别方法及装置。
背景技术
人脸识别是近年来模式识别、图像处理、机器视觉、神经网络以及认知科学等领域研究的热点课题之一。人脸识别通常是指,从人脸图像中提取有鉴别能力的视觉特征,并用其确定人脸身份的计算机技术,具体可以分为两类:人脸鉴别和人脸验证。人脸鉴别是指鉴别某张人脸图像的身份,即确定某一张人脸图像是哪个人的图像;人脸验证是指判断一张人脸图像的身份是否为声称的某个人。
现有的人脸识别技术通常包含两个主要研究方向:特征学习和度量学习。特征学习的目的是将人脸图像转化更可分的、更具鉴别能力的形式;而度量学习则用于从训练样本中学习评估样本间距离或相似度的度量模型或度量函数,其中,联合贝叶斯脸是目前应用比较普及的度量学习方法,是一种基于高斯假设的概率判别分析推导出的度量学习方法。
人脸识别的主要过程包括:训练过程和识别过程。训练过程是指利用人脸图像训练集求解相似度度量模型的参数,该过程也称为度量学习过程,所述人脸图像训练集由人脸图像和身份标签(标识哪些图像来自同一人,哪些图像来自不同人)组成;识别过程则是指,首先采集供查询的人脸图像注册集,注册集通常由人脸图像、身份标签和身份信息组成,其来源一般较为单一,质量较好,然后将待识别人脸图像的特征与所述注册集中样本特征进行比对,利用训练好的相似度度量模型计算待识别人脸图像特征与注册图像特征的相似度,从而确定待识别人脸图像对应的身份。
由于联合贝叶斯脸的基本假设为:参与比对的人脸样本x和y服从同一高斯分布,而在具体应用中,注册集中的图像来源通常是可控的,待识别人脸图像的来源则较为复 杂,质量参差不齐,如:视频截图、扫描图片、大头贴等,即:注册集中的图像和待识别图像的来源可能不同,导致参与比对的人脸样本可能并不满足服从同一高斯分布的要求(也称为非对称人脸),在这种情况下,现有的人脸识别技术通常不能很好地处理,导致识别准确率较低,无法满足应用的需求。在针对其他客体图像的识别应用中,也同样存在因为图像来源不同(即非对称客体图像)而导致的上述问题。
发明内容
本申请实施例提供一种图像识别方法和装置,以解决现有的图像识别技术针对来源多变的客体图像识别准确率低的问题。本申请实施例还提供一种度量学习方法和装置,以及一种图像来源识别方法和装置。
本申请提供一种图像识别方法,包括:
获取待识别客体图像;
提取所述待识别客体图像的客体特征;
从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;
其中,所述度量模型集合包含至少一个相似度度量模型,不同的相似度度量模型分别与客体图像的不同来源类别相对应。
可选的,所述度量模型集合中对应不同来源类别的各相似度度量模型,是利用属于预设来源类别的基准客体图像训练集、以及对应不同来源类别的比对客体图像训练集分别训练得到的。
可选的,所述基准客体图像训练集中的客体图像与所述注册图像属于相同的来源类别。
可选的,在所述从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型的步骤之前,执行下述操作:
以所述客体特征为输入,利用预先训练好的客体图像来源分类模型,确定所述待识别客体图像的来源类别。
可选的,所述客体图像来源分类模型是采用如下算法训练得到的多类分类模型:
Softmax算法、多类SVM算法、或者随机森林算法。
可选的,所述相似度度量模型包括:在参与比对的客体特征服从各自高斯分布的假 设下、建立的非对称度量模型。
可选的,所述非对称度量模型包括:基于联合贝叶斯脸的非对称度量模型;
对应于特定来源类别的上述非对称度量模型是采用如下步骤训练得到的:
提取属于预设来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
提取属于所述特定来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
可选的,所述对应于特定来源类别的非对称度量模型如下所示:
Figure PCTCN2016092785-appb-000001
A=(Sxx+Txx)-1-E
B=(Syy+Tyy)-1-F
G=-(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1Sxy(Syy+Tyy)-1
E=(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1
F=(Syy+Tyy-Syx(Sxx+Txx)-1Sxy)-1
其中,假设基准特征样本集X中的样本x=μxx,μx和εx服从均值为0,协方差矩阵为Sxx和Txx的高斯分布,比对特征样本集Y中的样本y=μyy,μy和εy服从均值为0,协方差矩阵为Syy和Tyy的高斯分布,Sxy和Syx是X和Y之间的互协方差矩阵;r(x,y)为基于类内/类间对数似然比计算的相似度;
所述求解所述非对称度量模型中的参数包括:求解Sxx、Txx、Syy、Tyy、Sxy、和Syx
可选的,所述求解所述非对称度量模型中的参数包括:
利用散度矩阵估算所述模型中的参数;或者,
采用期望最大化算法迭代求解所述模型中的参数。
可选的,所述计算所述客体特征与注册图像客体特征的相似度,包括:
计算所述客体特征与对应特定身份的注册图像客体特征的相似度;
在上述计算相似度的步骤后,执行下述操作:
判断所述相似度是否大于预先设定的阈值;
若是,判定所述待识别客体图像与所述对应特定身份的注册图像属于同一客体,并将所述判定作为客体识别结果输出。
可选的,所述计算所述客体特征与注册图像客体特征的相似度,包括:
计算所述客体特征与指定范围内的注册图像客体特征的相似度;
在上述计算相似度的步骤后,执行下述操作:
判断计算所得相似度中的最大值是否大于预先设定的阈值;
若是,判定所述待识别客体图像在所述指定范围内的注册图像中匹配成功,并将所述最大值对应的注册图像的相关身份信息作为客体识别结果输出。
可选的,所述提取所述待识别客体图像的客体特征,包括:
采用局部二值模式算法提取所述客体特征;或者,
采用Gabor小波变换算法提取所述客体特征;或者,
采用深度卷积网络提取所述客体特征。
可选的,所述待识别客体图像包括:待识别人脸图像;所述客体特征包括:人脸特征。
可选的,所述来源类别包括:
证件照、生活照、视频截图、扫描图像、翻拍图像、或者监控画面。
相应的,本申请还提供一种图像识别装置,包括:
图像获取单元,用于获取待识别客体图像;
特征提取单元,用于提取所述待识别客体图像的客体特征;
相似度计算单元,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;
其中,所述相似度计算单元包括:
度量模型选择子单元,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型;
计算执行子单元,用于利用所述度量模型选择子单元所选的相似度度量模型计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据。
可选的,所述装置包括:
度量模型训练单元,用于利用属于预设来源类别的基准客体图像训练集、以及对应不同来源类别的比对客体图像训练集,分别训练得到所述度量模型集合中对应不同来源类别的各相似度度量模型。
可选的,所述装置包括:
来源类别确定单元,用于在触发所述相似度计算单元工作之前,以所述客体特征为输入,利用预先训练好的客体图像来源分类模型,确定所述待识别客体图像的来源类别。
可选的,所述装置包括:
来源分类模型训练单元,用于在触发所述来源类别确定单元工作之前,采用如下算法训练训练所述客体图像来源分类模型:Softmax算法、多类SVM算法、或者随机森林算法。
可选的,所述装置包括:
度量模型训练单元,用于训练所述度量模型集合中的各相似度度量模型,所述相似度度量模型包括:在参与比对的客体特征服从各自高斯分布的假设下、基于联合贝叶斯脸建立的非对称度量模型;
所述度量模型训练单元通过如下子单元训练对应于特定来源类别的上述非对称度量模型:
基准样本提取子单元,用于提取属于预设来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
比对样本提取子单元,用于提取属于所述特定来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
度量模型建立子单元,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
模型参数求解子单元,用于根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
可选的,所述模型参数求解子单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
可选的,所述计算执行子单元具体用于,计算所述客体特征与对应特定身份的注册图像客体特征的相似度;
所述装置还包括:
第一阈值比对单元,用于判断所述相似度是否大于预先设定的阈值;
第一识别结果输出单元,用于当所述第一阈值比对单元的输出为是时,判定所述待识别客体图像与所述对应特定身份的注册图像属于同一客体,并将所述判定作为客体识别结果输出。
可选的,所述计算执行子单元具体用于,计算所述客体特征与指定范围内的注册图像客体特征的相似度;
所述装置还包括:
第二阈值比对单元,用于判断计算所得相似度中的最大值是否大于预先设定的阈值;
第二识别结果输出单元,用于当所述第二阈值比对单元的输出为是时,判定所述待识别客体图像在所述指定范围内的注册图像中匹配成功,并将所述最大值对应的注册图像的相关身份信息作为客体识别结果输出。
可选的,所述特征提取单元具体用于,采用局部二值模式算法提取所述客体特征、采用Gabor小波变换算法提取所述客体特征、或者采用深度卷积网络提取所述客体特征。
此外,本申请还提供一种度量学习方法,包括:
提取属于同一来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
提取属于同一来源类别、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
利用上述两类特征样本集中的样本,求解所述非对称度量模型中的参数。
可选的,所述非对称度量模型包括:基于联合贝叶斯脸的非对称度量模型;
所述非对称度量模型如下所示:
Figure PCTCN2016092785-appb-000002
A=(Sxx+Txx)-1-E
B=(Syy+Tyy)-1-F
G=-(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1Sxy(Syy+Tyy)-1
E=(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1
F=(Syy+Tyy-Syx(Sxx+Txx)-1Sxy)-1
其中,假设基准特征样本集X中的样本x=μxx,μx和εx服从均值为0,协方差矩阵为Sxx和Txx的高斯分布,比对特征样本集Y中的样本y=μyy,μy和εy服从均值为0,协方差矩阵为Syy和Tyy的高斯分布,Sxy和Syx是X和Y之间的互协方差矩阵;r(x,y)为基于类内/类间对数似然比计算的相似度;
所述求解所述非对称度量模型中的参数包括:求解Sxx、Txx、Syy、Tyy、Sxy、和Syx
可选的,所述求解所述非对称度量模型中的参数包括:
利用散度矩阵估算所述模型中的参数;或者,
采用期望最大化算法迭代求解所述模型中的参数。
可选的,所述基准客体图像以及所述比对客体图像包括:人脸图像;所述客体特征包括:人脸特征。
相应的,本申请还提供一种度量学习装置,包括:
基准样本提取单元,用于提取属于同一来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
比对样本提取单元,用于提取属于同一来源类别、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
非对称度量模型建立单元,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
度量模型参数求解单元,用于利用上述两类特征样本集中的样本,求解所述非对称度量模型中的参数。
可选的,所述非对称度量模型建立单元建立的度量模型包括:基于联合贝叶斯脸的非对称度量模型。
可选的,所述度量模型参数求解单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
此外,本申请还提供一种图像来源识别方法,包括:
采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合;
利用所述训练样本集合中的客体特征样本及其来源类别,训练客体图像来源分类模型;
从待分类客体图像中提取客体特征;
以上述提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
可选的,所述客体图像来源分类模型是采用如下算法训练得到的多类分类模型:
Softmax算法、多类SVM算法、或者随机森林算法。
可选的,所述客体图像包括:人脸图像;所述客体特征包括:人脸特征。
相应的,本申请还提供一种图像来源识别装置,包括:
训练样本采集单元,用于采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合;
分类模型训练单元,用于利用所述训练样本集合中的客体特征样本及其来源类别,训练图像来源分类模型;
待分类特征提取单元,用于从待分类客体图像中提取客体特征;
来源类别识别单元,用于以所述待分类特征提取单元提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
可选的,所述客体图像来源分类模型包括:多类分类模型;
所述分类模型训练单元具体用于,利用Softmax算法、多类SVM算法、或者随机森林算法训练所述客体图像来源分类模型。
与现有技术相比,本申请具有以下优点:
本申请提供的图像识别方法,首先获取待识别客体图像,提取所述待识别客体图像的客体特征,然后从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据。采用本方法进行图像识别,由于没有采用单一的相似度度量模型,而是选用预先训练好的与待识别客体图像的来源类别相对应的相似度度量模型,从而能够有效处理非对称客体图像识别问题,对来源多变的待识别客体图像的识别具有更好的鲁棒性和更高的准确率。
本申请提供的度量学习方法,在参与比对的人脸特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型,利用不同来源的客体图像特征样本集合,求解所述非对称度量模型中的参数,从而完成非对称度量模型的构建。本方法对传统图像识别技术中的假设进行了修改,即:参与比对的两个客体样本x和y可以分别服从各自高斯分布、而不必共享参数,并在此基础上从分属不同来源类别的样本集合中学习用于识别非对称 客体的相似度度量模型,从而为适应各种图像来源的高性能客体识别提供了基础。
本申请提供的图像来源识别方法,首先从分别属于不同来源类别的客体图像集中提取客体特征组成训练样本集合,利用所述训练样本集合中的客体特征样本及其来源类别,训练客体图像来源分类模型,然后以从待分类客体图像中提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。本方法能够有效识别客体图像的来源类别,从而为在客体识别过程中选择正确的相似度度量模型提供依据,保障了识别结果的正确性。
附图说明
图1是本申请提供的一种图像识别方法的实施例的流程图;
图2是本申请实施例提供的度量模型集合训练过程的示意图;
图3是本申请实施例提供的训练非对称度量模型的处理流程图;
图4是本申请实施例提供的利用度量模型集合进行人脸识别的示意图;
图5是本申请提供的一种图像识别装置的实施例的示意图;
图6是本申请提供的一种度量学习方法的实施例的流程图;
图7是本申请提供的一种度量学习装置的实施例的示意图;
图8是本申请提供的一种图像来源识别方法的实施例的流程图;
图9是本申请提供的一种图像来源识别装置的实施例的示意图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本申请。但是,本申请能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本申请内涵的情况下做类似推广,因此,本申请不受下面公开的具体实施的限制。
在本申请中,分别提供了一种图像识别方法及装置,一种度量学习方法及装置,以及一种图像来源识别方法及装置,在下面的实施例中逐一进行详细说明。
虽然本申请的技术方案是以人脸识别为背景提出的,但是本申请技术方案的应用领域并非仅局限于人脸识别,在针对其他客体图像的识别应用中同样可以采用本申请提供的技术方案。
现有的图像识别技术通常不考虑客体图像的来源,采用单一的相似度度量模型进行识别,而本申请的技术方案,针对待识别客体图像来源复杂、质量参差不齐的现象,提 出了一种图像识别的新思路:预先训练对应不同来源类别的相似度度量模型,而在具体应用时选用与待识别客体图像的来源类别相对应的相似度度量模型进行识别,从而能够处理非对称客体图像的识别问题,对属于不同来源类别的客体图像的识别具有更好的鲁棒性和更高的准确率。
所述客体图像通常是指,其主要展示内容(例如:作为图像主体的前景图像)为人脸或者各种物品等客体的图像。不同来源的客体图像通常是指,由于采集方式或者采集设备不同等因素、导致客体特征遵循不同数据分布的图像,不同来源可以包括:视频截图、扫描图像、翻拍图像等。
考虑到目前人脸图像的识别应用比较普及,在本申请的实施例中以人脸图像识别为重点进行描述。
请参考图1,其为本申请的一种图像识别方法的实施例的流程图。所述方法包括如下步骤:
步骤101、训练对应不同来源类别的相似度度量模型,组成度量模型集合。
对于本实施例中的人脸图像,各种不同来源类别包括但不局限于:证件照、生活照、视频截图、扫描图像、翻拍图像、或者监控画面等。
在采用本技术方案进行人脸识别之前,可以先训练对应于不同来源类别的相似度度量模型,所有训练好的相似度度量模型共同组成度量模型集合,该集合中的每个成员,即每个相似度度量模型分别与人脸图像的不同来源类别相对应。
给定两个属于不同来源类别的人脸特征样本(简称人脸样本)x和y,相似度度量模型用于评估两者之间的相似度,在具体实施时,所述相似度度量模型通常可以用度量函数f(x,y,P)来表示,其中P为该模型的参数,训练的目的是基于给定的训练集求解度量模型的参数P,参数P一旦确定则模型训练完毕。
针对人脸图像的多种来源类别,训练过程可以重复多次,从而得到多个度量函数,每个度量函数适用于不同来源类别的人脸图像。训练针对某一特定来源类别的度量模型时,训练集由三部分组成:作为训练基准的、属于预设来源类别的基准人脸图像训练集X、对应所述特定来源类别的比对人脸图像训练集Y、以及用于标识哪些图像来自同一人、哪些图像来自不同人的身份标签Z。给定一组训练集(X,Y,Z),即可训练得到一个针对(X,Y)空间的度量函数f(x,y,P)。固定训练集X,通过更换属于不同来源类别的训练集Yk,则可以训练得到多个度量函数fk(x,y,P),k=1…K,其中K为训练集Y的个数,表示图像来源的类别数。请参见图2,其为度量模型集合训练过程的示意图。
上面对整个训练过程作了概要性描述,下面具体描述训练对应于某一特定来源类别的相似度度量模型的具体步骤,包括:提取特征、建立模型、求解模型参数等。在具体实施时,可以采用不同的算法建立相似度度量模型,为了便于理解,在本实施例中以目前应用比较普及的联合贝叶斯脸为基础建立相似度度量模型,并将建立的模型称为非对称度量模型。下面结合图3对训练所述非对称度量模型的过程作进一步说明,所述训练过程包括:
步骤101-1、提取属于预设来源类别的基准人脸图像训练集中各图像的人脸特征,作为基准特征样本集。
在具体实施时,作为训练基准的基准人脸图像训练集X中的人脸图像通常是在可控环境下采集的,所述预设来源类别可以为:证件照,或者其它图像质量通常比较好的来源类别。采集基准人脸图像训练集后,可以提取其中各图像的人脸特征作为样本,即通常所说的人脸样本,所有样本共同组成基准特征样本集。关于如何提取人脸特征,请参见后续步骤103中的文字说明。
步骤101-2、提取属于所述特定来源类别的比对人脸图像训练集中各图像的人脸特征,作为比对特征样本集。
所述特定来源类别可以与基准人脸图像训练集X的来源类别不同,例如:X是在可控环境下采集的证件照,比对人脸图像训练集Y中的人脸图像可以是在不可控环境下采集的生活照。采集所述比对人脸图像训练集后,可以提取其中各图像的人脸特征作为样本,所有样本共同组成比对特征样本集。关于如何提取人脸特征,请参见后续步骤103中的文字说明。
步骤101-3、在参与比对的人脸特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型。
本实施例在传统联合贝叶斯脸的基础上进行了改进,并建立了非对称度量模型。为了便于理解,先对贝叶斯脸和联合贝叶斯脸作简要说明。
贝叶斯脸通常是对经典贝叶斯人脸识别方法的简称,该方法用两幅人脸图像特征的差别作为模式矢量,若两个图像属于同一人则称为类内模式,否则称为类间模式,从而将人脸识别的多分类问题转化为二分类问题。对于任意两个人脸样本x和y,如果基于类内/类间模式得到的对数似然比大于预先设定的阈值,则可以判定为同一个人。
联合贝叶斯脸则是在贝叶斯脸的基础上,针对x和y的联合概率分布建立二维模型,并将每个人脸样本表示为两个独立的潜在变量之和:不同人脸的变化+相同人脸的变化, 然后利用大量样本训练得到基于对数似然比的相似度度量模型。需要说明的是,虽然上述两种贝叶斯脸技术是针对人脸图像识别提出的,但是也可以应用于其他客体图像的识别。
联合贝叶斯脸的识别准确率比经典贝叶斯脸有所提高,但是由于联合贝叶斯脸的基本假设为:参与比对的人脸样本x和y服从同一高斯分布,而在具体应用中,注册集中的图像来源通常是可控的,待识别人脸图像的来源则较为复杂,质量参差不齐,也即:参与比对的人脸样本可能并不满足服从同一高斯分布的要求,导致联合贝叶斯脸技术通常不能很好地处理这种情况,识别准确率较低。
针对上述问题,本申请的发明人在对联合贝叶斯脸的假设进行修改的基础上,提出了非对称度量模型、以及采用不同来源类别的人脸图像训练集进行训练的度量学习方法。之所以称为“非对称”度量模型,是因为利用该模型进行比对的两个人脸样本所对应的人脸图像可以属于不同的来源类别,由于建模时考虑到了不同来源类别导致的数据分布差异,依据该模型估算的相似度可以得到更为准确的人脸识别结果。
非对称度量模型基于如下假设:参与比对的两个人脸样本x和y可以分别服从各自高斯分布、而不必共享参数。假设基准特征样本集X中的样本x可以用两个独立随机变量之和表示:x=μxx,其中μx表示由样本的身份标签带来的随机性,εx表示由其他因素带来的随机性,如:姿态、表情、光照等,假设μx和εx服从均值为0,协方差矩阵为Sxx和Txx的高斯分布。
同理,比对人脸图像训练集Y中的样本y也可用两个独立随机变量之和表示:y=μyy,其中μy表示由样本的身份标签带来的随机性,εy表示由其他因素带来的随机性。假设μy和εy服从均值为0,协方差矩阵为Syy和Tyy的高斯分布。
由于x和y都服从高斯分布,其联合分布也服从高斯分布。将X和Y空间连接起来,其中的样本表示为{x,y},该随机变量的均值仍为0,其方差分两种情况进行分析。
1)对于同一人的(类内)样本。
其协方差矩阵为:
Figure PCTCN2016092785-appb-000003
其中,Sxy和Syx是X和Y之间的互协方差矩阵。
其逆矩阵的形式为:
Figure PCTCN2016092785-appb-000004
由此可以得到:
E=(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1
G=-(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1Sxy(Syy+Tyy)-1
F=(Syy+Tyy-Syx(Sxx+Txx)-1Sxy)-1
2)对于不同人的(类间)样本。
其协方差矩阵为:
Figure PCTCN2016092785-appb-000005
其逆矩阵的形式为:
Figure PCTCN2016092785-appb-000006
在上述推导过程的基础上,对于任意两个样本x和y,使用类内/类间对数似然比评估他们的相似度,值越大说明x和y是同一人的可能性越大,因此,建立如下所示的非对称度量模型:
Figure PCTCN2016092785-appb-000007
A=(Sxx+Txx)-1-E
B=(Syy+Tyy)-1-F
则,非对称度量模型可以简化为如下表示方式:
Figure PCTCN2016092785-appb-000008
步骤101-4、根据上述两类特征样本集中的样本以及标识样本是否属于同一人的身份 标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
训练非对称度量模型的主要任务在于求解公式1所示模型表达式中的A、B和G参数,而通过步骤101-3的推导过程可以看出,这三个参数可以通过Sxx、Txx、Syy、Tyy、Sxy、和Syx经过特定的运算得到,因此训练非对称度量模型的核心,在于求解上述各个协方差矩阵以及互协方差矩阵。本实施例利用基准特征样本集X和比对特征样本集Y中的大量人脸样本,采用估算散度矩阵的方式求解所述各个参数,下面进行详细说明。
根据基准特征样本集X和身份标签信息(标识不同的人脸样本是否属于同一人),使用类间散度矩阵对Sxx作近似估计,使用类内散度矩阵对Txx作近似估计,计算公式如下:
Figure PCTCN2016092785-appb-000009
Figure PCTCN2016092785-appb-000010
其中C为类别数(属于同一人的人脸样本为同一类),
Figure PCTCN2016092785-appb-000011
为第i类样本的集合,
Figure PCTCN2016092785-appb-000012
表示第i类的样本数,mx为全体样本的均值,
Figure PCTCN2016092785-appb-000013
为第i类样本的均值。
同理,根据比对特征样本集Y和身份标签信息,使用类间散度矩阵对Syy作近似估计,使用类内散度矩阵对Tyy作近似估计,计算公式如下:
Figure PCTCN2016092785-appb-000014
Figure PCTCN2016092785-appb-000015
其中C为类别数,
Figure PCTCN2016092785-appb-000016
为第i类样本的集合,
Figure PCTCN2016092785-appb-000017
表示第i类的样本数,my为全体样本的均值,
Figure PCTCN2016092785-appb-000018
为第i类样本的均值。
同理,使用下述计算公式估计X和Y之间的互协方差矩阵:
Figure PCTCN2016092785-appb-000019
Figure PCTCN2016092785-appb-000020
通过上述估算散度矩阵的方式求解得到Sxx、Txx、Syy、Tyy、Sxy、和Syx后,根据步 骤101-3的推导过程,可以进一步计算得到参数A、B以及G的值,将这些参数值代入公式1中,得到训练完毕的非对称度量模型。
至此,通过上述步骤101-1至步骤101-4,描述了训练对应于特定来源类别的非对称度量模型的具体步骤。在具体实施时,对于人脸图像的K个来源类别,可以分别采用上述步骤进行训练,从而获取K个分别对应于不同来源类别的相似度度量模型。
需要说明的是,本实施例在利用大量人脸样本的基础上、采用估算散度矩阵的方式求解所述非对称度量模型中的各个参数,在其他实施方式中,也可以采用传统联合贝叶斯脸所采取的期望最大化算法、通过多轮迭代的方式求解所述模型中的参数,同样可以实现本申请的技术方案。
此外,本实施例在联合贝叶斯脸的基础上、通过修改其假设建立对应于不同来源类别的相似度度量模型,在其他实施方式中,也可以采用其他方法或者技术建立所述相似度度量模型,例如:利用典型相关分析技术(Canonical Correlation Analysis,简称CCA)、非对称深度度量学习方法(Asymmetric Deep Metric Learning,简称ADML)、或者基于多模态受限玻尔兹曼机(Multimodal Restricted Boltzmann Machines)的方法建立所述相似度度量模型。不管采用何种算法或者技术,只要能够针对来源不同的人脸图像分别建立并训练得到相对应的相似度度量模型,就不偏离本申请的核心,都在本申请的保护范围之内。
步骤102、获取待识别人脸图像。
所述待识别人脸图像通常是指待确定身份的人脸图像,一般在不可控环境下采集,其来源类别较多,可以包括:生活照、翻拍海报、翻拍电视、监控画面、扫描图像等。
在具体实施时,可以通过多种方式获取待识别人脸图像,例如,用具有摄像头的照相机或者移动终端设备拍摄、从互联网的资源数据库中下载、用扫描仪扫描、或者接收由客户端(例如:移动终端设备或者桌面电脑等)通过有线或者无线方式上传的待识别人脸图像等。
步骤103、提取所述待识别人脸图像的人脸特征。
由于人脸部分通常占据所述待识别人脸图像的主要空间,因此可以直接从所述待识别人脸图像中提取人脸特征,为了提高识别的准确率,也可以先从人脸图像背景中检测人脸所在的具体位置,例如:采用基于肤色的检测方法、基于形状的检测方法、或者基于统计理论的检测方法等确定人脸在所述图像中的具体位置,然后再从所述具体位置对应的人脸图像中提取人脸特征。
提取特征的过程是将人脸图像转换为矢量的过程,该矢量称为人脸特征,人脸特征对来自不同人的人脸图像具有较强的鉴别力,同时对外部干扰因素具有鲁棒性。在具体实施时,可以采用多种特征提取方法,如:局部二值模式算法(Local Binary Patterns,简称LBP)、Gabor小波变换算法、以及深度卷积网络等,其中,从识别准确率以及执行性能的角度考虑,采用深度卷积网络提取人脸特征是本实施例提供的优选实施方式。
步骤104、利用预先训练好的人脸图像来源分类模型,确定所述待识别人脸图像的来源类别。
具体实施时,可以根据步骤103获取待识别图像的方式确定所述待识别人脸图像的来源类别,例如:利用照相机拍照获取的普通生活中的人脸图像,则其来源类别为生活照;如果采用扫描仪扫描获取的人脸图像,则其来源类别为扫描图像。此外,对于从客户端或者网络获取的待识别人脸图像,如果所述图像带有预先标注好的来源信息,那么可以依据该信息确定所述人脸图像的来源类别。
对于无法通过上述方式或者类似方式获取来源类别的待识别人脸图像,则可以采用本步骤所述方法:利用人脸图像来源分类模型,确定所述待识别人脸图像的来源类别。
所述人脸图像来源分类模型为多类分类模型(也称为多类分类器),在具体实施时,可以在执行本步骤之前预先训练好所述人脸图像来源分类模型,例如,本实施例采用Softmax回归算法训练所述分类模型,下面对训练过程作进一步说明。
首先采集属于K个不同来源类别的人脸图像集,并从其中每个人脸图像中提取人脸特征组成训练样本集合,所述训练样本集合中的每个样本由两部分组成:人脸特征和其对应的来源类别标签,具体可以采用如下表示方式:{yi,si}(i=1…N)表示,其中yi为人脸特征,si为来源类别标签,N为样本数。
采用Softmax回归方法,对于给定人脸特征,其属于第k类的概率为如下形式:
Figure PCTCN2016092785-appb-000021
其中,θ为模型的参数,可以通过最小化下面的目标函数进行求解:
Figure PCTCN2016092785-appb-000022
其中,1{}为指标函数,当括号中的表达式成立时值为1,否则值为0。在具体实施时,对于给定的训练集{yi,si}(i=1…N),可以采用迭代的优化算法(例如:梯度下降法) 最小化目标函数J(θ),并求解得到参数θ,所述人脸图像来源分类模型训练完毕。
本步骤可以以所述待识别人脸图像的人脸特征作为输入,采用已训练完毕的人脸图像来源分类模型计算该人脸特征属于每个来源类别的概率P(s=k|y),其中最大值对应的来源类别即为所述待识别人脸图像所属的来源类别。
在本实施例中采用Softmax算法实现所述人脸图像来源分类模型,在其他实施方式中,也可以采用不同于上述算法的其他方式,例如可以采用多类SVM算法、或者随机森林算法等,也是可以的。
步骤105、从预先训练好的度量模型集合中选择与所述待识别人脸图像的来源类别相对应的相似度度量模型,并计算所述人脸特征与注册图像人脸特征的相似度,作为输出人脸识别结果的依据。
所述注册图像通常是指,在具体应用中供查询的人脸图像注册集中的人脸图像。所述人脸图像注册集中的图像通常在可控环境下采集,其来源通常较为单一,质量通常较好,例如:二代证照片、登记照等,且其规模比较大,可以达到数万至数千万。为了进一步提高本技术方案的识别准确率,所述人脸图像注册集、与在步骤101中训练相似度度量模型时所采用的基准人脸图像训练集,可以采用相同来源类别的图像,例如:都采用证件照。
在具体实施时,采集用于组成人脸图像注册集的图像后,可以提取每个人脸图像的人脸特征,并将人脸图像、人脸特征、以及对应的身份标签和身份信息存储在供查询的注册图像数据库中,同时建立上述各类信息之间的对应关系。其中,所述身份信息通常是指能够标识人脸图像所对应的个人身份的信息,例如:姓名、身份ID等。
由于在步骤101中已经预先训练好了用于人脸识别的度量模型集合,在本实施例的一个具体例子中,预先训练好的度量模型集合中包含K个相似度度量模型,每个相似度度量模型分别与不同来源类别相对应,其形式为fk(x,y,P),k=1...K,其中参数P已经在步骤101中求解得到。
本步骤根据所述待识别人脸图像的来源类别,从所述度量模型集合中选择相对应的相似度度量模型,例如待识别人脸图像的来源类别为扫描图像,那么本步骤则选择针对扫描图像这一来源类别预先训练的相似度度量模型,并利用所选模型计算待识别人脸图像的人脸特征与注册图像人脸特征的相似度,最终依据相似度输出人脸识别结果。请参考图4,其为所述具体例子中的处理过程的示意图。
在具体实施时,针对人脸识别的不同应用需求,本步骤在计算所述人脸特征与注册 图像人脸特征的相似度时,存在两种不同的情况,下面分别进行说明。
(一)人脸验证。
所述人脸验证通常是指,判断一张人脸图像的身份是否为某个特定人。在这种应用场景下,通常可以预先知道所述特定人的身份信息,例如代表其身份的数字标识(身份ID),根据所述身份信息查询注册图像数据库,即可获取对应该身份的注册图像人脸特征,然后计算所述待识别人脸图像的人脸特征与从数据库中获取的注册图像人脸特征的相似度,若所述相似度大于预先设定的阈值,则可以判定所述待识别人脸图像与所述注册图像属于同一个人,即:所述待识别人脸图像的身份确实为所述特定人,并将所述判定作为人脸识别结果输出。
(二)人脸鉴别。
所述人脸鉴别通常是指,鉴别待识别人脸图像的身份,即确定待识别人脸图像是具体哪个人的图像。在这种应用场景下,本步骤可以计算所述待识别人脸图像的人脸特征与指定范围内的注册图像人脸特征的相似度,例如,可以与预先建立好的注册图像数据库中的全部注册图像人脸特征逐一进行比对,也可以按照预设策略选取注册图像数据库中的部分注册图像人脸特征进行比对,并计算对应的相似度。若计算所得相似度中的最大值大于预先设定的阈值,则可以判定所述待识别人脸图像在所述指定范围内的注册图像中匹配成功,即可以确定待识别人脸图像在所述指定范围的注册图像集合中,并将所述最大值对应的注册图像的相关身份信息作为人脸识别结果输出,例如,可以输出所述最大值所对应的注册图像的身份ID、或者姓名等身份信息。
至此,通过上述步骤101至步骤105,对本实施例提供的人脸识别方法的具体实施方式进行了描述。需要说明的是,在具体实施本方法的过程中,上述步骤并非都是必需的。其中步骤101是度量模型集合的训练过程,通常情况下,所述度量模型集合中的各个相似度度量模型一旦训练完毕,就可以反复使用,而不必每次针对获取的待识别人脸图像重新进行训练;同理,步骤104也不是必需的,如果可以通过待识别图像的获取方式获知其来源类别、或者待识别图像本身携带了来源类别标注,则可以不执行步骤104。
上述实施例以人脸识别为例,详细描述了本申请提供的图像识别方法的具体实施过程。在实际应用中,本申请提供的图像识别方法也可应用于对其他客体图像(例如包含各种物品的图像)的识别中,下面以识别箱包图像为例进行简要说明。
可以预先根据基准箱包图像训练集以及对应不同来源类别的比对箱包图像训练集,分别训练对应不同图像来源类别的相似度度量模型,在获取待识别箱包图像后,先提取 所述待识别箱包图像中的箱包特征,然后选用与待识别箱包图像的来源类别相对应的相似度度量模型、计算所述箱包特征与注册图像箱包特征的相似度,并依据所述相似度输出所述待识别箱包图像的识别结果,例如:所述待识别箱包图像与对应特定身份的注册图像是否属于同一箱包,或者,所述待识别箱包图像的相关身份信息。针对箱包等物品的身份信息通常可以包括以下信息之一或者组合:生产厂家、品牌信息、型号信息等。
综上所述,本申请提供的图像识别方法,在进行客体图像识别时,没有采用单一的相似度度量模型,而是选用预先训练好的与待识别客体图像的来源类别相对应的相似度度量模型,从而能够有效处理非对称客体图像的识别问题,对来源多变的待识别客体图像的识别具有更好的鲁棒性和更高的准确率。
在上述的实施例中,提供了一种图像识别方法,与之相对应的,本申请还提供一种图像识别装置。请参看图5,其为本申请的一种图像识别装置的实施例示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
本实施例的一种图像识别装置,包括:度量模型训练单元501,用于利用属于预设来源类别的基准客体图像训练集、以及对应不同来源类别的比对客体图像训练集,分别训练得到所述度量模型集合中对应不同来源类别的各相似度度量模型;图像获取单元502,用于获取待识别客体图像;特征提取单元503,用于提取所述待识别客体图像的客体特征;来源类别确定单元504,用于以所述客体特征为输入,利用预先训练好的客体图像来源分类模型,确定所述待识别客体图像的来源类别;相似度计算单元505,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;
其中,所述相似度计算单元包括:
度量模型选择子单元,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型;
计算执行子单元,用于利用所述度量模型选择子单元所选的相似度度量模型计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据。
可选的,所述装置包括:
来源分类模型训练单元,用于在触发所述来源类别确定单元工作之前,采用如下算法训练训练所述客体图像来源分类模型:Softmax算法、多类SVM算法、或者随机森林 算法。
可选的,所述度量模型训练单元具体用于,训练对应不同来源类别的非对称度量模型,所述非对称度量模型是在参与比对的客体特征服从各自高斯分布的假设下、基于联合贝叶斯脸建立的度量模型;
所述度量模型训练单元通过如下子单元训练对应于特定来源类别的非对称度量模型:
基准样本提取子单元,用于提取属于预设来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
比对样本提取子单元,用于提取属于所述特定来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
度量模型建立子单元,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
模型参数求解子单元,用于根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
可选的,所述模型参数求解子单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
可选的,所述计算执行子单元具体用于,计算所述客体特征与对应特定身份的注册图像客体特征的相似度;
所述装置还包括:
第一阈值比对单元,用于判断所述相似度是否大于预先设定的阈值;
第一识别结果输出单元,用于当所述第一阈值比对单元的输出为是时,判定所述待识别客体图像与所述对应特定身份的注册图像属于同一客体,并将所述判定作为客体识别结果输出。
可选的,所述计算执行子单元具体用于,计算所述客体特征与指定范围内的注册图像客体特征的相似度;
所述装置还包括:
第二阈值比对单元,用于判断计算所得相似度中的最大值是否大于预先设定的阈值;
第二识别结果输出单元,用于当所述第二阈值比对单元的输出为是时,判定所述待识别客体图像在所述指定范围内的注册图像中匹配成功,并将所述最大值对应的注册图像的相关身份信息作为客体识别结果输出。
可选的,所述特征提取单元具体用于,采用局部二值模式算法提取所述客体特征、采用Gabor小波变换算法提取所述客体特征、或者采用深度卷积网络提取所述客体特征。
此外,本申请还提供一种度量学习方法。请参考图6,其为本申请提供的一种度量学习方法的实施例的流程图,本实施例与上述图像识别方法实施例步骤相同的部分不再赘述,下面重点描述不同之处。本申请提供的一种度量学习方法包括:
步骤601、提取属于同一来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集。
步骤602、提取属于同一来源、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集。
步骤603、在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型。
所述非对称度量模型包括:基于联合贝叶斯脸的非对称度量模型;所述非对称度量模型如下所示:
Figure PCTCN2016092785-appb-000023
步骤604、利用上述两类特征样本集中的样本,求解所述非对称人脸相似度度量模型中的参数。
本步骤可以利用上述两类特征样本集中的样本,采用与所建立模型相应的算法或者方式求解所述模型中的各个参数。例如,对于基于联合贝叶斯脸的非对称度量模型来说,可以根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签信息,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
本实施例提供的度量学习方法,可以用于学习非对称人脸图像的相似度度量模型,在这种应用场景下,所述基准客体图像以及所述比对客体图像包括:人脸图像;所述客体特征包括:人脸特征。当然,在实际应用中,也可以将本实施例提供的度量学习方法用于学习其他非对称客体图像的相似度度量模型。
本申请提供的度量学习方法,对传统图像识别技术中的假设进行了修改,即:参与比对的两个客体样本x和y可以分别服从各自高斯分布、而不必共享参数,并在此基础上从分属不同来源类别的样本集合中学习用于识别非对称客体的相似度度量模型,从而 为适应各种图像来源的高性能客体识别提供了基础。
在上述的实施例中,提供了一种度量学习方法,与之相对应的,本申请还提供一种度量学习装置。请参看图7,其为本申请的一种度量学习装置的实施例示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
本实施例的一种度量学习装置,包括:基准样本提取单元701,用于提取属于同一来源类别的基准客体图像训练集中各图像的人脸特征,作为基准特征样本集;比对样本提取单元702,用于提取属于同一来源类别、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;非对称度量模型建立单元703,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;度量模型参数求解单元704,用于利用上述两类特征样本集中的样本,求解所述非对称度量模型中的参数。
可选的,所述非对称度量模型建立单元建立的度量模型包括:基于联合贝叶斯脸的非对称度量模型。
可选的,所述度量模型参数求解单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
此外,本申请还提供一种图像来源识别方法。请参考图8,其为本申请提供的一种图像来源识别方法的实施例的流程图,本实施例与上述实施例步骤相同的部分不再赘述,下面重点描述不同之处。本申请提供的一种图像来源识别方法包括:
步骤801、采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合。
步骤802、利用所述训练样本集合中的客体特征样本及其来源类别,训练客体图像来源分类模型。
所述客体图像来源分类模型通常为多类分类模型,在具体实施时,可以采用以下算法训练所述客体图像来源分类模型:Softmax算法、多类SVM算法、或者随机森林算法。
步骤803、从待分类客体图像中提取客体特征。
步骤804、以上述提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
本实施例提供的图像来源识别方法,可以用于识别人脸图像的来源类别,在这种应 用场景下,所述客体图像包括:人脸图像;所述客体特征包括:人脸特征;所述预先训练的客体图像来源分类模型则是指人脸图像来源分类模型。当然,在实际应用中,也可以采用本方法识别其他客体图像的来源类别。
本申请提供的图像来源识别方法,能够有效识别客体图像的来源类别,从而为在客体图像识别过程中选择正确的相似度度量模型提供依据,保障了识别结果的正确性。
在上述的实施例中,提供了一种图像来源识别方法,与之相对应的,本申请还提供一种图像来源识别装置。请参看图9,其为本申请的一种图像来源识别装置的实施例示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。
本实施例的一种图像来源识别装置,包括:训练样本采集单元901,用于采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合;分类模型训练单元902,用于利用所述训练样本集合中的客体特征样本及其来源类别,训练客体图像来源分类模型;待分类特征提取单元903,用于从待分类客体图像中提取客体特征;来源类别识别单元904,用于以所述待分类特征提取单元提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
可选的,所述客体图像来源分类模型包括:多类分类模型;
所述分类模型训练单元具体用于,利用softmax算法、多类SVM算法、或者随机森林算法训练所述客体图像来源分类模型。
本申请虽然以较佳实施例公开如上,但其并不是用来限定本申请,任何本领域技术人员在不脱离本申请的精神和范围内,都可以做出可能的变动和修改,因此本申请的保护范围应当以本申请权利要求所界定的范围为准。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
1、计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器 (SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
2、本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。

Claims (35)

  1. 一种图像识别方法,其特征在于,包括:
    获取待识别客体图像;
    提取所述待识别客体图像的客体特征;
    从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;
    其中,所述度量模型集合包含至少一个相似度度量模型,不同的相似度度量模型分别与客体图像的不同来源类别相对应。
  2. 根据权利要求1所述的图像识别方法,其特征在于,所述度量模型集合中对应不同来源类别的各相似度度量模型,是利用属于预设来源类别的基准客体图像训练集、以及对应不同来源类别的比对客体图像训练集分别训练得到的。
  3. 根据权利要求2所述的图像识别方法,其特征在于,所述基准客体图像训练集中的客体图像与所述注册图像属于相同的来源类别。
  4. 根据权利要求1所述的图像识别方法,其特征在于,在所述从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型的步骤之前,执行下述操作:
    以所述客体特征为输入,利用预先训练好的客体图像来源分类模型,确定所述待识别客体图像的来源类别。
  5. 根据权利要求4所述的图像识别方法,其特征在于,所述客体图像来源分类模型是采用如下算法训练得到的多类分类模型:
    Softmax算法、多类SVM算法、或者随机森林算法。
  6. 根据权利要求1所述的图像识别方法,其特征在于,所述相似度度量模型包括:在参与比对的客体特征服从各自高斯分布的假设下、建立的非对称度量模型。
  7. 根据权利要求6所述的图像识别方法,其特征在于,所述非对称度量模型包括:基于联合贝叶斯脸的非对称度量模型;
    对应于特定来源类别的上述非对称度量模型是采用如下步骤训练得到的:
    提取属于预设来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
    提取属于所述特定来源类别的比对客体图像训练集中各图像的客体特征,作为比对 特征样本集;
    在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
    根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
  8. 根据权利要求7所述的图像识别方法,其特征在于,所述对应于特定来源类别的非对称度量模型如下所示:
    Figure PCTCN2016092785-appb-100001
    A=(Sxx+Txx)-1-E
    B=(Syy+Tyy)-1-F
    G=-(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1Sxy(Syy+Tyy)-1
    E=(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1
    F=(Syy+Tyy-Syx(Sxx+Txx)-1Sxy)-1
    其中,假设基准特征样本集X中的样本x=μxx,μx和εx服从均值为0,协方差矩阵为Sxx和Txx的高斯分布,比对特征样本集Y中的样本y=μyy,μy和εy服从均值为0,协方差矩阵为Syy和Tyy的高斯分布,Sxy和Syx是X和Y之间的互协方差矩阵;r(x,y)为基于类内/类间对数似然比计算的相似度;
    所述求解所述非对称度量模型中的参数包括:求解Sxx、Txx、Syy、Tyy、Sxy、和Syx
  9. 根据权利要求7所述的图像识别方法,其特征在于,所述求解所述非对称度量模型中的参数包括:
    利用散度矩阵估算所述模型中的参数;或者,
    采用期望最大化算法迭代求解所述模型中的参数。
  10. 根据权利要求1所述的图像识别方法,其特征在于,所述计算所述客体特征与注册图像客体特征的相似度,包括:
    计算所述客体特征与对应特定身份的注册图像客体特征的相似度;
    在上述计算相似度的步骤后,执行下述操作:
    判断所述相似度是否大于预先设定的阈值;
    若是,判定所述待识别客体图像与所述对应特定身份的注册图像属于同一客体,并 将所述判定作为客体识别结果输出。
  11. 根据权利要求1所述的图像识别方法,其特征在于,所述计算所述客体特征与注册图像客体特征的相似度,包括:
    计算所述客体特征与指定范围内的注册图像客体特征的相似度;
    在上述计算相似度的步骤后,执行下述操作:
    判断计算所得相似度中的最大值是否大于预先设定的阈值;
    若是,判定所述待识别客体图像在所述指定范围内的注册图像中匹配成功,并将所述最大值对应的注册图像的相关身份信息作为客体识别结果输出。
  12. 根据权利要求1-11任一项所述的图像识别方法,其特征在于,所述提取所述待识别客体图像的客体特征,包括:
    采用局部二值模式算法提取所述客体特征;或者,
    采用Gabor小波变换算法提取所述客体特征;或者,
    采用深度卷积网络提取所述客体特征。
  13. 根据权利要求1-11任一项所述的图像识别方法,其特征在于,所述待识别客体图像包括:待识别人脸图像;所述客体特征包括:人脸特征。
  14. 根据权利要求13所述的图像识别方法,其特征在于,所述来源类别包括:
    证件照、生活照、视频截图、扫描图像、翻拍图像、或者监控画面。
  15. 一种图像识别装置,其特征在于,包括:
    图像获取单元,用于获取待识别客体图像;
    特征提取单元,用于提取所述待识别客体图像的客体特征;
    相似度计算单元,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型,并计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据;
    其中,所述相似度计算单元包括:
    度量模型选择子单元,用于从预先训练好的度量模型集合中选择与所述待识别客体图像的来源类别相对应的相似度度量模型;
    计算执行子单元,用于利用所述度量模型选择子单元所选的相似度度量模型计算所述客体特征与注册图像客体特征的相似度,作为输出客体识别结果的依据。
  16. 根据权利要求15所述的图像识别装置,其特征在于,包括:
    度量模型训练单元,用于利用属于预设来源类别的基准客体图像训练集、以及对应 不同来源类别的比对客体图像训练集,分别训练得到所述度量模型集合中对应不同来源类别的各相似度度量模型。
  17. 根据权利要求15所述的图像识别装置,其特征在于,包括:
    来源类别确定单元,用于在触发所述相似度计算单元工作之前,以所述客体特征为输入,利用预先训练好的客体图像来源分类模型,确定所述待识别客体图像的来源类别。
  18. 根据权利要求17所述的图像识别装置,其特征在于,包括:
    来源分类模型训练单元,用于在触发所述来源类别确定单元工作之前,采用如下算法训练训练所述客体图像来源分类模型:Softmax算法、多类SVM算法、或者随机森林算法。
  19. 根据权利要求15所述的图像识别装置,其特征在于,包括:
    度量模型训练单元,用于训练所述度量模型集合中的各相似度度量模型,所述相似度度量模型包括:在参与比对的客体特征服从各自高斯分布的假设下、基于联合贝叶斯脸建立的非对称度量模型;
    所述度量模型训练单元通过如下子单元训练对应于特定来源类别的上述非对称度量模型:
    基准样本提取子单元,用于提取属于预设来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
    比对样本提取子单元,用于提取属于所述特定来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
    度量模型建立子单元,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
    模型参数求解子单元,用于根据上述两类特征样本集中的样本以及标识样本是否属于同一客体的身份标签,求解所述非对称度量模型中的参数,完成所述模型的训练。
  20. 根据权利要求19所述的图像识别装置,其特征在于,所述模型参数求解子单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
  21. 根据权利要求15所述的图像识别装置,其特征在于,所述计算执行子单元具体用于,计算所述客体特征与对应特定身份的注册图像客体特征的相似度;
    所述装置还包括:
    第一阈值比对单元,用于判断所述相似度是否大于预先设定的阈值;
    第一识别结果输出单元,用于当所述第一阈值比对单元的输出为是时,判定所述待识别客体图像与所述对应特定身份的注册图像属于同一客体,并将所述判定作为客体识别结果输出。
  22. 根据权利要求15所述的图像识别装置,其特征在于,所述计算执行子单元具体用于,计算所述客体特征与指定范围内的注册图像客体特征的相似度;
    所述装置还包括:
    第二阈值比对单元,用于判断计算所得相似度中的最大值是否大于预先设定的阈值;
    第二识别结果输出单元,用于当所述第二阈值比对单元的输出为是时,判定所述待识别客体图像在所述指定范围内的注册图像中匹配成功,并将所述最大值对应的注册图像的相关身份信息作为客体识别结果输出。
  23. 根据权利要求15-22任一项所述的图像识别装置,其特征在于,所述特征提取单元具体用于,采用局部二值模式算法提取所述客体特征、采用Gabor小波变换算法提取所述客体特征、或者采用深度卷积网络提取所述客体特征。
  24. 一种度量学习方法,其特征在于,包括:
    提取属于同一来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
    提取属于同一来源类别、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
    在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
    利用上述两类特征样本集中的样本,求解所述非对称度量模型中的参数。
  25. 根据权利要求24所述的度量学习方法,其特征在于,所述非对称度量模型包括:基于联合贝叶斯脸的非对称度量模型;
    所述非对称度量模型如下所示:
    Figure PCTCN2016092785-appb-100002
    A=(Sxx+Txx)-1-E
    B=(Syy+Tyy)-1-F
    G=-(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1Sxy(Syy+Tyy)-1
    E=(Sxx+Txx-Sxy(Syy+Tyy)-1Syx)-1
    F=(Syy+Tyy-Syx(Sxx+Txx)-1Sxy)-1
    其中,假设基准特征样本集X中的样本x=μxx,μx和εx服从均值为0,协方差矩阵为Sxx和Txx的高斯分布,比对特征样本集Y中的样本y=μyy,μy和εy服从均值为0,协方差矩阵为Syy和Tyy的高斯分布,Sxy和Syx是X和Y之间的互协方差矩阵;r(x,y)为基于类内/类间对数似然比计算的相似度;
    所述求解所述非对称度量模型中的参数包括:求解Sxx、Txx、Syy、Tyy、Sxy、和Syx
  26. 根据权利要求25所述的度量学习方法,其特征在于,所述求解所述非对称度量模型中的参数包括:
    利用散度矩阵估算所述模型中的参数;或者,
    采用期望最大化算法迭代求解所述模型中的参数。
  27. 根据权利要求24-26任一项所述的度量学习方法,其特征在于,所述基准客体图像以及所述比对客体图像包括:人脸图像;所述客体特征包括:人脸特征。
  28. 一种度量学习装置,其特征在于,包括:
    基准样本提取单元,用于提取属于同一来源类别的基准客体图像训练集中各图像的客体特征,作为基准特征样本集;
    比对样本提取单元,用于提取属于同一来源类别、但与所述基准客体图像分属不同来源类别的比对客体图像训练集中各图像的客体特征,作为比对特征样本集;
    非对称度量模型建立单元,用于在参与比对的客体特征服从各自高斯分布的假设下,建立包含参数的非对称度量模型;
    度量模型参数求解单元,用于利用上述两类特征样本集中的样本,求解所述非对称度量模型中的参数。
  29. 根据权利要求28所述的度量学习装置,其特征在于,所述非对称度量模型建立单元建立的度量模型包括:基于联合贝叶斯脸的非对称度量模型。
  30. 根据权利要求29所述的度量学习装置,其特征在于,所述度量模型参数求解单元具体用于,利用散度矩阵估算所述模型中的参数,或者,采用期望最大化算法迭代求解所述模型中的参数。
  31. 一种图像来源识别方法,其特征在于,包括:
    采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合;
    利用所述训练样本集合中的客体特征样本及其来源类别,训练客体图像来源分类模型;
    从待分类客体图像中提取客体特征;
    以上述提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
  32. 根据权利要求31所述的图像来源识别方法,其特征在于,所述客体图像来源分类模型是采用如下算法训练得到的多类分类模型:
    Softmax算法、多类SVM算法、或者随机森林算法。
  33. 根据权利要求31或32所述的图像来源识别方法,其特征在于,所述客体图像包括:人脸图像;所述客体特征包括:人脸特征。
  34. 一种图像来源识别装置,其特征在于,包括:
    训练样本采集单元,用于采集属于不同来源类别的客体图像集,并从中提取客体特征组成训练样本集合;
    分类模型训练单元,用于利用所述训练样本集合中的客体特征样本及其来源类别,训练图像来源分类模型;
    待分类特征提取单元,用于从待分类客体图像中提取客体特征;
    来源类别识别单元,用于以所述待分类特征提取单元提取的客体特征为输入,采用所述客体图像来源分类模型识别所述待分类客体图像的来源类别。
  35. 根据权利要求34所述的图像来源识别装置,其特征在于,所述客体图像来源分类模型包括:多类分类模型;
    所述分类模型训练单元具体用于,利用Softmax算法、多类SVM算法、或者随机森林算法训练所述客体图像来源分类模型。
PCT/CN2016/092785 2015-08-11 2016-08-02 图像识别方法、度量学习方法、图像来源识别方法及装置 WO2017024963A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510490041.8 2015-08-11
CN201510490041.8A CN106446754A (zh) 2015-08-11 2015-08-11 图像识别方法、度量学习方法、图像来源识别方法及装置

Publications (1)

Publication Number Publication Date
WO2017024963A1 true WO2017024963A1 (zh) 2017-02-16

Family

ID=57982977

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/092785 WO2017024963A1 (zh) 2015-08-11 2016-08-02 图像识别方法、度量学习方法、图像来源识别方法及装置

Country Status (2)

Country Link
CN (1) CN106446754A (zh)
WO (1) WO2017024963A1 (zh)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635626A (zh) * 2018-10-18 2019-04-16 西安理工大学 一种单样本的低分辨率单类人脸识别方法
CN109815970A (zh) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 翻拍图像的识别方法、装置、计算机设备和存储介质
CN109977826A (zh) * 2019-03-15 2019-07-05 百度在线网络技术(北京)有限公司 物体的类别识别方法和装置
CN110458107A (zh) * 2019-08-13 2019-11-15 北京百度网讯科技有限公司 用于图像识别的方法和装置
CN110879985A (zh) * 2019-11-18 2020-03-13 西南交通大学 一种抗噪声数据的人脸识别模型训练方法
CN111008651A (zh) * 2019-11-13 2020-04-14 科大国创软件股份有限公司 一种基于多特征融合的图像翻拍检测方法
CN111046933A (zh) * 2019-12-03 2020-04-21 东软集团股份有限公司 图像分类方法、装置、存储介质及电子设备
CN111160423A (zh) * 2019-12-12 2020-05-15 大连理工大学 一种基于集成映射的图像来源鉴别方法
CN111191568A (zh) * 2019-12-26 2020-05-22 中国平安人寿保险股份有限公司 翻拍图像识别方法、装置、设备及介质
CN111368764A (zh) * 2020-03-09 2020-07-03 零秩科技(深圳)有限公司 一种基于计算机视觉与深度学习算法的虚假视频检测方法
CN111582006A (zh) * 2019-02-19 2020-08-25 杭州海康威视数字技术股份有限公司 一种视频分析方法及装置
CN111626371A (zh) * 2020-05-29 2020-09-04 歌尔科技有限公司 一种图像分类方法、装置、设备及可读存储介质
CN111723615A (zh) * 2019-03-20 2020-09-29 杭州海康威视数字技术股份有限公司 对检测物图像进行检测物匹配判定的方法和装置
CN112115906A (zh) * 2020-09-25 2020-12-22 广州市派客朴食信息科技有限责任公司 基于深度学习目标检测和度量学习的开放性菜品识别方法
CN112241663A (zh) * 2019-07-18 2021-01-19 上汽通用汽车有限公司 一种对多个车载资源进行调配的装置以及系统
CN112448868A (zh) * 2020-12-02 2021-03-05 新华三人工智能科技有限公司 一种网络流量数据识别方法、装置及设备
CN112836754A (zh) * 2021-02-05 2021-05-25 方玉明 一种面向图像描述模型泛化能力评估方法
CN113111689A (zh) * 2020-01-13 2021-07-13 腾讯科技(深圳)有限公司 一种样本挖掘方法、装置、设备及存储介质
CN113486715A (zh) * 2021-06-04 2021-10-08 广州图匠数据科技有限公司 图像翻拍识别方法、智能终端以及计算机存储介质
CN114066807A (zh) * 2021-10-09 2022-02-18 西安深信科创信息技术有限公司 基于小波变换的多列卷积神经网络翻拍图片检测方法
CN115861823A (zh) * 2023-02-21 2023-03-28 航天宏图信息技术股份有限公司 一种基于自监督深度学习的遥感变化检测方法和装置
CN116129731A (zh) * 2022-12-29 2023-05-16 北京布局未来教育科技有限公司 人工智能模拟教学系统与方法
CN116229148A (zh) * 2023-01-03 2023-06-06 中南大学 一种基于自监督对比学习的屏幕翻拍翻录鲁棒检测方法

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909905B (zh) * 2017-03-02 2020-02-14 中科视拓(北京)科技有限公司 一种基于深度学习的多模态人脸识别方法
CN107220614B (zh) * 2017-05-24 2021-08-10 北京小米移动软件有限公司 图像识别方法、装置及计算机可读存储介质
CN107704626A (zh) * 2017-10-30 2018-02-16 北京萌哥玛丽科技有限公司 一种基于人脸识别查找用户的控制方法及控制装置
CN108255806B (zh) * 2017-12-22 2021-12-17 北京奇艺世纪科技有限公司 一种人名识别方法及装置
CN108427740B (zh) * 2018-03-02 2022-02-18 南开大学 一种基于深度度量学习的图像情感分类与检索算法
US11461996B2 (en) * 2018-03-05 2022-10-04 Omron Corporation Method, apparatus and system for determining feature data of image data, and storage medium
CN110490214B (zh) * 2018-05-14 2023-05-02 阿里巴巴集团控股有限公司 图像的识别方法及系统、存储介质及处理器
CN108681720A (zh) * 2018-05-21 2018-10-19 中兴智能视觉大数据技术(湖北)有限公司 一种人证核验管理系统和方法
CN108932420B (zh) * 2018-06-26 2021-11-09 北京旷视科技有限公司 人证核验装置、方法和系统以及证件解密装置和方法
CN108959884B (zh) * 2018-06-26 2021-11-09 北京旷视科技有限公司 人证核验装置和方法
CN108985386A (zh) * 2018-08-07 2018-12-11 北京旷视科技有限公司 获得图像处理模型的方法、图像处理方法及对应装置
CN109255319A (zh) * 2018-09-02 2019-01-22 珠海横琴现联盛科技发展有限公司 针对静态照片的人脸识别支付信息防伪方法
CN109668573A (zh) * 2019-01-04 2019-04-23 广东工业大学 一种改进rrt算法的车辆路径规划方法
CN109919183B (zh) * 2019-01-24 2020-12-18 北京大学 一种基于小样本的图像识别方法、装置、设备及存储介质
CN110135517B (zh) * 2019-05-24 2023-04-07 北京百度网讯科技有限公司 用于获取车辆相似度的方法及装置
CN113396414A (zh) * 2019-06-24 2021-09-14 深圳市欢太科技有限公司 刷量用户识别方法及相关产品
CN110288089B (zh) * 2019-06-28 2021-07-09 北京百度网讯科技有限公司 用于发送信息的方法和装置
CN110414483A (zh) * 2019-08-13 2019-11-05 山东浪潮人工智能研究院有限公司 一种基于深度神经网络和随机森林的人脸识别方法及系统
WO2021098772A1 (zh) * 2019-11-20 2021-05-27 Oppo广东移动通信有限公司 人脸验证的评估方法和系统、及计算机存储介质
CN111261172B (zh) * 2020-01-21 2023-02-10 北京爱数智慧科技有限公司 一种声纹识别方法和装置
CN111738106B (zh) * 2020-06-04 2023-09-19 东莞市度润光电科技有限公司 一种红外灯罩的检测方法、检测装置及存储介质
CN111476222B (zh) * 2020-06-11 2020-10-09 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备和计算机可读存储介质
CN112052959B (zh) * 2020-09-04 2023-08-25 深圳前海微众银行股份有限公司 基于联邦学习的自动驾驶训练方法、设备及介质
CN112614109B (zh) * 2020-12-24 2024-06-07 四川云从天府人工智能科技有限公司 图像质量评估方法、装置以及计算机可读存储介质
CN112287918B (zh) * 2020-12-31 2021-03-19 湖北亿咖通科技有限公司 一种人脸识别方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040197013A1 (en) * 2001-12-14 2004-10-07 Toshio Kamei Face meta-data creation and face similarity calculation
CN101046847A (zh) * 2007-04-29 2007-10-03 中山大学 一种基于二次多项式光照模型的人脸光照对齐方法
CN102147867A (zh) * 2011-05-20 2011-08-10 北京联合大学 一种基于主体的国画图像和书法图像的识别方法
CN103902961A (zh) * 2012-12-28 2014-07-02 汉王科技股份有限公司 一种人脸识别方法及装置
CN104281843A (zh) * 2014-10-20 2015-01-14 上海电机学院 基于自适应特征和分类模型选择的图像识别方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG123618A1 (en) * 2004-12-15 2006-07-26 Chee Khin George Loo A method and system for verifying the identity of a user
CN101364257B (zh) * 2007-08-09 2011-09-21 上海银晨智能识别科技有限公司 能识别图像来源的人脸识别方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040197013A1 (en) * 2001-12-14 2004-10-07 Toshio Kamei Face meta-data creation and face similarity calculation
CN101046847A (zh) * 2007-04-29 2007-10-03 中山大学 一种基于二次多项式光照模型的人脸光照对齐方法
CN102147867A (zh) * 2011-05-20 2011-08-10 北京联合大学 一种基于主体的国画图像和书法图像的识别方法
CN103902961A (zh) * 2012-12-28 2014-07-02 汉王科技股份有限公司 一种人脸识别方法及装置
CN104281843A (zh) * 2014-10-20 2015-01-14 上海电机学院 基于自适应特征和分类模型选择的图像识别方法及系统

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635626B (zh) * 2018-10-18 2022-11-25 北京和鸿盈科技术有限公司 一种单样本的低分辨率单类人脸识别方法
CN109635626A (zh) * 2018-10-18 2019-04-16 西安理工大学 一种单样本的低分辨率单类人脸识别方法
CN109815970A (zh) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 翻拍图像的识别方法、装置、计算机设备和存储介质
CN109815970B (zh) * 2018-12-21 2023-04-07 平安科技(深圳)有限公司 翻拍图像的识别方法、装置、计算机设备和存储介质
CN111582006A (zh) * 2019-02-19 2020-08-25 杭州海康威视数字技术股份有限公司 一种视频分析方法及装置
CN109977826A (zh) * 2019-03-15 2019-07-05 百度在线网络技术(北京)有限公司 物体的类别识别方法和装置
CN109977826B (zh) * 2019-03-15 2021-11-02 百度在线网络技术(北京)有限公司 物体的类别识别方法和装置
CN111723615B (zh) * 2019-03-20 2023-08-08 杭州海康威视数字技术股份有限公司 对检测物图像进行检测物匹配判定的方法和装置
CN111723615A (zh) * 2019-03-20 2020-09-29 杭州海康威视数字技术股份有限公司 对检测物图像进行检测物匹配判定的方法和装置
CN112241663B (zh) * 2019-07-18 2023-07-25 上汽通用汽车有限公司 一种对多个车载资源进行调配的装置以及系统
CN112241663A (zh) * 2019-07-18 2021-01-19 上汽通用汽车有限公司 一种对多个车载资源进行调配的装置以及系统
CN110458107B (zh) * 2019-08-13 2023-06-16 北京百度网讯科技有限公司 用于图像识别的方法和装置
CN110458107A (zh) * 2019-08-13 2019-11-15 北京百度网讯科技有限公司 用于图像识别的方法和装置
CN111008651B (zh) * 2019-11-13 2023-04-28 科大国创软件股份有限公司 一种基于多特征融合的图像翻拍检测方法
CN111008651A (zh) * 2019-11-13 2020-04-14 科大国创软件股份有限公司 一种基于多特征融合的图像翻拍检测方法
CN110879985A (zh) * 2019-11-18 2020-03-13 西南交通大学 一种抗噪声数据的人脸识别模型训练方法
CN111046933A (zh) * 2019-12-03 2020-04-21 东软集团股份有限公司 图像分类方法、装置、存储介质及电子设备
CN111046933B (zh) * 2019-12-03 2024-03-05 东软集团股份有限公司 图像分类方法、装置、存储介质及电子设备
CN111160423A (zh) * 2019-12-12 2020-05-15 大连理工大学 一种基于集成映射的图像来源鉴别方法
CN111160423B (zh) * 2019-12-12 2023-09-22 大连理工大学 一种基于集成映射的图像来源鉴别方法
CN111191568A (zh) * 2019-12-26 2020-05-22 中国平安人寿保险股份有限公司 翻拍图像识别方法、装置、设备及介质
CN113111689A (zh) * 2020-01-13 2021-07-13 腾讯科技(深圳)有限公司 一种样本挖掘方法、装置、设备及存储介质
CN111368764B (zh) * 2020-03-09 2023-02-21 零秩科技(深圳)有限公司 一种基于计算机视觉与深度学习算法的虚假视频检测方法
CN111368764A (zh) * 2020-03-09 2020-07-03 零秩科技(深圳)有限公司 一种基于计算机视觉与深度学习算法的虚假视频检测方法
CN111626371B (zh) * 2020-05-29 2023-10-31 歌尔科技有限公司 一种图像分类方法、装置、设备及可读存储介质
CN111626371A (zh) * 2020-05-29 2020-09-04 歌尔科技有限公司 一种图像分类方法、装置、设备及可读存储介质
CN112115906A (zh) * 2020-09-25 2020-12-22 广州市派客朴食信息科技有限责任公司 基于深度学习目标检测和度量学习的开放性菜品识别方法
CN112448868B (zh) * 2020-12-02 2022-09-30 新华三人工智能科技有限公司 一种网络流量数据识别方法、装置及设备
CN112448868A (zh) * 2020-12-02 2021-03-05 新华三人工智能科技有限公司 一种网络流量数据识别方法、装置及设备
CN112836754A (zh) * 2021-02-05 2021-05-25 方玉明 一种面向图像描述模型泛化能力评估方法
CN113486715A (zh) * 2021-06-04 2021-10-08 广州图匠数据科技有限公司 图像翻拍识别方法、智能终端以及计算机存储介质
CN114066807A (zh) * 2021-10-09 2022-02-18 西安深信科创信息技术有限公司 基于小波变换的多列卷积神经网络翻拍图片检测方法
CN116129731A (zh) * 2022-12-29 2023-05-16 北京布局未来教育科技有限公司 人工智能模拟教学系统与方法
CN116129731B (zh) * 2022-12-29 2023-09-15 北京布局未来教育科技有限公司 人工智能模拟教学系统与方法
CN116229148A (zh) * 2023-01-03 2023-06-06 中南大学 一种基于自监督对比学习的屏幕翻拍翻录鲁棒检测方法
CN116229148B (zh) * 2023-01-03 2023-10-03 中南大学 一种基于自监督对比学习的屏幕翻拍翻录鲁棒检测方法
CN115861823B (zh) * 2023-02-21 2023-05-09 航天宏图信息技术股份有限公司 一种基于自监督深度学习的遥感变化检测方法和装置
CN115861823A (zh) * 2023-02-21 2023-03-28 航天宏图信息技术股份有限公司 一种基于自监督深度学习的遥感变化检测方法和装置

Also Published As

Publication number Publication date
CN106446754A (zh) 2017-02-22

Similar Documents

Publication Publication Date Title
WO2017024963A1 (zh) 图像识别方法、度量学习方法、图像来源识别方法及装置
Ranjan et al. Deep learning for understanding faces: Machines may be just as good, or better, than humans
CN106897675B (zh) 双目视觉深度特征与表观特征相结合的人脸活体检测方法
WO2016150240A1 (zh) 身份认证方法和装置
US7873189B2 (en) Face recognition by dividing an image and evaluating a similarity vector with a support vector machine
WO2019015246A1 (zh) 图像特征获取
CN107610177B (zh) 一种同步定位与地图构建中确定特征点的方法和设备
CN106778474A (zh) 3d人体识别方法及设备
TWI731919B (zh) 圖像識別方法與裝置及度量學習方法與裝置
JP6071002B2 (ja) 信頼度取得装置、信頼度取得方法および信頼度取得プログラム
CN105528620B (zh) 一种联合鲁棒主成分特征学习与视觉分类方法及系统
US10423817B2 (en) Latent fingerprint ridge flow map improvement
CN110751069A (zh) 一种人脸活体检测方法及装置
CN105469117B (zh) 一种基于鲁棒特征提取的图像识别方法与装置
EP3674974A1 (en) Apparatus and method with user verification
WO2015176502A1 (zh) 一种图像特征的估计方法和设备
CN116958724A (zh) 一种产品分类模型的训练方法和相关装置
Demontis et al. Super-sparse regression for fast age estimation from faces at test time
Almaddah et al. Face relighting using discriminative 2D spherical spaces for face recognition
Madadi et al. Multi-part body segmentation based on depth maps for soft biometry analysis
TWI632509B (zh) 人臉辨識裝置及方法、提升影像辨識率的方法、及電腦可讀儲存介質
Almutiry Efficient iris segmentation algorithm using deep learning techniques
Mishra et al. Integrating State-of-the-Art Face Recognition and Anti-Spoofing Techniques into Enterprise Information Systems
CN111428670A (zh) 人脸检测方法、装置、存储介质及设备
Granda et al. Face recognition systems in math classroom through computer vision traditional techniques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16834587

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16834587

Country of ref document: EP

Kind code of ref document: A1