WO2019232862A1 - 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质 - Google Patents

嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质 Download PDF

Info

Publication number
WO2019232862A1
WO2019232862A1 PCT/CN2018/094289 CN2018094289W WO2019232862A1 WO 2019232862 A1 WO2019232862 A1 WO 2019232862A1 CN 2018094289 W CN2018094289 W CN 2018094289W WO 2019232862 A1 WO2019232862 A1 WO 2019232862A1
Authority
WO
WIPO (PCT)
Prior art keywords
mouth
sample data
face image
training
image sample
Prior art date
Application number
PCT/CN2018/094289
Other languages
English (en)
French (fr)
Inventor
戴磊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019232862A1 publication Critical patent/WO2019232862A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Definitions

  • the present application relates to the field of computer technology, and in particular, to a mouth model training method, a mouth recognition method, a device, a device, and a medium.
  • facial facial features recognition has received extensive attention and has become a hot topic in the field of artificial intelligence.
  • the positions of different organs such as eyes, ears, mouth, or nose
  • the algorithm can still identify the relative positions of different parts and provide corresponding pictures.
  • an unobstructed mouth image is required.
  • the mouth images identified by the conventional facial feature point recognition algorithm cannot filter the obstructed images, which is easy to introduce errors and is not conducive to further follow-up. To deal with needs.
  • a mouth model training method includes:
  • a mouth judgment model is obtained.
  • a mouth model training device includes:
  • a facial image sample data acquisition module is configured to acquire a facial image sample, and mark the facial image sample to obtain facial image sample data, and extract a facial image sample from the facial image sample data.
  • a face image sample data division module configured to divide the face image sample data into training sample data and verification sample data
  • a critical surface acquisition module configured to train a support vector machine classifier using the training sample data to obtain a critical surface of the support vector machine classifier
  • a vector distance calculation module configured to calculate a vector distance between a feature vector of a verification sample and the critical surface in the verification sample data
  • a classification threshold obtaining module configured to obtain a preset true classification rate or a preset false positive classification rate, and obtain a classification threshold according to the vector distance and the labeled data corresponding to the verification sample data;
  • the mouth judgment model acquisition module is configured to obtain a mouth judgment model according to the classification threshold.
  • a mouth recognition method includes:
  • the mouth image to be identified is input to a mouth judgment model trained by the mouth model training method for recognition, and a recognition result is obtained.
  • a mouth identification device includes:
  • a face picture acquisition module to obtain a face picture to be identified, and a facial feature point detection algorithm to obtain a positive image of the mouth area;
  • a to-be-recognized mouth image acquisition module configured to perform normalization processing on the forward mouth area image to obtain the to-be-recognized mouth image
  • a recognition result acquisition module is configured to input the mouth image to be identified into a mouth judgment model trained by the mouth model training method to recognize and obtain a recognition result.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor implements the above-mentioned mouth model training method when the processor executes the computer-readable instructions Or, when the processor executes the computer-readable instructions, the following steps are implemented:
  • a mouth judgment model is obtained.
  • One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
  • a mouth judgment model is obtained.
  • FIG. 1 is a schematic diagram of an application environment of a mouth model training method and a mouth recognition method according to an embodiment of the present application
  • FIG. 2 is an implementation flowchart of a mouth model training method provided by an embodiment of the present application
  • step S10 in a mouth model training method provided by an embodiment of the present application
  • step S30 is a flowchart of implementing step S30 in a mouth model training method according to an embodiment of the present application
  • step S15 is a flowchart of implementing step S15 in a mouth model training method provided by an embodiment of the present application
  • FIG. 6 is an implementation flowchart of step S50 in a mouth model training method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a mouth model training device according to an embodiment of the present application.
  • FIG. 8 is an implementation flowchart of a mouth recognition method provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a mouth recognition device provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a computer device according to an embodiment of the present application.
  • the mouth model training method provided by the present application can be applied in the application environment shown in FIG. 1, in which a client communicates with a server through a network, and the server receives training sample data sent by the client and establishes a mouth judgment classification model, and further Receive verification samples sent by the client for mouth model training.
  • the client can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • the method is applied to the server in FIG. 1 as an example for description, and includes the following steps:
  • S10 Obtain a facial image sample, and mark the facial image sample to obtain facial image sample data, and extract a feature vector of the facial image sample from the facial image sample data, where the facial image sample data includes a person Face image samples and annotation data.
  • the face image sample data is mouth image data used for model training.
  • the feature vector of a face image sample refers to a vector used to characterize the image information characteristics of each face image sample in the face image sample data, for example: a HOG (Histogram of Oriented Gradient) feature vector, LBP (Local Binary Patterns (Local Binary Patterns) feature vector or PCA (Principal Component Analysis) feature vector.
  • Feature vectors can represent image information with simple data and avoid repeated extraction operations in subsequent training processes.
  • a HOG feature vector of a face image sample can be extracted. Because the HOG feature vector of the face image sample is described by the gradient of the local information of the face image sample, extracting the HOG feature vector of the face image sample can avoid the influence of factors such as geometric deformation and light change on the training of the mouth model.
  • Marking a face image sample refers to dividing the face image sample into a positive sample (unblocked mouth image) and a negative sample (blocked mouth image) according to the content of the sample, and labeling these two sample data respectively Then, the face image sample data was obtained.
  • the face image samples include positive samples and negative samples. Understandably, the face image sample data includes face image samples and annotation data.
  • the number of negative samples is 2-3 times the number of positive samples, which can make the sample information more comprehensive and improve the accuracy of model training.
  • the face detection sample data is acquired for subsequent model training, and the occluded mouth image is used as the face image sample for training, thereby reducing the false detection rate.
  • the face image sample data includes, but is not limited to, a face image sample collected in advance and a face image sample stored in a commonly used face database in a memory in advance.
  • S20 Divide the face image sample data into training sample data and verification sample data.
  • the training sample data is sample data for learning, and a classifier is established by matching some parameters, that is, using the face image samples in the training sample data to train a machine learning model to determine the parameters of the machine learning model.
  • Validation sample data is sample data used to verify the resolving power (such as recognition rate) of a trained machine learning model.
  • the number of 70% -75% of the face image sample data is used as training sample data, and the rest is used as verification sample data.
  • a total of 1000 positive face samples and 700 negative samples are selected to combine 1000 face image samples with adult face image sample data, of which 260 samples are used as verification sample data and 740 samples are used as training sample data.
  • S30 Use the training sample data to train the support vector machine classifier to obtain the critical surface of the support vector machine classifier.
  • Support Vector Machine (SVM) classifier is a discriminative classifier defined by the classification critical surface, which is used to classify or regression analysis the data.
  • the critical surface is a classification surface that can correctly separate the two types of samples from the positive sample and the negative sample and maximize the distance between the two types of samples.
  • a suitable kernel function is selected, and then the feature vector of the training sample data and the kernel function are used to perform a kernel function operation, so that the feature vector of the training sample data is mapped to a high-dimensional feature space to achieve the
  • the feature vectors are linearly separable in this high-dimensional feature space to obtain a critical surface, and the critical surface is used as a classification surface for classifying training sample data, separating positive samples from negative samples.
  • the support vector machine classifier will output a critical face training data to classify.
  • the classification process of the support vector machine classifier is simplified by obtaining the critical surface.
  • a critical vector is obtained by training a feature vector of a face image sample on a support vector machine classifier, which has a good classification ability and improves the efficiency of mouth model training.
  • the verification sample data is pre-stored face image sample data for verification, which includes positive sample data (unblocked mouth image) and negative sample data (blocked mouth image).
  • positive sample data unblocked mouth image
  • negative sample data locked mouth image
  • Verification samples were obtained after labeling separately.
  • the feature vector of the verification sample refers to a feature vector obtained by extracting a feature vector from the verification sample.
  • the feature vectors of the verification samples include, but are not limited to, HOG feature vectors, LBP feature vectors, and PCA feature vectors.
  • the distance between the feature vector of the verification sample and the vector of the critical plane in the verification sample data refers to the distance between the directed line segment corresponding to the feature vector of the verification sample in mathematical sense and a plane corresponding to the critical plane in mathematical sense. That is, the distance from a line to a surface in a mathematical sense.
  • the distance is a value, and the distance is a vector distance.
  • represents the norm of w, that is,
  • S50 Obtain a preset true class rate or a preset false positive class rate, and obtain a classification threshold based on the vector distance and the labeled data corresponding to the verification sample data.
  • the preset true class rate refers to the preset ratio of the number of positive samples determined to be positive and the result is the total number of positive samples.
  • the preset false positive class rate refers to the preset number of negative samples determined to be negative and the results are incorrect.
  • the true class rate refers to the ratio of the face image samples of the unobstructed mouth image to the total unobstructed mouth image face image samples
  • the false positive class rate refers to The ratio of the face image samples of the blocked mouth image determined to be unblocked mouth to the total face image samples of the unblocked mouth image.
  • the higher the true class rate or the lower the false positive class rate it means that the classification requirements of the target are more stringent and can be adapted to more applications.
  • the preset true class rate in this embodiment is 95%, or when the preset false positive rate is 5%, a good classification effect can be obtained, which can be adapted to a variety of different application scenarios, and by setting the real class reasonably Rate or false positive class rate, so that the adaptability of the support vector machine classifier is better extended.
  • the preset true class rate or the preset false positive class rate here is the preferred range of this application, but it can be set according to the needs of the actual application occasion, and there is no limitation here.
  • the classification threshold is a critical value used to classify samples. Specifically, when samples are classified, a judgment that is lower than the classification threshold is a positive sample, and a judgment that is higher than the classification threshold is a negative sample.
  • the annotation data corresponding to the verification sample data refers to the annotation of the verification sample, for example, a positive sample is marked as 1 and a negative sample is marked as -1.
  • the classification threshold is calculated according to a preset true class rate or a preset false positive class rate.
  • the preset false positive rate is 10%
  • there are 15 verification samples of S 1 , S 2 ... S 15 among which there are 5 positive samples, 10 negative samples, and the feature vectors and critical surfaces of 10 negative samples.
  • the vector distances of are respectively 1, 2, ... 10, so when the classification threshold is in the interval [1,2], if the classification threshold is 1.5, it can meet the preset false positive class rate of 10%.
  • the mouth judgment model refers to a model for judging whether a mouth position is occluded in a face image sample.
  • the classification threshold is determined, the feature vector of the face image sample data and the vector distance of the critical surface of the support vector machine classifier are compared with the classification threshold, and the face image sample data is classified according to the comparison result to determine the face
  • the mouth position in the image sample is either occluded or unoccluded. Therefore, after a classification threshold is given, a mouth judgment model is established. After inputting a face image to be identified into the mouth judgment model, a classification result of yes or no is directly given according to the classification threshold, thereby avoiding repeated training and improving The efficiency of mouth model training.
  • the data is divided into training sample data and verification sample data; the training sample data is used to train the support vector machine classifier to obtain the critical surface of the support vector machine classifier, thereby simplifying the classification process, and then calculating the characteristics of the verification samples in the verification sample data
  • the vector distance between the vector and the critical surface of the support vector machine classifier can intuitively compare the closeness of each verification sample to the category to which it belongs, and obtain the preset true class rate or the preset false positive class rate in order to extend the support vector machine classifier.
  • the classification threshold is obtained based on the vector distance and the labeled data corresponding to the verification sample data.
  • the mouth judgment model is obtained to avoid repeated training and improve the efficiency of mouth model training.
  • step S10 the feature vector of the facial image sample in the facial image sample data is extracted, and specifically includes the following steps:
  • a facial feature point detection algorithm is used to obtain facial feature points.
  • the facial feature points include a left mouth corner point, a right mouth corner point, and a nasal point.
  • the facial feature point detection algorithm refers to an algorithm for detecting facial features and marking position information.
  • Face feature points refer to points used to mark the contours of the eyes, nose, and mouth, such as corner points, nose points, and mouth corner points.
  • the face feature point detection algorithm includes, but is not limited to, a face feature point detection algorithm based on deep learning, a face feature point detection algorithm based on a model, or a face feature point detection algorithm based on cascade shape regression.
  • the facial feature points can be obtained by using the Viola-Jones algorithm based on Harr features that comes with OpenCV.
  • OpenCV is a cross-platform computer vision library that can run on Linux, Windows, Android, and Mac OS operating systems. It consists of a series of C functions and a small number of C ++ classes. It also provides interfaces for languages such as Python, Ruby, and MATLAB.
  • Many general algorithms in image processing and computer vision have been implemented, and the Viola-Jones algorithm based on Harr features is one of facial feature point detection algorithms.
  • Haar feature is a feature that reflects the gray change of an image, and is a feature that reflects the difference between pixel sub-modules. Haar features are divided into three categories: edge features, linear features, and center-diagonal features.
  • the Viola-Jones algorithm is a method for face detection based on haar feature values of a face.
  • the input facial image sample data is obtained, the facial image sample data is preprocessed, and then the skin color region segmentation, face feature region segmentation, and face feature region classification steps are sequentially performed, and finally according to the Harr feature Viola- The Jones algorithm performs matching calculations on the classification of facial feature regions to obtain the facial feature point information of the facial image.
  • a left mouth corner point, a right mouth corner point, and a nasal tip point of a face image sample are obtained by using a facial feature point detection algorithm, so as to determine the area where the mouth of the face image sample is located according to the position information of these feature points.
  • the left mouth corner point, right mouth corner point, and nasal tip point mentioned in this step refer to the three characteristic points corresponding to the mouth.
  • S12 Positively adjust the face image sample according to the left mouth corner point and the right mouth corner point.
  • the forward adjustment is a normalization of the orientation of the feature points of the face and is set to a forward adjustment.
  • forward adjustment refers to adjusting the left mouth corner point and the right mouth corner point on the same horizontal line (that is, the vertical coordinates of the left mouth corner point and the right mouth corner point are equal), thereby normalizing the mouth feature points to the same orientation to avoid The effect of orientation change of training samples on model training. Improve the robustness of face image samples to changes in orientation.
  • S13 Construct a rectangular area of the mouth according to the left mouth corner point, the right mouth corner point, and the nasal tip point.
  • the rectangular area of the mouth refers to a rectangular area including the mouth image.
  • the position coordinates of the left mouth corner point, the right mouth corner point, and the nasal tip point are located using a facial feature point detection algorithm.
  • the abscissa of the mouth corner point is the left coordinate
  • the abscissa of the right mouth corner point is the right coordinate
  • the ordinate of the nose tip point is the upper coordinate
  • the distance from the point to the left corner of the mouth in the vertical direction is the lower coordinate
  • the rectangular area formed by these four point coordinates is the mouth rectangular area.
  • S14 Perform image normalization processing on the rectangular area of the mouth to obtain a normalized rectangular area of the mouth.
  • normalization processing refers to performing a series of transformations on an image to be processed to convert the image to be processed into a corresponding standard form.
  • image size normalization image grayscale normalization and so on.
  • the normalization process refers to normalizing the size of the rectangular area of the mouth.
  • the rectangular area of the mouth is set to a fixed size according to the resolution of the face image sample.
  • the rectangular area of the mouth can be set to a Size (48,32) rectangle, that is, a rectangular area with a length of 48 pixels and a width of 32 pixels.
  • the HOG (Histogram of Oriented Gradient, HOG) feature vector is a vector used to describe the gradient direction information of a local area of the image. This feature is greatly affected by changes in image size and position.
  • the fixed input image range makes the calculated HOG feature vector more accurate.
  • model training can pay more attention to the difference between unobstructed mouth images and obstructed mouth images without paying attention to changes in mouth positions, which is more convenient for training.
  • the HOG feature vector itself focuses on image gradient features rather than color features. It is not greatly affected by changes in illumination and changes in geometric shapes. Therefore, extracting HOG feature vectors can conveniently and efficiently extract feature vectors from face image samples.
  • feature extraction is also different. Generally, color, texture, and shape are used as target features. According to the requirements for detecting the accuracy of the mouth image, this embodiment chooses to use the shape feature and the HOG feature vector of the training sample.
  • a facial feature point detection algorithm is adopted to obtain the left mouth corner point, the right mouth corner point, and the nasal point of the facial feature point; and then the sample image is adjusted forward to improve the robustness of the face image to the direction change. Then, the mouth rectangular area is constructed and the mouth rectangular area is subjected to image normalization processing to obtain the normalized mouth rectangular area, which is conducive to the subsequent training of the support vector machine model. Finally, the normalized mouth rectangular area HOG feature vector is extracted, so It is convenient and efficient to extract feature vectors from the face image samples in the face image sample data.
  • step S30 training sample data is used to train a support vector machine classifier to obtain a critical surface of the support vector machine classifier, which specifically includes the following steps:
  • st is the abbreviation of the constraint condition in the mathematical formula
  • min means to replace the number formula under the constraint condition.
  • K (x i , x j ) is the kernel function of the support vector machine classifier
  • C is the penalty parameter of the support vector machine classifier
  • C> 0, ⁇ i and Lagrange multiplier Is the conjugate relationship
  • x i is the feature vector of the training sample data
  • l is the number of feature vectors of the training sample data
  • y i is the label of the training sample data.
  • the kernel function is a kernel function in a support vector machine classifier, and is used to perform a kernel function operation on the feature vectors of training samples input during the training of the support vector machine classifier.
  • the linear kernel parameter has the characteristics of few parameters and fast operation speed, which is suitable for linearly separable cases.
  • the penalty parameter C is a parameter used to optimize the support vector machine classifier, and it is a certain value. It can solve the problem of classification of sample skew. Specifically, the number of samples in the two categories (also referred to as multiple categories) that participate in the classification is very different. For example, there are 10,000 positive samples and 100 negative samples. This will cause sample bias. The skew problem. At this time, the distribution of positive samples is wide. To solve the problem of sample skew, specifically, the value of C can be reasonably increased according to the ratio of the number of positive samples to the number of negative samples. The larger C is, the smaller the fault tolerance of the classifier is.
  • the decision threshold b is a real number used to determine the threshold for decision classification in the process of a support vector machine classifier.
  • the formula is adopted After performing the kernel function operation on the feature vectors and kernel functions of the training sample data, the optimal problem is solved, that is, the Lagrange multiplier Value of the kernel function Reached the minimum and got Then, determine the range in the open interval (0, C) Weight And according to Calculate the b value.
  • the training program first extracts and saves the feature vectors of the samples, so that the extracted features can be saved during the continuous adjustment of the training parameters for multiple training processes. Time to get the training parameters that meet the requirements as soon as possible. In this way, the false positive rate and accuracy rate of a certain category can be adjusted without needing to repeatedly train the model, which improves the model training efficiency.
  • a critical surface g (x) is obtained, so that subsequent face image samples are classified according to the training data of the critical face, without the need to repeatedly train the model, which improves the efficiency of model training.
  • step S15 the HOG feature vector is extracted according to the normalized rectangular area of the mouth, which specifically includes the following steps:
  • S151 Divide the rectangular area of the normalized mouth into cell units, and calculate the size and direction of each pixel gradient of the cell unit.
  • the manner of dividing the normalized rectangular area of the mouth is also different.
  • the sub-region and the sub-region may or may not overlap.
  • Cell units are connected subregions of the image, that is, each subregion is composed of multiple cell units, for example, a 48 * 32 normalized mouth rectangular area. Assuming a cell unit is 4 * 4 pixels, 2 * 2 cells make up a sub-region, then this normalized mouth rectangular region has 6 * 4 sub-regions.
  • the gradient direction interval of each cell unit from 0 ° to 180 ° is divided into 9 intervals, so a 9-dimensional vector can be used to describe a cell unit.
  • G x (x, y) is the horizontal gradient of the pixel (x, y)
  • G y (x, y) is the vertical gradient of the pixel (x, y)
  • H (x, y) is the pixel ( x, y).
  • G (x, y) is the size of the pixel gradient.
  • ⁇ (x, y) is the directional angle of the direction of the pixel gradient.
  • S152 Count the gradient histogram of the magnitude and direction of each pixel gradient of the cell unit.
  • the gradient histogram refers to a histogram obtained by statistically calculating the magnitude and direction of the pixel gradient, and is used to characterize the gradient information of each cell unit. Specifically, first divide the gradient direction of each cell unit from 0 ° to 180 ° into 9 direction blocks, that is, 0 ° -20 ° is the first direction block, and 20 ° -40 ° is the second direction block. By analogy, 160 ° -180 ° is the ninth direction block. Then determine the direction block where the direction of the pixel gradient of the cell unit is located, and add the size of the pixel gradient of the direction block.
  • the pixel value in the third direction of the gradient histogram is added to the magnitude of the pixel gradient in that direction to obtain the gradient histogram of the cell unit.
  • tandem refers to merging all the gradient histograms of the gradient histograms of each cell unit from left to right and top to bottom to obtain the HOG feature vector of the normalized rectangular area of the mouth.
  • the normalized mouth rectangular area is divided into several small areas, and then the gradient histograms of the small areas are calculated. Finally, the gradient histograms corresponding to the small areas are connected in series to obtain the entire normalized mouth rectangle.
  • the gradient histogram of the region is used to describe the feature vector of the face image sample.
  • the HOG feature vector itself focuses on the image gradient feature rather than the color feature, and is not affected by the change of illumination. Extracting HOG feature vectors can easily and efficiently identify mouth images.
  • step S50 a preset true class rate or false positive class rate is obtained, and a classification threshold is obtained according to the vector distance and the labeled data corresponding to the verification sample data, which specifically includes the following steps:
  • ROC curve refers to the receiver's operating characteristic curve / receiver operating characteristic curve (receiver operating characteristic curve). It is a comprehensive index reflecting continuous variables of sensitivity and specificity. It is a composition method to reveal the relationship between sensitivity and specificity. .
  • the ROC curve shows the relationship between the true class rate and the false positive class rate of the support vector machine classifier. The closer the curve is to the upper left corner of the classifier, the higher the accuracy.
  • the samples are classified into positive and negative samples: positive samples (negative) or negative samples (negative).
  • positive samples negative
  • negative samples negative
  • four situations will occur: if the face image data is a positive sample and is also predicted as a positive sample, it is a true class (TP).
  • Face image data is a negative sample that is predicted to be a positive sample, which is called a false positive (FP).
  • FP false positive
  • the face image data is a negative sample, it is predicted as a negative sample, which is called a true negative (TN)
  • a positive sample is predicted as a negative sample, which is a false negative (FN).
  • the true class rate characterizes the ratio of positive instances identified by the classifier to all positive instances.
  • the false positive rate (FPR) characterizes the proportion of negative instances that the classifier mistakes for positive samples to all negative instances.
  • the process of drawing the ROC curve is: according to the feature vector of the verification sample data and the vector distance of the critical surface feature vector and the corresponding verification sample data annotation, the true class rate and false positive class rate of many verification samples are obtained.
  • the ROC curve is false positive class
  • the rate is the horizontal axis
  • the true class rate is the vertical axis.
  • Connect the points, that is, the true class rate and false positive class rate of many verification samples draw a curve, and then calculate the area under the curve. The larger the area, the higher the judgment value.
  • the ROC curve drawing tool can be used for drawing.
  • the ROC curve is drawn using the plotSVMroc (true_labels, predict_labels, classnumber) function in matlab.
  • true_labels are correct labels
  • predict_labels are labels for classification judgment
  • the vector distance distribution that is, the distribution range of the closeness of each verification sample data to the critical surface
  • Annotate the true and false positive rates of the verification sample data and then draw the ROC curve based on the true and false positive rates of the verification sample data.
  • S52 Obtain a classification threshold on the horizontal axis of the ROC curve according to a preset true class rate or a preset false positive class rate.
  • the preset true class rate or preset false positive class rate is set according to actual use needs.
  • the server After the server obtains the preset true class rate or preset false positive class rate, it passes the horizontal axis in the ROC curve.
  • the false positive class rate and the true class rate represented by the vertical axis are compared with the preset true class rate or the preset false positive class rate, that is, the preset true class rate or the preset false positive class rate is used to classify the test sample data.
  • the classification threshold is determined from the horizontal axis of the ROC curve according to the classification criteria, so that in the subsequent model training, different classification thresholds can be selected according to different scenarios through the ROC curve, which avoids the need for repeated training and improves the efficiency of model training.
  • the true class rate and false positive class rate of the verification sample data can be obtained after calculating the vector distance between the feature vector of the verification sample data and the critical surface feature vector, and according to the corresponding verification sample data label.
  • the classification threshold is obtained from the horizontal axis of the ROC curve by presetting the real class rate or the preset false positive class rate, so that in the subsequent model training, different classification thresholds can be selected according to different scenarios through the ROC curve to avoid the need for repeated training. Improve the efficiency of model training.
  • FIG. 7 shows a principle block diagram of a mouth model training device corresponding to the mouth model training method in the embodiment.
  • the mouth model training device includes a face image sample data acquisition module 10, a face image sample data division module 20, a critical surface acquisition module 30, a vector distance calculation module 40, a classification threshold acquisition module 50, and a mouth judgment Model acquisition module 60.
  • the implementation functions and embodiments of the face image sample data acquisition module 10, the face image sample data division module 20, the critical surface acquisition module 30, the vector distance calculation module 40, the classification threshold acquisition module 50, and the mouth judgment model acquisition module 60 The corresponding steps of the middle mouth model training method are one-to-one.
  • the detailed description of each functional module is as follows:
  • a face image sample data obtaining module 10 configured to obtain a face image sample, and mark the face image sample to obtain the face image sample data; and extract a feature vector of the face image sample from the face image sample data.
  • the facial image sample data includes facial image samples and annotation data;
  • a face image sample data dividing module configured to divide the face image sample data into training sample data and verification sample data
  • a critical surface acquisition module 30 is configured to train a support vector machine classifier using training sample data to obtain a critical surface of the support vector machine classifier;
  • the vector distance calculation module 40 is configured to calculate a vector distance between a feature vector of a verification sample and a critical surface in the verification sample data;
  • a classification threshold obtaining module 50 configured to obtain a preset true class rate or a preset false positive class rate, and obtain a classification threshold according to a vector distance and labeled data corresponding to the verification sample data;
  • the mouth judgment model acquisition module 60 is configured to obtain a mouth judgment model according to a classification threshold.
  • the facial image sample data acquisition module 10 includes a facial feature point acquisition unit 11, a forward adjustment unit 12, a mouth rectangular region construction unit 13, a mouth rectangular region acquisition unit 14, and a feature vector extraction unit 15.
  • a facial feature point acquiring unit 11 is configured to acquire a facial feature point by using a facial feature point detection algorithm, and the facial feature point includes a left mouth corner point, a right mouth corner point, and a nasal tip point;
  • a forward adjustment unit 12 configured to perform forward adjustment on a face image sample according to a left mouth corner point and a right mouth corner point;
  • the mouth rectangular area construction unit 13 is configured to construct a mouth rectangular area according to the left mouth corner point, the right mouth corner point, and the nasal tip point;
  • the mouth rectangular area acquiring unit 14 is configured to perform image normalization processing on the mouth rectangular area to obtain a normalized mouth rectangular area;
  • a feature vector extraction unit 15 is configured to extract a HOG feature vector according to a normalized rectangular area of the mouth.
  • the feature vector extraction unit 15 includes a pixel gradient acquisition subunit 151, a gradient histogram acquisition subunit 152, and a HOG feature vector acquisition subunit 153.
  • a pixel gradient acquisition subunit 151 configured to divide the normalized mouth rectangular area into cell units, and calculate the size and direction of each pixel gradient of the cell unit;
  • the gradient histogram acquisition subunit 152 is used to count the gradient histogram of the magnitude and direction of each pixel gradient of the cell unit;
  • the HOG feature vector acquisition subunit 153 is used to concatenate gradient histograms to obtain a HOG feature vector.
  • the critical surface acquisition module 30 includes a parameter acquisition unit 31 and a critical surface acquisition unit 32.
  • a parameter obtaining unit 31 is used for obtaining a kernel function of the support vector machine classifier and a penalty parameter of the support vector machine classifier, and solving the Lagrange multiplier by using the following formula And decision threshold b:
  • st is the abbreviation of the constraint condition in the mathematical formula
  • min means to replace the number formula under the constraint condition.
  • K (x i , x j ) is the kernel function of the support vector machine classifier
  • C is the penalty parameter of the support vector machine classifier
  • C> 0, ⁇ i and Lagrange multiplier Is the conjugate relationship
  • x i is the feature vector of the training sample data
  • l is the number of feature vectors of the training sample data
  • y i is the label of the training sample data
  • Critical plane acquisition unit 32 according to Lagrange multiplier And decision threshold b, the critical surface g (x) of the support vector machine classifier is obtained using the following formula:
  • the classification threshold acquisition module 50 includes a ROC curve drawing unit 51 and a classification threshold acquisition unit 52.
  • the ROC curve drawing unit 51 is configured to draw an ROC curve according to the vector distance and the labeled data corresponding to the verification sample data;
  • the classification threshold acquiring unit 52 is configured to acquire a classification threshold on a horizontal axis of the ROC curve according to a preset true class rate or a preset false positive class rate.
  • Each module in the above-mentioned mouth model training device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware in or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a mouth recognition method is provided.
  • the mouth recognition method can also be applied in the application environment as shown in FIG. 1, where a computer device communicates with a server through a network.
  • the client communicates with the server through the network, and the server receives the face picture to be identified sent by the client for mouth recognition.
  • the client can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • the method is applied to the server in FIG. 1 as an example for description, and includes the following steps:
  • S70 Obtain a face picture to be identified, and use a facial feature point detection algorithm to obtain a positive image of the mouth area.
  • the face picture to be identified refers to a face picture that needs to be identified by the mouth.
  • the face image can be obtained by collecting a face picture in advance, or directly obtaining a face picture from a face database, such as an AR face database.
  • the face picture to be identified includes an unobstructed mouth picture and a covered mouth picture, and a facial feature point detection algorithm is used to obtain a positive mouth area image.
  • the implementation process of using the facial feature point detection algorithm to obtain a positive image of the mouth region is the same as the method of steps S11 to S13, and details are not described herein again.
  • S80 Perform normalization processing on the forward image of the mouth area to obtain the mouth image to be identified.
  • the mouth image to be identified refers to a forward mouth region image after the normalization process is implemented.
  • the recognition efficiency can be improved.
  • the normalized to-be-recognized mouth image is transformed to a unified standard form, thereby avoiding the attribute of the large-value interval in the support vector machine classifier to be over-branched with the attribute of the small-value interval, and also avoiding calculation Numerical complexity in the process.
  • the implementation process of normalizing the forward mouth area image is the same as step S14, and details are not described herein again.
  • the recognition result refers to a result obtained by using a mouth judgment model for recognition of a mouth image to be identified, and includes two cases: the mouth image to be identified is an unobstructed mouth image and the mouth image to be identified is an obstructed mouth image. Specifically, the mouth image to be identified is input to a mouth judgment model for recognition, so as to obtain a recognition result.
  • FIG. 9 shows a principle block diagram of a mouth recognition device corresponding to the mouth recognition method in the embodiment.
  • the mouth recognition device includes a mouth image acquisition module 70, a mouth image acquisition module 80, and a recognition result acquisition module 90.
  • the functions of the mouth image acquisition module 70, the mouth image acquisition module 80, and the recognition result acquisition module 90 to be identified correspond to the steps corresponding to the mouth identification method in the embodiment, and each functional module is described in detail as follows:
  • a face picture to be identified module 70 is used to obtain a face picture to be identified, and a facial feature point detection algorithm is used to obtain a positive mouth area image;
  • the to-be-recognized mouth image acquisition module 80 is configured to perform normalization processing on the forward mouth area image to obtain the to-be-recognized mouth image;
  • a recognition result acquisition module 90 is configured to input a mouth image to be recognized into a mouth judgment model trained by a mouth model training method to recognize and obtain a recognition result.
  • Each module in the above-mentioned mouth identification device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware in or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in a non-volatile storage medium.
  • the database of the computer device is used to store the feature vector of the face image sample data and the mouth model training data in the mouth model training method.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a mouth model training method.
  • the functions of the modules / units in the mouth identification device in the embodiment are realized.
  • a computer device including a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor.
  • the processor implements the computer-readable instructions to implement the mouth model of the foregoing embodiment.
  • the steps of the training method are, for example, steps S10 to S60 shown in FIG. 2.
  • the steps of the mouth recognition method of the foregoing embodiment are implemented, for example, steps S70 to S90 shown in FIG. 8.
  • the functions of the modules / units of the mouth model training device of the foregoing embodiment are implemented, for example, modules 10 to 60 shown in FIG. 7.
  • the functions of the modules / units of the mouth recognition device of the above-mentioned embodiment are implemented, for example, modules 70 to 90 shown in FIG. 9. To avoid repetition, we will not repeat them here.
  • One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to execute the mouth model training method of the foregoing embodiment Steps of the mouth recognition method of the above embodiment when computer readable instructions are executed by one or more processors, or implementation of the mouth model training device of the above embodiment when computer readable instructions are executed by one or more processors
  • the functions of each module / unit of the computer or the computer-readable instructions are executed by one or more processors to implement the functions of each module / unit of the mouth identification device of the above embodiment. To avoid repetition, details are not repeated here.

Abstract

一种嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质,该方法包括:获取人脸图像样本并对该人脸图像样本进行标记以得到人脸图像样本数据,提取人脸图像样本的特征向量(S10),将人脸图像样本数据划分为训练样本数据和验证样本数据(S20);采用训练样本数据训练支持向量机分类器,得到支持向量机分类器的临界面(S30);计算验证样本数据中的验证样本的特征向量与临界面的向量距离(S40);获取预设真正类率或预设假正类率,根据向量距离和与验证样本数据对应的标注数据获取分类阈值(S50),并根据分类阈值获取嘴巴判断模型(S60)。采用该嘴巴模型训练方法,能够得到判断嘴巴是否有遮挡的准确率较高的嘴巴判断模型。

Description

嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质
本申请以2018年6月6日提交的申请号为201810574521.6,名称为“嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质”的中国发明专利申请为基础,并要求其优先权
技术领域
本申请涉及计算机技术领域,尤其涉及一种嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质。
背景技术
随着人工智能的快速发展,人脸五官定位识别得到了广泛的关注成为了人工智能领域的热门话题。传统地,在现有的人脸特征点识别算法中,可以从人脸图片中标注出不同器官的位置,例如眼睛、耳朵、嘴巴或者鼻子等,即使对应部位有所遮挡(眼镜、头发、捂嘴等动作),该算法还是可以识别不同部件的相对位置,并提供对应的图片。然而,在一些图片处理过程中,需要的是无遮挡的嘴巴图像,而常规采用人脸特征点识别算法识别出来的嘴巴图片却无法对有遮挡的图片进行筛选,容易引入误差,不利于后续进一步地处理需要。
发明内容
基于此,有必要针对上述技术问题,提供一种可以提高模型训练效率的嘴巴模型训练方法、装置、计算机设备及存储介质。
此外,还有必要提出一种嘴巴识别方法,其根据嘴巴模型训练方法进行训练后,利用训练好的嘴巴图片进行识别,以提高嘴巴识别的准确率。
一种嘴巴模型训练方法,包括:
获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
根据所述分类阈值,获取嘴巴判断模型。
一种嘴巴模型训练装置,包括:
人脸图像样本数据获取模块,用于获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
人脸图像样本数据划分模块,用于将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
临界面获取模块,用于采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
向量距离计算模块,用于计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
分类阈值获取模块,用于获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
嘴巴判断模型获取模块,用于根据所述分类阈值,获取嘴巴判断模型。
一种嘴巴识别方法,包括:
获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
将所述待识别嘴巴图像输入到所述的嘴巴模型训练方法训练得到的嘴巴判断模型进行识别,获取识别结果。
一种嘴巴识别装置,包括:
待识别人脸图片获取模块,用于获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
待识别嘴巴图像获取模块,用于对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
识别结果获取模块,用于将所述待识别嘴巴图像输入到所述嘴巴模型训练方法训练得到的嘴巴判断模型进行识别,获取识别结果。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述嘴巴模型训练方法的步骤,或者,所述处理器执行所述计算机可读指令时实现如下步骤:
获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
根据所述分类阈值,获取嘴巴判断模型。
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
根据所述分类阈值,获取嘴巴判断模型。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得更加明显。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的嘴巴模型训练方法、嘴巴识别方法的应用环境示意图;
图2是本申请实施例提供的嘴巴模型训练方法的实现流程图;
图3是本申请实施例提供的嘴巴模型训练方法中步骤S10的实现流程图;
图4是本申请实施例提供的嘴巴模型训练方法中步骤S30的实现流程图;
图5是本申请实施例提供的嘴巴模型训练方法中步骤S15的实现流程图;
图6是本申请实施例提供的嘴巴模型训练方法中步骤S50的实现流程图;
图7是本申请实施例提供的嘴巴模型训练装置的示意图;
图8是本申请实施例提供的嘴巴识别方法的实现流程图;
图9是本申请实施例提供的嘴巴识别装置的示意图;
图10是本申请实施例提供的计算机设备的示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然, 所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的嘴巴模型训练方法,可应用在如图1的应用环境中,其中,客户端通过网络与服务端进行通信,服务端接收客户端发送的训练样本数据并建立嘴巴判断分类模型,进而接收客户端发送的验证样本,进行嘴巴模型训练。其中,客户端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务端可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图2所示,以该方法应用于图1中的服务端为例进行说明,包括如下步骤:
S10:获取人脸图像样本,并对人脸图像样本进行标记以得到人脸图像样本数据,及提取人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据。
其中,人脸图像样本数据是用于进行模型训练的嘴巴图像数据。人脸图像样本的特征向量是指人脸图像样本数据中用于表征每一人脸图像样本的图像信息特征的向量,例如:HOG(Histogram of Oriented Gradient,梯度方向直方图)特征向量、LBP(Local Binary Patterns,局部二值模式)特征向量或PCA(Principal Component Analysis,主成分分析)特征向量等。特征向量能够以简单的数据表征图像信息,避免后续训练过程重复的提取操作。
优选地,本实施例中可以提取人脸图像样本的HOG特征向量。由于人脸图像样本的HOG特征向量是通过人脸图像样本的局部信息的梯度来描述,因此,提取人脸图像样本的HOG特征向量能够避免几何形变和光线变化等因素对嘴巴模型训练的影响。对人脸图像样本进行标记,是指将人脸图像样本依据样本的内容分为正样本(无遮挡的嘴巴图像)和负样本(有遮挡的嘴巴图像),对这两种样本数据分别进行标注后,得到了人脸图像样本数据。人脸图像样本中包括正样本和负样本,可以理解地,人脸图像样本数据包括人脸图像样本和标注数据。优选地,负样本数量是正样本数量的2-3倍,可以使得样本信息更加全面,提高模型训练的准确度。
在这个实施方式中,通过获取人脸图像样本数据,以便后续进行模型训练,并且通过把有遮挡的嘴巴图像作为人脸图像样本进行训练,从而能够降低误检率。
可选地,该人脸图像样本数据包括但不限于预先采集的人脸图像样本和预先存储在存储器中常用人脸库中的人脸图像样本。
S20:将人脸图像样本数据划分为训练样本数据和验证样本数据。
其中,训练样本数据是用于学习的样本数据,通过匹配一些参数来建立分类器,即采用训练样本数据中的人脸图像样本训练机器学习模型,以确定机器学习模型的参数。验证样本数据是用于验证训练好的机器学习模型的分辨能力(如识别率)的样本数据。可选地,将人脸图像样本数据的70%-75%的数目作为训练样本数据,其余的作为验证样本数据。在一具体实施方式中,选取300个正样本和700个负样本一共1000个人脸图像样本组合成人脸图像样本数据,其中的260个样本作为验证样本数据,740个样本作为训练样本数据。
S30:采用训练样本数据训练支持向量机分类器,得到支持向量机分类器的临界面。
支持向量机(Support Vector Machine,SVM)分类器是一个由分类临界面定义的判别分类器,用于对数据进行分类或者回归分析。临界面为能够将正样本和负样本这两类样本正确分开,并且使两类样本距离最大的分类面。具体地,根据人脸图像样本数据的特点,选取合适核函数,然后将训练样本数据的特征向量与核函数进行核函数运算,使得训练样本数据的特征向量映射到一个高维度特征空间,实现该特征向量在这个高维度特征空间的线性可分,得到临界面,并将临界面作为对训练样本数据进行分类的分类面,将正样本和负样本分开。具体地,输入训练样本数据,支持向量机分类器将会输出一个临界面对训练样本数据进行分类。通过获取临界面简化了支持向量机分类器的分类过程。
本实施例中,通过将人脸图像样本的特征向量训练支持向量机分类器,得到临界面,具有良好的分类能力,提高了嘴巴模型训练的效率。
S40:计算验证样本数据中的验证样本的特征向量与临界面的向量距离。
其中,验证样本数据是预先存储的用于验证的人脸图像样本数据,其中包括了正样本数据(无遮挡 的嘴巴图像)和负样本数据(有遮挡的嘴巴图像),对这两种样本数据分别进行标注后得到验证样本。其中,验证样本的特征向量是指对验证样本进行特征向量提取后获得的特征向量。
验证样本的特征向量包括但不限于:HOG特征向量、LBP特征向量和PCA特征向量等。
其中,验证样本数据中的验证样本的特征向量与临界面的向量距离是指验证样本的特征向量在数学意义上对应的有向线段与临界面在数学意义上对应的一个平面二者的距离,即数学意义上线到面的距离,其距离为一数值,该距离即为向量距离。假设临界面的表达式为g(x)=wx+b,式中w为多维向量,可表示为w=[w 1,w 2,w 3...w n],那么特征向量x到临界面的向量距离的表达式为
Figure PCTCN2018094289-appb-000001
式中||w||表示w的范数,即
Figure PCTCN2018094289-appb-000002
通过计算验证样本数据中的验证样本的特征向量与临界面的向量距离,能够直观地比较各个验证样本与其所属类别的接近程度。
S50:获取预设真正类率或预设假正类率,根据向量距离和与验证样本数据对应的标注数据获取分类阈值。
预设真正类率是指预先设定的判断为正样本且结果正确的数量占总的正样本数量的比值,预设假正类率是指预先设定的判断为负样本且结果错误的数量占总的正样本数量的比值。在本实施例中,真正类率是指将无遮挡的嘴巴图像判断为无遮挡的嘴巴的人脸图像样本占总的无遮挡的嘴巴图像的人脸图像样本的比值,假正类率是指有遮挡的嘴巴图像判断为无遮挡的嘴巴的人脸图像样本占总的无遮挡的嘴巴图像的人脸图像样本的比值。容易理解地,真正类率越高或者假正类率越低,说明目标的分类要求越严格,能适应更多的应用场合。优选地,本实施例中的预设真正类率为95%时,或者预设假正类率5%时,能够取得很好的分类效果,能够适应多种不同应用场合,通过合理设置真正类率或假正类率,从而较好地扩展支持向量机分类器的适应性。
应理解,此处预设真正类率或预设假正类率,为本申请优选范围,但可以根据实际应用场合的需要进行设置,此处不做限制。
分类阈值是用于对样本进行分类的临界值,具体地,对样本进行分类时,低于分类阈值的判断为正样本,高于分类阈值的判断为负样本。
具体地,与验证样本数据对应的标注数据是指验证样本的标注,例如:将正样本标记为1,将负样本标记为-1。在获得了验证样本的特征向量与临界面的向量距离和验证样本的标注数据后,根据预设真正类率或预设假正类率计算得到分类阈值。
例如预设假正类率为10%,有S 1,S 2...S 15共15个验证样本,其中有5个正样本,10个负样本,10个负样本的特征向量与临界面的向量距离分别为1,2…10,那么此时分类阈值在区间[1,2]时,如分类阈值取1.5,能够满足10%的预设假正类率。
S60:根据分类阈值,获取嘴巴判断模型。
具体地,嘴巴判断模型是指用于判断人脸图像样本中的嘴巴位置是否有遮挡的模型。确定分类阈值之后,通过将人脸图像样本数据的特征向量与支持向量机分类器的临界面的向量距离,并与分类阈值比较,根据比较结果对人脸图像样本数据进行分类,进而确定人脸图像样本中的嘴巴位置为有遮挡或者为无遮挡的两种情形。因此,给定分类阈值后,嘴巴判断模型就建立完成,将待识别人脸图像输入到该嘴巴判断模型后,会直接根据分类阈值给出是或者否的分类结果,因而能够避免重复训练,提高嘴巴模型训练的效率。
在本实施例中,首先获取人脸图像样本并对人脸图像样本进行标记以得到人脸图像样本数据,提取人脸图像样本数据中的人脸图像样本的特征向量,然后将人脸图像样本数据划分为训练样本数据和验证样本数据;采用训练样本数据训练支持向量机分类器,得到支持向量机分类器的临界面,从而简化了分 类的过程,接着计算验证样本数据中的验证样本的特征向量与支持向量机分类器的临界面的向量距离,能够直观地比较各个验证样本与其所属类别的接近程度,获取预设真正类率或预设假正类率,以便扩展支持向量机分类器的适应性,根据向量距离和与验证样本数据对应的标注数据获取分类阈值,最后获取嘴巴判断模型,避免重复训练,提高嘴巴模型训练的效率。
在一实施例中,如图3所示,步骤S10中,即提取人脸图像样本数据中的人脸图像样本的特征向量,具体包括如下步骤:
S11:采用人脸特征点检测算法获取人脸特征点,人脸特征点包括:左嘴角点、右嘴角点和鼻尖点。
其中,人脸特征点检测算法是指用于检测人脸五官特征并标记出位置信息的算法。人脸特征点是指眼角点、鼻翼点和嘴角点等用于标志眼、鼻和嘴等脸部轮廓的点。具体地,人脸特征点检测算法包括但不限于根据深度学习的人脸特征点检测算法、根据模型的人脸特征点检测算法或者根据级联形状回归的人脸特征点检测算法等。
可选地,可以采用OpenCV自带的根据Harr特征的Viola-Jones算法获取人脸特征点。其中,OpenCV是一个跨平台计算机视觉库,可以运行在Linux、Windows、Android和Mac OS操作系统上,由一系列C函数和少量C++类构成,同时提供了Python、Ruby、MATLAB等语言的接口,实现了图像处理和计算机视觉方面的很多通用算法,而根据Harr特征的Viola-Jones算法是其中一种人脸特征点检测算法。Haar特征是一种反映图像的灰度变化的特征,是反映像素分模块差值的一种特征。Haar特征分为三类:边缘特征、线性特征和中心-对角线特征。Viola-Jones算法是根据人脸的haar特征值进行人脸检测的方法。
具体地,获取输入的人脸图像样本数据,对人脸图像样本数据进行预处理,接着依次进行肤色区域分割、人脸特征区域分割和人脸特征区域分类的步骤,最后根据Harr特征的Viola-Jones算法与人脸特征区域分类进行匹配计算,得到人脸图像的人脸特征点信息。
本实施例中,通过采用人脸特征点检测算法获取到人脸图像样本的左嘴角点、右嘴角点和鼻尖点,以便根据这几个特征点的位置信息确定人脸图像样本的嘴巴所在区域。可以理解地,本步骤中提及的左嘴角点、右嘴角点和鼻尖点是指嘴巴对应的三个特征点。
S12:根据左嘴角点和右嘴角点对人脸图像样本进行正向调整。
其中,正向调整是对人脸特征点的方位进行规范化并设置为正向的调整。本实施例中,正向调整是指将左嘴角点和右嘴角点调整在同一水平线上(即左嘴角点和右嘴角点的纵坐标相等),从而将嘴巴特征点规范化到同一方位,以避免训练样本方位变化对模型训练的影响。提高人脸图像样本对方位变化的鲁棒性。
S13:根据左嘴角点、右嘴角点和鼻尖点构建嘴巴矩形区域。
其中,嘴巴矩形区域是指包括嘴巴图像的一个矩形区域,在一具体实施方式中,采用人脸特征点检测算法定位出左嘴角点、右嘴角点和鼻尖点的位置坐标,嘴巴矩形区域以左嘴角点的横坐标为左侧坐标,以右嘴角点的横坐标为右侧坐标,以鼻尖点的纵坐标为上侧坐标,以左嘴角点纵坐标(或者右嘴角点纵坐标)加上鼻尖点到左嘴角点垂直方向的距离为下侧坐标,以这四个点坐标(左侧坐标、右侧坐标、上侧坐标和下侧坐标)构成的矩形区域即为嘴巴矩形区域。
S14:对嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域。
其中,归一化处理是指对待处理的图像进行一系列变换以使待处理的图像转换成相应的标准形式。如图像的尺寸归一化、图像的灰度归一化等。优选地,归一化处理是指对嘴巴矩形区域进行尺寸归一化。具体地,将嘴巴矩形区域依据人脸图像样本的分辨率设置为固定尺寸,例如:嘴巴矩形区域可以设置为Size(48,32)矩形,即长为48像素,宽为32像素的矩形区域,通过将嘴巴矩形区域设置为固定尺寸,以便后续减少特征向量提取的复杂度。
容易理解地,对嘴巴矩形区域进行图像归一化处理,有利于后续支持向量机模型的训练,能够避免大数值区间的属性过分支配了小数值区间的属性,而且还能避免计算过程中数值复杂度。
S15:根据归一化嘴巴矩形区域提取HOG特征向量。
HOG(Histogram of Oriented Gradient,HOG)特征向量,是用于描述图像局部区域的梯度方向信息 的向量,该特征受图像尺寸位置等变化影响较大,输入图像范围固定使计算得到的HOG特征向量更统一,模型训练时可以更多关注无遮挡嘴巴图像与有遮挡嘴巴图像的区别而不需要注意嘴巴位置的变化,训练更方便,同时HOG特征向量本身关注的即是图像梯度特征而不是颜色特征,受光照变化以及几何形状变化的影响不大,因此,提取HOG特征向量能够方便高效地对人脸图像样本进行特征向量的提取。其中,根据分类检测目标的不同,对于特征提取也不同的,一般是将颜色、纹理以及形状作为目标特征。根据对检测嘴巴图像准确度的要求,本实施例选择采用形状特征,采用训练样本的HOG特征向量。
在本实施例中,采用人脸特征点检测算法获取人脸特征点的左嘴角点、右嘴角点和鼻尖点;然后对样本图像进行正向调整,以提高人脸图片对方向变化的鲁棒性,接着构建嘴巴矩形区域并对嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域,有利于后续支持向量机模型的训练,最后提取归一化嘴巴矩形区域HOG特征向量,从而方便高效地对人脸图像样本数据中的人脸图像样本进行特征向量的提取。
在一实施例中,如图4所示,步骤S30中,即采用训练样本数据训练支持向量机分类器,得到支持向量机分类器的临界面,具体包括如下步骤:
S31:获取支持向量机分类器的核函数和支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
Figure PCTCN2018094289-appb-000003
和决策阈值b:
Figure PCTCN2018094289-appb-000004
式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
Figure PCTCN2018094289-appb-000005
的最小值,K(x i,x j)为支持向量机分类器的核函数,C为支持向量机分类器的惩罚参数,C>0,α i与拉格朗日乘子
Figure PCTCN2018094289-appb-000006
是共轭关系,x i为训练样本数据的特征向量,l为训练样本数据的特征向量的个数,y i为训练样本数据的标注。
其中,核函数是支持向量机分类器中的核函数,用于对训练支持向量机分类器过程中输入的训练样本的特征向量进行核函数运算,支持向量机分类器的核函数包括但不限于线性核函数、多项式核函数、高斯核函数、高斯核函数和基于径向基核函数,因为本实施例中的支持向量机分类器是线性可分的,优选地,本实施例中采用线性核函数作为支持向量机分类器中的核函数,因此K(x i,x j)=(x i,x j),线性核参数具有参数少、运算速度快的特点,适用于线性可分的情况。y i为训练样本数据的标注,因为是支持向量机分类器的二分类问题,因此y i可以为1或者-1两类,若人脸图像样本为正样本则y i=1,若人脸图像样本为负样本则y i=-1。
惩罚参数C是用于对支持向量机分类器进行优化的参数,是一个确定数值。可以解决样本偏斜的分类问题,具体地,参与分类的两个类别(也可以指多个类别)样本数量差异很大,例如正样本有10000个而负样本有100个,如此会产生样本偏斜问题,此时正样本分布范围广,为解决样本偏斜问题,具体地,可依据正样本数量与负样本数量的比例合理增大C的取值。C越大,表示分类器的容错性小。决策阈值b用于确定支持向量机分类器过程中的决策分类的临界值,是一个实数。
具体地,通过获取合适的核函数K(x i,x j),并设定合适的惩罚参数C,采用公式
Figure PCTCN2018094289-appb-000007
对训练样本数据的特征向量与核函数进行核函数运算后,求解最优问题,即求取拉格朗日乘子
Figure PCTCN2018094289-appb-000008
的值,使得核函数运算后的结果
Figure PCTCN2018094289-appb-000009
达到最小,得到了
Figure PCTCN2018094289-appb-000010
然后,确定开区间(0,C)范围中的
Figure PCTCN2018094289-appb-000011
的分量
Figure PCTCN2018094289-appb-000012
并根据
Figure PCTCN2018094289-appb-000013
计算b值。
求解了支持向量机分类器的中的拉格朗日乘子
Figure PCTCN2018094289-appb-000014
和决策阈值b,从而获取较好的参数,以便构建高效的支持向量机分类器。
S32:根据拉格朗日乘子
Figure PCTCN2018094289-appb-000015
和决策阈值b,采用如下公式,得到支持向量机分类器的临界面g(x):
Figure PCTCN2018094289-appb-000016
通过训练支持向量机分类器得到拉格朗日乘子
Figure PCTCN2018094289-appb-000017
和决策阈值b后,即调整训练样本的拉格朗日乘子
Figure PCTCN2018094289-appb-000018
和决策阈值b这两个参数后,并代入到公式
Figure PCTCN2018094289-appb-000019
中,即得到支持向量机分类器的临界面。
容易理解地,通过计算得到临界面,以便后续人脸图像样本根据临界面对训练样本分类,训练程序先提取并保存样本的特征向量,从而可以在不断调整训练参数多次训练过程中节省提取特征的时间,尽快得到符合要求的训练参数。这样可以调整临界面对某一分类的误报率和准确率,而不需要经常重复训练模型,提高了模型训练效率。
本实施例中,首先获取合适的核函数K(x i,x j),并设定合适的惩罚参数C,将训练样本数据的特 征向量与核函数进行核函数运算,求解支持向量机分类器中的决策阈值b,从而获取较好的参数,构建支持向量机分类器,然后将拉格朗日乘子
Figure PCTCN2018094289-appb-000020
和决策阈值b这两个参数代入到公式
Figure PCTCN2018094289-appb-000021
中,得到临界面g(x),以便后续人脸图像样本根据临界面对训练样本数据分类,而不需要经常重复训练模型,提高了模型训练的效率。
在一实施例中,如图5所示,步骤S15中,即根据归一化嘴巴矩形区域提取HOG特征向量,具体包括如下步骤:
S151:将归一化嘴巴矩形区域划分成细胞单元,并计算细胞单元的每个像素梯度的大小和方向。
具体地,根据实际需要及对支持向量机分类器的要求不同,对归一化嘴巴矩形区域划分的方式也不同。子区域与子区域可重叠也可以不重叠。细胞单元是指图像的连通子区域,即每个子区域是由多个细胞单元组成,例如,一幅48*32的归一化嘴巴矩形区域,假设一个细胞单元为4*4像素,将2*2个细胞组成一个子区域,那么这个归一化嘴巴矩形区域有6*4个子区域。每个细胞单元的梯度方向区间0°到180°分成了9个区间,因此可以用一个9维向量描述一个细胞单元。
获取归一化嘴巴矩形区域每个像素梯度的大小和方向具体过程为:首先获取每个像素的梯度,假如像素为(x,y),其梯度计算公式如下:
Figure PCTCN2018094289-appb-000022
其中,G x(x,y)为像素(x,y)的水平方向梯度,其中G y(x,y)为像素(x,y)的垂直方向梯度,H(x,y)为像素(x,y)的灰度值。然后采用以下公式计算该像素的梯度大小:
Figure PCTCN2018094289-appb-000023
其中,G(x,y)为像素梯度的大小。
最后,采用以下公式计算像素梯度的方向:
Figure PCTCN2018094289-appb-000024
其中,α(x,y)为像素梯度的方向的方向角。
S152:统计细胞单元的每个像素梯度的大小和方向的梯度直方图。
其中,梯度直方图是指对像素梯度的大小和方向进行统计得到的直方图,用于表征每个细胞单元的梯度信息。具体地,首先将每个细胞单元的梯度方向从0°到180°均匀地分成9个方向块,即0°-20°是第一个方向块,20°-40°第二个方向块,依此类推,160°-180°为第九个方向块。然后判断细胞单元的像素梯度的方向所在的方向块,并加上该方向块的像素梯度的大小。例如:一个细胞单元的某一像素的方向落在40°-60°,就将梯度直方图第三个方向上的像素值加上该方向的像素梯度的大小,从而得到该细胞单元的梯度直方图。
S153:串联梯度直方图,得到HOG特征向量。
其中,串联是指对各个细胞单元的梯度直方图按照自左向右、自上向下的顺序将所有梯度直方图合并,从而得到归一化嘴巴矩形区域的HOG特征向量。
本实施例中,通过将归一化嘴巴矩形区域分成若干个小区域,然后计算各个小区域的梯度直方图,最后将各个小区域对应的梯度直方图串联一起,得到整幅归一化嘴巴矩形区域的梯度直方图,用于描述人脸图像样本的特征向量,同时HOG特征向量本身关注的是图像梯度特征而不是颜色特征,受光照变化影响不大。提取HOG特征向量能够方便高效地对嘴巴图像进行识别。
在一实施例中,如图6所示,步骤S50中,即获取预设真正类率或假正类率,根据向量距离和与验证样本数据对应的标注数据获取分类阈值,具体包括如下步骤:
S51:根据向量距离和与验证样本数据对应的标注数据绘制ROC曲线。
其中,ROC曲线指受试者工作特征曲线/接收器操作特性曲线(receiver operating characteristic curve),是反映敏感性和特异性连续变量的综合指标,是用构图法揭示敏感性和特异性的相互关系。本实施例中,ROC曲线显示的是支持向量机分类器真正类率和假正类率之间的关系,该曲线越靠近左上角分类器的准确性越高。
在验证训练样本中将样本进行了正负样本的分类:正样本(positive)或负样本(negative)。在对验证训练样本中的人脸图像数据进行分类的过程中,会出现四种情况:如果人脸图像数据是正样本并且也被预测成正样本,即为真正类(True positive,TP),如果人脸图像数据是负样本被预测成正样本,称之为假正类(False positive,FP)。相应地,如果人脸图像数据是负样本被预测成负样本,称之为真负类(True negative,TN),正样本被预测成负样本则为假负样本(false negative,FN)。
真正类率(true positive rate,TPR)刻画的是分类器所识别出的正实例占所有正实例的比例,计算公式为TPR=TP/(TP+FN)。假正类率(false positive rate,FPR)刻画的是分类器错认为正样本的负实例占所有负实例的比例,计算公式为FPR=FP/(FP+TN)。
ROC曲线的绘制过程为:根据验证样本数据的特征向量和临界面特征向量的向量距离和对应的验证样本数据标注,获得众多验证样本的真正类率和假正类率,ROC曲线以假正类率为横轴,以真正类率为纵轴,连接各点即众多验证样本的真正类率和假正类率,绘制曲线,然后计算曲线下的面积,面积越大,判断价值越高。
在一具体实施方式中,可通过ROC曲线绘制工具进行绘制,具体地,通过matlab中的plotSVMroc(true_labels,predict_labels,classnumber)函数绘制ROC曲线。其中,true_labels为正确的标记,predict_labels为分类判断的标记,classnumber为分类类别的数量,本实施例因为是正负样本的二分类问题,因此classnumber=2。具体地,通过计算验证样本数据的特征向量和临界面特征向量的向量距离后,根据向量距离分布情况,即各个验证样本数据与临界面的接近程度的分布范围,并根据对应的验证样本数据的标注能够获取到验证样本数据的真正类率和假正类率,然后依据验证样本数据的真正类率和假正类率绘制ROC曲线。
S52:根据预设真正类率或预设假正类率在ROC曲线的横轴上获取分类阈值。
具体地,预设真正类率或预设假正类率通过实际的使用需要而进行设置,服务端在获取到预设真正类率或预设假正类率后,通过ROC曲线中的横轴表示的假正类率和纵轴表示的真正类率与预设真正类率或预设假正类率比较大小,即预设真正类率或预设假正类率作为对测试样本数据进行分类的标准,从ROC曲线的横轴上依据分类标准确定分类阈值,从而使得后续模型训练中通过ROC曲线可以根据不同的场景选取不同的分类阈值,避免重复训练的需要,提高模型训练的效率。
本实施例中,首先通过计算验证样本数据的特征向量和临界面特征向量的向量距离后,并根据对应的验证样本数据的标注能够获取到验证样本数据的真正类率和假正类率,然后依据验证样本数据的真正类率和假正类率绘制ROC曲线。通过预设真正类率或预设假正类率从ROC曲线的横轴上获取分类阈值,从而使得后续模型训练中通过ROC曲线可以根据不同的场景选取不同的分类阈值,避免重复训练的需要,提高模型训练的效率。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图7示出与实施例中嘴巴模型训练方法一一对应的嘴巴模型训练装置的原理框图。如图7所示,该嘴巴模型训练装置包括人脸图像样本数据获取模块10、人脸图像样本数据划分模块20、临界面获取模 块30、向量距离计算模块40、分类阈值获取模块50和嘴巴判断模型获取模块60。其中,人脸图像样本数据获取模块10、人脸图像样本数据划分模块20、临界面获取模块30、向量距离计算模块40、分类阈值获取模块50和嘴巴判断模型获取模块60的实现功能与实施例中嘴巴模型训练方法对应的步骤一一对应,各功能模块详细说明如下:
人脸图像样本数据获取模块10,用于获取人脸图像样本,并对人脸图像样本进行标记以得到人脸图像样本数据,及提取人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
人脸图像样本数据划分模块20,用于将人脸图像样本数据划分为训练样本数据和验证样本数据;
临界面获取模块30,用于采用训练样本数据训练支持向量机分类器,得到支持向量机分类器的临界面;
向量距离计算模块40,用于计算验证样本数据中的验证样本的特征向量与临界面的向量距离;
分类阈值获取模块50,用于获取预设真正类率或预设假正类率,根据向量距离和与验证样本数据对应的标注数据获取分类阈值;
嘴巴判断模型获取模块60,用于根据分类阈值,获取嘴巴判断模型。
具体地,人脸图像样本数据获取模块10包括人脸特征点获取单元11、正向调整单元12、嘴巴矩形区域构建单元13、嘴巴矩形区域获取单元14和特征向量提取单元15。
人脸特征点获取单元11,用于采用人脸特征点检测算法获取人脸特征点,该人脸特征点包括:左嘴角点、右嘴角点和鼻尖点;
正向调整单元12,用于根据左嘴角点和右嘴角点对人脸图像样本进行正向调整;
嘴巴矩形区域构建单元13,用于根据左嘴角点、右嘴角点和鼻尖点构建嘴巴矩形区域;
嘴巴矩形区域获取单元14,用于对嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域;
特征向量提取单元15,用于根据归一化嘴巴矩形区域提取HOG特征向量。
具体地,特征向量提取单元15包括像素梯度获取子单元151、梯度直方图获取子单元152和HOG特征向量获取子单元153。
像素梯度获取子单元151,用于将归一化嘴巴矩形区域划分成细胞单元,并计算细胞单元的每个像素梯度的大小和方向;
梯度直方图获取子单元152,用于统计细胞单元的每个像素梯度的大小和方向的梯度直方图;
HOG特征向量获取子单元153,用于串联梯度直方图,得到HOG特征向量。
具体地,临界面获取模块30包括参数获取单元31和临界面获取单元32。
参数获取单元31,用于获取支持向量机分类器的核函数和支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
Figure PCTCN2018094289-appb-000025
和决策阈值b:
Figure PCTCN2018094289-appb-000026
式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
Figure PCTCN2018094289-appb-000027
的最小值,K(x i,x j)为支持向量机分类器的核函数,C为支持 向量机分类器的惩罚参数,C>0,α i与拉格朗日乘子
Figure PCTCN2018094289-appb-000028
是共轭关系,x i为训练样本数据的特征向量,l为训练样本数据的特征向量的个数,y i为训练样本数据的标注;
临界面获取单元32:根据拉格朗日乘子
Figure PCTCN2018094289-appb-000029
和决策阈值b,采用如下公式,得到支持向量机分类器的临界面g(x):
Figure PCTCN2018094289-appb-000030
具体地,分类阈值获取模块50包括ROC曲线绘制单元51和分类阈值获取单元52。
ROC曲线绘制单元51,用于根据向量距离和与验证样本数据对应的标注数据绘制ROC曲线;
分类阈值获取单元52,用于根据预设真正类率或预设假正类率在ROC曲线的横轴上获取分类阈值。
关于嘴巴模型训练装置的具体限定可以参见上文中对于嘴巴模型训练方法的限定,在此不再赘述。上述嘴巴模型训练装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一实施例中,提供一嘴巴识别方法,该嘴巴识别方法也可以应用在如图1的应用环境中,其中,计算机设备通过网络与服务端进行通信。客户端通过网络与服务端进行通信,服务端接收客户端发送待识别人脸图片,进行嘴巴识别。其中,客户端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务端可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图8所示,以该方法应用于图1中的服务端为例进行说明,包括如下步骤:
S70:获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像。
其中,待识别人脸图片是指需要进行嘴巴识别的人脸图片。具体地,获取人脸图像可通过预先采集人脸图片,或者直接从人脸库中获取人脸图片,例如AR人脸库。
本实施例中,待识别人脸图片包括无遮挡嘴巴图片和有遮挡嘴巴图片,并采用人脸特征点检测算法获取正向的嘴巴区域图像。该采用人脸特征点检测算法获取正向的嘴巴区域图像的实现过程和步骤S11至步骤S13的方法相同,在此不再赘述。
S80:对正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像。
其中,待识别嘴巴图像是指实现了归一化处理后的正向的嘴巴区域图像,通过对正向的嘴巴区域图像进行归一化处理,可以提高识别效率。具体地,归一化处理得到的待识别嘴巴图像因为变换到统一的标准形式,从而避免了支持向量机分类器中的大数值区间的属性过分支配了小数值区间的属性,而且还能避免计算过程中数值复杂度。可选地,对正向的嘴巴区域图像进行归一化处理的实现过程和步骤S14相同,在此不再赘述。
S90:将待识别嘴巴图像输入到如步骤S10至步骤S60中的嘴巴模型训练方法训练得到的嘴巴判断模型进行识别,获取识别结果。
其中,识别结果是指对待识别嘴巴图像采用嘴巴判断模型进行识别所得到的结果,包括两种情形:待识别嘴巴图像是无遮挡的嘴巴图像和待识别嘴巴图像是有遮挡的嘴巴图像。具体地,将待识别嘴巴图像输入到嘴巴判断模型进行识别,以获取识别结果。
本实施例中,先获取待识别人脸图片,对正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像,以便对归一化处理的待识别人脸图片输入到嘴巴判断模型进行识别,获取识别结果,快速识别出该人脸图片嘴巴有无遮挡,提高识别效率,从而避免影响后续的图像处理过程。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其 功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图9示出与实施例中嘴巴识别方法一一对应的嘴巴识别装置的原理框图。如图9所示,该嘴巴识别装置包括待识别嘴巴图像获取模块70、待识别嘴巴图像获取模块80和识别结果获取模块90。其中,待识别嘴巴图像获取模块70、待识别嘴巴图像获取模块80和识别结果获取模块90的实现功能与实施例中嘴巴识别方法对应的步骤一一对应,各功能模块详细说明如下:
待识别人脸图片获取模块70,用于获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
待识别嘴巴图像获取模块80,用于对正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
识别结果获取模块90,用于将待识别嘴巴图像输入到嘴巴模型训练方法训练得到的嘴巴判断模型进行识别,获取识别结果。
关于嘴巴模型训练装置的具体限定可以参见上文中对于嘴巴识别方法的限定,在此不再赘述。上述嘴巴识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图10所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储嘴巴模型训练方法中的人脸图像样本数据的特征向量和嘴巴模型训练数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种嘴巴模型训练方法。或者,该计算机可读指令被处理器执行时实现实施例中嘴巴识别装置中各模块/单元的功能
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现上述实施例嘴巴模型训练方法的步骤,例如图2所示的步骤S10至步骤S60。或者处理器执行计算机可读指令时实现上述实施例嘴巴识别方法的步骤,例如图8所示的步骤S70至步骤S90。或者,处理器执行计算机可读指令时实现上述实施例嘴巴模型训练装置的各模块/单元的功能,例如图7所示的模块10至模块60。或者,处理器执行计算机可读指令时实现上述实施例嘴巴识别装置的各模块/单元的功能,例如图9所示的模块70至模块90。为避免重复,这里不再赘述。
一个或多个存储有计算机可读指令的非易失性可读存储介质,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述实施例嘴巴模型训练方法的步骤,或者计算机可读指令被一个或多个处理器执行时实现上述实施例嘴巴识别方法的步骤,或者,计算机可读指令被一个或多个处理器执行时实现上述实施例嘴巴模型训练装置的各模块/单元的功能,或者,计算机可读指令被一个或多个处理器执行时实现上述实施例嘴巴识别装置的各模块/单元的功能,为避免重复,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种嘴巴模型训练方法,其特征在于,包括:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
  2. 如权利要求1所述的嘴巴模型训练方法,其特征在于,所述提取所述人脸图像样本数据中的人脸图像样本的特征向量,包括:
    采用人脸特征点检测算法获取人脸特征点,所述人脸特征点包括:左嘴角点、右嘴角点和鼻尖点;
    根据所述左嘴角点和所述右嘴角点对所述人脸图像样本进行正向调整;
    根据所述左嘴角点、所述右嘴角点和所述鼻尖点构建嘴巴矩形区域;
    对所述嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域;
    根据所述归一化嘴巴矩形区域提取HOG特征向量。
  3. 如权利要求1所述的嘴巴模型训练方法,其特征在于,所述采用训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面,包括:
    获取所述支持向量机分类器的核函数和所述支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
    Figure PCTCN2018094289-appb-100001
    和决策阈值b:
    Figure PCTCN2018094289-appb-100002
    式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
    Figure PCTCN2018094289-appb-100003
    的最小值,K(x i,x j)为所述支持向量机分类器的核函数,C为所述支持向量机分类器的惩罚参数,C>0,α i与所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100004
    是共轭关系,x i为所述训练样本数据的特征向量,l为所述训练样本数据的特征向量的个数,y i为所述训练样本数据的标注;
    根据所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100005
    和所述决策阈值b,采用如下公式,得到所述支持向量机分类器的临界面g(x):
    Figure PCTCN2018094289-appb-100006
  4. 如权利要求2所述的嘴巴模型训练方法,其特征在于,所述根据所述归一化嘴巴矩形区域提取HOG特征向量,包括:
    将归一化嘴巴矩形区域划分成细胞单元,并计算所述细胞单元的每个像素梯度的大小和方向;
    统计所述细胞单元的每个像素梯度的大小和方向的梯度直方图;
    串联所述梯度直方图,得到所述HOG特征向量。
  5. 如权利要求1所述的嘴巴模型训练方法,其特征在于,所述获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值,包括:
    根据所述向量距离和与验证样本数据对应的标注数据绘制ROC曲线;
    根据所述预设真正类率或预设假正类率在所述ROC曲线的横轴上获取分类阈值。
  6. 一种嘴巴识别方法,其特征在于,包括:
    获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
    对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
    将所述待识别嘴巴图像输入到如权利要求1-5任一项所述嘴巴模型训练方法训练得到的嘴巴判断模型进行识别,获取识别结果。
  7. 一种嘴巴模型训练装置,其特征在于,包括:
    人脸图像样本数据获取模块,用于获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    人脸图像样本数据划分模块,用于将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    临界面获取模块,用于采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    向量距离计算模块,用于计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    分类阈值获取模块,用于获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    嘴巴判断模型获取模块,用于根据所述分类阈值,获取嘴巴判断模型。
  8. 如权利要求7所述的嘴巴模型训练装置,其特征在于,所述人脸图像样本数据获取模块,包括:
    采用人脸特征点检测算法获取人脸特征点,所述人脸特征点包括:左嘴角点、右嘴角点和鼻尖点;
    根据所述左嘴角点和所述右嘴角点对所述人脸图像样本进行正向调整;
    根据所述左嘴角点、所述右嘴角点和所述鼻尖点构建嘴巴矩形区域;
    对所述嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域;
    根据所述归一化嘴巴矩形区域提取HOG特征向量。
  9. 如权利要求7所述的嘴巴模型训练装置,其特征在于,所述临界面获取模块,包括:
    获取所述支持向量机分类器的核函数和所述支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
    Figure PCTCN2018094289-appb-100007
    和决策阈值b:
    Figure PCTCN2018094289-appb-100008
    式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
    Figure PCTCN2018094289-appb-100009
    的最小值,K(x i,x j)为所述支持向量机分类器的核函数,C为所述支持向量机分类器的惩罚参数,C>0,α i与所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100010
    是共轭关系,x i为所述训练样本数据的特征向量,l为所述训练样本数据的特征向量的个数,y i为所述训练样本数据的标注;
    根据所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100011
    和所述决策阈值b,采用如下公式,得到所述支持向量机分类器的临界面g(x):
    Figure PCTCN2018094289-appb-100012
  10. 一种嘴巴识别装置,其特征在于,包括:
    待识别人脸图片获取模块,用于获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
    待识别嘴巴图像获取模块,用于对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
    识别结果获取模块,用于将所述待识别嘴巴图像输入到嘴巴判断模型进行识别,获取识别结果,其中,所述嘴巴判断模型是采用如下训练方法训练得到:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
  12. 如权利要求11所述的计算机设备,其特征在于,所述提取所述人脸图像样本数据中的人脸图像样本的特征向量,包括:
    采用人脸特征点检测算法获取人脸特征点,所述人脸特征点包括:左嘴角点、右嘴角点和鼻尖点;
    根据所述左嘴角点和所述右嘴角点对所述人脸图像样本进行正向调整;
    根据所述左嘴角点、所述右嘴角点和所述鼻尖点构建嘴巴矩形区域;
    对所述嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域;
    根据所述归一化嘴巴矩形区域提取HOG特征向量。
  13. 如权利要求11所述的计算机设备,其特征在于,所述采用训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面,包括:
    获取所述支持向量机分类器的核函数和所述支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
    Figure PCTCN2018094289-appb-100013
    和决策阈值b:
    Figure PCTCN2018094289-appb-100014
    式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
    Figure PCTCN2018094289-appb-100015
    的最小值,K(x i,x j)为所述支持向量机分类器的核函数,C为所述支持向量机分类器的惩罚参数,C>0,α i与所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100016
    是共轭关系,x i为所述训练样本数据的特征向量,l为所述训练样本数据的特征向量的个数,y i为所述训练样本数据的标注;
    根据所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100017
    和所述决策阈值b,采用如下公式,得到所述支持向量机分类器的临界面g(x):
    Figure PCTCN2018094289-appb-100018
  14. 如权利要求11所述的计算机设备,其特征在于,所述获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值,包括:
    根据所述向量距离和与验证样本数据对应的标注数据绘制ROC曲线;
    根据所述预设真正类率或预设假正类率在所述ROC曲线的横轴上获取分类阈值。
  15. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
    对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
    识别结果获取模块,用于将所述待识别嘴巴图像输入到嘴巴判断模型进行识别,获取识别结果,其中,所述嘴巴判断模型是采用如下训练方法训练得到:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
  16. 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
  17. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述提取所述人脸图像样本数据中的人脸图像样本的特征向量,包括:
    采用人脸特征点检测算法获取人脸特征点,所述人脸特征点包括:左嘴角点、右嘴角点和鼻尖点;
    根据所述左嘴角点和所述右嘴角点对所述人脸图像样本进行正向调整;
    根据所述左嘴角点、所述右嘴角点和所述鼻尖点构建嘴巴矩形区域;
    对所述嘴巴矩形区域进行图像归一化处理,得到归一化嘴巴矩形区域;
    根据所述归一化嘴巴矩形区域提取HOG特征向量。
  18. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述采用训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面,包括:
    获取所述支持向量机分类器的核函数和所述支持向量机分类器的惩罚参数,采用以下公式求解拉格朗日乘子
    Figure PCTCN2018094289-appb-100019
    和决策阈值b:
    Figure PCTCN2018094289-appb-100020
    式中,s.t.是数学公式中约束条件的缩写,min是指在约束条件下取代数式
    Figure PCTCN2018094289-appb-100021
    的最小值,K(x i,x j)为所述支持向量机分类器的核函数,C为所述支持向量机分类器的惩罚参数,C>0,α i与所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100022
    是共轭关系,x i为所述训练样本数据的特征向量,l为所述训练样本数据的特征向量的个数,y i为所述训练样本数据的标注;
    根据所述拉格朗日乘子
    Figure PCTCN2018094289-appb-100023
    和所述决策阈值b,采用如下公式,得到所述支持向量机分类器的临界面g(x):
    Figure PCTCN2018094289-appb-100024
  19. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值,包括:
    根据所述向量距离和与验证样本数据对应的标注数据绘制ROC曲线;
    根据所述预设真正类率或预设假正类率在所述ROC曲线的横轴上获取分类阈值。
  20. 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    获取待识别人脸图片,采用人脸特征点检测算法获取正向的嘴巴区域图像;
    对所述正向的嘴巴区域图像进行归一化处理,得到待识别嘴巴图像;
    识别结果获取模块,用于将所述待识别嘴巴图像输入到嘴巴判断模型进行识别,获取识别结果,其中,所述嘴巴判断模型是采用如下训练方法训练得到:
    获取人脸图像样本,并对所述人脸图像样本进行标记以得到人脸图像样本数据,及提取所述人脸图像样本数据中的人脸图像样本的特征向量,其中,人脸图像样本数据包括人脸图像样本和标注数据;
    将所述人脸图像样本数据划分为训练样本数据和验证样本数据;
    采用所述训练样本数据训练支持向量机分类器,得到所述支持向量机分类器的临界面;
    计算所述验证样本数据中的验证样本的特征向量与所述临界面的向量距离;
    获取预设真正类率或预设假正类率,根据所述向量距离和与验证样本数据对应的标注数据获取分类阈值;
    根据所述分类阈值,获取嘴巴判断模型。
PCT/CN2018/094289 2018-06-06 2018-07-03 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质 WO2019232862A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810574521.6 2018-06-06
CN201810574521.6A CN108985155A (zh) 2018-06-06 2018-06-06 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2019232862A1 true WO2019232862A1 (zh) 2019-12-12

Family

ID=64540788

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/094289 WO2019232862A1 (zh) 2018-06-06 2018-07-03 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN108985155A (zh)
WO (1) WO2019232862A1 (zh)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991641A (zh) * 2019-12-17 2020-04-10 合肥鼎盛锦业科技有限公司 一种油藏类型分析方法、装置及电子设备
CN111160169A (zh) * 2019-12-18 2020-05-15 中国平安人寿保险股份有限公司 人脸检测方法、装置、设备及计算机可读存储介质
CN111177811A (zh) * 2019-12-24 2020-05-19 武汉理工光科股份有限公司 一种应用于云平台的消防点位自动布图的方法
CN111353526A (zh) * 2020-02-19 2020-06-30 上海小萌科技有限公司 一种图像匹配方法、装置以及相关设备
CN111476271A (zh) * 2020-03-10 2020-07-31 杭州易现先进科技有限公司 图标识别的方法、装置、系统、计算机设备和存储介质
CN111580062A (zh) * 2020-05-25 2020-08-25 西安电子科技大学 基于双圆极化高分辨一维距离像hrrp的目标鉴别方法
CN111583225A (zh) * 2020-05-08 2020-08-25 京东方科技集团股份有限公司 缺陷检测方法、装置及存储介质
CN111652713A (zh) * 2020-07-01 2020-09-11 中国银行股份有限公司 权益风控建模方法和装置
CN111914908A (zh) * 2020-07-14 2020-11-10 浙江大华技术股份有限公司 一种图像识别模型训练方法、图像识别方法及相关设备
CN111915561A (zh) * 2020-06-30 2020-11-10 西安理工大学 一种基于图像识别和机器学习的螺栓状态监测方法
CN112101257A (zh) * 2020-09-21 2020-12-18 北京字节跳动网络技术有限公司 训练样本生成方法、图像处理方法、装置、设备和介质
CN112116525A (zh) * 2020-09-24 2020-12-22 百度在线网络技术(北京)有限公司 换脸识别方法、装置、设备和计算机可读存储介质
CN112232271A (zh) * 2020-10-29 2021-01-15 上海有个机器人有限公司 一种基于激光的人流检测方法以及设备
CN112529888A (zh) * 2020-12-18 2021-03-19 平安科技(深圳)有限公司 基于深度学习的人脸图像评估方法、装置、设备及介质
CN112884040A (zh) * 2021-02-19 2021-06-01 北京小米松果电子有限公司 训练样本数据的优化方法、系统、存储介质及电子设备
CN113313406A (zh) * 2021-06-16 2021-08-27 吉林大学 一种电动汽车运行大数据的动力电池安全风险评估方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222724B (zh) * 2019-05-15 2023-12-19 平安科技(深圳)有限公司 一种图片实例检测方法、装置、计算机设备及存储介质
CN110569721B (zh) * 2019-08-01 2023-08-29 平安科技(深圳)有限公司 识别模型训练方法、图像识别方法、装置、设备及介质
CN110490164B (zh) * 2019-08-26 2022-05-24 北京达佳互联信息技术有限公司 生成虚拟表情的方法、装置、设备及介质
CN110569826B (zh) * 2019-09-18 2022-05-24 深圳市捷顺科技实业股份有限公司 一种人脸识别方法、装置、设备及介质
CN113095146A (zh) * 2021-03-16 2021-07-09 深圳市雄帝科技股份有限公司 基于深度学习的嘴部状态分类方法、装置、设备和介质
CN113110833A (zh) * 2021-04-15 2021-07-13 成都新希望金融信息有限公司 机器学习模型可视化建模方法、装置、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069448A (zh) * 2015-09-29 2015-11-18 厦门中控生物识别信息技术有限公司 一种真假人脸识别方法及装置
CN105095856A (zh) * 2015-06-26 2015-11-25 上海交通大学 基于掩膜的有遮挡人脸识别方法
US9251402B2 (en) * 2011-05-13 2016-02-02 Microsoft Technology Licensing, Llc Association and prediction in facial recognition
CN106022317A (zh) * 2016-06-27 2016-10-12 北京小米移动软件有限公司 人脸识别方法及装置
CN107992783A (zh) * 2016-10-26 2018-05-04 上海银晨智能识别科技有限公司 人脸图像处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9251402B2 (en) * 2011-05-13 2016-02-02 Microsoft Technology Licensing, Llc Association and prediction in facial recognition
CN105095856A (zh) * 2015-06-26 2015-11-25 上海交通大学 基于掩膜的有遮挡人脸识别方法
CN105069448A (zh) * 2015-09-29 2015-11-18 厦门中控生物识别信息技术有限公司 一种真假人脸识别方法及装置
CN106022317A (zh) * 2016-06-27 2016-10-12 北京小米移动软件有限公司 人脸识别方法及装置
CN107992783A (zh) * 2016-10-26 2018-05-04 上海银晨智能识别科技有限公司 人脸图像处理方法及装置

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991641A (zh) * 2019-12-17 2020-04-10 合肥鼎盛锦业科技有限公司 一种油藏类型分析方法、装置及电子设备
CN110991641B (zh) * 2019-12-17 2024-03-05 合肥鼎盛锦业科技有限公司 一种油藏类型分析方法、装置及电子设备
CN111160169A (zh) * 2019-12-18 2020-05-15 中国平安人寿保险股份有限公司 人脸检测方法、装置、设备及计算机可读存储介质
CN111160169B (zh) * 2019-12-18 2024-03-15 中国平安人寿保险股份有限公司 人脸检测方法、装置、设备及计算机可读存储介质
CN111177811A (zh) * 2019-12-24 2020-05-19 武汉理工光科股份有限公司 一种应用于云平台的消防点位自动布图的方法
CN111353526A (zh) * 2020-02-19 2020-06-30 上海小萌科技有限公司 一种图像匹配方法、装置以及相关设备
CN111476271A (zh) * 2020-03-10 2020-07-31 杭州易现先进科技有限公司 图标识别的方法、装置、系统、计算机设备和存储介质
CN111476271B (zh) * 2020-03-10 2023-07-21 杭州易现先进科技有限公司 图标识别的方法、装置、系统、计算机设备和存储介质
CN111583225A (zh) * 2020-05-08 2020-08-25 京东方科技集团股份有限公司 缺陷检测方法、装置及存储介质
CN111580062A (zh) * 2020-05-25 2020-08-25 西安电子科技大学 基于双圆极化高分辨一维距离像hrrp的目标鉴别方法
CN111915561A (zh) * 2020-06-30 2020-11-10 西安理工大学 一种基于图像识别和机器学习的螺栓状态监测方法
CN111652713A (zh) * 2020-07-01 2020-09-11 中国银行股份有限公司 权益风控建模方法和装置
CN111652713B (zh) * 2020-07-01 2024-02-27 中国银行股份有限公司 权益风控建模方法和装置
CN111914908A (zh) * 2020-07-14 2020-11-10 浙江大华技术股份有限公司 一种图像识别模型训练方法、图像识别方法及相关设备
CN111914908B (zh) * 2020-07-14 2023-10-24 浙江大华技术股份有限公司 一种图像识别模型训练方法、图像识别方法及相关设备
CN112101257A (zh) * 2020-09-21 2020-12-18 北京字节跳动网络技术有限公司 训练样本生成方法、图像处理方法、装置、设备和介质
CN112101257B (zh) * 2020-09-21 2022-05-31 北京字节跳动网络技术有限公司 训练样本生成方法、图像处理方法、装置、设备和介质
CN112116525A (zh) * 2020-09-24 2020-12-22 百度在线网络技术(北京)有限公司 换脸识别方法、装置、设备和计算机可读存储介质
CN112232271A (zh) * 2020-10-29 2021-01-15 上海有个机器人有限公司 一种基于激光的人流检测方法以及设备
CN112232271B (zh) * 2020-10-29 2023-09-12 上海有个机器人有限公司 一种基于激光的人流检测方法以及设备
CN112529888A (zh) * 2020-12-18 2021-03-19 平安科技(深圳)有限公司 基于深度学习的人脸图像评估方法、装置、设备及介质
CN112529888B (zh) * 2020-12-18 2024-04-30 平安科技(深圳)有限公司 基于深度学习的人脸图像评估方法、装置、设备及介质
CN112884040A (zh) * 2021-02-19 2021-06-01 北京小米松果电子有限公司 训练样本数据的优化方法、系统、存储介质及电子设备
CN112884040B (zh) * 2021-02-19 2024-04-30 北京小米松果电子有限公司 训练样本数据的优化方法、系统、存储介质及电子设备
CN113313406B (zh) * 2021-06-16 2023-11-21 吉林大学 一种电动汽车运行大数据的动力电池安全风险评估方法
CN113313406A (zh) * 2021-06-16 2021-08-27 吉林大学 一种电动汽车运行大数据的动力电池安全风险评估方法

Also Published As

Publication number Publication date
CN108985155A (zh) 2018-12-11

Similar Documents

Publication Publication Date Title
WO2019232866A1 (zh) 人眼模型训练方法、人眼识别方法、装置、设备及介质
WO2019232862A1 (zh) 嘴巴模型训练方法、嘴巴识别方法、装置、设备及介质
US11256905B2 (en) Face detection method and apparatus, service processing method, terminal device, and storage medium
US10956719B2 (en) Depth image based face anti-spoofing
US11182592B2 (en) Target object recognition method and apparatus, storage medium, and electronic device
WO2020024400A1 (zh) 课堂监控方法、装置、计算机设备及存储介质
WO2019128646A1 (zh) 人脸检测方法、卷积神经网络参数的训练方法、装置及介质
US8818034B2 (en) Face recognition apparatus and methods
US7912253B2 (en) Object recognition method and apparatus therefor
Sun et al. Face detection based on color and local symmetry information
WO2022027912A1 (zh) 一种人脸姿态检测方法、装置、终端设备及存储介质
WO2020119450A1 (zh) 基于面部图片的风险识别方法、装置、计算机设备及存储介质
CN109165589B (zh) 基于深度学习的车辆重识别方法和装置
CN106778450B (zh) 一种面部识别方法和装置
WO2015149534A1 (zh) 基于Gabor二值模式的人脸识别方法及装置
US11361587B2 (en) Age recognition method, storage medium and electronic device
WO2021051611A1 (zh) 基于人脸可见性的人脸识别方法、系统、装置及存储介质
US8090151B2 (en) Face feature point detection apparatus and method of the same
WO2020187160A1 (zh) 基于级联的深层卷积神经网络的人脸识别方法及系统
KR102530516B1 (ko) 자동 수어 인식 방법 및 시스템
CN112784712B (zh) 一种基于实时监控的失踪儿童预警实现方法、装置
Dave et al. Face recognition in mobile phones
Kawulok Energy-based blob analysis for improving precision of skin segmentation
Mannan et al. Classification of degraded traffic signs using flexible mixture model and transfer learning
CN110175500B (zh) 指静脉比对方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18921887

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12/03/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18921887

Country of ref document: EP

Kind code of ref document: A1