CN111652260A - Method and system for selecting number of face clustering samples - Google Patents

Method and system for selecting number of face clustering samples Download PDF

Info

Publication number
CN111652260A
CN111652260A CN201910363240.0A CN201910363240A CN111652260A CN 111652260 A CN111652260 A CN 111652260A CN 201910363240 A CN201910363240 A CN 201910363240A CN 111652260 A CN111652260 A CN 111652260A
Authority
CN
China
Prior art keywords
face
training set
clustering
cluster
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910363240.0A
Other languages
Chinese (zh)
Other versions
CN111652260B (en
Inventor
薛圆圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Re Sr Information Technology Co ltd
Original Assignee
Shanghai Re Sr Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Re Sr Information Technology Co ltd filed Critical Shanghai Re Sr Information Technology Co ltd
Priority to CN201910363240.0A priority Critical patent/CN111652260B/en
Publication of CN111652260A publication Critical patent/CN111652260A/en
Application granted granted Critical
Publication of CN111652260B publication Critical patent/CN111652260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of face recognition, and discloses a method for selecting the number of face cluster samples, which comprises the following steps: constructing a face test set and a plurality of face training sets, wherein the number of face images of each face training set is different; clustering the face training sets to obtain a plurality of corresponding clustering centers; calculating the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center; and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Correspondingly, the invention also discloses a system for selecting the number of the face cluster samples. The invention provides a method for selecting the number of face clustering samples, and can ensure good clustering effect.

Description

Method and system for selecting number of face clustering samples
Technical Field
The invention relates to the technical field of face recognition, in particular to a method and a system for selecting the number of face cluster samples.
Background
The face recognition technology is a biological recognition technology for carrying out identity recognition based on face feature information of people, and is a series of related technologies for collecting images or video streams containing faces by using a camera, automatically detecting and tracking the faces in the images and further identifying the detected faces. In a product of face recognition, a plurality of face pictures need to be added to register a face model. The registered face model is generally a feature vector, and the problem that one person corresponds to a plurality of face feature vectors can occur, so that the face feature vectors are required to be clustered, and one person corresponds to a unique feature vector. The purpose of the face clustering algorithm is to find the clustering centers of feature vectors by clustering the feature vectors extracted from a plurality of photos of the same person, so that the sum of squares of distances from the clustering centers to respective images is minimum. Patent application publication No. CN 108875778A discloses a face clustering method, which includes: determining a clustering mode based on the number of the images to be clustered, wherein the clustering mode describes how many images are obtained from the images to be clustered each time for carrying out face clustering once; and acquiring a corresponding number of images from the images to be clustered each time based on the determined clustering mode to perform face clustering until the images to be clustered are completely finished.
Generally, the clustering algorithm has a good clustering effect under the condition of more images. But the user only provides a few photo images for clustering when registering the face. The above patent application provides a technical scheme of face clustering, and does not provide a method for selecting the number of face cluster samples.
Therefore, how to select and evaluate the number of face clustering samples and ensure good clustering effect becomes a technical problem to be solved.
Disclosure of Invention
The invention aims to provide a method and a system for selecting the number of face clustering samples, provides a method for selecting the number of face clustering samples and can ensure a good clustering effect.
In order to achieve the above object, the present invention provides a method for selecting the number of face cluster samples, wherein the method comprises: constructing a face test set and a plurality of face training sets, wherein the number of face images of each face training set is different; clustering the face training sets to obtain a plurality of corresponding clustering centers; calculating the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center; and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Based on the technical scheme, the technical scheme that the average value and the root mean square value of the cosine distances between the cluster center and the test set image are used as indexes for evaluating the number of face cluster samples is provided.
Preferably, the step S1 includes: constructing an original human face image set of human face images with multiple persons, and carrying out human face detection and cutting on all human face images in the original human face image set; and selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set.
Preferably, the step S1 includes: constructing a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of the face images of the fourth face training set is N4; wherein, N1, N2, N3, N4, N1 and 10, N2 is less than or equal to 10, and N4 is more than or equal to 3.
Preferably, the step S2 includes: performing convolution and feature extraction on each face image in the first face training set according to a convolution neural network model to generate a first feature vector group corresponding to the first face training set, and performing K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set; performing convolution and feature extraction on each face image in the second face training set according to a convolution neural network model to generate a second feature vector group corresponding to the second face training set, and performing K-means clustering on the second feature vector group to obtain a second clustering center corresponding to the second face training set; performing convolution and feature extraction on each face image in the third face training set according to a convolution neural network model to generate a third feature vector group corresponding to the third face training set, and performing K-means clustering on the third feature vector group to obtain a third clustering center corresponding to the third face training set; and performing convolution and feature extraction on each face image in the fourth face training set according to a convolutional neural network model to generate a fourth feature vector group corresponding to the fourth face training set, and performing K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
Preferably, the step S3 includes: the calculation formula of the mean value of the cosine distances is shown as formula 1:
Figure BDA0002047474020000031
wherein mean is the mean value of cosine distances, n is the number of face images in the face test set, diAnd the distance between the cluster center and the cosine of the characteristic vector of the ith human face image in the human face test set.
Preferably, the step S3 further includes: the calculation formula of the root mean square value of the cosine distance is shown as formula 2:
Figure BDA0002047474020000032
wherein var is the root mean square value of cosine distance, n is the number of face images in the face test set, diAnd the distance between the cluster center and the cosine of the characteristic vector of the ith human face image in the human face test set.
Preferably, the step S3 further includes: calculating the cosine distance between the first cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the first cluster center according to the formula 1 and the formula 2; calculating the cosine distance between the second cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the second cluster center according to the formula 1 and the formula 2; calculating the cosine distance between the third cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the third cluster center according to the formula 1 and the formula 2; and calculating the cosine distance between the fourth cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the fourth cluster center according to the formula 1 and the formula 2.
Preferably, the step S3 further includes: respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formula 1 and the formula 2.
Preferably, the step S4 includes: the number of face cluster samples ranges from [3,10 ]. According to the technical scheme, the number range of the face clustering samples is provided, a good clustering effect can be obtained, and the number of the samples of the face photos required to be provided by the user during face registration is guided.
In order to achieve the above object, the present invention provides a system for selecting the number of face cluster samples, the system comprising: the training set module is used for constructing a face test set and constructing a plurality of face training sets, and the number of face images of each face training set is different; the clustering module is used for clustering the face training sets to obtain a plurality of corresponding clustering centers; the computing module is used for computing the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center; and the evaluation module is used for acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Based on the technical scheme, the technical scheme that the average value and the root mean square value of the cosine distances between the cluster center and the test set image are used as indexes for evaluating the number of face cluster samples is provided.
Compared with the prior art, the method and the system for selecting the number of the face cluster samples have the beneficial effects that: the technical scheme that the average value and the root mean square value of the cosine distances between the cluster center and the test set image are used as indexes for evaluating the number of the face cluster samples is provided, the number range of the face cluster samples is provided, a good clustering effect can be obtained, and the number of the face photo samples required to be provided by a user during face registration is guided; the method and the device have the advantages that the user provides a small number of images for face clustering during face registration, the practicability is high, and the experience effect of the user is improved.
Drawings
Fig. 1 is a flowchart illustrating a method for selecting a number of face cluster samples according to an embodiment of the present invention.
Fig. 2 is a block diagram of a system for selecting the number of face cluster samples according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. In the drawings, structurally identical elements are represented by like reference numerals, and structurally or functionally similar elements are represented by like reference numerals throughout the several views. The size and thickness of each component shown in the drawings are arbitrarily illustrated, and the present invention is not limited to the size and thickness of each component. The thickness of the components may be exaggerated where appropriate in the figures to improve clarity.
As shown in fig. 1, according to an embodiment of the present invention, the present invention provides a method for selecting a number of face cluster samples, where the method includes:
s1, constructing a face test set and a plurality of face training sets, wherein the number of face images in each face training set is different;
s2, clustering the face training sets to obtain a plurality of corresponding clustering centers;
s3, calculating the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center;
and S4, acquiring the number range of the face cluster samples according to the cosine distance mean value and the root mean square value corresponding to each cluster center.
Step S1 is: and constructing a face test set and a plurality of face training sets, wherein the number of face images of each face training set is different. Specifically, an original face image set is constructed, the original face image set is provided with face images of a plurality of persons, all the face images in the original face image set are subjected to face detection and cutting, and face images with the same standard size are formed. And selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set. According to an embodiment of the present invention, the number of samples of each face in the original face image set is greater than 112, and 96 face images of the same person are selected from the original face image set as a face test set.
According to an embodiment of the present invention, the step S1 further includes: the step S1 includes: constructing a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of the face images of the fourth face training set is N4; wherein, N1, N2, N3, N4, N1 and 10, N2 is less than or equal to 10, and N4 is more than or equal to 3. According to a preferred embodiment of the present invention, N1 is 16, N2 is 10, N3 is 5, and N4 is 3. And selecting 16 face images from the original face image set, wherein the 16 face images belong to the same person, and forming a first face training set by the 16 face images. Similarly, from the original face image set, 10 face images are selected, the 10 face images belong to the same person, and the 10 face images form a second face training set. And selecting 5 face images from the original face image set, wherein the 5 face images belong to the same person, and forming a third face training set by the 5 face images. And selecting 3 face images from the original face image set, wherein the 3 face images belong to the same person, and forming a fourth face training set by the 3 face images.
Step S2 is: and clustering the face training sets to obtain a plurality of corresponding clustering centers. The clustering center is generated by extracting the face features of the face training set and clustering the extracted face features. According to an embodiment of the present invention, the step S2 includes: performing convolution and feature extraction on each face image in the first face training set according to a convolution neural network model to obtain a feature vector corresponding to each face image and generate a first feature vector group corresponding to the first face training set; and carrying out K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set. Performing convolution and feature extraction on each face image in the second face training set according to a convolution neural network model to obtain a feature vector corresponding to each face image and generate a second feature vector group corresponding to the second face training set; and carrying out K-means clustering on the second feature vector group to obtain a second clustering center corresponding to the second face training set. Performing convolution and feature extraction on each face image in the third face training set according to a convolution neural network model to obtain a feature vector corresponding to each face image and generate a third feature vector group corresponding to the third face training set; and performing K-means clustering on the third feature vector group to obtain a third clustering center corresponding to the third face training set. Performing convolution and feature extraction on each face image in the fourth face training set according to a convolution neural network model to obtain a feature vector corresponding to each face image and generate a fourth feature vector group corresponding to the fourth face training set; and carrying out K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
Step S3 is: and calculating the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center. According to an embodiment of the present invention, the step S3 includes: the calculation formula of the mean value of the cosine distances is shown as formula 1:
Figure BDA0002047474020000071
wherein mean is the mean value of cosine distances, n is the number of face images in the face test set, diThe cosine distance between the clustering center and the feature vector of the ith human face image in the human face test set is obtained;
according to an embodiment of the present invention, the step S3 further includes: the calculation formula of the root mean square value of the cosine distance is shown as formula 2:
Figure BDA0002047474020000072
wherein var is the root mean square value of cosine distance, n is the number of face images in the face test set, diAnd the distance between the cluster center and the cosine of the characteristic vector of the ith human face image in the human face test set.
According to an embodiment of the present invention, the step S3 further includes: and calculating the cosine distance between the first cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the first cluster center according to the formula 1 and the formula 2. And calculating the cosine distance between the second cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the second cluster center according to the formula 1 and the formula 2. And calculating the cosine distance between the third cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the third cluster center according to the formula 1 and the formula 2. And calculating the cosine distance between the fourth cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the fourth cluster center according to the formula 1 and the formula 2.
According to an embodiment of the present invention, the step S3 further includes: respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formula 1 and the formula 2.
Step S4 is: and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center.
According to an embodiment of the present invention, table 1 shows the maximum cosine distance, the minimum cosine distance, the mean value of the cosine distances and the root mean square value of the cosine distances corresponding to each cluster center. N1 was set to 16, N2 to 10, N3 to 5, and N4 to 3. The first clustering center corresponds to a first face training set, and the number of the face images of the first face training set is 16. The second cluster center corresponds to a second face training set, and the number of face images in the second face training set is 10. The third cluster center corresponds to a third face training set, and the number of face images in the third face training set is 5. The fourth clustering center corresponds to a fourth face training set, and the number of the face images in the fourth face training set is 3.
Figure BDA0002047474020000081
Figure BDA0002047474020000091
TABLE 1
As can be seen from table 1, the larger the number of the clustering samples is, the smaller the mean value and the root mean square value of the cosine distances corresponding to the clustering samples are, and the smaller the clustering effect is. The average value and the root mean square of the cosine distances between the cluster center and the face images in the face test set are used as the evaluation indexes of the number of face cluster samples, and the clustering effect is better when the more the cluster samples are, the smaller the average value and the root mean square value of the cosine distances are. As can be seen from the table, after the number of face cluster samples exceeds 10, the speed of reducing the average value of cosine distances and the root mean square value is obviously reduced, so that the upper limit of the number of samples provided when the user uses face registration is set to be 10, that is, the upper limit of the number range of the face cluster samples is 10.
According to an embodiment of the present invention, table 2 shows the maximum cosine distance, the minimum cosine distance, the mean value of the cosine distances and the root mean square value of the cosine distances corresponding to each face image in the fourth face training set.
Figure BDA0002047474020000092
TABLE 2
As can be seen from table 2, the clustering is better than the clustering without clustering in the case of a small number of face clustering samples. The clustering effect is obviously better than the non-clustering effect when the number of the face clustering samples is 3, so that the lower limit of the number range of the face clustering samples is 3. Therefore, the number range of the face cluster samples is [3,10], and within the number range, the larger the number of the face cluster samples is, the better the cluster is.
According to the technical scheme, the technical scheme that the average value and the root mean square value of the cosine distances between the cluster center and the test set image are used as indexes for evaluating the number of the face cluster samples is provided, the number range of the face cluster samples is provided, a good clustering effect can be obtained, and the number of the face photo samples required to be provided by a user during face registration is guided; the method and the device have the advantages that the user provides a small number of images for face clustering during face registration, the practicability is high, and the experience effect of the user is improved.
As shown in fig. 2, in another embodiment, the present invention further provides a system for selecting a number of face cluster samples, where the system includes:
a training set module 20, configured to construct a face test set and construct a plurality of face training sets, where the number of face images in each face training set is different;
the clustering module 21 is configured to cluster the face training sets to obtain a plurality of corresponding clustering centers;
the calculation module 22 is configured to calculate a cosine distance between each cluster center and a feature vector of each face image in the face test set, and obtain a mean value and a root mean square value of the cosine distance corresponding to each cluster center;
and the evaluation module 23 is configured to obtain a number range of face cluster samples according to the cosine distance mean value and the root mean square value corresponding to each cluster center.
The training set module 20 is configured to construct a face test set and construct a plurality of face training sets, where the number of face images in each face training set is different. According to a specific embodiment of the invention, a training set module constructs a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of the face images of the fourth face training set is N4; wherein, N1, N2, N3, N4, N1 and 10, N2 is less than or equal to 10, and N4 is more than or equal to 3.
The clustering module 21 is configured to cluster the face training sets to obtain a plurality of corresponding clustering centers. The clustering center is generated by extracting the face features of the face training set and clustering the extracted face features. According to a specific embodiment of the present invention, the clustering module clusters the four face training sets to generate respective corresponding clustering centers, that is, a first clustering center corresponding to the first face training set, a second clustering center corresponding to the second face training set, a third clustering center corresponding to the third face training set, and a fourth clustering center corresponding to the fourth face training set.
The calculating module 22 is configured to calculate a cosine distance between each cluster center and a feature vector of each face image in the face test set, and obtain a mean value and a root mean square value of the cosine distance corresponding to each cluster center. According to a specific embodiment of the present invention, the calculation module calculates cosine distances between each cluster center and a feature vector of each face image in the face test set, and obtains a mean value and a root mean square value of the cosine distances corresponding to each cluster center according to formulas 1 and 2 in the above method embodiment. According to a specific embodiment of the present invention, the calculation module calculates cosine distances between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and obtains a mean value and a root mean square value of the cosine distances corresponding to each face image in the fourth face training set according to formulas 1 and 2.
The evaluation module 23 is configured to obtain a number range of face cluster samples according to the cosine distance mean value and the root mean square value corresponding to each cluster center. According to the method, the average value and the root mean square of the cosine distances between the cluster center and the face images in the face test set are used as the evaluation indexes of the number of face cluster samples, and the more the cluster samples are, the smaller the average value and the root mean square value of the cosine distances are, and the better the clustering effect is. The number range of the face cluster samples is [3,10], and within the number range, the larger the number of the face cluster samples is, the better the cluster is.
According to the technical scheme, the technical scheme that the average value and the root mean square value of the cosine distances between the cluster center and the test set image are used as indexes for evaluating the number of the face cluster samples is provided, the number range of the face cluster samples is provided, a good clustering effect can be obtained, and the number of the face photo samples required to be provided by a user during face registration is guided.
While the invention has been described in detail in the foregoing with reference to the drawings and examples, such illustration and description are to be considered illustrative or exemplary and not restrictive. The invention is not limited to the disclosed embodiments. In the claims, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" or "a particular plurality" should be understood to mean at least one or at least a particular plurality. Any reference signs in the claims shall not be construed as limiting the scope. Other variations to the above-described embodiments can be understood and effected by those skilled in the art without inventive faculty, from a study of the drawings, the description and the appended claims, which will still fall within the scope of the invention as claimed.

Claims (10)

1. A method for selecting the number of face cluster samples is characterized by comprising the following steps:
s1, constructing a face test set and a plurality of face training sets, wherein the number of face images in each face training set is different;
s2, clustering the face training sets to obtain a plurality of corresponding clustering centers;
s3, calculating the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center;
and S4, acquiring the number range of the face cluster samples according to the cosine distance mean value and the root mean square value corresponding to each cluster center.
2. The method for selecting the number of face cluster samples according to claim 1, wherein the step S1 comprises:
constructing an original face image set with face images of multiple persons, and carrying out face detection and cutting on all the face images in the original face image set;
and selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set.
3. The method for selecting the number of face cluster samples according to claim 1, wherein the step S1 further comprises:
constructing a first face training set, wherein the number of face images of the first face training set is N1;
constructing a second face training set, wherein the number of face images of the second face training set is N2;
constructing a third face training set, wherein the number of face images of the third face training set is N3;
constructing a fourth face training set, wherein the number of the face images of the fourth face training set is N4;
wherein, N1, N2, N3, N4, N1 and 10, N2 is less than or equal to 10, and N4 is more than or equal to 3.
4. The method for selecting the number of face cluster samples according to claim 3, wherein the step S2 comprises:
performing convolution and feature extraction on each face image in the first face training set according to a convolution neural network model to generate a first feature vector group corresponding to the first face training set, and performing K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set;
performing convolution and feature extraction on each face image in the second face training set according to a convolution neural network model to generate a second feature vector group corresponding to the second face training set, and performing K-means clustering on the second feature vector group to obtain a second clustering center corresponding to the second face training set;
performing convolution and feature extraction on each face image in the third face training set according to a convolution neural network model to generate a third feature vector group corresponding to the third face training set, and performing K-means clustering on the third feature vector group to obtain a third clustering center corresponding to the third face training set;
and performing convolution and feature extraction on each face image in the fourth face training set according to a convolutional neural network model to generate a fourth feature vector group corresponding to the fourth face training set, and performing K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
5. The method for selecting the number of face cluster samples according to claim 4, wherein the step S3 comprises:
the calculation formula of the mean value of the cosine distances is shown as formula 1:
Figure FDA0002047474010000021
wherein mean is the mean value of cosine distances, n is the number of face images in the face test set, diAnd the distance between the cluster center and the cosine of the characteristic vector of the ith human face image in the human face test set.
6. The method for selecting the number of face cluster samples according to claim 5, wherein the step S3 further comprises:
the calculation formula of the root mean square value of the cosine distance is shown as formula 2:
Figure FDA0002047474010000022
wherein var is the root mean square value of cosine distance, n is the number of face images in the face test set, diAnd the distance between the cluster center and the cosine of the characteristic vector of the ith human face image in the human face test set.
7. The method for selecting the number of face cluster samples according to claim 6, wherein the step S3 further comprises:
calculating the cosine distance between the first cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the first cluster center according to the formula 1 and the formula 2;
calculating the cosine distance between the second cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the second cluster center according to the formula 1 and the formula 2;
calculating the cosine distance between the third cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the third cluster center according to the formula 1 and the formula 2;
and calculating the cosine distance between the fourth cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to the fourth cluster center according to the formula 1 and the formula 2.
8. The method for selecting the number of face cluster samples according to claim 7, wherein the step S3 further comprises:
respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formula 1 and the formula 2.
9. The method for selecting the number of face cluster samples according to claim 8, wherein the step S4 comprises:
the number of face cluster samples ranges from [3,10 ].
10. A system for selecting a number of face cluster samples, the system comprising: the training set module is used for constructing a face test set and constructing a plurality of face training sets, and the number of face images of each face training set is different;
the clustering module is used for clustering the face training sets to obtain a plurality of corresponding clustering centers;
the computing module is used for computing the cosine distance between each cluster center and the feature vector of each face image in the face test set, and acquiring the mean value and root mean square value of the cosine distance corresponding to each cluster center;
and the evaluation module is used for acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center.
CN201910363240.0A 2019-04-30 2019-04-30 Face clustering sample number selection method and system Active CN111652260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910363240.0A CN111652260B (en) 2019-04-30 2019-04-30 Face clustering sample number selection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910363240.0A CN111652260B (en) 2019-04-30 2019-04-30 Face clustering sample number selection method and system

Publications (2)

Publication Number Publication Date
CN111652260A true CN111652260A (en) 2020-09-11
CN111652260B CN111652260B (en) 2023-06-20

Family

ID=72346281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910363240.0A Active CN111652260B (en) 2019-04-30 2019-04-30 Face clustering sample number selection method and system

Country Status (1)

Country Link
CN (1) CN111652260B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101238A (en) * 2020-09-17 2020-12-18 浙江商汤科技开发有限公司 Clustering method and device, electronic equipment and storage medium
CN113052079A (en) * 2021-03-26 2021-06-29 重庆紫光华山智安科技有限公司 Regional passenger flow statistical method, system, equipment and medium based on face clustering
CN116541726A (en) * 2023-07-06 2023-08-04 中国科学院空天信息创新研究院 Sample size determination method, device and equipment for vegetation coverage estimation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013109625A1 (en) * 2012-01-17 2013-07-25 Alibaba Group Holding Limited Image index generation based on similarities of image features
CN105512620A (en) * 2015-11-30 2016-04-20 北京天诚盛业科技有限公司 Convolutional neural network training method and apparatus for face recognition
CN106250821A (en) * 2016-07-20 2016-12-21 南京邮电大学 The face identification method that a kind of cluster is classified again
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
CN108549883A (en) * 2018-08-06 2018-09-18 国网浙江省电力有限公司 A kind of face recognition methods again
WO2019011093A1 (en) * 2017-07-12 2019-01-17 腾讯科技(深圳)有限公司 Machine learning model training method and apparatus, and facial expression image classification method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013109625A1 (en) * 2012-01-17 2013-07-25 Alibaba Group Holding Limited Image index generation based on similarities of image features
CN105512620A (en) * 2015-11-30 2016-04-20 北京天诚盛业科技有限公司 Convolutional neural network training method and apparatus for face recognition
CN106250821A (en) * 2016-07-20 2016-12-21 南京邮电大学 The face identification method that a kind of cluster is classified again
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
WO2019011093A1 (en) * 2017-07-12 2019-01-17 腾讯科技(深圳)有限公司 Machine learning model training method and apparatus, and facial expression image classification method and apparatus
CN108549883A (en) * 2018-08-06 2018-09-18 国网浙江省电力有限公司 A kind of face recognition methods again

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李振东;钟勇;张博言;曹冬平;: "基于深度特征聚类的海量人脸图像检索" *
黎明;吴陈;: "基于改进聚类算法的图像特征提取" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101238A (en) * 2020-09-17 2020-12-18 浙江商汤科技开发有限公司 Clustering method and device, electronic equipment and storage medium
CN113052079A (en) * 2021-03-26 2021-06-29 重庆紫光华山智安科技有限公司 Regional passenger flow statistical method, system, equipment and medium based on face clustering
CN116541726A (en) * 2023-07-06 2023-08-04 中国科学院空天信息创新研究院 Sample size determination method, device and equipment for vegetation coverage estimation
CN116541726B (en) * 2023-07-06 2023-09-19 中国科学院空天信息创新研究院 Sample size determination method, device and equipment for vegetation coverage estimation

Also Published As

Publication number Publication date
CN111652260B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN107609493B (en) Method and device for optimizing human face image quality evaluation model
CN106778604B (en) Pedestrian re-identification method based on matching convolutional neural network
CN111652260B (en) Face clustering sample number selection method and system
CN108229674B (en) Training method and device of neural network for clustering, and clustering method and device
CN107909104A (en) The face cluster method, apparatus and storage medium of a kind of picture
CN106851437A (en) A kind of method for extracting video frequency abstract
CN107886507B (en) A kind of salient region detecting method based on image background and spatial position
CN110781766B (en) Grassman manifold discriminant analysis image recognition method based on characteristic spectrum regularization
CN108960260B (en) Classification model generation method, medical image classification method and medical image classification device
CN107481236A (en) A kind of quality evaluating method of screen picture
CN103020589B (en) A kind of single training image per person method
CN105956570B (en) Smiling face's recognition methods based on lip feature and deep learning
CN110827432B (en) Class attendance checking method and system based on face recognition
CN105631404B (en) The method and device that photo is clustered
CN116052218B (en) Pedestrian re-identification method
CN108960142A (en) Pedestrian based on global characteristics loss function recognition methods again
US8953877B2 (en) Noise estimation for images
CN104143088B (en) Face identification method based on image retrieval and feature weight learning
CN108960186B (en) Advertising machine user identification method based on human face
CN105678208B (en) Extract the method and device of face texture
CN113705310A (en) Feature learning method, target object identification method and corresponding device
CN107832667A (en) A kind of face identification method based on deep learning
CN109919056A (en) A kind of face identification method based on discriminate principal component analysis
CN108241868B (en) Method and device for mapping objective similarity to subjective similarity of image
CN107506400A (en) A kind of image search method based on cognitive characteristics and manifold ranking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant