CN111652260B - Face clustering sample number selection method and system - Google Patents
Face clustering sample number selection method and system Download PDFInfo
- Publication number
- CN111652260B CN111652260B CN201910363240.0A CN201910363240A CN111652260B CN 111652260 B CN111652260 B CN 111652260B CN 201910363240 A CN201910363240 A CN 201910363240A CN 111652260 B CN111652260 B CN 111652260B
- Authority
- CN
- China
- Prior art keywords
- face
- clustering
- training set
- feature vector
- cosine distance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of face recognition, and discloses a method for selecting the number of face clustering samples, which comprises the following steps: a face test set is built, a plurality of face training sets are built, and the number of face images of each face training set is different; clustering the face training sets to obtain a plurality of corresponding clustering centers; calculating the cosine distance between each clustering center and the feature vector of each face image in the face test set, and obtaining the mean value and the root mean square value of the cosine distance corresponding to each clustering center; and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Correspondingly, the invention also discloses a system for selecting the number of the face clustering samples. The invention provides a method for selecting the number of face clustering samples, and can ensure a good clustering effect.
Description
Technical Field
The invention relates to the technical field of face recognition, in particular to a method and a system for selecting the number of face clustering samples.
Background
The face recognition technology is a biological recognition technology for carrying out identity recognition based on facial feature information of a person, and a camera is used for collecting images or video streams containing the face and automatically detecting and tracking the face in the images so as to further recognize the detected face. In the face recognition product, a plurality of face pictures are required to be added to register a face model. The registered face model is generally a feature vector, and a problem that one person corresponds to a plurality of face feature vectors occurs, so that the face feature vectors need to be clustered, so that one person corresponds to a unique feature vector. The face clustering algorithm aims at finding the clustering centers of feature vectors extracted from a plurality of photos of the same person through clustering, so that the sum of squares of distances from the clustering centers to respective images is minimized. The patent application with publication number of CN 108875778A discloses a face clustering method, which comprises the following steps: determining a clustering mode based on the number of images to be clustered, wherein the clustering mode describes how many images are acquired from the images to be clustered each time to perform face clustering once; and acquiring a corresponding number of images from the images to be clustered each time based on the determined clustering mode to perform face clustering until the images to be clustered are all completed.
Generally, the clustering algorithm has better clustering effect under the condition of more images. But the user only provides a small number of photo images for clustering at the time of face registration. The above patent application provides a technical solution for face clustering, and does not provide a method for selecting the number of face clustering samples.
Therefore, how to select and evaluate the number of face clustering samples and ensure a good clustering effect becomes a technical problem that needs to be solved.
Disclosure of Invention
The invention aims to provide a method and a system for selecting the number of face clustering samples, and provides a method for selecting the number of face clustering samples, which can ensure a good clustering effect.
In order to achieve the above object, the present invention provides a method for selecting the number of face clustering samples, the method comprising: a face test set is built, a plurality of face training sets are built, and the number of face images of each face training set is different; clustering the face training sets to obtain a plurality of corresponding clustering centers; calculating the cosine distance between each clustering center and the feature vector of each face image in the face test set, and obtaining the mean value and the root mean square value of the cosine distance corresponding to each clustering center; and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Based on the technical scheme, the technical scheme is provided that an average value and a root mean square value of cosine distances between a clustering center and the images of the test set are used as indexes for evaluating the number of face clustering samples.
Preferably, the step S1 includes: constructing a face image original face image set with a plurality of people, and carrying out face detection and clipping on all face images in the original face image set; and selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set.
Preferably, the step S1 includes: constructing a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of face images of the fourth face training set is N4; wherein N1> N2> N3> N4, N1>10, N2 less than or equal to 10, and N4 more than or equal to 3.
Preferably, the step S2 includes: carrying out convolution and feature extraction on each face image in the first face training set according to a convolution neural network model, generating a first feature vector group corresponding to the first face training set, and carrying out K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set; carrying out convolution and feature extraction on each face image in the second face training set according to a convolution neural network model, generating a second feature vector group corresponding to the second face training set, and carrying out K-means clustering on the second feature vector group to obtain a second aggregation center corresponding to the second face training set; carrying out convolution and feature extraction on each face image in the third face training set according to a convolution neural network model, generating a third feature vector group corresponding to the third face training set, and carrying out K-means clustering on the third feature vector group to obtain a third polymerization center corresponding to the third face training set; and carrying out convolution and feature extraction on each face image in the fourth face training set according to the convolution neural network model, generating a fourth feature vector group corresponding to the fourth face training set, and carrying out K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
Preferably, the step S3 includes: the calculation formula of the mean value of the cosine distance is as formula 1:
wherein mean is the mean value of cosine distance, n is the number of face images in the face test set, and d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is obtained.
Preferably, the step S3 further includes: the root mean square value of the cosine distance is calculated as formula 2:
wherein var is root mean square value of cosine distance, n is number of face images in the face test set, and d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is obtained.
Preferably, the step S3 further includes: calculating the cosine distance between the first clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the first clustering center according to the formulas 1 and 2; calculating the cosine distance between the second center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the second center according to the formulas 1 and 2; calculating the cosine distance between the third class center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the third class center according to the formulas 1 and 2; and calculating the cosine distance between the fourth clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the fourth clustering center according to the formulas 1 and 2.
Preferably, the step S3 further includes: and respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formulas 1 and 2.
Preferably, the step S4 includes: the number of face cluster samples ranges from [3,10 ]. According to the technical scheme, the number range of the face clustering samples is provided, a good clustering effect can be obtained, and the number of the samples of the face photos required to be provided by a user during face registration is guided.
In order to achieve the above object, the present invention provides a system for selecting the number of face clustering samples, the system comprising: the training set module is used for constructing a face test set and a plurality of face training sets, and the number of face images of each face training set is different; the clustering module is used for clustering the face training sets to obtain a plurality of corresponding clustering centers; the computing module is used for computing the cosine distance between each clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to each clustering center; and the evaluation module is used for acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center. Based on the technical scheme, the technical scheme is provided that an average value and a root mean square value of cosine distances between a clustering center and the images of the test set are used as indexes for evaluating the number of face clustering samples.
Compared with the prior art, the method and the system for selecting the number of the face clustering samples have the beneficial effects that: the technical scheme of taking the average value and the root mean square value of the cosine distance between the clustering center and the test set image as the index for evaluating the number of face clustering samples is provided, the number range of the face clustering samples is provided, a good clustering effect can be obtained, and the number of the samples of the face photos required to be provided by a user during face registration is guided; the scene that a user provides a small amount of images for face clustering during face registration is met, the practicability is high, and the experience effect of the user is improved.
Drawings
Fig. 1 is a flow chart of a method for selecting a number of face clustering samples according to an embodiment of the present invention.
Fig. 2 is a block diagram showing the components of a system for selecting the number of face clustering samples in one embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. In the drawings, like structural elements are referred to by like reference numerals and components having similar structure or function are referred to by like reference numerals. The dimensions and thickness of each component shown in the drawings are arbitrarily shown, and the present invention is not limited to the dimensions and thickness of each component. The thickness of the components is exaggerated in some places in the drawings for clarity of illustration.
In one embodiment of the present invention as shown in fig. 1, the present invention provides a method for selecting a number of face clustering samples, the method comprising:
s1, constructing a face test set and constructing a plurality of face training sets, wherein the number of face images of each face training set is different;
s2, clustering the face training sets to obtain a plurality of corresponding clustering centers;
s3, calculating cosine distances between each clustering center and the feature vector of each face image in the face test set, and obtaining the mean value and the root mean square value of the cosine distances corresponding to each clustering center;
s4, obtaining the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center.
The step S1 is as follows: a face test set is constructed, a plurality of face training sets are constructed, and the number of face images of each face training set is different. Specifically, an original face image set is constructed, the original face image set is provided with face images of a plurality of persons, and face detection and cutting are carried out on all face images in the original face image set to form face images with consistent standard sizes. And selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set. According to a specific embodiment of the present invention, the number of samples of each face in the original face image set is greater than 112, and 96 face images of the same person are selected from the original face image set as the face test set.
According to an embodiment of the present invention, the step S1 further includes: the step S1 includes: constructing a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of face images of the fourth face training set is N4; wherein N1> N2> N3> N4, N1>10, N2 less than or equal to 10, and N4 more than or equal to 3. According to a preferred embodiment of the present invention, N1 is set to 16, N2 is set to 10, N3 is set to 5, and N4 is set to 3. And selecting 16 face images from the original face image set, wherein the 16 face images belong to the same person, and the 16 face images form a first face training set. Similarly, 10 face images are selected from the original face image set, the 10 face images belong to the same person, and the 10 face images form a second face training set. And selecting 5 face images from the original face image set, wherein the 5 face images belong to the same person, and the 5 face images form a third face training set. And selecting 3 face images from the original face image set, wherein the 3 face images belong to the same person, and the 3 face images form a fourth face training set.
The step S2 is as follows: and clustering the face training sets to obtain a plurality of corresponding clustering centers. The face feature extraction is carried out on the face training set, and the extracted face feature is clustered to generate a clustering center. According to an embodiment of the present invention, the step S2 includes: carrying out convolution and feature extraction on each face image in the first face training set according to a convolution neural network model, obtaining a feature vector corresponding to each face image, and generating a first feature vector group corresponding to the first face training set; and carrying out K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set. Carrying out convolution and feature extraction on each face image in the second face training set according to the convolution neural network model, obtaining a feature vector corresponding to each face image, and generating a second feature vector group corresponding to the second face training set; and carrying out K-means clustering on the second feature vector group to obtain a second aggregation center corresponding to the second face training set. Carrying out convolution and feature extraction on each face image in the third face training set according to the convolution neural network model, obtaining a feature vector corresponding to each face image, and generating a third feature vector group corresponding to the third face training set; and carrying out K-means clustering on the third feature vector group to obtain a third class center corresponding to the third face training set. Carrying out convolution and feature extraction on each face image in the fourth face training set according to the convolution neural network model, obtaining a feature vector corresponding to each face image, and generating a fourth feature vector group corresponding to the fourth face training set; and carrying out K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
The step S3 is as follows: and calculating the cosine distance between each clustering center and the feature vector of each face image in the face test set, and obtaining the mean value and the root mean square value of the cosine distance corresponding to each clustering center. According to an embodiment of the present invention, the step S3 includes: the calculation formula of the mean value of the cosine distance is as formula 1:
wherein mean is the mean value of cosine distance, n is the number of face images in the face test set, and d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is set;
according to an embodiment of the present invention, the step S3 further includes: the root mean square value of the cosine distance is calculated as formula 2:
wherein var is root mean square value of cosine distance, n is number of face images in the face test set, and d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is obtained.
According to an embodiment of the present invention, the step S3 further includes: and calculating the cosine distance between the first clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the first clustering center according to the formulas 1 and 2. And calculating the cosine distance between the second center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the second center according to the formulas 1 and 2. And calculating the cosine distance between the third class center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the third class center according to the formulas 1 and 2. And calculating the cosine distance between the fourth clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the fourth clustering center according to the formulas 1 and 2.
According to an embodiment of the present invention, the step S3 further includes: and respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formulas 1 and 2.
The step S4 is as follows: and acquiring the number range of the face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center.
According to an embodiment of the present invention, table 1 is a maximum cosine distance, a minimum cosine distance, a mean value of the cosine distances, and a root mean square value of the cosine distances corresponding to each cluster center. Setting N1 to 16, N2 to 10, N3 to 5 and N4 to 3. The first clustering center corresponds to a first face training set, and the number of face images of the first face training set is 16. The second face training set corresponds to the second face training set, and the number of face images of the second face training set is 10. The third class center corresponds to a third face training set, and the number of face images of the third face training set is 5. The fourth clustering center corresponds to a fourth face training set, and the number of face images of the fourth face training set is 3.
TABLE 1
As can be seen from table 1, the larger the number of clustering samples, the smaller the average value and the root mean square value of the corresponding cosine distances, and the smaller the clustering effect. The average value and the root mean square of the cosine distances between the clustering center and the face images in the face test set are used as evaluation indexes of the number of face clustering samples, and the more the clustering samples are, the smaller the average value and the root mean square value of the cosine distances are, so that the clustering effect is better. As can be seen from the table, after the number of samples of face clusters exceeds 10, the speed at which the average value of the cosine distance and the root mean square value decrease significantly decreases, and therefore, the upper limit of the number of samples provided at the time of user face registration using the face is set to 10, that is, the upper limit of the number range of the face cluster samples is 10.
According to an embodiment of the present invention, table 2 is a maximum cosine distance, a minimum cosine distance, a mean value of the cosine distances, and a root mean square value of the cosine distances corresponding to each face image in the fourth face training set.
TABLE 2
As can be seen from table 2, the clustering effect is better when the number of face clustering samples is small than when the clustering is not performed. The clustering effect is obviously better when the number of the face clustering samples is 3 than that of the face clustering samples which are not clustered, so that the lower limit of the number range of the face clustering samples is 3. The number of face clustering samples is thus in the range of [3,10], and the larger the number of face clustering samples, the better the clustering.
According to the technical scheme, the average value and the root mean square value of the cosine distance between the clustering center and the test set image are used as indexes for evaluating the number of face clustering samples, the number range of the face clustering samples is provided, a good clustering effect can be obtained, and the number of the face photos required to be provided by a user during face registration is guided; the scene that a user provides a small amount of images for face clustering during face registration is met, the practicability is high, and the experience effect of the user is improved.
In another embodiment, as shown in fig. 2, the present invention further provides a system for selecting a number of face clustering samples, where the system includes:
the training set module 20 is configured to construct a face test set, and construct a plurality of face training sets, each of which has a different number of face images;
a clustering module 21, configured to cluster the plurality of face training sets to obtain a plurality of corresponding clustering centers;
the calculating module 22 is configured to calculate cosine distances between each cluster center and feature vectors of each face image in the face test set, and obtain an average value and a root mean square value of the cosine distances corresponding to each cluster center;
and the evaluation module 23 is configured to obtain a number range of face clustering samples according to the cosine distance average value and the root mean square value corresponding to each clustering center.
The training set module 20 is configured to construct a face test set, and construct a plurality of face training sets, each of which has a different number of face images. According to a specific embodiment of the invention, a training set module constructs a first face training set, wherein the number of face images of the first face training set is N1; constructing a second face training set, wherein the number of face images of the second face training set is N2; constructing a third face training set, wherein the number of face images of the third face training set is N3; constructing a fourth face training set, wherein the number of face images of the fourth face training set is N4; wherein N1> N2> N3> N4, N1>10, N2 less than or equal to 10, and N4 more than or equal to 3.
The clustering module 21 is configured to cluster the face training sets to obtain a plurality of corresponding clustering centers. The face feature extraction is carried out on the face training set, and the extracted face feature is clustered to generate a clustering center. According to a specific embodiment of the present invention, the clustering module clusters the four face training sets respectively to generate respective corresponding clustering centers, that is, a first clustering center corresponding to the first face training set, a second clustering center corresponding to the second face training set, a third clustering center corresponding to the third face training set, and a fourth clustering center corresponding to the fourth face training set.
The calculating module 22 is configured to calculate cosine distances between each cluster center and feature vectors of each face image in the face test set, and obtain an average value and a root mean square value of the cosine distances corresponding to each cluster center. According to a specific embodiment of the present invention, the calculating module calculates cosine distances between each cluster center and a feature vector of each face image in the face test set, and obtains an average value and a root mean square value of the cosine distances corresponding to each cluster center according to equations 1 and 2 in the above method embodiment. According to a specific embodiment of the present invention, the calculation module calculates cosine distances between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and obtains an average value and a root mean square value of the cosine distances corresponding to each face image in the fourth face training set according to the formulas 1 and 2.
The evaluation module 23 is configured to obtain a number range of face clustering samples according to the cosine distance average value and the root mean square value corresponding to each clustering center. According to the method, the average value and the root mean square of the cosine distances between the clustering center and the face images of the face test set are used as the evaluation indexes of the number of face clustering samples, and the more the clustering samples are, the smaller the average value and the root mean square value of the cosine distances are, so that the clustering effect is better. The number of face clustering samples is in the range of [3,10], and the more the number of face clustering samples is in the number range, the better the clustering is.
According to the technical scheme, the average value and the root mean square value of the cosine distance between the clustering center and the test set image are used as indexes for evaluating the number of face clustering samples, the number range of the face clustering samples is provided, a good clustering effect can be obtained, and the number of the face photos required to be provided by a user during face registration is guided.
While the invention has been described in detail in the foregoing drawings and embodiments, such illustration and description are to be considered illustrative or exemplary and not restrictive. The invention is not limited to the disclosed embodiments. In the claims, the word "comprising" does not exclude other elements or steps, and the "a" or "an" or "a particular" plurality should be understood as at least one or at least a particular plurality. Any reference signs in the claims shall not be construed as limiting the scope. Other variations to the above-described embodiments can be understood and effected by those skilled in the art in light of the figures, the description, and the appended claims, without departing from the scope of the invention as defined in the claims.
Claims (7)
1. A method for selecting a number of face clustering samples, the method comprising the steps of:
s1, constructing a face test set and constructing a plurality of face training sets, wherein the number of face images of each face training set is different;
s2, clustering the face training sets to obtain a plurality of corresponding clustering centers;
s3, calculating cosine distances between each clustering center and the feature vector of each face image in the face test set, and obtaining the mean value and the root mean square value of the cosine distances corresponding to each clustering center;
s4, acquiring the number range of face clustering samples according to the cosine distance mean value and the root mean square value corresponding to each clustering center;
wherein, the step S1 includes:
constructing an original face image set with face images of a plurality of people, and carrying out face detection and clipping on all face images in the original face image set;
selecting a preset number of face images of the same person from the original face image set, wherein the selected face images form the face test set;
constructing a first face training set, wherein the number of face images of the first face training set is N1;
constructing a second face training set, wherein the number of face images of the second face training set is N2;
constructing a third face training set, wherein the number of face images of the third face training set is N3;
constructing a fourth face training set, wherein the number of face images of the fourth face training set is N4;
wherein N1 is greater than N2 and N3 is greater than N4, N1 is greater than 10, N2 is less than or equal to 10, and N4 is greater than or equal to 3;
the step S2 includes:
carrying out convolution and feature extraction on each face image in the first face training set according to a convolution neural network model, generating a first feature vector group corresponding to the first face training set, and carrying out K-means clustering on the first feature vector group to obtain a first clustering center corresponding to the first face training set;
carrying out convolution and feature extraction on each face image in the second face training set according to a convolution neural network model, generating a second feature vector group corresponding to the second face training set, and carrying out K-means clustering on the second feature vector group to obtain a second aggregation center corresponding to the second face training set;
carrying out convolution and feature extraction on each face image in the third face training set according to a convolution neural network model, generating a third feature vector group corresponding to the third face training set, and carrying out K-means clustering on the third feature vector group to obtain a third polymerization center corresponding to the third face training set;
and carrying out convolution and feature extraction on each face image in the fourth face training set according to the convolution neural network model, generating a fourth feature vector group corresponding to the fourth face training set, and carrying out K-means clustering on the fourth feature vector group to obtain a fourth clustering center corresponding to the fourth face training set.
2. The method for selecting the number of face clustering samples according to claim 1, wherein the step S3 includes:
the calculation formula of the mean value of the cosine distance is as formula 1:
wherein mean is the mean value of cosine distances, n is the face test setNumber of face images, d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is obtained.
3. The method for selecting the number of face clustering samples according to claim 2, wherein the step S3 further comprises:
the root mean square value of the cosine distance is calculated as formula 2:
wherein var is root mean square value of cosine distance, n is number of face images in the face test set, and d i The cosine distance between the clustering center and the feature vector of the ith face image in the face test set is obtained.
4. The method for selecting a number of face clustering samples according to claim 3, wherein the step S3 further comprises:
calculating the cosine distance between the first clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the first clustering center according to the formulas 1 and 2;
calculating the cosine distance between the second center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the second center according to the formulas 1 and 2;
calculating the cosine distance between the third class center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the third class center according to the formulas 1 and 2;
and calculating the cosine distance between the fourth clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to the fourth clustering center according to the formulas 1 and 2.
5. The method for selecting the number of face clustering samples according to claim 4, wherein the step S3 further comprises:
and respectively calculating the cosine distance between the feature vector of each face image in the fourth face training set and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to each face image in the fourth face training set according to the formulas 1 and 2.
6. The method for selecting the number of face clustering samples according to claim 5, wherein the step S4 includes:
the number of face cluster samples ranges from [3,10 ].
7. A system for selecting a number of face clustering samples, wherein the system performs a method for selecting a number of face clustering samples according to any one of claims 1 to 6, the system comprising: the training set module is used for constructing a face test set and a plurality of face training sets, and the number of face images of each face training set is different;
the clustering module is used for clustering the face training sets to obtain a plurality of corresponding clustering centers;
the computing module is used for computing the cosine distance between each clustering center and the feature vector of each face image in the face test set, and acquiring the mean value and the root mean square value of the cosine distance corresponding to each clustering center;
an evaluation module, configured to determine, according to the cosine distance average value and the root mean square value corresponding to each cluster center,
and obtaining the number range of the face clustering samples.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910363240.0A CN111652260B (en) | 2019-04-30 | 2019-04-30 | Face clustering sample number selection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910363240.0A CN111652260B (en) | 2019-04-30 | 2019-04-30 | Face clustering sample number selection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111652260A CN111652260A (en) | 2020-09-11 |
CN111652260B true CN111652260B (en) | 2023-06-20 |
Family
ID=72346281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910363240.0A Active CN111652260B (en) | 2019-04-30 | 2019-04-30 | Face clustering sample number selection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111652260B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112101238A (en) * | 2020-09-17 | 2020-12-18 | 浙江商汤科技开发有限公司 | Clustering method and device, electronic equipment and storage medium |
CN113052079B (en) * | 2021-03-26 | 2022-01-21 | 重庆紫光华山智安科技有限公司 | Regional passenger flow statistical method, system, equipment and medium based on face clustering |
CN116541726B (en) * | 2023-07-06 | 2023-09-19 | 中国科学院空天信息创新研究院 | Sample size determination method, device and equipment for vegetation coverage estimation |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013109625A1 (en) * | 2012-01-17 | 2013-07-25 | Alibaba Group Holding Limited | Image index generation based on similarities of image features |
CN105512620A (en) * | 2015-11-30 | 2016-04-20 | 北京天诚盛业科技有限公司 | Convolutional neural network training method and apparatus for face recognition |
CN106250821A (en) * | 2016-07-20 | 2016-12-21 | 南京邮电大学 | The face identification method that a kind of cluster is classified again |
CN106845421A (en) * | 2017-01-22 | 2017-06-13 | 北京飞搜科技有限公司 | Face characteristic recognition methods and system based on multi-region feature and metric learning |
CN108549883A (en) * | 2018-08-06 | 2018-09-18 | 国网浙江省电力有限公司 | A kind of face recognition methods again |
WO2019011093A1 (en) * | 2017-07-12 | 2019-01-17 | 腾讯科技(深圳)有限公司 | Machine learning model training method and apparatus, and facial expression image classification method and apparatus |
-
2019
- 2019-04-30 CN CN201910363240.0A patent/CN111652260B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013109625A1 (en) * | 2012-01-17 | 2013-07-25 | Alibaba Group Holding Limited | Image index generation based on similarities of image features |
CN105512620A (en) * | 2015-11-30 | 2016-04-20 | 北京天诚盛业科技有限公司 | Convolutional neural network training method and apparatus for face recognition |
CN106250821A (en) * | 2016-07-20 | 2016-12-21 | 南京邮电大学 | The face identification method that a kind of cluster is classified again |
CN106845421A (en) * | 2017-01-22 | 2017-06-13 | 北京飞搜科技有限公司 | Face characteristic recognition methods and system based on multi-region feature and metric learning |
WO2019011093A1 (en) * | 2017-07-12 | 2019-01-17 | 腾讯科技(深圳)有限公司 | Machine learning model training method and apparatus, and facial expression image classification method and apparatus |
CN108549883A (en) * | 2018-08-06 | 2018-09-18 | 国网浙江省电力有限公司 | A kind of face recognition methods again |
Non-Patent Citations (2)
Title |
---|
李振东 ; 钟勇 ; 张博言 ; 曹冬平 ; .基于深度特征聚类的海量人脸图像检索.哈尔滨工业大学学报.2018,(11),全文. * |
黎明 ; 吴陈 ; .基于改进聚类算法的图像特征提取.信息通信.2017,(03),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111652260A (en) | 2020-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108229674B (en) | Training method and device of neural network for clustering, and clustering method and device | |
CN108304435B (en) | Information recommendation method and device, computer equipment and storage medium | |
CN111652260B (en) | Face clustering sample number selection method and system | |
CN108776768A (en) | Image recognition method and device | |
CN107886507B (en) | A kind of salient region detecting method based on image background and spatial position | |
CN104933414B (en) | A kind of living body faces detection method based on WLD-TOP | |
CN107909104A (en) | The face cluster method, apparatus and storage medium of a kind of picture | |
CN105631404B (en) | The method and device that photo is clustered | |
US20200154392A1 (en) | Generating wireless network access point models using clustering techniques | |
CN108960260B (en) | Classification model generation method, medical image classification method and medical image classification device | |
CN109214428A (en) | Image partition method, device, computer equipment and computer storage medium | |
CN106575280B (en) | System and method for analyzing user-associated images to produce non-user generated labels and utilizing the generated labels | |
CN111814620A (en) | Face image quality evaluation model establishing method, optimization method, medium and device | |
CN103262118A (en) | Attribute value estimation device, attribute value estimation method, program, and recording medium | |
CN103353881B (en) | Method and device for searching application | |
CN103020589B (en) | A kind of single training image per person method | |
CN106897700B (en) | Single-sample face recognition method and system | |
CN108921140A (en) | Pedestrian's recognition methods again | |
CN110751069A (en) | Face living body detection method and device | |
CN104966075B (en) | A kind of face identification method and system differentiating feature based on two dimension | |
CN108960142A (en) | Pedestrian based on global characteristics loss function recognition methods again | |
CN109886239B (en) | Portrait clustering method, device and system | |
CN111860529A (en) | Image preprocessing method, system, device and medium | |
CN113705310A (en) | Feature learning method, target object identification method and corresponding device | |
CN110929583A (en) | High-detection-precision face recognition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |