CN110866136B - Face image stacking method and device, electronic equipment and readable storage medium - Google Patents

Face image stacking method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN110866136B
CN110866136B CN201911105931.7A CN201911105931A CN110866136B CN 110866136 B CN110866136 B CN 110866136B CN 201911105931 A CN201911105931 A CN 201911105931A CN 110866136 B CN110866136 B CN 110866136B
Authority
CN
China
Prior art keywords
heap
pile
processing object
threshold
stacking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911105931.7A
Other languages
Chinese (zh)
Other versions
CN110866136A (en
Inventor
唐琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4u Beijing Technology Co ltd
Original Assignee
Shanghai Tianli Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Tianli Intelligent Technology Co ltd filed Critical Shanghai Tianli Intelligent Technology Co ltd
Priority to CN201911105931.7A priority Critical patent/CN110866136B/en
Publication of CN110866136A publication Critical patent/CN110866136A/en
Application granted granted Critical
Publication of CN110866136B publication Critical patent/CN110866136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The invention provides a human face image pile-dividing method and device, electronic equipment and a readable storage medium, wherein the method comprises the following steps: acquiring different pile-in thresholds and pile-out thresholds; selecting a face image to be processed from the first sample set as a processing object; when judging that the processing object can be classified into one heap of the heap splitting result according to the heap classification threshold value, classifying the processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the processing object according to the heap opening threshold value, opening a new heap, classifying the processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the first sample set, updating the processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the first sample set are processed. The invention can make the opening and the pile-up of the face image more accurate and reliable.

Description

Face image stacking method and device, electronic equipment and readable storage medium
Technical Field
The invention relates to the field of photo management, in particular to a human face image stacking method and device, electronic equipment and a readable storage medium.
Background
With the increase of social activities of people, more and more photos are contained in a personal photo album in the smart phone, more and more people are contained in the photo album, and the size, angle, illumination and the like of the face of each photo are changed greatly. In order to facilitate a user to search all photos of a certain person from a large number of photos, the smart phone needs to provide a photo sorting function to automatically stack the photos in the photo album according to the faces of the people.
The heap nature of face images is an extension of face comparison. The face comparison is to give two faces to judge whether they are similar or not, and use a value to measure the degree of similarity. The typical process of face comparison comprises face detection, face key point detection, face alignment, face feature extraction and calculation of cosine distance of face feature vector. Generally, after the features of the faces of the two parties are compared, the classic cosine distance is used as the measure of similarity, and the closer the value is to 0, the more similar the two parties are compared, and the closer the value is to 1, the more dissimilar the two parties are compared.
Knowing the a, b vectors, the remaining chordal distance d is defined as:
Figure GDA0003686697420000011
generally, when face comparison is performed, a threshold is determined, and similarity is determined when the similarity value is smaller than the threshold, and is determined to be dissimilar when the similarity value is larger than the threshold.
The method can work normally when the environment is controllable. When the size, angle and illumination of the face are changed greatly, it is difficult to define the similarity or dissimilarity of the face by using a single threshold, and it is difficult to define whether the similarity or dissimilarity exists in fact in a huge blank zone between the similarity and dissimilarity.
The pile dividing of the face image mainly comprises two actions of pile returning and pile opening. The process of stacking refers to merging a sample into an existing stack, and the process of opening a stack refers to re-opening a stack and merging the sample into the stack. The wrong pile opening can easily sort the samples of the same person into a plurality of piles; wrong stacking easily causes each stack to be impure, and one stack contains face samples of several people, so when the face album is sorted, the similarity and the dissimilarity of the sample pairs are judged accurately and reliably, and the wrong stacking or stacking can be caused due to the unreliability of the single threshold.
Disclosure of Invention
One of the objectives of the present invention is to overcome at least some of the deficiencies in the prior art, and provide a method and an apparatus for stacking face images, an electronic device, and a readable storage medium, so that stacking and stacking are more accurate and reliable.
The technical scheme provided by the invention is as follows:
a method for stacking face images comprises the following steps: acquiring different pile-in thresholds and pile-out thresholds; selecting a face image to be processed from the first sample set as a processing object; when judging that the processing object can be classified into one heap of the heap splitting result according to the heap classification threshold value, classifying the processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the processing object according to the heap opening threshold value, opening a new heap, classifying the processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the first sample set, updating the processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the first sample set are processed.
Further, the obtaining different stacking thresholds and opening thresholds includes: acquiring precision recall curves of similar pairs and precision recall curves of dissimilar pairs; and determining a pile-returning threshold according to the precision recall curves of the similar pairs, and determining a pile-opening threshold according to the precision recall curves of the dissimilar pairs.
Further, the acquiring the precision recall curves of the similar pairs and the precision recall curves of the dissimilar pairs includes: acquiring training samples comprising N similar relative face images and N dissimilar face images, and labeling a real value of each sample; calculating the similarity of each sample according to the feature vectors of the two human faces in each sample; traversing the threshold values within the threshold value range, and calculating the precision value and recall value of the similar pair and the precision value and recall value of the dissimilar pair corresponding to each threshold value; obtaining precision recall curves of the similar pairs according to precision values and recall values of the similar pairs corresponding to all the thresholds; and obtaining an accuracy recall curve of the dissimilar pairs according to the accuracy values and recall values of the dissimilar pairs corresponding to all the thresholds.
Further, the calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to each threshold includes: predicting whether the samples are similar according to the threshold and the similarity of each sample to obtain a predicted value of the sample; and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to the threshold value according to the real values and the predicted values of all the samples.
Further, when it is determined that the processing object can be classified into one heap of the heap split result according to the heap classification threshold, classifying the processing object into the heap, and updating the heap, includes: when a heap which can form a similar pair with the processing object exists in the heap splitting result, the processing object is classified into a heap which forms a similar pair with the processing object, and the central vector of the heap which forms a similar pair with the processing object is updated;
wherein, the step of judging whether the processing object and a heap form similar pairs is as follows: acquiring a feature vector of the processing object; calculating the similarity of the feature vector and the central vector of the heap; and when the similarity is smaller than the stacking threshold value and the similarity value is smaller, the similarity is higher, the processing object and the stack form a similar pair.
Further, when it is determined according to the heap opening threshold that a new heap needs to be opened for the processing object, opening a new heap, and sorting the processing object into the new heap, includes: acquiring a feature vector of the processing object; calculating the similarity of the feature vector and the central vector of the heap; when the similarity is greater than the heap opening threshold value and the similarity value is larger, the dissimilarity degree is higher, the processing object and the heap form a dissimilarity pair; and when the processing object and all heaps form dissimilar pairs, opening a new heap, classifying the processing object into the new heap, and taking the characteristic vector of the processing object as the central vector of the new heap.
Further, after all the face images to be processed in the first sample set are processed, the method includes: collecting all unsuccessfully piled face images in the first sample set into a second sample set; acquiring a second pile-returning threshold and a second pile-opening threshold; wherein the second stacking threshold requires lower stacking accuracy than the stacking threshold, or the second opening threshold requires lower opening accuracy than the opening threshold; when the second sample set is not empty, selecting a face image to be processed from the second sample set as a second processing object; when judging that the second processing object can be classified into one heap of the heap splitting result according to the second heap classification threshold value, classifying the second processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the second processing object according to the second heap opening threshold value, opening a new heap, classifying the second processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the second sample set, updating the second processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the second sample set are processed.
Further, after all the face images to be processed in the second sample set are processed, the method includes: collecting all unsuccessfully piled face images in the second sample set into a third sample set; when the ratio of the number of the face images of the third sample set to the number of the face images of the first sample set exceeds a preset ratio, acquiring a third piling threshold and a third piling threshold; wherein the third stacking threshold requires lower stacking accuracy than the second stacking threshold, and the third stacking threshold is equal to the second stacking threshold; selecting a face image to be processed from the third sample set as a third processing object; when the third processing object is judged to be capable of being classified into one heap of the heap splitting result according to the third heap classifying threshold value, classifying the third processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the third processing object according to the third heap opening threshold value, opening a new heap, classifying the third processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the third sample set, updating the third processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the third sample set are processed.
Further, the results of the heap are evaluated from the score of the heap, the accuracy of the heap, and the recall rate of the heap.
Further, calculating the score of the heap includes: acquiring a main identifier, a main identifier sample number and a punishment coefficient of each pile in the pile dividing result; obtaining an ideal score of each pile according to the number of primary identification samples of each pile and the score of each sample of the primary identification; punishing the ideal score of the heap according to the punishment coefficient of the heap to obtain the score of the heap; and obtaining the scores of the piles according to the scores of all the piles.
Further, the obtaining of the primary identifier, the primary identifier sample number, and the punishment coefficient of the heap in the result of the heap splitting includes: counting the sample numbers of different characters in the heap; when the ratio of the number of samples of a person to the total number of samples of the pile is greater than a preset threshold, taking the person as a main identifier of the pile, wherein the number of samples of the main identifier in the pile is the number of samples of the main identifier of the pile; obtaining a purity punishment coefficient of the heap according to the proportion of the main identification sample number of the heap to the total sample number of the heap; counting the frequency of the main marks of the piles in all the piles to obtain the complete penalty coefficient of the piles; and obtaining the punishment coefficient of the heap according to the pure punishment coefficient of the heap and the complete punishment coefficient of the heap.
Further, the accuracy of the split stacks is calculated according to the following formula:
Figure GDA0003686697420000051
wherein, P f For accuracy of stacking, N is the number of stacks, Q i Is the number of samples of the ith stack, i-main-id for the main designation of the ith heap, N i-main-id Identifying a sample number for the primary of the ith heap;
the recall rate of the heap according to the following formula:
Figure GDA0003686697420000052
wherein R is f For recall in the heap, M is the number of real persons in the first sample set, D j The number of face images of the jth person.
The invention also provides a pile-dividing device of the face image, which comprises: the parameter acquisition module is used for acquiring different pile-in thresholds and pile-out thresholds; the object updating module is used for selecting a face image to be processed from the first sample set as a processing object; the heap returning module is used for returning the processing object into one heap of the heap dividing result and updating the heap when judging that the processing object can be returned into one heap according to the heap returning threshold value; the heap opening module is used for opening a new heap when judging that a new heap needs to be opened for the processing object according to the heap opening threshold value, classifying the processing object into the new heap, and updating the heap splitting result; the object updating module is further configured to select another face image to be processed from the first sample set, update the processing object with the selected face image, and repeat the above process until all the face images to be processed in the first sample set are processed.
The present invention also provides an electronic device comprising: a memory for storing a computer program; and the processor is used for running the computer program to realize the human face image stacking method in any one of the preceding aspects.
The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of human face image stacking according to any one of the preceding claims.
The human face image pile-dividing method and device, the electronic equipment and the readable storage medium provided by the invention can at least bring the following beneficial effects:
1. the invention is based on bilateral threshold value pile dividing, and because the pile opening threshold value and the pile returning threshold value have high precision, the pile returning and the pile opening are ensured to have high precision, and simultaneously, the same face can be effectively prevented from being divided into a plurality of piles.
2. The invention improves the accuracy and recall of the pile separation by a multi-stage funnel type processing method.
3. The invention provides a pile dividing evaluation method, which is used for evaluating the score of pile dividing, the precision of pile dividing and the recall rate of pile dividing and is beneficial to finding a pile dividing method with ideal pile dividing effect (pure pile dividing and accurate pile opening) and better pile dividing parameters.
Drawings
The above features, technical features, advantages and implementations of a face image stacking method and apparatus, an electronic device, and a readable storage medium will be further described in the following detailed description of preferred embodiments with reference to the accompanying drawings.
FIG. 1 is a flow chart of one embodiment of a method for stacking face images of the present invention;
FIG. 2 is a flow chart of another embodiment of a method for stacking face images of the present invention;
FIG. 3 is a flow chart of another embodiment of a method for stacking face images according to the present invention;
FIG. 4 is a flow chart of another embodiment of a method for stacking face images of the present invention;
FIG. 5 is a schematic structural diagram of an embodiment of a human face image stacking apparatus according to the present invention;
FIG. 6 is a schematic diagram of an architecture of the parameter acquisition module of FIG. 5;
FIG. 7 is a schematic structural diagram of another embodiment of a human face image stacking apparatus according to the present invention;
FIG. 8 is a schematic structural diagram of another embodiment of a human face image stacking apparatus according to the present invention;
FIG. 9 is a schematic structural diagram of another embodiment of a human face image stacking apparatus according to the present invention;
FIG. 10 is a schematic diagram of one configuration of the heap evaluation module of FIG. 9;
fig. 11 is a schematic structural diagram of an embodiment of an electronic device of the present invention.
The reference numbers illustrate:
100. the system comprises a parameter acquisition module, a 200 object updating module, a 300 pile-up module, a 400 pile-opening module, a 310 similarity pair judgment unit, a 410 dissimilarity pair judgment unit, a 500 second-stage processing module, a 600 three-stage processing module and a 700 pile-splitting evaluation module.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, without inventive effort, other drawings and embodiments can be derived from them.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
In an embodiment of the present invention, as shown in fig. 1, a method for stacking face images includes:
step S100 acquires different pile-up threshold values and pile-up opening threshold values.
Specifically, in order to overcome the unreliability of a single threshold, a double-sided threshold is adopted, and the two thresholds are different. Judging whether the pile can be piled according to the pile piling threshold value, and judging whether the pile can be opened according to the pile opening threshold value. Because of the existence of the bilateral threshold, both the opening and the closing have higher reliability and accuracy.
The heap threshold and the heap open threshold may be set empirically.
Optionally, in order to obtain more reliable stacking threshold values and stacking threshold values, precision recall curves of similar pairs and precision recall curves of dissimilar pairs are obtained according to the training samples, and then reliable stacking threshold values and stacking threshold values are found through the precision recall curves (ROC curves for short).
Finding a corresponding threshold value which guarantees high precision and certain recall in similar pairs of ROC curves, and taking the threshold value as a pile-up threshold value; similarly, in dissimilar pair ROC curve, find guarantee high accuracy and certain corresponding threshold value under recalling, regard this threshold value as the threshold value of opening a heap. For example, the heap threshold and the heap opening threshold are determined with an accuracy of 98% or more and a recall of 95% or more.
Further, acquiring training samples comprising N similar human face images and N dissimilar human face images, and labeling a real value of each sample; wherein each sample contains two face images, the two faces of a similar pair sample belonging to the same person. Two faces of the dissimilar pair sample correspond to different persons, respectively.
And acquiring the feature vectors of the two human faces in each sample. The feature vector of the face image can be extracted through the deep learning model. And calculating the similarity of the two human faces according to the feature vectors of the two human face images in each sample to serve as the similarity of the sample. For example, the cosine distance or the euclidean distance of the feature vectors of the two faces is used as the similarity of the two faces, and a smaller similarity value indicates that the two faces are more similar.
Traversing the threshold values in the threshold value range, and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to each threshold value; obtaining ROC curves of the similar pairs according to the precision values and the recall values of the similar pairs corresponding to all the threshold values; and obtaining ROC curves of the dissimilar pairs according to the precision values and the recall values of the dissimilar pairs corresponding to all the thresholds.
Further, the precision value and recall value of the similar pair and the precision value and recall value of the dissimilar pair corresponding to each threshold value can be calculated by the following method: predicting whether the samples are similar according to a threshold and the similarity of each sample to obtain a predicted value; and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to the threshold value according to the real values and the predicted values of all the samples.
In an example, N pairs of similar pair samples and N pairs of dissimilar pair samples are taken to construct training data with balanced distribution. Defining a label (i.e. the true value of the exemplar) for each pair of exemplars, the label being 1 (i.e. the true value being 1) if it is a similar pair exemplar; if it is a dissimilar pair sample, the label is 0 (i.e., true value is 0).
And expressing the similarity of the sample by adopting the cosine distance of the feature vectors of the two human faces in the sample, and predicting whether the sample is similar to the sample according to the similarity and a threshold value to obtain a predicted value of the sample.
The recall value Rs for similar pairs below a certain threshold (which indicates the proportion of samples correctly predicted as similar pairs to all of the samples of the actual similar pairs) is calculated according to the following formula:
Figure GDA0003686697420000091
where th is the threshold value, d i Is the similarity of the ith pair of samples, N is the number of similar pair samples, g i For the label of the i-th pair of samples, L () is an identification function, L (True) =1,L (False) =0.
Calculating the precision value Ps of the similar pair under a certain threshold value according to the following formula (the precision value represents the proportion of the truly similar pair in all samples predicted to be similar pairs):
Figure GDA0003686697420000092
calculating the recall value Rn of the dissimilar pairs under a certain threshold (the recall value represents the proportion of the samples correctly predicted as dissimilar pairs to all the samples of the dissimilar pairs in practice) according to the following formula:
Figure GDA0003686697420000093
the precision value Pn of the dissimilar pairs at a certain threshold (the precision value representing the proportion of truly dissimilar pairs in all samples predicted to be dissimilar pairs) is calculated according to the following formula:
Figure GDA0003686697420000101
for example, the threshold value range is 0-1, the threshold value th is traversed once every 0.05 step length from 0, the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to each threshold value are calculated according to the method, and the ROC curve of the similar pair is obtained according to the precision values and the recall values of all the similar pairs; and obtaining ROC curves of the dissimilar pairs according to the precision values and the recall values of all the dissimilar pairs. The ROC curve includes two curves, a threshold-precision value curve and a threshold-recall value curve.
Step S200 selects a face image to be processed from the first sample set as a processing object.
Specifically, the first sample set is a set including a plurality of face images, for example, an album including a plurality of photos, where at least one face image is provided on each photo; or, a set of pictures containing images of a human face. And performing stacking according to the face image. If a plurality of face images exist on the photo, the photo is subjected to the pile-dividing processing according to the plurality of face images, for example, if a photo has two persons A and B, the photo is classified into the pile where the person A is located and the pile where the person B is located.
Step S300 determines whether or not the processing object can be sorted into one of the piles of the sorted result based on the sorting threshold.
Specifically, when there is a heap in the heap result that can form a similar pair with the processing object, the processing object can be classified into a heap of the heap result.
Further, the step of judging that one heap can form a similar pair with the processing object comprises the following steps:
the feature vector of the processing object can be obtained through a deep learning model. Calculating the similarity of the feature vector to the center vector of the heap. The classical cosine distance can be used as the measure of similarity, the closer the value is to 0, the more similar the two parties are compared, and the closer the value is to 1, the more dissimilar the two parties are compared; when the similarity is less than the heap threshold, it can be predicted that the processing object and the heap form a similar pair.
Step S310, when the processing objects can be classified into a heap, classifying the processing objects into the heap, and updating the heap.
Specifically, when a heap exists in the heap splitting result and can form a similar pair with the processing object, the processing object is classified into the heap which forms the similar pair with the processing object, and the central vector of the heap which forms the similar pair with the processing object is updated. The center vector of the heap may be derived by a weighted average or mean of the feature vectors of all the processing objects in the heap.
In step S320, when the processing object cannot be placed in any heap, it is determined whether a new heap needs to be opened for the processing object according to the heap opening threshold.
Specifically, when the process object is not successfully destaged, whether a new heap needs to be opened for the process object is judged according to the heap opening threshold. Further, the step of judging whether a new heap needs to be opened for the processing object comprises the following steps:
and acquiring a feature vector of the processing object. The similarity of the feature vector to the center vector of a heap is calculated. A classical cosine distance can be used as a measure of similarity, and the greater the value is, the more dissimilar the two parties are compared; when the similarity is greater than the heap-open threshold, the processing object and the heap form a dissimilar pair.
When a processing object is not similarly paired with all heaps, a new heap needs to be opened for the processing object.
In step S400, when a new heap needs to be opened for the processing object, a new heap is opened, the processing object is placed in the new heap, and the result of the heap splitting is updated.
Specifically, when a new heap is opened for the processing object, the feature vector of the processing object is taken as the center vector of the new heap.
Step S500 selects another face image to be processed from the first sample set, updates the processing object with the selected face image, and then goes to step S300, and repeats the above process until all the face images to be processed in the first sample set are processed.
Specifically, the stacking process and the unstacking process can be performed in parallel or in series, and fig. 1 is only an example of a series process (stacking process is performed first and then unstacking process is performed); or the heap opening treatment can be carried out firstly and then the heap returning treatment can be carried out.
For example, the result of the heap is initialized to an empty set. The similarity is expressed in terms of cosine distance.
Selecting a first face image to be processed, opening a new pile (namely pile 1) for the first face image because the pile dividing result is an empty set, and taking the feature vector of the face image as the central vector of the new pile; and updating the heap splitting result.
And selecting a second face image to be processed. And calculating the cosine distance between the feature vector of the second face image and the central vector of the pile 1 to obtain the similarity between the second face image and the pile 1. And when the similarity is smaller than the stacking threshold, the second face image is stacked in the stack 1, and the central vector and the element number of the stack 1 are updated. And when the similarity is greater than the pile opening threshold value, newly opening the pile 2, putting the second face image into the pile 2, and taking the feature vector of the second face image as the central vector of the pile 2. When the similarity does not belong to the two cases, it is not determined whether the second face image belongs to a similar pair or a dissimilar pair with the pile 1, so that the second face image is not successful in piling.
And selecting a third to-be-processed face image. And calculating the similarity between the third face image and the heap 1, and judging whether the third face image can be placed in the heap 1 according to a heap threshold value. When the image can not be classified into the pile 1, if the pile 2 exists, the same processing method judges whether the third face image can be classified into the pile 2. And when neither the stack 1 nor the stack 2 can be put in, judging whether to start a new stack according to the stack opening threshold value. And when the similarity between the third face image and the pile 1 and the similarity between the third face image and the pile 2 are both greater than the pile opening threshold value, opening a new pile.
The specific implementation method is various, for example, the similarity between the third face image and each pile is calculated, the minimum value is taken and compared with the pile-up threshold, and when the minimum value is smaller than the pile-up threshold, the third face image can be classified into one pile; when the minimum value is greater than the unstack threshold, it is indicated that a new heap may be opened.
And repeating the processes until all the face images to be processed are processed.
And finally, processing all the face images which are not successfully subjected to pile division, for example, uniformly classifying the face images into a certain pile or respectively classifying the face images into piles with the closest similarity until one complete pile division is completed.
In this embodiment, the cosine distance is used to express the similarity, and this expression means that smaller values indicate more similarity, and larger values indicate less similarity, so in this case, the heap opening threshold is greater than the heap sorting threshold. Obviously, the similarity can be expressed in an opposite manner, for example, the similarity = 1-cosine distance, in which a larger value indicates more similarity, and a smaller value indicates less similarity, and the heap opening threshold is smaller than the heap sorting threshold. The present application supports the above two expression modes of similarity.
The embodiment is based on bilateral threshold value pile dividing, and due to the fact that pile opening threshold values and pile returning threshold values are high in accuracy, pile returning and pile opening are guaranteed to have high accuracy, and meanwhile the situation that the same face is divided into a plurality of piles can be effectively avoided.
In an embodiment of the present invention, as shown in fig. 2, a method for stacking face images includes:
on the basis of the previous embodiment, the following steps are added:
step S610 collects all unsuccessfully piled face images in the first sample set into a second sample set.
Specifically, due to the adoption of bilateral threshold pile-dividing, a blank unprocessed area exists between the pile-opening threshold value and the pile-returning threshold value, which can cause that some face images are not successfully piled; and collecting the unsuccessfully piled face images in the first sample set to form a second sample set.
Step S620, acquiring a second pile-in threshold and a second pile-out threshold; and the second pile-up threshold value is lower than the pile-up precision required by the pile-up threshold value, or the second pile-opening threshold value is lower than the pile-opening precision required by the pile-opening threshold value.
Specifically, for example, the cosine distance is used to express the similarity, a high stacking opening threshold can ensure the stacking opening accuracy, and a low stacking returning threshold can ensure the returning accuracy.
The process of stacking the first sample set is denoted as a first stage process. In the first stage, the number of samples in each heap is not large, and the central vector of each heap does not have good representativeness and distinctiveness, at this time, a relatively high heap opening threshold value and a relatively low heap sorting threshold value need to be selected, so that a good basis can be laid for the heap in the first stage. After the first stage, a sample set of unsuccessful stacking, i.e., a second sample set, is obtained.
With the good basis of the first stage, the second stage can appropriately lower the pile-opening threshold (e.g., adjust the pile-opening requirement downward by 0.05), and/or appropriately raise the pile-up threshold (e.g., adjust the pile-up requirement upward by 0.05), so that most of the facial images in the second sample set can be successfully piled up.
And step S630, when the second sample set is not empty, performing pile-dividing processing on the face images of the second sample set according to the second pile-up threshold and the second pile-up threshold.
The process of the heap processing of the face image of the second sample set is as follows:
selecting a face image to be processed from the second sample set as a second processing object;
when judging that a second processing object can be classified into one heap of the heap splitting result according to a second heap classification threshold value, classifying the second processing object into the heap, and updating the heap;
when judging that a new heap needs to be opened for the second processing object according to the second heap opening threshold value, opening a new heap, classifying the second processing object into the new heap, and updating the heap division result;
and selecting another face image to be processed from the second sample set, updating a second processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the second sample set are processed.
Specifically, on the result of the first stage of the heap splitting, the second sample set is subjected to the heap splitting, that is, the second stage of the heap splitting is performed. The second stage of the process is similar to the first stage and will not be repeated here.
After the second stage processing, there may still be some face images that have not been successfully piled up, and these face images are, for example, put together in a certain pile or put together in a pile with the closest similarity respectively until a complete piling up is completed.
The embodiment adopts two-stage stack separation treatment, and further improves the accuracy and recall of stack separation.
In an embodiment of the present invention, as shown in fig. 3, a method for stacking facial images includes:
on the basis of the previous embodiment, the following steps are added:
step S710 collects all unsuccessfully piled face images in the second sample set into a third sample set.
Step S720, when the ratio of the number of the face images of the third sample set to the number of the face images of the first sample set exceeds a preset ratio, acquiring a third piling threshold and a third piling threshold; the third stacking threshold value is lower than the stacking precision required by the second stacking threshold value, and the third stacking threshold value is equal to the second stacking threshold value.
Specifically, the samples that are not successfully stacked are usually few after the second stage of processing. However, if the ratio of the sample set of the second stage that is not successfully stacked (i.e., the third sample set) to the first sample set is found to be higher than a predetermined ratio, such as 5%, the processing of the third stage may be started. In the third stage, the opening threshold value of the second stage is kept unchanged, which is to fully guarantee the opening accuracy; and further increasing the stacking threshold, for example, gradually increasing from the threshold corresponding to the original precision of 98% to the threshold corresponding to the precision of 95%.
And step S730, performing pile-dividing processing on the face image of the third sample set according to the third pile-dividing threshold and the third pile-opening threshold.
The process of stacking the face images of the third sample set is as follows:
selecting a face image to be processed from the third sample set as a third processing object;
when judging that the third processing object can be classified into one heap of the heap splitting result according to a third heap classification threshold value, classifying the third processing object into the heap, and updating the heap;
when judging that a new heap needs to be opened for the third processing object according to the third heap opening threshold, opening a new heap, putting the third processing object into the new heap, and updating the heap splitting result;
and selecting another face image to be processed from the third sample set, updating a third processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the third sample set are processed.
Specifically, on the result of the second stage of stacking, the third sample set is subjected to stacking processing, that is, stacking processing of the third stage is performed. The process of the third stage of the process is similar to the first stage and will not be repeated here.
After the third stage of processing, there may still be some face images that have not been successfully piled up, and these face images are, for example, collectively put into a certain pile or respectively put into a pile with the closest similarity until a complete pile-up is completed.
In the embodiment, three-stage stacking processing is adopted, so that the stacking precision and recall are further improved. The multi-stage funnel type processing method can ensure extremely high stacking precision and can also well ensure the processing capacity of the sample.
In an embodiment of the present invention, as shown in fig. 4, a method for stacking facial images includes:
on the basis of the previous embodiment, the following steps are added: step S800 evaluates the result of the stacking from the score of the stacking, the accuracy of the stacking, and the recall rate of the stacking.
Specifically, the face album is divided into three objects, namely, the purity of each face stack, i.e., other faces are not expected to be mixed into the face stack. Second, it is desirable that all faces of a person be grouped into as many piles as possible, and it is undesirable that a face sample of a person be divided into several piles. Third, it is desirable to dig as many different people from the album as possible, rather than just dealing with faces that are easily piled.
According to the above objectives, three evaluation indexes such as the score of the pile, the precision of the pile, the recall rate of the pile and the like are designed.
And calculating the score of the heap aiming at the result of the heap splitting, wherein the calculation process is as follows:
1. counting the main identification, the main identification sample number, the pure penalty coefficient and the complete penalty coefficient of each heap in the result of the heap splitting:
counting the sample numbers of different characters in the heap; when the ratio of the number of samples of a person to the total number of samples of the stack is larger than a preset threshold, the person is used as a main identifier of the stack, and the number of samples of the main identifier in the stack is the number of samples of the main identifier of the stack.
And obtaining the purity penalty coefficient of the heap according to the proportion of the main identification sample number of the heap to the total sample number of the heap. And counting the frequency of the main identification of the heap appearing in all the heaps to obtain the complete penalty coefficient of the heap. And obtaining the punishment coefficient of the heap according to the pure punishment coefficient of the heap and the complete punishment coefficient of the heap.
2. And obtaining the ideal score of the heap according to the number of the primary identification samples of each heap and the score of each sample of the primary identification of the heap.
Punishing the ideal score of the heap according to the punishment coefficient of the heap to obtain the score of the heap;
and obtaining the score of the current pile division according to the scores of all piles.
Aiming at the stacking result, calculating the precision P of the stacking according to the following formula f
Figure GDA0003686697420000161
Wherein N is the number of piles, Q i Is the number of samples of the ith stack, i-main-id for the main designation of the ith heap, N i-main-id Identifying a sample number for the primary of the ith heap;
the recall rate R of the pile splitting according to the following formula f
Figure GDA0003686697420000162
Wherein M is the number of real persons in the first sample set, D j The number of face images of the jth person.
And evaluating the stacking result according to the score of the stacking, the accuracy of the stacking and the recall rate of the stacking.
As an example, assume that the test set contains M individuals (i.e., M classes, each class)Labeled with id)), D samples in total, the jth person contains D samples j Then there is
Figure GDA0003686697420000163
In the case of considering category imbalance, the score of each sample of each category is defined as (thus ensuring that the score of each category is balanced):
Figure GDA0003686697420000171
and (3) performing pile dividing on the test set, wherein the obtained pile dividing result is as follows: totally D faces are piled up and divided into N piles, and the number of samples in the ith pile is Q i
And calculating the score S of the heap according to the marking result and the heap result of the test set:
1) Calculating the primary identification (primary id) and the primary identification sample number (N) of each heap in the heap splitting result i-main-id ) And a penalty factor gamma of the heap i
The primary identifier (i.e., the primary id) of a heap is defined such that the number of samples corresponding to the primary id exceeds 80% of the total number of samples of the heap. When the primary id is determined, determining the primary identification sample number N of the heap i-main-id . When no master id exists, N i-main-id =0。
Penalty factor gamma of ith heap i Mainly comprises two parts: heap net penalty factor p i The heap integrity penalty factor g main-id
The purity penalty factor of the heap is defined as the proportion of the main identification sample number of the heap to the total sample number of the heap, and the calculation formula is as follows:
Figure GDA0003686697420000172
in order to calculate the complete penalty coefficient of the heap, the main id of each heap is calculated, and then the frequency N of occurrence of each id is counted id This frequency measures how often the same person's sample is divided into multiple heaps.
The heap's integrity penalty factor is defined as:
Figure GDA0003686697420000173
penalty factor gamma of the heap i :γ i =g main-id *p i
2) And calculating the score S of the heap.
Figure GDA0003686697420000174
S i =N i-main-id *S id-each-sample
Wherein N is the pile number of the pile dividing result, and i represents the pile number; gamma ray i Is the penalty factor of the ith heap, S i Is the ideal score, N, for the ith heap i-main-id Is the primary identification sample number of the ith heap.
Calculating the accuracy P of the pile according to the pile dividing result of the test set f Recall rate of heap splitting R f
Fig. 4 shows the three-stage heap evaluation after the three-stage heap processing, and in practice, the heap evaluation may be performed after each stage of heap processing.
By performing the heap evaluation after the heap processing at each stage, the evaluation indexes at each stage show that each stage can ensure extremely high precision, which indicates that the heap processing at each stage is extremely reliable, and the recalling is continuously improved along with the deepening of the process, which indicates that the sample processing capability of the method is also extremely good.
In addition, the optimal pile-in threshold and pile-out threshold can be obtained through searching according to the three indexes of the pile-dividing evaluation, and therefore a better pile-dividing result is obtained. For example, in the threshold range, the heap opening threshold and the heap returning threshold are traversed according to a certain step, the heap splitting result of each threshold is evaluated, and the heap opening threshold and the heap returning threshold corresponding to the optimal heap splitting result are selected as optimal parameter values.
The embodiment provides a stacking evaluation method, which evaluates the scores of the stacks, the precision of the stacks and the recall rate of the stacks, and is favorable for finding a stacking method with an ideal stacking effect (pure stacking and accurate stacking) and better stacking parameters.
In an embodiment of the present invention, as shown in fig. 5 and 6, a face image stacking apparatus includes:
and the parameter acquisition module 100 is used for acquiring different stacking thresholds and stacking opening thresholds.
Specifically, to overcome the unreliability of the single threshold, two-sided thresholds are used, namely a heap-in threshold and a heap-out threshold. Judging whether the pile can be piled according to the pile-piling threshold value, and judging whether the pile can be opened according to the pile-opening threshold value. The heap threshold and the heap open threshold may be set empirically.
Optionally, the parameter obtaining module 100 includes:
the sample labeling unit 110 is configured to obtain training samples including N similar pair of face images and N dissimilar pair of face images, and label a true value of each sample.
And the similarity calculation unit 120 is configured to calculate the similarity of each sample according to the feature vectors of the two human faces in each sample.
An accuracy recall value calculating unit 130, configured to traverse the thresholds within the threshold range, and calculate an accuracy value and a recall value of a similar pair and an accuracy value and a recall value of a dissimilar pair corresponding to each threshold;
the calculation of the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to each threshold specifically includes: predicting whether the samples are similar according to a threshold and the similarity of each sample to obtain a predicted value of the sample; and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to the threshold value according to the real values and the predicted values of all the samples.
A curve generating unit 140, configured to obtain precision recall curves of the similar pairs according to the precision values and recall values of the similar pairs corresponding to all the thresholds; and obtaining the precision recall curve of the dissimilar pairs according to the precision values and the recall values of the dissimilar pairs corresponding to all the thresholds.
And the parameter acquiring unit 150 is configured to determine a pile-returning threshold according to the precision recall curves of the similar pairs, and determine a pile-opening threshold according to the precision recall curves of the dissimilar pairs.
And an object updating module 200, configured to select a face image to be processed from the first sample set as a processing object.
A heap returning module 300, configured to determine whether the processing object can be returned to a heap of the heap splitting result according to the heap returning threshold; when the processing object can be classified into a heap, the processing object is classified into the heap, and the heap is updated.
Further, the heap returning module 300 is configured to determine whether the processing object and a heap of the heap splitting result form a similar pair according to the heap returning threshold; when a heap exists in the heap splitting result and can form a similar pair with the processing object, the processing object is classified into the heap which forms the similar pair with the processing object, and the central vector of the heap which forms the similar pair with the processing object is updated.
The stacking module comprises a similar pair judgment unit 310; the similar pair determining unit 310 is configured to determine whether the processing object and a heap form a similar pair.
Further, the similar pair determining unit 310 is configured to obtain a feature vector of the processing object; calculating the similarity of the feature vector and the central vector of the heap; and when the similarity is smaller than the heap sorting threshold value and the smaller the similarity value is, the higher the similarity is, the processing object and the heap form a similar pair.
A heap opening module 400, configured to determine whether a new heap needs to be opened for the processing object according to the heap opening threshold; and when a new heap needs to be opened for the processing object, opening a new heap, classifying the processing object into the new heap, and updating the heap splitting result.
Further, the heap opening module 400 is configured to open a new heap when the processing object and all heaps form dissimilar pairs, sort the processing object into the new heap, and use the feature vector of the processing object as the central vector of the new heap.
The stacking module comprises a dissimilar pair judgment unit 410; the dissimilar pair determining unit 410 is configured to determine whether the processing object and a heap form a dissimilar pair.
Further, the dissimilar pair determining unit 410 is configured to obtain a feature vector of the processing object; calculating the similarity of the feature vector and the central vector of the heap; and when the similarity is greater than the heap opening threshold value and the similarity value is larger, the dissimilarity degree is higher, the processing object and the heap form a dissimilarity pair.
The object updating module 200 is further configured to select another face image to be processed from the first sample set, update the processing object with the selected face image, and repeat the above process until all the face images to be processed in the first sample set are processed.
The embodiment is based on bilateral threshold value pile dividing, and due to the fact that pile opening threshold values and pile returning threshold values are high in accuracy, pile returning and pile opening are guaranteed to have high accuracy, and meanwhile the situation that the same face is divided into a plurality of piles can be effectively avoided.
In another embodiment of the present invention, as shown in fig. 7, a face image stacking apparatus includes:
on the basis of the foregoing embodiment, a two-stage processing module 500 is added.
A two-stage processing module 500, configured to collect all unsuccessfully piled face images in the first sample set into a second sample set; acquiring a second pile-returning threshold and a second pile-opening threshold; wherein the second stacking threshold requires lower stacking accuracy than the stacking threshold, or the second opening threshold requires lower opening accuracy than the opening threshold; and when the second sample set is not empty, performing pile-dividing processing on the face image of the second sample set according to a second pile-up threshold and a second pile-opening threshold.
The process of the heap processing of the face image of the second sample set is as follows:
selecting a face image to be processed from the second sample set as a second processing object; when judging that a second processing object can be classified into one heap of the heap splitting result according to a second heap classification threshold value, classifying the second processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the second processing object according to a second heap opening threshold value, opening a new heap, classifying the second processing object into the new heap, and updating a heap division result; and selecting another face image to be processed from the second sample set, updating a second processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the second sample set are processed.
The embodiment adopts two-stage stack separation treatment, and further improves the accuracy and recall of stack separation.
In an embodiment of the present invention, as shown in fig. 8, a face image stacking apparatus includes:
a three-stage processing module 600 is added to the foregoing embodiments.
A third-stage processing module 600, configured to collect all unsuccessfully stacked face images in the second sample set into a third sample set; when the ratio of the number of the face images of the third sample set to the number of the face images of the first sample set exceeds a preset ratio, acquiring a third piling threshold and a third piling threshold; wherein the third stacking threshold requires less stacking accuracy than the second stacking threshold, and the third stacking threshold is equal to the second stacking threshold; and performing pile-dividing processing on the face image of the third sample set according to the third pile-dividing threshold and the third pile-opening threshold.
The process of stacking the face images of the third sample set is as follows:
selecting a face image to be processed from the third sample set as a third processing object; when judging that the third processing object can be classified into one heap of the heap splitting result according to the third heap classification threshold value, classifying the third processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the third processing object according to the third heap opening threshold value, opening a new heap, putting the third processing object into the new heap, and updating the heap splitting result; and selecting another face image to be processed from the third sample set, updating a third processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the third sample set are processed.
The embodiment adopts three-stage stacking processing, and further improves the stacking precision and recall. The multi-stage funnel type processing method can not only guarantee extremely high stacking precision, but also well guarantee the processing capacity of the sample.
In an embodiment of the present invention, as shown in fig. 9 and 10, a face image stacking apparatus includes: on the basis of the foregoing embodiment, a heap evaluation module 700 is added.
And the pile dividing evaluation module 700 is used for evaluating the pile dividing result from the score of the pile dividing, the precision of the pile dividing and the recall rate of the pile dividing.
The heap evaluation module 700 includes:
a score calculating unit 710, configured to obtain a primary identifier, a primary identifier sample number, and a punishment coefficient of each heap in the result of the heap splitting; obtaining a punishment coefficient of the heap according to the pure punishment coefficient of the heap and the complete punishment coefficient of the heap; obtaining an ideal score of the pile according to the number of the samples of the main mark and the score of each sample of the main mark; punishing the ideal score of the heap according to the punishment coefficient of the heap to obtain the score of the heap; and obtaining the scores of the piles according to the scores of all the piles.
A precision calculating unit 720, configured to calculate the precision of the split stack according to the following formula:
Figure GDA0003686697420000221
wherein, P f For accuracy of stacking, N is the number of stacks, Q i Is the number of samples of the ith stack, i-main-id for the main designation of the ith heap, N i-main-id Identifying a sample number for the primary of the ith heap;
a recall rate calculating unit 730, configured to calculate the recall rate of the heap according to the following formula:
Figure GDA0003686697420000222
wherein R is f For recall in the heap, M is the number of real persons in the first sample set, D j The number of face images of the jth person.
It should be noted that the embodiment of the facial image stacking device provided by the present invention and the embodiment of the facial image stacking method provided by the foregoing are all based on the same inventive concept, and can obtain the same technical effects. Therefore, other specific contents of the embodiment of the facial image stacking device may refer to the description of the foregoing embodiment of the facial image stacking method.
In one embodiment of the invention, as shown in FIG. 11, an electronic device 800 includes a memory 810 and a processor 820. The memory 810 is used to store a computer program 830. The processor 820, when running the computer program 830, implements the method of stacking face images as described above.
As an example, the processor 820 realizes the steps S100 to S500 according to the foregoing description when executing the computer program. The processor 820 realizes the functions of the modules and units in the face image stacking apparatus described above when executing the computer program. As yet another example, the processor 820, when executing the computer program, implements the functions of the parameter obtaining module 100, the object updating module 200, the heap-in module 300, and the heap-out module 400.
Alternatively, the computer program 830 may be divided into one or more modules/units according to the particular needs to accomplish the present invention. Each module/unit may be a series of computer program instruction segments capable of performing a particular function. The computer program instruction segment is used for describing the execution process of the computer program in the facial image de-stacking device. As an example, the computer program may be divided into modules/units in the virtual device, such as a parameter acquisition module, an object update module, a heap open module.
The processor is used for realizing the pile dividing method of the face image by executing the computer program. The processor may be a central processing unit, graphics processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, general purpose processor or other logic device, etc., as desired.
The memory may be any internal storage unit and/or external storage device capable of storing data and programs, such as a plug-in hard disk, a smart card memory (SMC), a Secure Digital (SD) card, a flash card, or the like. The memory is used for storing computer programs, other programs and data of the pile dividing device of the face images.
The electronic device 800 may be any computer device, such as a desktop computer, a portable computer, a server, etc. Electronic device 800 may also include input-output devices, display devices, network access devices, and bus 840, among others, as desired. The electronic device 800 may also be a single-chip microcomputer or a computing device that integrates a Central Processing Unit (CPU) and a Graphics Processing Unit (GPU).
It will be understood by those skilled in the art that the above-mentioned units and modules for implementing the corresponding functions are divided for the purpose of convenient illustration and description, and the above-mentioned units and modules are further divided or combined according to the application requirements, that is, the internal structures of the devices/apparatuses are divided and combined again to implement the above-mentioned functions. Each unit and module in the above embodiments may be separate physical units, or two or more units and modules may be integrated into one physical unit. The units and modules in the above embodiments may implement corresponding functions by using hardware and/or software functional units. Direct coupling, indirect coupling or communication connection among a plurality of units, components and modules in the above embodiments can be realized through a bus or an interface; the coupling, connection, etc. between the multiple units or devices may be electrical, mechanical, or the like. Accordingly, the specific names of the units and modules in the above embodiments are only for convenience of description and distinction, and do not limit the scope of protection of the present application.
In an embodiment of the present invention, a computer-readable storage medium has a computer program stored thereon, and the computer program can realize the human face image stacking method as described in the foregoing embodiment when executed by a processor. That is, when part or all of the technical solutions of the embodiments of the present invention contributing to the prior art are embodied by means of a computer software product, the computer software product is stored in a computer-readable storage medium. The computer readable storage medium can be any portable computer program code entity apparatus or device. Such as a U-disk, a removable magnetic disk, a magnetic diskette, an optical disk, a computer memory, a read-only memory, a random access memory, etc.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (20)

1. A method for stacking face images is characterized by comprising the following steps:
acquiring different heap returning thresholds and different heap opening thresholds, wherein the heap returning thresholds are thresholds for judging the heap returning, and the heap returning refers to the process of merging a sample into an existing heap; the opening threshold is a threshold for opening judgment, and opening refers to opening a pile again and merging a sample into the pile;
selecting a face image to be processed from the first sample set as a processing object;
when a heap which can form a similar pair with the processing object exists in the heap splitting result, the processing object is classified into a heap which forms a similar pair with the processing object, and the central vector of the heap which forms a similar pair with the processing object is updated;
when the processing object and all heaps form dissimilar pairs, opening a new heap, classifying the processing object into the new heap, and taking the characteristic vector of the processing object as the central vector of the new heap;
wherein the judgment that the processing object and a heap form a similar pair or a dissimilar pair comprises the following steps:
acquiring a feature vector of the processing object;
calculating the similarity of the feature vector and the central vector of the heap;
when the similarity is smaller than the heap sorting threshold value and the smaller the similarity value is, the higher the similarity is, the processing object and the heap form a similar pair;
when the similarity is greater than the heap opening threshold value and the similarity value is larger, the dissimilarity degree is higher, the processing object and the heap form a dissimilarity pair;
and selecting another face image to be processed from the first sample set, updating the processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the first sample set are processed.
2. The method for stacking face images according to claim 1, wherein the obtaining different stacking thresholds and stacking opening thresholds comprises:
acquiring precision recall curves of similar pairs and precision recall curves of dissimilar pairs;
and determining a pile-returning threshold according to the precision recall curves of the similar pairs, and determining a pile-opening threshold according to the precision recall curves of the dissimilar pairs.
3. The method of claim 2, wherein the obtaining of the recall curve of similar pair precision and the recall curve of dissimilar pair precision comprises:
acquiring training samples comprising N similar relative face images and N dissimilar face images, and labeling a real value of each sample;
calculating the similarity of each sample according to the feature vectors of the two human faces in each sample;
traversing the threshold values within the threshold value range, and calculating the precision value and recall value of the similar pair and the precision value and recall value of the dissimilar pair corresponding to each threshold value;
obtaining precision recall curves of the similar pairs according to precision values and recall values of the similar pairs corresponding to all the thresholds;
and obtaining an accuracy recall curve of the dissimilar pairs according to the accuracy values and recall values of the dissimilar pairs corresponding to all the thresholds.
4. The method of claim 3, wherein the calculating the precision value and recall value of the similar pair and the precision value and recall value of the dissimilar pair corresponding to each threshold comprises:
predicting whether the samples are similar according to the threshold and the similarity of each sample to obtain a predicted value of the sample;
and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to the threshold value according to the real values and the predicted values of all the samples.
5. The method for stacking facial images according to claim 1, wherein said step until all facial images to be processed in said first sample set are processed comprises:
collecting all unsuccessfully piled face images in the first sample set into a second sample set;
acquiring a second pile-in threshold and a second pile-out threshold; wherein the second stacking threshold requires lower stacking accuracy than the stacking threshold, or the second opening threshold requires lower opening accuracy than the opening threshold;
when the second sample set is not empty, selecting a face image to be processed from the second sample set as a second processing object;
when judging that the second processing object can be classified into one heap of the heap splitting result according to the second heap classifying threshold, classifying the second processing object into the heap, and updating the heap;
when judging that a new heap needs to be opened for the second processing object according to the second heap opening threshold value, opening a new heap, classifying the second processing object into the new heap, and updating the heap division result;
and selecting another face image to be processed from the second sample set, updating the second processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the second sample set are processed.
6. The method as claimed in claim 5, wherein said step of stacking face images to be processed until all face images to be processed in the second sample set are processed comprises:
collecting all unsuccessfully piled face images in the second sample set into a third sample set;
when the ratio of the number of the face images of the third sample set to the number of the face images of the first sample set exceeds a preset ratio, acquiring a third piling threshold value and a third piling threshold value; wherein the third stacking threshold requires less stacking accuracy than the second stacking threshold, and the third stacking threshold is equal to the second stacking threshold;
selecting a face image to be processed from the third sample set as a third processing object;
when the third processing object is judged to be capable of being classified into one heap of the heap splitting result according to the third heap classifying threshold value, classifying the third processing object into the heap, and updating the heap;
when judging that a new heap needs to be opened for the third processing object according to the third heap opening threshold value, opening a new heap, classifying the third processing object into the new heap, and updating the heap division result;
and selecting another face image to be processed from the third sample set, updating the third processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the third sample set are processed.
7. The method for stacking face images according to any one of claims 1 to 6, wherein:
and evaluating the stacking result from the score of the stacking, the precision of the stacking and the recall rate of the stacking.
8. The method of claim 7, wherein calculating the score of the heap comprises:
acquiring a main identifier, a main identifier sample number and a punishment coefficient of each heap in the heap splitting result;
obtaining an ideal score of each pile according to the number of primary identification samples of each pile and the score of each sample of the primary identification;
punishing the ideal score of the heap according to the punishment coefficient of the heap to obtain the score of the heap;
and obtaining the scores of the piles according to the scores of all the piles.
9. The method of claim 8, wherein the obtaining of the primary label, the primary label sample number, and the punishment coefficient of the pile comprises:
counting the sample numbers of different characters in the heap;
when the ratio of the number of samples of a person to the total number of samples of the pile is greater than a preset threshold, taking the person as a main identifier of the pile, wherein the number of samples of the main identifier in the pile is the number of samples of the main identifier of the pile;
obtaining a purity punishment coefficient of the heap according to the proportion of the main identification sample number of the heap to the total sample number of the heap;
counting the frequency of the main marks of the piles in all the piles to obtain the complete penalty coefficient of the piles;
and obtaining the punishment coefficient of the heap according to the pure punishment coefficient of the heap and the complete punishment coefficient of the heap.
10. The method of claim 7, wherein:
calculating the accuracy of the split stack according to the following formula:
Figure FDA0003791872430000051
wherein, P f For accuracy of stacking, N is the number of stacks, Q i Is the sample number of the ith heap, i-main-id is the main identification of the ith heap, N i-main-id Identifying a sample number for the primary of the ith heap;
the recall rate of the heap according to the following formula:
Figure FDA0003791872430000052
wherein R is f For recall in the heap, M is the number of real persons in the first sample set, D j The number of face images of the jth individual.
11. The utility model provides a pile device of facial image which characterized in that:
the system comprises a parameter acquisition module, a pile opening judgment module and a pile sorting module, wherein the parameter acquisition module is used for acquiring different pile sorting thresholds and pile opening thresholds, the pile sorting threshold is a threshold for pile sorting judgment, and the pile sorting is to merge a sample into an existing pile; the opening threshold is a threshold for opening judgment, and opening refers to opening a pile again and merging a sample into the pile;
the object updating module is used for selecting a face image to be processed from the first sample set as a processing object;
the pile-grouping module is used for grouping the processing object into a pile forming a similar pair with the processing object and updating a central vector of the pile forming the similar pair with the processing object when one pile can form the similar pair with the processing object in the pile-dividing result;
the heap opening module is used for opening a new heap when the processing object and all heaps form dissimilar pairs, putting the processing object into the new heap, and taking the characteristic vector of the processing object as the central vector of the new heap;
wherein the judgment that the processing object and a heap form a similar pair or a dissimilar pair comprises the following steps:
acquiring a feature vector of the processing object;
calculating the similarity of the feature vector and the central vector of the heap;
when the similarity is smaller than the heap sorting threshold value and the smaller the similarity value is, the higher the similarity is, the processing object and the heap form a similar pair;
when the similarity is greater than the heap opening threshold value and the similarity value is larger, the dissimilarity degree is higher, the processing object and the heap form a dissimilarity pair;
the object updating module is further configured to select another to-be-processed face image from the first sample set, update the processing object with the selected to-be-processed face image, and repeat the above process until all to-be-processed face images in the first sample set are processed.
12. The apparatus for stacking face images according to claim 11, wherein the parameter obtaining module comprises:
the curve generating unit is used for acquiring the precision recall curves of the similar pairs and the precision recall curves of the dissimilar pairs;
and the parameter acquisition unit is used for determining a pile-returning threshold according to the precision recall curves of the similar pairs and determining a pile-opening threshold according to the precision recall curves of the dissimilar pairs.
13. The apparatus for stacking face images according to claim 12, wherein the parameter obtaining module further comprises:
the system comprises a sample marking unit, a training unit and a real value marking unit, wherein the sample marking unit is used for acquiring training samples comprising N similar pair face images and N dissimilar pair face images and marking the real value of each sample;
the similarity calculation unit is used for calculating the similarity of each sample according to the feature vectors of the two human faces in each sample;
the precision recall value calculation unit is used for traversing the threshold values in the threshold value range, and calculating the precision value and the recall value of the similar pair and the precision value and the recall value of the dissimilar pair corresponding to each threshold value;
the curve generating unit is further configured to obtain precision recall curves of the similar pairs according to the precision values and the recall values of the similar pairs corresponding to all the thresholds; and obtaining the precision recall curve of the dissimilar pairs according to the precision values and the recall values of the dissimilar pairs corresponding to all the thresholds.
14. The apparatus for stacking face images according to claim 11, further comprising:
the two-stage processing module is used for collecting all the unsuccessfully piled face images in the first sample set into a second sample set; acquiring a second pile-in threshold and a second pile-out threshold; wherein the second stacking threshold requires lower stacking accuracy than the stacking threshold, or the second opening threshold requires lower opening accuracy than the opening threshold; when the second sample set is not empty, selecting a face image to be processed from the second sample set as a second processing object; when judging that the second processing object can be classified into one heap of the heap splitting result according to the second heap classification threshold value, classifying the second processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the second processing object according to the second heap opening threshold, opening a new heap, classifying the second processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the second sample set, updating the second processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the second sample set are processed.
15. The apparatus for stacking face images according to claim 14, further comprising:
the three-stage processing module is used for collecting all the unsuccessfully piled face images in the second sample set into a third sample set; when the ratio of the number of the face images of the third sample set to the number of the face images of the first sample set exceeds a preset ratio, acquiring a third piling threshold and a third piling threshold; wherein the third stacking threshold requires less stacking accuracy than the second stacking threshold, and the third stacking threshold is equal to the second stacking threshold; selecting a face image to be processed from the third sample set as a third processing object; when the third processing object is judged to be capable of being classified into one heap of the heap splitting result according to the third heap classifying threshold value, classifying the third processing object into the heap, and updating the heap; when judging that a new heap needs to be opened for the third processing object according to the third heap opening threshold value, opening a new heap, classifying the third processing object into the new heap, and updating the heap division result; and selecting another face image to be processed from the third sample set, updating the third processing object by using the face image to be processed, and repeating the process until all the face images to be processed in the third sample set are processed.
16. The apparatus for stacking face images according to any one of claims 11 to 15, further comprising:
and the pile dividing evaluation module is used for evaluating the pile dividing result from the score of the pile dividing, the precision of the pile dividing and the recall rate of the pile dividing.
17. The apparatus for stacking face images according to claim 16, wherein the stacking evaluation module comprises:
the score calculating unit is used for acquiring the main identification, the main identification sample number and the punishment coefficient of each pile in the pile dividing result; obtaining a punishment coefficient of the heap according to the pure punishment coefficient of the heap and the complete punishment coefficient of the heap; obtaining an ideal score of the pile according to the number of the primary identification samples and the score of each sample of the primary identification; punishing the ideal score of the heap according to the punishment coefficient of the heap to obtain the score of the heap; and obtaining the scores of the piles according to the scores of all the piles.
18. The apparatus for stacking face images according to claim 16, wherein the stacking evaluation module further comprises:
the precision calculation unit is used for calculating the precision of the pile according to the following formula:
Figure FDA0003791872430000091
wherein, P f For accuracy of stacking, N is the number of stacks, Q i Is the sample number of the ith heap, i-main-id is the main identification of the ith heap, N i-main-id Identifying a sample number for the primary of the ith heap;
a recall rate calculating unit, configured to calculate a recall rate of the heap according to the following formula:
Figure FDA0003791872430000092
wherein R is f For recall in the heap, M is the number of real persons in the first sample set, D j The number of face images of the jth individual.
19. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the method of stacking face images according to any one of claims 1 to 10 when running the computer program.
20. A computer-readable storage medium having stored thereon a computer program, characterized in that:
the computer program, when executed by a processor, implements a method of stacking facial images according to any of claims 1 to 10.
CN201911105931.7A 2019-11-13 2019-11-13 Face image stacking method and device, electronic equipment and readable storage medium Active CN110866136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911105931.7A CN110866136B (en) 2019-11-13 2019-11-13 Face image stacking method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911105931.7A CN110866136B (en) 2019-11-13 2019-11-13 Face image stacking method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN110866136A CN110866136A (en) 2020-03-06
CN110866136B true CN110866136B (en) 2022-10-18

Family

ID=69654255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911105931.7A Active CN110866136B (en) 2019-11-13 2019-11-13 Face image stacking method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110866136B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116719962B (en) * 2023-08-11 2023-10-27 世优(北京)科技有限公司 Image clustering method and device and electronic equipment
CN116737974B (en) * 2023-08-16 2023-11-03 世优(北京)科技有限公司 Method and device for determining threshold value for face image comparison and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201503000A (en) * 2013-03-11 2015-01-16 Yahoo Inc Automatic image piling
AU2014218444A1 (en) * 2014-08-29 2016-03-17 Canon Kabushiki Kaisha Dynamic feature selection for joint probabilistic recognition
CN107622256A (en) * 2017-10-13 2018-01-23 四川长虹电器股份有限公司 Intelligent album system based on facial recognition techniques
CN108182394A (en) * 2017-12-22 2018-06-19 浙江大华技术股份有限公司 Training method, face identification method and the device of convolutional neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201503000A (en) * 2013-03-11 2015-01-16 Yahoo Inc Automatic image piling
AU2014218444A1 (en) * 2014-08-29 2016-03-17 Canon Kabushiki Kaisha Dynamic feature selection for joint probabilistic recognition
CN107622256A (en) * 2017-10-13 2018-01-23 四川长虹电器股份有限公司 Intelligent album system based on facial recognition techniques
CN108182394A (en) * 2017-12-22 2018-06-19 浙江大华技术股份有限公司 Training method, face identification method and the device of convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Exploring album structure for face recognition in online social networks;Jason Hochreiter等;《Image and Vision Computing》;20141031;第751-760页 *

Also Published As

Publication number Publication date
CN110866136A (en) 2020-03-06

Similar Documents

Publication Publication Date Title
CN110163033B (en) Positive sample acquisition method, pedestrian detection model generation method and pedestrian detection method
CN102542058B (en) Hierarchical landmark identification method integrating global visual characteristics and local visual characteristics
KR100444776B1 (en) Image texture retrieving method and apparatus thereof
US20110085728A1 (en) Detecting near duplicate images
CN107122382B (en) Patent classification method based on specification
CN108897775A (en) A kind of rapid image identifying system and method based on perceptual hash
JP2014232533A (en) System and method for ocr output verification
CN106228129A (en) A kind of human face in-vivo detection method based on MATV feature
WO2006075902A1 (en) Method and apparatus for category-based clustering using photographic region templates of digital photo
CN110866136B (en) Face image stacking method and device, electronic equipment and readable storage medium
CN103262118A (en) Attribute value estimation device, attribute value estimation method, program, and recording medium
CN106557521A (en) Object indexing method, object search method and object indexing system
CN106815362A (en) One kind is based on KPCA multilist thumbnail Hash search methods
CN106228554A (en) Fuzzy coarse central coal dust image partition methods based on many attribute reductions
CN106845358A (en) A kind of method and system of handwritten character characteristics of image identification
KR100647337B1 (en) Method and apparatus for category-based photo clustering using photographic region templates of digital photo
WO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and recording medium
CN110458094B (en) Equipment classification method based on fingerprint similarity
CN109948677B (en) Touchi attack detection method based on mixed characteristic values
CN108288061A (en) A method of based on the quick positioning tilt texts in natural scene of MSER
CN111428064B (en) Small-area fingerprint image fast indexing method, device, equipment and storage medium
CN104966109A (en) Medical laboratory report image classification method and apparatus
JP2013117861A (en) Learning device, learning method and program
CN111178367A (en) Feature determination device and method for adapting to multiple object sizes
US20040042663A1 (en) Method, apparatus, and program for similarity judgment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220921

Address after: 200120 Pudong New Area, Shanghai, China (Shanghai) free trade trial area, No. 3, 1 1, Fang Chun road.

Applicant after: Shanghai Tianli Intelligent Technology Co.,Ltd.

Address before: 200335 room 1446, 1st floor, building 8, 33 Guangshun Road, Changning District, Shanghai

Applicant before: FANTASY POWER (SHANGHAI) CULTURE COMMUNICATION Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230804

Address after: 4B28-29, Ritan International Trade Center, No. 17 Ritan North Road, Chaoyang District, Beijing, 100020

Patentee after: 4U (BEIJING) TECHNOLOGY CO.,LTD.

Address before: 200120 Pudong New Area, Shanghai, China (Shanghai) free trade trial area, No. 3, 1 1, Fang Chun road.

Patentee before: Shanghai Tianli Intelligent Technology Co.,Ltd.

TR01 Transfer of patent right