CN109829433A

CN109829433A - Facial image recognition method, device, electronic equipment and storage medium

Info

Publication number: CN109829433A
Application number: CN201910101153.8A
Authority: CN
Inventors: 杨磊; 詹晓航; 陈大鹏; 闫俊杰; 吕健勤; 林达华
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2019-05-31
Anticipated expiration: 2039-01-31
Also published as: TWI754855B; CN109829433B; WO2020155627A1; TW202030637A

Abstract

This disclosure relates to a kind of facial image recognition method, device, electronic equipment and storage medium, wherein the described method includes: obtaining multiple facial images；Feature extraction is carried out to the multiple facial image, obtains the corresponding multiple feature vectors of the multiple facial image；Multiple target objects to be identified are obtained according to the multiple feature vector；The multiple target object to be identified is assessed, the classification of the multiple facial image is obtained.Using the disclosure, the data of label are not marked to magnanimity, cluster may be implemented to realize recognition of face.

Description

Face image recognition method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for recognizing a face image, an electronic device, and a storage medium.

Background

In the related technology, when the input data has a label, the clustering process is supervised clustering; when the input data has no label, the clustering process is unsupervised clustering. Most of the clustering methods are unsupervised clustering, and the clustering effect is not good.

For the application scene of face recognition, most of the massive face data has no label. Aiming at mass data without labels, how to realize clustering to realize face recognition is a technical problem to be solved.

Disclosure of Invention

The present disclosure provides a face image recognition technical scheme.

According to a first aspect of the present disclosure, there is provided a face image recognition method, the method including:

obtaining a plurality of face images;

extracting the features of the face images to obtain a plurality of feature vectors corresponding to the face images respectively;

obtaining a plurality of target objects to be identified according to the plurality of feature vectors;

and evaluating the target objects to be recognized to obtain the categories of the face images.

In a possible implementation manner, the performing feature extraction on the plurality of face images to obtain a plurality of feature vectors corresponding to the plurality of face images respectively includes:

and extracting the features of the face images according to a feature extraction network to obtain a plurality of feature vectors corresponding to the face images respectively.

In a possible implementation manner, the obtaining a plurality of target objects to be identified according to the plurality of feature vectors includes:

obtaining a face relation graph according to the feature extraction network and the feature vectors;

and clustering the face relation graph to obtain the target objects to be identified.

In a possible implementation, the feature extraction network further comprises a self-learning process.

In a possible implementation manner, the method further comprises the following steps: the feature extraction network performs back propagation according to the first loss function to obtain a self-learned feature extraction network; and clustering the face relation graph according to the self-learned feature extraction network to obtain the target objects to be recognized.

In a possible implementation manner, the evaluating the target objects to be recognized to obtain a plurality of categories of face images includes:

and evaluating the target objects to be identified according to the cluster evaluation parameters to obtain the classes of the face images.

In a possible implementation manner, the evaluating the target objects to be recognized according to the cluster evaluation parameters to obtain categories of the face images, including:

and evaluating the target objects to be identified in the clustering network according to the clustering evaluation parameters to obtain the categories of the face images.

In a possible implementation manner, the evaluating the target objects to be recognized in the cluster network according to the cluster evaluation parameters to obtain the categories of the face images includes:

correcting the cluster evaluation parameters according to the cluster network to obtain corrected cluster evaluation parameters;

and evaluating the target objects to be identified according to the corrected cluster evaluation parameters to obtain the classes of the face images.

In a possible implementation manner, the clustering network further comprises a step of performing back propagation according to a second loss function of the clustering network to obtain a self-learned clustering network;

correcting the cluster evaluation parameters according to the self-learned cluster network to obtain corrected cluster evaluation parameters;

In a possible implementation, the method further comprises: after the target objects to be recognized are evaluated to obtain the categories of the face images, the method further comprises the following steps:

and extracting a plurality of face images in the category, and extracting the face images meeting preset clustering conditions from the plurality of face images.

extracting a plurality of face images in the category, and determining a second face image with overlapped clusters from the plurality of face images;

and performing overlap removal processing on the second face image.

According to a second aspect of the present disclosure, there is provided a training method of a face recognition neural network, the method including:

obtaining a first data set comprising a plurality of face image data;

obtaining a second data set by performing feature extraction on the plurality of face image data;

and carrying out cluster detection on the second data set to obtain the categories of a plurality of face images.

In a possible implementation manner, the obtaining a second data set by performing feature extraction on the plurality of face image data includes:

extracting the features of the face image data to obtain a plurality of feature vectors;

obtaining K neighbor according to the similarity between each feature vector in the feature vectors and the adjacent feature vectors, and obtaining a plurality of first neighbor graphs according to the K neighbor;

iterating the first adjacency graphs according to the super nodes to obtain a plurality of clustering results;

and forming the second data set according to the plurality of clustering results.

In a possible implementation manner, iterating the plurality of first adjacency graphs according to the super-node to obtain a plurality of clustering results, including:

dividing the plurality of first adjacent graphs into a plurality of connected domains according to a preset threshold value, and determining the connected domains as the super nodes;

obtaining K neighbor according to the similarity between each super node in the super nodes and the adjacent super node, and obtaining a plurality of second neighbor graphs to be processed according to the K neighbor;

and continuing to execute iteration for determining the supernodes on the plurality of second adjacency graphs to be processed until the iteration reaches a second threshold interval range, and stopping the iteration to obtain a plurality of clustering results.

In a possible implementation manner, performing cluster detection on the second data set to obtain a plurality of categories of facial images includes:

performing back propagation according to a loss function of the clustering network to obtain a self-learned clustering network;

and carrying out cluster quality evaluation on a plurality of cluster results in the second data set according to the corrected cluster evaluation parameters to obtain the categories of a plurality of face images.

In a possible implementation, the method further comprises: after cluster detection is carried out on the second data set to obtain the categories of a plurality of face images,

predicting a probability value for each node in the plurality of clustering results in the second data set to determine a probability of whether each node in the plurality of clustering results belongs to noise.

evaluating a plurality of clustering results in the second data set according to a clustering network and clustering evaluation parameters to obtain clustering quality evaluation results, and sequencing the clustering results according to the clustering quality evaluation results from high to low to obtain sequencing results;

and determining a clustering result with the highest clustering quality from the plurality of clustering results according to the sequencing result to serve as a final clustering result.

According to a third aspect of the present disclosure, there is provided a face image recognition apparatus, the apparatus comprising:

a first obtaining unit configured to obtain a plurality of face images;

the feature extraction unit is used for extracting features of the face images to obtain a plurality of feature vectors corresponding to the face images respectively;

the second obtaining unit is used for obtaining a plurality of target objects to be identified according to the plurality of feature vectors;

and the evaluation unit is used for evaluating the target objects to be recognized to obtain the classes of the face images.

In a possible implementation manner, the feature extraction unit is configured to:

In a possible implementation manner, the second obtaining unit is configured to:

In a possible implementation mode, the feature extraction network performs back propagation according to a first loss function to obtain a self-learned feature extraction network;

the second obtaining unit is configured to:

and clustering the face relation graph according to the self-learned feature extraction network to obtain the target objects to be recognized.

In a possible implementation, the evaluation unit is configured to:

the evaluation unit is configured to:

In a possible implementation, the apparatus further includes: an extraction unit for:

In a possible implementation, the apparatus further includes: a de-overlap unit to:

and performing overlap removal processing on the second face image.

According to a fourth aspect of the present disclosure, there is provided an apparatus for training a face recognition neural network, the apparatus including:

a data set obtaining unit for obtaining a first data set including a plurality of face image data;

the data feature extraction unit is used for extracting features of the face image data to obtain a second data set;

and the cluster detection unit is used for carrying out cluster detection on the second data set to obtain the categories of the plurality of face images.

In a possible implementation manner, the data feature extraction unit is configured to:

In a possible implementation manner, the cluster detection unit is configured to:

In a possible implementation, the apparatus further includes: a first processing unit to:

In a possible implementation, the apparatus further includes: a second processing unit to:

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: performing the method of any of the above.

According to a sixth aspect of the present disclosure, there is provided a computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method of any one of the above.

In the embodiment of the present disclosure, a plurality of face images are obtained; extracting the features of the face images to obtain a plurality of feature vectors corresponding to the face images respectively; obtaining a plurality of target objects to be identified according to the plurality of feature vectors; and evaluating the target objects to be recognized to obtain the categories of the face images. By adopting the embodiment of the disclosure, the feature extraction is carried out on a plurality of face images to obtain a plurality of feature vectors, and the cluster processing of evaluating a plurality of target objects to be identified obtained by the plurality of feature vectors to obtain the category of the face images is supervised clustering.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 shows a flowchart of a face image recognition method according to an embodiment of the present disclosure.

Fig. 2 shows a flow chart of a face image recognition method according to an embodiment of the present disclosure.

Fig. 3 shows a flow diagram of a training method according to an embodiment of the present disclosure.

FIG. 4 shows a block diagram of a training model to which a training method according to an embodiment of the present disclosure is applied.

Fig. 5 shows a schematic diagram of an adjacency graph according to an embodiment of the present disclosure.

Fig. 6 shows a schematic diagram of clustered categories according to an embodiment of the present disclosure.

Fig. 7 shows a schematic diagram of cluster detection and segmentation according to an embodiment of the present disclosure.

Fig. 8 shows a block diagram of a face recognition apparatus according to an embodiment of the present disclosure.

Fig. 9 shows a block diagram of a face recognition neural network training device according to an embodiment of the present disclosure.

Fig. 10 shows a block diagram of an electronic device according to an embodiment of the disclosure.

FIG. 11 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Although face recognition has been developed rapidly, the improvement of face recognition performance is heavily dependent on large-scale labeled data. A large number of face pictures can be easily downloaded on the internet, but the cost of fully labeling these pictures is extremely high. Therefore, the processing efficiency of face recognition can be improved by using these non-labeled data through unsupervised learning or semi-supervised learning. If the label-free data is endowed with 'pseudo labels' in a clustering mode, and then the 'pseudo labels' are added into a frame for supervision learning together for training, the clustering performance can be improved. These methods are usually unsupervised clustering, relying on some simple assumptions. Such as: k-means implicitly assumes that the samples in each class will be distributed around a center. Alternatively, spectral clustering requires that each of the clustered classes be as balanced in number as possible. The clustering methods such as hierarchical clustering and approximate sorting are also unsupervised clustering, and clustering grouping can be performed on unlabeled data (such as facial image data) based on simple assumptions. Especially when the method is applied to large-scale practical problems, the problem seriously restricts the improvement of clustering performance and correspondingly restricts the processing efficiency of face recognition.

By adopting the embodiment of the disclosure, the common mode in the face image data is captured by utilizing the strong expression capability of the graph convolution network, and the non-label data (such as the face image data) without labels is partitioned by utilizing the common mode. The graph convolution network may be a frame graph convolution network based on face image face clustering. The framework employs a Mask (Mask) like R-CNN based Convolutional Neural Network (CNN) that applies deep learning to the detection of target objects. The clustering network of the embodiment of the disclosure is adopted to cluster the face images, and then Mask is used to train the clustering network. These training steps may be accomplished by an iterative proposal generator based on supernodes, as well as by graph detection networks and graph segmentation networks, among others. The training steps of the disclosed embodiments can be applied to any adjacent graph and are not limited to the mesh of the 2D image. The disclosed embodiments are supervised clustering approaches, based on a graph convolution network learning model, representing clustering as detecting and segmenting a pipeline based on the graph convolution network. The cluster with a complex structure can be processed, the accuracy of clustering large-scale facial data is improved, unlabeled non-label data (such as face image data) can be processed, and the processing efficiency of face recognition is improved.

Fig. 1 shows a flowchart of a face image recognition method according to an embodiment of the present disclosure, which is applied to a face recognition apparatus, for example, the face recognition apparatus may be executed by a terminal device or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a wearable device, or the like. In some possible implementations, the face image recognition method may be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 1, the process includes:

and step S101, obtaining a plurality of face images.

In a possible implementation manner of the present disclosure, the plurality of face images may be from the same image or from a plurality of images respectively.

Step S102, extracting the features of the face images to obtain a plurality of feature vectors corresponding to the face images respectively.

In a possible implementation manner of the present disclosure, feature extraction may be performed on the plurality of face images according to a feature extraction network, so as to obtain a plurality of feature vectors corresponding to the plurality of face images respectively. In addition to the feature extraction network, other networks may be used, and the network capable of performing feature extraction is included in the scope of the present disclosure.

And S103, obtaining a plurality of target objects to be identified according to the plurality of feature vectors.

In a possible implementation manner of the present disclosure, a face relationship graph may be obtained according to a feature extraction network and the plurality of feature vectors, and the plurality of target objects to be recognized are obtained after the face relationship graph is clustered. The feature extraction network comprises a self-learning process, and the feature extraction network performs back propagation according to the first loss function to obtain the self-learned feature extraction network. And clustering the face relation graph according to the self-learned feature extraction network to obtain the target objects to be recognized.

In one example, a plurality of face images are input into the feature extraction network, which may be a first atlas neural network. Converting a plurality of face images into a plurality of feature vectors corresponding to the plurality of images respectively in a feature extraction network, optimizing a face relation graph (such as an adjacent graph in a clustering algorithm) obtained by the plurality of feature vectors, and obtaining a plurality of target objects to be identified according to an optimized result. Wherein the optimization process is realized by back-propagating the feature extraction network according to the first loss function. The target object to be identified may be a clustering result to be processed, the clustering result is most likely a required result, and the final clustering result is evaluated by a clustering evaluation parameter to obtain the final clustering result.

And step S104, evaluating the target objects to be recognized to obtain the classes of the face images.

In a possible implementation manner of the present disclosure, the target objects to be recognized may be evaluated according to the cluster evaluation parameter, so as to obtain the categories of the face images. For example, the target objects to be recognized are evaluated in the cluster network according to the cluster evaluation parameters, so as to obtain the categories of the face images.

In a possible implementation manner of the present disclosure, the evaluating the target objects to be recognized according to the cluster evaluation parameters in the cluster network to obtain categories of the face images includes:

correction method

And correcting the cluster evaluation parameters according to the cluster network to obtain corrected cluster evaluation parameters, and evaluating the target objects to be identified according to the corrected cluster evaluation parameters to obtain the categories of the face images.

Second, correction mode after self-learning of clustering network

The clustering network further comprises a step of performing back propagation according to a second loss function of the clustering network to obtain a self-learned clustering network, and a step of correcting the clustering evaluation parameters according to the self-learned clustering network to obtain corrected clustering evaluation parameters. And evaluating the target objects to be identified according to the corrected cluster evaluation parameters to obtain the classes of the face images.

In one example, a plurality of target objects to be identified are input into a clustering network, which may be a second atlas neural network. And optimizing the cluster evaluation parameters in the cluster network, and evaluating the target objects to be identified according to the optimized cluster evaluation parameters to obtain the category of the facial image. Wherein the optimization process is realized by back propagation of the clustering network according to the second loss function.

By adopting the embodiment of the disclosure, the feature extraction is carried out on the plurality of face images to obtain a plurality of feature vectors corresponding to the plurality of face images respectively, a plurality of target objects to be identified are obtained according to the plurality of feature vectors, and a feature extraction learning network is adopted to carry out the learning of feature extraction. And evaluating the target objects to be identified through the cluster evaluation parameters to obtain the class of the face identification, and performing cluster learning by adopting a cluster learning network. Through the characteristic extraction and the cluster learning, the clustering can be still realized and a better face recognition effect can be achieved for the face images with massive unlabeled labels.

In a possible implementation manner of the disclosure, the target objects to be identified are evaluated in the clustering network according to the clustering evaluation parameters to obtain the categories of the face images

In a possible implementation manner of the present disclosure, the cluster evaluation parameter includes: the first parameter and/or the second parameter. Wherein, the first parameter (e.g. IoU) is used to characterize the proportion of the intersection of the plurality of clustering results and the real category in the union of the plurality of clustering results and the real category, that is, in the evaluation of clustering quality, the proximity of the plurality of clustering results and the real category is represented by the first parameter. The second parameter (IoP) is used for representing the proportion of the intersection of the plurality of clustering results and the real category in the plurality of clustering results, namely, the purity of the plurality of clustering proposals is represented by the second parameter in the evaluation of the clustering quality.

In one example, a plurality of first images (original face pictures extracted from the same image or a plurality of images) are obtained, wherein the first images are image data without labels. And obtaining a first clustering mode (conventional existing clustering mode) for face clustering according to the first image convolution neural network, applying the first clustering mode to a plurality of first images for clustering learning, and obtaining a second clustering mode (learning how to perform clustering detection and clustering segmentation) by adopting the second image convolution neural network. And clustering the plurality of first images according to the second clustering mode to obtain a clustering result (the type of face recognition), and recognizing the face according to the clustering result. The plurality of face images in each category belong to the same person, and the plurality of face images in different categories belong to different persons.

Fig. 2 shows a flowchart of a face image recognition method according to an embodiment of the present disclosure, which is applied to a face recognition apparatus, for example, the face recognition apparatus may be executed by a terminal device or other processing devices, where the terminal device may be a User Equipment (UE), a mobile device, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a wearable device, or the like. In some possible implementations, the face image recognition method may be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 2, the process includes:

step S201, a plurality of face images are obtained.

In an example, the plurality of face images may be from the same image or from a plurality of images respectively.

Step S202, extracting the features of the plurality of face images to obtain a plurality of feature vectors corresponding to the plurality of face images respectively, and obtaining a plurality of target objects to be identified according to the plurality of feature vectors.

In an example, the target object to be identified may be a clustering result to be processed, the clustering results are most likely to be required results, and the final clustering result needs to be evaluated by a clustering evaluation parameter to obtain a final clustering result.

Step S203, the target objects to be identified are evaluated through the cluster evaluation parameters, and the categories of the face images are obtained.

And S204, extracting a plurality of face images in the category, and extracting a first face image which meets a preset clustering condition from the plurality of face images.

In one example, a plurality of face images in the category are extracted, a face image with abnormal clustering is determined from the plurality of face images and deleted, and the remaining face image is a first face image which meets a preset clustering condition in the plurality of face images.

By adopting the embodiment of the disclosure, the target objects to be identified can be evaluated through cluster detection to obtain a first cluster result with cluster quality meeting a preset condition, and then the face images with abnormal clusters in the first cluster result are deleted through cluster segmentation, so that the cluster processing for purifying the first cluster result is realized.

In a possible implementation manner of the present disclosure, the method further includes: the human face image overlap removal processing specifically comprises the following steps: and extracting a plurality of face images in the category, extracting a first face image which meets a preset clustering condition from the plurality of face images, then extracting the plurality of face images in the category, and determining a second face image with overlapped clusters from the plurality of face images. And performing de-overlapping processing on the second face image.

It should be noted that the facial image de-overlapping processing is not limited to be performed after the plurality of facial images in the category are extracted, and may be performed before the plurality of facial images in the category are extracted, as long as the cluster quality can be improved.

For the application of the face recognition, feature extraction learning and training of a cluster learning network need to be performed in advance. The training process is as follows.

Fig. 3 shows a flowchart of a training method of a face recognition neural network according to an embodiment of the present disclosure, and as shown in fig. 3, the flowchart includes:

step S301, a first data set including a plurality of face image data is obtained.

Step S302, a second data set is obtained by extracting the characteristics of the plurality of face image data.

In a possible implementation manner of the present disclosure, the second data set is formed by a plurality of clustering results obtained from the first adjacency graph representing the semantic relationship of the facial image data, and in short, the second data set is formed by a plurality of clustering results.

In a possible implementation manner of the present disclosure, the plurality of face image data are input to a feature extraction network, and the feature extraction network may be a first graph convolution neural network. The method comprises the steps of extracting features of the face image data in a first graph convolution neural network to obtain a plurality of feature vectors, comparing the similarity (such as cosine similarity) between each feature vector in the feature vectors and adjacent feature vectors to obtain K neighbor, and obtaining a plurality of first adjacent graphs according to the K neighbor, for example, the first adjacent graphs can be processed through an adjacent graph construction module.

In a possible implementation manner of the present disclosure, the plurality of first adjacency graphs may be iteratively optimized according to the super-node in the first graph convolution neural network. In the iterative optimization process, dividing the plurality of first adjacent graphs into a plurality of connected domains according to a preset size according to a preset threshold value, and determining the connected domains as the super nodes. And comparing the similarity between each super node in the plurality of super nodes and the adjacent super node, for example, comparing the cosine similarity between the center of each super node in the plurality of super nodes and the center of the adjacent super node to obtain a K neighbor, and obtaining a plurality of second neighbor graphs to be processed according to the K neighbor. And continuously executing the iterative optimization process of determining the super nodes on the plurality of second adjacent graphs to be processed to obtain a plurality of clustering results. The set formed by a plurality of said super nodes of different scales is a clustering result, which may also be referred to as a clustering proposal. For example, it can be processed by a clustering proposal module.

And S303, carrying out clustering detection on the second data set to obtain the classes of a plurality of face images.

In a possible implementation manner of the present disclosure, back propagation may be performed according to a loss function of a clustering network to obtain a self-learned clustering network, and the cluster evaluation parameter is corrected according to the self-learned clustering network to obtain a corrected cluster evaluation parameter. And carrying out cluster quality evaluation on a plurality of cluster results in the second data set according to the corrected cluster evaluation parameters to obtain the categories of a plurality of face images.

In one example, the plurality of clustering results may be input to a second convolutional neural network where a first parameter of the cluster evaluation parameters is optimized. A first parameter (e.g., IoU) is used to characterize a proportion of the intersection of the plurality of clustered results and the real category in the union of the plurality of clustered results and the real category. That is, in the evaluation of the cluster quality, the closeness of the plurality of cluster results to the real category is expressed by the first parameter. And performing clustering detection according to the optimized first parameter to obtain a first clustering quality evaluation result aiming at the plurality of clustering results. For example, it may be processed by a cluster detection module.

In another example, the plurality of clustering results may be input to a second convolutional neural network in which a second parameter of the cluster evaluation parameters is optimized. The second parameter (IoP) is used for representing the proportion of the intersection of the plurality of clustering results and the real category in the plurality of clustering results, namely, the purity of the plurality of clustering proposals is represented by the second parameter in the evaluation of the clustering quality. And performing clustering detection according to the optimized second parameter to obtain a second clustering quality evaluation result aiming at the plurality of clustering results. For example, it may be processed by a cluster detection module.

In a possible implementation manner of the present disclosure, after performing cluster detection on the second data set and obtaining the categories of the plurality of facial images, the method further includes: predicting a probability value for each node in the plurality of clustering results in the second data set to determine a probability of whether each node in the plurality of clustering results belongs to noise.

In one example, a probability value is predicted for each node in the plurality of clustering results in the second graph convolution neural network to determine a probability of whether each node in the plurality of clustering results belongs to noise. For example, it can be processed by a cluster segmentation module.

In a possible implementation manner of the present disclosure, after performing cluster detection on the second data set and obtaining the categories of the plurality of facial images, the method further includes: and evaluating the plurality of clustering results in the second data set according to the clustering network and the clustering evaluation parameters to obtain clustering quality evaluation results, and sequencing the plurality of clustering results according to the sequence of the clustering quality from high to low according to the clustering quality evaluation results to obtain sequencing results. And determining a clustering result with the highest clustering quality from the plurality of clustering results according to the sequencing result to serve as a final clustering result.

In one example, the process includes the following:

firstly, inputting a plurality of clustering results into a second graph convolution neural network, and optimizing a first parameter in the clustering evaluation parameters in the second graph convolution neural network. A first parameter (e.g., IoU) is used to characterize a proportion of the intersection of the plurality of clustered results and the real category in the union of the plurality of clustered results and the real category. That is, in the evaluation of the cluster quality, the closeness of the plurality of cluster results to the real category is expressed by the first parameter. And performing clustering detection according to the optimized first parameter to obtain a first clustering quality evaluation result aiming at the plurality of clustering results.

And secondly, inputting the plurality of clustering results into a second graph convolution neural network, and optimizing a second parameter in the clustering evaluation parameters in the second graph convolution neural network. The second parameter (IoP) is used for representing the proportion of the intersection of the plurality of clustering results and the real category in the plurality of clustering results, namely, the purity of the plurality of clustering proposals is represented by the second parameter in the evaluation of the clustering quality. And performing clustering detection according to the optimized second parameter to obtain a second clustering quality evaluation result aiming at the plurality of clustering results.

And thirdly, in a second graph convolution neural network, sequencing the plurality of clustering results according to the sequence of the clustering quality from high to low according to the first clustering quality evaluation result and/or the second clustering quality evaluation result to obtain a sequencing result. And determining a clustering result with the highest clustering quality from the plurality of clustering results according to the sequencing result to serve as a final clustering result. For example, it can be processed by a de-overlap module.

Application example:

users have collected a large number of unlabelled face images on the network and want to group together pictures in which the faces are identical. In this case, the user may learn a face clustering manner of clustering on the adjacency graph by using the embodiment of the present disclosure, so as to divide the acquired label-free face images into some mutually disjoint categories. The face images in each category belong to the same person, and the face images in different categories belong to different persons. After the category is obtained through a face clustering mode, face recognition can be achieved.

Fig. 4 is a block diagram of a training model applied by the training method according to the embodiment of the present disclosure, and the face clustering method can be processed by an adjacency graph constructing module, a cluster proposal generating module, a cluster detecting module, a cluster dividing module, and a de-overlapping module in the block diagram. Briefly, for the adjacency graph building block: the input data is an original face image in a data set, and the output data is an adjacent map representing semantic relations of all pictures. For the cluster proposal generation module: the input data is an adjacency graph, and the output is a series of clustering proposals. For the cluster detection module: the input data is the clustering proposal, and the output is the quality of the clustering proposal. For the cluster segmentation module: the input is a clustering proposal, and the output is the probability of whether each node in the clustering proposal belongs to noise. For the de-overlap module: the input is the clustering proposal and the quality of the clustering proposal, and the output is the clustering result.

Firstly, the method comprises the following steps: adjacency graph construction module 11: the input of the module is an original picture (such as a face image) in a data set, and the output of the module is an adjacent picture representing semantic relations of all pictures. The module adopts a common deep convolution network structure, such as Resnet-50 and the like. The module converts the picture into the feature vector through a deep convolution network, and then calculates the k neighbor of each feature vector through cosine similarity. And (3) regarding the feature vector obtained by each picture as the feature of the node, and taking the adjacent relation of every two pictures as an edge, thereby obtaining an adjacent graph constructed by all data. Wherein the working principle of the k-nearest neighbor is as follows: there is a sample data set in which the characteristic attribute of each object is known and to which class each object is known. For the objects to be detected with unknown classification, each characteristic attribute of the objects to be detected is compared with the characteristic attribute corresponding to the data in the sample data set, and then the classification label of the most similar object (nearest neighbor) of the sample is extracted through an algorithm. Generally, only the first k most similar object data in the sample data set are selected.

Secondly, the clustering proposal generating module 12: the input of the module is an adjacency graph, and the output is a series of clustering proposals. For an input adjacency graph, the module first divides the adjacency graph into a series of connected domains of consistent sizes according to a predetermined threshold, and defines the connected domains as 'super nodes'. The center of each super node is taken as a node, and k neighbors among all the centers can be calculated, so that an adjacency graph is formed again. On the basis, a super node with a larger receptive field (receptive field) can be generated, and the larger field of view can be sensed. This process can be iterated to form a series of "supernodes" of different scales. The set of these "supernodes" constitutes a clustering proposal.

The cluster detection module 13: the module inputs the clustering proposal and outputs the quality of the clustering proposal. The module adopts the structure of the graph convolution neural network. To describe the quality of the clustering proposal, two parameters are first introduced. A first parameter or first index (IoU) describes the proportion of the intersection of the cluster proposal and the real category in the union of the cluster proposal and the real category, which represents the closeness of the cluster proposal and the real category; the second parameter, or second indicator (IoP), describes the proportion of the intersection of the cluster proposal and the true category in the cluster proposal, indicating the purity of the cluster proposal. During the training phase, the graph convolution neural network is trained by optimizing the mean square error of the predicted IoU and IoP versus the true IoU and IoP. During the testing phase, all cluster proposals will go through the graph convolution neural network to get the predicted IoU and IoP.

The cluster segmentation module 14: the module inputs the clustering proposal and outputs the probability of whether each node in the clustering proposal belongs to the noise. The module is similar to the cluster detection module in structure, and also adopts the structure of a graph convolution neural network. The module predicts a probability value for each node in the clustering plan to indicate whether the node belongs to noise in the clustering plan. For the cluster proposal with low IoP in the cluster detection module, i.e. the cluster proposal with low purity, the module will refine the cluster proposal.

The overlap elimination module 15: the module inputs the clustering proposal and the quality of the clustering proposal, and outputs the clustering result. The module carries out overlap removal processing on the overlapped clustering proposal to obtain a final clustering result. The module firstly sorts the clustering proposals according to the quality of the clustering proposals, selects nodes in the clustering proposals from high to low according to the sorting result, and finally, each node belongs to the clustering proposal with the highest quality.

Fig. 5 is a schematic diagram illustrating an adjacency graph according to an embodiment of the present disclosure, where the picture in fig. 5 is a sample and shows a difference between the embodiment of the present disclosure and a related art in terms of implementation of clustering. Fig. 5 includes two different categories, where nodes in the target object 401 belong to a first category and nodes in the target object 402 belong to a second category. With the related art clustering method 31, a category with a complex internal structure (the second category identified by 402) cannot be processed because of the dependence on a specific clustering strategy. By adopting the embodiment of the disclosure, the quality of different clustering proposals can be evaluated through the structure of the clustering learning category, and the category (the second category identified by 402) with a complex internal structure can be classified, so that the high-quality clustering proposal is output to obtain a correct clustering result.

Fig. 6 shows a schematic diagram of categories obtained by clustering according to an embodiment of the present disclosure, and in fig. 6, four categories found by using the embodiment of the present disclosure are shown. According to the real labels, all the nodes in fig. 6 belong to the same real category, and the distance between two nodes in fig. 6 is inversely proportional to the similarity of the two nodes. This picture shows that categories with complex structures can be handled using embodiments of the present disclosure, for example: the category has a structure of two subgraphs and a structure with dense connection and sparse connection coexisting. Each target object in fig. 6, such as target object identified by 501, target object identified by 502, target object identified by 503, and target object identified by 504, belong to the same category, which is also called cluster.

In an example, in order to cope with a complex structure of a cluster pattern in a large-scale face cluster, cluster learning can be performed on a graph volume network based on the cluster pattern by using the embodiment of the disclosure. Particularly, clustering detection and clustering segmentation are integrated based on an adjacency graph to solve the problem of clustering learning. Given a face data set, facial features of each face in the face data set are extracted by training a Convolutional Neural Network (CNN), forming a set of feature values. When constructing the adjacency graph, the K neighbors of each sample are found by using cosine similarity. Through the connection between the neighbors, we can obtain the whole adjacency graph data set, or the adjacency graph can be represented by a symmetric adjacency matrix. An adjacency graph is a large graph with millions of nodes. From the adjacency graph, the characteristics of the clusters can be obtained: 1) images contained in different clusters in the cluster have different labels; 2) the images in a cluster have the same label.

Fig. 7 shows a schematic diagram of cluster detection and segmentation according to an embodiment of the present disclosure, where "cluster results" exist in the form of clusters (or referred to as classes), such as the individual clusters (or referred to as classes) shown in fig. 6, which are all referred to as "clusters" in this example. The initial clustering result input for the clustering detection may also be referred to as a clustering proposal since it is generated by a proposal generator. In fig. 7, the clustering framework (cluster framework) includes three modules: proposal generator, GCN-D and GCN-S. A clustering proposal is generated by the proposal generator, that is, the subgraphs may be clusters in similar graphs. A two-stage procedure is formed by GCN-D and GCN-S, with high-quality clustering proposals being selected first, and then refined, with the proposed clustering proposals being selected by eliminating noise therein. Specifically, cluster detection is performed by GCN-D, with the clustered proposals generated by the proposal generator as input, predicting IoU and IoP to assess the likelihood that a proposed clustered proposal constitutes a prospective cluster. Segmentation is then performed by GCN-S to refine the selected clustering proposal. For a clustering proposal, the noise probability of each node is estimated through GCN-S, and the selected clustering proposal is screened through discarding abnormal values, and the finally output cluster is the expected cluster, so that the high-quality cluster can be effectively obtained.

With respect to clustering proposals, employing the present example to not directly process large adjacency graphs, but to generate clustering proposals first, the computational cost can be greatly reduced since only a limited number of cluster candidates need to be evaluated. The generation of the clustering proposal is based on the super nodes, and all the super nodes form a clustering proposal, namely the clustering proposal in fig. 7 is generated according to the super nodes. A supernode is a subgraph of an adjacency graph that contains a small number of nodes, each node being closely connected to every other node. Thus, the use of connected domains may represent supernodes, but the connected domains may be too large to derive directly from the adjacency graph, for which edges within each supernode having high connectivity affinity values below a threshold are deleted and the size of the supernode is limited below a maximum value. In general, a 1M adjacency graph may be divided into 50K supernodes, each supernode containing 20 nodes on average. The nodes in a supernode are most likely the same person, and a sample of one person may distribute several supernodes. For an application scene of target detection (particularly human face recognition), the method is a multi-scale clustering scheme, close relations are established in centers of a plurality of super nodes, and connecting lines of the centers are used as edges.

In cluster detection, the present example contemplates a Graph Convolution (GCN) -D based GCN-D module that continues to select high quality clusters from the cluster proposals generated by the proposal generator. The quality of the clusters is measured by two parameters, namely IoU and IoP scores. The score calculations for IoU and IoP are shown in equations (1) and (2). Wherein,in order to be a true cluster of clusters,a proposed cluster for a proposal generator.

It is assumed that high quality clusters often exhibit some structural patterns between nodes. Such clusters are identified by the GCN-D module. For example, given a clustering schemeThe GCN-D module characterizes (denoted as) its nodes in relation to) And an adjacency graph submatrix (denoted as) As input and predicts scores of IoU and IoP. The GCN network based on the GCN-D module comprises L layers, and the calculation formula of each layer is shown as formula (3). Angle matrixThe calculation formula (2) is shown in formula (4). Wherein,for features related to the first level nodes of the network, W_lIs a learnable parameter at the l-th layer of the network.

Class labels are provided for the training dataset to obtain trues IoU and IoP, which train the GCN-D module to obtain the mean square error value of the true and predicted values, for which the GCN-D module can give accurate predictions. During the inference process, trained GCN-D modules may be used to predict IoU and IoP scores for each clustered proposal generated by the proposal generator. Then, a fixed number of high quality cluster proposals will be retained from the IoU evaluated cluster proposals, and the IoP score will be used at the next stage to determine if the cluster proposal needs to be further refined.

The clustering proposal determined by the GCN-D module may still contain some outliers, or values called cluster anomalies, which need to be eliminated. For this purpose, clustering segmentation is performed by a GCN-S module based on GCN to exclude values of clustering anomalies in the clustering proposal. The structure of the GCN-S module is similar to that of the GCN-D module, and the difference between the GCN-S module and the GCN-D module is mainly that: the GCN-S module outputs a probability value to a certain cluster instead of predicting the quality score of the whole cluster proposal.

To train the GCN-S module to identify outliers, nodes with node labels different from most of the labels can be considered outliers. The GCN-S module can learn different segmentation patterns as long as the segmentation result contains nodes of a class, whether it is a majority label or not. Specifically, one node may be randomly selected as a seed. The seed of nodes with the same label is considered a positive node, while other nodes are considered outliers. Based on the principle, the seeds are selected randomly through multiple iterations, and therefore multiple sets of training samples are obtained. A set of training samples is selected, each sample containing a set of feature vectors. The GCN-S module is trained using a binary of node directions with cross entropy as a loss function. In the reasoning process, multiple random nodes can be selected for the generated clustering proposal, and only the condition that the number of positive nodes in the prediction result is the maximum (the threshold value is 0.5) is reserved. By adopting the strategy, misguidance caused by the condition that the number of positive nodes corresponding to the random seed is too small can be avoided. For the GCN-S module, cluster proposals at thresholds 0.3 to 0.7 may be retained.

After the clustering proposal is further optimized through the clustering proposal obtained by the proposal generator, clustering detection and clustering segmentation, different clusters are still likely to overlap with each other, namely certain nodes are shared. This may result in adverse effects on facial recognition training. IoU sorted descending score classification suggestions can be used for quick de-overlap, sorting from high to low, collecting cluster proposals sequentially from the sorted results, and modifying each cluster proposal by deleting the previously displayed nodes.

It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the present disclosure also provides a face recognition apparatus, a training apparatus for a face recognition neural network, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the face image recognition methods and the training method for a face recognition neural network provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the methods section are not repeated.

Fig. 8 shows a block diagram of a face recognition apparatus according to an embodiment of the present disclosure, and in fig. 8, the apparatus includes: a first obtaining unit 41, configured to obtain a plurality of face images. And the feature extraction unit 42 is configured to perform feature extraction on the multiple face images to obtain multiple feature vectors corresponding to the multiple face images respectively. A second obtaining unit 43, configured to obtain a plurality of target objects to be identified according to the plurality of feature vectors. The evaluation unit 44 is configured to evaluate the target objects to be recognized to obtain the categories of the face images.

In a possible implementation manner of the present disclosure, the feature extraction unit is configured to: and extracting the features of the face images according to a feature extraction network to obtain a plurality of feature vectors corresponding to the face images respectively.

In a possible implementation manner of the present disclosure, the second obtaining unit is configured to: and obtaining a face relation graph according to the feature extraction network and the plurality of feature vectors, and clustering the face relation graph to obtain the plurality of target objects to be identified.

In a possible implementation manner of the present disclosure, the feature extraction network further includes a self-learning process. And the feature extraction network performs back propagation according to the first loss function to obtain the self-learned feature extraction network. The second obtaining unit is configured to: and clustering the face relation graph according to the self-learned feature extraction network to obtain the target objects to be recognized.

In a possible implementation manner of the present disclosure, the evaluation unit is configured to: and evaluating the target objects to be identified according to the cluster evaluation parameters to obtain the classes of the face images.

In a possible implementation manner of the present disclosure, the evaluation unit is configured to: and evaluating the target objects to be identified in the clustering network according to the clustering evaluation parameters to obtain the categories of the face images.

In a possible implementation manner of the present disclosure, the evaluation unit is configured to: and correcting the cluster evaluation parameters according to the cluster network to obtain the corrected cluster evaluation parameters. And evaluating the target objects to be identified according to the corrected cluster evaluation parameters to obtain the classes of the face images.

In a possible implementation manner of the present disclosure, the clustering network further includes performing back propagation according to the second loss function of the clustering network to obtain a self-learned clustering network. The evaluation unit is configured to: and correcting the cluster evaluation parameters according to the self-learned cluster network to obtain corrected cluster evaluation parameters. And evaluating the target objects to be identified according to the corrected cluster evaluation parameters to obtain the classes of the face images.

In a possible implementation manner of the present disclosure, the apparatus further includes: an extraction unit for: and extracting a plurality of face images in the category, and extracting a first face image which meets a preset clustering condition from the plurality of face images.

In a possible implementation manner of the present disclosure, the apparatus further includes: a de-overlap unit to: and extracting a plurality of face images in the category, and determining a second face image with overlapped clusters from the plurality of face images. And performing overlap removal processing on the second face image.

Fig. 9 is a block diagram illustrating a training apparatus for a face recognition neural network according to an embodiment of the present disclosure, where in fig. 9, the apparatus includes: a data set obtaining unit 51 for obtaining a first data set including a plurality of face image data. A data feature extraction unit 52, configured to obtain a second data set by performing feature extraction on the plurality of face image data. And the cluster detection unit 53 is configured to perform cluster detection on the second data set to obtain categories of a plurality of face images.

In a possible implementation manner of the present disclosure, the data feature extraction unit is configured to: and performing feature extraction on the plurality of face image data to obtain a plurality of feature vectors. And obtaining K neighbor according to the similarity between each feature vector in the plurality of feature vectors and the adjacent feature vector, and obtaining a plurality of first neighbor graphs according to the K neighbor. And iterating the plurality of first adjacency graphs according to the super nodes to obtain a plurality of clustering results. And forming the second data set according to the plurality of clustering results.

In a possible implementation manner of the present disclosure, the data feature extraction unit is configured to: and dividing the plurality of first adjacent graphs into a plurality of connected domains according to a preset threshold value, and determining the connected domains as the super nodes. And obtaining K neighbor according to the similarity between each super node in the super nodes and the adjacent super node, and obtaining a plurality of second neighbor graphs to be processed according to the K neighbor. And continuing to execute iteration for determining the supernodes on the plurality of second adjacency graphs to be processed until the iteration reaches a second threshold interval range, and stopping the iteration to obtain a plurality of clustering results.

In a possible implementation manner of the present disclosure, the cluster detection unit is configured to: and performing back propagation according to the loss function of the clustering network to obtain the self-learned clustering network. And correcting the cluster evaluation parameters according to the self-learned cluster network to obtain corrected cluster evaluation parameters. And carrying out cluster quality evaluation on a plurality of cluster results in the second data set according to the corrected cluster evaluation parameters to obtain the categories of a plurality of face images.

In a possible implementation manner of the present disclosure, the apparatus further includes: a first processing unit to: predicting a probability value for each node in the plurality of clustering results in the second data set to determine a probability of whether each node in the plurality of clustering results belongs to noise.

In a possible implementation manner of the present disclosure, the apparatus further includes: a second processing unit to: and evaluating the plurality of clustering results in the second data set according to the clustering network and the clustering evaluation parameters to obtain clustering quality evaluation results, and sequencing the plurality of clustering results according to the sequence of the clustering quality from high to low according to the clustering quality evaluation results to obtain sequencing results. And determining a clustering result with the highest clustering quality from the plurality of clustering results according to the sequencing result to serve as a final clustering result.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a non-volatile computer readable storage medium.

An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured as the above method.

The electronic device may be provided as a terminal, server, or other form of device.

Fig. 10 is a block diagram illustrating an electronic device 800 in accordance with an example embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.

Referring to fig. 10, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.

Fig. 11 is a block diagram illustrating an electronic device 900 in accordance with an example embodiment. For example, the electronic device 900 may be provided as a server. Referring to fig. 8, electronic device 900 includes a processing component 922, which further includes one or more processors, and memory resources, represented by memory 932, for storing instructions, such as applications, that are executable by processing component 922. The application programs stored in memory 932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 922 is configured to execute instructions to perform the above-described methods.

The electronic device 900 may also include a power component 926 configured to perform power management of the electronic device 900, a wired or wireless network interface 950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 958. The electronic device 900 may operate based on an operating system stored in memory 932, such as WindowsServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 932, is also provided that includes computer program instructions executable by the processing component 922 of the electronic device 900 to perform the above-described method.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A face image recognition method is characterized by comprising the following steps:

obtaining a plurality of face images;

2. The method according to claim 1, wherein the extracting features of the plurality of face images to obtain a plurality of feature vectors corresponding to the plurality of face images respectively comprises:

3. The method of claim 1, wherein obtaining a plurality of target objects to be identified according to the plurality of feature vectors comprises:

4. A training method of a face recognition neural network is characterized by comprising the following steps:

obtaining a first data set comprising a plurality of face image data;

5. The method of claim 4, wherein the obtaining a second data set by feature extraction of the plurality of face image data comprises:

6. The method of claim 5, wherein iterating the first adjacency graphs according to the super-node to obtain a plurality of clustering results comprises:

7. An apparatus for recognizing a face image, the apparatus comprising:

a first obtaining unit configured to obtain a plurality of face images;

8. An apparatus for training a face recognition neural network, the apparatus comprising:

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: performing the method of any one of claims 1 to 3, claims 4 to 6.

10. A computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method of any one of claims 1 to 3, 4 to 6.