CN112766421A

CN112766421A - Structure-aware-based face clustering method and device

Info

Publication number: CN112766421A
Application number: CN202110272409.9A
Authority: CN
Inventors: 周杰; 鲁继文; 沈帅; 李万华; 朱政
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-03-12
Filing date: 2021-03-12
Publication date: 2021-05-07
Anticipated expiration: 2041-03-12
Also published as: CN112766421B

Abstract

The present application proposes a structure-aware-based face clustering method and device, wherein the method includes: acquiring multiple face images to be processed, and extracting each of the to-be-processed face images based on a pre-trained convolutional neural network model face features of the face image, and construct a K-nearest neighbor graph according to the face features of each of the face images to be processed; input the K-nearest neighbor graph into a pre-trained edge score prediction model, and obtain the K-nearest neighbor graph The score of each edge in ; wherein, the edge score prediction model is obtained by using the structure-preserving subgraph sampling strategy to sample the K-nearest neighbor graph, and using the sampled subgraph to train the graph convolutional neural network; According to the score of each edge in the K-nearest neighbor graph, a first pruning operation is performed on the K-nearest neighbor graph to obtain face clusters for the plurality of face images to be processed. Solve the technical problem of insufficient accuracy of face clustering in related technologies.

Description

Structure-aware-based face clustering method and device

技术领域technical field

本申请涉及图像处理技术领域中的人工智能和深度学习技术领域，尤其涉及一种基于结构感知的人脸聚类方法和装置。The present application relates to the technical field of artificial intelligence and deep learning in the technical field of image processing, and in particular, to a method and device for face clustering based on structure perception.

背景技术Background technique

人脸识别技术的发展依赖于大规模人脸数据集的提出。近年来，随着人脸识别技术的发展，人脸数据集的规模也越来越大，然而伴随着数据集规模的提高，标注成本也越来越昂贵。The development of face recognition technology relies on the introduction of large-scale face datasets. In recent years, with the development of face recognition technology, the scale of face datasets has become larger and larger. However, with the increase in the scale of datasets, the cost of labeling has become more and more expensive.

人脸聚类算法是降低标注成本的一个有效方法，然而相关技术中，在面对大规模真实人脸数据时，聚类模型的聚类准确度有待提高。The face clustering algorithm is an effective method to reduce the cost of labeling. However, in the related art, when faced with large-scale real face data, the clustering accuracy of the clustering model needs to be improved.

发明内容SUMMARY OF THE INVENTION

本申请旨在至少在一定程度上解决相关技术中的技术问题之一。The present application aims to solve one of the technical problems in the related art at least to a certain extent.

为此，本申请的第一个目的在于提出一种基于结构感知的人脸聚类方法，以实现人脸聚类准确度的提高。Therefore, the first purpose of the present application is to propose a face clustering method based on structure perception, so as to improve the accuracy of face clustering.

本申请的第二个目的在于提出一种装置。The second object of the present application is to propose a device.

本申请的第三个目的在于提出一种电子设备。The third object of the present application is to propose an electronic device.

本申请的第四个目的在于提出一种非瞬时计算机可读存储介质。A fourth object of the present application is to propose a non-transitory computer-readable storage medium.

本申请的第五个目的在于提出一种计算机程序产品。A fifth object of the present application is to propose a computer program product.

为达上述目的，本申请第一方面实施例提出了一种人脸聚类方法，包括：In order to achieve the above purpose, the embodiment of the first aspect of the present application proposes a face clustering method, including:

获取多个待处理人脸图像，并基于预先训练的卷积神经网络模型提取每个所述待处理人脸图像的人脸特征，并根据每个所述待处理人脸图像的人脸特征构建K近邻图；Obtain a plurality of face images to be processed, and extract the face features of each of the face images to be processed based on a pre-trained convolutional neural network model, and construct a face feature based on the face features of each of the face images to be processed K-nearest neighbor graph;

将所述K近邻图输入至预先训练的边分数预测模型，获得所述K近邻图之中各边的分数；其中，所述边分数预测模型是利用结构保留子图采样策略对K近邻图进行采样，并利用采样得到的子图对图卷积神经网络进行训练而得到的；The K-nearest neighbor graph is input into the pre-trained edge score prediction model, and the score of each edge in the K-nearest neighbor graph is obtained; wherein, the edge score prediction model utilizes the structure-preserving subgraph sampling strategy to perform the K-nearest neighbor graph. Sampling, and using the sampled subgraphs to train the graph convolutional neural network;

根据所述K近邻图之中各边的分数，对所述K近邻图进行第一次剪枝操作，获得针对所述多个待处理人脸图像的人脸聚类。According to the score of each edge in the K-nearest neighbor graph, a first pruning operation is performed on the K-nearest neighbor graph to obtain face clusters for the plurality of face images to be processed.

为达上述目的，本申请第二方面实施例提出了一种装置，包括：In order to achieve the above purpose, an embodiment of the second aspect of the present application proposes a device, including:

第一获取模块，用于获取多个待处理人脸图像；a first acquisition module, used for acquiring a plurality of face images to be processed;

特征提取模块，用于基于预先训练的卷积神经网络模型提取每个所述待处理人脸图像的人脸特征；a feature extraction module for extracting the face features of each of the to-be-processed face images based on a pre-trained convolutional neural network model;

构建模块，用于根据每个所述待处理人脸图像的人脸特征构建K近邻图；a building module for constructing a K-nearest neighbor graph according to the facial features of each of the to-be-processed facial images;

预测模块，用于将所述K近邻图输入至预先训练的边分数预测模型，获得所述K近邻图之中各边的分数；其中，所述边分数预测模型是利用结构保留子图采样策略对K近邻图进行采样，并利用采样得到的子图对图卷积神经网络进行训练而得到的；A prediction module for inputting the K-nearest neighbor graph into a pre-trained edge score prediction model to obtain the score of each edge in the K-nearest neighbor graph; wherein, the edge score prediction model utilizes a structure-preserving subgraph sampling strategy It is obtained by sampling the K-nearest neighbor graph and training the graph convolutional neural network with the sampled subgraph;

剪枝操作模块，用于根据所述K近邻图之中各边的分数，对所述K近邻图进行第一次剪枝操作，获得针对所述多个待处理人脸图像的人脸聚类。A pruning operation module, configured to perform a first pruning operation on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph to obtain face clusters for the plurality of face images to be processed .

根据本申请的第三方面，提供了一种电子设备，包括：According to a third aspect of the present application, an electronic device is provided, comprising:

至少一个处理器；以及at least one processor; and

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行本申请的一方面所述基于结构感知的人脸聚类方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the structure-aware-based face clustering methods.

根据本申请的第四方面，提供了一种存储有计算机指令的非瞬时计算机可读存储介质，其中，所述计算机指令用于使所述计算机执行本申请的第一方面所述基于结构感知的人脸聚类方法。According to a fourth aspect of the present application, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the structure-awareness-based storage medium of the first aspect of the present application. face clustering methods.

根据本申请的第五方面，提供了一种计算机程序产品，包括计算机程序，所述计算机程序在被处理器执行时实现根据第一方面所述基于结构感知的人脸聚类方法。According to a fifth aspect of the present application, there is provided a computer program product, comprising a computer program that, when executed by a processor, implements the structure-aware-based face clustering method according to the first aspect.

根据本申请实施例的技术方案，通过卷积神经网络提取人脸图像的特征，得到准确的人脸特征，生成K近邻图。通过结构保留子图采样策略，从K近邻图中获取子图，该子图可以表示K近邻图中簇内和簇与簇之间的关系，使用该子图训练的边分数预测模型对K近邻图进行第一次剪枝操作，使得人脸聚类更加精准。According to the technical solutions of the embodiments of the present application, the features of a face image are extracted through a convolutional neural network, accurate face features are obtained, and a K-nearest neighbor graph is generated. Through the structure-preserving subgraph sampling strategy, a subgraph is obtained from the K-nearest neighbor graph, which can represent the relationships within and between clusters in the K-nearest neighbor graph. The first pruning operation is performed on the graph to make the face clustering more accurate.

本申请附加的方面和优点将在下面的描述中部分给出，部分将从下面的描述中变得明显，或通过本申请的实践了解到。Additional aspects and advantages of the present application will be set forth, in part, in the following description, and in part will be apparent from the following description, or learned by practice of the present application.

附图说明Description of drawings

本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解，其中：The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:

图1是根据本申请一个实施例的基于结构感知的人脸聚类方法的流程图；1 is a flowchart of a structure-aware-based face clustering method according to an embodiment of the present application;

图2是根据本申请另一个实施例的基于结构感知的人脸聚类方法的流程图；2 is a flowchart of a structure-aware-based face clustering method according to another embodiment of the present application;

图3是根据本申请一个实施例的第二次剪枝操作的示意图；3 is a schematic diagram of a second pruning operation according to an embodiment of the present application;

图4是根据本申请又一个实施例的基于结构感知的人脸聚类方法的流程图；4 is a flowchart of a structure-aware-based face clustering method according to yet another embodiment of the present application;

图5是根据本申请一个实施例的基于结构感知的人脸聚类方法的流程图；5 is a flowchart of a structure-aware-based face clustering method according to an embodiment of the present application;

图6是根据本申请一个实施例的基于结构感知的人脸聚类装置的结构框图；6 is a structural block diagram of a structure-aware-based face clustering apparatus according to an embodiment of the present application;

图7是根据本申请另一个实施例的基于结构感知的人脸聚类装置的结构框图；7 is a structural block diagram of a structure-aware-based face clustering apparatus according to another embodiment of the present application;

图8是用来实现本申请实施例的基于结构感知的人脸聚类的方法的电子设备的框图。FIG. 8 is a block diagram of an electronic device used to implement the structure-aware-based face clustering method according to the embodiment of the present application.

具体实施方式Detailed ways

下面详细描述本申请的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，旨在用于解释本申请，而不能理解为对本申请的限制。The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, and are intended to be used to explain the present application, but should not be construed as a limitation to the present application.

下面参考附图描述本申请实施例的基于结构感知的人脸聚类方法和装置。The following describes the structure-aware-based face clustering method and apparatus according to the embodiments of the present application with reference to the accompanying drawings.

图1是根据本申请一个实施例的基于结构感知的人脸聚类方法的流程图。需要说明的是，本申请实施例的基于结构感知的人脸聚类方法可应用于本申请实施例的基于结构感知的人脸聚类装置，该基于结构感知的人脸聚类装置可被配置于本申请实施例的电子设备上。FIG. 1 is a flowchart of a structure-aware-based face clustering method according to an embodiment of the present application. It should be noted that the structure-awareness-based face clustering method in the embodiment of the present application can be applied to the structure-awareness-based face clustering apparatus in the embodiment of the present application, and the structure-awareness-based face clustering apparatus can be configured on the electronic device of the embodiments of the present application.

如图1所示，该基于结构感知的人脸聚类方法可以包括：As shown in Figure 1, the structure-aware-based face clustering method may include:

步骤101，获取多个待处理人脸图像，并基于预先训练的卷积神经网络模型提取每个待处理人脸图像的人脸特征，并根据每个待处理人脸图像的人脸特征构建K近邻图。Step 101: Obtain a plurality of face images to be processed, and extract the face features of each face image to be processed based on a pre-trained convolutional neural network model, and construct K according to the face features of each face image to be processed. Neighbor graph.

在本申请一些实施例中，通过K近邻图，可以将属于同一类型的人脸图像连接一起，其中，K近邻图中的节点表示人脸图像，K近邻图中节点的边表示节点所对应的人脸图像属于同一类。K近邻图的构建过程中，需要两个参数，这两个参数是：每个待处理人脸图像的人脸特征和K值。其中，人脸特征可以通过预先训练好的卷积神经网络模型，从待处理人脸图像中提取获得；K值可以基于经验和/或待处理人脸图像数量设定，该K值可以决定K近邻图的归类效果。In some embodiments of the present application, face images belonging to the same type can be connected together through the K-nearest neighbor graph, where the nodes in the K-nearest neighbor graph represent face images, and the edges of the nodes in the K-nearest neighbor graph represent the corresponding Face images belong to the same class. In the process of constructing the K-nearest neighbor graph, two parameters are required, and the two parameters are: the face feature of each face image to be processed and the K value. Among them, the face features can be extracted from the face images to be processed through a pre-trained convolutional neural network model; the K value can be set based on experience and/or the number of face images to be processed, and the K value can determine K The classification effect of the nearest neighbor graph.

可以理解地，由于预先训练的卷积神经网络模型和K值选取的局限性，在K近邻图中，不仅属于同一类的人脸图像会被连接在一起，不属于同一类的人脸图像也有一定的概率被连接在一起，因此，此时的K近邻图的人脸图像分类的精准度是无法满足需求的。为了得到准确的人脸聚类，需要进一步对K近邻图进行操作，去除掉错误连接的人脸图像的连接关系。Understandably, due to the limitations of the pre-trained convolutional neural network model and the selection of the K value, in the K nearest neighbor graph, not only face images belonging to the same class will be connected together, but also face images that do not belong to the same class. A certain probability is connected together. Therefore, the accuracy of face image classification of K-nearest neighbor graph at this time cannot meet the requirements. In order to obtain accurate face clustering, it is necessary to further operate on the K-nearest neighbor graph to remove the connection relationship of wrongly connected face images.

步骤102，将K近邻图输入至预先训练的边分数预测模型，获得K近邻图之中各边的分数；其中，边分数预测模型是利用结构保留子图采样策略对K近邻图进行采样，并利用采样得到的子图对图卷积神经网络进行训练而得到的。Step 102, input the K-nearest neighbor graph into the pre-trained edge score prediction model, and obtain the score of each edge in the K-nearest neighbor graph; wherein, the edge score prediction model utilizes the structure-preserving subgraph sampling strategy to sample the K-nearest neighbor graph, and It is obtained by training a graph convolutional neural network using the sampled subgraphs.

可以理解地，可以通过对K近邻图中各个边进行打分，获得各个边的分数，以便可通过对分数的筛选来实现对K近邻图的剪枝操作，从而获取更精准的人脸聚类。It is understandable that the score of each edge can be obtained by scoring each edge in the K-nearest neighbor graph, so that the pruning operation of the K-nearest neighbor graph can be realized by screening the scores, so as to obtain more accurate face clustering.

在本申请一些实施例中，为了对K近邻图中各个边进行打分，可以预先训练一个边分数预测模型。在一些情况下，由于硬件算力的限制，可以利用结构保留子图采样策略对K近邻图进行采样，并通过采样之后的子图获得分数预测模型。这样，可以在节省算力的前提下，得到适用于K近邻图的边分数预测模型。通常而言，K近邻图中，连接关系紧密的节点可以作为一个簇，不同簇之间也可以有连接，但不同簇之间的连接关系通常不紧密。In some embodiments of the present application, in order to score each edge in the K-nearest neighbor graph, an edge score prediction model may be pre-trained. In some cases, due to the limitation of hardware computing power, the structure-preserving sub-graph sampling strategy can be used to sample the K-nearest neighbor graph, and obtain the score prediction model through the sub-graph after sampling. In this way, an edge score prediction model suitable for K-nearest neighbor graphs can be obtained under the premise of saving computing power. Generally speaking, in the K-nearest neighbor graph, nodes with close connections can be regarded as a cluster, and there can be connections between different clusters, but the connections between different clusters are usually not close.

举例而言，结构保留子图采样策略可以是从K近邻图中的每个簇中随机采样相同个数的节点，并保留这些节点之间的连接关系，得到子图。该子图可以保留每个簇内节点的连接关系，也可以保留簇之间的连接关系，可以理解地，该子图中每条边都会连接两个节点，一条边连接的两个节点可以作为一个节点对。可以通过该子图获得边分数预测模型，该边分数预测模型的初始模型包括但不限于：图卷积神经网络、图卷积神经网络连接多层感知器中的任一种。在对该边分数预测模型进行训练时，模型输入为整个子图，模型输出为子图中每个边对应的边分数。可以理解地，当该节点对属于同一类时，该节点对对应的边分数为1；当该节点对不属于同一类时，该节点对对应的边分数为0。还可以使用损失函数对该该训练进行监督，该损失函数包括但不限于：指数损失函数、交叉熵损失函数中的任一种。For example, the structure-preserving subgraph sampling strategy may be to randomly sample the same number of nodes from each cluster in the K-nearest neighbor graph, and retain the connection relationship between these nodes to obtain a subgraph. The subgraph can retain the connection relationship between nodes in each cluster, and also retain the connection relationship between clusters. It is understandable that each edge in this subgraph will connect two nodes, and two nodes connected by one edge can be used as A node pair. An edge score prediction model can be obtained through the subgraph, and the initial model of the edge score prediction model includes, but is not limited to, any one of a graph convolutional neural network and a graph convolutional neural network connected to a multi-layer perceptron. When training this edge score prediction model, the model input is the entire subgraph, and the model output is the edge score corresponding to each edge in the subgraph. Understandably, when the node pair belongs to the same class, the edge score corresponding to the node pair is 1; when the node pair does not belong to the same class, the edge score corresponding to the node pair is 0. The training can also be supervised using a loss function including, but not limited to, any of an exponential loss function and a cross-entropy loss function.

步骤103，根据K近邻图之中各边的分数，对K近邻图进行第一次剪枝操作，获得针对多个待处理人脸图像的人脸聚类。Step 103 , perform a first pruning operation on the K-nearest neighbor graph according to the scores of each edge in the K-nearest neighbor graph, and obtain face clusters for multiple face images to be processed.

在本申请一些实施例中，为了使K近邻图中的人脸聚类更加准确，可以对该K近邻图进行第一次剪枝操作。该第一次剪枝操作可以通过预先训练的边分数预测模型完成，预先训练的边分数预测模型可以对K近邻图之中的各边进行打分，可以预设一个阈值，将打分结果和阈值进行比较。当打分结果大于等于该阈值时，则表示该边连接的两个节点属于同一类，该边可以保留；当打分结果小于该阈值时，则表示该边连接的两个节点不属于同一类，该边会被移除。可以理解地，连接属于同一类的节点的边可以称为真边；连接不属于同一类的节点的边可以称为假边。经过第一次剪枝操作处理，可以获得针对多个待处理人脸图像的人脸聚类。In some embodiments of the present application, in order to make the face clustering in the K-nearest neighbor graph more accurate, a first pruning operation may be performed on the K-nearest neighbor graph. The first pruning operation can be completed by a pre-trained edge score prediction model. The pre-trained edge score prediction model can score each edge in the K-nearest neighbor graph. A threshold can be preset, and the scoring result and the threshold can be calculated. Compare. When the score is greater than or equal to the threshold, it means that the two nodes connected by the edge belong to the same class, and the edge can be retained; when the score is less than the threshold, it means that the two nodes connected by the edge do not belong to the same class, the edge Edges will be removed. Understandably, edges connecting nodes belonging to the same class can be called true edges; edges connecting nodes not belonging to the same class can be called false edges. After the first pruning operation, face clusters for multiple face images to be processed can be obtained.

根据本申请实施例的基于结构感知的人脸聚类方法，根据卷积神经网络模型提取的人脸特征对待处理人脸图像进行聚类处理，构建K近邻图。可以通过深度学习技术，得到较为准确的人脸特征，使得构建的K近邻图的人脸聚类更加精准。使用K近邻图的子图作为训练集，对图卷积神经网络进行训练，可以得到分数预测模型，该K近邻图的子图可以代表K近邻图中，一个簇中各人脸图像的关系，也可以代表K近邻图中，不同簇之间的关系。图卷积神经网络经过训练后，可以得到分数预测模型，通过该分数预测模型实现对K近邻图的剪枝操作，移除假边，得到更加精准的人脸聚类。According to the structure-aware-based face clustering method according to the embodiment of the present application, the face image to be processed is clustered according to the face features extracted by the convolutional neural network model, and a K-nearest neighbor graph is constructed. Through deep learning technology, more accurate face features can be obtained, so that the face clustering of the constructed K-nearest neighbor graph is more accurate. Using the subgraph of the K-nearest neighbor graph as a training set, the graph convolutional neural network can be trained to obtain a score prediction model. The subgraph of the K-nearest neighbor graph can represent the relationship between each face image in a cluster in the K-nearest neighbor graph, It can also represent the relationship between different clusters in the K-nearest neighbor graph. After the graph convolutional neural network is trained, a score prediction model can be obtained. Through the score prediction model, the pruning operation of the K-nearest neighbor graph can be realized, false edges can be removed, and more accurate face clustering can be obtained.

本申请的第二实施例中，基于第一实施例，为了使人脸聚类更加精准，可以在第一次剪枝之后，对人脸聚类进行第二次剪枝操作。可以基于图1的基于结构感知的人脸聚类方法使用实施例二具体说明该方法。可选地，在根据K近邻图之中各边的分数，对K近邻图进行第一次剪枝操作之后，还包括步骤201-202。In the second embodiment of the present application, based on the first embodiment, in order to make the face clustering more accurate, a second pruning operation may be performed on the face clustering after the first pruning. The method can be specifically described using Embodiment 2 based on the structure-aware-based face clustering method in FIG. 1 . Optionally, after the first pruning operation is performed on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph, steps 201-202 are further included.

图2是根据本申请另一个实施例的基于结构感知的人脸聚类方法的流程图。为了更清楚说明如何进行第二次剪枝操作，可以通过图2具体说明，图2为根据本申请一个实施例的基于结构感知的人脸聚类方法的流程图，具体包括：FIG. 2 is a flowchart of a structure-aware-based face clustering method according to another embodiment of the present application. In order to explain more clearly how to perform the second pruning operation, it can be described in detail with reference to FIG. 2 , which is a flowchart of a structure-aware-based face clustering method according to an embodiment of the present application, which specifically includes:

步骤201，计算经过第一次剪枝操作的K近邻图之中两两节点间的亲密度。Step 201: Calculate the intimacy between two nodes in the K-nearest neighbor graph that has undergone the first pruning operation.

在本申请一些实施例中，K近邻图的两两节点之间可以有一亲密度。该亲密度反映了两两节点是否属于同一个类别。亲密度越高，则说明该两两节点属于同一类别的概率越高；亲密度越低，则说明该两两节点属于同一类别的概率越低。通常来说，簇和簇之间的边连接的节点是一个类别的概率较低；簇内的边连接的节点是一个类别的该概率较高。In some embodiments of the present application, there may be an affinity between two nodes in the K-nearest neighbor graph. The intimacy reflects whether two nodes belong to the same category. The higher the intimacy, the higher the probability that the pair of nodes belong to the same category; the lower the intimacy, the lower the probability that the pair of nodes belong to the same category. In general, the probability that a node connected by an edge between clusters is a class is low; the probability that a node connected by an edge within a cluster is a class is high.

举例而言，该亲密度的计算方法可以为根据簇中节点的数目确定范围直径，以该簇中心为圆心，在范围直径内的节点则认为亲密度足够，不在范围直径内的点则认为亲密度不足。For example, the calculation method of the intimacy may be to determine the range diameter according to the number of nodes in the cluster, with the center of the cluster as the center of the circle, the nodes within the range diameter are considered to have sufficient intimacy, and the points not within the range diameter are considered intimate. Insufficient degree.

在本申请一些实施例中，还可以通过以下公式计算两两节点间的亲密度：In some embodiments of the present application, the intimacy between two nodes can also be calculated by the following formula:

其中，n₁表示与节点N1相连的边的个数，n₂表示与节点N2相连的边的个数，k表示节点N1与节点N2共有的邻居节点的个数。Among them, n ₁ represents the number of edges connected to node N1, n ₂ represents the number of edges connected to node N2, and k represents the number of neighbor nodes shared by node N1 and node N2.

图3是根据本申请一个实施例的第二次剪枝操作的示意图，如图所示，A节点有九条边，B节点有八条边，其中A节点和B节点的共有邻居节点为一个。C节点有七条边，D节点有十条边，其中C节点和D节点的共有邻居节点为六个，可以得出，A、B的亲密度为

C、D的亲密度为

3 is a schematic diagram of a second pruning operation according to an embodiment of the present application. As shown in the figure, node A has nine edges, node B has eight edges, and node A and node B share one neighbor node. Node C has seven edges, and node D has ten edges. Among them, the common neighbor nodes of node C and node D are six. It can be concluded that the intimacy of A and B is

The intimacy of C and D is

在本申请一些实施例中，还可以使用矩阵操作实现亲密度的计算，K近邻图对应的邻接矩阵为A∈RN×N，则所有两两节点的共有邻居节点数目为

中的每个元素

表示节点N_i和N_j的共有邻居数。那亲密度的值为

其中，sum₀＝vec(∑_ja_·j ^-1),sum₁＝vec(∑_ia_i· ^-1)。其中，vec()表示向量化，sum₀表示对矩阵A按行求和之后对各元素取倒数再向量化，sum₁表示对矩阵A按列求和之后对各元素取倒数再向量化。In some embodiments of this application, matrix operations can also be used to calculate intimacy. The adjacency matrix corresponding to the K-nearest neighbor graph is A∈RN×N, then the number of common neighbor nodes of all pairwise nodes is

each element in

represents the number of shared neighbors of nodes N _i and N _j . The value of intimacy is

Wherein, sum ₀ =vec(∑ _j a _·j ^-1 ), sum ₁ =vec(∑ _i a _{i ·} ^-1 ). Among them, vec() means vectorization, sum ₀ means taking the reciprocal of each element after summing the matrix A in rows and then vectorizing it, sum ₁ means taking the inverse of each element after summing the matrix A in columns and then vectorizing.

步骤202，根据两两节点间的亲密度，对经过第一次剪枝操作的K近邻图进行第二次剪枝操作。Step 202, according to the intimacy between the two nodes, perform a second pruning operation on the K-nearest neighbor graph that has undergone the first pruning operation.

在本申请一些实施例中，在第一次剪枝操作的K近邻图的基础上，可以进行第二次剪枝操作，该操作可以通过预设一个亲密度阈值实现。在两两节点间亲密度大于等于该阈值的情况下，保留该两两节点对应的边；在两两节点间亲密度小于该阈值的情况下，移除该两两节点对应的边。In some embodiments of the present application, based on the K-nearest neighbor graph of the first pruning operation, a second pruning operation may be performed, and this operation may be implemented by presetting an intimacy threshold. When the intimacy between the two nodes is greater than or equal to the threshold, the edges corresponding to the two nodes are retained; when the intimacy between the two nodes is less than the threshold, the edges corresponding to the two nodes are removed.

根据本申请实施例的基于结构感知的人脸聚类方法，第一次剪枝操作之后，仍有一些假边，由于该假边错误地连接了不同类别，会影响人脸聚类准确度。因而可以通过两两节点间的亲密度，对第一次剪枝操作之后的K近邻图进行第二次剪枝操作，该操作使得得到的人脸聚类更加精准，K近邻图的结构更加清晰。经过第二次剪枝之后，K近邻图中大多数错误的边已经被删除，此时从K近邻图中读取的人脸聚类信息会更准确。According to the structure-aware-based face clustering method according to the embodiment of the present application, after the first pruning operation, there are still some false edges. Since the false edges are wrongly connected to different categories, the accuracy of face clustering will be affected. Therefore, the second pruning operation can be performed on the K-nearest neighbor graph after the first pruning operation through the intimacy between the two nodes. This operation makes the obtained face clustering more accurate, and the structure of the K-nearest neighbor graph is clearer. . After the second pruning, most of the wrong edges in the K-nearest neighbor graph have been removed, and the face clustering information read from the K-nearest neighbor graph will be more accurate.

在本申请的第三实施例中，基于上述实施例，所述边分数预测模型可以预先采用以下步骤进行训练而得到。In the third embodiment of the present application, based on the above-mentioned embodiment, the edge score prediction model can be obtained by training the following steps in advance.

更清楚的，可以通过图4具体说明，图4是根据本申请又一实施例的基于结构感知的人脸聚类方法的流程图，具体包括：More clearly, it can be described in detail with reference to FIG. 4 , which is a flowchart of a structure-aware-based face clustering method according to another embodiment of the present application, which specifically includes:

步骤401，获取训练样本集，训练样本集包括多个人脸样本图像。Step 401: Obtain a training sample set, where the training sample set includes multiple face sample images.

可以理解地，进行深度学习模型训练时通常需要训练样本集，本申请的目的是得到人脸聚类，因此，本申请的一些实施例中，训练样本集可以包括多个人脸样本图像。Understandably, a training sample set is usually required when training a deep learning model. The purpose of the present application is to obtain face clusters. Therefore, in some embodiments of the present application, the training sample set may include multiple face sample images.

步骤402，基于卷积神经网络模型提取每个人脸样本图像的人脸特征。Step 402, extracting face features of each face sample image based on the convolutional neural network model.

可以理解地，可以根据K近邻图，采样获得用于对图卷积神经网络进行训练的子图。构建K近邻图时，需要用到人脸样本图像的人脸特征。在本申请一些实施例中，可以采用卷积神经网络模型提取人脸样本图像的人脸特征，该人脸特征可以为一个向量。Understandably, the subgraphs used for training the graph convolutional neural network can be obtained by sampling according to the K-nearest neighbor graph. When constructing the K-nearest neighbor graph, the face features of the face sample images need to be used. In some embodiments of the present application, a convolutional neural network model may be used to extract a face feature of a face sample image, and the face feature may be a vector.

步骤403，根据每个人脸样本图像的人脸特征，构建K近邻图样本。Step 403 , construct K-nearest neighbor graph samples according to the face features of each face sample image.

可以理解地，每个人脸样本图像有其对应的人脸特征，可以根据这个人脸特征构建K近邻图样本。Understandably, each face sample image has its corresponding face feature, and K-nearest neighbor image samples can be constructed according to this face feature.

步骤404，基于结构保留子图采样策略对K近邻图样本进行采样以获得采样后得到的子图，并利用采样后得到的子图对图卷积神经网络进行训练，获得网络参数，并根据网络参数生成边分数预测模型。Step 404: Sampling the K-nearest neighbor graph samples based on the structure-preserving subgraph sampling strategy to obtain the subgraph obtained after sampling, and use the subgraph obtained after sampling to train the graph convolutional neural network to obtain network parameters, and according to the network The parameters generate the edge score prediction model.

在本申请的一些实施例中，基于结构保留子图采样策略对K近邻图样本进行采样以获得采样后得到的子图，该子图的获取方法可以是从K近邻图的每个簇中随机选取一定数量的人脸图像从而得到子图，还可以通过以下步骤获得：In some embodiments of the present application, the K-nearest neighbor graph samples are sampled based on the structure-preserving subgraph sampling strategy to obtain a subgraph obtained after sampling, and the subgraph may be obtained by randomly selecting from each cluster of the K-nearest neighbor graph. Select a certain number of face images to obtain sub-images, which can also be obtained by the following steps:

步骤一，从K近邻图样本中随机选取M个簇作为采样种子。Step 1, randomly select M clusters from the K-nearest neighbor graph samples as sampling seeds.

可以理解地，为了对簇内的人脸图像之间的边进行建模，可以以簇为单位，从K近邻图样本中随机选取M个簇作为采样种子。其中，采样种子可以为一个或多个。It can be understood that, in order to model the edges between face images in a cluster, M clusters can be randomly selected from the K-nearest neighbor image samples as sampling seeds in a cluster unit. Wherein, there may be one or more sampling seeds.

步骤二，对于每个种子簇，选取每个种子簇的N个最近邻簇，将M个簇和N个最近邻簇组成的图作为第一子图S1。Step 2: For each seed cluster, select N nearest neighbor clusters of each seed cluster, and take the graph composed of M clusters and N nearest neighbor clusters as the first subgraph S1.

在本申请一些实施例中，为了建模簇和簇之间的关系，对于每个种子簇，可以选取N个最近邻簇，该近邻簇的计算方法可以是通过相似度计算实现的，该相似度计算可以是簇中心的余弦相似度。为了可以同时模簇内和簇之间的关系，可以将M个簇和N个最近邻簇组成第一子图S1。In some embodiments of the present application, in order to model the relationship between clusters, for each seed cluster, N nearest neighbor clusters can be selected, and the calculation method of the nearest neighbor clusters can be realized by similarity calculation. The degree calculation can be the cosine similarity of cluster centers. In order to model the relationships within and between clusters at the same time, the first subgraph S1 can be composed of M clusters and N nearest neighbor clusters.

步骤三，从第一子图S1中随机选取K1个簇构建第二子图S2。Step 3, randomly select K1 clusters from the first subgraph S1 to construct a second subgraph S2.

在本申请一些实施例中，为了对选取的簇进行泛化操作，可以从第一子图S1中随机选取K1个簇构建第二子图S2。In some embodiments of the present application, in order to perform a generalization operation on the selected clusters, K1 clusters may be randomly selected from the first subgraph S1 to construct the second subgraph S2.

步骤四，从第二子图S2中随机选取K2个节点作为采样后得到的子图。Step 4, randomly select K2 nodes from the second subgraph S2 as the subgraph obtained after sampling.

在本申请一些实施例中，为了对簇中的节点进行泛化操作，可以从第二子图S2中随机选取K2个节点作为采样后的子图。In some embodiments of the present application, in order to generalize the nodes in the cluster, K2 nodes may be randomly selected from the second subgraph S2 as the sampled subgraph.

通过结构保留子图采样策略得到的K近邻图可以对图卷积神经网络进行训练，该训练方法可以包括以下步骤：The K-nearest neighbor graph obtained by the structure-preserving subgraph sampling strategy can train the graph convolutional neural network, and the training method can include the following steps:

步骤一，将采样后得到的子图输入至图卷积神经网络，获得子图之中各边的分数预测值。Step 1: Input the subgraph obtained after sampling into the graph convolutional neural network, and obtain the predicted scores of each edge in the subgraph.

在本申请一些实施例中，各边连接的节点属于同一类时，分数预测值可以为1；各边连接的人脸图像属于不同类时，分数预测值可以为0。In some embodiments of the present application, when the nodes connected by each edge belong to the same class, the predicted score value may be 1; when the face images connected by each edge belong to different classes, the predicted score value may be 0.

步骤二，基于交叉熵损失函数计算各边的分数预测值与各自对应边的真实分数值之间的损失值。Step 2: Calculate the loss value between the predicted score value of each edge and the real score value of each corresponding edge based on the cross-entropy loss function.

在本申请一些实施例中，各边的分数预测值和其对应边的真实分数值之间会有一个损失值，可以采用交叉熵损失函数来计算该损失值。In some embodiments of the present application, there is a loss value between the predicted score value of each edge and the actual score value of its corresponding edge, and the loss value can be calculated by using a cross-entropy loss function.

步骤三，根据损失值对图卷积神经网络进行训练。Step 3: Train the graph convolutional neural network according to the loss value.

可以理解地，可以根据损失值对图卷积神经网络进行训练。Understandably, a graph convolutional neural network can be trained based on the loss value.

通过训练，可以获得网络参数，根据获得到的网络参数可以生成边分数预测模型。该边分数预测模型的输出如图5所示，图5是根据本申请实施例的边分数预测模型的输出结果，其中横轴表示边分数预测模型的输出，越靠近0则表示该边连接的两个人脸图像属于同一类型的概率越小；越靠近1则表示该边连接的两个人脸图像属于同一类型的概率越小。Through training, network parameters can be obtained, and an edge score prediction model can be generated according to the obtained network parameters. The output of the edge score prediction model is shown in FIG. 5 , which is the output result of the edge score prediction model according to an embodiment of the present application, where the horizontal axis represents the output of the edge score prediction model, and the closer to 0, the connected The lower the probability that two face images belong to the same type; the closer to 1, the lower the probability that the two face images connected by the edge belong to the same type.

根据本申请实施例的基于结构感知的人脸聚类方法，构建K近邻图，使用结构保留子图采样策略对K近邻图进行处理，得到子图，从而得到边分数预测模型。该方法中子图保留了整个K近邻图的重要结构信息，既保留了簇内相关度高的连接，又保留了簇与簇之间相关度低的连接。并且为了泛化，引入了随机性，该处理方法可以增强模型的性能。According to the structure-aware face clustering method according to the embodiment of the present application, a K-nearest neighbor graph is constructed, and a structure-preserving subgraph sampling strategy is used to process the K-nearest neighbor graph to obtain a subgraph, thereby obtaining an edge score prediction model. In this method, the subgraph retains the important structural information of the entire K-nearest neighbor graph, not only retaining the connections with high correlation within the cluster, but also retaining the connections with low correlation between clusters. And for generalization, randomness is introduced, which can enhance the performance of the model.

根据本申请的实施例，本申请还提供了一种基于结构感知的人脸聚类装置。According to an embodiment of the present application, the present application further provides a structure-aware-based face clustering apparatus.

图6是根据本申请一个实施例的基于结构感知的人脸聚类装置的结构框图。如图6所示，该基于结构感知的人脸聚类装置600可以包括：第一获取模块601，构建模块602，预测模块603，剪枝操作模块604。FIG. 6 is a structural block diagram of a structure-aware-based face clustering apparatus according to an embodiment of the present application. As shown in FIG. 6 , the structure-aware-based face clustering apparatus 600 may include: a first acquisition module 601 , a construction module 602 , a prediction module 603 , and a pruning operation module 604 .

具体地，第一获取模块601，用于获取多个待处理人脸图像；Specifically, the first acquisition module 601 is used to acquire a plurality of face images to be processed;

特征提取模块，用于基于预先训练的卷积神经网络模型提取每个待处理人脸图像的人脸特征；A feature extraction module, which is used to extract the face features of each face image to be processed based on the pre-trained convolutional neural network model;

构建模块602，用于根据每个待处理人脸图像的人脸特征构建K近邻图；Building module 602, for constructing a K-nearest neighbor graph according to the facial features of each to-be-processed facial image;

预测模块603，用于将K近邻图输入至预先训练的边分数预测模型，获得K近邻图之中各边的分数；其中，边分数预测模型是利用结构保留子图采样策略对K近邻图进行采样，并利用采样得到的子图对图卷积神经网络进行训练而得到的；The prediction module 603 is used to input the K-nearest neighbor graph into the pre-trained edge score prediction model, and obtain the score of each edge in the K-nearest neighbor graph; wherein, the edge score prediction model utilizes the structure-preserving subgraph sampling strategy to perform the K-nearest neighbor graph. Sampling, and using the sampled subgraphs to train the graph convolutional neural network;

剪枝操作模块604，用于根据K近邻图之中各边的分数，对K近邻图进行第一次剪枝操作，获得针对多个待处理人脸图像的人脸聚类。The pruning operation module 604 is configured to perform a first pruning operation on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph, to obtain face clusters for a plurality of face images to be processed.

图7是根据本申请另一个实施例的基于结构感知的人脸聚类装置的结构框图。在本申请一些实施例中，如图7所示，该基于结构感知的人脸聚类装置还包括：预训练模块705。FIG. 7 is a structural block diagram of an apparatus for face clustering based on structure perception according to another embodiment of the present application. In some embodiments of the present application, as shown in FIG. 7 , the structure-aware-based face clustering apparatus further includes: a pre-training module 705 .

具体地，预训练模块705，用于预先训练边分数预测模型；其中，预训练模块具体用于：获取训练样本集，训练样本集包括多个人脸样本图像；基于卷积神经网络模型提取每个人脸样本图像的人脸特征；根据每个人脸样本图像的人脸特征，构建K近邻图样本；基于结构保留子图采样策略对K近邻图样本进行采样以获得采样后得到的子图，并利用采样后得到的子图对图卷积神经网络进行训练，获得网络参数，并根据网络参数生成边分数预测模型。Specifically, the pre-training module 705 is used for pre-training the edge score prediction model; wherein, the pre-training module is specifically used for: acquiring a training sample set, the training sample set includes a plurality of face sample images; extracting each person based on the convolutional neural network model face features of face sample images; construct K-nearest neighbor map samples according to the face features of each face sample image; sample K-nearest neighbor map samples based on the structure-preserving sub-graph sampling strategy to obtain the sub-graph obtained after sampling, and use The subgraph obtained after sampling trains the graph convolutional neural network, obtains the network parameters, and generates an edge score prediction model according to the network parameters.

其中图7中701-704和图6中601-604具有相同功能和结构。701-704 in FIG. 7 and 601-604 in FIG. 6 have the same function and structure.

关于上述实施例中的装置，其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述，此处将不做详细阐述说明。Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.

根据本申请的实施例，本申请还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to the embodiments of the present application, the present application further provides an electronic device, a readable storage medium, and a computer program product.

图8示出了可以用来实施本申请的实施例的示例电子设备800的示意性框图。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本申请的实现。FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

如图8所示，设备800包括计算单元801，其可以根据存储在只读存储器(ROM)802中的计算机程序或者从存储单元808加载到随机访问存储器(RAM)803中的计算机程序，来执行各种适当的动作和处理。在RAM 803中，还可存储设备800操作所需的各种程序和数据。计算单元801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。As shown in FIG. 8 , the device 800 includes a computing unit 801 that can be executed according to a computer program stored in a read only memory (ROM) 802 or a computer program loaded from a storage unit 808 into a random access memory (RAM) 803 Various appropriate actions and handling. In the RAM 803, various programs and data necessary for the operation of the device 800 can also be stored. The computing unit 801 , the ROM 802 , and the RAM 803 are connected to each other through a bus 804 . An input/output (I/O) interface 805 is also connected to bus 804 .

设备800中的多个部件连接至I/O接口805，包括：输入单元806，例如键盘、鼠标等；输出单元807，例如各种类型的显示器、扬声器等；存储单元808，例如磁盘、光盘等；以及通信单元809，例如网卡、调制解调器、无线通信收发机等。通信单元809允许设备800通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Various components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, an optical disk, etc. ; and a communication unit 809, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 809 allows the device 800 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元801可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元801的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元801执行上文所描述的各个方法和处理，例如信息排序方法。例如，在一些实施例中，信息排序方法可被实现为计算机软件程序，其被有形地包含于机器可读介质，例如存储单元808。在一些实施例中，计算机程序的部分或者全部可以经由ROM 802和/或通信单元809而被载入和/或安装到设备800上。当计算机程序加载到RAM 803并由计算单元801执行时，可以执行上文描述的信息排序方法的一个或多个步骤。备选地，在其他实施例中，计算单元801可以通过其他任何适当的方式(例如，借助于固件)而被配置为执行信息排序方法。Computing unit 801 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 801 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 performs the various methods and processes described above, such as the information sorting method. For example, in some embodiments, the information sorting method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 808 . In some embodiments, part or all of the computer program may be loaded and/or installed on device 800 via ROM 802 and/or communication unit 809 . When a computer program is loaded into RAM 803 and executed by computing unit 801, one or more steps of the information sorting method described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the information sorting method by any other suitable means (eg, by means of firmware).

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

用于实施本申请的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器，使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

在本申请的上下文中，机器可读介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this application, a machine-readable medium may be a tangible medium that may contain or store the program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互，可以在计算机上实施此处描述的系统和技术，该计算机具有：用于向用户显示信息的显示装置(例如，CRT(阴极射线管)或者LCD(液晶显示器)监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网(LAN)、广域网(WAN)、互联网和区块链网络。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器，又称为云计算服务器或云主机，是云计算服务体系中的一项主机产品，以解决了传统物理主机与VPS服务("Virtual Private Server"，或简称"VPS")中，存在的管理难度大，业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器，或者是结合了区块链的服务器。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short). , there are the defects of difficult management and weak business expansion. The server can also be a server of a distributed system, or a server combined with a blockchain.

根据本申请实施例的基于结构感知的人脸聚类方法，根据卷积神经网络模型提取的人脸特征对待处理人脸图像进行聚类处理，构建K近邻图。可以通过深度学习技术，得到较为准确的人脸特征，使得构建的K近邻图的人脸聚类更加精准。使用K近邻图的子图作为训练集，对图卷积神经网络进行训练，可以得到分数预测模型，该K近邻图的子图可以代表K近邻图中，一个簇中各人脸图像的关系，也可以代表K近邻图中，不同簇之间的关系。图卷积神经网络经过训练后，可以得到分数预测模型，通过该分数预测模型实现对K近邻图的第一次剪枝操作，移除错误连接的边，得到更加精准的人脸聚类。According to the structure-aware-based face clustering method according to the embodiment of the present application, the face image to be processed is clustered according to the face features extracted by the convolutional neural network model, and a K-nearest neighbor graph is constructed. Through deep learning technology, more accurate face features can be obtained, so that the face clustering of the constructed K-nearest neighbor graph is more accurate. Using the subgraph of the K-nearest neighbor graph as a training set, the graph convolutional neural network can be trained to obtain a score prediction model. The subgraph of the K-nearest neighbor graph can represent the relationship between each face image in a cluster in the K-nearest neighbor graph, It can also represent the relationship between different clusters in the K-nearest neighbor graph. After the graph convolutional neural network is trained, a score prediction model can be obtained, and the first pruning operation of the K-nearest neighbor graph can be realized through the score prediction model, and the wrongly connected edges can be removed to obtain more accurate face clustering.

在第一次剪枝操作之后，还可以对K近邻图进行第二次剪枝操作，该操作通过亲密度计算，移除假边，使得得到的人脸聚类更加精准，K近邻图的结构更加清晰。经过第二次剪枝之后，K近邻图中大多数假边已经被删除，此时从K近邻图中读取的人脸聚类信息会更准确。After the first pruning operation, the second pruning operation can also be performed on the K-nearest neighbor graph. This operation removes false edges through intimacy calculation, so that the obtained face clustering is more accurate. The structure of the K-nearest neighbor graph Clearer. After the second pruning, most of the false edges in the K-nearest neighbor graph have been removed, and the face clustering information read from the K-nearest neighbor graph will be more accurate.

在构建K近邻图的过程中，可以使用结构保留子图采样策略对K近邻图进行处理，得到子图，从而得到边分数预测模型。该方法中子图保留了整个K近邻图的重要结构信息，既保留了簇内相关度高的连接，又保留了簇与簇之间相关度低的连接。并且为了泛化，引入了随机性，该处理方法可以增强模型的性能。In the process of constructing the K-nearest neighbor graph, the structure-preserving subgraph sampling strategy can be used to process the K-nearest neighbor graph to obtain a subgraph, thereby obtaining an edge score prediction model. In this method, the subgraph retains the important structural information of the entire K-nearest neighbor graph, not only retaining the connections with high correlation within the cluster, but also retaining the connections with low correlation between clusters. And for generalization, randomness is introduced, which can enhance the performance of the model.

上述具体实施方式，并不构成对本申请保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等，均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims

1. a face clustering method based on structure perception, is characterized in that, comprises:

Obtain a plurality of face images to be processed, and extract the face features of each of the face images to be processed based on a pre-trained convolutional neural network model, and construct a face feature based on the face features of each of the face images to be processed K-nearest neighbor graph;

The K-nearest neighbor graph is input into the pre-trained edge score prediction model, and the score of each edge in the K-nearest neighbor graph is obtained; wherein, the edge score prediction model utilizes the structure-preserving subgraph sampling strategy to perform the K-nearest neighbor graph. Sampling, and using the sampled subgraphs to train the graph convolutional neural network;

According to the score of each edge in the K-nearest neighbor graph, a first pruning operation is performed on the K-nearest neighbor graph to obtain face clusters for the plurality of face images to be processed.

2. The method according to claim 1, wherein after the first pruning operation is performed on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph, the method further comprises: include:

Calculate the intimacy between two nodes in the K-nearest neighbor graph that has undergone the first pruning operation;

According to the intimacy between the two nodes, a second pruning operation is performed on the K-nearest neighbor graph that has undergone the first pruning operation.

3. The method according to claim 2, wherein the intimacy between the two nodes is calculated by the following formula:

Among them, n ₁ represents the number of edges connected to node N1, n ₂ represents the number of edges connected to node N2, and k represents the number of neighbor nodes shared by node N1 and node N2.

4. The method according to any one of claims 1 to 3, wherein the edge score prediction model is obtained by training the following steps in advance:

Obtain a training sample set, the training sample set includes a plurality of face sample images;

Extract the face feature of each of the face sample images based on the convolutional neural network model;

According to the face features of each of the face sample images, construct a K-nearest neighbor graph sample;

Based on the structure-preserving subgraph sampling strategy, samples of K-nearest neighbor graphs are sampled to obtain subgraphs obtained after sampling, and the graph convolutional neural network is trained by using the subgraphs obtained after sampling to obtain network parameters, and according to The network parameters generate the edge score prediction model.

5. The method according to claim 4, wherein the sampling of K-nearest neighbor graph samples based on the structure-preserving subgraph sampling strategy to obtain the subgraph obtained after sampling, comprising:

Randomly select M clusters from the K-nearest neighbor graph samples as sampling seeds;

For each seed cluster, select the N nearest neighbor clusters of each seed cluster, and take the graph composed of the M clusters and the N nearest neighbor clusters as the first subgraph S1;

Randomly select K1 clusters from the first subgraph S1 to construct a second subgraph S2;

K2 nodes are randomly selected from the second subgraph S2 as the subgraph obtained after the sampling.

6. The method according to claim 4, wherein the training of the graph convolutional neural network using the subgraph obtained after the sampling comprises:

inputting the subgraph obtained after the sampling into the graph convolutional neural network to obtain the predicted score values of each edge in the subgraph;

Calculate the loss value between the predicted score value of each edge and the actual score value of the corresponding edge based on the cross entropy loss function;

The graph convolutional neural network is trained according to the loss value.

7. A face clustering device based on structure perception, characterized in that, comprising:

a first acquisition module, used for acquiring a plurality of face images to be processed;

a feature extraction module for extracting the face features of each of the to-be-processed face images based on a pre-trained convolutional neural network model;

a building module for constructing a K-nearest neighbor graph according to the facial features of each of the to-be-processed facial images;

A prediction module for inputting the K-nearest neighbor graph into a pre-trained edge score prediction model to obtain the score of each edge in the K-nearest neighbor graph; wherein, the edge score prediction model utilizes a structure-preserving subgraph sampling strategy It is obtained by sampling the K-nearest neighbor graph and training the graph convolutional neural network with the sampled subgraph;

A pruning operation module, configured to perform a first pruning operation on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph to obtain face clusters for the plurality of face images to be processed .

8. The device according to claim 7, wherein the pruning operation module is further used for:

After performing the first pruning operation on the K-nearest neighbor graph according to the score of each edge in the K-nearest neighbor graph, calculate the difference between two nodes in the K-nearest neighbor graph that has undergone the first pruning operation. intimacy;

9. The device according to claim 7 or 8, characterized in that, further comprising:

A pre-training module for pre-training the edge score prediction model; wherein, the pre-training module is specifically used for:

10. The apparatus according to claim 9, wherein the pre-training module is specifically used for: