CN114511905A - Face clustering method based on graph convolution neural network - Google Patents

Face clustering method based on graph convolution neural network Download PDF

Info

Publication number
CN114511905A
CN114511905A CN202210066025.6A CN202210066025A CN114511905A CN 114511905 A CN114511905 A CN 114511905A CN 202210066025 A CN202210066025 A CN 202210066025A CN 114511905 A CN114511905 A CN 114511905A
Authority
CN
China
Prior art keywords
node
nodes
density
graph
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210066025.6A
Other languages
Chinese (zh)
Inventor
初妍
李龙
赵庆超
莫士奇
李思纯
李松
时洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202210066025.6A priority Critical patent/CN114511905A/en
Publication of CN114511905A publication Critical patent/CN114511905A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention belongs to the technical field of face clustering, and particularly relates to a face clustering method based on a convolution neural network. Firstly, extracting the features of face data, taking the face features as nodes, and calculating the local density values of all the nodes; then, dividing the data into two parts of high-density nodes and low-density nodes based on the local density value, wherein the high-density nodes are connected with nodes which are high-density in the nearest neighbor to form a plurality of clustering centers; constructing an adaptive sub-graph for the low-density nodes, and predicting connectivity among the nodes by using the adaptive sub-graph as the input of a graph convolution neural network; and finally, combining the two parts, and cutting off the edge which does not meet the requirement by using pseudo label propagation to obtain a final clustering result. The invention divides the data into two parts based on the density, only constructs the subgraph for reasoning on the low-density part, improves the clustering efficiency, and simultaneously, the self-adaptive subgraph can extract richer context information, so that the reasoning on the subgraph is more accurate, and the clustering accuracy is improved.

Description

Face clustering method based on graph convolution neural network
Technical Field
The invention belongs to the technical field of face clustering, and particularly relates to a face clustering method based on a convolution neural network.
Background
With the development of science and technology, face recognition technology is gradually popularized and widely applied to the fields of aerospace, entertainment, education, security and the like, and the performance of face recognition technology is gradually improved, however, the high accuracy of the current face recognition technology depends on a large-scale face labeling data set, the more the face data sets with labels are, the better the effect is, although data which are not labeled can be easily obtained from the internet, manual labeling is time-consuming and expensive. To take advantage of this unlabeled data, face clustering is a good choice.
At present, face clustering and technology are widely applied, and some common scenes comprise face grouping and marking in an album, data cleaning during construction of a large-scale data set, management of new retail customers and the like. The traditional Clustering method has limitations, such as sensitivity of Spectral Clustering to parameter selection, a K-means requirement data set which is a convex data set, and uniform data distribution required by DBSCAN. The effect is often not good when face clustering is performed, and the time complexity is too high when a large-scale face data set is faced. In contrast, many link prediction methods have no requirement on the data structure, and can obtain better clustering results. However, the link prediction method also has the problem of high time complexity, especially when facing a large-scale face data set. The link prediction method takes each face image after extracting features as a node, constructs a subgraph for each node as input, and infers the subgraph by adopting a graph convolution neural network to judge the connectivity among the nodes so as to obtain clusters, however, the constructed subgraphs are highly overlapped, so that not only is redundant computing resource loss caused, but also the clustering efficiency is influenced.
Disclosure of Invention
The invention aims to provide a face clustering method based on a convolutional neural network.
A face clustering method based on a graph convolution neural network comprises the following steps:
step 1: processing an original face data set, and extracting face features;
step 2: regarding each face feature as a node, and calculating k neighbor nodes of the nearest neighbor of each node by adopting nearest neighbor search;
and 3, step 3: calculating the local density value of each node;
for each node i, traversing k neighbor nodes of the nearest neighbor nodes, and calculating that the Euclidean distance between the k neighbor nodes and the node i is less than the truncation distance E0The number of nodes is the local density value rho of the node iiThe process is represented as:
Figure BDA0003480088930000011
wherein E is0Representing a preset truncation distance; dijRepresenting the Euclidean distance between the node i and the node j;
Figure BDA0003480088930000012
and step 3: sorting the nodes according to the local density values in a descending order, and dividing the nodes into high-density nodes and low-density nodes according to a set threshold;
and 4, step 4: center area of high density node body clusteringA domain having strong connectivity and belonging to a strong connectivity region; traversing each node of the strong connection area, if a certain node exists in m nodes nearest to the node and belongs to the strong connection area, and the distance between the two nodes is smaller than the truncation distance E0Dividing the two nodes into the same high-density clustering center, and continuously performing the process until all the nodes in the strongly-connected region are traversed, wherein the high-density nodes form different clustering centers; m is a preset value and is not more than k;
and 5: processing the low-density nodes, constructing a self-adaptive sub-graph for the low-density nodes, and reasoning the self-adaptive sub-graph by using a graph convolution neural network to acquire connectivity among the nodes;
step 5.1: constructing an adaptive sub-graph for the low-density nodes, and obtaining a characteristic matrix F ═ F after construction1,f2,f3,...,fv]TAnd the adjacency matrix A ∈ R|V|×|V|(ii) a V represents a set of nodes in the adaptive sub-graph;
step 5.2: inputting the characteristic matrix and the adjacent matrix into a graph convolution neural network, aggregating the node context information by the graph convolution network, judging whether a node is connected with a central pivot node or not, and outputting a transformed characteristic matrix Y by the graph convolution layer;
Y=σ([F||GF]W)
wherein, W is a weight parameter matrix which can be learnt;
Figure BDA0003480088930000021
is an aggregation matrix, Λ is a diagonal matrix; the operator | | | represents the concatenation of the matrices along the eigendimension; σ denotes the Relu activation function;
step 5.3: inputting the feature matrix Y into a full connection layer of the graph convolution network for classification operation;
y=softmax(Wx+b)
wherein y represents the probability of a positive or negative sample; b is a bias term; the softmax function performs a normalization operation; x represents a feature vector in the feature matrix Y;
step 5.4: establishing an edge between the central node and the first-order neighbor, wherein the connectivity of the edge is the probability that the neighbor node is a positive sample;
step 6: and (5) combining the results obtained in the step (4) and the step (5), finally cutting off the edge which does not meet the requirement through multiple iterations, and forming the final cluster by the remaining connected parts.
Further, the method for constructing the adaptive sub-graph for the low-density nodes in the step 5.1 specifically includes:
step 5.1.1: for a low-density node p, forming a node set V by a second-order neighbor of the low-density node p;
step 5.1.2: traversing all first-order neighbor nodes of the node p, calculating the maximum value of the Euclidean distance between each node and k nearest neighbor nodes of each node, and calculating the average value of the maximum values to obtain a self-adaptive distance threshold dis;
step 5.1.3: adding edges in the adaptive sub-graph, wherein the finally obtained adaptive sub-graph is composed of a node set V and an edge set E and is represented as (V, E);
for any node q in the node set V, potential edges are searched in k neighbors of q; if a node r appears in V and the Euclidean distance from the node q is smaller than dis, adding an edge (q, r) to an edge set E; and for the node pairs with the Euclidean distance between the nodes larger than dis, directly setting the value of the corresponding position of the adjacent matrix as 0, and not adding edges.
Further, the results obtained in step 6 and step 3 are combined to obtain a set of edges, and the weight of the edges is the final prediction result and indicates the possibility that a connection exists between a pair of nodes; in order to obtain the final cluster, a pseudo label propagation strategy is used, edges with weight smaller than a threshold value are cut off in each iteration, the remained edge connection examples form clusters, and if one cluster exists in the clusters and is larger than the size of a predefined maximum cluster, the cluster is added into a queue so as to process the cluster in the next iteration; in the iterative process, the threshold for cutting edges will gradually increase, stopping the iteration when the queue is empty, since this means that the clustering has been completed.
The invention has the beneficial effects that:
firstly, extracting features of face data, taking the features as nodes, and calculating local density values of all the nodes; then, dividing the data into two parts of high-density nodes and low-density nodes based on the local density value, wherein the high-density nodes are connected with nodes which are high-density in the nearest neighbor to form a plurality of clustering centers; constructing an adaptive sub-graph for the low-density nodes, identifying the sub-graph as an adjacent matrix, and using the adjacent matrix as the input of a graph convolution neural network for predicting the connectivity among the nodes; and finally, combining the two parts and using the combined parts as label propagation to cut off the edge which does not meet the requirement, thereby obtaining the final clustering result. The invention divides the data into two parts based on the density, only constructs the subgraph for reasoning on the low-density part, reduces the input data scale of the graph convolution neural network, reduces the quantity of the subgraphs needing to be deduced, improves the clustering efficiency, and simultaneously, the self-adaptive subgraphs can extract more abundant context information, thereby leading the reasoning on the subgraphs to be more accurate and improving the clustering accuracy.
Drawings
FIG. 1 is a diagram of the construction of an adaptive sub-map in the present invention.
Fig. 2 is a framework diagram of the present invention.
Fig. 3 is a flow chart of the present invention.
FIG. 4 is a pseudo-code diagram of a specific algorithm of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention aims to solve the problem of face clustering, overcome the defects of the prior art and provide a face clustering method based on a convolutional neural network. The invention takes the face picture after extracting the characteristics as a node, and divides the data into two parts based on the local density value of the node: a high density portion and a low density portion. The high-density part belongs to a strong-communication area because of the existence of the near clustering center and strong connectivity. An adaptive sub-graph is only needed to be constructed for low-density part nodes, a graph convolution neural network is used for reasoning the constructed sub-graph, the connectivity of edges is judged, and finally the final clustering can be obtained by combining the high-density part and the low-density part and carrying out multiple iterations.
A face clustering method based on a graph convolution neural network comprises the following steps:
step 1: processing an original face data set, and extracting face features;
step 2: regarding each face feature as a node, and calculating k neighbor nodes of the nearest neighbor of each node by adopting nearest neighbor search;
and step 3: calculating the local density value of each node;
for each node i, traversing k neighbor nodes of the nearest neighbor nodes, and calculating that the Euclidean distance between the k neighbor nodes and the node i is less than a truncation distance E0The number of nodes is the local density value rho of the node iiThe process is represented as:
Figure BDA0003480088930000041
wherein E is0Indicating a preset truncation distance; dijRepresenting the Euclidean distance between the node i and the node j;
Figure BDA0003480088930000042
and step 3: sorting the nodes according to the local density values in a descending order, and dividing the nodes into high-density nodes and low-density nodes according to a set threshold;
and 4, step 4: the high-density node body clustering central area has strong connectivity and belongs to a strong connectivity area; traversing each node of the strong connection area, if a certain node exists in m nodes nearest to the node and belongs to the strong connection area, and the distance between the two nodes is smaller than the truncation distance E0Dividing the two nodes into the same high-density clustering center, and continuously performing the process until all the nodes in the strongly-connected region are traversed, wherein the high-density nodes form different clustering centers; m is a preset value and is not more than k;
and 5: processing the low-density nodes, constructing a self-adaptive sub-graph for the low-density nodes, and reasoning the self-adaptive sub-graph by using a graph convolution neural network to acquire connectivity among the nodes;
step 5.1: constructing an adaptive sub-graph for the low-density nodes, and obtaining a characteristic matrix F ═ F after construction1,f2,f3,...,fv]TAnd the adjacency matrix A ∈ R|V|×|V|(ii) a V represents a set of nodes in the adaptive sub-graph;
the method for constructing the self-adaptive sub-graph for the low-density nodes comprises the following specific steps:
step 5.1.1: for a low-density node p, forming a node set V by a second-order neighbor of the low-density node p;
step 5.1.2: traversing all first-order neighbor nodes of the node p, calculating the maximum value of the Euclidean distance between each node and k nearest neighbor nodes of each node, and calculating the average value of the maximum values to obtain a self-adaptive distance threshold dis;
step 5.1.3: adding edges in the adaptive sub-graph, wherein the finally obtained adaptive sub-graph is composed of a node set V and an edge set E and is represented as (V, E);
for any node q in the node set V, potential edges are searched in k neighbors of q; if a node r appears in V and the Euclidean distance from the node q is smaller than dis, adding an edge (q, r) to an edge set E; for node pairs with Euclidean distance between nodes larger than dis, directly setting the value of the corresponding position of the adjacent matrix as 0, and not adding edges;
step 5.2: inputting the characteristic matrix and the adjacent matrix into a graph convolution neural network, aggregating the node context information by the graph convolution network, judging whether a node is connected with a central pivot node or not, and outputting a transformed characteristic matrix Y by the graph convolution layer;
Y=σ([F||GF]W)
wherein, W is a weight parameter matrix which can be learnt;
Figure BDA0003480088930000051
is an aggregation matrix, Λ is a diagonal matrix; the operator | | | represents the concatenation of the matrices along the eigendimension; sigma representsRelu activation function;
step 5.3: inputting the feature matrix Y into a full connection layer of the graph convolution network for classification operation;
y=softmax(Wx+b)
wherein y represents the probability of a positive or negative sample; b is a bias term; the softmax function performs a normalization operation; x represents a feature vector in the feature matrix Y;
step 5.4: establishing an edge between the central node and the first-order neighbor, wherein the connectivity of the edge is the probability that the neighbor node is a positive sample;
step 6: and (5) combining the results obtained in the step (4) and the step (5), and finally cutting off the edges which do not meet the requirements through multiple iterations, wherein the remaining connected parts form the final cluster.
Combining the results obtained in the step 3 and the step 4 in the step 6 to obtain a group of edges, wherein the weight of the edges is a final prediction result and represents the possibility of existence of connection between a pair of nodes; in order to obtain the final cluster, a pseudo label propagation strategy is used, edges with weight smaller than a threshold value are cut off in each iteration, the remained edge connection examples form clusters, and if one cluster exists in the clusters and is larger than the size of a predefined maximum cluster, the cluster is added into a queue so as to process the cluster in the next iteration; in the iterative process, the threshold for cutting edges will gradually increase, stopping the iteration when the queue is empty, since this means that the clustering has been completed.
Example 1:
step 1, processing an original face data set, extracting face features, and regarding each feature as a node. The original face data set is expressed as X ═ X after the features are extracted by the convolutional neural network1,x2,...,xn]T∈RN×DWherein x isiAnd the feature of the ith node is represented, N represents the number of pictures, and D represents the dimension of the feature.
And 2, calculating k nearest neighbor nodes of each node by adopting nearest neighbor search according to the characteristics extracted in the step 1. When the node nearest neighbor search is completed, the density partitioning operation can be performed next, and the local density value of each node needs to be calculated first. The specific calculation mode of the local density value of the node is as follows:
Figure BDA0003480088930000061
wherein E is0Indicating a preset truncation distance, dijRepresenting the Euclidean distance, ρ, of node i and node jiThe local density value of the node i is represented as a measurement function, and the specific calculation mode of χ (x) is as follows:
Figure BDA0003480088930000062
for each node, traversing k nearest nodes of the node, and calculating the number of nodes which are less than the truncation distance from the pivot node in the k nodes. After the local density values of all the nodes are calculated, the node density value is expressed as d ═ d1,d2,...,dn]。
And 3, dividing the nodes into two parts according to the density: a high density portion and a low density portion, the high density portion being processed to form a plurality of cluster centers.
And 4, processing the low-density nodes, constructing an adaptive sub-graph (ASG) for the low-density nodes, and obtaining a characteristic matrix F ═ F after the construction is finished1,f2,f3,...,fv]TAnd the adjacency matrix A ∈ R|V|×|V|The feature matrix and the adjacency matrix are input into the graph convolution neural network, and the graph convolution network aggregates the node context information, so that whether a connection exists between one node and the central pivot node can be judged. The input of the graph convolution layer is a feature matrix F and an adjacent matrix A, the output is a transformed feature matrix Y, and the specific calculation method is as follows:
Y=σ([F||GF]W)
wherein the content of the first and second substances,
Figure BDA0003480088930000063
n isNumber of nodes, dinCharacteristic dimension representing input, doutRepresenting the feature dimension of the output, W is a learnable weight parameter matrix,
Figure BDA0003480088930000064
is an aggregate matrix, a is an adjacency matrix, and Λ is a diagonal matrix. The operator | | | represents the concatenation of matrices along the eigendimension. σ denotes the Relu activation function. Then, classifying the transformed features by adopting a full-connection layer, and taking the transformed features Y as input, wherein the specific calculation method comprises the following steps:
y=softmax(Wx+b)
y represents the probability of positive and negative samples, W is a learnable parameter, b is a bias term, and softmax performs a normalization operation, which is calculated as follows:
Figure BDA0003480088930000065
wherein z isjThe probability of the jth class is shown and k is the total number of classes. And finally, establishing an edge between the central node and the first-order neighbor, wherein the connectivity of the edge is the probability that the neighbor node is a positive sample.
And 5, combining the results obtained in the steps 3 and 4 to obtain a group of edges, wherein the weight of the edges is a final prediction result and represents the possibility that connection exists between a pair of nodes. To obtain the final clusters, we use a pseudo label propagation strategy. In each iteration, edges with weights less than a threshold are truncated, while the remaining edge join instances constitute clusters, and if there is a cluster in the clusters that is larger than a predefined maximum cluster size, it is added to the queue for processing in the next iteration. In an iterative process, the threshold for cutting the edge may be gradually increased. The iteration is stopped when the queue is empty, since this means that the clustering has been completed.
Further, the specific process of dividing the data into two parts with high and low densities based on the density in the step 3 and enabling the high density part to form a plurality of clustering centers is as follows:
the first step is as follows: the nodes are sorted according to the local density values in the descending order, and the data is divided into two parts through a threshold value: the high-density nodes and the low-density nodes are clustered in the central area of the high-density part of the nodes, have strong connectivity and belong to a strong-connectivity area;
the second step is that: traversing each node of the strong connection area, if a certain node exists in m nodes nearest to the node and belongs to the strong connection area, and the distance between the two nodes is smaller than the truncation distance E0Then, the two nodes are divided into the same high-density clustering center, the process is continued continuously until all the nodes in the strongly connected region are traversed, and the high-density parts form different clustering centers.
Further, the specific process of constructing the adaptive sub-graph for the low-density nodes in the step 4 is as follows:
the first step is as follows: given a node p, forming a node set V by the second-order neighbor of the node p, wherein an adaptive sub-graph is formed by nodes in the node set V, and an ASG is represented as G (V, E), wherein V represents a node set in a sub-graph, and E represents an edge set in the sub-graph;
the second step is that: calculating a self-adaptive distance threshold dis, traversing all first-order neighbor nodes of the node p, calculating the maximum value of the Euclidean distance between each node and k nearest neighbor nodes of each node, and calculating the average value of the maximum values to obtain a final self-adaptive distance threshold dis;
the third step: and (3) performing an operation of adding edges in the subgraph, searching potential edges in k neighbors of q for any node q in the node set V, if a node r appears in V and the distance from the node q is less than dis, adding an edge (q, r) to E, and directly setting the corresponding position of the adjacency matrix to be 0 for node pairs with the distance between the nodes greater than dis, and not performing the operation of adding edges.
Compared with the prior art, the method divides data into two parts based on density, only constructs sub-graphs for reasoning on the low-density part, reduces the input data scale of the graph convolution neural network, reduces the number of sub-graphs needing to be reasoned, improves the clustering efficiency, and simultaneously, the self-adaptive sub-graphs can extract more abundant context information, so that the reasoning on the sub-graphs is more accurate, and the clustering accuracy is improved.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. A face clustering method based on a graph convolution neural network is characterized by comprising the following steps:
step 1: processing an original face data set, and extracting face features;
step 2: regarding each face feature as a node, and calculating k neighbor nodes of the nearest neighbor of each node by adopting nearest neighbor search;
and step 3: calculating the local density value of each node;
for each node i, traversing k neighbor nodes of the nearest neighbor nodes, and calculating that the Euclidean distance between the k neighbor nodes and the node i is less than the truncation distance E0The number of nodes is the local density value rho of the node iiThe process is represented as:
Figure FDA0003480088920000011
wherein E is0Representing a preset truncation distance; dijRepresenting the Euclidean distance between the node i and the node j;
Figure FDA0003480088920000012
and step 3: sorting the nodes according to the local density values in a descending order, and dividing the nodes into high-density nodes and low-density nodes according to a set threshold;
and 4, step 4: center area of high density node body clusteringA domain having strong connectivity and belonging to a strong connectivity region; traversing each node of the strong connection area, if a certain node exists in m nodes nearest to the node and belongs to the strong connection area, and the distance between the two nodes is smaller than the truncation distance E0Dividing the two nodes into the same high-density clustering center, and continuously performing the process until all the nodes in the strongly-connected region are traversed, wherein the high-density nodes form different clustering centers; m is a preset value and is not more than k;
and 5: processing the low-density nodes, constructing a self-adaptive sub-graph for the low-density nodes, and reasoning the self-adaptive sub-graph by using a graph convolution neural network to acquire connectivity among the nodes;
step 5.1: constructing an adaptive sub-graph for the low-density nodes, and obtaining a characteristic matrix F ═ F after construction1,f2,f3,...,fv]TAnd the adjacency matrix A ∈ R|V|×|V|(ii) a V represents a set of nodes in the adaptive sub-graph;
step 5.2: inputting the characteristic matrix and the adjacent matrix into a graph convolution neural network, aggregating the node context information by the graph convolution network, judging whether a node is connected with a central pivot node or not, and outputting a transformed characteristic matrix Y by the graph convolution layer;
Y=σ([F||GF]W)
wherein, W is a weight parameter matrix which can be learnt;
Figure FDA0003480088920000013
is an aggregation matrix, Λ is a diagonal matrix; the operator | | | represents the concatenation of the matrices along the eigendimension; σ denotes the Relu activation function;
step 5.3: inputting the feature matrix Y into a full connection layer of the graph convolution network for classification operation;
y=softmax(Wx+b)
wherein y represents the probability of a positive or negative sample; b is a bias term; the softmax function performs a normalization operation; x represents a feature vector in the feature matrix Y;
step 5.4: establishing an edge between the central node and the first-order neighbor, wherein the connectivity of the edge is the probability that the neighbor node is a positive sample;
step 6: and (5) combining the results obtained in the step (4) and the step (5), finally cutting off the edge which does not meet the requirement through multiple iterations, and forming the final cluster by the remaining connected parts.
2. The face clustering method based on the convolutional neural network of claim 1, wherein: the method for constructing the adaptive sub-graph for the low-density nodes in the step 5.1 specifically comprises the following steps:
step 5.1.1: for a low-density node p, forming a node set V by a second-order neighbor of the low-density node p;
step 5.1.2: traversing all first-order neighbor nodes of the node p, calculating the maximum value of the Euclidean distance between each node and k nearest neighbor nodes of each node, and calculating the average value of the maximum values to obtain a self-adaptive distance threshold dis;
step 5.1.3: adding edges in the adaptive sub-graph, wherein the finally obtained adaptive sub-graph is composed of a node set V and an edge set E and is represented as (V, E);
for any node q in the node set V, potential edges are searched in k neighbors of q; if a node r appears in V and the Euclidean distance from the node q is smaller than dis, adding an edge (q, r) to an edge set E; and for the node pairs with the Euclidean distance between the nodes larger than dis, directly setting the value of the corresponding position of the adjacent matrix as 0, and not adding edges.
3. The face clustering method based on the convolutional neural network of claim 1, wherein: in the step 6, a group of edges is obtained after combining the results obtained in the step 3 and the step 4, and the weight of the edge is the final prediction result and represents the possibility of existence of connection between a pair of nodes; in order to obtain the final cluster, a pseudo label propagation strategy is used, edges with weights smaller than a threshold value are cut off in each iteration, the remaining edge connection examples form clusters, and if one cluster exists in the clusters and is larger than the predefined maximum cluster size, the cluster is added into a queue so as to process the next iteration; in the iterative process, the threshold for cutting edges will gradually increase, stopping the iteration when the queue is empty, since this means that the clustering has been completed.
CN202210066025.6A 2022-01-20 2022-01-20 Face clustering method based on graph convolution neural network Pending CN114511905A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210066025.6A CN114511905A (en) 2022-01-20 2022-01-20 Face clustering method based on graph convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210066025.6A CN114511905A (en) 2022-01-20 2022-01-20 Face clustering method based on graph convolution neural network

Publications (1)

Publication Number Publication Date
CN114511905A true CN114511905A (en) 2022-05-17

Family

ID=81550675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210066025.6A Pending CN114511905A (en) 2022-01-20 2022-01-20 Face clustering method based on graph convolution neural network

Country Status (1)

Country Link
CN (1) CN114511905A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858725A (en) * 2022-11-22 2023-03-28 广西壮族自治区通信产业服务有限公司技术服务分公司 Method and system for screening text noise based on unsupervised graph neural network
CN115858725B (en) * 2022-11-22 2023-07-04 广西壮族自治区通信产业服务有限公司技术服务分公司 Text noise screening method and system based on unsupervised graph neural network

Similar Documents

Publication Publication Date Title
WO2019238109A1 (en) Fault root cause analysis method and apparatus
CN103345645B (en) Commodity image class prediction method towards net purchase platform
CN106528874B (en) The CLR multi-tag data classification method of big data platform is calculated based on Spark memory
CN107391772B (en) Text classification method based on naive Bayes
CN110348526B (en) Equipment type identification method and device based on semi-supervised clustering algorithm
CN112699953B (en) Feature pyramid neural network architecture searching method based on multi-information path aggregation
CN112685504B (en) Production process-oriented distributed migration chart learning method
CN104239553A (en) Entity recognition method based on Map-Reduce framework
CN113378913A (en) Semi-supervised node classification method based on self-supervised learning
CN108763496A (en) A kind of sound state data fusion client segmentation algorithm based on grid and density
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN107145516A (en) A kind of Text Clustering Method and system
JP5754310B2 (en) Identification information providing program and identification information providing apparatus
CN110751191A (en) Image classification method and system
CN114511905A (en) Face clustering method based on graph convolution neural network
CN113344128B (en) Industrial Internet of things self-adaptive stream clustering method and device based on micro clusters
CN112597399B (en) Graph data processing method and device, computer equipment and storage medium
Jiang et al. On spectral graph embedding: A non-backtracking perspective and graph approximation
CN112508363A (en) Deep learning-based power information system state analysis method and device
CN114142923A (en) Optical cable fault positioning method, device, equipment and readable medium
CN111159508A (en) Anomaly detection algorithm integration method and system based on algorithm diversity
CN107862073B (en) Web community division method based on node importance and separation
CN116304518A (en) Heterogeneous graph convolution neural network model construction method and system for information recommendation
CN114265954B (en) Graph representation learning method based on position and structure information
WO2023273171A1 (en) Image processing method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination