CN113672751B - Background similar picture clustering method and device, electronic equipment and storage medium - Google Patents

Background similar picture clustering method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113672751B
CN113672751B CN202110729370.9A CN202110729370A CN113672751B CN 113672751 B CN113672751 B CN 113672751B CN 202110729370 A CN202110729370 A CN 202110729370A CN 113672751 B CN113672751 B CN 113672751B
Authority
CN
China
Prior art keywords
graph
affinity
nodes
frequency
strong
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110729370.9A
Other languages
Chinese (zh)
Other versions
CN113672751A (en
Inventor
田春霖
蒋泽锟
严宋扬
阮书宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Xinxin Information Technology Co ltd
Original Assignee
Xi'an Xinxin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Xinxin Information Technology Co ltd filed Critical Xi'an Xinxin Information Technology Co ltd
Priority to CN202110729370.9A priority Critical patent/CN113672751B/en
Publication of CN113672751A publication Critical patent/CN113672751A/en
Application granted granted Critical
Publication of CN113672751B publication Critical patent/CN113672751B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification

Abstract

The invention discloses a clustering method and a device of background similar pictures, an electronic device and a storage medium, wherein the clustering method comprises the following steps: constructing an undirected graph G, wherein the undirected graph G is represented by a contiguous matrix, and pictures are nodes of the undirected graph G; removing all nodes with the core degree smaller than k0 in the undirected graph G to obtain a plurality of subgraphs G1, wherein the subgraph G1 is a strong relationship cluster, and k0 is a turning point of an affinity and frequency relationship graph; and dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph. The invention combines a strong correlation sub-graph algorithm to mine the strong correlation relationship between different pictures, uses the uncorrelated sub-graph algorithm to find the strong relationship of corresponding entities on the basis of the obtained adjacent matrix, clusters the pictures and solves the problem of 'normal pictures'.

Description

Background similar picture clustering method and device, electronic equipment and storage medium
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a method and a device for clustering background similar pictures, an electronic device and a storage medium.
Background
At present, on the background clustering of pictures, common clustering algorithms are generally used, for example, k-means (k-means), k-centers (k-means), spectral clustering, Affinity diffusion clustering (Affinity prediction), and the like, to process feature information.
These clustering algorithms cannot cope well with the problem of "normal pictures" in picture background clustering, and "normal pictures" are a large number of pictures that do not belong to any cluster in picture background clustering, and a general clustering algorithm cannot process such pictures well.
Common clustering algorithms cannot well deal with the problem of 'normal pictures' in image background clustering, and the 'normal pictures' are pictures for handling business normally, which occupy most of the pictures and do not belong to any fraudulent clustering cluster, and can influence the final result to a great extent.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a method and an apparatus for clustering background-similar pictures, an electronic device, and a storage medium. The technical problem to be solved by the invention is realized by the following technical scheme:
a clustering method of background similar pictures comprises the following steps:
constructing an undirected graph G, wherein the undirected graph G is represented by a contiguous matrix, and pictures are nodes of the undirected graph G;
removing all nodes with the core degree smaller than k0 in the undirected graph G to obtain a plurality of sub-graphs G1, wherein the sub-graph G1 is a strong relation cluster, and k0 is a turning point of an affinity and frequency relation graph;
and classifying a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
In an embodiment of the present invention, all the nodes with a core degree smaller than k0 in the undirected graph G are removed, resulting in a subgraph G1, which includes:
obtaining a relation graph of affinity and frequency;
calculating turning points of the affinity and frequency relation graph;
and all nodes with the core degrees smaller than the turning points of the affinity-frequency relation graph in the undirected graph G are removed to obtain the sub-graphs G1.
In one embodiment of the present invention, obtaining an affinity vs. frequency graph comprises:
counting the affinity of each node to the other nodes;
and obtaining a relation graph of the affinity and the frequency according to all the affinities and all the frequencies.
In one embodiment of the present invention, calculating the inflection point of the affinity vs. frequency graph comprises:
and calculating the turning point of the affinity and frequency relation graph by a Petit algorithm.
In an embodiment of the present invention, partitioning a first non-strong relationship node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph includes:
finding the high confidence threshold in the affinity vs. frequency graph;
counting the nodes with the affinity greater than the high confidence threshold value in the first non-strong relationship nodes to obtain second non-strong relationship nodes;
calculating the affinities between the second non-strongly related node and the nodes in all the sub-graph G1 to obtain the minimum affinity m of the second non-strongly related node and the node in each sub-graph G1;
merging the second non-strong relationship node into the sub-graph G1 corresponding to the maximum affinity of all the minimum affinities m.
In an embodiment of the present invention, after dividing the first non-strong relationship node into the corresponding sub-graph G1 according to the high confidence threshold of the affinity-frequency relationship graph, the method further includes:
and obtaining the confidence of each node according to a confidence calculation formula.
In one embodiment of the present invention, the confidence calculation formula is:
Figure BDA0003138758870000031
wherein p represents confidence, x represents the affinity of the node belonging to the corresponding strong relation cluster, and t1、t2And when t represents the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph, the t uses the affinity corresponding to the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph.
An embodiment of the present invention further provides a device for clustering background similar pictures, including:
the device comprises a construction module, a searching module and a judging module, wherein the construction module is used for constructing an undirected graph G, the undirected graph G is represented by an adjacent matrix, and pictures are nodes of the undirected graph G;
a removing module, configured to remove all nodes with a core degree smaller than k0 in the undirected graph G to obtain a plurality of subgraphs G1, where the subgraph G1 is a strong relationship cluster, and k0 is a turning point of an affinity-frequency relationship graph;
and the clustering module is used for dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
An embodiment of the present invention further provides an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
a processor, configured to implement the steps of the clustering method for background similar pictures according to any of the above embodiments when the computer program is executed.
An embodiment of the present invention further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the steps of the clustering method for the background similar pictures described in any of the above embodiments are implemented.
The invention has the beneficial effects that:
the invention combines a strong correlation sub-graph algorithm to mine the strong correlation relationship between different pictures, uses the uncorrelated sub-graph algorithm to find the strong relationship of corresponding entities on the basis of the obtained adjacent matrix, clusters the pictures and solves the problem of 'normal pictures'.
The invention provides a background similar picture background clustering method based on a strong relation subgraph, which can detect similar background pictures generated in various different scenes on the basis of an obtained adjacency matrix and provide information which can be referred to by a user.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a schematic flow chart of a method for clustering background similar pictures according to an embodiment of the present invention;
FIG. 2 is a graph of affinity versus frequency according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a background similar picture clustering apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a method for clustering background similar pictures according to an embodiment of the present invention. The embodiment of the invention provides a clustering method of background similar pictures, which specifically comprises the following steps 1-4, wherein:
step 1, constructing an undirected graph G, wherein the undirected graph G is represented by an adjacent matrix, and the picture is a node of the undirected graph G.
Specifically, it is first necessary to obtain an adjacency matrix, and the adjacency matrix is used to represent a graph (graph) including nodes and edges, and representing the relationship between the nodes, where, assuming that a picture (picture) is a node of the graph, an edge is a degree of association between pictures, and the greater the degree of association, the greater the similarity between pictures, so the embodiment uses the adjacency matrix to represent the undirected graph G.
The adjacency matrix may be a picture distance matrix or an affinity matrix.
And 2, removing all nodes with the core degree smaller than k0 in the undirected graph G to obtain a plurality of subgraphs G1, wherein the subgraph G1 is a strong relationship cluster, and k0 is a turning point of the affinity and frequency relationship graph.
In a particular embodiment, step 2 may particularly comprise steps 2.1 to 2.3, wherein:
step 2.1, please refer to fig. 2, to obtain a graph of the relationship between affinity and frequency.
In this embodiment, step 2.1 may specifically include steps 2.11 to 2.12, where:
and 2.11, counting the affinity from each node to other nodes, wherein the affinity is an element of the adjacency matrix and is inversely proportional to the distance between the nodes.
Specifically, this embodiment needs to count the affinity between every two nodes in the adjacency matrix, i.e., the affinity from one node to the rest of the nodes.
And 2.12, obtaining a relation graph of the affinity and the frequency according to all the affinities and all the frequencies.
Specifically, the frequency is the number of occurrences of each element in the adjacency matrix, and referring to fig. 2, a graph of the relationship between affinity and frequency can be constructed on the basis of obtaining all the affinities and the frequency.
And 2.2, calculating the turning point of the affinity and frequency relation graph.
Specifically, the turning point of the affinity-frequency relationship graph, i.e., the threshold point 2 in fig. 2, is calculated by the pettit algorithm (pettit).
And 2.3, all nodes with the core degrees smaller than the turning points of the affinity-frequency relation graph in the undirected graph G are removed to obtain a plurality of sub-graphs G1.
Specifically, the core degrees of the nodes in the undirected graph G and the turning points of the affinity-frequency relationship graph are compared, the nodes with the core degrees smaller than the turning points of the affinity-frequency relationship graph are removed, the removed nodes are first non-strong relationship nodes, the remaining nodes are strong relationship nodes, a plurality of nodes which are interconnected through edges in the remaining strong relationship nodes form a sub-graph G1, and each sub-graph G1 is a strong relationship entity, that is, the sub-graph G1 is a strong relationship cluster.
And 3, dividing the first non-strong relation node into the corresponding subgraph G1 according to the high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
The strong relational entities obtained in the strong relationship mining are actually one-by-one high-confidence spoofed clusters, and there are a large number of non-strong relational entities outside the strong relational entities, which appear as nodes on the graph, so that the nodes need to be selectively merged into the clusters.
In a particular embodiment, step 3 may particularly comprise steps 3.1 to 3.4, wherein:
and 3.1, finding a high confidence threshold value in the affinity and frequency relation graph.
Specifically, the highest point, i.e. the threshold point 1 in fig. 2, is found in the affinity-frequency relationship diagram, and the affinity corresponding to this highest point serves as the high confidence threshold.
And 3.2, counting the nodes with the affinity greater than the high confidence threshold value in the first non-strong relationship nodes to obtain second non-strong relationship nodes.
In this embodiment, the first non-strong relationship nodes are the nodes removed in step 2.3, and these nodes need to be moved into the corresponding strong relationship clusters, so that the nodes whose affinities are greater than the high confidence threshold value in all the first non-strong relationship nodes need to be counted, and the nodes whose affinities are greater than the high confidence threshold value in the first non-strong relationship nodes are taken as the second non-strong relationship nodes.
And 3.3, calculating the affinity between the second non-strong relationship node and the nodes in all the subgraph G1 to obtain the minimum affinity m between the second non-strong relationship node and the node in each subgraph G1.
Specifically, for each second non-strongly related node, the affinity between the second non-strongly related node and the node in each sub-graph G1 needs to be calculated, so that the node with the smallest affinity between one second non-strongly related node and one sub-graph G1 can be obtained, and therefore the minimum affinity m is the minimum value of the affinities between one second non-strongly related node and all nodes of one sub-graph G1.
And 3.4, merging the second non-strong relation nodes into a subgraph G1 corresponding to the maximum affinity in all the minimum affinities m.
Specifically, for each second non-strong relationship node, there is a minimum affinity m with each sub-graph G1, so the largest m can be selected from all the minimum affinities m corresponding to the second non-strong relationship nodes, and the second non-strong relationship nodes are divided into the sub-graph G1 corresponding to the largest m, so as to complete the clustering of the second non-strong relationship nodes.
And 4, obtaining the confidence coefficient of each node according to a confidence coefficient calculation formula.
The confidence coefficient represents the probability that the corresponding node belongs to a certain cluster, namely the probability that a certain picture belongs to a certain cluster, and on the basis of node merging, the following calculation is carried out on each node according to a confidence coefficient calculation formula, wherein the confidence coefficient calculation formula is as follows:
Figure BDA0003138758870000081
wherein p represents confidence, x represents the affinity of the node belonging to the corresponding strong relation cluster, and t1、t2And when t represents the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph, the t uses the affinity corresponding to the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph.
According to the above formula, the present embodiment can calculate the confidence value of each node, thereby giving more information to the user.
According to the method, the threshold point 1 and the threshold point 2 are determined through the affinity-frequency relation graph, the pictures are sequentially merged into different clustering clusters, the affinity-frequency relation graph can adaptively reduce the interference of noise data on the result, particularly the interference of a 'normal picture', and the identification precision is improved.
The picture background clustering method provided by the invention is mainly used for clustering fraudulent background pictures, and the picture background clustering method provided by the invention comprises three main steps, namely: mining strong relation, merging nodes and calculating confidence coefficient.
The invention processes the obtained adjacent matrix through an irrelevant subgraph algorithm to obtain a strong relation entity, and finally finds out the most possible fraudulent picture cluster.
The invention combines a strong correlation sub-graph algorithm to mine the strong correlation relationship between different pictures, uses the uncorrelated sub-graph algorithm to find the strong relationship of corresponding entities on the basis of the obtained adjacent matrix, clusters the pictures and solves the problem of 'normal pictures'.
The invention provides a background similar picture background clustering method based on a strong relation subgraph, which can detect similar background pictures generated in various different scenes on the basis of an obtained adjacency matrix and provide information which can be referred to by a user.
Example two
Referring to fig. 3, fig. 3 is a schematic diagram of a clustering device for background similar pictures according to an embodiment of the present invention. The clustering device for the background similar pictures comprises:
the device comprises a construction module, a searching module and a judging module, wherein the construction module is used for constructing an undirected graph G, the undirected graph G is represented by an adjacent matrix, and pictures are nodes of the undirected graph G;
a removing module, configured to remove all nodes with a core degree smaller than k0 in the undirected graph G to obtain a plurality of subgraphs G1, where the subgraph G1 is a strong relationship cluster, and k0 is a turning point of an affinity-frequency relationship graph;
and the clustering module is used for dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
In an embodiment of the present invention, the removing module may be specifically configured to obtain an affinity-frequency relationship graph; calculating turning points of the affinity and frequency relation graph; and all nodes with the core degrees smaller than the turning points of the affinity and frequency relation graph in the undirected graph G are removed to obtain a plurality of sub-graphs G1.
In an embodiment of the present invention, the clustering module may be specifically configured to find the high confidence threshold in the affinity-frequency relationship graph; counting the nodes with the affinity greater than the high confidence threshold value in the first non-strong relationship nodes to obtain second non-strong relationship nodes; calculating affinities between the second non-strongly related node and nodes in all of the subgraph G1 to obtain a minimum affinity m of the second non-strongly related node and each of the subgraph G1; merging the second non-strongly related node into the sub-graph G1 corresponding to the maximum affinity of all the minimum affinities m.
The clustering device for the background similar pictures provided by the embodiment of the invention can execute the method embodiment, and the implementation principle and the technical effect are similar, so that the implementation principle and the technical effect are not described again.
EXAMPLE III
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device 1100 comprises: the system comprises a processor 1101, a communication interface 1102, a memory 1103 and a communication bus 1104, wherein the processor 1101, the communication interface 1102 and the memory 1103 are communicated with each other through the communication bus 1104;
a memory 1103 for storing a computer program;
the processor 1101, when executing the computer program, implements the above method steps.
The processor 1101, when executing the computer program, implements the following steps:
step 1, constructing an undirected graph G, wherein the undirected graph G is represented by an adjacent matrix, and pictures are nodes of the undirected graph G;
step 2, all nodes with the core degree smaller than k0 in the undirected graph G are removed to obtain a plurality of sub graphs G1, wherein the sub graphs G1 are strong relation clusters, and k0 is turning points of an affinity and frequency relation graph;
and 3, dividing a first non-strong relation node into the corresponding sub graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
The electronic device provided by the embodiment of the present invention can execute the above method embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
Example four
Yet another embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
step 1, constructing an undirected graph G, wherein the undirected graph G is represented by an adjacent matrix, and pictures are nodes of the undirected graph G;
step 2, all nodes with the core degree smaller than k0 in the undirected graph G are removed to obtain a plurality of sub graphs G1, wherein the sub graphs G1 are strong relation clusters, and k0 is turning points of an affinity and frequency relation graph;
and 3, dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the high confidence threshold is the highest point of the affinity and frequency relation graph.
The computer-readable storage medium provided by the embodiment of the present invention may implement the above method embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "module" or "system. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The computer program is stored/distributed on a suitable medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
In the description of the present invention, it is to be understood that the terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples described in this specification can be combined and combined by those skilled in the art.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (7)

1. A clustering method of background similar pictures is characterized by comprising the following steps:
constructing an undirected graph G, wherein the undirected graph G is represented by an adjacent matrix and comprises nodes and edges, pictures are the nodes of the undirected graph G, the edges are the correlation degrees among the pictures, and the larger the correlation degree is, the larger the similarity among the pictures is;
removing all nodes with the core degree smaller than k0 in the undirected graph G to obtain a plurality of sub-graphs G1, wherein the removed nodes are first non-strong relationship nodes, the sub-graph G1 is a strong relationship cluster, and k0 is a turning point of an affinity and frequency relationship graph;
dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the affinity is an element of an adjacent matrix, the affinity is inversely proportional to the distance between the nodes, the frequency is the occurrence frequency of each element in the adjacent matrix, and the high confidence threshold is the highest point of the affinity and frequency relation graph;
obtaining the confidence of each node according to a confidence calculation formula, wherein the confidence calculation formula is as follows:
Figure FDA0003643174190000011
wherein p represents confidence, x represents the affinity of the node belonging to the corresponding strong relation cluster, and t1、t2When t represents the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph, the t uses the affinity corresponding to the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph;
dividing a first non-strong relation node into the corresponding sub-graph G1 according to the high confidence threshold of the affinity and frequency relation graph, including:
finding the high confidence threshold in the affinity vs. frequency graph;
counting the nodes with the affinity greater than the high confidence threshold value in the first non-strong relationship nodes to obtain second non-strong relationship nodes;
calculating the affinities between the second non-strongly related node and the nodes in all the sub-graph G1 to obtain the minimum affinity m of the second non-strongly related node and the node in each sub-graph G1;
merging the second non-strong relationship node into the sub-graph G1 corresponding to the maximum affinity in all the minimum affinities m.
2. The method for clustering background similar pictures according to claim 1, wherein all nodes with a core degree smaller than k0 in the undirected graph G are removed to obtain a sub-graph G1, comprising:
obtaining a relation graph of affinity and frequency;
calculating the turning point of the affinity and frequency relation graph;
and all the nodes with the core degrees smaller than the turning points of the affinity and frequency relation graph in the undirected graph G are removed to obtain the sub-graphs G1.
3. The method of claim 2, wherein obtaining the affinity-frequency relationship graph comprises:
counting the affinity of each node to the other nodes;
and obtaining a relation graph of the affinity and the frequency according to all the affinities and all the frequencies.
4. The method of claim 2, wherein calculating the turning point of the affinity-frequency relationship graph comprises:
and calculating the turning point of the affinity and frequency relation graph by a Petit algorithm.
5. A background similar picture clustering device is characterized by comprising:
the device comprises a construction module and a processing module, wherein the construction module is used for constructing an undirected graph G, the undirected graph G is represented by an adjacent matrix and comprises nodes and edges, pictures are the nodes of the undirected graph G, the edges are the correlation degrees among the pictures, and the larger the correlation degree is, the larger the similarity degree is;
a removing module, configured to remove all nodes with a core degree smaller than k0 in the undirected graph G to obtain a plurality of sub-graphs G1, where the removed nodes are first non-strong-relationship nodes, the sub-graph G1 is a strong-relationship cluster, and k0 is a turning point of an affinity-frequency relationship graph;
the clustering module is used for dividing a first non-strong relation node into the corresponding sub-graph G1 according to a high confidence threshold of the affinity and frequency relation graph, wherein the affinity is an element of an adjacent matrix, the affinity is inversely proportional to the distance between the nodes, the frequency is the occurrence frequency of each element in the adjacent matrix, and the high confidence threshold is the highest point of the affinity and frequency relation graph;
obtaining the confidence of each node according to a confidence calculation formula, wherein the confidence calculation formula is as follows:
Figure FDA0003643174190000031
wherein p represents confidence, x represents the affinity of the node belonging to the corresponding strong relation cluster, and t1、t2When t represents the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph, the t uses the affinity corresponding to the highest point of the affinity and frequency relation graph or the turning point of the affinity and frequency relation graph;
dividing a first non-strong relation node into the corresponding sub-graph G1 according to the high confidence threshold of the affinity and frequency relation graph, including:
finding the high confidence threshold in the affinity vs. frequency graph;
counting the nodes with the affinity greater than the high confidence threshold value in the first non-strong relationship nodes to obtain second non-strong relationship nodes;
calculating affinities between the second non-strongly related node and nodes in all of the subgraph G1 to obtain a minimum affinity m of the second non-strongly related node and each node in the subgraph G1;
merging the second non-strong relationship node into the sub-graph G1 corresponding to the maximum affinity in all the minimum affinities m.
6. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1-4 when executing the computer program.
7. A storage medium, characterized in that a computer program is stored in the storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1-4.
CN202110729370.9A 2021-06-29 2021-06-29 Background similar picture clustering method and device, electronic equipment and storage medium Active CN113672751B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110729370.9A CN113672751B (en) 2021-06-29 2021-06-29 Background similar picture clustering method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110729370.9A CN113672751B (en) 2021-06-29 2021-06-29 Background similar picture clustering method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113672751A CN113672751A (en) 2021-11-19
CN113672751B true CN113672751B (en) 2022-07-01

Family

ID=78538340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110729370.9A Active CN113672751B (en) 2021-06-29 2021-06-29 Background similar picture clustering method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113672751B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943934A (en) * 2017-11-23 2018-04-20 北京天广汇通科技有限公司 Relationship strength determines method and apparatus
CN108629593A (en) * 2018-04-28 2018-10-09 招商银行股份有限公司 Fraudulent trading recognition methods, system and storage medium based on deep learning
CN111310768A (en) * 2020-01-20 2020-06-19 安徽大学 Saliency target detection method based on robustness background prior and global information
CN111860584A (en) * 2020-06-11 2020-10-30 石家庄铁路职业技术学院 Graph classification method and device
CN112529115A (en) * 2021-02-05 2021-03-19 支付宝(杭州)信息技术有限公司 Object clustering method and system
CN112580668A (en) * 2020-12-24 2021-03-30 西安深信科创信息技术有限公司 Background fraud detection method and device and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8560605B1 (en) * 2010-10-21 2013-10-15 Google Inc. Social affinity on the web
WO2013181222A2 (en) * 2012-05-29 2013-12-05 Battelle Memorial Institute Method of analyzing a graph with a covariance-based clustering algorithm using a modified laplacian pseudo-inverse matrix
CN112069964A (en) * 2020-08-31 2020-12-11 天津大学 Abnormal person relation network mining method based on image recognition technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943934A (en) * 2017-11-23 2018-04-20 北京天广汇通科技有限公司 Relationship strength determines method and apparatus
CN108629593A (en) * 2018-04-28 2018-10-09 招商银行股份有限公司 Fraudulent trading recognition methods, system and storage medium based on deep learning
CN111310768A (en) * 2020-01-20 2020-06-19 安徽大学 Saliency target detection method based on robustness background prior and global information
CN111860584A (en) * 2020-06-11 2020-10-30 石家庄铁路职业技术学院 Graph classification method and device
CN112580668A (en) * 2020-12-24 2021-03-30 西安深信科创信息技术有限公司 Background fraud detection method and device and electronic equipment
CN112529115A (en) * 2021-02-05 2021-03-19 支付宝(杭州)信息技术有限公司 Object clustering method and system

Also Published As

Publication number Publication date
CN113672751A (en) 2021-11-19

Similar Documents

Publication Publication Date Title
CN109859054B (en) Network community mining method and device, computer equipment and storage medium
Wan et al. An algorithm for multidimensional data clustering
US8886649B2 (en) Multi-center canopy clustering
US9031999B2 (en) System and methods for generation of a concept based database
Isaksson et al. SOStream: Self organizing density-based clustering over data stream
CN107070867B (en) Network flow abnormity rapid detection method based on multilayer locality sensitive hash table
JP2006338313A (en) Similar image retrieving method, similar image retrieving system, similar image retrieving program, and recording medium
US20060195423A1 (en) System and method for temporal data mining
CN111444363A (en) Picture retrieval method and device, terminal equipment and storage medium
CN111291768A (en) Image feature matching method and device, equipment and storage medium
CN111553215A (en) Personnel association method and device, and graph convolution network training method and device
Winter et al. Fast indexing strategies for robust image hashes
CN111460234A (en) Graph query method and device, electronic equipment and computer readable storage medium
WO2019119635A1 (en) Seed user development method, electronic device and computer-readable storage medium
EP3067804A1 (en) Data arrangement program, data arrangement method, and data arrangement apparatus
US20190050672A1 (en) INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS
CN110083731B (en) Image retrieval method, device, computer equipment and storage medium
Chehreghani Efficient computation of pairwise minimax distance measures
CN110717086A (en) Mass data clustering analysis method and device
CN113672751B (en) Background similar picture clustering method and device, electronic equipment and storage medium
WO2012159320A1 (en) Method and device for clustering large-scale image data
CN105760442B (en) Characteristics of image Enhancement Method based on database neighborhood relationships
CN113572721A (en) Abnormal access detection method and device, electronic equipment and storage medium
CN115757896A (en) Vector retrieval method, device, equipment and readable storage medium
CN114693943A (en) Non-maximum suppression acceleration method, system and equipment for target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant