CN112765414A - Graph embedding vector generation method and graph embedding-based community discovery method - Google Patents

Graph embedding vector generation method and graph embedding-based community discovery method Download PDF

Info

Publication number
CN112765414A
CN112765414A CN202110079198.7A CN202110079198A CN112765414A CN 112765414 A CN112765414 A CN 112765414A CN 202110079198 A CN202110079198 A CN 202110079198A CN 112765414 A CN112765414 A CN 112765414A
Authority
CN
China
Prior art keywords
vertex
vector
graph
graph embedding
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110079198.7A
Other languages
Chinese (zh)
Inventor
于东晓
张喜连
罗琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202110079198.7A priority Critical patent/CN112765414A/en
Publication of CN112765414A publication Critical patent/CN112765414A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of data processing, and relates to a method for generating a graph embedding vector and a community discovery method based on graph embedding. A method for generating graph embedding vectors comprises the following steps: acquiring a core value of a vertex; acquiring neighborhood structure information of the vertexes, and calculating the similarity between the vertexes; generating a vertex sequence based on the similarity of the vertex and the adjacent neighbor thereof; and carrying out word vector training on the vertex sequence to generate an embedded vector of each vertex. According to the method for generating the graph embedding vector, the neighborhood structure information of the vertex is kept through the kernel value information of the vertex, so that the vertex with a similar structure is closer to the embedding space. And clustering or classifying the obtained graph embedding vectors to discover the communities.

Description

Graph embedding vector generation method and graph embedding-based community discovery method
Technical Field
The invention belongs to the technical field of data processing, and relates to a method for generating a graph embedding vector and a community discovery method based on graph embedding.
Background
In the internet era, from computer vision to natural language processing, deep learning techniques have been applied to hundreds of practical problems over the past few years. Graph databases are also increasingly used in social networking, e-commerce, and other fields because of their excellent performance in handling relationships between data. In a graph network, a subgraph corresponding to a node subset with relatively close internal connection is called a community, and a process of finding out a community structure from the graph is called community discovery. Naturally, community discovery of graph data in combination with deep learning has become a subject of research. However, simple graph data cannot be directly input as a deep learning model, and the graph data needs to be converted into sequence data.
Graph embedding techniques refer to embedding a graph into a vector space, represented as a low-dimensional vector, while preserving the structural information of the graph. The current graph embedding technology can be roughly divided into three types, namely matrix decomposition based, random walk based and neural network model based, and specific algorithms such as Line, deep walk, SNDE and the like. In the existing methods, most of the methods consider the similarity between vertex pairs as the feature information of the vertices, but do not consider the neighborhood structure information of the vertices.
The kernel value of a vertex may reflect, to some extent, the neighborhood structure of the vertex. A vertex with a kernel value of k indicates that the vertex has at least k neighbors, and that the k neighbors are all greater than or equal to k degrees. In addition, a vertex with a k-kernel value indicates that the vertex is in a k-kernel subgraph, the subgraph is a compact subgraph, and the degrees of all the vertices in the subgraph are greater than or equal to k. In summary, the kernel value of a vertex may reflect how well it exists in a compact subgraph, and thus may reflect the neighborhood structure of the vertex.
Disclosure of Invention
The invention aims to provide a generating method of a graph embedding vector based on a kernel value and a community discovery method based on graph embedding aiming at the defects of the prior art.
In order to achieve the above purpose, one of the technical solutions provided by the present invention is: a method of generating graph embedding vectors, the method comprising:
acquiring a core value of a vertex;
acquiring neighborhood structure information of the vertexes, and calculating the similarity between the vertexes;
generating a vertex sequence based on the similarity of the vertex and the adjacent neighbor thereof;
and carrying out word vector training on the vertex sequence to generate an embedded vector of each vertex.
As a preferred embodiment of the present invention, the method for calculating the vertex kernel value includes:
calculating degrees of all vertexes;
selecting a vertex with the minimum degree, wherein the core value of the vertex is the value of the degree;
and traversing the neighbors of the vertex in the previous step, and if the degree of a certain neighbor vertex is greater than that of the vertex, subtracting 1 from the degree of the neighbor vertex.
Further preferably, the method for calculating the similarity between the vertexes includes:
acquiring a vertex set with a distance of 1,2, …, k from the vertex u, namely a k-hop neighbor set of the vertex u
Figure BDA0002908477610000021
Respectively obtaining the kernel value distribution of neighbor vertexes in k-hop neighbor set of vertexes, and using vectors
Figure BDA0002908477610000022
To represent this distribution; wherein
Figure BDA0002908477610000023
To represent
Figure BDA0002908477610000024
How many vertices with a median kernel value of t are;
multiplying the vector of each hop neighbor set of the vertex u by an attenuation coefficient respectively and integrating the vectors into a total vector du(ii) a Wherein, the larger the hop count, i.e. the larger k, the neighborhood information is shown to be in the neighborhood of the vertexThe smaller the structural aspect influence and, therefore, the smaller the attenuation coefficient.
And calculating Euclidean distance between vectors corresponding to the vertexes u and v, and further calculating the similarity between the two vertexes.
Further preferably, the vertex sequence is generated by:
calculating the probability of a vertex wandering to an adjacent vertex according to the similarity between the vertexes;
and generating a vertex sequence by using a random walk model by taking the obtained probability as a preset transition probability.
In order to further achieve the object of the present invention, the present invention also provides a graph embedding-based community discovery method, including:
acquiring a training sample set;
obtaining a graph embedding vector corresponding to each training sample, wherein the graph embedding vector is generated by adopting the method for generating the graph embedding vector provided by the invention;
and taking the graph embedding vector of the training sample as input data, and training a preset network model to obtain a community discovery model.
The preset network model is a clustering algorithm model, similar vertex vectors are summarized together, a plurality of different cluster classes are further obtained, and each cluster class represents a community.
Further preferably, under the condition that the labels of the training samples are known, each vertex vector is allocated to a community, and then the community to which the new vertex belongs is predicted to complete community discovery.
The invention has the beneficial effects that: compared with the prior art, the invention provides a generating method of the graph embedding vector based on the kernel value, and also provides a method based on clustering and classification for processing the obtained graph embedding vector to obtain a plurality of different communities, the vertexes in the communities are more closely connected, the similarity is higher, and the purpose of community discovery is achieved.
Drawings
FIG. 1 is a schematic flow chart of a community discovery method based on graph embedding according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for generating a kernel-based graph embedding vector according to an embodiment of the present invention;
fig. 3 is an exemplary diagram of data samples in the method for generating a graph embedding vector according to this embodiment;
fig. 4 is a schematic flowchart of the k-means cluster-based community discovery provided in this embodiment;
fig. 5 is a schematic flowchart of the KNN classification-based community discovery provided in this embodiment.
Detailed Description
In order to facilitate an understanding of the invention, the invention is described in more detail below with reference to the accompanying drawings and specific examples. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Embodiment 1 this implementation provides a method for generating a graph embedding vector based on a kernel value, and the flow is as shown in fig. 2, first, the kernel values of all vertices are calculated by a kernel value decomposition method; then acquiring neighborhood structure information of the vertexes, and calculating the structural similarity between the two vertexes based on the neighborhood structure information; calculating the probability of the vertex moving to the neighbor vertex based on the structural similarity between the vertex and the neighbor, and generating a vertex sequence in a biased way; and carrying out word vector training on the obtained vertex sequence to generate an embedded vector of each vertex.
Based on the overall process, the method comprises the following steps:
s201, calculating the kernel values of all vertexes:
specifically, the kernel values of all vertices are calculated in the graph using a kernel value decomposition method. The method for nuclear value decomposition specifically comprises the following steps: firstly, calculating degrees of all vertexes; secondly, selecting a vertex with the minimum degree, wherein the core value of the vertex is the value of the degree; thirdly, traversing the neighbor vertex of the vertex in the second step, and if the degree of a certain neighbor vertex is greater than that of the vertex, subtracting 1 from the degree of the neighbor vertex; fourthly, repeating the second step and the third step.
S202, acquiring a k-hop neighbor set of a vertex:
in particular, a k-hop neighbor of vertex u refers to a set of vertices that are a distance k from u. Here, taking vertex u as an example, to obtain a multi-hop neighbor set of vertex u, a breadth-first traversal is performed on the graph from u, the first layer is a neighbor directly connected to u, and the distance between u and u is 1, so that u is a 1-hop neighbor and is marked as u's 1-hop neighbor
Figure BDA0002908477610000041
Next, the vertex directly connected to the vertex of the 1-hop neighbor of u is 2 away from u, and thus is a 2-hop neighbor of u, and is noted as
Figure BDA0002908477610000042
By analogy, a vertex with a distance k from the vertex u is a k-hop neighbor set of u, and is marked as
Figure BDA0002908477610000043
S203, obtaining a kernel value distribution vector in the k-hop neighbor set of the vertex:
specifically, the kernel value distribution of the neighbor vertex in each neighbor set of u is respectively counted and represented in the form of a vector, and the vector is used to represent part of the structural information of the vertex u. For the
Figure BDA0002908477610000044
Using a D-dimensional vector as a vertex in (1)
Figure BDA0002908477610000045
To represent
Figure BDA0002908477610000046
The distribution of the kernel values of the middle vertex, wherein D is the maximum kernel value of the vertex in the graph,
Figure BDA0002908477610000047
the number of vertices with a kernel value i is indicated. Thus, each one
Figure BDA0002908477610000048
All represent the k jump neighborhood structure information of the vertex u, and integrate the vectors obtained when k takes different values to obtain a comprehensive vector cuThe vector represents the overall neighborhood structure information of vertex u, and two vertices are considered to be structurally similar if the two vectors are close in distance.
Figure BDA0002908477610000049
Wherein, δ is a coefficient for controlling the influence of the neighbors of different hops on the comprehensive vector, and δ is (0, 1). By setting different δ values, the influence of vertices farther from vertex u on the structural information of u can be made weaker. For example, as shown in FIG. 3, there are B, C, E vertices that are one-hop neighbors of vertex A, i.e.
Figure BDA00029084776100000410
The core values of the vertexes B and C are both 2, the core value of the vertex E is 1, namely, two vertexes with the core value of 2 are provided, and 1 vertex with the core value of 1 is provided, so that the vertex B and the vertex C have the core values of 2 and 1, and the vertex B and the vertex C have the core values of 1, and the vertex B and the vertex C
Figure BDA00029084776100000411
The two-hop neighbor of A has D and F, wherein the kernel value of D is 2, and the kernel value of F is 1, so
Figure BDA00029084776100000412
The two vectors are integrated, assuming δ is taken0=1,δ1When the value is equal to 0.5, then
Figure BDA00029084776100000413
S204, calculating the similarity between the two vertexes based on the vertex kernel value distribution vector:
in particular, we compare the structural vectors c of the two vertices u, vu、cvThe similarity of the two vertices is compared. We propose a similar function to make the comparison,
Figure BDA00029084776100000414
cuand cvThe larger the distance between the two points is, the more dissimilar the structures of the vertex u and the vertex v are; c. CuAnd cvThe smaller the distance of (d), the more similar the structures of the vertex u and the vertex v are, further reflecting that the role and function of the vertex u and the vertex v in the network may be similar.
S205, calculating the probability of a certain vertex wandering to an adjacent vertex, and obtaining a vertex sequence by biased wandering:
specifically, the random walk model may be used to capture a topological structure of the graph, where the random walk may select a vertex in the graph data as a first step, then move to a neighbor vertex according to a preset probability, then move to the neighbor vertex according to the preset transition probability with the vertex after the random walk once as a starting point, and so on until a preset condition of the random walk is met, to obtain a plurality of random walk sequences corresponding to the vertex.
Specifically, in the present embodiment, the preset transition probability is defined as:
Figure BDA0002908477610000051
where s (u, v) is the structural similarity between vertex u and vertex v, Σw∈N(u)s (u, w) is the sum of the structural similarity of the vertex u and all its neighbors, p (u, v) is the probability of wandering from the vertex u to the vertex v, and the higher the structural similarity between u and v, the greater the probability of wandering from u to v.
Specifically, the preset condition of the random walk refers to a maximum sequence length of the random walk and the number of random walk sequences of one vertex.
Specifically, in a specific implementation manner of this embodiment, a specific process of performing random walk on a certain vertex to obtain a plurality of random vertex sequences is as follows: firstly, taking the vertex as a starting point, and moving to a neighbor vertex according to a preset probability; and secondly, taking the vertex after random walk as a new starting point, moving to a neighbor vertex according to a preset probability, and so on until a preset condition is met, and obtaining a plurality of vertex sequences of the vertex.
For example, the following steps are carried out: the number of random walk sequences of each vertex is 10, the length of the random walk is 80, the random vertex sequences with the length of 80 are obtained by sampling the random walk sequences of each vertex in the graph data for 10 times, and high-order adjacent relations among items are implied in the random vertex sequences.
And S206, carrying out word vector training on the vertex sequence to generate an embedded vector of each vertex.
Specifically, after a plurality of random walk sequences of all vertices are obtained in step S205, the vertices may be graph-embedded by the word vector model, so as to obtain an embedded vector corresponding to each vertex.
Specifically, in this embodiment, the embedding process may be completed using a Skip-Gram model that considers each "context-target vocabulary" combination as a new observation volume and predicts the context vocabulary through the target vocabulary.
Embodiment 2 the flow of the graph embedding-based community discovery method provided in this embodiment is shown in fig. 1, which takes a k-means clustering method as an example. Fig. 4 is a specific flowchart of the generating method, which includes the following steps:
s401, obtaining graph embedding vector
Specifically, the embedding vector of the training sample data graph is obtained based on the method for generating the graph embedding vector based on the kernel value in embodiment 1.
S402, embedding vectors of a graph of a training sample as input data, and randomly selecting k vertex vectors as a clustering center by adopting a k-means clustering method.
Specifically, k vertex vectors are randomly selected as centers of the clusters, and the k centers respectively belong to k different communities.
And S403, distributing other vertex vectors to communities where the nearest clustering centers are located, and recalculating each clustering center.
Specifically, for each vertex vector in the data set, the distance between the vertex vector and each vertex vector serving as a cluster center is calculated, and the vertex vector is assigned to the community in which the cluster center closest to the vertex vector is located.
Specifically, the distance between the vectors can be calculated by using a manhattan distance or euler distance equidistance measurement mode.
Specifically, all k cluster centers after completion of community allocation are recalculated.
S404, judging whether the distance between the centers of the new cluster and the old cluster is smaller than delta.
Giving a threshold value delta, judging whether the distance between the recalculated clustering center and the original clustering center is smaller than the threshold value, if so, indicating that the position change of the clustering center is not large and tends to be convergent; otherwise, steps S302-S304 are repeated.
S405, outputting k communities.
Embodiment 3, the graph embedding-based community discovery method provided in this embodiment takes a KNN classification method as an example, and fig. 5 is a specific flowchart of the generation method, and specifically includes the following steps:
s501, obtaining a graph embedding vector.
Specifically, the embedding vector of the training sample data graph is obtained based on the method for generating the graph embedding vector based on the kernel value in embodiment 1.
S502, embedding the graph of the training sample into vectors to serve as input data, and calculating the distance between the new vectors and the vectors in each training set by adopting a KNN classification method.
Specifically, each vector in the training set is labeled, i.e., known to the community to which it belongs. For each new vector unknown to the community to which it belongs, its distance to each vector in the training set is calculated. The distance calculation may use euclidean distance, manhattan distance, or other distance measurement methods.
S503, selecting the K vectors with the minimum distance and determining the frequency of the community in which the K vectors are located.
Specifically, the distances between the new vectors obtained by calculation and the vectors in the training set are arranged according to an increasing order, and the first K vectors are selected. And counting the occurrence frequency of communities to which the K vectors belong, and taking the community with the highest occurrence frequency as the community to which the new vector should belong.
S504, determining the community of each new vertex vector.

Claims (7)

1. A method for generating graph embedding vectors, comprising:
acquiring a core value of a vertex;
acquiring neighborhood structure information of the vertexes, and calculating the similarity between the vertexes;
generating a vertex sequence based on the similarity of the vertex and the adjacent neighbor thereof;
and carrying out word vector training on the vertex sequence to generate an embedded vector of each vertex.
2. The method for generating graph-embedded vectors according to claim 1, wherein the vertex kernel value is calculated by:
calculating degrees of all vertexes;
selecting a vertex with the minimum degree, wherein the core value of the vertex is the value of the degree;
and traversing the neighbors of the vertex in the previous step, and if the degree of a certain neighbor vertex is greater than that of the vertex, subtracting 1 from the degree of the neighbor vertex.
3. The method for generating graph-embedded vectors according to claim 2, wherein the method for calculating the similarity between the vertices comprises:
obtaining k-hop neighbor set of vertex u
Figure FDA0002908477600000011
Separately obtaining collections
Figure FDA0002908477600000012
Kernel value distribution of u-neighborhood vertex of middle vertex
Figure FDA0002908477600000015
Wherein
Figure FDA0002908477600000014
To represent
Figure FDA0002908477600000013
How many vertices with a median kernel value of t are;
multiplying the vector of each hop neighbor set of the vertex u by an attenuation coefficient respectively and integrating the vectors into a total vector du(ii) a The larger the hop count is, namely the larger k is, the smaller the influence of the neighborhood information on the neighborhood structure condition of the vertex is, and therefore, the smaller the attenuation coefficient is;
and calculating Euclidean distance between vectors corresponding to the vertex u and the vertex v, and further calculating the similarity between the two vertexes.
4. The method of generating graph-embedded vectors according to claim 3, wherein the vertex sequence is generated by:
calculating the probability of a vertex wandering to an adjacent vertex according to the similarity between the vertexes;
and generating a vertex sequence by using a random walk model by taking the obtained probability as a preset transition probability.
5. A graph embedding-based community discovery method is characterized by comprising the following steps:
acquiring a training sample set;
obtaining a graph embedding vector corresponding to each training sample, wherein the graph embedding vector is generated by adopting the method of any one of claims 1-4;
and taking the graph embedding vector of the training sample as input data, and training a preset network model to obtain a community discovery model.
6. The graph-embedding-based community discovery method according to claim 5, wherein the preset network model is a clustering algorithm model, similar vertex vectors are summarized together, and a plurality of different cluster classes are obtained, wherein each cluster class represents a community.
7. The graph-based embedded community discovery method according to claim 6, wherein, under the condition that the labels of the training samples are known, each vertex vector is allocated to a community, and then the community to which a new vertex belongs is predicted to complete community discovery.
CN202110079198.7A 2021-01-21 2021-01-21 Graph embedding vector generation method and graph embedding-based community discovery method Pending CN112765414A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110079198.7A CN112765414A (en) 2021-01-21 2021-01-21 Graph embedding vector generation method and graph embedding-based community discovery method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110079198.7A CN112765414A (en) 2021-01-21 2021-01-21 Graph embedding vector generation method and graph embedding-based community discovery method

Publications (1)

Publication Number Publication Date
CN112765414A true CN112765414A (en) 2021-05-07

Family

ID=75702097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110079198.7A Pending CN112765414A (en) 2021-01-21 2021-01-21 Graph embedding vector generation method and graph embedding-based community discovery method

Country Status (1)

Country Link
CN (1) CN112765414A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061836A (en) * 2022-08-16 2022-09-16 浙江大学滨海产业技术研究院 Micro-service splitting method based on graph embedding algorithm for interface layer

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061836A (en) * 2022-08-16 2022-09-16 浙江大学滨海产业技术研究院 Micro-service splitting method based on graph embedding algorithm for interface layer
CN115061836B (en) * 2022-08-16 2022-11-08 浙江大学滨海产业技术研究院 Micro-service splitting method based on graph embedding algorithm for interface layer

Similar Documents

Publication Publication Date Title
CN108733976B (en) Key protein identification method based on fusion biology and topological characteristics
Liu et al. Fast detection of dense subgraphs with iterative shrinking and expansion
WO2014109127A1 (en) Index generating device and method, and search device and search method
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN107180079B (en) Image retrieval method based on convolutional neural network and tree and hash combined index
CN113297429B (en) Social network link prediction method based on neural network architecture search
CN108052683B (en) Knowledge graph representation learning method based on cosine measurement rule
Zarei et al. Detecting community structure in complex networks using genetic algorithm based on object migrating automata
CN109948242A (en) Network representation learning method based on feature Hash
CN115114484A (en) Abnormal event detection method and device, computer equipment and storage medium
CN111783088B (en) Malicious code family clustering method and device and computer equipment
CN112765414A (en) Graph embedding vector generation method and graph embedding-based community discovery method
KR102158049B1 (en) Data clustering apparatus and method based on range query using cf tree
CN116720975A (en) Local community discovery method and system based on structural similarity
CN108897820B (en) Parallelization method of DENCLUE algorithm
Qv et al. LG: A clustering framework supported by point proximity relations
Chen et al. Community Detection Based on DeepWalk Model in Large‐Scale Networks
CN114461943A (en) Deep learning-based multi-source POI semantic matching method and device and storage medium thereof
CN114722920A (en) Deep map convolution model phishing account identification method based on map classification
CN114529096A (en) Social network link prediction method and system based on ternary closure graph embedding
CN112802543B (en) Gene regulation network analysis method based on probability map
Mythili et al. Research Analysis on Clustering Techniques in Wireless Sensor Networks
CN112085085B (en) Multi-source migration learning method based on graph structure
CN107870952A (en) Data clustering method and device
CN114969462A (en) Sample screening method, sample screening device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210507