WO2020078370A1 - Procédé de recherche communautaire - Google Patents

Procédé de recherche communautaire Download PDF

Info

Publication number
WO2020078370A1
WO2020078370A1 PCT/CN2019/111419 CN2019111419W WO2020078370A1 WO 2020078370 A1 WO2020078370 A1 WO 2020078370A1 CN 2019111419 W CN2019111419 W CN 2019111419W WO 2020078370 A1 WO2020078370 A1 WO 2020078370A1
Authority
WO
WIPO (PCT)
Prior art keywords
community
node
nodes
search
result
Prior art date
Application number
PCT/CN2019/111419
Other languages
English (en)
Chinese (zh)
Inventor
王朝坤
竺俊超
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2020078370A1 publication Critical patent/WO2020078370A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying

Definitions

  • the present disclosure relates to the field of search technology, for example, to a community search method.
  • a community generally refers to a subgraph in which connections between internal nodes are closer than those between internal and external communities.
  • the community structure excavated from the network helps people to recommend friends, identify criminal groups and predict protein functions.
  • Community search local community discovery refers to given one or more nodes to find the community that contains them. Compared to global community discovery, it pays more attention to the local network structure and returns more personalized community results.
  • the current community search method is mainly based on k-clique, k-core, k-truss and other specific topological structures.
  • the community found based on the method of k-truss structure needs to satisfy the following properties: 1.
  • the number of triangles on each side is not less than k-2; 2. Any two sides can be reached by a series of adjacent triangles.
  • a typical method is to record the truss values of all adjacent edges around each node, and then organize the adjacent points of each node into a tree structure index according to the truss value of the edge, called TCP-index, and finally according to the given Node and k value, constantly find the neighbor nodes that can be expanded from the index, until it can not be expanded, you get a k-truss community containing the given node.
  • a typical method that comprehensively considers the topology structure and node attributes is to first supplement the edges of the original graph according to the attribute similarity between the nodes, thereby constructing a TA-graph, and then based on the k-truss structure on the TA-graph Conduct a community search and finally get the community containing the given node.
  • a technical problem to be solved by the embodiments of the present disclosure is to provide a community search method to solve the problems in the prior art.
  • the community search method includes:
  • the nodes are mapped to node variables, and the corresponding search conditions are written;
  • a single-item community search is performed for each search term
  • the node is mapped to a node variable according to the user's demand for community search, and writing out the corresponding search conditions includes:
  • the nodes that appear in the community are not allowed to be logically undecorated, and the nodes that the community must contain are not decorated;
  • the nodes that must appear in the community at the same time are connected with logic and the nodes that the community must contain and the nodes that are not allowed are also connected with logic and;
  • the community must contain at least one of several nodes that need to be represented by logical or connected nodes.
  • the conversion of the search condition into multiple search terms includes:
  • the performing a single-item community search for each search item includes:
  • the nodes that must be included in the community and the nodes that are not allowed to be included are organized into the necessary node set and the prohibited node set as the input of the single-condition community search process;
  • the extracted one or more common node variables are sorted into necessary node sets and prohibited node sets according to whether they can appear in the community as the input of the single-item community search process, the remaining Partly used to judge the output results;
  • the single-item community search is performed, using the necessary node set and the prohibited node set to search the community results from the network graph, so that the resulting community contains the necessary node set, while not including
  • the nodes in the prohibited node set include three implementation methods, namely: community search after filtering, weighted filtering, and filtering while searching.
  • the filtered community search method includes:
  • the weighted filtering method includes:
  • the necessary node set is classified into the community result C;
  • a02 according to the given network graph, get the derived subgraph corresponding to the community result. If the derived subgraph of the community result C has only one connected component and all the nodes in the derived subgraph have node degrees greater than or equal to the given threshold k, Then stop and return to the community results;
  • the nodes in the same component are grouped into the same group;
  • the neighbor nodes of all nodes in the community result C are classified into the candidate node set Candidate, and the nodes that already exist in the community result C are excluded;
  • a06 for each node in the candidate node set Candidate, record the number a of the connected connected components, the number b of the nodes in the community result C, the degree d of the node in the given network graph, and then according to a , B and d sort the nodes in the node set in multi-key descending order;
  • step a07 if the degree of the top candidate node c is less than the threshold k, then remove the node c from the candidate node set Candidate, go to step a06, otherwise, add node c to the community result, and join the node's neighbors Go to the candidate node set Candidate, and remove the node c from the candidate node set Candidate, go to step a02;
  • the nodes of the entire network graph are included in the community result C;
  • the performing community search on the new network graph with the necessary node set as input includes:
  • the derived subgraph of the community result C has only one connected component and all the nodes in the derived subgraph have node degrees greater than or equal to the given threshold k, Then stop and return to the community results;
  • the nodes in the same component are grouped into the same group;
  • the neighbor nodes of all nodes in the community result C are classified into the candidate node set Candidate, and the nodes that already exist in the community result C are excluded;
  • step b07 if the degree of the top candidate node c is less than the threshold k, then remove the node c from the candidate node set Candidate, go to step b06, otherwise, add node c to the community result, and join the node's neighbors Go to the candidate node set Candidate, and remove the node c from the candidate node set Candidate, go to step b02;
  • the manner of searching while filtering includes:
  • the nodes in the same component are grouped into the same group;
  • the neighbor nodes of all nodes in the community result C are classified into the candidate node set Candidate, and the nodes and banned nodes that already exist in the community result C are excluded from it;
  • c06 for each node in the candidate node set Candidate, record the number a of the connected different connected components, the number b of the nodes in the connection community result C, the number of nodes connected to non-prohibited nodes in the given network graph d; Sort the nodes in the node set in multi-key descending order according to a, b, and d;
  • step c07 if the degree of the top candidate node c is less than the threshold k, then remove the node c from the candidate node set Candidate, go to step c06, otherwise, add the node c to the community result C, and at the same time the neighbor node of the node Add to candidate node set Candidate, and remove node c from candidate node set Candidate, go to step c02;
  • step c10 Delete the node whose degree is lower than k in the derived subgraph of the community result C from the community result C. If the deleted node is a member of the necessary node set, stop and return to the empty set, otherwise go to step c09.
  • the present disclosure has the following advantages:
  • the present disclosure proposes a community search method, which expresses the search conditions in the form of a unified Boolean expression, which is convenient for users to express their search needs personally, and also facilitates the execution of community search under complex conditions; due to the consideration in the process of community search In addition, users do not want to appear in the community, and the community results obtained by the search are more in line with the user's expectations, making the community results more personalized; because it is allowed to consider that the community contains at least one such demand in a given node, a search condition may Obtain multiple different community results, and all meet the search conditions, which makes the user's choice of results more abundant; provides a variety of different implementation methods, you can choose according to actual needs.
  • FIG. 1 is a flowchart of an embodiment of a community search method of the present disclosure
  • FIG. 3 is a flowchart of another embodiment of the community search method of the present disclosure.
  • FIG. 6 is a flowchart of yet another embodiment of the community search method of the present disclosure.
  • FIG. 7 is a flowchart of yet another embodiment of the community search method of the present disclosure.
  • FIG. 8 is a flowchart of another embodiment of the community search method of the present disclosure.
  • FIG. 1 is a flowchart of an embodiment of a community search method of the present disclosure. As shown in FIG. 1, the community search method includes:
  • the node variable is also called Boolean variable, and the search condition is expressed by Boolean expression
  • FIG. 2 is a flowchart of another embodiment of the community search method of the present disclosure. As shown in FIG. 2, according to the user ’s demand for community search, the nodes are mapped to node variables, and the corresponding search conditions are written out as follows:
  • the nodes that appear in the community are not allowed to be logically undecorated, and the nodes that the community must contain are undecorated.
  • the symbol of the logical negation is
  • the nodes that must appear in the community at the same time are connected by logic and, the nodes that the community must contain and the nodes that are not allowed are also connected by logic and, the symbol of the logic and is " ⁇ ", for example: Boolean formula Indicates that the user wants the community to include node A and node B, and node C is not allowed. ;
  • the community must contain at least one of several nodes that need to be represented by logical or connected nodes.
  • the logical OR symbol is " ⁇ ", for example: the Boolean expression A ⁇ B ⁇ C indicates that the user wants the community to at least Contains one of node A, node B, and node C.
  • FIG. 3 is a flowchart of yet another embodiment of the community search method of the present disclosure. As shown in FIG. 3, the conversion of search conditions into multiple search terms includes:
  • each conjunction of the simplest and OR formula as the search term for example: the simplest and OR formula (A ⁇ B) ⁇ (C ⁇ D) contains two search terms, which are (A ⁇ B ) And (C ⁇ D), if you find that there are several search terms with the same node variable, you can merge these search terms into a new search term, for example: the most simple and or Search terms (A ⁇ B) and Contains the same node variable A, so you can extract the common node variable A, and merge these two conjunctions into In order to reduce the number of search terms, thereby reducing the number of subsequent single-item community search processes, to achieve the purpose of saving time overhead;
  • FIG. 4 is a flowchart of yet another embodiment of the community search method of the present disclosure. As shown in FIG. 4, performing a single-item community search for each search item includes:
  • the nodes that must be included in the community and the nodes that are not allowed to be included are organized into the necessary node set and the prohibited node set as the input of the single-item community search process, because it only contains the community Must appear in the node and nodes that are not allowed to appear;
  • the extracted one or more common node variables are organized into a necessary node set and a prohibited node set according to whether they can appear in the community as input to the single-item community search process , The remaining part is used to distinguish the output results, for example: in two search terms Combined search terms
  • the necessary node set ⁇ A ⁇ is used as the input to the single-item community search process, that is, to find the community containing node A
  • the discriminant as the output result is used to determine whether the community result contains node B or does not contain node D;
  • the single conditional community search described above uses the necessary node set and the prohibited node set to search the community results from the network graph, so that the resulting community contains the necessary node set, and the nodes that do not contain the prohibited node set include three implementation methods, namely: filtering Post-community search, weighted filtering, and filtering while searching.
  • the filtered community search method includes:
  • FIG. 5 is a flowchart of another embodiment of the community search method of the present disclosure. As shown in FIG. 5, the weighted filtering method includes:
  • FIG. 6 is a flowchart of another embodiment of the community search method of the present disclosure. As shown in FIG. 6, performing community search using the necessary node set as input for the new network diagram includes:
  • step 507 if the degree of the candidate node c ranked first is less than the threshold k, then the node c is removed from the candidate node set Candidate, go to step 506, otherwise, the node c is added to the community result, and the neighbor nodes of the node Go to the candidate node set Candidate, and remove the node c from the candidate node set Candidate, go to step 502;
  • step 510 Delete the node whose degree is lower than k in the derived subgraph of the community result C from the community result C. If the deleted node is a member of the necessary node set, stop and return to the empty set, otherwise go to step 509.
  • FIG. 7 is a flowchart of another embodiment of the community search method of the present disclosure. As shown in FIG. 7, the performing community search on the new network graph using the necessary node set as input includes:
  • the necessary node set is classified into the community result C;
  • the nodes in the same component are grouped into the same group;
  • the node c is removed from the candidate node set Candidate, go to step 606, otherwise, the node c is added to the community result, and the neighbor node of the node is also added Go to the candidate node set Candidate, and remove the node c from the candidate node set Candidate, go to step 602;
  • step 609 Delete the node whose degree is lower than k in the derived subgraph of the community result C from the community result C. If the deleted node is a member of the necessary node set, stop and return to the empty set, otherwise go to step 609.
  • FIG. 8 is a flowchart of another embodiment of the community search method of the present disclosure. As shown in FIG. 8, the manner of searching while filtering includes:
  • the necessary node set is classified into the community result C;
  • the nodes in the same component are grouped into the same group;
  • the node c is removed from the candidate node set Candidate, go to step 706, otherwise, the node c is added to the community result C, and the neighbor node of the node Add to candidate node set Candidate, and remove node c from candidate node set Candidate;
  • step 710. Delete the node whose degree is lower than k in the derived subgraph of the community result C. If the deleted node is a member of the necessary node set, stop and return to the empty set, otherwise jump to step 709.

Abstract

Selon l'invention, un procédé de recherche communautaire consiste à : associer un noeud à une variable de noeud selon une exigence de recherche communautaire d'un utilisateur, et préparer une condition de recherche correspondante (10); convertir la condition de recherche en de multiples éléments de recherche (20); effectuer une recherche communautaire pour chaque élément de recherche selon une condition unique (30); et combiner des résultats de recherche communautaire pour chaque condition unique, acquérir une combinaison des résultats communautaires, et renvoyer cette combinaison (40). Dans le procédé, toutes les conditions de recherche sont converties en expressions booléennes, de sorte qu'un utilisateur puisse facilement personnaliser l'expression d'une exigence de recherche et effectuer des recherches communautaires avec des conditions complexes. Le procédé prend en compte un noeud communautaire qu'un utilisateur souhaite exclure, répondant ainsi davantage aux besoins de l'utilisateur. Puisqu'une communauté est autorisée à comprendre au moins l'un des noeuds donnés, une condition de recherche peut conduire à de multiples résultats communautaires différents remplissant la condition de recherche et offrant à l'utilisateur plus d'options pendant la sélection de résultats. Des procédés multiples et différents sont mis en oeuvre, qui peuvent être choisis en fonction d'exigences réelles.
PCT/CN2019/111419 2018-10-16 2019-10-16 Procédé de recherche communautaire WO2020078370A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811205006.7A CN109543077B (zh) 2018-10-16 2018-10-16 社区搜索方法
CN201811205006.7 2018-10-16

Publications (1)

Publication Number Publication Date
WO2020078370A1 true WO2020078370A1 (fr) 2020-04-23

Family

ID=65843813

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/111419 WO2020078370A1 (fr) 2018-10-16 2019-10-16 Procédé de recherche communautaire

Country Status (2)

Country Link
CN (1) CN109543077B (fr)
WO (1) WO2020078370A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543077B (zh) * 2018-10-16 2020-07-31 清华大学 社区搜索方法
CN113254797B (zh) * 2021-04-19 2022-09-20 江汉大学 一种社交网络社区的搜索方法、装置以及处理设备
CN116485587B (zh) * 2023-04-21 2024-04-09 深圳润高智慧产业有限公司 社区服务获取方法与提供方法、电子设备、存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317142A1 (en) * 2009-09-11 2012-12-13 University Of Maryland, College Park Systmen and method for data management in large data networks
CN104636978A (zh) * 2015-02-12 2015-05-20 西安电子科技大学 一种基于多标签传播的重叠社区检测方法
CN106796611A (zh) * 2014-08-29 2017-05-31 邻客音公司 用于生成搜索查询的用户接口
CN109543077A (zh) * 2018-10-16 2019-03-29 清华大学 社区搜索方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170032044A1 (en) * 2006-11-14 2017-02-02 Paul Vincent Hayes System and Method for Personalized Search While Maintaining Searcher Privacy
KR20130098772A (ko) * 2012-02-28 2013-09-05 삼성전자주식회사 토픽 기반 커뮤니티 인덱스 생성장치, 토픽 기반 커뮤니티 검색장치, 토픽 기반 커뮤니티 인덱스 생성방법 및 토픽 기반 커뮤니티 검색방법
CN103425662B (zh) * 2012-05-16 2017-08-25 腾讯科技(深圳)有限公司 一种网络社区中的信息搜索方法和装置
US9461876B2 (en) * 2012-08-29 2016-10-04 Loci System and method for fuzzy concept mapping, voting ontology crowd sourcing, and technology prediction
US9652875B2 (en) * 2012-10-29 2017-05-16 Yahoo! Inc. Systems and methods for generating a dense graph
CN105224555B (zh) * 2014-06-12 2019-12-10 北京搜狗科技发展有限公司 一种搜索的方法、装置和系统
JP6332243B2 (ja) * 2015-11-18 2018-05-30 カシオ計算機株式会社 情報処理装置、電子機器及びプログラム
JP6697247B2 (ja) * 2015-11-18 2020-05-20 カシオ計算機株式会社 情報処理装置、プログラム、及び検索表示方法
CN106530039A (zh) * 2016-10-26 2017-03-22 深圳市亿家信息科技有限公司 一种智能化社区的数据处理实现方法及系统
CN108268603A (zh) * 2017-12-22 2018-07-10 中国电子科技集团公司第三十研究所 一种基于核心成员识别的社区发现方法
CN108319728A (zh) * 2018-03-15 2018-07-24 深圳大学 一种基于k-star的频繁社区搜索方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317142A1 (en) * 2009-09-11 2012-12-13 University Of Maryland, College Park Systmen and method for data management in large data networks
CN106796611A (zh) * 2014-08-29 2017-05-31 邻客音公司 用于生成搜索查询的用户接口
CN104636978A (zh) * 2015-02-12 2015-05-20 西安电子科技大学 一种基于多标签传播的重叠社区检测方法
CN109543077A (zh) * 2018-10-16 2019-03-29 清华大学 社区搜索方法

Also Published As

Publication number Publication date
CN109543077B (zh) 2020-07-31
CN109543077A (zh) 2019-03-29

Similar Documents

Publication Publication Date Title
WO2020078370A1 (fr) Procédé de recherche communautaire
US6556710B2 (en) Image searching techniques
CN112215837B (zh) 多属性图像语义分析方法和装置
JP2006018829A (ja) 自動分類生成
Mishra et al. Far efficient K-means clustering algorithm
US10135723B2 (en) System and method for supervised network clustering
CN105868366B (zh) 基于概念关联的概念空间导航方法
Huang et al. A link density clustering algorithm based on automatically selecting density peaks for overlapping community detection
Fahim A clustering algorithm based on local density of points
Petkos et al. Graph-based multimodal clustering for social multimedia
CN111061763A (zh) 用于生成规则引擎的规则执行计划的方法及装置
CN112214684B (zh) 一种种子扩展的重叠社区发现方法及装置
JP2003256427A (ja) 画像検索装置
Zhang et al. Selecting the optimal groups: Efficiently computing skyline k-cliques
WO2023206960A1 (fr) Procédé et appareil de recommandation de produit reposant sur un filtrage basé sur le contenu et un filtrage collaboratif, et dispositif informatique
WO2023178767A1 (fr) Procédé et appareil de détection de risque d'entreprise basés sur un graphe de connaissances de mégadonnées d'enquête de solvabilité d'entreprise
Mohotti et al. An efficient ranking-centered density-based document clustering method
KR100427603B1 (ko) 데이터 분류체계 구축방법
Bouanaka et al. An approach for an optimized web service selection based on skyline
US6671402B1 (en) Representing an image with weighted joint histogram
Min et al. Optimal sub-reducts in the dynamic environment
Vu et al. Density-based clustering with side information and active learning
Nguyen Scalable classification method based on rough sets
Pratibha et al. A context-aware recommender engine for smart kitchen
Peng et al. RBPR: Role-based Bayesian Personalized Ranking for Heterogeneous One-Class Collaborative Filtering.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19873794

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19873794

Country of ref document: EP

Kind code of ref document: A1