WO2023058151A1 - Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme - Google Patents

Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme Download PDF

Info

Publication number
WO2023058151A1
WO2023058151A1 PCT/JP2021/036970 JP2021036970W WO2023058151A1 WO 2023058151 A1 WO2023058151 A1 WO 2023058151A1 JP 2021036970 W JP2021036970 W JP 2021036970W WO 2023058151 A1 WO2023058151 A1 WO 2023058151A1
Authority
WO
WIPO (PCT)
Prior art keywords
vertex
vertices
search
reserved
function
Prior art date
Application number
PCT/JP2021/036970
Other languages
English (en)
Japanese (ja)
Inventor
淳也 新井
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/036970 priority Critical patent/WO2023058151A1/fr
Publication of WO2023058151A1 publication Critical patent/WO2023058151A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Definitions

  • the present invention relates to subgraph matching technology.
  • Non-Patent Document 1 As a method of solving the subgraph matching problem, which is one of the problems in graph theory, there is a search algorithm based on backtracking described in Non-Patent Document 1.
  • the function Search representing the subgraph matching problem and the search algorithm of Non-Patent Document 1 will be described in detail below.
  • the two graphs that are the inputs to the subgraph matching problem are both simple undirected graphs with labels at their vertices.
  • the vertex v ⁇ V G of the data graph G is also referred to as the data vertex
  • the vertex u ⁇ V Q of the query graph Q is also referred to as the query vertex.
  • V G ⁇ v 0 , v 1 , . . . , v m ⁇ 1 ⁇ .
  • the query vertex numbers are assigned in the order of connection. That is, query vertices other than query vertex u 0 , that is, query vertices u 1 , .
  • An embedding M V Q ⁇ V G that satisfies the following three constraints is called an isomorphic embedding.
  • (1) Label constraint: for any u i ⁇ V Q , L Q (u i ) L G (M[u i ]) (2) Edge constraint: for any (u i , u j ) ⁇ E Q , (M[u i ], M[u j ]) ⁇ E G (3)
  • Injective constraint For any u i , u j ⁇ V Q , if i ⁇ j then M[u i ] ⁇ M[u j ]
  • V Q ⁇ V G may also be expressed as a binary relation M ⁇ V Q ⁇ V G .
  • M[u 0 ] v 2
  • V Q [:i] represents a subset ⁇ u j
  • N(v) represents the set of neighboring vertices of vertex v.
  • N ⁇ (u i ) represents the forward adjacent vertex set ⁇ u j ⁇ N(u i )
  • N + ( u i ) represents the backward adjacent vertex set ⁇ u j ⁇ N(u i )
  • d(v) represents the degree of vertex v (ie, the number of vertices adjacent to vertex v).
  • ran(M) represents the range of embedding M ⁇ there exists u i that satisfies v
  • C(u k ) represents a set of vertices (hereinafter referred to as candidate vertices) to which vertex uk ⁇ V Q is assigned.
  • LDF label and degree filtering
  • the search for isomorphic embeddings by the function Search(M) will be described below.
  • the search is started by calling the function Search with the argument M representing the embedding as the empty set ⁇ .
  • the algorithm of the function Search(M) checks whether the embedding M satisfies the injective constraint on the 2nd line, whether the embedding M satisfies the label constraint on the 6th line, and whether the embedding M satisfies the edge constraint on the 7th line. are doing.
  • Search( ⁇ ) Search( ⁇ (u 0 , v) ⁇ ) (where v ⁇ C(u 0 )) will be called recursively. Henceforth, Search( ⁇ (u 0 , v) ⁇ ) will be performed for all v ⁇ C(u 0 ) and all isomorphic embeddings will be reported.
  • Search( ⁇ (u 0 , v 1 ), (u 1 , v 2 ) ⁇ ) is executed first of the two function calls. After Search( ⁇ (u 0 , v 1 ), (u 1 , v 2 ) ⁇ ) is done, Search( ⁇ (u 0 , v 1 ), (u 1 , v 3 ) ⁇ ) is executed .
  • the execution process of the function Search can be expressed as a search tree shown in FIG.
  • the node v a at height u i of the search tree represents the addition of (u i , v a ) to M, and the parentage of the nodes represents the recursive call of the function Search (see line 8 of the algorithm). Also, the x at the upper left of the node indicates that the search was discontinued without reporting the embedding.
  • v 2 on the right side of u 3 has an x on the upper left
  • Search( ⁇ (u 0 , v 1 ), (u 1 , v 2 ), (u 2 , v 4 ), (u 3 , v 2 ) ⁇ ) has terminated because the two vertices u 1 and u 3 are assigned to vertex v 2 and do not satisfy the injective constraint (see second line of the algorithm).
  • the algorithm is the set ⁇ (u 0 , v 1 ), (u 1 , v 3 ), (u 2 , v 4 ), (u 3 , v 2 ), (u 4 , v 0 ) Report only ⁇ as isomorphic embeddings.
  • Non-Patent Document 1 As can be seen in Figure 3, the search algorithm in Non-Patent Document 1 generates many recursive calls that cannot find isomorphic embeddings. As a result, there is a problem that the calculation time increases.
  • the purpose of the present invention is to provide a technique for efficiently solving the subgraph matching problem.
  • FIG. 3 is a diagram showing an example of a data graph G;
  • FIG. 4 is a diagram showing an example of a query graph Q;
  • FIG. 1 is a diagram showing an example of a search tree according to the prior art;
  • FIG. 3 is a diagram showing a candidate vertex set and a reserved vertex set in a subgraph matching problem with the data graph G in FIG. 1 and the query graph Q in FIG. 2 as inputs;
  • FIG. 4 is a diagram showing an example of a search tree according to the invention;
  • 2 is a block diagram showing the configuration of a subgraph matching device 100;
  • FIG. 4 is a flowchart showing the operation of the subgraph matching device 100; It is a figure which shows an example of the functional structure of the computer which implement
  • ⁇ (caret) represents a superscript.
  • x y ⁇ z means that y z is a superscript to x
  • x y ⁇ z means that y z is a subscript to x
  • _ (underscore) represents a subscript.
  • x y_z means that y z is a superscript to x
  • x y_z means that y z is a subscript to x.
  • the central idea of the present invention is to branch-hunt recursive calls that cause future injective constraint violations by analyzing the connections between candidate vertices. This is based on the following observations. As an example, focus on the recursive call from function call Search( ⁇ (u 0 , v 1 ), (u 1 , v 2 ) ⁇ ) in the search tree of FIG. Execution of these recursive calls is terminated due to violation of the injective constraint by using either vertex v 1 or v 2 as an argument of the two calls. The present invention reduces the number of recursive calls by clarifying the conditions under which such searches are terminated.
  • the reserved vertex set for (u k , v) is the set of data vertices that are always used to assign the rest of the query vertices to when including (u k , v) in the embedding. Specifically, it is defined as follows.
  • the reserved vertex set may contain redundant vertices.
  • V G is a reserved vertex set for any (u k , v) where v ⁇ C(u k ).
  • pruning of recursive calls is performed when R[u k , v] ⁇ ran(M), so the reserved vertex set R[u k , v] is preferably a set with a small number of elements.
  • lemma suggests an upper bound on the number of vertices.
  • FIG. 4 shows the state of the candidate vertex set C(u k ) and the corresponding reserved vertex set.
  • FIG. 4 shows the candidate vertex set C(u k ) and the corresponding reserved vertex set using one table for vertex u k .
  • the first row of the table represents the candidate vertex set.
  • the second row of the table represents the corresponding reserved vertex set.
  • edges between tables represent adjacency relationships between candidate vertices. For example, vertex v3 is adjacent to vertex v4 , so the second column of the table for vertex u1 is connected to the first column of the table for vertex u2 .
  • R[u 4 , v 1 ] ⁇ v 1 ⁇ .
  • V Q [:3] contains only one vertex with label a (specifically, vertex u 0 ), so R [u 3 , v 2 ] satisfies condition (1), and R[u 3 , v 2 ] is overwritten with ⁇ v 2 ⁇ .
  • Algorithm 3 Search for isomorphic embeddings --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  • the search by Algorithm 3 enables efficient subgraph matching.
  • FIG. 6 is a block diagram showing the configuration of the subgraph matching device 100.
  • FIG. 7 is a flow chart showing the operation of the subgraph matching device 100.
  • subgraph matching device 100 includes initialization unit 110 , first set calculation unit 120 , second set calculation unit 130 , search unit 140 , output unit 150 and recording unit 190 .
  • the recording unit 190 is a component that appropriately records information necessary for the processing of the subgraph matching device 100 .
  • the initialization unit 110 initializes.
  • the initialization unit 110 sets, for example, an empty set ⁇ as an initial value of the set S including all isomorphic embeddings.
  • L Q (u i )
  • L G (v') ⁇ ⁇
  • (where V Q [:k] ⁇ u j
  • the function Search(M) has an assignment part that assigns
  • the output unit 150 outputs the set S calculated at S140.
  • FIG. 8 is a diagram showing an example of a functional configuration of a computer 2000 that implements each of the devices described above.
  • the processing in each device described above can be performed by causing the recording unit 2020 to read a program for causing the computer 2000 to function as each device described above, and causing the control unit 2010, the input unit 2030, the output unit 2040, and the like to operate.
  • the apparatus of the present invention includes, for example, a single hardware entity, which includes an input unit to which a keyboard can be connected, an output unit to which a liquid crystal display can be connected, and a communication device (for example, a communication cable) capable of communicating with the outside of the hardware entity.
  • a communication device for example, a communication cable
  • CPU Central Processing Unit
  • memory RAM and ROM hard disk external storage device
  • input unit, output unit, communication unit a CPU, a RAM, a ROM, and a bus for connecting data to and from an external storage device.
  • the hardware entity may be provided with a device (drive) capable of reading and writing a recording medium such as a CD-ROM.
  • a physical entity with such hardware resources includes a general purpose computer.
  • the external storage device of the hardware entity stores a program necessary for realizing the functions described above and data required for the processing of this program (not limited to the external storage device; It may be stored in a ROM, which is a dedicated storage device). Data obtained by processing these programs are appropriately stored in a RAM, an external storage device, or the like.
  • each program stored in an external storage device or ROM, etc.
  • the data necessary for processing each program are read into the memory as needed, and interpreted, executed and processed by the CPU as appropriate.
  • the CPU realizes a predetermined function (each structural unit represented by the above, . . . unit, . . . means, etc.).
  • a program that describes this process can be recorded on a computer-readable recording medium.
  • Any computer-readable recording medium may be used, for example, a magnetic recording device, an optical disk, a magneto-optical recording medium, a semiconductor memory, or the like.
  • magnetic recording devices hard disk devices, flexible disks, magnetic tapes, etc., as optical discs, DVD (Digital Versatile Disc), DVD-RAM (Random Access Memory), CD-ROM (Compact Disc Read Only Memory), CD-R (Recordable) / RW (ReWritable), etc.
  • magneto-optical recording media such as MO (Magneto-Optical disc), etc. as semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory), etc. can be used.
  • this program is carried out, for example, by selling, assigning, lending, etc. portable recording media such as DVDs and CD-ROMs on which the program is recorded.
  • the program may be distributed by storing the program in the storage device of the server computer and transferring the program from the server computer to other computers via the network.
  • a computer that executes such a program for example, first stores the program recorded on a portable recording medium or the program transferred from the server computer once in its own storage device. When executing the process, this computer reads the program stored in its own storage device and executes the process according to the read program. Also, as another execution form of this program, the computer may read the program directly from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to this computer. Each time, the processing according to the received program may be executed sequentially. In addition, the above-mentioned processing is executed by a so-called ASP (Application Service Provider) type service, which does not transfer the program from the server computer to this computer, and realizes the processing function only by its execution instruction and result acquisition. may be It should be noted that the program in this embodiment includes information that is used for processing by a computer and that conforms to the program (data that is not a direct instruction to the computer but has the property of prescribing the processing of the computer, etc.).
  • ASP Application Service Provide
  • a hardware entity is configured by executing a predetermined program on a computer, but at least part of these processing contents may be implemented by hardware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne une technique permettant de résoudre efficacement un programme de mise en correspondance de sous-graphes. Le présent dispositif de mise en correspondance de sous-graphes comprend : une première unité de calcul d'ensemble qui calcule un ensemble de sommets candidats C(uk) (où k = 0, ..., n-1) pour un sommet uk ; une seconde unité de calcul d'ensemble qui calcule un ensemble de sommets réservés R[uk v] (où k = 0, ..., n-1, v∈C(uk)) pour (uk, v) ; et une unité de recherche qui calcule un ensemble S comprenant toutes les incorporations isomorphes en exécutant Search ({(u0, v)}) (où v∈C(u0)). La fonction Search (M) laisse k = |M| et termine l'exécution si M n'est pas injective ou ajoute M à l'ensemble S en tant qu'élément si M est injective et que la valeur de la variable k est égale à n. Sinon, une unité d'appel récursive exécute Search (M∪{(uk, v)}) si les deux conditions suivantes sont satisfaites : v ∈ N(M[ui]) pour tout sommet ui∈N-(uk) pour v∈C(uk) ; et l'ensemble de sommets réservés R[uk, v] n'est pas inclus dans ran (M).
PCT/JP2021/036970 2021-10-06 2021-10-06 Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme WO2023058151A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/036970 WO2023058151A1 (fr) 2021-10-06 2021-10-06 Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/036970 WO2023058151A1 (fr) 2021-10-06 2021-10-06 Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme

Publications (1)

Publication Number Publication Date
WO2023058151A1 true WO2023058151A1 (fr) 2023-04-13

Family

ID=85803310

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/036970 WO2023058151A1 (fr) 2021-10-06 2021-10-06 Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme

Country Status (1)

Country Link
WO (1) WO2023058151A1 (fr)

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YANG ZHENGWEI ZWYANG@CSE.CUHK.EDU.HK; FU ADA WAI-CHEE ADAFU@CSE.CUHK.EDU.HK; LIU RUIFENG RFLIU@CSE.CUHK.EDU.HK: "Diversified Top-k Subgraph Querying in a Large Graph", THE 2021 12TH INTERNATIONAL CONFERENCE ON E-BUSINESS, MANAGEMENT AND ECONOMICS, ACMPUB27, NEW YORK, NY, USA, 14 June 2016 (2016-06-14) - 15 November 2021 (2021-11-15), New York, NY, USA, pages 1167 - 1182, XP058880797, ISBN: 978-1-4503-8715-6, DOI: 10.1145/2882903.2915216 *

Similar Documents

Publication Publication Date Title
US10705735B2 (en) Method and device for managing hash table, and computer program product
US9213782B2 (en) Sorting multiple records of data using ranges of key values
US20080313251A1 (en) System and method for graph coarsening
JP2004334870A (ja) 最適化を用いたメモリアクセス制御の実装
US7225118B2 (en) Global data placement
JP5383665B2 (ja) セキュリティトークンならびにセキュリティトークンを生成およびデコードするためのシステムおよび方法
CN107506484B (zh) 运维数据关联审计方法、系统、设备及存储介质
CN111984204B (zh) 一种数据读写方法、装置及电子设备和存储介质
CN115733763A (zh) 一种关联网络的标签传播方法、装置及计算机可读存储介质
US11157495B2 (en) Dynamically managing predicate expression columns in an encrypted database
CN109614411B (zh) 数据存储方法、设备和存储介质
JP6445415B2 (ja) 匿名化装置、匿名化方法、プログラム
CN106570005A (zh) 清理数据库的方法和装置
CN112241396B (zh) 基于Spark的对Delta进行小文件合并的方法及系统
US20220121665A1 (en) Computerized Methods and Systems for Selecting a View of Query Results
WO2023058151A1 (fr) Dispositif de mise en correspondance de sous-graphes, procédé de mise en correspondance de sous-graphes et programme
US11740825B2 (en) Object lifecycle management in a dispersed storage system
CN107391541A (zh) 一种实时数据合并方法和装置
Ergenç Bostanoǧlu et al. Minimizing information loss in shared data: Hiding frequent patterns with multiple sensitive support thresholds
CN113297436A (zh) 基于关系图网络的用户策略分配方法、装置及电子设备
CN111523681A (zh) 全局特征重要性表征方法、装置、电子设备和存储介质
CN113474778B (zh) 匿名化装置、匿名化方法、计算机可读取的记录介质
Torra et al. Synthetic generation of spatial graphs
CN113744102B (zh) 政务服务事项场景颗粒化梳理方法及系统
US11940975B2 (en) Database distribution to avoid contention

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21959893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21959893

Country of ref document: EP

Kind code of ref document: A1