WO2016091174A1 - 图数据的搜索方法和装置 - Google Patents

图数据的搜索方法和装置 Download PDF

Info

Publication number
WO2016091174A1
WO2016091174A1 PCT/CN2015/096845 CN2015096845W WO2016091174A1 WO 2016091174 A1 WO2016091174 A1 WO 2016091174A1 CN 2015096845 W CN2015096845 W CN 2015096845W WO 2016091174 A1 WO2016091174 A1 WO 2016091174A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
query
graph
protocol
nodes
Prior art date
Application number
PCT/CN2015/096845
Other languages
English (en)
French (fr)
Inventor
樊文飞
王欣
吴颖徽
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP15866451.6A priority Critical patent/EP3223169A4/en
Publication of WO2016091174A1 publication Critical patent/WO2016091174A1/zh
Priority to US15/618,587 priority patent/US9798774B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24537Query rewriting; Transformation of operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • Embodiments of the present invention relate to computer technologies, and in particular, to a method and apparatus for searching for graph data.
  • the graph data are data that are related to each other. Based on this, computers often need to perform full-scale big data analysis, and obtain accurate search results through a large amount of time resources and computer storage resource consumption.
  • BlinkDB data sampling query
  • Embodiments of the present invention provide a method and apparatus for searching for graph data, which avoids resource waste caused by searching for graph data while effectively searching graph data.
  • an embodiment of the present invention provides a method for searching for graph data, including:
  • the query request includes a query condition carrying a start graph node, where the query request is used to query a first to-be-viewed graph node in the graph data set that matches the query condition; Include the start graph node, the plurality of graph view nodes, and an association relationship between the start graph node and the plurality of graph view nodes and each of the plurality of graph view nodes to be checked The relationship between the graph node and other nodes to be inspected;
  • the protocol subgraph includes the start graph node, and the query a first to-be-viewed node that matches the condition and an association relationship between the start graph node and the first graph to be checked;
  • the protocol subgraph is queried by the query condition to obtain the first to-be-viewed node.
  • the filtering according to the query condition and a preset available resource condition, filtering, in the graph data set, the second waiting item that does not satisfy the query condition
  • the mapping node and the association relationship corresponding to the second to-be-viewed node are obtained to obtain a protocol sub-graph, including:
  • the query topology includes a plurality of query nodes, and a query topology relationship between each of the plurality of query nodes and other query nodes;
  • Filtering accesses in the graph data set according to a query topology relationship between query nodes in the query topology and a preset first access cost of accessing the first to-be-viewed node and the available resource condition a second to-be-viewed graph node having a cost exceeding the first access cost and an association relationship including the second to-be-viewed graph node to obtain the protocol sub-graph; wherein the resource of the protocol sub-graph does not exceed The available resource conditions.
  • the query is based on a query topology relationship and a preset access between the query nodes in the query topology a first access cost of the first to-be-viewed node and the available resource condition, filtering a second to-be-viewed node in the graph data set whose access cost exceeds the first access cost, and including the second The association relationship of the node to be searched to obtain the specification subgraph, including:
  • Reading a query node stored in the storage space and a graph node matching the query node wherein the storage space stores a query node in the query topology and a graph node matching the query node, where The query node includes a start query node, and the map node includes the start graph node or the to-be-viewed graph node, and the start graph node matches the start query node;
  • the read graph node is added to the protocol submap, and it is determined that the resource occupied by the protocol submap does not exceed the Available resource conditions;
  • the access cost of the to-be-viewed node in the access sequence does not exceed the first access cost, and the dynamic protocol parameter is used to control the number of to-be-viewed nodes in the access sequence.
  • the association relationship of the node to be searched to obtain the specification subgraph including:
  • Step A setting the number of graph nodes in the protocol subgraph to zero, setting the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and setting the dynamic protocol parameter to First preset value;
  • Step B storing a start query node and the start graph node in the query topology to the storage space; the start graph node is matched with the start query node;
  • Step C reading a query node stored in the storage space and a graph node matching the query node;
  • Step D determining whether the read graph node is included in the protocol submap; wherein the read graph node includes the start graph node or the to-be-view graph node;
  • Step E if the read graph node is not included in the protocol submap, add the read graph node to the protocol submap, and determine that the resource occupied by the protocol submap is not Exceeding the available resource conditions;
  • Step F Calculate an access cost of the to-be-viewed node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filter the access cost to exceed the first a second to-be-viewed node of the access cost and an association relationship of the second to-be-viewed node including the access cost exceeding the first access cost, and outputting an access sequence stored in the storage space according to the dynamic specification parameter Where the view in the access sequence is to be checked The access cost of the node does not exceed the first access cost, and the dynamic protocol parameter is used to control the number of nodes to be inspected in the access sequence;
  • Step G determining whether the storage space is empty
  • Step H If the storage space is not empty, return to step C until the number of query nodes stored in the storage space and the number of graph nodes matching the query node are zero; if the storage space is Empty, then it is determined whether the specification subgraph has changed;
  • Step I If it is determined that the protocol subgraph has not changed, the calculation is ended to obtain the protocol subgraph.
  • Step A setting the number of graph nodes in the protocol subgraph to zero, setting the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and setting the dynamic protocol parameter to First preset value;
  • Step B storing a start query node and the start graph node in the query topology to the storage space; the start graph node is matched with the start query node;
  • Step C reading a query node stored in the storage space and a graph node matching the query node, and marking the read query node in the storage space and the graph node matching the query node ;
  • Step D determining whether the read graph node is included in the protocol submap; wherein the read graph node includes the start graph node or the to-be-view graph node;
  • Step E if the read graph node is not included in the protocol submap, add the read graph node to the protocol submap, and determine that the resource occupied by the protocol submap is not Exceeding the available resource conditions;
  • Step F Calculate an access cost of the to-be-viewed node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filter the access cost to exceed the first
  • the second to-be-viewed node of the access cost and the inclusion of the access cost exceeds An association relationship of the second to-be-viewed node of the first access cost, and outputting an access sequence stored in the storage space according to the dynamic protocol parameter; wherein an access cost of the to-be-viewed node in the access sequence The first access cost is not exceeded, and the dynamic protocol parameter is used to control the number of nodes to be inspected in the access sequence;
  • Step G determining whether there is an unmarked query node and a graph node matching the unmarked query node in the storage space;
  • Step H If there is an unmarked query node and a graph node matching the unmarked query node in the storage space, returning to step C until the storage node stores the query node and the The graph nodes matched by the query node are marked; if the query node stored in the storage space and the graph node matching the query node are marked, it is determined whether the protocol subgraph changes;
  • Step I If it is determined that the protocol subgraph has not changed, the calculation is ended to obtain the protocol subgraph.
  • the determining whether the protocol sub-graph changes also includes:
  • the method further includes:
  • the calculation is ended to obtain the protocol subgraph.
  • an embodiment of the present invention provides a method for searching for graph data, including:
  • the query request includes a query condition that carries a start graph node and a termination graph node, where the query request is used to request a first to-be-viewed graph node in the query graph data set that matches the query condition;
  • the map data set includes the start graph node, a plurality of graph view nodes, the termination graph node, and the start graph node, the termination graph node, and An association relationship between the plurality of to-be-viewed nodes;
  • searching for the landmark node tree according to the query condition, to obtain the first to-be-viewed node includes:
  • the second graph node that does not satisfy the query condition in the graph data set is filtered according to the query condition in the query request and the preset available resource condition, so as to obtain the protocol
  • the statistic sub-graph is queried by the query condition to obtain the required first to-be-viewed graph node.
  • the embodiment of the present invention can generate a corresponding statistic sub-graph in real time based on the query condition, and according to the real-time generated statistic. The result of the query is obtained, and the accuracy of the computer search graph data is improved, and the storage node of the graph data set is filtered by the query condition to generate the protocol sub-graph in the filtering process of the embodiment.
  • the generated protocol sub-graph is dynamically stored in the memory, which does not need to occupy the disk storage resource of the computer, thereby reducing the storage overhead of the computer; further, the method provided by the embodiment of the present invention is based on the query condition and Filtering, by the preset available resource conditions, that the query data is not satisfied in the graph data set And a relationship between the second to-be-viewed node and the second to-be-viewed node to obtain a protocol sub-graph, and querying the protocol sub-graph by using the query condition to obtain the required first to-be-viewed node, visible
  • the embodiment of the invention can support relatively accurate query results under different resource constraints, thereby overcoming the defect that the traditional query technology can not return the query result under different resource constraints.
  • the present invention provides a search device for graph data, including:
  • An obtaining module configured to obtain a query request, where the query request includes carrying a start map a query condition of the node, where the query request is used to query a first to-be-viewed node in the graph data set that matches the query condition;
  • the graph data set includes the start graph node, a plurality of graph nodes to be inspected, and An association relationship between the start graph node and the plurality of graph view nodes and an association relationship between each of the plurality of graph view nodes and other graph nodes to be inspected;
  • a processing module configured to filter, according to the query condition and a preset available resource condition, a second to-be-viewed node that does not satisfy the query condition in the graph data set, and an association relationship that includes the second to-be-viewed graph node Obtaining a protocol subgraph, and querying the protocol subgraph by using the query condition to obtain the first to-be-viewed node; wherein the protocol sub-graph includes the starting graph node, and the query Conditional matching first to-be-viewed graph node and association relationship between the starting graph node and the first graph to be inspected graph
  • the processing module is configured to generate a query topology according to the query condition, and according to the query node in the query topology The query topology relationship and the preset first access cost of the first to-be-viewed node and the available resource condition, and filtering the second to-be-served in the graph data set whose access cost exceeds the first access cost Correlating a node and an association relationship including the second to-be-viewed node to obtain the protocol sub-graph; wherein the query topology includes a plurality of query nodes, and each of the plurality of query nodes A query topology relationship between a node and other query nodes, where the resources occupied by the protocol subgraph do not exceed the available resource conditions.
  • the processing module is configured to read a query node stored in a storage space and the query node Matching the graph node, determining whether the read graph node is included in the protocol submap, and if the read graph node is not included in the protocol submap, adding the read graph node to In the protocol sub-graph, and determining that the resource occupied by the protocol sub-graph does not exceed the available resource condition, further calculating and reading the query topology relationship between the query nodes in the query topology The access cost of the graph node to be inspected adjacent to the graph node, and filtering the second graph node with the access cost exceeding the first access cost and the second graph including the access cost exceeding the first access cost An association relationship of the node to be inspected, and outputting an access sequence stored in the storage space according to the preset dynamic protocol parameter; wherein the storage space stores the query in the query topology Node and the query matches node a graph node, the query no
  • the processing module is configured to set the number of graph nodes in the protocol sub-graph to zero, Setting the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and setting the dynamic protocol parameter to the first preset value, the starting query in the query topology And the node and the start graph node are stored in the storage space, and after reading the query node stored in the storage space and the graph node matching the query node, determining whether the protocol subgraph includes the a read graph node; if the read graph node is not included in the protocol submap, the read graph node is further added to the protocol submap, and the protocol submap is determined After the occupied resource does not exceed the available resource condition, the access generation of the to-be-checked node adjacent to the read graph node is further calculated according to the query topology relationship between the query nodes in the query topology.
  • the parameter output is stored in an access sequence of the storage space; and further determining whether the storage space is empty, and if the storage space is not empty, continuing to read the query node stored in the storage space and the query node Matching the graph node until the number of query nodes stored in the storage space and the number of graph nodes matching the query node are zero; if the storage space is empty, further determining whether the protocol subgraph occurs Varying, and when determining that the protocol subgraph has not changed, ending the calculation to obtain the protocol subgraph; wherein the starting graph node matches the starting query node, the read graph
  • the node includes the start graph node or the to-be-viewed node, and the access cost of the to-be-viewed node in the access sequence does not exceed the first access cost.
  • the processing module is configured to set the number of graph nodes in the protocol sub-graph to zero, Setting the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and setting the dynamic protocol parameter to the first preset value, expanding the query
  • the initial query node and the start graph node in the flap structure are stored to the storage space, and read a query node stored in the storage space and a graph node matching the query node, and After the query node that has been read in the storage space and the graph node that matches the query node are marked, it is further determined whether the read graph node is included in the protocol subgraph; if the protocol subgraph does not include And the read graph node further adds the read graph node to the protocol submap, and determines that the resource occupied by the protocol submap does not exceed the available resource condition, and further according to the After the query topological relationship between the query nodes in the query topology calculates
  • the processing module is further configured to determine Returning the initial query node and the start graph node of the query node to the storage space, and adjusting the value of the dynamic protocol parameter to a second preset value, And continuing to read the query node stored in the storage space and the graph node matching the query node until the number of query nodes stored in the storage space and the number of graph nodes matching the query node are zero Or until the query node stored in the storage space and the graph node matching the query node are marked.
  • the processing module is further And if the resource occupied by the protocol subgraph exceeds the available resource condition, the calculation is ended to obtain the protocol subgraph.
  • an embodiment of the present invention provides a device for searching for graph data, including:
  • An obtaining module configured to obtain a query request, where the query request includes a query condition that carries a start graph node and a termination graph node, where the query request is used to request a first match in the query graph data set that matches the query condition a map node to be searched;
  • the map data set includes the start graph node, a plurality of graph view nodes, the termination graph node, and the start graph node, the termination graph node, and the plurality of graph nodes The relationship between the nodes to be searched;
  • a processing module configured to determine a landmark node of the graph data set according to an intermediate degree core of the plurality of graph nodes in the graph data set and a preset available resource condition, and establish a landmark node tree according to the landmark node, And searching the landmark node tree according to the query condition to obtain the first to-be-viewed node; wherein the landmark node tree includes a landmark node having a hierarchical relationship.
  • the processing module is configured to acquire, according to the query condition, auxiliary information of each landmark node in the landmark node tree, according to the After the auxiliary information of the landmark node determines the path policy for acquiring the first to-be-viewed node, the landmark node tree is searched according to the path policy to obtain the first to-be-viewed node.
  • the method and device for searching graph data determine a landmark node according to the intermediateity core of the graph node in the map data set and the preset available resource condition, and establish a landmark tree, and then search according to the query condition.
  • the landmark node tree determines a first to-be-view node that satisfies the query condition. Since the landmark node tree is used in the process of searching for the first to-be-viewed node, the path and the graph node that are searched for the first to-be-viewed node are directly valid paths and graph nodes, thereby avoiding the computer acquiring the first
  • the invalid search when the node is to be checked saves the time resources of the computer and improves the search efficiency.
  • FIG. 1 is a schematic structural diagram of a distributed computing system provided by the present invention.
  • Embodiment 1 is a schematic flowchart of Embodiment 1 of a method for searching for graph data provided by the present invention
  • FIG. 3 is a schematic diagram 1 of a map data set provided by the present invention.
  • FIG. 5 is a schematic flowchart diagram of Embodiment 2 of a method for searching for graph data provided by the present invention
  • FIG. 6 is a schematic diagram of a query topology structure provided by the present invention.
  • FIG. 7 is a schematic flowchart diagram of Embodiment 3 of a method for searching for graph data provided by the present invention.
  • FIG. 8 is a schematic flowchart diagram of Embodiment 4 of a method for searching for graph data provided by the present invention.
  • FIG. 9 is a schematic flowchart diagram of Embodiment 5 of a method for searching for graph data provided by the present invention.
  • Embodiment 6 is a schematic flowchart of Embodiment 6 of a method for searching for graph data provided by the present invention
  • Figure 11 is a schematic diagram 2 of a graph data set provided by the present invention.
  • FIG. 12 is a schematic diagram of a landmark tree of a landmark provided by the present invention.
  • FIG. 13 is a schematic flowchart diagram of Embodiment 7 of a method for searching for graph data provided by the present invention.
  • FIG. 14 is a schematic structural diagram of Embodiment 1 of a search device for graph data according to an embodiment of the present invention.
  • FIG. 15 is a schematic structural diagram of Embodiment 2 of a search device for graph data according to an embodiment of the present invention.
  • FIG. 16 is a schematic structural diagram of Embodiment 1 of a search device for graph data according to an embodiment of the present disclosure
  • FIG. 17 is a schematic structural diagram of Embodiment 2 of a search device for graph data according to an embodiment of the present invention.
  • the embodiment of the present invention is applicable to a scenario for searching for large-scale graph data, which is specifically applicable to a scenario in which a computing node in a distributed computing system searches for graph data.
  • the distributed computing system includes at least one computing node, which may be, for example, a computer, a server in a computer, or a user-oriented communication device.
  • the distributed computing system can refer to the system architecture diagram shown in FIG. 1.
  • the central node is a computing node that receives a user query command, and the central node can split the query command input by the user into different query requests, and The split query request is sent to the corresponding computing node, so that other computing nodes in the distributed computing system can search for data according to the query request split by the central node.
  • the query request split by the central node may also include its own corresponding query request, that is, the central node may also search for data according to its corresponding query request.
  • the technical solutions of the following embodiments are all introduced by using a computer as an execution subject.
  • FIG. 2 is a schematic flowchart diagram of Embodiment 1 of a method for searching for graph data provided by the present invention. As shown in Figure 2, the method includes:
  • S101 Acquire a query request, where the query request includes a query condition that carries a start graph node, where the query request is used to query a first to-be-viewed graph node in the graph data set that matches the query condition;
  • the data set includes the start graph node, the plurality of graph view nodes, and an association relationship between the start graph node and the plurality of graph view nodes and each of the plurality of graph nodes to be inspected The relationship between the node to be inspected and other nodes to be inspected.
  • the computer obtains a query request from the user.
  • the query request may be configured by the user to the computer, or may be sent by the user to the computer through other devices, for example, sent to the computer by using the central node shown in FIG. 1.
  • the query request may include a query condition carrying a start graph node, and the query request is used to query the first graph node to which the graph data set matches the query condition.
  • the data included in the graph data set may be stored in the form of a graph node, so the graph data set may include a start graph node, a plurality of graph nodes to be inspected, and a start graph node and the plurality of to-be-checked nodes.
  • the association relationship in the graph data set refers to the edge formed by the start graph node and each graph node to be checked.
  • the to-be-searched node in the graph data set is a graph node to be searched or a graph node to be queried.
  • both the start graph node and the graph to be traced in the graph data set are represented by data, and the association relationship between the start graph node and the graph node to be tested is also the association relationship of the data.
  • the first to-be-searched node may be one or more.
  • the query condition in the query request is "find all bicycle enthusiasts who know the michael's hiking club members and LA bicycle club members.”
  • michael is the starting graph node of the graph data set, and all graph nodes except michael are graph nodes to be inspected.
  • the HG in Figure 3 represents the hiking club
  • hg represents the members of the hiking club, which is the node to be checked in the graph data set
  • CC represents the bicycle club
  • cc represents the members of the bicycle club.
  • the connection line between the michael and other nodes to be inspected shown in FIG. 3 is the relationship between the start graph node and the graph node to be inspected in the graph data set, and the association relationship is the start graph. The edge formed by the node and each node to be checked.
  • S102 Filter, according to the query condition and the preset available resource condition, a second check graph node that does not satisfy the query condition in the graph data set, and an association relationship that includes the second graph to be checked, to obtain a protocol subgraph;
  • the protocol submap includes the start graph node, a first to-be-viewed graph node that matches the query condition, and an association between the start graph node and the first graph view node relationship.
  • the computer parses the query request and obtains the query condition in the query request, and then determines, according to the query condition and the available resource condition preset by the computer, the second in the graph data set that does not satisfy the query condition. Obtaining an association relationship between the to-be-viewed node and the second to-be-viewed node, and filtering the second to-be-viewed node that does not satisfy the query condition and the association relationship that includes the second to-be-viewed node, thereby obtaining a protocol Subgraph.
  • the protocol subgraph includes a start graph node and a first to-be-viewed graph node that matches the above query condition, and an association relationship between the start graph node and the first graph to be checked node.
  • the protocol subgraph The graphic mode may be in the form of a mapping set having an association relationship, or may be another form that can represent an association relationship between the starting graph node and the first graph node to be inspected, as long as the protocol subgraph can Enables the computer to quickly search for the desired result based on the query criteria.
  • the above available resource conditions may be a threshold of a resource size or a range value of a resource size or an upper limit value of a resource size.
  • the second graph to be tested filtered by the above computer that does not satisfy the query condition may be one One or more.
  • the “filtering” operation in the embodiment of the present invention refers to the computer filtering or ignoring the association relationship between the second to-be-checked graph node and the second to-be-checked graph node that do not satisfy the query condition from the graph data set. So that the remaining graph nodes in the graph data set are generated into a protocol subgraph, and the generated protocol subgraph is dynamically cached in the memory, and after the computer obtains the first graph to be inspected according to the protocol subgraph, The specification subgraph is released.
  • the computer does not need to write the protocol sub-picture from the memory to the disk, which saves the storage overhead of the computer disk; and the computer directly searches for the first to-be-checked node from the protocol sub-picture in the memory, without executing the slave Disk-to-memory IO operations, so the time spent processing resources is small, effectively improving the efficiency of the computer search graph data.
  • association relationship between the start graph node and the graph node to be inspected” described in the above graph data set refers only to the edge formed by all the graph nodes in the graph data set, and the association in the protocol sub graph.
  • the relationship includes all the to-be-viewed nodes and edges that pass through from the start graph node to the first graph view node.
  • the foregoing preset available resource condition is used to constrain the size of the resource occupied by the protocol subgraph.
  • the computer judges that hg 1 , hg 2 , cc 3 , and hg 1 , hg 2 and cl 1 and cl 2 do not satisfy the above query condition according to the query condition and the preset available resources, and then hg 1 , hg 2 , cc 3 , cl 1 , cl 2 and the association relationship corresponding to the second to-be-viewed nodes are deleted, thereby generating a protocol sub-graph, which can be seen in FIG. 4 .
  • the protocol subgraph includes a first to-be-viewed graph node cl n , cl n-1 that satisfies the query condition, and an association relationship between the first to-be-checked graph node and the start graph node (the association relationship includes both hg m .
  • the three to-be-viewed nodes of cc 1 and cc 2 also include the edges of the three to-be-searched nodes and the starting graph node and the first graph to be inspected, and the storage resources occupied by the protocol subgraph are not Exceeds the above preset available resource conditions.
  • the computer when the computer filters the second to-be-viewed node that does not satisfy the query condition and the association relationship corresponding to the second to-be-viewed node, the computer considers both the query condition and the preset available resource condition.
  • S103 Query the protocol sub-graph by using the query condition to obtain the first to-be-viewed node.
  • the new protocol sub-graph is dynamically generated, and the query is performed based on the new protocol sub-graph. Therefore, the method provided by the embodiment of the present invention improves the accuracy of the computer search map data. Further, the method provided by the embodiment of the present invention can return an accurate query result under any resource limitation, and overcome the defect that the traditional query technology cannot return the query result under any resource limitation.
  • the method for searching for graph data obtained by the embodiment of the present invention obtains a protocol sub-graph according to the query condition in the query request and the second inspected graph node in the map data set that does not satisfy the query condition in the preset available resource, and according to the The protocol sub-graph obtains the required first to-be-viewed node, that is, the method for filtering the graph node of the graph data set by using the query condition to generate the protocol sub-graph in the embodiment of the present invention, and does not occupy the storage of the computer during the filtering process.
  • the resource, and the generated protocol sub-picture is dynamically stored in the memory, which does not need to occupy the computer's disk storage resources, thus reducing the storage overhead of the computer; on the other hand, the embodiment of the present invention generates the real-time based on different query conditions.
  • the sub-graph is obtained, and the query result is obtained according to the real-time generated protocol sub-graph, and the accuracy of the computer search graph data is improved.
  • the method provided by the embodiment of the present invention can return the query accurate query result under any resource limitation, and overcome The defect that traditional query technology cannot return query results under any resource constraints.
  • FIG. 5 is a schematic flowchart diagram of Embodiment 2 of a method for searching for graph data provided by the present invention. Based on the above embodiments, the present embodiment relates to a specific process for a computer to acquire a protocol subgraph. Further, the above S102 specifically includes:
  • S201 Generate a query topology according to the query condition; the query topology includes a plurality of query nodes, and a query topology relationship between each of the plurality of query nodes and other query nodes.
  • the query topology generated by the computer according to the query condition may be a query mode.
  • each query node in the query topology has a query topology with other query nodes.
  • the computer finds all the bicycle enthusiasts who know the michael's hiking club members and the LA bicycle club members according to the query condition in the example shown in FIG. 3 above, and knows what they are looking for.
  • a bicycle enthusiast ie, the first to-be-viewed node
  • CL social circle
  • the computer will set Michael as the starting query node (in order to facilitate the distinction with the michael in the graph data set, Michael is used here, the actual meaning of the Michael is a query node, the meaning of the query node is to find a called michael Person, michael refers to the actual graph node in the graph data set, that is, the starting query node Michael and the starting graph node michael are matched with each other, and the starting graph node michael is the person that the starting query node Michael is looking for);
  • the computer will use HG, CC and CL as other query nodes, that is, the person to be found by the computer must be a person in the CL, and this person must be a person who is commonly known by members of HG and CC.
  • the computer constructs the query topology, and then queries according to the query topology relationship between the query nodes in the query topology during the query process. For example, the computer only needs to find the people in HG, CC, and CL, no longer. Need to find other circles.
  • the construction of the query topology reduces the computational query time and saves the computer's time processing resources.
  • S202 Filter, according to the query topology relationship between the query nodes in the query topology and the preset first access cost of the first to-be-viewed node and the available resource condition, to filter the graph data set. a second to-be-viewed node that exceeds the first access cost and an association relationship that includes the second to-be-viewed node to obtain the protocol sub-graph; wherein the resource of the protocol sub-graph does not exceed Describe the available resource conditions.
  • the computer presets a first access cost for accessing the first to-be-viewed node, that is, the computer searches for the first to-be-viewed node from the initial graph node.
  • the access cost of the time must not exceed the first access cost.
  • the computer filters out the graph data set beyond the first according to the query topology relationship between the query nodes in the query topology generated above and the first access cost of the first graph to be checked and the preset available resource conditions.
  • Second access graph node and inclusion of an access cost The association relationship of the second to-be-viewed node to obtain a specification sub-graph.
  • the computer judges that a bicycle enthusiast who knows the members of the HG in which the michael is located and who knows the members of the LA bicycle club cannot be found from the start map node via hg 1 or hg 2 or cc 3 (
  • the access cost can be regarded as infinity, and thus Exceeding the first access cost, the computer will hg 1 , hg 2 , cc 3 , cl 1 , cl 2 (hg 1 , hg 2 , cc 3 , cl 1 , cl 2 are the second to-be-viewed nodes) and
  • the association relationship including these second to-be-viewed nodes is filtered out to generate a protocol sub-picture, which can be seen in FIG.
  • the protocol subgraph includes a first to-be-viewed graph node cl n , cl n-1 and an association relationship between the first to-be-checked graph node and the start graph node (the association relationship includes both hg m , cc 1 , cc 2
  • These three to-be-viewed nodes also include the edges of the three to-be-searched nodes and the starting graph node and the first graph to be inspected, that is, the association between the first graph to be inspected and the node of the starting graph.
  • the relationship includes the to-be-checked graph node on the path of the start graph node to the first graph to be checked node and the edge constituting the path, and the storage resource occupied by the protocol subgraph does not exceed the preset available resource condition.
  • the method for searching for graph data uses a query topology and a preset first access cost and available resource conditions, and filters a second graph to be inspected and a second graph that exceeds the first access cost in the graph data set.
  • the association relationship of the node to be inspected, thereby obtaining a protocol subgraph, which does not occupy the storage resource of the computer during the filtering process, and the generated protocol subgraph is dynamically stored in the memory, which does not need to occupy the disk storage of the computer.
  • the embodiment of the present invention generates the protocol sub-graph in real time based on different query conditions, and obtains the query result according to the real-time generated protocol sub-graph, thereby improving the accuracy of the computer search chart data. Further, the method provided by the embodiment of the present invention can return an accurate query result under any resource limitation, and overcome the defect that the traditional query technology cannot return the query result under any resource limitation.
  • FIG. 7 is a schematic flowchart diagram of Embodiment 3 of a method for searching for graph data provided by the present invention.
  • the embodiment relates to a specific implementation process of determining a protocol sub-picture according to a query topology, a first access cost, and a preset available resource condition.
  • the foregoing S202 specifically includes:
  • S301 Read a query node stored in the storage space and a map section matching the query node a point, wherein the storage space stores a query node in the query topology and a graph node matching the query node, the query node includes a start query node, and the graph node includes the start a graph node or the graph view node, the start graph node matching the start query node.
  • the computer reads the query node stored in the storage space and the graph node matched with the query node according to the query topology.
  • the query node here includes a starting query node, and optionally, other query nodes having a query topology with the starting query node and assisting the computer to search for the first to-be-viewed node.
  • the storage space may also include a start graph node matching the initial query node, and may also include a graph node to be searched in the protocol subgraph, that is, the graph nodes stored in the computer storage space may be through the computer. After filtering and filtering the remaining graph nodes to be inspected, the computer can search for the first graph to be inspected through the nodes to be inspected.
  • S302 Determine whether the read graph node is included in the protocol submap; if the read graph node is not included in the protocol submap, add the read graph node to the protocol In the subgraph, and determining that the resources occupied by the protocol subgraph do not exceed the available resource conditions.
  • the computer determines that the graph sub-graph does not include the graph node read from the storage space, the above-mentioned read graph node is added to the protocol sub-graph, and the above-mentioned read graph node is added. After the specification subgraph, the computer further determines whether the memory resources occupied by the protocol subgraph exceed the conditions of the available resources.
  • the available resource condition herein may be a resource size threshold, a range value of the resource size, or a resource size limit.
  • the resources occupied by the protocol subgraph refer to the resources of the computer memory.
  • S303 Calculate an access cost of the to-be-viewed node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filter the access cost to exceed the first access a second to-be-viewed node of the cost and an association relationship of the second to-be-viewed node including the access cost exceeding the first access cost, and according to a preset dynamic protocol parameter And outputting an access sequence to the storage space; wherein an access cost of the to-be-viewed node in the access sequence does not exceed the first access cost, and the dynamic protocol parameter is used to control the access sequence The number of nodes to be checked.
  • the computer determines that the resource occupied by the protocol subgraph of the graph node that has been added does not exceed the available resource condition, the computer further performs the query topology between the query nodes according to the determined query topology.
  • the relationship calculates the access cost of the to-be-viewed node (hereinafter referred to as the neighbor to-be-checked node) adjacent to the above-mentioned read graph node, that is, the access cost of the neighbor to-be-viewed graph node to the first to-be-checked graph node.
  • the computer filters out the second to-be-viewed node whose access cost exceeds the first access cost, and the second to-be-viewed node is the to-be-checked that the access cost selected from the neighboring to-be-viewed node exceeds the first access cost.
  • the graph node, and the computer also filters out the association relationship including the second graph to be checked.
  • the computer may use the neighboring to-be-viewed node from the remaining access cost not exceeding the first access cost according to the preset dynamic protocol parameter. Determining an access sequence stored to the storage space, the dynamic specification parameter determining the number of graph nodes in the access sequence. It should be noted that the graph nodes in the access sequence are neighbor graph nodes whose access cost does not exceed the first access cost.
  • the computer determines the access sequence that can be stored in the storage space, storing the access sequence to the storage space, that is, storing the graph node in the access sequence to the storage space, so that the computer can execute S301 again, thereby determining to add to the protocol.
  • the method for searching for graph data determines a graph node that can be added to a protocol subgraph from the graph data set according to the query topology, the preset first access cost, and the available resource condition, thereby obtaining a protocol
  • the map searches through the protocol subgraph to the first to-be-viewed node.
  • the protocol subgraph since the protocol subgraph is cached in the computer memory, the protocol subgraph can be queried from the memory without using an IO operation to obtain a result, which effectively improves the efficiency of the computer search graph data; After a search is completed, the computer releases the previously generated protocol sub-picture.
  • the computer When the next search is performed, the computer re-searches the search data according to the new query request, that is, in the embodiment of the present invention, the computer generates the protocol sub-picture in real time according to the query condition, and The accurate search result is obtained according to the protocol sub-picture in real time, so the method provided by the embodiment of the invention is improved. Search accuracy of graph data.
  • FIG. 8 is a schematic flowchart diagram of Embodiment 4 of a method for searching for graph data provided by the present invention.
  • the embodiment relates to another specific implementation process for the computer to determine the protocol sub-picture according to the query topology and the first access cost and the preset available resources.
  • the foregoing S202 specifically includes:
  • S401 Set the number of graph nodes in the protocol subgraph to zero, set the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and set the dynamic protocol parameter to be the first A preset value.
  • the computer may initialize the protocol subgraph, the storage space of the graph node used to cache the query node and the query node, and the dynamic protocol parameter, that is, the number of graph nodes in the protocol subgraph.
  • the dynamic protocol parameter that is, the number of graph nodes in the protocol subgraph.
  • the first preset value is a parameter that controls the number of nodes to be inspected in the following access sequence, and the parameter can cause the computer to search for the least resource. A node to be checked.
  • S402 Store a start query node and the start graph node in the query topology to the storage space; the start graph node matches the start query node.
  • the starting query node and the starting graph node form a point pair (starting query node, starting graph node), and the computer stores the point pair to the storage space.
  • the storage space may be a storage space of a stack structure, or may be a storage module, as long as the storage space has the characteristics of “first in, then out”, that is, the computer obtains from the storage space. The data is based on the principle of “first in, then out”.
  • S403 Read a query node stored in the storage space and a graph node that matches the query node, and delete the read query node and the graph node that matches the query node from the storage space.
  • the query node read by the computer in S403 should be the initial query node, and read.
  • the taken graph node is the starting graph node.
  • the query nodes and graph nodes read by the computer in S403 are determined according to actual conditions.
  • the computer reads the graph node from the storage space. The procedure can be seen in the examples in the following embodiments.
  • S404 Determine whether the read graph node is included in the protocol submap; wherein the read graph node includes the start graph node or the to-be-view graph node. If yes, execute S405; if no, execute S406.
  • S405 The markup specification subgraph has not changed, and S407 is executed.
  • S406 Add the read graph node to the protocol submap, and determine whether the resource occupied by the protocol submap exceeds an available resource condition; if yes, execute S411; if not, execute S407.
  • the computer needs to determine whether the resource occupied by the protocol sub-graph exceeds the available resources. Resource conditions.
  • S407 Calculate a second access cost of the to-be-checked node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filter the access cost to exceed the first a second to-be-viewed node of the access cost and an association relationship of the second to-be-viewed node including the access cost exceeding the first access cost, and outputting the access stored to the storage space according to the dynamic specification parameter a sequence; wherein, the access cost of the to-be-viewed node in the access sequence does not exceed the first access cost, and the dynamic protocol parameter is used to control the number of to-be-viewed nodes in the access sequence.
  • the computer calculates, according to the query topology relationship between the query nodes in the query topology, the adjacent to the read graph node in S403.
  • the access cost of the view node hereinafter, the node to be inspected adjacent to the read graph node is simply referred to as the neighbor check graph node).
  • the computer determines whether the access cost of the neighbor to-be-viewed node exceeds the first access cost; if so, the neighbor to-be-viewed node is the second to-be-viewed node, and the computer selects the second to-be-viewed node and includes the second
  • the association relationship of the node to be checked is filtered; if not, the computer adds the neighbor to-be-searched node to the access queue (because the graph node may be the first to-be-viewed node or may assist the computer to find the first to-be-checked
  • the computer determines the number of pairs of points to join the access sequence according to the above dynamic protocol parameters. Wherein, a point pair includes a query node and a graph node matching the query node.
  • S408 Determine whether the storage space is empty; if yes, execute S409; if not, execute S403 until the number of query nodes stored in the storage space and the number of graph nodes matching the query node are zero. And after determining that the number of the query nodes stored in the storage space and the number of graph nodes matching the query node are zero, S409 is performed.
  • S410 Re-storing the initial query node and the start graph node of the query node to the storage space, and adjusting the value of the dynamic protocol parameter to a second preset value, and then continuing to return to execute S403. .
  • adjusting the dynamic protocol parameter may be to increase the initial dynamic protocol parameter, so that the number of points added to the access queue is increased, thereby expanding the search range of the computer and obtaining accurate results.
  • FIG. 9 is a schematic flowchart diagram of Embodiment 5 of a method for searching for graph data provided by the present invention.
  • the embodiment relates to the computer determining the protocol sub-graph according to the query topology relationship between the query nodes in the query topology and the first access cost and the preset available resources.
  • a specific implementation process As shown in FIG. 9, the foregoing S202 specifically includes:
  • S501 Set the number of graph nodes in the protocol subgraph to zero, set the number of query nodes stored in the storage space and the number of graph nodes matching the query node to zero, and set the dynamic protocol parameter to be the first A preset value.
  • S502 Store a start query node and the start graph node in the query topology to the storage space; the start graph node matches the start query node.
  • S503 reading a query node stored in the storage space and a graph node matching the query node, and marking the read query node in the storage space and the graph node matching the query node .
  • S504 determining whether the read graph node is included in the protocol submap; wherein the read graph node includes the start graph node or the to-be-view graph node; if yes, execute S505; Otherwise, execute S506.
  • S505 The markup specification subgraph has not changed, and S507 is executed.
  • S506 Add the read graph node to the protocol submap, and determine whether the resource occupied by the protocol submap exceeds an available resource condition; if yes, execute S511; if not, execute S507.
  • S507 Calculate an access cost of the to-be-viewed node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filter the access cost to exceed the first access a second to-be-viewed node of the cost and an association relationship of the second to-be-checked node including the access cost exceeding the first access cost, and outputting the storage according to the dynamic protocol parameter and the first access cost An access sequence of the storage space, wherein the access cost of the to-be-viewed node in the access sequence does not exceed the first access cost, and the dynamic protocol parameter is used to control the to-be-viewed node in the access sequence Number of.
  • S508 Determine whether there is an unmarked query node and a graph node that matches the unmarked query node in the storage space; if yes, execute S503 until the storage node stores the query node and the After the graph nodes matched by the query node are marked, and after determining that the query node stored in the storage space and the graph node matching the query node are marked, S509 is performed; if not, S509 is performed.
  • the computer reads the query node and the graph node matching the query node from the storage space, and adds the read query node and the graph node matching the query node to the protocol subgraph. Therefore, the computer marks the graph nodes in the storage space that have been added to the schema submap and the query nodes that match the graph nodes. Therefore, the computer needs to determine whether there is an unmarked query node in the storage space and a graph node matching the unmarked query node, that is, whether there are any graph nodes in the storage space that are not added to the protocol subgraph.
  • the computer If there is an unmarked query node in the storage space and a graph node matching the unmarked query node, the computer returns to execute S503 above, until the query node stored in the storage space and the graph node matching the query node are both hit. There are signs.
  • S510 Re-storing the initial query node and the start graph node of the query node to the storage space, and adjusting the value of the dynamic protocol parameter to a second preset value, and then continuing to return to execute S503. .
  • adjusting the dynamic protocol parameter may be to increase the initial dynamic protocol parameter, so that the number of points added to the access queue is increased, thereby expanding the search range of the computer and obtaining accurate results.
  • the S401-S411 in the embodiment shown in FIG. 8 and the S501-S511 in the embodiment shown in FIG. 9 are specific implementations of the computer determining the protocol sub-picture according to the query topology and the first access cost and the preset available resource conditions.
  • the process, in order to facilitate understanding of the flowcharts shown in FIGS. 8 and 9, continues to be more specifically described herein with the example shown in FIG. 3 above. Since the processes of FIG. 8 and FIG. 9 are similar, only the cyclic operation in the flowchart shown in FIG. 8 is shown here as a specific example, and specifically refer to the following nine steps of A-I.
  • the neighbor graph node of the start graph node michael in the query graph data set (the neighbor is to be checked)
  • the graph nodes are hg 1 , hg 2 , hg m , cc 1 , cc 2 , cc 3 ), and calculate the access cost of the michael neighbor graph node, and determine that the access cost of each of hg m , cc 1 , and cc 2 does not exceed a preset first access cost, so the computer obtains the access sequence (HG, hg m ) (CC, cc 1 ) (CC, cc 2 ) in combination with the dynamic protocol parameters, and stores the pairs of points in sequence, that is, The specific content in S at this time can be seen in Table 1.
  • the neighbor to-be-viewed nodes in the point pair of the above access sequence are all nodes
  • Step B Following step (5) of the above step A, the computer continues the following process:
  • Hg m (3) calculate the dynamic query node and neighboring nodes to be examined HG access cost graph nodes, and filter out the second node to be examined in accordance with FIG first access cost and the dynamic parameters of the statute be examined from the remaining nodes in FIG. Determine the access sequence.
  • the computer obtains the access sequence (CL, cl n ) by combining the above dynamic protocol parameters (CL , cl n-1 ), and store the two pairs of points to S, that is, the specific content in S at this time can be seen in Table 2.
  • the computer determined S is not empty, then the computer continue receiving node S in FIG unknown origin; in this case includes a michael G Q and hg m.
  • Step C Following step (4) of the above step B, the computer continues the following process:
  • the access sequence is determined in the graph node. That is, according to the topology structure of the query node CL in the query topology (the neighbor query node of the CL is empty), the neighbor check graph node of the graph node cl n-1 in the graph data is queried. Since cl n-1 does not have a neighbor to be inspected node, the computer determines that the access sequence is empty, and the specific content in S at this time can be seen in Table 3.
  • G Q includes michael, hg m, and cl n-1 .
  • Step D Following step (4) of the above step C, the computer continues the following process:
  • G Q includes michael, hg m , cl n-1 , and cl n .
  • Step E Following step (4) of the above step D, the computer continues the following process:
  • the neighbor check graph node of the graph node cc1 in the query graph data set (the neighbor graph node to be checked is Cl n , cl n-1 ), and calculate the access cost of the neighboring graph node of cc 1 does not exceed the preset first access cost, so the computer obtains the access sequence (CL, cl n ) by combining the above dynamic protocol parameters (CL , cl n-1 ), and store the two pairs of points to S, then the specific content in S at this time can be seen in Table 5.
  • the computer judges that S is not empty, the computer continues to acquire the node to be inspected in S; at this time, the GQ includes michael, hg m , cl n-1 , cl n , and cc 1 .
  • Step F Following step (4) of the above step E, the computer continues the following process:
  • the access sequence is determined in the graph node. That is, according to the topology structure of the query node CL in the query topology, the neighbor graph node of the graph node cl n-1 in the graph data set is queried, and since the cl n-1 has no neighbor graph node to be inspected, the computer determines the access. The sequence is empty, then the specific content in S at this time can be seen in Table 6.
  • the computer judges that S is not empty, the computer continues to acquire the node to be inspected in S; at this time, the GQ includes michael, hg m , cl n-1 , cl n , and cc 1 .
  • Step G Following step (4) of the above step F, the computer continues the following process:
  • the computer judges that S is not empty, the computer continues to acquire the node to be inspected in S; at this time, the GQ includes michael, hg m , cl n-1 , cl n , and cc 1 .
  • Step H Following step (4) of the above step G, the computer continues the following process:
  • the computer obtains the access sequence (CL, cl n ) in combination with the above dynamic protocol parameters, and pairs the point Store to S, then the specific content in S at this time can be seen in Table 8.
  • G Q includes michael, hg m , cl n-1 , cl n , cc 1 and cc 2 .
  • Step I Following step (4) of the above step H, the computer continues the following process:
  • the computer judges that S is empty, and G Q includes michael, hg m , cl n-1 , cl n , cc 1 , and cc 2 .
  • G Q comprises a michael, hg m, cl n- 1, cl n, cc 1 and Cc 2 ; where michael is the starting graph node in the statistic subgraph G Q , cl n-1 , cl n is the first graph node in the statistic subgraph G Q , hg m , cc 1 and cc 2 belong to A graph node on the association relationship between the start graph node and the first graph node to be inspected in the profile sub-graph G Q .
  • the method for searching for graph data determines a graph node that can be added to a protocol subgraph from the graph data set according to the query topology, the preset first access cost, and the available resource condition, thereby obtaining a protocol
  • the map searches through the protocol subgraph to the first to-be-viewed node.
  • the protocol subgraph since the protocol subgraph is cached in the computer memory, the protocol subgraph can be queried from the memory without using an IO operation to obtain a result, which effectively improves the efficiency of the computer search graph data; After a search is completed, the computer releases the previously generated protocol sub-picture.
  • the computer When the next search is performed, the computer re-searches the search data according to the new query request, that is, in the embodiment of the present invention, the computer generates the protocol sub-picture in real time according to the query condition, and The accurate search results are obtained in real time according to the protocol subgraph, so the method provided by the embodiment of the present invention improves the search precision of the graph data.
  • FIG. 10 is a schematic flowchart diagram of Embodiment 6 of a method for searching for graph data provided by the present invention.
  • the method according to the embodiment of the present invention is still applicable to the distributed computing system shown in FIG. 1 above.
  • This embodiment still takes a computer as an execution subject as an example.
  • This embodiment relates to the number of computers passing through the figure
  • the specific process of determining the first to-be-viewed node that matches the query condition is determined by the landmark node in the base.
  • the method includes:
  • S601 Acquire a query request, where the query request includes a query condition that carries a start graph node and a termination graph node, where the query request is used to request a first check graph matching the query condition in the query graph data set.
  • a node the map data set includes the start graph node, a plurality of graph view nodes, the termination graph node, and the start graph node, the termination graph node, and the plurality of to-be-checked graphs The relationship between nodes.
  • the computer obtains a query request from the user.
  • the query request may be configured by the user to the computer, or may be sent by the user to the computer through other devices.
  • the query request may include a query condition carrying a start graph node and a termination graph node, and the query request is used to query a first graph view node in the graph data set that matches the query condition.
  • the association relationship in the graph data set refers to the start graph node, the termination graph node, and the edge formed by each graph node to be inspected.
  • the first to-be-viewed node may be one or more.
  • the query condition in the query request is "Can I know the cyclist Eric through a friend”
  • michael in Fig. 11 is the starting graph node of the graph data, except for michael All graph nodes are nodes to be inspected
  • Eric is the termination graph node of the graph data.
  • the connection line between michael, Eric and other nodes to be inspected shown in FIG. 11 is the relationship between the start graph node and the graph node to be inspected in the graph data.
  • S602 Determine a landmark node of the graph data set according to an intermediate degree core of a plurality of graph nodes in the graph data set and a preset available resource condition, and establish a landmark node tree according to the landmark node; wherein the landmark The node tree includes landmark nodes with hierarchical relationships.
  • the computer determines the intermediateity of each graph node according to the association relationship of all the graph nodes (including the start graph node, the termination graph node, and the graph to be checked graph node) in the graph data set, according to each graph.
  • the node's betweenness centrality and the preset available resource conditions determine the landmark node of the graph data set. That is to say, the landmark node is a node located on a plurality of shortest paths of other graph nodes.
  • the landmark node tree is established according to thebetness centrality of each landmark node and the association relationship in the graph data set, and the landmark node tree includes a plurality of landmark nodes having a hierarchical relationship.
  • cl 3 , cl 4 , cl 5 , and cl 6 in FIG. 11 are all landmark nodes, and the degree of the continuity centrality of cl 4 is the largest, and is the core node node.
  • the computer establishes the landmark node tree according to the association relationship of the graph nodes in the graph data set, and specifically, if a landmark node a reaches a certain landmark node b or b reaches a, then Construct an edge (b, a) and join the landmark tree.
  • the landmark node tree can be seen in FIG. 12, and the landmark node tree includes other nodes in the graph data set in addition to the landmark node.
  • the preset available resource condition is used to constrain the resource size occupied by the landmark node tree, that is, the resource occupied by the constructed landmark node tree cannot exceed the preset available resource condition.
  • S603 Search the landmark node tree according to the query condition to obtain the first to-be-viewed node.
  • the computer learns the start graph node in the query condition and knows the termination graph node of the graph data set, and the computer can search the association relationship in the landmark node tree according to the query condition to determine the start. Whether the graph node to the termination graph node is reachable, and outputting the landmark node passing through the start graph node reachable termination graph node Eric when determining the reachability, these landmark nodes are the first graph to be checked.
  • the first to-be-viewed node determined by the computer may be cl 3 , cl 4 , and cl 6 .
  • the method for searching graph data determines a landmark node according to the intermediateity core of the graph node in the map data set and the preset available resource condition, and establishes a landmark tree, and then searches the roadmap according to the query condition.
  • the node tree determines the first to-be-viewed graph node that satisfies the query condition. Since the landmark node tree is used in the process of searching for the first to-be-viewed node, the path and the graph node that are searched for the first to-be-viewed node are directly valid paths and graph nodes, thereby avoiding the computer acquiring the first
  • the invalid search when the node is to be checked saves the time resources of the computer and improves the search efficiency.
  • FIG. 13 is a schematic flowchart diagram of Embodiment 7 of a method for searching for graph data provided by the present invention.
  • the embodiment relates to a specific process in which a computer searches for a landmark node tree according to a query condition to obtain a first to-be-viewed node.
  • the foregoing S403 specifically includes:
  • S701 Acquire auxiliary information of each landmark node in the landmark tree according to the query condition.
  • the computer obtains the landmark node according to the query condition.
  • Auxiliary information for each landmark node in the tree may be whether the landmark node reaches the termination graph node, and may also be an access cost or a search time taken by the landmark node to reach the termination graph node, and may also be a resource size occupied by the landmark node. It is also possible to obtain information of the first to-be-viewed node for other auxiliary computers.
  • S702 Determine a path policy for acquiring the first to-be-viewed node according to the auxiliary information of the landmark node.
  • the path policy may be used to assist the computer to select an optimal path to obtain the first to-be-viewed node, or to indicate to the computer which paths in the landmark node tree are such that the starting graph node is unreachable to terminate the graph node.
  • S703 Search the landmark node tree according to the path policy to obtain the first to-be-viewed node.
  • the method for searching graph data determines a landmark node according to the intermediateity core of the graph node in the map data set and the preset available resource condition, and establishes a landmark tree, and then searches the roadmap according to the query condition.
  • the node tree determines the first to-be-viewed graph node that satisfies the query condition. Since the landmark node tree is used in the process of searching for the first to-be-viewed node, the path and the graph node that are searched for the first to-be-viewed node are directly valid paths and graph nodes, thereby avoiding the computer acquiring the first
  • the invalid search when the node is to be checked saves the time resources of the computer and improves the search efficiency.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 14 is a schematic structural diagram of Embodiment 1 of a search device for graph data according to an embodiment of the present invention.
  • the search device 101 of the map data can be integrated on a compute node in the distributed computing system described above.
  • the apparatus includes an acquisition module 10 and a processing module 11.
  • the obtaining module 10 is configured to obtain a query request, where the query request includes a query condition that carries a start graph node, where the query request is used to search for the first check in the graph data set that matches the query condition.
  • a graph node the graph data set includes the start graph node, a plurality of graph graph nodes, and an association between the start graph node and the plurality of graph nodes to be inspected And an association relationship between each of the plurality of to-be-viewed nodes and the other to-be-viewed nodes;
  • the processing module 11 is configured to filter the location according to the query condition and the preset available resource condition a second to-be-viewed graph node that does not satisfy the query condition and an association relationship that includes the second to-be-viewed graph node in the graph data set to obtain a protocol subgraph, and query the protocol subgraph by using the query condition Obtaining the first to-be-viewed graph node; wherein the protocol sub-graph includes the start graph node, a first to-be-checked
  • the processing module 11 is configured to generate a query topology according to the query condition, and according to the query topology relationship between the query nodes in the query topology and the preset access to the first to-be-checked map.
  • a first access cost of the node and the available resource condition filtering a second to-be-viewed node in the graph data set whose access cost exceeds the first access cost and an association relationship including the second to-be-viewed node
  • the protocol subgraph wherein the query topology includes a plurality of query nodes, and a query topology relationship between each of the plurality of query nodes and other query nodes, the protocol The resources occupied by the graph do not exceed the available resource conditions.
  • the processing module 11 is configured to read a query node stored in the storage space and a graph node that matches the query node, and determine whether the read graph node includes the read graph node. If the read graph node is not included in the protocol submap, the read graph node is added to the protocol submap, and it is determined that the resource occupied by the protocol submap does not exceed the available
  • the resource condition further calculates an access cost of the to-be-checked node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filters the access cost to exceed the a second to-be-viewed node of the first access cost and an association relationship of the second to-be-viewed node including the access cost exceeding the first access cost, and outputting the storage to the storage space according to a preset dynamic protocol parameter Access sequence; wherein the storage space stores a query node in the query topology and a graph node matching the query node, and the query node includes a start Inquiring a node, the
  • the processing module 11 is specifically configured to set the number of graph nodes in the protocol submap to zero, and the number of query nodes stored in the storage space and the query. After the number of graph nodes matched by the node is set to zero, and the dynamic protocol parameter is set to the first preset value, the initial query node and the start graph node in the query topology are stored to the storage space.
  • the processing module 11 is specifically configured to set the number of graph nodes in the protocol submap to zero, and the number of query nodes stored in the storage space and the query. After the number of graph nodes matched by the node is set to zero, and the dynamic protocol parameter is set to the first preset value, the initial query node and the start graph node in the query topology are stored to the storage space.
  • a second to-be-viewed node whose cost exceeds the first access cost and an association relationship of the second to-be-checked node that includes the access cost exceeding the first access cost, and outputs the storage to the location according to the dynamic protocol parameter Determining an access sequence of the storage space; and further determining whether there is an unmarked query node and a graph node matching the unmarked query node in the storage space;
  • the storage space has an unmarked query node and a graph node matching the unmarked query node, and then continues to read the query node stored in the storage space and the graph node matched with the query node until the The query node stored in the storage space and the graph node matching the query node are marked; if the query node stored in the storage space and the graph node matching the query node are marked with null, then further Determining whether the protocol subgraph changes, and when determining that the protocol subgraph has not changed, ending the calculation to obtain the protocol subgraph; wherein the starting graph node is opposite to the starting query no
  • processing module 11 is further configured to: if it is determined that the protocol submap changes, re-storing the initial query node and the start graph node of the query node to the storage space, and The value of the dynamic protocol parameter is adjusted to a second preset value, and the query node stored in the storage space and the graph node matching the query node are continuously read until the number of query nodes stored in the storage space is The number of graph nodes matching the query node is zero or until the query node stored in the storage space and the graph node matching the query node are marked.
  • processing module 11 is further configured to: if the resource occupied by the protocol subgraph exceeds the available resource condition, end the calculation to obtain the protocol submap.
  • the search device for the map data provided by the present invention can refer to the above method embodiment, and the implementation thereof The principle and technical effects are similar and will not be described here.
  • FIG. 15 is a schematic structural diagram of Embodiment 2 of a search device for graph data according to an embodiment of the present invention.
  • the search device 102 of the map data can be integrated on a compute node in the distributed computing system described above.
  • the apparatus includes an acquisition module 20 and a processing module 21.
  • the obtaining module 20 is configured to obtain a query request, where the query request includes a query condition that carries a start graph node and a termination graph node, where the query request is used to request the query graph data set to match the query condition.
  • a first to-be-viewed node the map data set includes the start graph node, a plurality of graph view nodes, the termination graph node, and the start graph node, the termination graph node, and the An association relationship between the plurality of to-be-viewed nodes;
  • the processing module 21 configured to determine, according to the intermediateity core of the plurality of graph nodes in the graph data set and the preset available resource conditions, the graph data set a landmark node, and establishing a landmark tree according to the landmark node, and searching the landmark node tree according to the query condition to obtain the first to-be-viewed node; wherein the landmark node tree includes a hierarchical relationship Road sign node.
  • the processing module 11 is specifically configured to acquire auxiliary information of each landmark node in the landmark node tree according to the query condition, and determine, according to the auxiliary information of the landmark node, to acquire the first to-be-checked map. After the path policy of the node, the landmark node tree is searched according to the path policy to obtain the first to-be-viewed node.
  • FIG. 16 is a schematic structural diagram of Embodiment 1 of a search device for graph data according to an embodiment of the present invention.
  • the search device 103 of the map data may be a computing node in the distributed computing system described above.
  • the device includes a processor 30, a memory 31, and a user interface 32, which are connected by a bus 33.
  • the device provided by the embodiment of the present invention may further include a communication interface or the like for communicating with other devices.
  • the device shown in FIG. 16 may specifically be an electronic device such as a mobile phone, a tablet computer, a desktop computer, a portable computer, or a server.
  • the bus 33 is used to implement connection communication between the processor 30, the memory 31, and the user interface 32.
  • the bus may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture) bus.
  • the bus can be one or more physical lines when Multiple physical lines can be divided into address bus, data bus, control bus and so on.
  • User interface 32 is used to receive the user's actions or to present the page to the user.
  • the search device of the map data may acquire a query request through the user interface 32, thereby causing the processor 30 to perform a corresponding operation according to the query request.
  • the memory 31 is used to store computer programs, and may include applications and operating system programs.
  • the processor 30 is configured to read a computer program from the memory 31 and perform the following operations, specifically:
  • the query request includes a query condition carrying a start graph node, where the query request is used to query a first to-be-viewed graph node in the graph data set that matches the query condition; Include the start graph node, the plurality of graph view nodes, and an association relationship between the start graph node and the plurality of graph view nodes and each of the plurality of graph view nodes to be checked The relationship between the graph node and other nodes to be inspected;
  • the protocol subgraph includes the start graph node, a first to-be-viewed graph node that matches the query condition, and an association relationship between the start graph node and the first graph to be checked node;
  • the protocol subgraph is queried by the query condition to obtain the first to-be-viewed node.
  • the processor 30 is specifically configured to generate a query topology according to the query condition; the query topology includes a plurality of query nodes, and each of the plurality of query nodes and other query nodes Querying the topology relationship, and filtering the location according to the query topology relationship between the query nodes in the query topology and the preset first access cost of the first to-be-checked node and the available resource conditions.
  • a second inspected graph node whose access cost exceeds the first access cost and an association relationship including the second inspected graph node in the graph data set to obtain the protocol subgraph; wherein the protocol subroutine The resources occupied by the graph do not exceed the available resource conditions.
  • the processor 30 is configured to read a query node stored in the storage space and a graph node that matches the query node, and determine whether the read graph node is included in the protocol submap. If the read graph node is not included in the protocol submap, the read graph node is added to the protocol submap, and it is determined that the resource occupied by the protocol submap does not exceed
  • the available resource condition further calculates an access cost of the to-be-viewed node adjacent to the read graph node according to the query topology relationship between the query nodes in the query topology, and filters the access cost a second to-be-viewed node that exceeds the first access cost and an association relationship that includes the second to-be-viewed node whose access cost exceeds the first access cost, and outputs the storage to the location according to the preset dynamic protocol parameter
  • An access sequence of the storage space wherein the storage space stores a query node in the query topology and a graph node matching the query node, and the query node includes a start query node, where the graph
  • the processor 30 is specifically configured to set the number of graph nodes in the protocol submap to zero, and the number and the number of query nodes stored in the storage space. After the number of graph nodes matched by the query node is set to zero, and the dynamic protocol parameter is set to a first preset value, the initial query node and the start graph node in the query topology are stored in the Storing a storage space, and after reading the query node stored in the storage space and the graph node matching the query node, determining whether the read graph node is included in the protocol submap; If the read graph node is not included in the sub-picture, the read graph node is further added to the protocol sub-graph, and it is determined that the resource occupied by the protocol sub-graph does not exceed the available resource condition.
  • the processor 30 is specifically configured to set the number of graph nodes in the protocol submap to zero, and the number of query nodes stored in the storage space. After the number of graph nodes matching the query node is set to zero, and the dynamic protocol parameter is set to the first preset value, the start query node and the start graph node in the query topology are stored to The storage space, and reading a query node stored in the storage space and a graph node matching the query node, and matching the read query node in the storage space with the query node After the graph node is marked, it is further determined whether the read graph node is included in the protocol submap; if the read graph node is not included in the protocol submap, the read graph is further Adding a node to the protocol submap, and determining that the resource occupied by the protocol submap does not exceed the available resource condition, and further according to a query topology between query nodes in the query topology After calculating an access cost of the to-be-
  • the processor 30 is further configured to: if it is determined that the protocol submap changes, re-storing the initial query node and the start graph node of the query node to the storage Space, and adjusting the value of the dynamic specification parameter to a second preset value, and continuing to read the query node stored in the storage space and the graph node matching the query node until the storage space is stored The number of query nodes and the number of graph nodes matching the query node are zero or until the query node stored in the storage space and the graph node matching the query node are marked.
  • the processor 30 is further configured to: if the resource occupied by the protocol subgraph exceeds the available resource condition, end the calculation to obtain the protocol submap.
  • the image data searching device may perform the foregoing method embodiments, and the computer program included may be divided according to the module form described in the foregoing device embodiment, or may be divided according to other modules, or may not be divided.
  • the computer program included may be divided according to the module form described in the foregoing device embodiment, or may be divided according to other modules, or may not be divided.
  • FIG. 17 is a schematic structural diagram of Embodiment 2 of a search device for graph data according to an embodiment of the present invention.
  • the search device 104 of the map data may be a compute node in the distributed computing system described above.
  • the device includes a processor 40, a memory 41, and a user interface 42, all of which are connected by a bus 43.
  • the device provided by the embodiment of the present invention may further include a communication interface or the like for communicating with other devices.
  • the device shown in FIG. 17 may specifically be an electronic device such as a mobile phone, a tablet computer, a desktop computer, a portable computer, or a server.
  • the bus 43 is used to implement connection communication between the processor 40, the memory 41, and the user interface 42.
  • the bus can be an ISA4 bus, a PCI4 bus, or an EISA4 bus.
  • the bus may be one or more physical lines, and when it is a plurality of physical lines, it may be divided into an address bus, a data bus, a control bus, and the like.
  • User interface 42 is used to receive the user's actions or to present the page to the user.
  • the search device of the map data may acquire a query request through the user interface 42, thereby causing the processor 40 to perform a corresponding operation according to the query request.
  • the memory 41 is used to store computer programs, and may include applications and operating system programs.
  • the processor 40 is configured to read a computer program from the memory 41 and perform the following operations, specifically:
  • the query request includes a query condition that carries a start graph node and a termination graph node, and the query request is used to request a match in the query graph data set that matches the query condition a map node to be searched;
  • the map data set includes the start graph node, a plurality of graph view nodes, the termination graph node, and the start graph node, the termination graph node, and the plurality of graph nodes An association relationship between the nodes to be inspected;
  • the landmark node tree includes a landmark node having a hierarchical relationship.
  • the processor 40 is specifically configured to acquire auxiliary information of each landmark node in the landmark node tree according to the query condition, and determine, according to the auxiliary information of the landmark node, to acquire the first to-be-checked map. After the path policy of the node, the landmark node tree is searched according to the path policy to obtain the first to-be-viewed node.
  • the image data searching device may perform the foregoing method embodiments, and the computer program included may be divided according to the module form described in the foregoing device embodiment, or may be divided according to other modules, or may not be divided.
  • the computer program included may be divided according to the module form described in the foregoing device embodiment, or may be divided according to other modules, or may not be divided.

Abstract

一种图数据的搜索方法和装置,该方法包括:获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点,所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其他图节点之间的关联关系(S101);根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图,所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系(S102);通过所述查询条件查询所述规约子图,以得到所述第一待查图节点(S103)。所述方法提高了搜索图数据的效率,并节省了计算机的存储资源和时间资源。

Description

图数据的搜索方法和装置 技术领域
本发明实施例涉及计算机技术,尤其涉及一种图数据的搜索方法和装置。
背景技术
随着计算机技术的不断发展,计算机处理的数据量也逐渐增大,而当前的大数据时代也正是一个图数据繁荣发展的时代,这里的图数据是相互之间具有关联关系的数据。基于此,计算机往往需要进行全量大数据分析,通过大量的时间资源和计算机的存储资源消耗,得到精确的搜索结果。
为了避免传统搜索机制带来的资源消耗,现有技术中提出了一种数据采样查询(BlinkDB)技术,以特定的采样算法对原始图数据进行持续的采样,建立图数据样本并进行维护,进而获取相应的搜索结果。
但是,现有技术的数据采样查询(BlinkDB)技术中需要使用额外的存储开销以维护图数据样本,很大程度上对计算机的存储资源造成浪费。
发明内容
本发明实施例提供一种图数据的搜索方法和装置,在有效搜索图数据的同时,避免搜索图数据时造成的资源浪费。
第一方面,本发明实施例提供一种图数据的搜索方法,包括:
获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;
根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图;所述规约子图包括所述起始图节点、与所述查询 条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系;
通过所述查询条件查询所述规约子图,以得到所述第一待查图节点。
结合第一方面,在第一方面的第一种可能的实施方式中,所述根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点对应的关联关系,以得到规约子图,包括:
根据所述查询条件生成查询拓扑结构;所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系;
根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述规约子图所占的资源不超过所述可用资源条件。
结合第一方面的第一种可能的实施方式,在第一方面的第二种可能的实施方式中,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图,包括:
读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,其中所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;
判断所述规约子图中是否包括所述读取的图节点;
若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
结合第一方面的第一种可能的实施方式,在第一方面的第三种可能的实施方式中,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图,包括:
步骤A:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值;
步骤B:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配;
步骤C:读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点;
步骤D:判断所述规约子图中是否包括所述读取的图节点;其中,所述读取的图节点包括所述起始图节点或所述待查图节点;
步骤E:若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
步骤F:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图 节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目;
步骤G:判断所述存储空间是否为空;
步骤H:若所述存储空间非空,则返回执行步骤C,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则判断所述规约子图是否发生变化;
步骤I:若判断所述规约子图没有发生变化,则结束计算,以得到所述规约子图。
结合第一方面的第一种可能的实施方式,在第一方面的第四种可能的实施方式中,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点对应的关联关系,以得到所述规约子图,包括:
步骤A:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值;
步骤B:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配;
步骤C:读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记;
步骤D:判断所述规约子图中是否包括所述读取的图节点;其中,所述读取的图节点包括所述起始图节点或所述待查图节点;
步骤E:若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
步骤F:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过 所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目;
步骤G:判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;
步骤H:若所述存储空间中存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则返回执行步骤C,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记,则判断所述规约子图是否发生变化;
步骤I:若判断所述规约子图没有发生变化,则结束计算,以得到所述规约子图。
结合第一方面的第三种可能的实施方式或第一方面的第四种可能的实施方式,在第一方面的第五种可能的实施方式中,所述判断所述规约子图是否发生变化之后,还包括:
若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值;
执行步骤C。
结合第一方面的第三种可能的实施方式至第一方面的第五种可能的实施方式中的任一项,在第一方面的第六种可能的实施方式中,所述方法还包括:
若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
第二方面,本发明实施例提供一种图数据的搜索方法,包括:
获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及 所述多个待查图节点之间的关联关系;
根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树;其中,所述路标节点树包括具有层次关系的路标节点;
根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点。
结合第二方面,在第二方面的第一种可能的实施方式中,根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点,包括:
根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息;
根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略;
根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
本发明实施例提供的图数据的搜索方法和装置中,根据查询请求中的查询条件和预设的可用资源条件过滤图数据集合中不满足查询条件的第二待查图节点,以得到规约子图,通过所述查询条件查询所述规约子图,以得到所需要的第一待查图节点,可见本发明实施例可以基于查询条件实时生成对应的规约子图,并根据实时生成的规约子图得到查询结果,提高了计算机搜索图数据的准确性,而且在本发明实施例采用查询条件对图数据集合的图节点进行过滤以生成规约子图的过滤过程中不会占用计算机的存储资源,并且所生成的规约子图是动态存储在内存中,其并不需要占用计算机的磁盘存储资源,因此降低了计算机的存储开销;进一步地,本发明实施例提供的方法中根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图,通过所述查询条件查询所述规约子图,以得到所需要的第一待查图节点,可见本发明实施例可以支持在不同的资源限制下得到较为精确的查询结果,从而克服了传统查询技术无法在不同资源限制下均能返回查询结果的缺陷。
第三方面,本发明提供一种图数据的搜索装置,包括:
获取模块,用于获取查询请求;其中,所述查询请求包括携带起始图 节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;
处理模块,用于根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图,并通过所述查询条件查询所述规约子图,以得到所述第一待查图节点;其中,所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系
结合第三方面,在第三方面的第一种可能的实施方式中,所述处理模块,具体用于根据所述查询条件生成查询拓扑结构,并根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系,所述规约子图所占的资源不超过所述可用资源条件。
结合第三方面的第一种可能的实施方式,在第三方面的第二种可能的实施方式中,所述处理模块,具体用于读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,判断所述规约子图中是否包括所述读取的图节点,若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,则进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的 图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
结合第三方面的第一种可能的实施方式,在第三方面的第三种可能的实施方式中,所述处理模块,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点后,判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件后,进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间是否为空,若所述存储空间非空,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
结合第三方面的第一种可能的实施方式,在第三方面的第四种可能的实施方式中,所述处理模块,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓 扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记后,进一步判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,并进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价后,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;若所述存储空间存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
结合第三方面的第三种可能的实施方式或第三方面的第四种可能的实施方式,在第三方面的第五种可能的实施方式中,所述处理模块,还用于若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值,并继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止或直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止。
结合第三方面的第三种可能的实施方式至第三方面的第五种可能的实施方式中的任一项,在第三方面的第六种可能的实施方式中,所述处理模块,还用于若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
第四方面,本发明实施例提供一种图数据的搜索装置,包括:
获取模块,用于获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系;
处理模块,用于根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树,并根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点;其中,所述路标节点树包括具有层次关系的路标节点。
结合第四方面,在第四方面的第一种可能的实施方式中,所述处理模块,具体用于根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息,并根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略后,根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
本发明实施例提供的图数据的搜索方法和装置,通过根据确定图数据集合中图节点的中间度核心性和预设的可用资源条件确定路标节点,并建立路标节点树,然后根据查询条件搜索该路标节点树确定满足该查询条件的第一待查图节点。由于在搜索第一待查图节点的过程中采用了路标节点树,使得搜索第一待查图节点所经过的路径和图节点均为直接有效的路径和图节点,避免了计算机在获取第一待查图节点时的无效搜索,节省了计算机的时间资源,提高了搜索效率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对 实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明提供的分布式计算系统结构示意图;
图2为本发明提供的图数据的搜索方法实施例一的流程示意图;
图3为本发明提供的图数据集合示意图一;
图4为本发明提供的规约子图示意图;
图5为本发明提供的图数据的搜索方法实施例二的流程示意图;
图6为本发明提供的查询拓扑结构示意图;
图7为本发明提供的图数据的搜索方法实施例三的流程示意图;
图8为本发明提供的图数据的搜索方法实施例四的流程示意图;
图9为本发明提供的图数据的搜索方法实施例五的流程示意图;
图10为本发明提供的图数据的搜索方法实施例六的流程示意图;
图11为本发明提供的图数据集合示意图二;
图12为本发明提供的路标节点树示意图;
图13为本发明提供的图数据的搜索方法实施例七的流程示意图;
图14为本发明实施例提供的图数据的搜索装置实施例一的结构示意图;
图15为本发明实施例提供的图数据的搜索装置实施例二的结构示意图;
图16为本发明实施例提供的图数据的搜索设备实施例一的结构示意图;
图17为本发明实施例提供的图数据的搜索设备实施例二的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。 基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例适用于搜索大规模图数据的场景,其具体适用于应用于分布式计算系统中的计算节点进行图数据搜索的场景。该分布式计算系统包括至少一个计算节点,该计算节点例如可以是计算机,也可以是计算机中的服务器,还可以是面向用户的通信设备。该分布式计算系统可以参见图1所示的系统架构图,可选的,中心节点为接收用户查询命令的计算节点,该中心节点可以将用户输入的查询命令拆分成不同的查询请求,并将所拆分的查询请求发送给对应的计算节点,从而使得分布式计算系统中的其他计算节点可以根据中心节点拆分的查询请求搜索数据。当然,中心节点所拆分的查询请求中也可以包括自己对应的查询请求,即中心节点也可以根据自身对应的查询请求搜索数据。可选的,下述实施例的技术方案均以计算机作为执行主体来介绍。
图2为本发明提供的图数据的搜索方法实施例一的流程示意图。如图2所示,该方法包括:
S101:获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系。
具体的,计算机获取用户的查询请求。可选的,该查询请求可以是用户配置给计算机的,还可以是用户通过其他设备发送给计算机的,例如通过图1所示的中心节点发送给该计算机的。该查询请求中可以包括携带起始图节点的查询条件,并且该查询请求用于查询图数据集合与该查询条件匹配的第一待查图节点。需要说明的是,图数据集合中所包括的数据可以以图节点的形式存储,故图数据集合中可以包括起始图节点、多个待查图节点以及起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系。该关联关系在图数据集合中指的是起始图节点与各个待查图节点构成的边。另外,图数据集合中的待查图节点为待搜索图节点或待查询图节点。
需要说明的是,图数据集合中起始图节点和待查图节点均是用数据来表征的,并且,起始图节点与所述待测图节点之间的关联关系也是用数据的关联关系来表征的;另外,上述第一待查图节点可以为一个,也可以为多个。
例如,参见图3所示的图数据集合,假设查询请求中的查询条件为“找到所有认识michael所在的徒步旅行俱乐部成员和LA自行车俱乐部成员的自行车爱好者”。图3中michael为图数据集合的起始图节点,除michael之外的所有图节点均为待查图节点。图3中的HG代表的是徒步旅行俱乐部,hg代表的是徒步旅行俱乐部的成员,即为图数据集合中的待查图节点,CC代表的是自行车俱乐部,cc代表的是自行车俱乐部的成员,其也是图数据集合中的待查图节点。另外,图3中所示的michael与其他待查图节点之间的连接线即为图数据集合中的起始图节点与待查图节点之间的关联关系,该关联关系即为起始图节点与各个待查图节点构成的边。
S102:根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图;所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系。
具体的,计算机获取到查询请求后,解析该查询请求并获知该查询请求中的查询条件,然后根据该查询条件和计算机预设的可用资源条件确定图数据集合中不满足该查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,并将这些不满足该查询条件的第二待查图节点和包含第二待查图节点的关联关系过滤掉,从而获得规约子图。该规约子图中包括起始图节点和与上述查询条件匹配的第一待查图节点以及起始图节点与该第一待查图节点之间的关联关系,可选的,该规约子图可以为图形模式,还可以为具有关联关系的映射集合的形式,还可以为其他的可以表征起始图节点与该第一待查图节点之间的关联关系的形式,只要该规约子图可以使得计算机快速的根据查询条件搜索到想要的结果即可。上述可用资源条件可以为资源大小的阈值或资源大小的范围值或资源大小的上限值等。上述计算机所过滤的不满足所述查询条件的第二待测图节点可以是一 个或多个。
需要说明的是,本发明实施例涉及的“过滤”操作是指计算机从图数据集合中将不满足查询条件的第二待查图节点和第二待查图节点对应的关联关系进行筛选或忽略,从而将图数据集合中剩余的待查图节点生成规约子图,并且将所生成的规约子图动态缓存在内存中,待计算机根据该规约子图获取到第一待查图节点之后,将该规约子图进行释放。也就是说,计算机并不需要将该规约子图从内存写入磁盘,节省了计算机磁盘的存储开销;并且,计算机直接从内存中的规约子图中搜索得到第一待查节点,不用执行从磁盘到内存的IO操作,因此所花费的时间处理资源小,有效的提高了计算机搜索图数据的效率。
另外,上述图数据集合中所述的“起始图节点与所述待查图节点之间的关联关系”仅指的是图数据集合中所有图节点构成的边,而规约子图中的关联关系包括从起始图节点到第一待查图节点所经过的所有待查图节点和边。另外,上述预设的可用资源条件是用于约束规约子图所占资源的大小的。
继续参见图3所举的例子,计算机根据查询条件和预设的可用资源判断hg1、hg2、cc3以及hg1、hg2下的cl1、cl2不满足上述查询条件,则将hg1、hg2、cc3、cl1、cl2以及这些第二待查图节点对应的关联关系进行删除,从而生成规约子图,该规约子图可以参见图4所示。该规约子图中包括满足查询条件的第一待查图节点cln、cln-1以及这些第一待查图节点与起始图节点之间的关联关系(该关联关系既包括hgm、cc1、cc2这三个待查图节点,也包括这三个待查图节点与起始图节点、第一待查图节点构成的边),并且该规约子图所占的存储资源不超过上述预设的可用资源条件。
也就是说,计算机在过滤不满足查询条件的第二待查图节点以及该第二待查图节点对应的关联关系时,是基于查询条件和预设的可用资源条件两方面共同考虑的。
S103:通过所述查询条件查询所述规约子图,以得到所述第一待查图节点。
现有技术在搜索对应的图节点上的数据时,往往是通过采集样本,并将采集到的样本静态存储至磁盘中,计算机接收到的所有的查询请求都是 基于磁盘中存储的样本进行查询的,其不仅造成计算机存储资源的浪费,另一方面,由于所有的查询请求都是基于磁盘中同一个样本查询,查询精度也不高;但是,在本发明实施例中,采用的是根据查询条件对图数据集合的图节点进行过滤的方法,在过滤过程中不会占用计算机的存储资源,并且所生成的规约子图是动态存储在内存中,其并不需要占用计算机的磁盘存储资源,因此降低了计算机的存储开销;另一方面,在搜索过程中,在根据某一个查询条件查询完成之后,计算机会释放规约子图,当计算机下一时刻重新接收到的一个新的查询请求后,动态生成新的规约子图,并基于该新的规约子图进行查询,因此,本发明实施例提供的方法,提高了计算机搜索图数据的准确性。进一步地,本发明实施例提供的方法可以在任意资源限制下返回查询精确的查询结果,克服了传统查询技术无法在任何资源限制下均返回查询结果的缺陷。
本发明实施例提供的图数据的搜索方法,根据查询请求中的查询条件和预设的可用资源过滤图数据集合中不满足查询条件的第二待查图节点,获得规约子图,并根据该规约子图获得所需要的第一待查图节点,即在本发明实施例采用查询条件对图数据集合的图节点进行过滤以生成规约子图的方法,在过滤过程中不会占用计算机的存储资源,并且所生成的规约子图是动态存储在内存中,其并不需要占用计算机的磁盘存储资源,因此降低了计算机的存储开销;另一方面,本发明实施例基于不同的查询条件实时生成规约子图,并根据实时生成的规约子图得到查询结果,提高了计算机搜索图数据的准确性;进一步地,本发明实施例提供的方法可以在任意资源限制下返回查询精确的查询结果,克服了传统查询技术无法在任何资源限制下均返回查询结果的缺陷。
图5为本发明提供的图数据的搜索方法实施例二的流程示意图。在上述实施例的基础上,本实施例涉及的是计算机获取规约子图的具体过程。进一步地,上述S102具体包括:
S201:根据所述查询条件生成查询拓扑结构;所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系。
具体的,计算机根据查询条件生成的查询拓扑结构,可以是查询模式 图的形式,该查询拓扑结构中的每个查询节点与其它查询节点之间均具有查询拓扑。
为了便于理解查询拓扑结构,此处以一个简单的例子来进行说明。参见图6所示的查询拓扑结构,计算机根据上述图3所示的例子中的查询条件“找到所有认识michael所在的徒步旅行俱乐部成员和LA自行车俱乐部成员的自行车爱好者”,获知自身所要找的自行车爱好者(即第一待查图节点)是HG与CC中的成员共同认识的人,且这个人属于自行车爱好者的社交圈(CL)。因此,计算机会将Michael设置为起始查询节点(为了便于与图数据集合中的michael区分,这里采用Michael,该Michael的实际含义是一个查询节点,该查询节点的意义为要找到一个叫michael的人,michael指的是图数据集合中实际的图节点,即起始查询节点Michael与起始图节点michael是相互匹配的,起始图节点michael就是起始查询节点Michael所要查找的人);之后,计算机会将HG、CC以及CL作为其他的查询节点,即计算机要找的人必须是CL中的人,且这个人必须是HG、CC中的成员共同认识的人。故,计算机就此构建了查询拓扑结构,之后在查询过程中可以根据该查询拓扑结构中的查询节点之间的查询拓扑关系进行查询,例如计算机只需要找HG、CC以及CL中的人,不再需要查询其他的社交圈。查询拓扑结构的构建,减少了计算的查询时间,节省了计算机的时间处理资源。
S202:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述规约子图所占的资源不超过所述可用资源条件。
具体的,计算机为了以最少的资源获得第一待查图节点,会预设一个访问该第一待查图节点的第一访问代价,即计算机从起始图节点查找到第一待查图节点时的访问代价不得超过该第一访问代价。
因此,计算机根据上述生成的查询拓扑结构中查询节点之间的查询拓扑关系和上述访问第一待查图节点的第一访问代价以及预设的可用资源条件,过滤掉图数据集合中超过该第一访问代价的第二待查图节点和包含 第二待查图节点的关联关系,以得到规约子图。
例如,继续参见图3所举的例子,计算机判断从起始图节点经过hg1或hg2或cc3无法查找到既认识michael所在的HG的成员又认识LA自行车俱乐部的成员的自行车爱好者(这里的“既认识michael所在的HG的成员又认识LA自行车俱乐部的成员的自行车爱好者”即就是计算机所要查找的第一待查图节点),则其访问代价可以看作是无穷大,因而也就超过了第一访问代价,故计算机会将hg1、hg2、cc3、cl1、cl2(hg1、hg2、cc3、cl1、cl2即为第二待查图节点)以及包含这些第二待查图节点的关联关系过滤掉,从而生成规约子图,该规约子图可以参见上述图4所示。该规约子图中包括第一待查图节点cln、cln-1以及这些第一待查图节点与起始图节点之间的关联关系(该关联关系既包括hgm、cc1、cc2这三个待查图节点,也包括这三个待查图节点与起始图节点、第一待查图节点构成的边,即第一待查图节点与起始图节点之间的关联关系包括起始图节点到第一待查图节点的路径上的待查图节点和构成该路径的边),并且该规约子图所占的存储资源没有超过上述预设的可用资源条件。
本发明实施例提供的图数据的搜索方法,采用查询拓扑结构和预设的第一访问代价以及可用资源条件,过滤图数据集合中超过第一访问代价的第二待查图节点和包含第二待查图节点的关联关系,从而得到规约子图,其在过滤过程中不会占用计算机的存储资源,并且所生成的规约子图是动态存储在内存中,其并不需要占用计算机的磁盘存储资源,因此降低了计算机的存储开销;另一方面,本发明实施例基于不同的查询条件实时生成规约子图,并根据实时生成的规约子图得到查询结果,提高了计算机搜索图数据的准确性;进一步地,本发明实施例提供的方法可以在任意资源限制下返回查询精确的查询结果,克服了传统查询技术无法在任何资源限制下均返回查询结果的缺陷。
图7为本发明提供的图数据的搜索方法实施例三的流程示意图。在上述实施例的基础上,本实施例涉及的是计算机根据查询拓扑结构和第一访问代价以及预设的可用资源条件确定规约子图的具体实现过程。进一步地,如图7所示,上述S202具体包括:
S301:读取存储空间中存储的查询节点和与所述查询节点匹配的图节 点,其中所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配。
具体的,计算机在根据查询生成查询拓扑结构之后,根据该查询拓扑结构读取存储空间中存储的查询节点和与该查询节点匹配的图节点。这里的查询节点包括起始查询节点,可选的,还可以包括其他的与起始查询节点具有查询拓扑,并且可以协助计算机搜索到第一待查图节点的查询节点。存储空间中还可以包括与起始查询节点匹配的起始图节点,而且还可以包括可能写入规约子图中的待查图节点,也就是说计算机存储空间存储的这些图节点可以为经过计算机过滤筛选后剩余的待查图节点,计算机通过这些待查图节点可以搜索到第一待查图节点。
需要说明的是,当存储空间存储的图节点为待查图节点时,这些待查图节点实际上是通过下述S303的步骤添加至存储空间的,也就是说该S301实际上与下述S302和S303构成了一个循环执行过程,具体过程参见下述介绍。
S302:判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件。
具体的,当计算机判断规约子图中并不包括上述从存储空间读取的图节点,则将上述所读取的图节点添加至规约子图中,并且,在将上述读取的图节点添加至规约子图后,计算机进一步判断该规约子图所占的内存资源是否超过了上述可用资源的条件。
需要说明的是,这里的可用资源条件可以为资源大小的阈值,还可以为资源大小的范围值,还可以为资源大小的限值。另外,规约子图所占的资源指的是所占计算机内存的资源。
S303:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参 数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
具体的,当计算机判断已经添加了上述读取的图节点的规约子图所占的资源并未超过上述可用资源条件,计算机进一步根据上述所确定的查询拓扑结构中的查询节点之间的查询拓扑关系计算与上述读取的图节点相邻的待查图节点(以下简称邻居待查图节点)的访问代价,即计算该邻居待查图节点到第一待查访问图节点的访问代价。
之后,计算机过滤掉上述访问代价超过第一访问代价的第二待查图节点,该第二待查图节点即就是从上述邻居待查图节点中筛选的访问代价超过第一访问代价的待查图节点,并且,计算机还过滤掉包含该第二待查图节点的关联关系。
当计算机将第二待查图节点和包含第二待查图节点的关联关系过滤掉之后,计算机可以根据预设的动态规约参数从剩余的访问代价不超过第一访问代价的邻居待查图节点中确定存储至所述存储空间的访问序列,该动态规约参数决定了访问序列中图节点的数目。需要说明的是,访问序列中的图节点均为访问代价不超过第一访问代价的邻居待查图节点。
当计算机确定了可以存储至存储空间的访问序列之后,将该访问序列存储至存储空间,即将该访问序列中的图节点存储至存储空间,从而使得计算机可以再次执行S301,进而确定添加至规约子图中的图节点,从而得到规约子图。
本发明实施例提供的图数据的搜索方法,通过根据查询拓扑结构、预设的第一访问代价和可用资源条件,从图数据集合中确定可以添加至规约子图的图节点,从而得到规约子图,进而通过规约子图搜索到第一待查图节点。本发明实施例提供的方法,由于规约子图缓存在计算机内存中,不需通过IO操作就可以从内存中查询规约子图从而得到结果,其有效提高了计算机搜索图数据的效率;并且,在一次搜索完成后,计算机会释放之前生成的规约子图,待下次搜索时,计算机重新根据新的查询请求搜索图数据,即本发明实施例中计算机是根据查询条件实时生成规约子图,并实时根据规约子图获得精确的搜索结果,故本发明实施例提供的方法提高了 图数据的搜索精度。
图8为本发明提供的图数据的搜索方法实施例四的流程示意图。在上述实施例的基础上,本实施例涉及的是计算机根据查询拓扑结构和第一访问代价以及预设的可用资源确定规约子图的另一具体实现过程。进一步地,如图8所示,上述S202具体包括:
S401:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值。
具体的,计算机在生成查询拓扑结构之后,可以将规约子图、用于缓存查询节点和与查询节点匹配的图节点的存储空间、以及动态规约参数进行初始化,即将规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值。因此,此时的规约子图为空,存储空间为空,上述第一预设值为控制下述访问序列中的待查图节点数目的参数,该参数可以使得计算机以最少的资源搜索到第一待查图节点。
S402:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配。
需要说明的是,起始查询节点与起始图节点构成点对(起始查询节点,起始图节点),计算机将该点对存储至存储空间。可选的,在本实施例中,该存储空间可以为一堆栈结构的存储空间,也可以为存储模块,只要该存储空间具有“先入后出”的特点即可,即计算机从该存储空间获取数据时是依照“先入后出”原则进行的。
S403:读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并将所读取的查询节点和与所述查询节点匹配的图节点从所述存储空间删除。
具体的,继上述S402,因为上述存储空间中只包括了起始查询图节点以及起始图节点构成的点对,因此,此时S403中计算机读取的查询节点应该是起始查询节点,读取的图节点即为起始图节点。但是,当上述存储空间中还包括了其他的查询节点和图节点时,则S403中计算机读取的查询节点和图节点就要依实际情况而定。计算机从存储空间读取图节点的过 程可以参见后面实施例中的示例。
S404:判断所述规约子图中是否包括所述读取的图节点;其中,所述读取的图节点包括所述起始图节点或所述待查图节点。若是,则执行S405;若否,则执行S406。
S405:标记规约子图未发生变化,执行S407。
S406:将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源是否超过可用资源条件;若是,则执行S411;若否,则执行S407。
具体的,当计算机将所读取的图节点添加至规约子图中后,为了防止规约子图所占的存储资源超过可用资源条件,因此,计算机需要判断规约子图所占的资源是否超过可用资源条件。
S407:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的第二访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
具体的,当计算机判断规约子图所占的资源未超过可用资源条件时,计算机根据上述查询拓扑结构中的查询节点之间的查询拓扑关系计算S403中与所读取的图节点相邻的待查图节点(以下与所读取的图节点相邻的待查图节点简称为邻居待查图节点)的访问代价。计算机判断该邻居待查图节点的访问代价是否超过第一访问代价;若是,则该邻居待查图节点即就是第二待查图节点,计算机将该第二待查图节点以及包含该第二待查图节点的关联关系过滤掉;若否,则计算机将该邻居待查图节点添加至访问队列中(因为该图节点有可能是第一待查图节点或者可以协助计算机查找到第一待查图节点的图节点);同时,计算机还根据上述动态规约参数确定加入访问序列的点对数。其中,一个点对包括一个查询节点和一个与该查询节点匹配的图节点。当计算机确定访问序列之后,将该访问序列存储至上述存储空间中,即将该访问序列中的待查图节点存储至存储空间 中。
S408:判断所述存储空间是否为空;若是,则执行S409;若否,则执行S403,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止,并在确定所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零之后,执行S409。
S409:判断所述规约子图是否发生变化;若是,则执行S410;若否,则执行S411。
S410:将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值,之后,继续返回执行S403。
具体的,当规约子图发生变化时,说明还存在其他的访问代价不超过第一访问代价的邻居待查图节点未加入至规约子图中,即隐式的说明计算机搜索的范围过小。因此,这里的调整动态规约参数可以是增大初始的动态规约参数,以使得加入访问队列的点对数增多,进而扩大计算机的搜索范围,得到精确的结果。
S411:结束计算,确定所述规约子图。
图9为本发明提供的图数据的搜索方法实施例五的流程示意图。在上述图3所示实施例的基础上,本实施例涉及的是计算机根据查询拓扑结构中的查询节点之间的查询拓扑关系和第一访问代价以及预设的可用资源确定规约子图的另一具体实现过程。进一步地,如图9所示,上述S202具体包括:
S501:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值。
S502:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配。
S503::读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记。
S504:判断所述规约子图中是否包括所述读取的图节点;其中,所述读取的图节点包括所述起始图节点或所述待查图节点;若是,则执行S505;若否,则执行S506。
S505:标记规约子图未发生变化,执行S507。
S506:将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源是否超过可用资源条件;若是,则执行S511;若否,则执行S507。
S507:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数和所述第一访问代价输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
具体的,上述S501至S507具体过程可以参照上述图8所示的实施例四的内容,其具体执行过程类似,在此不再赘述。
S508:判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;若是,则执行S503,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止,并在确定所述存储空间中存储的查询节点和与所述查询节点匹配的图节点均打有标记之后,执行S509;若否,则执行S509。
具体的,按照上述所述,计算机会从存储空间读取查询节点以及与该查询节点匹配的图节点,并将所读取的查询节点以及与该查询节点匹配的图节点添加至规约子图中,因此计算机会对存储空间中已添加至规约子图中的图节点以及与该图节点匹配的查询节点打上标记。故,计算机需要判断存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,即判断存储空间中是否还有未添加至规约子图的图节点。若存储空间中存在未打标记的查询节点和与该未打标记的查询节点匹配的图节点,则计算机返回执行上述S503,直至存储空间存储的查询节点和与该查询节点匹配的图节点均打有标记为止。
S509:判断所述规约子图是否发生变化;若是,则执行S510;若否,则执行S511。
S510:将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值,之后,继续返回执行S503。
具体的,当规约子图发生变化时,说明还存在其他的访问代价不超过第一访问代价的邻居待查图节点未加入至规约子图中,即隐式的说明计算机搜索的范围过小。因此,这里的调整动态规约参数可以是增大初始的动态规约参数,以使得加入访问队列的点对数增多,进而扩大计算机的搜索范围,得到精确的结果。
S511:结束计算,以得到所述规约子图。
上述图8所示实施例中的S401-S411和图9所示实施例中的S501-S511均是计算机根据查询拓扑结构和第一访问代价以及预设的可用资源条件确定规约子图的具体实现过程,为了便于理解图8和图9所示的流程图,此处继续以上述图3所示出的例子进行更具体的说明。由于图8和图9过程类似,因此,此处仅将图8所示的流程图中的循环操作以具体的例子示出,具体参见下述A-I九大步骤。
步骤A:
(1)将所述规约子图中的图节点数目设置为零(下述规约子图均已GQ来表示),将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零(下述存储空间均以S表示),并将用于控制访问序列中图节点数目的动态规约参数设为第一预设值之后,将起始查询节点和起始图节点存储至S,即将(Michael,michael)存储至S;
(2)获取S中缓存的(Michael,michael);
(3)判断起始图节点michael未加入GQ,则更新GQ,即将michael加入GQ
(4)动态计算查询节点Michael和图节点michael的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中查 询节点Michael的拓扑结构(该拓扑结构即Michael的邻居查询节点为HG和CC),查询图数据集合中起始图节点michael的邻居待查图节点(该邻居待查图节点为hg1、hg2、hgm、cc1、cc2、cc3),并计算michael的邻居待查图节点的访问代价,确定hgm、cc1、cc2各自的访问代价不超过预设的第一访问代价,因此计算机结合上述动态规约参数得到访问序列(HG,hgm)(CC,cc1)(CC,cc2),并将这几个点对依次存储至S,即此时S中的具体内容可以参见表1所示。上述访问序列的点对中的邻居待查图节点均是能够精确找到第一待查图节点的节点。
表1
Figure PCTCN2015096845-appb-000001
(5)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael。
步骤B:继上述步骤A的第(5)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(HG,hgm);
(2)判断hgm未加入GQ,则更新GQ,即将hgm加入GQ
(3)动态计算查询节点HG和图节点hgm的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点HG的拓扑结构(该拓扑结构即HG的邻居查询节点为CL),查询图数据中图节点hgm的邻居待查图节点(该邻居待查图节点为cln、cln-1),并计算hgm的邻居待查图节点的访问代价不超过预设的第一访问代价,因此计算机结合上述动态规约参数得到访问序列(CL,cln)(CL,cln-1),并将这两个点对存储至S,即此时S中的具体内容可以参见表2所示。
表2
Figure PCTCN2015096845-appb-000002
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael和hgm
步骤C:继上述步骤B的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CL,cln-1);
(2)判断cln-1未加入GQ,则更新GQ,即将cln-1加入GQ
(3)动态计算查询节点CL和图节点cln-1的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CL的拓扑结构(CL的邻居查询节点为空),查询图数据中图节点cln-1的邻居待查图节点。由于cln-1并没有邻居待查图节点,因此计算机确定访问序列为空,则此时S中的具体内容可以参见表3所示。
表3
Figure PCTCN2015096845-appb-000003
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm和cln-1
步骤D:继上述步骤C的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CL,cln);
(2)判断cln未加入GQ,则更新GQ,即将cln加入GQ
(3)动态计算查询节点CL和图节点cln的邻居待查图节点的访问代 价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CL的拓扑结构(CL的邻居查询节点为空),查询图数据中图节点cln的邻居待查图节点,由于cln并没有邻居待查图节点,因此计算机确定访问序列为空,则此时S中的具体内容可以参见表4所示。
表4
Figure PCTCN2015096845-appb-000004
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm、cln-1和cln
步骤E:继上述步骤D的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CC,cc1);
(2)判断cc1未加入GQ,则更新GQ,即将cc1加入GQ
(3)动态计算查询节点CC和图节点cc1的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CC的拓扑结构(该拓扑结构即CC的邻居查询节点为CL),查询图数据集合中图节点cc1的邻居待查图节点(该邻居待查图节点为cln、cln-1),并计算cc1的邻居待查图节点的访问代价不超过预设的第一访问代价,因此计算机结合上述动态规约参数得到访问序列(CL,cln)(CL,cln-1),并将这两个点对存储至S,则此时S中的具体内容可以参见表5所示。
表5
Figure PCTCN2015096845-appb-000005
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm、cln-1、cln、和cc1
步骤F:继上述步骤E的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CL,cln-1);
(2)判断cln-1已加入GQ,则标记GQ未发生变化;
(3)动态计算查询节点CL和图节点cln-1的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CL的拓扑结构,查询图数据集合中图节点cln-1的邻居待查图节点,由于cln-1并没有邻居待查图节点,因此计算机确定访问序列为空,则此时S中的具体内容可以参见表6所示。
表6
Figure PCTCN2015096845-appb-000006
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm、cln-1、cln、和cc1
步骤G:继上述步骤F的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CL,cln);
(2)判断cln已加入GQ,则标记GQ未发生变化;
(3)动态计算查询节点CL和图节点cln的邻居待查图节点的访问代 价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CL的拓扑结构(CL的邻居查询节点为空),查询图数据集合中图节点cln的邻居图节点。由于cln并没有邻居待查图节点,因此计算机确定访问序列为空,则此时S中的具体内容可以参见表7所示。
表7
Figure PCTCN2015096845-appb-000007
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm、cln-1、cln、和cc1
步骤H:继上述步骤G的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CC,cc2);
(2)判断cc2未加入GQ,则更新GQ,即将cc2加入GQ
(3)动态计算查询节点CC和图节点cc2的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CC的拓扑结构(该拓扑结构即CC的邻居查询节点为CL),查询图数据集合中图节点cc2的邻居待查图节点(该邻居待查图节点为cln),并计算cc2的邻居待查图节点的访问代价不超过预设的第一访问代价,因此计算机结合上述动态规约参数得到访问序列(CL,cln),并将这个点对存储至S,则此时S中的具体内容可以参见表8所示。
表8
Figure PCTCN2015096845-appb-000008
(4)故,计算机判断S不为空,则计算机继续获取S中的待查图节点;此时GQ中包括michael、hgm、cln-1、cln、cc1和cc2
步骤I:继上述步骤H的第(4)步之后,计算机继续执行下述过程:
(1)获取S中缓存的(CL,cln);
(2)判断cln已加入GQ,则标记GQ未发生变化;
(3)动态计算查询节点CL和图节点cln的邻居待查图节点的访问代价,并根据第一访问代价过滤掉第二待查图节点,并根据动态规约参数从剩余的待查图节点中确定访问序列。即,根据查询拓扑结构中的查询节点CL的拓扑结构(CL的邻居查询节点为空),查询图数据中图节点cln的邻居待查图节点,由于cln并没有邻居待查图节点,因此计算机确定访问序列为空,则此时S为空。
(4)故,计算机判断S为空,此时GQ中包括michael、hgm、cln-1、cln、cc1和cc2
(5)判断GQ是否有变化,由上述步骤H和步骤I可知,GQ并没有发生变化,则终止计算,确定GQ包括michael、hgm、cln-1、cln、cc1和cc2;其中,michael为规约子图GQ中的起始图节点,cln-1、cln为规约子图GQ中的第一待查图节点,hgm、cc1和cc2属于规约子图GQ中起始图节点与第一待查图节点的关联关系上的图节点。
(6)计算机根据GQ确定第一待查图节点。
本发明实施例提供的图数据的搜索方法,通过根据查询拓扑结构、预设的第一访问代价和可用资源条件,从图数据集合中确定可以添加至规约子图的图节点,从而得到规约子图,进而通过规约子图搜索到第一待查图节点。本发明实施例提供的方法,由于规约子图缓存在计算机内存中,不需通过IO操作就可以从内存中查询规约子图从而得到结果,其有效提高了计算机搜索图数据的效率;并且,在一次搜索完成后,计算机会释放之前生成的规约子图,待下次搜索时,计算机重新根据新的查询请求搜索图数据,即本发明实施例中计算机是根据查询条件实时生成规约子图,并实时根据规约子图获得精确的搜索结果,故本发明实施例提供的方法提高了图数据的搜索精度。
图10为本发明提供的图数据的搜索方法实施例六的流程示意图。本发明实施例涉及的方法仍然适用于上述图1所示的分布式计算系统。本实施例依然以计算机作为执行主体为例。本实施例涉及的是计算机通过图数 据中的路标节点确定与查询条件匹配的第一待查图节点的具体过程。如图10所示,该方法包括:
S601:获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系。
具体的,计算机获取用户的查询请求。可选的,该查询请求可以是用户配置给计算机的,还可以是用户通过其他设备发送给计算机的。该查询请求中可以包括携带起始图节点和终止图节点的查询条件,并且该查询请求用于查询图数据集合中与该查询条件匹配的第一待查图节点。需要说明的是,图数据集合中的关联关系指的是起始图节点、终止图节点以及各个待查图节点构成的边。上述第一待查图节点可以为一个,也可以为多个。
例如,参见图11所示的图数据集合,假设查询请求中的查询条件为“我能否通过朋友认识自行车运动员Eric”,图11中michael为图数据的起始图节点,除michael之外的所有图节点均为待查图节点,Eric为图数据的终止图节点。图11中所示的michael、Eric与其他待查图节点之间的连接线即为图数据中的起始图节点与待查图节点之间的关联关系。
S602:根据图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树;其中,所述路标节点树包括具有层次关系的路标节点。
具体的,计算机根据图数据集合中所有图节点(包括起始图节点、终止图节点和待查图节点)的关联关系确定每个图节点的中间度核心性(betweenness centrality),根据每个图节点的betweenness centrality和预设的可用资源条件确定图数据集合的路标节点。也就是说,该路标节点为位于其他图节点的多条最短路径上的节点。
待计算机确定图数据集合的路标节点后,根据每个路标节点的betweenness centrality以及图数据集合中的关联关系建立路标节点树,该路标节点树中包括多个具有层次关系的路标节点。例如上述图11中所举的例子,图11中的cl3、cl4、cl5、cl6均为路标节点,且cl4的betweenness  centrality的度数最大,作为最核心的路标节点。之后,计算机根据图数据集合中的图节点的关联关系将所确定的这四个路标节点建立路标节点树,具体可以为如果某路标节点a到某路标节点b可达或b可达a,则构造一条边(b,a)并加入路标节点树。该路标节点树可以参见图12所示,路标节点树出来包括路标节点之外还包括图数据集合中的其他节点。
需要说明的是,上述预设的可用资源条件是用来约束路标节点树所占的资源大小的,即构建的路标节点树所占的资源不能超过该预设的可用资源条件。
S603:根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点。
继续参见上述图11所举的例子,计算机获知查询条件中的起始图节点,并获知了图数据集合的终止图节点,则计算机可以根据查询条件搜索路标节点树中的关联关系,判断起始图节点到终止图节点是否可达,并在确定可达时输出从起始图节点可达终止图节点Eric所经过的路标节点,这些路标节点即为第一待查图节点。结合上述图11和图12所示的例子,计算机所确定的第一待查图节点可以为cl3、cl4、cl6
本发明实施例提供的图数据的搜索方法,通过根据确定图数据集合中图节点的中间度核心性和预设的可用资源条件确定路标节点,并建立路标节点树,然后根据查询条件搜索该路标节点树确定满足该查询条件的第一待查图节点。由于在搜索第一待查图节点的过程中采用了路标节点树,使得搜索第一待查图节点所经过的路径和图节点均为直接有效的路径和图节点,避免了计算机在获取第一待查图节点时的无效搜索,节省了计算机的时间资源,提高了搜索效率。
图13为本发明提供的图数据的搜索方法实施例七的流程示意图。在上述图10所示实施例的基础上,本实施例涉及的是计算机根据查询条件搜索路标节点树,获得第一待查图节点的具体过程。如图11所示,上述S403具体包括:
S701:根据所述查询条件获取路标节点树中各个路标节点的辅助信息。
具体的,计算机构建了路标节点树之后,根据查询条件获取路标节点 树中各个路标节点的辅助信息。可选的,该辅助信息可以为路标节点是否可达终止图节点,还可以为该路标节点到达终止图节点的访问代价或所花费的搜索时间,还可以为该路标节点所占的资源的大小,还可以为其他的辅助计算机获取第一待查图节点的信息。
S702:根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略。
可选的,该路径策略可以用于协助计算机选择一条最优路径获取第一待查图节点,或者向计算机指示路标节点树中哪些路径使得起始图节点不可达终止图节点。
S703:根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
本发明实施例提供的图数据的搜索方法,通过根据确定图数据集合中图节点的中间度核心性和预设的可用资源条件确定路标节点,并建立路标节点树,然后根据查询条件搜索该路标节点树确定满足该查询条件的第一待查图节点。由于在搜索第一待查图节点的过程中采用了路标节点树,使得搜索第一待查图节点所经过的路径和图节点均为直接有效的路径和图节点,避免了计算机在获取第一待查图节点时的无效搜索,节省了计算机的时间资源,提高了搜索效率。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
图14为本发明实施例提供的图数据的搜索装置实施例一的结构示意图。该图数据的搜索装置101可以集成在上述分布式计算系统中的计算节点上。如图14所示,该装置包括:获取模块10和处理模块11。
其中,获取模块10,用于获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关 系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;处理模块11,用于根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图,并通过所述查询条件查询所述规约子图,以得到所述第一待查图节点;其中,所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系。
本发明提供的图数据的搜索装置,可以参见上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
进一步地,上述处理模块11,具体用于根据所述查询条件生成查询拓扑结构,并根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系,所述规约子图所占的资源不超过所述可用资源条件。
更进一步地,上述处理模块11,具体用于读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,判断所述规约子图中是否包括所述读取的图节点,若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,则进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;所述访问序列中的待查图节点的访问代价均不超过所述第一访问代 价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
本发明提供的图数据的搜索装置,可以参见上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
可选的,在上述实施例的基础上,所述处理模块11,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点后,判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件后,进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间是否为空,若所述存储空间非空,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
可选的,在上述实施例的基础上,所述处理模块11,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查 询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记后,进一步判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,并进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价后,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;若所述存储空间存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
进一步地,上述处理模块11,还用于若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值,并继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止或直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止。
更进一步地,上述处理模块11,还用于若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
本发明提供的图数据的搜索装置,可以参见上述方法实施例,其实现 原理和技术效果类似,在此不再赘述。
图15为本发明实施例提供的图数据的搜索装置实施例二的结构示意图。该图数据的搜索装置102可以集成在上述分布式计算系统中的计算节点上。如图15所示,该装置包括:获取模块20和处理模块21。
其中,获取模块20,用于获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系;处理模块21,用于根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树,并根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点;其中,所述路标节点树包括具有层次关系的路标节点。
本发明提供的图数据的搜索装置,可以参见上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
进一步地,上述处理模块11,具体用于根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息,并根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略后,根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
图16为本发明实施例提供的图数据的搜索设备实施例一的结构示意图。该图数据的搜索设备103可以为上述分布式计算系统中的计算节点,如图16所示,该设备包括处理器30、存储器31以及用户接口32,三者通过总线33连接。当然,除此之外,本发明实施例提供的设备还可以包括用于与其它设备通信的通信接口等。图16所示的设备具体可以为手机、平板电脑、台式计算机、便携式计算机、服务器等电子设备。
总线33用于实现处理器30、存储器31以及用户接口32之间的连接通信。该总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外部设备互连)总线或EISA(Extended Industry Standard Architecture,扩展工业标准体系结构)总线等。所述总线可以是一条或多条物理线路,当是 多条物理线路时可以分为地址总线、数据总线、控制总线等。
用户接口32用于接收用户的操作,或向用户展示页面。例如,在本发明实施例中,该图数据的搜索设备可以通过用户接口32获取查询请求,从而使得处理器30根据该查询请求执行相应的操作。
存储器31用于存储计算机程序,可以包括应用程序和操作系统程序。
处理器30,用于从存储器31中读取计算机程序并用于执行以下操作,具体为:
获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;
根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图;所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系;
通过所述查询条件查询所述规约子图,以得到所述第一待查图节点。
进一步地,上述处理器30,具体用于根据所述查询条件生成查询拓扑结构;所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系,并根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述规约子图所占的资源不超过所述可用资源条件。
更进一步地,上述处理器30,具体用于读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,判断所述规约子图中是否包括所述读取的图节点,若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过 所述可用资源条件,则进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
可选的,作为本发明实施例的一种具体实现方式,上述处理器30,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点后,判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件后,进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间是否为空,若所述存储空间非空,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访 问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
可选的,作为本发明实施例的另一种具体实现方式,上述处理器30,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记后,进一步判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,并进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价后,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;若所述存储空间存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
进一步地,上述处理器30,还用于若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储 空间,并将所述动态规约参数的值调整为第二预设值,并继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止或直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止。
更进一步地,上述处理器30,还用于若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
本实施例涉及的图数据搜索设备可以执行上述方法实施例,其中包含的计算机程序可以按照前述装置实施例介绍的模块形式划分,也可以按照其它模块划分方式,也可以不划分模块。具体实现方法和技术效果参考前述方法实施例,在此不再赘述。
图17为本发明实施例提供的图数据的搜索设备实施例二的结构示意图。该图数据的搜索设备104可以为上述分布式计算系统中的计算节点,如图17所示,该设备包括处理器40、存储器41以及用户接口42,三者通过总线43连接。当然,除此之外,本发明实施例提供的设备还可以包括用于与其它设备通信的通信接口等。图17所示的设备具体可以为手机、平板电脑、台式计算机、便携式计算机、服务器等电子设备。
总线43用于实现处理器40、存储器41以及用户接口42之间的连接通信。该总线可以是ISA4总线、PCI4总线或EISA4总线等。所述总线可以是一条或多条物理线路,当是多条物理线路时可以分为地址总线、数据总线、控制总线等。
用户接口42用于接收用户的操作,或向用户展示页面。例如,在本发明实施例中,该图数据的搜索设备可以通过用户接口42获取查询请求,从而使得处理器40根据该查询请求执行相应的操作。
存储器41用于存储计算机程序,可以包括应用程序和操作系统程序。
处理器40,用于从存储器41中读取计算机程序并用于执行以下操作,具体为:
获取查询请求,并根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树,并根据所述查询条件搜索所述路标节点树,以得 到所述第一待查图节点;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系;所述路标节点树包括具有层次关系的路标节点。
进一步地,上述处理器40,具体用于根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息,并根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略后,根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
本实施例涉及的图数据搜索设备可以执行上述方法实施例,其中包含的计算机程序可以按照前述装置实施例介绍的模块形式划分,也可以按照其它模块划分方式,也可以不划分模块。具体实现方法和技术效果参考前述方法实施例,在此不再赘述。
最后应说明的是:以上各实施例仅以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (18)

  1. 一种图数据的搜索方法,其特征在于,包括:
    获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;
    根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图;所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述第一待查图节点之间的关联关系;
    通过所述查询条件查询所述规约子图,以得到所述第一待查图节点。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点对应的关联关系,以得到规约子图,包括:
    根据所述查询条件生成查询拓扑结构;所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系;
    根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述规约子图所占的资源不超过所述可用资源条件。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图,包括:
    读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,其中所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;
    判断所述规约子图中是否包括所述读取的图节点;
    若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
    根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
  4. 根据权利要求2所述的方法,其特征在于,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图,包括:
    步骤A:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值;
    步骤B:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配;
    步骤C:读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点;
    步骤D:判断所述规约子图中是否包括所述读取的图节点;其中,所述读取的图节点包括所述起始图节点或所述待查图节点;
    步骤E:若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
    步骤F:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目;
    步骤G:判断所述存储空间是否为空;
    步骤H:若所述存储空间非空,则返回执行步骤C,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则判断所述规约子图是否发生变化;
    步骤I:若判断所述规约子图没有发生变化,则结束计算,以得到所述规约子图。
  5. 根据权利要求2所述的方法,其特征在于,所述根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点对应的关联关系,以得到所述规约子图,包括:
    步骤A:将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值;
    步骤B:将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间;所述起始图节点与所述起始查询节点相匹配;
    步骤C:读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记;
    步骤D:判断所述规约子图中是否包括所述读取的图节点;其中,所 述读取的图节点包括所述起始图节点或所述待查图节点;
    步骤E:若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件;
    步骤F:根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;其中,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目;
    步骤G:判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;
    步骤H:若所述存储空间中存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则返回执行步骤C,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记,则判断所述规约子图是否发生变化;
    步骤I:若判断所述规约子图没有发生变化,则结束计算,以得到所述规约子图。
  6. 根据权利要求4或5所述的方法,其特征在于,所述判断所述规约子图是否发生变化之后,还包括:
    若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值;
    执行步骤C。
  7. 根据权利要求4至6任一项所述的方法,其特征在于,所述方法还包括:
    若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
  8. 一种图数据的搜索方法,其特征在于,包括:
    获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系;
    根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标节点建立路标节点树;其中,所述路标节点树包括具有层次关系的路标节点;
    根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点。
  9. 根据权利要求8所述的方法,其特征在于,根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点,包括:
    根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息;
    根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略;
    根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
  10. 一种图数据的搜索装置,其特征在于,包括:
    获取模块,用于获取查询请求;其中,所述查询请求包括携带起始图节点的查询条件,所述查询请求用于查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点以及所述起始图节点与所述多个待查图节点之间的关联关系以及所述多个待查图节点中的每个待查图节点与其它待查图节点之间的关联关系;
    处理模块,用于根据所述查询条件和预设的可用资源条件过滤所述图数据集合中不满足所述查询条件的第二待查图节点和包含所述第二待查图节点的关联关系,以得到规约子图,并通过所述查询条件查询所述规约子图,以得到所述第一待查图节点;其中,所述规约子图包括所述起始图节点、与所述查询条件匹配的第一待查图节点以及所述起始图节点与所述 第一待查图节点之间的关联关系。
  11. 根据权利要求10所述的装置,其特征在于,所述处理模块,具体用于根据所述查询条件生成查询拓扑结构,并根据所述查询拓扑结构中的查询节点之间的查询拓扑关系和预设的访问所述第一待查图节点的第一访问代价以及所述可用资源条件,过滤所述图数据集合中其访问代价超过所述第一访问代价的第二待查图节点和包含所述第二待查图节点的关联关系,以得到所述规约子图;其中,所述查询拓扑结构包括多个查询节点,以及所述多个查询节点中的每个查询节点与其它查询节点之间的查询拓扑关系,所述规约子图所占的资源不超过所述可用资源条件。
  12. 根据权利要求11所述的装置,其特征在于,所述处理模块,具体用于读取存储空间中存储的查询节点和与所述查询节点匹配的图节点,判断所述规约子图中是否包括所述读取的图节点,若所述规约子图中不包括所述读取的图节点,则将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,则进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,并过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据预设的动态规约参数输出存储至所述存储空间的访问序列;其中,所述存储空间中存储有所述查询拓扑结构中的查询节点和与所述查询节点相匹配的图节点,所述查询节点包括起始查询节点,所述图节点包括所述起始图节点或所述待查图节点,所述起始图节点与所述起始查询节点相匹配;所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
  13. 根据权利要求11所述的装置,其特征在于,所述处理模块,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点后,判断所述规约子图中是否包括所述 读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件后,进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的访问序列;并进一步判断所述存储空间是否为空,若所述存储空间非空,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止;若所述存储空间为空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
  14. 根据权利要求11所述的装置,其特征在于,所述处理模块,具体用于将所述规约子图中的图节点数目设置为零,将存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目设置为零,并将动态规约参数设为第一预设值后,将所述查询拓扑结构中的起始查询节点和所述起始图节点存储至所述存储空间,并在读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,并对所述存储空间中已读取的查询节点和与所述查询节点匹配的图节点打标记后,进一步判断所述规约子图中是否包括所述读取的图节点;若所述规约子图中不包括所述读取的图节点,则进一步将所述读取的图节点添加至所述规约子图中,并确定所述规约子图所占的资源不超过所述可用资源条件,并进一步根据所述查询拓扑结构中的查询节点之间的查询拓扑关系计算与所述读取的图节点相邻的待查图节点的访问代价后,过滤所述访问代价超过所述第一访问代价的第二待查图节点以及包含所述访问代价超过所述第一访问代价的第二待查图节点的关联关系,并根据所述动态规约参数输出存储至所述存储空间的 访问序列;并进一步判断所述存储空间中是否存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点;若所述存储空间存在未打标记的查询节点和与所述未打标记的查询节点匹配的图节点,则继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止;若所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记空,则进一步判断所述规约子图是否发生变化,并在判断所述规约子图没有发生变化时,结束计算,以得到所述规约子图;其中,所述起始图节点与所述起始查询节点相匹配,所述读取的图节点包括所述起始图节点或所述待查图节点,所述访问序列中的待查图节点的访问代价均不超过所述第一访问代价,所述动态规约参数用于控制所述访问序列中的待查图节点的数目。
  15. 根据权利要求13或14所述的装置,其特征在于,所述处理模块,还用于若判断所述规约子图发生变化,则将所述查询节点的起始查询节点和所述起始图节点重新存储至所述存储空间,并将所述动态规约参数的值调整为第二预设值,并继续读取所述存储空间中存储的查询节点和与所述查询节点匹配的图节点,直至所述存储空间存储的查询节点的数目和与所述查询节点匹配的图节点的数目为零为止或直至所述存储空间存储的查询节点和与所述查询节点匹配的图节点均打有标记为止。
  16. 根据权利要求13-15任一项所述的装置,其特征在于,所述处理模块,还用于若所述规约子图所占的资源超过所述可用资源条件,则结束计算,以得到所述规约子图。
  17. 一种图数据的搜索装置,其特征在于,包括:
    获取模块,用于获取查询请求;其中,所述查询请求包括携带起始图节点和终止图节点的查询条件,所述查询请求用于请求查询图数据集合中与所述查询条件匹配的第一待查图节点;所述图数据集合包括所述起始图节点、多个待查图节点、所述终止图节点,以及,所述起始图节点、所述终止图节点以及所述多个待查图节点之间的关联关系;
    处理模块,用于根据所述图数据集合中的多个图节点的中间度核心性和预设的可用资源条件确定所述图数据集合的路标节点,并根据所述路标 节点建立路标节点树,并根据所述查询条件搜索所述路标节点树,以得到所述第一待查图节点;其中,所述路标节点树包括具有层次关系的路标节点。
  18. 根据权利要求17所述的装置,其特征在于,所述处理模块,具体用于根据所述查询条件获取所述路标节点树中各个路标节点的辅助信息,并根据所述路标节点的辅助信息确定用于获取所述第一待查图节点的路径策略后,根据所述路径策略搜索所述路标节点树,以得到所述第一待查图节点。
PCT/CN2015/096845 2014-12-09 2015-12-09 图数据的搜索方法和装置 WO2016091174A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP15866451.6A EP3223169A4 (en) 2014-12-09 2015-12-09 Search method and apparatus for graph data
US15/618,587 US9798774B1 (en) 2014-12-09 2017-06-09 Graph data search method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410751268.9A CN104504003B (zh) 2014-12-09 2014-12-09 图数据的搜索方法和装置
CN201410751268.9 2014-12-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/618,587 Continuation US9798774B1 (en) 2014-12-09 2017-06-09 Graph data search method and apparatus

Publications (1)

Publication Number Publication Date
WO2016091174A1 true WO2016091174A1 (zh) 2016-06-16

Family

ID=52945401

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/096845 WO2016091174A1 (zh) 2014-12-09 2015-12-09 图数据的搜索方法和装置

Country Status (4)

Country Link
US (1) US9798774B1 (zh)
EP (1) EP3223169A4 (zh)
CN (1) CN104504003B (zh)
WO (1) WO2016091174A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299337A (zh) * 2018-10-19 2019-02-01 南威软件股份有限公司 一种基于迭代的图搜索方法
CN110168523A (zh) * 2016-10-28 2019-08-23 微软技术许可有限责任公司 改变监测跨图查询
US11748506B2 (en) 2017-02-27 2023-09-05 Microsoft Technology Licensing, Llc Access controlled graph query spanning

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504003B (zh) 2014-12-09 2018-03-13 北京航空航天大学 图数据的搜索方法和装置
CN106354729B (zh) * 2015-07-16 2020-01-07 阿里巴巴集团控股有限公司 一种图数据处理方法、装置和系统
CN107622057A (zh) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 一种查找任务的方法和装置
CN108614821B (zh) * 2016-12-09 2020-10-13 中国地质调查局发展研究中心 地质资料互联互查系统
CN106991195B (zh) * 2017-04-28 2020-08-11 南京大学 一种分布式的子图枚举方法
CN108880835B (zh) * 2017-05-09 2021-08-27 创新先进技术有限公司 数据分析方法及装置、计算机存储介质
CN108153883B (zh) * 2017-12-26 2022-02-18 北京百度网讯科技有限公司 搜索方法和装置、计算机设备、程序产品以及存储介质
WO2019232956A1 (en) * 2018-06-08 2019-12-12 Zhejiang Tmall Technology Co., Ltd. Parallelization of graph computations
CN108959584B (zh) * 2018-07-09 2023-02-10 清华大学 一种基于社区结构的处理图数据的方法及装置
CN110889000B (zh) * 2018-09-10 2022-08-16 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置
CN111104560B (zh) * 2018-10-10 2022-06-07 福建天泉教育科技有限公司 资源检索方法及计算机可读存储介质
CN110377667B (zh) * 2019-06-17 2023-05-02 深圳壹账通智能科技有限公司 关联图谱展示方法、装置、计算机设备和存储介质
CN110413848B (zh) * 2019-07-19 2022-04-15 上海赜睿信息科技有限公司 一种数据检索方法、电子设备和计算机可读存储介质
JP7239433B2 (ja) * 2019-10-02 2023-03-14 ヤフー株式会社 情報処理装置、情報処理方法、及び情報処理プログラム
CN111177486B (zh) * 2019-12-19 2020-09-08 四川蜀天梦图数据科技有限公司 一种分布式图计算过程中的消息传递方法和装置
CN112214616B (zh) * 2020-10-20 2024-02-23 北京明略软件系统有限公司 知识图谱流畅展示方法、装置
EP4235460A1 (en) * 2022-02-23 2023-08-30 Celonis SE Method for filtering a graph
US11934276B2 (en) 2022-03-19 2024-03-19 Dell Products L.P. Enabling incremental backup operations targeting bare-metal recovery and system-state recovery data and metadata
US11809277B1 (en) * 2022-04-22 2023-11-07 Dell Products L.P. Topological view and insights of organization information technology environment based on bare-metal recovery and system-state recovery data and metadata

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521332A (zh) * 2011-12-06 2012-06-27 北京航空航天大学 基于强模拟的图模式匹配方法、装置及系统
US8316060B1 (en) * 2005-01-26 2012-11-20 21st Century Technologies Segment matching search system and method
CN102819536A (zh) * 2011-09-27 2012-12-12 金蝶软件(中国)有限公司 树型数据处理方法及装置
CN103377236A (zh) * 2012-04-26 2013-10-30 中兴通讯股份有限公司 一种用于分布式数据库的连接查询方法及系统
CN104504003A (zh) * 2014-12-09 2015-04-08 北京航空航天大学 图数据的搜索方法和装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8370363B2 (en) * 2011-04-21 2013-02-05 Microsoft Corporation Hybrid neighborhood graph search for scalable visual indexing
US9355478B2 (en) * 2011-07-15 2016-05-31 Hewlett Packard Enterprise Development Lp Reflecting changes to graph-structured data
CN102332009B (zh) * 2011-09-02 2013-09-04 北京大学 一种大规模数据集上的关系查询方法
CN102662974B (zh) * 2012-03-12 2014-02-26 浙江大学 一种基于邻接节点树的网络图索引方法
EP2731023B1 (en) * 2012-11-12 2015-03-25 Software AG Method and system for processing graph queries
US9116738B2 (en) * 2012-11-13 2015-08-25 International Business Machines Corporation Method and apparatus for efficient execution of concurrent processes on a multithreaded message passing system
US9053210B2 (en) * 2012-12-14 2015-06-09 Microsoft Technology Licensing, Llc Graph query processing using plurality of engines
US9268950B2 (en) * 2013-12-30 2016-02-23 International Business Machines Corporation Concealing sensitive patterns from linked data graphs
US10019536B2 (en) * 2014-07-15 2018-07-10 Oracle International Corporation Snapshot-consistent, in-memory graph instances in a multi-user database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8316060B1 (en) * 2005-01-26 2012-11-20 21st Century Technologies Segment matching search system and method
CN102819536A (zh) * 2011-09-27 2012-12-12 金蝶软件(中国)有限公司 树型数据处理方法及装置
CN102521332A (zh) * 2011-12-06 2012-06-27 北京航空航天大学 基于强模拟的图模式匹配方法、装置及系统
CN103377236A (zh) * 2012-04-26 2013-10-30 中兴通讯股份有限公司 一种用于分布式数据库的连接查询方法及系统
CN104504003A (zh) * 2014-12-09 2015-04-08 北京航空航天大学 图数据的搜索方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3223169A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110168523A (zh) * 2016-10-28 2019-08-23 微软技术许可有限责任公司 改变监测跨图查询
CN110168523B (zh) * 2016-10-28 2023-07-21 微软技术许可有限责任公司 改变监测跨图查询
US11748506B2 (en) 2017-02-27 2023-09-05 Microsoft Technology Licensing, Llc Access controlled graph query spanning
CN109299337A (zh) * 2018-10-19 2019-02-01 南威软件股份有限公司 一种基于迭代的图搜索方法

Also Published As

Publication number Publication date
CN104504003B (zh) 2018-03-13
US9798774B1 (en) 2017-10-24
US20170286484A1 (en) 2017-10-05
EP3223169A1 (en) 2017-09-27
EP3223169A4 (en) 2018-03-07
CN104504003A (zh) 2015-04-08

Similar Documents

Publication Publication Date Title
WO2016091174A1 (zh) 图数据的搜索方法和装置
US10102227B2 (en) Image-based faceted system and method
US9237190B2 (en) Node and method for generating shortened name robust against change in hierarchical name in content-centric network (CCN)
WO2017114206A1 (zh) 短链接处理方法、装置及短链接服务器
US10162550B2 (en) Large-scale, dynamic graph storage and processing system
KR102452159B1 (ko) 정황상 인식되는 다이나믹 그룹 형성
RU2665920C2 (ru) Оптимизированный процесс визуализации в браузере
WO2016192214A1 (zh) 一种端服务器部署方法及装置
US9454750B2 (en) Techniques for estimating distance between members of a social network service
KR20170047189A (ko) 위치 기반 정보의 서버 제어 타일링을 위한 기술
JP5841299B2 (ja) 情報をプッシュする方法および情報をプッシュするための装置
TWI652586B (zh) 基於社交網路的群組查找方法和裝置
US9754015B2 (en) Feature rich view of an entity subgraph
WO2016155146A1 (zh) 基于电子设备或应用的定位方法和装置
WO2017118171A1 (zh) 一种数据迁移方法及装置
CN107423037B (zh) 应用程序接口定位方法及设备
WO2016101780A1 (zh) 一种虚拟化网络中业务部署的方法和装置
KR101934420B1 (ko) 지도 중의 후보 주소 정보를 획득하기 위한 방법 및 장치
WO2018227773A1 (zh) 地点推荐方法、装置、计算机设备和存储介质
WO2017024684A1 (zh) 用户行为意图的获取方法、装置、设备及非易失性计算机存储介质
WO2014209514A1 (en) Systems, methods, and computer-readable media for locating real-world objects using computer-implemented searching
US10922321B2 (en) Interpreting user queries based on device orientation
WO2015192716A1 (zh) 一种基于电子地图的划线搜索方法和装置
WO2020024824A1 (zh) 一种用户状态标识确定方法及装置
Chen et al. Optimal region search with submodular maximization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15866451

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2015866451

Country of ref document: EP