WO2021083239A1 - 一种进行图数据查询的方法、装置、设备及存储介质 - Google Patents

一种进行图数据查询的方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021083239A1
WO2021083239A1 PCT/CN2020/124541 CN2020124541W WO2021083239A1 WO 2021083239 A1 WO2021083239 A1 WO 2021083239A1 CN 2020124541 W CN2020124541 W CN 2020124541W WO 2021083239 A1 WO2021083239 A1 WO 2021083239A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
variable
variable node
nodes
target
Prior art date
Application number
PCT/CN2020/124541
Other languages
English (en)
French (fr)
Inventor
邹磊
林殷年
苏勋斌
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Publication of WO2021083239A1 publication Critical patent/WO2021083239A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention belongs to the technical field of information search and query, and relates to large-scale data search acceleration technology, in particular to a method, device, equipment and storage medium for querying graph data.
  • the graph database is a database that stores data through a graph structure, which includes nodes and edges.
  • the nodes can represent the corresponding data
  • the edges can represent the existing relationship between the stored data. For example, two nodes with edges store data of different social accounts (such as account identification, gender, hobbies, etc.), and the edges corresponding to the two nodes can be expressed as the two corresponding accounts paying attention to each other.
  • the graph database can provide a data query function, that is, all graphs with a specified structure can be queried in the graph database.
  • a data query function that is, all graphs with a specified structure can be queried in the graph database.
  • the structure corresponding to the graph to be queried is a triangular structure composed of three nodes and corresponding edges.
  • the query function in the graph database is mainly through the structure of the graph to be queried (the edges existing between each node), and the adjacency list of each node is traversed in the graph database in turn, so as to find in the graph database.
  • the adjacency list of the node records other nodes connected to the node through edges, and may also record edges connected between other nodes.
  • the present invention provides a method, device, equipment and storage medium for querying graph data, which can improve the efficiency of querying graph data.
  • the technical solution is as follows:
  • a method for querying graph data includes:
  • the query instruction carries graph information to be queried, and the graph information to be queried includes the type of at least one edge between a plurality of variable nodes;
  • the pre-stored node connection relationship information corresponding to each type in the target graph Based on the type of the at least one edge and the pre-stored node connection relationship information corresponding to each type in the target graph, determine at least one node group in the target graph that satisfies the type of the at least one edge, wherein the type Corresponding node connection relationship information is used to indicate nodes connected through the type of edges;
  • the at least one node group in the target graph that satisfies the at least one edge type is determined based on the type of the at least one edge and pre-stored node connection relationship information corresponding to each type in the target graph ,include:
  • variable node in the query instruction For each variable node in the query instruction, based on the type of edge connected by the variable node in the query instruction, determine at least one candidate node in the target graph that satisfies the type of edge connected by the variable node, Forming a set of candidate nodes corresponding to the variable nodes;
  • At least one node group in the target graph that meets the type of the at least one edge is determined.
  • the determining at least one of the types of the at least one edge in the target graph based on the candidate node set corresponding to each variable node and the pre-stored node connection relationship information corresponding to each type in the target graph Node group including:
  • the ranking of the multiple variable nodes is determined, wherein each of the variable nodes except the first variable node in the ranking There is an edge between each variable node and at least one variable node in the front;
  • the candidate nodes are selected one by one from the first candidate node set corresponding to the first variable node in the sorting, and each candidate node is selected, the selected candidate node is set as the reference node, and the first variable node is set Is the base variable node;
  • the target candidate node is set as the reference node, and the next variable node is set as the reference variable node, and it is determined whether the next variable node is the last variable node in the ranking, and if it is not all If it is the last variable node in the sequence, then it will go to the next variable node in the sequence to determine the newly set reference variable node in the sequence. If it is the last variable node in the sequence, then the currently set reference variable node will be changed to the next variable node in the sequence.
  • a node is determined to be a node group satisfying the type of the at least one edge in the target graph, and the node group is transmitted to the processor.
  • the determining the ordering of the multiple variable nodes based on the candidate node set corresponding to each variable node, the graph information to be queried, and a preset ordering rule includes:
  • the unselected variable node set determine the first variable node corresponding to the candidate node set with the smallest number of nodes, and move the first variable node to the selected variable node set;
  • the unselected variable node set select one by one the second variable node that has an edge with the nodes in the selected variable node set, and the number of nodes in the corresponding candidate node set is the smallest, and the second variable node is moved To the set of selected variable nodes until the set of unselected variable nodes is empty;
  • the order in which the variable nodes are moved to the selected variable node set is determined as the order of the plurality of variable nodes.
  • the method is applied to natural language intelligent question answering queries, the data corresponding to the nodes in the target graph are persons, events, and things in the natural language question answering data, and the types of edges are the persons, The relationship between events and things.
  • a device for querying graph data wherein the device includes:
  • An acquiring module configured to acquire a query instruction, wherein the query instruction carries graph information to be queried, and the graph information to be queried includes the type of at least one edge between a plurality of variable nodes;
  • a determining module configured to determine at least one node group in the target graph that satisfies the at least one edge type based on the type of the at least one edge and pre-stored node connection relationship information corresponding to each type in the target graph, Wherein, the node connection relationship information corresponding to the type is used to indicate nodes connected by edges of the type;
  • the feedback module is configured to feedback the query instruction based on the at least one node group.
  • variable node in the query instruction For each variable node in the query instruction, based on the type of edge connected by the variable node in the query instruction, determine at least one candidate node in the target graph that satisfies the type of edge connected by the variable node, Forming a set of candidate nodes corresponding to the variable nodes;
  • At least one node group in the target graph that meets the type of the at least one edge is determined.
  • the ranking of the multiple variable nodes is determined, wherein each of the variable nodes except the first variable node in the ranking There is an edge between each variable node and at least one variable node in the front;
  • the candidate nodes are selected one by one from the first candidate node set corresponding to the first variable node in the sorting, and each candidate node is selected, the selected candidate node is set as the reference node, and the first variable node is set Is the base variable node;
  • the target candidate node is set as the reference node, and the next variable node is set as the reference variable node, and it is determined whether the next variable node is the last variable node in the ranking, and if it is not all If it is the last variable node in the sequence, then it will go to the next variable node in the sequence to determine the newly set reference variable node in the sequence. If it is the last variable node in the sequence, then the currently set reference variable node will be changed to the next variable node in the sequence.
  • a node is determined to be a node group satisfying the type of the at least one edge in the target graph, and the node group is transmitted to the processor.
  • a computer device in a third aspect, includes a processor, an FPGA, and a memory.
  • the memory stores at least one instruction, and the instruction is loaded and executed by the processor and/or FPGA to implement The operations performed by the method for querying graph data as described above.
  • a computer-readable storage medium characterized in that at least one instruction is stored in the storage medium, and the instruction is loaded and executed by a processor and/or FPGA to implement the progress diagram described above. The operation performed by the data query method.
  • the node connection relationship information corresponding to each type of edge can be obtained, and then the node connection relationship information and the variable nodes included in the query graph information in the query instruction can be obtained.
  • the nodes corresponding to multiple variable nodes can be filtered according to the pre-stored node connection relationship information corresponding to each type of edge, and at least one node group of at least one edge type can be obtained. Node traversal can improve the efficiency of graph data query.
  • FIG. 1 is a schematic diagram of a CSR structure (format) used in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the implementation of the BID&BS compression method adopted in the embodiment of the present invention.
  • FIG. 3 is a structural block diagram of the acceleration device/accelerator provided by the present invention.
  • FIG. 4 is a block diagram of the Kernel structure of the 0th layer of the acceleration device provided by the present invention.
  • Fig. 5 is a flowchart of intersection calculation in the query process according to an embodiment of the present invention.
  • the invention provides a method for querying graph data, which can be implemented based on the FPGA-CPU heterogeneous environment, and is used for a large-scale data query acceleration method of a graph database.
  • Accelerated graph database query can support a large number of application scenarios based on graph data. For example: need to quickly find a certain fixed pattern of application scenarios in graph data: for example, the shareholding relationship between companies can be expressed in the form of graph data.
  • Fraud detection is an application scenario that is very suitable for graph databases. In modern fraud and various types of financial crimes, such as bank fraud, credit card fraud, e-commerce fraud, insurance fraud, etc., fraudsters usually use methods such as changing their identity to achieve the purpose of evading risk control rules.
  • the graph data can establish a user tracking perspective that tracks the overall situation.
  • the time cost is unacceptable.
  • the algorithm running speed can be increased by more than 2 times.
  • graph database support is needed to quickly obtain the information needed to parse and answer natural language questions from graph data.
  • machine learning tools such as sentence vector coding, sentence analysis, word meaning filtering, emotion recognition, text clustering and classification are used to convert natural language into SPARQL that can be recognized by a graph database.
  • the graph database was used to retrieve the knowledge base stored in the form of triples in the background, covering common sense, ancient poems, life events, music and other fields, with a scale of nearly 100 million edges. There are about 200,000 such visits every day, and about 5 concurrent links during peak periods, with an average of 10 requests per second. The delay of this step accounts for about 30% of the overall delay.
  • the time delay ratio can be reduced to 15%, and the peak throughput can be doubled, thereby helping the company to achieve a response speed within seconds.
  • the large-scale data is a database that involves the intersection operation of the candidate point table and the adjacency list when processing the connection operation.
  • the graph database can be used in applications such as natural language question and answer query acceleration.
  • the specific implementation takes the representation of natural language knowledge base as graph data as an example, including the following operations:
  • nodes usually represent all subjects in the knowledge base that can have interrelationships, such as people, things, places, etc., while edges represent the interrelationships between subjects, such as spouses and birthplaces. , Location, etc.
  • the attributes of a node usually represent the inherent characteristics of the entity corresponding to the node, for example, the person's age, gender, birthday, name of the place, and so on.
  • the attributes of the edges usually represent the characteristics of the relationship. For example, in a spouse relationship, there may be attributes such as start time and end time.
  • a certain graph data format is adopted to convert the data into graph data.
  • the RDF format has detailed format specifications for the definition of nodes and edges, as well as the definition of their respective attributes, and data conversion can be carried out according to the format specifications.
  • the method for converting natural language into SPARQL query in the graph database may include the following steps:
  • the elements included in the natural language may have corresponding nodes in the graph database, but the name of the node and the natural language may not be exactly the same in literal content.
  • the label of the corresponding node may be "Li Bai (Poet of Tang Dynasty)".
  • the two need to be linked.
  • the commonly used methods basically rely on the information in the graph database, and use the method of pattern matching or deep learning to perform entity recognition.
  • dependency relationship refers to the semantic relationship between natural language entities.
  • a dependency relationship corresponds to two nodes and an edge in the graph data.
  • the method of dependency tree generation is commonly used to determine the dependency.
  • machine learning methods can be used to generate queries that can be identified by the graph database.
  • Graph database query is a very basic graph data operation. Whether it is to provide users with query itself or query-based application interface, there are certain requirements for graph database query. When the scale of graph data is huge, it takes a lot of time and computing resources to perform the Join operation on the graph.
  • the Join operation on the graph is similar to the Join operation of a table in a relational database. They both look for matching items in the two sets based on certain conditions. The difference is that in relational databases, equivalence conditions are usually used to determine whether the elements match, while the Join operation in the graph database needs to determine whether the elements match by judging whether there is a relationship between the elements. Compared with the Join operation in a relational database, the Join operation in a graph database involves more storage, reading and calculation operations, and is therefore more complicated.
  • the purpose of the Join operation on the graph is to calculate the isomorphism of the subgraph.
  • user queries can be expressed as query graphs.
  • the execution of the query is equivalent to finding a subgraph that has an isomorphic relationship with the query graph in the entire data graph.
  • the isomorphism problem is defined as: two simple graphs G and H are said to be isomorphic, if and only if there is a node 1...n that maps G's node 1...n to H's node 1...n There is a one-to-one correspondence with ⁇ , so that any two nodes i and j in G are connected if and only if the corresponding two nodes ⁇ (i) and ⁇ (j) in H are connected. If G and H are directed graphs, then the definition of isomorphism further requires that for any two connected nodes i and j in G, the edge (i, j) and its corresponding edge in H ( ⁇ (i) , ⁇ (j)) are in the same direction.
  • FPGA is a parallel computing algorithm that can execute multiple instructions at a time, while traditional ASICs, DSPs and even CPUs are all serial computing, which can only process one instruction set at a time. If the ASIC and CPU need to speed up, the more method is to increase the frequency , So the main frequency of ASIC and CPU is generally higher.
  • FPGAs generally have a low main frequency, for some special tasks, a large number of relatively low-speed parallel units are more efficient than a small number of high-efficiency units.
  • there is actually no so-called "calculation" inside the FPGA and the final result is almost similar to the ASIC "direct circuit feed", so the execution efficiency is greatly improved.
  • RDF Resource Description Framework
  • SPO Subject-Predication-Object
  • Subject-Predication-Object a data model
  • RDF is composed of nodes and edges. Nodes represent entities/resources and attributes, while edges represent the relationship between entities and entities and the relationship between entities and attributes. Generally speaking, the source point of an edge in the graph is called the subject, the label on the edge is called the predicate, and the pointed node becomes the object.
  • this list is called the adjacency list.
  • the adjacency list is divided into an out-edge adjacency list and an in-edge adjacency list, which respectively represent the adjacency list when the node is the subject or the object.
  • continuous random access to the storage unit will result in low operating efficiency. Therefore, by referring to the CSR (Compressed Sparse Row) storage format of the sparse matrix, the researchers proposed the CSR storage format of the graph corresponding adjacency matrix and the CSC (Compressed Sparse Row, column sparse compressed storage) storage format.
  • the CSR storage format of Figure 1 consists of two arrays C and E.
  • the E array is composed of adjacency lists of all nodes end to end. Since the graph database system usually sets node IDs for nodes, it is possible to combine the adjacency lists of all nodes in descending order of node IDs.
  • the number of elements in the C array is the number of nodes in the graph plus 1. Except for the last element in the C array, the value of the other i-th element is equal to the adjacency list of the node with the node ID i.
  • the adjacency list of the node in the array E is The position of an element, the value of the last element is equal to the number of elements in the E array.
  • the adjacency list When the adjacency list is an outbound adjacency list, it is called CSR format, and when the adjacency list is an inbound adjacency list, it is called CSC format. Since the array C represents the offset of the adjacency list in E, the array C is also called the offset array.
  • the technical scheme of the present invention includes: a data preprocessing part, a CPU control unit part and an FPGA calculation unit part.
  • the present invention uses Verilog language to write the FPGA-side computing unit, and C++ to write the CPU-side control unit program.
  • the development and operation of the present invention are based on the U200FPGA board sold by Xilinx and its supporting operation and development environment.
  • U200 is equipped with 16GB*4 memory space, 2175*18K chip built-in storage space. Therefore, when the present invention is implemented on U200, the adjacency list is divided into 128 groups, and 128 parallel intersection calculations are performed.
  • the supporting development environment can map the hardware logic corresponding to the Verilog language to the FPGA hardware.
  • the present invention is based on the open source graph database system gStore, and aims at the 5 billion edge-scale lubm database and the join query involved in the corresponding benchmark to realize a CPU-FPGA heterogeneous graph database system query accelerator.
  • the processing of the data preprocessing part is as follows:
  • the graph data format applicable to the present invention is the RDF format.
  • the adjacency matrix of the graph data is stored in the CSR format.
  • the relationship between nodes is represented by triples.
  • the values of subject, object, and predicate in the triples need to be matched.
  • the CSR format cannot distinguish neighbors linked by different predicates. Therefore, in the present invention, the graph data is divided according to predicates. Extract all edges with the same predicate to form a subgraph, and generate a separate CSR structure (format) for each subgraph. During query processing, additional parameters are used to determine which CSR data will be read.
  • the scale of the subgraph obtained according to the predicate may be much smaller than the scale of the original picture, making the offset array in the CSR very sparse and wasting a lot of storage space. Therefore, for each sub-picture, the present invention maintains a mapping structure in advance. For all nodes with degree greater than or equal to 1 in a given subgraph, set their number to n, and renumber them as 0...n-1 in the order of ID from small to large, and then construct the offset in the CSR under the new number The array, and the E array remains unchanged, that is, the elements in the adjacency list are still the node IDs before renumbering.
  • the present invention adopts a binary bit string-based data compression method, which is called the BID&BS structure.
  • the fastest running speed is binary bit operations.
  • the most intuitive idea is to represent the value of each element in the adjacency table that constitutes the E array by a binary string, where if the i-th bit is 1, it means the corresponding adjacency table If there is an i-node ID, if the i-th bit is 0, it means that there is no i-node ID in the corresponding adjacency table.
  • both adjacency tables include the node ID of j.
  • the above binary string can also be divided into multiple blocks. For each part, a unique Block ID (BID) is given, and within each block, a binary string Bit Stride (BS) is used to indicate the existence and the existence of the set elements. no. In this way, a block with no elements can be directly removed without calculation, thereby achieving the effect of data compression and alleviating the problem of data sparseness to a certain extent. For blocks with at least one element, you can use the merge method to find the same BID by comparing first, and then perform bitwise AND operation on the corresponding BS to get the result.
  • BID Block ID
  • BS binary string Bit Stride
  • each BID is g bits
  • the length of each BS is s bits
  • the number of elements in the complete set is ⁇ .
  • each block Allocate a unique BID. Since the size of g is controllable, there is no need to worry about insufficient BID space.
  • the BID&BS structure can be provided according to the above, and the E array corresponding to the CSR format corresponding to each subgraph divided according to the predicate can be represented by the BID&BS structure.
  • the CPU control unit part and FPGA calculation unit part are as follows:
  • the graph data is divided into multiple CSR structures (formats) according to the predicate ID, and each CSR corresponds to a predicate ID. Since it is a directed graph, two sets of CSRs are needed, one set is used to store the CSR structure (format) of the out-edge, and one set is used to store the CSR structure (format) of the in-edge. Since the two CSRs will not affect each other, they are stored in the storage unit of the FPGA card. In order to facilitate the FPGA computing unit to access the stored CSR, the present invention can map the discontinuous node IDs in the corresponding offset array in each CSR to continuous node IDs according to the mapping structure of the data preprocessing part.
  • Step 101 Obtain a query instruction, where the query instruction carries graph information to be queried, and the graph information to be queried includes the type of at least one edge between a plurality of variable nodes.
  • technicians can write query instructions according to actual needs, or generate corresponding query instructions according to query requests input by users.
  • the query instructions can carry the graph information to be queried, and the graph information to be queried can include variable nodes to be queried. The number of, and the types of edges that exist between the variable nodes.
  • Step 102 Based on the at least one edge type and pre-stored node connection relationship information corresponding to each type in the target graph, determine at least one node group in the target graph that satisfies the at least one edge type, and perform a query based on the at least one node group Order feedback.
  • the edge type can be the ID of the predicate in the RDF, and the node connection relationship information corresponding to the edge type is used to indicate the nodes connected by the type of edge, that is, the data preprocessing part is transmitted to the FPGA through the CPU in the CSR structure
  • the adjacency matrix corresponding to each subgraph stored (the E array represented by the BID&BS structure, and the C array after conversion according to the mapping structure).
  • At least one node group includes a plurality of nodes and edges existing in each node, that is, at least one query result obtained according to the query instruction.
  • At least one node group that satisfies the type of at least one edge in the target graph can be determined according to the FPGA and the CPU, that is, the query result of the corresponding query instruction is obtained, and then the query result is fed back to the technician or user, correspondingly
  • the processing can be as follows:
  • Step 201 For each variable node in the query instruction, based on the type of the edge connected by the variable node in the query instruction, determine at least one candidate node in the target graph that satisfies the type of the edge connected by the variable node to form a candidate node corresponding to the variable node set.
  • the CPU can filter all the nodes that may correspond to each variable node in the pre-stored target graph according to the types of edges existing between the variable nodes in the query instruction.
  • Step 202 Determine at least one node group in the target graph that meets the type of at least one edge based on the candidate node set corresponding to each variable node and the pre-stored node connection relationship information corresponding to each type in the target graph.
  • the node pair with the corresponding edge can be determined in the two candidate node sets according to the node connection relationship information corresponding to each type in the target graph stored in advance. Then it is determined that the variable node pair corresponding to any node in the node pair has the next variable node of the edge, and then the corresponding node group is determined according to each node in the candidate node set corresponding to the determined node pair and the next variable node (three Node), and then determine the next variable node on the edge of the variable node corresponding to any node in the node group (three nodes), and then determine the node group (three nodes) and the candidate node set corresponding to the next variable node Each node in the node determines the corresponding node group (four nodes), and so on, until the candidate node set corresponding to the node group (N-1 nodes) and the last variable node is determined to include at least one node containing N
  • this application provides a device for querying graph data in a heterogeneous environment based on FPGA-CPU.
  • the processing of step 202 above can be implemented based on this device, and the corresponding processing is as follows:
  • Step 301 The CPU determines the ordering of multiple variable nodes based on the candidate node set corresponding to each variable node, the graph information to be queried, and a preset ordering rule.
  • the processing to determine the sorting of multiple variable nodes can be as follows:
  • the order of multiple variable nodes can be determined according to the number of candidate nodes in the candidate node set corresponding to each variable node. First, move the first variable node corresponding to the smallest number of candidate nodes in the candidate node set to the selected variable node set, and determine the first variable node as the first variable node in the ranking, and then compare it with the selected variable node.
  • the second variable node is determined as the second variable node in the ranking, and then at least one variable of the edge exists between the variable node in the selected variable node set (ie, the first variable node or the second variable node) Determine the third variable node corresponding to the smallest number of candidate nodes in the set of candidate nodes among the nodes, move the third variable node to the set of selected variable nodes, and determine the third variable node as the third variable node in the ranking, And so on and so on, until all the variable nodes correspond to the order.
  • the CPU can determine the execution order of multiple FPGAs that perform the join operation according to the corresponding ordering, and then transmit the candidate node set of the corresponding variable node to the corresponding FPGA.
  • the candidate node set corresponding to the first two variable nodes in the ranking can be transmitted to the first FPGA, and then the candidate node set corresponding to each subsequent variable node can be transmitted to the corresponding FPGA.
  • each FPGA can perform multiple join operations, that is, technicians can divide the FPGA into multiple FPGA processing units.
  • the CPU determines the order of all variable nodes, it can determine the number of multiple FPGAs corresponding to the corresponding order.
  • the execution sequence of each FPGA processing unit, and then the candidate node set of the corresponding variable node is transmitted to the corresponding FPGA, and different FPGA processing units in the FPGA process the corresponding join operation.
  • Step 302 Select candidate nodes one by one from the first candidate node set corresponding to the first variable node in the sorting. Each time a candidate node is selected, the selected candidate node is set as a reference node, and the first variable node is set as The benchmark variable node determines the next variable node in the sequence of the newly set benchmark variable node, and in the set benchmark variable nodes, determines the target benchmark variable node with an edge between the next variable node.
  • the target reference variable node with an edge between the next variable node can be determined.
  • x, y, and z are variable nodes
  • p1, p2, and p3 are edge types
  • x, y, and z are variable nodes.
  • the corresponding order is xyz
  • the next variable node is the variable node y
  • the target reference variable node is the variable node x.
  • Step 303 determines the target type of the edge between the next variable node and the target reference variable node, and determines the target candidate node set corresponding to the next variable node.
  • the target type of the edge between the next variable node and the target reference variable node is the type of the edge between the variable node x and the variable node y, which is p1.
  • the target candidate node set corresponding to the next variable node is the candidate node set corresponding to the next variable node sent by the CPU to the corresponding FPGA.
  • Step 304 Based on the node connection relationship information corresponding to the target type, determine whether the target candidate node set includes an edge of the target type between the target candidate node and the reference node corresponding to the target reference variable node.
  • the adjacency matrix of the subgraph corresponding to the target type can be determined according to the node connection relationship information corresponding to the target type, and then the reference node corresponding to the target reference variable node can be determined in the target candidate node set according to the adjacency matrix of the subgraph There are target candidate nodes with target type edges in between.
  • Step 305 If it exists, set the target candidate node as the reference node, and set the next variable node as the reference variable node, and determine whether the next variable node is the last variable node in the ranking, if not the last variable node in the ranking , Then go to the execution to determine the next variable node in the sorting of the newly set benchmark variable node, if it is the last variable node in the ranking, determine the currently set benchmark nodes as the type that satisfies at least one edge in the target graph A node group of, transfer the node group to the processor.
  • the target candidate node is set as the reference node, and the next variable node is set as the reference variable node, to determine whether the next variable node is the last variable node in the sorting, if it is not in the sorting
  • the corresponding calculation result is sent to the CPU, and the CPU sends it to the next FPGA in the execution order corresponding to each FPGA.
  • the next FPGA can continue to execute step 302. If the next FPGA is the last FPGA in the sequence , The last FPGA may send the corresponding calculation result to the CPU after performing steps 302-305, and the CPU may determine it as a node group satisfying at least one edge type in the target graph.
  • the FPGA in the foregoing steps 302-305 may also be an FPGA processing unit, and the corresponding processing flow is similar to the processing procedures of the foregoing steps 302-305, and will not be repeated here.
  • the CPU can be connected to multiple FPGAs, and different FPGAs can be used in different candidate node sets to determine the nodes corresponding to the query instructions, that is, different FPGAs can join two tables (that is, determine the target node and another Whether there is an edge of the target type between nodes), three-table join (that is, determine whether there is an edge of the target type between the target node and the other two nodes), four-table join, and so on.
  • the following takes a three-variable query with three variable nodes as an example to illustrate the solution:
  • Step 3051 Determine the set of candidate nodes corresponding to each variable node, determine the ranking corresponding to each variable node, determine the corresponding join operation according to the ranking corresponding to each variable node, and execute the corresponding join operation FPGA.
  • the type of the edge between variable node x and variable node y is p1
  • the type of edge between variable node y and variable node z is p2
  • the type of edge between variable node z and variable node x is p3
  • the order of the corresponding variable nodes is variable node x, variable node y, variable node z
  • the corresponding join operation can be divided into a two-table join between the candidate node set corresponding to variable node x and the candidate node set corresponding to variable node y, A three-table join between the candidate node set corresponding to variable node z and the candidate node set corresponding to variable nodes x and y.
  • Step 3052 The CPU sends the set of corresponding candidate nodes to the FPGA that performs the corresponding join operation; wherein, the node connection relationship information corresponding to each type in the target graph can be pre-stored in the FPGA.
  • the node connection relationship information is divided, and the divided node connection relationship information is stored in different DRAM (Dynamic Random Access Memory) memory unit ports in the FPGA.
  • DRAM Dynamic Random Access Memory
  • Step 3053 the first FPGA executes the calculation unit program of the two-table join: for the two-table join, the CPU control unit will transmit two candidate node sets to the FPGA.
  • the previous candidate node set needs to map the node ID to the index ID of the corresponding CSR ,
  • the latter will compress the set of candidate nodes, that is, the entire set of candidate nodes is represented as a binary string according to the BID&BS structure.
  • the binary string corresponding to the set of candidate nodes if the k-th bit is 1, it means that the node with ID k appears
  • each w bits are divided into a group, and the binary bits and w bits of the group ID are spliced together to represent a group of adjacent vertex IDs.
  • the FPGA calculates The unit will return the intermediate result of the join of the two tables.
  • S_1 corresponding to the E array ⁇ 000001100,000011000,001000100 ⁇
  • S_00 can be expressed as S_01 ⁇ 000001101,000011110,100000111 ⁇
  • the nodes of 2, 3, and 7 have corresponding edges with the previous candidate node set, and then according to the corresponding node ID, E array and mapping structure, it can be determined that the previous candidate node set exists with node IDs 2, 3, and 7 respectively.
  • the corresponding node ID of the edge After the first
  • Step 3054 the second FPGA executes the calculation unit program of the three-table join: for the third table, it may join the first two tables, because each item of the intermediate result of the two-table join received by the CPU-side control unit It is a node pair, so the second item of the node pair is a set of vertex IDs, which need to be decompressed and then mapped to the index ID of the CSR, which is passed as a new input to the second FPGA of the three-table join, and the candidate corresponding to the variable node z The node set and the input of the three-table join calculation result of the node. Then the corresponding calculation result is sent to the CPU, and the CPU processes the corresponding calculation result to obtain the final query result.
  • the FPGA in the above steps 3051-3054 may also be an FPGA processing unit, and the corresponding processing flow is similar to the processing procedure in the above steps 3051-3054, and will not be repeated here.
  • the CPU can map the first N-1 elements in the corresponding structure to the node ID on the offet array in the CSR format according to the pre-stored mapping structure, and then map the node ID back to the The number in the graph database is output.
  • the Nth element is a BID&BS block.
  • the CPU can decompress the Nth element to obtain the corresponding node ID; the obtained node ID is the node ID corresponding to the previous N-1 elements, namely Form a final query result.
  • the calculation unit of the two-table join and the calculation unit of the three-table join can correspond to two FPGA cards.
  • the two computing units can be executed independently, and each accesses different CSR structures (formats) without conflict.
  • the flow of data is used between the FPGA and the CPU. That is, when an intermediate result is obtained from the previous layer calculation, it can be passed to the next layer immediately, without waiting for all the intermediate results to be calculated before starting the next layer calculation, that is, the third and fourth steps can be executed at the same time , Thereby greatly improving the degree of parallelism.
  • FIG. 3 is a structural block diagram of an acceleration device/accelerator corresponding to the method for querying graph data provided by the present invention.
  • the control unit program on the CPU writes the CSR data required to calculate the join into the memory of the FPGA hardware.
  • Join is divided into multiple levels, and data is transferred between the levels in a stream.
  • control unit program on the CPU side needs to transmit the candidate point table and control parameters required for the calculation of this layer to the FPGA. Similarly, the control unit program uses the flow form to transmit them to realize the overall data flow.
  • the module structure at each layer is shown in Figure 4.
  • the adjacency list read from the FPGA memory and the candidate point list obtained from the CPU are passed to multiple modules that process the intersection of two or more tables.
  • the specific number of parallel modules depends on the size of the data set and the specific configuration of the FPGA hardware.
  • FIG. 5 shows the flow of intersection calculation in the query process of the embodiment of the present invention.
  • the incoming candidate point table will be equally divided into N parts, which is equivalent to putting into N buckets. And every time an adjacency list is received, according to the range of node IDs in N buckets, each element can be put into the corresponding bucket, and then the calculation of merging and intersection is performed in each bucket. After a valid result is obtained, the result is passed back to the CPU.
  • the large-scale database module is used to store large-scale graph data sets expressed in RDF (Resource Description Framework) format.
  • the number of nodes is tens of millions or more, and the number of edges is more than 100 million.
  • the LUBM (the Lehigh University Benchmark) data set (used to test the performance of the present invention) contains 5 billion edges and about 900 million nodes.
  • Another example is the dbpedia data set extracted from Wikipedia, including 3 billion edges and 16 million nodes.
  • Such a large-scale data set has higher requirements for the performance of a single-machine graph database.
  • the present invention chooses the gStore (https://github.com/pkumod/gStore) developed by the university to provide graph database software support because of its better performance of single-machine query on large-scale data.
  • the present invention provides a large-scale data query acceleration method for graph database based on FPGA-CPU heterogeneous environment and its implementation device on FPGA, which can be applied to the intersection of candidate point table and adjacency list when processing connection operation Operational database query, to solve the problem of fast query of data on large-scale data sets, and accelerate graph database query, which can be widely used in application technology fields based on graph data processing such as social networking, financial risk control, Internet of Things applications, Relationship analysis, IT operation and maintenance, recommendation engine, etc.
  • the present invention can be combined with the graph database system to improve the query response speed of the graph database system. This method is applied to natural language Q&A intelligent query.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公布了一种进行图数据查询的方法、装置、设备及存储介质,将待查询处理的大规模数据表示为资源描述框架RDF格式的大规模图数据集,基于FPGA-CPU异构环境实现查询加速,解决在大规模数据集上对数据进行快速的查询的问题,加速图数据库查询,可广泛应用于基于图数据处理的应用技术领域。将该方法应用于自然语言问答智能查询中。实施表明,采用本发明方法,查询加速比例在两倍以上,可达到十倍加速,可以更好地满足对响应时间要求较高的应用需求。

Description

一种进行图数据查询的方法、装置、设备及存储介质
本申请要求于2019年10月28日提交的申请号为201911029459.3、发明名称为“基于FPGA-CPU异构环境的大规模数据查询加速装置及方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明属于信息搜索查询技术领域,涉及大规模数据搜索加速技术,尤其涉及一种进行图数据查询的方法、装置、设备及存储介质。
背景技术
随着互联网技术的发展,图数据库被应用的越来越广泛,如各种搜索引擎中的自动联想推荐、电商的交易网络中的欺诈检测、社交网络中的模式发现等等,都需要图数据库的支持。图数据库为通过图结构存储数据的数据库,其中包括节点和边,在节点中可以表示对应的数据、边可以表示存储的数据之间的存在的关系。例如,两个存在边的节点中存储的为不同社交账户的数据(如账户标识、性别、爱好等)、而两个节点对应的边可以表示为对应的两个账户相互关注。
图数据库可以提供数据的查询功能,即可以在图数据库中,查询所有具有指定结构的图。例如在存储各个社交账户的图数据库中,查找各个账户之间的共同好友,则对应待查询图的结构为三个节点与对应的边组成的三角形结构。相关技术中,在图数据库中实现查询功能主要是通过待查询图的结构(各个节点之间存在的边),在图数据库中依次对每个节点的邻接列表进行遍历,从而在图数据库中找出多个与待查询图具有相同结构的子图。其中,节点的邻接列表中记录有与该节点通过边相连的其他节点,并且还可以记录有其他节点之间相连的边。
在实现本申请的过程中,发明人发现现有技术至少存在以下问题:
在图数据库中进行查询操作时,需要遍历图中的每一个节点的邻接列表,这样占用大量处理资源,严重影响图数据查询的效率。
发明内容
为了克服上述现有技术的不足,本发明提供一种进行图数据查询的方法、装置、设备及存储介质,可以提高图数据查询的效率,所述技术方案如下:
第一方面,提供了一种进行图数据查询的方法,所述方法包括:
获取查询指令,其中,所述查询指令中携带有待查询图信息,所述待查询图信息包括多个变量节点之间的至少一个边的类型;
基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,其中,所述类型对应的节点连接关系信息用于指示通过所述类型的边相连接的节点;
基于所述至少一个节点组,对所述查询指令进行反馈。
可选的,所述基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,包括:
对于所述查询指令中的每个变量节点,基于所述查询指令中所述变量节点连接的边的类型,确定所述目标图中满足所述变量节点连接的边的类型的至少一个候选节点,组成所述变量节点对应的候选节点集合;
基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组。
可选的,所述基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,包括:
基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,其中,在所述排序中除第一个变量节点之外的每个变量节点与至少一个排在前面的变量节点之间存在边;
在所述排序中的第一个变量节点对应的第一候选节点集合中逐个选取候选节点,每选取一个候选节点,将选取的候选节点设置为基准节点,并将所述第一个变量节点设置为基准变量节点;
确定最新设置的基准变量节点在所述排序中的下一个变量节点;
在已设置的基准变量节点中,确定与所述下一个变量节点之间存在边的目标基准变量节点;
确定所述下一个变量节点与所述目标基准变量节点之间的边的目标类型,确定所述下一个变量节点对应的目标候选节点集合;
基于所述目标类型对应的节点连接关系信息,确定所述目标候选节点集合中是否包括目标候选节点与所述目标基准变量节点对应的基准节点之间存在目标类型的边;
如果存在,则将所述目标候选节点设置为基准节点,并将所述下一个变量节点设置为基准变量节点,确定所述下一个变量节点是否为所述排序中最后一个变量节点,如果不是所述排序中最后一个变量节点,则转至执行所述确定最新设置的基准变量节点在所述排序中的下一个变量节点,如果是所述排序中最后一个变量节点,则将当前设置的各基准节点,确定为所述目标图中满足所述至少一个边的类型的一个节点组,将所述节点组传输至所述处理器。
可选的,所述基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,包括:
建立未选取变量节点集合和已选取变量节点集合;
将所述多个变量节点添加到所述未选取变量节点集合中;
在所述未选取变量节点集合中,确定节点个数最少的候选节点集合对应的第一变量节点,将所述第一变量节点移至已选取变量节点集合中;
在所述未选取变量节点集合中逐个选取与已选取变量节点集合中的节点存在边,且对应的候选节点集合的中节点的个数最少的第二变量节点,将所述第二变量节点移至已选取变量节点集合中,直到所述未选取变量节点集合为空;
将所述各个变量节点移至已选取变量节点集合中的顺序确定为所述多个变量节点的排序。
可选的,所述方法应用于自然语言智能问答查询,所述目标图中的节点对应的数据为所述自然语言问答数据中的人物、事件、事物,所述边的类型为所述人物、事件、事物之间的关系。
第二方面,提供了一种进行图数据查询的装置,其特征在于,所述装置包 括:
获取模块,用于获取查询指令,其中,所述查询指令中携带有待查询图信息,所述待查询图信息包括多个变量节点之间的至少一个边的类型;
确定模块,用于基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,其中,所述类型对应的节点连接关系信息用于指示通过所述类型的边相连接的节点;
反馈模块,用于基于所述至少一个节点组,对所述查询指令进行反馈。
可选的,确定模块,用于
对于所述查询指令中的每个变量节点,基于所述查询指令中所述变量节点连接的边的类型,确定所述目标图中满足所述变量节点连接的边的类型的至少一个候选节点,组成所述变量节点对应的候选节点集合;
基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组。
可选的,确定模块,用于:
基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,其中,在所述排序中除第一个变量节点之外的每个变量节点与至少一个排在前面的变量节点之间存在边;
在所述排序中的第一个变量节点对应的第一候选节点集合中逐个选取候选节点,每选取一个候选节点,将选取的候选节点设置为基准节点,并将所述第一个变量节点设置为基准变量节点;
确定最新设置的基准变量节点在所述排序中的下一个变量节点;
在已设置的基准变量节点中,确定与所述下一个变量节点之间存在边的目标基准变量节点;
确定所述下一个变量节点与所述目标基准变量节点之间的边的目标类型,确定所述下一个变量节点对应的目标候选节点集合;
基于所述目标类型对应的节点连接关系信息,确定所述目标候选节点集合中是否包括目标候选节点与所述目标基准变量节点对应的基准节点之间存在目标类型的边;
如果存在,则将所述目标候选节点设置为基准节点,并将所述下一个变量节点设置为基准变量节点,确定所述下一个变量节点是否为所述排序中最后一个变量节点,如果不是所述排序中最后一个变量节点,则转至执行所述确定最新设置的基准变量节点在所述排序中的下一个变量节点,如果是所述排序中最后一个变量节点,则将当前设置的各基准节点,确定为所述目标图中满足所述至少一个边的类型的一个节点组,将所述节点组传输至所述处理器。
第三方面,提供了一种计算机设备,所述计算机设备包括处理器、FPGA和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器和/或FPGA加载并执行以实现如上所述的进行图数据查询的方法所执行的操作。
第四方面,提供了一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器和/或FPGA加载并执行以实现如上所述的进行图数据查询的方法所执行的操作。
本申请实施例提供的技术方案带来的有益效果是:
通过预先对根据目标图中各个边的类型进行划分,得到每个类型的边对应的节点连接关系信息,然后可以根据节点连接关系信息以及查询指令中的待查询图信息中包括的变量节点之间的至少一个边的类型,在目标图确定中满足查询指令的至少一个节点组。这样可以根据预先存储的每个类型的边对应的节点连接关系信息,对多个变量节点对应的节点进行筛选,得到至少一个边的类型的至少一个节点组,不需要依次对目标图中每个节点进行遍历,能够提高图数据查询的效率。
附图说明
图1为本发明实施例中采用的CSR结构(格式)示意图;
图2为本发明实施例采用的BID&BS压缩方法实施示意图;
图3是本发明提供的加速装置/加速器的结构框图;
图4是本发明提供的加速装置第0层的Kernel结构框图;
图5是本发明实施例查询过程中取交集计算的流程框图。
具体实施方式
下面结合附图,通过实施例进一步描述本发明,但不以任何方式限制本发明的范围。
本发明提供一种进行图数据查询的方法,可以基于FPGA-CPU异构环境实现,用于图数据库的大规模数据查询加速方法。加速图数据库查询可以支持大量基于图数据的应用场景。例如:需要快速寻找图数据中某种固定模式的应用场景:比如公司之间的持股关系可以表达成图数据的形式。欺诈检测是非常适合图数据库的一种应用场景。在现代欺诈及各种类型的金融犯罪中,例如银行欺诈,信用卡欺诈,电子商务欺诈,保险欺诈等,欺诈者通常是使用改变自身身份等手段,来达到逃避风控规则的欺诈目的。但是,欺诈者往往难以改变所有涉及网络的关联关系,难以在所有涉及的网络群体中同步执行相同操作来躲避风控。而图数据可建立跟踪全局的用户跟踪视角。但是,在不使用加速装置的情况下,时间代价无法接受。使用了加速装置后,算法运行速度可提高2倍以上。另外,还有需要支持自定义的实时查询的应用场景,其包括,很多研究机构与商业公司基于图数据(通常是知识图谱)来构建特定领域的或开放领域的自然语言问答系统。在这些问答系统底层,需要图数据库支持,以便于快速地从图数据中获取解析和回答自然语言问题所需的信息。例如,某人工智能公司研发的智能问答系统框架中,在使用句向量编码、句式解析、词义过滤、情感识别、文本聚类与分类等机器学习工具将自然语言转化为图数据库可以识别的SPARQL后,利用图数据库对后台以三元组形式存储的,涵盖常识、古诗词、生活事件、音乐等领域的,规模接近亿条边级的知识库进行检索。每天这样的访问约20万次左右,高峰时期并发约5个链接,平均每秒10次请求。这一步骤的时延约占总体时延的30%左右。使用本发明的技术方案对图数据库查询进行加速以后,能够将时延占比降低到15%,并使得高峰期吞吐量翻倍,从而帮助该公司实现了秒级以内的响应速度。
上述进行图数据查询的方法中,大规模数据为在处理连接操作时涉及候选点表和邻接列表的求交操作的数据库,如图数据库,可应用于自然语言问答查询加速等应用中。具体实施以将自然语言知识库表示成图数据为例,包括如下操作:
1)确定图数据中节点以及边对应的实际含义。在自然语言知识库中,节点通常表示知识库中的所有可以产生相互关系,具有属性的主体,比如人、事物、地点等,边则表示主体之间的相互关系,如互为配偶、出生地、所在地等。
2)确定图数据中节点以及边的属性。节点的属性通常代表节点对应的实体的固有特性,例如,人的年龄、性别、生日,地点的名称等等。边的属性通常代表关系带有的特性。例如在互为配偶关系中,可能存在开始时间、结束时间等属性。
3)根据上述定义,采用一定的图数据格式,将数据转化为图数据。以RDF格式为例,RDF格式中对节点和边的定义,以及其各自属性的定义方式都有详细的格式规定,根据格式规定进行数据转化即可。
现有技术中,将自然语言转化为图数据库中的SPARQL查询的方法可包括以下步骤:
1)进行实体识别,将自然语言中的元素与图数据库中的节点建立关联。
在自然语言中包括的要素,在图数据库中可能有对应的节点,但是节点的名称和自然语言在字面内容上不一定完全一致。例如,自然语言中提及“李白”,在图数据库中,有可能对应的节点的标签为“李白(唐朝诗人)”。需要将二者联系起来。目前常用的方法基本上都要借助图数据库中的信息,利用模式匹配或者深度学习的方法来进行实体识别。
2)确定依赖关系;
所谓依赖关系是指自然语言的实体之间的语义关系。通常,一条依赖关系对应图数据中的两个节点以及一条边。目前常用依赖关系树生成的方法来确定依赖关系。
3)生成查询。
利用前面得出的实体信息和依赖关系信息,可通过机器学习方法,生成图数据库可以识别的查询。
图数据库查询是非常基础的图数据操作,不管是对用户提供查询本身或是基于查询的应用接口,都对图数据库查询有一定的要求。图数据规模巨大时,进行图上的Join操作需要耗费大量的时间和计算资源。图上的Join操作与关系型数据库中表的Join操作类似,都是根据一定的条件寻找两个集合中的匹配项。不同之处在于,关系型数据库中通常采用等值条件来确定元素是否匹配,而图 数据库中的Join操作则需要通过判断元素之间是否存在关系来确定元素是否匹配。与关系型数据库中的Join操作相比,图数据库中的Join操作涉及更多的存储读取和计算操作,因此更加复杂。
从本质上上说,进行图上的Join操作的目的是为了进行子图同构的计算。在大部分图数据中,用户查询可以被表示为查询图。查询的执行等价于在整个数据图中寻找与查询图具有同构关系的子图。在计算机理论中,同构问题的定义为:两个简单图G和H称为是同构的,当且仅当存在一个将G的节点1……n映射到H的节点1……n的一一对应σ,使得G中任意两个节点i和j相连接,当且仅当H中对应的两个节点σ(i)和σ(j)相连接。如果G和H是有向图,那么同构的定义中还进一步要求对于G中任意两个相连的节点i和j,边(i,j)与它在H中对应的边(σ(i),σ(j))方向相同。FPGA的并行计算效率高。FPGA属于并行计算,一次可执行多个指令的算法,而传统的ASIC、DSP甚至CPU都是串行计算,一次只能处理一个指令集,如果ASIC和CPU需要提速,更多的方法是增加频率,所以ASIC、CPU的主频一般较高。FPGA虽然普遍主频较低,但对部分特殊的任务,大量相对低速并行的单元比起少量高效单元而言效率更高。另外,从某种角度上说,FPGA内部其实并没有所谓的“计算”,最终结果几乎是类似于ASIC“电路直给”,因此执行效率就大幅提高。
RDF(Resource Description Framework,资源描述框架),其本质是一个数据模型(Data Model)。它提供了一个统一的标准,用于描述实体/资源。简单来说,就是表示事物的一种方法和手段。RDF形式上表示为SPO(Subject-Predication-Object,主语-谓语-宾语)三元组,有时候也称为一条语句(statement),知识图谱中我们也称其为一条知识。
RDF由节点和边组成,节点表示实体/资源、属性,边则表示了实体和实体之间的关系以及实体和属性的关系。通常来说,图中一条边的源点被称为主语,边上的标签被称为谓词,被指向的节点成为宾语。
在本发明中,需要连续地随机读取图中任意节点链接的所有节点的列表。在图论中,这个列表被称为邻接列表。在有向图中,邻接列表分为出边邻接列表与入边邻接列表,分别表示节点作为主语或者宾语时的邻接列表。然而,在计算机系统中,连续的随机的对存储单元的访问将导致低下的运行效率。因此,研究人员通过借鉴稀疏矩阵的CSR(Compressed Sparse Row)存储格式,提出 了图对应邻接矩阵的CSR存储格式和CSC(Compressed Sparse Row,列稀疏压缩存储)存储格式。
图1的CSR存储格式由两个数组C和E构成。其中,E数组由全体节点的邻接列表首尾相接构成。由于图数据库系统通常会为节点设置节点ID,因此,可以按照节点ID从大到小的顺序,来组合所有节点的邻接列表。而C数组的元素的个数为图中的节点的个数加1,除了C数组中的最后一个元素,其他第i个元素的值等于节点ID为i的节点的邻接列表在数组E中第一个元素的位置,最后一个元素的值等于E数组的中元素的个数。当邻接列表为出边邻接列表时,称为CSR格式,当邻接列表为入边邻接列表时,称为CSC格式。由于数组C表示的是邻接列表在E中的偏移量,因此数组C又称为offset数组。
本发明的技术方案包括:数据预处理部分、CPU控制单元部分和FPGA计算单元部分。具体实施时,本发明采用Verilog语言编写FPGA端的计算单元,使用C++编写CPU端的控制单元程序。本发明的开发与运行基于Xilinx公司销售的U200FPGA板卡及其配套的运行以及开发环境。U200配备有16GB*4的内存空间,2175*18K的芯片内置存储空间。因此,在U200上实现本发明时,将邻接列表分割成128组,进行128路并行的交集计算。其配套的开发环境能够将Verilog语言对应的硬件逻辑映射到FPGA硬件之中。
本发明基于开源的图数据库系统gStore,针对50亿边规模的lubm数据库,以及其对应的benchmark中涉及的join查询,实现一个CPU-FPGA异构的图数据库系统查询加速器。
其中,数据预处理部分的处理如下:
目前,存在多种存储和表示图数据的标准格式,其中,RDF格式得到广泛的使用。本发明适用的图数据格式是RDF格式。在计算过程中,对图数据的邻接矩阵使用CSR格式进行存储。
在RDF格式中,节点之间的关系使用三元组来表示。在进行用户查询处理的时候,三元组中主语、宾语和谓词的值都需要进行匹配。然而,CSR格式中并不能区别由不同的谓词链接的邻居。因此,在本发明中,对图数据根据谓词进行了分割。将所有具有同样谓词的边抽出来构成一个子图,并为每个子图生成一个单独的CSR结构(格式)。在查询处理过程中,通过额外的参数确定将要读取哪一个CSR中的数据。
但是,根据谓词得到的子图的规模可能远小于原图的规模,使得CSR中的offset数组非常稀疏,浪费了大量的存储空间。因此,对于每一个子图,本发明预先维护了一个映射结构。对给定子图中全体度数大于等于1的节点,设其数量为n,按照ID从小到大的顺序,将其重新编号为0……n-1,再在新的编号下构建CSR中的offset数组,而E数组则保持不变,即邻接列表中的元素仍然是重编号之前的节点ID。
同时,为了提高算法在FPGA硬件上的并行度,降低计算复杂度,本发明采用了一种基于二进制位串的数据压缩方式,称为BID&BS结构。
从硬件角度上看,运行速度最快的是二进制的位操作。要使用位操作来完成取交集,最直观的思路是分别将构成E数组的每一个邻接表中元素的值通过一个二进制串进行表示,其中,若第i位取1,则说明对应的邻接表中存在为i节点ID,若第i位取0,则说明对应的邻接表中不存在为i节点ID。这样对于两个分别用二进制串表示的邻接表,只需要对对应的两个二进制串进行一次按位与操作,就可以得到两个邻接表中共同包括的节点ID,即通过按位与操作之后得到的二进制串中,若第j个元素的数值为1,则两个邻接表都包括为j的节点ID。
由于在实际应用的图中,绝大多数都是稀疏图,平均度数不高,从而导致二进制串中有大量的连续的0,影响性能。所以还可以将上述的二进制串分割成多个块,对每一部分,给定唯一的Block ID(BID),而在每一个块内部,用一个二进制串Bit Stride(BS)表示集合元素的存在与否。这样,对于一个没有元素存在的块,可以直接去掉,不作计算,从而达到了数据压缩的效果,一定程度上缓解了数据稀疏的问题。对于有至少一个元素存在的块,可以用归并方法,先通过比较找到相同的BID,然后再对对应的BS进行按位与操作,得到结果。为了方便起见,我们记每个BID的长度均为g位,每个BS的长度均为s位,全集的元素数量为Ω,显然,此时能够得到Ω/s个不同的块,每个块分配一个唯一的BID,由于g的大小是可控的,因此不需要担心BID空间不足的问题。
举个例子,假设现有两个集合S_0={0,2,3,5,6,7,64,65,66},S_1={2,3,7,18},设s=4,g=5,Ω=128,则生成的BID&BS结构如图2所示。
所以在数据预处理部分,可以根据上述提供BID&BS结构,对于每一个根据谓词划分的子图对应的CSR格式对应的E数组可以通过BID&BS结构进行表 示。
CPU控制单元部分和FPGA计算单元部分如下:
1.加载阶段:
将数据从本地索引中读入主机内存,由于两表join是基于某个谓词的,故根据谓词ID将图数据分成多个CSR结构(格式),每个CSR对应一个谓词ID。由于是有向图,故需要两组CSR,一组用于存储出边的CSR结构(格式),一组用于存储入边的CSR结构(格式)。由于两种CSR不会相互影响,故将其贮存于FPGA卡的存储单元上。为了便于FPGA计算单元访问存储的CSR,本发明可以根据数据预处理部分的映射结构将每个CSR中对应的offset数组中不连续的节点ID映射为了连续的节点ID。
2.查询执行阶段:
步骤101、获取查询指令,其中,查询指令中携带有待查询图信息,待查询图信息包括多个变量节点之间的至少一个边的类型。
在实施中,技术人员可以根据实际需求编写查询指令,或者根据用户输入的查询请求生成对应的查询指令,在查询指令中可以携带有待查询图信息,待查询图信息中可以包括待查询的变量节点的个数,以及各变量节点之间存在的边的类型。
步骤102、基于至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定目标图中满足至少一个边的类型的至少一个节点组,基于至少一个节点组,对查询指令进行反馈。
其中,边的类型可以为RDF中谓词的ID,边的类型对应的节点连接关系信息用于指示通过类型的边相连接的节点,即为数据预处理部分中通过CPU传输至FPGA中以CSR结构存储的各个子图对应的邻接矩阵(以BID&BS结构表示的E数组,以及根据映射结构转换之后的C数组)。至少一个节点组中包括多个节点以及各节点存在的边,即根据查询指令得到的至少一个查询结果。
在本发明中,可以根据FPGA以及CPU确定目标图中满足至少一个边的类型的至少一个节点组,即得到对应的查询指令的查询结果,然后将查询结果反馈给技术人员或用户,相对应的处理可以如下:
步骤201、对于查询指令中的每个变量节点,基于查询指令中变量节点连接的边的类型,确定目标图中满足变量节点连接的边的类型的至少一个候选节点, 组成变量节点对应的候选节点集合。
在实施中,当CPU在获取到查询指令之后,可以根据查询指令中变量节点之间存在的边的类型,在预先存储的目标图中对各个变量节点可能对应的所有节点进行筛选。例如,查询指令为S={x,p1,y;y,p2,z;z,p3,x},其中,x,y,z分别为变量节点,p1,p2,p3分别为边的类型,则可以根据p3和p1在目标图的各个节点中筛选变量节点x对应的候选节点集合,可以根据p1和p2在目标图的各个节点中筛选变量节点y对应的候选节点集合,可以根据p2和p3在目标图的各个节点中筛选变量节点z对应的候选节点集合。
步骤202、基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定目标图中满足至少一个边的类型的至少一个节点组。
在实施中,在得到每个变量节点对应的候选节点集合之后,可以根据预先存储的目标图中每个类型对应的节点连接关系信息,在两个候选节点集合中确定存在对应边的节点对,然后确定与节点对中任一节点对应的变量节点对存在边的下一个变量节点,然后根据已经确定节点对和下一个变量节点对应的候选节点集合中的各个节点确定对应的节点组(三个节点),然后确定与节点组(三个节点)中任一节点对应的变量节点存在边的下一个变量节点,然后根据已经确定节点组(三个节点)和下一个变量节点对应的候选节点集合中的各个节点确定对应的节点组(四个节点),以此类推,直到根据节点组(N-1个节点)与最后一个变量节点对应的候选节点集合,确定得到的包括至少一个包含N个节点的节点组,即得到目标图中满足至少一个边的类型的至少一个节点组。
可选的,本申请提供了一种基于FPGA-CPU的异构环境的图数据查询装置,上述步骤202的处理可基于该装置进行实现,相应的处理如下:
步骤301、CPU基于每个变量节点对应的候选节点集合、待查询图信息和预设排序规则,确定多个变量节点的排序。
其中,在排序中除第一个变量节点之外的每个变量节点与至少一个排在前面的变量节点之间存在边,基于每个变量节点对应的候选节点集合、待查询图信息和预设排序规则,确定多个变量节点的排序的处理可以如下:
建立未选取变量节点集合和已选取变量节点集合;将多个变量节点添加到未选取变量节点集合中;在未选取变量节点集合中,确定节点个数最少的候选 节点集合对应的第一变量节点,将第一变量节点移至已选取变量节点集合中;在未选取变量节点集合中逐个选取与已选取变量节点集合中的节点存在边,且对应的候选节点集合的中节点的个数最少的第二变量节点,将第二变量节点移至已选取变量节点集合中,直到未选取变量节点集合为空;将各个变量节点移至已选取变量节点集合中的顺序确定为多个变量节点的排序。
在实施中,可以根据每个变量节点对应的候选节点集合中候选节点的个数确定多个变量节点的排序。首先可以将对应候选节点集合中候选节点的个数最少的第一变量节点移至已选取变量节点集合中,并将第一变量节点确定为排序上第一个变量节点,然后在与已选取变量节点集合中的变量节点(即第一变量节点)存在边的至少一个变量节点中确定对应候选节点集合中候选节点的个数最少的第二变量节点,将第二变量节点移至已选取变量节点集合中,并将第二变量节点确定为排序上第二个变量节点,然后在与已选取变量节点集合中的变量节点(即第一变量节点或第二变量节点),存在边的至少一个变量节点中确定对应候选节点集合中候选节点的个数最少的第三变量节点,将第三变量节点移至已选取变量节点集合中,并将第三变量节点确定为排序上第三个变量节点,这样以此类推,直到得到所有的变量节点对应的排序。
CPU在确定所有的变量节点对应的排序之后,可以根据对应的排序,确定执行join操作的多个FPGA的执行顺序,然后将对应的变量节点的候选节点集合传送至对应的FPGA。其中可以将排序上前两个变量节点对应的候选节点集合传输至第一个FPGA,然后将后续每个变量节点对应的候选节点集合传输至对应的FPGA。
另外,每个FPGA可以执行多个join操作,即技术人员可以将FPGA划分为多个FPGA处理单元,CPU在确定所有的变量节点对应的排序之后,可以根据对应的排序确定多个FPGA对应的多个FPGA处理单元的执行顺序,然后将对应的变量节点的候选节点集合传送至对应的FPGA,由FPGA中不同的FPGA处理单元处理对应的join操作。
步骤302、在排序中的第一个变量节点对应的第一候选节点集合中逐个选取候选节点,每选取一个候选节点,将选取的候选节点设置为基准节点,并将第一个变量节点设置为基准变量节点,确定最新设置的基准变量节点在排序中的下一个变量节点,在已设置的基准变量节点中,确定与下一个变量节点之间存 在边的目标基准变量节点。
在实施中,可以确定最新设置的基准变量节点为在排序中的下一个变量节点,然后在已设置基准变量节点中,确定与下一个变量节点之间存在边的目标基准变量节点。例如延续步骤201中的例子,x,y,z分别为变量节点,p1,p2,p3分别为边的类型,x,y,z分别为变量节点对应的顺序为x-y-z,则可以将变量节点x设置为基准节点,下一个变量节点则为变量节点y,则目标基准变量节点为变量节点x。
步骤303确定下一个变量节点与目标基准变量节点之间的边的目标类型,确定下一个变量节点对应的目标候选节点集合。
其中,对应步骤302中的例子,下一个变量节点与目标基准变量节点之间的边的目标类型即为变量节点x与变量节点y之间边的类型,即为p1。下一个变量节点对应的目标候选节点集合即为CPU发送至对应FPGA中的下一个变量节点对应的候选节点集合。
步骤304、基于目标类型对应的节点连接关系信息,确定目标候选节点集合中是否包括目标候选节点与目标基准变量节点对应的基准节点之间存在目标类型的边。
在实施中,可以根据目标类型对应的节点连接关系信息,确定目标类型对应的子图的邻接矩阵,然后根据子图的邻接矩阵,在目标候选节点集合中确定与目标基准变量节点对应的基准节点之间存在目标类型的边的目标候选节点。
步骤305、如果存在,则将目标候选节点设置为基准节点,并将下一个变量节点设置为基准变量节点,确定下一个变量节点是否为排序中最后一个变量节点,如果不是排序中最后一个变量节点,则转至执行确定最新设置的基准变量节点在排序中的下一个变量节点,如果是排序中最后一个变量节点,则将当前设置的各基准节点,确定为目标图中满足至少一个边的类型的一个节点组,将节点组传输至处理器。
在实施中,如果存在目标候选节点,则将目标候选节点设置为基准节点,并将下一个变量节点设置为基准变量节点,确定下一个变量节点是否为排序中最后一个变量节点,如果不是排序中最后一个变量节点,则将对应的计算结果发送至CPU,由CPU按照各个FPGA对应的执行顺序发送至下一个FPGA中,下一个FPGA可继续执行步骤302,如果下一个FPGA是排序中最后一个FPGA, 则最后一个FPGA在执行完步骤302-305之后,可以将对应的计算结果发送至CPU,由CPU确定为目标图中满足至少一个边的类型的一个节点组。
其中,需要说明的是上述步骤302-305中的FPGA也可以是FPGA处理单元,对应的处理流程与上述步骤302-305的处理过程类似,此处不再赘述。
也就是说,CPU可以与多个FPGA连接,不同FPGA可以用于在不同的候选节点集合中,确定符合查询指令对应的节点,即不同的FPGA可以进行两表join(即确定目标节点与另一个节点之间是否存在目标类型的边),三表join(即确定目标节点与另外两个节点之间是否存在目标类型的边),四表join等等。下面以变量节点的个数为三个的三变量查询为例,对方案进行说明:
步骤3051,确定每个变量节点对应的候选节点集合,并确定每个变量节点对应的排序,根据每个变量节点对应的排序确定对应的join操作,以及执行对应的join操作FPGA。
例如,变量节点x与变量节点y之间的边的类型为p1,变量节点y与变量节点z之间的边的类型为p2,变量节点z与变量节点x之间的边的类型为p3,对应的变量节点的顺序为变量节点x、变量节点y、变量节点z,则对应的join操作可以分为变量节点x对应的候选节点集合与变量节点y对应的候选节点集合间的两表join,变量节点z对应的候选节点集合与变量节点x和y对应的候选节点集合间的三表join。
步骤3052,CPU将对应的候选节点集合发送至执行对应join操作FPGA;其中,在FPGA中可以预先存储的目标图中每个类型对应的节点连接关系信息,其中,为了提高数据访问效率,可以将节点连接关系信息进行划分,并将划分之后的节点连接关系信息分别存储在FPGA中的不同的DRAM(Dynamic Random Access Memory,动态随机存取存储器)内存单元端口。
步骤3053,第一FPGA执行两表join的计算单元程序:对于两表join,CPU控制单元会向FPGA端传输两个候选节点集合,前一个候选节点集合需要将节点ID映射为对应CSR的索引ID,后一个会对候选节点集合进行压缩,即将整个候选节点集合根据BID&BS结构表示为一个二进制串,在候选节点集合对应的二进制串中,若第k位为1,则表示ID为k的节点出现在候选节点集合中,再每w位划分为一个组,将组的ID的二进制位和w位拼接起来代表一组相邻的顶点ID,如果w位全为0便可以不存储,最后FPGA计算单元会返回两表join 的中间结果。例如,后一个候选节点集合为S_00={0,2,3,5,6,7,64,65,66},对应的节点连接关系信息中,E数组对应的S_1={000001100,000011000,001000100},根据BID&BS结构可将S_00表示为S_01{000001101,000011110,100000111},然后可以对S_01和S_1进行按位与操作,得到结果R_0={000001100,000011000},即后一个候选节点集合中节点ID为2、3、7的节点与前一个候选节点集合存在对应的边,然后可以根据对应的节点ID、E数组以及映射结构确定前一个候选节点集合中分别与节点ID为2、3、7存在边的对应的节点ID。当第一FPGA得到对应的计算结果之后,可以将结算结果发送至CPU,由CPU将计算结果发送至第二FPGA,以执行三表join的计算单元程序。
步骤3054,第二FPGA执行三表join的计算单元程序:对于第三个表,它可能会跟前两个表均要join,由于CPU端控制单元收到的两表join的中间结果的每一项是一个节点对,因此节点对的第二项是一组顶点ID,需要解压之后再映射为CSR的索引ID,作为新的输入传给三表join的第二FPGA,对变量节点z对应的候选节点集合与输入的对节点的三表join计算结果。然后将对应的计算结果发送至CPU,由CPU处理对应的计算结果得到最后的查询结果。
其中,需要说明的是上述步骤3051-3054中的FPGA也可以是FPGA处理单元,对应的处理流程与上述步骤3051-3054的处理过程类似,此处不再赘述。
如果FPGA计算的结果中存在有N个元素,则CPU可以根据预先存储的映射结构将对应结构中前N-1个元素映射到CSR格式的offet数组上的节点ID,再将节点ID映射回在图数据库中的编号进行输出,第N个元素是一个BID&BS块,CPU可以将第N个元素解压,得到对应的节点ID;将得到的节点ID与前面N-1个元素对应的节点ID,即构成一条最终的查询结果。
由于控制单元和计算单元使用了数据流的方式,因此两表join的计算单元和三表join的计算单元可以对应于两块FPGA卡。对于LUBM(the Lehigh University Benchmark)的benchmark(性能评测标准)而言,两个计算单元可以独立执行,各自访问不同的CSR结构(格式),不会冲突。另外,需要说明的是FPGA与CPU之间采用流的方式进行数据传递。即当上一层计算得到了一个中间结果,就可以马上传递到下一个层次,而不需要等到所有的中间结果计算完成,才能开始下一层的计算即第三步和第四步可以同时执行,从而极大地提高并行度。对三个及以上变量的用户查询,则可以用同样的逻辑来处理,不同的 是需要用到更多的FPGA硬件资源。由此体现本设计的可扩展性。如图3所示,图3所示为本发明提供的图数据查询的方法对应的加速装置/加速器的结构框图。在载入阶段,CPU上的控制单元程序将计算join所需的CSR数据写入到FPGA硬件的内存中。在查询执行阶段,在特定的执行顺序下,将Join切分成多个层次,层次之间采用流的方式进行数据传递。即当上一层计算得到了一个中间结果,就可以马上传递到下一个层次,而不需要等到所有的中间结果计算完成,才能开始下一层的计算,从而极大地提高并行度。对每一个层级,CPU端的控制单元程序需要向FPGA传输本层计算所需的候选点表以及控制参数。同样地,控制单元程序使用流的形式来传输它们,以实现整体的数据流。
以第0层为例,在每一层的模块结构如图4所示。从FPGA内存读取的邻接列表与从CPU端获取的候选点表被传递给多个处理两表或多表取交集的模块。具体的并行模块数量视数据集规模大小与FPGA硬件的具体配置而定。
同样以第0层为例,图5展示了本发明实施例查询过程中取交集计算的流程。传入的候选点表会被等分成N份,相当于放入N个桶中。而每收到一个邻接列表,根据N个桶中节点ID的范围,可以将每个元素放入对应的桶中,然后在每个桶中进行归并取交集的计算。得到一个有效结果之后,便将结果传回CPU。
具体实施时,大规模数据库模块用于存储以RDF(Resource Description Framework)格式表示的大规模图数据集。其节点数量在千万级及以上,边数量在亿级以上。例如,LUBM(the Lehigh University Benchmark)数据集(用于测试本发明的性能),包含有50亿条边,约9亿个节点。又如从维基百科中抽取得到的dbpedia数据集,包括30亿条边,1600万个节点。如此大规模的数据集,对单机的图数据库的性能有较高的要求。本发明选择了大学开发的gStore(https://github.com/pkumod/gStore)提供图数据库软件支持,因为其在大规模数据上单机查询的性能较好。
本发明提供一种基于FPGA-CPU异构环境用于图数据库的大规模数据查询加速方法及其在FPGA上的实现装置,可应用于在处理连接操作时涉及候选点表和邻接列表的求交操作的数据库查询,解决在大规模数据集上对数据进行快速的查询的问题,加速图数据库查询,可以广泛应用于基于图数据处理的应用技术领域如社交网络、金融风控、物联网应用、关系分析、IT运维、推荐引擎 等。通过对输入输出格式进行调整,本发明可以与图数据库系统结合,提高图数据库系统的查询响应速度。将该方法应用于自然语言问答智能查询中,自然语言问答数据中的人物、事件、事物识别为实体,相应表示为RDF格式中的节点;实体的属性定义为节点的属性,实体和实体之间的关系以及实体和属性的关系表示为RDF格式中的边。具体实施表明,采用本发明技术方案,查询加速比例在两倍以上,可达到十倍加速,可以更好地满足对响应时间要求较高的应用需求,例如可以实现实时的图模式查询发现。
需要注意的是,公布实施例的目的在于帮助进一步理解本发明,但是本领域的技术人员可以理解:在不脱离本发明及所附权利要求的精神和范围内,各种替换和修改都是可能的。因此,本发明不应局限于实施例所公开的内容,本发明要求保护的范围以权利要求书界定的范围为准。

Claims (10)

  1. 一种进行图数据查询的方法,其特征在于,所述方法包括:
    获取查询指令,其中,所述查询指令中携带有待查询图信息,所述待查询图信息包括多个变量节点之间的至少一个边的类型;
    基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,其中,所述类型对应的节点连接关系信息用于指示通过所述类型的边相连接的节点;
    基于所述至少一个节点组,对所述查询指令进行反馈。
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,包括:
    对于所述查询指令中的每个变量节点,基于所述查询指令中所述变量节点连接的边的类型,确定所述目标图中满足所述变量节点连接的边的类型的至少一个候选节点,组成所述变量节点对应的候选节点集合;
    基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组。
  3. 根据权利要求2所述的方法,其特征在于,所述基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,包括:
    基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,其中,在所述排序中除第一个变量节点之外的每个变量节点与至少一个排在前面的变量节点之间存在边;
    在所述排序中的第一个变量节点对应的第一候选节点集合中逐个选取候选节点,每选取一个候选节点,将选取的候选节点设置为基准节点,并将所述第一个变量节点设置为基准变量节点;
    确定最新设置的基准变量节点在所述排序中的下一个变量节点;
    在已设置的基准变量节点中,确定与所述下一个变量节点之间存在边的目 标基准变量节点;
    确定所述下一个变量节点与所述目标基准变量节点之间的边的目标类型,确定所述下一个变量节点对应的目标候选节点集合;
    基于所述目标类型对应的节点连接关系信息,确定所述目标候选节点集合中是否包括目标候选节点与所述目标基准变量节点对应的基准节点之间存在目标类型的边;
    如果存在,则将所述目标候选节点设置为基准节点,并将所述下一个变量节点设置为基准变量节点,确定所述下一个变量节点是否为所述排序中最后一个变量节点,如果不是所述排序中最后一个变量节点,则转至执行所述确定最新设置的基准变量节点在所述排序中的下一个变量节点,如果是所述排序中最后一个变量节点,则将当前设置的各基准节点,确定为所述目标图中满足所述至少一个边的类型的一个节点组,将所述节点组传输至所述处理器。
  4. 根据权利要求3所述的方法,其特征在于,所述基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,包括:
    建立未选取变量节点集合和已选取变量节点集合;
    将所述多个变量节点添加到所述未选取变量节点集合中;
    在所述未选取变量节点集合中,确定节点个数最少的候选节点集合对应的第一变量节点,将所述第一变量节点移至已选取变量节点集合中;
    在所述未选取变量节点集合中逐个选取与已选取变量节点集合中的节点存在边,且对应的候选节点集合的中节点的个数最少的第二变量节点,将所述第二变量节点移至已选取变量节点集合中,直到所述未选取变量节点集合为空;
    将所述各个变量节点移至已选取变量节点集合中的顺序确定为所述多个变量节点的排序。
  5. 根据权利要求1所述方法,其特征在于,所述方法应用于自然语言智能问答查询,所述目标图中的节点对应的数据为所述自然语言问答数据中的人物、事件、事物,所述边的类型为所述人物、事件、事物之间的关系。
  6. 一种进行图数据查询的装置,其特征在于,所述装置包括:
    获取模块,用于获取查询指令,其中,所述查询指令中携带有待查询图信 息,所述待查询图信息包括多个变量节点之间的至少一个边的类型;
    确定模块,用于基于所述至少一个边的类型和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组,其中,所述类型对应的节点连接关系信息用于指示通过所述类型的边相连接的节点;
    反馈模块,用于基于所述至少一个节点组,对所述查询指令进行反馈。
  7. 根据权利要求6所述的装置,其特征在于,确定模块,用于
    对于所述查询指令中的每个变量节点,基于所述查询指令中所述变量节点连接的边的类型,确定所述目标图中满足所述变量节点连接的边的类型的至少一个候选节点,组成所述变量节点对应的候选节点集合;
    基于每个变量节点对应的候选节点集合和预先存储的目标图中每个类型对应的节点连接关系信息,确定所述目标图中满足所述至少一个边的类型的至少一个节点组。
  8. 根据权利要求7所述的装置,其特征在于,确定模块,用于:
    基于每个变量节点对应的候选节点集合、所述待查询图信息和预设排序规则,确定所述多个变量节点的排序,其中,在所述排序中除第一个变量节点之外的每个变量节点与至少一个排在前面的变量节点之间存在边;
    在所述排序中的第一个变量节点对应的第一候选节点集合中逐个选取候选节点,每选取一个候选节点,将选取的候选节点设置为基准节点,并将所述第一个变量节点设置为基准变量节点;
    确定最新设置的基准变量节点在所述排序中的下一个变量节点;
    在已设置的基准变量节点中,确定与所述下一个变量节点之间存在边的目标基准变量节点;
    确定所述下一个变量节点与所述目标基准变量节点之间的边的目标类型,确定所述下一个变量节点对应的目标候选节点集合;
    基于所述目标类型对应的节点连接关系信息,确定所述目标候选节点集合中是否包括目标候选节点与所述目标基准变量节点对应的基准节点之间存在目标类型的边;
    如果存在,则将所述目标候选节点设置为基准节点,并将所述下一个变量节点设置为基准变量节点,确定所述下一个变量节点是否为所述排序中最后一 个变量节点,如果不是所述排序中最后一个变量节点,则转至执行所述确定最新设置的基准变量节点在所述排序中的下一个变量节点,如果是所述排序中最后一个变量节点,则将当前设置的各基准节点,确定为所述目标图中满足所述至少一个边的类型的一个节点组,将所述节点组传输至所述处理器。
  9. 一种计算机设备,其特征在于,所述计算机设备包括处理器、FPGA和存储器,所述存储器中存储有至少一条指令,所述指令由所述处理器和/或FPGA加载并执行以实现如权利要求1至权利要求5任一项所述的进行图数据查询的方法所执行的操作。
  10. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器和/或FPGA加载并执行以实现如权利要求1至权利要求5任一项所述的进行图数据查询的方法所执行的操作。
PCT/CN2020/124541 2019-10-28 2020-10-28 一种进行图数据查询的方法、装置、设备及存储介质 WO2021083239A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911029459.3A CN110990638B (zh) 2019-10-28 2019-10-28 基于fpga-cpu异构环境的大规模数据查询加速装置及方法
CN201911029459.3 2019-10-28

Publications (1)

Publication Number Publication Date
WO2021083239A1 true WO2021083239A1 (zh) 2021-05-06

Family

ID=70082620

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124541 WO2021083239A1 (zh) 2019-10-28 2020-10-28 一种进行图数据查询的方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN110990638B (zh)
WO (1) WO2021083239A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210233181A1 (en) * 2018-08-06 2021-07-29 Ernst & Young Gmbh Wirtschaftsprüfungsgesellschaft System and method of determining tax liability of entity
CN113220710A (zh) * 2021-05-11 2021-08-06 北京百度网讯科技有限公司 数据查询方法、装置、电子设备以及存储介质
CN113626594A (zh) * 2021-07-16 2021-11-09 上海齐网网络科技有限公司 基于多智能体的运维知识库的建立方法及系统
CN113837777A (zh) * 2021-09-30 2021-12-24 浙江创邻科技有限公司 基于图数据库的反诈骗管控方法、装置、系统及存储介质
CN114186100A (zh) * 2021-10-08 2022-03-15 支付宝(杭州)信息技术有限公司 数据存储及查询方法、装置及数据库系统
CN114298674A (zh) * 2021-12-27 2022-04-08 四川启睿克科技有限公司 一种基于复杂规则的轮岗分配计算的轮岗系统及方法

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990638B (zh) * 2019-10-28 2023-04-28 北京大学 基于fpga-cpu异构环境的大规模数据查询加速装置及方法
CN111553834B (zh) * 2020-04-24 2023-11-03 上海交通大学 基于fpga的并发图数据预处理方法
CN111241356B (zh) * 2020-04-26 2020-08-11 腾讯科技(深圳)有限公司 基于模拟量子算法的数据搜索方法、装置及设备
CN111538854B (zh) * 2020-04-27 2023-08-08 北京百度网讯科技有限公司 搜索方法及装置
CN111625558A (zh) * 2020-05-07 2020-09-04 苏州浪潮智能科技有限公司 一种服务器架构及其数据库查询方法和存储介质
CN112069216A (zh) * 2020-09-18 2020-12-11 山东超越数控电子股份有限公司 一种基于FPGA的Join算法实现方法、系统、装置和介质
US20220129770A1 (en) * 2020-10-23 2022-04-28 International Business Machines Corporation Implementing relation linking for knowledge bases
CN112463870B (zh) * 2021-02-03 2021-05-04 南京新动态信息科技有限公司 一种基于fpga的数据库sql加速方法
CN115544069B (zh) * 2022-09-26 2023-06-20 山东浪潮科学研究院有限公司 一种可重构数据库查询加速处理器及系统
CN117155400A (zh) * 2023-10-30 2023-12-01 山东浪潮数据库技术有限公司 一种行数据解压方法、系统及fpga异构数据加速卡

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017134582A (ja) * 2016-01-27 2017-08-03 ヤフー株式会社 グラフインデックス探索装置及びグラフインデックス探索装置の動作方法
CN109271458A (zh) * 2018-09-14 2019-01-25 南威软件股份有限公司 一种基于图数据库的关系网查询方法及系统
CN109726305A (zh) * 2018-12-30 2019-05-07 中国电子科技集团公司信息科学研究院 一种基于图结构的复杂关系数据存储及检索方法
CN110334159A (zh) * 2019-05-29 2019-10-15 苏宁金融服务(上海)有限公司 基于关系图谱的信息查询方法和装置
CN110990638A (zh) * 2019-10-28 2020-04-10 北京大学 基于fpga-cpu异构环境的大规模数据查询加速装置及方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8983990B2 (en) * 2010-08-17 2015-03-17 International Business Machines Corporation Enforcing query policies over resource description framework data
US8756237B2 (en) * 2012-10-12 2014-06-17 Architecture Technology Corporation Scalable distributed processing of RDF data
CN106528648B (zh) * 2016-10-14 2019-10-15 福州大学 结合Redis内存数据库的分布式RDF关键词近似搜索方法
US11120082B2 (en) * 2018-04-18 2021-09-14 Oracle International Corporation Efficient, in-memory, relational representation for heterogeneous graphs
CN109325029A (zh) * 2018-08-30 2019-02-12 天津大学 基于稀疏矩阵的rdf数据存储和查询方法
CN110109898B (zh) * 2019-04-23 2023-04-18 超越科技股份有限公司 基于fpga片内bram的哈希连接加速方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017134582A (ja) * 2016-01-27 2017-08-03 ヤフー株式会社 グラフインデックス探索装置及びグラフインデックス探索装置の動作方法
CN109271458A (zh) * 2018-09-14 2019-01-25 南威软件股份有限公司 一种基于图数据库的关系网查询方法及系统
CN109726305A (zh) * 2018-12-30 2019-05-07 中国电子科技集团公司信息科学研究院 一种基于图结构的复杂关系数据存储及检索方法
CN110334159A (zh) * 2019-05-29 2019-10-15 苏宁金融服务(上海)有限公司 基于关系图谱的信息查询方法和装置
CN110990638A (zh) * 2019-10-28 2020-04-10 北京大学 基于fpga-cpu异构环境的大规模数据查询加速装置及方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210233181A1 (en) * 2018-08-06 2021-07-29 Ernst & Young Gmbh Wirtschaftsprüfungsgesellschaft System and method of determining tax liability of entity
US11983780B2 (en) * 2018-08-06 2024-05-14 Ey Gmbh & Co. Kg Wirtschaftsp Üfungsgesell Schaft System and method of determining tax liability of entity
CN113220710A (zh) * 2021-05-11 2021-08-06 北京百度网讯科技有限公司 数据查询方法、装置、电子设备以及存储介质
CN113220710B (zh) * 2021-05-11 2024-06-04 北京百度网讯科技有限公司 数据查询方法、装置、电子设备以及存储介质
CN113626594A (zh) * 2021-07-16 2021-11-09 上海齐网网络科技有限公司 基于多智能体的运维知识库的建立方法及系统
CN113626594B (zh) * 2021-07-16 2023-09-01 上海齐网网络科技有限公司 基于多智能体的运维知识库的建立方法及系统
CN113837777A (zh) * 2021-09-30 2021-12-24 浙江创邻科技有限公司 基于图数据库的反诈骗管控方法、装置、系统及存储介质
CN113837777B (zh) * 2021-09-30 2024-02-20 浙江创邻科技有限公司 基于图数据库的反诈骗管控方法、装置、系统及存储介质
CN114186100A (zh) * 2021-10-08 2022-03-15 支付宝(杭州)信息技术有限公司 数据存储及查询方法、装置及数据库系统
CN114186100B (zh) * 2021-10-08 2024-05-31 支付宝(杭州)信息技术有限公司 数据存储及查询方法、装置及数据库系统
CN114298674A (zh) * 2021-12-27 2022-04-08 四川启睿克科技有限公司 一种基于复杂规则的轮岗分配计算的轮岗系统及方法
CN114298674B (zh) * 2021-12-27 2024-04-12 四川启睿克科技有限公司 一种基于复杂规则的轮岗分配计算的轮岗系统及方法

Also Published As

Publication number Publication date
CN110990638B (zh) 2023-04-28
CN110990638A (zh) 2020-04-10

Similar Documents

Publication Publication Date Title
WO2021083239A1 (zh) 一种进行图数据查询的方法、装置、设备及存储介质
CN110727839B (zh) 自然语言查询的语义解析
US20210097089A1 (en) Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium
US20140172914A1 (en) Graph query processing using plurality of engines
US10296524B1 (en) Data virtualization using leveraged semantic knowledge in a knowledge graph
CN106021457B (zh) 基于关键词的rdf分布式语义搜索方法
US11068512B2 (en) Data virtualization using leveraged semantic knowledge in a knowledge graph
CN101154228A (zh) 一种分段模式匹配方法及其装置
CN114218400A (zh) 基于语义的数据湖查询系统及方法
WO2021047373A1 (zh) 基于大数据的列数据处理方法、设备及介质
CN112580357A (zh) 自然语言查询的语义解析
De Virgilio Smart RDF data storage in graph databases
Cong Personalized recommendation of film and television culture based on an intelligent classification algorithm
US20200065395A1 (en) Efficient leaf invalidation for query execution
CN112970011A (zh) 记录查询优化中的谱系
CN112784017A (zh) 基于主亲和性表示的档案跨模态数据特征融合方法
US10990881B1 (en) Predictive analytics using sentence data model
Vasilyeva et al. Leveraging flexible data management with graph databases
CN109063048A (zh) 一种基于知识库图匹配的数据清洗方法及装置
Arasu et al. Towards a domain independent platform for data cleaning
Wang et al. A scalable parallel chinese online encyclopedia knowledge denoising method based on entry tags and spark cluster
CN113901278A (zh) 一种基于全局多探测和适应性终止的数据搜索方法和装置
US20170031909A1 (en) Locality-sensitive hashing for algebraic expressions
Abdallah et al. Towards a gml-enabled knowledge graph platform
Banerjee et al. Natural language querying and visualization system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20880538

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20880538

Country of ref document: EP

Kind code of ref document: A1