CN114691721A - Graph data query method and device, electronic equipment and storage medium - Google Patents

Graph data query method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114691721A
CN114691721A CN202210296632.1A CN202210296632A CN114691721A CN 114691721 A CN114691721 A CN 114691721A CN 202210296632 A CN202210296632 A CN 202210296632A CN 114691721 A CN114691721 A CN 114691721A
Authority
CN
China
Prior art keywords
data
file
target
vertex
index file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210296632.1A
Other languages
Chinese (zh)
Inventor
浦世亮
范小辉
姜伟浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202210296632.1A priority Critical patent/CN114691721A/en
Publication of CN114691721A publication Critical patent/CN114691721A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a query method, a query device, electronic equipment and a storage medium of graph data, relates to the technical field of data query, and can receive a query request of a target data file in the graph data; identifying and obtaining a target partition in a target node in graph data through the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions; acquiring an index file of a target partition, wherein the index file comprises storage position information of a plurality of data files; and inquiring according to the index file to obtain the target data file. Therefore, the storage position of the target data is identified through the pre-stored index file, and the target data is quickly searched.

Description

Graph data query method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data query technologies, and in particular, to a graph data query method and apparatus, an electronic device, and a storage medium.
Background
At present, the application of graph data has become more and more widespread. The graph data storage and index structure design is used as an important design part in the graph database, which plays a decisive role in the performance of the graph database, and the influence of different data storage and index modes on the performance is often increased by orders of magnitude.
However, the inventor finds that most storage layers of the current non-native graph database store vertex data and edge data in a key value pair mode, and the graph database selects a key value pair database with a mature architecture as a storage layer, so that no special storage structure optimization is performed on the graph, and not only is the storage space occupied large, but also efficient query of neighbor point traversal cannot be realized due to the limitation of storage structure design.
Disclosure of Invention
The embodiment of the invention aims to provide a graph data query method, a graph data query device, electronic equipment and a storage medium, so as to realize quick query of graph data. The specific technical scheme is as follows:
in a first aspect of the embodiments of the present application, a method for querying graph data is provided, where the method includes:
receiving a query request of a target data file in graph data;
identifying and obtaining a target partition in a target node in the graph data according to the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions;
acquiring an index file of the target partition, wherein the index file comprises storage position information of a plurality of data files;
and inquiring to obtain a target data file according to the index file.
Optionally, each partition stores a plurality of vertex data files and edge data files, the query request of the target data file includes identification information of the target vertex data file, and the index file includes identification information of the vertex data file and storage location information of the corresponding vertex data file;
the obtaining of the target data file according to the index file query comprises:
obtaining the storage position information of the target data file according to the index file query through the identification information of the target vertex data file;
and acquiring the target data file according to the storage position information of the target data file.
Optionally, the index file further includes identification information of the vertex data file, storage location information of the corresponding vertex data file, and storage location information of edge data files of a plurality of edges adjacent to the vertex;
the obtaining of the target data file according to the index file query comprises:
through the identification information of the target vertex data file, the storage position information of the edge data files of a plurality of adjacent edges of the target vertex is obtained through the query of the index file;
and searching the edge data files of a plurality of adjacent edges of the target vertex from the current partition according to the storage position information of the edge data files, wherein the edge data files of the adjacent edges in the index file are stored in a physically continuous storage space.
Optionally, the obtaining of the target data file according to the query of the index file includes:
and according to the index file, obtaining a target data file through binary search query.
Optionally, the index file is a multi-level tree-shaped index structure, the index file includes a plurality of identification information sorted according to a preset order, and the obtaining of the target data file through binary search query according to the index file includes:
and according to the index file, obtaining a target data file through binary search query according to a preset sequence.
Optionally, before receiving the query request of the target data file in the graph data, the method further includes:
acquiring a write-in request of a user, wherein the write-in request comprises data to be written;
performing data classification on the data to be written to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and storing the data to be written to a designated position according to the classification result, and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
Optionally, the storing the data to be written to the designated location according to the classification result, and establishing an index file according to the storage location includes:
dividing the vertex data file and the edge data file into different partitions according to the classification result;
sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result;
and generating a corresponding index file according to the sorting result.
In a second aspect of the embodiments of the present application, there is provided an apparatus for querying graph data, where the apparatus includes:
the request receiving module is used for receiving a query request of a target data file in the graph data;
a partition identification module, configured to identify and obtain a target partition in target nodes in the graph data according to the query request and a preset correspondence, where the graph data includes multiple nodes, and each node includes multiple partitions;
the file acquisition module is used for acquiring an index file of the target partition, wherein the index file comprises storage position information of a plurality of data files;
and the file query module is used for obtaining a target data file according to the index file query.
Optionally, each partition stores a plurality of vertex data files and edge data files, the query request of the target data file includes identification information of the target vertex data file, and the index file includes identification information of the vertex data file and storage location information of the corresponding vertex data file;
the file query module comprises:
the position information query submodule is used for obtaining the storage position information of the target data file according to the index file query through the identification information of the target vertex data file;
and the target file acquisition submodule is used for acquiring the target data file according to the storage position information of the target data file.
Optionally, the index file further includes identification information of the vertex data file, storage location information of the corresponding vertex data file, and storage location information of edge data files of a plurality of edges adjacent to the vertex;
the file query module comprises:
the adjacent position information acquisition submodule is used for inquiring the storage position information of the edge data files of a plurality of adjacent edges of the target vertex according to the index file through the identification information of the target vertex data file;
and the adjacent data file searching submodule is used for searching the edge data files of a plurality of adjacent edges of the target vertex from the current partition according to the storage position information of the edge data files, wherein the edge data files of the adjacent edges in the index file are stored in a physically continuous storage space.
Optionally, the index file is a multi-level tree index structure, and the file query module is specifically configured to obtain a target data file by binary search query according to the index file.
Optionally, the index file includes a plurality of identification information sorted according to a preset order,
the file query module is specifically configured to obtain a target data file through binary search query according to the index file and a preset sequence.
Optionally, the apparatus further comprises:
the device comprises a request acquisition module, a data storage module and a data processing module, wherein the request acquisition module is used for acquiring a write-in request of a user, and the write-in request comprises data to be written;
the classification result acquisition module is used for carrying out data classification on the data to be written to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and the data writing module is used for storing the data to be written to a designated position according to the classification result and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
Optionally, the data writing module includes:
the partition division submodule is used for dividing the vertex data file and the edge data file into different partitions according to the classification result;
the data sorting submodule is used for sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result;
and the index generation submodule is used for generating a corresponding index file according to the sorting result.
On the other hand, the embodiment of the present application further provides an electronic device, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of the query method of any graph data when executing the program stored in the memory.
In another aspect of the embodiments of the present application, a computer-readable storage medium is further provided, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the method for querying the data in any of the above-mentioned figures.
In another aspect of the embodiments of the present application, there is also provided a computer program product containing instructions, which when run on a computer, cause the computer to perform the steps of the method for querying data of any one of the above-mentioned figures.
The embodiment of the invention has the following beneficial effects:
the query method, the query device, the electronic device and the storage medium for the graph data provided by the embodiment of the invention can receive a query request of a target data file in the graph data; identifying and obtaining a target partition in a target node in graph data through the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions; acquiring an index file of a target partition, wherein the index file comprises storage position information of a plurality of data files; and inquiring according to the index file to obtain the target data file. Therefore, the storage position of the target data is identified through the pre-stored index file, and the target data is quickly searched.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by referring to these drawings.
Fig. 1 is a schematic flowchart of a graph data query method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a process for querying a target data file according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an example of a data store provided by an embodiment of the present application;
FIG. 4 is a schematic diagram of another example of data storage provided by an embodiment of the present application;
fig. 5 is another schematic flow chart illustrating query of a target data file according to an embodiment of the present application;
FIG. 6 is a diagram illustrating an example of an index data structure according to an embodiment of the present application;
FIG. 7 is a diagram illustrating another example of an index data structure according to an embodiment of the present application;
fig. 8 is a schematic flowchart of creating an index file according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an apparatus for querying graph data according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments given herein by one of ordinary skill in the art, are within the scope of the invention.
First, technical terms that may be applied in the embodiments of the present application are explained:
the following drawings: an abstract data structure for representing associations between objects is described using vertices representing objects and edges representing relationships between objects.
Native graph database: graph databases are a type of NoSQL (non-relational database) databases, which are non-relational databases that use graph theory to store relational information between entities. The native graph database refers to a graph database in which related optimization is performed on a graph in the aspects of data storage, graph query and the like, and is different from a graph database in which an external storage component (such as a KV (Key-Value) database) is used for storing data.
And (3) data storage: and storing the point-edge data in a graph database.
Indexing: and accelerating to query the data structure of the point edge.
In a first aspect of the embodiments of the present application, a method for querying graph data is provided, including:
receiving a query request of a target data file in graph data;
identifying and obtaining a target partition in a target node in graph data through the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions;
acquiring an index file of a target partition, wherein the index file comprises storage position information of a plurality of data files;
and inquiring according to the index file to obtain the target data file.
Therefore, by the method of the embodiment of the application, the query request of the target graph data can be received; identifying the storage position of the target data through a pre-stored index file; and acquiring target data according to the storage position. Therefore, the storage position of the target data is identified through the pre-stored index file, and the target data is quickly searched.
Referring to fig. 1, fig. 1 is a schematic flowchart of a graph data query method provided in an embodiment of the present application, including:
in step S11, a query request for a target data file in the graph data is received.
The graph data in the embodiment of the present application is data represented in a graph structure. The query request may refer to a query request for querying a file in the graph data. For example, when applied to a graph database, the graph database supports large-capacity point-side data storage by a data partitioning technique, the point-side data is stored in a plurality of different nodes, each node stores only part of the point-side data, and when graph data is stored in different nodes, a query request is sent to the node storing the graph data. The graph database can be used as an analysis platform of graph relations, and graph query (random query of vertexes or edges, query of vertexes in and out edges, and depth multi-level traversal) is a core function. The graph database can support large-capacity point-to-edge data storage through a data partitioning technology, the point-to-edge data is stored on a plurality of different nodes, and each node only stores part of the point-to-edge data.
The method of the embodiment of the application is applied to the intelligent terminal, and can be implemented through the intelligent terminal, specifically, the intelligent terminal can be a computer or a server and the like.
And step S12, identifying and obtaining the target partition in the target node in the graph data through the query request and the preset corresponding relation.
The graph data includes a plurality of nodes, each node including a plurality of partitions. The graph data in the embodiment of the application comprises a plurality of nodes, each node comprises a plurality of partitions, and each partition comprises a plurality of vertex data and a plurality of edge data files of edges adjacent to the vertexes in the partition.
Step S13, an index file of the target partition is acquired, where the index file includes storage location information of a plurality of data files.
The index file may include storage locations corresponding to different target data, and when querying the storage location of the target data file, the index file may be searched for through identification information of the target data file, where the specific identification information may be a preset unique identification.
And step S14, obtaining the target data file according to the index file query.
In the embodiment of the application, the target data file is obtained by querying according to the index file, the corresponding position can be queried according to the index file, and then the corresponding data is obtained from the position. For example, the positions of the graph data such as vertices or edges in the data file can be found according to the index file, and then the graph data to be found can be read into the data file.
When the method of the embodiment of the application is applied to query of data in the node, the target data is obtained according to the storage position, and the target data can be obtained from the node according to the identified storage position of the target data. When the method of the embodiment of the application is applied to the query of the edge data file, the target data file is obtained according to the query of the index file, and the edge data files of a plurality of adjacent edges of the target vertex can be searched from the current partition according to the storage position information of the edge data file.
Therefore, by the method of the embodiment of the application, the query request of the target graph data can be received; identifying the storage position of the target data through a pre-stored index file; and acquiring target data according to the storage position. Therefore, the storage position of the target data is identified through the pre-stored index file, and the target data is quickly searched.
Optionally, referring to fig. 2, each partition stores a plurality of vertex data files and edge data files, the query request of the target data file includes identification information of the target vertex data file, and the index file includes identification information of the vertex data file and storage location information of the corresponding vertex data file;
step S14 is to obtain a target data file according to the index file query, including:
step S141, obtaining the storage position information of the target data file according to the index file query by the identification information of the target vertex data file;
step S142, acquiring the target data file according to the storage location information of the target data file.
In the implementation of the application, the index file is divided into a point index file and an edge index file, and a plurality of pieces of index data are stored, the point index file stores the positions of vertexes in the data file, and the edge index file stores the positions of different types of edges in the data file. The data in the nodes can include two types, such as a point edge data file and a point edge index file, and the vertex data and the vertex-associated edge data are stored in the same partition. See fig. 3. The vertex and edge file organization in a distributed environment is shown in FIG. 4, in which NodeiRepresenting the ith node in the cluster environment, Pn represents partition n, and the partition number of each partition is determined by the partition policy.
Optionally, referring to fig. 5, the index file further includes identification information of the vertex data file, storage location information of the corresponding vertex data file, and storage location information of the edge data file of the plurality of edges adjacent to the vertex;
step S14 is to obtain a target data file according to the index file query, including:
step S143, obtaining the storage position information of the edge data files of a plurality of adjacent edges of the target vertex according to the index file query by the identification information of the target vertex data file;
step S144, according to the storage location information of the edge data file, searching for the edge data files of multiple adjacent edges of the target vertex from the current partition, where the edge data files of adjacent edges in the index file are stored in a physically continuous storage space.
In the embodiment of the present application, each vertex stored in the vertex data file may include a vertex ID (label) and attribute data of the vertex. The vertex ID is a unique identifier for the vertex in the external interface, e.g., a graph database client can query for the corresponding vertex data by the vertex ID. The vertex index file stores the position of each vertex in the vertex data file, and the data of the vertex is obtained according to the position of the data file through inquiring the index.
In the embodiment of the present application, the edge includes edge data and 2 files of edge index, and the file structure is shown in fig. 4, in which
Figure BDA0003563650300000091
J-th edge, l, representing the ith vertexkIndicating the location of all edges in the k-th vertex in the edge data file, e.g., vertex 0 is recorded in the corresponding edge index file
Figure BDA0003563650300000092
In the position of (a) in the first,
Figure BDA0003563650300000093
the m1 th edge corresponding to the vertex 0;
Figure BDA0003563650300000094
the m2 th edge corresponding to the vertex 1;
Figure BDA0003563650300000095
and
Figure BDA0003563650300000096
respectively a 1 st edge and a 2 nd edge corresponding to the vertex n;
Figure BDA0003563650300000097
the mk-th edge corresponding to the vertex n. The edge data file stores all data of each edge, and the vertex ID only needs to be stored once every timeThe edges only need to store the type and the direction of the edges, the ID of the vertex at the other end of the edge and the attribute data of the edges, all the edge data of the same vertex are not only stored in a continuous physical storage space, but also stored in order according to the edge type, so that all the neighbor points of the vertex can be quickly acquired through the index file, and the neighbor points of a specific edge type can also be acquired.
According to the method, the graph data storage structure is reorganized from the design purposes of storage space occupation optimization and deep traversal query, the vertex and the vertex-related in-out edges can be stored in the same node, and the edge data can be stored in a physically continuous storage space, so that neighbor points can be efficiently acquired, and the deep traversal query is realized. In addition, through a tree index structure optimized for graph query, the position of data of a point or an edge can be directly positioned through query indexes, the number of times of reading access files is reduced, and the performance of randomly querying a vertex or an edge is improved.
Optionally, the index file is a multi-level tree index structure, and the target data file is obtained according to the index file query, including: and according to the index file, obtaining a target data file through binary search query. Specifically, the index file may be divided into a point index file and an edge index file, and may store a plurality of pieces of index data, the point index data may store positions of vertices in the data file, and the edge index file may store positions of different types of edges in the data file. The index file can be quickly positioned to the position of the point edge data in the data file, and when the point edge data is inquired, the index file and the data file can be read only by inquiring the index file and reading the data file once.
Optionally, the index file includes a plurality of identification information sorted according to a preset order, and the target data file is obtained through binary search query according to the index file, including: and according to the index file, obtaining a target data file through binary search query according to a preset sequence.
In the embodiment of the present application, the structure of the index data may be as shown in fig. 6. ID is used as the unique identification of a vertex or an edge, and one piece of index Data consists of two parts, Head and Data. Where int denotes an integer type, byte denotes a byte, SkipList denotes a skip list, and BlockPos denotes a block location. Because adjacent IDs in the point-edge data file may have the same prefix, only different portions of the current index and the previous index may be recorded, and each index data stores a batch of IDs having a common prefix, which reduces the data stored.
For example, referring to fig. 7, in the embodiment of the present application, index data forms a tree structure, and the number of times of querying an index file may be reduced by binary search. Only the first tier stores the ID and the location of the data file, and the higher tier index stores the ID and the location pointing to the lower tier node. The query ID "abd 200" is taken as an example for explanation, assuming that an index file is composed of three layers of indexes, first, querying first index data "ab" of the three layers of indexes, finding that "abd 200" is larger than "ab", continuing to query the three layers of indexes until the index is smaller than a certain index, wherein "abd 200" is larger than "ab", and is smaller than "ac", similarly, querying "abd" on the two layers of indexes, and querying "abd 101-200" on the lowest layer, so that the position of the data file with the ID "abd 200" is obtained quickly.
In the embodiment of the application, all the edge data of the vertex are not only stored in a continuous physical storage space, but also stored in order according to the edge type, so that all the neighbor points of the vertex can be quickly acquired through the index file, and the neighbor points of a specific edge type can also be acquired.
By the method, the storage structure of the point-side data and the graph query scene of the index file can be optimized, and compared with a key value pair database, the storage structure of the point-side data improves the use efficiency of a storage space and occupies less storage space. The index data stores the positions of the edges in the data file and also stores the positions of the edges of different edge types in the data file, so that the performance of randomly inquiring the edges is improved.
Optionally, the iteration and query design and implementation ideas of the vertex and the edge are similar, taking the edge as an example, the edge data iterator is implemented mainly by reading the edge index file to obtain the position of the edge data in the data file, and sequentially reading the edge data file in the iteration process. The edge data can be quickly positioned according to the edge index file, and the iteration performance is good because the edge data is read sequentially in the iteration process and the designed storage structure is small. The inquiry of the neighbor point of the vertex is mainly to inquire all edges corresponding to the vertex according to the ID of the vertex, calculate the initial position of the edge corresponding to the vertex in the edge data file, and then read the data file to obtain all the edge data.
By the method, the graph query scene can be optimized, the storage structure of the point-side data is higher in storage space utilization efficiency and smaller in storage space occupation compared with a key value pair database. The index data stores the positions of the edges in the data file and also stores the positions of the edges of different edge types in the data file, so that the performance of randomly inquiring the edges is improved.
Optionally, referring to fig. 8, before receiving a query request of a target data file in graph data, the method further includes:
step S81, acquiring a write-in request of a user, wherein the write-in request comprises data to be written in;
step S82, data classification is carried out on data to be written in to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and step S83, storing the data to be written to the designated position according to the classification result, and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
Optionally, storing the data to be written to the designated location according to the classification result, and creating an index file according to the storage location, including: dividing the vertex data file and the edge data file into different partitions according to the classification result; sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result; and generating a corresponding index file according to the sorting result.
Specifically, in the embodiment of the application, the write request of the user may be acquired through the storage node, the request is analyzed, and data classification is performed, and the vertex data and the edge data are respectively stored in the vertex data file, the edge data file, and the corresponding vertex index file and the corresponding edge index file. The storage node divides the vertex data and the edge data into different partitions according to the unique identification of the vertex, wherein the edge data and the starting point of the edge are divided into the same partition. And then, sequencing the data according to the unique identification number of the vertex through the vertex data, and sequentially storing each piece of data in a vertex data file, wherein the vertex data comprises the unique identification of the vertex and attribute data. The vertex index file stores the positions of the vertex data in the vertex data file, and the vertex index file is organized into a tree structure according to the unique identification numbers of the vertexes. The edge data are sequenced according to the unique identification number of the starting point of the edge, each piece of edge data is sequentially stored in an edge data file, all pieces of edge data with the same starting point are continuously stored, wherein the edge data comprise the type of the edge, the starting point and the end point of the edge, the unique identification of the edge and the attribute data of the edge, the position of the starting point of the edge in the edge data file is stored in the edge index file, and a tree structure is organized according to the unique identification of the starting point.
In a second aspect of the embodiments of the present application, there is provided an apparatus for querying graph data, referring to fig. 9, including:
a request receiving module 901, configured to receive a query request for a target data file in graph data;
a partition identifying module 902, configured to identify a target partition in target nodes in graph data according to a query request and a preset corresponding relationship, where the graph data includes multiple nodes, and each node includes multiple partitions;
a file obtaining module 903, configured to obtain an index file of a target partition, where the index file includes storage location information of multiple data files;
and the file query module 904 is configured to query the index file to obtain a target data file.
Optionally, each partition stores a plurality of vertex data files and edge data files, the query request of the target data file includes identification information of the target vertex data file, and the index file includes identification information of the vertex data file and storage location information of the corresponding vertex data file;
a file query module comprising:
the position information query submodule is used for obtaining the storage position information of the target data file according to the index file query by the identification information of the target vertex data file;
and the target file acquisition submodule is used for acquiring the target data file according to the storage position information of the target data file.
Optionally, the index file further includes identification information of the vertex data file, storage location information of the corresponding vertex data file, and storage location information of edge data files of a plurality of edges adjacent to the vertex;
a file query module comprising:
the adjacent position information acquisition submodule is used for inquiring the storage position information of the edge data files of a plurality of adjacent edges of the target vertex according to the index file through the identification information of the target vertex data file;
and the adjacent data file searching submodule is used for searching the edge data files of a plurality of adjacent edges of the target vertex from the current subarea according to the storage position information of the edge data files, and the index file is of a multi-level tree index structure.
Optionally, the file query module is specifically configured to obtain the target data file by binary search query according to the index file.
Optionally, the index file is a multi-level tree index structure, the index file includes a plurality of identification information sorted according to a preset order,
and the file query module is specifically used for obtaining the target data file through binary search query according to the index file and the preset sequence.
Optionally, the apparatus further comprises:
the device comprises a request acquisition module, a data storage module and a data processing module, wherein the request acquisition module is used for acquiring a write-in request of a user, and the write-in request comprises data to be written;
the classification result acquisition module is used for performing data classification on data to be written to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and the data writing module is used for storing the data to be written to the designated position according to the classification result and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
Optionally, the data writing module includes:
the partition division submodule is used for dividing the vertex data file and the edge data file into different partitions according to the classification result;
the data sorting submodule is used for sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result;
and the index generation submodule is used for generating a corresponding index file according to the sorting result.
Therefore, the device of the embodiment of the application can receive the query request of the target graph data; identifying the storage position of the target data through a pre-stored index file; and acquiring target data according to the storage position. Therefore, the storage position of the target data is identified through the pre-stored index file, and the target data is quickly searched.
The embodiment of the present invention further provides an electronic device, as shown in fig. 10, which includes a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001, the communication interface 1002 and the memory 1003 complete mutual communication through the communication bus 1004,
a memory 1003 for storing a computer program;
the processor 1001 is configured to implement the following steps when executing the program stored in the memory 1003:
receiving a query request of a target data file in graph data;
identifying and obtaining a target partition in a target node in graph data through the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions;
acquiring an index file of a target partition, wherein the index file comprises storage position information of a plurality of data files;
and inquiring according to the index file to obtain the target data file.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In another embodiment provided by the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the method for querying data in any of the above-mentioned figures.
In another embodiment of the present invention, there is also provided a computer program product containing instructions, which when run on a computer, cause the computer to execute the method for querying data of any one of the above embodiments.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., SolID State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the apparatus, the electronic device, the storage medium and the computer program product embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (12)

1. A method for querying graph data, the method comprising:
receiving a query request of a target data file in graph data;
identifying and obtaining a target partition in a target node in the graph data according to the query request and a preset corresponding relation, wherein the graph data comprises a plurality of nodes, and each node comprises a plurality of partitions;
acquiring an index file of the target partition, wherein the index file comprises storage position information of a plurality of data files;
obtaining the storage position information of the target data file according to the index file query through the identification information of the target vertex data file; acquiring the target data file according to the storage position information of the target data file, wherein each partition is stored with a plurality of vertex data files and side data files, the query request of the target data file comprises identification information of the target vertex data file, and the index file comprises the identification information of the vertex data file and the storage position information of the corresponding vertex data file;
and/or the presence of a gas in the gas,
through the identification information of the target vertex data file, the storage position information of the edge data files of a plurality of adjacent edges of the target vertex is obtained through the query of the index file; and searching the edge data files of a plurality of adjacent edges of the target vertex from the current partition according to the storage position information of the edge data files, wherein the edge data files of the adjacent edges in the index file are stored in a physically continuous storage space, and the index file further comprises identification information of the vertex data file, storage position information of the corresponding vertex data file and storage position information of the edge data files of the plurality of adjacent edges of the vertex.
2. The method of claim 1, wherein the index file is a multi-level tree index structure, and the querying the target data file according to the index file comprises:
and according to the index file, obtaining a target data file through binary search query.
3. The method of claim 2, wherein the index file includes a plurality of identification information sorted in a preset order, and obtaining the target data file by a binary search query according to the index file includes:
and according to the index file, obtaining a target data file through binary search query according to a preset sequence.
4. The method of claim 1, wherein prior to receiving the query request for the target data file in the graph data, the method further comprises:
acquiring a write-in request of a user, wherein the write-in request comprises data to be written;
performing data classification on the data to be written to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and storing the data to be written to a designated position according to the classification result, and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
5. The method according to claim 4, wherein the storing the data to be written to a designated location according to the classification result and creating an index file according to the storage location comprises:
dividing the vertex data file and the edge data file into different partitions according to the classification result;
sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result;
and generating a corresponding index file according to the sorting result.
6. An apparatus for querying graph data, the apparatus comprising:
the request receiving module is used for receiving a query request of a target data file in the graph data;
a partition identification module, configured to identify and obtain a target partition in target nodes in the graph data according to the query request and a preset correspondence, where the graph data includes multiple nodes, and each node includes multiple partitions;
the file acquisition module is used for acquiring an index file of the target partition, wherein the index file comprises storage position information of a plurality of data files;
the file query module is used for querying the storage position information of the target data file according to the index file through the identification information of the target vertex data file; acquiring the target data file according to the storage position information of the target data file, wherein each partition is stored with a plurality of vertex data files and side data files, the query request of the target data file comprises identification information of the target vertex data file, and the index file comprises the identification information of the vertex data file and the storage position information of the corresponding vertex data file; and/or obtaining the storage position information of the edge data files of a plurality of adjacent edges of the target vertex according to the index file query by the identification information of the target vertex data file; and searching the edge data files of a plurality of adjacent edges of the target vertex from the current partition according to the storage position information of the edge data files, wherein the edge data files of the adjacent edges in the index file are stored in a physically continuous storage space, and the index file further comprises identification information of the vertex data file, storage position information of the corresponding vertex data file and storage position information of the edge data files of the plurality of adjacent edges of the vertex.
7. The apparatus of claim 6, wherein the index file is a multi-level tree index structure;
the file query module is specifically used for obtaining a target data file through binary search query according to the index file.
8. The apparatus of claim 7, wherein the index file includes a plurality of identification information sorted in a preset order,
the file query module is specifically configured to obtain a target data file through binary search query according to the index file and a preset sequence.
9. The apparatus of claim 6, further comprising:
the device comprises a request acquisition module, a data storage module and a data processing module, wherein the request acquisition module is used for acquiring a write-in request of a user, and the write-in request comprises data to be written;
the classification result acquisition module is used for carrying out data classification on the data to be written to obtain a corresponding classification result, wherein the classification result is at least one of a vertex data file and an edge data file;
and the data writing module is used for storing the data to be written to a designated position according to the classification result and establishing an index file according to the storage position, wherein the index file is at least one of a vertex index file and an edge index file.
10. The apparatus of claim 9, wherein the data writing module comprises:
the partition division submodule is used for dividing the vertex data file and the edge data file into different partitions according to the classification result;
the data sorting submodule is used for sorting according to the identification information of the data files in each partition to obtain a corresponding sorting result;
and the index generation submodule is used for generating a corresponding index file according to the sorting result.
11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.
12. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-5.
CN202210296632.1A 2022-03-24 2022-03-24 Graph data query method and device, electronic equipment and storage medium Pending CN114691721A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210296632.1A CN114691721A (en) 2022-03-24 2022-03-24 Graph data query method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210296632.1A CN114691721A (en) 2022-03-24 2022-03-24 Graph data query method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114691721A true CN114691721A (en) 2022-07-01

Family

ID=82138752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210296632.1A Pending CN114691721A (en) 2022-03-24 2022-03-24 Graph data query method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114691721A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033599A (en) * 2022-08-12 2022-09-09 深圳市洞见智慧科技有限公司 Graph query method, system and related device based on multi-party security
CN115481298A (en) * 2022-11-14 2022-12-16 阿里巴巴(中国)有限公司 Graph data processing method and electronic equipment
CN116403684A (en) * 2023-06-08 2023-07-07 杭州医策科技有限公司 Digital pathological image loading method and device
CN117235120A (en) * 2023-11-09 2023-12-15 支付宝(杭州)信息技术有限公司 Hypergraph data storage and query method and device with time sequence characteristics

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033599A (en) * 2022-08-12 2022-09-09 深圳市洞见智慧科技有限公司 Graph query method, system and related device based on multi-party security
CN115033599B (en) * 2022-08-12 2022-11-11 深圳市洞见智慧科技有限公司 Graph query method, system and related device based on multi-party security
CN115481298A (en) * 2022-11-14 2022-12-16 阿里巴巴(中国)有限公司 Graph data processing method and electronic equipment
CN116403684A (en) * 2023-06-08 2023-07-07 杭州医策科技有限公司 Digital pathological image loading method and device
CN116403684B (en) * 2023-06-08 2023-08-11 杭州医策科技有限公司 Digital pathological image loading method and device
CN117235120A (en) * 2023-11-09 2023-12-15 支付宝(杭州)信息技术有限公司 Hypergraph data storage and query method and device with time sequence characteristics

Similar Documents

Publication Publication Date Title
CN114691721A (en) Graph data query method and device, electronic equipment and storage medium
CN107491487B (en) Full-text database architecture and bitmap index creation and data query method, server and medium
CN109325032B (en) Index data storage and retrieval method, device and storage medium
CN111801665B (en) Hierarchical Locality Sensitive Hash (LSH) partition index for big data applications
CN112287182A (en) Graph data storage and processing method and device and computer storage medium
CN109033278B (en) Data processing method and device, electronic equipment and computer storage medium
CA3033173A1 (en) Systems, methods, and data structures for high-speed searching or filtering of large datasets
US20150058352A1 (en) Thin database indexing
CN112765405B (en) Method and system for clustering and inquiring spatial data search results
US11327985B2 (en) System and method for subset searching and associated search operators
CN114491172B (en) Rapid retrieval method, device and equipment for tree structure nodes and storage medium
US11853279B2 (en) Data storage using vectors of vectors
CN112434027A (en) Indexing method and device for multi-dimensional data, computer equipment and storage medium
US20230252012A1 (en) Method for indexing data
CN113656397A (en) Index construction and query method and device for time series data
CN113779286B (en) Method and device for managing graph data
CN116126864A (en) Index construction method, data query method and related equipment
CN110889424B (en) Vector index establishing method and device and vector retrieving method and device
KR102354343B1 (en) Spatial indexing method and apparatus for blockchain-based geospatial data
CN115809268B (en) Adaptive query method and device based on fragment index
CN110825747A (en) Information access method, device and medium
Yagoubi et al. Radiussketch: massively distributed indexing of time series
CN115495462A (en) Batch data updating method and device, electronic equipment and readable storage medium
CN114416741A (en) KV data writing and reading method and device based on multi-level index and storage medium
CN114398373A (en) File data storage and reading method and device applied to database storage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination