CN114020781A - Query task optimization method based on scientific and technological consultation large-scale graph data - Google Patents

Query task optimization method based on scientific and technological consultation large-scale graph data Download PDF

Info

Publication number
CN114020781A
CN114020781A CN202111316037.1A CN202111316037A CN114020781A CN 114020781 A CN114020781 A CN 114020781A CN 202111316037 A CN202111316037 A CN 202111316037A CN 114020781 A CN114020781 A CN 114020781A
Authority
CN
China
Prior art keywords
query
node
nodes
optimization method
query task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111316037.1A
Other languages
Chinese (zh)
Inventor
鄂海红
宋美娜
梁静茹
刘雨薇
魏秋实
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202111316037.1A priority Critical patent/CN114020781A/en
Publication of CN114020781A publication Critical patent/CN114020781A/en
Priority to PCT/CN2022/087215 priority patent/WO2023077731A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The query task optimization method, the query task optimization system and the storage medium based on scientific and technological consultation large-scale graph data acquire the identification of a query task, and select a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises the steps of adjusting a graph traversal and expansion sequence strategy, Cardinal reduction, mode advance and materialized view, then querying a graph database by using the query optimization method, and outputting a query result. Therefore, in the method provided by the disclosure, the corresponding query optimization method can be selected according to the identification of the query task, and the flexibility of the query method is improved. Meanwhile, in the method provided by the disclosure, the query optimization method improves the query efficiency of the query task under different scenes of scientific and technological consultation large-scale graph data, reduces the complexity of query calculation, and shortens the time spent on query.

Description

Query task optimization method based on scientific and technological consultation large-scale graph data
Technical Field
The application relates to the field of large-scale graph data query, in particular to a query task optimization method and device based on scientific and technological consultation large-scale graph data and a storage medium.
Background
The query task on graph data is one of the most fundamental problems in the field of knowledge graph, so that efficient query processing is generally required on large-scale graph data so that a user can quickly obtain a query result.
Currently, although query optimization techniques on graph data have advanced sufficiently, there are still some problems: like graph partitioning techniques for graph query optimization, graph data can be split into multiple servers, but the servers have high communication cost and processing overhead. Moreover, most query optimization technologies perform query optimization based on graph data of a social network, and are not suitable for graph data of a complex topological structure of a scientific and technological consultation scene. Therefore, how to optimize the query task based on the scientific and technical consultation large-scale graph data is a problem which needs to be solved urgently.
Disclosure of Invention
The application provides a method, a system and a storage medium for optimizing a query task based on scientific and technological consultation large-scale graph data, and provides the method for optimizing the query task based on the scientific and technological consultation large-scale graph data.
An embodiment of a first aspect of the present application provides a method for optimizing a query task based on scientific and technological consulting large-scale graph data, including:
acquiring an identifier of a query task;
selecting a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises adjusting a graph traversal and expansion sequence strategy, reducing Cardiality, advancing a mode and materializing a view;
and querying the graph database by using the query optimization method, and outputting a query result.
An embodiment of a second aspect of the present application provides a query task optimization system based on scientific and technological consulting large-scale graph data, including:
the acquisition module is used for acquiring the identifier of the query task;
the selection module is used for selecting a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises adjusting a graph traversal and expansion sequence strategy, reducing Cardiality, advancing a mode and materializing a view;
and the display module is used for inquiring the graph database by using the inquiry optimization method and outputting an inquiry result.
A computer storage medium provided in an embodiment of the third aspect of the present application, where the computer storage medium stores computer-executable instructions; the computer executable instructions, when executed by a processor, are capable of performing the method of the first aspect as described above.
A computer device according to an embodiment of a fourth aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method according to the first aspect is implemented.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
the query task optimization method, the query task optimization system and the storage medium based on scientific and technological consultation large-scale graph data acquire the identification of a query task, and select a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises the steps of adjusting a graph traversal and expansion sequence strategy, Cardinal reduction, mode advance and materialized view, then querying a graph database by using the query optimization method, and outputting a query result. Therefore, in the method provided by the disclosure, the corresponding query optimization method can be selected according to the identification of the query task, and the flexibility of the query method is improved. Meanwhile, in the method provided by the disclosure, the query optimization method improves the query efficiency of the query task under different scenes of scientific and technological consultation large-scale graph data, reduces the complexity of query calculation, and shortens the time spent on query.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a query task optimization method based on scientific and technological consulting large-scale graph data according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a query task optimization system based on scientific and technological consulting large-scale graph data according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The query task optimization method, the query task optimization system and the storage medium based on scientific and technological consultation large-scale graph data acquire the identification of a query task, and select a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises the steps of adjusting a graph traversal and expansion sequence strategy, Cardinal reduction, mode advance and materialized view, then querying a graph database by using the query optimization method, and outputting a query result. Therefore, in the method provided by the disclosure, the corresponding query optimization method can be selected according to the identification of the query task, and the flexibility of the query method is improved. Meanwhile, in the method provided by the disclosure, the query optimization method improves the query efficiency of the query task under different scenes of scientific and technological consultation large-scale graph data, reduces the complexity of query calculation, and shortens the time spent on query.
The following describes a query task optimization method and system based on scientific and technological consulting large-scale graph data according to an embodiment of the present application with reference to the accompanying drawings.
Example one
Fig. 1 is a schematic flowchart of a query task optimization method based on scientific and technological consulting large-scale graph data according to an embodiment of the present application, and as shown in fig. 1, the method may include:
step 101, obtaining the identification of the query task.
It should be noted that, in the embodiment of the present disclosure, the query task may include an organization, a talent, and an industry chain. In the embodiment of the present disclosure, the organization may be an ID of a company, and the talent may be a person
In the embodiment of the present disclosure, the identifier of the query task may be obtained according to the content of the query task. For example, in the embodiment of the present disclosure, assuming that the query task is to view company and patent conditions associated with a certain person, the identifier of the query task is obtained.
And 102, selecting a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises the steps of adjusting a graph traversal and expansion sequence strategy, reducing Cardiality, advancing a mode and materializing a view.
In the embodiment of the present disclosure, different identifiers correspond to different query optimization methods, and a corresponding query method may be selected according to the identifier of the query task.
And query optimization prevention in embodiments of the present disclosure may include adjusting graph traversal expansion order policy, Cardiality reduction, pattern advancement, materialized views.
Further, in the embodiment of the disclosure, a graph traversal expansion sequence strategy is adjusted in combination with a scientific and technological consultation actual query scenario, a graph traversal expansion sequence of the bidirectional BFS is designed, search is started from two directions of a starting point and an end point, and once a searched position in the other direction is searched (or a certain state is visited by both directions), a shortest path connecting the starting point and the end point is found. Then converge to a point in the middle of the shortest path and meet at the middle point of the path, so that the node number of the bidirectional BFS is 2 ×nm/2+1An order of magnitude.
Specifically, in the embodiment of the present disclosure, adjusting the graph traversal expansion order policy may include the following steps:
s11, inputting a source entity node and a target entity node, and inputting an intermediate entity node type mtype and a path mode pattern;
s12, initializing two node sets of S1 and S2, wherein S1 is initialized to be an input source entity node, and S2 is initialized to be an input target entity node;
s13, calculating the unfolding sequence of the bidirectional BFS by using pattern and mtype, and representing the unfolding sequence of the left end by using pattern1 and the unfolding sequence of the right end by using pattern 2;
s14, if S1 or S2 is not empty, continuing to execute step S15; otherwise, executing step S111;
s15 and S are the set of the expansion nodes of the layer;
s16, exchanging S1 and S2, and alternately expanding from the left end and expanding from the right end;
s17, expanding the next layer neighbor node of the node according to the mode for each node in the S1 set, and expressing the node by next _ nodes;
s18, judging the node in each next _ nodes, if the node is in the S set, finding a path, and executing the step S111;
s19, adding all the next _ nodes of the expanded nodes of the layer into a set S, copying the set S to S1, and storing the path;
s110, repeating the step S14;
and S111, ending.
For example, in the embodiment of the disclosure, the query task gives an industry chain tag and person information person, and queries the sub industry chain tag from the tag, the patent belonging to the sub industry chain tag, the company belonging to the patent, the job/investment of the company, and other related persons. In the established scientific and technological consultation knowledge graph, 146284 patent intermediate nodes can be generated on the path of an industry chain-a sub-industry chain label-a patent, if the 146284 patents are expanded by using the unidirectional BFS, explosive intermediate results can be generated, and the query performance is seriously influenced.
If the graph traversal and expansion sequence optimization strategy of the bidirectional BFS in the embodiment of the present disclosure is used, bidirectional search is performed from the starting point and the end point, that is, traversal is performed in two directions of the industry chain label-sub-industry chain label-patent and the personnel-company-patent, 146284 patent intermediate nodes generated by the industry chain label-sub-industry chain label-patent are processed into a hash table, and then a set of results is generated from the personnel node in the reverse direction, the path of the personnel-company-patent is intersected to find a path that connects the starting point and the end point and meets the condition, and the time complexity also only needs o (n).
Further, in the embodiments of the present disclosure, Cardinality represents the number of unique values after deduplication, such as Columns Cardinality (column Cardinality) refers to the number of non-duplicate values contained in a column. This quantity directly affects the effect of model compression and performance of the engine when scanning. It is therefore desirable to minimize cardability to reduce the time required for queries.
In an embodiment of the present disclosure, the cardability reduction may include the following steps:
s21, inputting a source entity node and a path pattern;
s22, next _ nodes is a node set of the next layer of expansion, and is initialized to a neighbor node of the next layer of the source entity node expanded according to the mode;
s23, removing the duplicate of the next _ nodes;
s24 and q are node queues and are initialized to next _ nodes;
s25, if q is not empty, continuing to execute step S26; otherwise, executing step S212;
s26, setting size as the current queue number;
s27, if the size is not empty, continuing to execute the step S28; otherwise, executing step S211;
s28, popping up a node of the current queue;
s29, expanding next-layer neighbor nodes next _ nodes of the node according to the mode;
s210, adding the next _ nodes into a queue q;
s211, if the pattern is traversed completely, continuing to execute the step S212, otherwise executing the step S25;
and S212, ending.
For example, in the embodiment of the present disclosure, in the knowledge graph in the practical scenario of scientific and technological consultation, there may be a heavy edge or different types of edges between two points, for example, there are three relationships "company-investor"/"company-princess stockholder-person"/"company-staffing person" between the "company" node and the "person" node. Therefore, looking for a "people" node adjacent to a company from a company may locate some of the same "people" nodes from the three relationships described above, resulting in duplicate nodes. And the redundant nodes are repeated, Cardinal is increased, and when the repeated 'personnel' nodes continue to search for adjacent nodes, traversal is repeated, so that the number of intermediate nodes is increased, and the query time is prolonged. Therefore, in embodiments of the present disclosure, a distinting pre-optimization strategy is used to reduce cardinality.
Specifically, in the embodiment of the present disclosure, the query task in the scientific and technological consultation scenario is to give person, search for its associated company, the patents owned by the company, and the industry chain labels to which the patents belong from the given person query, and output the non-repetitive company, patent, and industry chain label tuples that conform to the path. The embodiment of the disclosure uses the distinting to reduce the Cardinal optimization strategy in advance, and the deduplication operation is carried out in advance after the generation of the repeated nodes, that is, the deduplication operation is carried out immediately after the 'personnel' nodes traverse to the 'company' nodes, and 201 company intermediate nodes with repetition are reduced to 131 company nodes without repetition, so that the generation of the intermediate nodes is reduced, and the subsequent traversal time is reduced.
Further, in the embodiment of the present disclosure, the target data needs to be acquired and screened according to the service condition, and this process is filtering of the data query. There are a large number of filtering operations in a large-scale graph query task, AND various filtering conditions used in the filtering process are necessary steps for obtaining accurate data, such as basic algorithms (<, >, | >), logical operations (AND, OR, NOT), AND pattern matching.
Wherein, in the embodiment of the present disclosure, the mode advancing may include the following steps:
s31, inputting a source entity node, a path mode pattern and a filter _ pattern;
s32, initializing a mode advance set filter _ nodeb;
s33 and q are node queues and are initialized to be input source entity nodes;
s34, if q is not empty, continuing to execute step S35; otherwise, executing step S313;
s35, initializing the current queue number size;
s36, if the size is not empty, continuing to step S37; otherwise, go to step S312;
s37, popping up a node of the current queue;
s38, expanding next-layer neighbor nodes next _ nodes of the node according to the mode;
s39, judging whether the current next _ nodes node type is the filter _ nodes node type, if yes, continuing to execute the step S310; otherwise, executing step S311;
s310, traversing the next _ node of the next _ nodes set, and filtering the node if the next _ node is in the filter _ node set;
s311, adding the next _ nodes into a queue q;
s312, if the pattern is traversed completely, continuing to execute the step S313, otherwise executing the step S35;
and S313, ending.
For example, in the embodiment of the present disclosure, the query task in the scientific and technical consultation scenario is to give the tag information tag of the industry chain, and to query the company associated with the tag and the patent owned by the company, there is a filtering condition: the company can not have the operation exception, namely, the mode of the company-operation exception does not exist, and the company and the patent tuples without the duplication are output.
In particular, the mode advancement in embodiments of the present disclosure is to replace the traversal operation in the mode with efficient lookup of the sets. The mode of company-operation abnormity is made in advance, the company ID information associated with the 'operation abnormity' node is put into a hash table, then the filtering condition can judge whether the 'company' node exists in the hash table, if the 'company' node does not exist in the hash table, the 'company' node shows that the company has no operation abnormity, the set search is carried out only by the time complexity of 3292 times o (1), and therefore the query efficiency is improved.
Further, in the embodiment of the disclosure, the materialized view is mainly used for pre-calculating and storing results of operations which are time-consuming, such as table connection or aggregation, so that the operations which are time-consuming can be avoided when the query task is executed subsequently, and the query result can be obtained quickly. Under the scientific and technological consultation scene, the materialized view greatly improves the query performance of the hotspot problems which frequently use the same query result repeatedly, so that data can be quickly read from the materialized view.
For example, in the embodiment of the disclosure, in a scientific and technological consultation scenario, an industry chain tag information tag is given to a query task, a sub-industry chain tag of the query task and a company belonging to the sub-industry chain tag are queried from the tag, then the query task takes the sub-industry chain tag as a starting node, a path through which a patent finally traverses to reach a company node is queried, and company information and the number of patents which accord with the mode are counted. If each company is queried separately, it is time consuming. However, the materialized view method in the embodiment of the disclosure may obtain the patents owned by each company in advance, determine the industry chain labels to which each patent belongs and aggregate the patents, and enter the number of the patents under the industry chain labels into the attribute of the "company-industry chain label" edge, so that the precomputed materialized view improves the query efficiency.
And 103, querying the graph database by using a query optimization method, and outputting a query result.
In the embodiment of the present disclosure, the query optimization method in step 102 is used to query the graph database, and the query result is output. And, in embodiments of the present disclosure, the query results may include associations between nodes in a graph database.
The query task optimization method based on scientific and technological consultation large-scale graph data obtains identifiers of query tasks and selects corresponding query optimization methods according to the identifiers of the query tasks, wherein the query optimization methods comprise adjusting graph traversal and expansion sequence strategies, Cardinal reduction, mode advance and materialized view, then query is conducted on a graph database by using the query optimization methods, and query results are output. Therefore, in the method provided by the disclosure, the corresponding query optimization method can be selected according to the identification of the query task, and the flexibility of the query method is improved. Meanwhile, in the method provided by the disclosure, the query optimization method improves the query efficiency of the query task under different scenes of scientific and technological consultation large-scale graph data, reduces the complexity of query calculation, and shortens the time spent on query.
Fig. two is a schematic structural diagram of a query task optimization system based on scientific and technological consulting large-scale graph data according to an embodiment of the present application, and as shown in fig. 2, the system may include:
an obtaining module 201, configured to obtain an identifier of a query task;
the selection module 202 is configured to select a corresponding query optimization method according to the identifier of the query task, where the query optimization method includes adjusting a graph traversal and expansion sequence policy, reducing Cardiality, advancing a mode, and materializing a view;
the display module 203 is configured to query the graph database by using a query optimization method, and output a query result.
In the embodiment of the present disclosure, the query task may include an organization, a talent, and an industrial chain.
The query task optimization method, the query task optimization system and the storage medium based on scientific and technological consultation large-scale graph data acquire the identification of a query task, and select a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises the steps of adjusting a graph traversal and expansion sequence strategy, Cardinal reduction, mode advance and materialized view, then querying a graph database by using the query optimization method, and outputting a query result. Therefore, in the method provided by the disclosure, the corresponding query optimization method can be selected according to the identification of the query task, and the flexibility of the query method is improved. Meanwhile, in the method provided by the disclosure, the query optimization method improves the query efficiency of the query task under different scenes of scientific and technological consultation large-scale graph data, reduces the complexity of query calculation, and shortens the time spent on query.
A computer storage medium provided in an embodiment of the third aspect of the present application, where the computer storage medium stores computer-executable instructions; the computer executable instructions, when executed by a processor, are capable of performing the method of the first aspect as described above.
A computer device according to an embodiment of a fourth aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method according to the first aspect is implemented.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (9)

1. A query task optimization method based on scientific and technological consultation large-scale graph data is characterized by comprising the following steps:
acquiring an identifier of a query task;
selecting a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises adjusting a graph traversal and expansion sequence strategy, reducing Cardiality, advancing a mode and materializing a view;
and querying the graph database by using the query optimization method, and outputting a query result.
2. The method of claim 1, wherein the query task comprises an organization, a talent, and an industry chain.
3. The query task optimization method of claim 1, wherein the tuning graph traverses an expansion order strategy, comprising:
s11, inputting a source entity node and a target entity node, and inputting an intermediate entity node type mtype and a path mode pattern;
s12, initializing two node sets of S1 and S2, wherein S1 is initialized to be an input source entity node, and S2 is initialized to be an input target entity node;
s13, calculating the unfolding sequence of the bidirectional BFS by using pattern and mtype, and representing the unfolding sequence of the left end by using pattern1 and the unfolding sequence of the right end by using pattern 2;
s14, if S1 or S2 is not empty, continuing to execute step S15; otherwise, executing step S111;
s15 and S are the set of the expansion nodes of the layer;
s16, exchanging S1 and S2, and alternately expanding from the left end and expanding from the right end;
s17, expanding the next layer neighbor node of the node according to the mode for each node in the S1 set, and expressing the node by next _ nodes;
s18, judging the node in each next _ nodes, if the node is in the S set, finding a path, and executing the step S111;
s19, adding all the next _ nodes of the expanded nodes of the layer into a set S, copying the set S to S1, and storing the path;
s110, repeating the step S14;
and S111, ending.
4. The query task optimization method of claim 1, wherein the Cardinal reduction comprises:
s21, inputting a source entity node and a path pattern;
s22, next _ nodes is a node set of the next layer of expansion, and is initialized to a neighbor node of the next layer of the source entity node expanded according to the mode;
s23, removing the duplicate of the next _ nodes;
s24 and q are node queues and are initialized to next _ nodes;
s25, if q is not empty, continuing to execute step S26; otherwise, executing step S212;
s26, setting size as the current queue number;
s27, if the size is not empty, continuing to execute the step S28; otherwise, executing step S211;
s28, popping up a node of the current queue;
s29, expanding next-layer neighbor nodes next _ nodes of the node according to the mode;
s210, adding the next _ nodes into a queue q;
s211, if the pattern is traversed completely, continuing to execute the step S212, otherwise executing the step S25;
and S212, ending.
5. The query task optimization method of claim 1, wherein the pattern is advanced, comprising:
s31, inputting a source entity node, a path mode pattern and a filter _ pattern;
s32, initializing a mode advance set filter _ nodeb;
s33 and q are node queues and are initialized to be input source entity nodes;
s34, if q is not empty, continuing to execute step S35; otherwise, executing step S313;
s35, initializing the current queue number size;
s36, if the size is not empty, continuing to step S37; otherwise, go to step S312;
s37, popping up a node of the current queue;
s38, expanding next-layer neighbor nodes next _ nodes of the node according to the mode;
s39, judging whether the current next _ nodes node type is the filter _ nodes node type, if yes, continuing to execute the step S310; otherwise, executing step S311;
s310, traversing the next _ node of the next _ nodes set, and filtering the node if the next _ node is in the filter _ node set;
s311, adding the next _ nodes into a queue q;
s312, if the pattern is traversed completely, continuing to execute the step S313, otherwise executing the step S35;
and S313, ending.
6. A query task optimization system based on scientific and technological consultation large-scale graph data is characterized by comprising:
the acquisition module is used for acquiring the identifier of the query task;
the selection module is used for selecting a corresponding query optimization method according to the identification of the query task, wherein the query optimization method comprises adjusting a graph traversal and expansion sequence strategy, reducing Cardiality, advancing a mode and materializing a view;
and the display module is used for inquiring the graph database by using the inquiry optimization method and outputting an inquiry result.
7. The query task optimization system of claim 6, wherein the query task comprises an organization, a talent, an industry chain.
8. A computer storage medium, wherein the computer storage medium stores computer-executable instructions; the computer-executable instructions, when executed by a processor, are capable of performing the method of any of claims 1-5.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1-5 when executing the program.
CN202111316037.1A 2021-11-08 2021-11-08 Query task optimization method based on scientific and technological consultation large-scale graph data Pending CN114020781A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111316037.1A CN114020781A (en) 2021-11-08 2021-11-08 Query task optimization method based on scientific and technological consultation large-scale graph data
PCT/CN2022/087215 WO2023077731A1 (en) 2021-11-08 2022-04-15 Query task optimization method based on science and technology consultation large-scale graph data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111316037.1A CN114020781A (en) 2021-11-08 2021-11-08 Query task optimization method based on scientific and technological consultation large-scale graph data

Publications (1)

Publication Number Publication Date
CN114020781A true CN114020781A (en) 2022-02-08

Family

ID=80062381

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111316037.1A Pending CN114020781A (en) 2021-11-08 2021-11-08 Query task optimization method based on scientific and technological consultation large-scale graph data

Country Status (2)

Country Link
CN (1) CN114020781A (en)
WO (1) WO2023077731A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880504A (en) * 2022-07-08 2022-08-09 支付宝(杭州)信息技术有限公司 Graph data query method, device and equipment
WO2023077731A1 (en) * 2021-11-08 2023-05-11 北京邮电大学 Query task optimization method based on science and technology consultation large-scale graph data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190236188A1 (en) * 2018-01-31 2019-08-01 Salesforce.Com, Inc. Query optimizer constraints
GB201813561D0 (en) * 2018-08-21 2018-10-03 Shapecast Ltd Machine learning optimisation method
US11392623B2 (en) * 2019-12-11 2022-07-19 Oracle International Corporation Hybrid in-memory BFS-DFS approach for computing graph queries against heterogeneous graphs inside relational database systems
CN114020781A (en) * 2021-11-08 2022-02-08 北京邮电大学 Query task optimization method based on scientific and technological consultation large-scale graph data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023077731A1 (en) * 2021-11-08 2023-05-11 北京邮电大学 Query task optimization method based on science and technology consultation large-scale graph data
CN114880504A (en) * 2022-07-08 2022-08-09 支付宝(杭州)信息技术有限公司 Graph data query method, device and equipment
CN114880504B (en) * 2022-07-08 2023-03-31 支付宝(杭州)信息技术有限公司 Graph data query method, device and equipment
WO2024008117A1 (en) * 2022-07-08 2024-01-11 支付宝(杭州)信息技术有限公司 Graph data query method, apparatus and device

Also Published As

Publication number Publication date
WO2023077731A1 (en) 2023-05-11

Similar Documents

Publication Publication Date Title
US11354365B1 (en) Using aggregate compatibility indices to identify query results for queries having qualitative search terms
CA2310842C (en) Page aggregation for web sites
CN114020781A (en) Query task optimization method based on scientific and technological consultation large-scale graph data
US5301317A (en) System for adapting query optimization effort to expected execution time
US6353825B1 (en) Method and device for classification using iterative information retrieval techniques
US6925462B2 (en) Database management system, and query method and query execution program in the database management system
WO2009031915A1 (en) Method and a system for storing, retrieving and extracting information on the basis of low-organised and decentralised datasets
Vieira et al. Querying trajectories using flexible patterns
US11748351B2 (en) Class specific query processing
JP2004518226A (en) Database system and query optimizer
Huang et al. Query optimization of distributed pattern matching
CN108388642A (en) A kind of subgraph query method, device and computer readable storage medium
CN106202102B (en) Batch data querying method and device
FRATERNALI Graph search of software models using multidimensional scaling
Wu et al. Mining Skyline Patterns from Big Data Environments based on a Spark Framework
Bodra Processing queries over partitioned graph databases: An approach and it’s evaluation
Wang et al. Regular expression matching on billion-nodes graphs
JPH113354A (en) Data cube control system
Margoor et al. Improving join reordering for large scale distributed computing
CN111274265B (en) Method and device for fusion retrieval based on multiple retrieval modes
JPH04276828A (en) Hypothesis management method for knowledge processing system
CN114860729A (en) Relational data connection method and system based on graph structure index
JPH0333979A (en) Processing method for definitive predicate of relational database
KR100333682B1 (en) A Query Processing Method For Grouping And Aggregation Operations In Object-Relational Database Systems Using Reverse Pointers
Paroha An Incremental Mining Algorithm for Association Rules

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination