CN111159316B - Relational database query method, device, electronic equipment and storage medium - Google Patents

Relational database query method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111159316B
CN111159316B CN202010093639.4A CN202010093639A CN111159316B CN 111159316 B CN111159316 B CN 111159316B CN 202010093639 A CN202010093639 A CN 202010093639A CN 111159316 B CN111159316 B CN 111159316B
Authority
CN
China
Prior art keywords
node
nodes
connection
queue
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010093639.4A
Other languages
Chinese (zh)
Other versions
CN111159316A (en
Inventor
陈亮
刘海清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010093639.4A priority Critical patent/CN111159316B/en
Publication of CN111159316A publication Critical patent/CN111159316A/en
Application granted granted Critical
Publication of CN111159316B publication Critical patent/CN111159316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • G06F16/2456Join operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a relational database query method, a relational database query device, electronic equipment and a storage medium, and relates to the field of databases, wherein the method comprises the following steps: determining a node where data inquired by a user is located, wherein the node represents a data table in a relational database; adding the determined set formed by the nodes into a first queue as a candidate set, and executing the following first predetermined processing: taking out a standby set from the first queue according to a first-in first-out principle; if the connection tree can be generated according to the alternative set, the query is completed according to the connection tree, otherwise, a node meeting the expansion requirement is selected, the node is added into the alternative set aiming at any selected node, an expanded alternative set is obtained and added into the first queue, and if the first queue is not empty, the first preset processing is executed repeatedly. By applying the scheme, the user operation can be simplified, and the use threshold of the user is reduced.

Description

Relational database query method and device, electronic equipment and storage medium
Technical Field
The present application relates to computer application technologies, and in particular, to a relational database query method and apparatus, an electronic device, and a storage medium in the field of databases.
Background
A relational database refers to a database that uses a data model to organize data, storing data in rows and columns. The relational database may be composed of data tables and connection relationships between the data tables.
A user retrieves data in a database through a query. When inquiring, the user can write the inquiry statement by himself, and the inquiry statement is directly submitted to the database to be executed, and then the data is obtained. However, in this way, the user needs to give all data tables involved in the query, the connection relationship among the data tables, and the like, and all data tables involved in the query may include the data table in which the data queried by the user is located, the intermediate tables of the data tables, and the like. For example, the data tables where the data queried by the user are the data table 1 and the data table 2, the data table 1 and the data table 2 may be connected through the data table 3, and then the data table 3 is the intermediate table.
This approach requires that the user must have complete knowledge of the data model and familiarity with the database query method, which is often a requirement for developers, and it is difficult for the general user to have these qualities, thereby limiting the user's use threshold, etc.
Disclosure of Invention
In view of this, the present application provides a method, an apparatus, an electronic device, and a storage medium for querying a relational database.
A method of relational database querying, comprising:
determining a node where data inquired by a user is located, wherein the node represents a data table in a relational database;
adding the determined set formed by the nodes into a first queue as a candidate set, and executing the following first predetermined processing:
taking out a candidate set from the first queue according to a first-in first-out principle;
if a connection tree can be generated according to the alternative set, finishing the query according to the connection tree;
if the connection tree can not be generated according to the alternative set, selecting nodes meeting the expansion requirement, adding the nodes into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into the first queue; and if the first queue is not empty, repeatedly executing the first preset processing.
According to a preferred embodiment of the present application, the method further comprises: generating a directed graph corresponding to the relational database, wherein the relational database consists of data tables and connection relations among the data tables, the data tables are abstracted into nodes in the directed graph, and the connection relations are abstracted into edges among the nodes; and determining whether a connection tree can be generated or not according to the directed graph aiming at the alternative set.
According to a preferred embodiment of the present application, the connection relationship includes: inner connection, full outer connection, left connection and right connection;
abstracting the connection relationship as an edge between nodes includes: for any two nodes, when the connection relationship is a left connection or a right connection, the connection relationship is abstracted into a unidirectional edge, and when the connection relationship is an internal connection or a full external connection, the connection relationship is abstracted into a bidirectional edge.
According to a preferred embodiment of the present application, the selecting a node meeting the expansion requirement includes: and selecting nodes which are not positioned in the alternative set and are not selected.
According to a preferred embodiment of the present application, the method further comprises: and for any expanded alternative set, determining whether the expanded alternative set is added into the first queue, if so, discarding the expanded alternative set, and if not, adding the expanded alternative set into the first queue.
According to a preferred embodiment of the present application, the selecting a node meeting the expansion requirement includes: and selecting nodes which are not positioned in the alternative set and are not selected from the nodes upstream of the nodes in the alternative set.
According to a preferred embodiment of the present application, the method further comprises: and sequentially adding the obtained expanded alternative sets into the first queue according to the priority weights of the added nodes from high to low.
According to a preferred embodiment of the present application, if the connection tree can be generated according to the candidate set, completing the query according to the connection tree includes:
traversing each node in the alternative set;
respectively determining whether a connection tree can be generated by taking the node as a root node aiming at each traversed node, if so, finishing the query according to the connection tree, and if not, traversing the next node;
and if all the nodes are taken as root nodes and the connection tree cannot be generated, determining that the connection tree cannot be generated according to the alternative set.
According to a preferred embodiment of the present application, the determining, for each traversed node, whether the connection tree can be generated by using the node as a root node includes:
adding the node to a second queue and a junction tree, and performing a second predetermined process of:
taking out a node from the second queue according to a first-in first-out principle;
selecting nodes which are positioned in the alternative set and are not positioned in the connection tree from downstream nodes of the nodes, and adding each selected node into the second queue and the connection tree respectively;
determining whether the second queue is empty, and if not, repeatedly executing the second preset processing;
and if the candidate set does not contain all the nodes, traversing the next node.
According to a preferred embodiment of the present application, said traversing each node in the alternative set includes: and traversing each node in the alternative set according to the priority weight of the node from high to low.
According to a preferred embodiment of the present application, the adding the selected nodes into the second queue and the connection tree respectively includes: and sequentially adding the selected nodes into the second queue and the connection tree according to the priority weights of the nodes from high to low.
A relational database query apparatus comprising: a node determining unit and a query processing unit;
the node determining unit is used for determining a node where data inquired by a user is located, and the node represents a data table in a relational database;
the query processing unit is configured to add a set formed by the determined nodes as a candidate set to a first queue, and execute the following first predetermined processing: taking out a standby set from the first queue according to a first-in first-out principle; if a connection tree can be generated according to the alternative set, finishing the query according to the connection tree; if the connection tree cannot be generated according to the alternative set, selecting nodes meeting the expansion requirement, adding the nodes into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into the first queue; and if the first queue is not empty, repeatedly executing the first preset processing.
According to a preferred embodiment of the present application, the apparatus further comprises: the preprocessing unit is used for generating a directed graph corresponding to the relational database, the relational database consists of data tables and connection relations among the data tables, the data tables are abstracted into nodes in the directed graph, and the connection relations are abstracted into edges among the nodes;
and the query processing unit determines whether a connection tree can be generated or not according to the directed graph aiming at the alternative set.
According to a preferred embodiment of the present application, the connection relationship includes: inner connection, full outer connection, left connection and right connection;
the preprocessing unit abstracts the connection relation into a unidirectional edge when the connection relation is a left connection or a right connection and abstracts the connection relation into a bidirectional edge when the connection relation is an internal connection or a full external connection aiming at any two nodes.
According to a preferred embodiment of the present application, the node complying with the expansion requirement includes: there are no nodes in the candidate set that have not been selected.
According to a preferred embodiment of the present application, the query processing unit is further configured to, for any expanded candidate set, determine whether the expanded candidate set has already been added to the first queue, if so, discard the expanded candidate set, and if not, add the expanded candidate set to the first queue.
According to a preferred embodiment of the present application, the node complying with the expansion requirement includes: and selecting nodes which are not positioned in the alternative set and are not selected from nodes upstream of the nodes in the alternative set.
According to a preferred embodiment of the present application, the query processing unit is further configured to add, in order from high priority weight to low priority weight of the added node, each obtained expanded candidate set to the first queue in sequence.
According to a preferred embodiment of the present application, the query processing unit traverses each node in the candidate set, and determines whether a junction tree can be generated by using the node as a root node for each traversed node, if so, completes the query according to the junction tree, otherwise, traverses the next node, and if not, determines that the junction tree cannot be generated according to the candidate set by using each node as a root node.
According to a preferred embodiment of the present application, the query processing unit adds, for each traversed node, the node to a second queue and a junction tree, respectively, and performs the following second predetermined processing: taking out a node from the second queue according to a first-in first-out principle; selecting nodes which are positioned in the alternative set and are not positioned in the connection tree from the downstream nodes of the nodes, and adding the selected nodes into the second queue and the connection tree respectively; determining whether the second queue is empty, and if not, repeatedly executing the second preset processing; and if the node in the connection tree contains all the nodes in the alternative set, determining whether the node in the connection tree contains all the nodes, if the node in the connection tree contains all the nodes, determining that the connection tree can be generated by taking the node as a root node, and if the node in the connection tree does not contain all the nodes, traversing the next node.
According to a preferred embodiment of the present application, the query processing unit traverses the nodes in the candidate set according to the priority weights of the nodes from high to low.
According to a preferred embodiment of the present application, the query processing unit sequentially adds the selected nodes to the second queue and the connection tree in an order from high priority weight to low priority weight of the nodes.
An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as described above.
A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.
One embodiment in the above application has the following advantages or benefits: the required connection tree can be automatically generated through alternative set selection, expansion and the like according to the node where the data queried by the user is located, and then the query can be completed based on the connection tree, so that the user operation is simplified, the use threshold of the user is reduced, and the like; all the alternative sets are expanded from the set formed by the nodes where the data inquired by the user are located, so that the nodes where the data inquired by the user are located are ensured to be always present in the final connection tree, and when the nodes are expanded, the expansion is carried out according to the sequence of 0 expansion, 1 expansion and 2 expansion, so that other nodes except the nodes where the data inquired by the user are located can be introduced as few as possible, the expansion of a data range corresponding to the inquiry is avoided as much as possible, the inquiry efficiency is improved, and the like; when the alternative set is expanded, the nodes can be selected only from the upstream nodes of the nodes in the alternative set, so that the generation of a plurality of useless alternative sets can be reduced, the query efficiency is further improved, and the like; when the operations of adding the expanded alternative set into the first queue, traversing each node in the alternative set, adding the node into the second queue and the like are executed, the operations can be carried out according to the priority weights of the nodes from high to low, so that the idempotency of the query is ensured, and the control of fine granularity is facilitated; other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a flow chart of an embodiment of a relational database query method according to the present application;
FIG. 2 is a flowchart of an embodiment of a method for generating a junction tree according to the present application;
FIG. 3 is a flowchart of an embodiment of a method for determining whether a junction tree can be generated using any traversed node as a root node according to the present application;
FIG. 4 is a schematic diagram illustrating an effect of a node traversal order on a result according to the present application;
FIG. 5 is a schematic diagram illustrating the effect of the node join order on the result according to the present application;
FIG. 6 is a schematic diagram illustrating an exemplary embodiment of a relational database query apparatus 600 according to the present disclosure;
FIG. 7 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In addition, it should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter associated objects are in an "or" relationship.
Fig. 1 is a flowchart of an embodiment of a relational database query method according to the present application. As shown in fig. 1, the following detailed implementation is included.
In 101, a node where the data queried by the user is located is determined, and the node represents a data table in the relational database.
At 102, the set of determined nodes is added to the first queue as a candidate set, and the first predetermined process shown in 103 is executed.
In 103, taking a candidate set from the first queue according to a first-in first-out principle; if the connection tree can be generated according to the alternative set, the query is completed according to the connection tree; if the connection tree cannot be generated according to the alternative set, selecting a node meeting the expansion requirement, adding the node into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into a first queue; and if the first queue is not empty, repeatedly executing the first preset processing.
In this embodiment, a directed graph corresponding to a relational database may be generated first, where the relational database is composed of data tables and connection relationships between the data tables, and the data tables are abstracted into nodes in the directed graph, and the connection relationships are abstracted into edges between the nodes, that is, the relational database is abstracted into one directed graph.
In the relational database, the connection relationships between data tables mainly include four types, which are respectively: inner connection, full outer connection, left connection, and right connection. For both inner and full outer connections, the order of the connection of the two tables has no effect on the query, i.e., A is connected to B and B is connected to A is equivalent. The left and right connections are directional, requiring that the order of the connections must be from one to the other, changing the meaning of the query if the order is changed. For any two nodes, when the connection relationship is a left connection or a right connection, the connection relationship can be abstracted into a unidirectional edge, and when the connection relationship is an internal connection or a full external connection, the connection relationship can be abstracted into a bidirectional edge.
In the query of the relational database, different data tables need to be connected, for example, one base table can be selected, and other tables are continuously connected from the base table, so that the table set connected by the base table is continuously increased, and the table connected later can be connected with any table in the set and added into the set. According to the abstraction, the connection method becomes a generation algorithm of a connection tree, wherein the base table is the root node of the tree, and the size of the tree is expanded by continuously connecting with the tree. The data queried by the user is mapped to fields in the data table, and the fields may be distributed in different data tables.
The node (i.e. the data table) where the data queried by the user is located is a node that must be used in the query, i.e. a node that must be included in the connection tree. In addition, it is desirable that the generated connection tree introduces as few other nodes as possible, and in order to generate the connection tree, some unnecessary nodes, that is, nodes other than the node where the data queried by the user is located may need to be introduced, but each introduction of a node will cause the data range corresponding to the query to be enlarged, so in order to ensure that the data range is as small as possible, it is necessary to introduce as few unnecessary nodes as possible.
In this embodiment, each candidate set may be tried in a breadth-first traversal manner, whether the candidate set can generate a connection tree is determined, if successful, a solution is found, and if all candidate sets are tried, no solution is indicated.
FIG. 2 is a flowchart of an embodiment of a method for generating a junction tree according to the present application. As shown in fig. 2, the following detailed implementation is included.
In 201, a set composed of nodes where the data queried by the user is located is added to the first queue as an alternative set.
The node where the data of the user query is located may be determined in an existing manner, and may be one, may also be multiple, and is usually multiple, and the determined nodes may be used to form a set, which is used as an initial candidate set and added to the first queue.
At 202, it is determined whether the first queue is empty, if so, 203 is performed, otherwise 204 is performed.
At 203, it is determined that the connection tree cannot be generated, and the flow ends.
If the first queue is empty, it indicates that all the alternative sets have been tried, and a connection tree cannot be generated.
At 204, a candidate set is taken from the first queue on a first-in-first-out basis.
The first retrieved candidate set is the initial candidate set described in 201.
In 205 it is determined whether the junction tree can be generated from the alternative set, if so 206 is performed, otherwise 207 is performed.
For the alternative set, whether the connection tree can be generated may be determined according to the directed graph, and please refer to the following description for specific implementation.
In 206, the query is completed according to the generated connection tree, and the process is ended.
For example, a query statement may be generated according to the connection tree, and the required data is queried through the query statement and returned to the user.
At 207, it is determined whether there are nodes that meet the expansion requirements, if not, 202 is repeated, and if so, 208 is performed.
At 208, for any selected node, the node is added to the candidate set to obtain an expanded candidate set, and the expanded candidate set is added to the first queue, and then 202 is executed repeatedly.
Nodes meeting the expansion requirements can be selected from the directed graph, for example, nodes which are not located in the alternative set and are not selected can be selected.
For example, the candidate set includes three nodes, that is, node 1, node 2, and node 3, and when the node 4 is selected for the first time, an extended candidate set including node 1, node 2, node 3, and node 4 may be obtained, and when the node 4 is selected again, it is necessary to select from nodes other than node 1, node 2, node 3, and node 4, and if the node 5 is selected, another extended candidate set including node 1, node 2, node 3, and node 5 may be obtained.
In practical applications, there may be one, multiple or zero nodes meeting the expansion requirement.
The resulting expanded alternative sets may be added to the first queue. Preferably, for any expanded alternative set, it may be further determined whether it has been added to the first queue, and if so, the expanded alternative set may be discarded, and if not, the expanded alternative set may be added to the first queue.
For example, two candidate sets are sequentially taken out from the first queue, and for convenience of description, the two candidate sets are respectively referred to as a first candidate set and a second candidate set, where the first candidate set includes node 1, node 2, and node 3, and the second candidate set includes node 1, node 2, and node 4, and when the first candidate set is expanded, an expanded candidate set is obtained, where the expanded candidate set includes node 1, node 2, node 3, and node 4, and has been added to the first queue.
It can be seen that all the candidate sets in this embodiment are expanded from the set formed by the nodes where the data queried by the user is located, so that the nodes where the data queried by the user is located are ensured to appear in the final connection tree, and when the expansion is performed, the expansion is performed according to the sequence of expanding 0, expanding 1, and expanding 2, so that other nodes except the nodes where the data queried by the user is located can be introduced as little as possible, expansion of the data range corresponding to the query is avoided as much as possible, and the query efficiency is improved.
In addition, preferably, when selecting the nodes meeting the expansion requirement, the nodes which are not located in the alternative set and are not selected can be selected from the nodes upstream of the nodes in the alternative set.
Because the method for generating the connection tree in this embodiment is breadth-first traversal, if the expanded node is a downstream node of a node in the alternative set or a node without direct upstream-downstream relationship, there is no help for generating the connection tree, so that optimization can be performed when selecting the node, and only an upstream node of the node in the alternative set is selected, thereby reducing the generation of many useless alternative sets, further improving query efficiency and the like.
In addition, preferably, the obtained expanded candidate sets may be added to the first queue in sequence according to the order from high priority weight to low priority weight of the added node.
The different order of adding each expanded candidate set into the first queue results in different order of subsequently taking out the candidate sets from the first queue, and further different influence is generated on the final result. Therefore, priority weights can be set for different nodes respectively according to actual needs, and therefore the obtained expanded alternative sets can be added into the first queue in sequence according to the sequence from high priority weight to low priority weight of the added nodes.
In practical application, query idempotency is usually required, that is, query is performed for multiple times under the same condition to return the same data, which requires that generated connection trees are consistent under the same structure, and meanwhile, considering that related personnel need to have more fine-grained control on the generated connection trees under some conditions, if a certain connection mode is recommended, priority weights can be respectively set for different nodes, so that the idempotency of query is ensured, and the fine-grained control is facilitated.
For any alternative set taken out from the first queue, when determining whether the connection tree can be generated, the breadth-first traversal can be performed in the directed graph by starting from each node in the alternative set (taking the node as a root node), and if all the nodes in the alternative set can be traversed, the connection tree is found.
Specifically, each node in the candidate set may be traversed, and whether a connection tree can be generated by using the node as a root node is determined for each traversed node, if yes, the query is completed according to the generated connection tree, if not, the next node is traversed, and if each node as a root node cannot generate a connection tree, it is determined that a connection tree cannot be generated according to the candidate set.
Fig. 3 is a flowchart of an embodiment of the method for determining whether a junction tree can be generated by using any traversed node as a root node according to the present application. As shown in fig. 3, the following detailed implementation is included.
At 301, for any node traversed, the node is added to a second queue and junction tree.
And taking the traversed nodes as root nodes, adding the root nodes into a second queue and a connection tree, and taking the root nodes as initialization.
At 302, a node is taken from the second queue on a first-in-first-out basis.
The node taken out for the first time is the root node described in 301.
In 303, the nodes that are located in the candidate set and not located in the connection tree are selected from the nodes downstream of the node, and the selected nodes are added to the second queue and the connection tree, respectively.
The downstream node of the node can be traversed, and whether the node is located in the alternative set and not located in the connection tree is respectively determined for each traversed downstream node, and if yes, the node can be respectively added into the second queue and the connection tree.
At 304, it is determined whether the second queue is empty, and if not, 302 is repeated, and if so, 305 is performed.
In 305, it is determined whether all nodes in the alternative set are contained in the junction tree, if so, 306 is performed, otherwise, 307 is performed.
At 306, it is determined that a connection tree can be generated with the node as the root node, and the flow ends.
If the connection tree contains all the nodes in the alternative set, the connection tree can reach any node in the alternative set, and the connection tree is successfully generated.
In 307, it is determined that the connection tree cannot be generated with the node as the root node, and the flow ends.
Thereafter, the next node in the alternative set may be traversed and the process illustrated in FIG. 3 repeated. And if the connection tree cannot be generated by taking each node in the alternative set as a root node, determining that the connection tree cannot be generated according to the alternative set.
Preferably, when traversing each node in the alternative set, each node in the alternative set may be traversed according to the order of the priority weights of the nodes from high to low. Fig. 4 is a schematic diagram illustrating an influence of the node traversal order on the result according to the present application. As shown in fig. 4, the result of the junction tree generated will be different depending on whether the first traversal to node B or node C is made.
In addition, in 303, when adding the selected nodes to the second queue and the connection tree, the selected nodes may be added to the second queue and the connection tree, respectively, in order from the highest priority weight of the nodes to the lowest priority weight of the nodes. Fig. 5 is a schematic diagram illustrating an influence of a node adding order on a result according to the present application. As shown in fig. 5, if node B is first added, node B is first taken out from the second queue, and if node C is first added, node C is first taken out from the second queue, and accordingly, the generated connection tree results are different.
If the connection tree cannot be generated according to the method, the query cannot be completed. To avoid this, when building a relational database, it is ensured that a junction tree can be generated for the candidate set containing all the data tables, thus ensuring that the junction tree can be generated in any case (in the worst case, the candidate set is expanded to the full set). For this purpose, a validity check may be performed in advance to determine whether the corpus can generate the junction tree.
It should be noted that for simplicity of description, the above-mentioned method embodiments are described as a series of acts, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In a word, by adopting the scheme of the embodiment of the method, the required connection tree can be automatically generated through alternative set selection, expansion and the like according to the node where the data inquired by the user is located, and then the inquiry can be completed based on the connection tree, so that the user operation is simplified, the use threshold of the user is reduced, and the like; all the alternative sets are expanded from the set formed by the nodes where the data inquired by the user are located, so that the nodes where the data inquired by the user are located are ensured to be always present in the final connection tree, and when the nodes are expanded, the expansion is carried out according to the sequence of 0 expansion, 1 expansion and 2 expansion, so that other nodes except the nodes where the data inquired by the user are located can be introduced as few as possible, the expansion of a data range corresponding to the inquiry is avoided as much as possible, the inquiry efficiency is improved, and the like; when the alternative set is expanded, the nodes can be selected from the upstream nodes of the nodes in the alternative set, so that the generation of a plurality of useless alternative sets can be reduced, the query efficiency is further improved, and the like; when the operations of adding the expanded alternative set into the first queue, traversing nodes in the alternative set, adding nodes into the second queue and the like are executed, the operations can be performed from high to low according to the priority weights of the nodes, so that the idempotency of the query is ensured, and the fine-grained control and the like are facilitated.
The above is a description of method embodiments, and the embodiments of the present application are further described below by way of apparatus embodiments.
Fig. 6 is a schematic structural diagram illustrating a composition of an embodiment of a relational database query apparatus 600 according to the present application. As shown in fig. 6, includes: a node determination unit 601 and a query processing unit 602.
The node determining unit 601 is configured to determine a node where the data queried by the user is located, where the node represents a data table in the relational database.
A query processing unit 602, configured to add the set formed by the determined nodes as a candidate set to the first queue, and execute the following first predetermined processing: taking out a standby set from the first queue according to a first-in first-out principle; if the connection tree can be generated according to the alternative set, the query is completed according to the connection tree; if the connection tree cannot be generated according to the alternative set, selecting a node meeting the expansion requirement, adding the node into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into a first queue; and if the first queue is not empty, repeatedly executing the first preset processing.
The device shown in fig. 6 may further include: the preprocessing unit 603 is configured to generate a directed graph corresponding to a relational database, where the relational database is composed of data tables and connection relationships between the data tables, abstract the data tables into nodes in the directed graph, and abstract the connection relationships into edges between the nodes.
Accordingly, the query processing unit 602 may determine whether the connection tree can be generated according to the directed graph for the alternative set. In addition, the selection of the nodes meeting the expansion requirement can be the selection of the nodes meeting the expansion requirement from the directed graph.
The connection relationship may include: inner connection, full outer connection, left connection, and right connection.
For any two nodes, the preprocessing unit 603 may abstract the connection relationship into a unidirectional edge when the connection relationship is left-connected or right-connected, and may abstract the connection relationship into a bidirectional edge when the connection relationship is inner-connected or fully-outer-connected.
The nodes selected by query processing unit 602 that meet the expansion requirement may refer to nodes that are not in the candidate set and have not been selected. Preferably, the node meeting the expansion requirement may refer to an unselected node which is not located in the candidate set and is selected from nodes upstream of the nodes in the candidate set.
For any extended candidate set, query processing unit 602 may further determine whether it has already been added to the first queue, and if so, may discard the extended candidate set, and if not, may add the extended candidate set to the first queue.
In addition, the query processing unit 602 may add the obtained expanded candidate sets to the first queue in sequence according to the priority weights of the added nodes from high to low.
In order to determine whether a connection tree can be generated according to the candidate set, query processing unit 602 may traverse each node in the candidate set, and separately determine, for each traversed node, whether a connection tree can be generated by using the node as a root node, if yes, the current query may be completed according to the connection tree, if no, a next node may be traversed, and if each node as a root node cannot generate a connection tree, it may be determined that a connection tree cannot be generated according to the candidate set.
Specifically, the query processing unit 602 may add, for each traversed node, the node to the second queue and the junction tree, respectively, and may perform the following second predetermined processing: taking out a node from the second queue according to a first-in first-out principle; selecting nodes which are positioned in the alternative set and are not positioned in the connection tree from the downstream nodes of the nodes, and adding the selected nodes into the second queue and the connection tree respectively; determining whether the second queue is empty, and if not, repeatedly executing the second preset processing; if the candidate set does not contain all the nodes, the next node can be traversed.
The query processing unit 602 may traverse the nodes in the candidate set according to the order from high priority weight to low priority weight of the nodes.
In addition, the query processing unit 602 may further add the selected nodes to the second queue and the connection tree in sequence according to the priority weights of the nodes from high to low.
For the specific work flow of the apparatus embodiment shown in fig. 6, reference is made to the related description in the foregoing method embodiment, and details are not repeated.
In short, by adopting the scheme of the embodiment of the device, the required connection tree can be automatically generated through alternative set selection, expansion and the like according to the node where the data inquired by the user is located, and then the inquiry can be completed based on the connection tree, so that the user operation is simplified, the use threshold of the user is reduced, and the like; all the alternative sets are expanded from the set formed by the nodes where the data queried by the user are located, so that the nodes where the data queried by the user are located are ensured to appear in the final connection tree, and the expansion is performed according to the sequence of expanding 0, expanding 1 and expanding 2, so that other nodes except the nodes where the data queried by the user are located can be introduced as few as possible, the expansion of a data range corresponding to the query is avoided as much as possible, the query efficiency is improved, and the like; when the alternative set is expanded, the nodes can be selected only from the upstream nodes of the nodes in the alternative set, so that the generation of a plurality of useless alternative sets can be reduced, the query efficiency is further improved, and the like; when the operations of adding the expanded alternative set into the first queue, traversing nodes in the alternative set, adding nodes into the second queue and the like are executed, the operations can be performed from high to low according to the priority weights of the nodes, so that the idempotency of the query is ensured, and the fine-grained control and the like are facilitated.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 7 is a block diagram of an electronic device according to the method of the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.
As shown in fig. 7, the electronic apparatus includes: one or more processors Y01, a memory Y02, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information for a graphical user interface on an external input/output device (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing some of the necessary operations (e.g., as an array of servers, a group of blade servers, or a multi-processor system). In fig. 7, one processor Y01 is taken as an example.
The memory Y02 is a non-transitory computer readable storage medium provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
The memory Y02, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the methods in the embodiments of the present application. The processor Y01 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory Y02, that is, implements the method in the above-described method embodiment.
The memory Y02 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory Y02 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory Y02 may optionally include a memory remotely located from the processor Y01, and these remote memories may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, blockchain networks, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device Y03 and an output device Y04. The processor Y01, the memory Y02, the input device Y03, and the output device Y04 may be connected by a bus or other means, and are exemplified by being connected by a bus in fig. 7.
The input device Y03 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output device Y04 may include a display apparatus, an auxiliary lighting device, a tactile feedback device (e.g., a vibration motor), and the like. The display device may include, but is not limited to, a liquid crystal display, a light emitting diode display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific integrated circuits, computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable logic devices) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a cathode ray tube or a liquid crystal display monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area networks, wide area networks, blockchain networks, and the internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments are not intended to limit the scope of the present disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (20)

1. A method for querying a relational database, comprising:
determining a node where data inquired by a user is located, wherein the node represents a data table in a relational database;
adding the determined set formed by the nodes into a first queue as a candidate set, and executing the following first predetermined processing:
taking out a candidate set from the first queue according to a first-in first-out principle;
if a connection tree can be generated according to the alternative set, finishing the query according to the connection tree;
if the connection tree cannot be generated according to the alternative set, selecting nodes meeting the expansion requirement, adding the nodes into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into the first queue; if the first queue is not empty, the first predetermined processing is repeatedly executed, wherein the selecting the nodes meeting the expansion requirement comprises: and selecting nodes which are not positioned in the alternative set and are not selected from the nodes upstream of the nodes in the alternative set.
2. The method of claim 1,
the method further comprises the following steps: generating a directed graph corresponding to the relational database, wherein the relational database consists of data tables and connection relations among the data tables, the data tables are abstracted into nodes in the directed graph, and the connection relations are abstracted into edges among the nodes; and determining whether a connection tree can be generated or not according to the directed graph aiming at the alternative set.
3. The method of claim 2,
the connection relationship includes: inner connection, full outer connection, left connection and right connection;
abstracting the connection relationship as an edge between nodes includes: for any two nodes, when the connection relationship is a left connection or a right connection, the connection relationship is abstracted into a unidirectional edge, and when the connection relationship is an internal connection or a full external connection, the connection relationship is abstracted into a bidirectional edge.
4. The method of claim 1,
the method further comprises the following steps: and for any expanded alternative set, determining whether the expanded alternative set is added into the first queue, if so, discarding the expanded alternative set, and if not, adding the expanded alternative set into the first queue.
5. The method of claim 1,
the method further comprises the following steps: and sequentially adding the obtained expanded alternative sets into the first queue according to the priority weight of the added nodes from high to low.
6. The method of claim 2,
if the connection tree can be generated according to the alternative set, completing the query according to the connection tree includes:
traversing each node in the alternative set;
respectively determining whether a connection tree can be generated by taking the node as a root node or not aiming at each traversed node, if so, finishing the query according to the connection tree, and if not, traversing the next node;
and if all the nodes are taken as root nodes and the connection tree cannot be generated, determining that the connection tree cannot be generated according to the alternative set.
7. The method of claim 6,
the respectively determining whether the connection tree can be generated by taking the node as a root node for each traversed node comprises:
adding the node to a second queue and a junction tree, and performing a second predetermined process of:
taking out a node from the second queue according to a first-in first-out principle;
selecting nodes which are positioned in the alternative set and are not positioned in the connection tree from downstream nodes of the nodes, and adding each selected node into the second queue and the connection tree respectively;
determining whether the second queue is empty, and if not, repeatedly executing the second preset processing;
and if the candidate set does not contain all the nodes, traversing the next node.
8. The method of claim 6,
the traversing the nodes in the candidate set comprises: and traversing each node in the alternative set according to the priority weight of the node from high to low.
9. The method of claim 7,
the adding the selected nodes into the second queue and the connection tree respectively comprises: and sequentially adding the selected nodes into the second queue and the connection tree according to the priority weights of the nodes from high to low.
10. A relational database query apparatus, comprising: a node determining unit and a query processing unit;
the node determining unit is used for determining a node where data inquired by a user is located, and the node represents a data table in a relational database;
the query processing unit is configured to add a set formed by the determined nodes as a candidate set to a first queue, and execute the following first predetermined processing: taking out a standby set from the first queue according to a first-in first-out principle; if a connection tree can be generated according to the alternative set, finishing the query according to the connection tree; if the connection tree cannot be generated according to the alternative set, selecting nodes meeting the expansion requirement, adding the nodes into the alternative set aiming at any selected node to obtain an expanded alternative set, and adding the expanded alternative set into the first queue; if the first queue is not empty, the first predetermined processing is repeatedly executed, wherein the selecting nodes meeting the expansion requirement comprises: and selecting nodes which are not positioned in the alternative set and are not selected from the nodes upstream of the nodes in the alternative set.
11. The apparatus of claim 10,
the device further comprises: the preprocessing unit is used for generating a directed graph corresponding to the relational database, the relational database consists of data tables and connection relations among the data tables, the data tables are abstracted into nodes in the directed graph, and the connection relations are abstracted into edges among the nodes;
and the query processing unit determines whether a connection tree can be generated or not according to the directed graph aiming at the alternative set.
12. The apparatus of claim 11,
the connection relationship includes: inner connection, full outer connection, left connection and right connection;
the preprocessing unit abstracts the connection relation into a unidirectional edge when the connection relation is a left connection or a right connection and abstracts the connection relation into a bidirectional edge when the connection relation is an internal connection or a full external connection aiming at any two nodes.
13. The apparatus of claim 10,
the query processing unit is further configured to determine, for any expanded candidate set, whether the candidate set has been added to the first queue, if yes, discard the expanded candidate set, and if not, add the expanded candidate set to the first queue.
14. The apparatus of claim 10,
the query processing unit is further configured to add, in order from high to low, the obtained expanded candidate sets to the first queue according to the priority weight of the added node.
15. The apparatus of claim 11,
and the query processing unit traverses each node in the alternative set, respectively determines whether a connection tree can be generated by taking the node as a root node aiming at each traversed node, if so, completes the query according to the connection tree, otherwise, traverses the next node, and if not, determines that the connection tree cannot be generated according to the alternative set.
16. The apparatus of claim 15,
the query processing unit adds each traversed node to a second queue and a connection tree, respectively, and performs the following second predetermined processing: taking out a node from the second queue according to a first-in first-out principle; selecting nodes which are positioned in the alternative set and are not positioned in the connection tree from downstream nodes of the nodes, and adding each selected node into the second queue and the connection tree respectively; determining whether the second queue is empty, and if not, repeatedly executing the second preset processing; and if the candidate set does not contain all the nodes, traversing the next node.
17. The apparatus of claim 15,
and the query processing unit traverses each node in the alternative set according to the priority weight of the node from high to low.
18. The apparatus of claim 16,
and the query processing unit sequentially adds the selected nodes into the second queue and the connection tree according to the priority weights of the nodes from high to low.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
20. A non-transitory computer readable storage medium storing computer instructions, wherein,
the computer instructions are for causing the computer to perform the method of any one of claims 1-9.
CN202010093639.4A 2020-02-14 2020-02-14 Relational database query method, device, electronic equipment and storage medium Active CN111159316B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010093639.4A CN111159316B (en) 2020-02-14 2020-02-14 Relational database query method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010093639.4A CN111159316B (en) 2020-02-14 2020-02-14 Relational database query method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111159316A CN111159316A (en) 2020-05-15
CN111159316B true CN111159316B (en) 2023-03-14

Family

ID=70565695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010093639.4A Active CN111159316B (en) 2020-02-14 2020-02-14 Relational database query method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111159316B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111865970B (en) * 2020-07-17 2022-09-16 北京百度网讯科技有限公司 Method and apparatus for implementing interface idempotency

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2743462C (en) * 1999-07-30 2012-10-16 Basantkumar John Oommen A method of generating attribute cardinality maps
US6801904B2 (en) * 2001-10-19 2004-10-05 Microsoft Corporation System for keyword based searching over relational databases
CN100452047C (en) * 2005-12-27 2009-01-14 国际商业机器公司 System and method for executing search in a relational database
CN105224566B (en) * 2014-06-25 2019-03-01 国际商业机器公司 The method and system of injunctive graphical query is supported on relational database
US9507824B2 (en) * 2014-08-22 2016-11-29 Attivio Inc. Automated creation of join graphs for unrelated data sets among relational databases
CN104573039A (en) * 2015-01-19 2015-04-29 北京航天福道高技术股份有限公司 Keyword search method of relational database
CN107092656B (en) * 2017-03-23 2019-12-03 中国科学院计算技术研究所 A kind of tree data processing method and system
US11138195B2 (en) * 2017-08-31 2021-10-05 Salesforce.Com, Inc. Systems and methods for translating n-ary trees to binary query trees for query execution by a relational database management system

Also Published As

Publication number Publication date
CN111159316A (en) 2020-05-15

Similar Documents

Publication Publication Date Title
WO2022037039A1 (en) Neural network architecture search method and apparatus
CN110806923A (en) Parallel processing method and device for block chain tasks, electronic equipment and medium
CN111177476B (en) Data query method, device, electronic equipment and readable storage medium
US11886410B2 (en) Database live reindex
EP3896580A1 (en) Method and apparatus for generating conversation, electronic device, storage medium and computer program product
CN104268295A (en) Data query method and device
CN111488492B (en) Method and device for searching graph database
CN110619002A (en) Data processing method, device and storage medium
EP3825865A2 (en) Method and apparatus for processing data
CN111177339A (en) Dialog generation method and device, electronic equipment and storage medium
CN112527474A (en) Task processing method and device, equipment, readable medium and computer program product
CN111291082B (en) Data aggregation processing method, device, equipment and storage medium
CN111159316B (en) Relational database query method, device, electronic equipment and storage medium
CN111259090A (en) Graph generation method and device of relational data, electronic equipment and storage medium
CN111666417B (en) Method, device, electronic equipment and readable storage medium for generating synonyms
CN111428489B (en) Comment generation method and device, electronic equipment and storage medium
CN111414487B (en) Method, device, equipment and medium for associated expansion of event theme
CN111259058B (en) Data mining method, data mining device and electronic equipment
CN112069155A (en) Data multidimensional analysis model generation method and device
CN111290714A (en) Data reading method and device
CN111177479A (en) Method and device for acquiring feature vectors of nodes in relational network graph
CN111680508B (en) Text processing method and device
CN112270412B (en) Network operator processing method and device, electronic equipment and storage medium
CN112817965A (en) Data splicing method and device, electronic equipment and storage medium
CN111523000A (en) Method, device, equipment and storage medium for importing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant