CN114090617A - Plan execution method, device, database node and medium - Google Patents

Plan execution method, device, database node and medium Download PDF

Info

Publication number
CN114090617A
CN114090617A CN202111442810.9A CN202111442810A CN114090617A CN 114090617 A CN114090617 A CN 114090617A CN 202111442810 A CN202111442810 A CN 202111442810A CN 114090617 A CN114090617 A CN 114090617A
Authority
CN
China
Prior art keywords
plan
operator
execution
sending
executed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111442810.9A
Other languages
Chinese (zh)
Inventor
宋鑫
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN202111442810.9A priority Critical patent/CN114090617A/en
Publication of CN114090617A publication Critical patent/CN114090617A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a plan execution method, a plan execution device, a database node and a medium. The method comprises the following steps: carrying out partition cutting on a sending operational character in a plan to be executed based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character; constructing an executable plan based on the plan to be executed and the partition cutting result; and determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode. The method can effectively improve the efficiency of plan execution by changing the execution mode of the data distribution operator according to the execution node position of the sending operator and executing the executable plan according to different execution modes.

Description

Plan execution method, device, database node and medium
Technical Field
The embodiment of the invention relates to the technical field of distributed databases, in particular to a plan execution method, a plan execution device, a database node and a medium.
Background
One architecture for a distributed database cluster is: each node in the cluster is equivalent, and each node comprises the functions of plan generation, plan execution and the like. The data are distributed among all nodes in the cluster according to a certain rule, and after a user request R reaches a certain node X in the middle, if the data required by the user request R are only on the node X, the query plan can be executed and completed locally on the node X; if the data required by the user request R relates to other nodes except the node X, the query plan needs the node X and other nodes to complete cooperatively. In a Distributed database cluster, the former type of query Plan is usually called a Local Plan (Local Plan), and the latter type of query Plan is called a Distributed Plan (Distributed Plan) or a Remote Plan (Remote Plan). Local planning does not involve data exchange transmission operations and can be performed more efficiently; the distributed plan introduces communication overhead during execution, but the plan execution can adapt to the node distribution condition of various data, and the application range is wider.
When the existing distributed database product is planned to be reused, one method is as follows: whether the query plan belongs to the local plan or the distributed plan, the query plan is executed in a distributed plan mode, so that unnecessary communication overhead is added when the query plan belonging to the local plan is executed, and the execution efficiency of data in a local node is sacrificed. The other method is as follows: aiming at the query plan, a local plan and a distributed plan are distinguished, and a plurality of plans are stored and matched for the same query statement with different data node distribution positions, so that the plan cache quantity is overlarge.
Therefore, how to reduce the number of plan caches and improve the efficiency of plan execution is a technical problem to be solved at present.
Disclosure of Invention
The embodiment of the invention provides a plan execution method, a plan execution device, a database node and a medium, which are used for improving the efficiency of plan execution.
In a first aspect, an embodiment of the present invention provides a plan execution method, including:
carrying out partition cutting on a sending operational character in a plan to be executed based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character;
constructing an executable plan based on the plan to be executed and the partition cutting result;
and determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
In a second aspect, an embodiment of the present invention further provides a plan execution device, including:
the cutting module is used for cutting a sending operational character in a plan to be executed in a partition mode based on cutting information in the query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character;
the construction module is used for constructing an executable plan based on the plan to be executed and the partition cutting result;
and the execution module is used for determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
In a third aspect, an embodiment of the present invention further provides a database node, including:
one or more processors;
storage means for storing one or more programs;
the one or more programs are executed by the one or more processors, so that the one or more processors implement the scheduled execution method provided by the embodiment of the invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the schedule execution method provided by the embodiment of the present invention.
The embodiment of the invention provides a plan execution method, a plan execution device, a database node and a medium, wherein a sending operational character in a plan to be executed is subjected to partition cutting based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character; then constructing an executable plan based on the plan to be executed and the partition cutting result; and finally, determining the execution mode of the data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode. According to the technical scheme, the execution mode of the data distribution operator is changed according to the execution node position of the sending operator, and the executable plan is executed according to different execution modes, so that the plan execution efficiency can be effectively improved.
Drawings
Fig. 1A is a schematic diagram of a distributed database cluster architecture according to an embodiment of the present invention;
fig. 1B is a schematic diagram illustrating an implementation of a query plan corresponding to a query statement Q1 according to an embodiment of the present invention;
fig. 1C is a schematic diagram illustrating an implementation of a query plan corresponding to a query statement Q2 according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a plan execution method according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a plan execution method according to a second embodiment of the present invention;
fig. 4 is a schematic diagram illustrating implementation of plan cache matching according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a plan execution device according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a database node according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
The term "include" and variations thereof as used herein are intended to be open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment".
It should be noted that the concepts of "first", "second", etc. mentioned in the present invention are only used for distinguishing corresponding contents, and are not used for limiting the order or interdependence relationship.
It is noted that references to "a", "an", and "the" modifications in the present invention are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.
The embodiment of the invention provides a method for improving the reuse efficiency of a plan by changing the execution mode of a data distribution operator in a distributed database environment based on the existing plan reuse method, wherein the data distribution operator is regarded as a pipeline mode to be executed when the data is distributed in a local node; when data is distributed in other nodes except the local node, the data distribution operator is regarded as a normal mode to be executed, so that the plan reuse efficiency is effectively utilized, and the amount of the plan cache is reduced.
Fig. 1A is a schematic diagram of a distributed database cluster architecture according to an embodiment of the present invention. As shown in fig. 1A, there may be n (n > 1) nodes in a distributed database cluster, such as bp1, bp2, …, and bpn nodes (the nodes may be understood as servers), where each node has an independent Central Processing Unit (CPU), memory, and disk, and there is no shared storage between the nodes and the nodes can be connected through a high-speed network. The user can connect to any node in bp1-bpn through a client (e.g., client 1, …, client n), and the connection manner from the client to the node may be direct connection to the node, or the connection manner from the client to the node may be selection of the corresponding node for connection through a third-party component according to a load balancing or a preset routing policy, which is not limited herein.
The user table data of the database may be distributed over one or more nodes. For example, a range partition table T1 is established, and T1 has two partitions, namely p1 and p2, and p1 and p2 can be stored on two nodes of bp1 and bp2, respectively. The corresponding table building statement of table T1 may be expressed as:
create table T1(c1 int,c2 int)partition by range(c1)
(partition p1 values less than(100)storage(on bp1),
Partition p2 values less than(200)storage(on bp2));
the table building statement corresponding to the table T1 can be understood as that c1 and c2 are input, a table T1 partition is created according to the range of c1, data with the median value of c1 being less than 100 (namely c1 < 100) is partitioned into p1 and stored on the bp1 node, and data with the median value of c1 being more than 100 and less than 200 (namely 100 ≦ c1 < 200) is partitioned into p2 and stored on the bp2 node.
A common table T2 is created, and all the data in table T2 can be stored on the bp3 node. The corresponding table building statement of table T2 may be expressed as:
create table T2(d1 int,d2 int)storage(on bp3);
where the corresponding table building statement of table T2 is understood to be that d1, d2 are inputs, creating a table T2 in which all data is stored on the bp3 node.
When the user connects to the bp1 node through client 1, a query statement Q1 is issued, Q1 may be expressed as select from T1 where c1 is 50, which may be understood as querying data from the partition corresponding to c1 is 50 in the TI table. As can be seen from table T1, the data satisfying the query condition (i.e., c1 ═ 50) is located in the p1 partition, is distributed correspondingly at the bp1 node, and is partition-trimmed in the planning stage.
Fig. 1B is a schematic diagram illustrating an implementation of a query plan corresponding to a query statement Q1 according to an embodiment of the present invention. As shown in FIG. 1B, SCAN (T1_ p1) may indicate that the SCAN operator (i.e., SCAN operator) is responsible for scanning data of the p1 partition in Table T1; FILTER (c1 ═ 50) may represent a FILTER operator (i.e., FILTER operator) responsible for filtering data according to the filtering condition (at which time query condition c1 ═ 50 may be the filtering condition); PROJECT may represent a projection operator that may be used to compute the query term expression that is returned to the client. Specifically, firstly, SCAN the data of p1 partition in table T1 by SCAN operator; then, the acquired data is handed up to a FILTER operator, the filtering condition is c 1-50, and the data meeting the filtering condition is continuously sent up to a PROJECT operator through the FILTER operator; and finally, finishing the final query, and returning a query result to the client through a PROJECT operator.
When the user changes the query condition slightly, a query statement Q2 is obtained, and Q2 may be represented as select from T1 where c1 is 150, which is understood as querying data from the partition corresponding to c1 being 150 in the TI table. As can be seen from table T1, the data satisfying the query condition (i.e., c1 ═ 150) is located in the p2 partition and correspondingly distributed at the bp2 node. It is understood that the queried data distribution node location has changed, so the query plan of the query statement Q1 shown in fig. 1B cannot be directly multiplexed onto the query statement Q2.
Fig. 1C is a schematic diagram illustrating an implementation of a query plan corresponding to a query statement Q2 according to an embodiment of the present invention. As shown in FIG. 1C, SCAN (T1_ p2) may indicate that the SCAN operator is responsible for scanning data of the p2 partition in Table T1; FILTER (c1 ═ 50) may indicate that the FILTER operator is responsible for filtering data according to the filtering condition (in which case the query condition c1 ═ 150 may be the filtering condition); SEND may represent a SEND operator and RECV may represent a receive operator, for sending and receiving data, respectively. The query plan of Q2 has more SEND operators and RECV operators than Q1, which are available for data transfer across nodes (e.g., data transfer across nodes between bp1 and bp2 nodes). Specifically, a local plan including a SEND-FILTER-SCAN operator is first sent to a node corresponding to the query condition c1 ═ 150, that is, a bp2 node; the SCAN operator SCANs data of a p2 partition in the table T1, and forwards the acquired data to the FILTER operator, where the FILTER condition is c1 ═ 150, and the data meeting the FILTER condition is continuously sent upwards to the SEND operator through the FILTER operator. The RECV operator then waits to receive a data transfer of the SEND operator. And finally, after the SEND operator SENDs all the data and the RECV operator receives the data, the RECV operator SENDs the data to the PROJECT operator to complete the final query, and the query result is returned to the client through the PROJECT operator.
Obviously, if the Q1 directly adopts the planning mode of Q2, the query function can be normally completed, but the sending cost of the SEND-FILTER-SCAN local plan and the communication cost between the SEND-RECV are increased.
In the existing distributed database system, for different partition clipping results of the same query statement, a query plan of Q2 is used in a system, or two sets of local and distributed plan caches are used according to the partition clipping results. The former may cause unnecessary performance waste when the partition location is just local, and the latter may cause the projected cache size to be too large.
The embodiment of the invention provides a method, which can not distinguish a local plan from a distributed plan for a query plan in a plan generating stage, and can uniformly reserve a data distribution operator (namely a SEND/RECV operator); and in the plan reuse stage, determining the execution mode of the SEND/RECV operator according to the actual partition position of the SEND/RECV operator (namely, the position of which node is actually positioned), and executing the plan according to the execution mode. For example, the query plans corresponding to Q1 and Q2 in the above embodiment, the plan generation stage in the embodiment of the present invention may be collectively represented by a query plan corresponding to Q2. Before executing the plan, calculating the node position of each SEND operator, and if the calculated node positions are local nodes, setting the execution mode of all SEND/RECV operators in the query plan to be a pipeline mode; otherwise, setting the execution mode of the SEND/RECV operator as a normal mode. Therefore, the plan reuse efficiency is effectively utilized, and the plan cache amount is reduced.
Example one
Fig. 2 is a flowchart of a plan execution method according to an embodiment of the present invention, where the method is applicable to a case where a corresponding plan is executed by changing an execution mode of a data distribution operator, so as to effectively improve the efficiency of executing the plan, and the method may be executed by a plan execution apparatus, where the apparatus may be implemented by software and/or hardware, and is generally integrated with any device that provides a database node management function.
As shown in fig. 2, a plan execution method according to a first embodiment of the present invention includes the following steps:
s110, carrying out partition cutting on a sending operator in the plan to be executed based on the cutting information in the query statement to obtain a partition cutting result, wherein the partition cutting result comprises the position of an execution node of the sending operator.
In this embodiment, the query statement may refer to a query instruction statement input by a user through a client. The clipping information may refer to a query condition in the query statement, such as c 1-50 or c 1-150 in the above embodiments. The plan to be executed may be an existing plan matched from the plan cache according to the query statement, and on this basis, if the plan is not matched in the plan cache, the plan to be executed may be a query plan regenerated according to the query statement.
The SEND operator may refer to a SEND operator. The partition clipping can refer to a process of correspondingly clipping the SEND operator according to the clipping information to obtain a corresponding partition and a partition position, and the partition position obtained in the process can refer to a partition clipping result; the partition may refer to the partition p1 or p2 in the above embodiment, and the partition position may refer to a node position where the partition is located, such as the node bp1 or bp2 corresponding to p1 or p2 in the above embodiment. The execution node position can refer to a node where the SEND operator can execute data transmission, for example, which node the partition position of the SEND operator is, that is, the node where the SEND operator can execute data transmission, that is, the execution node position; that is, the partition clipping results may include the execution node location of the SEND operator. The plan to be executed may include a plurality of SEND operators, and each SEND operator may correspond to a partition clipping result, that is, each SEND operator may correspond to an execution node position.
And S120, constructing an executable plan based on the plan to be executed and the partition cutting result.
In this embodiment, the executable plan may refer to a plan constructed by adding the obtained partition clipping result information to the plan to be executed. It will be appreciated that the SEND operator and its corresponding execution node location may be included in the executable plan.
S130, determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
In this embodiment, the data distribution operator may refer to a SEND/RECV operator, which may be understood as an operator combined by a SEND operator and a RECV operator, which may be used for sending and receiving data. The execution mode can refer to a mode of the SEND/RECV operator when the SEND/RECV operator executes the plan, and the execution mode of the SEND/RECV operator in the executable plan can be determined according to the position of an execution node of the SEND operator.
Optionally, the execution mode includes a pipe mode and a normal mode.
The pipeline mode may refer to that the SEND/RECV operator is regarded as a pipeline operator, and data delivered by a child operator can be directly transmitted to a corresponding parent operator without performing any other processing on the data. Wherein a parent operator may be understood as a destination operator to which data is to be passed during execution of a plan. For example, as illustrated in fig. 1C, the SEND operator is to deliver data to the RECV operator, and the RECV operator is to deliver data to the PROJECT operator, that is, the SEND operator is to deliver data to the PROJECT operator, and the RECV operator is to deliver data to the PROJECT operator, that is, the parent operator corresponding to the SEND operator is the RECV operator, and the parent operator corresponding to the RECV operator is the PROJECT operator. It is understood that the PROJECT operator is a parent operator to which the SEND/RECV operator corresponds.
Child operators may be relative to parent operators, and child operators may be understood as operators of the origin of data during execution of a plan. Also, by way of example, fig. 1C is taken, in which the SEND operator is to receive data from the FILTER operator, the RECV operator is to receive data from the SEND operator, that is, the source operator of the SEND operator data is the FILTER operator, the source operator of the RECV operator data is the SEND operator, that is, the child operator corresponding to the SEND operator is the FILTER operator, and the child operator corresponding to the RECV operator is the SEND operator. It will be appreciated that the SEND/RECV operator corresponds to a child operator that is a FILTER operator.
The normal mode may refer to a SEND operator of SEND/RECV operators requiring data transfer with the RECV operator across nodes.
On the basis of the execution mode, the transmission of the SEND/RECV operator to the data can be controlled according to different execution modes so as to execute the executable plan correspondingly.
Firstly, based on cutting information in a query statement, performing partition cutting on a sending operator in a plan to be executed to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operator; then constructing an executable plan based on the plan to be executed and the partition cutting result; and finally, determining the execution mode of the data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode. The method can effectively improve the efficiency of plan execution by changing the execution mode of the data distribution operator according to the execution node position of the sending operator and executing the executable plan according to different execution modes.
Example two
Fig. 3 is a schematic flow chart of a plan execution method according to a second embodiment of the present invention, which is further detailed based on the above embodiments. In this embodiment, the processes of determining an execution mode of a data distribution operator in an executable plan and executing the executable plan according to the execution mode, etc. are specifically described. It should be noted that technical details that are not described in detail in the present embodiment may be referred to any of the above embodiments.
As shown in fig. 3, a plan execution method provided in the second embodiment of the present invention includes the following steps:
and S210, carrying out plan cache matching on the obtained query statement.
And S220, determining a plan to be executed according to the matching result.
In this embodiment, plan caching may refer to an existing execution plan for certain query statements cached in a database. A pending plan may refer to a plan that is waiting to be executed.
For example, plan cache matching is performed on the obtained query statement, and a plan to be executed is determined according to a matching result, it may be understood that, for the query statement obtained from the client, whether an existing execution plan available for the query statement exists in the plan cache may be searched according to the query statement (i.e., whether an existing execution plan corresponding to the query statement exists is searched), and a result of the search is a matching result; if the corresponding existing execution plan is found, the found existing execution plan can be used as a plan to be executed; if the corresponding existing execution plan is not found, the plan can be regenerated according to the query statement to serve as the plan to be executed.
Optionally, determining the plan to be executed according to the matching result includes: if the matching result is that the matching is successful, acquiring the matched plan from the plan cache as a plan to be executed; and if the matching result is matching failure, performing plan generation and plan optimization on the query statement, and taking the obtained plan as a plan to be executed.
And determining the plan to be executed according to the matching result of the plan cache matching. If the matching result is that the matching is successful, that is, the corresponding existing execution plan is found from the plan cache, the matched plan obtained from the plan cache can be used as the plan to be executed. If the matching result is matching failure, that is, the corresponding existing execution plan is not found from the plan cache, plan generation and plan optimization can be performed on the query statement, and the obtained plan is used as the plan to be executed.
Fig. 4 is a schematic diagram illustrating implementation of plan cache matching according to a second embodiment of the present invention. As shown in fig. 4, the obtained query statement is first parsed; then plan cache matching is carried out on the obtained query statement according to the analysis result of the query statement; finally, if the matching is successful, the matched execution plan can be used as a plan to be executed; if the matching fails, semantic analysis, plan generation and plan optimization can be performed according to the query statement, and the obtained plan can be used as the plan to be executed.
And S230, carrying out partition cutting on a sending operator in the plan to be executed based on the cutting information in the query statement to obtain a partition cutting result, wherein the partition cutting result comprises the position of an execution node of the sending operator.
And S240, constructing an executable plan based on the plan to be executed and the partition cutting result.
In the present embodiment, an executable plan is constructed based on the plan to be executed and the partition clipping result. In the process of converting the plan to be executed into the executable plan, the execution node position after partition clipping can be calculated for all the SEND operators encountered, and on the basis, the execution mode of the SEND/RECV operators in the executable plan is determined according to the execution node position result.
S250, judging whether the execution node positions of the sent operator are local nodes, if so, executing S260; if not, go to S270.
And S260, determining that the execution mode of the data distribution operator in the executable plan is a pipeline mode.
In this embodiment, if the execution node locations of the sending operator are all local nodes, it may be determined that the execution mode of the data distribution operator (i.e., SEND/RECV operator) in the executable plan is the pipe mode.
And S270, determining that the execution mode of the data distribution operator in the executable plan is a normal mode.
In this embodiment, if the execution node locations of the sending operator are not all local nodes, for example, all local nodes or other nodes outside the local node, the execution mode of the SEND/RECV operator in the executable plan may be determined to be a normal mode.
And S280, executing the executable plan according to the execution mode.
In this embodiment, the different execution modes may correspond to different execution modes of the executable plan.
Optionally, executing the executable plan according to the execution mode includes: and if the execution mode is the pipeline mode, executing the executable plan according to the stand-alone mode, wherein the data distribution operator serves as a pipeline operator and is used for sending the received data to the corresponding parent operator.
The stand-alone mode may refer to executing the executable plan directly at the local node without sending the executable plan to other nodes or the current node. If the execution mode is the pipeline mode, the executable plan can be executed according to the stand-alone mode, wherein a SEND/RECV operator in the executable plan can be used as a pipeline operator, the data is not used for any processing, and the received data can be directly sent to a corresponding parent operator.
Optionally, executing the executable plan according to the execution mode includes: if the execution mode is the normal mode, sending the sub-plan taking the sending operator as a root node to the corresponding execution node position; sending the data uploaded by the corresponding child operator to the corresponding receiving operator according to the destination node setting through the sending operator; after the receiving operator receives the data, the data is sent to the corresponding parent operator until the receiving stop condition is met; wherein, the receiving stop condition is that the data received by the receiving operator contains a stop mark.
In the example of the plan shown in fig. 1C, the sub-plan with the SEND operator as the root node can be understood as a partial plan composed of the SEND-FILTER-SCAN operator part. The destination node may refer to a node where the RECV operator (i.e., RECV operator) is located, and the node where the RECV operator is located may be a node visited by the user, that is, a node where the user query statement is first obtained from the client.
Specifically, if the execution mode is the normal mode, the sub-plan with the SEND operator as the root node may be sent to the execution node position corresponding to the SEND operator; then, the data uploaded by the corresponding child operator is sent to the corresponding RECV operator according to the destination node setting through the SEND operator; and finally, after receiving the data, the RECV operator sends the data to the corresponding parent operator until the receiving stop condition is met.
The reception stop condition may be that a stop flag is included in data received by the RECV operator. The stop mark may be a mark attached to the last data sent by the SEND operator, and the stop mark may be flexibly set by the skilled person, such as a mark formed by a string of characters, and is used to characterize the data as the last sent data, and the specific form of the stop mark is not limited herein. If the data received by the RECV operator contains a stop mark, it is understood that the data is the last data sent by the SEND operator, and the RECV operator can stop receiving the data after receiving the last data.
The plan execution method provided by the second embodiment of the present invention embodies the process of determining the execution mode of the data distribution operator in the executable plan and executing the executable plan according to the execution mode. The method does not distinguish a local plan and a remote plan in a plan generating stage, a SEND/RECV operator is uniformly reserved in the generated plan, when the plan is reused, the execution mode of the SEND/RECV is determined according to the actual execution node position of the SEND, the plan reuse efficiency can be effectively utilized, the amount of plan cache is reduced, and therefore the plan execution efficiency is improved.
EXAMPLE III
Fig. 5 is a schematic structural diagram of a plan execution device according to a third embodiment of the present invention, where the plan execution device may be implemented by software and/or hardware. As shown in fig. 5, the apparatus includes: a cropping module 310, a construction module 320, and an execution module 330;
the cutting module 310 is configured to perform partition cutting on a sending operator in a plan to be executed based on cutting information in a query statement, so as to obtain a partition cutting result, where the partition cutting result includes an execution node position of the sending operator;
a constructing module 320, configured to construct an executable plan based on the plan to be executed and the partition clipping result;
an executing module 330, configured to determine an execution mode of a data distribution operator in the executable plan according to the execution node location of the sending operator, and execute the executable plan according to the execution mode.
In this embodiment, the device performs partition clipping on a sending operator in a plan to be executed by a clipping module based on clipping information in a query statement to obtain a partition clipping result, where the partition clipping result includes an execution node position of the sending operator; then, constructing an executable plan based on the plan to be executed and the partition cutting result through a construction module; and finally, determining the execution mode of the data distribution operator in the executable plan according to the execution node position of the sending operator through the execution module, and executing the executable plan according to the execution mode. The device can effectively improve the efficiency of plan execution by changing the execution mode of the data distribution operator according to the execution node position of the sending operator and executing the executable plan according to different execution modes.
Optionally, the apparatus further comprises:
the matching module is used for carrying out plan cache matching on the acquired query statement before carrying out partition cutting on a sending operational character in the plan to be executed based on the cutting information in the query statement;
and the determining module is used for determining the plan to be executed according to the matching result.
Optionally, the determining module specifically includes:
if the matching result is that the matching is successful, acquiring the matched plan from the plan cache as a plan to be executed;
and if the matching result is matching failure, performing plan generation and plan optimization on the query statement, and taking the obtained plan as a plan to be executed.
Optionally, the execution mode includes a pipe mode and a normal mode.
Optionally, when the operation of "determining the execution mode of the data distribution operator in the executable plan according to the execution node position of the sending operator" is executed, the execution module 330 specifically includes:
a first determining unit, configured to determine that an execution mode of a data distribution operator in the executable plan is a pipeline mode if execution node positions of the sending operator are all local nodes;
a second determining unit, configured to determine that the execution mode of the data distribution operator in the executable plan is a normal mode if the execution mode is not the normal mode.
Optionally, the executing module 330, when executing the operation of "executing the executable plan according to the execution mode", specifically includes:
and the first execution unit is used for executing the executable plan according to the stand-alone mode if the execution mode is the pipeline mode, wherein the data distribution operator is used as a pipeline operator and used for sending the received data to the corresponding parent operator.
Optionally, the executing module 330, when executing the operation of "executing the executable plan according to the execution mode", specifically includes:
the first sending unit is used for sending the sub-plan taking the sending operator as a root node to the corresponding execution node position if the execution mode is the normal mode;
the second sending unit is used for sending the data uploaded by the corresponding child operation character to the corresponding receiving operation character according to the setting of the destination node through the sending operation character;
a third sending unit, configured to send the data to a corresponding parent operator after the receiving operator receives the data until a reception stop condition is satisfied;
wherein the receiving stop condition is that the data received by the receiving operator contains a stop mark.
The plan execution device can execute the plan execution method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a schematic structural diagram of a database node according to a fourth embodiment of the present invention. As shown in fig. 6, a database node according to a fourth embodiment of the present invention includes: a storage device 42 and one or more processors 41; the number of the processors 41 in the database node may be one or more, and one processor 41 is taken as an example in fig. 6; storage 42 is used to store one or more programs; the one or more programs are executed by the one or more processors 41, such that the one or more processors 41 implement the scheduled execution method according to any of the embodiments of the present invention.
The database node may further include: a communication device 43, an input device 44 and an output device 45.
The processor 41, the storage means 42, the communication means 43, the input means 44 and the output means 45 in the database node may be connected by a bus or other means, as exemplified by the bus connection in fig. 6.
The storage device 42 in the database node is used as a computer readable storage medium for storing one or more programs, which may be software programs, computer executable programs, and modules, such as program instructions/modules corresponding to the planning execution method provided in one or two embodiments of the present invention (for example, the modules in the planning execution device shown in fig. 5 include the cropping module 310, the constructing module 320, and the executing module 340). The processor 41 executes various functional applications and data processing of the database node by executing software programs, instructions and modules stored in the storage device 42, that is, implements the scheduled execution method in the above method embodiment.
The storage device 42 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the database node, and the like. Further, the storage 42 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, storage 42 may further include memory located remotely from processor 41, which may be connected to the database nodes over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The communication means 43 may comprise a receiver and a transmitter. The communication device 43 is configured to perform information transmission and reception communication in accordance with control of the processor 41.
The input device 44 is operable to receive entered numeric or character information and to generate key signal inputs relating to user settings and function controls of the database node. The output device 45 may include a display device such as a display screen.
And, when one or more programs included in the above-mentioned database node are executed by the one or more processors 41, the programs perform the following operations: carrying out partition cutting on a sending operational character in a plan to be executed based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character; constructing an executable plan based on the plan to be executed and the partition cutting result; and determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
EXAMPLE five
An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is used, when executed by a processor, to execute a method for executing a plan, where the method includes: carrying out partition cutting on a sending operational character in a plan to be executed based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character; constructing an executable plan based on the plan to be executed and the partition cutting result; and determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
Optionally, the program, when executed by the processor, may be further configured to perform a method for scheduling execution according to any of the embodiments of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a flash Memory, an optical fiber, a portable CD-ROM, an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. A computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take a variety of forms, including, but not limited to: an electromagnetic signal, an optical signal, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic, or Radio Frequency (RF), etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method of plan execution, the method comprising:
carrying out partition cutting on a sending operational character in a plan to be executed based on cutting information in a query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character;
constructing an executable plan based on the plan to be executed and the partition cutting result;
and determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
2. The method of claim 1, prior to performing partition clipping on a send operator in the plan to be executed based on clipping information in the query statement, further comprising:
carrying out plan cache matching on the obtained query statement;
and determining the plan to be executed according to the matching result.
3. The method according to claim 2, wherein the determining the plan to be executed according to the matching result comprises:
if the matching result is that the matching is successful, acquiring the matched plan from the plan cache as a plan to be executed;
and if the matching result is matching failure, performing plan generation and plan optimization on the query statement, and taking the obtained plan as a plan to be executed.
4. The method of claim 1, wherein the execution modes include a pipe mode and a normal mode.
5. The method of claim 4, wherein determining an execution mode of a data distribution operator in the executable plan based on an execution node location of the send operator comprises:
if the execution node positions of the sending operational characters are all local nodes, determining that the execution mode of the data distribution operational characters in the executable plan is a pipeline mode;
otherwise, determining that the execution mode of the data distribution operator in the executable plan is a normal mode.
6. The method of claim 4, wherein said executing the executable plan according to the execution mode comprises:
and if the execution mode is the pipeline mode, executing the executable plan according to the stand-alone mode, wherein the data distribution operator is used as a pipeline operator and used for sending the received data to the corresponding parent operator.
7. The method of claim 4, wherein said executing the executable plan according to the execution mode comprises:
if the execution mode is the normal mode, sending the sub-plan taking the sending operator as a root node to the corresponding execution node position;
sending the data uploaded by the corresponding child operator to the corresponding receiving operator according to the destination node setting through the sending operator;
after the receiving operator receives the data, the data is sent to a corresponding parent operator until a receiving stop condition is met;
wherein the receiving stop condition is that the data received by the receiving operator contains a stop mark.
8. A plan execution apparatus, comprising:
the cutting module is used for cutting a sending operational character in a plan to be executed in a partition mode based on cutting information in the query statement to obtain a partition cutting result, wherein the partition cutting result comprises an execution node position of the sending operational character;
the construction module is used for constructing an executable plan based on the plan to be executed and the partition cutting result;
and the execution module is used for determining an execution mode of a data distribution operator in the executable plan according to the execution node position of the sending operator, and executing the executable plan according to the execution mode.
9. A database node, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of scheduling execution of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method of planning as claimed in any one of claims 1 to 7.
CN202111442810.9A 2021-11-30 2021-11-30 Plan execution method, device, database node and medium Pending CN114090617A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111442810.9A CN114090617A (en) 2021-11-30 2021-11-30 Plan execution method, device, database node and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111442810.9A CN114090617A (en) 2021-11-30 2021-11-30 Plan execution method, device, database node and medium

Publications (1)

Publication Number Publication Date
CN114090617A true CN114090617A (en) 2022-02-25

Family

ID=80305920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111442810.9A Pending CN114090617A (en) 2021-11-30 2021-11-30 Plan execution method, device, database node and medium

Country Status (1)

Country Link
CN (1) CN114090617A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775698A (en) * 2023-08-23 2023-09-19 本原数据(北京)信息技术有限公司 Partition clipping method and device for database, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116775698A (en) * 2023-08-23 2023-09-19 本原数据(北京)信息技术有限公司 Partition clipping method and device for database, computer equipment and storage medium
CN116775698B (en) * 2023-08-23 2023-11-24 本原数据(北京)信息技术有限公司 Partition clipping method and device for database, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US10956417B2 (en) Dynamic operation scheduling for distributed data processing
US20170010968A1 (en) System and method for data caching in processing nodes of a massively parallel processing (mpp) database system
CN104954468A (en) Resource allocation method and resource allocation device
EP2835938A1 (en) Message publishing and subscribing method and apparatus
CN107004013A (en) System and method for providing distributed tree traversal using hardware based processing
CN111506602B (en) Data query method, device, equipment and storage medium
CN112989171B (en) Data query method, device, equipment and medium
CN110381162A (en) A kind of method and relevant apparatus of business processing
US11137258B2 (en) Systems and methods for comprehensive routing
US20220083498A1 (en) Data transmission method and device for network on chip and electronic apparatus
CN112672440A (en) Instruction execution method, system, network device and storage medium
CN112367211A (en) Method, device and storage medium for generating configuration template by device command line
CN110134738A (en) Distributed memory system resource predictor method, device
CN114090617A (en) Plan execution method, device, database node and medium
US11228537B2 (en) Resource processing method and system, storage medium and electronic device
WO2022253165A1 (en) Scheduling method, system, server and computer readable storage medium
WO2022111356A1 (en) Data migration method and system, and server and storage medium
CN114676179A (en) Multi-source heterogeneous data interaction and fusion method and system for shield scene
CN112035523A (en) Method, device and equipment for determining parallelism and storage medium
CN113656295A (en) Automatic generation method and device of test data, electronic equipment and storage medium
US11947539B2 (en) Concurrency and cancellation in distributed asynchronous graph processing
CN111324655B (en) Data subscription method based on differential data extraction in distributed simulation
CN118245503B (en) Execution method and device of storage process, electronic equipment and readable storage medium
KR20130078041A (en) Large scale qos-aware web service composition method using efficient anytime algorithm
CN114363245B (en) Multi-core network-on-chip data transmission method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination