WO2014040426A1 - 查询处理方法和装置 - Google Patents

查询处理方法和装置 Download PDF

Info

Publication number
WO2014040426A1
WO2014040426A1 PCT/CN2013/076366 CN2013076366W WO2014040426A1 WO 2014040426 A1 WO2014040426 A1 WO 2014040426A1 CN 2013076366 W CN2013076366 W CN 2013076366W WO 2014040426 A1 WO2014040426 A1 WO 2014040426A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
recoverable
executing
query request
data node
Prior art date
Application number
PCT/CN2013/076366
Other languages
English (en)
French (fr)
Inventor
曹莉
吴向阳
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014040426A1 publication Critical patent/WO2014040426A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries

Definitions

  • the present invention relates to a database technology, and in particular, to a query processing method and apparatus.
  • a parallel database is a data management technique that accelerates data query processing by uniformly distributing data across multiple data nodes and performing parallel execution on multiple data nodes during query.
  • Parallel databases are mainly used for massive data storage. As the number of data nodes increases, the probability of exceptions during query processing increases.
  • the control node that controls each data node to perform the query operation abandons the query processing that the database is performing. After the error correction processing ends, the control node re- Control each data node to execute the user's query request.
  • Embodiments of the present invention provide a query processing method and apparatus for improving the efficiency of parallel database query processing.
  • an embodiment of the present invention provides a query processing method, including:
  • the method before the sending, by the data node, the first step information of the client query request to the data node in the system, the method further includes:
  • the recoverable information is generated according to resource consumption corresponding to each step of the data request by the data node.
  • the recoverable information includes at least one recoverable point information
  • the next step of the recoverable point subsequent distance is the initial step of re-executing the query request.
  • the method before the sending, to the data node in the system, the first step information of executing the client query request, the method further includes:
  • a first recoverable point information located before the third step and closest to the third step is recorded.
  • the determining, according to the preset recoverable information, determining the second step of executing the query request includes:
  • the next step of the first recoverable point subsequent distance is taken as the second step.
  • the recording is located before the third step and after the first recoverable point information closest to the third step, the first step of transmitting a request for executing a client query to each data node in the system Before the information, it also includes:
  • a second recoverable point information located before the fourth step and closest to the fourth step is recorded.
  • the determining, according to the preset recoverable information, determining the second step of executing the query request includes:
  • the next step of the second recoverable point subsequent distance is taken as the second step.
  • an embodiment of the present invention provides a control node, including:
  • a first sending module configured to send, to each data node in the system, first step information for executing a client query request
  • a receiving module configured to receive a failure message sent by at least one of the data nodes, where the failure is cancelled
  • the information is used to indicate that the first step of performing the first step fails or is abnormal
  • a processing module configured to determine, according to the preset recoverable information, a second step of performing the query request, where the second step is an initial step of re-executing the query request, where the recoverable information includes The step of performing the query request without repeating the execution;
  • a second sending module configured to send the second step information to each of the data nodes.
  • the processing module is further configured to generate the recoverable information according to resource consumption corresponding to each step of the data request by the data node.
  • control node further includes: a storage module, configured to store the recoverable information, at least one recoverable point information in the recoverable information, and the next step of the recoverable point subsequent distance is The first step of re-executing the query request.
  • the first sending module is further configured to send, to the each data node, third step information for executing the query request;
  • the receiving module is further configured to receive a first success message sent by each of the data nodes, where the first success message is used to indicate that the performing the third step is successful;
  • the storage module is further configured to record first recoverable point information that is located before the third step and is closest to the third step.
  • the processing module determines, according to the preset recoverable information, the second step of performing the query request, including: using the next step that is the closest to the first recoverable point as the The second step.
  • the first sending module is further configured to send, to the each data node, fourth step information for executing the query request;
  • the receiving module is further configured to receive a second success message sent by each of the data nodes, where the second success message is used to indicate that the performing the fourth step is successful;
  • the storage module is further configured to record second recoverable point information that is located before the fourth step and is closest to the fourth step.
  • the processing module determines, according to the preset recoverable information, the second step of performing the query request, including: using the next step of the second recoverable point subsequent distance as the The second step.
  • an embodiment of the present invention provides a control node, including: a transmitter, configured to send, to each data node in the system, first step information for executing a client query request;
  • a receiver configured to receive a failure message sent by at least one of the data nodes, where the failure message is used to indicate that performing the first step failure or abnormality;
  • a processor configured to determine, according to the preset recoverable information, a second step of performing the query request, where the second step is an initial step of re-executing the query request, where the recoverable information includes The step of performing the query request without repeating the execution;
  • the transmitter is further configured to send the second step information to each of the data nodes.
  • the processor is further configured to generate the recoverable information according to resource consumption corresponding to each step of the data node performing the query request.
  • control node further includes: a memory, configured to store the recoverable information, at least one recoverable point information in the recoverable information, and the next step of the recoverable point subsequent distance is the re The initial step of executing the query request.
  • the transmitter is further configured to send, to the each data node, third step information for performing the query request;
  • the receiver is further configured to receive a first success message sent by each of the data nodes, where the first success message is used to indicate that the performing the third step is successful;
  • the memory is further configured to record first recoverable point information located before the third step and closest to the third step.
  • the processor determines, according to the preset recoverable information, the second step of performing the query request, including: using the next step of the first recoverable point subsequent distance as the The second step.
  • the transmitter is further configured to send, to the each data node, fourth step information for performing the query request;
  • the receiver is further configured to receive a second success message sent by each of the data nodes, where the second success message is used to indicate that the performing the fourth step is successful;
  • the memory is further configured to record second recoverable point information that is located before the fourth step and is closest to the fourth step.
  • the processor determines to perform the performing according to the preset recoverable information.
  • the second step of the query request includes: taking the next step that is the closest to the second recoverable point as the second step.
  • the control node in the process of performing query processing by each data node in the control node control system, when a certain step fails to execute, by analyzing the error information, the control node may determine according to the currently saved recoverable information.
  • the initial step of re-executing the query request does not need to be executed from the beginning after a certain step in the query process fails, which improves the efficiency of the parallel database query processing.
  • Embodiment 1 is a flowchart of Embodiment 1 of a query processing method according to the present invention
  • Embodiment 2 is a flowchart of Embodiment 2 of a query processing method according to the present invention
  • Embodiment 3 is a flowchart of Embodiment 3 of a query processing method according to the present invention.
  • Embodiment 4 is a flowchart of Embodiment 4 of a query processing method according to the present invention.
  • FIG. 5 is a schematic diagram of Embodiment 5 of a query processing method according to the present invention.
  • Embodiment 5 is a flowchart of Embodiment 5 of a query processing method according to the present invention.
  • Embodiment 7 is a schematic structural diagram of Embodiment 1 of a control node according to the present invention.
  • Embodiment 8 is a schematic structural diagram of Embodiment 2 of a control node according to the present invention.
  • Embodiment 9 is a schematic structural diagram of Embodiment 3 of a control node according to the present invention.
  • FIG. 10 is a schematic structural diagram of Embodiment 4 of a control node according to the present invention.
  • the technical solutions in the present invention will be clearly and completely described in the following with reference to the drawings of the present invention. It is obvious that the described embodiments are part of the present invention. Embodiments, not all of the embodiments. Based on embodiments in the present invention, common in the art All other embodiments obtained by a skilled person without creative efforts are within the scope of the present invention.
  • FIG. 1 is a flowchart of Embodiment 1 of a query processing method according to the present invention. As shown in FIG. 1, the method in this embodiment may include:
  • S101 Send, to each data node in the system, first step information for executing a client query request.
  • the control node may generate an execution plan step according to the query request of the client, and the control node sends the step information to each data node in the system before performing each step.
  • the first step above refers to any step that the control node sends to the data node during the execution of the query task.
  • the failure message can be used to indicate that the first step of execution failed or was abnormal.
  • the control node delivers any execution step task to each data node in the system, and the data node executes the execution step task delivered by the control node. After the execution succeeds, the execution result is sent to the control node; when an execution exception occurs, Send a failure message to the control node.
  • the control node receives the failure message sent by any data node, it means that the step task step execution fails.
  • the control node analyzes the error information, and determines the initial step of re-executing the query request according to the currently saved recoverable information.
  • the second step described above is the initial step of re-executing the query request, and the recoverable information includes the steps of re-executing the query request without repeated execution.
  • the second step information is sent to each data node in the system.
  • the control node after receiving the failure message sent by any data node, the control node analyzes the error information, and the control node may determine the initial step of re-executing the query request according to the currently saved recoverable information, and implement the query process. After a certain step in the execution fails, it does not need to be executed from the beginning, which improves the efficiency of parallel database query processing.
  • FIG. 2 is a flowchart of Embodiment 2 of a query processing method according to the present invention.
  • the control node evaluates the cost of executing the query processing task according to the query request step, and generates recoverable information. Therefore, each node in the control node is Data node Before sending the first step information of the client query request, the following operations may also be included:
  • control node generates an execution plan step according to the query request of the client, each step describes a specific database operation, such as scanning, connecting, and the like of the data table, and the control node evaluates resource consumption corresponding to each step of executing the query request, eg, according to The system resource consumption when the data is retrieved, the data set size of the retrieval condition, and the like, and the recoverable information is generated according to a certain algorithm, wherein the recoverable information may include information such as the number and location of the recoverable points.
  • the control node generates the recoverable information such as the number of recoverable points and the location of the execution step by evaluating the resource consumption of the system by performing various planning steps of the query request by the data node, and the data nodes in the control node control system execute the query.
  • the error information is analyzed, and according to the currently saved recoverable information, the initial step of re-executing the query request is determined, and the execution is not required to be executed from the beginning, thereby improving parallelism.
  • the efficiency of database query processing is performed by a certain execution step fails.
  • the recoverable information may include at least one recoverable point information
  • the next step of recovering the point subsequent distance is the initial step of re-executing the query request.
  • control node evaluates resource consumption corresponding to each step of executing the query request, and sets at least one recoverable point. When a certain step fails, the control node analyzes the error information, and searches for the recoverable point from which the execution result has been saved. The next recoverable point from the execution step, and the next step of the recoverable point is selected as the initial step of re-executing the query request.
  • the recoverable point closest to the execution step is found from the recoverable point where the execution result has been saved, and the recoverable point is selected.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • FIG. 3 is a flowchart of Embodiment 3 of a query processing method according to the present invention.
  • FIG. 3 on the basis of the second embodiment of the method of the present invention shown in FIG. 2, before performing the first step failure or abnormality, other execution steps may also exist, and therefore, the execution client is sent to each data node in the system.
  • the following operations may also be included:
  • step information for executing a query request is any step before the first step, and the control node sends step information for executing the query request to each data node in the system, and waits for the data node to send the execution result information.
  • S302 Receive a first success message sent by each data node.
  • the first success message is used to indicate that the third step is successful, and each data node in the system sends the execution result to the control node.
  • first recoverable point information that is located before the third step and is closest to the third step Specifically, after the third step is successfully executed, the control node records the first recoverable point information that is located before the third step and is closest to the third step, and the recoverable point information includes a specific result of the step that has been performed, thereby When the step execution fails, the next step of the recoverable point can be selected as the initial step of re-executing the query request.
  • the next recoverable point may be selected according to the first recoverable point information.
  • the step avoids the process of executing from the beginning after the execution fails, and improves the efficiency of the parallel database query processing.
  • the second step of executing the query request is determined according to the preset recoverable information, including: taking the next step of the first recoverable point subsequent distance as the second step.
  • the first recoverable point information is recoverable point information that is located before the third step and is closest to the third step, and the third step is any step before the first step, so when the first step fails to execute,
  • the next step of the first recoverable point can be selected as the second step, i.e., the next step of selecting the first recoverable point is the initial step of re-executing the query request.
  • the next step of selecting the first recoverable point may be selected as the second step, that is, selecting the first recoverable point.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • FIG. 4 is a flowchart of Embodiment 4 of a query processing method according to the present invention.
  • the second recoverable point can also be updated, so that the record is located at the first
  • the first execution of the client query request is sent to each data node in the system.
  • the following operations may also be included:
  • the fourth step is the step before the first step, and after the first recoverable point, the control node sends step information for executing the query request to each data node in the system, and waits for the data node to send the execution result information.
  • S402 Receive a second success message sent by each data node.
  • the second success message is used to indicate that the execution of the fourth step is successful, and each data node in the system sends the execution result to the control node.
  • the control node records second recoverable point information that is located before the fourth step and is closest to the fourth step, and the recoverable point information includes a specific result of the step that has been performed, thereby When the step execution fails, the next step of the recoverable point can be selected as the initial step of re-executing the query request.
  • the second recoverable point information since the second recoverable point is located after the first recoverable point, the second recoverable point information includes a specific result of the performing step recorded in the first recoverable point information, and therefore, the first recoverable point information may be used to cover the first Recovery point information.
  • the second recoverable point may be selected according to the second recoverable point information.
  • the next step is the initial step of re-executing the query request, avoiding the re-execution of some query processing steps after the execution fails, and improving the efficiency of the parallel database query processing.
  • the second step of executing the query request is determined according to the preset recoverable information, including: the next step of the second recoverable point subsequent distance being the second step.
  • the second recoverable point information is the recoverable point information located before the fourth step and closest to the fourth step
  • the fourth step is the step before the first step and after the first recoverable point
  • the next step of selecting the second recoverable point may be selected as the second step, that is, selecting the second.
  • the next step of the recoverable point is the initial step of re-executing the query request, avoiding the re-execution of some query processing steps after the execution failure, and improving the efficiency of the parallel database query processing.
  • FIG. 5 is a schematic diagram of Embodiment 5 of a query processing method according to the present invention.
  • the control node 500 generates an execution plan step 503 according to the query request of the client, and generates recoverable point information by evaluating the resource consumption corresponding to each step, wherein the first recoverable point 501 and the second recoverable point 502 Corresponding to step B and step D respectively.
  • the specific execution step 504 is as follows:
  • control node first sends the step A task to each data node. If each data node performs step A successfully, the control node delivers the step B task;
  • step B If each data node performs step B successfully, the control node first records the information of the first recoverable point 501, and issues a step C task to each node;
  • step C If the data node is abnormal or fails during the execution of step C, the control node re-issues the step C task to each data node by analyzing the error information and according to the information of the currently saved first recoverable point 501. Instead of starting from step A, starting from scratch;
  • step D The control node issues the step D task to each data node. If the execution is successful, the control node first records the information of the second recoverable point 502, and overwrites the information of the second recoverable point 502; and if the execution fails, Then execute again from step C;
  • step E the control node executes step E again according to the information of the second most recoverable point 502 currently recorded, instead of starting from step C, and does not start from step A.
  • FIG. 6 is a flowchart of Embodiment 5 of a query processing method according to the present invention.
  • the control node stores the recoverable point information of the execution plan and the recoverable point state of the currently executed task, where the first recoverable point 501 and the second recoverable point 502 respectively correspond to Step B and Step D. 4 ⁇
  • the execution steps E failed, the interaction process between the client, the control node, and the data node is as follows:
  • the client sends a query request message to the control node.
  • the control node generates an execution plan step and recoverable point information.
  • the control node generates an execution plan step according to the query request of the client, and generates recoverable information by evaluating the resource consumption corresponding to each step, wherein the recoverable information includes the number, location, and the like of the recoverable points.
  • the control node sends the step A information to each data node.
  • the control node sends a task message performing step A to each data node.
  • the data node performs step A.
  • Each data node performs task step A.
  • the data node sends the execution result of step A to the control node.
  • Each data node sends an execution result to the control node.
  • the control node sends step B information to each data node.
  • the control node sends a task message to perform step B to each data node.
  • the data node performs step B.
  • Each data node performs task step B.
  • the data node sends the execution result to the control node.
  • Each data node transmits the execution result of step B to the control node.
  • the control node records the success of the first recoverable point and stores the specific result data of the execution step.
  • the control node sends step C information to each data node.
  • the control node sends a task message to perform step C to each data node.
  • the data node performs step C.
  • Each data node performs task step C.
  • the data node sends the execution result to the control node.
  • Each data node transmits the execution result of step C to the control node.
  • the control node sends step D information to each data node.
  • the control node sends a task message to perform step D to each data node.
  • the data node performs step D.
  • Each data node performs task step D.
  • the data node sends the execution result to the control node.
  • Each data node transmits the execution result of step D to the control node.
  • the control node updates the recoverable point information, records the information of the second recoverable point, and overwrites the information of the first recoverable point.
  • the control node sends the step E information to each data node.
  • the control node sends a task message performing step E to each data node.
  • the data node performs step E.
  • Each data node performs task step E.
  • the data node performs step E and fails, and returns control node failure information.
  • Each data node sends an execution result to the control node, and the control node receives the step E of the execution of any one of the data nodes to execute the failure message, that is, the determination step E fails.
  • step £ is re-executed from D.
  • step E fails, the control node analyzes the recordable recoverable point status D. After the cause of the error, step E is re-executed from D, instead of executing step C from B, and step A is not executed from the beginning.
  • the control node resends the step E information to each data node.
  • the control node resends the task message performing step E to each data node.
  • the data node re-executes step £.
  • Each data node re-executes task step E.
  • the data node sends the execution result to the control node.
  • Each data node sends the execution result of step E to the control node.
  • the control node returns the result to the client.
  • the control node returns the final query processing result to the client.
  • control node generates the recoverable point information of the execution step while generating the execution plan, and records the state of the manageable recoverable point.
  • the control node decides the most recent recoverable point according to the recoverable point status information, and starts execution at the recovery point, avoids repeated execution of massive data, and improves the efficiency of parallel database query processing.
  • the student table has three fields: sid, name, sex.
  • the sc table has three fields: sid, cn, and score.
  • the table definition can refer to the following statement:
  • Table 1 is the student table, the specific table data is shown in Table 1;
  • Table 2 is the sc table, the specific table data is shown in Table 2. Show.
  • the table data is stored on three data nodes.
  • the query statement can refer to the following statement:
  • Table 3 is the query processing table.
  • the query plan for the student table and the sc table is divided into four steps, and the query statement can be referred to as shown in Figure 3.
  • the control node processing is:
  • step 1 First, perform step 1 . If the execution is successful, the control node records the recoverable point status as step 1, otherwise the control node analyzes the cause of the abnormality, and the decision is re-executed or reported error;
  • step 2 When step 2 is executed, if the execution fails, the control node analyzes the cause of the abnormality, and re-executes step 2 according to the current state of the most recoverable point. Otherwise, the control node overwrites the recoverable point state as step 2; continues to perform the following steps. 3. Step 4;
  • step 3 the control node analyzes the cause of the abnormality, and directly executes step 3 according to the latest recoverable point information stored;
  • control node After all the steps are completed, the control node deletes the recoverable point information.
  • step 1 is mainly to create a temporary table, scan the underlying data table, and store it in the temporary table.
  • Step 2 is to connect the temporary table to the sn table. Both steps are time consuming, and the number of data nodes is also When it is increased, it is most likely to have abnormal steps. Therefore, steps 1 and 2 can be set as recoverable points, avoiding repeated operations on massive data during query processing, and improving the efficiency of parallel database query processing.
  • FIG. 7 is a schematic structural diagram of Embodiment 1 of a control node according to the present invention.
  • the control node of this embodiment may include: a first sending module 701, a receiving module 702, a processing module 703, and a second sending module 704.
  • the first sending module 701 is configured to send, to each data node in the system, first step information for executing a client query request
  • the receiving module 702 is configured to receive a failure message sent by the at least one data node, where the failure message is used to indicate that the first step is performed.
  • the processing module 703 is configured to determine, according to the preset recoverable information, a second step of executing the query request, where the second step is an initial step of re-executing the query request, and the recoverable information includes re-executing the query request.
  • the step of executing is not required to be repeated; the second sending module 704 is configured to send the second step information to each data node.
  • the control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 1.
  • the implementation principle is similar to the technical effect to be achieved.
  • the control node After receiving the failure message sent by any data node, the control node analyzes the error information, and the control node may determine the initial step of re-executing the query request according to the currently saved recoverable information, and implement the query process. After a certain step in the execution fails, it does not need to be executed from the beginning, which improves the efficiency of parallel database query processing.
  • the control node is as follows, wherein the processing module 703 is further configured to generate recoverable information according to resource consumption corresponding to each step of the data node performing the query request.
  • control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 2, and the implementation principle thereof is similar to the technical effect to be achieved.
  • the specific process of executing the query processing method refer to the method embodiment shown in FIG. The related description will not be repeated here.
  • the control node generates the recoverable information such as the number of recoverable points and the location of the execution step by evaluating the resource consumption of the system by performing various planning steps of the query request by the data node, and the data nodes in the control node control system execute the query.
  • the error information is analyzed, and according to the currently saved recoverable information, the initial step of re-executing the query request is determined, and the execution is not required to be executed from the beginning, thereby improving parallelism.
  • the efficiency of database query processing is performed by a certain execution step fails.
  • FIG. 8 is a schematic structural diagram of Embodiment 2 of a control node according to the present invention. As shown in FIG. 8, on the basis of the first embodiment of the control node of the present invention shown in FIG. 7, the control node of this embodiment may further include a storage module 801.
  • the storage module 801 can also be used to store recoverable information, at least one recoverable point information in the recoverable information, and the next step of recovering the subsequent distance of the point is the initial step of re-executing the query request.
  • the processing module 703 evaluates resource consumption corresponding to each step of executing the query request, and sets at least one recoverable point. When a certain step fails, the processing module 703 analyzes the error information, and the storage module 801 has stored the execution result. In the recovery point, the recoverable point closest to the execution step is found, and the next step of the recoverable point is selected as the initial step of re-executing the query request.
  • the recoverable point closest to the execution step is found from the recoverable point where the execution result has been saved, and the recoverable point is selected.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • the control node of the above wherein the first sending module 701 is further configured to send, to each data node, third step information for executing the query request, and the receiving module 702 is further configured to receive the first success message sent by each data node, where The success message is used to indicate that the execution of the third step is successful; the storage module 801 can also be used to record the first recoverable point information located before the third step and closest to the third step.
  • control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 3, and the implementation principle thereof is similar to the technical effect to be achieved.
  • the specific process of executing the query processing method refer to the method embodiment shown in FIG. The related description will not be repeated here.
  • the next recoverable point may be selected according to the first recoverable point information.
  • the step avoids the process of executing from the beginning after the execution fails, and improves the efficiency of the parallel database query processing.
  • the control node is as follows:
  • the processing module 703 determines, according to the preset recoverable information, the second step of executing the query request, which may include: taking the next step that is the closest to the first recoverable point as the second step.
  • the first recoverable point information is recoverable point information that is located before the third step and is closest to the third step, and the third step is any step before the first step, so when the first step fails to execute,
  • the processing module 703 can select the next step of the first recoverable point as the second step, that is, the next step of selecting the first recoverable point as the initial step of re-executing the query request.
  • the next step of selecting the first recoverable point may be selected as the second step, that is, selecting the first recoverable point.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • the control node of the above wherein the first sending module 701 is further configured to send, to each data node, fourth step information for executing the query request, and the receiving module 702 is further configured to receive the second success message sent by each data node, where The success message is used to indicate that the execution of the fourth step is successful; the storage module 801 can also be used to record the second recoverable point information located before the fourth step and closest to the fourth step.
  • control node of this embodiment may be used to execute the method of the method embodiment shown in FIG. 4, and the implementation thereof The principle is similar to the technical effect to be achieved.
  • the specific process of executing the query processing method refer to the related description in the method embodiment shown in FIG. 4, and details are not described herein again.
  • the second recoverable point may be selected according to the second recoverable point information.
  • the next step is the initial step of re-executing the query request, avoiding the re-execution of some query processing steps after the execution fails, and improving the efficiency of the parallel database query processing.
  • the processing module 703 determines the second step of executing the query request according to the preset recoverable information, which may include: the next step of the second recoverable point subsequent distance being the second step.
  • the second recoverable point information is the recoverable point information located before the fourth step and closest to the fourth step
  • the fourth step is the step before the first step and after the first recoverable point, therefore, when When the first step fails to execute, the processing module 703 may select the next step of the second recoverable point as the second step, that is, the next step of selecting the second recoverable point as the initial step of re-executing the query request.
  • the next step of selecting the second recoverable point may be selected as the second step, that is, selecting the second
  • the next step of the recovery point is the initial step of re-executing the query request, avoiding some re-execution of the query processing steps after the execution failure, and improving the efficiency of the parallel database query processing.
  • FIG. 9 is a schematic structural diagram of Embodiment 3 of a control node according to the present invention.
  • the control node of this embodiment may include: a transmitter 901, a receiver 902, and a processor 903.
  • the sender 901 is configured to send, to each data node in the system, first step information for executing a client query request
  • the receiver 902 is configured to receive a failure message sent by the at least one data node, where the failure message is used to indicate that the first step of performing the failure is performed.
  • the processor 903 is configured to determine, according to the preset recoverable information, a second step of executing the query request, where the second step is a start step of re-executing the query request, and the recoverable information includes Repeating the steps; the transmitter 901 is further configured to send the second step information to each data node.
  • the control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 1.
  • the implementation principle is similar to the technical effect to be achieved.
  • the control node After receiving the failure message sent by any data node, the control node analyzes the error information, and the control node may determine the initial step of re-executing the query request according to the currently saved recoverable information, and implement the query process. After a certain step in the execution fails, it does not need to be executed from the beginning, which improves the efficiency of parallel database query processing.
  • the control node is as follows, wherein the processor 903 is further configured to generate recoverable information according to resource consumption corresponding to each step of the data node performing the query request.
  • control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 2, and the implementation principle thereof is similar to the technical effect to be achieved.
  • the specific process of executing the query processing method refer to the method embodiment shown in FIG. The related description will not be repeated here.
  • the control node generates the recoverable information such as the number of recoverable points and the location of the execution step by evaluating the resource consumption of the system by performing various planning steps of the query request by the data node, and the data nodes in the control node control system execute the query.
  • the error information is analyzed, and according to the currently saved recoverable information, the initial step of re-executing the query request is determined, and the execution is not required to be executed from the beginning, thereby improving parallelism.
  • the efficiency of database query processing is performed by a certain execution step fails.
  • FIG. 10 is a schematic structural diagram of Embodiment 4 of a control node according to the present invention. As shown in FIG. 10, on the basis of Embodiment 1 of the control node of the present invention shown in FIG. 9, the control node of this embodiment may further include a memory 1001.
  • the memory 1001 can also be used to store recoverable information, at least one recoverable point information in the recoverable information, and the next step of recovering the subsequent distance of the point is the initial step of re-executing the query request.
  • the processor 903 evaluates the resource consumption corresponding to each step of executing the query request, and sets at least one recoverable point. When a certain step fails, the processor 903 analyzes the error information, and the memory 1001 has the executable result restored. In the point, find the recoverable point closest to the execution step, and select the next step of the recoverable point as the initial step of re-executing the query request.
  • the recoverable point closest to the execution step is found from the recoverable point where the execution result has been saved, and the recoverable point is selected.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • the control node as above, wherein the transmitter 901 is further configured to send, to each data node, third step information for executing the query request, and the receiver 902 is further configured to receive the first success message sent by each data node, the first success message. Used to indicate that the execution of the third step is successful; the memory 1001 can also be used to record the first recoverable point information located before the third step and closest to the third step.
  • control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 3, and the implementation principle thereof is similar to the technical effect to be achieved.
  • the specific process of executing the query processing method refer to the method embodiment shown in FIG. The related description will not be repeated here.
  • the next recoverable point may be selected according to the first recoverable point information.
  • the step avoids the process of executing from the beginning after the execution fails, and improves the efficiency of the parallel database query processing.
  • the processor 903 determines the second step of executing the query request according to the preset recoverable information, which may include: the next step of the first recoverable point subsequent distance being the second step.
  • the first recoverable point information is recoverable point information that is located before the third step and is closest to the third step, and the third step is any step before the first step, so when the first step fails to execute,
  • the processor 903 can select the next step of the first recoverable point as the second step, i.e., select the next step of the first recoverable point as the initial step of re-executing the query request.
  • the next step of selecting the first recoverable point may be selected as the second step, that is, selecting the first recoverable point.
  • the next step is the initial step of re-executing the query request, avoiding the process of executing from the beginning after the execution fails, and improving the efficiency of the parallel database query processing.
  • the transmitter 901 is further configured to send, to each data node, fourth step information for executing the query request; the receiver 902 is further configured to receive the second success message sent by each data node, the second success message. Used to indicate that the execution of the fourth step is successful; the memory 1001 can also be used to record the second recoverable point information located before the fourth step and closest to the fourth step.
  • the control node of this embodiment may be used to perform the method of the method embodiment shown in FIG. 4, and the implementation principle thereof is similar to the technical effect to be achieved.
  • the second recoverable point may be selected according to the second recoverable point information.
  • the next step is the initial step of re-executing the query request, avoiding the re-execution of some query processing steps after the execution fails, and improving the efficiency of the parallel database query processing.
  • the processor 903 determines the second step of executing the query request according to the preset recoverable information, which may include: the next step of the second recoverable point being the closest to the second step.
  • the second recoverable point information is the recoverable point information located before the fourth step and closest to the fourth step
  • the fourth step is the step before the first step and after the first recoverable point, therefore, when When the first step fails to execute, the processor 903 may select the next step of the second recoverable point as the second step, that is, the next step of selecting the second recoverable point as the initial step of re-executing the query request.
  • the next step of selecting the second recoverable point may be selected as the second step, that is, selecting the second
  • the next step of the recovery point is the initial step of re-executing the query request, avoiding some re-execution of the query processing steps after the execution failure, and improving the efficiency of the parallel database query processing.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例提供一种查询处理方法和装置。方法包括:向系统中各个数据节点发送执行客户端查询请求的第一步骤信息;接收至少一个数据节点发送的失败消息,失败消息用于表示执行第一步骤失败或异常;根据预先设定的可恢复信息,确定执行查询请求的第二步骤,第二步骤为重新执行查询请求的起始步骤,可恢复信息中包括重新执行查询请求时无需重复执行的步骤;向各个数据节点发送第二步骤信息。本发明实施例提供的查询处理方法和装置,可以提高并行数据库查询处理的效率。

Description

查询处理方法和装置 技术领域 本发明实施例涉及数说据库技术, 尤其涉及一种查询处理方法和装置。 背景技术 并行数据库是通过将数据均衡分布书在多个数据节点上,查询时在多个数据 节点上并行执行,从而加速数据查询处理的数据管理技术。 并行数据库主要应 用于海量数据存储, 随着数据节点个数的增多, 查询处理过程中出现异常的概 率就越高。
现有技术中, 并行数据库在执行用户的查询请求处理过程中,如果出现异 常,则控制各个数据节点执行查询操作的控制节点放弃数据库正在执行的查询 处理,待纠错处理结束后,控制节点重新控制各数据节点执行用户的查询请求。
但是, 由于并行数据库的存储容量通常都比较大,数据库重新执行用户的 查询请求, 导致查询请求的处理效率低, 系统资源浪费。 发明内容 本发明实施例提供一种查询处理方法和装置,用以提高并行数据库查询处 理的效率。
一方面, 本发明实施例提供一种查询处理方法, 包括:
向系统中各个数据节点发送执行客户端查询请求的第一步骤信息; 接收至少一个所述数据节点发送的失败消息,所述失败消息用于表示执行 所述第一步骤失败或异常;
根据预先设定的可恢复信息,确定执行所述查询请求的第二步骤, 所述第 二步骤为重新执行所述查询请求的起始步骤,所述可恢复信息中包括重新执行 所述查询请求时无需重复执行的步骤;
向所述各个数据节点发送所述第二步骤信息。 结合第一方面 ,所述向系统中各个数据节点发送执行客户端查询请求的第 一步骤信息之前, 还包括:
根据所述数据节点执行所述查询请求的各个步骤对应的资源消耗,生成所 述可恢复信息。
结合第一方面, 所述可恢复信息中包括至少一个可恢复点信息, 所述可恢 复点后续距离最近的下一步骤作为重新执行所述查询请求的起始步骤。结合第 一方面,所述向系统中各个数据节点发送执行客户端查询请求的第一步骤信息 之前, 还包括:
向所述各个数据节点发送执行所述查询请求的第三步骤信息;
接收所述各个数据节点发送的第一成功消息,所述第一成功消息用于表示 执行所述第三步骤成功;
记录位于所述第三步骤之前且距所述第三步骤最近的第一可恢复点信息。 结合第一方面, 所述根据预先设定的可恢复信息, 确定执行所述查询请求 的第二步骤, 包括:
将所述第一可恢复点后续距离最近的下一步骤作为所述第二步骤。
结合第一方面,所述记录位于所述第三步骤之前且距所述第三步骤最近的 第一可恢复点信息之后,所述向系统中各个数据节点发送执行客户端查询请求 的第一步骤信息之前, 还包括:
向所述各个数据节点发送执行所述查询请求的第四步骤信息;
接收所述各个数据节点发送的第二成功消息,所述第二成功消息用于表示 执行所述第四步骤成功;
记录位于所述第四步骤之前且距所述第四步骤最近的第二可恢复点信息。 结合第一方面, 所述根据预先设定的可恢复信息, 确定执行所述查询请求 的第二步骤, 包括:
将所述第二可恢复点后续距离最近的下一步骤作为所述第二步骤。
另一方面, 本发明实施例提供一种控制节点, 包括:
第一发送模块,用于向系统中各个数据节点发送执行客户端查询请求的第 一步骤信息;
接收模块, 用于接收至少一个所述数据节点发送的失败消息, 所述失败消 息用于表示执行所述第一步骤失败或异常;
处理模块, 用于根据预先设定的可恢复信息,确定执行所述查询请求的第 二步骤, 所述第二步骤为重新执行所述查询请求的起始步骤, 所述可恢复信息 中包括重新执行所述查询请求时无需重复执行的步骤;
第二发送模块, 用于向所述各个数据节点发送所述第二步骤信息。
结合第二方面, 所述处理模块,还用于根据所述数据节点执行所述查询请 求的各个步骤对应的资源消耗, 生成所述可恢复信息。
结合第二方面, 所述控制节点还包括: 存储模块, 用于存储所述可恢复信 息, 所述可恢复信息中至少一个可恢复点信息, 所述可恢复点后续距离最近的 下一步骤作为重新执行所述查询请求的起始步骤。
结合第二方面, 所述第一发送模块,还用于向所述各个数据节点发送执行 所述查询请求的第三步骤信息;
所述接收模块,还用于接收所述各个数据节点发送的第一成功消息, 所述 第一成功消息用于表示执行所述第三步骤成功;
所述存储模块,还用于记录位于所述第三步骤之前且距所述第三步骤最近 的第一可恢复点信息。
结合第二方面, 所述处理模块, 根据预先设定的可恢复信息, 确定执行所 述查询请求的第二步骤, 包括: 将所述第一可恢复点后续距离最近的下一步骤 作为所述第二步骤。
结合第二方面, 所述第一发送模块,还用于向所述各个数据节点发送执行 所述查询请求的第四步骤信息;
所述接收模块,还用于接收所述各个数据节点发送的第二成功消息, 所述 第二成功消息用于表示执行所述第四步骤成功;
所述存储模块,还用于记录位于所述第四步骤之前且距所述第四步骤最近 的第二可恢复点信息。
结合第二方面, 所述处理模块, 根据预先设定的可恢复信息, 确定执行所 述查询请求的第二步骤, 包括: 将所述第二可恢复点后续距离最近的下一步骤 作为所述第二步骤。
再一方面, 本发明实施例提供一种控制节点, 包括: 发送器,用于向系统中各个数据节点发送执行客户端查询请求的第一步骤 信息;
接收器, 用于接收至少一个所述数据节点发送的失败消息, 所述失败消息 用于表示执行所述第一步骤失败或异常;
处理器, 用于根据预先设定的可恢复信息,确定执行所述查询请求的第二 步骤, 所述第二步骤为重新执行所述查询请求的起始步骤, 所述可恢复信息中 包括重新执行所述查询请求时无需重复执行的步骤;
所述发送器, 还用于向所述各个数据节点发送所述第二步骤信息。
结合第三方面, 所述处理器,还用于根据所述数据节点执行所述查询请求 的各个步骤对应的资源消耗, 生成所述可恢复信息。
结合第三方面,所述控制节点还包括:存储器,用于存储所述可恢复信息, 所述可恢复信息中至少一个可恢复点信息,所述可恢复点后续距离最近的下一 步骤作为重新执行所述查询请求的起始步骤。
结合第三方面, 所述发送器,还用于向所述各个数据节点发送执行所述查 询请求的第三步骤信息;
所述接收器,还用于接收所述各个数据节点发送的第一成功消息, 所述第 一成功消息用于表示执行所述第三步骤成功;
所述存储器,还用于记录位于所述第三步骤之前且距所述第三步骤最近的 第一可恢复点信息。
结合第三方面, 所述处理器, 根据预先设定的可恢复信息, 确定执行所述 查询请求的第二步骤, 包括: 将所述第一可恢复点后续距离最近的下一步骤作 为所述第二步骤。
结合第三方面, 所述发送器,还用于向所述各个数据节点发送执行所述查 询请求的第四步骤信息;
所述接收器,还用于接收所述各个数据节点发送的第二成功消息, 所述第 二成功消息用于表示执行所述第四步骤成功;
所述存储器,还用于记录位于所述第四步骤之前且距所述第四步骤最近的 第二可恢复点信息。
结合第三方面, 所述处理器, 根据预先设定的可恢复信息, 确定执行所述 查询请求的第二步骤, 包括: 将所述第二可恢复点后续距离最近的下一步骤作 为所述第二步骤。
本发明实施例提供的技术方案,控制节点控制系统中各数据节点进行执行 查询处理过程中, 当某步骤执行失败时, 通过对错误信息进行分析, 控制节点 可以根据当前保存的可恢复信息,确定重新执行查询请求的起始步骤, 实现查 询过程中的某一步骤执行失败后无需从头执行,提高了并行数据库查询处理的 效率。 附图说明 为了更清楚地说明本发明实施例的技术方案,下面将对实施例中所需要使 用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅是本发明的一些实 施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以 才艮据这些附图获得其他的附图。
图 1为本发明查询处理方法实施例一的流程图;
图 2为本发明查询处理方法实施例二的流程图;
图 3为本发明查询处理方法实施例三的流程图;
图 4为本发明查询处理方法实施例四的流程图;
图 5为本发明查询处理方法实施例五的原理图;
图 6为本发明查询处理方法实施例五的流程图;
图 7为本发明控制节点实施例一的结构示意图;
图 8为本发明控制节点实施例二的结构示意图;
图 9为本发明控制节点实施例三的结构示意图;
图 10为本发明控制节点实施例四的结构示意图。 具体实施方式 为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明的附图, 对本发明中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发 明一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通 技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发 明保护的范围。
图 1为本发明查询处理方法实施例一的流程图。如图 1所示, 本实施例的 方法可以包括:
S 101、 向系统中各个数据节点发送执行客户端查询请求的第一步骤信息。 具体的,控制节点可以根据客户端的查询请求生成执行计划步骤,在执行 每一步骤前,控制节点将该步骤信息发送给系统中的各个数据节点。上述第一 步骤指在执行查询任务过程中, 控制节点向数据节点发送的任一步骤。
5102、 接收至少一个数据节点发送的失败消息。
失败消息可以用于表示执行第一步骤失败或异常。 具体的,控制节点下发 任一执行步骤任务到系统中的各个数据节点,数据节点执行控制节点下发的执 行步骤任务, 执行成功后, 将执行结果发送给控制节点; 当出现执行异常时, 发送失败消息给控制节点。 当控制节点收到任一数据节点发送的失败消息时, 即表示该步任务步骤执行失败。
5103、 根据预先设定的可恢复信息, 确定执行查询请求的第二步骤。 具体的, 控制节点收到数据节点发送的失败消息后, 分析错误信息, 并根 据当前保存的可恢复信息,确定重新执行查询请求的起始步骤。 上述第二步骤 即为重新执行查询请求的起始步骤,可恢复信息中包括重新执行查询请求时无 需重复执行的步骤。
5104、 向各个数据节点发送第二步骤信息。
控制节点确定重新执行查询请求的起始步骤为第二步骤后,向系统中各个 数据节点发送第二步骤信息。
本实施例中,控制节点收到任一数据节点发送的失败消息后,通过对错误 信息进行分析,控制节点可以根据当前保存的可恢复信息,确定重新执行查询 请求的起始步骤, 实现查询过程中的某一步骤执行失败后无需从头执行,提高 了并行数据库查询处理的效率。
图 2为本发明查询处理方法实施例二的流程图。如图 2所示,在图 1所示 本发明方法实施例一的基础上,控制节点根据查询请求步骤,评估执行查询处 理任务的成本, 生成可恢复信息, 因此, 在控制节点向系统中各个数据节点发 送执行客户端查询请求的第一步骤信息之前, 还可以包括下述操作:
S201、根据数据节点执行查询请求的各个步骤对应的资源消耗,生成可恢 复信息。
具体的,控制节点根据客户端的查询请求生成执行计划步骤,每一步骤描 述了特定的数据库操作, 如对数据表的扫描、 连接等, 控制节点评估执行查询 请求各个步骤对应的资源消耗, 如根据检索数据时的系统资源消耗,检索条件 的数据集大小等, 并依据一定的算法, 决策生成可恢复信息, 其中, 可恢复信 息可以包括可恢复点的数量、 位置等信息。
本实施例中,控制节点通过评估数据节点执行查询请求的各个计划步骤对 系统的资源消耗, 生成执行步骤的可恢复点数量、 位置等可恢复信息, 控制节 点控制系统中各数据节点在执行查询请求任务的过程中,当出现某一执行步骤 失败时, 通过分析错误信息, 并根据当前保存的可恢复信息, 确定重新执行查 询请求的起始步骤, 实现执行失败后无需从头执行,提高了并行数据库查询处 理的效率。
如上的查询处理方法, 其中, 可恢复信息中可以包括至少一个可恢复点信 息, 可恢复点后续距离最近的下一步骤作为重新执行查询请求的起始步骤。
具体的,控制节点评估执行查询请求各个步骤对应的资源消耗, 至少设置 一个可恢复点, 当执行某个步骤失败时, 控制节点分析错误信息, 从已保存有 执行结果的可恢复点中, 查找出离该执行步骤最近的可恢复点, 并选择该可恢 复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中, 通过设置可恢复点, 实现在执行某个步骤失败时, 从已保存 有执行结果的可恢复点中, 查找出离该执行步骤最近的可恢复点, 并选择该可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头 执行的过程, 提高了并行数据库查询处理的效率。
图 3为本发明查询处理方法实施例三的流程图。如图 3所示,在图 2所示 本发明方法实施例二的基础上, 在执行第一步骤失败或异常之前,还可以存在 其它执行步骤, 因此, 向系统中各个数据节点发送执行客户端查询请求的第一 步骤信息之前, 还可以包括下述操作:
S301、 向各个数据节点发送执行查询请求的第三步骤信息。 具体的, 第三步骤为第一步骤之前的任一步骤,控制节点向系统中各个数 据节点发送执行查询请求的步骤信息, 并等待数据节点发送执行结果信息。
5302、 接收各个数据节点发送的第一成功消息。
具体的, 第一成功消息用于表示执行第三步骤成功, 系统中各个数据节点 将执行结果发送给控制节点。
5303、 记录位于第三步骤之前且距第三步骤最近的第一可恢复点信息。 具体的,在第三步骤执行成功后,控制节点记录位于第三步骤之前且距第 三步骤最近的第一可恢复点信息,该可恢复点信息包含其已执行步骤的具体结 果,从而在后续步骤执行失败时, 可以选择该可恢复点的下一步骤作为重新执 行查询请求的起始步骤。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一可恢复 点的后续步骤执行失败时, 可以根据第一可恢复点信息, 选择第一可恢复点的 下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执行的过 程, 提高了并行数据库查询处理的效率。
如上的查询处理方法, 其中, 根据预先设定的可恢复信息, 确定执行查询 请求的第二步骤, 包括: 将第一可恢复点后续距离最近的下一步骤作为第二步 骤。
具体的,第一可恢复点信息为位于第三步骤之前且距第三步骤最近的可恢 复点信息, 第三步骤为第一步骤之前的任一步骤, 因此, 当第一步骤执行失败 时, 可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第一可恢复点的 下一步骤作为重新执行查询请求的起始步骤。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一步骤执 行失败时, 可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第一可恢 复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执 行的过程, 提高了并行数据库查询处理的效率。
图 4为本发明查询处理方法实施例四的流程图。如图 4所示,在图 3所示 本发明方法实施例三的基础上, 在第一可恢复点之后, 在第一步骤之前, 还可 以更新记录第二可恢复点, 因此, 记录位于第三步骤之前且距第三步骤最近的 第一可恢复点信息之后,向系统中各个数据节点发送执行客户端查询请求的第 一步骤信息之前, 还可以包括下述操作:
S401、 向各个数据节点发送执行查询请求的第四步骤信息。
具体的, 第四步骤为第一步骤之前, 且位于第一可恢复点之后的步骤, 控 制节点向系统中各个数据节点发送执行查询请求的步骤信息,并等待数据节点 发送执行结果信息。
5402、 接收各个数据节点发送的第二成功消息。
具体的, 第二成功消息用于表示执行第四步骤成功, 系统中各个数据节点 将执行结果发送给控制节点。
5403、 记录位于第四步骤之前且距第四步骤最近的第二可恢复点信息。 具体的,在第四步骤执行成功后,控制节点记录位于第四步骤之前且距第 四步骤最近的第二可恢复点信息,该可恢复点信息包含其已执行步骤的具体结 果,从而在后续步骤执行失败时, 可以选择该可恢复点的下一步骤作为重新执 行查询请求的起始步骤。
由于第二可恢复点位于第一可恢复点之后,第二可恢复点信息包括第一可 恢复点信息中记录的执行步骤的具体结果, 因此, 可以用第二可恢复点信息覆 盖第一可恢复点信息。
本实施例中,通过在第一可恢复点之后更新记录第二可恢复点, 实现在第 二可恢复点的后续步骤执行失败时, 可以根据第二可恢复点信息,选择第二可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重新 执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
如上的查询处理方法, 其中, 根据预先设定的可恢复信息, 确定执行查询 请求的第二步骤, 包括: 将第二可恢复点后续距离最近的下一步骤作为第二步 骤。
具体的,第二可恢复点信息为位于第四步骤之前且距第四步骤最近的可恢 复点信息,第四步骤为第一步骤之前,且位于第一可恢复点之后的步骤, 因此, 当第一步骤执行失败时, 可以选择第二可恢复点的下一步骤作为第二步骤, 即 选择第二可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中,通过在第一可恢复点之后设置第二可恢复点, 实现在第一步 骤执行失败时, 可以选择第二可恢复点的下一步骤作为第二步骤, 即选择第二 可恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重 新执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
图 5为本发明查询处理方法实施例五的原理图。 如图 5所示, 控制节点 500根据客户端的查询请求生成执行计划步骤 503 , 并通过评估各个步骤对应 的资源消耗, 生成可恢复点信息, 其中第 1可恢复点 501和第 2可恢复点 502 分别对应步骤 B和步骤 D。 具体执行步骤 504如下:
1 )控制节点先下发步骤 A任务到每个数据节点, 如果每个数据节点执行 步骤 A都成功, 控制节点下发步骤 B任务;
2 )如果每个数据节点执行步骤 B成功, 则控制节点先记录第 1可恢复点 501的信息, 并下发步骤 C任务到每个节点;
3 )如果数据节点在执行步骤 C的过程中出现异常或失败, 则控制节点通 过分析错误信息, 以及根据当前保存的第 1可恢复点 501的信息, 重新下发步 骤 C任务到每个数据节点, 而不是从步骤 A开始从头执行;
4 )控制节点下发步骤 D任务到每个数据节点, 如果执行成功, 则控制节 点先记录第 2可恢复点 502的信息, 并将第 2可恢复点 502的信息覆盖; 而如 果执行失败, 则从步骤 C开始再次执行;
5 ) 当数据节点执行步骤 E出现异常时, 控制节点根据当前最新记录的第 2可恢复点 502的信息, 再次执行步骤 E, 而不是从步骤 C开始执行, 也不是 从步骤 A开始执行。
图 6为本发明查询处理方法实施例五的流程图。如图 6所示,本实施例中, 控制节点上存储了执行计划的可恢复点信息和当前执行任务的可恢复点状态, 其中, 第 1可恢复点 501和第 2可恢复点 502分别对应步骤 B和步骤 D。 4叚 定执行步骤 E失败, 客户端, 控制节点, 数据节点之间的交互流程如下:
5601、 客户端向控制节点发送查询请求消息。
5602、 控制节点生成执行计划步骤和可恢复点信息。
控制节点根据客户端的查询请求生成执行计划步骤,并通过评估各个步骤 对应的资源消耗生成可恢复信息, 其中, 可恢复信息包括可恢复点的数量、 位 置等。
5603、 控制节点发送步骤 A信息到每个数据节点。 控制节点发送执行步骤 A的任务消息到每个数据节点。
5604、 数据节点执行步骤 A。
各个数据节点执行任务步骤 A。
5605、 数据节点发送步骤 A的执行结果到控制节点。
各个数据节点发送执行结果到控制节点。
5606、 控制节点发送步骤 B信息到每个数据节点。
控制节点发送执行步骤 B的任务消息到每个数据节点。
5607、 数据节点执行步骤 B。
各个数据节点执行任务步骤 B。
5608、 数据节点发送执行结果到控制节点。
各个数据节点发送步骤 B的执行结果到控制节点。
5609、 控制节点记录可恢复点状态 =B。
控制节点记录第 1可恢复点执行成功, 并存储执行步骤的具体结果数据。
5610、 控制节点发送步骤 C信息到每个数据节点。
控制节点发送执行步骤 C的任务消息到每个数据节点。
5611、 数据节点执行步骤 C。
各个数据节点执行任务步骤 C。
5612、 数据节点发送执行结果到控制节点。
各个数据节点发送步骤 C的执行结果到控制节点。
5613、 控制节点发送步骤 D信息到每个数据节点。
控制节点发送执行步骤 D的任务消息到每个数据节点。
5614、 数据节点执行步骤 D。
各个数据节点执行任务步骤 D。
5615、 数据节点发送执行结果到控制节点。
各个数据节点发送步骤 D的执行结果到控制节点。
5616、 控制节点更新可恢复点状态 =D。
控制节点更新可恢复点信息,记录第 2可恢复点的信息, 并将第 1可恢复 点的信息覆盖。
5617、 控制节点发送步骤 E信息到每个数据节点。 控制节点发送执行步骤 E的任务消息到每个数据节点。
5618、 数据节点执行步骤 E。
各个数据节点执行任务步骤 E。
5619、 数据节点执行步骤 E失败, 返回控制节点失败信息。
各个数据节点发送执行结果到控制节点,控制节点收到任何一个数据节点 发送的步骤 E执行失败消息, 即判定步骤 E执行失败。
5620、 控制节点根据记录的可恢复点状态 D, 分析错误原因后, 从 D开 始重新执行步骤£。
当某个步骤执行失败或出现异常时, 数据节点根据最新可恢复点状态信 息, 决策距离最近的步骤开始执行, 因此, 在步骤 E执行失败时, 控制节点根 据记录的可恢复点状态 D, 分析错误原因后, 从 D开始重新执行步骤 E, 而不 是从 B开始执行步骤 C, 也不是从头开始执行步骤 A。
5621、 控制节点重新发送步骤 E信息到每个数据节点。
控制节点重新发送执行步骤 E的任务消息到每个数据节点。
5622、 数据节点重新执行步骤£。
各个数据节点重新执行任务步骤 E。
5623、 数据节点发送执行结果到控制节点。
各个数据节点发送步骤 E的执行结果到控制节点。
5624、 控制节点将结果返回给客户端。
控制节点将最终查询处理结果返回给客户端。
本实施例中,控制节点在生成执行计划的同时, 生成执行步骤的可恢复点 信息, 并记录管理可恢复点状态。 在出现执行失败时, 控制节点根据可恢复点 状态信息, 决策最近的可恢复点, 以这个恢复点开始执行, 避免海量数据的重 复执行, 提高了并行数据库查询处理的效率。
为了更清楚地说明本发明实施例的技术方案,下面给出并行数据库查询处 理的样例。
假定有两个表: student表和 sc表。 student表有 sid, name, sex三个字段, sc表有 sid, cn, score三个字段, 表定义可以参照如下语句:
create table student (sid int,name varchar(20) , sex varchar(20)) partitioning key sid on all;
create table sc (sid int,cn varchar(20) , score float) partitioning key cn on all; 表 1 为 student表, 具体表数据如表 1所示; 表 2为 sc表, 具体表数据如 表 2所示。 表数据存储在三个数据节点上, 查询语句可以参照如下语句:
Select student.name,sc.cn,sc. score from student, sc where student. sid
Figure imgf000014_0001
表 3为查询处理表。 对 student表和 sc表的查询计划分为 4个步骤, 查询 语句可以参照图 3所示。
表 3查询处理表
Figure imgf000014_0002
SELECT isProducer = true 首先创建临时数 student.name as name, isConsumer = true 据库 TMPTT1 _ 1 , 然 student, sid as sid nodeld = 2 后执行查询语句, FROM student isProducer = true 并将结果广播到 targetTable = TMPTTl l isConsumer = true 其他数据节点上。 targetSchema = nodeld = 3
CREATE TABLE isProducer = true
"TMPTTl l " isConsumer = true
( "name" VARCHAR (20),
"sid" INT)
WITHOUT OIDS
destType =
DEST— TYPE— BROADCAST
queryString = nodeld = 1 在各底层数据节 SELECT isProducer = true 点上执行 join查
TMPTT1 1.name as name, isConsumer = false 询。
scl .cn as cn, nodeld = 2
scl . score as score isProducer = true
FROM isConsumer = false
TMPTTl l nodeld = 3
INNER join scl on (scl .sid = isProducer = true
TMPTTl l .sid) isConsumer = false
destType =
DEST— TYPE— COORD— FINAL
Final Result Set construct Master 由控制节点将各 数据节点返回的 多个 Resul tSe t封 装为外部的一个 Resul tSet 4 Dro 临时表 TMPTT1— 1 nodeld = 1,2,3 在向客户端返回 结果结束后, 控制 节点通知各数据 节点删除临时表。 控制节点处理过程为:
1 )先执行步骤 1 ,如果执行成功,则控制节点记录可恢复点状态为步骤 1 , 否则控制节点分析异常原因 , 决策重新执行或报错;
2 )执行步骤 2时, 如果执行失败, 则控制节点分析异常原因后, 根据当 前最新的可恢复点状态, 重新执行步骤 2, 否则控制节点覆盖可恢复点状态为 步骤 2; 继续执行后面的步骤 3、 步骤 4;
3 ) 当执行步骤 3、 步骤 4 出现异常时, 控制节点分析异常原因后, 根据 存储的最新可恢复点信息, 直接执行步骤 3;
4 )所有步骤执行完成后, 控制节点删除可恢复点信息。
本实施例中, 步骤 1主要是创建临时表, 扫描底层数据表, 并存储到临时 表中, 步骤 2是临时表与 sn表进行连接运算, 两个步骤都很耗时, 也是当数 据节点数量增多时最容易出现异常的步骤,因此可以将步骤 1和步骤 2设置为 可恢复点,避免在查询处理过程中对海量数据的重复操作,提高了并行数据库 查询处理的效率。
图 7为本发明控制节点实施例一的结构示意图。如图 7所示, 本实施例的 控制节点可以包括: 第一发送模块 701、 接收模块 702、 处理模块 703和第二 发送模块 704。 其中, 第一发送模块 701用于向系统中各个数据节点发送执行 客户端查询请求的第一步骤信息;接收模块 702用于接收至少一个数据节点发 送的失败消息, 失败消息用于表示执行第一步骤失败或异常; 处理模块 703 用于根据预先设定的可恢复信息,确定执行查询请求的第二步骤, 第二步骤为 重新执行查询请求的起始步骤,可恢复信息中包括重新执行查询请求时无需重 复执行的步骤; 第二发送模块 704用于向各个数据节点发送第二步骤信息。
本实施例的控制节点, 可以用于执行图 1所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 1 所示方法实施例中的相关描述, 在此不再赘述。 本实施例中,控制节点收到任一数据节点发送的失败消息后,通过对错误 信息进行分析,控制节点可以根据当前保存的可恢复信息,确定重新执行查询 请求的起始步骤, 实现查询过程中的某一步骤执行失败后无需从头执行,提高 了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理模块 703还可以用于根据数据节点执行查询 请求的各个步骤对应的资源消耗, 生成可恢复信息。
本实施例的控制节点, 可以用于执行图 2所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 2 所示方法实施例中的相关描述, 在此不再赘述。
本实施例中,控制节点通过评估数据节点执行查询请求的各个计划步骤对 系统的资源消耗, 生成执行步骤的可恢复点数量、 位置等可恢复信息, 控制节 点控制系统中各数据节点在执行查询请求任务的过程中,当出现某一执行步骤 失败时, 通过分析错误信息, 并根据当前保存的可恢复信息, 确定重新执行查 询请求的起始步骤, 实现执行失败后无需从头执行,提高了并行数据库查询处 理的效率。
图 8为本发明控制节点实施例二的结构示意图。如图 8所示,在图 7所示 本发明控制节点实施例一的基础上,本实施例的控制节点还可以包括存储模块 801。
本实施例中,存储模块 801还可以用于存储可恢复信息, 可恢复信息中至 少一个可恢复点信息,可恢复点后续距离最近的下一步骤作为重新执行查询请 求的起始步骤。
具体的, 处理模块 703评估执行查询请求各个步骤对应的资源消耗, 至少 设置一个可恢复点, 当执行某个步骤失败时, 处理模块 703分析错误信息, 从 存储模块 801已存储有执行结果的可恢复点中,查找出离该执行步骤最近的可 恢复点, 并选择该可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中, 通过设置可恢复点, 实现在执行某个步骤失败时, 从已保存 有执行结果的可恢复点中, 查找出离该执行步骤最近的可恢复点, 并选择该可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头 执行的过程, 提高了并行数据库查询处理的效率。 如上的控制节点, 其中, 第一发送模块 701还可以用于向各个数据节点发 送执行查询请求的第三步骤信息;接收模块 702还可以用于接收各个数据节点 发送的第一成功消息, 第一成功消息用于表示执行第三步骤成功; 存储模块 801 还可以用于记录位于第三步骤之前且距第三步骤最近的第一可恢复点信 息。
本实施例的控制节点, 可以用于执行图 3所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 3 所示方法实施例中的相关描述, 在此不再赘述。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一可恢复 点的后续步骤执行失败时, 可以根据第一可恢复点信息, 选择第一可恢复点的 下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执行的过 程, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理模块 703根据预先设定的可恢复信息, 确定 执行查询请求的第二步骤, 具体可以包括: 将第一可恢复点后续距离最近的下 一步骤作为第二步骤。
具体的,第一可恢复点信息为位于第三步骤之前且距第三步骤最近的可恢 复点信息, 第三步骤为第一步骤之前的任一步骤, 因此, 当第一步骤执行失败 时, 处理模块 703可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第 一可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一步骤执 行失败时, 可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第一可恢 复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执 行的过程, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中, 第一发送模块 701还可以用于向各个数据节点发 送执行查询请求的第四步骤信息;接收模块 702还可以用于接收各个数据节点 发送的第二成功消息, 第二成功消息用于表示执行第四步骤成功; 存储模块 801 还可以用于记录位于第四步骤之前且距第四步骤最近的第二可恢复点信 息。
本实施例的控制节点, 可以用于执行图 4所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 4 所示方法实施例中的相关描述, 在此不再赘述。
本实施例中,通过在第一可恢复点之后更新记录第二可恢复点, 实现在第 二可恢复点的后续步骤执行失败时, 可以根据第二可恢复点信息,选择第二可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重新 执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理模块 703根据预先设定的可恢复信息, 确定 执行查询请求的第二步骤, 具体可以包括: 将第二可恢复点后续距离最近的下 一步骤作为第二步骤。
具体的,第二可恢复点信息为位于第四步骤之前且距第四步骤最近的可恢 复点信息,第四步骤为第一步骤之前,且位于第一可恢复点之后的步骤, 因此, 当第一步骤执行失败时,处理模块 703可以选择第二可恢复点的下一步骤作为 第二步骤, 即选择第二可恢复点的下一步骤作为重新执行查询请求的起始步 骤。
本实施例中,通过在第一可恢复点之后设置第二可恢复点, 实现在第一步 骤执行失败时, 可以选择第二可恢复点的下一步骤作为第二步骤, 即选择第二 可恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重 新执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
图 9为本发明控制节点实施例三的结构示意图。如图 9所示, 本实施例的 控制节点可以包括: 发送器 901、接收器 902和处理器 903。 其中, 发送器 901 用于向系统中各个数据节点发送执行客户端查询请求的第一步骤信息;接收器 902用于接收至少一个数据节点发送的失败消息, 失败消息用于表示执行第一 步骤失败或异常; 处理器 903用于根据预先设定的可恢复信息, 确定执行查询 请求的第二步骤, 第二步骤为重新执行查询请求的起始步骤, 可恢复信息中包 括重新执行查询请求时无需重复执行的步骤;发送器 901还用于向各个数据节 点发送第二步骤信息。
本实施例的控制节点, 可以用于执行图 1所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 1 所示方法实施例中的相关描述, 在此不再赘述。 本实施例中,控制节点收到任一数据节点发送的失败消息后,通过对错误 信息进行分析,控制节点可以根据当前保存的可恢复信息,确定重新执行查询 请求的起始步骤, 实现查询过程中的某一步骤执行失败后无需从头执行,提高 了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理器 903还可以用于根据数据节点执行查询请 求的各个步骤对应的资源消耗, 生成可恢复信息。
本实施例的控制节点, 可以用于执行图 2所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 2 所示方法实施例中的相关描述, 在此不再赘述。
本实施例中,控制节点通过评估数据节点执行查询请求的各个计划步骤对 系统的资源消耗, 生成执行步骤的可恢复点数量、 位置等可恢复信息, 控制节 点控制系统中各数据节点在执行查询请求任务的过程中,当出现某一执行步骤 失败时, 通过分析错误信息, 并根据当前保存的可恢复信息, 确定重新执行查 询请求的起始步骤, 实现执行失败后无需从头执行,提高了并行数据库查询处 理的效率。
图 10为本发明控制节点实施例四的结构示意图。 如图 10所示, 在图 9 所示本发明控制节点实施例一的基础上,本实施例的控制节点还可以包括存储 器 1001。
本实施例中, 存储器 1001还可以用于存储可恢复信息, 可恢复信息中至 少一个可恢复点信息,可恢复点后续距离最近的下一步骤作为重新执行查询请 求的起始步骤。
具体的, 处理器 903评估执行查询请求各个步骤对应的资源消耗, 至少设 置一个可恢复点, 当执行某个步骤失败时, 处理器 903分析错误信息, 从存储 器 1001 已存储有执行结果的可恢复点中, 查找出离该执行步骤最近的可恢复 点, 并选择该可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中, 通过设置可恢复点, 实现在执行某个步骤失败时, 从已保存 有执行结果的可恢复点中, 查找出离该执行步骤最近的可恢复点, 并选择该可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头 执行的过程, 提高了并行数据库查询处理的效率。 如上的控制节点, 其中,发送器 901还可以用于向各个数据节点发送执行 查询请求的第三步骤信息;接收器 902还可以用于接收各个数据节点发送的第 一成功消息, 第一成功消息用于表示执行第三步骤成功; 存储器 1001还可以 用于记录位于第三步骤之前且距第三步骤最近的第一可恢复点信息。
本实施例的控制节点, 可以用于执行图 3所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 3 所示方法实施例中的相关描述, 在此不再赘述。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一可恢复 点的后续步骤执行失败时, 可以根据第一可恢复点信息, 选择第一可恢复点的 下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执行的过 程, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理器 903根据预先设定的可恢复信息, 确定执 行查询请求的第二步骤, 具体可以包括: 将第一可恢复点后续距离最近的下一 步骤作为第二步骤。
具体的,第一可恢复点信息为位于第三步骤之前且距第三步骤最近的可恢 复点信息, 第三步骤为第一步骤之前的任一步骤, 因此, 当第一步骤执行失败 时, 处理器 903可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第一 可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中,通过在第一步骤之前设置第一可恢复点, 实现在第一步骤执 行失败时, 可以选择第一可恢复点的下一步骤作为第二步骤, 即选择第一可恢 复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后从头执 行的过程, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中,发送器 901还可以用于向各个数据节点发送执行 查询请求的第四步骤信息;接收器 902还可以用于接收各个数据节点发送的第 二成功消息, 第二成功消息用于表示执行第四步骤成功; 存储器 1001还可以 用于记录位于第四步骤之前且距第四步骤最近的第二可恢复点信息。
本实施例的控制节点, 可以用于执行图 4所示方法实施例的方法, 其实现 原理和所要达到的技术效果类似, 其执行查询处理方法的具体过程可参见图 4 所示方法实施例中的相关描述, 在此不再赘述。 本实施例中,通过在第一可恢复点之后更新记录第二可恢复点, 实现在第 二可恢复点的后续步骤执行失败时, 可以根据第二可恢复点信息,选择第二可 恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重新 执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
如上的控制节点, 其中, 处理器 903根据预先设定的可恢复信息, 确定执 行查询请求的第二步骤, 具体可以包括: 将第二可恢复点后续距离最近的下一 步骤作为第二步骤。
具体的,第二可恢复点信息为位于第四步骤之前且距第四步骤最近的可恢 复点信息,第四步骤为第一步骤之前,且位于第一可恢复点之后的步骤, 因此, 当第一步骤执行失败时,处理器 903可以选择第二可恢复点的下一步骤作为第 二步骤, 即选择第二可恢复点的下一步骤作为重新执行查询请求的起始步骤。
本实施例中,通过在第一可恢复点之后设置第二可恢复点, 实现在第一步 骤执行失败时, 可以选择第二可恢复点的下一步骤作为第二步骤, 即选择第二 可恢复点的下一步骤作为重新执行查询请求的起始步骤,避免了执行失败后重 新执行一些查询处理步骤, 提高了并行数据库查询处理的效率。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤 可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取 存储介质中。 该程序在执行时, 执行包括上述各方法实施例的步骤; 而前述的 存储介质包括: ROM, RAM,磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是: 以上各实施例仅用以说明本发明的技术方案, 而非对其 限制; 尽管参照前述各实施例对本发明进行了详细的说明, 本领域的普通技术 人员应当理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或者 对其中部分或者全部技术特征进行等同替换; 而这些修改或者替换, 并不使相 应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims

权利要求书
1、 一种查询处理方法, 其特征在于, 包括:
向系统中各个数据节点发送执行客户端查询请求的第一步骤信息; 接收至少一个所述数据节点发送的失败消息,所述失败消息用于表示执行 所述第一步骤失败或异常;
根据预先设定的可恢复信息,确定执行所述查询请求的第二步骤, 所述第 二步骤为重新执行所述查询请求的起始步骤,所述可恢复信息中包括重新执行 所述查询请求时无需重复执行的步骤;
向所述各个数据节点发送所述第二步骤信息。
2、根据权利要求 1所述的方法, 其特征在于, 所述向系统中各个数据节点 发送执行客户端查询请求的第一步骤信息之前, 还包括:
根据所述数据节点执行所述查询请求的各个步骤对应的资源消耗,生成所 述可恢复信息。
3、 根据权利要求 1或 2所述的方法, 其特征在于, 所述可恢复信息中包括 至少一个可恢复点信息,所述可恢复点后续距离最近的下一步骤作为重新执行 所述查询请求的起始步骤。
4、根据权利要求 3所述的方法, 其特征在于, 所述向系统中各个数据节点 发送执行客户端查询请求的第一步骤信息之前, 还包括:
向所述各个数据节点发送执行所述查询请求的第三步骤信息;
接收所述各个数据节点发送的第一成功消息,所述第一成功消息用于表示 执行所述第三步骤成功;
记录位于所述第三步骤之前且距所述第三步骤最近的第一可恢复点信息。
5、根据权利要求 4所述的方法, 其特征在于, 所述根据预先设定的可恢复 信息, 确定执行所述查询请求的第二步骤, 包括:
将所述第一可恢复点后续距离最近的下一步骤作为所述第二步骤。
6、根据权利要求 4所述的方法, 其特征在于, 所述记录位于所述第三步骤 之前且距所述第三步骤最近的第一可恢复点信息之后,所述向系统中各个数据 节点发送执行客户端查询请求的第一步骤信息之前, 还包括:
向所述各个数据节点发送执行所述查询请求的第四步骤信息; 接收所述各个数据节点发送的第二成功消息,所述第二成功消息用于表示 执行所述第四步骤成功;
记录位于所述第四步骤之前且距所述第四步骤最近的第二可恢复点信息。
7、根据权利要求 6所述的方法, 其特征在于, 所述根据预先设定的可恢复 信息, 确定执行所述查询请求的第二步骤, 包括:
将所述第二可恢复点后续距离最近的下一步骤作为所述第二步骤。
8、 一种控制节点, 其特征在于, 包括:
第一发送模块,用于向系统中各个数据节点发送执行客户端查询请求的第 一步骤信息;
接收模块, 用于接收至少一个所述数据节点发送的失败消息, 所述失败消 息用于表示执行所述第一步骤失败或异常;
处理模块, 用于根据预先设定的可恢复信息,确定执行所述查询请求的第 二步骤, 所述第二步骤为重新执行所述查询请求的起始步骤, 所述可恢复信息 中包括重新执行所述查询请求时无需重复执行的步骤;
第二发送模块, 用于向所述各个数据节点发送所述第二步骤信息。
9、 根据权利要求 8所述的控制节点, 其特征在于, 所述处理模块, 还用于 根据所述数据节点执行所述查询请求的各个步骤对应的资源消耗,生成所述可 恢复信息。
10、 根据权利要求 8或 9所述的控制节点, 其特征在于, 所述控制节点还 包括: 存储模块, 用于存储所述可恢复信息, 所述可恢复信息中至少一个可恢 复点信息,所述可恢复点后续距离最近的下一步骤作为重新执行所述查询请求 的起始步骤。
1 1、 根据权利要求 10所述的控制节点, 其特征在于, 所述第一发送模块, 还用于向所述各个数据节点发送执行所述查询请求的第三步骤信息;
所述接收模块,还用于接收所述各个数据节点发送的第一成功消息, 所述 第一成功消息用于表示执行所述第三步骤成功;
所述存储模块,还用于记录位于所述第三步骤之前且距所述第三步骤最近 的第一可恢复点信息。
12、 根据权利要求 1 1所述的控制节点, 其特征在于, 所述处理模块, 根 据预先设定的可恢复信息, 确定执行所述查询请求的第二步骤, 包括: 将所述 第一可恢复点后续距离最近的下一步骤作为所述第二步骤。
13、 根据权利要求 1 1所述的控制节点, 其特征在于,
所述第一发送模块,还用于向所述各个数据节点发送执行所述查询请求的 第四步骤信息;
所述接收模块,还用于接收所述各个数据节点发送的第二成功消息, 所述 第二成功消息用于表示执行所述第四步骤成功;
所述存储模块,还用于记录位于所述第四步骤之前且距所述第四步骤最近 的第二可恢复点信息。
14、 根据权利要求 13所述的控制节点, 其特征在于, 所述处理模块, 根 据预先设定的可恢复信息, 确定执行所述查询请求的第二步骤, 包括: 将所述 第二可恢复点后续距离最近的下一步骤作为所述第二步骤。
15、 一种控制节点, 其特征在于, 包括:
发送器,用于向系统中各个数据节点发送执行客户端查询请求的第一步骤 信息;
接收器, 用于接收至少一个所述数据节点发送的失败消息, 所述失败消息 用于表示执行所述第一步骤失败或异常;
处理器, 用于根据预先设定的可恢复信息,确定执行所述查询请求的第二 步骤, 所述第二步骤为重新执行所述查询请求的起始步骤, 所述可恢复信息中 包括重新执行所述查询请求时无需重复执行的步骤;
所述发送器, 还用于向所述各个数据节点发送所述第二步骤信息。
16、 根据权利要求 15所述的控制节点, 其特征在于, 所述处理器, 还用 于根据所述数据节点执行所述查询请求的各个步骤对应的资源消耗,生成所述 可恢复信息。
17、 根据权利要求 15或 16所述的控制节点, 其特征在于, 所述控制节点 还包括: 存储器, 用于存储所述可恢复信息, 所述可恢复信息中至少一个可恢 复点信息,所述可恢复点后续距离最近的下一步骤作为重新执行所述查询请求 的起始步骤。
18、 根据权利要求 17所述的控制节点, 其特征在于, 所述发送器, 还用 于向所述各个数据节点发送执行所述查询请求的第三步骤信息;
所述接收器,还用于接收所述各个数据节点发送的第一成功消息, 所述第 一成功消息用于表示执行所述第三步骤成功;
所述存储器,还用于记录位于所述第三步骤之前且距所述第三步骤最近的 第一可恢复点信息。
19、 根据权利要求 18所述的控制节点, 其特征在于, 所述处理器, 根据 预先设定的可恢复信息, 确定执行所述查询请求的第二步骤, 包括: 将所述第 一可恢复点后续距离最近的下一步骤作为所述第二步骤。
20、 根据权利要求 18所述的控制节点, 其特征在于,
所述发送器,还用于向所述各个数据节点发送执行所述查询请求的第四步 骤信息;
所述接收器,还用于接收所述各个数据节点发送的第二成功消息, 所述第 二成功消息用于表示执行所述第四步骤成功;
所述存储器,还用于记录位于所述第四步骤之前且距所述第四步骤最近的 第二可恢复点信息。
21、 根据权利要求 20所述的控制节点, 其特征在于, 所述处理器, 根据 预先设定的可恢复信息, 确定执行所述查询请求的第二步骤, 包括: 将所述第 二可恢复点后续距离最近的下一步骤作为所述第二步骤。
PCT/CN2013/076366 2012-09-14 2013-05-29 查询处理方法和装置 WO2014040426A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210341682.3A CN103678368B (zh) 2012-09-14 2012-09-14 查询处理方法和装置
CN201210341682.3 2012-09-14

Publications (1)

Publication Number Publication Date
WO2014040426A1 true WO2014040426A1 (zh) 2014-03-20

Family

ID=50277569

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/076366 WO2014040426A1 (zh) 2012-09-14 2013-05-29 查询处理方法和装置

Country Status (2)

Country Link
CN (1) CN103678368B (zh)
WO (1) WO2014040426A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784479A (zh) * 2017-02-16 2018-03-09 平安科技(深圳)有限公司 一种业务流程处理方法和装置
CN111736977B (zh) * 2020-07-21 2020-11-10 成都新希望金融信息有限公司 一种多中台的中央调度方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0657813A1 (en) * 1993-12-06 1995-06-14 International Business Machines Corporation Distributed database management
CN101120340A (zh) * 2004-02-21 2008-02-06 数据迅捷股份有限公司 超无共享并行数据库
CN101299219A (zh) * 2008-06-27 2008-11-05 北京邮电大学 多线程断点续传可定制内部网爬虫系统
CN102323946A (zh) * 2011-09-05 2012-01-18 天津神舟通用数据技术有限公司 并行数据库中算子复用的实现方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0657813A1 (en) * 1993-12-06 1995-06-14 International Business Machines Corporation Distributed database management
CN101120340A (zh) * 2004-02-21 2008-02-06 数据迅捷股份有限公司 超无共享并行数据库
CN101299219A (zh) * 2008-06-27 2008-11-05 北京邮电大学 多线程断点续传可定制内部网爬虫系统
CN102323946A (zh) * 2011-09-05 2012-01-18 天津神舟通用数据技术有限公司 并行数据库中算子复用的实现方法

Also Published As

Publication number Publication date
CN103678368A (zh) 2014-03-26
CN103678368B (zh) 2017-02-08

Similar Documents

Publication Publication Date Title
US8868514B2 (en) Transaction support for distributed data
EP2877942B1 (en) Automatic transaction retry after session failure
US9424149B2 (en) Systems and methods for fault tolerant communications
US11269902B2 (en) Time series data management method, device, and apparatus
JP6301256B2 (ja) 処理方法、コンピュータプログラム及びメタデータサポートサーバ
US20190384835A1 (en) Ingestion engine method and system
US9582314B2 (en) Managing data consistency between loosely coupled components in a distributed computing system
US10255341B2 (en) Mode switching in high availability disaster recovery (HADR) systems
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
EP3026574B1 (en) Affair processing method and device
WO2011120452A2 (zh) 更新数据的方法和控制装置
US7636873B2 (en) Enhancement of assured event delivery mechanism to eliminate external XA store requirement
WO2020232951A1 (zh) 一种任务执行方法及装置
US20160087759A1 (en) Tuple recovery
EP3805946A1 (en) Scalable data extractor
US8719622B2 (en) Recording and preventing crash in an appliance
CN112995262B (zh) 分布式事务提交方法、系统及计算设备
US9043283B2 (en) Opportunistic database duplex operations
WO2014040426A1 (zh) 查询处理方法和装置
US8719316B2 (en) Write agent delayed write to data stores
US9031969B2 (en) Guaranteed in-flight SQL insert operation support during an RAC database failover
JP2007058506A (ja) 文書管理サーバ、文書管理システム、及び、文書管理プログラムとその記録媒体
US20120191645A1 (en) Information processing apparatus and database system
US10936430B2 (en) Method and system for automation of differential backups
JP2006259806A (ja) プーリング方法、システム及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13836328

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13836328

Country of ref document: EP

Kind code of ref document: A1