CN117812077A

CN117812077A - Data scheduling method, device and system

Info

Publication number: CN117812077A
Application number: CN202311846903.7A
Authority: CN
Inventors: 吴怀江; 刘佳鑫; 邓福喜
Original assignee: Ant Blockchain Technology Shanghai Co Ltd
Current assignee: Ant Blockchain Technology Shanghai Co Ltd
Priority date: 2023-12-28
Filing date: 2023-12-28
Publication date: 2024-04-02

Abstract

The specification provides a data scheduling method, device and system. The method is applied to any node of a plurality of nodes contained in a data scheduling system, the blockchain related data of a blockchain network associated with the system and the execution related information of a target scheduling task are stored in a database, and the plurality of nodes respectively have access rights to the data and the information, and the method comprises the following steps: under the condition that the execution related information does not contain the execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment, determining the current execution related information of any node aiming at the target scheduling task, and storing the current execution related information as the latest execution related information of the task in a database; and executing the task in the execution time period so as to schedule the target data corresponding to the target block from the database to the data analysis party.

Description

Data scheduling method, device and system

Technical Field

The embodiment of the specification belongs to the technical field of blockchains, and particularly relates to a data scheduling method, device and system.

Background

Blockchain (Blockchain) is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. In the block chain system, the data blocks are combined into a chain data structure in a sequential connection mode according to the time sequence, and the distributed account book which is not tamperable and counterfeit and is ensured in a cryptographic mode is formed.

The blockchain network can generate data on the chain when running, and the corresponding data value can be mined by analyzing the data, such as statistics of performance indexes of the blockchain network, network fault detection or attack transaction. Given the typically weak data analysis capabilities of the blockchain network itself, current stages will typically be pulled to store and analyze under the chain.

In the related art, in order to implement parallel analysis to improve data analysis efficiency, a multi-node analysis system is generally used to schedule and analyze the blockchain related data (including the above-mentioned on-chain data) stored under the chain. Whereas the data on a chain typically constitutes a chain structure (e.g., highly incremental blocks are generated in chronological order), the blockchain-related data needs to be scheduled in an order when analyzed. In the multi-node analysis system, coordination of a plurality of nodes for orderly scheduling the data is a basis for ensuring accurate and credible analysis results.

Disclosure of Invention

The purpose of the present specification is to provide a data scheduling method, device and system.

According to a first aspect of one or more embodiments of the present specification, there is provided a data scheduling method applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the plurality of nodes having access rights for the blockchain-related data and the execution-related information, respectively, the method comprising:

Determining the current execution related information of any node aiming at the target scheduling task under the condition that the execution related information does not contain execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment, and storing the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and executing the target scheduling task in the current execution time period so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analysis party.

According to a second aspect of one or more embodiments of the present specification, there is provided a data scheduling method applied to a master node and a target slave node among a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the method comprising:

The master node determines the current execution related information of a target slave node aiming at the target scheduling task under the condition that the execution related information does not contain execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment, and stores the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and the target slave node executes the target scheduling task in the current execution time period so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analysis party.

According to a third aspect of one or more embodiments of the present specification, there is provided a data scheduling system comprising: the data scheduling system is connected with a blockchain network, blockchain related data of the blockchain network and execution related information of target scheduling tasks created for the blockchain related data are stored in a database, a plurality of nodes in the data scheduling system respectively have access rights for the blockchain related data and the execution related information, and any node in the plurality of nodes is used for:

According to a fourth aspect of one or more embodiments of the present specification, there is provided a data scheduling system having a blockchain network connected thereto, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, a plurality of nodes in the data scheduling system including a master node and at least one slave node, the master node having access rights for the blockchain-related data and the execution-related information, wherein:

The master node is configured to: determining current execution related information of a target slave node aiming at the target scheduling task under the condition that the execution related information does not contain execution time information or contains latest execution time information which indicates that the target scheduling task is not executed at the current moment, and storing the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

the target slave node is configured to: and executing the target scheduling task in the current execution time period so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analysis party.

According to a fifth aspect of one or more embodiments of the present specification, there is provided a data scheduling apparatus applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the plurality of nodes having access rights for the blockchain-related data and the execution-related information, respectively, the apparatus comprising:

A task preempting unit, configured to determine, when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time, current execution related information of the target scheduling task for the any node, and store the current execution related information as latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and the task execution unit is used for executing the target scheduling task in the current execution time period so as to schedule target data corresponding to the target block in the block chain related data from the database to a data analysis party.

According to a sixth aspect of one or more embodiments of the present specification, there is provided a data scheduling apparatus applied to a master node and a target slave node among a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the apparatus comprising:

The task allocation unit is used for enabling the master node to determine the current execution related information of the target scheduling task aiming at the target scheduling task when the execution related information does not contain the execution time information or the latest execution time information contained in the execution related information indicates that the target scheduling task is not executed at the current moment, and storing the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and the task execution unit is used for enabling the target slave node to execute the target scheduling task in the current execution time period so as to schedule target data corresponding to the target block in the block chain related data from the database to a data analysis party.

According to a seventh aspect of one or more embodiments of the present specification, there is provided an electronic device, comprising:

a processor; a memory for storing processor-executable instructions;

wherein the processor implements the method of any of the first or second aspects by executing the executable instructions.

According to an eighth aspect of one or more embodiments of the present description, there is provided a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method as in any of the first or second aspects.

In the embodiment of the present specification, a data scheduling system to which a blockchain network is connected includes a plurality of nodes, blockchain-related data of the network and execution-related information of a target scheduling task created for the data are stored in a database and can be accessed by the nodes. The data scheduling method described in the present specification may be implemented by any node in the system, or may also be implemented by the interaction of a master node and a target slave node in the system.

In this scheme, for a target scheduling task, if execution related information stored in a database by any node (or the master node) does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at a current time, current execution related information (including target block height information for representing a target block and current execution time information for representing a current execution time period) of the target scheduling task by any node is determined, and the information is stored in the database as latest execution related information of the task. And further, the target scheduling task is executed by any node (or the target slave node) in the execution time period, so that target data corresponding to a target block in the blockchain related data is scheduled from the database to a data analysis party (namely, the scheduling of the target data is realized), and the data analysis party is used for analyzing and processing the data.

In addition to storing the blockchain-related data in the database, the scheme stores the execution-related information of the target scheduling task in the database. In the case that the data scheduling method is executed by any node (at this time, a plurality of nodes in the system form a non-master structure), the node self-preempts and executes a target scheduling task to be executed. Specifically, the node may access the above blockchain related data (e.g., scheduling the target data in the data) and the execution related information (e.g., reading the information), so as to determine and store the current execution related information of the any node for the task if the execution related information indicates that the target scheduled task is not executed at the current time. It will be appreciated that the current execution related information may be used to indicate that any node is executing the target scheduling task within the current execution time period, so that the node preempts the execution rights of the target scheduling task within the time period by storing the current execution related information in the database. Based on the information, other nodes in the system can read the information and judge that any node is executing the task according to the information, so that the task is effectively prevented from being preempted again, and finally, only any node is ensured to execute the task in the current execution time period. Therefore, the scheme can avoid the conflict situation that different nodes schedule the same target data at the same time, thereby ensuring the ordered scheduling of the target data and reducing the task competition among the nodes.

And in the case that the data scheduling method is cooperatively executed by the master node and the target sub-node (when a plurality of nodes in the system form a master-slave structure), the master node allocates a target scheduling task to be executed to the target sub-node to be executed by the master node. In this scheme, the master node may access the above blockchain related data (e.g., scheduling the target data in the data) and the execution related information (e.g., reading the information), so as to determine and store the current execution related information of the target slave node for the task if the execution related information indicates that the target scheduling task is not executed at the current time. It will be appreciated that the current execution related information may be used to indicate that the target slave node executes the target scheduling task within the current execution time period, so that the master node allocates the execution right of the target scheduling task within the time period (i.e. allocates the execution right of the task within the time period to the target slave node) by storing the current execution related information in the database. Based on the above, the master node can avoid repeatedly distributing the execution right of the task in the current execution time period to other sub-nodes, and finally ensure that only the target sub-node executes the task in the current execution time period. Therefore, the scheme can avoid the conflict situation that different child nodes schedule the same target data at the same moment, thereby ensuring the ordered scheduling of the target data and reducing the task competition among the nodes.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a data scheduling system according to an exemplary embodiment.

FIG. 2 is a schematic diagram of a data scheduling task execution process according to an exemplary embodiment.

Fig. 3 is a flowchart of a data scheduling method according to an exemplary embodiment.

FIG. 4 is a schematic diagram of dependency of blockchain-related data in accordance with an exemplary embodiment.

Fig. 5 is a flow chart of another data scheduling method provided by an exemplary embodiment.

Fig. 6 is a schematic diagram of an apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram of a data scheduling apparatus according to an exemplary embodiment.

Fig. 8 is a block diagram of another data scheduling apparatus provided by an exemplary embodiment.

Detailed Description

In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

In order to solve the technical problems in the related art, the present specification proposes a data scheduling system and a data scheduling method implemented by the same, so that nodes in the system can implement ordered scheduling of blockchain related data. The scheme is described in detail below with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a schematic diagram of a data scheduling system according to an exemplary embodiment. As shown in FIG. 1, the system comprises a plurality of nodes and is connected with at least one blockchain network, for example, the system can comprise nodes A-E and the like, and is connected with blockchain 1-4 and the like. The nodes in the data scheduling system described in the present specification are at least used for implementing a data scheduling function, so they can be regarded as scheduling nodes. In addition, at least one blockchain node in any blockchain network is connected to at least one node, so that the connection between the network and the system can be realized, for example, at least one blockchain node in the blockchain 1 (such as a master node or any node of the network) can be connected to a node a in the data scheduling system, and at least one blockchain node in the blockchain 2 (such as a master node or any node of the network) can be connected to nodes B and C in the data scheduling system, which are not described herein. It should be noted that, each node in the data scheduling system may be used to implement a data scheduling function, and data pulling may not need to be performed between different nodes, which is a distinction between a node in the data scheduling system and a blockchain node in the blockchain network described in this specification.

In addition to the above-mentioned multiple nodes, the data scheduling system may further include a server, so as to implement a corresponding function of the system. In addition, the data scheduling system may also include or be associated with a database, such as a first database may be included within the system (i.e., the first database belongs to the system), or a second database not belonging to the system may be connected to the system. Based on the connection between the blockchain network and the data scheduling system, nodes in the data scheduling system can pull the original data on the chain generated by the network from any blockchain network connected with the nodes. Based on this, on the one hand the raw data on the chain may be stored in the aforementioned database, on the other hand the pre-analysis data may be generated based on the data and stored in the database, or the pulled raw data on the chain may be sent to the server to generate the pre-analysis data based on the data and stored in the database by the server. For any blockchain network, the original data on the chain of the network and/or the pre-analysis data generated based on the data stored in the database are blockchain related data of the network, and can be used as a scheduling object of a subsequent data scheduling process.

The server may also cooperate with the corresponding client to provide task creation functionality for the user to create corresponding data scheduling tasks for the blockchain-related data stored in the database. The user may be a manager of the data scheduling system, an operation and maintenance person, a manager of the blockchain network, a common user corresponding to the blockchain node, etc., which are not described herein. The foregoing client may be executed in an electronic device such as a mobile phone or a computer used by a user of the data scheduling system, and the user may initiate a task creation request to the server through the client, where the request may include data indication information (such as a data identifier, a block height range, etc.) specified by the user for the target data to be scheduled, relevant information of a data analyzer (such as identification information of the data analyzer, a data interface, etc.), and so on, and the server may generate a data scheduling task according to the foregoing information in response to the request, and store execution relevant information of the task in a database—where the data may be read by a node that subsequently executes the task. Or, the data scheduling task may be automatically generated by the data scheduling system according to a preset plan or a user requirement, or generated by a request from the data analysis direction server, etc., which will not be described again.

In addition, each data scheduling task created and its execution related information may be recorded in the aforementioned database, e.g., a task list may be maintained in the database, which is used to record each data scheduling task created—the data scheduling system is used to execute these tasks. Wherein, in order to reduce the occupation of the database storage space by the list, the task list may only record the tasks that are being executed and not yet executed, but not record the tasks that have already been executed (such as deleting from the list after any task has been executed); of course, the task list may record the tasks that have been already executed in addition to the tasks that are currently being executed and not yet executed by the data scheduling system, so as to perform backtracking query when needed, and a specific recording manner may be flexibly selected according to actual situations, which is not limited in this specification.

The execution related information of any data scheduling task recorded in the task list may include various information. By way of example, it may include a task ID of the task, data indication information (for indicating to an executing party of the task that target data needs to be scheduled when executing the task, which may include a block height range, a data ID, a dependency relationship between data, etc.), creation time information, creator related information (identification of a user who created the task, etc.), data analyzer related information corresponding to the task (such as an analyzer ID, a data interface or an access address of an analyzer), etc. If the task is being executed or has been executed, the execution related information of the task may include a time slicing table of the task, where the table is used to record a task ID of the task, a time slicing ID of the table, task status information (indicating whether the task is currently being executed or not), an execution node identifier (i.e., a node identifier of a node that is executing or has executed the task last time), execution time information (such as a latest execution time period or a start time and an end time of the current execution time period), and so on. The node can judge whether the task is currently being executed or not by checking whether the time slicing table (i.e. whether the execution time information exists) exists or not, the task state information in the table or the relation between the current moment and the execution time information, and the like.

In addition, since the execution of any data scheduling task may include multiple phases, the nodes executing the task need to switch between the phases, and thus the database may also store a table of context information for the task (see table 1 below for details). The task is scheduled according to any data recorded in a task list, and the information of the task recorded in the task list, the time slicing table corresponding to the task and the information recorded in a context information table respectively form execution related information of the task together; wherein the information recorded in the time slicing table is the execution time information of the task.

Each node in the data scheduling system can access the data in the database. The blockchain-related data of the blockchain network and the execution-related information of the target scheduling task created for the blockchain-related data can be stored in a database, and nodes in the system can have access rights for the blockchain-related data and the execution-related information. Specifically, blockchain-related data stored therein may be scheduled or execution-related information stored therein queried, etc., in a manner that is described in more detail below.

Scheduling tasks for respective data in the task list: under the condition of no main structure, for the nodes, a plurality of nodes in the system can execute each task in parallel, wherein any node can execute only one task at any moment and can also execute a plurality of tasks simultaneously; for a task, any task may not be performed by any node or may only be performed by a certain node at any time (without the occurrence of a situation where multiple nodes perform the same task at the same time, respectively). Similarly, tasks are scheduled for respective data in the task list: in a master-slave structure (in this case, the system includes a master node and a plurality of slave nodes), for the nodes, the plurality of slave nodes in the system can execute each task in parallel (the master node can schedule each task to the corresponding slave node respectively), wherein any slave node can execute only one task at any time, or can execute a plurality of tasks simultaneously; for a task, any task may not be performed by any slave node or may be performed by only a certain slave node at any time (without the case where a plurality of slave nodes perform the same task at the same time, respectively). In the master-slave structure, the master node may be a fixed node. Of course, the master node may be not fixed, but rather, determined by negotiating each node in the system, for example, each node is alternatively used as a master node according to a preset sequence, or the master node is elected by a preset election algorithm, so as to achieve higher system stability, which is not described again.

The following description is directed to the complete execution of any one of the data scheduling tasks. With the data scheduling system shown in fig. 1, as shown in fig. 2, the data scheduling task may be continuously executed (e.g. task i) or may be automatically stopped when the execution reaches a preset level (e.g. task j). For any task i which is created, the task is executed by a node A (acquired by the node A at the time t 0) in a time period t 0-t 1 (the time period is the current execution time period of the task i by the node A, and t0 and t1 are the starting time and the ending time of the time period respectively) and is used for scheduling target data corresponding to each block with the height of Hn-Hn+2; the task is executed by a node C (acquired by the node C at the time t 1) in a time period t 1-t 2 and is used for scheduling target data corresponding to a block with the height Hn+3; the task is executed by a node D (acquired by the node D at the time t 2) in a time period t 2-t 3, and is used for scheduling target data corresponding to blocks with heights Hn+4 and Hn+5 respectively, wherein the target data corresponding to the blocks with the heights Hn+5 are not scheduled; the task is executed by the node B (acquired by the node B at the time t 3) in the time period t 3-t 4, and is used for scheduling the residual target data corresponding to the block with the height of Hn+5 and the target data corresponding to the block with the height of Hn+6; the task is not executed by any node in the time period t 4-t 5, and is executed by the node D (acquired by the node D at the time t 5) after the time t5, so as to schedule the target data with the height Hn+7 and corresponding blocks after the time t5, and the task is not repeated. It can be seen that, as the blockchain network operates, blockchain-related data thereof is continuously generated and stored in the database, so that each node in the data scheduling system recommends continuous execution of task i over time to achieve continuous scheduling of corresponding target data.

Similarly, for any task j that is created, the task is executed by the node C in the time period t 0-t 1 (acquired by the node C at the time t 0) and is used for scheduling target data corresponding to blocks with heights Hm and hm+1 respectively; the task is still executed by the node C (acquired by the node C again at the time t 1) within the time period t 1-t 2 and is used for scheduling target data corresponding to the block with the height of Hm+2; the task is executed by a node A (acquired by the node A at the time t 2) in a time period t 2-t 3 and is used for scheduling target data corresponding to a block with the height of Hm+3; the task is not executed by any node in the time period from t3 to t4, and is executed by the node B (acquired by the node B at the time t 4) after the time t4, and is used for scheduling target data respectively corresponding to the blocks with the heights of Hm+4 to Hm+6. After the execution by the node B is completed at time t5, the task is completely executed.

Wherein, the value range of any one of the variables n and m is a natural number set, i.e. the variable can be zero or any positive integer. For the two tasks shown in task i and task j, the difference can be embodied by the respective block height ranges of the two tasks. If the block height range corresponding to the task is designated as [ Hn, + ] in the task i, it indicates that the task needs to be continuously executed from Hn (i.e., by executing the target data corresponding to each block with the task scheduling height greater than or equal to Hn), and the specific stopping time can be determined by the user. If the height range of the block corresponding to the task is designated as [ Hm, hm+6], the task j indicates that the task is stopped when the task starts to execute from Hm to hm+6 (i.e. the task is scheduled to execute the target data corresponding to seven blocks with heights of Hm to hm+6). In addition, similar to the scheduling process of the node D and the node B for the target data corresponding to the block with the height hn+5 when executing the task i, the node C and the node a may also schedule the target data corresponding to the block with the height hm+2 when executing the task j, respectively, if at the time t2 the node C only schedules a portion of the target data corresponding to the block with the height hm+2, the node a may schedule the remaining target data corresponding to the block from the time t2, which is not described again.

It can be understood that, the target scheduling task described in the present specification may be executed by at least one node (or at least one slave node), where any node (or any slave node) performs the execution process of the target scheduling task in any execution period, and only a part of the task is actually executed, that is, only a part of all target data corresponding to the task is scheduled; the task can be executed to end only after the target data corresponding to the task is completely scheduled. As for task j, only when each target data corresponding to the block with the height of Hm to hm+6 is scheduled to the corresponding data analysis party, the task j is executed and ended. It is further seen that any task may not be performed by any node at any one time, or may be performed by a node. It should be emphasized that the same task, when executed at any one time, is executed by only one node, and is not executed by multiple nodes at the same time.

Any task can be acquired by any node shown in fig. 2, and can be acquired by the node in a preemptive manner, or can be distributed to the node by a master node for execution, which is described in the following embodiments.

It should be noted that, whether each node in the data scheduling system forms a master-free structure or a slave-free structure, the system can execute a plurality of data scheduling tasks simultaneously, and the "target scheduling task" in the specification is any data scheduling task executed in the system, and the execution method is universal, i.e. the specification takes the target scheduling task (i.e. any data scheduling task) as an example to describe the scheduling scheme. For brevity, part of the description in this specification refers to the data scheduling task or the target scheduling task simply as a task, and a specific meaning may be determined according to the context, so that this is described.

Each node in the data scheduling system can realize different functions (namely playing different functional roles in the system), for example, under the condition that each node forms a non-main structure, the data scheduling function can be realized by each node respectively; under the condition that each node forms a master-slave structure, the master node and the slave nodes in the master-slave structure can mutually cooperate to realize the data scheduling function. Based on this, for the blockchain related data of the blockchain network stored in the aforementioned database and the target data in the data, the node in the data scheduling system may implement the data scheduling method for the data in different manners.

In an embodiment, each node in the data scheduling system may form a non-master structure (i.e., each node has similar functions and can respectively implement the data scheduling functions), and the data scheduling method described in the specification may be applied to any node in the plurality of nodes, where each node in the system may acquire and execute the target scheduling task in a "preemptive" manner. The plurality of nodes may have access rights to the blockchain-related data and the execution-related information, respectively, and any node of the plurality of nodes (such as any of nodes a-E shown in fig. 1) may be configured to:

In another embodiment, each node in the data scheduling system may form a master-slave structure, that is, at any time, the plurality of nodes in the data scheduling system includes a master node and at least one slave node (as in nodes a to E in fig. 1, if node a is the master node, nodes B to E are the slave nodes). At this time, the master node in the system may "allocate" a corresponding target scheduling task for each slave node, so that each slave node may acquire and execute the corresponding task. The master node has access rights to the blockchain-related data and the execution-related information, wherein:

The master node is configured to: determining current execution related information of a target slave node (any slave node can be) aiming at the target scheduling task under the condition that the execution related information does not contain execution time information or contains latest execution time information which indicates that the target scheduling task is not executed at the current moment, and storing the current execution related information in the database as the latest execution related information of the target scheduling task; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

The following describes in detail a data scheduling method implemented based on a data scheduling system in a scenario without a master structure and an active structure, respectively, with reference to the accompanying drawings and embodiments.

Referring to fig. 3, fig. 3 is a flowchart of a data scheduling method according to an exemplary embodiment. As shown in fig. 3, the method is applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data are stored in a database, and the plurality of nodes respectively have access rights for the blockchain-related data and the execution-related information, and the method includes the following steps 302 to 304.

Step 302, determining the current execution related information of the target scheduling task by any node under the condition that the execution related information does not contain execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment, and storing the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information used for representing a target block and current execution time information used for representing a current execution time period.

As described above, the data scheduling system has the blockchain network connected thereto, and the blockchain-related data of the blockchain network is stored in the database, and the execution-related information of the data scheduling task created for the data is also stored in the database, that is, the database is used for storing the blockchain-related data of the blockchain network and the execution-related information of the data scheduling task. In an embodiment, the database may comprise a first database belonging to the data scheduling system and/or a second database not belonging to the data scheduling system. In other words, the database may include only the aforementioned first database, in which both blockchain-related data and execution-related information are stored; the database may also include only the aforementioned second database, where both blockchain-related data and execution-related information are stored in the second database; the databases may also include the first database and the second database at the same time, at this time, the blockchain related data and the execution related information may be stored in the same database, or may be stored in different databases, or the two databases may store the same data and information, so as to be mutually backed up, which is not repeated.

Any of the databases may be a cloud database based on OSS (Object Storage Service ), where OSS may be used to store data to the cloud database for management. The service can provide high-availability and high-reliability distributed storage for data objects with any size, and has the advantages of large storage capacity, high safety, low cost, high durability and the like compared with the traditional storage mode. In addition, the data storage modes of the first database and the second database are not limited in this specification, for example, any database may be a relational database such as PolarDB, mySQL, SQL Server, oracle, or a non-relational database such as MongoDB, redis, couchDB. Taking the first database as an example, the database can select the above-mentioned relational data so as to realize higher reading/writing speed.

The data scheduling system may be connected to a plurality of blockchain networks at the same time, and the network may be any network of the plurality of blockchain networks, that is, any network is taken as an example in the specification to describe the scheduling process of the blockchain related data. In particular, the network may be connected to a first node of a plurality of nodes included in the data scheduling system, such as any one or more blockchain nodes in the network may be connected to the first node, or a master blockchain node in the network may be connected to the first node, etc. The first node may be any node in the data scheduling system, in other words, a node connected to the blockchain network in the data scheduling system is the first node.

In addition, the scheduling object of the data scheduling method described in the present specification is target data in the blockchain-related data, and the blockchain-related data may include on-chain raw data and/or pre-analysis data (generated based on the on-chain raw data), and the data scheduling system may pull the on-chain raw data from the blockchain network based on the first node and obtain corresponding pre-analysis data. Accordingly, the scheduled target data may include raw data on the target chain and target pre-analysis data.

In one embodiment, the blockchain network is connected to a first node of the plurality of nodes, and the blockchain-related data of the blockchain network includes on-chain raw data generated by the network during operation, and for this purpose, the first node may pull the on-chain raw data from the blockchain network and send the pulled on-chain raw data to the database storage. The original data on the chain may include data such as a block (block), a transaction (transaction), a receipt (receipt), a log, etc., and in view of the fact that the data is continuously generated by the blockchain network in the running process, the first node may synchronize the data from the blockchain network in real time, or may periodically pull the data generated in the period in an incremental manner according to a preset time interval; it is of course also possible that the generated data are sent by block links in the network to a message queue (the network acting as a producer of the message) so that the first node can asynchronously acquire the data by consuming the message (the first node acting as a consumer of the message). Of course, the specific pulling manner of the original data on the chain can be reasonably set according to practical situations, and the specification is not limited to this.

Further, where the blockchain-related data also includes pre-analysis data, the data scheduling system may acquire and store the pre-analysis data in a variety of ways. In an embodiment, the first node may generate pre-analysis data, for example, the first node may perform pre-analysis on the pulled original data on the chain to obtain the pre-analysis data, and send the data to the database for storage. In another embodiment, when the data scheduling system further includes a server, pre-analysis data may be generated by the server, for example, the first node may send the original data on the chain pulled by itself to the server, so that the server may perform pre-analysis processing on the data to obtain the pre-analysis data, and send the data to the database for storage. Of course, in addition to the pre-analysis data generated by the first node and the server, other nodes may also generate the pre-analysis data, for example, when the first node pulls the on-chain original data, if the remaining available resources at the current moment are insufficient to generate the pre-analysis data due to the fact that the node is performing the data scheduling task, the first node may send the on-chain original data to other nodes with sufficient resources in the data scheduling system, so that the node generates the pre-analysis data according to the on-chain original data and sends the pre-analysis data to the database for storage. In this way, the data scheduling system can ensure that pre-analysis data is timely generated and stored according to the raw data on the chain pulled by the first node, so as to improve the execution success rate of the data scheduling task for the part of data (if the database does not store some pre-analysis data, the target scheduling task taking the data as target data will fail to execute).

As shown in fig. 4, the pre-analysis data for any block generated based on the raw data on the chain of that block in the manner described above may include various data tables. Such as data table 11 generated based on blocks, data table 12 generated based on transactions, data table 13 generated based on receipts, data table 14 generated based on logs, etc. In addition, the pre-analysis data may also include new data tables further generated based on the raw data and/or data tables on the chain described above, such as data table 21 generated based on data table 11 and data table 12, data table 22 generated based on data table 12 and data table 13, data table 23 generated based on receipt and data table 14, data table 31 generated based on data table 21 and data table 22, data table 32 generated based on data table 23, and data table 41 generated based on data table 22 and data table 32, and so forth. It will be understood that the pre-analysis data belongs to a plurality of levels and that there is a dependency relationship between different pre-analysis data, such as data tables 11 to 14 belonging to a first level, data tables 21 to 23 belonging to a second level, data tables 31 to 32 belonging to a third level, data table 41 belonging to a fourth level, etc.; in addition, there are dependencies between the data tables such as data table 21 dependent data table 11 and data table 12, data table 32 dependent data table 23, data table 41 dependent data tables 22 and 32, and the like. In addition, the dependency relationship may exist between the pre-analysis data and the original data on the chain, such as the data tables 11 to 14 depend on the blocks, the transactions, the receipts and the logs, and the data table 23 depends on not only the data table 14 but also the receipts, and the like, which is not described again.

And aiming at least one data scheduling task to be executed, which is created in the data scheduling system, if any node preempts any task, the task becomes a target scheduling task. The target scheduling task is used for scheduling target data, wherein the target data belongs to the block chain related data of the block chain network, namely the target data to be scheduled can comprise on-chain original data and/or pre-analysis data of the block chain network.

The node may include a plurality of functional components, such as a multi-condition allocation decision component, a multi-condition allocation job, a time slicing production component, a time allocation persistence component, a time slice task start component, and the like, and the specific functions of each component may be described in the following embodiments.

In an embodiment, for any task in the task list, the any node may first determine whether the task may be preempted and executed, and the determining process may be implemented by the aforementioned multi-conditional allocation decision component multi-conditional allocation j udge. This component may be used to determine whether an event exists in the TimeWheelRecordNotExist time-slicing table, whether the latest execution period of timewheelruout has expired, whether an error event has occurred in the timewheelOccurror time-slicing table, and so on.

Illustratively, the multicondiitiondisatchjodge may first determine whether the corresponding time-slicing table exists for any task (i.e., determine whether the TimeWheelRecordNotExist event is true): if not (i.e., event TimeWheelRecordNotExist is not true), the task may be preempted; if so (i.e., the event TimeWheelRecordNotExist is true), it is further determined whether the current time is later than the end time of the last execution period recorded in the time slicing table (i.e., whether the TimeWheelRunOut event is true): if it is later (i.e., the TimeWheelRunOut event is true), the task can be preempted; if not, further judging whether the latest scheduling has errors (namely judging whether the TimeWheelOccurError event is true): if an error occurs (i.e., the TimeWheelOccurError event is true), the task may be preempted, otherwise (i.e., the TimeWheelOccurError event is not true) the task is not preempted. If the judgment determines that any task can be preempted, the task is preempted, and the preempted task becomes a target scheduling task; if it is determined that the task cannot be preempted, the same judgment can be performed on the next task in the task list, and details are not repeated. The above-described determination process may be described in detail in the following embodiments (for the avoidance of ambiguity, any of the tasks will be referred to as target scheduling tasks directly hereinafter).

Specifically, the any node may determine, according to the execution related information (of the target scheduling task), whether the target scheduling task is being executed at the current time: if the task is currently being executed (at this time, the task may be being executed by any node in the data scheduling system, for example, the task may be being executed by any node, or some other node except the node is executing the task), the task cannot be preempted by any node at this time; and if the task is not executed at the current moment (any node), the task can be preempted and executed by any node at the moment.

The execution time information in the execution related information is used for representing a time period of executing the task by the node executing the task at the current moment (the node can only execute the task in the execution time period represented by the execution time information), and it can be understood that if the execution related information does not contain any execution time information, the node indicates that the target scheduling task has not been executed by any node yet; if the execution related information includes execution time information (hereinafter referred to as latest execution time information), it indicates that the target scheduling task has been executed by the node or is currently being executed before the current time. It can be seen that, whether the execution related information does not include the execution time information, or the latest execution time information included in the information indicates that the target scheduling task is not currently executed, it indicates that the task is not executed at the current moment—at this time, any node may preempt the task and execute the task.

Wherein, the latest execution time information may include a start time and an end time of an execution time period: and if the current time is between the starting time and the ending time (i.e. the current time is within the execution time period), indicating that the target scheduling task is currently executed. If the current time is before the start time (i.e., the current time is before the execution time period), it indicates that the target scheduling task is currently preempted by a certain node but has not yet been executed, which is not discussed in the present specification and will not be described again. If the current time is after the end time (i.e. the current time is after the execution time period), it indicates that the previous execution of the target scheduling task has been completed and has not been preempted by any node, and at this time, the execution time period is the previous execution time period, which is used to characterize that the task is executed last in the time period before the current time. Therefore, if the current time is later than the ending time of the previous execution time period characterized by the latest execution time information, it can be determined that the latest execution time information indicates that the target scheduling task is not executed at the current time.

Alternatively, an update duration threshold may be set for the execution time information of the target scheduling task, such as 10min, 5min, and so on. Based on this, if the interval duration between the current time and the time of storing the latest execution time information (i.e. the time of storing the latest execution time information) exceeds the update duration threshold, it indicates that the latest execution time information has not been updated yet, and further indicates that the previous execution of the target scheduling task may be in error, so that it can be determined that the latest execution time information indicates that the target scheduling task has not been executed at the current time. Of course, the aforementioned update duration threshold may be set to be greater than the length of the aforementioned previous execution period to ensure that the node that executed the task last time has finished executing the task when the update duration threshold is reached.

And under the condition that the execution related information does not contain the execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment, the any node can determine the current execution related information of the target scheduling task by the any node and store the current execution related information of the target scheduling task in the database as the latest execution related information of the target scheduling task. It will be appreciated that the latest execution related information is stored, i.e. persisted in a database, for individual nodes in the system (which may be themselves viewed if required). The process of persisting the information can be completed by the time allocation persisting component TimeWheelpersist. It can be understood that, because the current execution related information includes current execution time information for characterizing a current execution time period, after the current execution related information is written into the database, other nodes in the data scheduling system can read the current execution time information, so that it can be determined based on the information that the target scheduling task is being executed by any node at the current moment (i.e., the moment after the other nodes read the information), so that the other nodes cannot (or cannot) preempt the task again, thereby effectively avoiding the task from being repeatedly executed by a plurality of nodes in the same time period.

The current execution time information may include a start time and an end time of the current execution time period, and for this, when determining the current execution time period, the any node may calculate the start time and the end time according to the current running duration and/or the current available resource amount of the any node. Wherein the calculation of the start time and the end time may be done by the aforementioned time slicing production component genetime wheel. Specifically, the current time or the time when it is determined that the task is not being executed may be taken as the start time, the length of the current execution time period may be calculated based on the current running time and/or the current available resource, and then the sum of the start time and the length may be taken as the end time, that is, the time interval between the start time and the end time (i.e., the length of the current execution time period) may be calculated.

The size of the time interval is positively correlated with the current running time length and/or the current available resource amount. By way of example, the time interval Δh may be calculated according to the following formula (1):

△H＝H/(1+e ^-a*b ) (1)

wherein, H is a preset default time (e.g. 5 min), a is the current running time of any node, and b is the current available resource amount of any node. It will be appreciated that a larger a indicates a longer duration of the current operation of any node (i.e. a longer time interval between two shutdowns), i.e. a higher operational stability of the node; as can be seen from the formula (1), the larger a (i.e., the longer the current operation time period), the larger Δh, so the higher the operation stability of any node, the larger Δh. Similarly, the larger b indicates that the more the current available resource quantity of the current operation of any node is, namely the more the current available resource of the node can meet the execution requirement of the target scheduling task, so that the higher the execution success rate of the task is; whereas as can be seen from equation (1), the larger b is the larger Δh, and thus the larger the current amount of available resources of any node is the larger Δh.

The larger the delta H is, the longer the time for executing the task after the any node currently preempts the target scheduling task is indicated. As shown in fig. 2, it may be assumed that the current execution time period is [ t0, t1], after t0 is determined, the node a (i.e., any node) may calculate Δh in the foregoing manner, so as to determine t1, and thereafter, the node may execute the target scheduling task within [ t0, t1 ].

In addition to the foregoing current execution time information, the current execution related information further includes target block height information, which is used to characterize a target block, where the target block is a block corresponding to target data to be scheduled, and the target data may be the target block, or may be a transaction included in the block, a receipt generated by executing the transaction in the block, a log corresponding to the block, and the pre-analysis data generated based on the block.

In an embodiment, when determining the information about the current execution of the target scheduling task by the any node, the node identifier of the any node may be determined as the execution node identifier of the target scheduling task (the node identifier of the any node may be recorded in the foregoing time slicing table, e.g. the execution node identifier already recorded in the table is added to or replaced by the table), where the execution node identifier is used to characterize which node is executing the target scheduling task. After the recording is completed in the mode, the node in the data scheduling system can determine that any node executes the task in the current execution time period through the execution node identification. And/or, the any node may update the task state of the target scheduling task to the executing state (may update the task state information in the time slicing table, such as updating the "non-executing state" to the "executing state"), and the task in the executing state cannot be allocated to the node for executing again.

In an embodiment, in a case that the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current moment, the any node may further allocate a mutual exclusion lock for the target scheduling task to itself (or called, the any node may preempt the mutual exclusion lock for the task). By distributing the mutual exclusion lock (or locking), the method can avoid other nodes from preempting the target scheduling task as much as possible, thereby avoiding the task from being repeatedly executed by different nodes at the same time.

After the relevant information of the current execution is determined in the foregoing manner, the relevant information of the current execution of the target scheduling task may be stored in the database in various manners. For example, the latest execution related information of the target scheduling task stored in the database may be updated to the current execution related information, by which the latest execution related information (corresponding to the task to be executed by the arbitrary node) that has been stored may be replaced with the current execution related information (the "latest" indicates the last time before the current time, which corresponds to the task executed by the arbitrary node or other nodes last time before the current time), so that the database stores the execution related information of only one node for the task at any time, thereby contributing to saving the storage space of the database. As shown in fig. 2, the execution-related information of the task i by the node C is updated at time t1 with the execution-related information of the task a, the execution-related information of the task i by the node D is updated at time t2 with the execution-related information of the task C, and so on.

Or, the any node may store the current execution related information and the latest execution related information of the target scheduling task in the database in a correlated manner, where the current execution related information is used as the latest execution related information of the target scheduling task at the current time and in the future after the storage is completed. In this way, the relevant information of the current execution can be newly added in the database, so that the relevant information of each node for executing the target scheduling task for the task in the past (equivalent to storing the history record of the task executed by each node in turn) can be stored in the database, and the execution process of the task can be traced back conveniently. As shown in fig. 2, in the period from t0 to t1, the database only stores the execution related information of the node a on the task i; at the time t1, the execution related information of the node C to the task i is newly added, and in the time period from t1 to t2, the database stores the execution related information of the nodes A and C to the task i respectively, wherein the execution related information of the node C to the task i is the latest execution related information; at time t2, the execution related information of the node D to the task i is newly added, and in the time period from t3 to t3, the database stores the execution related information of the nodes A, C and D to the task i respectively, wherein the execution related information of the node D to the task i is the latest execution related information. After the current execution-related information is stored as the latest execution-related information, the current execution time information among the pieces of information is also corresponding to the latest execution time information.

And step 304, executing the target scheduling task in the current execution time period to schedule target data corresponding to the target block in the block chain related data from the database to a data analysis party.

Based on the stored latest execution related information, any node can determine the current execution time period according to the latest execution time information and execute the target scheduling task in the time period. Of course, instead of storing the aforementioned current execution time information (as latest execution time information) in the database, the any node may locally cache the information so as to determine the current execution time period accordingly, without having to read the information from the database. The process of executing the target scheduling task by any node is a process of scheduling target data corresponding to the target block in the blockchain related data from the database to a data analysis party. The target scheduling task may be triggered to be executed by a time slicing task opening component StartTimeWheelWork, and a task container may be created in any node, where the component may run at least one thread in the container, and each thread is used to execute one data scheduling task. For the target scheduling task, the thread of the task may include at least one process, so as to implement single-thread serial execution or multi-thread parallel execution of the task, which will not be described herein.

Before scheduling the target data, it is necessary to determine which data in the above blockchain-related data is the target data. In an embodiment, in view of the fact that the latest execution related information (i.e., the execution related information after the replacement or the addition) stored in the database includes target block height information (e.g., a block height range included in the data indication information of the target scheduling task recorded in the task list, etc.), the any node may determine the corresponding target block according to the information. If the target block height information is "60,70", 11 blocks having a block height of 60 to 70 may be determined as target blocks, and the blockchain related data corresponding to these blocks may be determined as target data. In addition, in the case that the foregoing data indication information further includes a data ID (such as an ID of a certain data table in the pre-analysis data), the arbitrary node may directly determine the corresponding target data according to the data ID.

In another embodiment, the pre-analysis data may belong to multiple levels and there may be a dependency relationship between different pre-analysis data, where in a case where the target data includes multiple target pre-analysis data, the any node may determine, according to data indication information (such as the block height range, the data ID, etc.) in the execution related information, target pre-analysis data of a highest level, and determine, according to the dependency relationship, target pre-analysis data of each other level on which the target pre-analysis data of the highest level depends. The dependency relationship may be characterized by dependency indication information, which may be recorded in the latest execution related information, or may be requested and acquired from a database.

For the embodiment corresponding to fig. 4, if the target pre-analysis data represented by the data indication information is the data table 22, the transaction, receipt, and data tables 12 and 13 (all of which are relied on by the data table 22) may be determined as the target pre-analysis data, respectively; for another example, if the target pre-analysis data represented by the data indication information is the data table 41, the block, the transaction, the receipt, the log, the data tables 11 to 14, the data tables 21 to 23, and the data tables 31 to 32 (all of which are relied on by the data table 41) may be determined as the target pre-analysis data, which is not described in detail.

In an embodiment, when the target data is scheduled from the database to the data analysis party, the target data of each target block may be sequentially scheduled from the database to the data analysis party in the order of the block height from small to large. By the method, the sequence of scheduling the target data is ensured to be consistent with the sequence generated by the corresponding target data, so that any node only needs to determine and execute the current execution time period (which target data can be scheduled in the time period and which target data are scheduled), and does not need to predict in advance which target data need to be scheduled in the time period, thereby avoiding accidents such as scheduling errors or data omission caused by inaccurate presetting. Of course, in the current execution period, it may be determined which target data can be scheduled by any node, and then each block is scheduled in sequence according to the order of the block height from high to low, where the target data are respectively corresponding to each block, which is not described again.

In the process of scheduling according to the sequence from small to large of the block heights, the completed block heights in the context information can be updated to ensure that the sequential scheduling is realized. The context information may be recorded in a context information table of the target scheduled task. By way of example, the context information table may be as shown in table 1 below:

TABLE 1

The current value of the completed block height recorded in the context information table at any time is used for representing the block height of the target block corresponding to the target data which is completed by the last time in the process of executing the target scheduling task by any node. As shown in fig. 2, if the reliveredblockheight=n+x, it indicates that the target data corresponding to the target block with the height of n+x has already been scheduled.

In an embodiment, the any node may read the context information of the target scheduling task from the database, and add one to the current value of the completed block height in the context information. If the current time is earlier than the end time of the current execution time period, determining target data of a target block corresponding to the current value of the completed block height in the block chain related data, and scheduling the target data from the database to a data analysis party.

As shown in fig. 4, taking the example that the scheduled target data is a block: after t2 or this time (before node D actually performs task i), node D may increment the current value of the completed block height of task i by one (i.e., update the current value of the deliveredBlockHeight from n+3 to n+4); after the update is completed, if t3 is not exceeded at this time (i.e. the current execution time period of the task i by the node D is currently in), the block with the height of n+4 may be scheduled to the corresponding data analyzer. Similarly, after the scheduling of the block with the height of n+4 is completed, node D may update the current value of the reliveredblockheight from n+4 to n+5; after the update is completed, if t3 is not exceeded at this time, the block with the height of n+5 may be scheduled to the corresponding data analysis party, which will not be described again.

In the process of scheduling the target data from the database to the data analysis party, scheduling failure may be caused by some reasons (such as the absence of the target data, unstable network, downtime of the downstream data analysis party, program processing errors, etc.). In this regard, any node may perform corresponding processing according to different failure causes. For example, if the scheduling fails due to the related failure of the data analysis party (such as unstable network, downtime of the data analysis party, etc.), the scheduling may be retried after waiting for a preset time period (such as 1 s). For another example, if the scheduling fails due to the absence of the target data of any target block in the blockchain related data, the scheduling may be retried after waiting for a preset block-out duration (such as 12s or 10min, etc.), where the absence of the target data may be possibly due to the fact that the blockchain network has not generated the target block, so that the original data on the chain has not been generated or the corresponding pre-analysis data has not been generated, and at this time, the scheduling is retried after waiting for the preset block-out duration, if the scheduling is successful during the retrying, the waiting period indicates that the target data is generated.

Further, if the scheduling still fails during the retry, that is, the target data of any target block still does not exist after waiting for the block-out duration, the block height of any target block may be recorded in the missing block height set; and then, when the block height recorded in the missing block height set reaches a maximum allowable missing block number threshold, taking the difference value between the current block height and the maximum allowable missing block number threshold as the block height of an initial retry block, and retrying to schedule target data corresponding to the initial retry block from the database to a data analyzer. In this way, when the number of blocks that are continuously missing (i.e., not present) reaches a preset value, the node returns to the initial height to retry scheduling. Of course, the number of retries to schedule can also be flexibly configured to avoid trapping dead loops.

In one embodiment, the any node may schedule the target data from the database to the data analyst in a variety of ways. For example, the target data may be read from a database and sent (i.e. the data itself) to the data analyst, i.e. forwarded by the any node to the data analyst. Alternatively, in view of the fact that the data size of the target data is generally large, in order to reduce the data processing burden of any node, the storage address of the target data in the database may be sent to the data analysis party, so that the data analysis party may read the target data from the database according to the storage address.

For the two scheduling policies (i.e., the policy of directly transmitting data and the policy of transmitting access address), in order to balance the data transmission speed and the time consumption of the downstream data analysis party for acquiring the target data, the block scheduling selector may be used to automatically switch the scheduling policies. For example, if the data size of the target data is smaller than a preset data size threshold (for example, the size of the target block and/or the transaction number is not greater than the threshold), the foregoing policy of directly sending data may be selected, so as to reduce the time consumption of network communication in the sending process, and realize fast scheduling of the target data. Otherwise, if the data of the target data is not smaller than the preset data amount threshold (for example, the size of the target block and/or the transaction number is larger than the threshold), the policy of sending the access address may be selected, so as to reduce the time consumption of the downstream data analyzer to acquire the target data.

The data analysis party described in the specification can be a server of the data scheduling system, that is, the system can be used for data analysis (that is, integrated with a data analysis function) besides data scheduling. Even, each node in the system can also have an analysis function for the target data, at this time, any node can read the target data from the database and then analyze the target data by itself, and feed back a corresponding analysis result to a server side or a user of the data scheduling system. Alternatively, the data analyzer may also be another analyzer (such as a server of the third party analysis platform, or an analyzer specified by a user) connected to the data scheduling system, where any node only needs to schedule the target data to the data analyzer, and does not need to pay attention to a specific analysis process. The specific analysis process of the scheduled target data is not limited in this specification, and will not be described in detail.

In addition to storing the blockchain-related data in the database, the scheme stores the execution-related information of the target scheduling task in the database. In the case that the data scheduling method is executed by any node (at this time, a plurality of nodes in the system form a non-master structure), the node self-preempts and executes a target scheduling task to be executed. Specifically, the node may access the above blockchain related data (e.g., scheduling the target data in the data) and the execution related information (e.g., reading the information), so as to determine and store the current execution related information of the any node for the task if the execution related information indicates that the target scheduled task is not executed at the current time. It will be appreciated that the current execution related information may be used to indicate that the target scheduling task is executed by any node during the current execution period, so that the node preempts the execution rights of the target scheduling task during the period by storing the current execution related information in a database. Based on the information, other nodes in the system can read the information and judge that any node is executing the task according to the information, so that the task is effectively prevented from being preempted again, and finally, only any node is ensured to execute the task in the current execution time period. Therefore, the scheme can avoid the conflict situation that different nodes schedule the same target data at the same time, thereby ensuring the ordered scheduling of the target data and reducing the task competition among the nodes.

So far, the description of the data scheduling process under the non-main structure is completed. The data scheduling process in the master-slave configuration is described below with reference to fig. 5.

Referring to fig. 5, fig. 5 is a flowchart of another data scheduling method according to an exemplary embodiment. As shown in fig. 5, the method is applied to a master node and a target slave node among a plurality of nodes included in a data scheduling system, the data scheduling system having a blockchain network connected thereto, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the method comprising the following steps 502 to 504.

Step 502, when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time, the master node determines current execution related information of the target scheduling task for a target slave node, and stores the current execution related information as latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information used for representing a target block and current execution time information used for representing a current execution time period.

Similar to the aforementioned non-master mechanism, blockchain-related data of a blockchain network to which the data scheduling system under the master structure is connected and execution-related information of a data scheduling task created for the data are also stored in the database. The database may include a first database belonging to the data scheduling system and/or a second database not belonging to the data scheduling system, and types of the first database and the second database and data stored in the first database and the second database, etc. may refer to an embodiment without a main structure, and will not be described again.

The scheduling object of the data scheduling method is target data in the blockchain related data, the blockchain related data may include on-chain original data and/or pre-analysis data (generated based on the on-chain original data), the data scheduling system may pull the on-chain original data from the blockchain network based on the first node and obtain corresponding pre-analysis data, and specific pulling and generating manners may refer to the embodiments corresponding to the non-main structure and are not described herein. Accordingly, the scheduled target data may include raw data on the target chain and target pre-analysis data.

For at least one data scheduling task to be executed, which is created in the data scheduling system, if any one of the tasks is distributed to any slave node by the master node, the task becomes a target scheduling task, and the slave node becomes a target slave node. The target scheduling task is used for scheduling target data, wherein the target data belongs to the block chain related data of the block chain network, namely the target data to be scheduled can comprise on-chain original data and/or pre-analysis data of the block chain network.

The master node and the target slave node may respectively include corresponding functional components, for example, the master node may include a multi-condition allocation decision component, a multi-condition allocation judge, a time slicing production component, a time slice persistence component, a time slice task opening component, a time slice task work, etc., and specific functions of the components may be described in the following embodiments. In fact, whereas individual components in a data scheduling system may be referred to as master nodes or slave nodes, each node in the system may contain the individual functional components described above to allocate tasks to other nodes when itself is the master node, to perform tasks allocated by the master node when itself is the slave node, and so on. In addition, the master node may allocate tasks to each slave node, and may also allocate tasks to itself (i.e., may preempt tasks to be executed by itself), which is not limited in this specification.

In an embodiment, for any task in the task list, the master node may first determine whether the task can be allocated and executed, and the determination process may be implemented by the above-mentioned multi-conditional allocation decision component multi-conditional allocation j udge. This component may be used to determine whether an event exists in the TimeWheelRecordNotExist time-slicing table, whether the latest execution period of timewheelruout has expired, whether an error event has occurred in the timewheelOccurror time-slicing table, and so on.

Illustratively, the multicondiitiondisatchjodge may first determine whether the corresponding time slicing table exists for any task (i.e., whether the TimeWheelRecordNotExist event is true): if not (i.e., the event is not true), the task may be assigned; if so, further judging whether the current moment is later than the end moment of the latest execution time period recorded in the time slicing table (namely, whether the TimeWheelRunOut event is met): if it is later (i.e., the event is true), the task may be assigned; if not, further judging whether the latest scheduling has errors (namely whether the TimeWheelOccurError event is true): if an error occurs (i.e., the event is true), the task may be assigned, otherwise the task is not assigned. If the judgment determines that any task can be allocated, the task becomes a target scheduling task; if the task cannot be preempted, the same judgment can be started to be performed on the next task, and the details are not repeated. The above-mentioned judgment process can be described in detail in the following embodiments (for the sake of ambiguity, any of the tasks will be referred to as target scheduling tasks directly hereinafter):

Specifically, the master node may first determine, according to the execution related information, whether the current time of the target scheduling task is being executed: if the task is currently being executed (at this time, the task may be being executed by any node in the data scheduling system, e.g., the target slave node may be executing the task, or any other slave node is executing the task), the master node cannot allocate the task at this time; and if the task is not currently executed (any node), the task can be allocated by the master node at the moment.

The execution time information in the execution related information is used for representing the time period of executing the task by the node executing the task at the current moment (namely, the node executes the task in the execution time period represented by the execution time information), so that if the execution related information does not contain any execution time information, the target scheduling task is indicated not to be executed by any node yet; otherwise, if the execution related information includes execution time information (hereinafter referred to as latest execution time information), it indicates that the target scheduling task is executed by the node or is currently being executed before the current time. It can be seen that, whether the execution related information does not include execution time information or the latest execution time information included in the information indicates that the target scheduling task is not currently executed, it indicates that the task is not executed at the current moment—at this time, the master node may allocate the task.

Wherein, the latest execution time information may include a start time and an end time of an execution time period: and if the current time is between the starting time and the ending time (i.e. the current time is within the execution time period), indicating that the target scheduling task is currently executed. If the current time is before the start time (i.e., the current time is before the execution time period), it indicates that the target scheduling task is currently preempted by a certain node but has not yet been executed, which is not discussed in the present specification and will not be described again. If the current time is after the end time (i.e. the current time is after the execution time period), it indicates that the previous execution of the target scheduling task has been completed and has not been allocated again, and at this time, the execution time period is the previous execution time period, which is used to characterize that the task is executed last time in the time period before the current time. Therefore, if the current time is later than the ending time of the previous execution time period characterized by the latest execution time information, it can be determined that the latest execution time information indicates that the target scheduling task is not executed at the current time.

In the case that the execution related information does not include the execution time information or includes the latest execution time information indicating that the target scheduling task is not executed at the current time, the master node may determine the current execution related information of the target scheduling task for the target slave node, and store the current execution related information as the latest execution related information of the target scheduling task in the database. It will be appreciated that the latest execution related information is stored, i.e. persisted in the database, for viewing by the master node (possibly the master node at the current time and possibly a newly elected new master node) at that time. The process of persisting the information may be performed by the time allocation persistence component timewheelpersistence described above, which is included in the master node. It can be understood that, because the current execution related information includes current execution time information for characterizing a current execution time period, after the current execution related information is written into the database, a new master node may read the current execution time information, so that it can be determined, based on the information, that the target scheduling task is being executed by the target slave node at a current time (i.e., a time when the new master node reads the information), so that the master node does not re-allocate the task, thereby effectively avoiding that the task is repeatedly executed by a plurality of slave nodes in the same time period.

The current execution time information may include a start time and an end time of the current execution time period. In an embodiment, the master node may calculate, according to the current running duration and/or the current available resource amount of each slave node, a start time and an end time of the current execution time period and a time interval between the two times, which correspond to each slave node, respectively, and determine the slave node with the largest time interval as the target slave node. The calculation of the start time, the end time and the time interval between the start time and the end time can be completed by the time slicing production component GenerateTimeWheel. As shown in fig. 2, at time t2, if the master node E calculates that the current execution time periods of the node D and the node a for the task i are 15min and 6min, respectively, the node D may be determined as the target slave node, and then the task i may be allocated to the node D (within 15 min) for execution.

Or when determining the current execution time information, the master node may calculate the start time and the end time according to the current running time of the target slave node and/or the current available resource amount. Wherein the calculation of the start time and the end time may be done by the aforementioned time slicing production component genetime wheel. Specifically, the current time or the time when it is determined that the task is not being executed may be taken as the start time, and the length of the current execution time period may be calculated based on the current running time and/or the current available resource, and then the sum of the start time and the length may be taken as the end time, that is, the time interval between the start time and the end time (that is, the length of the current execution time period) may be calculated. The size of the time interval is positively correlated with the current running time length and/or the current available resource amount. The specific calculation method can be referred to the description of the foregoing formula (1) and the related embodiments, and will not be repeated here.

In an embodiment, when determining the information related to the current execution of the target scheduling task by the target slave node, the node identifier of the target slave node may be determined as the execution node identifier of the target scheduling task (the node identifier of the target slave node may be recorded in the foregoing time slicing table, such as the execution node identifier already recorded in the table is added to or replaced by the table), where the execution node identifier is used to characterize which slave node executes the target scheduling task. After the recording is completed in the mode, the node in the data scheduling system can determine that the target slave node executes the task in the current execution time period through the execution node identification. And/or the master node may update the task state of the target scheduling task to the executing state (may update the task state information in the time slicing table, such as updating the "non-executing state" to the "executing state"), and the task in the executing state cannot be allocated to the slave node again for execution.

In an embodiment, when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current moment, the master node may further allocate a mutual exclusion lock for the target scheduling task to the target slave node, or the target slave node may preempt the mutual exclusion lock for the task, so as to avoid that other nodes also preempt the target scheduling task as much as possible, thereby avoiding that the task is repeatedly executed by different nodes at the same moment.

After the relevant information of the current execution is determined in the foregoing manner, the relevant information of the current execution of the target scheduling task may be stored in the database in various manners. For example, the latest execution related information of the target scheduling task stored in the database may be updated to the current execution related information, by which the latest execution related information that has been stored (corresponding to the target slave node about to execute the task) may be replaced with the current execution related information (the "latest" indicates the last time before the current time, which corresponds to the target slave node or other nodes executing the task last time before the current time), so that the database stores the execution related information of only one node for the task at any time, thereby contributing to saving the storage space of the database.

Or, the master node may store the current execution related information and the latest execution related information of the target scheduling task in a database in a correlated manner, where the current execution related information is used as the latest execution related information of the target scheduling task at the current time and in the future after the storage is completed. In this way, the relevant information of the current execution can be newly added in the database, so that the relevant information of each slave node for executing the target scheduling task for the previous time (equivalent to storing the history record of the task executed by each slave node in turn) can be stored in the database, and the execution process of the task can be traced back conveniently.

In an embodiment, after storing the latest execution related information is completed, the target scheduling task is successfully allocated to the target slave node. At this point, the master node or database may send an execution trigger message to the slave node to trigger the node to execute the target scheduling task for which it is assigned.

In step 504, the target slave node executes the target scheduling task in the current execution time period, so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analyzer.

Based on the stored latest execution related information, the target slave node can determine the current execution time period according to the latest execution time information and execute the target scheduling task in the time period. Of course, instead of storing the aforementioned current execution time information (as latest execution time information) in the database, the master node may also send this information to the target slave node so that the latter determines the current execution time period accordingly without having to read this information from the database. The process of executing the target scheduling task by the target slave node is a process of scheduling the target data corresponding to the target block in the block chain related data from the database to a data analysis party. The target scheduling task may be triggered to be executed by a time slicing task opening component StartTimeWheelWork contained in the target slave node, where a task container may be created, and the component may run at least one thread in the container, and each thread is used to execute one data scheduling task. For the target scheduling task, the thread of the task may include at least one process, so as to implement single-thread serial execution or multi-thread parallel execution of the task, which will not be described herein.

Before scheduling the target data, it is necessary to determine which data in the above blockchain-related data is the target data. In one embodiment, in view of the fact that the latest execution related information (i.e., the replaced or added execution related information) stored in the database includes target block height information (e.g., a block height range included in the data indication information of the target scheduling task recorded in the task list), the target slave node may determine the corresponding target block according to the information. If the target block height information is "60,70", 11 blocks having a block height of 60 to 70 may be determined as target blocks, and the blockchain related data corresponding to these blocks may be determined as target data. In addition, in the case that the foregoing data indication information further includes a data ID (such as an ID of a certain data table in the pre-analysis data), the target slave node may directly determine the corresponding target data according to the data ID.

In another embodiment, the pre-analysis data may belong to multiple levels and there may be a dependency relationship between different pre-analysis data, where in a case where the target data includes multiple target pre-analysis data, the target slave node may determine, according to data indication information (such as the block height range, the data ID, etc.) in the current execution related information, target pre-analysis data of a highest level, and determine, according to the dependency relationship, target pre-analysis data of each other level on which the target pre-analysis data of the highest level depends. The dependency relationship may be characterized by dependency indication information, which may be recorded in the latest execution related information, or may be requested and acquired from a database. The specific manner may be referred to the foregoing embodiments, and will not be described herein.

In an embodiment, when the target data is scheduled from the database to the data analysis party, the target data of each target block may be sequentially scheduled from the database to the data analysis party in the order of the block height from small to large. By the method, the sequence of scheduling the target data is ensured to be consistent with the sequence generated by the corresponding target data, so that the target slave node only needs to determine the current execution time period and execute the current execution time period (which target data can be scheduled in the time period and which target data are scheduled), and does not need to predict which target data need to be scheduled in the time period in advance, thereby avoiding accidents such as scheduling errors or data omission caused by inaccurate presetting. Of course, in the current execution period, it may be determined which target data can be scheduled by the target slave node first, and then each block is scheduled in sequence according to the order of the block height from high to low, so that the corresponding target data are respectively corresponding to each block, which is not described again.

In the process of scheduling according to the sequence from small to large of the block heights, the completed block heights in the context information can be updated to ensure that the sequential scheduling is realized. The context information may be recorded in a context information table of the target scheduled task. The context information table, i.e. the update of the context information, can be seen in the foregoing table 1 and related embodiments, and will not be described herein.

In the process of scheduling the target data from the database to the data analyzer, scheduling may fail for some reasons (e.g., the target data is not present, the network is unstable, the downstream data analyzer is down, the program is in error, etc.). In this regard, the target slave node may perform corresponding processing according to different failure causes. For example, if the scheduling fails due to the related failure of the data analysis party (such as unstable network, downtime of the data analysis party, etc.), the scheduling may be retried after waiting for a preset time period (such as 1 s). For another example, if the scheduling fails due to the absence of the target data of any target block in the blockchain related data, the scheduling may be retried after waiting for a preset block-out duration (such as 12s or 10min, etc.), where the absence of the target data may be possibly due to the fact that the blockchain network has not generated the target block, so that the original data on the chain has not been generated or the corresponding pre-analysis data has not been generated, and at this time, the scheduling is retried after waiting for the preset block-out duration, if the scheduling is successful during the retrying, the waiting period indicates that the target data is generated.

Further, if the scheduling still fails during retry, that is, the target data of any target block still does not exist after waiting for the block-out duration, the block height of the block may be recorded in the missing block height set; and then, when the block height recorded in the missing block height set reaches a maximum allowable missing block number threshold, taking the difference value between the current block height and the maximum allowable missing block number threshold as the block height of an initial retry block, and retrying to schedule target data corresponding to the initial retry block from the database to a data analyzer. In this way, the target slave node may return to the original altitude to retry scheduling when the number of consecutively missing (i.e., non-existing) blocks reaches a preset value. Of course, the number of retries to schedule can also be flexibly configured to avoid trapping dead loops.

In one embodiment, the target slave node schedules the target data from the database to the data analyst in a number of ways. For example, the target data may be read from a database and sent (i.e. the data itself) to the data analyst, i.e. forwarded by the target slave node to the data analyst. Alternatively, in view of the fact that the data size of the target data is generally large, in order to reduce the data processing load of the target slave node, the storage address of the target data in the database may be sent to the data analysis party, so that the data analysis party reads the target data from the database according to the storage address. For the two scheduling policies (i.e. the policy of directly sending data and the policy of sending access address), in order to balance the data sending speed and the time consumption of the downstream data analysis party for obtaining the target data, the block scheduling selector may be used to automatically switch the scheduling policies, and the specific switching manner may be referred to the foregoing embodiments, which are not repeated herein.

The data analysis party described in the specification can be a server of the data scheduling system, that is, the system can be used for data analysis (that is, integrated with a data analysis function) besides data scheduling. Even the nodes in the system can also have the analysis function for the target data, for example, the target slave node can read the target data from the database and then analyze the target data by itself, and feed back the corresponding analysis result to the server side or the user of the data scheduling system. Alternatively, the data analysis party may also be another analysis party (such as a server of the third party analysis platform, or an analysis party specified by a user) connected to the data scheduling system, where the target slave node only needs to schedule the target data to the data analysis party, without paying attention to a specific analysis process. The specific analysis process of the scheduled target data is not limited in this specification, and will not be described in detail.

In the case where the data scheduling method is cooperatively performed by a master node and a target child node (in which a plurality of nodes in the system constitute a master-slave structure), the target child node is allocated with a target scheduling task to be performed by the master node for execution by the target child node. In this scheme, the master node may access the above blockchain related data (e.g., scheduling the target data in the data) and the execution related information (e.g., reading the information), so as to determine and store the current execution related information of the target slave node for the task if the execution related information indicates that the target scheduling task is not executed at the current time. It will be appreciated that the current execution related information may be used to indicate that the target slave node executes the target scheduling task within the current execution time period, so that the master node allocates the execution right of the target scheduling task within the time period (i.e. allocates the execution right of the task within the time period to the target slave node) by storing the current execution related information in the database. Based on the above, the master node can avoid repeatedly distributing the execution right of the task in the current execution time period to other sub-nodes, and finally ensure that only the target sub-node executes the task in the current execution time period. Therefore, the scheme can avoid the conflict situation that different child nodes schedule the same target data at the same moment, thereby ensuring the ordered scheduling of the target data and reducing the task competition among the nodes.

Fig. 6 is a schematic block diagram of an apparatus provided in an exemplary embodiment. Referring to fig. 6, at the hardware level, the device includes a processor 602, an internal bus 604, a network interface 606, a memory 608, and a non-volatile storage 610, although other hardware required for other functions may be included. One or more embodiments of the present description may be implemented in a software-based manner, such as by the processor 602 reading a corresponding computer program from the non-volatile memory 610 into the memory 608 and then running. Of course, in addition to software implementation, one or more embodiments of the present disclosure do not exclude other implementation manners, such as a logic device or a combination of software and hardware, etc., that is, the execution subject of the following processing flow is not limited to each logic unit, but may also be hardware or a logic device.

As shown in fig. 7, fig. 7 is a block diagram of a data scheduling apparatus according to an exemplary embodiment of the present disclosure, and the apparatus may be applied to the device shown in fig. 6 to implement the technical solution of the present disclosure. The apparatus is applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data are stored in a database, the plurality of nodes respectively having access rights for the blockchain-related data and the execution-related information, the apparatus comprising:

A task preemption unit 701, configured to determine, when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time, current execution related information of the target scheduling task for the any node, and store the current execution related information as the latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and a task execution unit 702, configured to execute the target scheduling task in the current execution time period, so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analyzer.

Optionally, the current execution time information includes a start time and an end time of the current execution time period, and the task preemption unit 701 is specifically configured to:

and calculating the starting time and the ending time according to the current running time of any node and/or the current available resource quantity.

Optionally, the task preemption unit 701 is specifically configured to:

determining the node identification of any node as an execution node identification of the target scheduling task, wherein the execution node identification is used for representing the node for executing the target scheduling task; and/or the number of the groups of groups,

and updating the task state of the target scheduling task into an executing state, wherein the task in the executing state cannot be allocated to the node for execution again.

Optionally, the apparatus further includes:

a locking unit 703, configured to allocate, for the any node, a mutex lock for the target scheduling task if the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time.

As shown in fig. 8, fig. 8 is a block diagram of another data scheduling apparatus provided in the present specification according to an exemplary embodiment, and the apparatus may be applied to the device shown in fig. 6 to implement the technical solution of the present specification. The device comprises:

a master node and a target slave node applied to a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the apparatus comprising:

A task allocation unit 801, configured to, when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time, determine current execution related information of a target slave node for the target scheduling task, and store the current execution related information as latest execution related information of the target scheduling task in the database; the current execution related information comprises target block height information for representing a target block and current execution time information for representing a current execution time period;

and a task execution unit 802, configured to cause the target slave node to execute the target scheduling task in the current execution time period, so as to schedule target data corresponding to the target block in the blockchain related data from the database to a data analyzer.

Optionally, the current execution time information includes a start time and an end time of the current execution time period, and the task allocation unit 801 is specifically configured to:

calculating the starting time and the ending time according to the current running time of the target slave node and/or the current available resource quantity; or,

And calculating the starting time and the ending time of the current execution time period and the time interval between the two times respectively corresponding to each slave node according to the current operation time length and/or the current available resource quantity of each slave node, and determining the slave node with the maximum time interval as the target slave node.

Optionally, the task allocation unit 801 is specifically configured to:

determining the node identification of the target slave node as an execution node identification of the target scheduling task, wherein the execution node identification is used for representing a node for executing the target scheduling task; and/or the number of the groups of groups,

Optionally, the apparatus further includes:

and a locking unit 803, configured to allocate, by the master node, a mutex lock for the target scheduling task to the target slave node when the execution related information does not include execution time information or includes latest execution time information indicating that the target scheduling task is not executed at the current time.

The data scheduling apparatuses shown in fig. 7 and 8 further include the following means:

Optionally, the latest execution time information indicates that the target scheduling task is not executed at the current moment, including:

the current moment is later than the ending moment of the previous execution time period characterized by the latest execution time information; or,

the time length of the interval between the current time and the storage time of the latest execution time information exceeds an updating time length threshold value.

Optionally, the task preemption unit 701 and/or the task allocation unit 801 are specifically configured to:

updating the latest execution related information of the target scheduling task stored in the database into the current execution related information; or,

and storing the current execution related information and the latest execution related information of the target scheduling task in the database in a correlated manner, wherein the current execution related information is used as the latest execution related information of the target scheduling task at the current moment and in the future after the storage is completed.

Optionally, the database includes: a first database belonging to the data scheduling system and/or a second database not belonging to the data scheduling system.

Optionally, the task execution unit 702 and/or the task execution unit 802 are specifically configured to:

And sequentially scheduling target data corresponding to each target block in the block chain related data from the database to a data analysis party according to the sequence of the block heights from small to large.

reading context information of the target scheduling task from the database, and adding one to the current value of the height of the completed block in the context information;

and if the current time is earlier than the ending time of the current execution time period, determining target data of a target block corresponding to the current value of the completed block height in the block chain related data, and scheduling the target data from the database to a data analysis party.

if the scheduling fails due to the related faults of the data analysis party, retrying the scheduling after waiting for a preset time length; or,

if the scheduling fails due to the fact that the target data of any target block does not exist in the related data of the block chain, the scheduling is retried after waiting for the preset block-out time.

Optionally, the method further comprises:

A height recording unit 901, configured to record, if the target data of the any target block still does not exist after waiting for the block-out duration, a block height of the any target block in a missing block height set;

a scheduling retry unit 902, configured to, when the number of blocks recorded in the missing block height set reaches a maximum allowable missing block number threshold, take a difference between a current block height and the maximum allowable missing block number threshold as a block height of an initial retry block, and retry scheduling target data corresponding to the initial retry block from the database to a data analyzer.

reading the target data from the database and sending it to the data analysis party; or,

and sending the storage address of the target data in the database to the data analysis party so that the data analysis party reads the target data from the database according to the storage address.

Optionally, the blockchain network is connected to a first node of the plurality of nodes, the blockchain-related data includes on-chain raw data generated by the blockchain network during operation, and the apparatus further includes:

The data pulling unit 903 is configured to pull, by a first node, the raw data on the chain from the blockchain network, and send the pulled raw data on the chain to the database for storage.

Optionally, the blockchain-related data further includes pre-analysis data, and the apparatus further includes:

a first pre-analysis unit 904, configured to perform pre-analysis processing on the pulled original data on the chain by using a first node to obtain the pre-analysis data, and send the data to the database for storage; or,

the second pre-analysis unit 905 is configured to, when the data scheduling system further includes a server, perform pre-analysis processing on the raw data on the chain pulled by the first node to obtain the pre-analysis data, and send the data to the database for storage.

Optionally, the pre-analysis data belong to a plurality of levels and there is a dependency relationship between different pre-analysis data, and the task preemption unit 701 and/or the task allocation unit 801 are specifically configured to:

and determining target pre-analysis data of the highest level according to the data indication information in the current execution related information, and determining target pre-analysis data of each other level on which the target pre-analysis data of the highest level depends according to the dependency relationship.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, programmable logic devices (Programmable Logic Device, PLDs) (such as field programmable gate arrays (Field Programmable Gate Array, FPGAs) are integrated circuits whose logic functions are determined by the user programming the device, and the designer programs themselves to "integrate" a digital system on a single PLD without requiring the chip manufacturer to design and fabricate application specific integrated circuit chips.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation device is a server system. Of course, this specification does not exclude that as future computer technology advances, the computer implementing the functions of the above-described embodiments may be, for example, a personal computer, a laptop computer, a car-mounted human-computer interaction device, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

Although one or more embodiments of the present description provide method operational steps as described in the embodiments or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented in an actual device or end product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment, or even in a distributed data processing environment) as illustrated by the embodiments or by the figures. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, it is not excluded that additional identical or equivalent elements may be present in a process, method, article, or apparatus that comprises a described element. For example, if first, second, etc. words are used to indicate a name, but not any particular order.

For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when one or more of the present description is implemented, the functions of each module may be implemented in the same piece or pieces of software and/or hardware, or a module that implements the same function may be implemented by a plurality of sub-modules or a combination of sub-units, or the like. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage, graphene storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

One skilled in the relevant art will recognize that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Moreover, one or more embodiments of the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

One or more embodiments of the present specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the present description may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments. In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present specification. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

The foregoing is merely an example of one or more embodiments of the present specification and is not intended to limit the one or more embodiments of the present specification. Various modifications and alterations to one or more embodiments of this description will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present specification, should be included in the scope of the claims.

Claims

1. A data scheduling method applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the plurality of nodes having access rights to the blockchain-related data and the execution-related information, respectively, the method comprising:

2. The method according to claim 1, wherein the current execution time information includes a start time and an end time of a current execution time period, and determining the current execution time information in current execution related information of the target scheduled task by the any node includes:

3. The method of claim 1, the determining information about the current execution of the target scheduled task by the any node, comprising:

4. The method of claim 1, the method further comprising:

and if the execution related information does not contain the execution time information or the latest execution time information contained in the execution related information indicates that the target scheduling task is not executed at the current moment, the mutual exclusion lock aiming at the target scheduling task is distributed for any node.

5. A data scheduling method applied to a master node and a target slave node among a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the method comprising:

6. The method of claim 5, wherein the current execution time information includes a start time and an end time of the current execution time period,

determining the current execution time information in the current execution related information of the target slave node aiming at the target scheduling task, wherein the method comprises the following steps: calculating the starting time and the ending time according to the current running time of the target slave node and/or the current available resource quantity; or,

the master node determining the target slave node, comprising: and calculating the starting time and the ending time of the current execution time period and the time interval between the two times respectively corresponding to each slave node according to the current operation time length and/or the current available resource quantity of each slave node, and determining the slave node with the maximum time interval as the target slave node.

7. The method according to claim 5, wherein the determining the information about the current execution of the target slave node for the target scheduling task includes:

8. The method of claim 5, the method further comprising:

and the master node allocates a mutual exclusion lock aiming at the target scheduling task for the target slave node under the condition that the execution related information does not contain the execution time information or the latest execution time information contained indicates that the target scheduling task is not executed at the current moment.

9. The method of claim 1 or 5, the latest execution time information indicating that the target scheduled task is not executed at a current time, comprising:

10. The method according to claim 1 or 5, wherein the storing the current execution related information as the latest execution related information of the target scheduling task in the database includes:

11. The method of claim 1 or 5, the database comprising: a first database belonging to the data scheduling system and/or a second database not belonging to the data scheduling system.

12. The method of claim 1 or 5, the scheduling target data corresponding to the target block in the blockchain-related data from the database to a data analyzer, comprising:

13. The method of claim 12, scheduling target data corresponding to any target block in the blockchain-related data from the database to a data analyzer, comprising:

14. The method of claim 12, scheduling target data corresponding to any target block in the blockchain-related data from the database to a data analyzer, comprising:

15. The method of claim 14, further comprising:

if the target data of any target block still does not exist after waiting for the block-out time, recording the block height of any target block in a missing block height set;

And under the condition that the block height recorded in the missing block height set reaches a maximum allowable missing block number threshold, taking a difference value between the current block height and the maximum allowable missing block number threshold as the block height of an initial retry block, and retrying to schedule target data corresponding to the initial retry block from the database to a data analysis party.

16. The method of claim 1 or 5, the scheduling target data corresponding to the target block in the blockchain-related data from the database to a data analyzer, comprising:

17. The method of claim 1 or 5, the blockchain network being connected to a first node of the plurality of nodes, the blockchain-related data including on-chain raw data generated by the blockchain network during operation, the method further comprising:

The first node pulls the raw data on the chain from the blockchain network and sends the pulled raw data on the chain to the database for storage.

18. The method of claim 17, the blockchain-related data further including pre-analysis data, the method further comprising:

the first node performs pre-analysis processing on the pulled original data on the chain to obtain pre-analysis data, and sends the data to the database for storage; or,

and under the condition that the data scheduling system further comprises a server, the server performs pre-analysis processing on the raw data on the chain pulled by the first node to obtain pre-analysis data, and sends the data to the database for storage.

19. The method of claim 18, the pre-analysis data belonging to multiple tiers and there being a dependency between different pre-analysis data, the any node determining the target data if the target data includes multiple target pre-analysis data, comprising:

20. A data scheduling system having a blockchain network connected thereto, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, a plurality of nodes in the data scheduling system having access rights for the blockchain-related data and the execution-related information, respectively, any one of the plurality of nodes being configured to:

21. A data scheduling system having a blockchain network connected thereto, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, a plurality of nodes in the data scheduling system including a master node and at least one slave node, the master node having access rights for the blockchain-related data and the execution-related information, wherein:

22. A data scheduling apparatus applied to any one of a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the plurality of nodes having access rights to the blockchain-related data and the execution-related information, respectively, the apparatus comprising:

23. A data scheduling apparatus applied to a master node and a target slave node among a plurality of nodes included in a data scheduling system to which a blockchain network is connected, blockchain-related data of the blockchain network and execution-related information of a target scheduling task created for the blockchain-related data being stored in a database, the master node having access rights for the blockchain-related data and the execution-related information, the apparatus comprising:

24. An electronic device, comprising:

a processor; a memory for storing processor-executable instructions;

wherein the processor is configured to implement the method of any one of claims 1-19 by executing the executable instructions.

25. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-19.