WO2019218814A1 - 图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备 - Google Patents

图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备 Download PDF

Info

Publication number
WO2019218814A1
WO2019218814A1 PCT/CN2019/082192 CN2019082192W WO2019218814A1 WO 2019218814 A1 WO2019218814 A1 WO 2019218814A1 CN 2019082192 W CN2019082192 W CN 2019082192W WO 2019218814 A1 WO2019218814 A1 WO 2019218814A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
task
blockchain network
graph
computing
Prior art date
Application number
PCT/CN2019/082192
Other languages
English (en)
French (fr)
Inventor
郑博
刘日佳
刘志斌
陈谦
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2020563591A priority Critical patent/JP7158801B2/ja
Priority to KR1020207031862A priority patent/KR102485652B1/ko
Priority to SG11202010651RA priority patent/SG11202010651RA/en
Priority to EP19804270.7A priority patent/EP3771179B1/en
Publication of WO2019218814A1 publication Critical patent/WO2019218814A1/zh
Priority to US16/983,948 priority patent/US11847488B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks

Definitions

  • the present application relates to the field of graph computing technologies, and in particular, to a graph data processing method and apparatus, a computing task publishing method, apparatus, storage medium, and computer device.
  • Graph computing is an abstract representation of a "graph” structure of the real world based on "graph theory” and a computational model on such a data structure.
  • the graph data structure expresses the correlation between the data very well. Therefore, the problems in many applications can be abstracted into graphs to represent the problem, and the model is based on the idea of graph theory or graph to solve the problem.
  • graph data can be used to represent information such as social networks, commodity purchase relationships, road traffic networks, communication networks, etc., and computing requirements based on graph data are becoming more and more complex.
  • the application example provides a graph data processing method, a computing task publishing method of the graph data, an apparatus, a computer readable storage medium, and a computer device.
  • the application example provides a graph data processing method, which is executed by a computer device distributed by a computing node in a distributed computing node cluster, and includes:
  • global data in the blockchain network is updated by the distributed computing node cluster
  • the calculation task for the sub-picture data is iteratively executed according to the acquired latest global data and the local data, and the calculation result is obtained until the iteration stop condition is satisfied.
  • the application example further provides a graph data processing apparatus, including:
  • a calculation management module configured to acquire sub-picture data divided from the to-be-processed graph data
  • the calculation management module is further configured to execute a calculation task for the sub-picture data, and obtain corresponding global data and local data;
  • a communication module configured to write the global data into a blockchain network; global data in the blockchain network is updated by the distributed computing node cluster;
  • the communication module is further configured to obtain the latest global data from the blockchain network
  • the calculation management module is further configured to iteratively execute a calculation task for the sub-picture data according to the acquired latest global data and the local data, and obtain a calculation result when an iteration stop condition is met.
  • the present application also provides a computer readable storage medium storing a computer program that, when executed by a processor, causes the processor to perform the steps of the map data processing method.
  • the application examples also provide a computer device including a memory and a processor, the memory storing a computer program, when the computer program is executed by the processor, causing the processor to execute the map data processing method step.
  • the application example also provides a computing task publishing method for graph data, which is executed by a computer device, and the method includes:
  • the new block is added to the blockchain network
  • the broadcasted task information is used to instruct the computing node to acquire the sub-picture data divided from the to-be-processed graph data, and iteratively execute based on the blockchain network
  • the calculation task of the subgraph data is obtained, and the calculation result is obtained.
  • the application example further provides a computing task issuing device for graph data, the device comprising:
  • a task management module configured to acquire task information corresponding to the to-be-processed graph data
  • the communication module is further configured to write the task information into the new block
  • the communication module is further configured to join the new block to the blockchain network after the new block is verified by the blockchain network;
  • the communication module is further configured to broadcast the task information in the blockchain network, and the broadcast task information is used to instruct the computing node to acquire the submap data divided from the to-be-processed graph data, and based on the region
  • the blockchain network iteratively performs a calculation task for the subgraph data to obtain a calculation result.
  • the application examples also provide a computer readable storage medium storing a computer program, the computer program being executed by a processor, causing the processor to perform the steps of the computing task publishing method of the graph data.
  • the application examples also provide a computer device including a memory and a processor, the memory storing a computer program, when the computer program is executed by the processor, causing the processor to perform a computing task of the graph data The steps to publish the method.
  • 1 is an application environment diagram of a method for processing data of a graph in some examples
  • FIG. 2 is a schematic flow chart of a method for processing data in some examples
  • 3 is a schematic flowchart of steps of obtaining task information corresponding to data to be processed broadcasted in a blockchain network in some examples
  • FIG. 5 is a schematic flowchart of a node node exchange step in some examples.
  • FIG. 6 is a schematic flow chart of a method for processing data in another example
  • FIG. 7 is a schematic flow chart of a method for publishing a calculation task of graph data in some examples
  • FIG. 8 is a schematic diagram of an interface of a task editing interface in some examples.
  • FIG. 9 is a schematic structural diagram of a task execution state display interface in some examples.
  • FIG. 10 is a schematic structural diagram of a computing node status display interface in some examples.
  • FIG. 11 is a schematic flow chart of a method for publishing a calculation task of graph data in another example
  • Figure 12 is a block diagram showing the structure of a data processing device of some examples.
  • Figure 13 is a block diagram showing the structure of the data processing device of the other examples.
  • Figure 14 is a block diagram showing the data processing apparatus of the figure in some examples.
  • Figure 15 is a block diagram showing the structure of a computer device in some examples.
  • 16 is a structural block diagram of a computing task issuing apparatus for graph data in some examples
  • Figure 17 is a block diagram showing the structure of a computer device in some examples.
  • the graph calculation method may be a graph calculation method based on a centralization idea, in which a parameter server is often used to store and update data that needs to be shared.
  • the parameter server can store the data that needs to be shared distributed on each server node to avoid a single point of failure to a certain extent, it needs to rely on a centralized driver to coordinate processing when updating data. This often leads to huge network throughput during the graph calculation process, which makes network communication a bottleneck of the graph computing system.
  • the traditional centralization-based graph calculation method has the problem of low computational efficiency.
  • FIG. 1 is an application environment diagram of a graph data processing method in some examples.
  • the distributed graph computing system includes a control node 110, a compute node 120, and a data repository 130.
  • the control node 110 and the computing node 120 are connected by a blockchain network, and the control node 110 and the computing node 120 are respectively connected to the data warehouse 130 via a network.
  • the control node 110 can be implemented by a terminal, and specifically can be a desktop terminal or a mobile terminal, such as a mobile phone, a tablet computer or a notebook computer.
  • the computing node 120 can be implemented by a program deployed on one or more servers, such as the computing nodes a, b, c, etc.
  • the data warehouse 130 can be a centralized storage device or a distributed storage cluster or device.
  • the distributed graph computing system includes a distributed computing node cluster, the graph data processing method being applicable to computing nodes in a distributed computing node cluster.
  • a graph data processing method is provided. This example is mainly illustrated by the method being applied to the computing nodes in the distributed computing node cluster in FIG. 1 above.
  • the data processing method of the figure specifically includes the following steps:
  • the graph data is structured data organized by graph structure, and the relationship information between entities is stored by applying graph theory.
  • the to-be-processed graph data is graph data to be processed by the compute node.
  • the subpicture data is part of the map data divided from the graph data to be processed.
  • graph data consists of edges between graph nodes and graph nodes.
  • Graph nodes are vertices in graph data that can be used to represent subjects in graph data. For example, when using graph data to store information between individuals in a social network, different graph nodes in the graph data can be used to represent different individuals; or when graph data is used to represent a merchandise purchase relationship, each user and each product are Is a graph node.
  • the graph node may include information such as a node identifier and a node attribute of the graph node.
  • the edges between graph nodes are the edges in the graph data and can be used to represent the relationships between different subjects in the graph data.
  • graph data when using graph data to store information between individuals in a social network, different graph nodes in the graph data represent different individuals, and the edges in the graph data can be used to represent relationships between individuals, such as friend relationships;
  • the graph data indicates the merchandise purchase relationship, each user and each merchandise is a graph node, and the user purchases the merchandise as an edge.
  • each compute node may be a program deployed on one or more servers, and the compute nodes may form a virtual network structure through the blockchain medium.
  • the computing node can write information such as data, calculation task execution log or computing node status into the new block, and add the new block to the blockchain network to facilitate communication coordination between the computing nodes to complete the subgraph data.
  • the computing task and then complete the computing task corresponding to the pending graph data.
  • the computing node state is the state information of the computing node, including the CPU (Central Processing Unit) memory resource, the algorithm name and type, the number of errors, and the like consumed by the current computing node to perform the computing task.
  • CPU Central Processing Unit
  • the distributed graph computing system includes a control node that can communicate with one or more compute nodes in the distributed graph computing system to access the blockchain network.
  • the control node may broadcast the task information in the blockchain network by recording the task information corresponding to the to-be-processed map data in the new block and adding the new block to the blockchain network.
  • the task information is information related to the calculation task.
  • the computing node can periodically view information recorded in the latest block in the blockchain network through a timer or a timing program or the like. After detecting the task information, determining the load status of the computing node itself, when the computing node is not overloaded, the computing node may load the task information from the blockchain network.
  • the task information may include an acquisition path of the task executable file, an algorithm parameter to execute the calculation task, an acquisition path of the sub-picture data, an output path of the calculation result, and a desired completion time and the like.
  • a certain number of graph nodes and corresponding edges may be randomly obtained from the graph structure corresponding to the to-be-processed graph data according to the self-load state.
  • Subgraph structure obtaining the sub-picture data corresponding to the sub-picture structure from the data warehouse according to the acquisition path of the sub-picture data recorded in the task information.
  • the distributed graph computing system may divide the graph data to be processed into the corresponding number of sub graph data according to a preset condition, such as average or random, according to the current number of computing nodes.
  • the calculation task is the workload that needs to be calculated when processing the processed graph data.
  • Global data is data that needs to be globally shared and globally updated during the calculation of sub-picture data by each computing node.
  • the size of global data is limited.
  • the local data is data that is used and updated by only a few computing nodes in the process of calculating the sub-picture data by each computing node.
  • the computing node may perform corresponding calculation on the graph nodes and edges in the acquired sub-picture data to obtain corresponding global data and local data.
  • each graph node in the submap data may correspond to one unit of computational tasks.
  • the unit calculation task is the smallest computational allocation unit in the distributed graph computing system, such as the computing task corresponding to a graph node in the current iterative calculation process.
  • the computing node pulls the corresponding task executable file from the data warehouse according to the acquisition path of the task executable file recorded in the task information, and the algorithm parameters of the computing task. The algorithm parameters that perform the calculation task. Further, after acquiring the sub-picture data, the computing node executes the pulled task executable file according to the algorithm parameter of the execution computing task to perform the calculation task for the sub-picture data.
  • the compute node may assign initial values to the global data and local data either randomly or according to historical experience.
  • the computing node executes the task executable file and updates the global data and the local data according to the algorithm parameters of the execution computing task and the initial values of the global data and the local data.
  • the calculations to be processed on the graph data are iterative calculations, and as the iterative update of the algorithm, new computational tasks are generated during each iteration.
  • Each graph node in the subgraph data can be driven by the algorithm to continuously generate unit calculation tasks that need to complete the calculation.
  • the computing node performs the computing task for the sub-picture data, and after obtaining the corresponding global data and local data, the corresponding data may be stored in isolation according to different tasks and different graph nodes.
  • the global data is written into the blockchain network; the global data in the blockchain network is updated by the distributed computing node cluster.
  • Blockchain technology also known as distributed ledger technology
  • BT Blockchain technology
  • distributed ledger technology is an Internet database technology, which is characterized by decentralization, openness and transparency, so that everyone can participate in database records.
  • Blockchain technology uses blockchain data structures to validate and store data, use distributed node consensus algorithms to generate and update data, use cryptography to ensure data transfer and access security, and utilize intelligent script code
  • a distributed computing node cluster is a collection of distributed computing nodes.
  • the computing node after the computing node performs the computing task for the sub-picture data, and obtains the corresponding global data and the local data, the obtained global data can be shared in the distributed computing node through the blockchain network.
  • the compute node can write the global data to the new block by writing the new block to the end of the blockchain network after the new block is verified by the blockchain network to write the global data into the blockchain network.
  • each computing node in the distributed graph computing system can asynchronously perform computational tasks on the subgraph data, and respectively obtain corresponding global data and local data.
  • Each computing node performs the computing task asynchronously, and after completing the corresponding computing task, the global data can be written into the blockchain network according to the situation of each computing node, and the global data in the blockchain network is jointly updated by the distributed computing node cluster. .
  • the computing node may hash the global data to obtain a corresponding string, and then use the preset private key to encrypt the string to generate a signature, and then write the global data and the signature into the blockchain network.
  • the other nodes obtain the global data and the signature
  • the validity of the corresponding public key verification signature can be obtained from the data warehouse. After the verification is passed, the corresponding global data is obtained and the local global data is cached and updated.
  • the blockchain network described above may consist of a private blockchain, and the keys required to record data in the blockchain network may be based on associated asymmetric encryption algorithm criteria, such as elliptic curve cryptography or RSA encryption.
  • the algorithm an asymmetric encryption algorithm
  • Compute nodes can access the data warehouse to obtain the relevant keys when they need to be used.
  • the compute node may also record all relevant data from the previous block to the current time, such as current global data, compute task execution logs, or compute node status, in new blocks and add new blocks to the blockchain. In the network.
  • a computing node may share broadcast related data, such as global data recorded in a new block, with all of the computing nodes in the distributed graph computing system by actively broadcasting relevant data recorded in the new block.
  • broadcast related data such as global data recorded in a new block
  • the relevant data recorded in the new block can be diffused in the form of a request.
  • the diffusion in the form of a request means that the computing node can acquire the related data recorded in the new block by sending a request.
  • the graph structure corresponding to the graph data to be processed is usually a sparse graph, and the number of edges M in the graph data to be processed is much smaller than the graph node number N.
  • Two-two combination In this way, the global data that usually needs to be globally shared is limited in size, and most of the data has a certain locality, which is only used by the calculation of a few computing nodes.
  • the distributed graph computing system is used to globally share the global data through the blockchain network, and the local data can be cached to the corresponding computing node, and obtained through the request form, thereby avoiding a large amount of unnecessary network communication overhead.
  • the computing node can obtain the latest global computing data recorded in the new block through the blockchain network.
  • the compute node can cache the latest global computation data from the blockchain network by periodically viewing the new chunks.
  • the compute node can obtain global data in the new block.
  • the compute node caches the acquired global data to the local.
  • the compute node can be driven by a timer to replace the oldest global data with an update strategy and a common FIFO (First Input First Output) and LFU ( Least frequently used, the least frequently used page replacement algorithm) strategy is similar, and will not be described here.
  • FIFO First Input First Output
  • LFU Least frequently used, the least frequently used page replacement algorithm
  • the computing node may periodically detect the number of blocks in the blockchain network. When the number of blocks increases, the global data recorded in the new block is acquired, and the local global data is updated according to the acquired global data.
  • the compute node can periodically detect block data in the blockchain network by a timer or by running a preset timing detection procedure. When the number of blocks increases, the computing node can obtain the global data recorded in the new block by the form of the request, and update the local global data according to the obtained global data. This can update the global data quickly and easily, so as to improve the efficiency of subsequent processing of the sub-picture data.
  • a computing node may generate a new iteratively calculated computing task based on global data and local data.
  • the computing node When the computing node generates a new round of iterative computing computing tasks, the latest global data is obtained from the blockchain network in the form of a data request, and the required local data is obtained from the adjacent computing nodes, according to the latest acquired data.
  • the global data, the local data at adjacent compute nodes, and the local local data generate a new round of iterative computing computing tasks.
  • S210 Iteratively perform a calculation task for the sub-picture data according to the obtained latest global data and local data, and obtain a calculation result when the iteration stop condition is satisfied.
  • the iteration stop condition is the condition for ending the iterative calculation.
  • the iterative stop condition may be that the preset number of iterations is reached, or the iterative calculation reaches a preset duration, or the calculation result obtained by the iterative calculation converges to a stable value.
  • the computing node may perform the calculation task for the sub-picture data in an iterative manner according to the latest global data and the local data that is acquired. If the iteration stop condition is not satisfied after the calculation task for the sub-picture data is performed, the process returns to step S204 to continue. The calculation ends until the iteration stop condition is met.
  • the graph nodes in the subgraph data can be driven by the algorithm to continuously generate computational tasks that require computation.
  • the computing node can determine the computing tasks to be processed in the sub-picture data according to the sub-picture data, the number and content of the completed computing tasks, the latest global data, and the local data. For example, the computing node may generate, according to the sub-picture data, the completed computing task of the previous s time point on which the computing task calculated by the current iteration depends, and the corresponding latest global data and current local data, driven by the algorithm. Generate the calculation task for the next iteration calculation. The compute node performs the generated computational tasks and updates the global data and local data. Iterate through this loop until the iteration stop condition is met and the result of the calculation is obtained.
  • the data processing method of the figure is applied to the model training to illustrate, the graph data can be input to the machine learning model to determine the intermediate prediction result corresponding to the graph node in the graph data, and the label corresponding to the graph data according to the intermediate prediction result. Difference, adjust the model parameters of the machine learning model and continue training until the training is terminated when the iteration stop condition is met.
  • the model parameters obtained by the training may include global data and local data.
  • the model training process is used, the model parameters obtained by the previous model training are used, and the model parameters obtained by the second model training are reused. For the next model training, this cycle.
  • the consistency requirements for the data vary greatly depending on the algorithm.
  • the global acquisition of the compute node is obtained.
  • Data usually has a certain data delay, but can not guarantee strong consistency, that is, the error of the global parameters will be within a controllable range. This is not sensitive to such things as random gradient descent algorithms.
  • the local data is stored locally or directly connected by point-to-point communication, the consistency can be well guaranteed and there is no data delay.
  • the computing node when the computing node iteratively performs the calculation task for the sub-picture data according to the latest global data and the local data obtained, and the iterative stop condition is satisfied, the calculation result corresponding to the sub-picture data can be obtained.
  • Each computing node can store the local data of each subgraph and the calculation result into the data warehouse.
  • the data warehouse can consistently integrate the global data in the blockchain network, the local data of each subgraph, and the calculation results corresponding to each subgraph.
  • the above data processing method can greatly improve the processing efficiency of the graph data by dividing the graph data to be processed into sub-graph data for distributed processing. Then, the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced. Moreover, in the process of iterative calculation of the processed graph data, the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • step S202 specifically includes: acquiring task information corresponding to the to-be-processed map data broadcasted in the blockchain network; and reading the corresponding task executable file and dividing from the to-be-processed graph data according to the task information.
  • Step S204 specifically includes: executing a task executable file to perform a calculation task on the sub-picture data, and obtaining corresponding global data and local data.
  • the computing node may obtain the task information corresponding to the to-be-processed graph data broadcasted in the blockchain network through the blockchain network, and read the corresponding task executable file from the local or data warehouse according to the task information. And subgraph data divided from the graph data to be processed. Execute a task executable to perform computational tasks for subgraph data.
  • the distributed graph computing system includes a control node, and the control node can record the task information corresponding to the graph data to be processed in the new block, and add the new block to the blockchain network.
  • the task information is broadcast in the blockchain network.
  • all of the compute nodes in the blockchain network can receive task information corresponding to the graph data to be processed by broadcast of the new chunk.
  • the computing node can periodically view information recorded in the latest block in the blockchain network through a timer or a timing program or the like. Actively acquiring task information corresponding to the to-be-processed map data broadcasted in the blockchain network.
  • the task information corresponding to the to-be-processed map data broadcasted in the blockchain network is obtained, and then the corresponding task executable file and the corresponding sub-picture data are read according to the task information, and the task executable file is executed, Implement the steps to perform a computing task.
  • the task information can be a very light file, so that the task release through the blockchain network can greatly reduce the traffic of task release and acquisition in the distributed graph computing system, and greatly improve the efficiency of task publishing.
  • the step of acquiring the task information corresponding to the to-be-processed map data broadcasted in the blockchain network specifically includes the following steps:
  • the task information corresponding to the to-be-processed map data broadcasted in the blockchain network is periodically detected.
  • the computing node may periodically detect the person information corresponding to the to-be-processed map data broadcasted in the blockchain network by using a timer or a timing program or the like.
  • the control node broadcasts the task information through the blockchain network by recording the task information in the new block.
  • the load status of the current computing node may be determined.
  • the preset load condition is a preset condition, such as the load amount is less than a preset threshold, or the load amount is within a preset range.
  • the computing node can determine whether the current load state meets the preset load condition. When the load state of the computing node meets the preset load condition, for example, when the computing node is in an idle state, the computing node can pull the task information. When the load state of the computing node does not meet the preset load condition, such as when the computing node is in an overload state, the computing node may ignore the task information.
  • the latest task information can be obtained in real time by periodically detecting the task information corresponding to the to-be-processed map data broadcasted in the blockchain network.
  • the load status of the computing node meets the preset load condition, the task information is pulled, which ensures that the computing node that pulls the task information has idle resources to process the corresponding computing task, thereby avoiding invalid pulling, and further improving the mapping.
  • the efficiency of data processing is improved.
  • step S206 specifically includes the steps of: creating a new block; writing the global data to the created new block; and adding the new block to the blockchain network after the new block is verified by the blockchain network.
  • the computing node may periodically create a new block, and write global calculation data obtained after performing the calculation task of the sub-picture data into the created new block.
  • the new block is verified by the blockchain network, the new block is added to the blockchain network.
  • the compute node may generate a new block according to a corresponding consensus algorithm. Consensus algorithms, such as consensus hashing algorithms.
  • the compute node can write the global data corresponding to the completed compute task to the new block.
  • the compute node can write global data, computation task execution logs, exchange data for graph nodes, or compute node status to new blocks.
  • the number of bits of the consensus hash algorithm can be reduced to improve processing efficiency and achieve system throughput. Reduce the number of bits in the consensus hash algorithm, such as using 128-bit SHA (Secure Hash Algorithm) instead of 256 bits. You can also adjust the consensus hash algorithm to POS (Proof of Work). The age of the currency determines the generator of the next new block.
  • SHA Secure Hash Algorithm
  • the number of consensus prefixes 0 can also be used to control the complexity. Similar to the bit number control, the more the constraint hash result prefix 0 is, the more difficult the consensus problem is.
  • the blockchain network may employ a corresponding consensus algorithm to verify the new block. After the new block is verified by the blockchain network, the compute node may join the new block to the blockchain network.
  • each compute node can cache the corresponding data for a local time.
  • the global data is written into the new block.
  • the global data can be shared in the blockchain network, which greatly reduces the amount of communication required to share the global data.
  • the step of writing global data to the created new block includes writing the global data and the corresponding computing task execution log to the new block.
  • the data processing method of the figure further includes the step of reconstructing the lost data, and the step specifically includes:
  • S402. Determine a fault calculation node and corresponding lost data according to a calculation task execution log in the blockchain network.
  • the calculation task execution log is log data generated when the calculation task is executed.
  • the computing node fails to acquire local data from the neighboring node, the corresponding computing task execution log may be pulled and collated from the blockchain network as needed to determine the fault computing node and the corresponding lost data.
  • the compute nodes involved in reconstructing the lost data are typically neighboring compute nodes of the failed node.
  • the neighboring computing node of the fault computing node may acquire a partial sub-graph data structure of the fault computing node according to the self-load state, according to the acquired partial sub-graph data structure from the data warehouse. The corresponding partial subgraph data is obtained to be combined to form new subgraph data.
  • the compute node may obtain local data associated with the faulty compute node from other compute nodes involved in reconstructing the lost data.
  • the computing node may also search local data related to the faulty computing node from the locally cached local data, share the found local data to the computing node participating in reconstructing the lost data, and share the local data for other calculations.
  • the node reconstructs the lost data.
  • the sharing method may be that the computing node actively sends the local data related to the fault computing node to the corresponding computing node, or the computing node that needs the data initiates the data acquiring request to obtain the corresponding local data.
  • the compute node involved in reconstructing the lost data can also be an empty load compute node.
  • the computing node participating in the reconstruction of the lost data may acquire local data related to the faulty computing node shared by the computing node adjacent to the faulty computing node, and reconstruct the lost data according to the shared local data.
  • the computing node can reconstruct the lost data according to the principle of minimum error.
  • the computing node can reconstruct the lost data by interpolation according to the obtained local data.
  • the commonly used difference method such as the moving average window method, the regression method, the interpolation function method, etc., or the mean, median or mode of the obtained local data as the reconstructed value of the lost book data.
  • the faulty node and the corresponding lost data can be quickly located by recording the calculation task execution log in the blockchain network. According to the local data shared by the computing nodes involved in reconstructing the lost data, the lost data can be quickly reconstructed, which avoids the single point failure of the computing node and affects the overall calculation, so that the overall distributed graph processing system has high reliability.
  • the data processing method of the figure further includes the step of performing graph node exchange, and the step specifically includes:
  • the blockchain data is data recorded in the blockchain network, and the new block in the blockchain network can share the recorded information by means of broadcast.
  • the quantized value is a value obtained by quantizing a calculation task, such as a unit of calculation task corresponding to one unit of quantized value.
  • the first quantized value is a value obtained by quantifying the calculated task of the corresponding blockchain data in the computing node.
  • the first quantized value can measure the computing power of the computing node and corresponds to a partially completed computing task.
  • the first quantized value may specifically be a resource that is circulated and exchanged in a blockchain network, and may be referred to as a currency value or a virtual currency value.
  • the global data corresponding to the partially completed computing task may be recorded into the blockchain network to form corresponding blockchain data, which is corresponding to the completed computing task.
  • the quantized value is the first quantized value.
  • the compute node may also record all relevant data from the previous block to the current time, such as current global data, compute task execution logs, or compute node status in a new block and join the new block to the blockchain network. And obtain the corresponding first quantized value.
  • the second quantized value is a value obtained by quantizing the completed computing task without forming corresponding blockchain data in the computing node.
  • the second quantized value can measure the computing power of the computing node, corresponding to another portion of the computing node that has completed the computing task.
  • the second quantized value may be a resource that the computing node currently has to be redeemed as the first quantized value, and may be referred to as an asset value.
  • the asset value corresponding to the blockchain data can be converted into a circulation in the blockchain network. The same amount of currency value.
  • the computing node After the computing node performs the computing task for the sub-picture data, the corresponding global data and local data are obtained.
  • the computing node quantizes the completed computing task to obtain a second quantized value.
  • the corresponding computing task is completed.
  • the second quantized value corresponding to the completed computing task can be obtained.
  • the computing node obtains the historical second quantized value corresponding to the completed computing task before writing the global data to the new block.
  • the historical second quantized value corresponds to the completed computing task before the global data is written to the new block.
  • the corresponding historical second quantized value can be converted to the first quantized value by writing the global data to the new block to form the corresponding blockchain data.
  • the second quantifiable value that can be redeemed in the future is quickly and conveniently converted into a corresponding first quantifiable value that can be circulated.
  • the sum of the first quantized value and the current second quantized value can represent the computing power of the computing node.
  • the third quantized value is a value obtained by quantizing an unfinished computing task in the computing node.
  • the third quantized value may be a value corresponding to the computing task to be calculated by the computing node, and may be referred to as a liability value, which may be a measure of the load state of the computing node.
  • the corresponding third quantized value may be converted into the second quantized value of the same amount, and then converted into the first quantized value of the same amount.
  • the computing node can obtain the current unfinished computing task in real time, and determine a corresponding third quantized value according to the unfinished computing task.
  • the first quantized value, the second quantized value, and the third quantized value have the same units, all corresponding to the computing task.
  • the total task corresponding to the graph data to be processed is constantly changing, and the subtasks generated by the subgraph data in each computing node are also constantly changing.
  • the compute node can iteratively generate computational tasks based on the subgraph data. As the algorithm is iteratively updated, new computational tasks are generated during each iteration.
  • Each graph node in the subgraph data can be driven by the algorithm to continuously generate unit computing tasks that need to be calculated.
  • the compute node may determine that the submap data is not completed based on the submap data, the number of completed computation tasks, the content of the completed computation task, the global data shared in the blockchain network, and the local data.
  • the third quantized value corresponding to the computing task may be determined that the submap data is not completed based on the submap data, the number of completed computation tasks, the content of the completed computation task, the global data shared in the blockchain network, and the local data.
  • the equalization condition is a condition that is preset to measure the equilibrium relationship between the current computing power and the load state of the computing node.
  • the equalization condition such as the second sum of the current third quantized value of the subgraph data, and the contrast value of the first sum of the current second quantized values of the subgraph data, are within the specified range.
  • the contrast value is the difference value of the two values, and the difference value between the two values can be determined by mathematical calculation. Mathematical calculations, such as dividing the two numbers directly, taking the logarithm and dividing, subtracting, or performing other operations, then taking the logarithm and dividing them.
  • the contrast value measures the difference state of one value relative to another.
  • the exchange of the graph node and the first quantized value may be performed between the computing nodes to keep the second quantized value and the third quantized value in accordance with the equalization condition.
  • the specified range corresponding to the equalization condition may be a preset fixed range or a range determined by a function that changes with time.
  • the computing node may obtain a first sum of the current second quantized values of the sub-picture data, and obtain a second sum of the current third quantized values of the sub-picture data.
  • a comparison value of the second sum with respect to the first sum is determined, and when the comparison value is out of the specified range, the second quantized value and the third quantized value do not meet the equalization condition.
  • the compute node may calculate a comparison of the second sum relative to the first sum according to the following formula:
  • a and m are constants; a>0, and a ⁇ 1; m ⁇ 1; Representing a second sum of the current third quantized values of the subgraph data; Represents the first sum of the current second quantized values of the subgraph data.
  • the comparison value is a value obtained by dividing the logarithm of the second sum and the first sum
  • the minimum value ⁇ (t) and the maximum value ⁇ (t) of the specified range are over time t A linear decreasing function of the change, and ⁇ (t) ⁇ ⁇ (t).
  • the following formula can be used to indicate that the comparison value is within the specified range:
  • the constant a can be 10 and the constant m is 1, then the above formula can be simplified to
  • the first quantized value of the current computing node is exchanged with the corresponding graph node of the non-current computing node to keep the contrast value within the specified range; when the contrast value is greater than When the maximum value of the range is specified, the graph node of the current compute node is exchanged with the corresponding first quantized value of the non-current compute node to keep the contrast value within the specified range.
  • the comparison value is the comparison value of the second sum with respect to the first sum, and when the comparison value is less than the minimum value of the specified range, the first quantization value of the current calculation node is compared with the non-current calculation.
  • the corresponding graph nodes of the node are exchanged; when the contrast value is greater than the maximum value of the specified range, the graph node of the current compute node is exchanged with the corresponding first quantized value of the non-current compute node to keep the contrast value within the specified range.
  • the comparison value may also be a comparison value of the first sum with respect to the second sum, and when the comparison value is greater than the maximum value of the specified range, the first quantized value of the current computing node and the corresponding graph node of the non-current computing node Exchanging; when the comparison value is less than the minimum value of the specified range, the graph node of the current computing node is exchanged with the corresponding first quantized value of the non-current computing node to keep the contrast value within the specified range.
  • the step of exchanging the graph node and the first quantized value between the computing nodes specifically includes the steps of: determining, between the computing nodes to be exchanged, the first party providing the graph node to be exchanged, and redeeming The second party of the graph node to be exchanged. Determining a first predicted quantized value of the graph node to be exchanged by the first party; determining a second estimated quantized value of the graph node to be exchanged by the second party. A first quantized value for exchange with the graph node to be exchanged is determined based on the first estimated quantized value and the second estimated quantized value. An exchange between the graph nodes to be exchanged and the determined first quantized value is performed between the computing nodes to be exchanged.
  • the estimated quantized value is a value that is estimated by the estimated uncompleted computing task owned by the graph node to be exchanged.
  • the first estimated quantized value is a value that provides for the first party of the graph node to be exchanged to quantize the unfinished computing task owned by the graph node to be exchanged.
  • the second estimated quantized value is a value that is quantified in exchange for an unfinished computing task owned by the second node of the graph node to be exchanged to be exchanged.
  • the computing node may wait according to the relationship between the second quantized value corresponding to the graph node to be exchanged and the second quantized value corresponding to the first party.
  • the relationship between the third quantized value corresponding to the exchanged graph node and the third quantized value corresponding to the first party determines a first estimated quantized value of the graph node to be exchanged by the first party.
  • the compute node can calculate the first estimated quantized value by the following formula:
  • v i represents a graph node to be exchanged
  • k represents a first party providing a graph node to be exchanged
  • y1 (v i , k) represents a first predicted quantized value of the graph node to be exchanged by the first party
  • a third quantized value representing a graph node to be exchanged
  • C i representing a second quantized value of the graph node to be exchanged
  • ⁇ and ⁇ are respectively corresponding parameters
  • e is a natural constant.
  • the computing node when the computing node is in exchange for the second party of the graph node to be exchanged, the computing node may consider, after determining the second estimated quantized value, the second party corresponding to the graph node to be exchanged. Compute nodes require increased computational tasks and reduced communication distance. Therefore, the computing node may determine, according to the third quantized value of the graph node to be exchanged, and the communication distance between the graph node to be exchanged and the graph node in the second party, the second pre-determination of the graph node to be exchanged by the second party Estimate the quantified value. For example, the compute node can calculate the second estimated quantized value according to the following formula:
  • v i represents a graph node to be exchanged
  • l represents a second party of the graph node to be exchanged
  • y 2 (v i , l) represents a second estimated quantized value of the graph node to be exchanged by the second party
  • ⁇ j ⁇ l dist(i,j) representing the communication distance of the graph node j in the second party 1 of the graph node to be exchanged with the graph node i to be exchanged sum.
  • dist(i,j) may represent the communication distance between graph node i and graph node j. Since the calculation distance of the communication distance between any two graph nodes in the calculation graph data is huge, the computational node uses a mathematical approximation method to calculate the communication distance between the graph nodes. For example, a local approximation method can be used to calculate the communication between the graph nodes. Distance, specifically calculate the communication distance between graph nodes according to the following formula:
  • dist(i,j) represents the communication distance between the graph node j in the second party l of the graph node to be exchanged and the graph node i to be exchanged;
  • e i,j ⁇ E represents the graph node j and the graph node i is connected by an edge; Indicates that there is no edge between graph node j and graph node i.
  • the computing node corresponding to the second party exchanges the graph node to be exchanged
  • the computing node can reduce the corresponding communication distance, and therefore, the second estimated quantized value is higher.
  • the computing node does not reduce the communication distance.
  • the compute node may determine a first quantized value for use in exchange with the graph node to be exchanged based on the first estimated quantized value and the second estimated quantized value. For example, an average of the first estimated quantized value and the second predicted quantized value may be calculated, and the average value is taken as the first quantized value exchanged with the graph node to be exchanged.
  • the computing node may perform weighted summation on the first estimated quantized value and the second estimated quantized value according to a certain weight ratio, and then averages the weighted summed average as the graph node to be exchanged. The first quantized value exchanged, etc. Further, between the computing nodes to be exchanged, the second party may exchange the determined first quantized value for the first-party graph node to be exchanged.
  • the first quantized value, the second quantized value, and the third quantized value are respectively obtained.
  • the graph node and the first quantized value are performed between the computing nodes.
  • the first quantized value and the second quantized value can measure the computing power of the computing node
  • the third quantized value can measure the load state of the computing node, so that the computing power and the load state of the computing node can be quantized according to the computing task. Quantitative representation of values, accurate and intuitive.
  • the second quantized value and the third quantized value are kept to satisfy the equalization condition by exchanging the graph node and the first quantized value between the computing nodes, so that it is not necessary to rely on a specific server or node to assign a task, but to calculate Nodes coordinate map nodes with each other and dynamically adjust allocation to achieve self-organized load balancing, avoiding single-point failures and network congestion problems of specific servers, and greatly improving task scheduling efficiency.
  • the self-organized dynamic task scheduling method can adapt to the computational task scheduling of a larger-scale cluster, and the number of dynamically increasing or decreasing computing nodes does not affect existing computing tasks, and has high scalability.
  • the graph data processing method includes the following steps:
  • the task information corresponding to the to-be-processed map data broadcasted in the blockchain network is periodically detected.
  • S610 Execute a task executable file to perform a calculation task on the sub-picture data, and obtain corresponding global data and local data.
  • the above data processing method can greatly improve the processing efficiency of the graph data by dividing the graph data to be processed into sub-graph data for distributed processing. Then, the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced. Moreover, in the process of iterative calculation of the processed graph data, the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • FIG. 6 is a schematic flow chart of a method for processing data in some examples. It should be understood that although the various steps in the flowchart of FIG. 6 are sequentially displayed as indicated by the arrows, these steps are not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in FIG. 6 may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be executed at different times, and the execution of these sub-steps or stages The order is also not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of the other steps.
  • control node may write the task information into the new block, and issue the task information in the blockchain network, and the computing node determines the load state after detecting the task information, and when the load state meets the preset load condition, pull The task information is taken, and the sub-picture data and the task executable file are obtained from the data warehouse according to the task information.
  • the compute node executes the task executable file to calculate the subgraph data.
  • the computing node can write the calculated global data, the calculation task execution log, and the status of the computing node into the blockchain network.
  • the computing node may respectively obtain corresponding first quantized values, second quantized values, and third quantized values.
  • the computing nodes When the second quantized values and the third quantized values do not meet the equalization condition, the computing nodes perform graphs between the computing nodes. The exchange of nodes and first quantized values. In this way, the graph data is continuously iteratively calculated until the iteration stop condition is satisfied.
  • the compute node can record the corresponding calculation results into the blockchain or store them in the data warehouse.
  • the TSP question specifically refers to how the salesman walks through all the cities (graph nodes) in the shortest distance and returns to the starting city.
  • This is a complete problem of NP (Non-deterministic Polynomial), and it is generally approximated.
  • the TSP problem can be split into the shortest traversal problem in the subgraph, that is, the global optimal solution is approximated by finding the local optimum.
  • the global data is the path and total length in the current graph data to be processed, and the local data includes the preferred path length in each subgraph data.
  • the calculation process is to continuously optimize the path selection, and the computing nodes can greatly reduce the network traffic consumed in the calculation process by sharing the global data into the blockchain network for sharing. Based on continuously updated global data and local data, perform computational tasks and find the optimal solution.
  • a computing task publishing method for graph data is provided. This example is mainly illustrated by the method being applied to the control node 110 in FIG. 1 described above.
  • the computing task publishing method of the data includes the following steps:
  • control node may obtain the task information corresponding to the to-be-processed map data that is stored locally, or obtain the task information stored by other devices through a network connection, an interface connection, or the like.
  • control node can receive a task addition instruction to present a task editing interface in accordance with the task addition instruction. Users can enter task information through the task editing interface. The control node may obtain task information corresponding to the to-be-processed map data input to the task editing interface.
  • Figure 8 shows an interface diagram of the task editing interface in some examples.
  • the control node may display a task editing interface. Specifically, information such as the number of tasks, the number of running tasks, the number of successful tasks, and the number of failed tasks may be displayed at the top of the task editing interface. In the middle of the task editing interface, the information about each task is displayed in the form of a table, such as the task name, running time, running command, progress, and processing controls that can change the state of the existing task.
  • the text box of adding a task is displayed at the bottom of the task editing interface. When the user clicks the button of “add task”, the corresponding task information, such as algorithm parameters and task executable file, can be input in the corresponding text box below the task editing interface.
  • control node may create a new block when a new computing task is issued, and issue task information in a manner of writing task information to the new block.
  • a new block may also be created, and the modified content is written into the new block to be updated in the blockchain network. Task information.
  • a control node when a control node terminates a computing task that already exists or executes, it can be marked as unavailable in the blockchain. It is also broadcast in the blockchain network by adding new blocks.
  • control node may write the acquired task information to the new block.
  • the blockchain network may employ a corresponding consensus algorithm to verify the new block. After the new block is verified by the blockchain network, the compute node may join the new block to the blockchain network.
  • Broadcast task information in a blockchain network is used to instruct the computing node to acquire subgraph data that is divided from the to-be-processed graph data, and iteratively perform calculation on the subgraph data based on the blockchain network. Task, get the calculation results.
  • the task information in the new block may be broadcast through the blockchain network.
  • the broadcast task information is used to instruct the computing node to acquire the sub-picture data divided from the to-be-processed graph data, and iteratively perform the calculation task for the sub-picture data based on the blockchain network, and obtain the calculation result.
  • the computing task issuing method of the above graph data, by writing the acquired task information into a new block in the blockchain network, to broadcast and publish the task information in the blockchain network, so that other nodes in the blockchain network
  • the corresponding task information can be pulled, and the sub-picture data divided from the to-be-processed graph data is obtained, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • the task execution state of the compute node in the blockchain network may also be displayed by the control node.
  • the control node may receive the task execution state display instruction; access the computing task execution log corresponding to the corresponding computing node in the blockchain network according to the task execution state display instruction, to obtain the task execution state information, and display the task execution state.
  • the display interface is displayed in the task execution status display interface, and the task execution status information corresponding to the existing computing task in the blockchain network is displayed.
  • FIG. 9 shows a schematic structural diagram of a task execution state display interface in some examples.
  • the number of compute nodes, the number of tasks, the running time, the accumulated running time, the number of successful tasks, the number of failed tasks, the number of running tasks, the CPU usage, and the network throughput can be displayed in the task execution status display interface.
  • Information such as memory usage and node availability. Among them, the above related information can also be assisted by the form of a chart.
  • the control node can access the compute task execution log of the compute node on the blockchain network and display it in the task execution status display interface.
  • FIG. 10 shows a schematic structural diagram of a computing node status display interface in some examples.
  • information such as the number of computing nodes, the availability of computing nodes, the number of computing node failures, the CPU usage rate, the memory usage rate, and the network throughput can be displayed in the computing node status display interface.
  • the status information of each node may also be displayed in the computing node status display interface, such as computing node name, running time, current status, load rate, and viewing logs.
  • the text box of the added node is displayed at the bottom of the computing node status display interface.
  • the computing node information is, for example, global data, node parameters, log options, and computing tasks.
  • the operating parameters of the system may be set by the control node, including parameters of the blockchain network, such as block size, update speed, authentication mode, and encryption algorithm.
  • the computing task publishing method of the graph data includes the following steps:
  • S1104 Display a task editing interface according to a task addition instruction.
  • S1106 Acquire task information corresponding to the to-be-processed graph data input to the task editing interface.
  • S1114 Broadcast task information in a blockchain network; broadcast task information, used to instruct the computing node to acquire subgraph data divided from the to-be-processed graph data, and iteratively perform calculation on the subgraph data based on the blockchain network Task, get the calculation results.
  • the computing task issuing method of the above graph data, by writing the acquired task information into a new block in the blockchain network, to broadcast and publish the task information in the blockchain network, so that other nodes in the blockchain network
  • the corresponding task information can be pulled, and the sub-picture data divided from the to-be-processed graph data is obtained, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • a map data processing apparatus 1200 including: a compute management module 1201 and a communication module 1202.
  • a calculation management module 1201 configured to acquire sub-picture data that is divided from the to-be-processed graph data
  • the calculation management module 1201 is further configured to perform a calculation task on the sub-picture data, and obtain corresponding global data and local data;
  • the communication module 1202 is configured to write global data into the blockchain network; the global data in the blockchain network is updated by the distributed computing node cluster;
  • the communication module 1202 is further configured to obtain the latest global data from the blockchain network
  • the calculation management module 1201 is further configured to iteratively execute the calculation task for the sub-picture data according to the acquired latest global data and the local data, and obtain the calculation result when the iteration stop condition is satisfied.
  • the communication module 1202 is further configured to acquire task information corresponding to the to-be-processed map data broadcasted in the blockchain network;
  • the calculation management module 1201 is further configured to: read the corresponding task executable file according to the task information, and The sub-picture data divided from the to-be-processed graph data; the task executable file is executed to perform the calculation task for the sub-picture data, and the corresponding global data and local data are obtained.
  • the communication module 1202 is further configured to periodically detect task information corresponding to the to-be-processed map data broadcasted in the blockchain network; when detecting the task information, determine a load status of the current computing node; when the load status is satisfied Pull task information when preset load conditions.
  • communication module 1202 includes a new block generation sub-module 12021 for creating a new block; writing global data to the created new block; when the new block is verified by the blockchain network, The new block joins the blockchain network.
  • the new block generation sub-module 12021 is further configured to write the global data and the corresponding calculation task execution log to the new block;
  • the calculation management module 1201 further includes a data recovery sub-module 12011, and the data recovery sub-module 12011 is configured to a computing task execution log in the blockchain network, determining a fault computing node and corresponding missing data; determining a computing node participating in reconstructing the lost data; obtaining local data related to the fault computing node from the determined computing node; Data reconstruction loses data.
  • the data recovery sub-module 12011 is further configured to look up local data local to the faulty computing node; share the found local data to a computing node that participates in reconstructing the lost data; the shared local data is used to reconstruct the lost data.
  • the communication module 1202 is further configured to periodically detect the number of blocks in the blockchain network; when the number of blocks increases, acquire global data recorded in the new block; and update the local global data according to the acquired global data.
  • the data processing apparatus 1200 further includes a scheduling management module 1203, and the scheduling management module 1203 is configured to acquire a first quantization value corresponding to the completed computing task that has formed the corresponding blockchain data. Obtaining a second quantized value corresponding to the completed computing task that does not form the corresponding blockchain data; determining a third quantized value corresponding to the unfinished computing task in the subgraph data; when the second quantized value and the third quantizing When the value does not meet the equilibrium condition, the exchange of the graph node and the first quantized value is performed between the computing nodes.
  • the above-mentioned data processing device can greatly improve the processing efficiency of the graph data by dividing the graph data to be processed into sub-picture data for distributed processing. Then, the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced. Moreover, in the process of iterative calculation of the processed graph data, the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • FIG. 14 illustrates that the graph data processing apparatus may include a communication module, a data cache module, a schedule management module, and a calculation management module.
  • the communication module includes a new block generation sub-module, a transaction accounting sub-module, a log management sub-module, and a status reporting sub-module.
  • the data cache module includes a global data cache sub-module, a local data cache sub-module, an update timer sub-module, and a data read/write back sub-module.
  • the scheduling management module includes a multitasking coordination submodule, an ad hoc network submodule, a process control submodule, and a key management submodule.
  • the calculation management module includes a calculation execution sub-module, an internal variable storage sub-module, a data recovery sub-module, and a result verification sub-module.
  • the global data cache sub-module and the local data cache sub-module in the data cache module can be placed into the computation management module, allowing internal variable memory to directly access the data warehouse.
  • the data cache module is omitted, and the work of the data cache is completely completed by the calculation management module, and the algorithm for driving the data cache is defined by the executed algorithm, thereby realizing a more compact structure.
  • the communication module is responsible for all functions of communication interaction with the blockchain network, including maintaining the normal operation of the blockchain network, and maintaining effective information transmission between other modules and the blockchain network.
  • the blockchain network mainly performs interaction of global data, algorithm configuration of calculation tasks, and calculation of node state data.
  • the new block generation sub-module in the communication module is configured to generate current global data, calculation node state data (including second quantized value and third quantized value, and graph node index and the like) and after generating from the previous block.
  • the graph node exchange information is written to the new node.
  • the transaction accounting sub-module is used to process the exchange of graph nodes, that is, the buying and selling of graph nodes and the management of the first quantized values.
  • the log management sub-module is used to organize the calculation task execution log generated by the current computing node, and pulls and sorts the required computing task execution log from the blockchain as needed, which is very important in the data recovery process of the fault computing node.
  • the status report sub-module obtains the state information of the computing node from the calculation management module, including the CPU memory resource, the algorithm name and type, and the number of errors, which are currently consumed by the computing task.
  • the communication module may also include a verification module to ensure the reliability of new block generation and accounting.
  • the data cache module includes four sub-modules: global data cache, local data cache, update timer, and data read-write.
  • the data update strategy is similar to the common FIFO and LFU strategies, and the oldest cache is replaced by a timer driver.
  • the scheduling management module is the core of the entire distributed graph computing system, which drives the entire system in an orderly manner.
  • the self-organizing network sub-module is responsible for the execution of the self-organizing strategy, including determining the first quantized value, transaction decision, asset accounting, and management for the exchange graph node.
  • the multi-task coordination sub-module controls the calculation of data for executing multiple different tasks. Here, common queue scheduling and priority scheduling methods can be used for control coordination.
  • the key management sub-module obtains the authorization information of the computing node and the corresponding key from the data warehouse.
  • the process control sub-module monitors the complete validity of the computational process, particularly when the computational process needs to be rolled back.
  • the calculation execution sub-module in the calculation management module acquires the required task executable file from the data warehouse according to the instruction of the scheduling management module to complete the related calculation.
  • the corresponding variables are stored in the internal variable storage sub-module, and the data needs to be stored separately according to different computing tasks and graph nodes.
  • the Data Recovery submodule is used to effectively recover the data stored on it when a failed compute node occurs in the blockchain network.
  • the graph data processing apparatus in the example of the present application has the following advantages:
  • the newly added nodes can also quickly and adaptively join this computing network, making the scale of the computing network flexible.
  • heterogeneous devices can adapt to heterogeneous computing. Due to the autonomy of load balancing, heterogeneous devices can also be coordinated in a similar way, and there is no bottleneck caused by differences in computing power and mode.
  • Figure 15 shows an internal block diagram of a computer device in some examples.
  • the computer device may specifically be a computer device distributed by the computing node 120 of FIG.
  • the computer device includes the computer device including a processor, a memory, and a network interface connected by a system bus.
  • the memory comprises a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by the processor, causes the processor to implement a map data processing method.
  • the internal memory may also store a computer program that, when executed by the processor, causes the processor to execute the map data processing method.
  • FIG. 15 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
  • the map data processing apparatus can be implemented in the form of a computer program that can be run on a computer device as shown in FIG.
  • the various program modules constituting the data processing device of the figure may be stored in a memory of the computer device, such as the calculation management module and the communication module shown in FIG.
  • the computer program of each program module causes the processor to perform the steps in the map data processing method of the various examples of the present application described in this specification.
  • the computer device shown in FIG. 15 can perform steps S202, S204, and S210 through the calculation management module in the map data processing device shown in FIG.
  • the computer device can perform steps S206 and S208 through the communication module.
  • a computing task issuing apparatus 1600 for providing graph data includes a task management module 1601 and a communication module 1602.
  • a task management module 1601 configured to acquire task information corresponding to the to-be-processed graph data
  • a communication module 1602 configured to create a new block
  • the communication module 1602 is further configured to write task information into the new block
  • the communication module 1602 is further configured to add the new block to the blockchain network after the new block is verified by the blockchain network;
  • the communication module 1602 is further configured to broadcast the task information in the blockchain network; the broadcast task information is used to instruct the computing node to acquire the sub-picture data divided from the to-be-processed graph data, and iteratively execute the pair based on the blockchain network. The calculation task of the graph data, and the calculation result is obtained.
  • the task management module 1601 is further configured to receive a task adding instruction, display a task editing interface according to the task adding instruction, and acquire task information corresponding to the to-be-processed graph data input into the task editing interface.
  • the computing task issuing device of the above graph data broadcasts and publishes the task information in the blockchain network by writing the acquired task information into a new block in the blockchain network, so that other nodes in the blockchain network are The corresponding task information can be pulled, and the sub-picture data divided from the to-be-processed graph data is obtained, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • the publishing task efficiency of the graph data is greatly improved, thereby improving the efficiency.
  • the processing efficiency of calculating the graph data is greatly improved.
  • Figure 17 shows an internal block diagram of a computer device in some examples.
  • the computer device may specifically be the control node 110 of FIG.
  • the computer device includes the computer device including a processor, a memory, a network interface, an input device, and a display screen connected by a system bus.
  • the memory comprises a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system, and can also store a computer program, which when executed by the processor, can cause the processor to implement a computing task publishing method of the graph data.
  • the internal memory may also store a computer program that, when executed by the processor, causes the processor to execute a computational task publishing method of the graph data.
  • the display screen of the computer device may be a liquid crystal display or an electronic ink display screen
  • the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touchpad provided on the computer device casing, and It can be an external keyboard, trackpad or mouse.
  • FIG. 17 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
  • the computing task publishing device of the graph data provided herein can be implemented in the form of a computer program that can be run on a computer device as shown in FIG.
  • the program modules of the computing task issuing device constituting the map data may be stored in a memory of the computer device, such as the task management module and the communication module shown in FIG.
  • the computer program of each program module causes the processor to perform the steps in the calculation task issuing method of the map data of the various examples of the present application described in the present specification.
  • the computer device shown in FIG. 17 can execute step S702 through the task management module in the computing task issuing device of the drawing data shown in FIG. 16.
  • the computer device can perform steps S704, S706, and S708 through the communication module.
  • the present scheme provides a distributed graph computing system including a computing node and a control node: the control node is configured to acquire task information corresponding to the graph data to be processed; create a new block; and write the task information a new block; when the new block is verified by the blockchain network, the new block is added to the blockchain network; the task information is broadcasted in the blockchain network; and the broadcast task information is used to instruct the computing node to obtain the data from the pending map.
  • the sub-picture data is divided, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • the computing node is configured to acquire sub-picture data divided from the to-be-processed graph data; perform calculation tasks on the sub-picture data, obtain corresponding global data and local data; write global data into the blockchain network; blockchain
  • the global data in the network is updated by the distributed computing node cluster; the latest global data is obtained from the blockchain network; the computing task for the subgraph data is iteratively executed according to the latest global data and local data obtained until the iteration is satisfied The calculation result is obtained when the condition is stopped.
  • the control node can broadcast and publish the task information in the blockchain network by writing the acquired task information into the new block in the blockchain network, which can be used in extremely low network traffic.
  • the computing node obtains the task information
  • the data to be processed can be divided into sub-picture data for distributed processing, which can greatly improve the processing efficiency of the graph data.
  • the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced.
  • the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • a computer apparatus comprising a memory and a processor, the memory storing a computer program, the computer program being executed by the processor, causing the processor to perform the step of: acquiring the data from the to-be-processed map data Sub-picture data; performing calculation tasks on sub-picture data, obtaining corresponding global data and local data; writing global data into the blockchain network; global data in the blockchain network is updated by the distributed computing node cluster; The latest global data is obtained from the blockchain network; the calculation task for the subgraph data is iteratively executed according to the latest global data and local data obtained, and the calculation result is obtained until the iteration stop condition is satisfied.
  • the computer program causes the processor to perform the following steps when performing the step of acquiring the sub-picture data divided from the to-be-processed picture data: acquiring a task corresponding to the to-be-processed picture data broadcasted in the blockchain network Information; according to the task information, reading the corresponding task executable file and the sub-picture data divided from the to-be-processed picture data; the computer program causes the processor to perform the calculation task for the sub-picture data, and obtain corresponding global data and local
  • the step of data specifically performs the following steps: executing a task executable file to perform a calculation task on the sub-picture data, and obtaining corresponding global data and local data.
  • the computer program causes the processor to perform the following steps in the step of acquiring the task information corresponding to the to-be-processed map data broadcasted in the blockchain network: periodically detecting the broadcasted and pending in the blockchain network The task information corresponding to the graph data; when the task information is detected, determining the load state of the current computing node; when the load state meets the preset load condition, the task information is pulled.
  • the computer program causes the processor to perform the following steps in performing the step of writing global data into the blockchain network: creating a new block; writing global data to the created new block; when the new block passes the block After the chain network is verified, the new block is added to the blockchain network.
  • the computer program causes the processor to specifically perform the steps of writing the global data and the corresponding computing task execution log to the new block while performing the step of writing the global data to the created new block; the computer program causes the processor to further Performing the following steps: determining a fault calculation node and corresponding missing data according to a calculation task execution log in the blockchain network; determining a calculation node participating in reconstructing the lost data; and acquiring local data related to the fault calculation node from the determined calculation node Rebuild lost data based on acquired local data.
  • the computer program causes the processor to further perform the steps of: finding local data locally associated with the faulty computing node; sharing the found local data to a computing node participating in reconstructing the lost data; sharing the local data for reconstructing the lost data.
  • the computer program causes the processor to perform the following steps when performing the step of obtaining the latest global data from the blockchain network: periodically detecting the number of blocks in the blockchain network; when the number of blocks is increased, Get the global data recorded in the new block; update the local global data according to the acquired global data.
  • the computer program causes the processor to further perform the steps of: acquiring a first quantized value corresponding to the completed computing task that has formed the corresponding blockchain data; and obtaining the completed calculation that does not form the corresponding blockchain data a second quantized value corresponding to the task; determining a third quantized value corresponding to the unfinished computing task in the sub-picture data; and performing a graph between the computing nodes when the second quantized value and the third quantized value do not meet the equalization condition The exchange of nodes and first quantized values.
  • the above computer device can greatly improve the processing efficiency of the graph data by dividing the data to be processed into sub-picture data for distributed processing. Then, the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced. Moreover, in the process of iterative calculation of the processed graph data, the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • a computer apparatus comprising a memory and a processor, the memory storing a computer program, the computer program being executed by the processor, causing the processor to perform the step of: acquiring a task corresponding to the to-be-processed map data Information; create a new block; write the task information to the new block; when the new block is verified by the blockchain network, add the new block to the blockchain network; broadcast the task information in the blockchain network; broadcast the task information, And instructing the computing node to acquire the sub-picture data divided from the to-be-processed graph data, and iteratively perform the calculation task for the sub-picture data based on the blockchain network, and obtain the calculation result.
  • the computer program causes the processor to perform the following steps in the step of acquiring the task information corresponding to the to-be-processed map data: receiving the task addition instruction; displaying the task editing interface according to the task addition instruction; obtaining the input to the task editing The task information corresponding to the graph data to be processed in the interface.
  • the above computer device broadcasts and publishes the task information in the blockchain network by writing the acquired task information into a new block in the blockchain network, so that other nodes in the blockchain network can pull corresponding
  • the task information is obtained, and the sub-picture data divided from the to-be-processed graph data is obtained, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • the publishing task efficiency of the graph data is greatly improved, thereby improving the efficiency.
  • the processing efficiency of calculating the graph data is greatly improved.
  • a computer readable storage medium storing a computer program, when executed by a processor, implements the following steps: acquiring subgraph data divided from the graph data to be processed; performing a calculation task on the sub graph data, and obtaining corresponding Global data and local data; global data is written into the blockchain network; global data in the blockchain network is updated by the distributed computing node cluster; the latest global data is obtained from the blockchain network; The latest global data and local data, iteratively performs the calculation task for the subgraph data until the iteration stop condition is satisfied.
  • the computer program causes the processor to perform the following steps when performing the step of acquiring the sub-picture data divided from the to-be-processed picture data: acquiring a task corresponding to the to-be-processed picture data broadcasted in the blockchain network Information; according to the task information, reading the corresponding task executable file and the sub-picture data divided from the to-be-processed picture data; the computer program causes the processor to perform the calculation task for the sub-picture data, and obtain corresponding global data and local
  • the step of data specifically performs the following steps: executing a task executable file to perform a calculation task on the sub-picture data, and obtaining corresponding global data and local data.
  • the computer program causes the processor to perform the following steps in the step of acquiring the task information corresponding to the to-be-processed map data broadcasted in the blockchain network: periodically detecting the broadcasted and pending in the blockchain network The task information corresponding to the graph data; when the task information is detected, determining the load state of the current computing node; when the load state meets the preset load condition, the task information is pulled.
  • the computer program causes the processor to perform the following steps in performing the step of writing global data into the blockchain network: creating a new block; writing global data to the created new block; when the new block passes the block After the chain network is verified, the new block is added to the blockchain network.
  • the computer program causes the processor to specifically perform the steps of writing the global data and the corresponding computing task execution log to the new block while performing the step of writing the global data to the created new block; the computer program causes the processor to further Performing the following steps: determining a fault calculation node and corresponding missing data according to a calculation task execution log in the blockchain network; determining a calculation node participating in reconstructing the lost data; and acquiring local data related to the fault calculation node from the determined calculation node Rebuild lost data based on acquired local data.
  • the computer program causes the processor to further perform the steps of: finding local data locally associated with the faulty computing node; sharing the found local data to a computing node participating in reconstructing the lost data; sharing the local data for reconstructing the lost data.
  • the computer program causes the processor to perform the following steps when performing the step of obtaining the latest global data from the blockchain network: periodically detecting the number of blocks in the blockchain network; when the number of blocks is increased, Get the global data recorded in the new block; update the local global data according to the acquired global data.
  • the computer program causes the processor to further perform the steps of: acquiring a first quantized value corresponding to the completed computing task that has formed the corresponding blockchain data; and obtaining the completed calculation that does not form the corresponding blockchain data a second quantized value corresponding to the task; determining a third quantized value corresponding to the unfinished computing task in the sub-picture data; and performing a graph between the computing nodes when the second quantized value and the third quantized value do not meet the equalization condition The exchange of nodes and first quantized values.
  • the computer readable storage medium can greatly improve the processing efficiency of the graph data by dividing the graph data to be processed into subgraph data for distributed processing. Then, the global data that needs to be globally shared by each computing node in the distributed computing process of the graph data is written into the blockchain network, and globally shared by the blockchain network. The amount of communication used for data sharing is greatly reduced. Moreover, in the process of iterative calculation of the processed graph data, the latest global data and the localized local data can be quickly and directly obtained from the blockchain network, without a centralized driver to coordinate the update data, Improve the efficiency of processing graph data.
  • a computer readable storage medium storing a computer program, when executed by a processor, performing the following steps: acquiring task information corresponding to the to-be-processed map data; creating a new block; writing the task information to the new block; After the new block is verified by the blockchain network, the new block is added to the blockchain network; the task information is broadcasted in the blockchain network; the broadcast task information is used to instruct the computing node to obtain the child segmented from the to-be-processed map data.
  • Graph data and iteratively performs calculation tasks for subgraph data based on the blockchain network, and obtains calculation results.
  • the computer program causes the processor to perform the following steps in the step of acquiring the task information corresponding to the to-be-processed map data: receiving the task addition instruction; displaying the task editing interface according to the task addition instruction; obtaining the input to the task editing The task information corresponding to the graph data to be processed in the interface.
  • the computer readable storage medium by writing the acquired task information into a new block in the blockchain network, to broadcast and publish the task information in the blockchain network, so that other nodes in the blockchain network can be pulled
  • the corresponding task information is obtained, and the sub-picture data divided from the to-be-processed graph data is obtained, and the calculation task for the sub-picture data is iteratively executed based on the blockchain network, and the calculation result is obtained.
  • the publishing task efficiency of the graph data is greatly improved, thereby improving the efficiency.
  • the processing efficiency of calculating the graph data is greatly improved, thereby improving the efficiency.
  • Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • Synchlink DRAM SLDRAM
  • Memory Bus Radbus
  • RDRAM Direct RAM
  • DRAM Direct Memory Bus Dynamic RAM
  • RDRAM Memory Bus Dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Power Engineering (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请涉及一种图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备,所述图数据处理方法由分布式计算节点集群中的计算节点所分布的计算机设备执行,包括:获取从待处理图数据中划分出的子图数据;执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;从所述区块链网络中获取最新的全局数据;根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条件时获得计算结果。

Description

图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备
本申请要求于2018年05月16日提交中国专利局、申请号为201810467817.8、发明名称为“图数据处理方法和图数据的计算任务发布方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图计算技术领域,特别是涉及一种图数据处理方法和装置、图数据的计算任务发布方法、装置、存储介质及计算机设备。
背景
随着计算机技术的发展,出现了以“图论”为基础的对现实世界的一种图结构的抽象表达,以及在这种数据结构上的计算模式,也就是图计算。
图计算是以“图论”为基础的对现实世界的一种“图”结构的抽象表达,以及在这种数据结构上的计算模式。图数据结构很好的表达了数据之间的关联性,因此,很多应用中出现的问题都可以抽象成图来表示,以图论的思想或者以图为基础建立模型来解决问题。例如,可以用图数据表示社交网络、商品购买关系、道路交通网络、通信网络等等信息,而基于图数据的计算需求也越来越多,越来越复杂。
技术内容
本申请实例提供了一种图数据处理方法、图数据的计算任务发布方法、装置、计算机可读存储介质和计算机设备。
本申请实例提供了一种图数据处理方法,由分布式计算节点集群中的计算节点所分布的计算机设备执行,包括:
获取从待处理图数据中划分出的子图数据;
执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;
将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;
从所述区块链网络中获取最新的全局数据;
根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
本申请实例还提供了一种图数据处理装置,包括:
计算管理模块,用于获取从待处理图数据中划分出的子图数据;
所述计算管理模块还用于执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;
通信模块,用于将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;
所述通信模块还用于从所述区块链网络中获取最新的全局数据;
所述计算管理模块还用于根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
本申请实例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行所述图数据处理方法的步骤。
本申请实例还提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行所述图数据处理方法的步骤。
本申请实例还提供了一种图数据的计算任务发布方法,由计算机设备执行,所述方法包括:
获取与待处理图数据相对应的任务信息;
创建新区块;
将所述任务信息写入所述新区块;
当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络;
在所述区块链网络中广播所述任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于所述区块链网络迭代执行对于所述子图数据的计算任务,获得计算结果。
本申请实例还提供了一种图数据的计算任务发布装置,所述装置包括:
任务管理模块,用于获取与待处理图数据相对应的任务信息;
通信模块,用于创建新区块;
所述通信模块还用于将所述任务信息写入所述新区块;
所述通信模块还用于当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络;
所述通信模块还用于在所述区块链网络中广播所述任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于所述区块链网络迭代执行对于所述子图数据的计算任务,获得计算结果。
本申请实例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行所述图数据的计算任务发布方法的步骤。
本申请实例还提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行所述图数据的计算任务发布方法的步骤。
附图简要说明
图1为一些实例中图数据处理方法的应用环境图;
图2为一些实例中图数据处理方法的流程示意图;
图3为一些实例中获取区块链网络中广播的与待处理图数据相对应的任务信息的步骤的流程示意图;
图4为一些实例中重建丢失数据步骤的流程示意图;
图5为一些实例中图节点交换步骤的流程示意图;
图6为另一些实例中图数据处理方法的流程示意图;
图7为一些实例中图数据的计算任务发布方法的流程示意图;
图8为一些实例中任务编辑界面的界面示意图;
图9为一些实例中任务执行状态显示界面的结构示意图;
图10为一些实例中计算节点状态显示界面的结构示意图;
图11为另一些实例中图数据的计算任务发布方法的流程示意图;
图12为一些实例中图数据处理装置的结构框图;
图13为另一些实例中图数据处理装置的结构框图;
图14为一些实例中图数据处理装置的模块示意图;
图15为一些实例中计算机设备的结构框图;
图16为一些实例中图数据的计算任务发布装置的结构框图;
图17为一些实例中计算机设备的结构框图。
实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实例仅仅用以解释本申请,并不用于限定本申请。
在一些实例中,图计算方法可以是基于中心化思想的图计算方法,在图计算过程中常常使用参数服务器对需要进行共享的数据进行存储和更新。虽然参数服务器可以将需要共享的数据分布式地存储在各个服务器节点上,以在一定程度上避开单点故障,但在数据更新时却需要依赖一个中心化的驱动程序来协调处理。这在图计算过程中往往会导致巨大的网络吞吐量,从而使得网络通信成为图计算系统的瓶颈。特别是当处理超大规模数据量时,数据更新延迟,使得传统的基于中心化的图计算方法存在计算效率低的问题。
基于此,有必要针对图数据计算效率低的技术问题,提供一种图数据处理方法、图数据的计算任务发布方法、装置、计算机可读存储介质和计算机设备。
图1为一些实例中图数据处理方法的应用环境图。参照图1,该图数据处理方法应用于分布式的图计算系统。该分布式的图计算系统包括控制节点110、计算节点120和数据仓库130。控制节点110和计算节点120通过区块链网络连接,控制节点110和计算节点120分别通过网络与数据仓库130连接。控制节点110可以通过终端来实现,具体可以是台式终端或移动终端,移动终端比如手机、平板电脑或笔记本电脑等。计算节点120可以通过部署在一个或多个服务器上的程序来实现,比如图1所示的计算节点a、b、c等分别分布在高性能计算机1至高性能计算机N上。数据仓库130可以是集中式的存储设备,也可以是分布式的存储集群或设备。在一些实例中,该分布式的图计算系统包括分布式计算节点集群,该图数据处理方法可应用于分布式计算节点集群中的计算节点。
如图2所示,在一些实例中,提供了一种图数据处理方法。本实例主要以该方法应用于上述图1中的分布式计算节点集群中的计算节点来举例说明。参照图2,该图数据处理方法具体包括如下步骤:
S202,获取从待处理图数据中划分出的子图数据。
其中,图数据是采用图结构组织起来的结构化数据,通过应用图形理论来存储实体之间的关系信息。在数学上,图结构可以使用二元组G=<V,E>来表示,其中,
Figure PCTCN2019082192-appb-000001
表示图数据中的N个图节点的集合,
Figure PCTCN2019082192-appb-000002
Figure PCTCN2019082192-appb-000003
表示图节点之间的连接边。待处理图数据是计算节点待处理的图数据。子图数据是从待处理图数据划分出的部分图数据。
通常,图数据由图节点和图节点间的边构成。图节点是图数据中的顶点,可以用来表示图数据中的主体。比如,当用图数据存储社会网络中个体之间的信息时,可以用图数据中的不同图节点表示不同的个体;或者当用图数据表示商品购买关系时,每个用户和每个商品均是一个图节点。其中,图节点可以包含图节点的节点标识和节点属性等信息。图节点间的边是图数据中的边,可以用来表示图数据中不同主体之间的关系。比如,当用图数据存储社会网络中个体之间的信息时,用图数据中的不同图节点表示不同的个体,可以用图数据中的边表示个体之间的关系,如好友关系;当用图数据表示商品购买关系时,每个用户和每个商品均是一个图节点,用户购买商品表示成一条边等。
在一些实例中,各个计算节点可以是部署在一个或多个服务器上的程序,计算节点可通过区块链这一媒介构成一个虚拟的网络结构。计算节点可通过将数据、计算任务执行日志或计算节点状态等信息写入新区块,并将新区块加入区块链网络,以方便进行计算节点彼此间的通信协调,以完成子图数据所对应的计算任务,进而完成待处理图数据所对应的计算任务。计算节点状态是计算节点的状态信息,包括当前计算节点执行计算任务所消耗的CPU(Central Processing Unit,中央处理器)内存资源、算法名称和类型、出错次数等。
在一些实例中,该分布式的图计算系统中包含控制节点,控制节点可通过与该分布式的图计算系统中的一个或多个计算节点进行通信,以接入此区块链网络。控制节点可通过将与待处理图数据相对应的任务信息记录在新区块中,并将新区块加入区块链网络,在区块链网络中广播该任务信息。这样,在该区块链网络中的所有计算节点均可接收到广播的任务信息。其中,任务信息是与计算任务相关的信息。
在一些实例中,计算节点可通过定时器或定时程序等定时查看区块链网络中的最新区块中记录的信息。当检测到任务信息后,判断计算节点自身的负载状态,当计算节点未超载时,计算节点可从区块链网络中加载该任务信息。
在一些实例中,任务信息可以包括任务可执行文件的获取路径、执行计算任务的算法参数、子图数据的获取路径、计算结果的输出路径,以及期望完成时间等。计算节点从区块链网络中加载该任务信息后,可获取待处理图数据所对应的图结构中的子图结构,并根据获取的子图结构从数据仓库中获取相应的子图数据。
在一些实例中,当计算节点从区块链网络中加载该任务信息后,可根据自身负载状态,从待处理图数据所对应的图结构中随机获取一定数量的图节点及相应的边,构成子图结构。并根据任务信息中记载的子图数据的获取路径,从数据仓库中获取该子图结构所对应的子图数据。
在一些实例中,分布式的图计算系统可将待处理图数据按照当前的计算节点的数量,将待处理图数据按预设条件,比如平均或随机,划分成相应数量的子图数据。
S204,执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
其中,计算任务是对待处理图数据进行处理时需要计算的工作量。全局数据是各计算节点对子图数据进行计算的过程中需要进行全局共享并全局更新的数据,全局数据的规模是有限的。局部数据是各计算节点对子图数据进行计算的过程中仅被少数计算节点使用和更新的数据。
具体地,计算节点在获取从待处理图数据中划分出的子图数据后,可对获取的子图数据中的图节点和边进行相应的计算,得到相应的全局数据和局部数据。
在一些实例中,子图数据中的每个图节点可以对应一个单位计算任务。在此实例中,单位计算任务是该分布式的图计算系统中的最小计算分配单位,比如在当次迭代计算过程中,某个图节点对应的计算任务。
在一些实例中,计算节点在加载任务信息后,根据任务信息中记载的任务可执行文件的获取路径,以及执行计算任务的算法参数等信息,从数据仓库中拉取相应的任务可执行文件和执行计算任务的算法参数。进一步地,计算节点在获取子图数据后,根据执行计算任务的算法参数,执行拉取的任务可执行文件,以执行对于子图数据的计算任务。
在一些实例中,计算节点可对全局数据和局部数据随机或依历史经验赋予初始值。计算节点根据执行计算任务的算法参数,以及全局数据和局部数据的初始值,执行任务可执行文件,并更新全局数据和局部数据。
在一些实例中,对待处理图数据进行的计算是迭代计算,随着算法的迭代更新,每次迭代过程中都会产生新的计算任务。子图数据中的每个图节点可以通过算法驱动源源不断地产生需要完成计算的单位计算任务。
在一些实例中,计算节点执行对于子图数据的计算任务,得到相应的全局数据和局部数据后,可将相应的数据按不同的任务、不同的图节点进行隔离存储。
S206,将全局数据写入区块链网络中;区块链网络中的全局数据,由分布 式计算节点集群更新。
其中,区块链网络是运行区块链技术的载体和组织方式。区块链技术,简称BT(Blockchain technology),也被称之为分布式账本技术,是一种互联网数据库技术,其特点是去中心化、公开透明,让每个人均可参与数据库记录。区块链技术是利用块链式数据结构来验证与存储数据、利用分布式节点共识算法来生成和更新数据、利用密码学的方式保证数据传输和访问的安全、利用由自动化脚本代码组成的智能合约来编程和操作数据的一种全新的分布式基础架构与计算方式。分布式计算节点集群是由分布式的计算节点所组成的一个集合。
具体地,计算节点执行对于子图数据的计算任务,得到相应的全局数据和局部数据后,可通过区块链网络在分布式的计算节点中共享获得的全局数据。计算节点可通过将全局数据写入新区块,当新区块通过区块链网络的验证后,将新区块附加在区块链网络的尾端的方式以将全局数据写入区块链网络中。
在一些实例中,该分布式的图计算系统中的各计算节点可异步执行对于子图数据的计算任务,分别得到相应的全局数据和局部数据。各计算节点异步执行计算任务,完成相应的计算任务后可根据各计算节点自身的情况将全局数据写入区块链网络中,区块链网络中的全局数据,由分布式计算节点集群共同更新。
在一些实例中,计算节点可对全局数据进行哈希加密,得到相应的字符串,再采用预设的私钥加密字符串生成签名,再将全局数据和签名均写入区块链网络中。当其他节点获取到全局数据和签名后,可从数据仓库中获取相应的公钥验证签名的合法性,当验证通过后再获取相应的全局数据并缓存更新本地的全局数据。
在一些实例中,上述区块链网络可以由一个私有的区块链构成,在区块链网络中记录数据所需的密钥可以根据相关非对称加密算法标准,比如椭圆曲线加密算法或RSA加密算法(一种非对称加密算法)事先生成,存储在数据仓库中。计算节点可在需要使用时访问数据仓库以获取相关密钥。
在一些实例中,计算节点还可将从上一区块到当前时间的所有相关数据,比如当前的全局数据、计算任务执行日志或计算节点状态等记录在新区块并将新区块加入区块链网络中。
在一些实例中,计算节点可通过主动广播新区块中记录的相关数据,以与分布式的图计算系统中的所有计算节点共享广播的相关数据,比如记录在新区块中的全局数据。或者,也可通过请求的形式扩散新区块中记录的相关数据。 其中,以请求的形式扩散是指,计算节点可通过发送请求的方式获取新区块中记录的相关数据。
在具体应用场景中,待处理图数据所涉及的图计算问题中,待处理图数据所对应的图结构通常是较为稀疏的图,待处理图数据中的边数M远远小于图节点数N的两两组合:
Figure PCTCN2019082192-appb-000004
这样,通常需要全局共享的全局数据规模有限,大部分的数据都具有一定的局部性,仅被少数计算节点的计算所使用。采用该分布式的图计算系统,将全局数据通过区块链网络进行全局共享,局部数据则可以缓存至相应的计算节点中,通过请求的形式获取,避免了大量的不必要的网络通信开销。
S208,从区块链网络中获取最新的全局数据。
具体地,计算节点可通过区块链网络获取新区块中记录的最新的全局计算数据。在一些实例中,计算节点可通过定时查看新区块的方式,从区块链网络中缓存最新的全局计算数据。
在一些实例中,当区块链网络中广播新区块时,计算节点可获取该新区块中的全局数据。计算节点将获取的全局数据缓存至本地,计算节点可通过计时器驱动,将最老的全局数据进行替代更新,其更新策略与常见的FIFO(First Input First Output,先入先出队列)和LFU(least frequently used,最不经常使用页置换算法)策略类似,在此就不赘述。
在一些实例中,计算节点可定时检测区块链网络中的区块数量,当区块数量增加时,获取新区块中记录的全局数据,根据获取的全局数据更新本地的全局数据。
在一些实例中,计算节点可通过定时器或运行预设定时检测程序以定时检测区块链网络中的区块数据。当区块数量增加时,计算节点可通过请求的形式获取新区块中记录的全局数据,根据获取的全局数据更新本地的全局数据。这样可以快速便捷地更新全局数据,以提高后续对子图数据的处理执行效率。
在一些实例中,计算节点可根据全局数据和局部数据生成新一轮迭代计算的计算任务。当计算节点在生成新一轮迭代计算的计算任务时,以数据请求的形式从区块链网络中获取最新的全局数据,从相邻的计算节点处获取所需的局部数据,根据获取的最新的全局数据、相邻计算节点处的局部数据,以及本地的局部数据,生成新一轮迭代计算的计算任务。
S210,根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
其中,迭代停止条件是结束迭代计算的条件。迭代停止条件可以是达到预设的迭代次数,或者是迭代计算达到预设时长,亦或者是迭代计算得到的计算结果收敛到稳定值等。
具体地,计算节点可根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,若执行完对于子图数据的计算任务后,不满足迭代停止条件,则返回步骤S204继续计算,直到满足迭代停止条件时结束计算。
在一些实例中,子图数据中的图节点可根据算法驱动源源不断地产生需要完成计算的计算任务。计算节点可根据子图数据、已完成的计算任务的数量和内容、最新的全局数据和局部数据,确定子图数据中待处理的计算任务。比如,计算节点可根据子图数据,生成当前迭代计算的计算任务所依赖的前s个时点的已完成的计算任务,以及相应的最新的全局数据和当前的局部数据,在算法的驱动下生成下次迭代计算的计算任务。计算节点执行生成的计算任务,并更新全局数据和局部数据。以此循环迭代,直到满足迭代停止条件时获得计算结果。
以该图数据处理方法应用于模型训练来举例说明,可将图数据输入至机器学习模型,以确定图数据中的图节点所对应的中间预测结果,按照中间预测结果与图数据相应的标签的差异,调整机器学习模型的模型参数并继续训练,直至满足迭代停止条件时结束训练。其中,每次模型训练过程中,训练得到的模型参数就可包含全局数据和局部数据,当次模型训练过程中会使用前次模型训练得到的模型参数,当次模型训练得到的模型参数再用于下次模型训练,如此循环。
在一些实例中,在图计算问题中,对于数据的一致性要求依算法不同差异很大,在本实例中,由于区块链网络中的信息传递是洪泛式的,因此计算节点获取的全局数据通常会存在一定的数据延迟,而不能保证强一致性,也就是全局参数的误差会在一个可控范围内。这对于如随机梯度下降算法这类是不敏感的。但相应的,由于局部数据是存储于本地,或者通过点对点直接相连进行通信共享的,因此其一致性能够得到很好的保证,不存在数据延迟。
在一些实例中,当计算节点根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务后,满足迭代停止条件时,可获得子图数据所对应的计算结果。各计算节点可将各子图的局部数据,以及计算结果存储至数据仓库中。数据仓库可对区块链网络中的全局数据、各子图的局部数据、以及各子图所对应的计算结果进行一致性整合。
上述图数据处理方法,通过将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
在一些实例中,步骤S202具体包括:获取区块链网络中广播的与待处理图数据相对应的任务信息;根据任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据。步骤S204具体包括:执行任务可执行文件,以执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
具体地,计算节点可通过区块链网络获取区块链网络中广播的与待处理图数据相对应的任务信息,并根据任务信息,从本地或数据仓库中读取相应的任务可执行文件,以及从待处理图数据中划分出的子图数据。执行任务可执行文件,以执行对于子图数据的计算任务。
在一些实例中,该分布式的图计算系统中包含控制节点,控制节点可通过将与待处理图数据相对应的任务信息记录在新区块中,并将新区块加入区块链网络,在区块链网络中广播该任务信息。
在一些实例中,区块链网络中的所有计算节点均可通过新区块的广播而接收到与待处理图数据相对应的任务信息。
在一些实例中,计算节点可通过定时器或定时程序等定时查看区块链网络中的最新区块中记录的信息。主动获取区块链网络中广播的与待处理图数据相对应的任务信息。
上述实例中,获取区块链网络中广播的与待处理图数据相对应的任务信息,再根据任务信息,读取相应的任务可执行文件和相应的子图数据,执行任务可执行文件,可以实现执行计算任务的步骤。其中,任务信息可以是一个很轻的文件,这样,通过区块链网络进行任务的发布可大大减少分布式的图计算系统中任务发布和获取的通信量,大大提高了任务发布的效率。
在一些实例中,获取区块链网络中广播的与待处理图数据相对应的任务信息的步骤具体包括以下步骤:
S302,定时检测区块链网络中广播的与待处理图数据相对应的任务信息。
具体地,计算节点可通过定时器或定时程序等定时检测区块链网络中广播 的与待处理图数据相对应的人物信息。在一些实例中,控制节点通过将任务信息记录在新区块中的方式以将任务信息通过区块链网络进行广播。
S304,当检测到任务信息时,确定当前计算节点的负载状态。
具体地,当计算节点检测到有新的任务信息后,可确定当前计算节点的负载状态。
S306,当负载状态满足预设负载条件时,拉取任务信息。
其中,预设负载条件是预先设置的条件,比如负载量小于预设阈值,或负载量在预设范围内等。计算节点可判断当前的负载状态是否满足预设负载条件,当计算节点的负载状态满足预设负载条件时,比如,计算节点处于空闲状态时,计算节点可拉取任务信息。当计算节点的负载状态不满足预设负载条件时,比如计算节点处于超载状态时,计算节点可忽略该任务信息。
上述实例中,通过定时检测区块链网络中广播的与待处理图数据相对应的任务信息,可实时获取最新的任务信息。当计算节点的负载状态满足预设负载条件时才拉取任务信息,可保障拉取任务信息的计算节点是有空闲资源来处理相应的计算任务的,避免了无效拉取,进一步提高了对图数据进行处理的效率。
在一些实例中,步骤S206具体包括以下步骤:创建新区块;将全局数据写入创建的新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络。
具体地,计算节点可定时创建新区块,将执行对于子图数据的计算任务后得到的全局计算数据写入创建的新区块中。当新区块通过区块链网络的验证后,将新区块加入区块链网络。
在一些实例中,计算节点可根据相应的共识算法生成新区块。共识算法,比如共识哈希算法。计算节点可将已完成的计算任务相应的全局数据写入新区块。在一些实例中,计算节点可将全局数据、计算任务执行日志、图节点的交换数据或计算节点状态等信息写入新区块。
在一些实例中,当区块链是私有链时,可以缩减共识哈希算法的位数,以提高处理效率,达到提升系统吞吐量的目的。缩减共识哈希算法的位数,比如使用128位的SHA(Secure Hash Algorithm,安全哈希算法)而不是256位;也可以调节共识哈希算法为POS(Proof of Work,工作量证明),使用币龄来决定下一个新区块的生成者。
在一些实例中,对于共识哈希函数,也可以采用限制共识前缀0的数量来控制复杂度,与位数控制类似,约束哈希结果前缀0的数量越多,共识问题的 难度就越大。
在一些实例中,区块链网络可采用相应的共识算法来验证新区块,当新区块通过区块链网络的验证后,计算节点可将该新区块加入区块链网络。
在一些实例中,该分布式的图计算系统中可能有多个新区块同时生成,造成区块链网络的分叉,处理分叉的策略采用常见的多数认同原则,使得最终只有一个分叉成为主链。如果当前计算节点创建的新块没有成为主链,计算节点的状态会退回块产生新区块之前的状态。为避免记录在该新区块中的数据的丢失,各计算节点可将相应的数据缓存至本地一段时间。
上述实例中,将全局数据写入新区块,当新区块通过区块链网络的验证后,可在区块链网络中共享全局数据,大大减小了共享全局数据所需的通信量。
在一些实例中,将全局数据写入创建的新区块的步骤包括,将全局数据和相应的计算任务执行日志写入新区块。该图数据处理方法还包括重建丢失数据的步骤,该步骤具体包括:
S402,根据区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据。
其中,计算任务执行日志是执行计算任务时产生的日志数据。在一些实例中,当计算节点从相邻节点获取局部数据失败时,可从区块链网络中根据需要拉取和整理相应的计算任务执行日志,确定故障计算节点和相应的丢失数据。
S404,确定参与重建丢失数据的计算节点。
具体地,参与重建丢失数据的计算节点通常是故障节点的相邻计算节点。在一些实例中,当确定了故障计算节点时,该故障计算节点的相邻计算节点可依自身负载状态获取该故障计算节点的部分子图数据结构,根据获取的部分子图数据结构从数据仓库中获取相应的部分子图数据,以合并构成新的子图数据。
S406,从确定的计算节点中获取与故障计算节点相关的局部数据。
具体地,计算节点可从参与重建丢失数据的其他计算节点处,获取与故障计算节点相关的局部数据。在一些实例中,计算节点也可从本地缓存的局部数据中查找与故障计算节点相关的局部数据,将查找到的局部数据共享至参与重建丢失数据的计算节点,共享的局部数据用于其他计算节点重建丢失数据。其中,共享的方式可以是计算节点主动将本地的与故障计算节点相关的局部数据发送至相应的计算节点处,也可以是需要数据的计算节点发起数据获取请求,以获取相应的局部数据。
在一些实例中,参与重建丢失数据的计算节点也可以是一个空载的计算节 点。此时,参与重建丢失数据的计算节点可获取与故障计算节点相邻的计算节点共享的与故障计算节点相关的局部数据,根据共享的局部数据重建丢失数据。
S408,根据获取的局部数据重建丢失数据。
具体地,计算节点可根据最小误差原则对于丢失数据进行重建。计算节点可根据获取的局部数据,采用插值法重建丢失数据。其中,常用的差值法,比如滑动平均窗口法、回归方法、插值函数法等,或者将获取的局部数据的均值、中位数或众数作为丢书数据的重建值。
上述实例中,通过记录在区块链网络中的计算任务执行日志,可快速定位故障节点和相应的丢失数据。根据参与重建丢失数据的计算节点所共享的局部数据可快速重建丢失的数据,避免了计算节点的单点故障而影响整体的计算,使得整体分布式的图处理系统具有较高的可靠性。
在一些实例中,该图数据处理方法还包括图节点交换的步骤,该步骤具体包括:
S502,获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值。
其中,区块链数据是记录在区块链网络中的数据,区块链网络中的新区块可以通过广播的方式共享记录的信息。量化值是对计算任务进行量化而得到的数值,比如一个单位计算任务对应一个单位的量化值。第一量化值是对计算节点中已经形成相应区块链数据的,且已完成的计算任务进行量化而得到的数值。第一量化值可以衡量计算节点的计算能力,与部分已完成的计算任务相对应。第一量化值,具体可以是在区块链网络中进行流通和交换的资源,可以称之为通货值或虚拟货币值等。
具体地,计算节点执行对于子图数据的计算任务后,可将部分已完成的计算任务所对应全局数据记录至区块链网络中形成相应区块链数据,这部分已完成的计算任务所对应的量化值即为第一量化值。
在一些实例中,计算节点还可将从上一区块到当前时间的所有相关数据,比如当前的全局数据、计算任务执行日志或计算节点状态记录在新区块并将新区块加入区块链网络,并获得相应的第一量化值。
S504,获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值。
其中,第二量化值是对计算节点中未形成相应区块链数据的,且已完成的计算任务进行量化而得到的数值。第二量化值可以衡量计算节点的计算能力, 与计算节点中另一部分已完成的计算任务相对应。第二量化值,具体可以是计算节点当前所拥有的可兑现为第一量化值的资源,可以称之为资产值。当计算节点将已完成的计算任务所对应的数据记录在区块链网络中形成相应的区块链数据后,这部分区块链数据所对应的资产值就可转换成区块链网络中流通的相同数额的通货值。
具体地,计算节点执行对于子图数据的计算任务后,得到相应的全局数据和局部数据。计算节点对已完成的计算任务进行量化,得到第二量化值。在一些实例中,计算节点执行对于子图数据的计算任务后,完成相应的计算任务。当计算节点完成相应的计算任务后,也就可以获取已完成的计算任务所对应的第二量化值。
在一些实例中,计算节点将全局数据写入新区块前,获取已完成的计算任务所对应的历史第二量化值。其中,历史第二量化值,对应将全局数据写入新区块前的已完成的计算任务。当计算节点将全局数据写入新区块,且该新区块通过区块链网络的验证后,从历史第二量化值中扣减生成与写入的全局数据相应的计算任务所对应的第一量化值,得到当前的第二量化值。
这样,通过将全局数据写入新区块以形成相应区块链数据的方式,可将相应的历史第二量化值转换成第一量化值。通过记账的方式,将未来可兑现的第二量化值,快捷方便地转化为相应的可流通的第一量化值。这样,第一量化值和当前的第二量化值的总和就可表示计算节点当前的计算能力。
S506,确定子图数据中未完成的计算任务所对应的第三量化值。
其中,第三量化值是对计算节点中的未完成的计算任务进行量化而得到的数值。第三量化值,具体可以是计算节点待计算的计算任务所对应的数值,可以称之为负债值,可以衡量计算节点的负载状态。当计算节点执行未完成的计算任务并完成后,可将相应的第三量化值转换成相同数额的第二量化值,进而再转换成相同数额的第一量化值。
具体地,计算节点可实时获取当前的未完成的计算任务,根据未完成的计算任务,确定相应的第三量化值。在一些实例中,第一量化值、第二量化值和第三量化值的单位相同,均与计算任务相对应。
在一些实例中,待处理图数据所对应的总任务是不断变化的,各计算节点中的子图数据所产生的子任务也是不断变化的。计算节点可根据子图数据迭代产生计算任务。随着算法的迭代更新,每次迭代过程中都会产生新的计算任务。子图数据中的每个图节点可以通过算法驱动源源不断地产生需要完成计算的单 位计算任务。
在一些实例中,计算节点可根据子图数据、已完成的计算任务的数量、已完成的计算任务的内容、区块链网络中共享的全局数据,以及局部数据,确定子图数据中未完成的计算任务所对应的第三量化值。
S508,当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
其中,均衡条件是预先设置的用于衡量计算节点当前的计算能力和负载状态之间是均衡关系的条件。均衡条件,比如子图数据当前的第三量化值的第二总和,和子图数据当前的第二量化值的第一总和的对比值,在指定范围内。其中,对比值是两个数值的差异值,具体可以通过数学计算来确定两个数值的差异值。数学计算,比如将两个数直接相除、取对数后相除、相减,或进行其他运算后再取对数后相除等等。对比值可以衡量一个数值相对于另一个数值的差异状态。
具体地,当第二量化值和第三量化值不符合均衡条件时,可在计算节点之间进行图节点和第一量化值的交换,以保持第二量化值和第三量化值符合均衡条件。
在一些实例中,均衡条件所对应的指定范围可以是预设的固定范围,也可以是由随时间变化的函数所确定的范围。
在一些实例中,计算节点可获取子图数据当前的第二量化值的第一总和,获取子图数据当前的第三量化值的第二总和。确定第二总和相对于第一总和的对比值,当对比值超出指定范围时,第二量化值和第三量化值不符合均衡条件。
在一些实例中,计算节点可根据以下公式计算第二总和相对于第一总和的对比值:
Figure PCTCN2019082192-appb-000005
其中,a和m均为常数;a>0,且a≠1;m≥1;
Figure PCTCN2019082192-appb-000006
表示子图数据当前的第三量化值的第二总和;
Figure PCTCN2019082192-appb-000007
表示子图数据当前的第二量化值的第一总和。
在一些实例中,当对比值为对第二总和和第一总和取对数后再相除所得的值,则指定范围的最小值μ(t)和最大值λ(t)均是随时间t变化的线性减函数,且μ(t)<λ(t)。比如,可用如下公式表示对比值在指定范围内:
Figure PCTCN2019082192-appb-000008
在一个具体地实例中,常数a可以为10,常数m为1,则上述公式可简化为
Figure PCTCN2019082192-appb-000009
在一些实例中,当对比值小于指定范围的最小值时,将当前计算节点的第一量化值与非当前计算节点的相应图节点进行交换,以保持对比值在指定范围内;当对比值大于指定范围的最大值时,将当前计算节点的图节点与非当前计算节点的相应第一量化值进行交换,以保持对比值在指定范围内。
需要指出的是,在上述的实例中,对比值是第二总和相对于第一总和的对比值,当对比值小于指定范围的最小值时,将当前计算节点的第一量化值与非当前计算节点的相应图节点进行交换;当对比值大于指定范围的最大值时,将当前计算节点的图节点与非当前计算节点的相应第一量化值进行交换,以保持对比值在指定范围内。相应的,对比值也可以是第一总和相对于第二总和的对比值,而当对比值大于指定范围的最大值时,将当前计算节点的第一量化值与非当前计算节点的相应图节点进行交换;当对比值小于指定范围的最小值时,将当前计算节点的图节点与非当前计算节点的相应第一量化值进行交换,以保持对比值在指定范围内。
在一些实例中,在计算节点之间进行图节点和第一量化值的交换的步骤具体包括以下步骤:在待进行交换的计算节点间,确定提供待交换的图节点的第一方,以及换取待交换的图节点的第二方。确定第一方对待交换的图节点的第一预估量化值;确定第二方对待交换的图节点的第二预估量化值。根据第一预估量化值和第二预估量化值,确定用于与待交换的图节点交换的第一量化值。在待进行交换的计算节点之间,进行待交换的图节点与确定的第一量化值间的交换。
其中,预估量化值是预估的对待交换的图节点所拥有的未完成的计算任务进行量化的值。第一预估量化值是提供待交换的图节点的第一方对待交换的图节点所拥有的未完成的计算任务进行量化的值。第二预估量化值是换取待交换的图节点的第二方对待交换的图节点所拥有的未完成的计算任务进行量化的值。
具体地,当计算节点为提供待交换的图节点的第一方时,计算节点可根据待交换的图节点所对应的第二量化值与第一方所对应的第二量化值的关系,待 交换的图节点所对应的第三量化值与第一方所对应的第三量化值的关系,确定第一方对待交换的图节点的第一预估量化值。比如,计算节点可通过以下公式计算第一预估量化值:
Figure PCTCN2019082192-appb-000010
其中,v i表示待交换的图节点;k表示提供待交换的图节点的第一方;y1(v i,k)表示第一方对待交换的图节点的第一预估量化值;
Figure PCTCN2019082192-appb-000011
表示待交换的图节点的第三量化值;C i表示待交换的图节点的第二量化值;
Figure PCTCN2019082192-appb-000012
表示提供待交换的图节点的第一方k中的图节点的平均第三量化值;
Figure PCTCN2019082192-appb-000013
表示提供待交换的图节点的第一方k中的图节点的平均第二量化值;α和β分别为相应的参数;e为自然常数。
在一些实例中,当计算节点为换取待交换的图节点的第二方时,计算节点在确定第二预估量化值时,可以考虑在换取待交换的图节点后,第二方所对应的计算节点需要增加的计算任务和减少的通信距离量。因此,计算节点可根据待交换的图节点的第三量化值,以及待交换的图节点与第二方中的图节点间的通信距离,来确定第二方对待交换的图节点的第二预估量化值。比如,计算节点可根据以下公式计算第二预估量化值:
Figure PCTCN2019082192-appb-000014
其中,v i表示待交换的图节点;l表示换取待交换的图节点的第二方;y 2(v i,l)表示第二方对待交换的图节点的第二预估量化值;
Figure PCTCN2019082192-appb-000015
表示待交换的图节点的第三量化值;∑ j∈ldist(i,j)表示换取待交换的图节点的第二方l中的图节点j与待交换的图节点i的通信距离的总和。
在一些实例中,dist(i,j)可表示图节点i和图节点j之间的通信距离。由于计算图数据中任意两图节点间的通信距离的计算量巨大,因此计算节点采用数学上的近似方法来计算图节点间的通信距离,比如,可采用局部近似法来计算图节点间的通信距离,具体可根据以下公式计算图节点间的通信距离:
Figure PCTCN2019082192-appb-000016
其中,dist(i,j)表示换取待交换的图节点的第二方l中的图节点j与待交换的图节点i间的通信距离;e i,j∈E表示图节点j与图节点i之间通过边连接;
Figure PCTCN2019082192-appb-000017
表示图节点j与图节点i之间无边。
也就是,当图节点i和图节点j之间通过边连接时,可近似认为图节点i和图 节点j的距离很近。当第二方所对应的计算节点换取了待交换的图节点后,计算节点可减少相应的通信距离,因此,第二预估量化值越高。当图节点i和图节点j之间无边连接时,当第二方所对应的计算节点换取了待交换的图节点后,计算节点并未减少通信距离。
在一些实例中,计算节点可根据第一预估量化值和第二预估量化值,确定用于与待交换的图节点交换的第一量化值。比如,可计算第一预估量化值和第二预估量化值的平均值,将平均值作为与待交换的图节点交换的第一量化值。或者,计算节点也可根据一定的权重比值,对第一预估量化值和第二预估量化值进行加权求和后取平均值,将加权求和后的平均值作为与待交换的图节点交换的第一量化值等。进一步地,在待进行交换的计算节点间,第二方可通过确定的第一量化值换取第一方的待交换的图节点。
上述实例中,分别获取第一量化值、第二量化值和第三量化值,当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。其中,第一量化值和第二量化值可以衡量计算节点的计算能力,第三量化值可以衡量计算节点的负载状态,这样,可将计算节点的计算能力和负载状态用和计算任务相应的量化值量化表示,准确又直观。通过在计算节点之间进行图节点和第一量化值的交换,来保持第二量化值和第三量化值满足均衡条件,这样,可以不需要依赖特定的服务器或节点分配任务,而是通过计算节点之间彼此协调分配图节点,并动态调整分配,实现自组织的负载均衡,避免了特定服务器的单点故障和网络拥塞问题,大大提高了任务的调度效率。并且,采用这样自组织的动态任务调度方法,能够适应更大规模集群的计算任务调度,动态增减计算节点的数量均不会对已有的计算任务造成影响,具有高可扩展性。
如图6所示,在一个具体的实例中,图数据处理方法包括以下步骤:
S602,定时检测区块链网络中广播的与待处理图数据相对应的任务信息。
S604,当检测到任务信息时,确定当前计算节点的负载状态。
S606,当负载状态满足预设负载条件时,拉取任务信息。
S608,根据任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据。
S610,执行任务可执行文件,以执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
S612,创建新区块。
S614,将全局数据和相应的计算任务执行日志写入新区块。
S616,当新区块通过区块链网络验证后,将新区块加入区块链网络;区块链网络中的全局数据,由分布式计算节点集群更新。
S618,获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值。
S620,获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值。
S622,确定子图数据中未完成的计算任务所对应的第三量化值。
S624,当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
S626,定时检测区块链网络中的区块数量。
S628,当区块数量增加时,获取新区块中记录的全局数据。
S630,根据获取的全局数据更新本地的全局数据。
S632,根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
S634,根据区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据。
S636,确定参与重建丢失数据的计算节点。
S638,从确定的计算节点中获取与故障计算节点相关的局部数据。
S640,根据获取的局部数据重建丢失数据。
上述图数据处理方法,通过将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
图6为一些实例中图数据处理方法的流程示意图。应该理解的是,虽然图6的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图6中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的 执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个具体实例中,控制节点可将任务信息写入新区块,在区块链网络中发布任务信息,计算节点探测到任务信息后判断自身负载状态,当负载状态满足预设负载条件时,拉取任务信息,并根据任务信息从数据仓库中获取子图数据和任务可执行文件。计算节点执行任务可执行文件,以对子图数据进行计算。计算节点在计算过程中,可将计算得到的全局数据、计算任务执行日志和计算节点状态等信息写入区块链网络。计算节点在计算过程中,可分别获取相应的第一量化值、第二量化值和第三量化值,当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。这样,不断地对图数据进行循环迭代计算,直到满足迭代停止条件时获得计算结果。计算节点可将相应的计算结果记录到区块链中,或者存储至数据仓库中。
在具体应用场景中,比如TSP问题中(Traveling Salesman Problem,旅行推销员问题),可通过上述图数据处理方法高效地求得较优解。其中,TSP问题具体指推销员如何在最短距离内走完所有城市(图节点)并回到出发的城市。这是一个NP(Non-deterministic Polynomial,非确定性多项式)完全问题,一般会求其近似解。通过本方案中的图数据处理方法,可将该TSP问题拆分成子图中的最短遍历问题,即通过寻找局部最优来近似逼近全局最优解。其中全局数据即为当前待处理图数据中的的路径以及总长度,而局部数据则包括各个子图数据中较优的路径长度。计算的过程为不断优化路径选择,计算节点之间通过将全局数据写入区块链网络中进行共享,可大大减少在计算过程中所耗费的网络通信量。根据不断更新的全局数据和局部数据,以执行计算任务,并求得最优解。
如图7所示,在一些实例中,提供了一种图数据的计算任务发布方法。本实例主要以该方法应用于上述图1中的控制节点110来举例说明。参照图7,该图数据的计算任务发布方法具体包括如下步骤:
S702,获取与待处理图数据相对应的任务信息。
在一些实例中,控制节点可获取本地存储的与待处理图数据相对应的任务信息,或者通过网络连接、接口连接等方式获取其他设备处所存储的任务信息。
在一些实例中,控制节点可接收任务添加指令,根据任务添加指令展示任务编辑界面。用户可通过任务编辑界面输入任务信息。控制节点可获取输入至任务编辑界面中的与待处理图数据相对应的任务信息。
图8示出了一些实例中任务编辑界面的界面示意图。如图8所示,控制节点可展示任务编辑界面,具体地,可在任务编辑界面的上方展示任务数量、运行中任务数、成功任务数、失败任务数等信息。在任务编辑界面的中间部位以表格的形式展示各任务的相关信息,比如任务名称、运行时间、运行命令、进度,以及可对已存在的任务进行状态变更的处理控件。在任务编辑界面的下方展示添加任务的文本框,当用户点击“添加任务”的按钮时,可在任务编辑界面的下方对应的文本框中输入相应的任务信息,比如算法参数、任务可执行文件的获取路径、输入数据路径、结果输出路径,以及期望完成的时间等。
S704,创建新区块。
具体地,控制节点在有新的计算任务发布时可创建新区块,以将任务信息写入新区块的方式发布任务信息。
在一些实例中,控制节点在对已经存在或执行的计算任务进行修改参数或输出结果路径等信息时,也可创建新区块,将修改的内容写入新区块,以在区块链网络中更新任务信息。
在一些实例中,控制节点对已经存在或执行的计算任务进行终止任务时,可在区块链中将其标记为不可用状态。也是通过添加新区块的方式在区块链网络中进行广播。
S706,将任务信息写入新区块。
具体地,控制节点可将获取的任务信息写入新区块。
S708,当新区块通过区块链网络验证后,将新区块加入区块链网络。
在一些实例中,区块链网络可采用相应的共识算法来验证新区块,当新区块通过区块链网络的验证后,计算节点可将该新区块加入区块链网络。
S710,在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
具体地,当控制节点用来记录任务信息的新区块通过区块链网络的验证后,可通过区块链网络广播新区块中的任务信息。广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
上述图数据的计算任务发布方法,通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,这样在区块链网络中的其他节点就可拉取相应的任务信息,并获取从待处理图数据中划分出的子图数 据,基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。这样可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率,进而提高了对图数据进行计算的处理效率。
在一些实例中,还可通过控制节点展示区块链网络中计算节点的任务执行状态。具体地,控制节点可接收任务执行状态展示指令;根据任务执行状态展示指令在区块链网络中访问相应的计算节点所对应的计算任务执行日志,以获取任务执行状态信息,并展示任务执行状态显示界面;在任务执行状态显示界面中展示区块链网络中已存在的计算任务所对应的任务执行状态信息。
如图9所示,图9示出了一些实例中任务执行状态显示界面的结构示意图。如图9所示,在任务执行状态显示界面中可展示计算节点数量、任务数量、持续运行时间、累计运行时间、成功任务数、失败任务数、运行中任务数、CPU使用率、网络吞吐量、内存使用率以及节点可用率等信息。其中,还可通过图表的形式辅助展示上述相关的信息。对于特定的计算节点,控制节点可访问该计算节点在区块链网络上的计算任务执行日志,并在任务执行状态显示界面中进行展示。
如图10所示,图10示出了一些实例中计算节点状态显示界面的结构示意图。如图10所示,在计算节点状态显示界面中可展示计算节点数量、计算节点可用率、计算节点故障数、CPU使用率、内存使用率、网络吞吐量等信息。还可在计算节点状态显示界面中可展示各节点的状态信息,比如计算节点名称、运行时间、当前状态、负载率、查看日志等。在计算节点状态显示界面的下方展示添加节点的文本框,当用户点击“添加节点”的按钮时,可在计算节点状态显示界面的下方对应的文本框中输入相应的计算节点信息以添加计算节点。其中计算节点信息比如,比如全局数据、节点参数、日志选项和计算任务等。
在一些实例中,可通过控制节点设置系统的运行参数,包括区块链网络的参数,比如区块大小、更新速度、验证方式和加密算法等。
如图11所示,在一个具体的实例中,图数据的计算任务发布方法包括以下步骤:
S1102,接收任务添加指令。
S1104,根据任务添加指令展示任务编辑界面。
S1106,获取输入至任务编辑界面中的与待处理图数据相对应的任务信息。
S1108,创建新区块。
S1110,将任务信息写入新区块。
S1112,当新区块通过区块链网络验证后,将新区块加入区块链网络。
S1114,在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
上述图数据的计算任务发布方法,通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,这样在区块链网络中的其他节点就可拉取相应的任务信息,并获取从待处理图数据中划分出的子图数据,基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。这样可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率,进而提高了对图数据进行计算的处理效率。
如图12所示,在一些实例中,提供了一种图数据处理装置1200,包括:计算管理模块1201和通信模块1202。
计算管理模块1201,用于获取从待处理图数据中划分出的子图数据;
计算管理模块1201还用于执行对于子图数据的计算任务,得到相应的全局数据和局部数据;
通信模块1202,用于将全局数据写入区块链网络中;区块链网络中的全局数据,由分布式计算节点集群更新;
通信模块1202还用于从区块链网络中获取最新的全局数据;
计算管理模块1201还用于根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
在一些实例中,通信模块1202还用于获取区块链网络中广播的与待处理图数据相对应的任务信息;计算管理模块1201还用于根据任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据;执行任务可执行文件,以执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
在一些实例中,通信模块1202还用于定时检测区块链网络中广播的与待处理图数据相对应的任务信息;当检测到任务信息时,确定当前计算节点的负载状态;当负载状态满足预设负载条件时,拉取任务信息。
在一些实例中,通信模块1202包括新块生成子模块12021,新块生成子模块12021用于创建新区块;将全局数据写入创建的新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络。
在一些实例中,新块生成子模块12021还用于将全局数据和相应的计算任务执行日志写入新区块;计算管理模块1201还包括数据恢复子模块12011,数据恢复子模块12011,用于根据区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据;确定参与重建丢失数据的计算节点;从确定的计算节点中获取与故障计算节点相关的局部数据;根据获取的局部数据重建丢失数据。
在一些实例中,数据恢复子模块12011还用于查找本地与故障计算节点相关的局部数据;将查找到的局部数据共享至参与重建丢失数据的计算节点;共享的局部数据用于重建丢失数据。
在一些实例中,通信模块1202还用于定时检测区块链网络中的区块数量;当区块数量增加时,获取新区块中记录的全局数据;根据获取的全局数据更新本地的全局数据。
如图13所示,在一些实例中,图数据处理装置1200还包括调度管理模块1203,调度管理模块1203用于获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值;获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值;确定子图数据中未完成的计算任务所对应的第三量化值;当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
上述图数据处理装置,通过将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
在具体应用场景中,如图14所示,图14示出了一些实例中图数据处理装置可包括通信模块、数据缓存模块、调度管理模块和计算管理模块。其中,通信模块包括新块生成子模块、交易记账子模块、日志管理子模块和状态报告子模块。数据缓存模块包括全局数据缓存子模块、局部数据缓存子模块、更新计时器子模块和数据读取/写回子模块。调度管理模块包括多任务协调子模块、自组织网络子模块、过程控制子模块和密钥管理子模块。计算管理模块包括计算执行子模块、内部变量存储子模块、数据恢复子模块和结果校验子模块。
在一些实例中,作为图14所示的实例的一个简化的实现,可以将数据缓存模块中的全局数据缓存子模块和局部数据缓存子模块放入计算管理模块,允许内部变量存储器直接访问数据仓库,省略数据缓存模块,数据缓存的工作完全由计算管理模块完成,并由执行的算法定义驱动数据缓存的策略,从而实现更紧凑的结构。
下面以图14为例,详细说明各模块之间是如何相互协调工作的。
其中,通信模块负责所有和区块链网络通信交互的功能,既包括维持区块链网络的正常运行,也包括维持其他模块与区块链网络的有效信息传递。其中,与区块链网络主要进行全局数据、计算任务的算法配置和计算节点状态数据等信息的交互。通信模块中的新块生成子模块用于将当前的全局数据、计算节点状态数据(包括第二量化值和第三量化值,以及图节点索引等数据)和从上一区块产生后发生的图节点交换信息写入新节点。交易记账子模块用于处理图节点的交换情况,即图节点的买卖和第一量化值的管理。日志管理子模块用于整理当前计算节点产生的计算任务执行日志,并从区块链上根据需要拉取和整理所需要的计算任务执行日志,这在故障计算节点的数据恢复过程中十分重要。状态报告子模块从计算管理模块中获得计算节点的状态信息,包括当前执行计算任务所消耗的CPU内存资源、算法名称和类型、出错次数等。
在一些实例中,通信模块还可以包括校验模块,以保证新块生成和记账的可靠性。
数据缓存模块包括全局数据缓存、局部数据缓存、更新计时器、数据读取写回四个子模块,数据更新策略与常见的FIFO和LFU策略类似,通过计时器驱动,将最老的缓存替代。
调度管理模块是整个分布式的图计算系统的核心,它驱动着整个系统有序运行。其中,自组织网络子模块负责自组织策略的执行,包括确定用于交换图节点的第一量化值、交易决策、资产核算和管理等。多任务协调子模块控制计算执行多个不同任务的数据,这里可以采用常见的队列调度和优先级调度方法进行控制协调。密钥管理子模块从数据仓库中取得计算节点的授权信息及相应的密钥。过程控制子模块监控计算过程的完整有效性,特别是当计算过程需要回退时的相关上下文处理。
计算管理模块中的计算执行子模块按照调度管理模块的指令,从数据仓库中获取所需的任务可执行文件以完成相关计算。对应的变量存储在内部变量存储子模块中,需要根据不同计算任务和图节点进行数据的隔离存储。数据恢复 子模块用于当区块链网络中出现故障计算节点时,尝试对存储在其上的数据进行有效地恢复。
相比于中心化的图数据处理装置,本申请实例中的图数据处理装置具有以下优势:
高可靠性,由于所有的计算节点可以自发地进行计算,因此单个计算节点的掉线不会影响整体的计算,也不会导致任务的失败,使得整个系统具有较高的可靠性。
灵活性高,正如上一点所述,新加入的节点同样能够很快地自适应地加入这一计算网络,使得计算网络的规模可以灵活扩大
可适应异构计算,由于负载均衡的自主化,异构的设备加入此网络同样可以按照类似的方式协调,不会因为计算能力和方式的差异产生瓶颈。
图15示出了一些实例中计算机设备的内部结构图。该计算机设备具体可以是图1中的计算节点120所分布的计算机设备。如图15所示,该计算机设备包括该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现图数据处理方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行图数据处理方法。
本领域技术人员可以理解,图15中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一些实例中,本申请提供的图数据处理装置可以实现为一种计算机程序的形式,计算机程序可在如图15所示的计算机设备上运行。计算机设备的存储器中可存储组成该图数据处理装置的各个程序模块,比如,图12所示的计算管理模块和通信模块。各个程序模块构成的计算机程序使得处理器执行本说明书中描述的本申请各个实例的图数据处理方法中的步骤。
例如,图15所示的计算机设备可以通过如图12所示的图数据处理装置中的计算管理模块执行步骤S202、S204和S210。计算机设备可通过通信模块执行步骤S206和S208。
如图16所示,在一些实例中,提供了一种图数据的计算任务发布装置1600,包括:任务管理模块1601和通信模块1602。
任务管理模块1601,用于获取与待处理图数据相对应的任务信息;
通信模块1602,用于创建新区块;
通信模块1602还用于将任务信息写入新区块;
通信模块1602还用于当新区块通过区块链网络验证后,将新区块加入区块链网络;
通信模块1602还用于在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
在一些实例中,任务管理模块1601还用于接收任务添加指令;根据任务添加指令展示任务编辑界面;获取输入至任务编辑界面中的与待处理图数据相对应的任务信息。
上述图数据的计算任务发布装置,通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,这样在区块链网络中的其他节点就可拉取相应的任务信息,并获取从待处理图数据中划分出的子图数据,基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。这样可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率,进而提高了对图数据进行计算的处理效率。
图17示出了一些实例中计算机设备的内部结构图。该计算机设备具体可以是图1中的控制节点110。如图17所示,该计算机设备包括该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、输入装置和显示屏。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现图数据的计算任务发布方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行图数据的计算任务发布方法。计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图17中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一些实例中,本申请提供的图数据的计算任务发布装置可以实现为一种计算机程序的形式,计算机程序可在如图17所示的计算机设备上运行。计算机设备的存储器中可存储组成该图数据的计算任务发布装置的各个程序模块,比如,图16所示的任务管理模块和通信模块。各个程序模块构成的计算机程序使得处理器执行本说明书中描述的本申请各个实例的图数据的计算任务发布方法中的步骤。
例如,图17所示的计算机设备可以通过如图16所示的图数据的计算任务发布装置中的任务管理模块执行步骤S702。计算机设备可通过通信模块执行步骤S704、S706和S708。
在一些实例中,本方案提供了一种分布式的图计算系统,包括计算节点和控制节点:控制节点用于获取与待处理图数据相对应的任务信息;创建新区块;将任务信息写入新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络;在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。计算节点用于获取从待处理图数据中划分出的子图数据;执行对于子图数据的计算任务,得到相应的全局数据和局部数据;将全局数据写入区块链网络中;区块链网络中的全局数据,由分布式计算节点集群更新;从区块链网络中获取最新的全局数据;根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
上述分布式的图计算系统,控制节点通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率。计算节点获取任务信息后,可将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
在一些实例中,提供了一种计算机设备,包括存储器和处理器,存储器中 储存有计算机程序,计算机程序被处理器执行时,使得处理器执行以下步骤:获取从待处理图数据中划分出的子图数据;执行对于子图数据的计算任务,得到相应的全局数据和局部数据;将全局数据写入区块链网络中;区块链网络中的全局数据,由分布式计算节点集群更新;从区块链网络中获取最新的全局数据;根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
在一些实例中,计算机程序使得处理器在执行获取从待处理图数据中划分出的子图数据的步骤时具体执行以下步骤:获取区块链网络中广播的与待处理图数据相对应的任务信息;根据任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据;计算机程序使得处理器在执行对于子图数据的计算任务,得到相应的全局数据和局部数据的步骤时具体执行以下步骤:执行任务可执行文件,以执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
在一些实例中,计算机程序使得处理器在执行获取区块链网络中广播的与待处理图数据相对应的任务信息的步骤时具体执行以下步骤:定时检测区块链网络中广播的与待处理图数据相对应的任务信息;当检测到任务信息时,确定当前计算节点的负载状态;当负载状态满足预设负载条件时,拉取任务信息。
在一些实例中,计算机程序使得处理器在执行将全局数据写入区块链网络中的步骤时具体执行以下步骤:创建新区块;将全局数据写入创建的新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络。
在一些实例中,计算机程序使得处理器在执行将全局数据写入创建的新区块的步骤时具体执行以下步骤:将全局数据和相应的计算任务执行日志写入新区块;计算机程序使得处理器还执行以下步骤:根据区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据;确定参与重建丢失数据的计算节点;从确定的计算节点中获取与故障计算节点相关的局部数据;根据获取的局部数据重建丢失数据。
在一些实例中,计算机程序使得处理器还执行以下步骤:查找本地与故障计算节点相关的局部数据;将查找到的局部数据共享至参与重建丢失数据的计算节点;共享的局部数据用于重建丢失数据。
在一些实例中,计算机程序使得处理器在执行从区块链网络中获取最新的全局数据的步骤时具体执行以下步骤:定时检测区块链网络中的区块数量;当区块数量增加时,获取新区块中记录的全局数据;根据获取的全局数据更新本 地的全局数据。
在一些实例中,计算机程序使得处理器还执行以下步骤:获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值;获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值;确定子图数据中未完成的计算任务所对应的第三量化值;当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
上述计算机设备,通过将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
在一些实例中,提供了一种计算机设备,包括存储器和处理器,存储器中储存有计算机程序,计算机程序被处理器执行时,使得处理器执行以下步骤:获取与待处理图数据相对应的任务信息;创建新区块;将任务信息写入新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络;在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
在一些实例中,计算机程序使得处理器在执行获取与待处理图数据相对应的任务信息的步骤时具体执行以下步骤:接收任务添加指令;根据任务添加指令展示任务编辑界面;获取输入至任务编辑界面中的与待处理图数据相对应的任务信息。
上述计算机设备,通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,这样在区块链网络中的其他节点就可拉取相应的任务信息,并获取从待处理图数据中划分出的子图数据,基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。这样可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率,进而提高了对图数据进行计算的处理效率。
一种计算机可读存储介质,存储有计算机程序,该计算机程序被处理器执 行时实现以下步骤:获取从待处理图数据中划分出的子图数据;执行对于子图数据的计算任务,得到相应的全局数据和局部数据;将全局数据写入区块链网络中;区块链网络中的全局数据,由分布式计算节点集群更新;从区块链网络中获取最新的全局数据;根据获取的最新的全局数据和局部数据,迭代执行对于子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
在一些实例中,计算机程序使得处理器在执行获取从待处理图数据中划分出的子图数据的步骤时具体执行以下步骤:获取区块链网络中广播的与待处理图数据相对应的任务信息;根据任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据;计算机程序使得处理器在执行对于子图数据的计算任务,得到相应的全局数据和局部数据的步骤时具体执行以下步骤:执行任务可执行文件,以执行对于子图数据的计算任务,得到相应的全局数据和局部数据。
在一些实例中,计算机程序使得处理器在执行获取区块链网络中广播的与待处理图数据相对应的任务信息的步骤时具体执行以下步骤:定时检测区块链网络中广播的与待处理图数据相对应的任务信息;当检测到任务信息时,确定当前计算节点的负载状态;当负载状态满足预设负载条件时,拉取任务信息。
在一些实例中,计算机程序使得处理器在执行将全局数据写入区块链网络中的步骤时具体执行以下步骤:创建新区块;将全局数据写入创建的新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络。
在一些实例中,计算机程序使得处理器在执行将全局数据写入创建的新区块的步骤时具体执行以下步骤:将全局数据和相应的计算任务执行日志写入新区块;计算机程序使得处理器还执行以下步骤:根据区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据;确定参与重建丢失数据的计算节点;从确定的计算节点中获取与故障计算节点相关的局部数据;根据获取的局部数据重建丢失数据。
在一些实例中,计算机程序使得处理器还执行以下步骤:查找本地与故障计算节点相关的局部数据;将查找到的局部数据共享至参与重建丢失数据的计算节点;共享的局部数据用于重建丢失数据。
在一些实例中,计算机程序使得处理器在执行从区块链网络中获取最新的全局数据的步骤时具体执行以下步骤:定时检测区块链网络中的区块数量;当区块数量增加时,获取新区块中记录的全局数据;根据获取的全局数据更新本地的全局数据。
在一些实例中,计算机程序使得处理器还执行以下步骤:获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值;获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值;确定子图数据中未完成的计算任务所对应的第三量化值;当第二量化值和第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
上述计算机可读存储介质,通过将待处理图数据分割成子图数据进行分布式处理,可大大提高图数据的处理效率。再将各计算节点在进行图数据的分布式计算过程中获得的需要进行全局共享的全局数据写入区块链网络中,通过区块链网络进行全局共享。大大减少了数据共享所耗费的通信量。并且,在对待处理图数据进行迭代计算的过程中,可迅速直接地从区块链网络中获取最新的全局数据和缓存至本地的局部数据,无需一个中心化的驱动程序来协调更新数据,大大提高了对图数据进行处理的效率。
一种计算机可读存储介质,存储有计算机程序,该计算机程序被处理器执行时实现以下步骤:获取与待处理图数据相对应的任务信息;创建新区块;将任务信息写入新区块;当新区块通过区块链网络验证后,将新区块加入区块链网络;在区块链网络中广播任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。
在一些实例中,计算机程序使得处理器在执行获取与待处理图数据相对应的任务信息的步骤时具体执行以下步骤:接收任务添加指令;根据任务添加指令展示任务编辑界面;获取输入至任务编辑界面中的与待处理图数据相对应的任务信息。
上述计算机可读存储介质,通过将获取的任务信息写入区块链网络中的新区块,以在区块链网络中广播发布该任务信息,这样在区块链网络中的其他节点就可拉取相应的任务信息,并获取从待处理图数据中划分出的子图数据,基于区块链网络迭代执行对于子图数据的计算任务,获得计算结果。这样可以在耗费极低的网络通信量的情况下,使得处于区块链网络中的就算节点可以迅速有效地接收到发布的任务信息,大大提高了图数据的计算任务的发布效率,进而提高了对图数据进行计算的处理效率。
本领域普通技术人员可以理解实现上述实例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实例的 流程。其中,本申请所提供的各实例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (15)

  1. 一种图数据处理方法,由分布式计算节点集群中的计算节点所分布的计算机设备执行,包括:
    获取从待处理图数据中划分出的子图数据;
    执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;
    将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;
    从所述区块链网络中获取最新的全局数据;
    根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
  2. 根据权利要求1所述的方法,其中,所述获取从待处理图数据中划分出的子图数据,包括:
    获取区块链网络中广播的与待处理图数据相对应的任务信息;
    根据所述任务信息,读取相应的任务可执行文件和从待处理图数据中划分出的子图数据;
    所述执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据包括:
    执行所述任务可执行文件,以执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据。
  3. 根据权利要求2所述的方法,其中,所述获取区块链网络中广播的与待处理图数据相对应的任务信息包括:
    定时检测区块链网络中广播的与待处理图数据相对应的任务信息;
    当检测到所述任务信息时,确定当前计算节点的负载状态;
    当所述负载状态满足预设负载条件时,拉取所述任务信息。
  4. 根据权利要求1所述的方法,其中,所述将所述全局数据写入区块链网络中包括:
    创建新区块;
    将所述全局数据写入创建的新区块;
    当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络。
  5. 根据权利要求4所述的方法,其中,所述将所述全局数据写入创建的新 区块包括:将所述全局数据和相应的计算任务执行日志写入新区块;
    所述方法还包括:
    根据所述区块链网络中的计算任务执行日志,确定故障计算节点和相应的丢失数据;
    确定参与重建所述丢失数据的计算节点;
    从确定的计算节点中获取与所述故障计算节点相关的局部数据;
    根据获取的局部数据重建所述丢失数据。
  6. 根据权利要求5所述的方法,所述方法还包括:
    查找本地与所述故障计算节点相关的局部数据;
    将查找到的局部数据共享至参与重建所述丢失数据的计算节点;共享的局部数据用于重建所述丢失数据。
  7. 根据权利要求1所述的方法,其中,所述从所述区块链网络中获取最新的全局数据包括:
    定时检测所述区块链网络中的区块数量;
    当所述区块数量增加时,获取新区块中记录的全局数据;
    根据获取的全局数据更新本地的全局数据。
  8. 根据权利要求1至7任一项所述的方法,所述方法还包括:
    获取已形成相应区块链数据且已完成的计算任务所对应的第一量化值;
    获取未形成相应区块链数据且已完成的计算任务所对应的第二量化值;
    确定所述子图数据中未完成的计算任务所对应的第三量化值;
    当所述第二量化值和所述第三量化值不符合均衡条件时,在计算节点之间进行图节点和第一量化值的交换。
  9. 一种图数据的计算任务发布方法,由计算机设备执行,所述方法包括:
    获取与待处理图数据相对应的任务信息;
    创建新区块;
    将所述任务信息写入所述新区块;
    当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络;
    在所述区块链网络中广播所述任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于所述区块链网络迭代执行对于所述子图数据的计算任务,获得计算结果。
  10. 根据权利要求9所述的方法,其中,所述获取与待处理图数据相对应的任务信息包括:
    接收任务添加指令;
    根据所述任务添加指令展示任务编辑界面;
    获取输入至所述任务编辑界面中的与待处理图数据相对应的任务信息。
  11. 一种图数据处理装置,包括:
    计算管理模块,用于获取从待处理图数据中划分出的子图数据;
    所述计算管理模块还用于执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;
    通信模块,用于将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;
    所述通信模块还用于从所述区块链网络中获取最新的全局数据;
    所述计算管理模块还用于根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条件时获得计算结果。
  12. 一种图数据的计算任务发布装置,所述装置包括:
    任务管理模块,用于获取与待处理图数据相对应的任务信息;
    通信模块,用于创建新区块;
    所述通信模块还用于将所述任务信息写入所述新区块;
    所述通信模块还用于当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络;
    所述通信模块还用于在所述区块链网络中广播所述任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于所述区块链网络迭代执行对于所述子图数据的计算任务,获得计算结果。
  13. 一种分布式的图计算系统,包括计算节点和控制节点:
    所述控制节点用于获取与待处理图数据相对应的任务信息;创建新区块;将所述任务信息写入所述新区块;当所述新区块通过区块链网络验证后,将所述新区块加入所述区块链网络;在所述区块链网络中广播所述任务信息;广播的任务信息,用于指示计算节点获取从待处理图数据中划分出的子图数据,并基于所述区块链网络迭代执行对于所述子图数据的计算任务,获得计算结果;
    所述计算节点用于获取从待处理图数据中划分出的子图数据;执行对于所述子图数据的计算任务,得到相应的全局数据和局部数据;将所述全局数据写入区块链网络中;所述区块链网络中的全局数据,由所述分布式计算节点集群更新;从所述区块链网络中获取最新的全局数据;根据获取的最新的全局数据和所述局部数据,迭代执行对于所述子图数据的计算任务,直到满足迭代停止条 件时获得计算结果。
  14. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至8中任一项所述方法的步骤、或如权利要求9至10中任一项所述方法的步骤。
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如权利要求1至8中任一项所述方法的步骤、或如权利要求9至10中任一项所述方法的步骤。
PCT/CN2019/082192 2018-05-16 2019-04-11 图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备 WO2019218814A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2020563591A JP7158801B2 (ja) 2018-05-16 2019-04-11 グラフデータ処理方法、グラフデータの計算タスクの配布方法、装置、コンピュータプログラム、及びコンピュータ機器
KR1020207031862A KR102485652B1 (ko) 2018-05-16 2019-04-11 그래프 데이터 처리 방법, 그래프 데이터 계산 태스크들을 공개하는 방법 및 디바이스, 저장 매체 및 컴퓨터 장치
SG11202010651RA SG11202010651RA (en) 2018-05-16 2019-04-11 Graph data processing method, method and device for publishing graph data computational tasks, storage medium, and computer apparatus
EP19804270.7A EP3771179B1 (en) 2018-05-16 2019-04-11 Graph data processing method, system and storage medium
US16/983,948 US11847488B2 (en) 2018-05-16 2020-08-03 Graph data processing method, method and device for publishing graph data computational tasks, storage medium, and computer apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810467817.8A CN108683738B (zh) 2018-05-16 2018-05-16 图数据处理方法和图数据的计算任务发布方法
CN201810467817.8 2018-05-16

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/983,948 Continuation US11847488B2 (en) 2018-05-16 2020-08-03 Graph data processing method, method and device for publishing graph data computational tasks, storage medium, and computer apparatus

Publications (1)

Publication Number Publication Date
WO2019218814A1 true WO2019218814A1 (zh) 2019-11-21

Family

ID=63805486

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/082192 WO2019218814A1 (zh) 2018-05-16 2019-04-11 图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备

Country Status (7)

Country Link
US (1) US11847488B2 (zh)
EP (1) EP3771179B1 (zh)
JP (1) JP7158801B2 (zh)
KR (1) KR102485652B1 (zh)
CN (2) CN111030802B (zh)
SG (1) SG11202010651RA (zh)
WO (1) WO2019218814A1 (zh)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353709A (zh) * 2020-02-29 2020-06-30 国网上海市电力公司 一种基于区块链的电力信息产品的生产方法及系统
US10754700B2 (en) 2017-01-24 2020-08-25 Oracle International Corporation Distributed graph processing system featuring interactive remote control mechanism including task cancellation
CN113704067A (zh) * 2021-09-09 2021-11-26 合肥新青罗数字技术有限公司 无形资产管理系统监控方法
US11250059B2 (en) 2020-01-09 2022-02-15 Oracle International Corporation Optimizing graph queries by performing early pruning
US11456946B2 (en) 2020-06-11 2022-09-27 Oracle International Corporation Regular path queries (RPQS) for distributed graphs
US11461130B2 (en) 2020-05-26 2022-10-04 Oracle International Corporation Methodology for fast and seamless task cancelation and error handling in distributed processing of large graph data
CN115378803A (zh) * 2022-04-13 2022-11-22 网易(杭州)网络有限公司 日志管理方法、装置、区块链节点和存储介质
US11675785B2 (en) 2020-01-31 2023-06-13 Oracle International Corporation Dynamic asynchronous traversals for distributed graph queries
CN117479235A (zh) * 2023-12-28 2024-01-30 中通信息服务有限公司 一种末梢网络设施调度管理方法及系统
JP7490394B2 (ja) 2020-03-06 2024-05-27 株式会社日立製作所 情報共有支援方法、及び情報共有支援システム
US12001425B2 (en) 2020-12-09 2024-06-04 Oracle International Corporation Duplication elimination in depth based searches for distributed systems

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030802B (zh) 2018-05-16 2020-12-29 腾讯科技(深圳)有限公司 图数据的计算任务发布方法、装置、设备和存储介质
CN108681482B (zh) * 2018-05-16 2020-12-01 腾讯科技(深圳)有限公司 基于图数据的任务调度方法和装置
CN109542602B (zh) * 2018-11-20 2021-05-11 苏州朗润创新知识产权运营有限公司 一种基于区块链的分布式任务处理方法、装置及系统
WO2020107351A1 (zh) * 2018-11-29 2020-06-04 袁振南 模型训练方法及其节点、网络及存储装置
CN109710406B (zh) * 2018-12-21 2023-01-17 腾讯科技(深圳)有限公司 数据分配及其模型训练方法、装置、及计算集群
CN109801070B (zh) * 2019-01-12 2020-11-06 杭州复杂美科技有限公司 交易排队方法、设备和存储介质
CN109947565B (zh) 2019-03-08 2021-10-15 北京百度网讯科技有限公司 用于分配计算任务的方法和装置
CN110399208B (zh) * 2019-07-15 2023-10-31 创新先进技术有限公司 分布式任务调度拓扑图的展示方法、装置及设备
CN110807125B (zh) * 2019-08-03 2020-12-22 北京达佳互联信息技术有限公司 推荐系统、数据访问方法及装置、服务器、存储介质
CN112487080A (zh) * 2019-08-20 2021-03-12 厦门本能管家科技有限公司 一种用于区块链的差值回退方法及系统
CN110750385B (zh) * 2019-10-25 2022-09-09 东北大学 一种基于受限恢复的图迭代器及方法
CN111369144B (zh) * 2020-03-04 2022-02-18 浙江大学 一种基于雾计算与区块链的智慧能源调度系统及方法
CN111581443B (zh) * 2020-04-16 2023-05-30 南方科技大学 分布式图计算方法、终端、系统及存储介质
CN111598036B (zh) * 2020-05-22 2021-01-01 广州地理研究所 分布式架构的城市群地理环境知识库构建方法及系统
US20220019742A1 (en) * 2020-07-20 2022-01-20 International Business Machines Corporation Situational awareness by fusing multi-modal data with semantic model
CN112118290B (zh) * 2020-08-12 2022-03-18 北京大学 一种基于程序分析的数据资源的管控方法
CN112445940B (zh) * 2020-10-16 2022-05-24 苏州浪潮智能科技有限公司 图划分方法、装置及计算机可读存储介质
US11755543B2 (en) * 2020-12-29 2023-09-12 International Business Machines Corporation Optimization of workflows with dynamic file caching
US20220210140A1 (en) * 2020-12-30 2022-06-30 Atb Financial Systems and methods for federated learning on blockchain
CN113434702A (zh) * 2021-07-27 2021-09-24 支付宝(杭州)信息技术有限公司 一种用于图计算的自适应控制方法和系统
CN114049213A (zh) * 2021-11-15 2022-02-15 深圳前海鸿泰源兴科技发展有限公司 一种信息化金融数据分析系统与分析方法
CN114553917B (zh) * 2021-12-30 2024-01-26 北京天成通链科技有限公司 一种基于区块链的网络智能治理方法
CN114567634B (zh) * 2022-03-07 2023-02-07 华中科技大学 面向后e级图计算的方法、系统、存储介质及电子设备
WO2023206142A1 (zh) * 2022-04-27 2023-11-02 西门子股份公司 数据同步方法、装置、计算机设备和存储介质
CN115017234A (zh) * 2022-06-29 2022-09-06 贵州财经大学 一种区块链信息管理系统、区块链信息存储及查询方法
CN115511086B (zh) * 2022-11-03 2024-05-24 上海人工智能创新中心 一种针对超大模型的分布式推理部署系统
CN115809686B (zh) * 2023-02-03 2023-06-16 中国科学技术大学 提升循环图结构数据处理系统处理效率方法、设备及介质
CN116681104B (zh) * 2023-05-11 2024-03-12 中国地质大学(武汉) 分布式空间图神经网络的模型建立及实现方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590321A (zh) * 2015-12-24 2016-05-18 华中科技大学 一种基于块的子图构建及分布式图处理方法
CN106033476A (zh) * 2016-05-19 2016-10-19 西安交通大学 一种云计算环境中分布式计算模式下的增量式图计算方法
CN106777351A (zh) * 2017-01-17 2017-05-31 中国人民解放军国防科学技术大学 基于art树分布式系统图存储计算系统及其方法
US20170364701A1 (en) * 2015-06-02 2017-12-21 ALTR Solutions, Inc. Storing differentials of files in a distributed blockchain
CN108683738A (zh) * 2018-05-16 2018-10-19 腾讯科技(深圳)有限公司 图数据处理方法和图数据的计算任务发布方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2589931B1 (en) * 2011-11-07 2016-06-29 Elektrobit Automotive GmbH Technique for structuring navigation data
US11394773B2 (en) * 2014-06-19 2022-07-19 Jim Austin Joseph Cryptographic currency block chain based voting system
JP5858507B1 (ja) 2015-05-18 2016-02-10 株式会社Orb 仮想通貨管理プログラム、及び仮想通貨管理方法
US10366204B2 (en) * 2015-08-03 2019-07-30 Change Healthcare Holdings, Llc System and method for decentralized autonomous healthcare economy platform
CN105404701B (zh) * 2015-12-31 2018-11-13 浙江图讯科技股份有限公司 一种基于对等网络的异构数据库同步方法
CN106357405A (zh) * 2016-09-19 2017-01-25 弗洛格(武汉)信息科技有限公司 一种基于区块链技术一致性算法的数据管理方法及系统
US10291627B2 (en) * 2016-10-17 2019-05-14 Arm Ltd. Blockchain mining using trusted nodes
US10726346B2 (en) * 2016-11-09 2020-07-28 Cognitive Scale, Inc. System for performing compliance operations using cognitive blockchains
CN106611061B (zh) * 2016-12-29 2018-02-23 北京众享比特科技有限公司 基于区块链网络的数据库写入方法及系统
US20210125083A1 (en) * 2017-03-16 2021-04-29 Facet Labs, Llc Edge devices, systems and methods for processing extreme data
CN113766035B (zh) * 2017-03-28 2023-05-23 创新先进技术有限公司 一种业务受理及共识的方法及装置
CN107943833B (zh) * 2017-10-25 2021-11-19 华南农业大学 一种基于区块链的无中心分布式文件存储及检索方法
CN107864198B (zh) * 2017-11-07 2019-09-24 山东浪潮人工智能研究院有限公司 一种基于深度学习训练任务的区块链共识方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170364701A1 (en) * 2015-06-02 2017-12-21 ALTR Solutions, Inc. Storing differentials of files in a distributed blockchain
CN105590321A (zh) * 2015-12-24 2016-05-18 华中科技大学 一种基于块的子图构建及分布式图处理方法
CN106033476A (zh) * 2016-05-19 2016-10-19 西安交通大学 一种云计算环境中分布式计算模式下的增量式图计算方法
CN106777351A (zh) * 2017-01-17 2017-05-31 中国人民解放军国防科学技术大学 基于art树分布式系统图存储计算系统及其方法
CN108683738A (zh) * 2018-05-16 2018-10-19 腾讯科技(深圳)有限公司 图数据处理方法和图数据的计算任务发布方法

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10754700B2 (en) 2017-01-24 2020-08-25 Oracle International Corporation Distributed graph processing system featuring interactive remote control mechanism including task cancellation
US11250059B2 (en) 2020-01-09 2022-02-15 Oracle International Corporation Optimizing graph queries by performing early pruning
US11675785B2 (en) 2020-01-31 2023-06-13 Oracle International Corporation Dynamic asynchronous traversals for distributed graph queries
CN111353709A (zh) * 2020-02-29 2020-06-30 国网上海市电力公司 一种基于区块链的电力信息产品的生产方法及系统
CN111353709B (zh) * 2020-02-29 2023-05-16 国网上海市电力公司 一种基于区块链的电力信息产品的生产方法及系统
JP7490394B2 (ja) 2020-03-06 2024-05-27 株式会社日立製作所 情報共有支援方法、及び情報共有支援システム
US11461130B2 (en) 2020-05-26 2022-10-04 Oracle International Corporation Methodology for fast and seamless task cancelation and error handling in distributed processing of large graph data
US11456946B2 (en) 2020-06-11 2022-09-27 Oracle International Corporation Regular path queries (RPQS) for distributed graphs
US12001425B2 (en) 2020-12-09 2024-06-04 Oracle International Corporation Duplication elimination in depth based searches for distributed systems
CN113704067B (zh) * 2021-09-09 2023-10-24 合肥新青罗数字技术有限公司 无形资产管理系统监控方法
CN113704067A (zh) * 2021-09-09 2021-11-26 合肥新青罗数字技术有限公司 无形资产管理系统监控方法
CN115378803B (zh) * 2022-04-13 2023-12-12 网易(杭州)网络有限公司 日志管理方法、装置、区块链节点和存储介质
CN115378803A (zh) * 2022-04-13 2022-11-22 网易(杭州)网络有限公司 日志管理方法、装置、区块链节点和存储介质
CN117479235A (zh) * 2023-12-28 2024-01-30 中通信息服务有限公司 一种末梢网络设施调度管理方法及系统
CN117479235B (zh) * 2023-12-28 2024-03-19 中通信息服务有限公司 一种末梢网络设施调度管理方法及系统

Also Published As

Publication number Publication date
EP3771179A1 (en) 2021-01-27
CN111030802B (zh) 2020-12-29
CN111030802A (zh) 2020-04-17
US20200364084A1 (en) 2020-11-19
KR20200139780A (ko) 2020-12-14
JP2021523474A (ja) 2021-09-02
SG11202010651RA (en) 2020-11-27
CN108683738A (zh) 2018-10-19
KR102485652B1 (ko) 2023-01-06
EP3771179A4 (en) 2021-05-19
US11847488B2 (en) 2023-12-19
JP7158801B2 (ja) 2022-10-24
CN108683738B (zh) 2020-08-14
EP3771179B1 (en) 2023-09-20

Similar Documents

Publication Publication Date Title
WO2019218814A1 (zh) 图数据处理方法、图数据的计算任务发布方法、装置、存储介质及计算机设备
KR102499076B1 (ko) 그래프 데이터 기반의 태스크 스케줄링 방법, 디바이스, 저장 매체 및 장치
US11922308B2 (en) Generating neighborhood convolutions within a large network
JP7470476B2 (ja) 蒸留を用いたそれぞれのターゲット・クラスを有するモデルの統合
US9213574B2 (en) Resources management in distributed computing environment
CN112154462A (zh) 高性能流水线并行深度神经网络训练
US10191998B1 (en) Methods of data reduction for parallel breadth-first search over graphs of connected data elements
CN106446061A (zh) 用于存储虚拟机镜像的方法及设备
JP6556659B2 (ja) ニューラルネットワークシステム、シェア計算装置、ニューラルネットワークの学習方法、プログラム
Gu et al. From server-based to client-based machine learning: A comprehensive survey
CN114330730A (zh) 量子线路分块编译方法、装置、设备、存储介质和产品
Sulich et al. Schemes for verification of resources in the cloud: comparison of the cloud technology providers
Patman et al. Predictive cyber foraging for visual cloud computing in large-scale IoT systems
Li et al. Two-level incremental checkpoint recovery scheme for reducing system total overheads
WO2011131248A1 (en) Method and apparatus for losslessly compressing/decompressing data
CN111309786A (zh) 基于MapReduce的并行频繁项集挖掘方法
Fiosina et al. Distributed nonparametric and semiparametric regression on spark for big data forecasting
JP2015001954A (ja) 情報処理システムおよび情報処理方法
US20240028669A1 (en) Experimentally validating causal graphs
WO2017028930A1 (en) Methods and apparatus for running an analytics function
US20220405631A1 (en) Data quality assessment for unsupervised machine learning
US20230111378A1 (en) Framework for estimation of resource usage and execution time of workloads in a heterogeneous infrastructure
US20230186168A1 (en) Performing automated tuning of hyperparameters in a federated learning environment
Bădică et al. Optimizing communication costs in acoda using simulated annealing: Initial experiments
CN115329189A (zh) 基于层次图属性网络表示学习的论文推荐方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19804270

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20207031862

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019804270

Country of ref document: EP

Effective date: 20201023

ENP Entry into the national phase

Ref document number: 2020563591

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE