WO2020107350A1

WO2020107350A1 - Node management method and apparatus for blockchain system, and storage device

Info

Publication number: WO2020107350A1
Application number: PCT/CN2018/118290
Authority: WO
Inventors: 袁振南; 朱鹏新
Original assignee: 区链通网络有限公司
Priority date: 2018-11-29
Filing date: 2018-11-29
Publication date: 2020-06-04
Also published as: CN109964452A; CN109964452B

Abstract

Disclosed are a node management method and apparatus for a blockchain system, and a storage device. The method comprises: acquiring characteristic data of each node; acquiring node characteristic representation of each node by using the characteristic data; acquiring overall characteristic representation of a blockchain system by using the node characteristic representation; performing training on the basis of the reinforcement learning algorithm by using the node characteristic representation and the overall characteristic representation to obtain control strategies, there being a plurality of control strategies; and performing consensus-based voting on the plurality of control strategies to determine the control strategy of each node and managing each node according to the control strategy. In such way, the present application can implement the automatic management and control of nodes of the blockchain system.

Description

Node management method, device and storage device of block chain system

【Technical Field】

The present application relates to the field of network communication technology, and in particular, to a node management method, device, and storage device of a blockchain system.

【Background technique】

In the blockchain system, each node server provides computing power and resources for the entire system, which forms the basis of the entire blockchain system. In the long-term research, the inventor of the present application found that due to the different state of the nodes in the blockchain system and the computing power they can provide, there may be various abnormal and malicious nodes; there are also nodes that provide long-term stable services . Therefore, it is necessary to manage the access of new nodes, the removal of anomalies and malicious nodes, as well as the promotion of server node permissions for long-term stable service, and perform role conversion and permission adjustment between different role permission nodes.

[Invention content]

The technical problem mainly solved by the present application is to provide a node management method, device and storage device of the blockchain system, which can realize automatic management and control of the nodes of the blockchain system.

In order to solve the above technical problems, a technical solution adopted by the present application is: to provide a node management method of a blockchain system, the method includes obtaining characteristic data of each node; using the characteristic data to obtain a node characteristic representation of each node; using The node feature representation obtains the overall feature representation of the blockchain system; based on the reinforcement learning algorithm, the node feature representation and the overall feature representation are used to train the control strategy, and the control strategy is multiple; a consensus-based voting is performed on multiple control strategies to determine the The control strategy of the node, and manage the node according to the control strategy.

In order to solve the above technical problems, a technical solution adopted by the present application is to provide a node management device of a blockchain system, wherein the device includes a processor, and the processor is used to obtain characteristic data of each node; The data obtains the node feature representation of each node; uses the node feature representation to obtain the overall feature representation of the blockchain system; based on the reinforcement learning algorithm, the node feature representation and the overall feature representation are used to train the control strategy, and the control strategy is multiple; for multiple The control strategy conducts consensus-based voting to determine the control strategy of the node and manage the node according to the control strategy.

In order to solve the above technical problems, another technical solution adopted by the present application is to provide a node management device of a blockchain system, wherein the device includes a data collection module for acquiring characteristic data of each node; the first feature Representation module, used to obtain the node feature representation of each node using feature data; second feature representation module, used to obtain the overall feature representation of the blockchain system using node feature representation; management strategy module, used to utilize nodes based on reinforcement learning algorithm The feature representation and the overall feature representation training result in regulation strategies, and there are multiple regulation strategies; the voting module is used to conduct consensus-based voting on multiple regulation strategies, determine the regulation strategy of the node, and manage the nodes according to the regulation strategy.

In order to solve the above technical problems, another technical solution adopted by the present application is to provide a device with a storage function, wherein the device stores a program, and when the program is executed, the above node management of the blockchain system is implemented method.

The beneficial effects of this application are: different from the situation in the prior art, this application provides a node management method of the blockchain system, which uses a reinforcement learning algorithm to train the regulation strategy, so that there is no need to set up a central management node to control the area The access, removal and change of authority in the blockchain system are automatically controlled.

【Explanation】

FIG. 1 is a schematic flowchart of a first embodiment of a node management method of a blockchain system of this application;

2 is a schematic structural diagram of a first embodiment of a blockchain system of this application;

3 is a schematic structural diagram of a first embodiment of a node management device of a blockchain system of this application;

4 is a schematic structural diagram of a second embodiment of a node management device of a blockchain system of the present application;

5 is a schematic structural diagram of a first embodiment of a device with a storage function according to the present application.

【detailed description】

In order to make the purpose, technical solutions and effects of the present application clearer and clearer, the present application will be described in further detail below with reference to the accompanying drawings and examples.

This application provides a node management method for a blockchain system. By using machine learning algorithms to extract node feature data and training regulation strategies, the entire system can implement autonomous node authority regulation.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a first embodiment of a node management method of a blockchain system of the present application. In this embodiment, the node management method includes the following steps:

S101: Obtain the characteristic data of each node.

Among them, the characteristic data is data that reflects the characteristics of the performance and status of the node. For example, it may be one or more of physical hardware data, network data, operating status data, log data, and task assignment data between nodes.

S102: Obtain a node feature representation of each node.

Among them, a predetermined algorithm is used to process the acquired feature data to obtain its vector representation, which is a node feature representation.

S103: Obtain the overall feature representation of the blockchain system.

Among them, the collected node data and neighbor node data are used to update and train the system feature model to obtain the feature representation of the current system state, which is the overall feature representation.

S104: Based on the reinforcement learning algorithm, the node feature representation and the overall feature representation are used to train to obtain a control strategy, and there are multiple control strategies.

Among them, reinforcement learning is that agents (agents) learn in a "trial and error" manner, and reward guidance behaviors obtained through interaction with the environment, and the goal is to make the agents get the most rewards. Reinforcement learning is different from supervised learning in connectionism learning, which is mainly expressed in teacher signals. The reinforcement signal provided by the environment in reinforcement learning is an evaluation of the quality of the action (usually a scalar signal), rather than telling reinforcement Learning system RLS (reinforcement learning) how to produce correct action. Because the external environment provides little information, RLS must learn from its own experience. In this way, RLS acquires knowledge in the action-evaluation environment and improves the action plan to adapt to the environment. Through reinforcement learning, the optimal control strategy can be trained.

S105: Conduct consensus-based voting on multiple regulation strategies, determine the regulation strategy of the node, and manage the node according to the regulation strategy.

Among them, if a lower-level role node corresponds to multiple upper-level role nodes, different upper-level role nodes will derive multiple control strategies. After the control strategy is obtained, the results of the node control strategy generated by the control model will be voted based on consensus to determine The control strategy of the node, and then adjust the reputation value of the node according to the control strategy to control the node access, removal or role role adjustment between nodes with different roles, to achieve the management of the node.

This embodiment adopts a reinforcement learning algorithm to train the control strategy, so that there is no need to set up a central management node to automatically control and control the access, removal and permission changes of the nodes in the blockchain system.

Please refer to FIG. 2, which is a schematic structural diagram of the first embodiment of the blockchain system of the present application. In this embodiment, the blockchain system includes node A, node B, and node C as examples, but it is not limited to this architecture. In this embodiment, a data collection unit is provided in each node server of the blockchain system, and the data collection unit may be used to collect and/or report characteristic data. Specifically, it is used to collect characteristic data of this node, collect characteristic data of neighbor nodes reported to this node, and collect characteristic data of lower role nodes reported to this node. At the same time, the collected characteristic data can be reported to the corresponding upper role nodes and neighbor nodes. Specifically, when reporting, in addition to reporting the characteristic data of the local node, the collected characteristic data reported to the local node will also be reported. That is, cross-level and cross-region feature data will not be reported repeatedly, but will be reported step by step or point by point through the upper role node or neighbor node closest to the node.

Among them, in one embodiment, the blockchain system is a hierarchical blockchain system. The hierarchical blockchain system refers to a blockchain system composed of blockchain nodes with different roles and permissions. The access of the nodes , The rejection and the conversion of node role permissions are jointly determined by the upper role nodes. In this system, each node server can be divided into upper role nodes and lower role nodes. The upper role nodes can manage the lower role nodes; there can be multiple upper role nodes, and one upper role node can manage multiple lower role nodes, A lower role node can also be managed by multiple upper role nodes. Role conversion and/or permission adjustment can be performed between different role permission nodes in different tasks or in different time periods.

Among them, a feature representation unit is set in the upper role node, which is responsible for automatically extracting dynamic data features using a decentralized training algorithm and converting into high-dimensional state feature representation. The feature representations of training nodes, regional features and system features are calculated layer by layer.

Specifically, the collected feature data is processed, and the decentralized graph algorithm is used to train the feature data to obtain a node feature representation. Decentralization means that in a system with many nodes distributed, each node has a highly autonomous feature. The nodes can be freely connected to each other to form a new connection unit. Any node may become a phased center, but it does not have a mandatory central control function. The influence between nodes will form a non-linear causality through the network. That is, an open, flat, and equal system phenomenon or structure is formed. To achieve decentralization (distributed) on the blockchain technology, this depends on the consensus algorithm. The consensus algorithm solves the process of reaching a consensus on a proposal (Proposal) to ensure that the system meets different degrees of consistency.

After obtaining the node feature representation, the decentralized deep learning algorithm is used to train the regional feature model on the collected feature data of each node and neighbor nodes to obtain the regional feature representation of the blockchain system.

Further, the collected feature data of each node and neighbor nodes and the regional feature representation are used to train and update the system feature model to obtain the feature representation of the current system state, that is, the overall feature representation.

After the state feature representation is obtained, these state feature representations are used to train the control strategy model to obtain the control strategy. Specifically, a management strategy unit is also provided in the upper role node, which is used to train a control strategy based on reinforcement learning algorithm using node feature representation and overall feature representation training.

Among them, the reinforcement learning algorithm (RL algorithm) is mainly composed of agents and the environment, specifically an agent (Agent) to take action (Action) to change their state (State) to get rewards (Reward) and environment (Environment) Cycle process. The environment refers to the object (such as a node server) where the agent is acting, and the agent represents the RL algorithm. The environment first sends a state to the agent, and then takes action based on its knowledge to respond to the state. After that, the environment sends a pair of next states and rewards the agent. The agent will update its knowledge with the rewards returned by the environment to evaluate its final actions. Its strategy depends entirely on the current status (Only present Matters). Reinforcement learning algorithms include Q-learning, sarsa, deep Q Network, policy Gradient, Actor Critic, etc.

In this embodiment, the regulation strategy model is updated and trained based on the preset target function using the current system state and the current state feature representation of each node. Among them, the preset objective function is used to evaluate and measure the current node and its task status. The current system status includes the regional system status and the overall system status.

For example, when processing task 1, the current feedback information of the lower role node A1 and the CPU usage, current task volume, and historical task status information of the lower role node; the corresponding upper role node A, when training the regulation strategy, is based on the current state of node A1 1. The regional system status and the overall system status of the area where A1 is located, and the control strategy for the regional environment of task 1 is obtained. For example, the control strategy may be to eliminate node A1 while increasing the task sharing of nodes A2 and A3. Or when the upper role node A trains the regulation strategy, according to the current state of the node A1, the regional system state of the area where A1 is located and the overall system state, the regulation strategy for the overall environment of task 1 is obtained, for example, the regulation strategy may be to eliminate the node A1 , While increasing the task sharing of nodes A2 and B2.

In one embodiment, if a lower-level role node corresponds to multiple upper-level role nodes, after the control strategy is obtained, a consensus-based vote will be made on the node control strategy result generated by the control model and the transaction is recorded. For example, in the system of task 1, there is also an upper role node B. Lower role nodes A1 and B2 will also report to node B when they report node A; then node B will also be trained to obtain a regulation for nodes A1 and B2 Strategies, such as regulation and control strategies, can eliminate node A1, reduce the task sharing of node B2, and increase the task sharing of node B1. At this time, for node B2, you will get two different control strategies, then you need to vote on the control strategy based on consensus to confirm the final control strategy.

Then, the nodes are regulated according to the confirmed regulation strategy. After adjustment, the node changes its state. After the adjustment, the upper role node evaluates whether it receives a strategy reward through the throughput of the entire system, the speed of block generation, etc., and then allows the upper node to update its knowledge with the rewards returned by the environment to evaluate its subsequent actions in this cycle. In this way, the optimal control strategy can be trained.

Based on the above method, the present application also provides a node management device of the blockchain system. Please refer to FIG. 3, which is a schematic structural diagram of a first embodiment of the node management device of the blockchain system of the present application. In this embodiment, the node management device 30 of the blockchain system includes a processor 301, which is used to obtain the characteristic data of each node; use the characteristic data to obtain the node characteristic representation of each node; use the node characteristic representation to obtain the block The overall feature representation of the chain system; based on the reinforcement learning algorithm, the node feature representation and the overall feature representation are used to train the control strategy, and the control strategy is multiple; a consensus-based voting is performed on multiple control strategies to determine the control strategy of the node, and Manage the nodes according to the control strategy.

Wherein, in an embodiment, the processor 301 is specifically configured to obtain a regional and/or overall environment regulation strategy based on the reinforcement learning algorithm and the preset objective function, using node feature representation and overall feature representation training.

Among them, in an embodiment, the processor 301 is specifically configured to train the feature data using a decentralized graph algorithm to obtain a node feature representation.

In one embodiment, the processor 301 is specifically used to train a regional feature model on the collected feature data of each node and neighbor nodes using a decentralized deep learning algorithm to obtain a regional feature representation of the blockchain system.

In one embodiment, the processor 301 is specifically configured to use the collected feature data and regional feature representations of each node and neighbor nodes to train a system model to obtain an overall feature representation.

As mentioned above, the node management device 30 of the blockchain system can be used to execute the node management method of the above-mentioned blockchain system to manage the nodes of the blockchain system, and has corresponding beneficial effects. For the specific process, please refer to the The description will not be repeated here. The device may be an independent device independent of the server, or a certain module in the server, or a certain processing unit.

Please refer to FIG. 4, which is a schematic structural diagram of a second embodiment of a node management device of a blockchain system of the present application. In this embodiment, the node management device 40 of the blockchain system is a certain module in the server, which specifically includes a data collection module 401, a first feature representation module 402, a second feature representation module 403, a management and control strategy module 404, and a vote Module 405.

The data collection module 401 is used to obtain characteristic data of each node.

The first feature representation module 402 is used to obtain the node feature representation of each node using the feature data.

The second feature representation module 403 is used to obtain the overall feature representation of the blockchain system using the node feature representation.

The management and control strategy module 404 is used to train a control strategy based on a reinforcement learning algorithm using node feature representation and overall feature representation training, and there are multiple control strategies.

The voting module 405 is used to conduct consensus-based voting on multiple regulation strategies, determine the regulation strategy of the node, and manage the node according to the regulation strategy.

The node management device 40 of the blockchain system can be used to execute the node management method of the blockchain system described above, manage the nodes of the blockchain system, and have corresponding beneficial effects. For the specific process, please refer to the description of the above embodiment, I will not repeat them here.

The present application also provides a device with a storage function, please refer to FIG. 5, which is a schematic structural diagram of a first embodiment of a device with a storage function in the present application. In this embodiment, the storage device 50 stores a program 501, and the program 501 is one or more. When the program 501 is executed, the node management method of the blockchain system described above is implemented. The specific working process is the same as in the above method embodiment, so it will not be repeated here. For details, please refer to the description of the corresponding method steps above. Among them, devices with storage function can be portable storage media such as U disk, optical disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk, etc. The medium storing the program code may also be a terminal, server, or the like.

The above solution, this application provides a node management method of the blockchain system. By using decentralized machine learning algorithms for feature training and reinforcement learning algorithms for regulation strategy training, there is no need to set up another central management node to control the block. The access, removal and change of authority of the nodes in the chain system are automatically managed and controlled.

In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device implementation described above is only schematic. For example, the division of the module or unit is only a division of logical functions. In actual implementation, there may be other divisions, for example, multiple units or components may be The combination can either be integrated into another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or software functional unit.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application essentially or part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium It includes several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods described in the embodiments of the present application.

The above are only the embodiments of the present application, and therefore do not limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made by the description and drawings of this application, or directly or indirectly used in other related technologies In the field, the same reason is included in the scope of patent protection of this application.

Claims

A node management method of a blockchain system, wherein the method includes:

Obtain the characteristic data of each node;

Use the feature data to obtain a node feature representation of each node;

Use the node feature representation to obtain the overall feature representation of the blockchain system;

Based on the reinforcement learning algorithm, using the node feature representation and the overall feature representation to train to obtain a control strategy, the control strategy is multiple;

Conduct consensus-based voting on the plurality of control strategies, determine the control strategy of the node, and manage the node according to the control strategy.
The node management method of the blockchain system according to claim 1, wherein the reinforcement learning algorithm based on the node feature representation and the overall feature representation training to obtain the regulation strategy includes:

Based on the reinforcement learning algorithm and the preset objective function, the control strategy of the region and/or the overall environment is obtained by training using the node feature representation and the overall feature representation.
The node management method of the blockchain system according to claim 1, wherein the management of the nodes according to the regulation strategy includes:

According to the regulation strategy, control the node access, removal, or role role adjustment between nodes with different roles.
The node management method of the blockchain system according to claim 1, wherein the acquiring the node characteristic representation of each node includes:

Use the decentralized graph algorithm to train the feature data to obtain the node feature representation.
The node management method of the blockchain system according to claim 4, wherein after acquiring the node characteristic representation of each node further comprises:

Using the decentralized deep learning algorithm, the regional feature model is trained on the collected feature data of each node and neighbor nodes, and the regional feature representation of the blockchain system is obtained.
The node management method of the blockchain system according to claim 5, wherein the acquiring the overall characteristic representation of the blockchain system includes:

Using the collected feature data of each node and neighbor nodes and the regional feature representation, a system model is trained to obtain the overall feature representation.
The node management method of the blockchain system according to claim 1, wherein the acquiring characteristic data of each node includes:

Collect characteristic data of this node, and/or collect characteristic data of neighbor nodes reported to this node, and/or collect characteristic data of lower role nodes reported to this node.
The node management method of the blockchain system according to claim 1, wherein the acquiring characteristic data of each node further comprises:

Report the collected local node and the characteristic data reported to the local node to the upper role node and/or neighbor node.
The node management method of a blockchain system according to claim 1, wherein the characteristic data is one or more of physical hardware data, network data, operating status data, log data, or task assignment data between nodes.
A node management device of a blockchain system, wherein the device includes a processor, and the processor is used for:

Obtain the characteristic data of each node;

Use the feature data to obtain a node feature representation of each node;

Use the node feature representation to obtain the overall feature representation of the blockchain system;

Based on the reinforcement learning algorithm, using the node feature representation and the overall feature representation to train to obtain a control strategy, the control strategy is multiple;

Conduct consensus-based voting on the plurality of control strategies, determine the control strategy of the node, and manage the node according to the control strategy.
The node management device of the blockchain system according to claim 10, wherein the processor is specifically configured to use the node feature representation and the overall feature representation to derive the training area based on the reinforcement learning algorithm and the preset objective function And/or overall environmental control strategies.
The node management device of the blockchain system according to claim 10, wherein the processor is specifically configured to train the feature data using a decentralized graph algorithm to obtain the node feature representation.
The node management device of the blockchain system according to claim 12, wherein the processor is specifically used to train a regional feature model on the collected feature data of each node and neighbor nodes using a decentralized deep learning algorithm to obtain The regional characteristics of the blockchain system.
The node management device of the blockchain system according to claim 13, wherein the processor is specifically used to train the system model using the collected feature data of each node and neighbor nodes and the regional feature representation to obtain Describe the overall characteristics.
A node management device of a blockchain system, wherein the device includes:

The data collection module is used to obtain the characteristic data of each node;

A first feature representation module, used to obtain the node feature representation of each node using the feature data;

The second feature representation module is used to obtain the overall feature representation of the blockchain system using the node feature representation;

The management and control strategy module is used to train and obtain a regulation strategy based on the reinforcement learning algorithm by using the node feature representation and the overall feature representation training, and there are multiple regulation strategies;

The voting module is used to conduct consensus-based voting on the multiple control strategies, determine the control strategy of the node, and manage the node according to the control strategy.
A device having a storage function, wherein the device stores a program, and when the program is executed, the node management method of the blockchain system according to any one of claims 1 to 9 is realized.