CN113537625A

CN113537625A - Data sharing method considering energy consumption efficiency in power internet of things based on block chain

Info

Publication number: CN113537625A
Application number: CN202110873613.6A
Authority: CN
Inventors: 蔡婷; 蔡宇; 闫会峰
Original assignee: Chongqing Yitong College
Current assignee: Chongqing Yitong College
Priority date: 2021-07-30
Filing date: 2021-07-30
Publication date: 2021-10-22

Abstract

The invention discloses a data sharing method considering energy consumption efficiency in an electric power Internet of things based on a block chain, which is used for solving the technical problems of low data transmission safety and integrity, low data sharing rate and high energy consumption of the existing data sharing method. Wherein, the invention includes: training an optimal routing inspection path at a routing inspection site of the power system; patrolling the power system based on the optimal patrolling path to obtain sensing data; and uploading the sensing data to a preset block chain.

Description

Data sharing method considering energy consumption efficiency in power internet of things based on block chain

Technical Field

The invention relates to the technical field of data sharing, in particular to a data sharing method considering energy consumption efficiency in an electric power Internet of things based on a block chain.

Background

With the deep fusion of the internet of things technology and a smart grid, the intelligent mining of mass user side data, the safe sharing of energy data, the real-time processing of routing inspection data and the wide interconnection of edge data of a power system, the ubiquitous power internet of things with comprehensive sensing of the construction state and ubiquitous data connection becomes a current research hotspot, and the mass data sensing and the safe sharing in the power internet of things become key problems to be solved urgently.

The large-scale power system is easy to generate high-frequency abnormal conditions of power transmission (distribution) equipment in the long-term operation process in the field (or indoor) severe environment, the traditional manual regular inspection operation mode is relied on, the workload is large, the efficiency is low, the investment cost is high, the life safety of operators can be threatened by long-period power inspection operation in the complex environment, and therefore, the intelligent robot inspection becomes a safe and efficient alternative mode.

However, how to collect high-quality data on the site of a power system by using a mobile intelligent device (such as a mobile intelligent device, an unmanned aerial vehicle, etc.) and realize safe and real-time data sharing is a problem to be solved urgently. On one hand, in the case that the endurance and sensing range of a Mobile Smart Terminal (MST) are limited, how to obtain more high-quality data perception is very important. On the other hand, how to share the sensed data to other MSTs and promote mutual understanding and cooperation among devices so as to complete the allocation task better is one of the key problems to be solved urgently.

The traditional centralized power internet of things system structure provides a serious challenge for data privacy, data real-time sharing and low-delay transmission. In addition, the existing robot inspection mostly takes manual operation as a main part, and the optimization of the inspection path of the robot under the condition of limited cruising ability is not considered, so that the data sharing rate is improved.

In summary, the existing power internet of things data sensing and sharing have the following disadvantages: firstly, after high-definition data of a power system field acquired by the mobile intelligent device are transmitted back to a base for processing by using a centralized power Internet of things system, the safety and integrity of the data cannot be guaranteed; secondly, the problem of insufficient cruising ability of the unmanned aerial vehicle can not be fully considered when manual operation is used for on-site routing inspection; finally, the existing motion path optimization algorithm of the mobile intelligent device does not have the autonomous learning capability, and the minimum energy consumption cost of the mobile intelligent device can not be realized while sharing data as much as possible.

Disclosure of Invention

The invention provides a data sharing method considering energy consumption efficiency in an electric power Internet of things based on a block chain, which is used for solving the technical problems of low data transmission safety and integrity, low data sharing rate and high energy consumption of the existing data sharing method.

The invention provides a data sharing method of an electric power Internet of things, which is applied to mobile intelligent equipment and comprises the following steps:

training an optimal routing inspection path at a routing inspection site of the power system;

patrolling the power system based on the optimal patrolling path to obtain sensing data;

and uploading the sensing data to a preset block chain.

Optionally, the step of training an optimal patrol route at a patrol site of the power system includes:

acquiring the current environmental state of the power system at the current moment of the inspection site;

inputting the current environment state into a preset first network to obtain a current output action;

calculating rewards according to the current output action and the current environment state, and generating a next environment state;

storing the current environmental state, the current output action, the reward, and the next environmental state in a preset playback pool;

judging whether the current time is equal to a preset time or not, if not, adopting the time corresponding to the next environment state as the current time, and returning to the step of inputting the current environment state into a preset first network to obtain a current output action until the current time is equal to the preset time;

acquiring a training sample from the playback pool, calculating a target value of the training sample through a preset second network, and obtaining a strategy gradient based on the target value;

optimizing the second network based on the target value to obtain a second optimization parameter;

optimizing the first network by adopting the strategy gradient to obtain a first optimization parameter;

and generating an optimal routing inspection path based on the first optimization parameter and the second optimization parameter.

Optionally, the step of uploading the sensing data to a preset block chain includes:

obtaining a private key;

signing the perception data by adopting the private key to obtain encrypted data;

sending the encrypted data to a preset authentication and authorization center for verification;

and when receiving verification passing information returned by the authentication and authorization center, sending the encrypted data to a preset block chain.

Optionally, the step of sending the encrypted data to a preset block chain when receiving verification passing information returned by the authentication and authorization center includes:

when verification passing information returned by the authentication and authorization center is received, a storage request is sent to the block chain; the storage request carries the encrypted data; and the block chain is used for verifying the encrypted data and storing the sensing data when the verification is passed.

The invention also provides a data sharing device of the power Internet of things, which is applied to mobile intelligent equipment, and the device comprises:

the optimal routing inspection path training module is used for training an optimal routing inspection path on a routing inspection site of the power system;

the perception data acquisition module is used for patrolling the power system based on the optimal patrolling path to acquire perception data;

and the uploading module is used for uploading the sensing data to a preset block chain.

Optionally, the optimal patrol path training module includes:

the current environment state acquisition submodule is used for acquiring the current environment state of the power system at the current moment of the inspection site;

the current output action acquisition submodule is used for inputting the current environment state into a preset first network to obtain a current output action;

the reward calculation submodule is used for calculating reward according to the current output action and the current environment state and generating the next environment state;

a storage submodule, configured to store the current environment state, the current output action, the reward, and the next environment state in a preset playback pool;

the circulation submodule is used for judging whether the current time is equal to a preset time or not, if not, the time corresponding to the next environment state is taken as the current time, and the step of inputting the current environment state into a preset first network to obtain the current output action is returned until the current time is equal to the preset time;

the strategy gradient generation submodule is used for acquiring a training sample from the playback pool, calculating a target value of the training sample through a preset second network, and obtaining a strategy gradient based on the target value;

a second optimization parameter obtaining submodule, configured to optimize the second network based on the target value, so as to obtain a second optimization parameter;

a first optimization parameter obtaining submodule, configured to optimize the first network by using the policy gradient to obtain a first optimization parameter;

and the optimal routing inspection path generation submodule is used for generating an optimal routing inspection path based on the first optimization parameter and the second optimization parameter.

Optionally, the upload module includes:

the private key obtaining sub-module is used for obtaining a private key;

the encryption submodule is used for signing the sensing data by adopting the private key to obtain encrypted data;

the verification sub-module is used for sending the encrypted data to a preset authentication authorization center for verification;

and the uploading sub-module is used for sending the encrypted data to a preset block chain when receiving verification passing information returned by the authentication and authorization center.

Optionally, the upload sub-module includes:

the uploading unit is used for sending a storage request to the block chain when receiving verification passing information returned by the authentication and authorization center; the storage request carries the encrypted data; and the block chain is used for verifying the encrypted data and storing the sensing data when the verification is passed.

The invention also provides an electronic device comprising a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is used for executing the data sharing method of the power internet of things according to instructions in the program codes.

The invention also provides a computer readable storage medium for storing program code for executing the data sharing method of the power internet of things as described in any one of the above.

According to the technical scheme, the invention has the following advantages: the optimal routing inspection path of the mobile intelligent equipment on the routing inspection site of the power system is trained based on the preset deterministic strategy gradient algorithm, so that the mobile intelligent equipment can share data as much as possible to the maximum extent and reduce the energy consumption cost. Meanwhile, the security and the integrity of data storage are improved in a mode that sensing data sensed in the inspection process are uploaded to a block chain for storage.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.

Fig. 1 is a flowchart illustrating steps of a data sharing method for an internet of things of electric power according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a deep reinforcement learning process according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating steps of training an optimal routing inspection path according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of data sensing based on multi-agent DRL according to an embodiment of the present invention;

FIG. 5 is an MST training framework provided by embodiments of the present invention;

fig. 6 is a flowchart illustrating a procedure of uploading sensing data to a predetermined block chain according to an embodiment of the present invention;

fig. 7 is a block diagram of a data sharing apparatus of an electric power internet of things according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a data sharing method considering energy consumption efficiency in an electric power Internet of things based on a block chain, which is used for solving the technical problems of low data transmission safety and integrity, low data sharing rate and high energy consumption in the existing data sharing method.

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart illustrating steps of a data sharing method for an internet of things of electric power according to an embodiment of the present invention.

The invention provides a data sharing method of an electric power Internet of things, which is applied to mobile intelligent equipment, wherein the mobile intelligent equipment can comprise an inspection robot and an unmanned aerial vehicle, and the method specifically comprises the following steps:

101, training an optimal routing inspection path on a routing inspection site of a power system;

the problem that the insufficient cruising ability of the unmanned aerial vehicle cannot be fully considered when manual operation is carried out on-site routing inspection; and the problem that the existing mobile intelligent equipment motion path optimization algorithm does not have autonomous learning capability and cannot realize minimum energy consumption cost per se while sharing data as much as possible, the embodiment of the invention models the optimal routing inspection path learning of the mobile intelligent equipment into a Markov Decision Process (MDP) with a continuous motion space. Each mobile smart device MST is an agent whose goal is to maximize its overall reward by learning the optimal patrol path policy through interaction with the environment. Because the invention is oriented to distributed and continuous multi-agent data sensing and sharing, the traditional DRL (Deep Learning) algorithm cannot meet the text requirement. Therefore, in the embodiment of the invention, a multi-agent DRL solution based on DDPG is provided to realize the optimal path training among the multi-agents. In particular, a mixed cooperation-competition relationship exists between the MSTs which are communicated with each other, and a single MST can cooperate with other MSTs to complete the field data perception of the power system. However, in the case that the total amount of data points is relatively fixed, there is a case that MSTs compete with each other in order to maximize the self-reward. In one example, a multi-agent DRL can employ a framework of centralized training, decentralized execution. During the training process, the additional information of the department MST (e.g., actions, rewards, training cycles, etc.) may be used to train the cooperative learning strategies of other MSTs. But may not use the private information of other MSTs in the process of execution to maintain independence and autonomy of the MSTs.

The markov decision process is a mathematical model of sequential decisions for simulating stochastic strategies and returns achievable by an agent in an environment where the system state has markov properties. The MDP is built based on a set of interactive objects, namely agents and environments, with elements including state, actions, policies and rewards. In the simulation of MDP, the agent perceives the current system state and acts on the environment in a strategic manner, thereby changing the state of the environment and receiving rewards, the accumulation of which over time is referred to as rewards.

DRL combines the perception ability of Deep Learning (DL) and the decision-making ability of Reinforcement Learning (RL), can be controlled directly according to input state information, and is an artificial intelligence method closer to a human thinking mode. The brief learning process may be as shown in FIG. 2:

at each moment, agent (software or hardware entity capable of autonomous activity, such as mobile intelligent device in the embodiment of the present invention) interacts with the environment to obtain an observation at a high latitude, and senses the observation (Observations) by using a DL method to obtain a specific state feature representation; evaluating a cost function of each Action based on expected Reward (Reward), and mapping the current state to a corresponding Action (Action) through a certain strategy; the Environment (Environment) reacts to this action and gets the next observation, and by continuously cycling the above processes, the optimal strategy for achieving the goal can be finally obtained.

102, patrolling the power system based on the optimal patrolling path to obtain sensing data;

in the embodiment of the invention, the mobile intelligent equipment can acquire the field data of the power system (such as a transformer substation, a power transmission line and the like) and acquire the running condition of the equipment.

And 103, uploading the sensing data to a preset block chain.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, a confidential algorithm and the like, and has the characteristics of decentralization, no tampering, trace remaining in the whole process, traceability, collective maintenance, openness and transparency and the like.

In order to provide a safe data storage and sharing service, the mobile intelligent device in the embodiment of the invention can realize distributed storage and sharing of the sensed field data through the blockchain with the help of the edge server.

The optimal routing inspection path of the mobile intelligent equipment on the routing inspection site of the power system is trained based on the preset deterministic strategy gradient algorithm, so that the mobile intelligent equipment can share data as much as possible to the maximum extent and reduce the energy consumption cost. Meanwhile, the security and the integrity of data storage are improved in a mode that sensing data sensed in the inspection process are uploaded to a block chain for storage.

Referring to fig. 3, fig. 3 is a flowchart illustrating a procedure for training an optimal routing inspection path according to an embodiment of the present invention.

In the embodiment of the present invention, the step of training the optimal routing inspection path at the routing inspection site of the power system may specifically include:

s31, acquiring the current environmental state of the power system at the current moment of the inspection site;

s32, inputting the current environment state into a preset first network to obtain the current output action;

s33, calculating reward according to the current output action and the current environment state, and generating the next environment state;

s34, storing the current environment state, the current output action, the reward and the next environment state in a preset playback pool;

s35, judging whether the current time is equal to a preset time, if not, adopting the time corresponding to the next environment state as the current time, and returning to the step of inputting the current environment state into a preset first network to obtain the current output action until the current time is equal to the preset time;

s36, obtaining training samples from the playback pool, calculating the target value of the training samples through a preset second network, and obtaining a strategy gradient based on the target value;

s37, optimizing the second network based on the target value to obtain a second optimization parameter;

s38, optimizing the first network by adopting a strategy gradient to obtain a first optimization parameter;

and S39, generating an optimal routing inspection path based on the first optimization parameter and the second optimization parameter.

In one example of the present invention, an embodiment of the present invention proposes a DDPG based multi-agent DRL solution to achieve optimal path training between multi-agents. In particular, a mixed cooperation-competition relationship exists between the MSTs which are communicated with each other, and a single MST can cooperate with other MSTs to complete the field data perception of the power system. However, in the case that the total amount of data points is relatively fixed, there is a case that MSTs compete with each other in order to maximize the self-reward. In one example, a multi-agent DRL can employ a framework of centralized training, decentralized execution. During the training process, the additional information (e.g., actions, rewards, training periods, etc.) of some MSTs may be used to train the cooperative learning strategies of other MSTs. But may not use the private information of other MSTs in the process of execution to maintain independence and autonomy of the MSTs.

Referring to fig. 4, fig. 4 shows a schematic diagram of data sensing based on multi-agent DRL.

As shown in fig. 4, states, actions and rewards are the basic three elements of a DRL, and given a state and a series of alternative actions, the goal of MST is to find a strategy that maximizes the cumulative rewards. It is assumed that MSTm (m 1, 2,.., m) is generated by observing the environment

And selects an action

Acting on the environment, a prize r can be obtained_t ^m. The system environment is composed of a group of states, including current routing inspection data point distribution, MST position coordinates, historical tracks and the like. After the action is executed

After that, the environmental state is represented by s_tIs converted into s_t+1. The present invention defines the following for status, actions and rewards:

1) state space S: (S) { (S)₁,S₂,S₃) Is a description of the environment. Wherein S₁Representing coordinate positions of the patrol data points and the obstacle in the two-dimensional area. The definition is as follows:

wherein the content of the first and second substances,

representing a set of obstacles within the sensing region,

representing a set of data points within the sensing region that need to be collected,

xⁿ,x^c∈[0,Q_x]，yⁿ,y^c∈[0,Q_y]，Q_x、Q_yrespectively representing the X, Y-axis coordinate maximum of a mobile smart device (MST) perception data region.

S₂The current position of the MST in the two-dimensional region is defined, and the formula is expressed as follows:

wherein the content of the first and second substances,

representing a set of mobile smart devices, defined as:

the coordinates representing the MSTm are shown,

representing the percentage of remaining power at time t of MSTm.

S₃Representing the perceived time h of the data point n accumulated at time t_t(n)∈[0,T]And T represents the maximum number of time periods in each data sensing round.

h_t+1(n)＝h_t(n)+1

2) An action space A: the motion space defines the continuous motion behavior of the MST within the two-dimensional target region:

wherein the content of the first and second substances,

indicates the direction (angle) of the MST movement,

indicates the distance moved,/_maxIndicating the maximum distance that the MST can move within a unit time t.

3) Reward R: the reward includes a perceived amount of data

Energy consumption

Where energy consumption is mainly due to MST data perception and self-movement. Suppose that

α represents the energy expended to perceive a unit of data and β represents the energy expended to move a unit of distance.

Thus awarding r_t ^mThe calculation can be as follows:

each MST is trained by four groups of deep neural networks, including an action network Actor network mu (s | theta)^μ) Critic network Q (s, a | θ)^Q) Target network T _ Actor and T _ critical. In the embodiment of the present invention, as shown in fig. 5, an action network Actor network and a target network T _ Actor constitute a first network; critic network Q (s, a | θ)^Q) And the target network T _ Critic form a second network; wherein, theta^μAnd theta^QIs a random initialization parameter of the Actor and Critic networks, and an initial parameter theta of the T _ Actor network and the T _ Critic network^μ':＝θ^μ，θ^Q':＝θ^Q. The function of each neural network is as follows:

1) an Actor network: is responsible for theta^μIterative updating of parameters based on the input current environmental state s_tSelecting a Current output action a_tFor interacting with the environment and generating a next state s_t+1Receive the current award r_t。

2) Critic network: is responsible for theta^QIteratively updating the parameters and calculating the current Q value Q (s, a | theta)^Q)。

3) T _ Actor network: selecting the optimal next action a 'according to the small batch of training samples sampled in the experience playback pool B'_tNetwork parameter θ^μ'Periodically from theta^μAnd (6) updating.

4) T _ Critic network: responsible for calculating Q ' (s ', a ' | theta)^Q') Network parameter θ^Q'Periodically from theta^QAnd (6) updating.

In each round of data perception, the MST first observes the ambient state s_tE is S, will state S_tThe input is sent to the Actor network to generate an action output a_t. To increase the randomness of the training process and the exploration of the state space, a certain noise can be added to the selected actions, a_tThe calculation formula is as follows:

wherein the noise

Can be generated according to the Ornstein-Ullenbeck random procedure (Omstein-Uhlenbeck, Ou).

In the centralized training process, each MST has a private playback pool B_mTransferring samples(s) in a storage state_t,a_t,r_t,s_t+1) And H samples are adopted from each private playback pool to form H groups of small training samples. Target T _ Actor network outputs action a 'through small-batch training samples'_tCritic networks pass through the minimization of the Loss function Loss (θ)^Q) Updating the parameters of the self, wherein the formula is as follows:

the target value y of the training sample is then calculated according to the following formula_t：

Where γ ∈ (0, 1) represents the discount factor.

Then, the Actor network is updated according to the policy gradient, and the formula is as follows:

wherein the content of the first and second substances,

it is noted that the target network, T _ Actor network and T _ critical network, are two copies of the Actor and critical networks. After the parameters of the Actor network and the critical network are updated, in order to improve the stability of the learning process, the parameters of the T _ Actor and the T _ critical network may be updated in a soft update manner. The details are as follows:

θ^μ':＝τθ^μ+(1-τ)θ^μ'

θ^Q':＝τθ^Q+(1-τ)θ^Q'

wherein τ is an update factor, and generally takes a small value (e.g., 0.1 or 0.01). After sufficient learning, all parameters of the neural network are optimized to obtain the optimal action selection strategy, so that the optimal routing inspection path is obtained. The data sensing task is completed through the optimal routing inspection path, and energy consumption cost can be reduced.

In one example, the multi-agent optimal patrol path learning process is as shown in table 1 below:

TABLE 1

Referring to fig. 6, fig. 6 is a flowchart illustrating a procedure for uploading sensing data to a predetermined block chain according to an embodiment of the present invention.

In this embodiment of the present invention, the step of uploading the sensing data to the preset block chain may include:

s61, obtaining a private key;

s62, signing the sensing data by using a private key to obtain encrypted data;

s63, sending the encrypted data to a preset authentication and authorization center for verification;

and S64, when the verification passing information returned by the authentication and authorization center is received, sending the encrypted data to a preset block chain.

In practical applications, in order to provide a secure data storage and sharing service, a private chain may be constructed based on Ethereum (etherhouse), wherein the private chain includes a miner node and a non-miner node.

Ethereum is an open-source, intelligent contract-enabled, common blockchain platform that provides decentralized ethernet Virtual machines (Ethereum Virtual machines) to handle point-to-point contracts via its dedicated cryptocurrency ethernet (ETH, abbreviated as "ETH").

1) A miner node: the data sharing transaction is written into the block and participates in transaction verification on the basis of meeting the block chain consensus protocol. Considering that the participation of block consensus and verification requires certain computational power of the MST, the embodiment of the present invention introduces the edge computing server to help the MST complete the computation task, and adds the block passing the verification to the blockchain network.

2) Non-miner nodes: since the non-mineworker node is only responsible for accepting and broadcasting the data sharing transaction request, it need not possess the same computing resources as the mineworker node.

According to the block chain characteristics, each Ethereum node keeps a complete and verified block chain and MST service copy. Assuming that each MST is actively connected to the blockchain network, after each round of data sensing, the MST uploads the data sensed from the power system site to the blockchain. Therefore, in the u-th round, the total number of data storage requests issued by all MSTs is Y ═ M × u. Where M is the number of MSTs and u is the number of data sensing rounds. In addition, to ensure that the data sent to the blockchain network is valid and not forged, the MST needs to encrypt the data using its own private key to verify the legitimacy of the data source. The data sharing process is shown in table 2 below.

TABLE 2

In Table 2, id^m: represents the identity (identity) obtained after registration of the MST numbered m;

cert^m: an authentication certificate obtained after registration of MST with the number m;

pk^m: public key of MST with number m;

sk^m: a private key secret key representing MST numbered m;

wa^m: a blockchain wallet address representing MST numbered m;

the MST with the number of m is represented in the u-th data sensing round, and the total amount of the collected data is accumulated;

representing the encryption of the acquired data by using a Hash algorithm;

private key sk indicating use of MST m for encrypted data^mCarrying out signature;

the authentication authorization center verifies the encrypted data packet with the private key signature to prevent data from being forged.

In a specific implementation, Gen (1) is performed first^v) Obtaining public-private key pair (pk) of MST m^m,sk^m) In which 1 is^vIndicating a security parameter. The total number of rounds of MST perceptual data is U, each round comprising T time periods. Thus, the amount of data that MST cumulatively senses after the end of each data sensing round is as follows:

wherein the content of the first and second substances,

indicating the tth time of MSTm in sensing round uSegment-aware data volume. MST uses its own private key pk^mAnd after the data is signed, sending the data to an authentication and authorization center for verification. The certificate authority decrypts the data packet and verifies whether the data packet is a legitimate MST transmission. In addition, obtained after decryption

And the data stored in the MST is compared with the data stored in the MST so as to prevent data from being forged. After receiving the storage request from the MST, the blockchain verifies the signature on the request and determines whether to submit the transaction to the blockchain network based on the verification result. The submitted request transaction will be mined by the miner node and written to the new block.

The distributed storage and sharing of the routing inspection data of the power system are realized by using the block chain, and the safety and the integrity of the power data are ensured. Meanwhile, the decentralized sharing mode improves the flexibility and the access efficiency of the system.

Referring to fig. 7, fig. 7 is a block diagram of a data sharing device of an electric power internet of things according to an embodiment of the present invention.

The embodiment of the invention provides a data sharing device of an electric power Internet of things, which is applied to mobile intelligent equipment and comprises:

an optimal routing inspection path training module 701, configured to train an optimal routing inspection path at a routing inspection site of the power system;

a perception data obtaining module 702, configured to patrol the power system based on the optimal patrol route, and obtain perception data;

an uploading module 703 is configured to upload the sensing data to a preset block chain.

In this embodiment of the present invention, the optimal routing inspection path training module 701 includes:

the storage submodule is used for storing the current environment state, the current output action, the reward and the next environment state in a preset playback pool;

the circulation submodule is used for judging whether the current time is equal to the preset time or not, if not, the time corresponding to the next environment state is adopted as the current time, and the step of inputting the current environment state into the preset first network to obtain the current output action is returned until the current time is equal to the preset time;

the strategy gradient generation submodule is used for acquiring a training sample from the playback pool, calculating the target value of the training sample through a preset second network, and obtaining a strategy gradient based on the target value;

the second optimization parameter obtaining submodule is used for optimizing a second network based on the target value to obtain a second optimization parameter;

the first optimization parameter obtaining submodule is used for optimizing the first network by adopting strategy gradient to obtain a first optimization parameter;

In this embodiment of the present invention, the upload module 703 includes:

the private key obtaining sub-module is used for obtaining a private key;

the encryption submodule is used for signing the sensing data by adopting a private key to obtain encrypted data;

and the uploading sub-module is used for sending the encrypted data to the preset block chain when receiving the verification passing information returned by the authentication and authorization center.

In an embodiment of the present invention, the upload sub-module includes:

the uploading unit is used for sending a storage request to the block chain when receiving verification passing information returned by the authentication and authorization center; the storage request carries encrypted data; the block chain is used for verifying the encrypted data and storing the sensing data when the verification is passed.

the memory is used for storing the program codes and transmitting the program codes to the processor;

the processor is used for executing the data sharing method of the power internet of things according to the instructions in the program codes.

The embodiment of the invention also provides a computer-readable storage medium, which is used for storing the program codes, and the program codes are used for executing the data sharing method of the power internet of things in the embodiment of the invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A data sharing method of an electric power Internet of things is applied to mobile intelligent equipment, and comprises the following steps:

and uploading the sensing data to a preset block chain.

2. The method of claim 1, wherein the step of training an optimal routing inspection path at a routing inspection site of the power system comprises:

3. The method of claim 1, wherein the step of uploading the perception data to a preset blockchain comprises:

obtaining a private key;

4. The method according to claim 3, wherein the step of sending the encrypted data to a preset block chain when receiving verification passing information returned by the certificate authority comprises:

5. The utility model provides a data sharing device of electric power thing networking which characterized in that is applied to mobile intelligent equipment, the device includes:

6. The apparatus of claim 5, wherein the optimal routing inspection path training module comprises:

7. The apparatus of claim 5, wherein the upload module comprises:

the private key obtaining sub-module is used for obtaining a private key;

8. The apparatus of claim 7, wherein the upload sub-module comprises:

9. An electronic device, comprising a processor and a memory:

the processor is used for executing the data sharing method of the power internet of things as claimed in any one of claims 1-4 according to instructions in the program code.

10. A computer-readable storage medium for storing program code for performing the data sharing method of the power internet of things of any one of claims 1-4.