CN114936606A - Asynchronous decentralized model training method suitable for edge Internet of things agent device - Google Patents

Asynchronous decentralized model training method suitable for edge Internet of things agent device Download PDF

Info

Publication number
CN114936606A
CN114936606A CN202210651051.5A CN202210651051A CN114936606A CN 114936606 A CN114936606 A CN 114936606A CN 202210651051 A CN202210651051 A CN 202210651051A CN 114936606 A CN114936606 A CN 114936606A
Authority
CN
China
Prior art keywords
model
edge internet
things
agent device
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210651051.5A
Other languages
Chinese (zh)
Other versions
CN114936606B (en
Inventor
于东晓
张良旭
陈姝祯
邹逸飞
王鹏
杜超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Shanghai Step Electric Corp
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202210651051.5A priority Critical patent/CN114936606B/en
Publication of CN114936606A publication Critical patent/CN114936606A/en
Application granted granted Critical
Publication of CN114936606B publication Critical patent/CN114936606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an asynchronous decentralized model training method suitable for an edge Internet of things agent device, which comprises the following steps: each edge Internet of things agent device collects data from a user terminal and initializes the data; calculating a loss function value, performing back propagation according to the loss function value to obtain a gradient, and updating the model by using a random gradient descent formula; carrying out data privacy protection on the updated model by using a differential privacy mechanism; each edge Internet of things agent device sends the model to a neighbor and receives the neighbor model, and updates a local buffer pool when receiving the neighbor model; and aggregating the models in the local buffer pool to obtain a next iteration model. The training method disclosed by the invention reduces the bandwidth requirement through decentralization, improves the robustness of the algorithm, accelerates the training through an asynchronous mode, and improves the resource utilization rate; meanwhile, the protection of the local data privacy is realized by utilizing a differential privacy technology, and the privacy disclosure is effectively prevented.

Description

Asynchronous decentralized model training method suitable for edge Internet of things agent device
Technical Field
The invention belongs to the field of distributed machine learning, and particularly relates to an asynchronous decentralized model training method suitable for an edge Internet of things agent device.
Background
Distributed machine learning refers to a computing technique that distributes data across different machines for parallel large-scale machine learning training. The architecture based on the parameter server is a basic distributed model, each client performs gradient calculation according to local data of the client and then sends the gradient calculation to the server, and the server performs model aggregation and then sends the global model to each client to perform the next gradient calculation. This architecture breaks the bottleneck of stand-alone training and is able to process larger scale data, but with less robustness. First, this architecture is limited by the bandwidth and performance of the parameter servers, and when a parameter server is down or damaged, the entire training process will terminate until the server is functioning properly. Secondly, each client needs to wait for the completion of the training of other clients and can enter the next round of training after sending the model to the parameter server, so that the performance based on the parameter server architecture is limited by the worst performance of the client, and the waste of training resources is caused.
Decentralized machine learning can effectively solve the above problems. Under a decentralized architecture, all clients form a peer-to-peer network, and each client has own data and model. In each training round, each client uses its own data to perform gradient calculation, then shares the model with the connected neighbor clients, and calculates a local average model. Each client only needs to communicate with the connected neighbor clients and updates the model of the client, and the downtime or the error of any client only affects the training of a part of machines, so that the decentralized machine learning has higher robustness to the problems of communication bottleneck, edge equipment failure and the like.
The architecture of decentralized machine learning has great coupling with edge computing, and can be applied to various fields of edge computing. In the field of intelligent power grids, power loads intelligently predict power which needs to be used in multiple places, and power utilization data of each region are uploaded to local edge equipment to be cooperatively trained to obtain a global prediction model; for another example, in an intelligent inspection system, a machine learning model is needed to determine whether an inspected circuit or equipment has a fault, however, fault information of different scene areas is different, and data can be sourced from a large number of inspection equipment in different areas. Therefore, the inspection robot can be used for data acquisition and then uploaded to local edge equipment to cooperatively train a global model capable of covering various fault information. Edge computing essentially provides a decentralized service, and decentralized cooperative training has unprecedented advantages in the field of edge computing.
Although decentralized cooperative training algorithms have many advantages over the algorithms of the parameter server architecture, they still have certain disadvantages. As introduced above, each client needs to communicate with the neighbor's clients and share model data in each round of training. Therefore, any client needs to synchronously wait for the completion of the training of all the neighbor nodes and successfully receive the model to perform the next round of training, which causes resource waste to a certain extent. In addition, data privacy during communication needs to be further ensured.
Disclosure of Invention
In order to solve the technical problems, the invention provides an asynchronous decentralized model training method suitable for an edge Internet of things agent device, bandwidth requirements are reduced through decentralized and robustness of an algorithm is improved, and training is accelerated through an asynchronous mode and resource utilization rate is improved; meanwhile, the protection of the local data privacy is realized by utilizing a differential privacy technology.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an asynchronous decentralized model training method suitable for a marginal Internet of things proxy device comprises the following steps:
(1) an initialization stage: each edge Internet of things agent device collects data from a user terminal, preprocesses the data to obtain a sample data set, and initializes a local model and a local buffer pool;
(2) and (3) updating the model: each edge Internet of things agent device calculates a loss function value of the sample data set, and then performs back propagation according to the loss function value to obtain a gradient; updating the model by using a random gradient descent formula;
(3) a privacy protection stage: each edge Internet of things agent device utilizes a differential privacy mechanism to introduce quantitative noise into the updated model for data privacy protection, and a noise-adding model is obtained;
(4) a model sharing stage: each edge Internet of things agent device sends the noise-added model to a neighbor edge Internet of things agent device, receives the model from the neighbor edge Internet of things agent device, and updates the model to a local buffer pool when receiving the neighbor model;
(5) a model polymerization stage: and (3) aggregating each edge Internet of things agent device by using the model in the local buffer pool to obtain a next iteration model, and returning to the step (2) until the maximum iteration round number is met and then outputting the model.
In the above scheme, the parameters initialized in step (1) include: the method comprises the following steps of calculating the total iteration number K, the learning rate gamma, the weight matrix W, the variance sigma of noise, the number n of edge Internet of things proxy devices and an initialization model of the edge Internet of things proxy device i
Figure BDA0003687737050000021
And initializing local buffer pool variables
Figure BDA0003687737050000022
Sample data set D i And a loss function, the loss function employing a crossoverAn entropy loss function F (·).
In the above scheme, the step (2) is specifically as follows:
(2.1) in the k-th iteration, the edge Internet of things agent i samples the data set D from the edge Internet of things agent i i In-between random sampling of a data sample
Figure BDA0003687737050000023
And is stored in the memory to be stored in the memory,
Figure BDA0003687737050000024
the characteristics of the sample are represented by,
Figure BDA0003687737050000025
representing a sample label, where d is the dimension of the sample feature, R d Representing a d-dimensional real number space;
(2.2) the edge Internet of things agent device i trains the local model by using an AI module, namely, samples to be obtained
Figure BDA0003687737050000031
Sample characteristics of
Figure BDA0003687737050000032
Model input to the k-th wheel
Figure BDA0003687737050000033
To obtain a predicted value y pred
(2.3) labeling the specimen
Figure BDA0003687737050000034
And predicted value y pred The input is input into a cross entropy loss function, namely a loss function value is obtained according to the following formula:
Figure BDA0003687737050000035
(2.4) obtaining a loss function from the back propagation derivation
Figure BDA0003687737050000036
About model
Figure BDA0003687737050000037
Gradient of (2)
Figure BDA0003687737050000038
Then, parameters are updated by using a random gradient descent formula to obtain an updated model
Figure BDA0003687737050000039
Figure BDA00036877370500000310
Where γ is the learning rate.
In the above scheme, the step (3) is specifically as follows:
(3.1) sampling a d-dimensional vector from the Gaussian distribution with d-dimensional variance as sigma to obtain a noise vector according to the initialized noise variance sigma
Figure BDA00036877370500000311
(3.2) noise vector to be sampled
Figure BDA00036877370500000312
Adding to updated model
Figure BDA00036877370500000313
To obtain a model of the noise-adding process
Figure BDA00036877370500000314
Figure BDA00036877370500000315
In the scheme, the step (4) is specifically as follows:
(4.1) edge agent of things i uses upstream communication interface and networkModel for network protocol to add its own noise
Figure BDA00036877370500000316
Sending the information to a neighbor edge Internet of things agent j;
(4.2) the edge IOT agent i uses the upstream communication interface and network protocol to receive the noise processing model from the neighbor edge IOT agent j
Figure BDA00036877370500000317
And model
Figure BDA00036877370500000318
If the buffer pool variable is saved in the buffer pool of the user, the updated buffer pool variable is stored in the buffer pool
Figure BDA00036877370500000319
In the above scheme, the step (5) is specifically as follows:
and aggregating the model data of the buffer pool by using the initialized weight matrix W, namely performing model aggregation on the edge Internet of things agent device i to obtain the model data of the next round:
Figure BDA00036877370500000320
wherein, W ij Elements representing ith row and jth column represent a model of an edge agent i receiving an edge agent j
Figure BDA00036877370500000321
The weight assigned.
Through the technical scheme, the asynchronous decentralized model training method suitable for the edge Internet of things agent device has the following beneficial effects:
(1) each edge Internet of things agent device initializes a buffer pool to receive model data from a neighbor edge Internet of things agent device, updates variables in the buffer pool when the neighbor data is successfully received, and finally uses the variables in the buffer pool to aggregate the models; by an asynchronous mode, any edge Internet of things agent device can directly use model data in the buffer pool for aggregation, and the training of other edge Internet of things agent devices is not required to be completed. This will greatly increase the speed of model training and solve the problem of resource waste caused by different equipment differences.
(2) The invention introduces a differential privacy mechanism, carries out noise processing on information transmitted by different edge Internet of things agent devices, and ensures privacy safety for data of each edge Internet of things agent device.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a schematic diagram of a decentralized machine learning architecture according to the present disclosure;
FIG. 2 is a functional block diagram of an edge IOT agent according to the present disclosure;
FIG. 3 is a flowchart of an asynchronous decentralized model training method suitable for an edge Internet of things proxy device according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
The invention provides an asynchronous decentralized model training method suitable for an edge Internet of things agent device, wherein a decentralized machine learning architecture is shown in figure 1: and the user terminal of each region uploads the data to a local edge Internet of things agent device, and the edge Internet of things agent device processes the data to obtain a data set with a uniform format. And then, each edge Internet of things agent device performs machine learning model training, and shares the model trained by itself to other connected edge Internet of things agent devices in each iteration. And finally, each edge Internet of things agent device aggregates all collected models to perform the next iteration.
As shown in fig. 2, the edge internet of things proxy apparatus is an operation terminal for completing cooperative training, and the functional architecture of the edge internet of things proxy apparatus used in the embodiment of the present invention mainly includes the following parts:
(1) data acquisition module
The wireless communication device provides diversified wired and wireless interfaces, and is compatible with various wired communication protocols such as LAN, USB, RS485/232 and the like and wireless communication protocols such as RFID, Bluetooth, ZigBee, LoRa, NB-IoT and the like. Different wired communication and short-distance wireless communication technologies are adopted for different acquisition terminals to access the data of the terminals to the edge Internet of things agent device nearby, and unified acquisition of the data is achieved. Aiming at a wireless data acquisition scene, a wireless communication protocol frequency spectrum isolation technology is adopted, so that the mutual interference of different protocols is avoided, and the accuracy and the real-time performance of data acquisition are ensured. The northbound open interface supports multiple interface protocols such as MQTT, HTTP, COAP, XMPP, TCP/UDP and the like, and realizes unified data interaction with a platform layer Internet of things platform, a cloud platform and a full-service data center.
(2) Edge calculation module
The edge Internet of things agent device provides light-weight data caching capacity, format conversion is carried out on collected data according to a uniform service data model, and service-level analysis is carried out on the data by using an artificial intelligence technology. The device provides micro-service capability, can rapidly deploy customized application, and meets personalized requirements under different service scenes. The calculation result can be directly returned and transmitted to the terminal and the user, and the quick response of the service request is realized.
(3) AI function module
The edge Internet of things agent device hardware adopts a proprietary AI accelerating chip, deploys a general AI SDK, supports a mainstream deep learning reasoning framework and provides reasoning calculation accelerating capability for edge services. Especially for video monitoring services and high-bandwidth and low-delay service scenes, the local video coding and decoding hardware capability and AI calculation acceleration capability can effectively relieve the calculation load of a CPU (Central processing Unit) and meet the requirements of related services.
(4) Safety control module
In the aspect of data security, the edge Internet of things agent device provides a zero trust security access authentication mechanism, and performs multi-condition identity authentication on an access terminal by comprehensively analyzing fingerprint information such as an IP address, an MAC address, a communication protocol, a digital certificate and the like of equipment, so that the occurrence of equipment counterfeiting is avoided, and the identity credibility, safety and reliability of the access equipment are ensured. In the aspect of transmission safety, the edge Internet of things agent device establishes a virtual safe private network based on an encryption tunnel and an SD-WAN technology, so that a private safe isolation network is established on an unsafe wide area network, and meanwhile, different safe private networks can be established for different services, so that service isolation is achieved. In the aspect of equipment safety, unified maintenance management of the device and the access terminal is supported, wherein the unified maintenance management comprises login management, authority management, task management, data management, state monitoring, abnormal alarm and the like, and meanwhile periodic vulnerability scanning and automatic firmware upgrading of the device can be realized.
It should be noted that the existing edge internet of things agent apparatus can complete the model training method of the present invention, and is not limited to the above edge internet of things agent apparatus.
The invention discloses an asynchronous decentralized model training method suitable for an edge Internet of things agent device. Second, the method adds a certain amount of noise to the local model before model sharing to protect data privacy.
As shown in fig. 3, the method specifically includes the following steps:
(1) an initialization stage: each edge Internet of things agent device collects data from a user terminal, preprocesses the data to obtain a sample data set, and initializes a local model and a local buffer pool.
The method comprises the following specific steps:
(1.1) initializing the total number of iterations K, K usually being trained several times to find a suitable value.
(1.2) initialize the learning rate γ, which controls the convergence effect, usually through multiple training to find a suitable value. The early stage convergence of the larger learning rate is faster, but the oscillation convergence effect is poor easily in the vicinity of the optimal solution, the slower convergence speed is caused by the smaller learning rate, and the training time is very long.
(1.3) the number of the edge Internet of things agent devices is represented by a variable n, and the number of the n edge Internet of things agent devices is [1,2, …, n]. Initializing model for each edge Internet of things agent device i
Figure BDA0003687737050000061
And buffer pool variables
Figure BDA0003687737050000062
These values may be set to 0.
(1.4) initializing the local data set D i For each edge internet-of-things agent and loss function F (-) the data set is usually processed data containing features and labels, and the loss function can be a cross-entropy loss function.
(1.5) initializing a adjoint W with double random properties, and carrying out model aggregation by using the values of the adjoint by each edge Internet of things agent device. W ij And elements in the ith row and the jth column represent weights given by the edge internet of things agent i when receiving the model of the edge internet of things agent j.
(1.6) initializing the variance σ of the noise, wherein the larger the variance represents the more noise is added, the higher the degree of privacy protection is, but the larger the variance also causes the more noisy data can deviate from the original data, and the convergence speed can be reduced.
(2) And (3) updating the model: each edge Internet of things agent device calculates a loss function value of the sample data set by utilizing an AI function module thereof, and then performs reverse propagation according to the loss function value to obtain a gradient; and then updating the model by using a random gradient descent formula.
The method comprises the following specific steps:
(2.1) in the k-th iteration, the edge Internet of things agent i samples the data set D from the edge Internet of things agent i i In-between random sampling of a data sample
Figure BDA0003687737050000063
And is stored in the memory to be stored in the memory,
Figure BDA0003687737050000064
the characteristics of the sample are represented by,
Figure BDA0003687737050000065
representing a sample label, where d is the dimension of the sample feature, R d Representing a d-dimensional real number space;
(2.2) the edge Internet of things agent device i trains the local model by using an AI module, namely, samples to be obtained
Figure BDA0003687737050000066
Sample characteristics of
Figure BDA0003687737050000067
Model input to the k-th wheel
Figure BDA0003687737050000068
To obtain a predicted value y pred
(2.3) labeling the specimen
Figure BDA0003687737050000069
And predicted value y pred The input is input into a cross entropy loss function, namely a loss function value is obtained according to the following formula:
Figure BDA0003687737050000071
(2.4) obtaining a loss function from the back propagation derivation
Figure BDA0003687737050000072
About model
Figure BDA0003687737050000073
Gradient of (2)
Figure BDA0003687737050000074
Then, parameters are updated by using a random gradient descent formula to obtain an updated model
Figure BDA0003687737050000075
Figure BDA0003687737050000076
Where γ is the learning rate.
(3) A privacy protection stage: and each edge Internet of things agent device utilizes a differential privacy mechanism to introduce quantitative noise into the updated model for data privacy protection, so that a noise-adding model is obtained.
The method comprises the following specific steps:
(3.1) sampling a d-dimensional vector from a Gaussian distribution with d-dimensional variance sigma to obtain a noise vector according to the initialized noise variance sigma
Figure BDA0003687737050000077
(3.2) noise vector to be sampled
Figure BDA0003687737050000078
Adding to updated model
Figure BDA0003687737050000079
In the method, a model of noise-adding treatment is obtained
Figure BDA00036877370500000710
Figure BDA00036877370500000711
(4) A model sharing stage: each edge Internet of things agent device sends the noise-added model to a neighbor edge Internet of things agent device, receives the model from the neighbor edge Internet of things agent device, and updates the model to a local buffer pool when receiving the neighbor model.
The method comprises the following specific steps:
(4.1) edge Internet of things agent device i uses upstream communication interface and network protocol to process self-noise model
Figure BDA00036877370500000712
Sending the information to a neighbor edge Internet of things agent j;
(4.2) the edge Internet of things agent device i receives the noise-added model from the neighbor edge Internet of things agent device j by using an uplink communication interface and a network protocol
Figure BDA00036877370500000713
And model
Figure BDA00036877370500000714
If the buffer pool variable is saved in the buffer pool of the user, the updated buffer pool variable is stored in the buffer pool
Figure BDA00036877370500000715
(5) A model polymerization stage: and (3) aggregating each edge Internet of things agent device by using the model in the local buffer pool to obtain a next iteration model, and returning to the step (2) until the maximum iteration round number is met and then outputting the model.
The method comprises the following specific steps:
and aggregating the model data of the buffer pool by using the initialized weight matrix W, namely performing model aggregation on the edge Internet of things agent device i to obtain the model data of the next round:
Figure BDA0003687737050000081
wherein, W ij Elements representing ith row and jth column representing a model of an edge entity-association broker i receiving an edge entity-association broker j
Figure BDA0003687737050000082
The weight assigned. To ensure convergence, W needs to beEnsuring the dual random nature, i.e. the sum of any row and any column is 1. Wherein if i and j are not neighbors, W must be present ij =W ji =0。
The invention also discloses an asynchronous decentralized power equipment fault identification method suitable for the edge Internet of things agent device, which comprises the following steps:
first, data acquisition stage
1. The type of equipment and the type of fault to be detected are determined, depending on the actual scene requirements.
2. And (4) carrying out data collection according to the determined equipment, and putting the inspection robot to the area needing to be detected. Due to the fact that most scenes need to be covered, data need to be collected for the same power equipment under different weather conditions at different moments, not only pictures containing fault information but also normal pictures need to be collected.
3. A part of data is acquired from the Internet, the step is supplementary to the data acquisition of the previous step, and a fault detection model trained by more data sets has higher robustness.
4. And cutting all data pictures into uniform formats, and naming the pictures uniformly.
II, data processing stage:
1. the fault category, i.e. the number of sample tags and the specific sample tags, is first determined. Such as no device fault set to label No. 0, transformer fault set to label No. 1, line fault set to label No. 2, etc. This step depends on the actual scene requirements.
2. And marking the collected picture. This process can be accomplished using labeling tools such as labelme or labellimg. Specifically, the marking tool is used for marking the device to be detected in the picture, and a VOC data set format or a coco data set format is generated.
3. And segmenting the equipment fault data set into a training set, a verification set and a test set.
Thirdly, model training stage:
after sufficient data is collected, the data is uploaded to a local edge internet of things proxy device. Training a fault detection model in the edge Internet of things agent device according to the process shown in fig. 3 to obtain an output fault detection model. For target detection, a Fast-RCNN model or a Yolo model is generally used.
Fourthly, a model deployment stage:
and deploying environments required by the operation of the model on the inspection robot to be deployed, such as a python environment, a detection system, an early warning system and the like required by the operation. And then deploying the finally output fault detection model to each inspection robot in a docker or http service mode and the like.
Fifthly, fault detection stage:
the intelligent inspection robot acquires data of the power equipment and performs fault detection on the acquired picture data by using the fault detection model.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. An asynchronous decentralized model training method suitable for a marginal Internet of things proxy device is characterized by comprising the following steps:
(1) an initialization stage: each edge Internet of things agent device collects data from a user terminal, preprocesses the data to obtain a sample data set, and initializes a local model and a local buffer pool;
(2) and (3) a model updating stage: each edge Internet of things agent device calculates a loss function value of the sample data set, and then performs reverse propagation according to the loss function value to obtain a gradient; updating the model by using a random gradient descent formula;
(3) a privacy protection stage: each edge Internet of things agent device utilizes a differential privacy mechanism to introduce quantitative noise into the updated model for data privacy protection, and a noise-adding model is obtained;
(4) a model sharing stage: each edge Internet of things agent device sends the noise-added model to a neighbor edge Internet of things agent device, receives the model from the neighbor edge Internet of things agent device, and updates the model to a local buffer pool when receiving the neighbor model;
(5) a model polymerization stage: and (3) aggregating each edge Internet of things agent device by using the model in the local buffer pool to obtain a next iteration model, and returning to the step (2) until the maximum iteration round number is met and then outputting the model.
2. The method for training the asynchronous decentralized model of the edge internet of things proxy device according to claim 1, wherein the parameters initialized in step (1) include: the method comprises the following steps of calculating the total iteration number K, the learning rate gamma, the weight matrix W, the variance sigma of noise, the number n of edge Internet of things proxy devices and an initialization model of the edge Internet of things proxy device i
Figure FDA0003687737040000011
And initializing local buffer pool variables
Figure FDA0003687737040000012
Sample data set D i And a loss function, the loss function employing a cross-entropy loss function F (·).
3. The method for training the asynchronous decentralized model suitable for the edge internet of things proxy device according to claim 1, wherein the step (2) is specifically as follows:
(2.1) in the k-th iteration, the edge Internet of things agent i samples the data set D from the edge Internet of things agent i i In-between random sampling of a data sample
Figure FDA0003687737040000013
And is stored in the internal memory of the computer,
Figure FDA0003687737040000014
the characteristics of the sample are represented by,
Figure FDA0003687737040000015
representing a sample label, where d is the dimension of the sample feature, R d Representing a d-dimensional real number space;
(2.2) the edge Internet of things agent device i trains the local model by using an AI module, namely, samples to be obtained
Figure FDA0003687737040000016
Sample characteristics of
Figure FDA0003687737040000017
Model input to the k-th wheel
Figure FDA0003687737040000018
To obtain a predicted value y pred
(2.3) labeling the specimen
Figure FDA0003687737040000019
And predicted value y pred The input is input into a cross entropy loss function, namely a loss function value is obtained according to the following formula:
Figure FDA00036877370400000110
(2.4) obtaining a loss function from the back propagation derivation
Figure FDA00036877370400000111
About model
Figure FDA00036877370400000112
Gradient of (2)
Figure FDA00036877370400000113
Then, parameters are updated by using a random gradient descent formula to obtain an updated model
Figure FDA00036877370400000114
Figure FDA0003687737040000021
Where γ is the learning rate.
4. The method for training the asynchronous decentralized model suitable for the edge internet of things proxy device according to claim 1, wherein the step (3) is specifically as follows:
(3.1) sampling a d-dimensional vector from a Gaussian distribution with d-dimensional variance sigma to obtain a noise vector according to the initialized noise variance sigma
Figure FDA0003687737040000022
(3.2) noise vector to be sampled
Figure FDA0003687737040000023
Adding to updated model
Figure FDA0003687737040000024
To obtain a model of the noise-adding process
Figure FDA0003687737040000025
Figure FDA0003687737040000026
5. The method for training the asynchronous decentralized model suitable for the edge agent device of the internet of things as claimed in claim 1, wherein the step (4) is specifically as follows:
(4.1) edge Internet of things agent device i uses upstream communication interface and network protocol to process self-noise model
Figure FDA0003687737040000027
Sending the information to a neighbor edge Internet of things agent j;
(4.2) the edge Internet of things agent device i receives the noise-added model from the neighbor edge Internet of things agent device j by using an uplink communication interface and a network protocol
Figure FDA0003687737040000028
And model
Figure FDA0003687737040000029
If the buffer pool variable is saved in the buffer pool of the user, the updated buffer pool variable is stored in the buffer pool
Figure FDA00036877370400000210
6. The method for training the asynchronous decentralized model suitable for the edge internet of things proxy device according to claim 1, wherein the step (5) is specifically as follows:
and aggregating the model data of the buffer pool by using the initialized weight matrix W, namely performing model aggregation on the edge Internet of things agent device i to obtain the model data of the next round:
Figure FDA00036877370400000211
wherein, W ij Elements representing ith row and jth column represent a model of an edge agent i receiving an edge agent j
Figure FDA00036877370400000212
The weight assigned.
CN202210651051.5A 2022-06-10 2022-06-10 Asynchronous decentralization model training method suitable for edge internet of things proxy device Active CN114936606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210651051.5A CN114936606B (en) 2022-06-10 2022-06-10 Asynchronous decentralization model training method suitable for edge internet of things proxy device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210651051.5A CN114936606B (en) 2022-06-10 2022-06-10 Asynchronous decentralization model training method suitable for edge internet of things proxy device

Publications (2)

Publication Number Publication Date
CN114936606A true CN114936606A (en) 2022-08-23
CN114936606B CN114936606B (en) 2024-08-23

Family

ID=82866832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210651051.5A Active CN114936606B (en) 2022-06-10 2022-06-10 Asynchronous decentralization model training method suitable for edge internet of things proxy device

Country Status (1)

Country Link
CN (1) CN114936606B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection
CN114116198A (en) * 2021-10-21 2022-03-01 西安电子科技大学 Asynchronous federal learning method, system, equipment and terminal for mobile vehicle
US20220114475A1 (en) * 2020-10-09 2022-04-14 Rui Zhu Methods and systems for decentralized federated learning
CN114363043A (en) * 2021-12-30 2022-04-15 华东师范大学 Asynchronous federated learning method based on verifiable aggregation and differential privacy in peer-to-peer network
CN114362940A (en) * 2021-12-29 2022-04-15 华东师范大学 Server-free asynchronous federated learning method for data privacy protection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220114475A1 (en) * 2020-10-09 2022-04-14 Rui Zhu Methods and systems for decentralized federated learning
CN112818394A (en) * 2021-01-29 2021-05-18 西安交通大学 Self-adaptive asynchronous federal learning method with local privacy protection
CN114116198A (en) * 2021-10-21 2022-03-01 西安电子科技大学 Asynchronous federal learning method, system, equipment and terminal for mobile vehicle
CN114362940A (en) * 2021-12-29 2022-04-15 华东师范大学 Server-free asynchronous federated learning method for data privacy protection
CN114363043A (en) * 2021-12-30 2022-04-15 华东师范大学 Asynchronous federated learning method based on verifiable aggregation and differential privacy in peer-to-peer network

Also Published As

Publication number Publication date
CN114936606B (en) 2024-08-23

Similar Documents

Publication Publication Date Title
CN112203282B (en) 5G Internet of things intrusion detection method and system based on federal transfer learning
Mohammed et al. Budgeted online selection of candidate IoT clients to participate in federated learning
Zainudin et al. An efficient hybrid-dnn for ddos detection and classification in software-defined iiot networks
CN107770263A (en) A kind of internet-of-things terminal safety access method and system based on edge calculations
EP4078899A1 (en) Systems and methods for enhanced feedback for cascaded federated machine learning
CN111510433A (en) Internet of things malicious flow detection method based on fog computing platform
CN107683597A (en) Network behavior data collection and analysis for abnormality detection
CN114330544A (en) Method for establishing business flow abnormity detection model and abnormity detection method
US20240106836A1 (en) Learning of malicious behavior vocabulary and threat detection
CN114710330A (en) Anomaly detection method based on heterogeneous hierarchical federated learning
CN116347492A (en) 5G slice flow abnormality detection method, device, computer equipment and storage medium
CN116319437A (en) Network connectivity detection method and device
Barsellotti et al. Introducing data processing units (DPU) at the edge
Ozer et al. Offloading deep learning powered vision tasks from UAV to 5G edge server with denoising
Abbasi et al. FLITC: A novel federated learning-based method for IoT traffic classification
Ageyev et al. Traffic Monitoring and Abnormality Detection Methods for IoT
CN112749403B (en) Edge data encryption method suitable for edge Internet of things agent device
Chen et al. Federated meta-learning framework for few-shot fault diagnosis in industrial IoT
WO2020192922A1 (en) Intermediate network node and method performed therein for handling data of communication networks
CN117729540A (en) Perception equipment cloud edge safety control method based on unified edge computing framework
CN114070775A (en) Block chain network slice safety intelligent optimization method facing 5G intelligent network connection system
CN114936606B (en) Asynchronous decentralization model training method suitable for edge internet of things proxy device
Chen et al. A 5G Enabled Adaptive Computing Workflow for Greener Power Grid
Malini et al. An efficient deep learning mechanisms for IoT/Non-IoT devices classification and attack detection in SDN-enabled smart environment
CN116155592A (en) AMI network intrusion detection method based on DCGAN federal semi-supervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Wang Peng

Inventor after: Zhang Liangxu

Inventor after: Chen Shuzhen

Inventor after: Yu Dongxiao

Inventor after: Zou Yifei

Inventor after: Du Chao

Inventor before: Yu Dongxiao

Inventor before: Zhang Liangxu

Inventor before: Chen Shuzhen

Inventor before: Zou Yifei

Inventor before: Wang Peng

Inventor before: Du Chao

CB03 Change of inventor or designer information
TA01 Transfer of patent application right

Effective date of registration: 20221116

Address after: No.72 Binhai Road, Jimo District, Qingdao, Shandong 266200

Applicant after: SHANDONG University

Applicant after: SHANGHAI STEP ELECTRIC Corp.

Address before: No.72 Binhai Road, Jimo District, Qingdao, Shandong 266200

Applicant before: SHANDONG University

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant