CN116614484B - Heterogeneous data federal learning method based on structure enhancement and related equipment - Google Patents
Heterogeneous data federal learning method based on structure enhancement and related equipment Download PDFInfo
- Publication number
- CN116614484B CN116614484B CN202310884262.8A CN202310884262A CN116614484B CN 116614484 B CN116614484 B CN 116614484B CN 202310884262 A CN202310884262 A CN 202310884262A CN 116614484 B CN116614484 B CN 116614484B
- Authority
- CN
- China
- Prior art keywords
- loss function
- data
- network model
- determining
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 230000006870 function Effects 0.000 claims abstract description 119
- 238000005070 sampling Methods 0.000 claims abstract description 119
- 238000012549 training Methods 0.000 claims abstract description 92
- 238000012545 processing Methods 0.000 claims abstract description 19
- 230000004044 response Effects 0.000 claims abstract description 18
- 238000004590 computer program Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 13
- 238000005520 cutting process Methods 0.000 claims description 2
- 230000002787 reinforcement Effects 0.000 claims 1
- 238000004891 communication Methods 0.000 description 11
- 230000002776 aggregation Effects 0.000 description 7
- 238000004220 aggregation Methods 0.000 description 7
- 238000009826 distribution Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000002708 enhancing effect Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000013506 data mapping Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/04—Protocols for data compression, e.g. ROHC
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Machine Translation (AREA)
- Image Analysis (AREA)
Abstract
The application provides a heterogeneous data federation learning method based on structure enhancement and related equipment, which comprises the steps that a global model of a receiving server side initializes a local network model according to the global model; in response to determining that the initialization is completed, acquiring local data, and performing regularization training on the local network model according to the local data to determine a first loss function; performing structure enhancement training on a local network model in response to determining that training is completed to obtain a preset sampling coefficient and a preset sampling width, and sampling the local network model according to the preset sampling coefficient and the preset sampling width to determine a plurality of sub-network models; performing data enhancement processing on the local data to determine training data, and performing update training on a plurality of sub-network models according to the training data to determine a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the local network model updated by the global loss function to the server side.
Description
Technical Field
The application relates to the technical field of federal learning, in particular to a heterogeneous data federal learning method based on structural enhancement and related equipment.
Background
Federal learning enables a large number of clients to implement collaborative training of machine learning models without compromising data privacy. In a federal learning setting, participating clients are typically deployed in various environments or owned by different users or organizations. Thus, the distribution of local data per client may vary greatly, i.e., data heterogeneity. Such non-independent co-distributed data between participating devices in federal learning results in reduced global model accuracy.
Disclosure of Invention
In view of the above, the present application aims to provide a heterogeneous data federal learning method and related devices based on structural enhancement.
Based on the above object, the present application provides a heterogeneous data federation learning method based on structural enhancement applied to federation learning clients, comprising:
receiving a global model of a server side, and initializing a local network model according to the global model;
obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the heterogeneous local data to determine a first loss function;
in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function.
Optionally, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining the sub-network model includes:
determining a model width of the local network model;
sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; wherein the preset sampling coefficients are in one-to-one correspondence with the preset sampling widths.
Optionally, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models includes:
determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient;
and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
Optionally, the local data is image data;
the data enhancement processing is performed on the local data, and the training data is determined, including:
and rotating, overturning, scaling, cutting, adding noise and changing the color range to the image data to determine the training data.
Optionally, the updating training is performed on the plurality of sub-network models according to the training data, and determining the global loss function includes:
carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions of a plurality of the sub-network models; wherein the second loss function is a divergence loss function;
and determining the global loss function according to the first loss function and the second loss functions.
Optionally, the first loss function is a cross entropy loss function;
said determining said global loss function from said first loss function and said plurality of second loss functions comprises:
summing a plurality of the second loss functions to determine a loss function sum;
and determining the global loss function according to the loss function sum and the first loss function.
Based on the same inventive concept, the embodiment of the application also provides a heterogeneous data federation learning method based on structure enhancement, which is applied to a server side of federation learning and comprises the following steps:
transmitting the global model to a client;
and receiving the updated local network model of the client, and updating the global model weight according to the local network model.
Based on the same inventive concept, the embodiment of the application also provides a heterogeneous data federation learning system based on structure enhancement, which comprises the following steps: a client and a server; the client comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any one of the above; the server side comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any one of the above.
Based on the same inventive concept, the embodiment of the application also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the heterogeneous data federal learning method based on the structural enhancement.
Based on the same inventive concept, the embodiment of the application further provides a non-transitory computer readable storage medium, which stores computer instructions, wherein the computer instructions are used for making a computer execute any of the heterogeneous data federal learning method based on structure enhancement.
From the above, it can be seen that the heterogeneous data federal learning method and related device based on structural enhancement provided by the application comprise: receiving a global model of a server side, and initializing a local network model according to the global model; obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
Drawings
In order to more clearly illustrate the technical solutions of the present application or related art, the drawings that are required to be used in the description of the embodiments or related art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 is a flow chart of a heterogeneous data federation learning method based on structural enhancement according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a model structure of a heterogeneous data federation learning method based on structural enhancement according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a heterogeneous data federal learning model based on structural enhancement according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a heterogeneous data federation learning system based on structural enhancement according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present application should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "first," "second," and the like, as used in embodiments of the present application, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As described in the background section, federal learning enables a large number of clients to implement collaborative training of machine learning models without compromising data privacy. In a federal learning setting, participating clients are typically deployed in various environments or owned by different users or organizations. Thus, the distribution of local data per client may vary greatly (i.e., data heterogeneity). Such non-independent co-distributed data between participating devices in federal learning results in reduced global model accuracy. The federal learning mainly comprises a client side and a server side, and specifically, the server is responsible for coordinating distribution and aggregation of models in the whole federal training process, and the client side updates a local model based on local data.
In the prior art, in order to solve the problem of global model accuracy degradation caused by non-independent co-distributed data in federal learning, the method generally comprises the steps of carrying out feature mapping on local data on each client, uploading mapping results to a server, integrating the mapping results by the server, downloading the integrated results to each client, and training the client based on a common data mapping data set and then training the client based on the local data set when the client executes local model updating so as to relieve global accuracy loss caused by data heterogeneity; and constructing a difference between the local model and the global model as a penalty term constraint and adding the difference into a loss function of local model training by adding the constraint between the local model and the global model in a local training process, so that the local model and the global model are closer to each other to relieve drift of the local model, and the global model with better aggregation performance is obtained, thereby relieving heterogeneous degree of different local data distribution and solving the problem that the global model precision is reduced due to non-independent same-distribution data among participating devices.
However, the method of alleviating the degree of data heterogeneity by sharing data mappings between nodes still inherently exposes local data features, with the risk of privacy disclosure; the local model is forced to converge with the global model directly, and the local model drift is restrained to a certain extent, but the learning ability of the local model is objectively limited, so that the method has no good universality.
In view of this, the embodiment of the present application provides a heterogeneous data federation learning method, system, device and storage medium based on structure enhancement, which are applied to a client, and the method includes: receiving a global model of a server side, and initializing a local network model according to the global model; acquiring local data in response to determining that the initialization is completed, and performing regularization training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
As shown in fig. 1, the heterogeneous data federation learning method based on structure enhancement applied to a federal learning client includes:
step S102, receiving a global model of a server side, and initializing a local network model according to the global model;
step S104, obtaining local data in response to the completion of the initialization, and carrying out initial training on the local network model according to the local data to determine a first loss function;
step S106, in response to determining that the structure enhancement training of the local network model is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
step S108, carrying out data enhancement processing on the local data, determining training data, carrying out structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and step S110, updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function.
Referring to fig. 2, the present application includes a client and a server, specifically, the server is responsible for coordinating the distribution and aggregation of models in the whole federal training process, and the client updates the local model based on local data. Namely, it mainly comprises: client model updating and server side aggregation.
In step S102, it can be seen that, in the initial stage, the client first receives the global model issued by the server, that is, the initial global model, and initializes the local network model according to the initial global model, so that the local network model can be trained according to the structure of the global model, thereby locally obtaining a local model with better generalization effect and more fitting the global model at the client.
Further, after the initialization is completed, local data, that is, local image data, is obtained according to step S104, and the local model is trained according to the local data. The local image data may be image data currently received by the client or image data stored in the memory by the client in advance. Since the data is different on each client, the training results in different local models.
In some optional embodiments, when the local network model is trained according to the local image data, first, a pixel value of any image in a batch of image data is obtained, and according to converting the image data according to the pixel value, an array or tensor corresponding to the pixel value is obtained.
In some optional embodiments, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, determining a plurality of sub-network models includes: determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient; and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
In some optional embodiments, the client model is subjected to structural regularization initial training by a sub-network gradient enhancement method to obtain a client model with stronger universality, and local client drift caused by heterogeneous local data distribution is resisted. Specifically, in response to determining that training is complete, the local network model is sampled to determine a number of subnetworks. In the present application, as shown in fig. 2, taking a client k as an example, the client k first trains the local network model according to initial image data, where k=1, 2,3, …, n, that is, image data that is not subjected to any processing, thereby completing the training of the local network model, and determining a cross entropy loss function, that is, a first loss function, of the local network model. Further, the local model is subsequently sampled as shown in fig. 2 to obtain a plurality of sub-networks, and a divergence loss function, i.e. a second loss function, of the sub-networks is determined from the plurality of sub-networks. A global loss function is then determined based on the first loss function and the second loss function. And finally, updating the local network model according to the global loss function. Wherein the second loss function is a KL divergence loss function.
In some alternative embodiments, the local network model is regularized according to the local data to determine a first loss function.
In some alternative embodiments, in response to determining that training is completed, a preset sampling coefficient and a preset sampling width are obtained, the local network model is sampled according to the preset sampling coefficient and the preset sampling width, and a plurality of sub-network models are determined. The method specifically comprises the following steps: determining a model width of the local network model; sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; the preset sampling coefficients and the preset sampling widths are in one-to-one correspondence, the preset sampling coefficients may be sequences of sampling times, the preset sampling widths and sequence numbers of sequences of the preset sampling coefficients are in one-to-one correspondence, for example, the preset sampling coefficients include: 1,2,3,4,5,6, each sampling coefficient is preset a corresponding sampling width six times, the preset sampling width corresponding to the 1 st sampling may be 80%, the preset sampling width corresponding to the 2 nd sampling may be 78%, the preset sampling width corresponding to the 3 rd sampling may be 74%, and so on. Of course, the above sampling width is only described as an example, and the sampling width may be preset according to the actual situation in the actual operation process, and may be a percentage, or may be a specific value determined by calculation according to the model width of the local network model.
In some alternative embodiments, where the local network model is sampled several times, several sub-networks may be determined.
In some alternative embodiments, the sampling is performed according to the network width of the neural network used in federal learning, resulting in a number of sub-networks. For example: if a layer of network has 100 neurons, the network width is 100, the sampling width is set to 80%, and then the sub-network obtained by sampling comprises 80 neurons. It should be noted that, in federal learning, each client has a local model, where federal learning includes a server and a plurality of clients, where the server corresponds to a global model, and each client has its own local model, which is also called a local model or a local model. In the initial stage, a server side initializes a global model and distributes the global model to each client side, the client side takes the global model as an initialized local model, and trains the local model based on local data, and different local models are obtained through training due to different data on each client side.
Further, the global model is also called the full network, corresponding to the sub-network, where the sub-network refers to a partial model sampled by width, and the sub-network can be understood as an 80% model according to the example described above.
In step S108, the method includes performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function.
Wherein, the data enhancement processing performed on the local data determines training data, including: for example, a batch of image data, which is a picture, is subjected to operations such as rotation, flipping, scaling, cropping, noise addition, and color range change. The changing of the color range may be that the image is subjected to gray processing, so that the obtained image only needs to have two colors of black and white, and the difference between the parts is only gray, for example, one image comprises A, B, C, D parts; wherein the gray scale of the A part is 75%, the gray scale of the B part is 25%, the gray scale of the C part is 0%, and the gray scale of the D part is 50%. Of course, changing the color range may be to increase the color saturation, increase the brightness of the picture, or the like.
In some optional embodiments, training a plurality of the sub-networks according to the local data after the data enhancement processing, and determining a global loss function includes: carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions corresponding to a plurality of sub-network models; wherein the second loss function is a divergence loss function, and is used for calculating a distance, namely a difference measure between output results of two different sub-networks, and reducing the loss is equivalent to reducing the difference between the two networks. Wherein the divergence loss function is a KL divergence function.
In some alternative embodiments, determining the global loss function from the first loss function and the number of second loss functions includes: summing a plurality of the second loss functions to determine a loss function sum; determining the global loss function according to the loss function sum and the first loss function; wherein the first loss function is a cross entropy loss function.
In some alternative embodiments, a number of the second loss functions are summed to determine a loss function sum; and determining a global loss function in the global loss function according to the loss function sum and the first loss function, wherein the global loss function is determined by the following formula:
wherein L is CE Cross entropy loss for local network model, (F) θ (x),y) cross entropy loss after training of local training sample input for local network model, x is input sample, F θ (x) For the input of the local network model, y is the label of the input sample, L KD For the divergence loss of a sub-network, μ is the cross entropy loss L of multiple sub-networks CE And loss of divergence L KD The balance parameter between the summations,for a sampling width of +.>Is->For a specific data enhancement processing mode, i is a sub-network, i=1, 2,3, …, k, …, n.
In some alternative embodiments, the local network model is sampled according to a preset sampling coefficient and a preset sampling width, and a plurality of sub-network models are determined; performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function; the method comprises the steps of updating the weight of the local network model according to the global loss function, and uploading the updated local network model to a server side, wherein the method comprises the following steps of, for example, sampling the local network model to a sampling width of 80%, obtaining three sub-networks ABC after sampling, and determining that the iteration times are 3 times according to the number of the sub-networks, and the specific steps are as follows:
step 1, turning over a picture, sampling three sub-networks ABC to obtain a 3x80% sub-network, inputting the turned picture to obtain the 3x80% sub-network, training the 3x80% sub-network through the turned picture, and completing training to obtain the 3x80% sub-network;
step 2, sampling 3x80% of sub-networks to obtain 3x80% of sub-networks, inputting the turned picture into the 3x80% of sub-networks, training the 3x80% of sub-networks through the turned picture, and completing training to obtain 3x80% of sub-networks;
step 3, sampling the 3x 80%x80%sub-network to obtain the 3x 80%x80%x80%sub-network, inputting the turned picture into the 3x 80%x80%x80%sub-network, training the 3x 80%x80%x80%sub-network through the turned picture, and completing training to obtain the 3x 80%x80%x80%sub-network;
step 4, determining that the iteration times exceed the number of the sub-networks, outputting the sub-networks after the last iteration training, determining the loss function of the sub-network after each iteration training and the previous iteration training, recording the loss function as a first loss function, a second loss function and a third loss function, and determining the total loss function of the iteration training according to the first loss function, the second loss function and the third loss function;
and 5, updating the local network model of the client according to the total loss function, and sending the updated local model to the server so that the server updates according to the updated local model.
In the above steps 1 to 3, "x" is a multiplier, for example, in step 1, "obtaining 3x80% of the subnetworks" means "obtaining 3 times 80% of the subnetworks".
From the above, it can be seen that the heterogeneous data federal learning method based on structure enhancement applied to a client provided by the present application includes: receiving a global model of a server side, and initializing a local network model according to the global model; obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
In some optional embodiments, the present application further provides a heterogeneous data federal learning method based on structure enhancement applied to a server, including: transmitting the global model to a client; and receiving the updated local network model of the client, and updating the weight of the initial global model in the server according to the local network model.
It should be noted that, the method of the embodiment of the present application may be performed by a single device, for example, a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the method of an embodiment of the present application, the devices interacting with each other to accomplish the method.
It should be noted that the foregoing describes some embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Based on the same inventive concept, the application also provides a heterogeneous data federal learning device based on structure enhancement, which is applied to a client, corresponding to the method of any embodiment.
Referring to fig. 3, the heterogeneous data federation learning device based on structure enhancement is applied to a client of federation learning, and includes:
the initialization module 302 is configured to receive a global model of a server side, and initialize a local network model according to the global model;
a first determining module 304 configured to obtain local data in response to determining that the initialization is completed, and perform regularization training on the local network model according to the local data, to determine a first loss function;
the sampling module 306 is configured to obtain a preset sampling coefficient and a preset sampling width in response to determining that the structure enhancement training of the local network model is completed, sample the local network model according to the preset sampling coefficient and the preset sampling width, and determine a plurality of sub-network models;
a second determining module 308, configured to perform data enhancement processing on the local data, determine training data, and perform structure enhancement training on a plurality of sub-network models according to the training data, to determine a global loss function;
and the updating module 310 is configured to update the weight of the local network model according to the global loss function, and upload the updated local network model to the server side.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
The device of the foregoing embodiment is configured to implement the heterogeneous data federal learning method based on structural enhancement in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present application also provides a heterogeneous data federal learning system based on structural enhancement, corresponding to the method of any embodiment, as shown in fig. 4, which is characterized by comprising: a client and a server; the client comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any of the embodiments above; the server side comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any of the embodiments described above.
Based on the same inventive concept, the application also provides an electronic device corresponding to the method of any embodiment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the heterogeneous data federal learning method based on structure enhancement, which is applied to the client and the server and is described in any embodiment, when executing the program.
Fig. 5 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The electronic device of the foregoing embodiment is configured to implement the heterogeneous data federal learning method based on structure enhancement applied to the client and the server in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present application also provides a non-transitory computer readable storage medium corresponding to the method of any embodiment, wherein the non-transitory computer readable storage medium stores computer instructions for causing the computer to execute the heterogeneous data federal learning method based on the structure enhancement according to any embodiment.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
The storage medium of the foregoing embodiments stores computer instructions for causing the computer to perform the heterogeneous data federal learning method based on structural enhancement as in any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present disclosure also provides a computer program product, corresponding to the heterogeneous data federal learning method based on structure enhancement applied to the client side and the server side described in any of the above embodiments, which includes computer program instructions. In some embodiments, the computer program instructions may be executed by one or more processors of a computer to cause the computer and/or the processor to perform the color correction method. Corresponding to the execution subject corresponding to each step in each embodiment of the color correction method, the processor executing the corresponding step may belong to the corresponding execution subject.
The computer program product of the foregoing embodiment is configured to enable the computer and/or the processor to perform the heterogeneous data federal learning method based on structure enhancement applied to the client and the server according to any one of the foregoing embodiments, and has the beneficial effects of corresponding method embodiments, which are not described herein.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the application (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the application, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the application as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present application. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present application, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the embodiments of the present application are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the application, it should be apparent to one skilled in the art that embodiments of the application can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the application has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The present embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalent substitutions, improvements, and the like, which are within the spirit and principles of the embodiments of the application, are intended to be included within the scope of the application.
Claims (9)
1. The heterogeneous data federation learning method based on structure enhancement is characterized by being applied to a federation learning client and comprising the following steps:
receiving a global model of a server side, and initializing a local network model according to the global model;
obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function;
in response to determining that the structure enhancement training of the local network model is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side.
2. The method of claim 1, wherein the sampling the local network model according to the preset sampling factor and the preset sampling width, determining a sub-network model, comprises:
determining a model width of the local network model;
sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; wherein the preset sampling coefficients are in one-to-one correspondence with the preset sampling widths.
3. The method of claim 1, wherein the sampling the local network model according to the preset sampling factor and the preset sampling width, determining a plurality of sub-network models, comprises:
determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient;
and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
4. The method of claim 1, wherein the local data is image data;
the data enhancement processing is performed on the local data, and the training data is determined, including:
and rotating, overturning, scaling, cutting, adding noise and changing the color range to the image data to determine the training data.
5. The method of claim 1, wherein said updating training of said plurality of sub-network models based on said training data to determine a global loss function comprises:
carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions of a plurality of the sub-network models; wherein the second loss function is a divergence loss function;
and determining the global loss function according to the first loss function and the second loss functions.
6. The method of claim 5, wherein the first loss function is a cross entropy loss function;
said determining said global loss function from said first loss function and said plurality of second loss functions comprises:
summing a plurality of the second loss functions to determine a loss function sum;
and determining the global loss function according to the loss function sum and the first loss function.
7. A structure-based reinforcement heterogeneous data federal learning system, comprising: a client and a server; the client comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any one of claims 1 to 6; the server side comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any of claims 1 to 6.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 6 when the program is executed by the processor.
9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310884262.8A CN116614484B (en) | 2023-07-19 | 2023-07-19 | Heterogeneous data federal learning method based on structure enhancement and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310884262.8A CN116614484B (en) | 2023-07-19 | 2023-07-19 | Heterogeneous data federal learning method based on structure enhancement and related equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116614484A CN116614484A (en) | 2023-08-18 |
CN116614484B true CN116614484B (en) | 2023-11-10 |
Family
ID=87683888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310884262.8A Active CN116614484B (en) | 2023-07-19 | 2023-07-19 | Heterogeneous data federal learning method based on structure enhancement and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116614484B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792890A (en) * | 2021-09-29 | 2021-12-14 | 国网浙江省电力有限公司信息通信分公司 | Model training method based on federal learning and related equipment |
CN114386570A (en) * | 2021-12-21 | 2022-04-22 | 中山大学 | Heterogeneous federated learning training method based on multi-branch neural network model |
CN116306987A (en) * | 2023-02-02 | 2023-06-23 | 北京邮电大学 | Multitask learning method based on federal learning and related equipment |
CN116451593A (en) * | 2023-06-14 | 2023-07-18 | 北京邮电大学 | Reinforced federal learning dynamic sampling method and equipment based on data quality evaluation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230136378A1 (en) * | 2021-11-03 | 2023-05-04 | Korea Advanced Institute Of Science And Technology | System, method, and computer-readable storage medium for federated learning of local model based on learning direction of global model |
-
2023
- 2023-07-19 CN CN202310884262.8A patent/CN116614484B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792890A (en) * | 2021-09-29 | 2021-12-14 | 国网浙江省电力有限公司信息通信分公司 | Model training method based on federal learning and related equipment |
CN114386570A (en) * | 2021-12-21 | 2022-04-22 | 中山大学 | Heterogeneous federated learning training method based on multi-branch neural network model |
CN116306987A (en) * | 2023-02-02 | 2023-06-23 | 北京邮电大学 | Multitask learning method based on federal learning and related equipment |
CN116451593A (en) * | 2023-06-14 | 2023-07-18 | 北京邮电大学 | Reinforced federal learning dynamic sampling method and equipment based on data quality evaluation |
Non-Patent Citations (2)
Title |
---|
Knowledge-Aided Federated Learning for Energy-Limited Wireless Networks;Zhixiong Chen;Wenqiang Yi;Yuanwei Liu;Arumugam Nallanathan;《IEEE Transactions on Communications》;第71卷(第6期);全文 * |
面向非独立同分布数据的联邦学习优化算法研究;燕忠毅;《中国优秀硕士论文电子期刊》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116614484A (en) | 2023-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200160493A1 (en) | Image filtering based on image gradients | |
US11755901B2 (en) | Dynamic quantization of neural networks | |
US10621764B2 (en) | Colorizing vector graphic objects | |
US11533458B2 (en) | Image processing device including neural network processor and operating method thereof | |
WO2020074989A1 (en) | Data representation for dynamic precision in neural network cores | |
CN109657081B (en) | Distributed processing method, system and medium for hyperspectral satellite remote sensing data | |
CN110390075B (en) | Matrix preprocessing method, device, terminal and readable storage medium | |
US11398015B2 (en) | Iterative image inpainting with confidence feedback | |
CN113327318B (en) | Image display method, image display device, electronic equipment and computer readable medium | |
Christophe et al. | Open source remote sensing: Increasing the usability of cutting-edge algorithms | |
CN113077384B (en) | Data spatial resolution improving method, device, medium and terminal equipment | |
CN110782391A (en) | Image processing method and device in driving simulation scene and storage medium | |
CN116614484B (en) | Heterogeneous data federal learning method based on structure enhancement and related equipment | |
CN114511100B (en) | Graph model task implementation method and system supporting multi-engine framework | |
CN113706606B (en) | Method and device for determining position coordinates of spaced hand gestures | |
CN111932466B (en) | Image defogging method, electronic equipment and storage medium | |
Diaz et al. | Estimating photometric properties from image collections | |
US10049425B2 (en) | Merging filters for a graphic processing unit | |
US10740937B1 (en) | Freeform gradient style transfer | |
CN115688042A (en) | Model fusion method, device, equipment and storage medium | |
CN113570659A (en) | Shooting device pose estimation method and device, computer equipment and storage medium | |
CN111191602A (en) | Pedestrian similarity obtaining method and device, terminal equipment and readable storage medium | |
Techawatcharapaikul et al. | Improved Weighted Least Square Radiometric Calibration Based Noise and Outlier Rejection by Adjacent Comparagraph and Brightness Transfer Function | |
CN111324860B (en) | Lightweight CNN calculation method and device based on random matrix approximation | |
US20220067487A1 (en) | Electronic device for generating data and improving task performance by using only very small amount of data without prior knowledge of associative domain and operating method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |