CN116614484B - Heterogeneous data federal learning method based on structure enhancement and related equipment - Google Patents

Heterogeneous data federal learning method based on structure enhancement and related equipment Download PDF

Info

Publication number
CN116614484B
CN116614484B CN202310884262.8A CN202310884262A CN116614484B CN 116614484 B CN116614484 B CN 116614484B CN 202310884262 A CN202310884262 A CN 202310884262A CN 116614484 B CN116614484 B CN 116614484B
Authority
CN
China
Prior art keywords
loss function
data
network model
determining
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310884262.8A
Other languages
Chinese (zh)
Other versions
CN116614484A (en
Inventor
梁美玉
张珉
李雅文
薛哲
管泽礼
潘圳辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310884262.8A priority Critical patent/CN116614484B/en
Publication of CN116614484A publication Critical patent/CN116614484A/en
Application granted granted Critical
Publication of CN116614484B publication Critical patent/CN116614484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a heterogeneous data federation learning method based on structure enhancement and related equipment, which comprises the steps that a global model of a receiving server side initializes a local network model according to the global model; in response to determining that the initialization is completed, acquiring local data, and performing regularization training on the local network model according to the local data to determine a first loss function; performing structure enhancement training on a local network model in response to determining that training is completed to obtain a preset sampling coefficient and a preset sampling width, and sampling the local network model according to the preset sampling coefficient and the preset sampling width to determine a plurality of sub-network models; performing data enhancement processing on the local data to determine training data, and performing update training on a plurality of sub-network models according to the training data to determine a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the local network model updated by the global loss function to the server side.

Description

Heterogeneous data federal learning method based on structure enhancement and related equipment
Technical Field
The application relates to the technical field of federal learning, in particular to a heterogeneous data federal learning method based on structural enhancement and related equipment.
Background
Federal learning enables a large number of clients to implement collaborative training of machine learning models without compromising data privacy. In a federal learning setting, participating clients are typically deployed in various environments or owned by different users or organizations. Thus, the distribution of local data per client may vary greatly, i.e., data heterogeneity. Such non-independent co-distributed data between participating devices in federal learning results in reduced global model accuracy.
Disclosure of Invention
In view of the above, the present application aims to provide a heterogeneous data federal learning method and related devices based on structural enhancement.
Based on the above object, the present application provides a heterogeneous data federation learning method based on structural enhancement applied to federation learning clients, comprising:
receiving a global model of a server side, and initializing a local network model according to the global model;
obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the heterogeneous local data to determine a first loss function;
in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function.
Optionally, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining the sub-network model includes:
determining a model width of the local network model;
sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; wherein the preset sampling coefficients are in one-to-one correspondence with the preset sampling widths.
Optionally, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models includes:
determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient;
and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
Optionally, the local data is image data;
the data enhancement processing is performed on the local data, and the training data is determined, including:
and rotating, overturning, scaling, cutting, adding noise and changing the color range to the image data to determine the training data.
Optionally, the updating training is performed on the plurality of sub-network models according to the training data, and determining the global loss function includes:
carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions of a plurality of the sub-network models; wherein the second loss function is a divergence loss function;
and determining the global loss function according to the first loss function and the second loss functions.
Optionally, the first loss function is a cross entropy loss function;
said determining said global loss function from said first loss function and said plurality of second loss functions comprises:
summing a plurality of the second loss functions to determine a loss function sum;
and determining the global loss function according to the loss function sum and the first loss function.
Based on the same inventive concept, the embodiment of the application also provides a heterogeneous data federation learning method based on structure enhancement, which is applied to a server side of federation learning and comprises the following steps:
transmitting the global model to a client;
and receiving the updated local network model of the client, and updating the global model weight according to the local network model.
Based on the same inventive concept, the embodiment of the application also provides a heterogeneous data federation learning system based on structure enhancement, which comprises the following steps: a client and a server; the client comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any one of the above; the server side comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any one of the above.
Based on the same inventive concept, the embodiment of the application also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the heterogeneous data federal learning method based on the structural enhancement.
Based on the same inventive concept, the embodiment of the application further provides a non-transitory computer readable storage medium, which stores computer instructions, wherein the computer instructions are used for making a computer execute any of the heterogeneous data federal learning method based on structure enhancement.
From the above, it can be seen that the heterogeneous data federal learning method and related device based on structural enhancement provided by the application comprise: receiving a global model of a server side, and initializing a local network model according to the global model; obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
Drawings
In order to more clearly illustrate the technical solutions of the present application or related art, the drawings that are required to be used in the description of the embodiments or related art will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 is a flow chart of a heterogeneous data federation learning method based on structural enhancement according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a model structure of a heterogeneous data federation learning method based on structural enhancement according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a heterogeneous data federal learning model based on structural enhancement according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a heterogeneous data federation learning system based on structural enhancement according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present application should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "first," "second," and the like, as used in embodiments of the present application, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As described in the background section, federal learning enables a large number of clients to implement collaborative training of machine learning models without compromising data privacy. In a federal learning setting, participating clients are typically deployed in various environments or owned by different users or organizations. Thus, the distribution of local data per client may vary greatly (i.e., data heterogeneity). Such non-independent co-distributed data between participating devices in federal learning results in reduced global model accuracy. The federal learning mainly comprises a client side and a server side, and specifically, the server is responsible for coordinating distribution and aggregation of models in the whole federal training process, and the client side updates a local model based on local data.
In the prior art, in order to solve the problem of global model accuracy degradation caused by non-independent co-distributed data in federal learning, the method generally comprises the steps of carrying out feature mapping on local data on each client, uploading mapping results to a server, integrating the mapping results by the server, downloading the integrated results to each client, and training the client based on a common data mapping data set and then training the client based on the local data set when the client executes local model updating so as to relieve global accuracy loss caused by data heterogeneity; and constructing a difference between the local model and the global model as a penalty term constraint and adding the difference into a loss function of local model training by adding the constraint between the local model and the global model in a local training process, so that the local model and the global model are closer to each other to relieve drift of the local model, and the global model with better aggregation performance is obtained, thereby relieving heterogeneous degree of different local data distribution and solving the problem that the global model precision is reduced due to non-independent same-distribution data among participating devices.
However, the method of alleviating the degree of data heterogeneity by sharing data mappings between nodes still inherently exposes local data features, with the risk of privacy disclosure; the local model is forced to converge with the global model directly, and the local model drift is restrained to a certain extent, but the learning ability of the local model is objectively limited, so that the method has no good universality.
In view of this, the embodiment of the present application provides a heterogeneous data federation learning method, system, device and storage medium based on structure enhancement, which are applied to a client, and the method includes: receiving a global model of a server side, and initializing a local network model according to the global model; acquiring local data in response to determining that the initialization is completed, and performing regularization training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
As shown in fig. 1, the heterogeneous data federation learning method based on structure enhancement applied to a federal learning client includes:
step S102, receiving a global model of a server side, and initializing a local network model according to the global model;
step S104, obtaining local data in response to the completion of the initialization, and carrying out initial training on the local network model according to the local data to determine a first loss function;
step S106, in response to determining that the structure enhancement training of the local network model is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
step S108, carrying out data enhancement processing on the local data, determining training data, carrying out structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and step S110, updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function.
Referring to fig. 2, the present application includes a client and a server, specifically, the server is responsible for coordinating the distribution and aggregation of models in the whole federal training process, and the client updates the local model based on local data. Namely, it mainly comprises: client model updating and server side aggregation.
In step S102, it can be seen that, in the initial stage, the client first receives the global model issued by the server, that is, the initial global model, and initializes the local network model according to the initial global model, so that the local network model can be trained according to the structure of the global model, thereby locally obtaining a local model with better generalization effect and more fitting the global model at the client.
Further, after the initialization is completed, local data, that is, local image data, is obtained according to step S104, and the local model is trained according to the local data. The local image data may be image data currently received by the client or image data stored in the memory by the client in advance. Since the data is different on each client, the training results in different local models.
In some optional embodiments, when the local network model is trained according to the local image data, first, a pixel value of any image in a batch of image data is obtained, and according to converting the image data according to the pixel value, an array or tensor corresponding to the pixel value is obtained.
In some optional embodiments, the sampling the local network model according to the preset sampling coefficient and the preset sampling width, determining a plurality of sub-network models includes: determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient; and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
In some optional embodiments, the client model is subjected to structural regularization initial training by a sub-network gradient enhancement method to obtain a client model with stronger universality, and local client drift caused by heterogeneous local data distribution is resisted. Specifically, in response to determining that training is complete, the local network model is sampled to determine a number of subnetworks. In the present application, as shown in fig. 2, taking a client k as an example, the client k first trains the local network model according to initial image data, where k=1, 2,3, …, n, that is, image data that is not subjected to any processing, thereby completing the training of the local network model, and determining a cross entropy loss function, that is, a first loss function, of the local network model. Further, the local model is subsequently sampled as shown in fig. 2 to obtain a plurality of sub-networks, and a divergence loss function, i.e. a second loss function, of the sub-networks is determined from the plurality of sub-networks. A global loss function is then determined based on the first loss function and the second loss function. And finally, updating the local network model according to the global loss function. Wherein the second loss function is a KL divergence loss function.
In some alternative embodiments, the local network model is regularized according to the local data to determine a first loss function.
In some alternative embodiments, in response to determining that training is completed, a preset sampling coefficient and a preset sampling width are obtained, the local network model is sampled according to the preset sampling coefficient and the preset sampling width, and a plurality of sub-network models are determined. The method specifically comprises the following steps: determining a model width of the local network model; sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; the preset sampling coefficients and the preset sampling widths are in one-to-one correspondence, the preset sampling coefficients may be sequences of sampling times, the preset sampling widths and sequence numbers of sequences of the preset sampling coefficients are in one-to-one correspondence, for example, the preset sampling coefficients include: 1,2,3,4,5,6, each sampling coefficient is preset a corresponding sampling width six times, the preset sampling width corresponding to the 1 st sampling may be 80%, the preset sampling width corresponding to the 2 nd sampling may be 78%, the preset sampling width corresponding to the 3 rd sampling may be 74%, and so on. Of course, the above sampling width is only described as an example, and the sampling width may be preset according to the actual situation in the actual operation process, and may be a percentage, or may be a specific value determined by calculation according to the model width of the local network model.
In some alternative embodiments, where the local network model is sampled several times, several sub-networks may be determined.
In some alternative embodiments, the sampling is performed according to the network width of the neural network used in federal learning, resulting in a number of sub-networks. For example: if a layer of network has 100 neurons, the network width is 100, the sampling width is set to 80%, and then the sub-network obtained by sampling comprises 80 neurons. It should be noted that, in federal learning, each client has a local model, where federal learning includes a server and a plurality of clients, where the server corresponds to a global model, and each client has its own local model, which is also called a local model or a local model. In the initial stage, a server side initializes a global model and distributes the global model to each client side, the client side takes the global model as an initialized local model, and trains the local model based on local data, and different local models are obtained through training due to different data on each client side.
Further, the global model is also called the full network, corresponding to the sub-network, where the sub-network refers to a partial model sampled by width, and the sub-network can be understood as an 80% model according to the example described above.
In step S108, the method includes performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function.
Wherein, the data enhancement processing performed on the local data determines training data, including: for example, a batch of image data, which is a picture, is subjected to operations such as rotation, flipping, scaling, cropping, noise addition, and color range change. The changing of the color range may be that the image is subjected to gray processing, so that the obtained image only needs to have two colors of black and white, and the difference between the parts is only gray, for example, one image comprises A, B, C, D parts; wherein the gray scale of the A part is 75%, the gray scale of the B part is 25%, the gray scale of the C part is 0%, and the gray scale of the D part is 50%. Of course, changing the color range may be to increase the color saturation, increase the brightness of the picture, or the like.
In some optional embodiments, training a plurality of the sub-networks according to the local data after the data enhancement processing, and determining a global loss function includes: carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions corresponding to a plurality of sub-network models; wherein the second loss function is a divergence loss function, and is used for calculating a distance, namely a difference measure between output results of two different sub-networks, and reducing the loss is equivalent to reducing the difference between the two networks. Wherein the divergence loss function is a KL divergence function.
In some alternative embodiments, determining the global loss function from the first loss function and the number of second loss functions includes: summing a plurality of the second loss functions to determine a loss function sum; determining the global loss function according to the loss function sum and the first loss function; wherein the first loss function is a cross entropy loss function.
In some alternative embodiments, a number of the second loss functions are summed to determine a loss function sum; and determining a global loss function in the global loss function according to the loss function sum and the first loss function, wherein the global loss function is determined by the following formula:
wherein L is CE Cross entropy loss for local network model, (F) θ (x),y) cross entropy loss after training of local training sample input for local network model, x is input sample, F θ (x) For the input of the local network model, y is the label of the input sample, L KD For the divergence loss of a sub-network, μ is the cross entropy loss L of multiple sub-networks CE And loss of divergence L KD The balance parameter between the summations,for a sampling width of +.>Is->For a specific data enhancement processing mode, i is a sub-network, i=1, 2,3, …, k, …, n.
In some alternative embodiments, the local network model is sampled according to a preset sampling coefficient and a preset sampling width, and a plurality of sub-network models are determined; performing data enhancement processing on the local data, determining training data, updating and training a plurality of sub-network models according to the training data, and determining a global loss function; the method comprises the steps of updating the weight of the local network model according to the global loss function, and uploading the updated local network model to a server side, wherein the method comprises the following steps of, for example, sampling the local network model to a sampling width of 80%, obtaining three sub-networks ABC after sampling, and determining that the iteration times are 3 times according to the number of the sub-networks, and the specific steps are as follows:
step 1, turning over a picture, sampling three sub-networks ABC to obtain a 3x80% sub-network, inputting the turned picture to obtain the 3x80% sub-network, training the 3x80% sub-network through the turned picture, and completing training to obtain the 3x80% sub-network;
step 2, sampling 3x80% of sub-networks to obtain 3x80% of sub-networks, inputting the turned picture into the 3x80% of sub-networks, training the 3x80% of sub-networks through the turned picture, and completing training to obtain 3x80% of sub-networks;
step 3, sampling the 3x 80%x80%sub-network to obtain the 3x 80%x80%x80%sub-network, inputting the turned picture into the 3x 80%x80%x80%sub-network, training the 3x 80%x80%x80%sub-network through the turned picture, and completing training to obtain the 3x 80%x80%x80%sub-network;
step 4, determining that the iteration times exceed the number of the sub-networks, outputting the sub-networks after the last iteration training, determining the loss function of the sub-network after each iteration training and the previous iteration training, recording the loss function as a first loss function, a second loss function and a third loss function, and determining the total loss function of the iteration training according to the first loss function, the second loss function and the third loss function;
and 5, updating the local network model of the client according to the total loss function, and sending the updated local model to the server so that the server updates according to the updated local model.
In the above steps 1 to 3, "x" is a multiplier, for example, in step 1, "obtaining 3x80% of the subnetworks" means "obtaining 3 times 80% of the subnetworks".
From the above, it can be seen that the heterogeneous data federal learning method based on structure enhancement applied to a client provided by the present application includes: receiving a global model of a server side, and initializing a local network model according to the global model; obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function; in response to determining that training is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models; performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function; and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side by the global loss function. Sampling a local network model in a structuring mode to obtain a plurality of sub-networks, training different sub-networks by data processed in different enhancing modes to learn enhancing representations so as to determine a global loss function, updating the weight of the local network model according to the global loss function to promote the universality of the local network model, and obtaining a client model with stronger generalization performance so as to resist client drift phenomenon caused by local data heterogeneity, thereby improving the performance of the global aggregation model.
In some optional embodiments, the present application further provides a heterogeneous data federal learning method based on structure enhancement applied to a server, including: transmitting the global model to a client; and receiving the updated local network model of the client, and updating the weight of the initial global model in the server according to the local network model.
It should be noted that, the method of the embodiment of the present application may be performed by a single device, for example, a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the method of an embodiment of the present application, the devices interacting with each other to accomplish the method.
It should be noted that the foregoing describes some embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Based on the same inventive concept, the application also provides a heterogeneous data federal learning device based on structure enhancement, which is applied to a client, corresponding to the method of any embodiment.
Referring to fig. 3, the heterogeneous data federation learning device based on structure enhancement is applied to a client of federation learning, and includes:
the initialization module 302 is configured to receive a global model of a server side, and initialize a local network model according to the global model;
a first determining module 304 configured to obtain local data in response to determining that the initialization is completed, and perform regularization training on the local network model according to the local data, to determine a first loss function;
the sampling module 306 is configured to obtain a preset sampling coefficient and a preset sampling width in response to determining that the structure enhancement training of the local network model is completed, sample the local network model according to the preset sampling coefficient and the preset sampling width, and determine a plurality of sub-network models;
a second determining module 308, configured to perform data enhancement processing on the local data, determine training data, and perform structure enhancement training on a plurality of sub-network models according to the training data, to determine a global loss function;
and the updating module 310 is configured to update the weight of the local network model according to the global loss function, and upload the updated local network model to the server side.
For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of each module may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
The device of the foregoing embodiment is configured to implement the heterogeneous data federal learning method based on structural enhancement in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present application also provides a heterogeneous data federal learning system based on structural enhancement, corresponding to the method of any embodiment, as shown in fig. 4, which is characterized by comprising: a client and a server; the client comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any of the embodiments above; the server side comprises a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method according to any of the embodiments described above.
Based on the same inventive concept, the application also provides an electronic device corresponding to the method of any embodiment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the heterogeneous data federal learning method based on structure enhancement, which is applied to the client and the server and is described in any embodiment, when executing the program.
Fig. 5 shows a more specific hardware architecture of an electronic device according to this embodiment, where the device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 implement communication connections therebetween within the device via a bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit ), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc. for executing relevant programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage device, dynamic storage device, or the like. Memory 1020 may store an operating system and other application programs, and when the embodiments of the present specification are implemented in software or firmware, the associated program code is stored in memory 1020 and executed by processor 1010.
The input/output interface 1030 is used to connect with an input/output module for inputting and outputting information. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
Communication interface 1040 is used to connect communication modules (not shown) to enable communication interactions of the present device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 1050 includes a path for transferring information between components of the device (e.g., processor 1010, memory 1020, input/output interface 1030, and communication interface 1040).
It should be noted that although the above-described device only shows processor 1010, memory 1020, input/output interface 1030, communication interface 1040, and bus 1050, in an implementation, the device may include other components necessary to achieve proper operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
The electronic device of the foregoing embodiment is configured to implement the heterogeneous data federal learning method based on structure enhancement applied to the client and the server in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present application also provides a non-transitory computer readable storage medium corresponding to the method of any embodiment, wherein the non-transitory computer readable storage medium stores computer instructions for causing the computer to execute the heterogeneous data federal learning method based on the structure enhancement according to any embodiment.
The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.
The storage medium of the foregoing embodiments stores computer instructions for causing the computer to perform the heterogeneous data federal learning method based on structural enhancement as in any one of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.
Based on the same inventive concept, the present disclosure also provides a computer program product, corresponding to the heterogeneous data federal learning method based on structure enhancement applied to the client side and the server side described in any of the above embodiments, which includes computer program instructions. In some embodiments, the computer program instructions may be executed by one or more processors of a computer to cause the computer and/or the processor to perform the color correction method. Corresponding to the execution subject corresponding to each step in each embodiment of the color correction method, the processor executing the corresponding step may belong to the corresponding execution subject.
The computer program product of the foregoing embodiment is configured to enable the computer and/or the processor to perform the heterogeneous data federal learning method based on structure enhancement applied to the client and the server according to any one of the foregoing embodiments, and has the beneficial effects of corresponding method embodiments, which are not described herein.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the application (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the application, the steps may be implemented in any order, and there are many other variations of the different aspects of the embodiments of the application as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present application. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present application, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the embodiments of the present application are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the application, it should be apparent to one skilled in the art that embodiments of the application can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the application has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The present embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omissions, modifications, equivalent substitutions, improvements, and the like, which are within the spirit and principles of the embodiments of the application, are intended to be included within the scope of the application.

Claims (9)

1. The heterogeneous data federation learning method based on structure enhancement is characterized by being applied to a federation learning client and comprising the following steps:
receiving a global model of a server side, and initializing a local network model according to the global model;
obtaining local data in response to determining that the initialization is completed, and performing initial training on the local network model according to the local data to determine a first loss function;
in response to determining that the structure enhancement training of the local network model is completed, acquiring a preset sampling coefficient and a preset sampling width, sampling the local network model according to the preset sampling coefficient and the preset sampling width, and determining a plurality of sub-network models;
performing data enhancement processing on the local data, determining training data, performing structure enhancement training on a plurality of sub-network models according to the training data, and determining a global loss function;
and updating the weight of the local network model according to the global loss function, and uploading the updated local network model to the server side.
2. The method of claim 1, wherein the sampling the local network model according to the preset sampling factor and the preset sampling width, determining a sub-network model, comprises:
determining a model width of the local network model;
sampling the local network model according to the model width of the local network model, the preset sampling coefficient and the model width, and determining the sub-network model; wherein the preset sampling coefficients are in one-to-one correspondence with the preset sampling widths.
3. The method of claim 1, wherein the sampling the local network model according to the preset sampling factor and the preset sampling width, determining a plurality of sub-network models, comprises:
determining the preset sampling width corresponding to the preset sampling coefficient according to the preset sampling coefficient;
and determining a target sampling part for the local network model according to the preset sampling width, and taking the target sampling part as the sub-network model.
4. The method of claim 1, wherein the local data is image data;
the data enhancement processing is performed on the local data, and the training data is determined, including:
and rotating, overturning, scaling, cutting, adding noise and changing the color range to the image data to determine the training data.
5. The method of claim 1, wherein said updating training of said plurality of sub-network models based on said training data to determine a global loss function comprises:
carrying out regularization training on any one of the sub-network models according to the training data, and determining second loss functions of a plurality of the sub-network models; wherein the second loss function is a divergence loss function;
and determining the global loss function according to the first loss function and the second loss functions.
6. The method of claim 5, wherein the first loss function is a cross entropy loss function;
said determining said global loss function from said first loss function and said plurality of second loss functions comprises:
summing a plurality of the second loss functions to determine a loss function sum;
and determining the global loss function according to the loss function sum and the first loss function.
7. A structure-based reinforcement heterogeneous data federal learning system, comprising: a client and a server; the client comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any one of claims 1 to 6; the server side comprising a memory, a processor and a computer program stored on the memory and executable by the processor for performing the method of any of claims 1 to 6.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 6 when the program is executed by the processor.
9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 6.
CN202310884262.8A 2023-07-19 2023-07-19 Heterogeneous data federal learning method based on structure enhancement and related equipment Active CN116614484B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310884262.8A CN116614484B (en) 2023-07-19 2023-07-19 Heterogeneous data federal learning method based on structure enhancement and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310884262.8A CN116614484B (en) 2023-07-19 2023-07-19 Heterogeneous data federal learning method based on structure enhancement and related equipment

Publications (2)

Publication Number Publication Date
CN116614484A CN116614484A (en) 2023-08-18
CN116614484B true CN116614484B (en) 2023-11-10

Family

ID=87683888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310884262.8A Active CN116614484B (en) 2023-07-19 2023-07-19 Heterogeneous data federal learning method based on structure enhancement and related equipment

Country Status (1)

Country Link
CN (1) CN116614484B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792890A (en) * 2021-09-29 2021-12-14 国网浙江省电力有限公司信息通信分公司 Model training method based on federal learning and related equipment
CN114386570A (en) * 2021-12-21 2022-04-22 中山大学 Heterogeneous federated learning training method based on multi-branch neural network model
CN116306987A (en) * 2023-02-02 2023-06-23 北京邮电大学 Multitask learning method based on federal learning and related equipment
CN116451593A (en) * 2023-06-14 2023-07-18 北京邮电大学 Reinforced federal learning dynamic sampling method and equipment based on data quality evaluation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230136378A1 (en) * 2021-11-03 2023-05-04 Korea Advanced Institute Of Science And Technology System, method, and computer-readable storage medium for federated learning of local model based on learning direction of global model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792890A (en) * 2021-09-29 2021-12-14 国网浙江省电力有限公司信息通信分公司 Model training method based on federal learning and related equipment
CN114386570A (en) * 2021-12-21 2022-04-22 中山大学 Heterogeneous federated learning training method based on multi-branch neural network model
CN116306987A (en) * 2023-02-02 2023-06-23 北京邮电大学 Multitask learning method based on federal learning and related equipment
CN116451593A (en) * 2023-06-14 2023-07-18 北京邮电大学 Reinforced federal learning dynamic sampling method and equipment based on data quality evaluation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Knowledge-Aided Federated Learning for Energy-Limited Wireless Networks;Zhixiong Chen;Wenqiang Yi;Yuanwei Liu;Arumugam Nallanathan;《IEEE Transactions on Communications》;第71卷(第6期);全文 *
面向非独立同分布数据的联邦学习优化算法研究;燕忠毅;《中国优秀硕士论文电子期刊》;全文 *

Also Published As

Publication number Publication date
CN116614484A (en) 2023-08-18

Similar Documents

Publication Publication Date Title
US20200160493A1 (en) Image filtering based on image gradients
US11755901B2 (en) Dynamic quantization of neural networks
US10621764B2 (en) Colorizing vector graphic objects
US11533458B2 (en) Image processing device including neural network processor and operating method thereof
WO2020074989A1 (en) Data representation for dynamic precision in neural network cores
CN109657081B (en) Distributed processing method, system and medium for hyperspectral satellite remote sensing data
CN110390075B (en) Matrix preprocessing method, device, terminal and readable storage medium
US11398015B2 (en) Iterative image inpainting with confidence feedback
CN113327318B (en) Image display method, image display device, electronic equipment and computer readable medium
Christophe et al. Open source remote sensing: Increasing the usability of cutting-edge algorithms
CN113077384B (en) Data spatial resolution improving method, device, medium and terminal equipment
CN110782391A (en) Image processing method and device in driving simulation scene and storage medium
CN116614484B (en) Heterogeneous data federal learning method based on structure enhancement and related equipment
CN114511100B (en) Graph model task implementation method and system supporting multi-engine framework
CN113706606B (en) Method and device for determining position coordinates of spaced hand gestures
CN111932466B (en) Image defogging method, electronic equipment and storage medium
Diaz et al. Estimating photometric properties from image collections
US10049425B2 (en) Merging filters for a graphic processing unit
US10740937B1 (en) Freeform gradient style transfer
CN115688042A (en) Model fusion method, device, equipment and storage medium
CN113570659A (en) Shooting device pose estimation method and device, computer equipment and storage medium
CN111191602A (en) Pedestrian similarity obtaining method and device, terminal equipment and readable storage medium
Techawatcharapaikul et al. Improved Weighted Least Square Radiometric Calibration Based Noise and Outlier Rejection by Adjacent Comparagraph and Brightness Transfer Function
CN111324860B (en) Lightweight CNN calculation method and device based on random matrix approximation
US20220067487A1 (en) Electronic device for generating data and improving task performance by using only very small amount of data without prior knowledge of associative domain and operating method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant