CN113988260B - Data processing method, device, equipment and system - Google Patents

Data processing method, device, equipment and system Download PDF

Info

Publication number
CN113988260B
CN113988260B CN202111254170.9A CN202111254170A CN113988260B CN 113988260 B CN113988260 B CN 113988260B CN 202111254170 A CN202111254170 A CN 202111254170A CN 113988260 B CN113988260 B CN 113988260B
Authority
CN
China
Prior art keywords
network
client
model
layer
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111254170.9A
Other languages
Chinese (zh)
Other versions
CN113988260A (en
Inventor
吴昌建
张迪
陈鹏
张玉全
薛军印
黄球
连欢欢
田清波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202111254170.9A priority Critical patent/CN113988260B/en
Publication of CN113988260A publication Critical patent/CN113988260A/en
Application granted granted Critical
Publication of CN113988260B publication Critical patent/CN113988260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a data processing method, a device, equipment and a system, wherein the method comprises the following steps: acquiring a first target federated model matched with local domain data of a first client, wherein the first target federated model comprises a network backbone layer and a first network header layer; sending the network backbone layer to each second client so that the second client generates a second target federal model, wherein the second target federal model comprises the network backbone layer and a random network head layer, and model parameters of the random network head layer are adjusted based on local data of the second client to obtain the second network head layer; receiving a second network head layer returned by each second client, and generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client; and the final federal model is used for processing the data to be processed. Through the technical scheme of the application, the performance of the final federal model on the local domain data and the other domain data is higher.

Description

Data processing method, device, equipment and system
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a data processing method, apparatus, device, and system.
Background
Artificial intelligence is rapidly developed in various fields, and the artificial intelligence can be adopted for processing from the fields of face recognition, image recognition, target detection, semantic segmentation, instance segmentation to unmanned driving and the like.
Machine learning is a way to realize artificial intelligence, is a multi-field cross subject, and relates to a plurality of subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. Machine learning is used to study how computers simulate or implement human learning behaviors to acquire new knowledge or skills and reorganize existing knowledge structures to improve their performance. Machine learning focuses more on algorithm design, so that a computer can automatically learn rules from data and predict unknown data by using the rules.
In order to realize artificial intelligence by machine learning, a large amount of training data needs to be acquired, a high-performance machine learning model is trained by the large amount of training data, and then the artificial intelligence is realized by the machine learning model.
However, each data holder has limited data, and cannot train a high-performance machine learning model by using local data. Due to the data privacy requirement, for each data holding end, the data holding end cannot send local data to other data holding ends, so that a high-performance machine learning model cannot be trained by using data of a plurality of data holding ends together.
Disclosure of Invention
The application provides a data processing method, a federal learning system comprises a first client and at least one second client, the method is applied to the first client, and the method comprises the following steps:
acquiring a first target federal model matched with local domain data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer;
sending the network backbone layer to each second client to enable the second clients to generate a second target federated model, wherein the second target federated model comprises the network backbone layer and a random network head layer, the structure of the random network head layer is the same as that of the first network head layer, model parameters of the random network head layer are different from those of the first network head layer, and the model parameters of the random network head layer are adjusted based on local domain data of the second clients to obtain the second network head layer;
receiving a second network head layer returned by each second client, and generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client;
and the final federal model is used for processing the data to be processed.
The application provides a data processing device, federal learning system include first customer end and at least one second customer end, the device is applied to first customer end, the device includes:
the obtaining module is used for obtaining a first target federal model matched with the local domain data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer;
a sending module, configured to send the network backbone layer to each second client, so that the second clients generate a second target federation model, where the second target federation model includes the network backbone layer and a random network head layer, a structure of the random network head layer is the same as that of the first network head layer, a model parameter of the random network head layer is different from that of the first network head layer, and the model parameter of the random network head layer is adjusted based on local data of the second clients to obtain the second network head layer;
the receiving module is used for receiving the second network head layer returned by each second client;
the generating module is used for generating a final federation model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client;
and the final federal model is used for processing the data to be processed.
The application provides a client device, wherein a federal learning system comprises a first client and at least one second client, and the client device is the first client; the client device includes a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is used for executing machine executable instructions to realize the data processing method.
The application provides a federated learning system, which comprises a first client and at least one second client; wherein: the first client is used for acquiring a first target federal model matched with local domain data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer; and sending the network backbone layer to each second client;
the second client is used for generating a second target federated model, the second target federated model comprises the network backbone layer and a random network head layer, the structure of the random network head layer is the same as that of the first network head layer, and the model parameters of the random network head layer are different from those of the first network head layer; adjusting the model parameters of the random network head layer based on the local domain data of the second client to obtain a second network head layer, and sending the second network head layer to the first client;
the first client is further used for receiving the second network head layer returned by each second client, and generating a final federation model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client; and the final federal model is used for processing the data to be processed.
As can be seen from the above technical solutions, in the embodiment of the application, for each client (i.e., a data holding end), a target federal model matched with local data (i.e., local data) of the client is obtained, the target federal model is obtained by increasing weight of the local model and decreasing weight of the local model, the local data is fully used to obtain the target federal model, generalization ability and feature extraction ability of the target federal model on the local data are improved, and performance of the target federal model on the local data is improved. For each client, a final federated model is generated based on a target federated model of the client and network head layers of other clients, namely, the network structures of the other clients are expanded on the basis of the target federated model, and the network structures of the other clients are obtained based on data training of the other clients, so that the domain data of the client are fully used to obtain the final federated model, the generalization capability of the final federated model on the domain data of the client is improved, and the performance of the final federated model on the domain data of the client is improved.
In summary, after the final federated model is obtained, the performance of the final federated model on the data in the domain is relatively high, and the performance of the final federated model on the data in the domain is also relatively high, that is, a high-performance final federated model is trained, so that the final federated model has good performance on the data of all clients. On the premise of meeting the data privacy and data safety, a high-performance final federal model can be obtained, artificial intelligence is realized based on the final federal model, and the data of each client can be used together more efficiently and accurately.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present application or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present application.
FIG. 1 is a schematic flow chart diagram of a data processing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of the structure of a federated learning system in one embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of a data processing method in one embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of a data processing method according to an embodiment of the present application;
FIG. 5 is a schematic representation of the structure of the final federal model in an embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of a data processing method according to an embodiment of the present application;
FIG. 7 is a block diagram of a data processing apparatus according to an embodiment of the present application;
fig. 8 is a hardware configuration diagram of a client device in an embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" used may be interpreted as "at \8230; \8230when" or "when 8230; \823030when" or "in response to a determination".
The embodiment of the present application provides a data processing method, which may be applied to a federated learning system, where the federated learning system may include a server and at least two clients (also referred to as clients or data holders, such as cameras, etc.), and for convenience of differentiation, a client of the at least two clients is referred to as a first client, and the remaining clients except the first client are referred to as second clients. For example, the federal learning system includes a client a, a client b, and a client c, the client a being a first client, the client b and the client c being a second client, in which case the data processing method is executed, and the client b being the first client, the client a and the client c being the second client, in which case the data processing method is executed, and the client c being the first client, the client a and the client b being the second client, in which case the data processing method is executed. In summary, the federal learning system may include a first client and at least one second client, and the data processing method may be applied to the first client, as shown in fig. 1, which is a flowchart of the method, and the method may include:
step 101, obtaining a first target federated model matched with local domain data of the first client, wherein the first target federated model may include a network backbone layer and a first network header layer.
In one possible embodiment, the first client may first obtain the original model, for example, obtain the original model from the server, and train the obtained original model based on the local data of the first client, to obtain a trained initial federated model. After the initial federated model is obtained, the initial federated model can be sent to the server, and a first target federated model matched with the local domain data of the first client is generated for the first client by the server based on the initial federated model of the first client and the initial federated models of the second clients. The first client may then obtain a first target federated model from the server.
For example, the server generates a first target federated model matching the local domain data of the first client for the first client based on the initial federated model of the first client and the initial federated models of the second clients, which may include but are not limited to: the server side can perform weighted fusion on the initial federal model of the first client side and the initial federal model of each second client side based on the weighting coefficient of the first client side and the weighting coefficient of each second client side, and a first target federal model matched with local data of the first client side is obtained. For example, the weighting factor of the first client may be greater than the weighting factor of each second client.
In one possible embodiment, after obtaining the first target federal model, the first target federal model may include K network layers, and the K network layers of the first target federal model may be divided into a network backbone layer and a first network header layer. For example, K network layers of the first target federal model can be divided into a network backbone layer and a first network head layer based on the configured division parameter M, wherein the first M network layers of the K network layers are the network backbone layers, and the rest network layers except the network backbone layers are the first network head layer. For another example, K network layers of the first target federal model are divided into a network backbone layer and a first network head layer based on the configured division parameter N, the next N network layers of the K network layers are the first network head layer, and the remaining network layers except the first network head layer are the network backbone layers. Of course, the above manner is only an example, and the method is not limited thereto, as long as the K network layers are divided into the network backbone layer and the first network header layer.
In the above embodiments, K is a positive integer, and K is a positive integer greater than 1, M is a positive integer, and M is a positive integer greater than 0, N is a positive integer, and N is a positive integer greater than 0. For example, M may be less than K, N may be less than K, M may be less than N, may be equal to N, and may be greater than N.
Step 102, sending the network backbone layer of the first target federated model to each second client, so that the second client generates a second target federated model, where the second target federated model may include the network backbone layer and a random network head layer, a structure of the random network head layer is the same as that of the first network head layer of the first target federated model, and a model parameter of the random network head layer is different from that of the first network head layer (i.e., parameter values of the random network head layer and the first network head layer are different), and the second client may adjust the model parameter of the random network head layer based on local data of the second client, to obtain the second network head layer.
For example, for each second client, after receiving the network backbone layer of the first target federated model, the second client may generate a second target federated model based on the network backbone layer, which may include the network backbone layer and a random network header layer.
For example, for each second client, after obtaining the second target federated model, the second client may input the local data of the second client to the second target federated model, so as to train the second target federated model through the local data of the second client. When the second target federated model is trained through the local domain data of the second client, model parameters of a random network head layer of the second target federated model are adjusted to obtain a second network head layer matched with the local domain data of the second client, and model parameters of a network backbone layer of the second target federated model are not adjusted.
And 103, receiving the second network head layer returned by each second client, and generating a final federated model based on the network backbone layer of the first target federated model, the first network head layer of the first target federated model and the second network head layer returned by each second client. For example, the output end of the network backbone layer is spliced with the first network header layer and each second network header layer to obtain a final federated model, that is, the final federated model includes the network backbone layer (one network backbone layer), the first network header layer and the second network header layer returned by each second client. In the final federated model, the first network header layer and each second network header layer are in a parallel relationship.
Illustratively, after the final federated model is obtained, the final federated model is used to process the data to be processed. For example, the data to be processed is input to a network backbone layer of the final federal model, and the network backbone layer processes the data to be processed to obtain data characteristics corresponding to the data to be processed. And then, inputting the data characteristics to a first network head layer of a final federal model, processing the data characteristics by the first network head layer to obtain a first processing result, inputting the data characteristics to each second network head layer of the final federal model, and processing the data characteristics by each second network head layer to obtain a second processing result. Then, a target processing result of the data to be processed is determined based on the first processing result and each of the second processing results.
As can be seen from the above technical solutions, in the embodiment of the application, for each client (i.e., a data holding end), a target federal model matched with local data (i.e., local data) of the client is obtained, the target federal model is obtained by increasing weight of the local model and decreasing weight of the local model, the local data is fully used to obtain the target federal model, generalization ability and feature extraction ability of the target federal model on the local data are improved, and performance of the target federal model on the local data is improved. For each client, a final federal model is generated based on a target federal model of the client and network head layers of other clients, namely, the network structures of the other clients are expanded on the basis of the target federal model, and the network structures of the other clients are obtained based on data training of the other clients, so that the domain data of the client are fully used to obtain the final federal model, the generalization capability of the final federal model on the domain data of the client is improved, and the performance of the final federal model on the domain data of the client is improved.
In summary, after the final federated model is obtained, the performance of the final federated model on the data in the local domain is relatively high, and the performance of the final federated model on the data in the other domain is also relatively high, that is, a high-performance final federated model is trained, so that the final federated model has good performance on the data of all clients. On the premise of meeting the data privacy and data safety, a high-performance final federal model can be obtained, artificial intelligence is realized based on the final federal model, and the data of each client can be used together more efficiently and accurately.
The data processing method according to the embodiment of the present application is described below with reference to specific application scenarios.
Before the technical solutions of the embodiments of the present application are introduced, technical terms related to the present application are introduced:
a machine learning model: the network model obtained by machine learning may be referred to as a machine learning model, and the machine learning model may include a model structure (which may also be referred to as a network structure) and model parameters thereof.
The models in the embodiments of the present application all refer to machine learning models, for example, an original model, an initial federal model, a target federal model, a final federal model, and the like all refer to machine learning models.
For example, the machine learning model may be a machine learning model using a neural network (e.g., a convolutional neural network), or may be a machine learning model using a deep learning algorithm, which is not limited thereto.
A neural network: the neural network may include a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a fully connected network, and the like. The structural elements of the neural network may include, but are not limited to: a convolutional layer (Conv), a Pool layer (Pool), an excitation layer, a full connection layer (FC), etc., without limitation.
In the convolutional layer, the image features are enhanced by performing a convolution operation on the image using a convolution kernel, the convolutional layer performs a convolution operation on a spatial range using a convolution kernel, the convolution kernel may be a matrix of m × n, and the output of the convolutional layer may be obtained by convolving the input of the convolutional layer with the convolution kernel. The convolution operation is actually a filtering process, and in the convolution operation, the pixel value f (x, y) of the point (x, y) on the image is convolved with the convolution kernel w (x, y). For example, a 4 × 4 convolution kernel is provided, the 4 × 4 convolution kernel contains 16 values, and the size of the 16 values can be configured as required. Sliding the image in order of 4 x 4 to obtain a plurality of 4 x 4 sliding windows, convolving the 4 x 4 convolution kernels with each sliding window to obtain a plurality of convolution features, which are the output of the convolution layer and are provided to the next layer.
In the pooling layer, which is actually a down-sampling process, the maximum value, the minimum value, the average value and the like are performed on a plurality of convolution characteristics (i.e. the output of the pooling layer), so that the calculation amount can be reduced, and the characteristic invariance can be maintained. In the pooling layer, the image can be sub-sampled by utilizing the principle of local image correlation, so that the data processing amount can be reduced, and useful information in the image can be reserved.
In the excitation layer, the characteristics of the output of the previous layer can be mapped by using an activation function (such as a nonlinear function), so that a nonlinear factor is introduced, and the neural network enhances the expression capability through the combination of nonlinearities. The activation function of the excitation layer may include, but is not limited to, a ReLU (Rectified Linear Units) function, and taking a ReLU function as an example for illustration, the ReLU function may set a feature smaller than 0 to 0 and keep a feature larger than 0 unchanged from all features output by the previous layer.
In the fully-connected layer, the fully-connected layer is configured to perform fully-connected processing on all features input to the fully-connected layer, so as to obtain a feature vector, and the feature vector may include a plurality of features. Furthermore, the fully-connected layer can also adopt a convolution layer of 1 × 1, so that a fully-convoluted network can be formed.
In practical application, one or more convolution layers, one or more pooling layers, one or more excitation layers and one or more fully-connected layers can be combined to construct a neural network according to different requirements.
Of course, the above is only an example of the neural network, and the structure of the neural network is not limited.
In summary, based on the machine learning model of the neural network, the model structure may be network layers such as convolutional layer, pooling layer, excitation layer, full connection layer, and the like, and the connection relationship of these network layers. The model parameters may be convolutional layer parameters, pooling layer parameters, excitation layer parameters, full link layer parameters, etc., without limitation.
Convolutional Neural Network (CNN): CNN is a feedforward neural network, one of the most representative network structures in the deep learning technology, and its artificial neurons can respond to a part of surrounding units in the coverage range and process according to the image characteristics. In general, the basic structure of CNN includes two layers, one of which is a feature extraction layer (e.g., convolutional layer), and the input of each neuron is connected to the local reception domain of the previous layer and extracts the feature of the local layer. Once the local feature is extracted, its positional relationship with other features is also determined. The other is a feature mapping layer (such as a structure adopting a ReLU function, etc.), each computing layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure may use sigmoid function (i.e., sigmoid function, also called sigmoid growth curve), reLU function, etc. as the activation function of the convolutional network.
For convenience of description, the present embodiment takes the implementation of a machine learning model using a convolutional neural network as an example.
Federal learning: federated machine Learning is also known as Federated Learning (FL), or joint Learning, or league Learning. Federal learning is a machine learning mode, and can effectively help a plurality of clients to perform data use and machine learning modeling under the condition of meeting the requirements of user privacy protection and data safety.
The present domain data and the other domain data: the local domain data refers to data usable by the local client, and the domain data refers to data usable by other clients besides the local domain data. For example, for client a, the data that client a can use is called local domain data, the data that client b uses is called its domain data, and the data that client c uses is called its domain data. For client b, the data that client b can use is called local domain data, the data that client a uses is called its domain data, and the data that client c uses is called its domain data.
Data co-distribution and data non-independent co-distribution: the data are distributed in the same way, namely the data of each client are distributed in the same way and are independent of each other. The non-independent and uniform distribution of data means that the data distribution of each client has a certain difference, for example, the category distribution of each client is inconsistent, and the data scenes have a difference.
Training process of the machine learning model: in the training process of the machine learning model, training data can be used for training each model parameter in the machine learning model, namely, model parameters such as convolutional layer parameters, pooling layer parameters, excitation layer parameters, full-link layer parameters and the like are adjusted, optimized and the like, and are not limited to the above, all model parameters in the machine learning model can be trained, and the machine learning model can fit the mapping relation between input data and output data by training the model parameters.
The use process of the machine learning model comprises the following steps: in the using process of the machine learning model, data to be processed (i.e. input data such as image data, audio data, text data, video data and the like) can be provided for the machine learning model, the machine learning model processes the data to be processed, for example, all model parameters are utilized to process the data to be processed (such as artificial intelligence processing), the processing process is not limited, output data is obtained, and the input data and the output data meet the mapping relation between input and output of the machine learning model.
Illustratively, artificial intelligence has been developed rapidly in various fields, such as face recognition, image recognition, object detection, semantic segmentation, instance segmentation, unmanned driving, etc., and can be processed by the artificial intelligence. In order to realize artificial intelligence by adopting machine learning, a high-performance machine learning model needs to be trained through a large amount of training data, and then the artificial intelligence is realized through the machine learning model. However, for each client, the data held by the client is limited, and a high-performance machine learning model cannot be trained by using the data in the local domain. Due to the data privacy requirement, each client cannot send the data of the local domain to other clients, so that a high-performance machine learning model cannot be trained by the data of the local domain and the data of the other domains. In summary, how to train a high-performance machine learning model on the premise of satisfying data privacy and data security is an important direction for artificial intelligence to use data of multiple clients more efficiently and accurately.
In view of the above findings, the embodiment of the present application provides a data processing method, which may obtain a final federal model by using federal learning training, where the performance of the final federal model is higher in the data of the local area and higher in the data of the local area, that is, a high-performance final federal model is trained, so that the final federal model has good performance in all the data of the clients. On the premise of meeting the data privacy and data safety, a high-performance final federal model can be obtained, and the data of each client can be used together more efficiently and accurately.
Referring to fig. 2, a schematic structural diagram of a federal learning system is shown, where the federal learning system may include a service end and at least two clients (a client may also be referred to as a user end or a data holding end, such as a camera, etc.), in fig. 2, 3 clients (such as a client a, a client b, and a client c) are taken as an example for illustration, and in practical applications, the number of the clients may be more, which is not limited. The server is used for realizing the fusion and transmission of the models, and each client is used for training the models by adopting the local domain data.
Illustratively, when client a is taken as the first client, client b and client c are taken as the second clients. When the client b is used as a first client, the client a and the client c are used as second clients. When the client c is taken as a first client, the client a and the client b are taken as second clients.
In the embodiment of the application, a local performance improving process and a local performance improving process are involved, in the local performance improving process, a target federal model matched with local data of a client can be obtained, the target federal model is obtained by increasing the weight of the local model and reducing the weight of the local model, the local data is fully used to obtain the target federal model, the generalization capability and the feature extraction capability of the target federal model on the local data are improved, and the performance of the target federal model on the local data is improved. In the process of improving the performance of the other client, the network structures of other clients can be expanded on the basis of the target federated model, and the network structures of other clients are obtained based on data training of other clients, so that the data of the other client can be fully used to obtain a final federated model, the generalization capability of the final federated model on the data of the other client is improved, the performance of the final federated model on the data of the other client is improved, and the problem of performance reduction of the other client caused by non-independent and same distribution of the data is solved.
For the performance improvement process of this domain, see fig. 3, which is a schematic flow chart of a data processing method, the method is used to obtain a target federal model after the performance improvement of this domain, and the method may include:
and 301, the server side sends the original model to each client side.
For example, the server may obtain an original model, where the original model may be a machine learning model that uses a neural network (e.g., a convolutional neural network), or may also be a machine learning model that uses a deep learning algorithm, and the structure and function of the original model are not limited, such as an original model for implementing a classification function, an original model for implementing a detection function, and the like. For example, the original model is a model for realizing face recognition, a model for realizing image recognition, a model for realizing target detection, a model for realizing semantic segmentation, a model for realizing instance segmentation, or a model for realizing unmanned driving.
After obtaining the original model, the server may send the original model to each client, for example, send the original model to client a, client b, and client c, respectively.
Step 302, for each client, after receiving the original model, the client trains the original model based on local domain data of the client to obtain a trained initial federated model.
For example, after receiving the original model, the client a trains the original model based on local data (denoted as data Da) of the client a to obtain a trained initial federated model a1, and the model training process is not limited as long as the original model can be trained based on the data Da.
For example, after receiving the original model, the client b trains the original model based on local data (denoted as data Db) of the client b to obtain a trained initial federated model b1.
For example, after receiving the original model, the client c trains the original model based on local domain data (denoted as data Dc) of the client c to obtain a trained initial federated model c1.
And step 303, each client sends the initial federated model to the server.
For example, the client a sends the initial federal model a1 to the server, the client b sends the initial federal model b1 to the server, and the client c sends the initial federal model c1 to the server, so that the server can obtain the initial federal model a1, the initial federal model b1 and the initial federal model c1.
And step 304, the server side generates a target federal model matched with the local domain data of the client side for each client side respectively based on the initial federal model of each client side. It should be noted that, when the server generates the target federal model for each client, the target federal model of different clients may be different.
For example, the client a is used as a first client, the client b and the client c are both used as second clients, and based on the initial federal model a1, the initial federal model b1 and the initial federal model c1, the server can generate a target federal model a2 matched with the local data Da of the client a for the client a.
The server side can perform weighted fusion on the initial federal model a1, the initial federal model b1 and the initial federal model c1 based on the weighting coefficient of the client side a, the weighting coefficient of the client side b and the weighting coefficient of the client side c to obtain a target federal model a2 matched with the data Da of the local area. The weighting coefficient of the client a is greater than the weighting coefficient of the client b, the weighting coefficient of the client a is greater than the weighting coefficient of the client c, and the weighting coefficient of the client b and the weighting coefficient of the client c may be the same or different, which is not limited to this.
Referring to formula (1), to generate an example of the target federal model a2, this is not a limitation.
a2= a1 w11+ b1 w12+ c1 w13 equation (1)
In the formula (1), w11 represents the weighting coefficient of the client a when the client a generates the target federal model a2, w12 represents the weighting coefficient of the client b when the client a generates the target federal model a2, and w13 represents the weighting coefficient of the client c when the client a generates the target federal model a2.
Obviously, w11 may be greater than w12, w11 may be greater than w13, and w12 and w13 may be the same.
For another example, the client b is used as a first client, the client a and the client c are both used as second clients, and the server may generate a target federal model b2 matched with the local data Db of the client b for the client b based on the initial federal model b1, the initial federal model a1 and the initial federal model c1.
The server side can perform weighted fusion on the initial federal model b1, the initial federal model a1 and the initial federal model c1 based on the weighting coefficient of the client side b, the weighting coefficient of the client side a and the weighting coefficient of the client side c to obtain a target federal model b2 matched with the data Db in the field. The weighting coefficient of the client b is larger than that of the client a, and the weighting coefficient of the client b is larger than that of the client c.
Referring to formula (2), to generate an example of the target federal model b2, this is not limited.
b2= a1 w21+ b1 w22+ c1 w23 equation (1)
In the formula (2), w21 represents the weighting coefficient of the client a when the client b generates the target federal model b2, w22 represents the weighting coefficient of the client b when the client b generates the target federal model b2, and w23 represents the weighting coefficient of the client c when the client b generates the target federal model b2.
Obviously, w22 may be greater than w21, w22 may be greater than w23, and w21 and w23 may be the same.
For another example, the client c is used as a first client, the client a and the client b are both used as second clients, and based on the initial federal model b1, the initial federal model a1 and the initial federal model c1, a target federal model c2 matched with the local domain data Dc of the client c can be generated for the client c, which is not described herein again.
In summary, a target federal model a2 of the client a, a target federal model b2 of the client b, and a target federal model c2 of the client c are obtained, where the target federal model a2 is different from the target federal model b2, the target federal model a2 is different from the target federal model c2, and the target federal model b2 is different from the target federal model c 2.
Step 305, the server sends the target federal model of the client to the client, for example, the target federal model a2 of the client a can be sent to the client a, the target federal model b2 of the client b can be sent to the client b, and the target federal model c2 of the client c can be sent to the client c.
In a possible implementation manner, an iterative process may be performed once, after the client a obtains the target federal model a2, the target federal model a2 is used as the target federal model with the best performance in the local area, the performance improvement process in the local area is ended, and the performance improvement process in the local area is adopted to continuously improve the performance in the local area of the target federal model a2. Similarly, the client b takes the target federal model b2 as the target federal model with the best performance in the local area, ends the performance improvement process in the local area, and adopts the performance improvement process in the other area to continuously improve the performance in the other area of the target federal model b2. And the client c takes the target federal model c2 as a target federal model with the best performance in the local area, ends the performance improvement process in the local area, and adopts the performance improvement process in the other area to continuously improve the performance in the other area of the target federal model c 2.
In another possible embodiment, a plurality of iterations may be performed, such as P iterations, where P is a positive integer greater than 1, and the above steps 302-305 are the first iteration. After the client a obtains the target federal model a2, the client a takes the target federal model a2 as an original model and returns to the step 302, after the client b obtains the target federal model b2, the client b takes the target federal model b2 as the original model and returns to the step 302, and after the client c obtains the target federal model c2, the client c takes the target federal model c2 as the original model and returns to the step 302, so far, the steps 302-305 can be repeated, namely, a second iteration process is executed. And repeating the steps until P times of iteration processes are completed, obtaining a target federated model by the client a, taking the target federated model as the target federated model with the best performance in the domain, ending the performance improvement process in the domain, and continuously improving the domain performance of the target federated model by adopting the domain performance improvement process. And the client b obtains a target federal model, takes the target federal model as a target federal model with the best performance in the domain, ends the performance improvement process in the domain, and adopts the performance improvement process in the other domain to continuously improve the performance in the other domain of the target federal model. And the client c obtains a target federal model, takes the target federal model as a target federal model with the best performance in the field, ends the performance improvement process in the field, and adopts the performance improvement process in the field to continuously improve the field performance of the target federal model.
In summary, the client a may obtain the target federal model with the best performance in the domain, and then record the target federal model as a2, the client b may obtain the target federal model with the best performance in the domain, and then record the target federal model as a2, and the client c may obtain the target federal model with the best performance in the domain, and then record the target federal model as a c2, so that the process of improving the performance in the domain is completed, and the process of improving the performance in the domain is executed.
For the process of improving the performance of its domain, see fig. 4, which is a schematic flow chart of a data processing method, the method is used to obtain a final federal model after the performance of its domain is improved, and the method may include:
step 401, after the client a obtains the target federal model a2 matched with the local data Da of the client a, dividing the target federal model a2 into a network backbone layer and a first network header layer.
In a possible implementation manner, a partitioning parameter M may be configured in advance, where the partitioning parameter M indicates that the first M network layers are used as network Backbone layers (e.g., backhaul layers), and on this basis, assuming that the target federal model a2 includes K network layers, the K network layers of the target federal model a2 may be partitioned into a network Backbone layer and a first network header layer (e.g., head layer), where the first M network layers of the K network layers are network Backbone layers, and the remaining network layers except the network Backbone layers are the first network header layer.
In another possible implementation, a partitioning parameter N may be configured in advance, where the partitioning parameter N indicates that the next N network layers are taken as the first network head layer, and on this basis, the K network layers of the target federated model a2 may be partitioned into a network backbone layer and a first network head layer, where the next N network layers of the K network layers are the first network head layer, and the remaining network layers except the first network head layer are the network backbone layers.
In summary, K network layers of the target federal model a2 may be divided into a network Backbone layer and a first network Head layer, where the network Backbone layer is denoted as a Backbone layer a, and the first network Head layer is denoted as a Head layer a.
Illustratively, taking the image classification task, the machine learning model uses VGG16 as an example, see table 1, which is an example of a network structure of the target federal model a2, the target federal model a2 includes 22 network layers, and the configurable partition parameter N is 3, so that the last 3 network layers (e.g., the last three Fc layers) of the target federal model a2 can be used as the first network layer, and the remaining network layers (i.e., the first 19 network layers) of the target federal model a2 except the first network layer can be used as the network backbone layer.
TABLE 1
Figure BDA0003323472180000151
Figure BDA0003323472180000161
Step 402, the client a sends the network Backbone layer (such as backhaul layer a) of the target federated model a2 to the client b and the client c respectively. It should be noted that, for data privacy protection, the client a does not send the target federal model a2 to the client b and the client c, but only sends the network backbone layer of the target federal model a2 to the client b and the client c, so as to avoid other clients from learning the target federal model a2.
Step 403, the client b receives the network Backbone layer (such as backhaul layer a) of the target federal model a2 and generates a target federal model a2', the target federal model a2' includes the backhaul layer a and a random network Head layer, the structure (i.e., model structure) of the random network Head layer is the same as the structure of the first network Head layer (such as Head layer a) of the target federal model a2, and the model parameters of the random network Head layer are different from the model parameters of the Head layer a.
For example, in step 401, assuming that the target federal model a2 is divided into a backhaul layer a and a Head layer a according to the division parameter N, the client b firstly divides the target federal model b2 of the client b into a network Backbone layer and a network Head layer (denoted as network Head layer x) based on the division parameter N, and the network Head layer x is the next N network layers of the target federal model b2. Then, a random network Head layer is generated based on the network Head layer x, and the structure of the random network Head layer is the same as that of the Head layer A because the network Head layer x is the same as that of the Head layer A. After the random network Head layer is obtained, the model parameters of the random network Head layer are randomly set, that is, the model parameters of the random network Head layer are randomly set, and the setting mode is not limited.
After the random network header layer is obtained, the random network header layer and a backhaul layer a can be combined, and the backhaul layer a is located in front of the random network header layer to obtain a target federal model a2'.
And step 404, the client b adjusts model parameters of a random network Head layer of the target federated model a2' based on the local domain data Db to obtain a second network Head layer, and the second network Head layer is marked as a Head layer AB.
For example, after obtaining the target federal model a2', the client b may input the local data Db of the client b into the target federal model a2', so as to train the target federal model a2' through the local data Db. Illustratively, when the target federal model a2' is trained, only the model parameters of the random network header layer of the target federal model a2' are adjusted to obtain the Head layer AB matched with the local data Db, but the model parameters of the backhaul layer a of the target federal model a2' are not adjusted.
In summary, when the target federal model a2' is trained, it is necessary to keep the model parameters of the backhaul layer a of the target federal model a2' unchanged, adjust the model parameters of the random network Head layer of the target federal model a2', and call the random network Head layer after the adjustment of the model parameters as the Head layer AB.
Similarly, after receiving the backhaul layer a of the target federated model a2, the client c may also obtain a second network header layer, which is denoted as a Head layer AC, and the implementation manner is referred to in steps 403 to 404, which is not described herein again.
In step 405, the client a receives the second network header layer returned by each client, for example, the client a receives the Head layer AB returned by the client b, and the client a receives the Head layer AC returned by the client c.
And step 406, the client a generates a final federated model based on the backhaul layer a, the Head layer a and the second network Head layers (such as the Head layer AB and the Head layer AC) returned by each client.
For example, the output end of the backhaul layer a is spliced with the Head layer a and each second network Head layer to obtain a final federal model, as shown in fig. 5, in the final federal model, the output end of the backhaul layer a is spliced with the Head layer a, the Head layer AB and the Head layer AC. And the Head layer A, the Head layer AB and the Head layer AC are in parallel relation and are all connected with the output end of the backhaul layer A.
In summary, on the basis of the target federal model a2 (such as a backhaul layer a and a Head layer a, and the output end of the backhaul layer a is spliced with the Head layer a), the Head layer AB and the Head layer AC are extended on the basis of the target federal model a2 through network structure extension, so as to obtain a final federal model, and thus, the performance of the target federal model is improved. For example, the Head layer AB of the final federated model is trained based on the local data Db of the client b, and can improve the performance based on the data Db. And finally, the Head layer AC of the federated model is obtained by training the local domain data Dc based on the client c, and the performance on the basis of the data Dc can be improved.
In steps 401-406, the final federal model of the client a is obtained when the client a is used as the first client and the clients b and c are used as the second clients. Similarly, the client b may be used as the first client, the client a and the client c may be used as the second client, so as to obtain the final federation model of the client b, the client c may be used as the first client, and the client a and the client b may be used as the second client, so as to obtain the final federation model of the client c, and the implementation manner is referred to in steps 401 to 406, and no repeated description is repeated here.
For the process of improving the performance of its domain, see fig. 6, which is a schematic flow chart of a data processing method, the method is used to obtain a final federal model after the performance of its domain is improved, and the method may include:
step 601, after the server obtains a target federal model a2 matched with local domain data Da of the client a, dividing the target federal model a2 into a network backbone layer and a first network header layer.
Step 602, the server sends the network Backbone layer (such as backhaul layer a) of the target federal model a2 to the client b, and sends the network Backbone layer of the target federal model a2 to the client c.
Step 603, the client b receives a network Backbone layer (such as a backhaul layer a) of the target federated model a2, and generates a target federated model a2', wherein the target federated model a2' comprises the backhaul layer a and a random network header layer.
Step 604, the client b adjusts model parameters of a random network Head layer of the target federal model a2' based on the local domain data Db to obtain a second network Head layer, and the second network Head layer is recorded as a Head layer AB.
Similarly, the client c may also obtain a second network header layer, which is denoted as Head layer AC.
Step 605, the server receives the second network header returned by each client, for example, the server receives the Head layer AB returned by the client b, and the server receives the Head layer AC returned by the client c.
And step 606, the service end generates a final federated model for the client a on the basis of the network backbone layer and the first network head layer of the target federated model a2 and the second network head layer returned by each client.
Illustratively, steps 601-606 are similar to steps 401-406 and are not repeated here.
And step 607, the server sends the final federal model to the client a.
Similarly, based on the processes of steps 601 to 607, the server may also generate a final federation model for the client b and send the final federation model to the client b, and the server may also generate a final federation model for the client c and send the final federation model to the client c, which is not described in detail herein.
In one possible embodiment, after obtaining the final federal model, the client a may process the data to be processed based on the final federal model. For example, the data to be processed is input to the backhaul layer a of the final federal model, and the data to be processed is processed by the backhaul layer a, so as to obtain the data characteristics corresponding to the data to be processed. Then, the data characteristics are input to the Head layer a, the Head layer AB, and the Head layer AC, respectively. After receiving the data feature, the Head layer a processes the data feature to obtain a processing result, which is denoted as a processing result r1. After receiving the data feature, the Head layer AB processes the data feature to obtain a processing result, which is recorded as a processing result r2. After receiving the data feature, the Head layer AC processes the data feature to obtain a processing result, which is denoted as a processing result r3.
Then, a target processing result of the data to be processed is determined based on the processing result r1, the processing result r2, and the processing result r3, for example, the processing result r1, the processing result r2, and the processing result r3 are logically fused to obtain the target processing result of the data to be processed, which is not limited in the present embodiment as long as the target processing result is related to the processing result r1, the processing result r2, and the processing result r3.
For example, when the final federal model is used to implement a classification function (e.g., identify a class of a target object in an image), if the processing result r1 represents a target class of the target object and a confidence 1 of the target class, the processing result r2 represents a confidence 2 of the target class of the target object and a confidence 3 of the target class of the target object, the target processing result is the target class and the target confidence of the target object, and the target confidence is an average of the confidence 1, the confidence 2, and the confidence 3.
For another example, when the final federal model is used to implement a detection function (e.g., identify the position of a target object in an image), if the processing result r1 represents a rectangular frame 1 of the target object (i.e., the target image is located in the rectangular frame 1, and the rectangular frame 1 can be represented by vertex coordinates, the length and width of the rectangle, etc.), the processing result r2 represents a rectangular frame 2 of the target object, and the processing result r3 represents a rectangular frame 3 of the target object, then the target processing result may be a target rectangular frame of the target object, and the target rectangular frame is the union of the rectangular frame 1, the rectangular frame 2, and the rectangular frame 3, i.e., the target rectangular frame may include the rectangular frame 1, the rectangular frame 2, and the rectangular frame 3.
Of course, the above are only two examples of performing logical fusion on the processing result r1, the processing result r2, and the processing result r3, and the implementation manner of this logical fusion is not limited in this embodiment of the application.
According to the technical scheme, in the embodiment of the application, for each client, the target federal model matched with the local domain data of the client is obtained, the target federal model is obtained by increasing the weight of the local domain model and reducing the weight of the local domain model, the local domain data is fully used to obtain the target federal model, the generalization capability and the feature extraction capability of the target federal model on the local domain data are improved, and the performance of the target federal model on the local domain data is improved. For each client, a final federated model is generated based on a target federated model of the client and network head layers of other clients, namely, the network structures of the other clients are expanded on the basis of the target federated model, and the network structures of the other clients are obtained based on data training of the other clients, so that the domain data of the client are fully used to obtain the final federated model, the generalization capability of the final federated model on the domain data of the client is improved, the performance of the final federated model on the domain data of the client is improved, namely, the problem of performance reduction of the domain of the client can be solved through network structure expansion, and the improvement is more obvious particularly when the data are not independently and simultaneously distributed.
Based on the same application concept as the method, in the embodiment of the present application, a data processing apparatus is provided, where a federal learning system includes a first client and at least one second client, and is shown in fig. 7, and is a schematic structural diagram of the apparatus, where the apparatus is applied to the first client, and the apparatus may include:
an obtaining module 71, configured to obtain a first target federated model that is matched with local domain data of the first client, where the first target federated model includes a network backbone layer and a first network header layer; a sending module 72, configured to send the network backbone layer to each second client, so that the second client generates a second target federation model, where the second target federation model includes the network backbone layer and a random network head layer, a structure of the random network head layer is the same as that of the first network head layer, a model parameter of the random network head layer is different from that of the first network head layer, and the model parameter of the random network head layer is adjusted based on local data of the second client to obtain the second network head layer; a receiving module 73, configured to receive the second network header layer returned by each second client; a generating module 74, configured to generate a final federated model based on the network backbone layer, the first network head layer, and the second network head layer returned by each second client; and the final federal model is used for processing the data to be processed.
In a possible implementation, the generating module 74 is specifically configured to, when generating the final federation model based on the network backbone layer, the first network head layer, and the second network head layer returned by each second client: splicing the output end of the network backbone layer with the first network head layer and each second network head layer to obtain the final federal model; wherein, in the final federated model, the first network header layer and each second network header layer are in a parallel relationship.
Illustratively, the data processing apparatus further comprises (not shown in the figures):
the processing module is used for inputting data to be processed to the network backbone layer of the final federated model, and the network backbone layer processes the data to be processed to obtain data characteristics corresponding to the data to be processed; inputting the data characteristics to a first network head layer of the final federated model, processing the data characteristics by the first network head layer to obtain a first processing result, inputting the data characteristics to each second network head layer of the final federated model, and processing the data characteristics by each second network head layer to obtain a second processing result; on this basis, a target processing result of the data to be processed may be determined based on the first processing result and the respective second processing results.
Illustratively, the first target federal model includes K network layers, and the data processing apparatus further includes (not shown in the figure): the dividing module is used for dividing K network layers of the first target federal model into a network backbone layer and a first network head layer based on the configured dividing parameter M; wherein the first M network layers of the K network layers are the network backbone layer, and the rest of the network layers except the network backbone layer are the first network header layer; or dividing K network layers of the first target federated model into a network backbone layer and a first network head layer based on the configured dividing parameter N; wherein the next N network layers of the K network layers are the first network head layer, and the rest network layers except the first network head layer are the network backbone layers; K. m and N are positive integers, M is smaller than K, and N is smaller than K.
For example, the obtaining module 71 is specifically configured to, when obtaining the first target federation model matched with the local domain data of the first client: training the obtained original model based on the local data of the first client to obtain a trained initial federated model, and sending the initial federated model to a server, wherein the server generates a first target federated model matched with the local data of the first client for the first client based on the initial federated model of the first client and the initial federated models of the second clients; and acquiring the first target federal model from the server.
Based on the same application concept as the method, the embodiment of the application provides the client equipment, wherein the federal learning system comprises a first client and at least one second client, and the client equipment is used as the first client; referring to fig. 8, the client device includes: a processor 81 and a machine-readable storage medium 82, the machine-readable storage medium 82 storing machine-executable instructions executable by the processor 81; the processor 81 is configured to execute machine-executable instructions to implement the data processing method described above.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored, and when the computer instructions are executed by a processor, the data processing method disclosed in the above example of the present application can be implemented.
The machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
Based on the same application concept as the method, the embodiment of the application provides a federated learning system, wherein the federated learning system comprises a first client and at least one second client; wherein:
the first client is used for acquiring a first target federal model matched with local domain data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer; and sending the network backbone layer to each second client;
the second client is used for generating a second target federated model, the second target federated model comprises the network backbone layer and a random network head layer, the structure of the random network head layer is the same as that of the first network head layer, and the model parameters of the random network head layer are different from those of the first network head layer; adjusting the model parameters of the random network head layer based on the local domain data of the second client to obtain a second network head layer, and sending the second network head layer to the first client;
the first client is further used for receiving the second network head layer returned by each second client, and generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client; and the final federal model is used for processing the data to be processed.
Illustratively, the federal learning system further comprises a server;
the first client is used for training the acquired original model based on local domain data of the first client to obtain a trained initial federated model, and sending the initial federated model to the server;
the server is used for generating a first target federal model matched with local domain data of the first client for the first client based on the initial federal model of the first client and the initial federal models of the second clients, and sending the first target federal model to the first client;
the first client is further configured to obtain the first target federation model from the server.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may be in the form of a personal computer, laptop, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art to which the present application pertains. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. A data processing method is characterized in that a federal learning system comprises a first client and at least one second client, the method is applied to the first client, and the method comprises the following steps:
acquiring a first target federal model matched with the local image data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer; the first target federal model comprises K network layers, wherein the first M network layers in the K network layers are network backbone layers, and the rest network layers except the network backbone layers are first network head layers; or, the next N network layers of the K network layers are first network head layers, and the rest of the network layers except the first network head layer are network backbone layers; wherein K, M and N are positive integers, M is less than K, and N is less than K;
sending the network backbone layer to each second client to enable the second clients to generate a second target federated model, wherein the second target federated model comprises the network backbone layer and a random network head layer, the structure of the random network head layer is the same as that of the first network head layer, model parameters of the random network head layer are different from those of the first network head layer, and the model parameters of the random network head layer are adjusted based on local domain image data of the second clients to obtain the second network head layer;
receiving a second network head layer returned by each second client, and generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client;
and the final federal model is used for processing the image data to be processed.
2. The method of claim 1, wherein generating a final federated model based on the network backbone layer, the first network header layer, and the second network header layer returned by each second client comprises:
splicing the output end of the network backbone layer with the first network head layer and each second network head layer to obtain the final federal model; wherein, in the final federated model, the first network header layer and each second network header layer are in a parallel relationship.
3. The method of claim 2,
after generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client, the method further includes:
inputting image data to be processed to the network backbone layer of the final federal model, and processing the image data to be processed by the network backbone layer to obtain data characteristics corresponding to the image data to be processed;
inputting the data characteristics to the first network head layer, processing the data characteristics by the first network head layer to obtain a first processing result, inputting the data characteristics to each second network head layer, and processing the data characteristics by each second network head layer to obtain a second processing result;
and determining a target processing result of the image data to be processed based on the first processing result and each second processing result.
4. The method according to claim 1, wherein the adjusting model parameters of the random network header layer based on the local image data of the second client to obtain a second network header layer comprises:
the second client inputs the local image data of the second client to the second target federal model so as to train the second target federal model through the local image data of the second client;
when the second target federated model is trained through the local image data of the second client, the model parameters of the random network head layer are adjusted to obtain a second network head layer matched with the local image data of the second client, and the model parameters of the network backbone layer are not adjusted.
5. The method according to any one of claims 1-4, further comprising:
dividing K network layers of the first target federal model into a network backbone layer and a first network head layer based on a configured division parameter M; or dividing K network layers of the first target federal model into a network backbone layer and a first network head layer based on the configured dividing parameter N.
6. The method according to any one of claims 1 to 4,
the obtaining of the first target federal model matched with the local image data of the first client includes:
training the obtained original model based on the local image data of the first client to obtain a trained initial federated model, sending the initial federated model to a server, and generating a first target federated model matched with the local image data of the first client for the first client by the server based on the initial federated model of the first client and the initial federated models of the second clients;
and acquiring the first target federal model from the server.
7. The method according to claim 6, wherein the server generates a first target federated model for the first client that matches the local image data of the first client based on the initial federated model of the first client and the initial federated model of each second client, including:
the server side performs weighted fusion on the initial federal model of the first client and the initial federal model of each second client based on the weighting coefficient of the first client and the weighting coefficient of each second client to obtain a first target federal model matched with the local image data of the first client;
wherein the weighting coefficient of the first client is greater than the weighting coefficient of each second client.
8. A data processing apparatus, wherein a federal learning system includes a first client and at least one second client, the apparatus is applied to the first client, and the apparatus includes:
the acquisition module is used for acquiring a first target federal model matched with local image data of the first client, and the first target federal model comprises a network backbone layer and a first network head layer; the first target federal model comprises K network layers, wherein the first M network layers in the K network layers are network backbone layers, and the rest network layers except the network backbone layers are first network head layers; or, the next N network layers of the K network layers are first network head layers, and the rest network layers except the first network head layer are network backbone layers; wherein K, M and N are positive integers, M is less than K, and N is less than K;
a sending module, configured to send the network backbone layer to each second client, so that the second clients generate a second target federation model, where the second target federation model includes the network backbone layer and a random network head layer, a structure of the random network head layer is the same as that of the first network head layer, a model parameter of the random network head layer is different from that of the first network head layer, and the model parameter of the random network head layer is adjusted based on local image data of the second client to obtain a second network head layer;
the receiving module is used for receiving the second network head layer returned by each second client;
the generating module is used for generating a final federal model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client;
and the final federal model is used for processing the image data to be processed.
9. The apparatus according to claim 8, wherein the generating module is specifically configured to, when generating the final federation model based on the network backbone layer, the first network head layer, and the second network head layer returned by each second client: splicing the output end of the network backbone layer with the first network head layer and each second network head layer to obtain the final federal model; in the final federated model, the first network head layer and each second network head layer are in a parallel relationship;
wherein the apparatus further comprises: the processing module is used for inputting image data to be processed to the network backbone layer of the final federated model, and the network backbone layer processes the image data to be processed to obtain data characteristics corresponding to the image data to be processed; inputting the data characteristics to a first network head layer of the final federated model, processing the data characteristics by the first network head layer to obtain a first processing result, inputting the data characteristics to each second network head layer of the final federated model, and processing the data characteristics by each second network head layer to obtain a second processing result; determining a target processing result of the image data to be processed based on the first processing result and each second processing result;
wherein the first target federated model includes K network layers, the apparatus further comprising: the dividing module is used for dividing K network layers of the first target federated model into a network backbone layer and a first network head layer based on the configured dividing parameter M; or dividing K network layers of the first target federal model into a network backbone layer and a first network head layer based on a configured dividing parameter N;
the obtaining module is specifically configured to, when obtaining a first target federal model matched with the local image data of the first client,: training the obtained original model based on the local image data of the first client to obtain a trained initial federated model, and sending the initial federated model to a server, wherein the server generates a first target federated model matched with the local image data of the first client for the first client based on the initial federated model of the first client and the initial federated models of the second clients; and acquiring the first target federal model from the server.
10. A client device, wherein a federal learning system includes a first client and at least one second client, the client device being the first client;
wherein the client device comprises a processor and a machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to implement the method steps of any of claims 1-7.
11. A federated learning system is characterized in that,
the federal learning system comprises a first client and at least one second client; wherein:
the first client is used for acquiring a first target federal model matched with local image data of the first client; wherein the first target federated model comprises a network backbone layer and a first network header layer; and sending the network backbone layer to each second client; the first target federal model comprises K network layers, wherein the first M network layers in the K network layers are network backbone layers, and the rest network layers except the network backbone layers are first network head layers; or, the next N network layers of the K network layers are first network head layers, and the rest of the network layers except the first network head layer are network backbone layers; wherein K, M and N are positive integers, M is less than K, and N is less than K;
the second client is used for generating a second target federated model, the second target federated model comprises the network backbone layer and a random network head layer, the structure of the random network head layer is the same as that of the first network head layer, and the model parameters of the random network head layer are different from those of the first network head layer; adjusting the model parameters of the random network head layer based on the local image data of the second client to obtain a second network head layer, and sending the second network head layer to the first client;
the first client is further used for receiving the second network head layer returned by each second client, and generating a final federated model based on the network backbone layer, the first network head layer and the second network head layer returned by each second client; and the final federal model is used for processing the image data to be processed.
12. The system of claim 11,
the federal learning system further comprises a server;
the first client is used for training the acquired original model based on the local image data of the first client to obtain a trained initial federated model, and sending the initial federated model to the server;
the server is used for generating a first target federal model matched with the local image data of the first client for the first client based on the initial federal model of the first client and the initial federal models of the second clients, and sending the first target federal model to the first client;
the first client is further configured to obtain the first target federation model from the server.
CN202111254170.9A 2021-10-27 2021-10-27 Data processing method, device, equipment and system Active CN113988260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111254170.9A CN113988260B (en) 2021-10-27 2021-10-27 Data processing method, device, equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111254170.9A CN113988260B (en) 2021-10-27 2021-10-27 Data processing method, device, equipment and system

Publications (2)

Publication Number Publication Date
CN113988260A CN113988260A (en) 2022-01-28
CN113988260B true CN113988260B (en) 2022-11-25

Family

ID=79742387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111254170.9A Active CN113988260B (en) 2021-10-27 2021-10-27 Data processing method, device, equipment and system

Country Status (1)

Country Link
CN (1) CN113988260B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329010A (en) * 2020-10-16 2021-02-05 深圳前海微众银行股份有限公司 Adaptive data processing method, device, equipment and storage medium based on federal learning
CN113297396A (en) * 2021-07-21 2021-08-24 支付宝(杭州)信息技术有限公司 Method, device and equipment for updating model parameters based on federal learning
CN113408743A (en) * 2021-06-29 2021-09-17 北京百度网讯科技有限公司 Federal model generation method and device, electronic equipment and storage medium
CN113537518A (en) * 2021-07-19 2021-10-22 哈尔滨工业大学 Model training method and device based on federal learning, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329010A (en) * 2020-10-16 2021-02-05 深圳前海微众银行股份有限公司 Adaptive data processing method, device, equipment and storage medium based on federal learning
CN113408743A (en) * 2021-06-29 2021-09-17 北京百度网讯科技有限公司 Federal model generation method and device, electronic equipment and storage medium
CN113537518A (en) * 2021-07-19 2021-10-22 哈尔滨工业大学 Model training method and device based on federal learning, equipment and storage medium
CN113297396A (en) * 2021-07-21 2021-08-24 支付宝(杭州)信息技术有限公司 Method, device and equipment for updating model parameters based on federal learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Toward Efficient Federated Learning in Multi-Channeled Mobile Edge Network;Haizhou Du et al.;《arxiv》;20210918;第1-12页 *
智能生态网络:知识驱动的未来价值互联网基础设施;雷凯等;《应用科学学报》;20200131;第38卷(第1期);第152-172页 *

Also Published As

Publication number Publication date
CN113988260A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
Zhang et al. Adversarial spatio-temporal learning for video deblurring
US11093805B2 (en) Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image
US10346726B2 (en) Image recognition method and apparatus, image verification method and apparatus, learning method and apparatus to recognize image, and learning method and apparatus to verify image
CN111444881A (en) Fake face video detection method and device
CN111563601A (en) Representation learning using joint semantic vectors
CN106599863A (en) Deep face identification method based on transfer learning technology
Zhao et al. Scale-aware crowd counting via depth-embedded convolutional neural networks
Guo et al. Dynamic low-light image enhancement for object detection via end-to-end training
CN108875767A (en) Method, apparatus, system and the computer storage medium of image recognition
WO2022166797A1 (en) Image generation model training method, generation method, apparatus, and device
CN111476806A (en) Image processing method, image processing device, computer equipment and storage medium
Hu et al. LDF-Net: Learning a displacement field network for face recognition across pose
KR102357000B1 (en) Action Recognition Method and Apparatus in Untrimmed Videos Based on Artificial Neural Network
KR102013649B1 (en) Image processing method for stereo matching and program using the same
Singh et al. A proficient approach for face detection and recognition using machine learning and high‐performance computing
Nida et al. Video augmentation technique for human action recognition using genetic algorithm
Liu et al. Deep learning and its application to general image classification
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN109523478B (en) Image descreening method and storage medium
CN113988260B (en) Data processing method, device, equipment and system
Althbaity et al. Colorization Of Grayscale Images Using Deep Learning
Khan et al. Masked linear regression for learning local receptive fields for facial expression synthesis
CN114841887A (en) Image restoration quality evaluation method based on multi-level difference learning
Vishwakarma A non-iterative learning based artificial neural network classifier for face recognition under varying illuminations
Shukla et al. Deep Learning Model to Identify Hide Images using CNN Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant