CN113807157A

CN113807157A - Method, device and system for training neural network model based on federal learning

Info

Publication number: CN113807157A
Application number: CN202011352528.7A
Authority: CN
Inventors: 毛伟; 王希予; 张立平; 裴积全
Original assignee: Jingdong Technology Holding Co Ltd
Current assignee: Jingdong Technology Holding Co Ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-12-17

Abstract

The embodiment of the disclosure discloses a method, a device and a system for training a neural network model based on federal learning. One embodiment of the method comprises: receiving an untrained neural network model sent by a server; training an untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; and storing the trained and aggregated neural network model. This embodiment enables the application of federal learning to the training of neural networks.

Description

Method, device and system for training neural network model based on federal learning

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for training a neural network model based on federal learning.

Background

With the development of technologies such as cloud computing and distributed storage, the preface of the big data era is uncovered. In the big data era, various information including, for example, business activities of enterprises, item or product descriptions, user behavior data, natural social environments, economic and political situation, etc. can be recorded, forming valuable data assets. The analysis and utilization of big data to capture objective laws drives the development and application of artificial intelligence.

With the research and application of artificial intelligence, it has shown its advantages in many industries such as unmanned driving, health care, finance, etc. Therefore, researchers are beginning to expect the use of more sophisticated, more efficient artificial intelligence techniques in many areas. However, data available in many fields at present are usually limited and even have poor data quality, which directly affects the falling of artificial intelligence technology. Based on this, researchers have proposed whether data from different data sources can be fused together to solve the problem of limited and poor data quality, but breaking the barrier between different data sources is very difficult in many cases. For example, different enterprises have different data, and data privacy and data security problems between different enterprises cause data of each enterprise to exist in an isolated island form and cannot be used jointly.

Federal learning is an emerging artificial intelligence technology, and the initial design goal is to develop efficient machine learning among a plurality of participants (such as a plurality of computing nodes, a plurality of user terminals and the like) on the premise of ensuring data security, data privacy, legal compliance and the like during data exchange, such as training and application of support vector machines, XGBoost (eXtreme Gradient boosting) and other models. The various parties involved in federal learning often have equal positions, participate in contributions and share efforts. Compared with the traditional distributed learning, the federated learning has the advantages that the data exchange process is legally compliant, secret-related or private data do not need to be migrated, and therefore the private data are not disclosed.

Disclosure of Invention

The embodiment of the disclosure provides a method, a device and a system for training a neural network model based on federal learning.

In a first aspect, an embodiment of the present disclosure provides a method for training a neural network model based on federal learning, which is applied to a participating end, and the method includes: receiving an untrained neural network model sent by a server; training an untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; and storing the trained and aggregated neural network model.

In some embodiments, the processing step further comprises: and in response to determining that the aggregated neural network model is not trained completely, sending the aggregated neural network model to each participant terminal to continue training.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the server is one of the participating peers.

In some embodiments, the training data is encrypted data.

In some embodiments, training the untrained neural network model using local training data to obtain a trained neural network model, includes: and training the untrained neural network model by using local training data, and encrypting the trained neural network model in the training process to obtain the trained neural network model.

In some embodiments, training an untrained neural network model using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, includes: encrypting the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the trained neural network model to the server.

In some embodiments, the neural network model is a convolutional neural network.

In some embodiments, the data transmission between the server and each participant is implemented based on a preset acceleration library.

In a second aspect, an embodiment of the present disclosure provides a face recognition method, including: acquiring a face image to be recognized; inputting the face image into a face recognition model trained in advance to obtain a recognition result, wherein the face recognition model is obtained by training through a method described in any one implementation mode in the first aspect; and generating prompt information according to the identification result.

In a third aspect, an embodiment of the present disclosure provides a system for training a neural network model based on federal learning, including a server and at least two participating terminals; the server side sends untrained neural network models to at least two participant sides respectively; for a participant end of at least two participant ends, the participant end trains a received untrained neural network model by using local training data to obtain a trained neural network model, and sends the trained neural network model to a server end; the server side aggregates the received trained neural network models to obtain an aggregated neural network model, and sends at least two participating terminals of the aggregated neural network model in response to the fact that the training of the aggregated neural network model is completed; and at least two participating terminals store the received trained and aggregated neural network model.

In some embodiments, the server sends the aggregated neural network model to each participant terminal to continue training in response to determining that the aggregated neural network model is not trained.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the server is one of the participating peers.

In some embodiments, the training data is encrypted data.

In some embodiments, for a participant terminal of the at least two participant terminals, the participant terminal trains an untrained neural network model using local training data, and encrypts the trained neural network model during the training process to obtain a trained neural network model.

In some embodiments, for a participant end of the at least two participant ends, the participant end encrypts the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the trained neural network model to the server.

In a fourth aspect, an embodiment of the present disclosure provides an apparatus for training a neural network model based on federal learning, which is applied to a participating end, and the apparatus includes: the receiving unit is configured to receive the untrained neural network model sent by the server; the training unit is configured to train an untrained neural network model by using local training data, obtain a trained neural network model, and send the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; a storage unit configured to store the trained aggregated neural network model.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the server is one of the participating peers.

In some embodiments, the training data is encrypted data.

In some embodiments, the training unit is further configured to train the untrained neural network model using local training data, and encrypt the trained neural network model during the training process to obtain the trained neural network model.

In some embodiments, the training unit is further configured to encrypt the trained neural network model by using a preset encryption algorithm, so as to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the trained neural network model to the server.

In a fifth aspect, an embodiment of the present disclosure provides a face recognition apparatus, including: an acquisition unit configured to acquire a face image to be recognized; a recognition unit configured to input a face image to a face recognition model trained in advance to obtain a recognition result, wherein the face recognition model is obtained by training according to the training method described in the second aspect; and the generating unit is configured to generate the prompt information according to the identification result.

In a sixth aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; storage means for storing one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any one of the implementations of the first aspect or the second aspect.

In a seventh aspect, embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, which, when executed by a processor, implements the method as described in any one of the implementations of the first aspect or the second aspect.

According to the method, the device and the system for training the neural network model based on the federal learning, the participating terminals of the federal learning use local training data to train the same neural network model, then the neural network model is sent to the server terminal to be aggregated, then the participating terminals continue training until the training of the neural network model aggregated by the server terminal is completed, and further the trained neural network model can be stored for subsequent use. Therefore, the combination of the federal learning and the neural network can be realized, so that the federal learning is applied to the training of the neural network, the problems of isolation, data fragmentation and the like between training data of the neural network and belonging to different data sources can be solved legally and conveniently, and meanwhile, the processing capacity and generalization of the trained neural network can be improved.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method of training a neural network model based on federated learning according to the present disclosure;

FIG. 3 is a flow diagram of one embodiment of a face recognition method according to the present disclosure;

FIG. 4 is a timing diagram of one embodiment of a system for training a neural network model based on federated learning according to the present disclosure;

FIG. 5 is a schematic structural diagram of one embodiment of an apparatus for training a neural network model based on federated learning, according to the present disclosure;

FIG. 6 is a schematic block diagram of one embodiment of a face recognition apparatus according to the present disclosure;

FIG. 7 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary architecture 100 to which an embodiment of a method of training a neural network model based on federal learning or an apparatus for training a neural network model based on federal learning of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include participating

peers

101, 102, 103 and a server 104. The communication connection between the participating

terminals

101, 102, 103 and the service terminal 104 is not limited by the specific connection manner and connection type, such as wired, wireless communication link or fiber optic cable.

Data interaction can be performed between the

participants

101, 102, 103 and the server 104 to receive or transmit data (such as neural network model, etc.). Various applications or tools, etc. may be installed on the participating

clients

101, 102, 103 and the server 104 to implement different data processing functions. For example, a framework for building neural network models, tools for training neural network models, and the like

The participating

terminals

101, 102, 103 may be hardware or software. When the

participants

101, 102, 103 are hardware, they can be various electronic devices, including but not limited to various terminal devices (such as laptop, desktop, tablet, etc.) and various servers, etc. When the participating

terminals

101, 102, 103 are software, they may be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 104 may be a server providing various services, such as a server that aggregates neural network models transmitted by the participating

terminals

101, 102, 103. Further, the server may also send the aggregated neural network model to the participating

terminals

101, 102, 103.

It should be noted that the method for training the neural network model based on federal learning provided in the embodiments of the present disclosure is generally performed by the participating

terminals

101, 102, 103, and accordingly, the apparatus for training the neural network model based on federal learning is generally disposed in the participating

terminals

101, 102, 103.

It is further noted that in some cases, one of the

participants

101, 102, 103 may also act as a server while acting as a participant.

The server 104 may be hardware or software. When the server 104 is hardware, it may be implemented as a distributed server cluster composed of multiple servers, or may be implemented as a single server. When the server 104 is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be understood that the number of participating and serving peers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method of training a neural network model based on federal learning in accordance with the present disclosure is illustrated. The method for training the neural network model based on the federal learning comprises the following steps:

step 201, receiving an untrained neural network model sent by a server.

In this embodiment, an executive (e.g., participating

end

101, 102, 103, etc. shown in fig. 1) of the method for training a neural network model based on federal learning may receive an untrained neural network model sent by a server (e.g., server 104 shown in fig. 1) from the server.

The neural network model may be various types of neural network models. Such as deep neural networks, recurrent neural networks, convolutional neural networks, and so forth. The processing data, processing procedure and processing result of the neural network model may be different according to the actual application requirements and different application scenarios. For example, neural network models may be used for face recognition, speech recognition, video tracking, and the like. The untrained neural network model may be an untrained or a trained incomplete neural network model.

Step 202, training the untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to the server.

In this embodiment, after receiving the untrained neural network model from the server, the executive body may train the untrained neural network model by using its local training data. The corresponding training data may be different according to the neural network model. For example, if the neural network model is used for face recognition, the training data may include a face image and a face recognition result corresponding to the face image.

Specifically, the untrained neural network model may be trained by using a machine learning method until a preset training stop condition is satisfied, so as to obtain the trained neural network model. The training stopping condition can be flexibly set according to actual application requirements and application scenes. For example, training stop conditions may include, but are not limited to: the training time reaches a preset duration, the training times reach preset times, the value of the loss function or the attenuation amplitude reaches a preset threshold value, and the like.

After the executing agent obtains the trained neural network model through training, the trained neural network model can be further sent to the server.

It should be noted that, in addition to the execution main body, other participating ends may be provided. Different participating peers may have different training data locally. In some cases, there may also be an intersection of the local training data of the different participating ends. Each participant can obtain the neural network model trained by using its local training data by performing the

above steps

201 and 202. Generally, the trained neural network model obtained by each participant is different.

After receiving the trained neural network models respectively sent by each participating end (including the execution main body and other participating ends), the server may perform aggregation processing on each trained neural network model to fuse the characteristics of the trained neural network models of each participating end, thereby obtaining an aggregated neural network model.

The specifically adopted aggregation processing algorithm can be flexibly set according to actual application requirements and application scenes. For example, the aggregation process may refer to averaging or summing network parameters of the trained neural network models, and so on.

After the aggregated neural network model is obtained, the server may first determine whether the training of the aggregated neural network model is completed. Specifically, the server may determine whether training of the aggregated neural network model is completed according to a preset stop condition. The preset stop condition can be flexibly set according to actual application requirements and application scenes. For example, the preset stop conditions include, but are not limited to: the training time reaches the preset duration, the training times reaches the preset times, and the like.

If the training of the aggregated neural network model is completed, the aggregated neural network model can be sent to each participating end, so that each participating end can process corresponding data by using the trained neural network model.

And step 203, storing the trained and aggregated neural network model.

In this embodiment, after receiving the aggregated neural network model sent by the server and having completed training, the execution subject may store the aggregated neural network model for subsequent use. The other participating terminals can also store the trained and aggregated neural network model.

In some optional implementation manners of this embodiment, in step 202, if the server determines that the aggregated neural network model is not trained, the untrained and aggregated neural network model may be sent to each participating end, so that each participating end continues to train the untrained and aggregated neural network model.

Specifically, each participant terminal may determine the untrained and aggregated neural network model received from the server terminal as the untrained neural network model, and continue to perform the

above steps

201 and 202 until the training of the neural network model is completed.

It should be noted that before the training is not started, the server may obtain the initial model as an untrained neural network model, and send the initial model to each participating end respectively. Then, the server may use the neural network model obtained by aggregating the trained neural network models sent by the participating terminals as the untrained neural network model.

In addition, it should be noted that the federal learning framework employed in the present embodiment can be selected by a technician according to actual application requirements or application scenarios. Currently, common federal learning frameworks include Tensorflow fed, Pyshift, FATE, and the like. As an example, the neural network model may be selected to be constructed and trained using the federal learning framework Pysyft.

Optionally, different federal learning methods may be selected to train the neural network model according to the task of the neural network model to be trained, the characteristics of the training data of each participating end, and the like. In general, federal Learning may include Horizontal federal Learning (Horizontal federal learned Learning), Vertical federal Learning (Vertical federal learned Learning), and federal Transfer Learning (fed transferred Learning).

As an example, if the task of the neural network model is used for face recognition, and the user features in the training data of different participating terminals are more repeated but the corresponding users are less overlapped, a horizontal federal learning method may be adopted to perform horizontal segmentation on the training data of each participating terminal, that is, segmentation is performed from the user dimension, and then the training data with the same user features but different users are taken out to train the neural network model.

In some optional implementation manners of this embodiment, the server may be one of the participating terminals. For example, the execution agent may be a server side as well as a participant side. At this time, the executing main body may send the untrained neural network model to each of the other participating terminals, so that each participating terminal trains the untrained neural network model by using the local training data, respectively, to form the trained neural network model, including that the executing main body itself trains the untrained neural network model by using the local training data thereof, to form the trained neural network model.

And then the execution main body can receive the trained neural network models respectively sent by each participating end and aggregate the trained neural network models with the locally trained neural network models to obtain the aggregated neural network models. If the training of the aggregated neural network model is completed, the aggregated neural network model can be stored, and simultaneously the aggregated neural network model is respectively sent to each participating end. If the aggregated neural network model is not trained, the aggregated neural network model can be determined as the untrained neural network model and is respectively sent to each participating end for continuous training.

Therefore, the processes of constructing the server, confirming the trust between the server and each participating end and the like can be omitted, and the overall processing flow of the neural network model is simplified.

In some optional implementations of the present embodiment, the training data for training the untrained neural network model may not include one-dimensional data. Namely, the training data is two-dimensional or multi-dimensional high-dimensional data. Specifically, in the training process of the neural network model, the training data can be read for training by adopting a mode of reading the original storage path of the training data.

In the prior art, the training of the machine learning model based on the federal learning generally only supports the processing of one-dimensional data, such as signal data, time sequence data, and the like, and cannot process high-dimensional data represented in the form of matrix, multidimensional List, and the like, such as image, video, voice, and the like, so that the federal learning technology cannot be applied to the training of a neural network model for processing the high-dimensional data. Aiming at the problem, when the neural network model is trained, the training data is loaded and used by reading the original storage path of the training data to finish the training of the neural network model, so that the federal learning technology can be applied to the training of the neural network model for processing high-dimensional data, and the processing capacity of the neural network model is further improved.

In some optional implementations of this embodiment, the training data local to each participant may be encrypted data. Specifically, the training data may be encrypted using various existing encryption algorithms. The encryption algorithm used by each participating peer may be the same.

Therefore, the safety of the training data of each participating end can be protected, and the trained neural network model obtained by training the untrained neural network model by using the encrypted training data can be ensured, so that after the server end receives the trained neural network model sent by the participating end, the server end is difficult to know the information such as the distribution of the training data of the participating end according to the trained neural network model, and the training data of each participating end is prevented from being leaked in the joint training process.

In some optional implementation manners of this embodiment, for each participating end, after receiving the untrained neural network model from the server, the untrained neural network model may be trained by using local training data, and the trained untrained neural network model is encrypted in the training process, so that the trained neural network model that is encrypted may be obtained.

Specifically, when the untrained neural network model is trained, a perturbation (e.g., random noise, etc.) may be added to the untrained neural network model to encrypt the untrained neural network model. When decryption is needed, the trained neural network model can be decrypted by removing corresponding disturbance.

By the method, the safety of the process of exchanging the trained neural network model between each participant and the server can be further guaranteed, and the situation that an illegal person can know the relevant information of the trained neural network model in a reverse-pushing mode and the like is avoided.

Optionally, the trained neural network model that is not trained may be encrypted by using a differential privacy technique, so that the neural network model is naturally encrypted in the training process, thereby ensuring the security and the training speed of the neural network model.

In some optional implementation manners of this embodiment, after each participant terminal obtains the trained neural network model, the trained neural network model may be encrypted by using a preset encryption algorithm to obtain an encrypted trained neural network model, and then the encrypted trained neural network model is determined as the trained neural network model and sent to the server terminal.

Specifically, various existing encryption algorithms can be used to encrypt the trained neural network model. The encryption algorithm used by the various participating peers may be the same. For example, the trained neural network model may be encrypted using a homomorphic encryption technique.

In this way, after each participant receives the aggregated neural network model from the server, the aggregated neural network model can be decrypted according to the used encryption algorithm, and then the decrypted neural network model is trained continuously by using local training data. Therefore, the safety of the multi-participant joint training neural network model is further ensured.

In some alternative implementations of the present embodiment, each of the neural network models participating in the federated learning-based training may be a convolutional neural network to implement a variety of different convolution operators. Specifically, when the neural network model to be trained is constructed, the neural network model can be loaded and packaged into a structural body, so as to support the application of the convolution operator.

In the prior art, a machine learning model trained by applying federal learning generally only supports operators (such as Laplace Operator) in simple machine learning, such as XGBosot and support vector machine, but cannot support more complex operators. Aiming at the problem, the neural network model is loaded and packaged into a structural body, and the delivered convolution operator is realized by utilizing the structural body, so that the processing capacity of the neural network model trained by applying federal learning is further improved.

In some optional implementation manners of this embodiment, data transmission between the server and each participant may be implemented based on a preset acceleration library (e.g., a Wsaccel acceleration library, etc.), so as to accelerate a training process of the neural network model and improve training efficiency.

The method provided by the above embodiment of the present disclosure applies federal learning to training of a neural network model, thereby legally solving the problem of data isolation between training data of different participating terminals, and simultaneously guaranteeing the security of the training data of each participating terminal. Moreover, the neural network model trained based on the federal learning can support reading of high-dimensional data, support more complex convolution operators, have stronger data protection capability, faster data transmission speed and the like through some adjustment of the training process, so that the efficiency of the whole training process and the processing capability of the trained neural network model are further improved.

With further reference to FIG. 3, a flow 300 of one embodiment of a face recognition method is shown. The process 300 of the face recognition method includes the following steps:

step 301, obtaining a face image to be recognized.

In the present embodiment, the face image may refer to an image for presenting a face. The face image to be recognized can be any face image. For example, the face image may be a face image designated in advance, or may be a face image currently detected or photographed.

The execution subject of the face recognition method can acquire a face image to be recognized from local or other electronic equipment, and can also acquire the face image by using an image acquisition device (such as a camera and the like) of the face recognition method.

Step 302, inputting the face image into a pre-trained face recognition model to obtain a recognition result.

In this embodiment, the face recognition model can be obtained by training through the method as shown in the embodiment of fig. 2. At this time, the neural network model in the embodiment of fig. 2 may be a face recognition model.

The face recognition model can be used for recognizing the identity of a person to which the face corresponds to the face image. Therefore, the identification result obtained by the face identification model can represent the identity of the person to which the face corresponding to the face image belongs.

And 303, generating prompt information according to the identification result.

In this embodiment, different prompt information may be generated according to different application scenarios and application requirements and according to the recognition result. For example, if the recognition result is used to represent that the person to which the face corresponding to the face image belongs to the designated user group, prompt information for prompting that the verification is successful may be generated. If the recognition result is used for representing that the person to which the face corresponding to the face image belongs does not belong to the designated user group, prompt information for prompting verification failure can be generated.

The method provided by the embodiment of the disclosure jointly trains the face recognition model by using the training data of the multiple participating ends, solves the problems of data isolation and the like existing between the training data of different participating ends, ensures the data safety of the different participating ends, and can learn the feature expression of the model trained by the training data of each participating end, so that the trained face recognition model has stronger generalization, and can be applied to various application scenes related to different participating ends to improve the recognition accuracy of the face recognition model.

With further reference to FIG. 4, a timing diagram 400 of one embodiment of a system for training a neural network model based on federated learning is shown. The system for training the neural network model based on the federal learning in the embodiment of the disclosure may include a server (e.g., the server 104 shown in fig. 1) and at least one participant (e.g., the

participants

101, 102, 103 shown in fig. 1). It should be noted that, according to the actual application scenario, the number of participating terminals may be more than two.

As shown in fig. 4, in step 401, the server sends untrained neural network models to at least two participating terminals respectively.

In the present embodiment, the neural network model may be various types of neural network models. Such as deep neural networks, recurrent neural networks, convolutional neural networks, and so forth. The processing data, processing procedure and processing result of the neural network model may be different according to the actual application requirements and different application scenarios. For example, neural network models may be used for face recognition, speech recognition, video tracking, and the like. The untrained neural network model may be an untrained or a trained incomplete neural network model.

In step 402, each participant terminal trains the received untrained neural network model using local training data to obtain a trained neural network model.

In this embodiment, after receiving the untrained neural network model from the server, each participant may train the untrained neural network model using its local training data. The corresponding training data may be different according to the neural network model. For example, if the neural network model is used for face recognition, the training data may include a face image and a face recognition result corresponding to the face image.

It should be noted that different participating sites may have different training data locally. In some cases, there may also be an intersection of the local training data of the different participating ends. Generally, the trained neural network model obtained by each participant is different.

And step 403, each participant sends the trained neural network model to the server.

And step 404, the server side aggregates the received trained neural network models to obtain an aggregated neural network model.

In this embodiment, after receiving the trained neural network models respectively sent by each participating end, the server may perform aggregation processing on each trained neural network model to fuse the characteristics of the trained neural network models of each participating end, so as to obtain an aggregated neural network model.

Step 405, in response to determining that the training of the aggregated neural network model is completed, sending the aggregated neural network model to at least two participating terminals.

In this embodiment, after obtaining the aggregated neural network model, the server may first determine whether training of the aggregated neural network model is completed. Specifically, the server may determine whether training of the aggregated neural network model is completed according to a preset stop condition. The preset stop condition can be flexibly set according to actual application requirements and application scenes. For example, the preset stop conditions include, but are not limited to: the training time reaches the preset duration, the training times reaches the preset times, and the like.

If the training of the aggregated neural network model is completed, the server side can send the aggregated neural network model to each participating side, so that each participating side can process corresponding data by using the trained neural network model.

And step 406, each participating end stores the received trained and aggregated neural network model.

Optionally, in step 405, if the server determines that the aggregated neural network model is not trained, the untrained and aggregated neural network model may be sent to each participating end, so that each participating end continues to train the untrained and aggregated neural network model.

Specifically, each participant may determine the untrained and aggregated neural network model received from the server as an untrained neural network model, and continue to train with the local training data until the training of the neural network model is completed.

Optionally, the server may be one of the participating terminals.

Optionally, the training data local to each participant for training the untrained neural network model may not include one-dimensional data.

Alternatively, the training data local to each participant may be encrypted data.

Optionally, after each participant receives the untrained neural network model from the server, the participant can train the untrained neural network model by using local training data, and encrypt the untrained neural network model in the training process, so as to obtain the trained neural network model which is encrypted.

Optionally, after each participant terminal obtains the trained neural network model, the trained neural network model may be encrypted by using a preset encryption algorithm to obtain an encrypted trained neural network model, and then the encrypted trained neural network model is determined as the trained neural network model and sent to the server terminal.

Alternatively, each participating federated-trained neural network model based on federated learning may be a convolutional neural network to implement a variety of different convolution operators.

Optionally, data transmission between the server and each participant can be implemented based on a preset acceleration library (e.g., a Wsaccel acceleration library, etc.), so as to accelerate the training process of the neural network model and improve the training efficiency.

It should be noted that, for the content that is not described in detail in this embodiment, reference may be made to the related description in the embodiment corresponding to fig. 2, and details are not described here again.

The system provided by the above embodiment of the present disclosure applies federal learning to training of a neural network model, thereby legally solving the problem of data isolation between training data of different participating terminals, and simultaneously ensuring the security of the training data of each participating terminal. Moreover, the neural network model trained based on the federal learning can support reading of high-dimensional data, support more complex convolution operators, have stronger data protection capability, faster data transmission speed and the like through some adjustment of the training process, so that the efficiency of the whole training process and the processing capability of the trained neural network model are further improved.

With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present disclosure provides an embodiment of an apparatus for training a neural network model based on federal learning, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.

As shown in fig. 5, the apparatus 500 for training a neural network model based on federal learning provided in this embodiment includes a receiving unit 501, a training unit 502, and a storage unit 503. The receiving unit 501 is configured to receive an untrained neural network model sent by a server; the training unit 502 is configured to train an untrained neural network model using local training data, obtain a trained neural network model, and send the trained neural network model to a server, wherein the server is configured to perform the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; the storage unit 503 is configured to store the trained aggregated neural network model.

In the present embodiment, in the apparatus 500 for training a neural network model based on federal learning: the specific processing of the receiving unit 501, the training unit 502, and the storage unit 503 and the technical effects thereof can refer to the related descriptions of step 201, step 202, and step 203 in the corresponding embodiment of fig. 2, which are not repeated herein.

In some optional implementations of this embodiment, the processing step further includes: and in response to determining that the aggregated neural network model is not trained completely, sending the aggregated neural network model to each participant terminal to continue training.

In some optional implementations of this embodiment, the training data does not include one-dimensional data.

In some optional implementation manners of this embodiment, the server is one of the participating terminals.

In some optional implementations of this embodiment, the training data is encrypted data.

In some optional implementations of this embodiment, the training unit 502 is further configured to train the untrained neural network model using local training data, and encrypt the trained neural network model during the training process to obtain the trained neural network model.

In some optional implementation manners of this embodiment, the training unit 502 is further configured to perform encryption processing on the trained neural network model by using a preset encryption algorithm, so as to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the trained neural network model to the server.

In some optional implementations of the present embodiment, the neural network model is a convolutional neural network.

In some optional implementation manners of this embodiment, the data transmission between the server and each participant is implemented based on a preset acceleration library.

According to the device provided by the embodiment of the disclosure, the untrained neural network model sent by the server is received by the receiving unit; the training unit trains an untrained neural network model by using local training data to obtain a trained neural network model, and sends the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; and the storage unit stores the trained and aggregated neural network model. Therefore, the problem of data isolation between training data of different participating ends can be solved legally and compliantly, the safety of the training data of each participating end is guaranteed, and the processing capacity and the generalization capacity of the trained neural network model can be greatly improved and the processing precision is improved due to the fact that the training data of the different participating ends are used for training the neural network model in a combined mode.

With further reference to fig. 6, as an implementation of the method shown in fig. 3, the present disclosure provides an embodiment of a face recognition apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 3, and the apparatus may be applied to various electronic devices.

As shown in fig. 5, the face recognition apparatus 500 provided in the present embodiment includes an acquisition unit 501, a recognition unit 502, and a generation unit 503. Wherein the acquiring unit 501 is configured to acquire a face image to be recognized; the recognition unit 502 is configured to input the face image into a face recognition model trained in advance, and obtain a recognition result, wherein the face recognition model is obtained by training through a training method as shown in the embodiment of fig. 3; the generating unit 503 is configured to generate prompt information according to the recognition result.

In the present embodiment, in the face recognition apparatus 500: the specific processing of the obtaining unit 501, the identifying unit 502, and the generating unit 503 and the technical effects thereof can refer to the related descriptions of step 301, step 302, and step 303 in the corresponding embodiment of fig. 2, which are not repeated herein.

According to the device provided by the embodiment of the disclosure, the face image to be recognized is acquired through the acquisition unit, the recognition unit inputs the face image into the face recognition model obtained based on the federal learning training to obtain the recognition result, and the generation unit generates the prompt information according to the recognition result. Therefore, the problems of data isolation and the like existing between training data of different participating ends can be solved, the data safety of the different participating ends can be guaranteed, and the feature expression of the model trained by the training data of each participating end can be learned, so that the trained face recognition model has stronger generalization, and the face recognition model can be applied to various different application scenes related to the different participating ends, and the recognition accuracy of the face recognition model is improved.

Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the participant terminal in fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 7, electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from storage 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of embodiments of the present disclosure.

It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an untrained neural network model sent by a server; training an untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed; and storing the trained and aggregated neural network model.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a receiving unit, a training unit, and a storage unit. The names of these units do not form a limitation to the unit itself under certain circumstances, and for example, a receiving unit may also be described as a "unit that receives an untrained neural network model transmitted by a server".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims

1. A method for training a neural network model based on federated learning is applied to a participant side, and the method comprises the following steps:

receiving an untrained neural network model sent by a server;

training the untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed;

and storing the trained and aggregated neural network model.

2. The method of claim 1, wherein the processing step further comprises:

and in response to determining that the aggregated neural network model is not trained completely, sending the aggregated neural network model to each participant terminal to continue training.

3. The method of claim 1, wherein the training data does not include one-dimensional data.

4. The method of claim 1, wherein the server is one of the participants.

5. The method of claim 1, wherein the training data is encrypted data.

6. The method of claim 1, wherein the training the untrained neural network model with local training data to obtain a trained neural network model comprises:

and training the untrained neural network model by using local training data, and encrypting the trained neural network model in the training process to obtain the trained neural network model.

7. The method of claim 1, wherein the training the untrained neural network model with local training data to obtain a trained neural network model, and sending the trained neural network model to the server comprises:

encrypting the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model;

and determining the encrypted trained neural network model as the trained neural network model and sending the trained neural network model to the server.

8. The method of claim 1, wherein the neural network model is a convolutional neural network.

9. The method according to one of claims 1 to 8, wherein the data transmission between the server and each participant is implemented based on a preset acceleration library.

10. A face recognition method, comprising:

acquiring a face image to be recognized;

inputting the face image into a face recognition model trained in advance to obtain a recognition result, wherein the face recognition model is obtained by training according to the training method of any one of claims 1 to 9;

and generating prompt information according to the identification result.

11. A system for training a neural network model based on federal learning comprises a server and at least two participating terminals;

the server side sends untrained neural network models to the at least two participating sides respectively;

for a participant end of the at least two participant ends, the participant end trains a received untrained neural network model by using local training data to obtain a trained neural network model, and sends the trained neural network model to the server end;

the server side aggregates the received trained neural network models to obtain an aggregated neural network model, and sends the aggregated neural network model to the at least two participating terminals in response to the fact that the training of the aggregated neural network model is completed;

and the at least two participating terminals store the received trained and aggregated neural network model.

12. The system of claim 11, wherein the server sends the aggregated neural network model to each participant to continue training in response to determining that the aggregated neural network model is not trained.

13. An apparatus for training a neural network model based on federal learning, applied to a participant, the apparatus comprising:

the receiving unit is configured to receive the untrained neural network model sent by the server;

a training unit configured to train the untrained neural network model by using local training data, obtain a trained neural network model, and send the trained neural network model to the server, wherein the server is configured to execute the following processing steps: aggregating the trained neural network models received from each participating terminal to obtain aggregated neural network models, and sending the aggregated neural network models to each participating terminal in response to determining that the training of the aggregated neural network models is completed;

a storage unit configured to store the trained aggregated neural network model.

14. A face recognition apparatus comprising:

an acquisition unit configured to acquire a face image to be recognized;

a recognition unit configured to input the face image to a face recognition model trained in advance to obtain a recognition result, wherein the face recognition model is obtained by training according to the training method of one of claims 1 to 9;

and the generating unit is configured to generate prompt information according to the identification result.

15. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-10.

16. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-10.