CN113807157B

CN113807157B - Method, device and system for training neural network model based on federal learning

Info

Publication number: CN113807157B
Application number: CN202011352528.7A
Authority: CN
Inventors: 毛伟; 王希予; 张立平; 裴积全
Original assignee: Jingdong Technology Holding Co Ltd
Current assignee: Jingdong Technology Holding Co Ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2024-07-19
Anticipated expiration: 2040-11-27
Also published as: CN113807157A

Abstract

Embodiments of the present disclosure disclose methods, devices, and systems for training neural network models based on federal learning. One embodiment of the method comprises the following steps: receiving an untrained neural network model sent by a server; training the untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; the aggregated neural network model after training is stored. This embodiment enables the application of federal learning to training of neural networks.

Description

Method, device and system for training neural network model based on federal learning

Technical Field

Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for training a neural network model based on federal learning.

Background

Along with the development of cloud computing, distributed storage and other technologies, a prolog of a big data age is uncovered. In the big data age, various information, including, for example, business activities of the enterprise, project or product descriptions, user behavior data, natural social environments, economic and political situations, etc., can be recorded to form precious data assets. Analysis and utilization of big data to capture objective laws has driven the development and application of artificial intelligence.

With the research and application of artificial intelligence, the artificial intelligence has shown its own advantages in many industries such as unmanned, healthcare, finance, etc. Accordingly, researchers have begun to desire more complex, more efficient artificial intelligence techniques for use in many areas. However, the data available in many fields are usually limited and even have poor data quality, which directly affects the landing of artificial intelligence technology. Based on this, researchers have proposed whether data of different data sources can be fused together to solve the problems of limited data and poor quality, but breaking the barrier between different data sources is very difficult in many cases. For example, different enterprises have different data respectively, and the data privacy and the data security problem between different enterprises make the data of each enterprise exist in island form, and cannot be used jointly.

Federal learning is an emerging artificial intelligence technology, and its initial design goal is to develop efficient machine learning between multiple participants (e.g. multiple computing nodes, multiple user terminals, etc.) on the premise of guaranteeing data security and data privacy, legal compliance, etc. during data exchange, such as training and application of models such as support vector machines, XGBoost (eXtreme Gradient Boosting), etc. The various participants involved in federal learning typically have equal status, participate in contribution and share in achievements. Compared with the traditional distributed learning, the federal learning can lead the data exchange process to be legal and compliant, secret-related or private data to be unnecessary to migrate, thereby not revealing the private data and the like.

Disclosure of Invention

Embodiments of the present disclosure provide methods, apparatus, and systems for training neural network models based on federal learning.

In a first aspect, embodiments of the present disclosure provide a method for training a neural network model based on federal learning, applied to a participating end, the method comprising: receiving an untrained neural network model sent by a server; training the untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; the aggregated neural network model after training is stored.

In some embodiments, the above processing step further comprises: and in response to determining that the aggregated neural network model is not trained, transmitting the aggregated neural network model to each participant to continue training.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the service end is one of the participating ends.

In some embodiments, the training data is encrypted data.

In some embodiments, training the untrained neural network model using the local training data to obtain a trained neural network model includes: training the untrained neural network model by using the local training data, and encrypting the trained neural network model in the training process to obtain the trained neural network model.

In some embodiments, training the untrained neural network model using the local training data to obtain a trained neural network model, and sending the trained neural network model to the server, including: encrypting the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the encrypted trained neural network model to the server.

In some embodiments, the neural network model is a convolutional neural network.

In some embodiments, the data transmission between the server and each participant is implemented based on a preset acceleration library.

In a second aspect, embodiments of the present disclosure provide a face recognition method, including: acquiring a face image to be recognized; inputting a face image into a pre-trained face recognition model to obtain a recognition result, wherein the face recognition model is obtained through training by the method described in any implementation manner in the first aspect; and generating prompt information according to the identification result.

In a third aspect, embodiments of the present disclosure provide a system for training a neural network model based on federal learning, including a server and at least two participating ends; the server side sends untrained neural network models to at least two participating sides respectively; for a participating terminal in at least two participating terminals, the participating terminal trains the received untrained neural network model by utilizing local training data to obtain a trained neural network model, and sends the trained neural network model to a server; the server side aggregates each received trained neural network model to obtain an aggregated neural network model, and transmits at least two participation ends of the aggregated neural network model in response to determining that the training of the aggregated neural network model is completed; at least two participating ends store the received training-completed aggregated neural network model.

In some embodiments, the server transmits the aggregated neural network model to each participant to continue training in response to determining that the aggregated neural network model is not trained.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the service end is one of the participating ends.

In some embodiments, the training data is encrypted data.

In some embodiments, for a participant of the at least two participants, the participant trains an untrained neural network model using local training data and encrypts the trained neural network model during the training process to obtain a trained neural network model.

In some embodiments, for a participant in the at least two participants, the participant encrypts the trained neural network model using a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the encrypted trained neural network model to the server.

In a fourth aspect, embodiments of the present disclosure provide an apparatus for training a neural network model based on federal learning, for application to a participant, the apparatus comprising: the receiving unit is configured to receive the untrained neural network model sent by the server; the training unit is configured to train the untrained neural network model by utilizing the local training data to obtain a trained neural network model, and send the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; and a storage unit configured to store the aggregated neural network model after training is completed.

In some embodiments, the training data does not include one-dimensional data.

In some embodiments, the service end is one of the participating ends.

In some embodiments, the training data is encrypted data.

In some embodiments, the training unit is further configured to train the untrained neural network model using the local training data, and encrypt the trained neural network model during the training process to obtain a trained neural network model.

In some embodiments, the training unit is further configured to encrypt the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the encrypted trained neural network model to the server.

In a fifth aspect, embodiments of the present disclosure provide a face recognition apparatus, including: an acquisition unit configured to acquire a face image to be recognized; the recognition unit is configured to input a face image into a pre-trained face recognition model to obtain a recognition result, wherein the face recognition model is trained by the training method as described in the second aspect; and the generating unit is configured to generate prompt information according to the identification result.

In a sixth aspect, embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage means for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any implementation manner of the first aspect or the second aspect.

In a seventh aspect, embodiments of the present disclosure provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect or the second aspect.

According to the method, the device and the system for training the neural network model based on the federal learning, the participation end of the federal learning is used for training the same neural network model by utilizing local training data of the participation end, the same neural network model is then sent to the server end for aggregation, and the participation ends continue training until the training of the neural network model after aggregation of the server end is completed, so that the trained neural network model can be stored for subsequent use. Therefore, the combination of the federal learning and the neural network can be realized, so that the federal learning is applied to the training of the neural network, the problems of isolation, data fragmentation and the like among training data belonging to different data sources of the neural network can be solved legally and compliantly, and the processing capacity and generalization of the neural network after the training can be improved.

Drawings

Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:

FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;

FIG. 2 is a flow chart of one embodiment of a method of training a neural network model based on federal learning, according to the present disclosure;

FIG. 3 is a flow chart of one embodiment of a face recognition method according to the present disclosure;

FIG. 4 is a timing diagram of one embodiment of a federal learning-based training neural network model system according to the present disclosure;

FIG. 5 is a schematic structural view of one embodiment of an apparatus for training a neural network model based on federal learning according to the present disclosure;

fig. 6 is a schematic structural diagram of one embodiment of a face recognition device according to the present disclosure;

Fig. 7 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

FIG. 1 illustrates an exemplary architecture 100 of an embodiment of a federal learning-based neural network model-based method or federal learning-based device to which the present disclosure may be applied.

As shown in fig. 1, a system architecture 100 may include participating ends 101, 102, 103 and a server end 104. The communication connection between the participating terminals 101, 102, 103 and the service terminal 104 is not limited in specific connection manner and connection type, for example, wired, wireless communication links or optical fiber cables.

Data interaction may be performed between the participating terminals 101, 102, 103 and the server 104 to receive or transmit data (e.g., neural network model, etc.). Various applications or tools etc. may be installed on each of the participating terminals 101, 102, 103 and the service terminal 104 to implement different data processing functions. For example, a framework for building a neural network model, a tool for training a neural network model, and the like

The participating terminals 101, 102, 103 may be hardware or software. When the participating terminals 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, various terminal devices (e.g., laptop, desktop, tablet, etc.), and various servers, etc. When the participating terminals 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., multiple software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.

The server 104 may be a server that provides various services, such as a server that performs aggregation or the like on the neural network models transmitted by the participating terminals 101, 102, 103. Further, the server may also send the aggregated neural network model to the participating ends 101, 102, 103.

It should be noted that, the method based on the federal learning training neural network model provided by the embodiments of the present disclosure is generally performed by the participating terminals 101, 102, 103, and accordingly, the device based on the federal learning training neural network model is generally disposed in the participating terminals 101, 102, 103.

It should also be noted that in some cases, one of the participating terminals 101, 102, 103 may also be a service terminal while acting as a participating terminal.

The server 104 may be hardware or software. When the server 104 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server 104 is software, it may be implemented as a plurality of software or software modules (e.g., a plurality of software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.

It should be understood that the number of participating and serving ends in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method of training a neural network model based on federal learning according to the present disclosure is shown. The method for training the neural network model based on federal learning comprises the following steps:

step 201, receiving an untrained neural network model sent by a server.

In this embodiment, the execution subject (such as the participating terminals 101, 102, 103, etc. shown in fig. 1) of the method for training a neural network model based on federal learning may receive an untrained completed neural network model sent by a server terminal from a server terminal (such as the server terminal 104 shown in fig. 1).

The neural network model may be various types of neural network models. Such as deep neural networks, recurrent neural networks, convolutional neural networks, and the like. The processing data, processing procedure and processing result of the neural network model may be different according to the actual application requirements and application scenes. For example, neural network models may be used for face recognition, speech recognition, video tracking, and so forth. The untrained completed neural network model may be an untrained or an untrained neural network model.

Step 202, training the untrained neural network model by using the local training data to obtain a trained neural network model, and sending the trained neural network model to the server.

In this embodiment, after receiving the untrained neural network model from the server, the executing entity may train the untrained neural network model using its local training data. The corresponding training data may be different depending on the neural network model. For example, if the neural network model is used for face recognition, the training data may include face images and face recognition results corresponding to the face images.

Specifically, the neural network model which is not trained can be trained by using a machine learning method until a preset training stop condition is met, so that the trained neural network model is obtained. The training stopping condition can be flexibly set according to actual application requirements and application scenes. For example, training stop conditions may include, but are not limited to: the training time reaches a preset duration, the training times reach a preset number of times, the value of the loss function or the attenuation amplitude reaches a preset threshold, etc.

After the execution body trains to obtain the trained neural network model, the trained neural network model can be further sent to the server.

It should be noted that, in addition to the execution subject, there may be other participating terminals. The local sites of different participating ends may have different training data. In some cases, the training data local to the different participating ends may also have intersections. Each participant may obtain a neural network model trained using its local training data by performing steps 201 and 202 described above. Typically, the trained neural network model obtained for each participant is different.

After receiving the trained neural network models sent by each participant (including the execution subject and other participants), the server may aggregate each trained neural network model to fuse the features of the trained neural network models of each participant, thereby obtaining an aggregated neural network model.

The aggregation processing algorithm adopted specifically can be flexibly set according to actual application requirements and application scenes. For example, the aggregation process may refer to averaging or summing network parameters of the various trained neural network models, and so forth.

After the aggregated neural network model is obtained, the server may first determine whether the aggregated neural network model is trained. Specifically, the server may determine whether training of the aggregated neural network model is completed according to a preset stopping condition. The preset stopping conditions can be flexibly set according to actual application requirements and application scenes. For example, preset stop conditions include, but are not limited to: the training time reaches a preset duration, the training times reach a preset number of times, and the like.

If the training of the aggregated neural network model is completed, the aggregated neural network model may be sent to each participating end, so that each participating end may process the corresponding data using the trained neural network model.

Step 203, storing the aggregated neural network model after training.

In this embodiment, after receiving the training-completed and aggregated neural network model sent by the server, the execution body may store the aggregated neural network model for subsequent use. Other participating ends may also store the trained, aggregated neural network model.

In some optional implementations of this embodiment, in step 202, if the server side determines that the aggregated neural network model is not trained, the server side may send the untrained aggregated neural network model to each participant, so that each participant may continuously train the untrained aggregated neural network model.

Specifically, each participating end may determine the untrained, aggregated neural network model received from the server end as an untrained neural network model, and continue performing steps 201-202 described above until the neural network model training is completed.

It should be noted that, before training is not started, the server may acquire the initial model as an untrained neural network model, and send the initial model to each participating end respectively. And then, the service end can take the neural network model obtained by aggregating the trained neural network models sent by each participating end as an untrained neural network model.

In addition, it should be noted that the federal learning framework adopted in this embodiment may be selected by a technician according to actual application requirements or application scenarios. Currently, common federal learning frameworks include Tensorflow federated, pysoft, FATE, and the like. As an example, the federal learning framework Pysyft may optionally be employed to construct and train neural network models.

Optionally, different federal learning methods may be selected to train the neural network model according to the task of the neural network model to be trained, the characteristics of the training data of each participant, and the like. Generally, federal learning may include lateral federal learning (Horizontal FEDERATED LEARNING), longitudinal federal learning (VERTICAL FEDERATED LEARNING), and federal migration learning (FEDERATED TRANSFER LEARNING).

As an example, if the task of the neural network model is used for performing face recognition, and the user features in the training data on different participating ends are repeated more, but the corresponding user overlaps less, a method of horizontal federal learning may be adopted to perform horizontal segmentation on the training data on each participating end, that is, from the user dimension, and then take out the training data with the same user features but different users to train the neural network model.

In some optional implementations of this embodiment, the service end may be one of the participating ends. For example, the execution body may be used as a participating side and a service side. At this time, the executing body may send the untrained neural network model to each other participating end, so that each participating end trains the untrained neural network model by using the local training data, to form a trained neural network model, including the executing body itself also trains the untrained neural network model by using the local training data, to form a trained neural network model.

And then the execution main body can receive the trained neural network models sent by the participation terminals respectively and aggregate the trained neural network models with the local trained neural network models to obtain aggregated neural network models. If the training of the aggregated neural network model is completed, the aggregated neural network model can be stored, and meanwhile, the aggregated neural network model is respectively sent to each participating end. If the aggregated neural network model is not trained, the aggregated neural network model can be determined to be the neural network model which is not trained, and the neural network model is respectively sent to each participating end to continue training.

Therefore, the processes of constructing the server side, confirming trust between the server side and each participating side and the like can be omitted, and the whole processing flow of the neural network model is simplified.

In some optional implementations of this embodiment, the training data for training the untrained neural network model may not include one-dimensional data. I.e. the training data is two-dimensional or multi-dimensional etc. high-dimensional data. Specifically, during the training process of the neural network model, training can be performed by reading training data in a manner of reading an original storage path of the training data.

In the prior art, the training of the machine learning model based on federal learning generally only supports the processing of one-dimensional data, such as signal data, time sequence data, and the like, but cannot process high-dimensional data such as images, videos, voices, and the like, which are represented in the form of matrixes, multidimensional List, and the like, so that the federal learning technology cannot be applied to the training of a neural network model for processing the high-dimensional data. In order to solve the problem, when the neural network model is trained, the original storage path of the training data is read to load the training data to complete the training of the neural network model, so that the federal learning technology can be applied to the training of the neural network model for processing high-dimensional data, and the processing capacity of the neural network model is further improved.

In some alternative implementations of the present embodiment, the training data local to each participant may be encrypted data. Specifically, the training data may be encrypted using various existing encryption algorithms. The encryption algorithm used by each participating end may be the same.

Therefore, the safety of the training data of each participating end can be protected, and the training neural network model obtained by training the untrained neural network model by using the encrypted training data can also be ensured, so that after the server receives the trained neural network model sent by the participating end, the server is difficult to know information such as the distribution of the training data of the participating end according to the trained neural network model, and the leakage of the training data of each participating end in the combined training process is avoided.

In some optional implementations of this embodiment, for each participant, after receiving the untrained neural network model from the server, the untrained neural network model may be trained using the local training data, and the trained untrained neural network model may be encrypted during the training process, so that a trained neural network model that is itself encrypted may be obtained.

Specifically, as the untrained completed neural network model is trained, a disturbance (e.g., adding random noise, etc.) may be added to the untrained completed neural network model to encrypt the untrained completed neural network model. When decryption is needed, the trained neural network model can be decrypted by removing the corresponding disturbance.

By the method, the safety of the process of exchanging the trained neural network model between each participating end and the server can be further guaranteed, and the phenomenon that an illegal person knows the related information of the trained neural network model in a reverse pushing mode and the like is avoided.

Optionally, the trained untrained neural network model can be encrypted by utilizing a differential privacy technology, so that the neural network model is encrypted naturally in the training process, and the safety and the training speed of the neural network model are ensured.

In some optional implementations of this embodiment, after obtaining the trained neural network model, each participating end may firstly encrypt the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model, and then determine the encrypted trained neural network model as the trained neural network model and send the encrypted trained neural network model to the server.

Specifically, the trained neural network model can be encrypted by using various existing encryption algorithms. The encryption algorithms used by the various participating ends may be the same. For example, the trained neural network model may be encrypted using homomorphic encryption techniques.

In this way, after each participating end receives the aggregated neural network model from the server end, the aggregated neural network model can be decrypted according to the encryption algorithm used, and then the local training data is utilized to train the neural network model obtained after decryption. Therefore, the safety of the neural network model jointly trained by the multiple participating ends is further guaranteed.

In some alternative implementations of the present embodiment, each participating end-federally learning co-trained neural network model may be a convolutional neural network to implement a variety of different convolutional operators. Specifically, when the neural network model to be trained is constructed, the neural network model can be loaded and packaged into a structure body so as to support the application of a convolution operator.

In the prior art, a machine learning model trained by applying federal learning generally only supports XGBosot, supports operators in simple machine learning such as a vector machine (e.g. Laplace Operator Laplacian Operator), and cannot support more complex operators. Aiming at the problem, the processing capacity of the neural network model trained by applying federal learning is further improved by loading and packaging the neural network model into a structural body and utilizing the structural body to realize the convolution operator delivered at the structural body.

In some optional implementations of this embodiment, the data transmission between the server and each participating end may be implemented based on a preset acceleration library (such as Wsaccel acceleration library, etc.), so as to accelerate the training process of the neural network model and improve the training efficiency.

The method provided by the embodiment of the disclosure applies federal learning to training of the neural network model, thereby legally and compliantly solving the problem of data isolation between training data of different participating ends, simultaneously guaranteeing the safety of the training data of each participating end, and greatly improving the processing capacity and generalization capacity of the neural network model after training and improving the processing precision due to the fact that the training data of different participating ends are used for jointly training the neural network model. Moreover, the neural network model trained based on federal learning can support high-dimensional data reading, support more complex convolution operators, have stronger data protection capability, faster data transmission speed and the like through some adjustment of the training process, so that the efficiency of the whole training process and the processing capability of the trained neural network model are further improved.

With further reference to fig. 3, a flow 300 of one embodiment of a face recognition method is shown. The face recognition method flow 300 includes the following steps:

Step 301, a face image to be recognized is acquired.

In this embodiment, the face image may refer to an image for presenting a face. The face image to be recognized may be any face image. For example, the image may be a pre-designated face image, or may be a currently detected or photographed face image.

The execution subject of the face recognition method can acquire the face image to be recognized from local or other electronic equipment, and can also acquire the face image by utilizing an image acquisition device (such as a camera and the like) of the execution subject.

Step 302, inputting the face image into a pre-trained face recognition model to obtain a recognition result.

In this embodiment, the face recognition model may be trained by the method described above in the embodiment of fig. 2. At this time, the neural network model in the embodiment of fig. 2 may be a face recognition model.

The face recognition model can be used for recognizing the identity of the person to whom the face image corresponds. Therefore, the identification result obtained by using the face identification model can be used for representing the identity of the person to whom the face image corresponds.

Step 303, generating prompt information according to the identification result.

In this embodiment, according to different application scenarios and application requirements, different prompt messages may be generated according to the recognition result. For example, if the recognition result is used to characterize that the person belonging to the face corresponding to the face image belongs to the specified user group, a prompt message for prompting that the verification is successful may be generated. If the identification result is used for representing that the person belonging to the face corresponding to the face image does not belong to the appointed user group, prompt information for prompting verification failure can be generated.

The method provided by the embodiment of the invention utilizes the training data of the multiple participating terminals to jointly train the face recognition model, solves the problems of data isolation and the like existing between the training data of different participating terminals, ensures the data safety of different participating terminals, and can learn the characteristic expression of the model trained by the training data of each participating terminal, thereby ensuring that the trained face recognition model has stronger generalization, and can be applied to various different application scenes related by different participating terminals to improve the recognition precision of the face recognition model.

With further reference to FIG. 4, a timing diagram 400 of one embodiment of a system for training a neural network model based on federal learning is shown. The federal learning-based training neural network model system in embodiments of the present disclosure may include a server (e.g., server 104 shown in fig. 1) and at least one participant (e.g., participant 101, 102, 103 shown in fig. 1). It should be noted that, according to an actual application scenario, the number of the participating terminals may be more than two.

As shown in fig. 4, in step 401, the server sends untrained neural network models to at least two participating terminals, respectively.

In the present embodiment, the neural network model may be various types of neural network models. Such as deep neural networks, recurrent neural networks, convolutional neural networks, and the like. The processing data, processing procedure and processing result of the neural network model may be different according to the actual application requirements and application scenes. For example, neural network models may be used for face recognition, speech recognition, video tracking, and so forth. The untrained completed neural network model may be an untrained or an untrained neural network model.

In step 402, each participant trains the received untrained completed neural network model using the local training data to obtain a trained neural network model.

In this embodiment, after receiving the untrained neural network model from the server, each participant may train the untrained neural network model using its local training data. The corresponding training data may be different depending on the neural network model. For example, if the neural network model is used for face recognition, the training data may include face images and face recognition results corresponding to the face images.

It should be noted that the local areas of different participating terminals may have different training data. In some cases, the training data local to the different participating ends may also have intersections. Typically, the trained neural network model obtained for each participant is different.

In step 403, each participant sends the trained neural network model to the server.

Step 404, the server aggregates the received trained neural network models to obtain aggregated neural network models.

In this embodiment, after receiving the trained neural network models sent by each participant, the server may aggregate each trained neural network model to fuse the features of the trained neural network model of each participant, thereby obtaining an aggregated neural network model.

And step 405, transmitting at least two participation ends of the aggregated neural network model in response to determining that the aggregated neural network model training is completed.

In this embodiment, after obtaining the aggregated neural network model, the server may first determine whether training of the aggregated neural network model is completed. Specifically, the server may determine whether training of the aggregated neural network model is completed according to a preset stopping condition. The preset stopping conditions can be flexibly set according to actual application requirements and application scenes. For example, preset stop conditions include, but are not limited to: the training time reaches a preset duration, the training times reach a preset number of times, and the like.

If the training of the aggregated neural network model is completed, the server side can send the aggregated neural network model to each participating side so that each participating side can process corresponding data by using the trained neural network model.

In step 406, each participant stores the received training-completed aggregated neural network model.

Optionally, in step 405, if the server determines that the aggregated neural network model is not trained, the server may send the untrained aggregated neural network model to each participating end, so that each participating end may continuously train the untrained aggregated neural network model.

Specifically, each participating end may determine the untrained, aggregated neural network model received from the server end as an untrained neural network model, and continue training using the local training data until the neural network model training is completed.

Alternatively, the service end may be one of the participating ends.

Alternatively, the training data local to each participant for training the untrained completed neural network model may not include one-dimensional data.

Alternatively, the training data local to each participant may be encrypted data.

Optionally, after receiving the untrained neural network model from the server, each participant may train the untrained neural network model with local training data, and encrypt the trained untrained neural network model during the training process, so as to obtain a trained neural network model that is itself encrypted.

Optionally, after each participating end obtains the trained neural network model, the trained neural network model may be encrypted by using a preset encryption algorithm to obtain an encrypted trained neural network model, and then the encrypted trained neural network model is determined to be the trained neural network model and sent to the server.

Alternatively, each participating end-federally learning co-trained neural network model may be a convolutional neural network to implement a variety of different convolutional operators.

Optionally, data transmission between the server and each participating end may be implemented based on a preset acceleration library (such as Wsaccel acceleration library, etc.), so as to accelerate the training process of the neural network model and improve training efficiency.

It should be noted that, in this embodiment, details not described in detail may refer to the related descriptions in the corresponding embodiment of fig. 2, and are not described herein.

The system provided by the embodiment of the disclosure applies federal learning to training of the neural network model, thereby legally and compliantly solving the problem of data isolation between training data of different participating ends, simultaneously guaranteeing the safety of the training data of each participating end, and greatly improving the processing capacity and generalization capacity of the neural network model after training and improving the processing precision due to the combined training of the training data of different participating ends. Moreover, the neural network model trained based on federal learning can support high-dimensional data reading, support more complex convolution operators, have stronger data protection capability, faster data transmission speed and the like through some adjustment of the training process, so that the efficiency of the whole training process and the processing capability of the trained neural network model are further improved.

With further reference to fig. 5, as an implementation of the method shown in fig. 2 described above, the present disclosure provides one embodiment of an apparatus for federally learning-based training of neural network models, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in a variety of electronic devices.

As shown in fig. 5, the apparatus 500 for training a neural network model based on federal learning according to the present embodiment includes a receiving unit 501, a training unit 502, and a storage unit 503. Wherein, the receiving unit 501 is configured to receive an untrained neural network model sent by a server; the training unit 502 is configured to train the untrained neural network model by using the local training data, obtain a trained neural network model, and send the trained neural network model to the server, where the server is configured to perform the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; the storage unit 503 is configured to store the aggregated neural network model after training is completed.

In this embodiment, in the apparatus 500 for training a neural network model based on federal learning: the specific processing of the receiving unit 501, the training unit 502 and the storage unit 503 and the technical effects thereof may refer to the descriptions related to step 201, step 202 and step 203 in the corresponding embodiment of fig. 2, and are not repeated herein.

In some optional implementations of this embodiment, the foregoing processing step further includes: and in response to determining that the aggregated neural network model is not trained, transmitting the aggregated neural network model to each participant to continue training.

In some optional implementations of this embodiment, the training data does not include one-dimensional data.

In some optional implementations of this embodiment, the service end is one of the participating ends.

In some optional implementations of this embodiment, the training data is encrypted data.

In some optional implementations of this embodiment, the training unit 502 is further configured to train the neural network model that is not trained using the local training data, and encrypt the trained neural network model during the training process to obtain the trained neural network model.

In some optional implementations of this embodiment, the training unit 502 is further configured to encrypt the trained neural network model with a preset encryption algorithm to obtain an encrypted trained neural network model; and determining the encrypted trained neural network model as the trained neural network model and sending the encrypted trained neural network model to the server.

In some optional implementations of this embodiment, the neural network model is a convolutional neural network.

In some optional implementations of this embodiment, the data transmission between the server and each participant is implemented based on a preset acceleration library.

The device provided by the embodiment of the disclosure receives the untrained neural network model sent by the server through the receiving unit; the training unit trains an untrained neural network model by using local training data to obtain a trained neural network model, and sends the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; the storage unit stores the aggregated neural network model after training is completed. Therefore, the problem of data isolation between training data of different participating ends can be solved legally and reasonably, meanwhile, the safety of the training data of each participating end is guaranteed, and the processing capacity and generalization capacity of the neural network model after training can be greatly improved and the processing precision is improved due to the fact that the neural network model is jointly trained by the training data of different participating ends.

With further reference to fig. 6, as an implementation of the method shown in fig. 3, the disclosure provides an embodiment of a face recognition apparatus, which corresponds to the method embodiment shown in fig. 3, and which is particularly applicable to various electronic devices.

As shown in fig. 5, the face recognition apparatus 500 provided in the present embodiment includes an acquisition unit 501, a recognition unit 502, and a generation unit 503. Wherein the acquiring unit 501 is configured to acquire a face image to be recognized; the recognition unit 502 is configured to input a face image into a pre-trained face recognition model, so as to obtain a recognition result, wherein the face recognition model is trained by a training method as shown in the embodiment of fig. 3; the generating unit 503 is configured to generate the hint information according to the identification result.

In the present embodiment, in the face recognition apparatus 500: the specific processes of the obtaining unit 501, the identifying unit 502 and the generating unit 503 and the technical effects thereof may refer to the relevant descriptions of the step 301, the step 302 and the step 303 in the corresponding embodiment of fig. 2, and are not repeated herein.

According to the device provided by the embodiment of the disclosure, the face image to be recognized is obtained through the obtaining unit, the face image is input into the face recognition model obtained based on federal learning training by the recognition unit, the recognition result is obtained, and the generating unit generates prompt information according to the recognition result. Therefore, the method can solve the problems of data isolation and the like existing between training data of different participating ends, ensure the data safety of the different participating ends, and learn the characteristic expression of a model trained by the training data of each participating end, so that the trained face recognition model has stronger generalization, can be applied to various different application scenes related to different participating ends, and improves the recognition accuracy of the face recognition model.

Referring now to fig. 7, a schematic diagram of an electronic device (e.g., the participant in fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 7 is only one example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 7 may represent one device or a plurality of devices as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 708, or installed from ROM 702. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 701.

It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving an untrained neural network model sent by a server; training the untrained neural network model by using local training data to obtain a trained neural network model, and sending the trained neural network model to a server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain aggregated neural network models, and transmitting the aggregated neural network models to the participating ends in response to determining that the aggregated neural network models are trained; the aggregated neural network model after training is stored.

Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a receiving unit, a training unit, and a storage unit. The names of these units do not in some cases limit the unit itself, for example, the receiving unit may also be described as "a unit that receives an untrained neural network model sent by the server".

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims

1. A method for training a neural network model based on federal learning, applied to a participating end, the method comprising:

Receiving an untrained neural network model sent by a server;

Reading training data in a mode of reading an original storage path of the local training data, training the untrained neural network model by using the local training data to obtain a trained neural network model, and sending the trained neural network model to the server, wherein the server is used for executing the following processing steps: aggregating the trained neural network models received from the participating ends to obtain an aggregated neural network model, transmitting the aggregated neural network model to the participating ends in response to determining that the aggregated neural network model is trained, wherein the neural network model is a convolutional neural network, and loading and packaging the neural network model into a structure body to support a convolutional operator when constructing the neural network model to be trained;

Storing the aggregated neural network model after training is completed;

the training the untrained neural network model by using the local training data to obtain a trained neural network model includes: and training the untrained neural network model by using local training data, and encrypting the trained neural network model by adding disturbance in the training process to obtain a trained neural network model.

2. The method of claim 1, wherein the processing step further comprises:

And in response to determining that the aggregated neural network model is not trained, transmitting the aggregated neural network model to each participant to continue training.

3. The method of claim 1, wherein the training data does not include one-dimensional data.

4. The method of claim 1, wherein the server is one of the participating ends.

5. The method of claim 1, wherein the training data is encrypted data.

6. The method of claim 1, wherein the training the untrained neural network model using the local training data to obtain a trained neural network model, and transmitting the trained neural network model to the server comprises:

Encrypting the trained neural network model by using a preset encryption algorithm to obtain an encrypted trained neural network model;

And determining the encrypted trained neural network model as the trained neural network model and sending the encrypted trained neural network model to the server.

7. The method according to one of claims 1 to 6, wherein the data transmission between the server and each participant is implemented based on a preset acceleration library.

8. A face recognition method, comprising:

acquiring a face image to be recognized;

inputting the face image into a pre-trained face recognition model to obtain a recognition result, wherein the face recognition model is obtained by training according to the training method of one of claims 1 to 7;

And generating prompt information according to the identification result.

9. A system for training a neural network model based on federal learning comprises a server side and at least two participating sides;

The server side sends untrained neural network models to the at least two participating sides respectively;

For the participation end in the at least two participation ends, the participation end reads the training data in a mode of reading an original storage path of the local training data, trains the received untrained neural network model by utilizing the local training data to obtain a trained neural network model, and sends the trained neural network model to the service end;

The server side aggregates each received trained neural network model to obtain an aggregated neural network model, transmits the at least two participation ends of the aggregated neural network model in response to determining that the aggregated neural network model is trained, wherein the neural network model is a convolutional neural network, and loads and encapsulates the neural network model into a structure body to support a convolutional operator when constructing the neural network model to be trained;

the at least two participating ends store the received training-completed aggregated neural network model;

the participation end trains the untrained neural network model by utilizing local training data, encrypts the trained neural network model in a disturbance adding mode in the training process, and obtains the trained neural network model.

10. The system of claim 9, wherein the server transmits the aggregated neural network model to each participant to continue training in response to determining that the aggregated neural network model is not trained.

11. An apparatus for training a neural network model based on federal learning, for use at a participating end, the apparatus comprising:

The receiving unit is configured to receive the untrained neural network model sent by the server;

The training unit is configured to read training data by reading a local original storage path of the training data, train the untrained neural network model by using the local training data to obtain a trained neural network model, and send the trained neural network model to the server, wherein the server is configured to execute the following processing steps: aggregating the trained neural network models received from the participating ends to obtain an aggregated neural network model, transmitting the aggregated neural network model to the participating ends in response to determining that the aggregated neural network model is trained, wherein the neural network model is a convolutional neural network, and loading and packaging the neural network model into a structure body to support a convolutional operator when constructing the neural network model to be trained;

a storage unit configured to store the aggregated neural network model after training is completed;

wherein the training unit is further configured to: and training the untrained neural network model by using local training data, and encrypting the trained neural network model by adding disturbance in the training process to obtain a trained neural network model.

12. A face recognition device, comprising:

An acquisition unit configured to acquire a face image to be recognized;

A recognition unit configured to input the face image into a pre-trained face recognition model to obtain a recognition result, wherein the face recognition model is trained by the training method according to one of claims 1 to 7;

and the generating unit is configured to generate prompt information according to the identification result.

13. An electronic device, comprising:

One or more processors;

a storage device having one or more programs stored thereon;

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-7.

14. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-7.