CN113516227A

CN113516227A - Neural network training method and device based on federal learning

Info

Publication number: CN113516227A
Application number: CN202110639374.8A
Authority: CN
Inventors: 许奕星; 陈汉亭; 王云鹤
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-06-08
Filing date: 2021-06-08
Publication date: 2021-10-19
Anticipated expiration: 2041-06-08
Also published as: CN113516227B

Abstract

The application discloses a neural network training method and equipment based on federal learning, which can be applied to the field of artificial intelligence and comprise the following steps: the method comprises the steps of constructing a target loss function comprising a first loss function and a second loss function, wherein the first loss function is used for representing the probability of error prediction of labeled data input into a neural network, the second loss function is used for representing the probability that a prediction result obtained by predicting unlabeled data input into the neural network does not belong to a preset classification category, then, each client uses the target loss function to train the local target neural network by respectively adopting local training sets (each local training set comprises the labeled data and the unlabeled data), obtained weight matrixes are uploaded to a server to be integrated, and the weight matrixes are integrated and then sent to each client to be trained repeatedly until a training termination condition is reached. The target loss function constructed by the method simultaneously considers the contribution of the labeled data and the unlabeled data on each client to the target loss function.

Description

Neural network training method and device based on federal learning

Technical Field

The application relates to the field of federal learning, in particular to a neural network training method and equipment based on federal learning.

Background

In recent years, a neural network is widely applied in the field of computer vision (such as application scenes of target detection, image classification, semantic segmentation and the like), and high-quality data has irreplaceable important role in training of the neural network, so that different manufacturers pay more attention to the protection of the data, and the problem of data island is easily formed, namely the high-quality data is not circulated among different manufacturers. Based on this, Federal Learning (FL) has come into force. Federal learning, also known as federal machine learning, joint learning, league learning, etc., can effectively help a plurality of client devices to perform data use and machine learning modeling under the condition of meeting the requirements of user privacy protection, data security and government regulations. The federal study can utilize respective data on each client to train the neural network together on the premise that different clients do not share data, so that the performance of the neural network is improved while the data privacy of users is protected.

For example, a federated learning method is called federated averaging (FedAvg), and the FedAvg structurally generally includes a server (server) and a plurality of clients (clients), and is suitable for a scenario in which data of each client has a label. Specifically, as shown in fig. 1, the technical process mainly includes a network parameter issuing process and a network parameter aggregation process, in the network parameter issuing process, a client downloads network parameters of a neural network from a server, trains on local data, and uploads updated network parameters to the server after training to a certain extent; and in the network parameter aggregation process, the server collects the network parameters uploaded by each client and performs network parameter fusion, and the two processes are iterated repeatedly until the neural network converges.

Existing federal learning methods assume that local data on each client is tagged data, and are actually limited by time and tagging capability, and the client may only tag part of the data. At present, the industry provides a federated matching (Fed Match) algorithm to solve the problem that only part of data on a client is labeled by combining semi-supervised learning and the federated learning method, but the premise is that labeled data exists on a server for auxiliary training, which is contrary to the idea of federated learning to protect user data privacy. Therefore, on the premise that each client only has partial label data from partial categories, the server does not have data for assisting training, and only parameters of the neural network are allowed to be transmitted between the client and the server, how to train to obtain the neural network with excellent performance is an urgent problem to be solved.

Disclosure of Invention

The embodiment of the application provides a neural network training method and device based on federal learning, and the method solves the problem of federal learning under the condition that only part of labeled data exists on a client (namely, a first device) by constructing an objective loss function different from the traditional federal learning mode.

Based on this, the embodiment of the present application provides the following technical solutions:

in a first aspect, an embodiment of the present application first provides a neural network training method based on federal learning, which can be used in the field of artificial intelligence, and the method includes: first, a first device obtains an objective loss function according to a first loss function and a second loss function, and then, the first device (also referred to as a first client) trains a target neural network (i.e., a local target neural network) on the first device by using a training set (i.e., a first training set) on the first device to obtain a first weight matrix of the local target neural network, where the first training set includes both a plurality of labeled data (which may be referred to as first labeled data) and a plurality of unlabeled data (which may be referred to as first unlabeled data). After obtaining the first weight matrix of the local target neural network, the first device further sends the first weight matrix to a service device (e.g., a server or other client). Finally, the first device receives an integrated weight matrix sent by the service device, and the integrated weight matrix is obtained by integrating the first weight matrix and a second weight matrix by the service device, where the second weight matrix is a value of a weight matrix obtained by the second device using the target loss function and training a target neural network on the second device by using a second training set on the second device, and similarly, the second training set also includes a plurality of second labeled data and a plurality of second unlabeled data, and the unlabeled data in the second training set may also be data belonging to a plurality of classification categories (only unlabeled). It should be noted here that the network structures of the target neural network on the first device and the target neural network on the second device are the same, that is, the initial network models are the same, but in the subsequent training process, the weight matrices of the network models obtained by respective training are different from each other.

In the above embodiments of the present application, the problem of federate learning under the condition that only part of labeled data exists on a client (i.e., a first device) is solved by constructing an objective loss function different from a traditional federate learning manner, where the objective loss function considers not only the contribution of labeled data to the objective loss function, but also the contribution of unlabeled data to the objective loss function, so as to improve the performance of a target neural network obtained by jointly training a plurality of clients (i.e., the first device and a second device) based on the objective loss function.

In a possible implementation manner of the first aspect, a first loss function in the constructed target loss function may specifically include a first sub-loss function and a second sub-loss function, where the first sub-loss function is used to characterize the probability that labeled data input to the neural network is predicted, and an obtained prediction result does not belong to a classification category corresponding to the labeled data, and the second sub-loss function is used to characterize the probability that the labeled data input to the neural network is predicted, and an obtained prediction result does not belong to a preset classification category.

In the above embodiments of the present application, it is specifically stated that the first loss function may further include two sub-portions, and each sub-portion has different contribution to the whole target neural network, and has flexibility.

In a possible implementation manner of the first aspect, the first loss function may specifically be: a difference between the first sub-loss function and the second sub-loss function.

In the above embodiments of the present application, the specific form in which the first sub-loss function and the second sub-loss function constitute the first loss function is specifically described, and the realizability is provided.

In a possible implementation manner of the first aspect, a specific implementation manner of the first device obtaining the target loss function according to the first loss function and the second loss function may be: the first device adds the first loss function and the second loss function to obtain a target loss function.

In the above embodiments of the present application, specific forms in which the first loss function and the second loss function constitute the target loss function are specifically described, and the realizability is provided.

In a possible implementation manner of the first aspect, a specific implementation manner of the first device obtaining the target loss function according to the first loss function and the second loss function may further be: the first device obtains a target loss function according to the first loss function, the second loss function and a third loss function, wherein the third loss function is used for representing the probability that a prediction result obtained by predicting target labeled data input into the neural network does not belong to a target preset classification category, the target labeled data belongs to the first labeled data on the first device and does not belong to the second labeled data on the second device, and the target preset classification category is a classification category preset on the second device. To facilitate understanding of the labeled data of the target and the preset classification category of the target in the third loss function, the following example is illustrated: assuming client k1 is a first device and client k2 is a second device, then for the first device, the target tagged data refers to the first tagged data that belongs to the first device and not the second tagged data that belongs to the second device. Similarly, the target preset classification category also refers to a preset classification category defined on the second device, for example, assuming that there are 10 classification categories in total, wherein the client k1 is marked with part of the data in the 1 st to 5 th categories (i.e. the first labeled data), and the data in the 6 th to 10 th categories on the client k1 are all marked with no label (i.e. the first unlabeled data), then for the client k1, the category corresponding to the 1 st to 5 th categories is the target classification category, and the category corresponding to the 6 th to 10 th categories is the preset classification category for the client k1, and assuming that the client k2 is marked with part of the data in the 6 th to 10 th categories (i.e. the second labeled data), and the data in the 1 st to 5 th categories on the client k2 are all marked with no label (i.e. the second unlabeled data), then for the client k2, the category corresponding to the 6 th to 10 th categories is the target classification category, the categories corresponding to categories 1 to 5 refer to the preset category for the client k 2. It is noted that the preset classification category of the client k2 is the target preset classification category for the client k 1.

In the foregoing embodiments of the present application, it is specifically stated that the constructed target loss function may further include a third loss function in addition to the first loss function and the second loss function, where the third loss function is used to further distinguish each category in the unlabeled data, so as to improve the reliability of the target loss function.

In a possible implementation manner of the first aspect, a specific implementation manner of the first device obtaining the target loss function according to the first loss function, the second loss function, and the third loss function may be: first, the first device adds the first loss function and the second loss function to obtain an addition result, and then, the first device subtracts the addition result and the third loss function to obtain the target loss function.

In the above-described embodiments of the present application, specific forms in which the first loss function, the second loss function, and the third loss function constitute the target loss function are specifically described, and the present application has realizability.

In a possible implementation manner of the first aspect, after the first device receives the integrated weight matrix sent by the service device, the method further includes: the first device takes the integrated weight matrix as a current weight matrix of a target neural network on the first device, and executes the steps of training the target neural network on the first device by using a target loss function and adopting a first training set on the first device, sending the first weight matrix to the service device, and receiving the integrated weight matrix sent by the service device again, and then the first device repeats the steps until a training termination condition is reached.

In the above embodiments of the present application, the first device may repeatedly train the local target neural network with the updated integrated weight matrix to improve the performance of the target neural network.

In a possible implementation manner of the first aspect, in some embodiments of the present application, the manner of determining that the training termination condition is reached includes, but is not limited to: 1) when the updating times of the integrated weight matrix reach preset times, the training termination condition is considered to be reached; 2) and if the difference value between the two adjacent integrated weight matrixes is smaller than a preset threshold value, the training termination condition is considered to be reached.

In the above embodiments of the present application, several determination manners for reaching the training termination condition are specifically described, which provides flexibility.

The second aspect of the embodiments of the present application further provides a neural network training method based on federal learning, where the method includes: first, a service device (e.g., a server) receives a first weight matrix sent by a first device and a second weight matrix sent by a second device, where the first weight matrix is a weight matrix obtained by the first device using a target loss function and training a target neural network on the first device using a first training set on the first device, and the second weight matrix is a weight matrix obtained by the second device using the target loss function and training a target neural network on the second device using a second training set on the second device, where it is to be noted that the network structures of the target neural network on the first device and the target neural network on the second device are the same, that is, the initial network models are the same, and only in a subsequent training process, the weight matrices of the network models obtained by respective training are different. It should be further noted that the target neural network is obtained according to a first loss function and a second loss function, the first loss function is used for representing the probability of prediction error of the labeled data input to the neural network, the second loss function is used for representing the probability that a prediction result obtained by predicting the unlabeled data input to the neural network does not belong to the preset classification category, the first training set includes the first labeled data and the first unlabeled data, and the second training set includes the second labeled data and the second unlabeled data. It should be noted that, in the embodiment of the present application, the unlabeled data in the first training set may be data belonging to a plurality of classification categories (only unlabeled), and similarly, the second training set also includes a plurality of second labeled data and a plurality of second unlabeled data, and the unlabeled data in the second training set may also be data belonging to a plurality of classification categories (only unlabeled). After receiving the first weight matrix and the second weight matrix respectively sent by the first device and the second device, the service device further obtains an integrated weight matrix according to the first weight matrix and the second weight matrix (e.g., integrates the first weight matrix and the second weight matrix to obtain an integrated weight matrix), and sends the integrated weight matrix to the first device, so that the first device uses the integrated weight matrix as a current weight matrix of the target neural network on the first device. It should be noted that in some embodiments of the present application, in addition to sending the integrated weight matrix to the first device, the target device may also send the integrated weight matrix to the second device, or send the integrated weight matrix to the first device and the second device separately.

In the above embodiment of the present application, the problem of federate learning is solved by constructing an objective loss function different from a traditional federate learning manner, where the objective loss function considers not only the contribution of labeled data to the objective loss function, but also the contribution of unlabeled data to the objective loss function, so as to improve the performance of an objective neural network obtained by joint training of multiple clients (i.e., a first device and a second device) based on the objective loss function.

In a possible implementation manner of the second aspect, after the service device sends the integrated weight matrix to the first device and the second device, the method further includes: and the service equipment repeatedly executes the steps until the training termination condition is reached.

In the above embodiments of the present application, after the service device sends the integrated weight matrix to the first device and the second device, the above process may be repeatedly performed, so that the first device and the second device may repeatedly train the local target neural network with the updated integrated weight matrix, so as to improve the performance of the target neural network.

A third aspect of the embodiments of the present application further provides a data processing method, where the method includes: first, the execution device obtains input data to be processed, which is related to a target task to be executed, for example, when the target task is a classification task, the input data refers to data for classification. And then, the execution device processes the input data according to the trained target neural network, so as to obtain output data (namely a prediction classification result), wherein a weight matrix of the trained target neural network is obtained by joint training of the first device and the second device respectively by using a local first training set and a local second training set based on the constructed target loss function, the first training set comprises first labeled data and first unlabeled data, and the second training set comprises second labeled data and second unlabeled data. That is, the trained target neural network is trained based on the target loss function constructed by the method of the first aspect or any one of the possible implementation manners of the first aspect.

In the above embodiments of the present application, how to execute a corresponding task on input data through a trained target neural network is specifically described, and the trained target neural network is obtained through the training of the constructed target loss function, so that the inference process is accelerated.

In a possible implementation manner of the third aspect, the input data may be image data, audio data, or text data, and a data type of the input data is determined by a target task to be processed, which is not limited herein.

In the above-described embodiments of the present application, the data type of the input data is not limited, and the present application has wide applicability.

A fourth aspect of the embodiments of the present application provides a training device, where the training device is used as a first device (that is, one of clients involved in federal learning), and has a function of implementing a method according to the first aspect or any one of possible implementation manners of the first aspect; the training device, as a serving device, has the functionality of a method for implementing any of the second aspects or possible implementations of the second aspects. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above.

A fifth aspect of the embodiments of the present application provides an execution device, where the execution device has a function of implementing the method according to any one of the third aspect and the third aspect. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above.

A sixth aspect of the embodiments of the present application provides a training apparatus, which may include a memory, a processor, and a bus system, where the memory is configured to store a program, and the processor is configured to call the program stored in the memory to execute the method according to the first aspect or any one of the possible implementation manners of the first aspect of the embodiments of the present application, or the processor is configured to call the program stored in the memory to execute the method according to the second aspect or any one of the possible implementation manners of the second aspect of the embodiments of the present application.

A seventh aspect of the present embodiment provides an execution device, which may include a memory, a processor, and a bus system, where the memory is configured to store a program, and the processor is configured to call the program stored in the memory to execute the method according to any one of the third aspect and the possible implementation manner of the third aspect of the present embodiment.

An eighth aspect of the present application provides a computer-readable storage medium, which stores instructions that, when executed on a computer, enable the computer to perform the method of the first aspect or any one of the possible implementations of the first aspect, or enable the computer to perform the method of the second aspect or any one of the possible implementations of the second aspect, or enable the computer to perform the method of any one of the possible implementations of the third aspect.

A ninth aspect of embodiments of the present application provides a computer program, which when run on a computer, causes the computer to perform the method of any one of the above-mentioned first aspect or first possible implementation manner, or causes the computer to perform the method of any one of the above-mentioned second aspect or second possible implementation manner, or causes the computer to perform the method of any one of the above-mentioned third aspect or third possible implementation manner.

A tenth aspect of the embodiments of the present application provides a chip, where the chip includes at least one processor and at least one interface circuit, the interface circuit is coupled to the processor, the at least one interface circuit is configured to perform a transceiving function and send an instruction to the at least one processor, and the at least one processor is configured to execute a computer program or an instruction, where the at least one processor has a function of implementing the method according to the first aspect or any one of the possible implementations of the second aspect, or a function of implementing the method according to any one of the possible implementations of the second aspect, and the function may be implemented by hardware, or by software, or by a combination of hardware and software, where the hardware or software includes one or more modules corresponding to the above functions. In addition, the interface circuit is used for communicating with other modules except the chip, for example, the interface circuit may send the target neural network obtained by on-chip association training to other terminal devices (e.g., a mobile phone, a personal computer, a smart bracelet, etc.) for processing images, texts, audios, etc., or send the target neural network to various intelligent running (e.g., unmanned driving, assisted driving, etc.) agents for motion planning (e.g., driving behavior decision, global path planning, etc.).

Drawings

FIG. 1 is a block diagram of a conventional federated learning framework;

FIG. 2 is a schematic structural diagram of an artificial intelligence body framework provided by an embodiment of the present application;

FIG. 3 is a system architecture diagram of a task processing system according to an embodiment of the present application;

fig. 4 is a schematic flowchart of a neural network training method based on federal learning according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating a comparison between a normal two-class learning method and a conventional two-class PU learning method according to an embodiment of the present application;

FIG. 6 is a diagram illustrating a comparison between a normal multi-class learning method according to an embodiment of the present application and a multi-class PU learning method according to the present application;

FIG. 7 is a block diagram of joint training of a neural network training method based on federated learning according to an embodiment of the present application;

fig. 8 is a schematic flowchart of a data processing method according to an embodiment of the present application;

fig. 9 is a schematic diagram of an application scenario of a shared neural network model obtained by a neural network training method based on federal learning according to an embodiment of the present application;

fig. 10 is a schematic diagram of an application scenario in which a trained target neural network performs target detection on a terminal handset according to an embodiment of the present application;

fig. 11 is a schematic diagram of an application scenario of an automatic driving scenario segmentation of a trained target neural network on a wheeled mobile device according to an embodiment of the present application;

fig. 12 is a schematic diagram of an application scenario of a trained target neural network in a face recognition application according to an embodiment of the present application;

FIG. 13 is a diagram illustrating an application scenario of a trained target neural network in a speech recognition application according to an embodiment of the present application;

FIG. 14 is a schematic view of a training apparatus provided in accordance with an embodiment of the present application;

FIG. 15 is another schematic view of a training apparatus provided in an embodiment of the present application;

FIG. 16 is a schematic diagram of an execution device provided in an embodiment of the present application;

FIG. 17 is another schematic view of a training apparatus provided in an embodiment of the present application;

FIG. 18 is another schematic diagram of an execution device provided by an embodiment of the present application;

fig. 19 is a schematic structural diagram of a chip according to an embodiment of the present disclosure.

Detailed Description

The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

The embodiments of the present application relate to many related knowledge about neural networks and federal learning, and in order to better understand the scheme of the embodiments of the present application, the following first introduces related terms and concepts to which the embodiments of the present application may relate. It should be understood that the related conceptual explanations may be limited by the specific details of the embodiments of the present application, but do not mean that the present application is limited to the specific details, and that the specific details of the embodiments may vary from one embodiment to another, and are not limited herein.

(1) Neural network

The neural network can be composed of neural units, and can be specifically understood as a neural network with an input layer, a hidden layer and an output layer, wherein generally the first layer is the input layer, the last layer is the output layer, and the middle layers are hidden layers. Among them, a neural network with many hidden layers is called a Deep Neural Network (DNN). The operation of each layer in the neural network can be expressed mathematically

Describing, from the physical level, the work of each layer in the neural network can be understood as performing the transformation of the input space to the output space (i.e. the row space to the column space of the matrix) through five operations on the input space (the set of input vectors), which include: 1. ascending/descending dimensions; 2. zooming in/out; 3. rotating; 4. translating; 5. "bending". Wherein 1, 2, 3 are operated by

The operation of 4 is completed by "+ b", and the operation of 5 is realized by "a ()". The expression "spatial" is used here because the objects to be classified are not single objects, but rather classes of objects,space refers to the set of all individuals of such things, where W is a weight matrix for each layer of the neural network, each value in the matrix representing a weight value for one neuron of that layer. The matrix W determines the spatial transformation of the input space to the output space described above, i.e. W at each layer of the neural network controls how the space is transformed. The purpose of training the neural network is to finally obtain the weight matrix of all layers of the trained neural network. Therefore, the training process of the neural network is essentially a way of learning the control space transformation, and more specifically, the weight matrix.

(2) Loss function (loss function)

In the process of training the neural network, because the output of the neural network is expected to be as close as possible to the value really expected to be predicted, the weight matrix of each layer of the neural network can be updated according to the difference between the predicted value of the current network and the really expected target value (of course, an initialization process is usually carried out before the first updating, namely, parameters are configured in advance for each layer in the neural network), for example, if the predicted value of the network is high, the weight matrix is adjusted to be lower in prediction, and the adjustment is carried out continuously until the neural network can predict the really expected target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the neural network becomes a process of reducing the loss as much as possible.

(3) Back propagation algorithm

In the training process of the neural network, a Back Propagation (BP) algorithm can be adopted to correct the size of parameters in the initial neural network model, so that the reconstruction error loss of the neural network model is smaller and smaller. Specifically, the error loss is generated by transmitting the input signal in the forward direction until the output, and the parameters in the initial neural network model are updated by reversely propagating the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion with error loss as a dominant factor, aiming at obtaining the optimal parameters of the neural network model, such as a weight matrix.

(4) Tagged and untagged data

In the embodiment of the present application, the tagged data is tagged data, and the non-tagged data may be tagged data or non-tagged data. That is, the training data labeled with the classification category must be labeled data, while the training data not labeled with the classification category may belong to labeled data or unlabeled data.

(5) Learning with label and label (PU learning)

Specifically, in a given training data, only part of labeled data is labeled, and the rest of labeled data and all unlabeled data are not labeled. In this case, a two-class classifier is trained to separate positive and negative data, namely, labeled learning and label-free learning.

(6) Federal Learning (FL)

Federal learning is a machine learning method for protecting user privacy. In some practical application scenarios in the field of machine learning, due to the limitations of insufficient data characteristics or a small number of samples on a single device, it is difficult to separately train a better machine learning model, and therefore, data of multiple devices need to be fused together for training, so that a model with better quality is obtained; the data privacy of the user is required to be guaranteed while the training is carried out by fusing the data on the multiple devices, namely the data cannot be transmitted out of the devices of the user and can only be locally used for model training, and the federal learning is generated based on the requirement, and the data privacy training device can effectively help the multiple computing nodes to carry out data use and machine learning modeling under the condition that the requirements of user privacy protection, data safety and government regulations are met.

Embodiments of the present application are described below with reference to the accompanying drawings. As can be known to those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.

First, the general workflow of the artificial intelligence system is described, please refer to fig. 2, fig. 2 shows a structural diagram of an artificial intelligence body framework, which is explained below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where "intelligent information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process. The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.

(1) Infrastructure

The infrastructure provides computing power support for the artificial intelligent system, realizes communication with the outside world, and realizes support through a foundation platform. Communicating with the outside through a sensor; the computing power is provided by intelligent chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA and the like); the basic platform comprises distributed computing framework, network and other related platform guarantees and supports, and can comprise cloud storage and computing, interconnection and intercommunication networks and the like. For example, sensors and external communications acquire data that is provided to intelligent chips in a distributed computing system provided by the base platform for computation.

(2) Data of

Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphs, images, voice and texts, and also relates to the data of the Internet of things of traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.

(3) Data processing

Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.

The machine learning and the deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.

Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.

The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.

(4) General capabilities

After the above-mentioned data processing, further based on the result of the data processing, some general capabilities may be formed, such as algorithms or a general system, e.g. translation, analysis of text, computer vision processing, speech recognition, recognition of images, etc.

(5) Intelligent product and industrial application

The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent terminal, intelligent manufacturing, intelligent transportation, intelligent house, intelligent medical treatment, intelligent security protection, autopilot, wisdom city etc..

The embodiment of the application can be applied to the optimization design of a neural network (such as a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), and the like, specifically, without limitation), and can be particularly applied to the optimization design of a loss function of the neural network. The neural network for optimizing the loss function can be particularly applied to the field of artificial intelligence, for example, the neural network can be applied to the field of image processing in the field of computer vision, such as image segmentation, target detection and super-resolution reconstruction; but also in the field of text processing, speech processing, etc. Specifically, referring to fig. 1, taking the image processing field as an example, in the embodiment of the present application, data in a data set acquired by an infrastructure may be a plurality of image data (which may also be referred to as training samples or training data, and a plurality of training data form a training set) acquired by a sensor such as a monitoring camera and a mobile phone camera module or image data obtained based on acquired video data, as long as the training set satisfies a function for performing iterative training on a neural network, and the specific embodiment of the present application does not limit the type of data in the training set.

Referring to fig. 3, an overall framework of the task processing system provided in the embodiment of the present application is described, fig. 3 is a system architecture diagram of the task processing system provided in the embodiment of the present application, in fig. 3, a task processing system 300 includes an execution device 210, a first training device 2201, a second training device 2202, a database 230, a client device 240, a data storage system 250, and a data collection device 260, the execution device 210 includes a computation module 211 and an input/output (I/O) interface 212, and the computation module 211 includes a trained target neural network 201. It should be noted that, in this embodiment of the application, the first training device 2201 may be regarded as one of the clients in federal learning, and the second training device 2202 may be regarded as another client in federal learning, and when there are n clients in federal learning, there may be n training devices (not shown in fig. 3) correspondingly, which is not described herein again, where n ≧ 2, for convenience of understanding, in fig. 3, 2 clients in federal learning are taken as an example for description, that is, 2 training devices are taken as an example.

In the training phase, the data collecting device 260 may be configured to obtain a large-scale data set (i.e., a training set) required by a user, and store the training set in the database 230, where the training set includes a plurality of labeled data and a plurality of unlabeled data, these data may be referred to as training data or training samples, the training data may be image data, video data, audio data, text data, and the like, and is specifically related to a target task to be executed, and this is not limited herein. In this embodiment of the application, the training set acquired by the data acquisition device 260 and the training set stored in the database 230 serve only the first training device 2201, so the training set may be regarded as a local training set of the first training device 2201 and may be referred to as a first training set, similarly, the second training device 2202 also has its own data acquisition device, database, and the like (not shown in fig. 3), and the local training set of the second training device 2202, that is, the second training set, may be obtained based on a similar process, and if the federation learns n clients, all corresponding clients may obtain the local training set according to the similar process of the first training device 2201, which is not described herein again. Then, the first training device 2201 trains the target neural network 201 through the training set (i.e. the first training set) local to the training device 2201 and updates the weight matrix of each neural network layer of the target neural network 201 by using the target loss function constructed in the embodiment of the present application, similarly, other clients in federal learning (e.g. the second training device 2202) also train their respective local target neural networks 201 by using their respective local training sets and also update their respective weight matrices of each neural network layer of the local target neural network 201, and finally, the weight matrices of the target neural networks 201 of each client may be aggregated by some of the clients (e.g. the first training device 2201) or an additionally deployed server (not shown in fig. 3) (for example, the weight matrices of the target neural networks 201 of all the clients may be aggregated, or only aggregating the weight matrices of the target neural networks 201 of some of the clients, which is not limited in this application), to obtain an aggregated weight matrix, which is then sent to each training device (e.g., to the first training device 2201 and the second training device 2202) by a certain client (or an additionally deployed server), the first training device 2201 and the second training device 2202 take the obtained aggregated weight matrix as the current weight matrix of the respective local target neural network 201, continue to train the respective target neural network by using the local training set, and aggregate the respective new weight matrices until the iteration termination condition of the training is reached in the above alternating training process, at which time, the training of the target neural network 201 is considered to be completed. The trained target neural network 201 may be applied to different systems or devices (i.e., the execution device 210), for example, end-side devices such as a mobile phone, a tablet, and a notebook computer, wearable devices such as a smart watch, a smart bracelet, and smart glasses, or smart vehicles such as a smart car, an internet vehicle, and a robot, or edge devices such as a monitoring system (e.g., a camera), an internet camera (IP camera, IPC) in a security system, an Augmented Reality (AR), a Virtual Reality (VR), an identity recognition device (e.g., a work attendance machine, a card punch, and the like), and a smart speaker, and specifically, the trained target neural network 201 is not limited to be applied to devices with limited computing resources, but may be applied to any other devices capable of deploying a neural network such as a cloud server and a computing platform, this is not limited in this application.

During the inference phase, the execution device 210 may invoke data, code, etc. from the data storage system 250 and may store data, instructions, etc. in the data storage system 250. The data storage system 250 may be disposed in the execution device 210 or the data storage system 250 may be an external memory with respect to the execution device 210. The calculation module 211 performs processing (e.g., super-resolution reconstruction, target detection, image classification, etc.) corresponding to the target task for each input data (e.g., image data) through the trained target neural network 201.

In FIG. 3, the execution device 210 is configured with an I/O interface 212 to interact with data from an external device, and a "user" may input data to the I/O interface 212 via a client device 240. As an example, assuming that the target task is a super-resolution reconstruction task, the client device 240 may be an image capturing device of a monitoring system, an image captured by the image capturing device is input to the computing module 211 of the execution device 210 as input data, the computing module 211 performs super-resolution reconstruction on the input image to obtain an enhanced image, and then outputs the enhanced image to the image capturing device, or directly displays the enhanced image on a display interface (if any) of the execution device 210, or stores the enhanced image in a storage module of the execution device 210 for subsequent use.

In addition, in some embodiments of the present application, the client device 240 may also be integrated in the execution device 210, for example, when the execution device 210 is a mobile phone, if the target task is a super-resolution reconstruction task with an image, a target image to be processed (for example, an image that can be captured by a camera of the mobile phone, or an image obtained based on a video captured by the camera of the mobile phone) or a target image to be processed that is sent by another device (for example, another mobile phone) may be directly obtained by the mobile phone, and then a calculation module 211 in the mobile phone performs super-resolution reconstruction on the target image to obtain an enhanced result (i.e., an enhanced image) of the image, and directly presents the enhanced image on a display interface of the mobile phone or stores the enhanced image. The product forms of the execution device 210 and the client device 240 are not limited herein.

It should be noted that fig. 3 is only a schematic diagram of a system architecture provided in the embodiment of the present application, and the position relationship between the devices, modules, etc. shown in the diagram does not constitute any limitation, for example, in fig. 3, the data storage system 250 is an external memory with respect to the execution device 210, and in other cases, the data storage system 250 may be disposed in the execution device 210; in fig. 3, the client device 240 is an external device with respect to the execution device 210, and in other cases, the client device 240 may be integrated in the execution device 210.

In some embodiments of the present application, for example, in fig. 3, the first training device 2201, the second training device 2202, and the executing device 210 are distributed independent devices, but fig. 3 is only a schematic structural diagram of a task processing system provided by an embodiment of the present invention, and the positional relationship among the devices, modules, and the like shown in the diagram does not constitute any limitation. Further, the example in FIG. 3 is not intended to limit the number of each device, e.g., database 230 may be in communication with a plurality of client devices 240.

With reference to the above description, a specific implementation flow of the training phase and the application phase of the data processing method provided in the embodiment of the present application is set forth below.

First, training phase

In the embodiment of the present application, the training phase includes a process in which the first training device 2201 performs a training operation on the local target neural network 201 by using the training samples in the first training set in fig. 3, and a process in which the second training device 2202 performs a training operation on the local target neural network 201 by using the training samples in the second training set (i.e., a process of joint training in federal learning). It should be noted that the present application is described by taking two clients as an example, and federal learning participated in by more than 2 clients is similar to this, and is not described herein again. Referring to fig. 4, fig. 4 is a schematic flow chart of a neural network training method based on federal learning according to an embodiment of the present application, which may specifically include the following steps:

401. the first device trains a target neural network on the first device by using the constructed target loss function and a first training set on the first device to obtain a first weight matrix, wherein the first training set comprises first labeled data and first unlabeled data.

First, a first device (i.e., a first training device 2201) trains a target neural network (i.e., a local target neural network) on the first device with a training set (i.e., a first training set) on the first device by using a constructed target loss function, so as to obtain a first weight matrix of the local target neural network, wherein the first training set includes both labeled data (which may be referred to as first labeled data) and unlabeled data (which may be referred to as first unlabeled data).

It should be noted that, in the embodiment of the present application, before the first device trains the local target neural network with the constructed target loss function, the target loss function needs to be constructed, and a background and a construction process for constructing the target loss function according to the present application are described below.

The background for constructing the objective loss function is introduced: due to the limited ability of different clients to mark data, the present application assumes that each client only marks a portion of the data of a category. For example, assuming a total of 10 classes of training data, the first device (i.e., one of the clients for federal learning) only marks 20% of the training data in classes 1-5, and all the training data in classes 6-10 are not marked; the second device (i.e., another client for federal learning) only marks 40% of the training data in category 6-10, while all of the training data in category 1-5 are not marked. As shown in fig. 5, fig. 5 is a schematic diagram comparing a normal two-class learning method provided by the embodiment of the present application with a conventional two-class PU learning method, and a sub-diagram (a) in fig. 5 illustrates a normal two-class learning method, in which data used for training are all labeled data, where "×" represents one class and "o" represents another class; the sub-diagram (b) in fig. 5 illustrates a conventional two-class PU learning method, in which only part of the training data is labeled data, and the rest of the training data is unlabeled data, e.g., "o" in the sub-diagram (b) in fig. 5 indicates labeled training data, "? "is expressed as unlabeled training data.

On the basis, the application provides a multi-classification PU learning method, which is different from the traditional two-classification PU learning method in that: in the embodiment of the present application, the non-tag data may have a plurality of different classification categories, for example, it is assumed that there are 10 classification categories, where part of data in 5 categories is marked as tagged data, and the other 5 categories are not marked and are all non-tag data, that is, in the embodiment of the present application, although all of the non-tag data are non-tag data, what category the non-tag data specifically belongs to may not be known; while the non-tag data in the conventional two-classification PU learning method only includes one classification category (the number of categories of tagged data is not limited), for example, assuming that there are 10 classification categories, the non-tag data can only be one of the categories, and the other 9 categories must be tagged data. For easy understanding, referring to fig. 6 in particular, fig. 6 is a schematic diagram illustrating a comparison between a normal multi-class learning method provided in the present application and a multi-class PU learning method provided in the present application.

Because the training data adopted by the traditional multi-classification learning method are all labeled data, the constructed loss function is generally shown as the following formula (1):

wherein R (f) is the expression of the loss function adopted in the normal multi-classification learning method, C is the number of classes, R_i(f)＝P_i(f (x) ≠ i) is a loss function of data of type i (also called type i data), x is input data of the neural network, f (x) is output data (e.g. prediction classification result) when the input data of the neural network is x, P_i(f (x) ≠ i) is the probability that the output prediction classification result is not the class i (i.e. the probability of prediction error), pi_iAre class priors (like weights) for class i data, so the overall loss function r (f) is a linear combination of the corresponding loss functions for each class.

However, in the multi-class PU learning method of the present application, since the unlabeled data is data without any label, if the loss function commonly used in the normal multi-class learning method described in formula (1) is adopted, the loss function related to the unlabeled data cannot be calculated, and based on this, the embodiment of the present application constructs a new target loss function, which includes not only the contribution of the labeled data to the target loss function, but also the contribution of the unlabeled data to the target loss function can be obtained by using the loss of the unlabeled data and the labeled data.

The following describes the process of constructing the objective loss function constructed by the present application: based on the above, the objective loss function not only includes the contribution of the labeled data to the objective loss function, but also can obtain the contribution of the unlabeled data to the objective loss function by using the loss of the unlabeled data and the labeled data. Therefore, the target loss function can be constructed according to a first loss function and a second loss function, wherein the first loss function is used for representing the probability of wrong prediction of the labeled data input to the neural network, and the second loss function is used for representing the probability that the prediction result obtained by predicting the unlabeled data input to the neural network does not belong to the preset classification category. Specifically, in some embodiments of the present application, the target loss function may be obtained by adding the first loss function and the second loss function. As shown in the following formula (2):

wherein, r (f) is an objective loss function constructed in the present application, i is a classification category corresponding to training data labeled on the current client k1, m is a category (which may be referred to as a preset classification category) to which unlabeled data on the current client may correspond, for example, assuming that there are 10 classification categories, where partial data (i.e., labeled data) in the 1 st to 5 th categories are labeled on the client k1, and all data in the 6 th to 10 th categories on the client k1 are unlabeled (i.e., unlabeled data), then the value of i is the category corresponding to the 1 st to 5 th categories (i.e., C)_pk1) And the value of m is the category corresponding to the 6 th to 10 th categories. It should be noted here that the total number of classification categories of all clients is determined, and still taking 10 classification categories as an example, each client, whether having tag data or having no tag data, will not exceed the range of the 10 classification categories (the training data may be less than the 10 categories but cannot exceed the 10 categories).

Is a first loss function, R'_i(f) Indicating the probability that the prediction result obtained by predicting the class i data input to the neural network is wrong, P_U(f (x) ≠ m) is the probability that the prediction result obtained by predicting the label-free data input into the neural network does not belong to the preset classification category m.

It is to be noted here that in the formula (2), the class prior is pi_iActually, the training data is a fixed value preset based on real data in actual application, the value of the training data is not changed in the subsequent training process, similar weights are used, the class is a priori unrelated to the number of specific training data on each client, and is related to the class distribution under the real condition in the actual application, for example, in a real scene, the data of a cat accounts for 30%, and the data of a dog accounts for 45% and 25% of bird data, the category prior of the dog is greater than the category prior of the cat, the category prior of the cat is greater than the category prior of the bird, and the category prior of each category is in positive correlation with the data quantity of each category in a real scene.

It should be noted that, in some embodiments of the present application, the first loss function may further include a first sub-loss function and a second sub-loss function, where the first sub-loss function is used to characterize a probability that the prediction result obtained by predicting the labeled data input to the neural network does not belong to the classification category corresponding to the labeled data; and the second sub-loss function is used for representing the probability that the prediction result obtained by predicting the labeled data input into the neural network does not belong to the preset classification category. Specifically, the first loss function may be a difference between the first sub-loss function and the second sub-loss function, as shown in the following equation (3):

wherein,

for the purpose of said first sub-loss function,

is said second sub-loss function.

In addition, the first loss function may be a difference between the first sub-loss function and the second sub-loss function, and may be in another characterization form, for example, the first loss function may also be a difference between a product obtained by multiplying the first sub-loss function by a preset coefficient a and a product obtained by multiplying the second sub-loss function by a preset coefficient b, where the preset coefficient a + b is 1 and a > 0, and b > 0, and specifically, may be as shown in the following formula (4):

it should be further noted that, in other embodiments of the present application, the constructed target loss function may further include a third loss function, in addition to the first loss function and the second loss function, where the third loss function is used to characterize a probability that a prediction result obtained by predicting target labeled data input to the neural network does not belong to a target preset classification category, where it should be noted that the third loss function is used to further distinguish each category in the unlabeled data, but for the current client k1, further distinguish each category in the local unlabeled data cannot be achieved, because the client k1 does not have any label about the local unlabeled data. However, since the training data tagged by each client in the federal study is different, the remaining clients (e.g., client kq, q is 2, 3, … …, n ≧ 2) can be found, so that the unlabeled data of client k1 is tagged labeled data in client kq, and then the third loss function of client k1 is calculated with the help of client kq, as shown in the following equation (5):

when in use

Therefore, for the current client k1, a third loss function in the target loss functions is used to calculate target loss functions of other clients, and the third loss function is used to represent a probability that a prediction result obtained by predicting target labeled data input to the neural network does not belong to a target preset classification category (which may be denoted as m'), in this embodiment, for the current client k1, target labeled data refers to labeled data on the current client k1, but the target labeled data does not belong to labeled data on other clients kq; similarly, the target preset classification category m' is not the preset classification category defined by the current client k1, but belongs to the preset classification categories defined by other clients kq.

To facilitate understanding of the labeled data of the target and the preset classification category of the target in the third loss function, the following example is illustrated: assuming client k1 is a first device and client k2 is a second device, then for the first device, the target tagged data refers to the first tagged data that belongs to the first device and not the second tagged data that belongs to the second device. Similarly, the target preset classification category also refers to the preset classification category defined on the second device, for example, assuming that there are 10 classification categories, wherein part of the data in the categories 1 to 5 (i.e. the first labeled data) is labeled on the client k1, and all the data in the categories 6 to 10 (i.e. the first unlabeled data) on the client k1 are unlabeled, then for the client k1, the value of i is the category corresponding to the categories 1 to 5 (i.e. the category corresponding to the category is the category i)

) For the target classification category, the value of m is the category corresponding to the 6 th to 10 th categories, where m for the client k1 is the preset classification category, and assuming that the client k2 is marked with partial data (i.e., second labeled data) in the 6 th to 10 th categories, and all data of the 1 st to 5 th categories on the client k2 are not marked (i.e., second unlabeled data), then for the client k2, the value of i is the category corresponding to the 6 th to 10 th categories (i.e., the category corresponding to the 6 th to 10 th categories) (i.e., second unlabeled data)

) The value of m is a category corresponding to categories 1 to 5, and m in the category refers to a preset classification category for the client k 2. It is noted that the preset classification category of the client k2 is the target preset classification category m' for the client k 1.

To sum up, equation (5) above shows that to calculate the j-th class data of the client k1 about the third loss function, but there is no label data about the j-th class on the client k1, but the client kq has the j-th class label data, the client kq can use its local labeled data to calculate the value of the third loss function of the client k 1.

Thus, in some embodiments of the present application, the constructed target loss function may be, taking into account the third loss function: first, the target loss function is obtained by adding the first loss function and the second loss function, and then subtracting the third loss function, as shown in the following formula (6):

wherein, pi_jAnd pi_iSimilarly, for class priors, see in particular the above-mentioned Pair π_iThe description of the above is omitted here for brevity. m' is the target preset classification category.

It should be noted that, in equation (6), the third loss function of the target loss function at the current client k1 is calculated for a part of the target loss functions of all other clients kq (q is 2, 3, … …, n ≧ 2), and in practical applications, only one or more of the clients may be calculated, and taking calculation of one of the clients as an example, the target loss function may also be as shown in equation (7) below:

it should be noted that, in other embodiments of the present application, if

Is the form shown in formula (3), then the above formula (6) can also be developed as shown in the following formula (8):

similarly, the above formula (7) can also be developed as shown in the following formula (9):

after the target loss function is constructed, each client can train the target neural network on each client by using local training data based on the target loss function, the training process is a process of updating a weight matrix of the local target neural network, and it should be noted here that network models (also referred to as network structures) of the target neural network on each client are the same, and the trained weight matrices are different only due to different local data. For the first device, the process is: the first device trains a target neural network (i.e. a local target neural network) on the first device by using a training set (i.e. a first training set) on the first device by using the constructed target loss function, so as to obtain a first weight matrix of the local target neural network, wherein the first training set comprises both labeled data (which may be called first labeled data) and unlabeled data (which may be called first unlabeled data).

402. And the second equipment trains the target neural network on the second equipment by using the constructed target loss function and a second training set on the second equipment to obtain a second weight matrix, wherein the second training set comprises second labeled data and second unlabeled data.

Similar to the first device, the second device (i.e., the second training device 2202) may also train the target neural network (i.e., the local target neural network) on the second device using the constructed target loss function using a training set (i.e., a second training set) on the second device to obtain a second weight matrix for the local target neural network, where the second training set includes both tagged data (which may be referred to as second tagged data) and non-tagged data (which may be referred to as second non-tagged data).

It should be noted here that the execution order of step 401 and step 402 is not limited, and step 401 may be executed first and then step 402 may be executed, or step 402 may be executed first and then step 401 may be executed, or step 401 and step 402 may be executed simultaneously, which is not limited herein.

403. The first device sends the first weight matrix to the serving device.

The first device transmits a first weight matrix of the local target neural network to the serving device. It should be noted that, in this embodiment of the application, the service device may be a server, and may also be another device (e.g., a second device) serving as a client, which is not limited herein. For ease of illustration, the service device is hereinafter referred to as a server by default.

404. The second device sends the second weight matrix to the serving device.

Similarly, the second device also sends a second weight matrix of the local target neural network to the serving device.

It should be noted that, the execution sequence of step 403 and step 404 is not limited, step 403 may be executed first, and then step 404 is executed, or step 404 may be executed first, and then step 403 is executed, or step 403 and step 404 may be executed simultaneously, which is not limited herein.

405. The service equipment integrates the first weight matrix and the second weight matrix to obtain an integrated weight matrix.

After receiving the first weight matrix sent by the first device and the second weight matrix sent by the second device, the service device further integrates the first weight matrix and the second weight matrix to obtain an integrated weight matrix.

For ease of understanding, one of the integration processes is illustrated below: assume that there are n1 training data on the first device for training the local target neural network and n2 training data on the second device for training the local target neural networkThe network, the weight matrix trained by the first device in the t +1 th iteration is recorded as

The weight matrix trained by the second device in the t +1 th iteration is recorded as

Then the service equipment integrates the obtained integrated weight matrix W_t+1It can be described as shown in the following formula (10):

where N is the sum of the training data on the first device and the second device, i.e., N1+ N2.

In an embodiment of the application, after receiving the integrated weight, the first device may apply the integrated weight matrix to the target neural network as a final weight matrix of the target neural network.

It should be noted that, in some embodiments of the present application, iterative training may also be performed to improve the application performance of the target neural network, that is, some embodiments of the present application may further include steps 406 to 409.

406. The service device sends the integrated weight matrix to the first device and the second device respectively.

After the service device completes the integration of the first weight matrix and the second weight matrix to obtain an integrated weight matrix, the service device sends the integrated weight matrix to the first device and the second device, respectively.

407. The first device updates a current weight matrix of the local target neural network to an integrated weight matrix.

After receiving the integrated weight matrix sent by the service device, the first device updates the current weight matrix of the local target neural network to the integrated weight matrix.

408. The second device updates a current weight matrix of the local target neural network to an integrated weight matrix.

Similarly, the second device, after receiving the integrated weight matrix sent by the serving device, will also update the current weight matrix of the local target neural network to the integrated weight matrix.

It should be noted that, in some embodiments of the present application, the service device in step 406 may also send the obtained integrated weight matrix to only one of the clients (e.g., the first device, or the second device) involved in the federal learning, so that the client updates the current weight matrix of the local target neural network to the integrated weight matrix; the service device in step 406 may also send the obtained integrated weight matrix to a plurality of clients involved in federal learning (e.g., there are n clients, only m of the clients send the integrated weight matrix, where n > m ≧ 2) or all the clients (e.g., send the integrated weight matrix to all n clients), so that the plurality of clients or all the clients respectively update the current weight matrix of the local target neural network to the integrated weight matrix, which is not limited in this application.

It should be noted that, step 407 and step 408 are not limited by the execution sequence, step 407 may be executed first, and then step 408 is executed, or step 408 may be executed first, and then step 407 is executed, or step 407 and step 408 are executed simultaneously, which is not limited herein.

409. And repeating the steps 401 to 408 until the training termination condition is reached.

After the first device and the second device respectively update the current weight matrix of the local target neural network to the integrated weight matrix, steps 401 to 408 may be repeatedly performed until a training termination condition is reached.

It should be noted that, in some embodiments of the present application, the manner of determining that the training termination condition is reached includes, but is not limited to: 1) when the updating times of the integrated weight matrix reach preset times, the training termination condition is considered to be reached; 2) and if the difference value between the two adjacent integrated weight matrixes is smaller than a preset threshold value, the training termination condition is considered to be reached.

It should be further noted that, in the embodiment corresponding to fig. 4, 2 clients are taken as an example for description, that is, the first device and the second device. In practical applications, there may be more than 2 clients, in which case the difference lies in: 1) the process of training the local target neural network by using the constructed target loss function and the respective local training set is similar for each client, and details are not repeated here. 2) In step 405, the service device integrates the weight matrices sent by all the clients to obtain an integrated weight matrix. For example, assuming that there are k clients, the service device may receive k weight matrices, which are the first weight matrix, the second weight matrix, … …, and the kth weight matrix, respectively, and then obtain an integrated weight matrix based on the first weight matrix, the second weight matrix, … …, and the kth weight matrix. For ease of understanding, one of the integration processes is illustrated below: assuming that n1 training data are provided on the first device for training the local target neural network and n2 training data are provided on the second device for training the local target neural network, … …, and nk training data are provided on the kth device for training the local target neural network, similarly, the weight matrix trained by the first device in the t +1 th iteration is recorded as

… …, the weight matrix trained by the kth device is recorded as

Then the service equipment integrates the obtained integrated weight matrix W_t+1It can be described as shown in the following formula (11):

where N is the sum of the training data at the first device, the second device, … …, and the k-th device, i.e., N1+ N2+ … … + nk.

In summary, the neural network training method based on federal learning according to the embodiment of the present application includes two parts (specifically, refer to fig. 7): a. the server integrates the weight matrix of the local target neural network of each client and returns the updated integrated weight matrix to the client; b. and each client updates the weight matrix of each local target neural network by using each local training data and uploads the weight matrix to the server. The two parts are iterated for a plurality of times until the network parameters converge.

In the above embodiments of the present application, the problem of federate learning is solved by constructing an objective loss function different from the traditional federate learning method, where the objective loss function considers not only the contribution of labeled data to the objective loss function, but also the contribution of unlabeled data to the objective loss function, so as to improve the performance of the objective neural network obtained by jointly training a plurality of clients (i.e., the first device and the second device) based on the objective loss function.

Second, reasoning phase

In this embodiment of the application, the inference phase describes a process of the executing device 210 in fig. 3 processing input data (e.g., an image) by using the trained target neural network 201, specifically please refer to fig. 8, where fig. 8 is a flowchart of a data processing method provided in this embodiment of the application, and specifically includes the following steps:

801. the execution device obtains input data related to the target task.

First, the execution device obtains input data to be processed, which is related to a target task to be executed, for example, when the target task is a classification task, the input data refers to data for classification.

It should be noted that, in the embodiment of the present application, the input data may be image data, audio data, or text data, and the data type of the input data is determined by the target task to be processed, which is not limited herein.

802. The execution equipment processes input data through a trained target neural network to obtain output data, a weight matrix of the trained target neural network is obtained by joint training of first equipment and second equipment based on a constructed target loss function respectively by using a local first training set and a local second training set, the first training set comprises first labeled data and first unlabeled data, and the second training set comprises second labeled data and second unlabeled data.

And then, the execution device processes the input data according to the trained target neural network, so as to obtain output data (namely a prediction classification result), wherein a weight matrix of the trained target neural network is obtained by joint training of the first device and the second device respectively by using a local first training set and a local second training set based on the constructed target loss function, the first training set comprises first labeled data and first unlabeled data, and the second training set comprises second labeled data and second unlabeled data. That is, the trained target neural network is trained based on the target loss function constructed in the embodiment corresponding to fig. 4. In the embodiment of the present application, it can be referred to the related description of the embodiment corresponding to fig. 4 above for how the trained target neural network is trained based on the constructed target loss function, and details are not repeated here.

It should be noted that, the above is described by taking 2 clients as an example, if there are more than 2 clients, for example, the first device, the second device, … …, and the nth device, where n is greater than 2, the weight matrix of the trained target neural network is obtained by joint training of the first device, the second device, … …, and the nth device based on the constructed target loss function, respectively using the local first training set, the second training set, … …, and the nth training set, where the first training set includes the first labeled data and the first unlabeled data, and the second training set includes the second labeled data and the second unlabeled data, and … …, and the nth training set includes the nth labeled data and the nth unlabeled data.

In summary, the shared neural network model (i.e., the trained target neural network, and the initial neural network model of each client is the same) trained by the neural network training method based on federal learning according to the embodiment of the present application may be used in various application scenarios such as computer vision, natural language processing, and speech, and the target neural network may be used in a device capable of operating a neural network, for example, a mobile phone, a tablet computer, and various embedded and portable devices, and may be specifically shown in fig. 9.

In order to more intuitively recognize the beneficial effects brought by the embodiments of the present application, the following further compares the technical effects brought by the embodiments of the present application. Specifically, the present application tested on two commonly used data sets, CIFAR-10 and MNIST. First for the MNIST dataset, it is assumed that each client owns all 10 classes of data, but only the P-class classes are marked therein. The results are shown in Table 1.

TABLE 1 Classification results on MNIST data set 1

As can be seen from table 1, the method greatly surpasses the result of training only by using local labeled data at each client, and is very close to the result of training after marking all local training data at each client.

Then, the present application performed an experiment on the condition that each client is marked with a different number of labels, and the result is shown in table 2.

TABLE 2 Classification results on MNIST data set 2

The results of both table 1 and table 2 assume that each client has all categories of data (with some marked, with none), which is called independent identity distribution (iid) data. Next, the present application assumes that each client owns only data of a part of all the categories, and in the case of non-independent identity distribution (non-iid), the classification result is as shown in table 3 (total of 10 clients).

TABLE 3 Classification results on MNIST data set 3

Second, the present application performed experiments on the CIFAR-10 dataset. As shown in table 4, as can be seen from table 4, the method greatly surpasses the result of training only by using local labeled data at each client, and is very close to the result of training after marking all local training data at each client.

TABLE 4 Classification results on CIFAR-10 data set

As can be seen from table 4, the method of the present application is far better than the result of training with only local labeled data at each client, and is close to the result of training after marking all local training data at each client.

Because the trained target neural network in the embodiment of the present application can be used for task processing (e.g., image processing, audio processing, semantic analysis, etc.) in the fields of intelligent security, smart city, intelligent terminal, etc., for example, the trained target neural network of the present application can be applied to various scenes and problems in the fields of computer vision, etc., such as some common tasks: face recognition, image classification, target detection, semantic segmentation, etc., a plurality of application scenarios that fall on the floor of a product will be introduced below.

(1) Target detection

As an example, the target neural network trained by the training method of the embodiment of the application may be used for target detection of a terminal (e.g., a mobile phone, a smart watch, a personal computer, etc.), specifically please refer to fig. 10, taking the terminal as the mobile phone as an example, when a user uses the mobile phone to take a picture, the user can automatically capture a target such as a human face and an animal, which can help the mobile phone to automatically focus and beautify. The trained target neural network can be applied to the mobile phone, the trained target neural network has label data and label-free data on a plurality of clients in the training process, the performance of the target neural network is improved by abundant training data and the constructed target loss function, the mobile phone is more smooth when the target detection is executed, better user experience can be brought to the user smoothly, and the quality of mobile phone products is improved.

(2) Automatic driving scene segmentation

As another example, the trained target neural network of the present application may also be used for automatic driving scene segmentation of a wheeled mobile device (e.g., an automatic driving vehicle, an assisted driving vehicle, etc.), and referring to fig. 11 in particular, taking a wheeled mobile device as an example of an automatic driving vehicle, the automatic driving scene segmentation is a semantic segmentation problem. The camera of the autonomous vehicle captures a road image, and the image needs to be divided to separate different objects such as a road surface, a roadbed, vehicles, pedestrians and the like, so that the vehicles can be kept in a correct safety area. For automatic driving with extremely high safety requirements, a picture needs to be understood in real time, a target neural network which can run in real time and is used for semantic segmentation is of great importance, in the training process of the target neural network, the training data adopted by the target neural network are labeled data and unlabeled data on a plurality of clients, the performance of the target neural network is improved by abundant training data and a constructed target loss function, the target neural network runs faster, and a series of requirements of automatic driving vehicles on the neural network can be well met, so that the trained target neural network can also be used as a neural network model for automatic driving scene segmentation of wheeled mobile equipment.

The wheel-type moving equipment described in the present application may be a wheel-type robot, a wheel-type construction equipment, an autonomous vehicle, or the like, and any equipment having a movable wheel-type may be the wheel-type moving equipment described in the present application. In addition, the autonomous vehicle described above in the present application may be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, an amusement car, a playground vehicle, construction equipment, an electric car, a golf cart, a train, a cart, or the like, and the present embodiment is not particularly limited.

(3) Face recognition

As another example, the trained target neural network of the present application can also be used for face recognition (e.g., face verification at the portal gate), and referring to fig. 12 in particular, the face recognition is an image similarity comparison problem. On gates at entrances of high-speed rails, airports and the like, when passengers carry out face authentication, a camera can shoot a face image, the neural network is used for extracting features, similarity calculation is carried out on the features of the image of the identity document stored in the system, and if the similarity is high, verification is successful. The neural network extraction features are the most time-consuming, and efficient neural network extraction is needed to perform feature extraction for rapid face verification. In the training process of the target neural network trained by the method, the adopted training data are labeled data and unlabeled data on a plurality of clients, the performance of the target neural network is improved by abundant training data and the constructed target loss function, the operation is faster, and a series of requirements on the neural network in the application scene of face recognition can be well met.

(4) Speech recognition

As another example, the trained target neural network of the present application can also be used for speech recognition (e.g., translator-transliteration), and referring specifically to fig. 13, translator-transliteration is a speech recognition and machine translation problem. In terms of speech recognition and machine translation, a neural network is also a common recognition model, real-time speech recognition and translation must be achieved in a scene needing simultaneous interpretation, which requires that the neural network deployed on equipment needs to be high in calculation speed, and the target neural network trained by the application improves the performance of the target neural network due to the fact that the adopted training data are labeled data and unlabeled data on a plurality of clients in the training process, rich training data and a constructed target loss function, the running is faster, and a series of requirements of the application scene of the speech recognition on the neural network can be well met.

It should be noted that the target neural network trained by the neural network training method based on federal learning according to the present application can be applied not only to the application scenarios described in fig. 10 to fig. 13, but also to various subdivided fields in the artificial intelligence field, such as the image processing field, the computer vision field, the semantic analysis field, and so on.

On the basis of the above embodiments, in order to better implement the above aspects of the embodiments of the present application, the following also provides related equipment for implementing the above aspects. Referring to fig. 14 specifically, fig. 14 is a schematic diagram of a training device provided in the embodiment of the present application, where the training device 1400, as a first device, may specifically include: the device comprises an acquisition module 1401, a training module 1402, a sending module 1403 and a receiving module 1404, wherein the acquisition module 1401 is configured to obtain a target loss function according to a first loss function and a second loss function, the first loss function is used for representing the probability of prediction error of labeled data input to a neural network, and the second loss function is used for representing the probability that a prediction result obtained by predicting unlabeled data input to the neural network does not belong to a preset classification category; a training module 1402, configured to train a target neural network on the first device using a first training set on the first device using the target loss function, and update a current weight matrix of the target neural network on the first device to a first weight matrix, where the first training set includes first labeled data and first unlabeled data; a sending module 1403, configured to send the first weight matrix to a serving device; a receiving module 1404, configured to receive an integrated weight matrix sent by the service device, where the integrated weight matrix is obtained by integrating the first weight matrix and a second weight matrix by the service device, and the second weight matrix is a value of a weight matrix obtained by using the target loss function by the second device and training a target neural network on the second device by using a second training set on the second device, where the second training set includes second labeled data and second unlabeled data, and it is to be noted that network structures of the target neural network on the first device and the target neural network on the second device are the same, that is, initial network models are the same, and only in a subsequent training process, weight matrices of network models obtained through respective training are different from each other.

In the above embodiment of the present application, the problem of federate learning is solved by constructing an objective loss function different from that in the conventional federate learning manner, where only part of labeled data exists on a client (i.e., the training device 1400), and the objective loss function considers not only the contribution of labeled data to the objective loss function, but also the contribution of unlabeled data to the objective loss function, so as to improve the performance of the objective neural network obtained by a plurality of clients through joint training based on the objective loss function.

In one possible design, the first loss function includes a first sub-loss function and a second sub-loss function, where the first sub-loss function is used to characterize the probability that the prediction result obtained by predicting the labeled data input to the neural network does not belong to the classification category corresponding to the labeled data; the second sub-loss function is used for representing the probability that the prediction result obtained by predicting the labeled data input into the neural network does not belong to the preset classification category.

In one possible design, the obtaining module 1401 is specifically configured to: and adding the first loss function and the second loss function to obtain the target loss function.

In one possible design, the obtaining module 1401 is further configured to: obtaining a target loss function according to the first loss function, the second loss function and a third loss function, wherein the third loss function is used for representing the probability that a prediction result obtained by predicting target labeled data input into the neural network does not belong to a target preset classification category, the target labeled data belongs to first labeled data on the first equipment and does not belong to second labeled data on the second equipment, and the target preset classification category is a classification category preset on the second equipment.

In one possible design, the obtaining module 1401 is further configured to: adding the first loss function and the second loss function to obtain an addition result; and subtracting the third loss function from the addition result to obtain the target loss function.

In one possible design, the training device 1400 further includes a triggering module 1405 for: taking the integrated weight matrix as the current weight matrix of the target neural network on the first device, and triggering the training module 1402, the sending module 1403, and the receiving module 1404 to execute again; and repeating the previous step until a training termination condition is reached.

In the above embodiment of the present application, the training device 1400 may repeatedly train the local target neural network with the updated integrated weight matrix to improve the performance of the target neural network.

It should be noted that, the contents of information interaction, execution process, and the like between the modules/units in the training device 1400 are based on the same concept as the method embodiment corresponding to fig. 4 in the present application, and specific contents may refer to the description of the step executed by the first device in the foregoing method embodiment in the present application, and are not described again here.

An embodiment of the present application further provides a training device, specifically referring to fig. 15, where fig. 15 is a schematic diagram of a training device provided in an embodiment of the present application, and the training device 1500, as a service device, may specifically include: a receiving module 1501, an integrating module 1502, and a sending module 1503, wherein the receiving module 1501, for receiving a first weight matrix transmitted by a first device and a second weight matrix transmitted by a second device, the first weight matrix is obtained by the first device through training a target neural network on the first device by using a target loss function and adopting a first training set on the first device, the second weight matrix is obtained by the second device by utilizing the target loss function and training a target neural network on the second device by adopting a second training set on the second device, wherein the network structure of the target neural network on the first device and the target neural network on the second device are the same, that is, the initial network models are the same, and only in the subsequent training process, the weight matrixes of the network models obtained by respective training are different. In addition, the target neural network is obtained according to a first loss function and a second loss function, the first loss function and the second loss function are constructed according to a target task, the first loss function is used for representing the probability of wrong prediction of labeled data input to the neural network, the second loss function is used for representing the probability that a prediction result obtained by predicting unlabeled data input to the neural network does not belong to a preset classification class, the labeled data is training data labeled with a classification class, the unlabeled data is training data not labeled with a classification class, the first training set comprises first labeled data and first unlabeled data, and the second training set comprises second labeled data and second unlabeled data; an integrating module 1502, configured to obtain an integrated weight matrix according to the first weight matrix and the second weight matrix; a sending module 1503, configured to send the integrated weight matrix to the first device, so that the first device uses the integrated weight matrix as a current weight matrix of a target neural network on the first device.

In one possible design, the training apparatus 1500 further includes a triggering module 1504 to: triggering the receiving module 1501, the integrating module 1502 and the sending module 1503 to repeat the above steps until the training termination condition is reached.

In the above embodiment of the present application, after the service device 1500 sends the integrated weight matrix to the first device and the second device, the above process may be repeatedly performed, so that the first device and the second device may repeatedly train the local target neural network with the updated integrated weight matrix to improve the performance of the target neural network.

It should be noted that, the contents of information interaction, execution process, and the like between the modules/units in the training device 1500 are based on the same concept as the method embodiment corresponding to fig. 4 in the present application, and specific contents may refer to the description of the steps executed by the service device in the foregoing method embodiment in the present application, and are not described again here.

An execution device is further provided in the embodiment of the present application, specifically referring to fig. 16, where fig. 16 is a schematic diagram of an execution device provided in the embodiment of the present application, and the execution device 1600 specifically may include: an obtaining module 1601 and a processing module 1602, where the obtaining module 1601 is configured to obtain input data related to a target task; a processing module 1602, configured to process the input data through the trained target neural network to obtain output data, where the weight matrix of the trained target neural network is an integrated weight matrix obtained by the method in the embodiment corresponding to fig. 4.

It should be noted that, the contents of information interaction, execution process, and the like between the modules/units in the execution device 1600 are based on the same concept as the method embodiment corresponding to fig. 8 in the present application, and specific contents may refer to the description in the foregoing method embodiment in the present application, and are not described herein again.

Referring to fig. 17, fig. 17 is a schematic structural diagram of a training apparatus provided in an embodiment of the present application, and when the training apparatus 1700 is used as a first apparatus, the training apparatus 1400 described in the embodiment corresponding to fig. 14 may be disposed on the training apparatus 1700 to implement the function of the training apparatus 1400 in the embodiment corresponding to fig. 14; when the training apparatus 1700 is used as a service apparatus, the training apparatus 1500 described in the embodiment corresponding to fig. 15 may be deployed on the training apparatus 1700, so as to implement the functions of the training apparatus 1500 in the embodiment corresponding to fig. 15. In particular, exercise device 1700 is implemented by one or more servers, and exercise device 1700 may vary widely depending on configuration or performance, and may include one or more Central Processing Units (CPUs) 1722 and memory 1732, one or more storage media 1730 (e.g., one or more mass storage devices) that store applications 1742 or data 1744. Memory 1732 and storage media 1730 may be transitory storage or persistent storage, among other things. The program stored on storage medium 1730 may include one or more modules (not shown), each of which may include a sequence of instructions that operate on exercise device 1700. Still further, central processor 1722 may be configured to communicate with storage medium 1730 to perform a series of instruction operations on storage medium 1730 on exercise device 1700.

Training apparatus 1700 may also include one or more power supplies 1726, one or more wired or wireless network interfaces 1750, one or more input-output interfaces 1758, and/or one or more operating systems 1741, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

In this embodiment, the central processing unit 1722 is configured to execute the steps executed by the first device or the service device in the embodiment corresponding to fig. 4, which is not described herein again in detail.

Referring to fig. 18, fig. 18 is a schematic structural diagram of an execution device provided in the embodiment of the present application, and the execution device 1800 may be embodied as various terminal devices, such as a virtual reality VR device, a mobile phone, a tablet, a laptop, an intelligent wearable device, a monitoring data processing device, or a radar data processing device, which is not limited herein. The execution device 1800 may be disposed with the execution device 1600 described in the embodiment corresponding to fig. 16, and is used to implement the function of the execution device 1600 in the embodiment corresponding to fig. 16. Specifically, the execution device 1800 includes: a receiver 1801, a transmitter 1802, a processor 1803, and a memory 1804 (where the number of processors 1803 in the execution device 1800 may be one or more, for example, one processor in fig. 18), where the processor 1803 may include an application processor 18031 and a communication processor 18032. In some embodiments of the present application, the receiver 1801, transmitter 1802, processor 1803, and memory 1804 may be connected by a bus or otherwise.

Memory 1804 may include both read-only memory and random-access memory, and provides instructions and data to processor 1803. A portion of the memory 1804 may also include non-volatile random access memory (NVRAM). The memory 1804 stores a processor and operating instructions, executable modules or data structures, or subsets thereof, or expanded sets thereof, wherein the operating instructions may include various operating instructions for performing various operations.

The processor 1803 controls the operation of the execution device 1800. In particular implementations, the various components of the execution device 1800 are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.

The method disclosed in the above-described embodiment of fig. 8 may be implemented in the processor 1803, or implemented by the processor 1803. The processor 1803 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1803. The processor 1803 may be a general-purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 1803 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments corresponding to fig. 8 of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1804, and the processor 1803 reads the information in the memory 1804, and completes the steps of the above method in combination with the hardware thereof.

The receiver 1801 may be used to receive entered numeric or character information and to generate signal inputs relating to the performance of settings and function controls associated with the device 1800. The transmitter 1802 may be used to output numeric or character information through a first interface; the transmitter 1802 is further operable to send instructions to the disk groups via the first interface to modify data in the disk groups; the transmitter 1802 may also include a display device such as a display screen.

In this embodiment, in one case, the processor 1803 is configured to perform corresponding data processing on input data through the trained target neural network, so as to obtain corresponding output data (i.e., a prediction classification result). The trained target neural network may be obtained by a training method corresponding to fig. 4 of the present application, and specific contents may be referred to in the description of the foregoing illustrated method embodiments of the present application, and are not described herein again.

Also provided in an embodiment of the present application is a computer-readable storage medium, in which a program for signal processing is stored, and when the program is executed on a computer, the program causes the computer to execute the steps executed by the training apparatus described in the foregoing illustrated embodiment, or causes the computer to execute the steps executed by the execution apparatus described in the foregoing illustrated embodiment shown in fig. 4 or fig. 8.

The training device, the execution device and the like provided by the embodiment of the application can be specifically chips, and the chips comprise: a processing unit, which may be for example a processor, and a communication unit, which may be for example an input/output interface, a pin or a circuit, etc. The processing unit may execute the computer executable instructions stored by the storage unit to cause the chip within the training apparatus to perform the steps performed by the training apparatus described in the illustrated embodiment described above, or to cause the chip within the execution apparatus to perform the steps performed by the execution apparatus described in the embodiment illustrated in fig. 4 or fig. 8 described above.

Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.

Specifically, please refer to fig. 19, where fig. 19 is a schematic structural diagram of a chip provided in the embodiment of the present application, the chip may be represented as a neural network processor NPU 200, and the NPU 200 is mounted on a main CPU (Host CPU) as a coprocessor, and the Host CPU allocates tasks. The core portion of the NPU is an arithmetic circuit 2003, and the controller 2004 controls the arithmetic circuit 2003 to extract matrix data in the memory and perform multiplication.

In some implementations, the arithmetic circuit 2003 internally includes a plurality of processing units (PEs). In some implementations, the arithmetic circuitry 2003 is a two-dimensional systolic array. The arithmetic circuit 2003 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 2003 is a general purpose matrix processor.

For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 2002 and buffers it in each PE in the arithmetic circuit. The arithmetic circuit takes the matrix a data from the input memory 2001 and performs matrix arithmetic with the matrix B, and partial results or final results of the obtained matrix are stored in an accumulator (accumulator) 2008.

The unified memory 2006 is used to store input data and output data. The weight data directly passes through a Direct Memory Access Controller (DMAC) 2005, and the DMAC is transferred to the weight memory 2002. Input data is also carried into the unified memory 2006 by the DMAC.

A bus interface unit 2010 (BIU) is used for interaction between the AXI bus and the DMAC and an Instruction Fetch memory (IFB) 2009.

The bus interface unit 2010 is configured to fetch an instruction from the external memory by the instruction fetch memory 2009, and further configured to fetch the original data of the input matrix a or the weight matrix B from the external memory by the storage unit access controller 2005.

The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 2006 or to transfer weight data to the weight memory 2002 or to transfer input data to the input memory 2001.

The vector calculation unit 2007 includes a plurality of operation processing units, and further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, if necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization, pixel-level summation, up-sampling of a feature plane and the like.

In some implementations, the vector calculation unit 2007 can store the vector of processed outputs to the unified memory 2006. For example, the vector calculation unit 2007 may apply a linear function and/or a nonlinear function to the output of the arithmetic circuit 2003, such as linear interpolation of the feature planes extracted by the convolutional layers, and further such as a vector of accumulated values, to generate the activation values. In some implementations, the vector calculation unit 2007 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to the arithmetic circuit 2003, e.g., for use in subsequent layers in a neural network.

An instruction fetch buffer 2009 connected to the controller 2004 for storing instructions used by the controller 2004;

the unified memory 2006, the input memory 2001, the weight memory 2002, and the instruction fetch memory 2009 are all On-Chip memories. The external memory is private to the NPU hardware architecture.

Wherein any of the aforementioned processors may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits configured to control the execution of the programs of the method of the first aspect.

It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.

Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Claims

1. A neural network training method based on federal learning is characterized by comprising the following steps:

the method comprises the steps that a first device obtains a target loss function according to a first loss function and a second loss function, wherein the first loss function is used for representing the probability of wrong prediction of labeled data input into a neural network, and the second loss function is used for representing the probability that a prediction result obtained by predicting unlabeled data input into the neural network does not belong to a preset classification category;

the first device trains a target neural network on the first device by using a first training set on the first device by using the target loss function, and updates a current weight matrix of the target neural network on the first device into a first weight matrix, wherein the first training set comprises first labeled data and first unlabeled data;

the first device sends the first weight matrix to a service device;

the first device receives an integrated weight matrix sent by the service device, the integrated weight matrix is obtained by integrating the first weight matrix and a second weight matrix by the service device, the second weight matrix is a value of the weight matrix obtained by training a target neural network on the second device by using the target loss function and adopting a second training set on the second device, wherein the second training set comprises second labeled data and second unlabeled data, and the network structure of the target neural network on the first device is the same as that of the target neural network on the second device.

2. The method of claim 1,

the first loss function comprises a first sub-loss function and a second sub-loss function;

the first sub-loss function is used for representing the probability that the prediction result obtained by predicting the labeled data input into the neural network does not belong to the classification category corresponding to the labeled data;

and the second sub-loss function is used for representing the probability that the prediction result obtained by predicting the labeled data input into the neural network does not belong to the preset classification category.

3. The method of claim 2,

the first loss function is a difference between the first sub-loss function and the second sub-loss function.

4. The method of any of claims 1-3, wherein the first device deriving the target loss function from the first loss function and the second loss function comprises:

and the first equipment adds the first loss function and the second loss function to obtain the target loss function.

5. The method of any of claims 1-3, wherein the first device deriving the target loss function from the first loss function and the second loss function comprises:

the first equipment obtains a target loss function according to a first loss function, a second loss function and a third loss function, the third loss function is used for representing the probability that a prediction result obtained by predicting target labeled data input into the neural network does not belong to a target preset classification category, the target labeled data belongs to first labeled data on the first equipment and does not belong to second labeled data on the second equipment, and the target preset classification category is a classification category preset on the second equipment.

6. The method of claim 5, wherein the first device deriving the target loss function from the first loss function, the second loss function, and the third loss function comprises:

the first device adds the first loss function and the second loss function to obtain an addition result;

and the first equipment subtracts the addition result from the third loss function to obtain the target loss function.

7. The method of any of claims 1-6, wherein after the first device receives the aggregated weight matrix sent by the serving device, the method further comprises:

the first device takes the integrated weight matrix as a current weight matrix of a target neural network on the first device, and performs again: training a target neural network on the first device by using a first training set on the first device by using the target loss function, sending the first weight matrix to the service device, and receiving an integrated weight matrix sent by the service device;

the first device repeatedly executes the previous step until a training termination condition is reached.

8. The method of any one of claims 1-7, wherein the reaching a training termination condition comprises:

the updating times of the integrated weight matrix reach preset times, or the difference value between the integrated weight matrices obtained in two adjacent times is smaller than a preset threshold value.

9. A neural network training method based on federal learning is characterized by comprising the following steps:

the method comprises the steps that a service device receives a first weight matrix sent by a first device and a second weight matrix sent by a second device, the first weight matrix is a weight matrix obtained by the first device through a target loss function and a first training set on the first device through training a target neural network on the first device, the second weight matrix is a weight matrix obtained by the second device through the target loss function and a second training set on the second device through training the target neural network on the second device, the network structures of the target neural network on the first device and the target neural network on the second device are the same, the target neural network is obtained according to the first loss function and the second loss function, and the first loss function is used for representing the probability of prediction errors of labeled data input to the neural network, the second loss function is used for representing the probability that a prediction result obtained by predicting label-free data input into the neural network does not belong to a preset classification category, the first training set comprises first labeled data and first label-free data, and the second training set comprises second labeled data and second label-free data;

the service equipment obtains an integrated weight matrix according to the first weight matrix and the second weight matrix;

the serving device sends the integrated weight matrix to the first device to cause the first device to treat the integrated weight matrix as a current weight matrix for a target neural network on the first device.

10. The method of claim 9, wherein after the serving device sends the integrated weight matrix to the first device, the method further comprises:

and the service equipment repeatedly executes the steps until the training termination condition is reached.

11. A data processing method, comprising:

acquiring input data related to a target task;

processing the input data by a trained target neural network to obtain output data, wherein the weight matrix of the trained target neural network is an integrated weight matrix obtained by the method of any one of claims 1-10.

12. The method of claim 11, wherein the input data comprises any one of:

image data, audio data, or text data.

13. An exercise device, characterized in that the exercise device, as a first device, comprises:

the device comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a target loss function according to a first loss function and a second loss function, the first loss function is used for representing the probability of wrong prediction of labeled data input into a neural network, and the second loss function is used for representing the probability that a prediction result obtained by predicting unlabeled data input into the neural network does not belong to a preset classification category;

a training module, configured to train a target neural network on the first device by using a first training set on the first device using the target loss function, and update a current weight matrix of the target neural network on the first device to a first weight matrix, where the first training set includes first labeled data and first unlabeled data;

a sending module, configured to send the first weight matrix to a service device;

a receiving module, configured to receive an integrated weight matrix sent by the service device, where the integrated weight matrix is obtained by integrating the first weight matrix and a second weight matrix by the service device, and the second weight matrix is a value of a weight matrix obtained by training a target neural network on a second device by using a second training set on the second device by using the target loss function by the second device, where the second training set includes second labeled data and second unlabeled data, and a network structure of the target neural network on the first device is the same as a network structure of the target neural network on the second device.

14. The apparatus of claim 13,

15. The device according to any one of claims 13 to 14, wherein the obtaining module is specifically configured to:

and adding the first loss function and the second loss function to obtain the target loss function.

16. The apparatus of any of claims 13-14, wherein the obtaining module is further configured to:

obtaining a target loss function according to a first loss function, a second loss function and a third loss function, where the third loss function is used to represent a probability that a prediction result obtained by predicting target labeled data input to the neural network does not belong to a preset target classification category, the target labeled data belongs to first labeled data on the first device and does not belong to second labeled data on the second device, and the preset target classification category is a classification category preset on the second device.

17. The device according to claim 16, wherein the obtaining module is further configured to:

adding the first loss function and the second loss function to obtain an addition result;

and subtracting the third loss function from the addition result to obtain the target loss function.

18. The apparatus according to any of claims 13-17, further comprising a triggering module configured to:

taking the integrated weight matrix as a current weight matrix of a target neural network on the first device, and triggering the training module, the sending module and the receiving module to execute again;

and repeating the previous step until a training termination condition is reached.

19. An exercise device, characterized in that the exercise device, as a service device, comprises:

a receiving module, configured to receive a first weight matrix sent by a first device and a second weight matrix sent by a second device, where the first weight matrix is a weight matrix obtained by the first device using a target loss function and training a target neural network on the first device using a first training set on the first device, the second weight matrix is a weight matrix obtained by the second device using the target loss function and training a target neural network on the second device using a second training set on the second device, a network structure of the target neural network on the first device is the same as a network structure of the target neural network on the second device, the target neural network is obtained according to a first loss function and a second loss function, and the first loss function and the second loss function are constructed according to a target task, the first loss function is used for representing the probability of wrong prediction of the labeled data input into the neural network, the second loss function is used for representing the probability that a prediction result obtained by predicting the unlabeled data input into the neural network does not belong to a preset classification category, the first training set comprises first labeled data and first unlabeled data, and the second training set comprises second labeled data and second unlabeled data;

the integration module is used for obtaining an integration weight matrix according to the first weight matrix and the second weight matrix;

a sending module, configured to send the integrated weight matrix to the first device, so that the first device uses the integrated weight matrix as a current weight matrix of a target neural network on the first device.

20. The apparatus of claim 19, further comprising a triggering module to:

and triggering the receiving module, the integrating module and the sending module to repeatedly execute the steps until a training termination condition is reached.

21. An execution device, comprising:

the acquisition module is used for acquiring input data related to the target task;

a processing module, configured to process the input data through a trained target neural network to obtain output data, where a weight matrix of the trained target neural network is an integrated weight matrix obtained by the method according to any one of claims 1 to 10.

22. A training device comprising a processor and a memory, the processor being coupled to the memory,

the memory is used for storing programs;

the processor to execute a program in the memory to cause the training apparatus to perform the method of any of claims 1-10.

23. An execution device comprising a processor and a memory, the processor coupled with the memory,

the memory is used for storing programs;

the processor to execute the program in the memory to cause the execution device to perform the method of any of claims 11-12.

24. A computer-readable storage medium comprising a program which, when run on a computer, causes the computer to perform the method of any one of claims 1-12.

25. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1-12.

26. A chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface to perform the method of any one of claims 1-12.