CN112052938A

CN112052938A - Multi-terminal model compression method based on knowledge federation, task prediction method and device and electronic equipment

Info

Publication number: CN112052938A
Application number: CN202010818643.2A
Authority: CN
Inventors: 韦达; 孟丹; 李宏宇; 李晓林
Original assignee: Tongdun Holdings Co Ltd
Current assignee: Tongdun Holdings Co Ltd
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2020-12-08

Abstract

The invention discloses a multi-terminal model compression method, a task prediction method, a device and electronic equipment based on knowledge federation, wherein the multi-terminal model compression method comprises the following steps: aggregating the local models after the Nth round of training reported by a plurality of participants to obtain a global model to be compressed, wherein N is greater than or equal to 1; compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, wherein the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on the data of the multiple participants; sending the global compression model to the plurality of participants for N +1 rounds of training.

Description

Multi-terminal model compression method based on knowledge federation, task prediction method and device and electronic equipment

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a multi-end model compression method, a task prediction device and electronic equipment.

Background

As Artificial Intelligence (AI) has grown mature, people have recognized the great potential of Artificial Intelligence (AI) in complex application scenarios. Such as unmanned automobile driving, healthcare, financial data analysis, and the like. People hope to explore the advantages of artificial intelligence in a deeper level and improve the robustness and accuracy of the model. Current interest in artificial intelligence is driven by big data: in 2016, AlphaGo used a total of 300,000 board plays as training data to achieve excellent performance.

With the success of AlphaGo, it is naturally expected that large data-driven AI like AlphaGo can be realized early in various aspects of our lives. However, the real world situation is somewhat disappointing: most fields have limited or poor data quality except for a few businesses. Today's AI still faces two major challenges. One is that in most industries, data exists in isolated islands. Another is to enhance data privacy and security. How to reasonably solve the problems of data islanding and data safety in the AI industry is a major challenge facing AI researchers and practitioners.

The knowledge federation-based multi-end joint training mode is used as a solution for solving the problems, and a global model is established in cooperation among a plurality of participants on the premise that private data is not exchanged, so that the global model can be fully trained. By taking the model layer federation in the knowledge federation as an example, the model layer federation can enhance data privacy and security and solve the data island problem on the premise of realizing the cooperation and establishment of a global model among a plurality of participants.

However, when training a model, since many parties participate in the training, frequent communication and encrypted data exchange are required, thereby posing a great challenge to the communication traffic. Moreover, with the increase of the data volume and the number of the participants, the model becomes more and more complex, more and more propagation data are needed in the process of model training and model prediction, the communication pressure becomes greater and greater, and further the efficiency of model training is greatly reduced.

Therefore, how to improve the efficiency of model training in the knowledge-based federation becomes a technical problem to be solved urgently.

Disclosure of Invention

The technical problem to be solved by the embodiment of the invention is how to improve the model training efficiency based on the knowledge federation.

According to a first aspect, an embodiment of the present invention provides a knowledge federation-based multi-end model compression method, including: aggregating the local models after the Nth round of training reported by a plurality of participants to obtain a global model to be compressed, wherein N is greater than or equal to 1; compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, wherein the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on data of a plurality of participants; the global compression model is sent to multiple participants for N +1 rounds of training.

Optionally, compressing the global model to be compressed by using the public data set based on a preset accuracy to obtain the global compression model includes: and pruning the global model to be compressed by utilizing the public data set based on a pruning algorithm to obtain the global compression model.

Optionally, the service side performs pruning operation on the compression model by using the public data set based on a pruning algorithm, and obtaining the global compression model includes: updating the current global parameters of the global model to be compressed by utilizing the public data set to obtain the updating gradient of the global model to be compressed; determining the contribution degree of the neurons of the global model to be compressed by using the updating gradient, wherein the contribution degree is used for representing the activation degree of the neurons; and reserving neurons meeting preset conditions as the global compression model, wherein the preset conditions comprise a preset contribution threshold and/or a neuron quantity ratio.

Optionally, the updating the current global parameter of the global model to be compressed by using the common data set to obtain the update gradient of the global model to be compressed includes: and forward propagation and backward propagation are carried out on the global model to be compressed by utilizing the public data set, so that an updating gradient is obtained.

Optionally, the determining the contribution of the neurons of the global model to be compressed by using the global update parameter includes: determining the weight value gradient of the neuron according to the updating gradient; and calculating the contribution degree of the neuron based on the weight value gradient.

Optionally, the method includes, between compressing the global model to be compressed by using the common data set based on a preset accuracy and sending the global model to the multiple participants for N +1 rounds of training: testing the global compression model by using the public data set to obtain a first performance index of the global compression model; calculating a performance attenuation value of the first performance index relative to a second performance index of the global to-be-compressed model, wherein the second performance index is obtained by testing the global to-be-compressed model by utilizing a public data set; and when the performance attenuation value is greater than the preset attenuation value, the step of compressing the global model to be compressed by adopting the public data set based on the preset accuracy rate is repeated until the performance attenuation value is less than or equal to the preset attenuation value, and the step is entered to send the global compression model to a plurality of participants for N +1 rounds of training.

Optionally, aggregating the N-th round of trained local models reported by the multiple participants to obtain a global model to be compressed includes: aggregating the local model and the local model parameters reported by the participants to obtain a global model and global model parameters; the aggregated global model and global model parameters are sent to the participants to complete a round of training; judging whether the training times reach N times or not; and when the training times reach N times, taking the global model after the Nth round of training as a global model to be compressed.

According to a second aspect, an embodiment of the present invention provides a task prediction method, including: acquiring task data to be predicted; and inputting the task data to be predicted into a knowledge federation model to obtain a prediction result, wherein the knowledge federation model is obtained by adopting any one multi-terminal model compression method in the first aspect.

According to a third aspect, an embodiment of the present invention provides a multi-end model compression apparatus based on knowledge federation, including: the aggregation module is used for aggregating the local models after the Nth round of training reported by the multiple participants to obtain a global model to be compressed, wherein N is greater than or equal to 1; the compression module is used for compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on data of a plurality of participants; and the sending module is used for sending the global compression model to a plurality of participants to carry out N +1 rounds of training.

According to a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores computer instructions for causing a computer to execute the multi-end model compression method of any one of the above first aspects and/or the task prediction method of any one of the second aspects.

According to a fifth aspect, an embodiment of the present invention provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to cause the at least one processor to perform the multi-terminal model compression method of any one of the above-mentioned first aspects and/or the task prediction method of any one of the second aspects.

The compression of the model is arranged at the service side, the service side aggregates the local models after receiving the local models uploaded by the participants, the global model to be compressed is compressed by using the preset performance index of the public data set deployed at the service side, the service side aggregates the local models of the multiple participants, the aggregated model is uniformly compressed by using the public data set obtained by enhancing the data of the multiple participants at the service side, compared with the method that the structure and/or parameters of the model are compressed to the maximum extent and distributed at the multiple participants, the model precision is kept unchanged, the data characteristics of all the participants can be simulated in the compression process, the compressed model can keep better generalization capability on the data of the multiple participants, the compression ratio is improved, and meanwhile, the difference of pruned models obtained by different participants due to the difference of data distribution of the various participants can be avoided, the smaller intersection of different models at the time of re-aggregation leads to a problem of smaller compression ratio.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a diagram illustrating a multi-ended knowledge-federation-based model compression method according to an embodiment of the present invention;

FIG. 2 shows a schematic diagram of a multi-terminal model compression method of an embodiment of the invention;

FIG. 3 is a diagram illustrating the compression effect of the model according to the embodiment of the present invention;

FIG. 4 is a schematic diagram of a multi-terminal model compression method according to another embodiment of the invention;

FIG. 5 is a schematic diagram of a knowledge federation based multi-ended model compression apparatus according to an embodiment of the present invention;

fig. 6 shows a schematic view of an electronic device of an embodiment of the invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Neural Networks (NN), which are complex network systems formed by a large number of simple processing units (or neurons) widely interconnected, reflect many of the basic features of human brain function. The function and characteristics of neurons can be modeled by a mathematical model, so that a network model (also referred to as a network model of a neural network in this application) can be constructed on the basis of the mathematical model of the neuron.

As background art, although the data privacy can be enhanced through the knowledge federation technology, and the data islanding problem can be solved more effectively, during model training, as the number of participants and data increases, the model becomes more and more complex. Taking a CNN network as an example, there are usually tens of millions or even hundreds of millions of model parameters for a complex network model, and the size is usually hundreds of MB. On the premise of ensuring the accuracy, the model parameters are effectively transmitted, and the model training efficiency is improved. The inventor finds that some model compression schemes, such as model pruning, exist in the knowledge federation/distributed computing scenario at present, and the pruning scheme in the general knowledge federation/distributed computing is to prune at the client side and then transmit the pruned models to the service side for aggregation. However, because the data distribution of each participant is different (the data is not independent and the same in distribution), different post-pruning models obtained by different clients are different, and the problem of smaller compression ratio caused by smaller intersection of different models when the models are aggregated after pruning exists. Meanwhile, the problem of low generalization precision caused by pruning of non-independent and identically distributed data is solved by a model obtained by respective data pruning of the participants. Therefore, the existing multi-end model compression method is difficult to meet the model training efficiency and the accuracy of the compressed model.

Based on the specific problems, the inventor provides a multi-terminal model compression method, which can be applied to the knowledge federation scene and can also be applied to other scenes of model training with a plurality of participants and a service party, in this embodiment, the knowledge federation scenario is taken as an example, as shown in fig. 1, the knowledge federation scenario may include two roles, a participant and a service party, the service party may be a third party other than the participant or any one of the participants, and a common data set related to the current knowledge federation task may be deployed on the service party, the common data set is obtained by merging enhanced data sets obtained by performing data enhancement on local data of all participants, and specifically, the data enhancement can generate new data by using transformation, noise addition and oversampling on the local data. Illustratively, the local data may be oversampled using SMOTE, MAHAKIL, etc. generation equations to produce an enhanced data set that conforms to the local data distribution without revealing the local real data. Therefore, the common data set contains the data characteristics of all the participants, so that the data characteristics of all the participants can be simulated in the pruning process. The multi-end model compression method in the embodiment of the present application is performed at a server side, and as shown in fig. 2, the method may include the following steps:

and S10, aggregating the local models after the Nth round of training reported by the multiple participants to obtain a global model to be compressed, wherein N is greater than or equal to 1. As an exemplary embodiment, the global model to be compressed may include a global model structure to be compressed and global model parameters to be compressed. The multiple participants are respectively provided with a local model and local model parameters, and the server is used for aggregating the local models and the local model parameters reported by the multiple participants to obtain a global model and global model parameters. The multiple participants respectively hold respective private data, and can respectively process the data of the multiple participants through a mutual negotiation mode to obtain local data capable of meeting the training requirement, as shown in fig. 1, the participants train local models based on the local data, upload the local models and local model parameters to a service party, store the local models and the local model parameters into a model cache, aggregate the local models and the local model parameters, the service party synchronizes the aggregated global models and the global model parameters to the multiple participants, and the multiple participants update the local models and the local model parameters to complete one federate training. And when the N rounds of training are achieved, the server side takes the aggregated global model and global model parameters as a global model to be compressed. Namely, training is carried out for a preset training time N, and then the compression step can be carried out. And when the preset training times are not reached, returning to the model cache region after aggregation, and synchronizing to the local model for next compression until the preset training times are reached. In this embodiment, N is a preset training number, where N may be 1 or greater than 1, and may be specifically determined based on practical application, for example, a value of N may be determined according to performance of the model, for example, accuracy of model prediction, for example, when prediction accuracy of the aggregated model reaches more than 95%, the model may be compressed, that is, the aggregated model is used as a global model to be compressed.

And S20, compressing the global model to be compressed by adopting the public data set based on the preset performance index to obtain a global compression model. As an exemplary embodiment, the preset performance index may include a preset accuracy, that is, an accuracy for characterizing the output result of the global compression model. The preset performance indexes may further include indexes such as accuracy, recall rate, average accuracy, and false detection rate. In this embodiment, the preset performance index may be described by taking a preset accuracy as an example. Specifically, after the model is compressed, the size of the model, for example, the number of neurons, the number of layers, the parameter amount, and the like in the model are reduced, and the accuracy of the model to predict the output result may decrease, so that the minimum value is set as the preset accuracy for the accuracy of the output result of the model before compression, and the accuracy of the output result after compression of the model needs to be greater than or equal to the preset accuracy to meet the performance of the finally trained model. As an exemplary embodiment, the multi-end model compression method may include a plurality of compression methods, for example: model pruning, low-rank decomposition, migration/compression convolution filter, knowledge refining and the like. In the present embodiment, model pruning is taken as an example for explanation. Those skilled in the art should understand that other compression methods for compressing the global model to be compressed by using the common data set based on the preset accuracy are also applicable to this embodiment, in this embodiment, multiple rounds of compression may be performed with the preset performance index as a guide, after 1 round of compression or more, if the performance of the compressed model reaches the preset performance index, the global compression model is obtained, and if the performance of the compressed model does not reach the preset performance index, the common data set is continuously used to compress the model until the compressed model reaches the preset performance index.

And S30, sending the global compression model to a plurality of participants to perform N +1 training rounds. As an exemplary embodiment, the global compression model sent to the plurality of participants may include a structure of the global compression model and parameters of the global compression model. After the server side sends the global compression model to the multiple participants, the multiple participants update the local model and the local model parameters based on the global compression model, start the next round of training, and after N rounds of training or other rounds of training are repeated, execute the steps S10-S30 until the model converges. It should be understood by those skilled in the art that the foregoing embodiments only describe, by way of example, one round of training and compression in a cyclic process of training and compression of a model in a multi-end model compression method, and in practical applications, the above-mentioned process of training and compression of a model can be implemented cyclically in a round or more than one round.

In this embodiment, after repeating N times of training a local model based on local data by a participant and uploading the local model and local model parameters to a server for aggregation, the server synchronizes the aggregated global model and global model parameters to a plurality of participants, the plurality of participants perform training of updating the local model and local model parameters to obtain a global model to be compressed, enter a model compression stage, perform multi-round compression on the model to be compressed in the model compression stage, test the model after the compression each time, test the output accuracy of the model to be compressed to a preset accuracy, stop the compression when the preset accuracy is approached to obtain a global compression model, synchronize the global compression model to the plurality of participants, the participants update the local model based on the global compression model, perform model training and model compression operations of federal knowledge again, until the model converges.

In this embodiment, the compression of the model is set at the server, the server aggregates the local models uploaded by the participants after receiving the local models, and compresses the global model to be compressed by using the preset performance index of the public data set deployed at the server, and since the server aggregates the local models of multiple participants, the local models are uniformly compressed at the server, compared with the local models distributed at each participant to compress the structure and/or parameters of the model to the maximum extent, the accuracy of the model is kept unchanged, and the compression rate is improved while the accuracy attenuation problem caused by the compression is reduced. The aggregated model is compressed by using the public data set obtained by enhancing the data of the multiple participants, and the public data contains the data distribution of each participant, so that the data characteristics of all the participants can be simulated in the compression process, and the compressed model can keep better generalization capability on the data of the multiple participants.

As an exemplary embodiment, the global compression model is pruned by using the common data set based on a pruning algorithm to obtain the global compression model, and in this embodiment, for the pruning of the global compression model, the neurons may be pruned, and the convolutional layer may be pruned. In this embodiment, an example of pruning neurons will be described. Exemplarily, updating the current global parameters of the global model to be compressed by using a public data set to obtain an update gradient of the global model to be compressed; wherein the update gradient may characterize a gap between the output result and the predicted result. Specifically, the global model to be compressed may be propagated forward and backward at least once through the public data set to obtain a Gradient of the updated model, and for the Gradient of the model update, the model parameter update method may update the model parameter by using an update method such as a Gradient Descent method, a random Gradient (SDG), a Momentum method (Momentum), an Adam, and an adaptive Gradient adjustment method (AdaGrad). After the update gradient is obtained, determining the contribution degree of the neurons of the global model to be compressed by using the update gradient, for example, determining the weight value gradient of the neurons based on the update gradient, calculating the activation degree of the neurons according to the weight value gradient of each neuron, and further determining the contribution degree of each neuron. Some neurons may produce strong activation for an input containing a specific data feature, and the contribution degree referred to in this embodiment may be the product of the output value of the neuron activation function and the gradient. The higher the contribution degree of the neuron is, the more important the neuron is for the task, so after the contribution degree of the neuron is obtained, the neuron meeting a preset condition is reserved as the global compression model, and the preset condition comprises a preset contribution degree threshold value and/or a preset neuron quantity proportion. For example, the contribution degrees of the neurons may be sorted, a contribution degree threshold may be preset, a part of the neurons with the contribution degrees lower than the preset contribution degree threshold may be selected for deletion, and a part of the neurons with the contribution degrees higher than the preset contribution degree threshold may be reserved. As an exemplary embodiment, the retention may also be in number percentages based on the contribution of the neurons, for example ninety percent may be retained and ten percent may be deleted. The neurons can also be retained and deleted based on a preset contribution threshold and a neuron quantity ratio. As an exemplary embodiment, a high preset contribution threshold and a preset number ratio may also be satisfied, and on the premise that the preset contribution threshold is satisfied, neurons higher than the preset number ratio are retained. For the neuron deletion result, reference may be made to an exemplary effect diagram after model compression shown in fig. 3, and it should be understood by those skilled in the art that the effect diagram shown in fig. 3 does not necessarily refer to a diagram of a global compression model in the present application, and in this embodiment, only the compression effect is exemplarily described by using fig. 3, and other model compression effects are also applicable to this embodiment.

When a model is compressed, for example, when the model is pruned, the pruned model needs to meet performance required by task prediction, and therefore, the performance of the model needs to be tested after each pruning so as to detect whether pruning is completed, as an exemplary embodiment, after the model compression is completed, a global compression model may be detected before being sent to a participant for a next round of training to prevent the model from being over-compressed, specifically, steps of adding a multi-end model compression method after the model performance detection may be shown in fig. 4, and specifically, the steps may include the following steps:

s100, aggregating the local models after the Nth round of training reported by the multiple participants to obtain a global model to be compressed, wherein N is greater than or equal to 1. See in particular the description of step S10 in the above embodiment.

S200, compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, wherein the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on data of a plurality of participants. See in particular the description of step S20 in the above embodiment.

S300, testing the global compression model by using the public data set to obtain a first performance index of the global compression model. As an exemplary embodiment, after the model compression is completed, the common data set is input into the global compression model, and a first performance index capable of characterizing the performance of the global compression model is obtained, and the first performance index may be used to evaluate the performance of the global compression model, for example, performance indexes such as accuracy, precision, recall rate, average precision, false drop rate, and the like.

S400, calculating a performance attenuation value of the first performance index relative to a second performance index of the global to-be-compressed model, wherein the second performance index is obtained by testing the global to-be-compressed model by utilizing a public data set. After the performance index of the global compression model is obtained, the performance attenuation degree of the compressed model needs to be determined through comparison with the performance index of the model before compression. As an exemplary embodiment, the common data set is input into the model before the compression, so as to obtain a second performance index of the model before the compression, where the second performance index may also be a performance index such as accuracy, precision, recall rate, average precision, false positive rate, and the like. And comparing the first performance index and the second performance index to obtain a performance attenuation value, wherein the performance attenuation value can comprise an attenuation value such as accuracy, precision, recall rate, average precision, false detection rate and the like.

S500, judging whether the performance attenuation value is larger than a preset attenuation value. When the performance attenuation value is greater than the preset attenuation value, the step S300 is carried out, the step of compressing the global model to be compressed by adopting the public data set based on the preset accuracy rate is repeated until the performance attenuation value is less than or equal to the preset attenuation value, and the step S600 is carried out.

S600, sending the global compression model to a plurality of participants to perform N +1 training rounds. See in particular the description of step S30 in the above embodiment.

The principle of the multi-port model compression method will be described in detail below with specific examples:

two parties X₁，X₂For Federal pruning training, we set the service side at third party X₃(the third party C may also be any of the parties). Among the three parties, the service party provides functions of local model aggregation of the participants and pruning of the aggregated global model. X₁And X₂Respective sensitive data may be held. For X1 and X2, before launching knowledge federation, data needs to be aligned in a federated manner. Illustratively, after alignment, X₁And X₂Having a characteristic x₁，x₂，x₃，x₄。

The process of training begins with the server initializing the model, assuming that the initialized model is (Ax)₁+Bx₂+Cx₃+Dx₄) Let a be 1, B be 2, C be 3, and D be 4. The server distributes the model and parameters to X₁And X₂Respectively with X₁And X₂Is trained. X₁The model and parameters updated after local training are (2 ×)₁+1x₂+3x₃+5x₄)，X₂The updated model and parameters after local update are (3 ×)₁+ax₂+3x₃+3x₄). Model-entering data x of the known model₁、x₂、x₃Respectively has a parameter gradient of W_a(1, -1, 0, 1) and W_bIs (2, -2, 0, -1). Finally, transmitting the parameter gradient to a server side for aggregation to obtain new parameters: 0.5 (W)_a+W_b) (2.5, 0.5, 3, 4). Thus, the corresponding new model and parameters are (2.5 ×)₁+0.5x₂+3x₃+4x₄). And after the new model is redistributed to each participant, carrying out the next training, and obtaining the model to be pruned through a certain number of rounds of training.

And after obtaining the model to be pruned, entering a pruning step. In the pruning step, the performance of the model is maintained and the scale (model structure and/or model parameters) of the model is deleted according to the preset performance index. Before the whole training is started, the whole training round number, for example, E300 rounds, then the final performance acceptance degree t, for example, the accuracy rate is ninety percent, the number of model neurons deleted at each time accounts for the proportion r of the total number, for example, ten percent, and finally the node where pruning occurs, for example, pruning every N50 rounds. According to the data, the pruning test is carried out on the model once by each fifty wheels, and under the condition that the performance reduction is acceptable (for example, the accuracy rate after pruning/the accuracy rate before pruning > t), the scale of the model is deleted to a certain extent, and the final performance reduction of the model is not more than the initially set performance acceptance degree.

In the pruning step, X₁And X₂And transmitting the gradient of each locally trained model to a server side for aggregation, and then performing one-time forward propagation and one-time backward propagation on the aggregated model by maintaining the public data set of the server side. Gradients of updated model parameters may be obtained. The activation degree of each neuron can be calculated according to the gradient of the weight value of each neuron. And according to the ranking of the activation degree of each neuron, selecting a part of neurons with the lowest activation degree for deletion. The pruned model is predicted once through the public data set and compared with the predicted result of the model which is not pruned before. If the performance attenuation does not exceed the preset performance attenuation value, pruning is carried out on the model again, and then performance is compared. Until the performance decay exceeds a preset performance decay value. The last resulting acceptable model (the number of pruning may be greater than or equal to 0) is distributed among the participants. After multiple rounds of training, aggregation and specific rounds of pruning, a pruning model in the knowledge federation is finally obtained.

The embodiment of the invention also provides a task prediction method, which comprises the following steps: acquiring task data to be predicted; and inputting the task data to be predicted into a knowledge federation model to obtain a prediction result, wherein the knowledge federation model is obtained by adopting the multi-end model compression method described in the embodiment.

The embodiment of the present invention further provides a multi-end model compression device based on the knowledge federation, as shown in fig. 5, the device may include: the aggregation module 10 is configured to aggregate the N-th round of trained local models reported by multiple participants to obtain a global model to be compressed, where N is greater than or equal to 1; the compression module 20 is configured to compress the global model to be compressed by using a common data set based on a preset performance index to obtain a global compression model, where the preset performance index is used to represent a performance index of the global compression model during prediction, and the common data set is obtained by performing data enhancement on data of multiple parties; a sending module 30, configured to send the global compression model to a plurality of participants for N +1 rounds of training.

An embodiment of the present invention provides an electronic device, as shown in fig. 6, the electronic device includes one or more processors 61 and a memory 62, and one processor 63 is taken as an example in fig. 6.

The controller may further include: an input device 63 and an output device 64.

The processor 61, the memory 62, the input device 63 and the output device 64 may be connected by a bus or other means, as exemplified by the bus connection in fig. 6.

The processor 61 may be a Central Processing Unit (CPU). The processor 61 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations thereof. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 62, which is a non-transitory computer readable storage medium, may be used for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the control methods in the embodiments of the present application. The processor 61 executes various functional applications of the server and data processing, i.e. implementing the multi-terminal model compression method of the above-described method embodiment, by running non-transitory software programs, instructions and modules stored in the memory 62.

The memory 62 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of a processing device operated by the server, and the like. Further, the memory 62 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 62 may optionally include memory located remotely from the processor 61, which may be connected to a network connection device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 63 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the processing device of the server. The output device 64 may include a display device such as a display screen.

One or more modules are stored in the memory 62 and, when executed by the one or more processors 61, perform the method as shown in fig. 1 or 3.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the motor control methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), a Random Access Memory (RAM), a flash memory (FlashMemory), a hard disk (hard disk drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims

1. A multi-terminal model compression method based on knowledge federation is characterized by comprising the following steps:

aggregating the local models after the Nth round of training reported by a plurality of participants to obtain a global model to be compressed, wherein N is greater than or equal to 1;

compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, wherein the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on the data of the multiple participants;

sending the global compression model to the plurality of participants for N +1 rounds of training.

2. The multi-end model compression method of claim 1, wherein the compressing the global model to be compressed based on a preset accuracy rate by using a common data set to obtain a global compression model comprises:

and pruning the global compression model by utilizing a public data set based on a pruning algorithm to obtain the global compression model.

3. The multi-end model compression method of claim 1, wherein the server performs pruning operations on the compression model using a common data set based on a pruning algorithm to obtain the global compression model comprises:

updating the current global parameters of the global model to be compressed by using the public data set to obtain the updating gradient of the global model to be compressed;

determining contribution degrees of neurons of the global model to be compressed by using the updating gradient, wherein the contribution degrees are used for representing the activation degrees of the neurons;

and reserving neurons meeting preset conditions as the global compression model, wherein the preset conditions comprise a preset contribution threshold and/or a neuron quantity ratio.

4. The multi-end model compression method of claim 3, wherein the updating the current global parameters of the global model to be compressed with the common data set to obtain the update gradient of the global model to be compressed comprises:

and forward propagation and backward propagation are carried out on the global model to be compressed by utilizing the public data set, so that the updating gradient is obtained.

5. The multi-terminal model compression method of claim 3 or 4, wherein determining the contribution of the neurons of the global model to be compressed using the global update parameters comprises:

determining the weight value gradient of the neuron according to the updating gradient;

calculating a contribution of the neuron based on the weight value gradient.

6. The multi-terminal model compression method as claimed in claim 1, wherein between said compressing the global model to be compressed with a common data set based on a preset accuracy and said sending the global compression model to the plurality of participants for N +1 rounds of training comprises:

testing the global compression model by using the public data set to obtain a first performance index of the global compression model;

calculating a performance attenuation value of the first performance index relative to a second performance index of the global model to be compressed, wherein the second performance index is obtained by testing the global model to be compressed by utilizing the public data set;

and when the performance attenuation value is larger than a preset attenuation value, repeating the step of compressing the global model to be compressed by adopting a public data set based on a preset accuracy rate until the performance attenuation value is smaller than or equal to the preset attenuation value, and entering the step of sending the global compression model to the plurality of participants to perform N +1 rounds of training.

7. The multi-terminal model compression method of claim 1, wherein the aggregating the N-th round of trained local models reported by the multiple participants to obtain a global model to be compressed comprises:

aggregating the local model and the local model parameters reported by the participants to obtain a global model and global model parameters;

distributing the aggregated global model and global model parameters to the participating parties to complete a round of training;

judging whether the training times reach N times or not;

and when the training times reach N times, taking the global model after the Nth round of training as a global model to be compressed.

8. A method of task prediction, comprising:

acquiring task data to be predicted;

and inputting the task data to be predicted into a knowledge federation model to obtain a prediction result, wherein the knowledge federation model is obtained by adopting a multi-terminal model compression method of any one of claims 1 to 7.

9. A multi-terminal model compression apparatus based on knowledge federation, comprising:

the aggregation module is used for aggregating the local models after the Nth round of training reported by the multiple participants to obtain a global model to be compressed, wherein N is greater than or equal to 1;

the compression module is used for compressing the global model to be compressed by adopting a public data set based on a preset performance index to obtain a global compression model, the preset performance index is used for representing the performance index of the global compression model during prediction, and the public data set is obtained by performing data enhancement on the data of the multiple parties;

a sending module, configured to send the global compression model to the multiple participants for N +1 rounds of training.

10. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to cause the at least one processor to perform the multi-terminal model compression method of any one of claims 1-7 and/or the task prediction method of claim 8.