CN112686368A - Cooperative learning method, storage medium, terminal and system for updating center side - Google Patents

Cooperative learning method, storage medium, terminal and system for updating center side Download PDF

Info

Publication number
CN112686368A
CN112686368A CN202011455627.8A CN202011455627A CN112686368A CN 112686368 A CN112686368 A CN 112686368A CN 202011455627 A CN202011455627 A CN 202011455627A CN 112686368 A CN112686368 A CN 112686368A
Authority
CN
China
Prior art keywords
central server
task
model
cooperative
cooperative learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011455627.8A
Other languages
Chinese (zh)
Inventor
戴晶帼
杨旭
陈�光
苏新铎
叶鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GRG Banking Equipment Co Ltd
Original Assignee
GRG Banking Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GRG Banking Equipment Co Ltd filed Critical GRG Banking Equipment Co Ltd
Priority to CN202011455627.8A priority Critical patent/CN112686368A/en
Priority to PCT/CN2020/140757 priority patent/WO2022121026A1/en
Publication of CN112686368A publication Critical patent/CN112686368A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a collaborative learning method, a storage medium, a terminal and a system for dynamically updating a central party, wherein the system comprises a data provider determining module, a performance evaluation module and an optimal selection module, wherein the optimal data provider in a network is selected as a central server according to evaluation performance and a central program is operated; a communication transmission module; the dynamic update learning module judges whether to enter a cooperative learning task or update the central server according to the initial state, and if so, starts the cooperative learning task and predicts the risk of the central server; if not, entering a central server for updating, and judging whether the cooperative learning task is continued, recovered or suspended until the task is finished. Compared with the prior art, the method for quickly selecting the optimal data provider participating in the training task as the central server under the condition that the central server is down can be quickly connected, so that cooperative learning can continue to operate.

Description

Cooperative learning method, storage medium, terminal and system for updating center side
Technical Field
The invention relates to multi-party data joint processing, in particular to a collaborative learning method, a storage medium, a terminal and a system for a dynamic update center party.
Background
In traditional centralized deep learning, a central server needs to collect a large amount of user data for training a neural network model (model for short), but due to the problems of high network communication overhead of data transmission, user data ownership, user data privacy and the like, user data for deep learning is often difficult to obtain.
Some solutions to the above problems have also emerged in the prior art, such as federal learning, which is an emerging machine learning framework that takes another approach to training machine learning models: in one round of training, each user trains a local model by using private data of the user, then the parameters of the local model are uploaded to a central server, the central server fuses the parameters of all the users to generate the parameters of a global model, then the parameters of the global model are issued to the users, the users update the local model according to the parameters of the global model, and a plurality of rounds of training are circulated until the global model converges, and the training is finished.
To summarize, federal (machine) learning is an emerging distributed machine learning paradigm under which computing entities (mobile/edge devices, cross-regional organizations) co-train a machine learning model under the coordination of a central server (e.g., service provider). Since data always resides locally at the computing entity, federated learning reduces privacy risks and data transfer costs of traditional centralized machine learning. As a novel artificial intelligence basic technology, in recent years, federal learning has gained extensive attention from academic and industrial fields, and becomes a new trend of machine learning development and application.
Based on the technology, the federal learning can realize that a plurality of users jointly carry out machine learning model training on the premise that private training data of the users do not leave the local, and complete specified learning tasks, such as: image classification, text prediction and the like, and the problem that user data is difficult to obtain in traditional centralized machine learning is solved.
However, there are also some safety concerns in federal learning. Once the central server is down, the users need to come again to train each party and then gather data to a new central server, so that the cost is high and the time is long. Specifically, one model training in federal learning generally comprises a plurality of iteration rounds, wherein each iteration round comprises four steps of model distribution, model calculation, model aggregation and model updating (an entity selection step may be included when the number of calculation entities is large). The model distribution refers to that the central server distributes the latest model to each participating node; the model calculation means that the participating nodes obtain model updating amount or gradient after calculation according to the latest model and local data; the model aggregation refers to that the participating nodes gather the calculated model updating amount or gradient to a central server; and the model updating means that the central server updates the global model according to the aggregated model updating amount or gradient. The model training process is continuously repeated until the global model converges (namely, the precision of the model in the standard test set reaches an ideal value). In the existing federal learning framework (such as TensorFlow Federated, FATE), the model distribution and model aggregation in the above steps generally adopt a Hub-and-spoke mode, and in the mode, a central server is used as a unique model distributor and aggregator to periodically generate a large amount of model communication with the participating nodes. In an actual deployment environment, the central server and each participating node are usually distributed across regions, and a network between the central server and each participating node is a part of a cross-domain public network, and has the characteristics of limited bandwidth, heterogeneous dynamic performance and the like. Thus, the communication overhead generated by frequent and extensive model communication is a major bottleneck in the efficiency of federal learning training.
It can be seen that compression methods are generally studied from the aspect of communication-efficient algorithms to reduce the data volume of model communication, but the methods may cause the quality of the model to be reduced. For the central service, the optimal data providers participating in the cooperative learning task are difficult to select quickly under the down condition, and the optimal data providers cannot be connected quickly, so that the model aggregation of the cooperative learning cannot be operated continuously, which is a problem to be solved urgently at present.
Disclosure of Invention
In order to overcome the disadvantages of the prior art, an object of the present invention is to provide a collaborative learning method, a readable storage medium, a terminal and a system for a dynamic update center, which can solve the above problems.
A cooperative learning method for dynamically updating a center side comprises the following steps:
s1, each data provider is ready, and the task initiator initiates a task;
s2, each data provider provides data used in the cooperative learning task, and the currently available data providers are obtained through the network connection state;
s3, evaluating the current performance of the available data providers through indexes;
s4, selecting the optimal data provider as a central server by comparing the performance of the data provider, and operating a central program;
s5 the central server establishes connections with all available data providers, i.e. clients.
S6, judging whether the current task is in the initial state, if yes, the cooperative learning task starts, and the process goes to step S7 a; if not, the current cooperative task is in a cooperative task execution state, and the process is switched to step S7 b;
after the cooperative learning task of S7a is started, the task flow is as follows: 7a1) initializing model parameters by a central server; 7a2) the central server distributes the model to the client; 7a3) each client uses local data thereof to carry out model training; 7a4) each client encrypts and sends the trained model to a central server, and model aggregation is carried out on the central server; after the aggregation is finished, the model information is stored and sent to a public space, the stability of the central server is predicted, and the flow enters a flow step S8;
the S7b central server status task flow comprises the following steps: 7b1) judging whether the current central server is updated or not; 7b2) if so, the central server of the previous round of tasks is abnormal, reading model information from the public space on the newly selected optimal central server, and recovering the cooperative learning task; if not, the abnormal risk of the central server is eliminated, and the cooperative learning task continues; 7b3) judging whether a cooperative learning task stop condition is met or not after the primary model aggregation is finished, if the cooperative learning task stop condition is met, entering a flow step S9, and if the cooperative learning task stop condition is not met, predicting the stability of the central server and entering a flow step S8;
s8, predicting the stability of the central server, if the system is unstable and the central server is abnormal, stopping the task of the current round, returning to the flow step S2, reselecting the central server and switching to S7 a; if the system is stable, the cooperative learning task continues, the process is switched to S7b, the process is repeated until the task stop condition is met, and the process enters a process step S9;
s9 satisfies the task stop condition, and the task ends.
Preferably, the indexes in step S3 include computational power, bandwidth, and memory.
Preferably, in step S8, the central server reliability is analyzed using a probabilistic graph model to enable a prediction of its stability.
The present invention also provides a computer readable storage medium having stored thereon computer instructions which, when executed, perform the steps of the aforementioned method.
The invention also provides a terminal, which comprises a memory and a processor, wherein the memory is stored with the registered pictures and computer instructions capable of being run on the processor, and the processor executes the steps of the method when running the computer instructions.
The invention also provides a cooperative learning system based on the dynamic update center party, a center server of the system is in telecommunication connection with each data provider and runs the steps, and the system comprises:
the data provider determining module is used for determining an available data provider through the network connection state;
the performance evaluation module evaluates the performance of the data provider through parameters such as computing power, bandwidth and memory;
the optimization module is used for autonomously selecting an optimal data provider in the network as a central server according to the evaluation performance and operating a central program;
the central server establishes connection with all available data providers, namely clients;
the dynamic update learning module judges whether to enter a cooperative learning task or update the central server according to the initial state, and if so, starts the cooperative learning task and predicts the risk of the central server; if not, entering a central server for updating, and judging whether the cooperative learning task is continued, recovered or suspended until the task is finished.
Compared with the prior art, the invention has the beneficial effects that: the method for quickly selecting the optimal data provider participating in the cooperative learning task as the central server under the condition that the central server is down can be quickly connected, so that model training can continue to operate.
Drawings
FIG. 1 is a flowchart of a collaborative learning method of a dynamic update center according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be understood that "system", "device", "unit" and/or "module" as used in this specification is a method for distinguishing different components, elements, parts or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
A large amount of information data, which is abundant in various industries such as economy, culture, education, medical care, public management, etc., is widely used in more and more scenes for performing data processing analysis such as data analysis, data mining, and trend prediction. The data cooperation mode can enable a plurality of data owners to obtain better data processing results. For example, more accurate model parameters may be obtained through multi-party collaborative learning.
In some embodiments, the method for dynamically updating collaborative learning of a central party may be applied to a scenario in which parties collaboratively train a machine learning model for use by multiple parties while ensuring data security of the parties. In this scenario, multiple data parties have their own data, and they want to use each other's data together for unified modeling (e.g., classification model, linear regression model, logistic regression model, etc.), but do not want the respective data (especially private data) to be revealed. For example, an internet deposit institution a has one batch of user data, a bank B has another batch of user data, and a training sample set determined based on the user data of a and B can be trained to obtain a better-performing machine learning model. Both a and B would like to participate in model training together with each other's user data, but for some reasons a and B would not like their own user data information to be revealed, or at least would not like to let the other party know their own user data information.
In some embodiments, a method of federated learning may be employed for collaborative learning. Federal Learning (Federal Learning) can develop efficient machine Learning between multiple parties or computing nodes. The federal learning can enable multi-party data to carry out model training under the condition that training samples are not local, only trained models are transmitted or the gradient is calculated, and therefore privacy of the training samples held by all parties is protected.
In some embodiments, federated learning is often applied to situations where the model is computationally intensive and has many parameters. In the embodiment of the scenario, the communication transmission pressure is large because the data transmission amount is large in the federal learning process. Therefore, in a scenario of using federal learning, a certain method is often needed to reduce the communication pressure in the transmission process.
In some embodiments of the present disclosure, during each iterative update of the model, the collaborative learning task determination (including the trained model gradient value or model parameter) updated by the central server may be used for compression. Specifically, by means of the updated task recovery and continuation, the client model training is not interrupted and does not need to be retrained, so that the communication pressure is reduced. Meanwhile, risk prediction is carried out on the abnormal condition of the central server, and the stability of the model is guaranteed.
First embodiment
A cooperative learning method for dynamically updating a central party comprises the following steps.
S1 the data providers are ready and the task originator initiates the task.
S2 each data provider provides data used in the cooperative learning task, and obtains a currently available data provider through the network connection status.
S3 evaluates the current performance of the available data providers by metrics.
S4 selects the optimal data provider as the central server by comparing the performance of the data providers, and runs the central program.
S5 the central server establishes connections with all available data providers, i.e. clients.
S6 judges whether the current task is in the initial task state, if so, the cooperative learning task is started, and the process proceeds to step S7 a. If not, the cooperative task is currently in the cooperative task performing state, and the flow proceeds to step S7 b.
After the cooperative learning task of S7a is started, the task flow is as follows:
7a1) the central server initializes the model parameters. For example: before using the netdata, the central server needs to initialize model parameters, such as weights and biases in the linear regression model. An init module is imported from the MXNet. This module provides various methods of model parameter initialization. Init here is an abbreviated form of initializer. Specifying the weight parameters by init. normal (sigma ═ 0.01) each element will be randomly sampled at initialization to a normal distribution with a mean of 0 and a standard deviation of 0.01. The bias parameter is initialized to zero by default.
7a2) The central server distributes the models to the clients.
7a3) Each client uses its local data for model training.
7a4) And each client encrypts and sends the trained model to a central server, and the central server performs model aggregation. And after the aggregation is finished, the model information is stored and sent to a public space, the stability of the central server is predicted, and the flow step S8 is carried out.
The S7b central server status task flow comprises the following steps:
7b1) and judging whether the current central server is updated or not.
7b2) And if so, the central server of the previous round of task is abnormal, reading model information from the public space on the newly selected optimal central server, and recovering the cooperative learning task. If not, the abnormal risk of the central server is eliminated, and the cooperative learning task continues.
7b3) And after the primary model aggregation is finished, judging whether a cooperative learning task stop condition is met, if the cooperative learning task stop condition is met, entering a flow step S9, and if the cooperative learning task stop condition is not met, predicting the stability of the central server and entering a flow step S8.
S8, predicting the stability of the central server, if the system is unstable and the central server is abnormal, stopping the task of the round, returning to the flow step S2, reselecting the central server and switching to S7 a. If the system is stable, the cooperative learning task continues, the process proceeds to S7b, the above process is repeated until the task stop condition is satisfied, and the process proceeds to step S9.
S9 satisfies the task stop condition, and the task ends.
The indexes in step S3 include computation power, bandwidth, and memory.
Further, in step S7a, the model parameter initialization adopts the following standard as the conventional requirement:
(1) the parameters cannot all be initialized to 0, nor can the parameters all be initialized to the same value;
(2) preferably, the mean value of the parameter initialization is 0, and the positive and negative parameters are staggered, and are approximately equal in number.
Some common initialization methods are random initialization according to "normal distribution-corresponding to normal" and random initialization according to "uniform distribution-corresponding to uniform", which are not described again, and in addition, there are:
normalized gloot initializer — gloot normal, which employs a gloot normal distribution initializer, also known as Xavier normal distribution initializer. It extracts samples from a truncated normal distribution centered at 0 with stddev ═ sqrt (2/(fan _ in + fan _ out)), where fan _ in is the number of input units in the weight tensor and fan _ out is the number of output units in the weight tensor. Standardized gloot initializer-gloot _ uniform distribution initializer, also called Xavier uniform distribution initializer. It extracts samples from the uniform distribution in [ -limit, limit ], where limit is sqrt (6/(fan _ in + fan _ out)), fan _ in is the number of input units in the weight tensor, and fan _ out is the number of output units in the weight tensor.
② Kaiming initialization, also called he initialization, also called msra initialization: a normalized kaiming initializer — he normal distribution initializer. It extracts samples from a truncated normal distribution centered at 0 with standard deviation stddev sqrt (2/fan _ in), where fan _ in is the number of input units in the weight tensor, implemented in keras as. b normalized kaiming initializer — He _ uniform, He uniform variance scaling initializer. It extracts samples from the uniform distribution in [ -limit, limit ], where limit is sqrt (6/fan _ in), where fan _ in is the number of input units in the weight tensor. First. e _ uniform (seed) ═ None).
Let LeCun initialization, a standardized kaiming initialization-LeCun _ uniform, LeCun uniform initializers-LeCun _ uniform (seed). It extracts samples from the uniform distribution in [ -limit, limit ], where limit is sqrt (3/fan _ in), and fan _ in is the number of input units in the weight tensor. b normalized kaiming initializations — LeCun _ normal, LeCun normal distribution initializers — LeCun _ normal (seed ═ None). It extracts samples from a truncated normal distribution centered at 0 with the standard deviation stddev ═ sqrt (1/fan _ in), where fan _ in is the number of input units in the weight tensor.
And (iv) Batch Normalization, BN changes the input data distribution into Gaussian distribution, thus ensuring that the input of each layer of neural network keeps the same distribution. With the increase of the number of network layers, the distribution gradually shifts, so that the convergence is slow because the overall distribution is close to the upper limit and the lower limit of the value interval of the nonlinear function. This results in the gradient disappearing when propagating backwards. BN is that the input value distribution of any neuron in each layer of neural network is forced to be pulled back to the standard normal distribution with the mean value of 0 and the variance of 1 by means of normalization, so that the activation input value falls into a sensitive region in the nonlinear function. The gradient can be increased, the learning convergence speed is high, and the convergence speed can be greatly accelerated. Scale and Shift acts on γ and β. γ and β are learned parameters that allow the standard normal distribution to become higher/fatter and shift left and right.
In step S8, the central server reliability is analyzed using a probabilistic graph model to enable a prediction of its stability. The probability graph model is a generic name for a model expressing a probability-based correlation relationship by using a graph mode. Probabilistic graph models utilize graphs to represent the joint probability distribution of variables associated with the model, in conjunction with knowledge of probability theory and graph theory. Common probabilistic graphical models include bayesian networks, markov networks, and hidden markov networks, and any model may be used in the scheme.
Second embodiment
The present invention also provides a computer readable storage medium having stored thereon computer instructions which, when executed, perform the steps of the aforementioned method. For details, the method is described in the foregoing section, and is not repeated here.
It will be appreciated by those of ordinary skill in the art that all or a portion of the steps of the various methods of the embodiments described above may be performed by associated hardware as instructed by a program that may be stored on a computer readable storage medium, which may include non-transitory and non-transitory, removable and non-removable media, to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
Third embodiment
The invention also provides a terminal comprising a memory and a processor, wherein the memory is stored with data provider information and computer instructions capable of running on the processor, and the processor executes the computer instructions to execute the steps of the method. For details, the method is described in the foregoing section, and is not repeated here.
Fourth embodiment
A collaborative learning system based on a dynamic update hub for communicatively coupling a hub server to data providers, the system comprising:
the data provider determining module is used for determining an available data provider through the network connection state;
the performance evaluation module evaluates the performance of the data provider through parameters such as computing power, bandwidth and memory;
the optimization module is used for autonomously selecting an optimal data provider in the network as a central server according to the evaluation performance and operating a central program;
and the central server establishes connection with all available data providers, namely clients.
The dynamic update learning module judges whether to enter a cooperative learning task or update the central server according to the initial state, and if so, starts the cooperative learning task and predicts the risk of the central server; if not, entering a central server for updating, and judging whether the cooperative learning task is continued, recovered or suspended until the task is finished.
The key point of the method is the cooperative learning task state after the central server is updated, the part mainly runs the step S7a and the step S7b in the preorder method, the training of the client model is not interrupted and does not need to be retrained in the process of executing the method, and the quick updating of the central server and the quick recovery after the abnormality can be ensured until the learning task is completed.
It is to be understood that the system and its modules in one or more implementations of the present description can be implemented in various ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).
It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
It should be noted that the above description of the processing device and its modules is merely for convenience of description and is not intended to limit the present application to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visualbasic, Fortran2003, Perl, COBOL2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or processing device. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing processing device or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of the present application. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present application can be viewed as being consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to only those embodiments explicitly described and depicted herein.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the phrase "including a" does not exclude the presence of other, identical elements in the process, method, article, or apparatus that comprises the same element, whether or not the same element is present in all of the same element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (6)

1. A collaborative learning method for dynamically updating a central party is characterized by comprising the following steps:
s1, each data provider is ready, and the task initiator initiates a task;
s2, each data provider provides data used in the cooperative learning task, and the currently available data providers are obtained through the network connection state;
s3, evaluating the current performance of the available data providers through indexes;
s4, selecting the optimal data provider as a central server by comparing the performance of the data provider, and operating a central program;
s5, the central server establishes connection with all available data providers, namely clients;
s6, judging whether the current task is in the initial state, if yes, the cooperative learning task starts, and the process goes to step S7 a; if not, the current cooperative task is in a cooperative task execution state, and the process is switched to step S7 b;
after the cooperative learning task of S7a is started, the task flow is as follows: 7a1) initializing model parameters by a central server; 7a2) the central server distributes the model to the client; 7a3) each client uses local data thereof to carry out model training; 7a4) each client encrypts and sends the trained model to a central server, and model aggregation is carried out on the central server; after the aggregation is finished, the model information is stored and sent to a public space, the stability of the central server is predicted, and the flow enters a flow step S8;
the S7b central server status task flow comprises the following steps: 7b1) judging whether the current central server is updated or not; 7b2) if so, the central server of the previous round of tasks is abnormal, reading model information from the public space on the newly selected optimal central server, and recovering the cooperative learning task; if not, the abnormal risk of the central server is eliminated, and the cooperative learning task continues; 7b3) judging whether a cooperative learning task stop condition is met or not after the primary model aggregation is finished, if the cooperative learning task stop condition is met, entering a flow step S9, and if the cooperative learning task stop condition is not met, predicting the stability of the central server and entering a flow step S8;
s8, predicting the stability of the central server, if the system is unstable and the central server is abnormal, stopping the task of the current round, returning to the flow step S2, reselecting the central server and switching to S7 a; if the system is stable, the cooperative learning task continues, the process is switched to S7b, the process is repeated until the task stop condition is met, and the process enters a process step S9;
s9 satisfies the task stop condition, and the task ends.
2. The method of claim 1, wherein: the indexes in step S3 include calculation power, bandwidth, and memory.
3. The method of claim 1, wherein: in step S8, the central server reliability is analyzed using a probabilistic graph model to enable a prediction of its stability.
4. A computer-readable storage medium having stored thereon computer instructions, characterized in that: the computer instructions when executed perform the steps of the method of any one of claims 1 to 3.
5. A terminal comprising a memory and a processor, characterized in that: the memory having stored thereon data provider information and computer instructions executable on the processor, the processor when executing the computer instructions performing the steps of the method of any one of claims 1-3.
6. A collaborative learning system based on a dynamic update central party, wherein a central server of the system is in telecommunication connection with each data provider and executes the steps of the method according to any one of claims 1 to 3, wherein: the system comprises:
the data provider determining module is used for determining an available data provider through the network connection state;
the performance evaluation module evaluates the performance of the data provider through parameters such as computing power, bandwidth and memory;
the optimization module is used for autonomously selecting an optimal data provider in the network as a central server according to the evaluation performance and operating a central program;
the central server establishes connection with all available data providers, namely clients;
the dynamic update learning module judges whether to enter a cooperative learning task or update the central server according to the initial state, and if so, starts the cooperative learning task and predicts the risk of the central server; if not, entering a central server for updating, and judging whether the cooperative learning task is continued, recovered or suspended until the task is finished.
CN202011455627.8A 2020-12-10 2020-12-10 Cooperative learning method, storage medium, terminal and system for updating center side Pending CN112686368A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011455627.8A CN112686368A (en) 2020-12-10 2020-12-10 Cooperative learning method, storage medium, terminal and system for updating center side
PCT/CN2020/140757 WO2022121026A1 (en) 2020-12-10 2020-12-29 Collaborative learning method that updates central party, storage medium, terminal and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011455627.8A CN112686368A (en) 2020-12-10 2020-12-10 Cooperative learning method, storage medium, terminal and system for updating center side

Publications (1)

Publication Number Publication Date
CN112686368A true CN112686368A (en) 2021-04-20

Family

ID=75448915

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011455627.8A Pending CN112686368A (en) 2020-12-10 2020-12-10 Cooperative learning method, storage medium, terminal and system for updating center side

Country Status (2)

Country Link
CN (1) CN112686368A (en)
WO (1) WO2022121026A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358316A (en) * 2022-01-14 2022-04-15 中国人民解放军总医院 Federal learning system and large-scale image training method and device thereof
CN114707430A (en) * 2022-06-02 2022-07-05 青岛鑫晟汇科技有限公司 Multi-user encryption-based federated learning visualization system and method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117034000A (en) * 2023-03-22 2023-11-10 浙江明日数据智能有限公司 Modeling method and device for longitudinal federal learning, storage medium and electronic equipment
CN116308965B (en) 2023-05-24 2023-08-04 成都秦川物联网科技股份有限公司 Intelligent gas underground gas pipe network safety management method, internet of things system and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929880A (en) * 2019-11-12 2020-03-27 深圳前海微众银行股份有限公司 Method and device for federated learning and computer readable storage medium
CN111538608A (en) * 2020-04-30 2020-08-14 深圳前海微众银行股份有限公司 Method for preventing terminal equipment from being down, terminal equipment and storage medium
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109525411B (en) * 2017-09-19 2022-03-04 北京金山云网络技术有限公司 Network function component cluster, system, control method, device and storage medium
US11816548B2 (en) * 2019-01-08 2023-11-14 International Business Machines Corporation Distributed learning using ensemble-based fusion
CN111143173A (en) * 2020-01-02 2020-05-12 山东超越数控电子股份有限公司 Server fault monitoring method and system based on neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110929880A (en) * 2019-11-12 2020-03-27 深圳前海微众银行股份有限公司 Method and device for federated learning and computer readable storage medium
CN111611610A (en) * 2020-04-12 2020-09-01 西安电子科技大学 Federal learning information processing method, system, storage medium, program, and terminal
CN111538608A (en) * 2020-04-30 2020-08-14 深圳前海微众银行股份有限公司 Method for preventing terminal equipment from being down, terminal equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358316A (en) * 2022-01-14 2022-04-15 中国人民解放军总医院 Federal learning system and large-scale image training method and device thereof
CN114707430A (en) * 2022-06-02 2022-07-05 青岛鑫晟汇科技有限公司 Multi-user encryption-based federated learning visualization system and method

Also Published As

Publication number Publication date
WO2022121026A1 (en) 2022-06-16

Similar Documents

Publication Publication Date Title
CN112686368A (en) Cooperative learning method, storage medium, terminal and system for updating center side
Galanti et al. Explainable predictive process monitoring
Raskutti et al. Learning directed acyclic graph models based on sparsest permutations
Tong et al. Directed graph contrastive learning
Demertzis et al. Gryphon: a semi-supervised anomaly detection system based on one-class evolving spiking neural network
Zhao et al. Autoloss: Automated loss function search in recommendations
US20200286095A1 (en) Method, apparatus and computer programs for generating a machine-learning system and for classifying a transaction as either fraudulent or genuine
US11847551B2 (en) Feature engineering in neural networks optimization
US11145018B2 (en) Intelligent career planning in a computing environment
CN110502739A (en) The building of the machine learning model of structuring input
Donta et al. The promising role of representation learning for distributed computing continuum systems
US11900236B2 (en) Interpretable neural network
Bárcena et al. Fed-XAI: Federated Learning of Explainable Artificial Intelligence Models.
Kumar et al. Development of a cloud-assisted classification technique for the preservation of secure data storage in smart cities
Nagarajan et al. A review on intelligent cloud broker for effective service provisioning in cloud
Anjos et al. A Survey on Collaborative Learning for Intelligent Autonomous Systems
US11748568B1 (en) Machine learning-based selection of metrics for anomaly detection
Caschera et al. MONDE: a method for predicting social network dynamics and evolution
US20200380405A1 (en) Data exposure for transparency in artificial intelligence
Hansen et al. Predicting the timing and quality of responses in online discussion forums
Mishra Model Explainability for Rule-Based Expert Systems
Azad et al. iBUST: An intelligent behavioural trust model for securing industrial cyber-physical systems
Ghedass et al. Autonomic computing and incremental learning for the management of big services
Christiyana ArulSelvi et al. Identifying trusted similar users using stochastic model and next-closure based knowledge model in online social networks
Jose et al. Application of artificial intelligence in secure decentralized computation enabled by TOTEM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210420

RJ01 Rejection of invention patent application after publication