CN115293358A

CN115293358A - Internet of things-oriented clustered federal multi-task learning method and device

Info

Publication number: CN115293358A
Application number: CN202210751781.2A
Authority: CN
Inventors: 张群; 马珊珊; 徐洋; 杨少杰; 熊翱; 王冰; 廖双乐; 周游
Original assignee: State Grid Comprehensive Energy Service Group Co ltd; Beijing University of Posts and Telecommunications; China Electronics Standardization Institute; Suzhou Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Current assignee: State Grid Comprehensive Energy Service Group Co ltd; Beijing University of Posts and Telecommunications; China Electronics Standardization Institute; Suzhou Power Supply Co of State Grid Jiangsu Electric Power Co Ltd
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-11-04

Abstract

The invention provides a clustering federated multitask learning method and a clustering federated multitask learning device for the Internet of things. In the local training process of each Internet of things terminal device, the local training round is adjusted according to the calculation power, the calculation resources of each Internet of things terminal device are fully utilized, and the model training efficiency is improved. The global model is used for conducting regularization constraint on the personalized training task, overfitting can be effectively prevented, the personalized degree is controlled, and the quality of the model is improved.

Description

Internet of things-oriented clustered federal multi-task learning method and device

Technical Field

The invention relates to the technical field of data processing, in particular to a clustering federal multi-task learning method and a clustering federal multi-task learning device for the Internet of things.

Background

The internet of things digitalizes and networks all things, realizes efficient information interaction modes among articles, articles and people and real environments, enables various information technologies to be integrated into social behaviors through a new service mode, and is considered as a great revolution in the information field. In order to better utilize a large amount of data generated in the internet of things, in the process of executing a training task aiming at specific content, a traditional method is to collect the data, then carry out unified training, and then send the trained model to a terminal for use. In the process, the privacy and the safety of the terminal data cannot be guaranteed, the data is easy to attack in the data transmission process to cause data leakage, and the data collection to the cloud end does not accord with the privacy rights and interests of the user. Another conventional method is to train a model at a terminal, but the training is limited by hardware conditions such as computing resources and memory resources of the terminal, the selection space of the model is limited, and meanwhile, if the number of data samples of a single terminal is insufficient or the quality of the data samples is poor, the quality of the trained model may be low.

In the prior art, the federal study realizes data sharing on the premise of ensuring the data privacy of the terminal, and the terminal collaboratively trains a sharing model under the condition that the data is not in the local, so that the method has a wide application prospect in the internet of things. However, the environment of the Internet of things is complex, and the challenges of equipment isomerism, data isomerism and model isomerism exist in application of federal learning in the Internet of things.

The device heterogeneity refers to the difference of terminal device configurations in the internet of things, including hardware, network conditions and the like, which results in the difference of terminal computing, storing and communication capabilities. The device heterogeneous causes the problems of high communication cost, disconnection, fault tolerance and the like, and indirectly influences the training efficiency of the model or causes different participation of users in model training. Data heterogeneity refers to the fact that data from different terminals are distributed in a non-IID mode (independent and same distribution). Data distribution differences are caused by factors such as the surrounding environment of the terminal and the working time, and the data distribution differences from different terminals often cause weight divergence in the federal learning process and influence the quality of the model. Model heterogeneity means that in federal learning, the architecture of different terminal deployment models may be different, resulting in failure of the aggregation process in traditional federal learning.

In the scene of the internet of things, the quality of the model, the training efficiency, the experience of the user and the like are all adversely affected, and therefore, solving the adverse effects caused by the heterogeneity in the scene of the internet of things is an urgent problem to be solved.

Disclosure of Invention

The embodiment of the invention provides a clustering federal multi-task learning method and a clustering federal multi-task learning device for the Internet of things, which are used for eliminating or improving one or more defects in the prior art and solving the adverse effects of equipment heterogeneity and model heterogeneity of the Internet of things on training tasks.

One aspect of the invention provides a clustering federated multitask learning method facing to the Internet of things, wherein the method is operated on terminal equipment of the Internet of things, the terminal equipment of the Internet of things is connected with a macro base station through a micro base station relay, and the terminal equipment of the Internet of things is divided into a preset number of clusters by taking the minimized data distribution difference as a target; the method is used for executing the federal multi-task training, and each training turn comprises the following steps:

downloading a global model of a current cluster local round to which current terminal equipment of the Internet of things belongs from the macro base station;

carrying out a global training task on the personalized model of the current terminal equipment of the Internet of things by using local data of the current terminal equipment of the Internet of things, wherein the global training task adopts a first loss function to carry out parameter updating so as to obtain a global updating model;

performing personalized training tasks on the global model by using local data of the current terminal equipment of the Internet of things, wherein the personalized training adopts a second loss function to perform parameter updating to obtain an updated personalized model; the second loss function combines parameters of the global model to carry out regularization constraint on the personalized training task on the basis of the first loss function;

and sending the global updating model to the macro base station, and aggregating the global updating model with models obtained by executing global training tasks by other terminal equipment of the Internet of things in the current cluster so as to update the global model of the current cluster.

In some embodiments, the method clusters the internet of things terminal devices by using a k-means algorithm, and the dividing the internet of things terminal devices into a preset number of clusters with a minimized data distribution difference as a target includes:

initializing the preset number of Internet of things terminal devices into a mass center at random to establish corresponding clusters;

and calculating the distribution distance between the local data of the rest Internet of things terminal equipment and the local data of each mass center one by one, attributing the local data to the cluster corresponding to the mass center with the closest distribution distance, and calculating the distribution mean value of the local data of each Internet of things terminal equipment in each cluster to update the mass center until all the Internet of things terminal equipment are classified.

In some embodiments, the method determines the preset number of clusters by using an elbow method, and the internet of things terminal device is divided into the preset number of clusters with the aim of minimizing data distribution difference, including

Randomly selecting an Internet of things terminal device as a clustering center;

calculating the distribution distances between the rest Internet of things terminal equipment and the local data of the existing clustering centers one by one, and selecting the Internet of things terminal equipment with the largest distribution distance as a new clustering center;

and iterating to select a preset number of clustering centers.

In some embodiments, the method further comprises determining the preset number using an elbow method.

In some embodiments, the method further adjusts the local training round according to the power of each internet of things terminal device in the cluster, including:

calculating the time delay required by each Internet of things terminal device to execute a round of local training, wherein the calculation formula is as follows:

wherein l _k,z Represents the time delay D needed by the terminal equipment z of the Internet of things in the kth cluster to execute a round of local training _z L represents the amount of z local data of the terminal equipment of the Internet of things, f _z Representing the computing power of the terminal equipment z of the Internet of things;

acquiring the time delay of the Internet of things terminal equipment which needs the longest time for executing a round of local training in each cluster as the longest time delay of the cluster, wherein the expression is as follows:

wherein l _k,max The longest time delay required by local training in all the terminal equipment of the internet of things in the kth cluster is represented;

adjusting the number of times of executing local training by each Internet of things terminal device in each cluster according to the proportional relation between the time delay required by each Internet of things terminal device for executing one local training round and the longest time delay corresponding to the cluster to which the terminal device belongs, wherein the calculation formula is as follows:

wherein e is _z Representing the turn of local training of the terminal device z of the internet of things,

and representing the local training turn of the Internet of things terminal equipment with the longest time delay required by executing the local training in the kth cluster.

In some embodiments, the second loss function performs regularization constraint on the personalized training task in combination with the parameters of the global model on the basis of the first loss function, and an expression of the second loss function is as follows:

wherein, F _z,P (. Represents the second loss function, F _z,g (. Represents the first loss function, w _z Parameters of the personalized model representing the terminal device z of the internet of things in the nth round, D _z Local parameter, w, representing nth turn of internet of things terminal device z _z,i Denotes w _z The parameters of the layer i are set to be,

the i-th layer parameter, lambda, of the global model adopted by the terminal equipment z of the internet of things in the nth round is represented _z,i Representing a control parameter.

In some embodiments, the global update model is sent to the macro base station, and is aggregated with models obtained by executing global training tasks by other internet of things terminal devices in the current cluster, so as to update the global model of the current cluster, where a calculation formula is:

wherein the content of the first and second substances,

parameter, p, representing global model of kth cluster in round n +1 _z Representing the weight of the terminal device z of the internet of things during aggregation,

and representing parameters of a global updating model of the Internet of things terminal device z in the nth round.

In some embodiments, the method further comprises:

when a preset convergence condition is reached, stopping the federal multi-task training; the preset convergence condition is that a preset training turn is reached.

In another aspect, the invention further provides a clustered federal multitask learning device for internet of things, which includes a processor and a memory, wherein the memory stores computer instructions, the processor is configured to execute the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the device implements the steps of the method.

In another aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.

The invention has the beneficial effects that:

according to the clustering federated multitask learning method and device for the Internet of things, data distribution in the same cluster tends to be approximate by clustering the Internet of things terminal equipment, a federated multitask learning algorithm is executed in each cluster, a global training task and an individualized training task are executed at each Internet of things terminal equipment, data sharing is achieved in the clusters, meanwhile, local data of the Internet of things terminal equipment are fully utilized to train the individualized tasks, the local data of each Internet of things terminal equipment are efficiently utilized, and the training effect is improved.

Furthermore, in the local training process of each Internet of things terminal device, the local training round is adjusted according to the calculation power, the calculation resources of each Internet of things terminal device are fully utilized, and the model training efficiency is improved.

Furthermore, the global model is used for conducting regularization constraint on the personalized training task, overfitting can be effectively prevented, the personalized degree is controlled, and the quality of the model is improved.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.

It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the specific details set forth above, and that these and other objects that can be achieved with the present invention will be more clearly understood from the detailed description that follows.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

fig. 1 is a schematic view of the internet of things of the present invention.

Fig. 2 is a schematic diagram of a clustering process of terminal devices of the internet of things in an embodiment of the present invention.

Fig. 3 is a schematic diagram of a process of clustering terminal devices of the internet of things in another embodiment of the present invention.

Fig. 4 is a schematic flow chart of the clustering federal multitask learning method for the internet of things according to an embodiment of the present invention.

Fig. 5 is a logic diagram of a clustered federal multitask learning method for the internet of things according to another embodiment of the present invention.

Fig. 6 is a schematic diagram of aggregation of global models in a cluster according to the internet-of-things-oriented clustered federal multitask learning method in the embodiment of the invention.

Fig. 7 is a graph of average accuracy of the FedAvg algorithm (federal learning algorithm) and the PCFML algorithm (clustered federal multitask learning algorithm) as a function of training rounds when the Ratio is set to 0.7 under the mnist data set.

FIG. 8 is a graph of mean accuracy of the FedAvg algorithm and the PCFML algorithm as a function of training rounds with Ratio set to 0.9 in the mnist dataset.

FIG. 9 is a graph of mean accuracy of the FedAvg algorithm and the PCFML algorithm as a function of training runs with Ratio set to 0.7 for cifar10 dataset.

FIG. 10 is a graph of the mean accuracy of the FedAvg algorithm and the PCFML algorithm as a function of training rounds with Ratio set to 0.5 for the cifar10 dataset.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the following embodiments and the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, and other details not so related to the present invention are omitted.

It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.

It is also noted herein that the term "coupled," if not specifically stated, may refer herein to not only a direct connection, but also an indirect connection in which an intermediate is present.

In some traditional typical federal learning methods, knowledge distillation technology is adopted, and clients can select a model architecture designed by the clients by uploading model prediction instead of model parameters, and dynamically aggregate the model architecture according to the importance degree of knowledge provided by the clients, so that the aggregated model prediction can better integrate the model knowledge of the clients. After the server aggregation is completed, the model prediction distribution information of the public data but the pseudo label information is transmitted back to the client, so that the transmission communication efficiency is improved.

Some federal learning tasks are created by using block chain intelligent contracts, data samples are loaded by terminal equipment, and the data samples are unloaded to edge computing equipment for local model training; the edge computing equipment encrypts the parameters of the local training model and uploads the parameters to a block chain, and after the chain link points of the block chain are identified together, a new block is generated; the intelligent contract aggregates the model parameters, aggregates the model parameters and updates the whole model; the intelligent contract judges whether a preset convergence condition of the model is achieved or not, if not, the next round of training is carried out, and if so, the federal learning task is terminated; the edge computing device trains the personalized model based on the global model information in combination with the data of the edge computing device.

However, in the scene of the internet of things, due to device configuration differences and local data structure differences, significant heterogeneity exists among terminal devices of the internet of things, which causes that a traditional federal learning method in the prior art cannot pay attention to heterogeneous characteristics of data and models among different terminals.

Based on this, one aspect of the invention provides a clustering federal multi-task learning method facing to the internet of things, as shown in fig. 1, the method is operated on terminal equipment of the internet of things, and the terminal equipment of the internet of things is connected with a macro base station through a micro base station relay, wherein the terminal equipment of the internet of things is divided into a preset number of clusters with the aim of minimizing data distribution difference.

It should be noted that the terminal device of the internet of things may include a mobile phone, a computer, a vehicle equipped with a smart car, a smart home appliance, and other user terminal devices capable of storing and executing programs. The micro base stations are used for establishing relays between the terminal equipment of the Internet of things and the macro base stations, the MEC server of each micro base station has certain calculation power, and the macro base stations connected with the MEC server cover a plurality of terminal equipment of the Internet of things. MEC servers within macro base stations have powerful computational and communication resources.

In this embodiment, in order to weaken the heterogeneity of the internet of things terminal devices in a data plane, the internet of things terminal devices with similar data distribution are divided into the same cluster by evaluating the similarity of the data distribution, and federal multi-task learning is performed by using the cluster as a unit. For example, the wearable smart device for collecting vital sign information may include a smart watch, a smart bracelet, a smart sphygmomanometer, and the like, the data collected by these devices mainly include parameters such as human heart rate, body temperature, blood pressure, and blood oxygen, and due to the difference in collection form, the data structure of the smart watch and the smart bracelet is similar, and in contrast, the data structure collected by the smart sphygmomanometer has a certain difference compared with the smart watch and the smart bracelet. In the process of analyzing the vital sign analysis model, different devices have differences in application scenes, data structure and analysis purposes, and different terminal devices of the internet of things need to be trained in an individualized way. Therefore, on the one hand for make the training more high-efficient, avoid the less model that leads to of terminal local data volume to converge, on the other hand is in order to make the training concern equipment and data isomerism, and this embodiment is clustered the thing networking terminal equipment, can be divided into a cluster with intelligent wrist-watch and intelligent bracelet, divide into a cluster with intelligent sphygmomanometer.

Specifically, the method for clustering the terminal devices of the internet of things is to determine the similarity of local data distribution, in some embodiments, the method uses a k-means algorithm to cluster the terminal devices of the internet of things, and the terminal devices of the internet of things are divided into a preset number of clusters with the minimized data distribution difference as a target, as shown in fig. 2, and includes steps S101 to S102:

step S101: and initializing a preset number of Internet of things terminal devices into a mass center at random to establish a corresponding cluster.

Step S102: and calculating the distribution distance between the local data of the rest Internet of things terminal equipment and the local data of each mass center one by one, attributing the local data to the cluster corresponding to the mass center with the closest distribution distance, and calculating the distribution mean value of the local data of each Internet of things terminal equipment in each cluster to update the mass center until all the Internet of things terminal equipment are classified.

In the embodiment, a k-means algorithm is used as a clustering algorithm, k represents the number of categories, means represents a mean value, similar data points are divided by presetting a k value and an initial centroid of each category, and an optimal clustering result is obtained through mean value iterative optimization after division. Here, the distribution distance is calculated using the euclidean distance. And dividing the terminal equipment of the Internet of things with closer Euclidean distance of local data into the same cluster so as to be capable of paying attention to the isomerism.

In some embodiments, the method determines the preset number of clusters by using an elbow method, and the internet of things terminal device is divided into the preset number of clusters with the minimized data distribution difference as a target, as shown in fig. 3, including steps S201 to S203:

step S201: and randomly selecting one Internet of things terminal device as a clustering center.

Step S202: and calculating the distribution distance between the rest of the Internet of things terminal equipment and the local data of each existing clustering center one by one, and selecting the Internet of things terminal equipment with the largest distribution distance as a new clustering center.

Step S203: and iterating to select a preset number of clustering centers.

In the case of randomly selecting the centroid in steps S101 to S102, the time complexity of the k-means algorithm is not high due to the selection of the mode of randomly selecting the centroid. The disadvantage is that the two centre points are closely spaced. In the steps S201-S203, only one piece of Internet of things terminal equipment is selected as a clustering center, namely a centroid, in the initial stage, the distance from the remaining points to each clustering center is continuously calculated, the Internet of things terminal equipment with the largest distance is established as a new clustering center, and the problem that the two clustering centroids are close to each other can be effectively solved.

In other embodiments, the number of clusters to be finally established may be determined by selecting inflection points by using an elbow method, wherein the elbow method refers to average error sum (SSE) to measure the clustering quality, as the number of classes classified by a cluster increases, the descending amplitude of the SSE decreases sharply, and then as the k value continues to increase, the descending amplitude tends to be flat, and the inflection points are used as the number of classes of the final cluster.

Specifically, after clustering the terminal devices of the internet of things according to the distribution of local data, the clustered federal multitask learning method for the internet of things performs federal multitask training, and as shown in fig. 4 and 5, in each training turn, the method includes steps S301 to S304:

step S301: and downloading the global model of the current cluster local round to which the current terminal equipment of the Internet of things belongs from the macro base station.

Step S302: and performing a global training task on the global model by using local data of the current terminal equipment of the Internet of things, wherein the global training task performs parameter updating by using a first loss function to obtain a global updating model.

Step S303: performing personalized training tasks on the personalized models of the current terminal equipment of the Internet of things by using local data of the current terminal equipment of the Internet of things, and updating parameters of the personalized training by adopting a second loss function to obtain updated personalized models; and the second loss function is combined with the parameters of the global model to carry out regularization constraint on the personalized training task on the basis of the first loss function.

Step S304: and sending the global updating model to the macro base station, and aggregating the global updating model with models obtained by executing global training tasks by other terminal equipment of the Internet of things in the current cluster so as to update the global model of the current cluster.

In step S301 of this embodiment, each cluster is separately subjected to federal multitask training, and each cluster has its own global model. And the global model is respectively downloaded to each Internet of things terminal device in the cluster, and is trained by using local data of each device. First, in step S302, a global training task is executed in a cluster, local data of each internet of things terminal device in the cluster is used for training, each internet of things terminal device is trained according to a first loss function, and a trained global update model is returned to a macro base station in step S304 to perform aggregation update on the global model. In some embodiments, the first loss function is a cross-entropy loss function.

In step S303, the intra-cluster internet-of-things terminal device further executes an individualized training task based on its own task target, and in the individualized training, local data is also used to train an individualized model of the current internet-of-things terminal in the current round, where the structure of the individualized model is consistent with that of the global model, and the individualized model is obtained by transfer learning based on the global model in the initial stage. The structure of the personalized model and the global model is set according to specific work tasks. In the personalized training task, a second loss function is configured according to a specified target for training, but in order to prevent overfitting and control the personalized degree of the model, the second loss function introduces parameters of the global model on the basis of the first loss function to carry out regularization constraint on the personalized training.

In some embodiments, the second loss function performs regularization constraint on the personalized training task by combining parameters of the global model on the basis of the first loss function, and an expression of the second loss function is as follows:

wherein, F _z,P (. Represents the second loss function, F _z,g (. Represents the first loss function, w _z Parameters of the personalized model representing the terminal device z of the internet of things in the nth round, D _z Local parameter, w, representing nth-round internet-of-things terminal device z _z,i Denotes w _z The parameters of the layer i are set to be,

In step S304, the global update model is sent to the macro base station, and is aggregated with models obtained by executing global training tasks by other internet of things terminal devices in the current cluster, so as to update the global model of the current cluster, where the calculation formula is:

wherein the content of the first and second substances,

parameter, p, representing the global model of the kth cluster in round n +1 _z Representing the weight of the internet of things terminal device z during aggregation,

In steps S301-S304, when a preset convergence condition is reached, stopping the Federal multitask training; the predetermined convergence condition is to achieve a predetermined training round. In other embodiments, the predetermined convergence condition may be to stop when the loss function is smaller than the set value.

The time required for executing a round of local training is different due to the difference of hardware structures, the difference of calculated amount and the difference of data amount of each terminal device of the internet of things. Considering that the time for each internet of things terminal device in the internet of things to execute a round of local training usually depends on the device with the longest time, if the local training rounds of each internet of things terminal device are consistent, the devices with less time for local training can be in an idle state for a long time due to small data volume or strong calculation power, and in order to utilize computing resources to the maximum, the embodiment also adjusts the rounds of local training, thereby preventing the waste of computing resources.

Specifically, in some embodiments, the method further adjusts the local training round according to the computing power of each internet of things terminal device in the cluster, and includes steps S401 to S403:

step S401: calculating the time delay required by each Internet of things terminal device to execute a round of local training, wherein the calculation formula is as follows:

wherein l _k,z Represents the time delay D needed by the terminal equipment z of the Internet of things in the kth cluster to execute a round of local training _z L represents the amount of z local data of the terminal equipment of the Internet of things, f _z And the computing power of the terminal equipment z of the Internet of things is represented.

Step S402: obtaining the time delay of the Internet of things terminal equipment which needs the longest time for executing a round of local training in each cluster as the longest time delay of the cluster, wherein the expression is as follows:

step S403: adjusting the round of local training executed by each Internet of things terminal device in each cluster according to the proportional relation between the time delay required by each Internet of things terminal device for executing a round of local training and the longest time delay corresponding to the cluster to which the terminal device belongs, wherein the calculation formula is as follows:

wherein e is _z Representing the local training turn of the terminal equipment z of the internet of things,

In steps S401 to S403, for the internet of things terminal devices in one cluster, the longer the time delay required for executing one round of local training, the fewer the rounds of local training, and finally, the time for each internet of things terminal device to complete local training is substantially the same, so that the computing resources of all devices can be fully utilized. It should be explained here that a big turn is a process in which each internet of things terminal device in a cluster downloads a global model to execute a global training task, performs a personalized training task, and returns to a global update model to complete aggregation. The local training refers to a process of performing local training and updating on the global model and the personalized model.

The invention is illustrated below with reference to specific examples:

the embodiment provides a clustering federal multi-task learning method facing to the Internet of things, and the scheme is mainly based on the idea of personalized federal learning. The personalized federal learning is a solution to the data isomerism problem on the basis of the federal learning, and the personalized federal learning establishes a personalized model suitable for the terminal by considering the specificity of the terminal while enjoying the knowledge sharing advantage. The personalized federal learning method mainly comprises federal multitask learning, federal transfer learning, federal meta-learning and the like. In the federal learning, the establishment of the model for different terminals can be regarded as a plurality of tasks, so that the multitask learning can be naturally applied to the federal learning, and independent but related models can be fitted for the plurality of terminals. However, in the case of data heterogeneity, data sharing and personalization of a terminal model are often incompatible, and for a certain terminal, knowledge obtained from other terminals may be meaningless in implementing personalization. Therefore, the embodiment provides a clustered federal multi-task learning scheme, terminals with close data distribution are divided into a cluster, the terminals in the same cluster perform collaborative multi-task training due to similarity, and a hidden layer at the bottom of a shared model network learns some common low-level abstract features, so that data sharing in the cluster is realized.

The system architecture constructed by the present embodiment is shown in fig. 1. The traditional federal learning training process mainly comprises a terminal local training model, terminal parameter uploading, and a macro base station end executing model aggregation and parameter issuing. Different from the traditional federal learning task which uniformly executes cooperative training on all terminals, the embodiment clusters the terminals, associates the terminals with strong relevance together, executes the traditional federal learning task in the cluster, and provides reference for individualized federal learning of the terminals in the cluster by a global model in the cluster. And the terminal completes the training of the personalized model through the soft parameter sharing in the multi-task learning. The scheme provided by the embodiment gives consideration to data sharing and model individuation in the cluster, and the model is improved in the performances such as accuracy. Meanwhile, because the task relevance of the same cluster is strong, compared with the traditional federal learning scheme, the scheme provided by the embodiment has an advantage in convergence speed.

The method comprises the steps that an internet of things network is set to be composed of an internet of things terminal device, a micro base station, a macro base station and a corresponding Mobile Edge Computing (MEC) server. MEC servers within macro base stations have powerful computational and communication resources. The micro base stations play a role of relaying between the terminal equipment of the Internet of things and the macro base station, the MEC server in each micro base station has certain computing capacity, the base stations connected with the MEC server cover a plurality of terminal equipment of the Internet of things, the set of the terminal equipment of the Internet of things is represented by Z, and D is enabled to be _z ＝{x _z ,y _z Denotes the local data set of terminal Z, Z ∈ Z.

In this embodiment, the internet-of-things-oriented clustering federal multitask learning method is shown in fig. 5, and includes the following steps:

1.1 clustering the terminal equipment of the Internet of things by the macro base station end, and initializing the personalized model of the terminal equipment of the Internet of things locally.

1.2 the terminal equipment of the Internet of things downloads the global model of the current turn of the cluster to which the terminal equipment belongs from the macro base station and locally generates a copy.

1.3 the terminal equipment of the Internet of things executes a global model training task on the local data set to complete global model copy updating and sends the copy to the macro base station.

1.4 the terminal equipment of the Internet of things executes the personalized model training task on the local data set to complete the updating of the personalized model.

1.5, aggregating the copies of the global model of the terminal equipment of the Internet of things at a macro base station end and updating the global model.

1.6 if the preset convergence condition is not satisfied, the step 1.2 is carried out, otherwise, the process is finished.

In the scene of the internet of things, the terminal devices of the internet of things often execute the same type of tasks but have different data distribution. And in the clustering stage, the macro base station end performs clustering according to the data distribution condition of the terminal equipment of the Internet of things. Firstly, the macro base station sends a request to all the terminal equipment of the internet of things to inquire the data distribution condition of the terminal equipment of the internet of things, and the terminal equipment of the internet of things replies the data distribution condition of the terminal equipment of the macro base station after receiving the request. The macro base station clusters the terminal equipment of the Internet of things by using a k-means algorithm according to the data distribution condition of all the terminals, the terminal equipment of the Internet of things with similar data distribution condition is divided into a cluster, and the clustering step comprises the following steps:

2.1 initializing centroids randomly according to the number of clusters.

2.2 calculating the distance from the data distribution condition of the terminal to all the centroids, and selecting the centroid with the closest distance, wherein the terminal belongs to the cluster to which the centroid belongs.

2.3 adjusting the position of the center of mass.

2.4 repeating the steps 2.1-2.3 until the clustering condition of each terminal device of the Internet of things does not change.

The k-means algorithm needs to preset the number of clusters, and the number of clusters of the terminal device of the internet of things is not known in advance in the embodiment. The number of clusters is determined here using the elbow method. SSE is used as a standard for evaluating the quality of cluster quantity, along with the improvement of the cluster quantity, the division of samples is finer, the SSE is reduced, when the cluster quantity is smaller than the optimal cluster quantity, along with the increase of the cluster quantity, the SSE is quicker to descend, and when the cluster quantity is larger than the optimal cluster quantity, the descending speed of the SSE tends to be gentle, so that the optimal cluster quantity can be determined.

After clustering is completed, a clustering set is represented by K. And performing federal multi-task learning in the cluster, and considering a model for training on the terminal equipment of the Internet of things as a neural network. The hidden layers of the neural network are considered to play a role in feature extraction, each model is provided with a plurality of hidden layers, the hidden layers at the bottom learn some low-level features, and the hidden layers at the bottom of the internal shared network in the cluster can learn some common low-level abstract features. And a part of hidden layers with higher levels are reserved for the personalized model of each terminal, so that the personalized model of the terminal can learn specific characteristics with higher abstraction levels.

In a traditional federal learning scheme, local training rounds of terminal equipment of the internet of things are the same. However, in the internet of things, because the data volume of the terminal is different from the computing capacity, the time for the terminal device of the internet of things to perform a round of local training is different, and under the condition that the local training rounds are the same, the time for a round of global training depends on the client with the longest local training time, and the terminal device of the internet of things with the short local training time is in a waiting state after uploading the locally updated parameters, which causes the waste of computing resources. To solve this problem, the present embodiment adjusts the local training round of the terminal based on the clustering result. Specifically, steps S401 to S403 can be referred to for adjustment.

In this embodiment, sharing of the bottom hidden layer is realized through soft parameter sharing, and the soft parameter sharing does not require that parameters of the bottom hidden layers of different terminal models are completely the same, but encourages parameter similarity.

Referring to fig. 6, let the parameter of the personalized model trained on the internet of things terminal device z in the nth round be w _z ,w _z,i Denotes w _z The parameters of the ith layer and the global model parameters of the cluster k after the nth round of global aggregation are

To represent

Parameters of the ith layer, and the Internet of things terminal equipment z generates a copy of a cluster global model in the nth round

The hidden layer participating in soft parameter sharing is set as s, and the local training performed on z is split into two partsRespectively, a task for global model training _g And task for personalized model training _p Define execution of task on z _g The loss function of time is

The specific form of the loss function is determined according to the task, taking the terminal task as the multi-classification task as an example, the loss function

A cross entropy loss function is used. Executing task on z _g The gradient of when is

To pair

The update of (a) is represented as:

wherein epsilon _z Indicating execution task _g Step size of time. Task in each round of federal multi-task learning process _g May be performed locally multiple times.

And the terminal sends the information to the macro base station terminal for the current round aggregation of the global model.

Executing task on z _p The loss function of time is F _z,P (w _z ，D _z )，w _z The update of (a) is represented as:

wherein eta is _z Indicating execution task _p The step size of the time is determined,

indicates that task is performed on z _p The gradient of time. Similarly, task in each round of federal multi-task learning process _p May be performed multiple times. To further improve the efficiency of federal multitask learning, because communication delay is large when uploading parameters, task may be started when uploading parameters _p Is performed. Here, λ _z,i For controlling the parameters of the similarity between the personalized model and the global model by adjusting lambda _z,i The degree of personalization of the model can be controlled. Lambda [ alpha ] _z,i When the size of the data is larger, the parameters of the terminal personalized model are forced to be close to the parameters of the global model on the shared hidden layer set s, and the data sharing degree is higher. Lambda [ alpha ] _z,i When the personalized model is smaller, the constraint applied to the terminal personalized model is smaller, and the personalized degree of the personalized model is higher.

After collecting the global model updating results of all terminals in the clustering k in the round of federal multi-task learning, the macro base station end aggregates the results and completes the updating of the global model, and the calculation formula is as follows:

wherein the content of the first and second substances,

and representing parameters of a global updating model of the terminal equipment z of the internet of things in the nth round.

The effects of the present invention will be described below with reference to specific examples:

in this example, experiments were performed on mnist and cifar10 datasets, and the experimental results of the PCFML algorithm (clustered federal multi-tasked learning algorithm) of the present invention were compared with the FedAvg algorithm (federal learning algorithm).

The mnist and cifar10 datasets both contain 10 classes, and similar processing is done for both datasets. And dividing the training set into subsets with equal length according to the number of the terminals, wherein each subset has a main class, the main class is a label with a larger proportion in the subsets, and the main class is considered to represent a class to which one terminal belongs. A parameter ratio is set, and the proportion of the main class in the subset can be adjusted by setting the parameter ratio, wherein the larger the parameter is, the more extreme the distribution of data is. The test data is partitioned into 10 subsets, where each subset has a "master class" with a ratio of ratio. The training set and the test set of a terminal have the same "main class".

And randomly distributing the subset of the data set segmentation to the terminal as a local data set, training the terminal by using the local training set, and testing on the test set. LeNet is adopted as a training model, the model structure is set to be 2 layers of convolution layers and 3 layers of full connection layers, and an Adam optimizer is used. And taking the average value of the accuracy of the terminal on the local test set as a main evaluation index.

The experimental results in the mnist data set are shown in fig. 7 and 8, and when the ratio is set to 0.7, the relationship between the training round and the average accuracy of the terminal on the test set is shown in fig. 7. With the increase of training rounds, the accuracy of both FedAvg and PCFML stabilized after rising. After convergence, the accuracy of PCFML is improved by about 3% compared with FedAvg. At the same time, it was observed that PCFML converged faster.

When the ratio is set to 0.9, the data distribution of the terminal is more extreme. The relationship between the training round and the average accuracy of the terminal on the test set is shown in fig. 8. The accuracy of FedAvg is obviously reduced, and the accuracy of PCFML is still kept at a higher level. When both FedAvg and PCFML tend to converge, the accuracy of PCFML is improved by about 24% compared with FedAvg, and the advantage of fast convergence is maintained. Meanwhile, when the data distribution is more extreme, the average accuracy of the FedAvg fluctuates obviously.

The experimental results on the cifar10 data set are shown in fig. 9 and 10, and when the data set is cifar10 and the ratio is set to 0.7, the relationship between the training round and the average accuracy of the terminal on the test set is shown in fig. 9. The accuracy of FedAvg is lower because the LeNet model is simple in structure and cifar10 is more difficult to perform image recognition than mnist. An overfitting phenomenon appears in the process of training by using the PCFML, and the overfitting problem is solved by using an early-stopping method. Compared with FedAvg, the accuracy and convergence rate of PCFML are still greatly improved.

When ratio is set to 0.5, the relationship between training turns and the average accuracy of the terminal on the test set is shown in FIG. 10. The accuracy of the FedAvg is improved along with the reduction of the extreme degree of data distribution, the difference between the FedAvg and the PCFML after convergence is reduced, and the experiment on the mnist data set is combined to think that the PCFML has greater advantages when the data distribution difference of terminals in different clusters is greater.

Therefore, the embodiment provides the internet-of-things-oriented clustering federal multi-task learning method and device, which emphasize on the individuation of the terminal model and relieve the adverse effect caused by data heterogeneity. Clustering is carried out on the terminals according to the data condition of the terminal equipment of the Internet of things, and federal multi-task learning is executed among the terminals with higher similarity, so that the negative influence caused by weight divergence of the terminals with overlarge data distribution difference during aggregation is eliminated. The local training round of the Internet of things terminal equipment is adjusted according to the clustering condition, the Internet of things terminal equipment with high computing power or small data volume executes multiple local training, idle waiting of partial terminals is avoided, and the utilization rate of computing resources is improved. And driving partial hidden layer parameters of different terminal personalized models in the same cluster to be similar. And the personalized model of the terminal equipment of the Internet of things is allowed to take account of the data sharing and the personalization of the model in the cluster and can adjust the personalization degree of the model. Meanwhile, the training of the global model and the training of the personalized model are decoupled, the training of the personalized model and the uploading and aggregation of parameters can be carried out simultaneously, and the waste of computing resources is reduced.

In summary, according to the internet of things-oriented clustered federal multitask learning method and device, data distribution in the same cluster tends to be approximate by clustering terminal devices of the internet of things, a federal multitask learning algorithm is respectively executed in each cluster, a global training task and an individualized training task are executed at each terminal device of the internet of things, data sharing is achieved in the clusters, meanwhile, local data of the terminal devices of the internet of things are fully utilized for training the individualized tasks, the local data of the terminal devices of the internet of things are efficiently utilized, and training effects are improved.

Correspondingly to the method, the invention also provides a clustered federal multitask learning method and device for internet of things, the device comprises computer equipment and the computer equipment comprises a processor and a memory, computer instructions are stored in the memory, the processor is used for executing the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the device realizes the steps of the method.

Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the foregoing steps of the edge computing server deployment method. The computer readable storage medium may be a tangible storage medium such as Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, floppy disks, hard disks, removable storage disks, CD-ROMs, or any other form of storage medium known in the art.

Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein may be implemented as hardware, software, or combinations thereof. Whether this is done in hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link.

It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.

Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments in the present invention.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The clustering federated multitask learning method for the Internet of things is characterized in that the method is operated on terminal equipment of the Internet of things, the terminal equipment of the Internet of things is connected with a macro base station through a micro base station relay, and the terminal equipment of the Internet of things is divided into a preset number of clusters by taking the minimized data distribution difference as a target; the method is used for executing the federal multitask training, and each training turn comprises the following steps:

performing a global training task on the global model by using local data of the current terminal equipment of the Internet of things, wherein the global training task performs parameter updating by using a first loss function to obtain a global updating model;

performing personalized training tasks on the personalized models of the current terminal equipment of the Internet of things by using local data of the current terminal equipment of the Internet of things, wherein the personalized training adopts a second loss function to perform parameter updating to obtain updated personalized models; the second loss function combines parameters of the global model to carry out regularization constraint on the personalized training task on the basis of the first loss function;

2. The internet of things-oriented clustered federal multitask learning method as claimed in claim 1, wherein the method is used for clustering the internet of things terminal equipment by adopting a k-means algorithm, the internet of things terminal equipment is divided into a preset number of clusters with a goal of minimizing data distribution difference, and the method comprises the following steps:

and calculating the distribution distance between the local data of the rest of the Internet of things terminal equipment and the local data of each mass center one by one, belonging the local data to the cluster corresponding to the mass center with the closest distribution distance, and calculating the distribution mean value of the local data of each Internet of things terminal equipment in each cluster to update the mass center until all the Internet of things terminal equipment are classified.

3. The internet of things-oriented clustered federal multitask learning method as claimed in claim 1, wherein the method is characterized in that the preset number of clusters is determined by an elbow method, and the internet of things terminal equipment is divided into the preset number of clusters with the aim of minimizing data distribution difference, and the method comprises the steps of

calculating the distribution distance between the rest Internet of things terminal equipment and the local data of each existing clustering center one by one, and selecting the Internet of things terminal equipment with the largest distribution distance as a new clustering center;

and iterating to select a preset number of clustering centers.

4. The internet of things-oriented clustered federal multitask learning method as claimed in claim 3, wherein the method further comprises determining the preset number by using an elbow method.

5. The internet of things-oriented clustered federal multitask learning method as claimed in claim 1, wherein the method further adjusts local training rounds according to the calculation power of each internet of things terminal device in the cluster, and comprises the following steps:

and the local training turn of the Internet of things terminal equipment with the longest time delay required for executing the local training in the kth cluster is represented.

6. The internet of things-oriented clustered federal multitask learning method as claimed in claim 1, wherein the second loss function is based on the first loss function and combined with parameters of the global model to conduct regularization constraint on the personalized training task, and an expression of the second loss function is as follows:

wherein, F _z,P (. Represents the second loss function, F _z,g (. Represents the first loss function, w _z Representing the thing networking in the nth roundParameters of the personalized model of the terminal device z, D _z Local parameter, w, representing nth turn of internet of things terminal device z _z,i Denotes w _z The parameters of the layer i are set to be,

7. The internet of things-oriented clustered federal multitask learning method as claimed in claim 6, wherein the global update model is sent to the macro base station, and is aggregated with models obtained by executing global training tasks by other internet of things terminal devices in the current cluster, so as to update the global model of the current cluster, and the calculation formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

8. The internet of things-oriented clustered federal multitask learning method as claimed in claim 7, wherein the method further comprises:

9. An internet of things-oriented clustered federal multitask learning device comprising a processor and a memory, wherein the memory has stored therein computer instructions, the processor is configured to execute the computer instructions stored in the memory, and when the computer instructions are executed by the processor, the device implements the steps of the method as recited in any one of claims 1 through 8.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.