CN116596065A

CN116596065A - Gradient calculation method and device, storage medium, product and electronic equipment

Info

Publication number: CN116596065A
Application number: CN202310848180.8A
Authority: CN
Inventors: 朱凯旋; 孙仁恩; 魏鹏; 张冠男
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2023-07-12
Filing date: 2023-07-12
Publication date: 2023-08-15
Anticipated expiration: 2043-07-12
Also published as: CN116596065B

Abstract

The application discloses a gradient calculation method, a gradient calculation device, a storage medium, a product and electronic equipment, wherein the gradient calculation method comprises the following steps: the method comprises the steps that the edge side receives gradient data of preset number sent by a client, the gradient data are obtained by the client in the training process of a model, average calculation processing is conducted on the gradient data of the preset number, average gradient data are obtained, the average gradient data are sent to a cloud, the cloud receives the average gradient data sent by the edge side, the average gradient data are sent to the client, and the client is instructed to conduct parameter adjustment on the model.

Description

Gradient calculation method and device, storage medium, product and electronic equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a gradient computing method, a gradient computing device, a storage medium, a product, and an electronic device.

Background

A neural network model is often existed in the client for helping the client to perform data processing operations such as information processing and mathematical calculation, and the neural network model often needs the client to learn and train the neural network model to have the data processing capability. In the prior art, a central node device is often adopted to help a plurality of clients to cooperatively train a neural network model, one central node calculates gradient data uploaded by the plurality of clients, the calculation pressure of an intermediate node is high, bandwidth and network communication blockage are easily caused by data transmission in a repeated training process, and a model cooperative training method which is high in data transmission speed and capable of reducing calculation pressure is required to be provided.

Disclosure of Invention

The embodiment of the application provides a gradient calculation method, a gradient calculation device, a storage medium, a product and electronic equipment, wherein average gradient data can be sent to a client by a cloud through data calculation of an order of magnitude by using an edge side, training is carried out on a model based on three cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced. The technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a gradient calculating method, applied to an edge side, where the method includes:

receiving gradient data of a preset number sent by a client, wherein the gradient data are acquired by the client in the training process of a model;

carrying out average calculation processing on the gradient data of the preset number to obtain average gradient data;

and sending the average gradient data to a cloud end, wherein the cloud end is used for sending the average gradient data to the client end and indicating the client end to carry out parameter adjustment on the model.

In a second aspect, an embodiment of the present application provides a gradient calculation method, applied to a cloud, where the method includes:

Receiving average gradient data sent by an edge side, wherein the average gradient data is calculated by the edge side based on gradient data obtained in the training process of a model by a client;

and sending the average gradient data to the client side, and indicating the client side to carry out parameter adjustment on the model.

In a third aspect, an embodiment of the present application provides an edge side apparatus, including:

the system comprises a gradient data receiving module, a model training module and a model training module, wherein the gradient data receiving module is used for receiving preset number of gradient data sent by a client, and the gradient data are acquired by the client in the model training process;

the average calculation processing module is used for carrying out average calculation processing on the gradient data with the preset number to obtain average gradient data;

the average data sending module is used for sending the average gradient data to a cloud end, and the cloud end is used for sending the average gradient data to the client end and indicating the client end to carry out parameter adjustment on the model.

In a fourth aspect, an embodiment of the present application provides a cloud device, where the cloud device includes:

the average data receiving module is used for receiving average gradient data sent by the edge side, wherein the average gradient data is obtained by calculating gradient data obtained in the training process of the model by the edge side based on the client side;

And the gradient data sending module is used for sending the average gradient data to the client and indicating the client to carry out parameter adjustment on the model.

In a fifth aspect, embodiments of the present application provide a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps.

In a sixth aspect, embodiments of the present application provide a computer program product storing a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps.

In a seventh aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.

In one or more embodiments of the present application, an edge receives preset number of gradient data sent by a client, where the gradient data is obtained by the client in a training process of a model, average calculation processing is performed on the preset number of gradient data to obtain average gradient data, the average gradient data is sent to a cloud, the cloud receives the average gradient data sent by the edge, and the average gradient data is sent to the client to instruct the client to perform parameter adjustment on the model. By using the edge side to perform high-order data calculation, the cloud end sends average gradient data to the client end, training is performed on the model based on the cooperation of the client end, the edge side and the cloud end, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is an exemplary schematic diagram of an average gradient data calculation provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 4 is an exemplary schematic diagram of gradient data reception provided by an embodiment of the present application;

FIG. 5 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 6 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 7 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 8 is a schematic flow chart of a gradient calculation method according to an embodiment of the present application;

FIG. 9 is a schematic diagram of an edge side device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a cloud device according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Edge side (Edge Computing) is a Computing model that focuses on placing Computing, storage, and network functions near the Edge of a data source to achieve real-time data processing, reduce data transmission delays, and improve system reliability. The edge side device may be an edge side, for example, the edge side device may be an edge side Kubernetes (K8 s) cluster, the edge device may also be a module for implementing the gradient calculation method in the edge side, and the cloud device may be a cloud server or a module for implementing the gradient calculation method in the cloud server. Gradient computation is a technique in machine learning to optimize model parameters, thereby improving the predictive performance of the model. The basic idea of gradient computation is to compute the error between the model predicted value and the true value using samples in the dataset and adjust the model parameters according to the magnitude of the error, minimizing the error. Gradient data refers to the rate of change or slope of a function at a point, indicating the steepest rise in value taken by the function. The client can acquire gradient data in the training process of the model, and it can be understood that the model training process requires multiple times of iterative training and calculation, if only one client is used for training the model, the calculation capability of the client is excessively high, the time consumed by multiple iterations is too long, and the model is often deployed in not only one client but also multiple clients, so that the multiple clients can be used for training the model and acquiring gradient data in the training process, the edge side device and the cloud side device can acquire average gradient data by integrating the gradient data acquired by the multiple clients in the model training process, and then the client can train the model by utilizing the average gradient data, so that the model training speed is accelerated, and the model training situation of all the clients can be combined, so that the model acquired by training is more accurate and accords with more use scenes.

Referring to fig. 1 together, an exemplary schematic diagram of average gradient data calculation is provided for an embodiment of the present application, where an edge side device may include a data gateway process and a gradient calculation process, where the data gateway process is used to perform data transmission and data interaction with a client and a cloud device, the client may acquire gradient data during training of a model, and then send gradient data information to the data gateway process, where the gradient data information includes gradient data acquired by the client during training of the model, and may also include information such as a client identifier, a model version, and the client identifier may be an internet protocol (Internet Protocol, IP) address of the client, a device model of the client, an international mobile equipment identifier (International Mobile Equipment Identity, IMEI) and the like, and the model may generate different model versions due to updating parameters during training, and model training schedules of different clients may be different, so that the model versions of different clients may also include gradient data corresponding model versions in the gradient data sent by the client.

The data gateway process may send the acquired gradient data to the gradient computing process in batches, for example, when the data gateway process acquires the preset number of gradient data, the preset number of gradient data may be sent to the gradient computing process, where the preset number may be an initial setting of the edge side device, or may be set by a user or related staff, for example, may be 100. The gradient calculation process is used for carrying out average calculation processing on a preset number of gradient data to obtain average gradient data, for example, the gradient calculation process can adopt NumPy (Numerical Python) to carry out average calculation processing, and the average calculation processing can be federal average calculation (Federated Averaging Algorithm, fedAVg) processing. The gradient computing process then sends the average gradient data to the data gateway process, which sends the average gradient data to the cloud device.

The data gateway process may send average gradient data information to the cloud end, where the average gradient data information includes average gradient data, a client identifier, a model version, and the like, where the client identifier and the model version are used to instruct the cloud end device to send the average gradient data to a client corresponding to the client identifier, and the average gradient data information may further include edge side content delivery network (Content Delivery Network, CDN) information. After the cloud device receives the average gradient data information sent by the data gateway process, the average gradient data can be sent to the client corresponding to the client identifier, the client can carry out parameter adjustment on the model based on the average gradient data, and then the processes of sending the gradient data information to the data gateway process and receiving the average gradient data sent by the cloud device to carry out parameter adjustment on the model are repeated until the model converges, so that model training is completed.

The gradient calculation method provided by the application is described in detail below with reference to specific examples.

Referring to fig. 2, a flow chart of a gradient calculating method is provided in an embodiment of the application. As shown in fig. 2, the embodiment of the present application describes a gradient calculation method with respect to the edge side and the cloud, and the method may include the following steps S102 to S108.

S102, the edge side receives gradient data of preset number sent by the client.

Specifically, the client can acquire gradient data in the training process of the model, then send the gradient data to the edge side, the edge side can receive the gradient data sent by the client, and average calculation processing can be carried out on the preset number of gradient data once when the preset number of gradient data is received. For example, the data gateway process at the edge side may store the gradient data in the gradient data queue until a preset number of gradient data is stored in the gradient data queue, and send the preset number of gradient data to the gradient computing process at the edge side for performing average computing processing.

It can be appreciated that the client may send gradient data information to the edge side, where the gradient data information may include, in addition to gradient data, information such as a client identifier, a model version, and the like, where the client identifier may be an IP address of the client, a device model number, an IMEI of the client, and the like.

S104, carrying out average calculation processing on the gradient data with the preset number by the edge side to obtain average gradient data.

Specifically, the edge side may perform average calculation processing on the preset number of gradient data, so as to obtain average gradient data. The average gradient data may be the average number of the preset number of gradient data, or may be obtained by performing federal average calculation processing on the preset number of gradient data from the edge side. For example, the gradient calculation process at the edge side may use NumPy to perform federal average calculation processing on a preset number of gradient data.

And S106, the edge side sends the average gradient data to the cloud.

Specifically, the edge side may send the average gradient data to the cloud end, and it may be understood that the data gateway process is used for data interaction and data transmission between the client end and the cloud end device, so after the gradient calculation process calculates the average gradient data, the average gradient data may be sent to the data gateway process, and the data gateway process sends the average gradient to the cloud end.

It may be appreciated that the edge side may send average gradient data information to the cloud end, where the average gradient data information may include, in addition to average gradient data, a client identifier, a model version, and the like, and is configured to instruct the cloud end to send the average gradient data to a client corresponding to the client identifier.

S108, the cloud end sends the average gradient data to the client end, and the client end is instructed to conduct parameter adjustment on the model.

Specifically, after the cloud receives the average gradient data sent by the edge side, the average gradient data can be sent to the client, so that the client is instructed to perform parameter adjustment on the model, that is, after the client receives the average gradient data sent by the cloud, the client can perform parameter adjustment on the model by adopting the average gradient data, and the steps are repeated until the model converges, so that model training is completed.

In the embodiment of the application, the edge side receives the preset number of gradient data sent by the client side, wherein the gradient data is obtained by the client side in the training process of the model, average calculation processing is carried out on the preset number of gradient data to obtain average gradient data, the average gradient data is sent to the cloud side, the average gradient data sent by the edge side is received by the cloud side, the average gradient data is sent to the client side, and the client side is instructed to carry out parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 3, a flow chart of a gradient calculating method is provided in an embodiment of the application. As shown in fig. 3, the embodiment of the present application describes a gradient calculation method with respect to the edge side and the cloud, and the method may include the following steps S202 to S216.

S202, the edge side receives gradient data sent by the client and stores the gradient data in a gradient data queue.

Specifically, the client can acquire gradient data in the training process of the model, then send the gradient data to the edge side, the edge side can receive the gradient data sent by the client, and average calculation processing can be carried out on the preset number of gradient data once when the preset number of gradient data is received. It can be appreciated that the client may send gradient data information to the edge side, where the gradient data information may include, in addition to gradient data, information such as a client identifier, a model version, and the like, where the client identifier may be an IP address of the client, a device model number, an IMEI of the client, and the like.

The edge side can receive gradient data sent by the client and store the gradient data in a gradient data queue, and it can be understood that the edge side can comprise a data gateway process and a gradient computing process, the data gateway process is used for carrying out data interaction and data transmission with the client and the cloud, and the gradient computing process is used for carrying out average computing processing on the gradient data. Therefore, when the data gateway process receives the gradient data, the gradient data is not directly transmitted to the gradient calculation process for calculation, but is sequentially stored in the gradient data queue according to the time sequence of the received gradient data, and the data gateway process performs batch scheduling on the received gradient data, so that the order of gradient calculation is improved, and the efficiency of gradient calculation is improved.

S204, if the number of the gradient data in the gradient data queue meets the preset number, the edge side acquires the gradient data with the preset number from the gradient data queue.

Specifically, if the number of gradient data in the gradient data queue meets the preset number, the edge side may acquire the preset number of gradient data from the gradient data queue, and may delete the preset number of gradient data from the gradient data queue. It can be understood that the edge side sequentially stores the gradient data in the gradient data queue according to the time sequence of receiving the gradient data, so that if the number of the gradient data in the gradient data queue meets the preset number, the edge side can acquire the preset number of the gradient data from the head in the gradient data queue. The preset number may be an initial setting of the edge side device, or may be set by a user or a related worker, for example, may be 100.

Optionally, if the number of gradient data in the gradient data queue meets the preset number, the data gateway process in the edge side may acquire the preset number of gradient data from the gradient data queue, and send the preset number of gradient data to the gradient computing process. Referring to fig. 4, an exemplary schematic diagram of gradient data receiving is provided for an embodiment of the present application, where a client may send gradient data to a data gateway process at an edge side, and the data gateway process sequentially stores gradient data in a gradient queue according to a time of receiving the gradient data, for example, gradient 1, gradient 2, gradient 3, gradient 100, etc. are gradient data in the gradient data queue, if the preset number is 100, after receiving the gradient 100 and storing the gradient 100 in the gradient queue, the gradient data in the gradient data queue is 100, and the preset number is satisfied, the data gateway process may obtain the preset number of gradient data from front to back in the gradient data queue, and send the gradient data to a gradient computing process, that is, gradient 1, gradient 2, gradient 3, gradient 100 in the gradient data queue is sent to the gradient computing process.

S206, carrying out grouping filtering processing on the gradient data with the preset number by the edge side to obtain filtering gradient data.

Specifically, different model versions are generated during the model training process due to the update of parameters, model training progress of different clients may be different, so model versions corresponding to each gradient data in the preset number of gradient data may be different, reference significance of the gradient data of the old model version is not great, the edge side can perform grouping filtering processing on the preset number of gradient data, filter the gradient data of the old model version in the preset number of gradient data, obtain filtered gradient data from the preset number of gradient data, and filter the gradient data to be the gradient data corresponding to the latest model version in the preset number of gradient data.

Optionally, the edge side may perform grouping processing on a preset number of gradient data according to model versions corresponding to the gradient data to obtain at least one gradient data set, where model versions corresponding to different gradient data sets are different, that is, gradient data of a same model version is stored in a same gradient data set, then a filtered gradient data set corresponding to a latest model version is obtained, filtered gradient data in the filtered gradient data set is obtained, and it is understood that the filtered gradient data is the gradient data in the filtered gradient data set, and model versions corresponding to the filtered data in the filtered gradient data set are all latest model versions.

Optionally, after the gradient computing process receives the preset number of gradient data sent by the data gateway process, the preset number of gradient data may be subjected to packet filtering processing to obtain filtered gradient data.

And S208, carrying out average calculation processing on the filtered gradient data by the edge side to obtain average gradient data.

Specifically, the edge side may perform an average calculation process on the filtered gradient data, thereby obtaining average gradient data. The average gradient data may be the average of the filtered gradient data, or may be obtained by performing federal average calculation processing on the filtered gradient data by the edge side. For example, the gradient calculation process at the edge side may perform federal average calculation processing on the filtered gradient data using NumPy.

Optionally, after the gradient calculation process obtains the filtered gradient data, the filtered gradient data may be subjected to federal average calculation processing, so as to obtain average gradient data.

S210, the edge side sends the average gradient data to the cloud.

S212, the cloud determines average gradient data of the secondary average number.

Specifically, if the cloud end only receives the average gradient data sent by the edge side, the average gradient data is sent to all clients, the cloud end data communication pressure is easy to be high, and network bandwidth consumption is high, so that the cloud end can receive the average gradient data meeting the secondary average number from the edge side and then perform data transmission with the clients, wherein the secondary average number can be the initial setting of the cloud end, and can also be set for users or related staff, for example, the secondary average number can be 100.

Optionally, after the cloud receives the average gradient data sent by the edge side, the cloud may store the average gradient data in the gradient data storage table, and it may be understood that the cloud may sequentially store the average gradient data in the gradient data storage table according to a time sequence of receiving the average gradient data, and if the number of average gradient data in the gradient data storage table meets the second average number, determine average gradient data of the second average number in the gradient data storage table.

And S214, carrying out average calculation processing on the average gradient data of the secondary average number by the cloud to obtain secondary average gradient data.

Specifically, the cloud may perform average calculation processing on the average gradient data of the second average number to obtain second average gradient data, where the second average gradient data may be an average value of the average gradient data of the second average number.

Optionally, the gradient data storage table may be used to record the average gradient used in the model training process, so the gradient data storage table may not be deleted, and is stored in the cloud for the relevant staff to check, and monitor the model training process. Therefore, the cloud end can mark the average gradient data subjected to the average calculation processing, if the number of the unmarked average gradient data in the gradient data storage table is detected to meet the secondary average number, the unmarked average gradient data of the secondary average number in the gradient data storage table is determined, the average calculation processing is performed on the unmarked average gradient data of the secondary average number, the secondary average gradient data is obtained, and then the unmarked average gradient data is marked.

And S216, the cloud end sends the secondary average gradient data to the client end, and instructs the client end to carry out parameter adjustment on the model.

Specifically, after the cloud obtains the secondary average gradient data, the secondary average gradient data can be sent to the client, so that the client is instructed to perform parameter adjustment on the model, namely after the client receives the secondary average gradient data sent by the cloud, the model can be subjected to parameter adjustment by adopting the secondary average gradient data, and the steps are repeated until the model converges, so that model training is completed.

In the embodiment of the application, the edge side receives the gradient data sent by the client, stores the gradient data in the gradient data queue, if the number of the gradient data in the gradient data queue meets the preset number, the edge side acquires the preset number of the gradient data from the gradient data queue, carries out grouping filtering processing on the preset number of the gradient data to acquire filtered gradient data, filters out the gradient data of the old model version with less reference significance, provides the accuracy of average gradient data, and reduces the calculation pressure of the edge side. And carrying out federal average calculation processing on the filtered gradient data at the edge side to obtain average gradient data, and sending the average gradient data to the cloud. The cloud determines average gradient data of the secondary average number, average calculation processing is carried out on the average gradient data of the secondary average number, the secondary average gradient data is obtained, communication pressure of the cloud data is reduced, network bandwidth consumption is reduced, the cloud sends the secondary average gradient data to the client, and the client is instructed to carry out parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 5, a flow chart of a gradient calculating method is provided in an embodiment of the present application. As shown in fig. 5, the embodiment of the present application describes a gradient calculation method based on the edge side, and the method may include the following steps S302 to S306.

S302, receiving gradient data of preset number sent by a client.

S304, carrying out average calculation processing on the gradient data with the preset number to obtain average gradient data.

And S306, sending the average gradient data to the cloud.

It can be understood that the edge side may send average gradient data information to the cloud end, where the average gradient data information may include, in addition to average gradient data, a client identifier, a model version, and the like, and is configured to instruct the cloud end to send the average gradient data to a client corresponding to the client identifier, and instruct the client to perform parameter adjustment on the model.

In the embodiment of the application, the preset number of gradient data sent by the client is received, wherein the gradient data is obtained by the client in the training process of the model, average calculation processing is carried out on the preset number of gradient data to obtain average gradient data, the average gradient data is sent to the cloud, and the cloud is used for sending the average gradient data to the client to instruct the client to carry out parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 6, a flow chart of a gradient calculating method is provided in an embodiment of the application. As shown in fig. 6, the embodiment of the present application describes a gradient calculation method based on the edge side, and the method may include the following steps S402 to S410.

S402, receiving gradient data sent by a client and storing the gradient data in a gradient data queue.

S404, if the number of the gradient data in the gradient data queue meets the preset number, acquiring the gradient data with the preset number from the gradient data queue.

Specifically, if the number of gradient data in the gradient data queue meets the preset number, the edge side may acquire the preset number of gradient data from the gradient data queue, and may delete the preset number of gradient data from the gradient data queue. It can be understood that the edge side sequentially stores the gradient data in the gradient data queue according to the time sequence of receiving the gradient data, so that if the number of the gradient data in the gradient data queue meets the preset number, the edge side can acquire the preset number of the gradient data from the head in the gradient data queue. The preset number may be an initial setting of the edge side device, or may be set by a user or a related worker.

Optionally, if the number of gradient data in the gradient data queue meets the preset number, the data gateway process in the edge side may acquire the preset number of gradient data from the gradient data queue, and send the preset number of gradient data to the gradient computing process.

S406, carrying out grouping filtering processing on the gradient data with the preset number to obtain filtering gradient data.

S408, carrying out average calculation processing on the filtered gradient data to obtain average gradient data.

And S410, sending the average gradient data to the cloud.

Specifically, the edge side may send the average gradient data to the cloud end, and it may be understood that the data gateway process is used for data interaction and data transmission between the client end and the cloud end device, so after the gradient calculation process calculates the average gradient data, the average gradient data may be sent to the data gateway process, the data gateway process sends the average gradient to the cloud end, and the cloud end may send the average gradient data to the client end to instruct the client end to perform parameter adjustment on the model.

In the embodiment of the application, the edge side receives the gradient data sent by the client, stores the gradient data in the gradient data queue, if the number of the gradient data in the gradient data queue meets the preset number, the edge side acquires the preset number of the gradient data from the gradient data queue, carries out grouping filtering processing on the preset number of the gradient data to acquire filtered gradient data, filters out the gradient data of the old model version with less reference significance, provides the accuracy of average gradient data, and reduces the calculation pressure of the edge side. The edge side performs federal average computing processing on the filtered gradient data to obtain average gradient data, the average gradient data is sent to the cloud, and the cloud can send the average gradient data to the client to instruct the client to perform parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 7, a flow chart of a gradient calculating method is provided in an embodiment of the application. As shown in fig. 7, the embodiment of the present application describes a gradient calculation method based on cloud, and the method may include the following steps S502 to S504.

S502, receiving average gradient data sent by the edge side.

Specifically, the edge side may receive gradient data sent by the client, perform average calculation processing based on the obtained gradient data, obtain average gradient data, and send the average gradient data to the cloud. The cloud may receive average gradient data sent by the edge side. It may be appreciated that the edge side may send average gradient data information to the cloud end, where the average gradient data information may include, in addition to average gradient data, a client identifier, a model version, and the like, and is configured to instruct the cloud end to send the average gradient data to a client corresponding to the client identifier.

And S504, sending the average gradient data to the client, and indicating the client to perform parameter adjustment on the model.

In the embodiment of the application, average gradient data sent by an edge side is received, wherein the average gradient data is obtained by calculating gradient data acquired in a training process of a model based on a client side by the edge side, and the average gradient data is sent to the client side to instruct the client side to carry out parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 8, a flow chart of a gradient calculating method is provided in an embodiment of the application. As shown in fig. 8, the embodiment of the present application describes a gradient calculation method based on cloud, and the method may include the following steps S602 to S608.

S602, receiving average gradient data sent by the edge side.

Specifically, the edge side may receive gradient data sent by the client, perform average calculation processing based on the obtained gradient data, obtain average gradient data, and send the average gradient data to the cloud. The cloud may receive average gradient data sent by the edge side.

S604, determining average gradient data of the secondary average number.

S606, carrying out average calculation processing on the average gradient data of the secondary average number to obtain secondary average gradient data.

And S608, sending the secondary average gradient data to the client, and indicating the client to perform parameter adjustment on the model.

In the embodiment of the application, the cloud can receive the average gradient data sent by the edge side, wherein the average gradient data is obtained by calculating the gradient data acquired in the training process of the model based on the client side, the cloud determines the average gradient data of the secondary average number, carries out average calculation processing on the average gradient data of the secondary average number to obtain the secondary average gradient data, reduces the communication pressure of the cloud data and reduces the network bandwidth consumption, and the cloud sends the secondary average gradient data to the client side to instruct the client side to carry out parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

The following describes the edge side device according to the embodiment of the present application in detail with reference to fig. 9. It should be noted that, in fig. 9, the edge side device is used to perform the method of the embodiment of fig. 5 and 6, for convenience of explanation, only the portion relevant to the embodiment of the present application is shown, and specific technical details are not disclosed, please refer to the embodiment of fig. 5 and 6 of the present application.

Referring to fig. 9, a schematic structural diagram of an edge side device according to an exemplary embodiment of the present application is shown. The edge side device may be implemented as all or part of the device by software, hardware, or a combination of both. The apparatus 1 comprises a gradient data receiving module 11, an average calculation processing module 12 and an average data transmitting module 13.

The gradient data receiving module 11 is configured to receive preset number of gradient data sent by a client, where the gradient data is acquired by the client in a training process of a model;

optionally, the gradient data receiving module 11 is specifically configured to receive gradient data sent by a client, and store the gradient data in a gradient data queue;

and if the number of the gradient data in the gradient data queue meets the preset number, acquiring the gradient data with the preset number from the gradient data queue.

The average calculation processing module 12 is configured to perform average calculation processing on the preset number of gradient data, so as to obtain average gradient data;

optionally, the average calculation processing module 12 is specifically configured to perform packet filtering processing on the preset number of gradient data to obtain filtered gradient data;

and carrying out average calculation processing on the filtered gradient data to obtain average gradient data.

Optionally, the average calculation processing module 12 is specifically configured to perform grouping processing on the preset number of gradient data according to model versions corresponding to the gradient data, so as to obtain at least one gradient data set, where model versions corresponding to different gradient data sets are different;

and acquiring a filtering gradient data set corresponding to the latest model version, and acquiring filtering gradient data in the filtering gradient data set.

Optionally, the average calculating and processing module 12 is specifically configured to perform federal average calculating and processing on the preset number of gradient data to obtain average gradient data.

And the average data sending module 13 is configured to send the average gradient data to a cloud end, where the cloud end is configured to send the average gradient data to the client end, and instruct the client end to perform parameter adjustment on the model.

In this embodiment, the edge receives gradient data sent by the client, stores the gradient data in the gradient data queue, if the number of gradient data in the gradient data queue meets the preset number, the edge obtains the preset number of gradient data from the gradient data queue, and performs packet filtering processing on the preset number of gradient data to obtain filtered gradient data, so as to filter out gradient data of old model versions with less reference significance, thereby providing accuracy of average gradient data and reducing computing pressure of the edge. The edge side performs federal average computing processing on the filtered gradient data to obtain average gradient data, the average gradient data is sent to the cloud, and the cloud can send the average gradient data to the client to instruct the client to perform parameter adjustment on the model. By using the edge side to perform data calculation of an order of magnitude, average gradient data is sent to the client by the cloud, training is performed on the model based on the cooperation of the client, the edge side and the cloud, gradient data transmission efficiency and data performance efficiency are improved, and data transmission pressure is reduced.

Referring to fig. 10, a block diagram of an electronic device according to an exemplary embodiment of the present application is shown. The electronic device of the present application may include one or more of the following components: processor 110, memory 120, input device 130, output device 140, and bus 150. The processor 110, the memory 120, the input device 130, and the output device 140 may be connected by a bus 150.

Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall electronic device using various interfaces and lines, performs various functions of the terminal 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and invoking data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 110 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user page, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 110 and may be implemented solely by a single communication chip.

The Memory 120 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (ROM). Optionally, the memory 120 includes a Non-transitory computer readable medium (Non-Transitory Computer-Readable Storage Medium). Memory 120 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, which may be an Android (Android) system, including an Android system-based deep development system, an IOS system developed by apple corporation, including an IOS system-based deep development system, or other systems, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like.

Memory 120 may be divided into an operating system space in which the operating system runs and a user space in which native and third party applications run. In order to ensure that different third party application programs can achieve better operation effects, the operating system allocates corresponding system resources for the different third party application programs. However, the requirements of different application scenarios in the same third party application program on system resources are different, for example, under the local resource loading scenario, the third party application program has higher requirement on the disk reading speed; in the animation rendering scene, the third party application program has higher requirements on the GPU performance. The operating system and the third party application program are mutually independent, and the operating system often cannot timely sense the current application scene of the third party application program, so that the operating system cannot perform targeted system resource adaptation according to the specific application scene of the third party application program.

In order to enable the operating system to distinguish specific application scenes of the third-party application program, data communication between the third-party application program and the operating system needs to be communicated, so that the operating system can acquire current scene information of the third-party application program at any time, and targeted system resource adaptation is performed based on the current scene.

The input device 130 is configured to receive input instructions or data, and the input device 130 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 140 is used to output instructions or data, and the output device 140 includes, but is not limited to, a display device, a speaker, and the like. In one example, the input device 130 and the output device 140 may be combined, and the input device 130 and the output device 140 are touch display screens.

The touch display screen may be designed as a full screen, a curved screen, or a contoured screen. The touch display screen may also be designed as a combination of a full screen and a curved screen, and the combination of a special-shaped screen and a curved screen, which is not limited in the embodiment of the present application.

In addition, those skilled in the art will appreciate that the configuration of the electronic device shown in the above-described figures does not constitute a limitation of the electronic device, and the electronic device may include more or less components than illustrated, or may combine certain components, or may have a different arrangement of components. For example, the electronic device further includes components such as a radio frequency circuit, an input unit, a sensor, an audio circuit, a wireless fidelity (Wireless Fidelity, wiFi) module, a power supply, and a bluetooth module, which are not described herein.

In the electronic device shown in fig. 10, the processor 110 may be configured to invoke the gradient computing application program stored in the memory 120, and specifically perform the following operations:

In one embodiment, the processor 110, when executing the preset number of gradient data sent by the receiving client, specifically executes the following operations:

receiving gradient data sent by a client, and storing the gradient data in a gradient data queue;

In one embodiment, when the processor 110 performs the average calculation processing on the preset number of gradient data to obtain average gradient data, the following operations are specifically performed:

Performing grouping filtering treatment on the gradient data with the preset number to obtain filtered gradient data;

In one embodiment, when performing the packet filtering processing on the preset number of gradient data to obtain filtered gradient data, the processor 110 specifically performs the following operations:

grouping the preset number of gradient data according to model versions corresponding to the gradient data to obtain at least one gradient data set, wherein model versions corresponding to different gradient data sets are different;

and performing federal average calculation processing on the gradient data with the preset number to obtain average gradient data.

The cloud device provided by the embodiment of the application will be described in detail with reference to fig. 11. It should be noted that, in the cloud device in fig. 11, for performing the method of the embodiment of fig. 7 and 8 of the present application, for convenience of explanation, only the portion relevant to the embodiment of the present application is shown, and specific technical details are not disclosed, please refer to the embodiment of fig. 7 and 8 of the present application.

Fig. 11 is a schematic structural diagram of a cloud device according to an exemplary embodiment of the application. The cloud device may be implemented as all or part of the device by software, hardware, or a combination of both. The apparatus 2 comprises an average data receiving module 21 and a gradient data transmitting module 22.

An average data receiving module 21, configured to receive average gradient data sent by an edge side, where the average gradient data is calculated by the edge side based on gradient data acquired in a training process of the model by the client;

and the gradient data sending module 22 is configured to send the average gradient data to the client, and instruct the client to perform parameter adjustment on the model.

Optionally, the gradient data sending module 22 is specifically configured to determine average gradient data of the second average number;

Carrying out average calculation processing on the average gradient data of the secondary average number to obtain secondary average gradient data;

and sending the secondary average gradient data to a client, and indicating the client to carry out parameter adjustment on the model.

Optionally, the gradient data sending module 22 is specifically configured to store the average gradient data in a gradient data storage table;

and if the number of the average gradient data in the gradient data storage table meets the secondary average number, acquiring the average gradient data of the secondary average number in the gradient data storage table.

Referring to fig. 12, a block diagram of an electronic device according to an exemplary embodiment of the present application is shown. The electronic device of the present application may include one or more of the following components: processor 210, memory 220, input device 230, output device 240, and bus 250. The processor 210, memory 220, input device 230, and output device 240 may be connected by a bus 250.

Processor 210 may include one or more processing cores. The processor 210 utilizes various interfaces and lines to connect various portions of the overall electronic device, perform various functions of the electronic device 200, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 220, and invoking data stored in the memory 220. Alternatively, the processor 210 may be implemented in at least one hardware form of DSP, FPGA, PLA. The processor 210 may integrate one or a combination of several of a CPU, GPU, modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 210 and may be implemented solely by a single communication chip.

The memory 220 may include RAM or ROM. Optionally, the memory 220 includes a non-transitory computer readable medium. Memory 220 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 220 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, which may be an Android (Android) system, including an Android system-based deep development system, an IOS system developed by apple corporation, including an IOS system-based deep development system, or other systems, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the electronic device in use, such as phonebooks, audiovisual data, chat log data, and the like.

Memory 220 may be divided into an operating system space in which the operating system runs and a user space in which native and third party applications run. In order to ensure that different third party application programs can achieve better operation effects, the operating system allocates corresponding system resources for the different third party application programs. However, the requirements of different application scenarios in the same third party application program on system resources are different, for example, under the local resource loading scenario, the third party application program has higher requirement on the disk reading speed; in the animation rendering scene, the third party application program has higher requirements on the GPU performance. The operating system and the third party application program are mutually independent, and the operating system often cannot timely sense the current application scene of the third party application program, so that the operating system cannot perform targeted system resource adaptation according to the specific application scene of the third party application program.

The input device 230 is configured to receive input instructions or data, and the input device 230 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 240 is used to output instructions or data, and the output device 240 includes, but is not limited to, a display device, a speaker, and the like. In one example, the input device 230 and the output device 240 may be combined, and the input device 230 and the output device 240 are touch display screens.

In addition, those skilled in the art will appreciate that the configuration of the electronic device shown in the above-described figures does not constitute a limitation of the electronic device, and the electronic device may include more or less components than illustrated, or may combine certain components, or may have a different arrangement of components. For example, the electronic device further includes components such as a radio frequency circuit, an input unit, a sensor, an audio circuit, a WiFi module, a power supply, and a bluetooth module, which are not described herein.

In the electronic device shown in fig. 12, the processor 210 may be configured to invoke the gradient computing application program stored in the memory 220, and specifically perform the following operations:

In one embodiment, the processor 210, when executing the sending of the average gradient data to the client, specifically performs the following operations:

determining average gradient data of the secondary average number;

In one embodiment, the processor 210, when executing the determination of the average gradient data of the twice average number, specifically performs the following operations:

storing the average gradient data in a gradient data storage table;

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, or the like.

The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

It should be noted that, information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals according to the embodiments of the present disclosure are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions. For example, gradient data, client information, and the like referred to in this specification are all acquired with sufficient authorization.

Claims

1. A gradient calculation method applied to an edge side, the method comprising:

2. The method of claim 1, wherein the receiving the preset number of gradient data sent by the client includes:

3. The method of claim 1, wherein the performing an average calculation on the preset number of gradient data to obtain average gradient data includes:

4. The method of claim 3, wherein the performing packet filtering on the preset number of gradient data to obtain filtered gradient data includes:

5. The method of claim 1, wherein the performing an average calculation on the preset number of gradient data to obtain average gradient data includes:

6. A gradient computing method applied to a cloud, the method comprising:

7. The method of claim 6, the sending the average gradient data to the client, comprising:

determining average gradient data of the secondary average number;

8. The method of claim 7, the determining average gradient data for the secondary average number comprising:

Storing the average gradient data in a gradient data storage table;

9. An edge side device, the device comprising:

10. A cloud device, the device comprising:

11. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any one of claims 1 to 5 or 6 to 8.

12. A computer program product storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any of claims 1 to 5 or 6 to 8.

13. An electronic device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1-5 or 6-8.