CN115794359A - Heterogeneous system and processing method for federal learning - Google Patents

Heterogeneous system and processing method for federal learning Download PDF

Info

Publication number
CN115794359A
CN115794359A CN202111055426.3A CN202111055426A CN115794359A CN 115794359 A CN115794359 A CN 115794359A CN 202111055426 A CN202111055426 A CN 202111055426A CN 115794359 A CN115794359 A CN 115794359A
Authority
CN
China
Prior art keywords
operator
heterogeneous
processing device
type
heterogeneous processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111055426.3A
Other languages
Chinese (zh)
Inventor
彭瑞
王玮
陈沫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhixing Technology Co Ltd
Original Assignee
Shenzhen Zhixing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhixing Technology Co Ltd filed Critical Shenzhen Zhixing Technology Co Ltd
Priority to CN202111055426.3A priority Critical patent/CN115794359A/en
Publication of CN115794359A publication Critical patent/CN115794359A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a heterogeneous system and a processing method for federal learning, wherein the processing method comprises the following steps: determining an operator type for each of a plurality of operators involved in a federated learning computing task; respectively determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to the operator type of each operator; distributing each operator to corresponding heterogeneous processing equipment for operation; and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the plurality of operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results. The distribution flexibility of the federal learning operator is improved, and the problem of the computational power bottleneck of the federal learning is solved.

Description

Heterogeneous system and processing method for federal learning
Technical Field
The application relates to the technical field of privacy computing, in particular to a heterogeneous system and a processing method for federal learning.
Background
In order to cope with the two major challenges of data islanding and privacy security, federal learning arises. Federal learning is a machine learning technology for training models on a plurality of decentralized edge devices or servers, and allows a plurality of data parties to establish a common and powerful machine learning model together on the premise of not sharing original data, so that key problems of data privacy, data security, data access authority, heterogeneous data access and the like are solved, and accordingly federal learning is developed rapidly and application scenes of the federal learning are spread in a plurality of fields, such as national defense, communication, internet of things, pharmacy, finance and the like. Since federal learning needs to process much more data than traditional machine learning, it becomes a technical bottleneck for federal learning to be implemented in the industry.
Disclosure of Invention
The application provides a heterogeneous system and a processing method for federal learning, which finish calculation of federal learning by distributing proper heterogeneous processing equipment for operators of federal learning.
In a first aspect, the present application provides a processing method for a federated learning computation task based on a heterogeneous system, where the heterogeneous system includes a plurality of heterogeneous processing devices, and the processing method includes:
determining an operator type for each of a plurality of operators involved in a federated learning computing task; respectively determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to the operator type of each operator; distributing each operator to corresponding heterogeneous processing equipment for operation; and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the plurality of operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results. Therefore, operator operation is completed by distributing proper heterogeneous processing equipment for the operator of the federal learning through the operator type, and the calculation efficiency of the federal learning is improved.
In an embodiment, the determining an operator type of each of a plurality of operators involved in the federated learning computing task specifically includes: and correspondingly determining the operator type of each operator according to the operation type contained by each operator in a plurality of operators involved in the federated learning calculation task. By summarizing the commonalities of the operators learned by the federation, the operators are classified into several categories, thereby further improving the computational efficiency of the federated learning.
In an embodiment, the determining, according to an operation type included in each of the operators involved in the federal learning computation task, an operation type of each of the operators correspondingly includes: determining an operation type contained by each operator in a plurality of operators involved in a federated learning calculation task; if the operator is determined to contain the modular exponentiation, determining that the operator type of the operator is a first type; if the operator is determined not to comprise the modular exponentiation operation but comprises the modular multiplication operation, determining that the operator type of the operator is a second type; and if the operator is determined not to comprise the modular exponentiation operation and the modular multiplication operation, determining that the operator type of the operator is a third type. By setting the three types, the corresponding heterogeneous processing equipment can be rapidly determined according to the operator type of the operator, and the calculation efficiency of federal learning can be further improved.
In one embodiment, the heterogeneous system comprises a GPU processing device, an FPGA processing device, and a CPU processing device; the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device. Different calculation types correspond to different heterogeneous processing devices or different processing modes, so that the calculation efficiency of federal learning is further improved.
In one embodiment, the processing method further comprises: acquiring at least one of equipment states, equipment operator execution efficiency and equipment energy consumption of a plurality of heterogeneous processing equipment; the determining, from the heterogeneous system, the heterogeneous processing device corresponding to each operator according to the operator type of each operator specifically includes: and determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to at least one of the equipment state, the equipment operator execution efficiency and the equipment energy consumption and the operator type of the operator. By means of the equipment state, the equipment operator execution efficiency and/or the equipment energy consumption, the calculation efficiency of federal learning can be improved, and the requirements of different users can be met.
In an embodiment, before said assigning each of said operators to a corresponding heterogeneous processing device, said processing method further includes:
determining whether the heterogeneous processing equipment can support the operator according to member functions of classes in an operator library of the heterogeneous processing equipment; if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment; if the heterogeneous processing device does not support the operator, whether the next heterogeneous processing device in the heterogeneous system can support the operator or not is continuously determined. Whether the operator can be supported by the heterogeneous processing equipment or not is automatically judged, and appropriate heterogeneous processing equipment is conveniently and reasonably distributed, so that the calculation efficiency of federal learning is improved.
In an embodiment, if it is determined that the heterogeneous processing device corresponding to a specific operator in the plurality of operators includes at least two heterogeneous processing devices, the processing method further includes:
acquiring a performance test result and a reference value for performance evaluation of each heterogeneous processing device in the at least two heterogeneous processing devices; determining a performance test score according to the performance test result and the reference value; and determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score; correspondingly, the allocating each operator to the corresponding heterogeneous processing device for operation includes: and allocating the specific operator to the target heterogeneous processing equipment for operation. By performing performance test on the heterogeneous processing equipment, the calculation efficiency of federal learning can be improved, and hardware resources of a heterogeneous system can be reasonably utilized.
In a second aspect, the present application further provides another processing method for federated learning and computation tasks based on a heterogeneous system, where the heterogeneous system includes a plurality of heterogeneous processing devices, and the processing method includes:
obtaining an equipment selection strategy selected by a user, wherein the equipment selection strategy is used for selecting corresponding heterogeneous processing equipment for each operator in a plurality of operators related to a federated learning calculation task, and different equipment selection strategies correspond to different equipment selection modes; respectively determining heterogeneous processing equipment corresponding to each operator based on an equipment selection strategy selected by a user; distributing each operator to corresponding heterogeneous processing equipment for operation; and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results. Therefore, operator operation is completed by distributing proper heterogeneous processing equipment for the operator of the federal learning through the operator type, so that the calculation efficiency of the federal learning is improved.
In some embodiments, the device selection policy includes at least a first selection policy and a second selection policy; the first selection strategy is used for respectively determining the heterogeneous processing equipment corresponding to each operator according to a preset corresponding relation table between the operators and the heterogeneous processing equipment; the second selection strategy is used for determining heterogeneous processing equipment corresponding to each operator according to the operator type corresponding to each operator in the operators. By setting at least two selection strategies, wherein the first selection strategy is a static selection strategy, and the second selection strategy is a dynamic selection strategy, the calculation efficiency of federal learning can be improved, and the requirements of different users can be met.
In some embodiments, if it is determined that the user selects the first selection policy, the determining, based on the device selection policy selected by the user, the heterogeneous processing device corresponding to each operator respectively includes: determining heterogeneous processing equipment corresponding to each operator according to the preset corresponding relation table; if it is determined that the user selects the second selection strategy, respectively determining the heterogeneous processing device corresponding to each operator based on the device selection strategy selected by the user, including: and acquiring the operator type of each operator, thereby determining the heterogeneous processing equipment corresponding to each operator. By setting at least two selection strategies, different users can select according to actual requirements, so that the calculation efficiency of federal learning can be improved, and the requirements of different users can be met.
In some embodiments, the second selection policy is further configured to determine a heterogeneous processing device corresponding to each operator according to at least one of a device state of the heterogeneous processing device, a device execution operator efficiency, a device energy consumption, and an operator type of the operator, where different heterogeneous processing devices corresponding to the operators include one or more heterogeneous processing devices. The method can improve the calculation efficiency of federal learning and meet the requirements of different users.
In some embodiments, the processing method further comprises: and correspondingly determining the operator type of each operator according to the operation type contained in each operator. By summarizing the commonality of the operators in federated learning, the operators are classified into several categories, thereby further improving the computational efficiency of federated learning.
In some embodiments, said determining the operator type of each of the operators according to the operation type included in each of the operators, respectively, includes:
determining the operation type contained by each operator; if the operator is determined to contain modular exponentiation, determining that the operator type of the operator is the first type; if the operator is determined not to comprise the modular exponentiation operation but comprises the modular multiplication operation, determining that the operator type of the operator is the second type; and if the operator is determined not to comprise the modular exponentiation operation and the modular multiplication operation, determining the operator type of the operator as the third type. By setting the three types, the corresponding heterogeneous processing equipment can be rapidly determined according to the operator type of the operator, and the calculation efficiency of federal learning can be further improved.
In some embodiments, the heterogeneous system includes a GPU processing device, an FPGA processing device, and a CPU processing device; the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device.
In an embodiment, before said assigning each of said operators to a corresponding heterogeneous processing device, said processing method further includes:
determining whether the heterogeneous processing equipment can support the operator according to member functions of classes in an operator library of the heterogeneous processing equipment; if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment; if the heterogeneous processing device does not support the operator, whether the next heterogeneous processing device in the heterogeneous system can support the operator or not is continuously determined. Whether the operator can be supported by the heterogeneous processing equipment or not is automatically judged, and appropriate heterogeneous processing equipment is conveniently and reasonably distributed, so that the calculation efficiency of federal learning is improved.
In an embodiment, if it is determined that the heterogeneous processing device corresponding to a specific operator in the plurality of operators includes at least two heterogeneous processing devices, the processing method further includes:
acquiring a performance test result and a reference value for performance evaluation of each of the at least two heterogeneous processing devices; determining a performance test score according to the performance test result and the reference value; and determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score; correspondingly, the allocating each operator to the corresponding heterogeneous processing device for operation includes: and allocating the specific operator to the target heterogeneous processing equipment for operation. By performing performance test on the heterogeneous processing equipment, the calculation efficiency of federal learning can be improved, and hardware resources of a heterogeneous system can be reasonably utilized.
In some embodiments, the heterogeneous system further includes a display device, and the processing method further includes: controlling the display device to display a policy selection interface for selection by a user, wherein the policy selection interface includes a plurality of different device selection policies. The display can help the user to quickly select the required equipment selection strategy, so that the experience of the user is improved.
In a third aspect, the present application further provides a heterogeneous system for federal learning, the heterogeneous system comprising: the device comprises a device selector and a plurality of heterogeneous processing devices, wherein the plurality of heterogeneous processing devices comprise FPGA processing devices, GPU processing devices and/or CPU processing devices; the device selector is used for executing the steps of the processing method of the federal learning and calculation task provided by any one of the applications.
According to the heterogeneous system and the processing method for federated learning, suitable heterogeneous processing equipment is allocated to the federated learning operator, for example, suitable heterogeneous processing equipment is allocated to the federated learning operator through a dynamic allocation strategy or a static allocation strategy, the heterogeneous processing equipment completes the operation of the federated learning operator, and finally, the operation result is fed back to the federated learning, so that the federated learning efficiency can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic block diagram of a heterogeneous system for federated learning provided by an embodiment of the present application;
FIG. 2 is a schematic block diagram of another heterogeneous system for federated learning provided by embodiments of the present application;
FIG. 3 is a schematic block diagram of yet another heterogeneous system for federated learning provided by an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating an effect of a policy selection interface provided by an embodiment of the present application;
FIG. 5 is a schematic step diagram of a processing method of a federated learning calculation task based on a heterogeneous system according to an embodiment of the present application;
fig. 6 is a schematic step diagram of another processing method for a federated learning calculation task based on a heterogeneous system according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that, for the convenience of clearly describing the technical solutions of the embodiments of the present application, the words "first", "second", and the like are used in the embodiments of the present application to distinguish the same items or similar items with basically the same functions and actions. For example, the first selection policy and the second selection policy are only for distinguishing different selection policies and are not limited in their order. Those skilled in the art will appreciate that the terms "first," "second," and the like do not denote any order or importance, but rather the terms "first," "second," and the like do not denote any order or importance.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
For the purpose of facilitating understanding of the embodiments of the present application, some terms referred to in the embodiments of the present application will be briefly described below.
1. Federal learning: a machine learning technique for training models on multiple decentralized edge devices or servers allows multiple data parties to jointly build a common, powerful machine learning model without sharing the original data.
Artificial intelligence and machine learning are significant drivers to advance the technological front and social evolution. Artificial intelligence and machine learning have brought significant advances in many areas, such as autonomous cars, robots, wearable devices, and the like. However, the current artificial intelligence and machine learning face two important challenges, one is that in the industrial industry, due to factors such as commercial competition, privacy security, administrative flow, internal wind control of enterprises and the like, data owned by each entity often exists in an isolated island form, and difficulty and resistance for getting through data barriers between different data parties are gradually increased; secondly, people have higher moral and legal requirements on data privacy and security, and the trend of putting importance on data privacy and data security is global.
Researchers in academia and industry have proposed a solution to federal learning in order to address both data islanding and privacy security challenges. Federal learning is a machine learning technology for training models on a plurality of decentralized edge devices or servers, and allows a plurality of data parties to establish a common and powerful machine learning model on the premise of not sharing original data, so that key problems of data privacy, data safety, data access authority, heterogeneous data access and the like are solved. The federal learning is essentially a distributed machine learning technology, and aims to realize common modeling and improve the effect of an AI model on the basis of ensuring the data privacy safety and legal compliance.
Currently, the application scenario of federal learning is spread over multiple fields in academic and industrial circles, such as national defense, communications, internet of things, pharmacy, finance, and the like.
2. Heterogeneous system: to achieve heterogeneous computing optimization, a new system architecture and an execution standard are derived, and the purpose is to perform cooperative operation among heterogeneous architectures of core chips (including an FPGA (Field Programmable Gate Array), a GPU (Graphic Processing Unit), a CPU (Central Processing Unit), a DSP (Digital Signal Processing), and other processors) in a system.
It should be noted that, in the embodiment of the present Application, an example that the heterogeneous system mainly includes a CPU, a GPU and an FPGA is described, but it is understood that the heterogeneous system provided in the present Application may also include other heterogeneous Processing devices, such as an ASIC (Application Specific Integrated Circuit), a TPU (Tensor Processing Unit), and other devices.
With the gradual implementation of the federal learning technology, the inventor finds that the calculation power becomes a technical bottleneck for the implementation of the federal learning in the industrial field. Generally speaking, the federally learned algorithm requires processing far more data than traditional machine learning with limited computational resources, while also requiring low latency.
The federated learning calculation task refers to calculation involved in federated learning, such as sample alignment, model training, model prediction and the like, wherein different calculation tasks involve different operators.
The operators are single or a series of specific 'operations' or computing units in the federated learning algorithm, and include homomorphic encryption operators, homomorphic decryption operators, secret addition operators, secret multiplication operators, secret summation operators, secret matrix multiplication operators and the like which can be directly called by the federated learning algorithm, and internal operators such as modular exponentiation operators, modular multiplication operators, modular inversion operators and the like which can be called in the computing process of some operators.
The inventor finds that indexes such as implementation difficulty, operation efficiency and energy consumption of different operators on different heterogeneous processing devices are greatly different in the federal learning process. And the implementation difficulty of different operators on different heterogeneous processing devices also differs.
Taking a modulus inverse operator called in the secret multiplication operator as an example, the GPU can be directly realized by calling a related interface of a CUDA (compute unified device architecture) large number operation library, and the FPGA needs to pass through a whole set of complicated development process (design, synthesis, simulation, realization, wiring, verification and the like) aiming at the modulus inverse operation, so that the development difficulty and the development period are far longer than those of the GPU.
There are also differences in the operational efficiency and energy consumption of different operators (even the same operator in case of different data/parameters) on different heterogeneous processing devices. For example, when the public key length is 1024, the operation efficiency of the decryption operation by the GPU is about 160,000 operations per second, and the operation efficiency of the decryption operation by the FPGA is about 70,000 operations per second. And, the power of the GPU and FPGA is about 250W and 80W, respectively. For a single large-scale decryption operation, the GPU consumes less operation time, but also consumes more energy. When the data size is small, the use of the CPU to complete the computation task may be more beneficial because the initialization of the heterogeneous processing device and the transmission of the data between the host and the heterogeneous device take some time.
Based on this, in order to solve the bottleneck of federal learning in terms of calculation, the embodiment of the application provides a heterogeneous system and a processing method for federal learning, and specifically, an operator for federal learning is allocated to a proper heterogeneous processing device, and the operation of the operator is completed on the heterogeneous processing device of the heterogeneous system, so that the efficiency of federal learning is improved, and the bottleneck problem of federal learning in terms of calculation is solved.
Embodiments of the application may be used in application scenarios including, but not limited to, multi-party security computing, federal learning related machine learning model training, data security, privacy protection, or other application scenarios applying a privacy computing framework or algorithm, etc. The embodiments of the present application may be modified and improved according to specific application environments, and are not limited herein.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic block diagram of a heterogeneous system for federal learning provided in an embodiment of the present application. The heterogeneous system can be applied to federal learning, and particularly can be used for completing the calculation task of contact learning.
As shown in fig. 1, the heterogeneous system includes a device selector 11 and a processing device 12, and the processing device 12 includes a plurality of heterogeneous processing devices, such as a CPU processing device 121, a GPU processing device 122, and an FPGA processing device 123, which will be referred to as CPU, GPU, and FPGA for short in the following description.
It should be noted that the processing device may also include other heterogeneous processing devices, such as an ASIC, a TPU, and the like. Meanwhile, the number of heterogeneous processing devices is not limited, and for example, one or more heterogeneous processing devices of the same type may be included.
Specifically, the device selector 11 is configured to distribute an operator of the federal learning calculation task to each appropriate heterogeneous processing device, where the heterogeneous processing device completes the operation of the operator, and returns the operation result to the federal learning, thereby completing the federal learning calculation task.
In the embodiment of the present application, multiple operator allocation strategies are further provided, so that the device selector 11 allocates the federated learning calculation task to a suitable heterogeneous processing device according to different operator allocation strategies, or allocates the federated learning calculation task to a suitable heterogeneous processing device according to the user's requirement by using a corresponding operator allocation strategy.
The multiple operator allocation policies provided by the present application can be roughly divided into static allocation policies and dynamic allocation policies according to the allocation process, where the device selector 11 can be selected by a user according to the static allocation policies or the dynamic allocation policies, or the device selector 11 is configured in advance. The dynamic allocation strategy provided by the present application is described in detail below.
And summarizing the characteristics of the federal learning operator, classifying according to the commonality of the operator, and scheduling and distributing by the equipment selector 11 according to the type of the operator. In the embodiment of the application, a classification mode is based on a main performance bottleneck of each operator, and in each operator in federal learning, time-consuming operations include modular exponentiation, modular inverse operation and modular multiplication operation of large integers, wherein the time complexity of the modular exponentiation is the highest, and the time complexity of the modular multiplication operation is relatively low, so that the operators can be divided into three types. Classification can thus be made based on the operator's major performance bottleneck to select a suitable heterogeneous processing device.
It should be noted that the operator containing the modular inverse operation necessarily contains the modular exponentiation operation, and thus can be classified into the type of modular exponentiation operation by the operator performing the modular inverse operation.
Therefore, the operators for federated learning provided by the embodiments of the present application can be divided into at least three types, that is, the operator types at least include a first type, a second type and a third type. The first type is an operator containing modular exponentiation, the second type is an operator containing modular multiplication operation but not containing modular exponentiation, and the third type is an operator not containing modular multiplication operation and modular exponentiation. Specifically, the results are shown in Table 1.
Table 1 shows the correspondence between operators and operator types
Operator Operator type
Fixed point number coding Type of regular operation
Fixed point number decoding Type of regular operation
Paillier encryption (without confusion) Type of modular multiplication operation
Generating a confusion number Type of modular exponentiation
Secret number plus confusion Type of modular multiplication operation
Paillier decryption Type of modular exponentiation
Fixed point number multiplication Type of modular multiplication operation
Secret state addition Type of modular exponentiation
Secret state multiplication Type of modular exponentiation
Dense state matrix multiplication Type of modular exponentiation
Dense state summation Type of modular exponentiation
In table 1, the type of modular exponentiation is a first type, the type of modular multiplication operation is a second type, and the type of normal operation is a third type.
The corresponding allocation policy is thus: determining an operator type for each of a plurality of operators involved in a federated learning computing task; respectively determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to the operator type of each operator; and respectively allocating each operator to corresponding heterogeneous processing equipment for operation.
In some embodiments, the operator type of each of the plurality of operators involved in the federated learning calculation task is determined, and specifically, the operator type of each of the plurality of operators involved in the federated learning calculation task may be determined according to the operation type included in each of the plurality of operators involved in the federated learning calculation task.
Illustratively, if the operator is determined to contain a modular exponentiation, the operator type of the operator is determined to be a first type; if the operator does not comprise the modular exponentiation operation but comprises the modular multiplication operation, determining that the operator type of the operator is a second type; and if the operator does not comprise the modular exponentiation operation and the modular multiplication operation, determining the operator type of the operator to be a third type. Therefore, the operator type of the operator can be quickly determined so as to be distributed to proper heterogeneous processing equipment for processing, and the processing efficiency is improved.
It should be noted that different operator types correspond to different heterogeneous processing devices, or different operator types correspond to a plurality of different heterogeneous processing devices and processing manners corresponding to the plurality of heterogeneous processing devices.
For different operator types corresponding to different heterogeneous processing devices, as shown in table 2, for example, an operator of the third type (conventional operation type) may be calculated using a CPU processing device, while operators of other types use other heterogeneous processing devices.
For different operator types corresponding to a plurality of different heterogeneous processing devices, as shown in table 2, for example, an operator of the second type (modular multiplication operation type) may use an FPGA processing device or a CPU processing device; the operator of the first type (modular inverse type) may use a GPU processing device, an FPGA processing device or a CPU processing device.
For different operator types corresponding to a plurality of different heterogeneous processing devices and processing manners corresponding to the plurality of heterogeneous processing devices, exemplarily, as shown in table 2, an operator of a second type (modular multiplication operation type) may use an FPGA processing device or a CPU processing device, and the corresponding processing manner is "if the FPGA processing device supports a current operator and the remaining resources are sufficient to complete the current operator, the FPGA processing device is used for performing calculation; otherwise, calculating by using CPU processing equipment; the operator of the first type (modular inverse operation type) can use GPU processing equipment, FPGA processing equipment or CPU processing equipment, and the corresponding processing mode is that if the GPU processing equipment supports the current operator and the residual resources are enough to complete the current operator, the GPU processing equipment is used for calculating; otherwise, if the FPGA processing equipment supports the current operator and the residual resources are enough to finish the current operator, calculating by using the FPGA processing equipment; otherwise, compute with CPU processing device ".
Table 2 shows the correspondence between the types of operators, their judgment methods, and their scheduling methods
Figure BDA0003254435720000111
Figure BDA0003254435720000121
In practical applications, table 1 and table 2 may be configured in the device selector 11 so as to determine an appropriate heterogeneous processing device and a corresponding processing method according to the type of the operator. It will be appreciated that it may of course also be stored at the cloud server, where it is queried when in use.
In table 2, in order to further improve the calculation efficiency of federal learning, not only the operator type but also the device status of the heterogeneous processing device is considered, and the device status includes whether or not the heterogeneous processing device supports the operator, the remaining resources, and the like, and may of course include other status information.
In some embodiments, in order to further improve the calculation efficiency of federal learning, at least one of the device states, device operator execution efficiency, and device energy consumption of a plurality of heterogeneous processing devices in a heterogeneous system may also be obtained; and determining the heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to at least one of the equipment state, the equipment operator execution efficiency and the equipment energy consumption and the operator type of the operator.
Specifically, when allocating appropriate heterogeneous processing devices to operators, not only the operator type of the operator but also one or more of the device state of the heterogeneous processing device, the device operator execution efficiency, and the device energy consumption need to be considered, so as to allocate more appropriate heterogeneous processing devices to federal learning, for example, the processing efficiency is high and the energy consumption is low.
In some embodiments, the task of the operator may be further split into partial tasks according to the operator type and the efficiency of the device execution operator of the heterogeneous processing devices, the partial tasks are respectively allocated to the corresponding multiple heterogeneous processing devices for processing, and the operation results are summarized.
In some embodiments, in order to further improve processing efficiency, operation results are summarized, and when the operation of each operator is completed, a plurality of operation results corresponding to the plurality of operators, respectively, are obtained, and a processing result of the federated learning calculation task is obtained according to the plurality of operation results. Specifically, the operation results of the plurality of heterogeneous processing devices may be summarized by other heterogeneous processing devices of the heterogeneous system, where the other heterogeneous processing devices of the heterogeneous system are devices other than the heterogeneous processing device to which the partial task has been allocated. Therefore, heterogeneous processing equipment of the heterogeneous system can be reasonably utilized, and the federal learning calculation efficiency is further improved.
For example, the current heterogeneous system has an unoccupied GPU processing device and an unoccupied FPGA processing device, the public key bit width used for encryption is 1024, if a decryption operator with a large data size needs to be executed as fast as possible, considering that the decryption operation efficiency (device execution operator efficiency) ratio of the GPU processing device to the FPGA processing device is about 16.
In some embodiments, the heterogeneous processing device corresponding to the operator may be determined according to energy consumption of the heterogeneous processing device, and the heterogeneous processing device completes operation of the operator. For example, the user may use this method when the user wants to save energy consumption and has low requirement on processing efficiency.
In some embodiments, to quickly and accurately determine whether an assigned heterogeneous processing device can support an operator, it may also be determined whether the heterogeneous processing device can support the operator according to member functions of classes in an operator library of the heterogeneous processing device; if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment; if the heterogeneous processing device does not support the operator, the operator is allocated to the next heterogeneous processing device in the heterogeneous system until the heterogeneous processing device capable of supporting the operator is determined.
Specifically, in the embodiment of the present application, the device selector 11 can automatically analyze whether the heterogeneous processing device has a computing function of a specific operator. The device selector 11 dynamically determines whether a heterogeneous processing device can support a specific operator, mainly based on the technique of duck-typing in object-oriented programming.
Specifically, the operator library of each heterogeneous processing device (CPU, GPU, FPGA) is packaged into a specific class (for example, class CPU _ operators; class GPU _ operators; class FPGA _ operators), and the names of each operator can be unified, and each realized operator is a member function in the current class (for example, a Paillier encryption operator in the CPU, specifically, a CPU _ operators. If the FPGA processing device does not support the Paillier encryption operator, the device selector 11 cannot find a corresponding member function in the FPGA operator class (that is, there is no function named FPGA _ operators. Paillier _ encrypt ()), and since the Paillier encryption operator belongs to the "modular multiplication operator" defined herein, that is, belongs to the modular multiplication operation type, the device selector 11 automatically allocates the Paillier encryption operator to the CPU for calculation.
In some embodiments, if the devices such as the FPGA/GPU/CPU are damaged or have a fault, the device selector 11 allocates the operator to other heterogeneous processing devices supporting the operator according to a preset scheduling manner when determining that the corresponding heterogeneous processing device is abnormal. The heterogeneous processing device exception includes a damage or a fault.
For example, when the FPGA/GPU processing device is damaged or fails, the device selector 11 may perform scheduling according to a dynamic scheduling manner corresponding to table 2 (for example, for a Paillier encryption operator, if the FPGA is damaged, a CPU is selected), when scheduling, firstly, availability and remaining resources of the designated heterogeneous processing device are checked according to a member function corresponding to each operator, if the heterogeneous processing device is damaged or fails, the device selector 11 regards the designated device as a case that the designated device does not support the current operator, and continuously determines whether a next heterogeneous processing device in the heterogeneous system can support the operator, that is, a fallback mechanism, and the fallback mechanism may perform scheduling according to the scheduling manner (that is, returning the FPGA from the GPU and returning the CPU from the FPGA) described in table 2. Therefore, each operator can be effectively completed, and the calculation task of federal learning is smoothly completed.
In some embodiments, in order to improve the calculation efficiency of federal learning more quickly, when it is determined that the heterogeneous processing device corresponding to a specific operator in the multiple operators includes at least two heterogeneous processing devices, a performance test result and a reference value for performance evaluation of each heterogeneous processing device in the at least two heterogeneous processing devices may also be obtained; determining a performance test score according to the performance test result and the reference value; determining target heterogeneous processing equipment from the at least two heterogeneous processing equipment according to the performance test scores; and distributing the operator to the target heterogeneous processing equipment to finish the operation of the operator.
Specifically, in the embodiment of the present application, the device selector 11 has a test mechanism for testing the performance of heterogeneous processing devices in a heterogeneous system, and is capable of dynamically allocating operators according to the performance test result, where the test mechanism is suitable for a situation where there are multiple heterogeneous processing devices of the same type. The heterogeneous processing equipment periodically self-checks and evaluates the performance and the loss condition of the current equipment, one evaluation test mode is to run a performance test script (comprising 64-bit floating point performance, 32-bit floating point performance, 16-bit floating point performance, 8-bit shaping performance and the like), and compare each performance test result with a reference value to obtain a performance test score, specifically, the performance test score is an average value of the ratio of each performance test result to the reference value. When a specific operator is called, the device selector preferentially checks the support and the residual resources of the device to the operator, and if the calculation can be performed, the heterogeneous processing device with the lowest loss degree (specifically, the heterogeneous processing device with the highest performance test score) is preferentially selected from heterogeneous processing devices (GPU/FPGA/CPU) of the same type for calculation.
The above embodiment provides multiple dynamic allocation strategies, and the device selector 11 allocates appropriate heterogeneous processing devices to the operators of the federated learning computing task according to any one of the multiple dynamic allocation strategies or a combination of several allocation strategies, so that the federated learning computing task can be completed quickly.
The device selector 11 may select to perform allocation according to the operator type or split the task of the operator according to the operator type if a specific dynamic allocation strategy is configured in the device selector 11 in advance, or according to a user requirement, for example, if the user needs to quickly complete a calculation task of federal learning, or if the user requirement is energy saving, may allocate the operator to a corresponding heterogeneous processing device according to an energy saving allocation manner.
The static allocation policy provided by the present application is described in detail below, wherein the static allocation policy is relatively simple in allocation manner compared to the dynamic allocation policy, and thus is faster in allocation.
In the static allocation strategy, the operator allocation mode of the Federal learning calculation task is already specified during program compiling. For each incoming operator, the device selector 11 will first identify an operator name (e.g., fixed-point number coding operator), and determine the heterogeneous processing device corresponding to the operator.
In some embodiments, for example, a preset correspondence table may be stored in advance, where the preset correspondence table records heterogeneous processing devices corresponding to each type of operator, and the heterogeneous processing device corresponding to each operator may be determined by querying the preset correspondence table, where the determined heterogeneous processing device may only include one heterogeneous processing device, or may include multiple heterogeneous processing devices, or multiple heterogeneous processing devices and corresponding processing modes.
Searching the heterogeneous processing device corresponding to the operator according to the table 3 (for example, the heterogeneous processing device corresponding to the fixed point number coding operator is a GPU processing device), and then allocating the computing task of the corresponding operator according to the heterogeneous processing device specified in the table 3 (for example, all the computing tasks of the fixed point number coding operator are allocated to the GPU processing device by the device selector). For the Paillier encryption operator, the device selector distributes the calculation tasks of the modular multiplication part to the FPGA processing device, and distributes the calculation tasks of the modular addition part to the GPU processing device.
Table 3 is a table of correspondence between operators and heterogeneous processing devices
Figure BDA0003254435720000151
Figure BDA0003254435720000161
Table 3 is used to record the heterogeneous processing devices corresponding to the operators of the federal learning, and may be stored in the device selector 11, or stored in the cloud server, and if the needed usage is downloading the table from the cloud server, or sending the table to the cloud server for querying.
It should be noted that if an operator that is not recorded in table 3 exists in federal learning, the user may be prompted to specify the heterogeneous processing device corresponding to the operator for processing, and specifically, if it is determined that the heterogeneous processing device corresponding to the operator cannot be queried in the preset correspondence table, prompt information is output to prompt the user to specify the heterogeneous processing device corresponding to the operator.
It should be noted that the device selector 11 is a processing device independent from heterogeneous processing devices, and is specifically shown in fig. 1; of course, the device selector 11 may also be a program module, and as shown in fig. 2 in particular, the program module may be disposed in one processing device of the multiple heterogeneous processing devices, and is implemented to allocate a suitable heterogeneous processing device for the federally learned operator.
In some embodiments, the heterogeneous system provided by the application may further obtain a device selection policy selected by a user, where the device selection policy is used to select a corresponding heterogeneous processing device for each of a plurality of operators involved in a federated learning calculation task, and different device selection policies correspond to different device selection modes; respectively determining heterogeneous processing equipment corresponding to each operator based on an equipment selection strategy selected by a user; distributing each operator to corresponding heterogeneous processing equipment for operation; and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the plurality of operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results. Therefore, the calculation efficiency of federal learning and the experience degree of a user can be improved.
In some embodiments, the device selection policy includes at least a first selection policy and a second selection policy; the first selection strategy is used for respectively determining heterogeneous processing equipment corresponding to each operator according to a preset correspondence table (such as table 3) between the operators and the heterogeneous processing equipment, specifically, for example, the heterogeneous processing equipment corresponding to the operator is queried by using table 3; and the second selection strategy is used for determining the heterogeneous processing equipment corresponding to each operator according to the operator type corresponding to each operator in the operators.
Correspondingly, if the user selects the first selection strategy, the heterogeneous processing device corresponding to each operator can be determined according to the preset corresponding relation table; if the user selects the second selection strategy, the operator type of each operator can be specifically obtained, and the heterogeneous processing device corresponding to each operator is determined.
It should be noted that the second selection policy is further configured to determine, according to at least one of a device state of the heterogeneous processing device, a device execution operator efficiency, a device energy consumption, and an operator type of an operator, the heterogeneous processing device corresponding to each operator, where the heterogeneous processing devices corresponding to different operators may include one or more heterogeneous processing devices.
In some embodiments, as shown in fig. 3, the heterogeneous system 100 further includes a display device 13, and the device selector 11 is communicatively connected to the display device 13, and may control the display device 13 to display a policy selection interface for selection by a user, where the policy selection interface includes a plurality of different device selection policies.
Illustratively, as shown in fig. 4, a static selection policy and a dynamic selection policy are displayed by the display device 13, where the static selection policy is a first selection policy, and the dynamic selection policy is a second selection policy, so as to be selected by the user, and of course, more device selection policies may be set according to the device allocation policy provided in the present application for the user to select.
It should be noted that the display device 13 may specifically be a touch display screen, and certainly may also be other display screens, and the other display screens may select a device selection policy by setting a physical key.
The processing method of the federated learning calculation task provided by the embodiment of the present application is described below based on the heterogeneous system 100 provided by the embodiment of the present application, and the processing method is to allocate a suitable heterogeneous processing device to an operator for federated learning to complete the federated learning calculation task, so that the federated learning calculation efficiency can be improved.
Referring to fig. 5, a schematic flow chart of a processing method for a federated learning computing task provided in an embodiment of the present application is applicable to a device selector of a heterogeneous system in the foregoing embodiment, and a suitable heterogeneous processing device is allocated to an operator for federated learning, and different heterogeneous processing devices complete the federated learning computing task, so as to improve the federated learning computing efficiency.
As shown in fig. 5, the processing method specifically includes steps S101 to S104:
s101, determining an operator type of each operator in a plurality of operators involved in a federated learning calculation task;
s102, determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to the operator type of each operator;
s103, distributing each operator to corresponding heterogeneous processing equipment for operation;
s104, under the condition that the operation of each operator is completed, obtaining a plurality of operation results corresponding to the operators respectively, and obtaining a processing result of the federal learning calculation task according to the plurality of operation results.
The operator type of each operator in a plurality of operators involved in the federated learning calculation task is determined, and specifically, the operator type of the operator can be determined according to the operation type contained in the operator.
In some embodiments, the determining an operator type of an operator according to an operation type included in the operator specifically includes: determining an operation type contained by each operator in a plurality of operators involved in a federated learning calculation task; if the operator is determined to contain modular exponentiation, determining that the operator type of the operator is a first type; if the operator is determined not to comprise the modular exponentiation operation but comprises the modular multiplication operation, determining that the operator type of the operator is a second type; and if the operator is determined not to comprise the modular exponentiation operation and the modular multiplication operation, determining that the operator type of the operator is a third type.
In some embodiments, different operator types correspond to different heterogeneous processing devices; or different operator types correspond to a plurality of different heterogeneous processing devices; or different operator types correspond to a plurality of different heterogeneous processing devices and processing modes corresponding to the heterogeneous processing devices.
In some embodiments, the heterogeneous system includes a GPU processing device, an FPGA processing device, and a CPU processing device; the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device.
In some embodiments, at least one of a device state, a device operator execution efficiency, and a device energy consumption of the plurality of heterogeneous processing devices may also be obtained; and determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to at least one of the equipment state, the equipment operator execution efficiency or the equipment energy consumption and the operator type of the operator.
In some embodiments, before assigning each operator to the corresponding heterogeneous processing device for operation, the processing method may further determine whether the heterogeneous processing device can support the operator according to member functions of classes in an operator library of the heterogeneous processing device; if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment; if the heterogeneous processing device does not support the operator, continuing to determine whether a next heterogeneous processing device in the heterogeneous system can support the operator.
In some embodiments, if it is determined that the heterogeneous processing device corresponding to a specific operator in the multiple operators includes at least two heterogeneous processing devices, the processing method may further obtain a performance test result and a reference value for performance evaluation of each of the at least two heterogeneous processing devices; determining a performance test score according to the performance test result and the reference value; and determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score; and allocating the specific operator to the target heterogeneous processing equipment for operation.
Referring to fig. 6, an exemplary flowchart of another processing method for a federated learning computation task provided in an embodiment of the present application is applicable to a device selector of a heterogeneous system in the foregoing embodiment, and a suitable heterogeneous processing device is allocated to an operator for federated learning, so that different heterogeneous processing devices complete the federated learning computation task, thereby improving the federated learning computation efficiency.
As shown in fig. 6, the processing method specifically includes steps S201 to S204:
s201, obtaining an equipment selection strategy selected by a user, wherein the equipment selection strategy is used for selecting corresponding heterogeneous processing equipment for each operator in a plurality of operators related to a federated learning calculation task, and different equipment selection strategies correspond to different equipment selection modes;
s202, respectively determining heterogeneous processing equipment corresponding to each operator based on an equipment selection strategy selected by a user;
s203, distributing each operator to corresponding heterogeneous processing equipment for operation;
s204, under the condition that the operation of each operator is completed, a plurality of operation results corresponding to the operators are obtained, and the processing result of the federal learning calculation task is obtained according to the operation results.
Specifically, the device selection policy at least includes a first selection policy and a second selection policy, where the first selection policy is used to respectively determine, according to a preset correspondence table (table 3) between multiple operators and multiple heterogeneous processing devices, a heterogeneous processing device corresponding to each operator; and the second selection strategy is used for determining the heterogeneous processing equipment corresponding to each operator according to the operator type corresponding to each operator in the operators.
Correspondingly, if the user is determined to select the first selection strategy, respectively determining the heterogeneous processing equipment corresponding to each operator according to the preset corresponding relation table; and if the user selects the second selection strategy, obtaining the operator type of each operator, and determining the heterogeneous processing equipment corresponding to each operator.
In some embodiments, the second selection policy is further configured to determine a heterogeneous processing device corresponding to each operator according to at least one of a device state of the heterogeneous processing device, a device execution operator efficiency, a device energy consumption, and an operator type of the operator, where different heterogeneous processing devices corresponding to the operators include one or more heterogeneous processing devices.
In some embodiments, the operator type of each operator may be determined according to the operation type included in the operator. The determining the operator type of the operator according to the operation type included in the operator specifically may include: determining the operation type contained by each operator; if the operator is determined to contain modular exponentiation, determining that the operator type of the operator is the first type; if the operator is determined not to include modular exponentiation but to include modular multiplication, determining the operator type of the operator to be the second type; and if the operator is determined not to comprise the modular exponentiation operation and the modular multiplication operation, determining the operator type of the operator as the third type.
In some embodiments, the heterogeneous system includes a GPU processing device, an FPGA processing device, and a CPU processing device; the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device.
In some embodiments, before said assigning each of said operators to a corresponding heterogeneous processing device for operation, said processing method further includes: determining whether the heterogeneous processing equipment can support the operator according to member functions of classes in an operator library of the heterogeneous processing equipment; if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment; if the heterogeneous processing device does not support the operator, continuing to determine whether a next heterogeneous processing device in the heterogeneous system can support the operator.
In some embodiments, if it is determined that the heterogeneous processing device corresponding to a specific operator in the plurality of operators includes at least two heterogeneous processing devices, the processing method further includes: acquiring a performance test result and a reference value for performance evaluation of each heterogeneous processing device in the at least two heterogeneous processing devices; determining a performance test score according to the performance test result and the reference value; and determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score; correspondingly, the allocating each operator to the corresponding heterogeneous processing device for operation includes: and allocating the specific operator to the target heterogeneous processing equipment for operation.
In some embodiments, the heterogeneous system further includes a display device, and the processing method may further control the display device to display a policy selection interface for selection by a user, wherein the policy selection interface includes a plurality of different device selection policies.
It should be noted that, the processing method, such as the strategy for selecting heterogeneous processing devices, and the technical effect achieved by the processing method, can refer to the detailed description of the above embodiments, and are not described in detail herein.
While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (16)

1. A processing method for federated learning calculation tasks based on a heterogeneous system is characterized in that the heterogeneous system comprises a plurality of heterogeneous processing devices, and the processing method comprises the following steps:
determining an operator type for each of a plurality of operators involved in a federated learning computing task;
respectively determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to the operator type of each operator;
distributing each operator to corresponding heterogeneous processing equipment for operation;
and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the plurality of operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results.
2. The method for processing a federated learning computing task as recited in claim 1, wherein the determining an operator type for each of the plurality of operators involved in the federated learning computing task comprises:
and correspondingly determining the operator type of each operator according to the operation type contained by each operator in the operators involved in the federal learning calculation task.
3. The method for processing a federated learning and computing task as claimed in claim 2, wherein the determining, according to the operation type included in each of the operators in the plurality of operators involved in the federated learning and computing task, the operator type of each of the operators accordingly comprises:
determining an operation type contained by each operator in a plurality of operators involved in a federated learning calculation task;
if the operator is determined to contain modular exponentiation, determining that the operator type of the operator is a first type;
if the operator is determined not to comprise the modular exponentiation operation but comprises the modular multiplication operation, determining that the operator type of the operator is a second type;
and if the operator is determined not to comprise the modular exponentiation operation and the modular multiplication operation, determining that the operator type of the operator is a third type.
4. The processing method of the federal learning computation task of claim 3, wherein the heterogeneous system includes a GPU processing device, an FPGA processing device, and a CPU processing device;
the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device.
5. The method for processing a federated learning computing task as recited in claim 1, wherein the method further comprises:
acquiring at least one of equipment states, equipment operator execution efficiency and equipment energy consumption of a plurality of heterogeneous processing equipment;
the determining, from the heterogeneous system, the heterogeneous processing device corresponding to each operator according to the operator type of each operator includes:
and determining heterogeneous processing equipment corresponding to each operator from the heterogeneous system according to at least one item of the equipment state, the equipment operator execution efficiency and the equipment energy consumption and the operator type of the operator.
6. The processing method of a federated learning computation task as recited in claim 1, wherein before the operator is respectively assigned to the corresponding heterogeneous processing devices for operation, the processing method further comprises:
determining whether the heterogeneous processing equipment can support the operator according to member functions of classes in an operator library of the heterogeneous processing equipment;
if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment;
if the heterogeneous processing device does not support the operator, continuing to determine whether a next heterogeneous processing device in the heterogeneous system can support the operator.
7. The method for processing a federated learning computing task as recited in claim 1, wherein if it is determined that the heterogeneous processing device corresponding to a particular operator among the plurality of operators includes at least two heterogeneous processing devices, the method further comprises:
acquiring a performance test result and a reference value for performance evaluation of each heterogeneous processing device in the at least two heterogeneous processing devices;
determining a performance test score according to the performance test result and the reference value; and
determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score;
correspondingly, the allocating each operator to the corresponding heterogeneous processing device for operation includes: and allocating the specific operator to the target heterogeneous processing equipment for operation.
8. A processing method for a federated learning calculation task based on a heterogeneous system is characterized in that the heterogeneous system comprises a plurality of heterogeneous processing devices, and the processing method comprises the following steps:
obtaining an equipment selection strategy selected by a user, wherein the equipment selection strategy is used for selecting corresponding heterogeneous processing equipment for each operator in a plurality of operators related to a federated learning calculation task, and different equipment selection strategies correspond to different equipment selection modes;
respectively determining heterogeneous processing equipment corresponding to each operator based on an equipment selection strategy selected by a user;
distributing each operator to corresponding heterogeneous processing equipment for operation;
and under the condition that the operation of each operator is completed, acquiring a plurality of operation results corresponding to the plurality of operators respectively, and acquiring a processing result of the federal learning calculation task according to the plurality of operation results.
9. The method for processing a federated learning computing task as recited in claim 8, wherein the device selection policy includes at least a first selection policy and a second selection policy;
the first selection strategy is used for respectively determining the heterogeneous processing equipment corresponding to each operator according to a preset corresponding relation table between the operators and the heterogeneous processing equipment;
the second selection strategy is used for determining heterogeneous processing equipment corresponding to each operator according to the operator type corresponding to each operator in the operators;
if it is determined that the user selects the first selection strategy, respectively determining the heterogeneous processing device corresponding to each operator based on the device selection strategy selected by the user, including: respectively determining heterogeneous processing equipment corresponding to each operator according to the preset corresponding relation table;
if it is determined that the user selects the second selection strategy, respectively determining heterogeneous processing equipment corresponding to each operator based on the equipment selection strategy selected by the user, including: and acquiring the operator type of each operator, thereby determining the heterogeneous processing equipment corresponding to each operator.
10. The method for processing a federated learning computing task as recited in claim 9, wherein the second selection policy is further configured to determine the heterogeneous processing device corresponding to each operator according to at least one of a device state of the heterogeneous processing device, a device execution operator efficiency, a device energy consumption, and an operator type of the operator, where the heterogeneous processing devices corresponding to different operators include one or more heterogeneous processing devices.
11. A method for processing a federal learning computation task as claimed in claim 9 or 10, wherein the method further comprises:
and correspondingly determining the operator type of each operator according to the operation type contained in each operator.
12. The method for processing a federated learning computing task as recited in claim 11, wherein the determining the operator type for each operator according to the operator type included in each operator comprises:
determining the operation type contained by each operator;
if the operator is determined to contain modular exponentiation, determining that the operator type of the operator is the first type;
if the operator is determined not to include modular exponentiation but to include modular multiplication, determining the operator type of the operator to be the second type;
and if the operator does not comprise the modular exponentiation operation and the modular multiplication operation, determining that the operator type of the operator is the third type.
13. The method for processing the federal learning computation task of claim 12, wherein the heterogeneous system includes a GPU processing device, an FPGA processing device, and a CPU processing device;
the heterogeneous processing device corresponding to the operator of the first type is a GPU processing device, an FPGA processing device or a CPU processing device, the heterogeneous processing device corresponding to the operator of the second type is an FPGA processing device or a CPU processing device, and the heterogeneous processing device corresponding to the operator of the third type is a CPU processing device.
14. The processing method of a federated learning computation task as recited in claim 8, wherein before the operator is respectively assigned to the corresponding heterogeneous processing devices for operation, the processing method further comprises:
determining whether the heterogeneous processing equipment can support the operator according to member functions of classes in an operator library of the heterogeneous processing equipment;
if the heterogeneous processing equipment can support the operator, the operator is distributed to the corresponding heterogeneous processing equipment;
if the heterogeneous processing device does not support the operator, continuing to determine whether a next heterogeneous processing device in the heterogeneous system can support the operator.
15. The method for processing a federated learning computing task as recited in claim 8, wherein if it is determined that the heterogeneous processing device corresponding to a particular operator among the plurality of operators includes at least two heterogeneous processing devices, the method further comprises:
acquiring a performance test result and a reference value for performance evaluation of each heterogeneous processing device in the at least two heterogeneous processing devices;
determining a performance test score according to the performance test result and the reference value; and
determining a target heterogeneous processing device from the at least two heterogeneous processing devices according to the performance test score;
correspondingly, the allocating each operator to the corresponding heterogeneous processing device for operation includes: and allocating the specific operator to the target heterogeneous processing equipment for operation.
16. A heterogeneous system for federal learning, comprising:
a device selector;
the system comprises a plurality of heterogeneous processing devices, a processing device and a processing device, wherein the heterogeneous processing devices comprise at least one of an FPGA processing device, a GPU processing device and a CPU processing device;
wherein the device selector is configured to perform the steps of the method for processing the federal learning computation task as defined in any one of claims 1-15.
CN202111055426.3A 2021-09-09 2021-09-09 Heterogeneous system and processing method for federal learning Pending CN115794359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111055426.3A CN115794359A (en) 2021-09-09 2021-09-09 Heterogeneous system and processing method for federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111055426.3A CN115794359A (en) 2021-09-09 2021-09-09 Heterogeneous system and processing method for federal learning

Publications (1)

Publication Number Publication Date
CN115794359A true CN115794359A (en) 2023-03-14

Family

ID=85473475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111055426.3A Pending CN115794359A (en) 2021-09-09 2021-09-09 Heterogeneous system and processing method for federal learning

Country Status (1)

Country Link
CN (1) CN115794359A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700934A (en) * 2023-08-04 2023-09-05 浪潮电子信息产业股份有限公司 Multi-element heterogeneous computing power equipment scheduling method, device, equipment and storage medium
CN117521150A (en) * 2024-01-04 2024-02-06 极术(杭州)科技有限公司 Data collaborative processing method based on multiparty security calculation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157711A1 (en) * 2016-12-06 2018-06-07 Electronics And Telecommunications Research Institute Method and apparatus for processing query based on heterogeneous computing device
CN111488205A (en) * 2019-01-25 2020-08-04 上海登临科技有限公司 Scheduling method and scheduling system for heterogeneous hardware architecture
CN112883408A (en) * 2021-04-29 2021-06-01 深圳致星科技有限公司 Encryption and decryption system and chip for private calculation
CN113112029A (en) * 2021-04-22 2021-07-13 中国科学院计算技术研究所 Federal learning system and method applied to heterogeneous computing equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180157711A1 (en) * 2016-12-06 2018-06-07 Electronics And Telecommunications Research Institute Method and apparatus for processing query based on heterogeneous computing device
CN111488205A (en) * 2019-01-25 2020-08-04 上海登临科技有限公司 Scheduling method and scheduling system for heterogeneous hardware architecture
CN113112029A (en) * 2021-04-22 2021-07-13 中国科学院计算技术研究所 Federal learning system and method applied to heterogeneous computing equipment
CN112883408A (en) * 2021-04-29 2021-06-01 深圳致星科技有限公司 Encryption and decryption system and chip for private calculation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116700934A (en) * 2023-08-04 2023-09-05 浪潮电子信息产业股份有限公司 Multi-element heterogeneous computing power equipment scheduling method, device, equipment and storage medium
CN116700934B (en) * 2023-08-04 2023-11-07 浪潮电子信息产业股份有限公司 Multi-element heterogeneous computing power equipment scheduling method, device, equipment and storage medium
CN117521150A (en) * 2024-01-04 2024-02-06 极术(杭州)科技有限公司 Data collaborative processing method based on multiparty security calculation
CN117521150B (en) * 2024-01-04 2024-04-09 极术(杭州)科技有限公司 Data collaborative processing method based on multiparty security calculation

Similar Documents

Publication Publication Date Title
US11297130B2 (en) Cluster resource management in distributed computing systems
US9367359B2 (en) Optimized resource management for map/reduce computing
WO2019001418A1 (en) Data sharing system and data sharing method therefor
Gai et al. Energy-aware optimal task assignment for mobile heterogeneous embedded systems in cloud computing
CN115794359A (en) Heterogeneous system and processing method for federal learning
CN111695675A (en) Federal learning model training method and related equipment
CN102541858A (en) Data equality processing method, device and system based on mapping and protocol
da Silva et al. Scalability limits of Bag-of-Tasks applications running on hierarchical platforms
CN114667507A (en) Resilient execution of machine learning workload using application-based profiling
CN103856547A (en) Mapping method and system of virtual machines and client side devices
CN114168998A (en) Data processing method and device
CN109614227A (en) Task resource concocting method, device, electronic equipment and computer-readable medium
CN114296922A (en) Multi-party data processing method, system, electronic device and storage medium
CN111860853A (en) Online prediction system, online prediction equipment, online prediction method and electronic equipment
Liu et al. K‐PSO: An improved PSO‐based container scheduling algorithm for big data applications
Nasonov et al. Hybrid scheduling algorithm in early warning systems
Soleymani et al. Fuzzy rule-based trust management model for the security of cloud computing
Zhu et al. A security protection framework for cloud computing
Vahidi et al. Optimization of resource allocation in cloud computing by grasshopper optimization algorithm
US11301305B2 (en) Dynamic resource clustering architecture
Costa et al. Exploiting different types of parallelism in distributed analysis of remote sensing data
CN111291084A (en) Sample ID alignment method, device, equipment and storage medium
US11334393B2 (en) Resource cluster chaining architecture
Lukashin et al. Resource scheduler based on multi-agent model and intelligent control system for openstack
CN112801276A (en) Data processing method, processor and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination