WO2024078428A1 - Acceleration device, computing system, and acceleration method - Google Patents

Acceleration device, computing system, and acceleration method Download PDF

Info

Publication number
WO2024078428A1
WO2024078428A1 PCT/CN2023/123473 CN2023123473W WO2024078428A1 WO 2024078428 A1 WO2024078428 A1 WO 2024078428A1 CN 2023123473 W CN2023123473 W CN 2023123473W WO 2024078428 A1 WO2024078428 A1 WO 2024078428A1
Authority
WO
WIPO (PCT)
Prior art keywords
component
data
ciphertext
processing
bucket
Prior art date
Application number
PCT/CN2023/123473
Other languages
French (fr)
Chinese (zh)
Inventor
何倩雯
蒋佳立
邬贵明
Original Assignee
杭州阿里云飞天信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州阿里云飞天信息技术有限公司 filed Critical 杭州阿里云飞天信息技术有限公司
Publication of WO2024078428A1 publication Critical patent/WO2024078428A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular, to an acceleration device, a computing system, and an acceleration method.
  • Homomorphic encryption is a type of encryption algorithm with special natural properties. By processing homomorphically encrypted data, an output data is obtained. After decrypting this output data, the result is the same as the output result obtained by processing the unencrypted original data in the same way. That is, calculation first and then decryption is equivalent to decryption first and then calculation. This feature is of great significance for protecting data security.
  • the data initiator performs homomorphic encryption on the target data obtained by calculating the feature values of each object to obtain ciphertext data, and then provides the ciphertext data corresponding to the multiple objects to the data receiver; the data receiver buckets the ciphertext data corresponding to the multiple objects according to different feature values for each feature it has; then it calculates and processes the ciphertext data in each bucket to obtain the ciphertext processing result, and then returns the ciphertext processing results of each bucket corresponding to each feature to the data initiator.
  • the data initiator can then decrypt and obtain the plaintext processing results of each bucket, and can perform subsequent processing operations based on the plaintext processing results of each bucket, thereby achieving the purpose of the data initiator using the features of the data receiver to process the data, while protecting the data security of both parties.
  • the embodiments of the present application provide an acceleration device, a computing system and an acceleration method for solving the technical problems that affect processing efficiency in the prior art.
  • an acceleration device comprising: a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus;
  • the first storage component is used to store multiple ciphertext data corresponding to multiple objects sent by the first host processing component;
  • the second acceleration component is used to obtain the multiple ciphertext data from the first storage component, and for any feature, perform bucket processing on the multiple ciphertext data to obtain multiple bucket results; and store the multiple bucket results in the first storage component;
  • the first acceleration component is used to obtain the multiple bucket results from the first storage component; perform calculations on the ciphertext data in the same bucket result to obtain a ciphertext processing result; and store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component;
  • the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
  • an embodiment of the present application provides a computing system, including a first computing device and a second computing device, wherein the first computing device includes a first host processing component and an acceleration device as described in any one of the first aspects above;
  • the second computing device includes a second host processing component and a second acceleration device; the second acceleration device includes a second storage component and at least one third acceleration component; the second storage component is connected to the second host processing component via a bus;
  • the second storage component is used to store a plurality of to-be-processed data sent by the second host processing component; the to-be-processed data is target data to be encrypted or a ciphertext processing result to be decrypted;
  • the third acceleration component is used to obtain at least one to-be-processed data from the second storage component; for any to-be-processed data, encrypt or decrypt the to-be-processed data to obtain a calculation result, and store the calculation result in the second storage component;
  • the second host processing component is used to obtain a calculation result corresponding to any data to be processed from the second storage component.
  • an embodiment of the present application provides a computing device, including a host processing component, a host storage component, and an acceleration device as described in the first aspect above.
  • an acceleration method is provided in an embodiment of the present application, which is applied to an acceleration device, wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus; wherein the first storage component stores the first host processing component.
  • the method includes:
  • the plurality of ciphertext data are bucketed to obtain a plurality of bucketing results
  • the multiple bucket results are stored in the first storage component; the first acceleration component is used to obtain the multiple bucket results from the first storage component; the ciphertext data in the same bucket result is calculated and processed to obtain the ciphertext processing result; the ciphertext processing results corresponding to the multiple bucket results are respectively stored in the first storage component; the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
  • the acceleration device provided in the embodiment of the present application includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; the second acceleration component performs bucket processing, and the first host processing component stores multiple ciphertext data in the first storage component. Multiple features can share the multiple ciphertext data for bucket processing, and then the first acceleration component obtains the bucket result from the first storage component, and performs calculation processing on the ciphertext data in the same bucket result to obtain the ciphertext processing result; the ciphertext processing result can be provided to the first host processing component via the first storage component.
  • the bucket processing operation and the calculation processing operation can be implemented using the acceleration device, which reduces the amount of calculation of the host processing component, thereby improving the processing efficiency, and can reduce the I/O overhead to ensure the acceleration performance of the acceleration device.
  • FIG1 is a schematic structural diagram of an embodiment of an acceleration device provided by the present application.
  • FIG2 is a schematic structural diagram of an embodiment of a second accelerating component provided by the present application.
  • FIG3 shows a schematic structural diagram of an embodiment of a first accelerating component provided by the present application
  • FIG4 shows a schematic structural diagram of an embodiment of a first computing unit provided by the present application
  • FIG5 is a schematic diagram showing the operation structure of a first operation unit in a practical application of an embodiment of the present application
  • FIG6a shows a schematic diagram of the structure of an embodiment of a computing system provided by the present application.
  • FIG6b is a schematic diagram showing an interaction scenario of a computing system provided by the present application in an actual application
  • FIG7a shows a schematic structural diagram of an embodiment of a second acceleration device provided by the present application.
  • FIG7b shows a schematic structural diagram of an embodiment of a third acceleration component provided by the present application.
  • FIG8 shows a flow chart of an embodiment of an acceleration method provided by the present application
  • FIG9 shows a flow chart of an embodiment of an acceleration method provided by the present application.
  • FIG. 10 shows a schematic structural diagram of an embodiment of a computing device provided by the present application.
  • the data initiator will calculate the target data based on the feature value of each object, perform homomorphic encryption, obtain ciphertext data, and then provide the ciphertext data corresponding to multiple objects to the data receiver; the data receiver will bucket the ciphertext data corresponding to multiple objects according to different feature values for each feature it possesses; then calculate and process the ciphertext data in each bucket result to obtain the ciphertext processing result, and then return the ciphertext processing results of each bucket result corresponding to each feature to the data initiator, and the data initiator can decrypt and obtain the plaintext processing results of each bucket result, and can perform subsequent processing operations based on the plaintext processing results of each bucket result.
  • federated learning is used for multi-party joint modeling.
  • Federated learning is a distributed machine learning method that can use data from multiple parties for joint modeling while protecting data privacy.
  • Vertical federated learning is a commonly used federated learning method, which refers to multi-party joint modeling when the feature data and label information of the sample object are distributed among different data providers. Multiple data providers have the same sample object but different feature data. For example, data provider A and data provider B have the same user C, but data provider A has the educational background data of user C, and data provider B has the age data of user C. The educational background data and age data are feature data.
  • the data provider with the label data is also called the data initiator (active party), and the data provider without label data is also called the data receiver (passive party).
  • active party the data provider without label data
  • passive party the data receiver
  • the active party can use the feature data of the passive party to improve the capabilities of the machine learning model while protecting the data privacy of each participant.
  • the decision tree model is a commonly used machine learning model. The most important thing in training the decision tree model is to find the optimal split point, where the split point refers to the specific value of a certain feature data. For example, if the label data is user C as the target group, the split point may be age less than 20 years old or age less than 30 years old, etc.
  • the active party When training a decision tree model, the following method is usually used: the active party first determines the gradient information corresponding to the model based on the feature values and label data of the sample objects it has, and then encrypts the gradient information into ciphertext gradient information using homomorphic encryption and transmits it to the passive party.
  • the passive party calculates the ciphertext gradient accumulation value of the split space corresponding to each feature based on the ciphertext gradient information, and then sends the ciphertext gradient accumulation value to the active party.
  • the active party decrypts it to obtain the gradient accumulation value, and can finally determine the optimal split point based on the gradient accumulation values of multiple features. It can be seen that the passive party needs to ciphertext accumulate the ciphertext gradient information obtained by homomorphic encryption.
  • the bucketing method can be used. For each feature data, the passive party can bucket the ciphertext gradient information corresponding to different sample objects according to the feature value, accumulate the ciphertext gradient information in each bucket result, and then send the ciphertext gradient accumulation value corresponding to each bucket result to the active party. The active party then determines the optimal split point based on the ciphertext gradient accumulation value of each bucket result.
  • the data receiver needs to bucket each feature and perform corresponding calculations on the ciphertext gradient information in each bucket result. Since these calculations and processing operations are usually completed by the processing components in each computing device, the host processing component also needs to perform the remaining work, which will result in a large amount of calculation for the processing component, thereby affecting the processing performance and reducing the processing efficiency.
  • the inventors found in their research that the computational processing of ciphertext data obtained by encrypting using a homomorphic encryption algorithm essentially requires large integer multiplication and addition to achieve, which consumes a lot of processing performance. Therefore, they thought of using a dedicated accelerator to perform computational processing on ciphertext data to achieve better processing performance.
  • the inventors also found that if a dedicated accelerator is used, the host processing component is still required to perform bucket processing.
  • the number of objects is often large, especially in joint modeling scenarios, where sample objects are usually in the hundreds of thousands or even millions, and the number of features is also very large. Since bucket processing is required for each feature, the bucketing results need to be transmitted to the accelerator for each feature.
  • the data order is: number of objects * number of features, which will in turn bring about a large I/O overhead, resulting in an acceleration performance bottleneck.
  • the embodiment of the present application provides an acceleration device, which is composed of a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; the second acceleration component performs bucket processing, and the first host processing component only needs to send multiple ciphertext data corresponding to multiple objects once to be stored in the first storage component, and multiple features can share the multiple ciphertext data for bucket processing, and then the first acceleration component obtains the bucket result from the first storage component, and performs calculation processing on the ciphertext data in the same bucket result to obtain the ciphertext processing result; the ciphertext processing result can be provided to the first host processing component via the first storage component, so that the host processing component only needs to perform data transmission once, and the acceleration device can be used to implement bucket processing and calculation processing, which reduces the operation of the host processing component.
  • the amount of calculation is performed by using dedicated acceleration devices to perform
  • FIG1 is a schematic diagram of the structure of an embodiment of an acceleration device provided by an embodiment of the present application, and the acceleration device may include a first storage component 101, a first acceleration component 102 and a second acceleration component 103 respectively connected to the first storage component 101.
  • the first storage component 101 is connected to the first host processing component 100 via a bus, and the bus type may be, for example, PCIE (peripheral component interconnect express, a high-speed serial computer expansion bus standard), and of course, other high-speed buses such as Ethernet may also be used for interconnection, and this application does not limit this.
  • PCIE peripheral component interconnect express
  • Ethernet may also be used for interconnection, and this application does not limit this.
  • the acceleration device can be implemented by an application-specific integrated circuit (ASIC) or a field programmable gate array (FPGA).
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • DSP digital signal processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • controller a microcontroller
  • microprocessor or other forms of integrated circuits (IC), etc. This application does not limit this.
  • the acceleration device can be deployed in a first computing device.
  • the first computing device can be referred to as a host device of the acceleration device.
  • the first host processing component can be, for example, a central processing unit (CPU) in the first computing device, which is responsible for traditional processing tasks in the first computing device.
  • CPU central processing unit
  • the first storage component 101 is used to store multiple ciphertext data corresponding to multiple objects sent by the first host processing component 100;
  • the second acceleration component 103 is used to obtain multiple ciphertext data from the first storage component 101, and for any feature, perform bucket processing on the multiple ciphertext data to obtain multiple bucket results; and store the multiple bucket results in the first storage component 101;
  • the first acceleration component 102 is used to obtain multiple bucket results from the first storage component; perform calculations on the ciphertext data in the same bucket result to obtain a ciphertext processing result; and store the ciphertext processing results corresponding to the multiple bucket results in the first storage component;
  • the first storage component 101 is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component 100 .
  • Each object may correspond to a ciphertext data, and thus multiple objects may correspond to multiple ciphertext data.
  • the ciphertext data may be obtained by encrypting the target data using a homomorphic encryption algorithm.
  • the ciphertext data may refer to ciphertext gradient information, which is obtained by the data initiator by encrypting the gradient data using a homomorphic encryption algorithm.
  • the first host processing component 100 can transmit multiple ciphertext data corresponding to multiple objects sent by the data sender to the first storage component 101 in the acceleration device for storage.
  • the first host processing component 100 may send corresponding instruction information to the first storage component 101, the first acceleration component 102, and the second acceleration component 103 to start or trigger each component to perform corresponding operations. For example, after the first host processing component 100 stores multiple ciphertext data in the first storage component 101, it may send corresponding instruction information to the second acceleration component 103, and the second acceleration component 103 may obtain the multiple ciphertext data from the first storage component 101 based on the instruction information.
  • the first host processing component 100 may also notify the first storage component 101, the first acceleration component 102, and the second acceleration component 103 to start after receiving the multiple ciphertext data sent by the data initiator, and the first storage component 101, the first acceleration component 102, and the second acceleration component 103 may trigger the execution of their respective operations in real time or periodically.
  • the second acceleration component 103 is responsible for the bucket processing operation corresponding to each feature owned by the data receiver. It can bucket multiple ciphertext data for each feature to obtain multiple bucket results corresponding to each feature. The bucket results corresponding to different features can then be stored in the first storage component 101.
  • the second acceleration component 103 can also send a bucket end notification to the first host processing component 100. After receiving the bucket end notification, the first host processing component 100 can notify the first acceleration component 102 to obtain multiple bucket results and perform calculation processing.
  • the second acceleration component 103 can adopt a parallel method to perform bucket processing on multiple ciphertext data for multiple features at the same time.
  • the multiple features can be notified by the first host processing component 100, etc.
  • the first host processing component 100 can divide the features to be processed into multiple groups, each group includes multiple features, and after the bucketing operation corresponding to multiple features in any group is completed, multiple features of another group are issued.
  • the first acceleration component 102 After the first acceleration component 102 obtains multiple bucket results from the first storage component 101, it can calculate and process the ciphertext data in the same bucket result to obtain the ciphertext processing result, and store the ciphertext processing results corresponding to the multiple bucket results in the first storage component; optionally, the first acceleration component 102 can calculate and process the ciphertext data in the same bucket result specifically according to the target calculation processing mode, and the corresponding operation method can be determined according to the target calculation processing mode, and the calculation processing is specifically performed according to the operation method corresponding to the target calculation processing mode.
  • the target computing processing mode or the operation method can be notified to the first acceleration component 102 by the first host processing component 100 .
  • the target calculation processing mode may include, for example, ciphertext accumulation, and may also include ciphertext multiplication, ciphertext subtraction, etc. In a multi-party joint modeling scenario, the target calculation processing mode may specifically refer to ciphertext accumulation.
  • the operation method corresponding to the accumulation of ciphertext can be point addition operation.
  • ECC Elliptic Curve Cryptography
  • the accumulation of ciphertext means the conversion into the point addition operation of two points on the elliptic curve.
  • the point addition operation is converted into arithmetic operations such as modular addition and modular multiplication when it is executed.
  • the first storage component 101 After the first storage component 101 stores the ciphertext processing results corresponding to the multiple bucket results, it can notify the first host processing component 100, so that the first host processing component 100 can obtain the multiple buckets from the first storage component 100.
  • the results correspond to the ciphertext processing results respectively.
  • the first storage component can be implemented by an external memory with a higher bandwidth.
  • the first host processing component 100 can send the ciphertext processing results corresponding to the multiple bucket results to the data initiator to facilitate the data initiator to perform subsequent operations.
  • the data initiator can first decrypt to obtain the plaintext processing results corresponding to the multiple bucket results corresponding to each feature, and then calculate the plaintext processing results according to the target calculation processing mode; or the data initiator can first calculate the ciphertext processing results corresponding to the multiple bucket results corresponding to each feature according to the target calculation processing mode, and then decrypt the processing results.
  • bucket operations and computing processing operations can be performed by the acceleration device.
  • the host processing component only needs to transmit the ciphertext data once, which can be shared by multiple features for bucket operations, thereby reducing the amount of computation of the host processing component, improving processing efficiency, and reducing I/O overhead, thereby ensuring the acceleration performance of the acceleration device.
  • the acceleration device may further include a bus interface 104, which may be used to access the first computing device, so that the first acceleration component 102, the second acceleration component 103, and the first storage component 101 are connected to the first host processing component 100 in the first computing device through a bus.
  • the bus interface 104 may be used to enable the acceleration device to be pluggable and installed in the first computing device.
  • the acceleration device may further include a substrate 105 , on which the first storage component 101 , the first acceleration component 102 , and the second acceleration component 103 are welded, so as to realize electrical connection between the first acceleration component 102 , the second acceleration component 103 and the first storage component 101 , respectively.
  • the multiple ciphertext data can be divided into multiple data intervals, each data interval is similar to a bucket, and the ciphertext data contained in each data interval constitutes a bucket result.
  • the second acceleration component 103 buckets the multiple ciphertext data for any feature, and obtaining multiple bucket results may include: for any feature, bucketing the multiple ciphertext data according to at least one feature value corresponding to the feature, and obtaining multiple bucket results.
  • the bucket processing operation can first divide multiple objects according to at least one feature value, and then divide the ciphertext data corresponding to the multiple objects according to the division results of the multiple objects, so that the ciphertext data corresponding to the objects in the same feature value interval are divided into the same bucket result.
  • the feature values include 10, 20, and 30.
  • the age can be divided into four age intervals: 0-10, 10-20, 20-30, and 30- ⁇ (infinity).
  • the four age intervals multiple users can be divided into different age intervals. Then, the ciphertext data corresponding to users in the same age range is also divided into the same bucket, thereby obtaining multiple bucket results.
  • At least one feature value corresponding to each feature can be stored in the first storage component 100 by the first host processing component 100, and obtained from the first storage component 100 by the second acceleration component 103.
  • the first host processing component 100 can directly send at least one feature value corresponding to each feature to the second acceleration component 103.
  • the first storage component 101 is also used to store bucket information of multiple objects corresponding to different features sent by the first host processing component 100;
  • the second acceleration component 103 performs bucket processing on multiple ciphertext data for any feature, and obtains multiple bucket results including: for any feature, determining bucket information of multiple objects corresponding to the feature respectively; dividing the ciphertext data corresponding to at least one object corresponding to the same bucket information into the same bucket, so as to obtain multiple buckets.
  • the bucket information may refer to a bucket identifier, which is used to uniquely identify a bucket and may be implemented in the form of any one or more characters (such as a combination of numbers, letters, etc.), which is not limited in this application. Bucket information corresponding to different features of multiple objects may be determined by the first host processing component 100.
  • the first host processing component 100 After the first host processing component 100 obtains the ciphertext data corresponding to each object, it can combine the multiple features possessed by the data recipient itself, and for each feature, divide the multiple objects according to at least one feature value corresponding to each feature, so as to determine the feature value interval of each object, and set the same bucket information for objects in the same feature value interval, and the bucket information corresponding to different feature value intervals is different.
  • the first host processing component 100 can store the bucket information of each feature corresponding to each object in the first storage component 101, and the second acceleration component 103 can obtain the bucket information of each feature corresponding to multiple objects from the first storage component 101.
  • the first host processing component 100 can also send the bucket information of different features corresponding to multiple objects to the second acceleration component 103.
  • the second acceleration component 103 may include a data loading unit 201 , a plurality of bucketing units 202 , and a data storage unit 203 .
  • the data loading unit 201 is used to obtain multiple ciphertext data from the first storage component 101, and provide the multiple ciphertext data to multiple bucketing units respectively; assign features to be processed to the multiple bucketing units respectively, and control the multiple bucketing units to process the assigned features to be processed in parallel;
  • the bucketing unit 202 is used to bucket the multiple ciphertext data according to the features assigned to it, obtain multiple bucketing results, and send the multiple bucketing results to the storage unit;
  • the data storage unit 203 is used to store the multiple bucketing results sent by each bucketing unit into the first storage component.
  • the data storage unit can be implemented by RAM (Random Access Memory).
  • Each bucketing unit 202 can be allocated to obtain at least one feature, and the at least one feature can be processed by bucketing in a line processing manner.
  • each bucket unit 202 may be assigned a feature, and the first host processing component 100 may determine the number of features to be processed in parallel at one time according to the number of units of the multiple bucket units 202 , and the number of features may be less than or equal to the number of units.
  • the first host processing component 100 can select at least one feature according to the number of features, and provide the bucket information of the at least one feature corresponding to multiple objects to the acceleration device, so that the data loading unit 201 can assign the bucket information of the at least one feature to at least one bucket unit 202 one by one, and each bucket unit 202 can obtain the bucket information of a feature, and then for the feature assigned to it, the ciphertext data corresponding to at least one object corresponding to the same bucket information can be divided into the same bucket result; of course, the first host processing component 100 can also select at least one feature according to the number of features, and provide at least one feature value corresponding to the at least one feature to the acceleration device, so that the data loading unit 201 can assign at least one feature value corresponding to the at least one feature to at least one bucket unit 202 one by one, and each bucket unit 202 can obtain at least one feature value of a feature, and then for the feature assigned to it, the multiple ciphertext data can be bucketed according to the at least one feature value corresponding to it
  • the first acceleration component 101 may include at least one first acceleration unit;
  • each first acceleration unit can be used to obtain at least one bucket result from the first storage component 101, and for any bucket result, according to the target calculation processing mode, calculate and process multiple ciphertext data in the bucket result to obtain the ciphertext processing result; store the ciphertext processing result corresponding to any bucket result in the first storage component 101.
  • the first acceleration component 101 may be provided with a plurality of first acceleration units, so as to improve parallel processing capability, processing efficiency, and acceleration performance.
  • each first acceleration unit may include a first control unit 301 and a plurality of first computing units 302 .
  • the first control unit 301 is used to obtain at least one bucket result from the first storage component 101; and dispatch the at least one bucket result to at least one computing unit 302;
  • the first computing unit 302 is used to perform computing processing on a plurality of ciphertext data in the bucket result according to a target computing processing mode for any bucket result assigned thereto to obtain a ciphertext processing result;
  • the first control unit 301 is used to store the ciphertext processing result corresponding to any bucket result in the first storage component 101.
  • Multiple first computing units 302 can be used to implement parallel computing of multiple bucket results, thereby improving processing efficiency and further ensuring acceleration performance.
  • each first acceleration unit 300 may further include a first storage unit 303 ;
  • the first operation unit 302 may also be used to save the ciphertext processing result corresponding to any bucket result to the first storage unit 303;
  • the first control unit 301 stores the ciphertext processing result corresponding to any bucket result in the first storage component, which may be:
  • the ciphertext processing result corresponding to any bucket result stored in the first storage unit 303 is stored in the first storage component 101.
  • each first acceleration unit 300 may further include a first loading unit 304 .
  • the first control unit 301 obtaining at least one bucket result from the first storage component 101 may specifically control the first loading unit 304 to obtain at least one bucket result from the first storage component 101 .
  • the first control unit 301 can perform corresponding operations according to the instructions of the first host processing component 100. Therefore, in some embodiments, the first control unit 301 can also be used to receive first control information sent by the first host processing component 100, and control the operation of multiple first computing units 302, first storage units 303, and first loading units 304 according to the first control information.
  • the first control information may include the first total data amount of at least one bucket result that the first acceleration unit 300 needs to obtain and the second total data amount corresponding to the at least one bucket result after the calculation and processing are performed on the at least one bucket result.
  • it may also include the first storage address corresponding to the at least one bucket result that needs to be obtained and the second storage address corresponding to the at least one ciphertext processing result obtained after the calculation and processing are performed on the at least one bucket result.
  • the first control unit 301 may specifically obtain at least one bucket result from the first storage component 101 according to the first total data amount and the first storage address; and may control the first storage unit 303 to store at least one ciphertext processing result to the first storage component 101 according to the second total data amount and the second storage address.
  • the first control unit 301 may specifically control the first loading unit 304 to obtain at least one bucket result from the first storage component 101 according to the first total data amount and the first storage address.
  • the first control information may further include the target computing processing mode or the operation method corresponding to the target computing processing mode, and the first control unit 301 may specifically notify the first operation unit 302 of the corresponding operation method according to the first control information.
  • the first operation unit 302 calculates and processes multiple ciphertext data in any bucket result assigned to it to obtain a ciphertext processing result, including: for any bucket result assigned to it, according to the operation method, calculating and processing multiple ciphertext data in the bucket result to obtain a ciphertext processing result.
  • the operation method corresponding to each target calculation mode can be pre-configured with one or more corresponding operation instructions, and the calculation and processing of multiple ciphertext data in each bucket result can be achieved by executing one or more operation instructions.
  • each first operation unit 302 can be implemented by a programmable processor (PC), which can store corresponding instructions to perform corresponding operations.
  • the first operation unit 302 may include a first storage subunit 401, a first parsing subunit 402, a first calculation subunit 403, and a first control subunit 404;
  • the first storage subunit 401 is used to store one or more operation instructions corresponding to the target computing processing mode
  • the first parsing subunit 402 is used to parse one or more operation instructions
  • the first control subunit 403 is used to send a signal to the first calculation subunit 404 based on the analysis result of the first analysis unit. Calculation instructions;
  • the first calculation subunit 404 is used to perform calculation processing on multiple ciphertext data based on the calculation indication information to obtain a ciphertext processing result.
  • the one or more operation instructions may be converted into corresponding calculation instruction information after being parsed to control the operation of the first calculation subunit.
  • the first storage subunit may be implemented by RAM, etc.
  • the target computing processing mode can be ciphertext accumulation, and the corresponding operation mode is point addition.
  • ciphertext data is encrypted using a homomorphic encryption algorithm based on elliptic curves, such as the EC-ELGamal semi-homomorphic acceleration algorithm.
  • EC-ElGamal is a type of ECC, which is an implementation of ElGamal transplanted to elliptic curves.
  • the main calculations include: elliptic curve point addition, point subtraction, point multiplication, modular inversion and discrete logarithm.
  • ElGamal is an asymmetric encryption algorithm based on Diffie-Hellman key exchange.
  • P represents the public key, which is a point on the elliptic curve
  • G is the base point of the elliptic curve
  • k is a random number
  • m is the plaintext data to be encrypted, that is, the target data
  • Enc(P, m) represents the ciphertext obtained by encryption, which is composed of the point pair data C1 and C2 .
  • M represents the decryption result
  • s represents the private key
  • the private key multiplied by the base point is the public key
  • C2-sC1 mG.
  • Ciphertext addition is essentially point addition on an elliptic curve, while decryption requires point multiplication on an elliptic curve.
  • Point multiplication operations essentially consist of scalars and points.
  • the point multiplication operation kP includes the scalar k and the point P
  • the point multiplication operation mG includes the scalar m and the point G.
  • Ciphertext accumulation means adding multiple ciphertext data.
  • the first calculation subunit 404 performs calculation processing on multiple ciphertext data based on the calculation indication information to obtain the ciphertext processing result, which can be: sequentially obtain one ciphertext data from the multiple ciphertext data, perform a dot addition operation with the previous dot addition result, and determine whether the current accumulation times meets the preset times. If yes, the last point addition result is output as the ciphertext processing result, if no, the point addition result is saved in the first storage subunit 401.
  • the first control unit can provide the multiple ciphertext data to the first calculation subunit in the form of an input data stream.
  • the first storage subunit 401 may include a first instruction storage subunit, a first data storage subunit, and a first number storage unit.
  • the first instruction storage unit is used to store one or more operation instructions
  • the first data storage subunit is used to store intermediate results in the calculation process, such as the previous point addition result
  • the first number storage unit is used to store a preset number of times, etc.
  • the first storage subunit may further include a first scalar storage subunit for storing scalar data.
  • the operation schematic diagram of the first operation unit 302 can be as shown in FIG5.
  • the first instruction storage unit, the first parsing subunit, the first control subunit, the first data storage subunit, the first calculation subunit, the first number storage subunit and the first scalar storage subunit described in FIG5 have been described in detail above and will not be repeated here.
  • the first calculation subunit can have a basic calculation logic, which can include a first input A, a second input B, a third input C and a fourth input D; the first input A can come from the input data stream or the first data storage subunit, and the second input B, the third input C and the fourth input D can come from the first data storage subunit, and of course each input can be empty.
  • the ciphertext data obtained from the input data stream can enter the first input A, the previous point addition result is used as the second input B, the third input C and the fourth input D can be space, and the first calculation subunit performs a point addition operation, and performs a point addition operation on the first input A and the second input B to obtain a point addition result, which will be stored in the first data storage subunit or output as a ciphertext processing result.
  • the first computing device may be a computing device corresponding to a data receiver responsible for computing and processing multiple ciphertext data, etc.
  • the data initiator corresponds to a second computing device, which is used to transmit ciphertext data corresponding to multiple objects to the first computing device.
  • the embodiment of the present application further provides a computing system, which may include a first computing device 60 and a second computing device 70 .
  • the first computing device 60 may include a first host processing component 100 and a first acceleration device 601.
  • the specific structural implementation of the first acceleration device 601 may be described in detail in any of the embodiments shown in FIG. 1 to FIG. 5 above, and will not be repeated here.
  • the second computing device 70 may include a second host processing component 700 and a second acceleration device 602 .
  • the second computing device 70 may also be configured with a second acceleration device 602 for accelerating encryption or decryption operations. Therefore, the second acceleration device 602 can be used to obtain multiple data to be processed, and for any data to be processed, encrypt or decrypt the data to be processed to obtain a calculation result.
  • the data to be processed may be target data to be encrypted or a ciphertext processing result to be decrypted; correspondingly, the calculation processing result may be ciphertext data or a plaintext processing result.
  • the second host processing component 700 may obtain the ciphertext data corresponding to the multiple objects respectively from the second acceleration device 602 and send the ciphertext data to the first computing device 60 .
  • the second host processing component 700 can obtain the plaintext processing result from the second acceleration device 602 and perform subsequent processing operations.
  • the technical solution of the embodiment of the present application can be applied to a scenario in which multi-party joint modeling is performed using a vertical federated learning method.
  • the second acceleration device 602 in the second computing device 70 of the data initiator first encrypts the gradient information corresponding to different sample objects to obtain the ciphertext gradient information of multiple sample objects, wherein the gradient information is calculated based on the feature values and label data corresponding to the sample objects provided by the data initiator using a decision tree model.
  • the second acceleration device 602 sends the ciphertext gradient information of the multiple sample objects to the second host processing component 700, and the second host processing component 700 sends the ciphertext gradient information of the multiple sample objects to the first computing device 60 corresponding to the data recipient.
  • the ciphertext gradient information of the multiple sample objects can be sent to the first acceleration device 601.
  • the first acceleration device 601 can first perform bucket processing on the ciphertext gradient information of the multiple sample objects according to different features to obtain multiple bucket results of each feature, and then use the technical solution of the present application to calculate the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature, and then send the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature to the first host processing component 100; the first host processing component 100 then sends the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature to the second computing device 70 of the data initiator.
  • the second host processing component 700 in the second computing device 70 receives the ciphertext gradient accumulation values corresponding to the multiple bucket results of each feature, and can send them to the second acceleration device 602.
  • the second acceleration device 602 can decrypt and obtain the gradient accumulation values corresponding to the multiple bucket results of each feature, and then accumulate the gradient accumulation values of the multiple bucket results of each feature to obtain the gradient accumulation value corresponding to the feature.
  • the second acceleration device 602 may send the gradient accumulation values corresponding to the plurality of features to the second host processing component 700 .
  • the second host processing component 700 can specifically determine the decision tree model based on the gradient accumulation values corresponding to the multiple features.
  • the optimal split point According to the optimal split point, the decision tree model can be constructed.
  • the decision tree model may be an XGBoost (eXtreme Gradient Boosting) model, etc.
  • XGBoost eXtreme Gradient Boosting
  • GBDT Gram Boosting Decision Tree
  • GBM GBM
  • the gradient information may include the first-order gradient and second-order gradient corresponding to each sample object, which is obtained by deriving the loss function of the decision tree model.
  • the feature values of the sample object are input into the decision tree model to obtain the predicted data.
  • the loss function can be used to estimate the degree of inconsistency between the predicted data and the label data.
  • the first-order gradient and second-order gradient can be obtained by deriving the loss function.
  • the target data when the data to be processed is the target data to be encrypted, the target data may be the gradient information corresponding to the decision tree model calculated based on the feature values and label data corresponding to the sample object provided by the data initiator;
  • the data to be processed may be the ciphertext processing result to be decrypted obtained by calculation for any feature provided by the data recipient, wherein the ciphertext processing result to be decrypted may be the ciphertext gradient accumulated value, the ciphertext gradient accumulated value corresponding to each feature, or the ciphertext gradient accumulated value corresponding to each bucket result, and the corresponding calculation processing result obtained by decrypting it is the gradient accumulated value.
  • the first computing device 60 and the second computing device 70 are connected via a network.
  • the network provides a medium for a communication link between the first computing device 60 and the second computing device 70.
  • the network may include various connection types, such as wired, wireless, or optical fiber cables, etc.
  • the wireless connection may be implemented via a mobile network, and accordingly, the network standard of the mobile network may be any one of 2G (GSM), 2.5G (GPRS), 3G (WCDMA, TD-SCDMA, CDMA2000, UTMS), 4G (LTE), 4G+ (LTE+), 5G, WiMax, etc.
  • GSM 2G
  • GPRS 2.5G
  • 3G WCDMA, TD-SCDMA, CDMA2000, UTMS
  • 4G LTE
  • 4G+ LTE+
  • 5G, WiMax etc.
  • a communication connection may also be established via Bluetooth, WiFi, infrared, etc.
  • the first computing device 60 and the second computing device 70 may also include other components, such as input/output interfaces, display components, communication components for implementing the above-mentioned communication connections, and host storage components for storing computer instructions for host processing components to call and execute to implement corresponding operations, etc. This application does not go into details.
  • the second acceleration device may include a second storage component 701 and at least one third acceleration component 702 ; the second storage component 701 is connected to the second host processing component 700 via a bus;
  • the second storage component 701 is used to store a plurality of data to be processed sent by the second host processing component 700; the data to be processed is target data to be encrypted or a ciphertext processing result to be decrypted;
  • the third acceleration component 702 is used to obtain at least one to-be-processed data from the second storage component; for any to-be-processed data, encrypt or decrypt the to-be-processed data to obtain a calculation result, and store the calculation result in the second storage component 701;
  • the second host processing component 700 obtains a calculation result corresponding to any data to be processed from the second storage component 701 .
  • the second acceleration device may be provided with a plurality of third acceleration components 702, thereby improving parallel processing capability, improving processing efficiency, and improving acceleration performance.
  • each third acceleration component 702 may include a second control unit 7021 and a plurality of second computing units 7022 ;
  • the second control unit 7021 is used to obtain at least one data to be processed from the second storage component; and dispatch the at least one data to be processed to at least one second computing unit;
  • the second computing unit 7022 is used to encrypt or decrypt any data to be processed assigned to it to obtain a computing result
  • the second control unit 7021 is used to store the calculation results corresponding to any data to be processed into the second storage component 701.
  • each third acceleration component 702 may further include a second storage unit 7023; the second computing unit 7022 is further configured to save a calculation result corresponding to any to-be-processed data to the second storage unit 7023;
  • the second control unit 7021 stores the calculation processing result corresponding to any data to be processed in the second storage component 701, including: storing the calculation processing result corresponding to any data to be processed stored in the second storage unit 7023 in the second storage component 701.
  • each third acceleration component 702 may further include a second loading unit 7024.
  • the second control unit 7021 may specifically control the second loading unit 7024 to obtain at least one to-be-processed data from the second storage component 701 .
  • the second control unit 7021 is further used to receive second control information sent by the second host processing component 700, and control the operation of the plurality of second computing units 7022 and the second storage unit 7023 according to the second control information;
  • the second control unit 7021 is further used to notify the second computing unit 7022 of the corresponding computing method according to the second control information; wherein the computing methods corresponding to encryption are point addition and point multiplication, and the computing method corresponding to decryption is point multiplication.
  • the second control information may include the first total amount of at least one data to be processed that the third acceleration component needs to obtain and the second total amount of data corresponding to at least one data to be processed after the at least one data to be processed is calculated and processed.
  • it may also include a first storage address corresponding to the at least one data to be processed that needs to be obtained and a second storage address corresponding to at least one calculation result obtained after the at least one data to be processed is calculated and processed.
  • the second control unit 7021 can specifically obtain at least one data to be processed from the second storage component 701 according to the first total amount of data and the first storage address; and can control the second storage unit 7023 to store at least one calculation result to the second storage component 701 according to the second total amount of data and the second storage address.
  • the second control unit 7021 can specifically control the second loading unit 7024 to obtain at least one data to be processed from the second storage component 701 according to the first total amount of data and the first storage address. data to be processed.
  • the second control information may also include an operation method corresponding to encryption or decryption, and the second control unit 7021 may specifically notify the second operation unit 7022 of the corresponding operation method according to the second control information;
  • the second computing unit 7022 performs computing processing on any data to be processed assigned to it to obtain a computing processing result, including: for any data to be processed assigned to it, processing the data to be processed according to the computing method to obtain a computing processing result.
  • the operation mode corresponding to encryption or decryption can be pre-configured with one or more corresponding operation instructions, and each data to be processed can be calculated and processed by executing one or more operation instructions.
  • the second operation unit 7022 can be implemented by a programmable processor (PC) in actual application, which can store corresponding instructions to perform corresponding operations.
  • the second operation unit 7022 can include a second storage subunit, a second parsing subunit, a second calculation subunit, and a second control subunit;
  • the second storage subunit is used to store one or more operation instructions corresponding to encryption or decryption
  • the second parsing subunit is used to parse one or more operation instructions
  • the second control subunit is used to send calculation instruction information to the second calculation subunit based on the analysis result of the analysis unit;
  • the second calculation subunit is used to perform calculation processing on the data to be processed based on the calculation indication information to obtain a calculation processing result.
  • the one or more operation instructions may be converted into corresponding calculation instruction information after being parsed to control the operation of the first calculation subunit.
  • the second storage subunit may be implemented by RAM, etc.
  • the second storage subunit may include a second instruction storage subunit, a second data storage subunit, and a second number storage unit.
  • the second instruction storage unit is used to store one or more operation instructions
  • the second data storage subunit is used to store intermediate results in the calculation process
  • the second number storage unit is used to store a preset number of times, etc.
  • the second storage subunit may further include a first scalar storage subunit for storing scalar data.
  • the specific structure of the second operation unit can be the same as the structure of the first operation unit 302 described in the corresponding embodiment above. Therefore, the specific implementation can be found in the above explanation of the first operation unit, which will not be repeated here.
  • the processing efficiency of the encryption or decryption operation in the second computing device and the processing efficiency of the ciphertext accumulation operation in the first computing device can be improved, the amount of calculation of the host processing component can be reduced, the processing performance can be improved, the acceleration performance can be improved, and efficient and high-performance data joint processing can be achieved.
  • the first computing device and the second computing device may be physical machines, which may be physical machines providing cloud computing capabilities, etc.
  • an embodiment of the present application further provides an acceleration method, which can be applied to an acceleration device as shown in FIG1 , wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; wherein the first storage component stores multiple ciphertext data corresponding to multiple objects sent by the first host processing component; the specific structural implementation of the acceleration device can be described in detail in the corresponding embodiment, which will not be repeated here.
  • the method can be specifically executed by the second acceleration component in the acceleration device, as shown in FIG8 , and the method can include the following steps:
  • the first acceleration component is used to obtain multiple bucket results from the first storage component; calculate and process the ciphertext data in the same bucket result to obtain the ciphertext processing result; store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component; the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results respectively to the first host processing component.
  • an embodiment of the present application further provides an acceleration method, which can be applied to an acceleration device as shown in FIG1 , wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; wherein the first storage component stores multiple ciphertext data corresponding to multiple objects sent by the first host processing component; the specific structural implementation of the acceleration device can be described in detail in the corresponding embodiment, which will not be repeated here.
  • the method can be specifically executed by the first acceleration component in the acceleration device, as shown in FIG9 , and the method can include the following steps:
  • the multiple bucketing results can be obtained by the second acceleration component obtaining multiple ciphertext data from the first storage component, and performing bucketing processing on the multiple ciphertext data according to multiple features.
  • the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
  • an embodiment of the present application further provides a computing device, as shown in FIG. 10 , which may include a host processing component 1001, a host storage component 1002, and an acceleration device 1003, wherein the acceleration device may adopt a structure as described in any of the embodiments of FIG. 1 to FIG. 5 or FIG. 7 a, which will not be repeated here.
  • a computing device as shown in FIG. 10 , which may include a host processing component 1001, a host storage component 1002, and an acceleration device 1003, wherein the acceleration device may adopt a structure as described in any of the embodiments of FIG. 1 to FIG. 5 or FIG. 7 a, which will not be repeated here.
  • the host storage component 1002 may store one or more computer instructions for the host processing component 1001 to call and execute to implement corresponding operations.
  • the computing device may also include other components, such as input/output interfaces, display components, communication components, etc.
  • the input/output interface provides an interface between the processing component and the peripheral interface module, which may be an output device, an input device, etc.
  • the communication component is configured to facilitate wired or wireless communication between the computing device and other devices.
  • the host processing component may include one or more processors to execute computer instructions to complete all or part of the steps in the above method.
  • the host processing component may also be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components to perform the above method.
  • ASICs application-specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • controllers microcontrollers, microprocessors or other electronic components to perform the above method.
  • the host storage component is configured to store various types of data to support operations in the computing device.
  • the host storage component can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory flash memory
  • flash memory magnetic disk or optical disk.
  • the acceleration device can be implemented by using an application-specific integrated circuit (ASIC), a digital signal processor (DSP), a digital signal processing device (DSPD), a programmable logic device (PLD), a field programmable gate array (FPGA), a controller, a microcontroller, a microprocessor or other electronic components. It can be connected to the host processing component through a bus and deployed in a computing device in a hot-swappable manner.
  • ASIC application-specific integrated circuit
  • DSP digital signal processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • FPGA field programmable gate array
  • the embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the acceleration method of the embodiment shown in Figure 8 or Figure 9 when executed by a computer.
  • the computer-readable medium can be included in the computing device described in the above embodiment; or it can exist independently without being assembled into the electronic device.
  • the embodiment of the present application also provides a computer program product, which includes a computer program carried on a computer-readable storage medium, and when the computer program is executed by a computer, it can implement the acceleration method of the embodiment shown in Figure 8 or Figure 9 as described above.
  • the computer program can be downloaded and installed from a network, and/or installed from a removable medium.
  • various functions defined in the system of the present application are executed.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of this embodiment. Those of ordinary skill in the art may understand and implement it without creative work.
  • each implementation method can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the above technical solution is essentially or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, a disk, an optical disk, etc., including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in each embodiment or some parts of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Storage Device Security (AREA)

Abstract

Embodiments of the present application provide an acceleration device, a computing system, and an acceleration method. The acceleration device comprises: a first storage component, and a first acceleration component and a second acceleration component that are connected to the first storage component. The first storage component is connected to a first host processing component by means of a bus. The first storage component is used for storing multiple pieces of ciphertext data corresponding to multiple objects and sent by the first host processing component. The second acceleration component is used for acquiring the multiple pieces of ciphertext data from the first storage component, for any feature, performing bucketing processing on the multiple pieces of ciphertext data to obtain multiple bucketing results, and storing the multiple bucketing results in the first storage component. The first acceleration component is used for acquiring the multiple bucketing results from the first storage component, performing computing processing on the ciphertext data in a same bucketing result to obtain a ciphertext processing result, and storing, in the first storage component, the ciphertext processing results respectively corresponding to the multiple bucketing results. The technical solution provided in the embodiments of the present application improves the processing efficiency.

Description

加速设备、计算系统及加速方法Acceleration device, computing system and acceleration method
本申请要求于2022年10月11日提交中国专利局、申请号为202211241151.7、申请名称为“加速设备、计算系统及加速方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on October 11, 2022, with application number 202211241151.7 and application name “Acceleration device, computing system and acceleration method”, the entire contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请实施例涉及计算机技术领域,尤其涉及一种加速设备、计算系统及加速方法。The embodiments of the present application relate to the field of computer technology, and in particular, to an acceleration device, a computing system, and an acceleration method.
背景技术Background technique
随着科学技术发展,数据价值越来越受到重视,不同数据提供方之间往往存在着数据融合需求,但是出于隐私保护等因素的考虑,不同数据提供方之间的数据无法共享,从而形成数据孤岛。为了解决数据孤岛问题,基于同态加密的隐私计算应运而生,它意在打破数据孤岛,在不泄露数据隐私前提下利用多方数据进行计算,建模等。With the development of science and technology, data value has been increasingly valued. There is often a need for data fusion between different data providers. However, due to factors such as privacy protection, data between different data providers cannot be shared, thus forming data islands. In order to solve the problem of data islands, privacy computing based on homomorphic encryption has emerged. It aims to break data islands and use multi-party data for calculations and modeling without leaking data privacy.
同态加密是一类具有特殊自然属性的加密算法,对经过同态加密的数据进行处理得到一个输出数据,将这一输出数据进行解密,其结果与用同一方式处理未加密的原始数据得到的输出结果是一样的,即先计算后解密可等价于先解密后计算,这个特性对于保护数据安全具有重要意义。Homomorphic encryption is a type of encryption algorithm with special natural properties. By processing homomorphically encrypted data, an output data is obtained. After decrypting this output data, the result is the same as the output result obtained by processing the unencrypted original data in the same way. That is, calculation first and then decryption is equivalent to decryption first and then calculation. This feature is of great significance for protecting data security.
在一个实际应用中,多个数据提供方拥有相同对象,而不同特征的情况下,存在如下的数据联合处理需求:数据发起方将基于每个对象的特征取值计算获得的目标数据,进行同态加密,获得密文数据,再将多个对象分别对应的密文数据提供给数据接收方;数据接收方针对其拥有的每个特征,根据不同特征取值将多个对象分别对应的密文数据进行分桶处理;再对每个分桶中的密文数据进行计算处理,获得密文处理结果,之后将每个特征对应的各个分桶的密文处理结果返回给数据发起方,数据发起方即可以解密获得各个分桶的明文处理结果,基于各个分桶的明文处理结果,可以进行后续的处理操作,从而实现了数据发起方利用数据接收方的特征进行数据处理的目的,同时保护了双方数据安全。In an actual application, when multiple data providers have the same object but different features, there is a need for joint data processing as follows: the data initiator performs homomorphic encryption on the target data obtained by calculating the feature values of each object to obtain ciphertext data, and then provides the ciphertext data corresponding to the multiple objects to the data receiver; the data receiver buckets the ciphertext data corresponding to the multiple objects according to different feature values for each feature it has; then it calculates and processes the ciphertext data in each bucket to obtain the ciphertext processing result, and then returns the ciphertext processing results of each bucket corresponding to each feature to the data initiator. The data initiator can then decrypt and obtain the plaintext processing results of each bucket, and can perform subsequent processing operations based on the plaintext processing results of each bucket, thereby achieving the purpose of the data initiator using the features of the data receiver to process the data, while protecting the data security of both parties.
由上文描述可知,由于需要针对每个特征,对多个密文数据进行分桶以及对每个分桶中的密文数据进行计算处理,运算量很大,影响处理效率。As can be seen from the above description, since it is necessary to bucket multiple ciphertext data for each feature and perform calculations on the ciphertext data in each bucket, the amount of calculation is very large, which affects the processing efficiency.
发明内容 Summary of the invention
本申请实施例提供一种加速设备、计算系统及加速方法,用于解决现有技术中影响处理效率的技术问题。The embodiments of the present application provide an acceleration device, a computing system and an acceleration method for solving the technical problems that affect processing efficiency in the prior art.
第一方面,本申请实施例中提供了一种加速设备,包括:第一存储组件、与所述第一存储组件连接的第一加速组件及第二加速组件;所述第一存储组件与第一主机处理组件通过总线连接;In a first aspect, an acceleration device is provided in an embodiment of the present application, comprising: a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus;
所述第一存储组件用于存储所述第一主机处理组件发送的多个对象对应的多个密文数据;The first storage component is used to store multiple ciphertext data corresponding to multiple objects sent by the first host processing component;
所述第二加速组件用于从所述第一存储组件获取所述多个密文数据,并针对任一个特征,将所述多个密文数据进行分桶处理,获得多个分桶结果;将所述多个分桶结果存储至所述第一存储组件;The second acceleration component is used to obtain the multiple ciphertext data from the first storage component, and for any feature, perform bucket processing on the multiple ciphertext data to obtain multiple bucket results; and store the multiple bucket results in the first storage component;
所述第一加速组件用于从所述第一存储组件获取所述多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将所述多个分桶结果分别对应的密文处理结果存储至所述第一存储组件;The first acceleration component is used to obtain the multiple bucket results from the first storage component; perform calculations on the ciphertext data in the same bucket result to obtain a ciphertext processing result; and store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component;
所述第一存储组件用于将所述多个分桶结果分别对应的密文处理结果提供给所述第一主机处理组件。The first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
第二方面,本申请实施例中提供了一种计算系统,包括第一计算设备及第二计算设备,所述第一计算设备包括第一主机处理组件及如上述第一方面任一项所述的加速设备;In a second aspect, an embodiment of the present application provides a computing system, including a first computing device and a second computing device, wherein the first computing device includes a first host processing component and an acceleration device as described in any one of the first aspects above;
所述第二计算设备包括第二主机处理组件及第二加速设备;所述第二加速设备包括第二存储组件及至少一个第三加速组件;所述第二存储组件与第二主机处理组件通过总线连接;The second computing device includes a second host processing component and a second acceleration device; the second acceleration device includes a second storage component and at least one third acceleration component; the second storage component is connected to the second host processing component via a bus;
所述第二存储组件用于存储所述第二主机处理组件发送的多个待处理数据;所述待处理数据为待加密的目标数据或者待解密的密文处理结果;The second storage component is used to store a plurality of to-be-processed data sent by the second host processing component; the to-be-processed data is target data to be encrypted or a ciphertext processing result to be decrypted;
所述第三加速组件用于从所述第二存储组件获取至少一个待处理数据;针对任一待处理数据,对所述待处理数据进行加密或解密,获得计算处理结果,并将所述计算处理结果存储至所述第二存储组件;The third acceleration component is used to obtain at least one to-be-processed data from the second storage component; for any to-be-processed data, encrypt or decrypt the to-be-processed data to obtain a calculation result, and store the calculation result in the second storage component;
所述第二主机处理组件用于从所述第二存储组件中获取任一待处理数据对应的计算处理结果。The second host processing component is used to obtain a calculation result corresponding to any data to be processed from the second storage component.
第三方面,本申请实施例中提供了一种计算设备,包括主机处理组件、主机存储组件以及如上述第一方面所述的加速设备。In a third aspect, an embodiment of the present application provides a computing device, including a host processing component, a host storage component, and an acceleration device as described in the first aspect above.
第四方面,本申请实施例中提供了一种加速方法,应用于加速设备,所述加速设备包括第一存储组件、与所述第一存储组件连接的第一加速组件及第二加速组件;所述第一存储组件与第一主机处理组件通过总线连接;其中,所述第一存储组件存储所述第一主机处 理组件发送的多个对象对应的多个密文数据;所述方法包括:In a fourth aspect, an acceleration method is provided in an embodiment of the present application, which is applied to an acceleration device, wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus; wherein the first storage component stores the first host processing component. The method includes:
从所述第一存储组件获取所述多个密文数据;Acquire the plurality of ciphertext data from the first storage component;
针对任一个特征,将所述多个密文数据进行分桶处理,获得多个分桶结果;For any feature, the plurality of ciphertext data are bucketed to obtain a plurality of bucketing results;
将所述多个分桶结果存储至所述第一存储组件;第一加速组件用于从所述第一存储组件获取所述多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将所述多个分桶结果分别对应的密文处理结果存储至所述第一存储组件;所述第一存储组件用于将所述多个分桶结果分别对应的密文处理结果提供给所述第一主机处理组件。The multiple bucket results are stored in the first storage component; the first acceleration component is used to obtain the multiple bucket results from the first storage component; the ciphertext data in the same bucket result is calculated and processed to obtain the ciphertext processing result; the ciphertext processing results corresponding to the multiple bucket results are respectively stored in the first storage component; the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
本申请实施例提供的加速设备包括第一存储组件、与所述第一存储组件连接的第一加速组件及第二加速组件构成;所述第一存储组件与第一主机处理组件通过总线连接;由第二加速组件进行分桶处理,第一主机处理组件将多个密文数据存储在第一存储组件中,多个特征即可共用该多个密文数据进行分桶处理,之后再由第一加速组件从第一存储组件获取分桶结果,并对同一个分桶结果中的密文数据进行计算处理获得密文处理结果;该密文处理结果可以再经由第一存储组件提供至第一主机处理组件,由于第一主机处理组件只需进行一次数据传输,利用加速设备即可以实现分桶处理操作以及计算处理操作,降低了主机处理组件的运算量,从而提高了处理效率,且可以降低I/O开销,保证加速设备的加速性能。The acceleration device provided in the embodiment of the present application includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; the second acceleration component performs bucket processing, and the first host processing component stores multiple ciphertext data in the first storage component. Multiple features can share the multiple ciphertext data for bucket processing, and then the first acceleration component obtains the bucket result from the first storage component, and performs calculation processing on the ciphertext data in the same bucket result to obtain the ciphertext processing result; the ciphertext processing result can be provided to the first host processing component via the first storage component. Since the first host processing component only needs to perform one data transmission, the bucket processing operation and the calculation processing operation can be implemented using the acceleration device, which reduces the amount of calculation of the host processing component, thereby improving the processing efficiency, and can reduce the I/O overhead to ensure the acceleration performance of the acceleration device.
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。These and other aspects of the present application will be more clearly understood in the description of the following embodiments.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following is a brief introduction to the drawings required for use in the embodiments or the description of the prior art. Obviously, the drawings described below are some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.
图1示出了本申请提供的加速设备一个实施例的结构示意图;FIG1 is a schematic structural diagram of an embodiment of an acceleration device provided by the present application;
图2示出了本申请提供的第二加速组件一个实施例的结构示意图;FIG2 is a schematic structural diagram of an embodiment of a second accelerating component provided by the present application;
图3示出了本申请提供的第一加速组件一个实施例的结构示意图;FIG3 shows a schematic structural diagram of an embodiment of a first accelerating component provided by the present application;
图4示出了本申请提供的第一运算单元一个实施例的结构示意图;FIG4 shows a schematic structural diagram of an embodiment of a first computing unit provided by the present application;
图5示出了本申请实施例在一个实际应用的第一运算单元的运算结构示意图;FIG5 is a schematic diagram showing the operation structure of a first operation unit in a practical application of an embodiment of the present application;
图6a示出了本申请提供的一种计算系统一个实施例的结构示意图;FIG6a shows a schematic diagram of the structure of an embodiment of a computing system provided by the present application;
图6b示出了本申请提供的计算系统在一个实际应用中的交互场景示意图; FIG6b is a schematic diagram showing an interaction scenario of a computing system provided by the present application in an actual application;
图7a示出了本申请提供的第二加速设备一个实施例的结构示意图;FIG7a shows a schematic structural diagram of an embodiment of a second acceleration device provided by the present application;
图7b示出了本申请提供的第三加速组件一个实施例的结构示意图;FIG7b shows a schematic structural diagram of an embodiment of a third acceleration component provided by the present application;
图8示出了本申请提供的一种加速方法一个实施例的流程图;FIG8 shows a flow chart of an embodiment of an acceleration method provided by the present application;
图9示出了本申请提供的一种加速方法一个实施例的流程图;FIG9 shows a flow chart of an embodiment of an acceleration method provided by the present application;
图10示出了本申请提供的一种计算设备一个实施例的结构示意图。FIG. 10 shows a schematic structural diagram of an embodiment of a computing device provided by the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application.
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如101、102等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。In some of the processes described in the specification and claims of this application and the above-mentioned figures, multiple operations that appear in a specific order are included, but it should be clearly understood that these operations may not be executed in the order in which they appear in this article or executed in parallel. The serial numbers of the operations, such as 101, 102, etc., are only used to distinguish between different operations, and the serial numbers themselves do not represent any execution order. In addition, these processes may include more or fewer operations, and these operations may be executed in sequence or in parallel. It should be noted that the descriptions of "first", "second", etc. in this article are used to distinguish different messages, devices, modules, etc., do not represent the order of precedence, and do not limit the "first" and "second" to be different types.
本申请实施例的技术方案可以适用于多方数据联合处理场景中,例如多方联合建模场景等,当然本申请并不限定于此。The technical solution of the embodiment of the present application can be applied to scenarios of joint data processing by multiple parties, such as scenarios of joint modeling by multiple parties, etc. Of course, the present application is not limited to this.
由背景技术中的描述可知,目前存在的一种数据联合处理需求是:数据发起方将基于每个对象的特征取值,计算获得的目标数据,进行同态加密,获得密文数据,再将多个对象分别对应的密文数据提供给数据接收方;数据接收方针对其拥有的每个特征,根据不同特征取值将多个对象分别对应的密文数据进行分桶处理;再对每个分桶结果中的密文数据进行计算处理,获得密文处理结果,之后将每个特征对应的各个分桶结果的密文处理结果返回给数据发起方,数据发起方即可以解密获得各个分桶结果的明文处理结果,基于各个分桶结果的明文处理结果,可以进行后续的处理操作。From the description in the background technology, it can be seen that there is currently a demand for joint data processing: the data initiator will calculate the target data based on the feature value of each object, perform homomorphic encryption, obtain ciphertext data, and then provide the ciphertext data corresponding to multiple objects to the data receiver; the data receiver will bucket the ciphertext data corresponding to multiple objects according to different feature values for each feature it possesses; then calculate and process the ciphertext data in each bucket result to obtain the ciphertext processing result, and then return the ciphertext processing results of each bucket result corresponding to each feature to the data initiator, and the data initiator can decrypt and obtain the plaintext processing results of each bucket result, and can perform subsequent processing operations based on the plaintext processing results of each bucket result.
上述数据联合处理需求在实际应用中例如可以是存在于采用联邦学习方式进行多方联合建模场景中,以多方联合建模为例,其中,联邦学习是一种分布式的机器学习方式,能够在保护数据隐私的前提下,使用多方的数据进行联合建模。而纵向联邦学习是常用的一种联邦学习方式,是指样本对象的特征数据和标签信息分布在不同数据提供方的情况下进行多方联合建模,多个数据提供方拥有相同的样本对象,但是不同的特征数据,比如数据提供方A和数据提供方B拥有相同的用户C,但是数据提供方A拥有用户C的学历数据,数据提供方B拥有用户C的年龄数据,其中,学历数据和年龄数据即为特征数据,进行联合建模时,通常只有一方拥有样本对象的标签数据,拥有标签数据的数据提供方也被为数据发起方(主动方),而没有标签数据的数据提供方也被称为数据接收方(被动方), 通过纵向联邦学习,主动方可以借助被动方的特征数据,提高机器学习模型的能力,同时又能保护各个参与方的数据隐私。In practical applications, the above-mentioned data joint processing requirements may exist in scenarios where federated learning is used for multi-party joint modeling. Take multi-party joint modeling as an example. Federated learning is a distributed machine learning method that can use data from multiple parties for joint modeling while protecting data privacy. Vertical federated learning is a commonly used federated learning method, which refers to multi-party joint modeling when the feature data and label information of the sample object are distributed among different data providers. Multiple data providers have the same sample object but different feature data. For example, data provider A and data provider B have the same user C, but data provider A has the educational background data of user C, and data provider B has the age data of user C. The educational background data and age data are feature data. When performing joint modeling, usually only one party has the label data of the sample object. The data provider with the label data is also called the data initiator (active party), and the data provider without label data is also called the data receiver (passive party). Through vertical federated learning, the active party can use the feature data of the passive party to improve the capabilities of the machine learning model while protecting the data privacy of each participant.
在纵向联邦学习方式中,决策树模型是常用的一种机器学习模型,决策树模型的训练最主要的是需要寻找最优分裂点,其中,分裂点也即是指某个特征数据的具体取值,比如标签数据为用户C为目标群体,分裂点可能是年龄小于20岁或者年龄小于30岁等。In the vertical federated learning method, the decision tree model is a commonly used machine learning model. The most important thing in training the decision tree model is to find the optimal split point, where the split point refers to the specific value of a certain feature data. For example, if the label data is user C as the target group, the split point may be age less than 20 years old or age less than 30 years old, etc.
对决策树模型进行训练时,通常采用方式是:先由主动方根据拥有的样本对象的特征取值和标签数据确定模型对应的梯度信息,再将梯度信息利用同态加密方式加密为密文梯度信息传输至被动方,被动方根据密文梯度信息分别计算每个特征对应分裂空间的密文梯度累加值,再将密文梯度累加值发送至主动方。主动方从中解密获得梯度累加值,根据多个特征的梯度累加值,可以最终确定出最优分裂点。可知,被动方需要对同态加密获得的密文梯度信息进行密文累加,为了提高训练效率,可以采用分桶方式,被动方针对每一个特征数据,可以按照特征取值将不同样本对象对应的密文梯度信息进行分桶,对每个分桶结果内的密文梯度信息进行累加,再将每个分桶结果对应的密文梯度累加值发送给主动方,主动方再基于各个分桶结果的密文梯度累加值,确定最优分裂点。When training a decision tree model, the following method is usually used: the active party first determines the gradient information corresponding to the model based on the feature values and label data of the sample objects it has, and then encrypts the gradient information into ciphertext gradient information using homomorphic encryption and transmits it to the passive party. The passive party calculates the ciphertext gradient accumulation value of the split space corresponding to each feature based on the ciphertext gradient information, and then sends the ciphertext gradient accumulation value to the active party. The active party decrypts it to obtain the gradient accumulation value, and can finally determine the optimal split point based on the gradient accumulation values of multiple features. It can be seen that the passive party needs to ciphertext accumulate the ciphertext gradient information obtained by homomorphic encryption. In order to improve the training efficiency, the bucketing method can be used. For each feature data, the passive party can bucket the ciphertext gradient information corresponding to different sample objects according to the feature value, accumulate the ciphertext gradient information in each bucket result, and then send the ciphertext gradient accumulation value corresponding to each bucket result to the active party. The active party then determines the optimal split point based on the ciphertext gradient accumulation value of each bucket result.
由上述描述可知,数据接收方需要针对每个特征进行分桶及对每个分桶结果内的密文梯度信息进行相应的计算处理,由于通常是利用各自计算设备中的处理组件完成这些计算处理操作,主机处理组件还需要执行其余工作,因此会导致处理组件的运算量很大,从而就会影响处理性能,降低处理效率。From the above description, it can be seen that the data receiver needs to bucket each feature and perform corresponding calculations on the ciphertext gradient information in each bucket result. Since these calculations and processing operations are usually completed by the processing components in each computing device, the host processing component also needs to perform the remaining work, which will result in a large amount of calculation for the processing component, thereby affecting the processing performance and reducing the processing efficiency.
而为了提高处理性能,提高处理效率,发明人在研究中发现,对利用同态加密算法进行加密获得的密文数据进行计算处理,本质上需要大整数相乘和相加来实现,需要消耗较多处理性能,因此想到可以采用专用的加速器来进行密文数据的计算处理,以实现更好的处理性能,然而,发明人又发现,如果采用专用的加速器,仍然需要主机处理组件进行分桶处理,实际应用中对象数量往往很多,特别是联合建模场景中,样本对象通常是在几十万甚至上百万量级,特征数量也非常多,由于需要针对每个特征都需要进行分桶处理,针对每个特征均需要将分桶结果传输至加速器上,数据量级为:对象数量*特征数量,这样又会带来较大的的I/O开销,造成加速性能瓶颈。In order to improve processing performance and efficiency, the inventors found in their research that the computational processing of ciphertext data obtained by encrypting using a homomorphic encryption algorithm essentially requires large integer multiplication and addition to achieve, which consumes a lot of processing performance. Therefore, they thought of using a dedicated accelerator to perform computational processing on ciphertext data to achieve better processing performance. However, the inventors also found that if a dedicated accelerator is used, the host processing component is still required to perform bucket processing. In actual applications, the number of objects is often large, especially in joint modeling scenarios, where sample objects are usually in the hundreds of thousands or even millions, and the number of features is also very large. Since bucket processing is required for each feature, the bucketing results need to be transmitted to the accelerator for each feature. The data order is: number of objects * number of features, which will in turn bring about a large I/O overhead, resulting in an acceleration performance bottleneck.
据此,发明人又经过了一系列研究,提出了本申请的技术方案,本申请实施例提供了一种加速设备,由第一存储组件、与第一存储组件连接的第一加速组件及第二加速组件构成;第一存储组件与第一主机处理组件通过总线连接;由第二加速组件进行分桶处理,第一主机处理组件只需发送一次多个对象对应的多个密文数据存储在第一存储组件中,多个特征即可共用该多个密文数据进行分桶处理,之后再由第一加速组件从第一存储组件获取分桶结果,并对同一个分桶结果中的密文数据进行计算处理获得密文处理结果;该密文处理结果可以再经由第一存储组件提供至第一主机处理组件,从而主机处理组件只需进行一次数据传输,利用加速设备即可以实现分桶处理以及计算处理,降低了主机处理组件的运 算量,利用专用的加速设备来执行计算处理操作,提高了处理效率,且可以降低加速设备的I/O开销,保证加速设备的加速性能。Based on this, the inventor has conducted a series of studies and proposed the technical solution of the present application. The embodiment of the present application provides an acceleration device, which is composed of a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; the second acceleration component performs bucket processing, and the first host processing component only needs to send multiple ciphertext data corresponding to multiple objects once to be stored in the first storage component, and multiple features can share the multiple ciphertext data for bucket processing, and then the first acceleration component obtains the bucket result from the first storage component, and performs calculation processing on the ciphertext data in the same bucket result to obtain the ciphertext processing result; the ciphertext processing result can be provided to the first host processing component via the first storage component, so that the host processing component only needs to perform data transmission once, and the acceleration device can be used to implement bucket processing and calculation processing, which reduces the operation of the host processing component. The amount of calculation is performed by using dedicated acceleration devices to perform calculation processing operations, which improves processing efficiency and can reduce the I/O overhead of the acceleration device to ensure the acceleration performance of the acceleration device.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those skilled in the art without creative work are within the scope of protection of this application.
图1为本申请实施例提供的一种加速设备一个实施例的结构示意图,该加速设备可以包括第一存储组件101,与第一存储组件101分别连接的第一加速组件102以及第二加速组件103。其中,第一存储组件101与第一主机处理组件100通过总线连接,总线类型例如可以为PCIE(peripheral component interconnect express,高速串行计算机扩展总线标准)等,当然也可以采用诸如以太网的其他高速总线互连,本申请对此不进行限定。FIG1 is a schematic diagram of the structure of an embodiment of an acceleration device provided by an embodiment of the present application, and the acceleration device may include a first storage component 101, a first acceleration component 102 and a second acceleration component 103 respectively connected to the first storage component 101. The first storage component 101 is connected to the first host processing component 100 via a bus, and the bus type may be, for example, PCIE (peripheral component interconnect express, a high-speed serial computer expansion bus standard), and of course, other high-speed buses such as Ethernet may also be used for interconnection, and this application does not limit this.
其中,该加速设备可以采用专用集成电路(ASIC)、或现场可编程门阵列(FPGA)实现,当然,也可以采用数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、控制器、微控制器、微处理器或其他形式的集成电路(IC)等实现,本申请对此不进行限定。Among them, the acceleration device can be implemented by an application-specific integrated circuit (ASIC) or a field programmable gate array (FPGA). Of course, it can also be implemented by a digital signal processor (DSP), a digital signal processing device (DSPD), a programmable logic device (PLD), a controller, a microcontroller, a microprocessor or other forms of integrated circuits (IC), etc. This application does not limit this.
该加速设备可以部署在第一计算设备中,该第一计算设备相对于加速设备,可以称之为加速设备的主机设备,第一主机处理组件例如可以是第一计算设备中的中央处理器(CPU)等,负责第一计算设备中的传统的处理任务等。The acceleration device can be deployed in a first computing device. Relative to the acceleration device, the first computing device can be referred to as a host device of the acceleration device. The first host processing component can be, for example, a central processing unit (CPU) in the first computing device, which is responsible for traditional processing tasks in the first computing device.
其中,第一存储组件101用于存储第一主机处理组件100发送的多个对象对应的多个密文数据;The first storage component 101 is used to store multiple ciphertext data corresponding to multiple objects sent by the first host processing component 100;
第二加速组件103用于从第一存储组件101获取多个密文数据,并针对任一个特征,将多个密文数据进行分桶处理,获得多个分桶结果;将多个分桶结果存储至第一存储组件101;The second acceleration component 103 is used to obtain multiple ciphertext data from the first storage component 101, and for any feature, perform bucket processing on the multiple ciphertext data to obtain multiple bucket results; and store the multiple bucket results in the first storage component 101;
第一加速组件102用于从第一存储组件获取多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将多个分桶结果分别对应的密文处理结果存储至第一存储组件;The first acceleration component 102 is used to obtain multiple bucket results from the first storage component; perform calculations on the ciphertext data in the same bucket result to obtain a ciphertext processing result; and store the ciphertext processing results corresponding to the multiple bucket results in the first storage component;
第一存储组件101用于将多个分桶结果分别对应的密文处理结果提供给第一主机处理组件100。The first storage component 101 is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component 100 .
其中,每个对象可以对应一个密文数据,从而多个对象对应多个密文数据。密文数据可以是对目标数据采用同态加密算法进行加密获得。在一个实际应用中,如多方联合建模场景中,该密文数据可以是指密文梯度信息,由数据发起方通过对梯度数据采用同态加密算法是进行加密获得。 Each object may correspond to a ciphertext data, and thus multiple objects may correspond to multiple ciphertext data. The ciphertext data may be obtained by encrypting the target data using a homomorphic encryption algorithm. In a practical application, such as in a multi-party joint modeling scenario, the ciphertext data may refer to ciphertext gradient information, which is obtained by the data initiator by encrypting the gradient data using a homomorphic encryption algorithm.
第一主机处理组件100可以将数据发送方所发送的多个对象对应的多个密文数据传输至加速设备中的第一存储组件101进行存储。The first host processing component 100 can transmit multiple ciphertext data corresponding to multiple objects sent by the data sender to the first storage component 101 in the acceleration device for storage.
第一主机处理组件100可以向第一存储组件101、第一加速组件102以及第二加速组件103发送相应的指示信息,以启动或者触发各个组件执行相应操作。比如,第一主机处理组件100将多个密文数据存储至第一存储组件101之后,可以向第二加速组件103发送相应指示信息,第二加速组件103可以是基于该指示信息,从第一存储组件101中获取该多个密文数据。当然,第一主机处理组件100也可以是在接收到数据发起方所发送的多个密文数据之后,通知第一存储组件101、第一加速组件102以及第二加速组件103启动,第一存储组件101、第一加速组件102以及第二加速组件103可以实时或者周期性触发执行各自操作。The first host processing component 100 may send corresponding instruction information to the first storage component 101, the first acceleration component 102, and the second acceleration component 103 to start or trigger each component to perform corresponding operations. For example, after the first host processing component 100 stores multiple ciphertext data in the first storage component 101, it may send corresponding instruction information to the second acceleration component 103, and the second acceleration component 103 may obtain the multiple ciphertext data from the first storage component 101 based on the instruction information. Of course, the first host processing component 100 may also notify the first storage component 101, the first acceleration component 102, and the second acceleration component 103 to start after receiving the multiple ciphertext data sent by the data initiator, and the first storage component 101, the first acceleration component 102, and the second acceleration component 103 may trigger the execution of their respective operations in real time or periodically.
第二加速组件103负责数据接收方所拥有的各个特征对应的分桶处理操作,其可以针对每个特征,将多个密文数据进行分桶处理,获得每个特征对应的多个分桶结果。之后可以将不同特征分别对应的分桶结果存储至第一存储组件101,第二加速组件103还可以向第一主机处理组件100发送分桶结束通知,第一主机处理组件100接收到分桶结束通知之后,即可以通知第一加速组件102获取多个分桶结果,并进行计算处理等。The second acceleration component 103 is responsible for the bucket processing operation corresponding to each feature owned by the data receiver. It can bucket multiple ciphertext data for each feature to obtain multiple bucket results corresponding to each feature. The bucket results corresponding to different features can then be stored in the first storage component 101. The second acceleration component 103 can also send a bucket end notification to the first host processing component 100. After receiving the bucket end notification, the first host processing component 100 can notify the first acceleration component 102 to obtain multiple bucket results and perform calculation processing.
可选地,为了提高处理效率,第二加速组件103可以采用并行方式,同时针对多个特征,分别对多个密文数据进行分桶处理。该多个特征可以由第一主机处理组件100通知获得等,第一主机处理组件100可以将待处理的特征划分为多组,每一组包括多个特征,在任一组中的多个特征对应的分桶操作结束之后,再下发另一组的多个特征。Optionally, in order to improve processing efficiency, the second acceleration component 103 can adopt a parallel method to perform bucket processing on multiple ciphertext data for multiple features at the same time. The multiple features can be notified by the first host processing component 100, etc. The first host processing component 100 can divide the features to be processed into multiple groups, each group includes multiple features, and after the bucketing operation corresponding to multiple features in any group is completed, multiple features of another group are issued.
第一加速组件102从第一存储组件101获得多个分桶结果之后,即可以将同一分桶结果中的密文数据进行计算处理获得密文处理结果,并将多个分桶结果分别对应的密文处理结果存储至第一存储组件;可选地,第一加速组件102可以具体是按照目标计算处理模式将同一分桶结果中的密文数据进行计算处理,根据目标计算处理模式可以确定对应的运算方式,具体按照目标计算处理模对应的运算方式进行计算处理。After the first acceleration component 102 obtains multiple bucket results from the first storage component 101, it can calculate and process the ciphertext data in the same bucket result to obtain the ciphertext processing result, and store the ciphertext processing results corresponding to the multiple bucket results in the first storage component; optionally, the first acceleration component 102 can calculate and process the ciphertext data in the same bucket result specifically according to the target calculation processing mode, and the corresponding operation method can be determined according to the target calculation processing mode, and the calculation processing is specifically performed according to the operation method corresponding to the target calculation processing mode.
该目标计算处理模式或该运算方式可以由第一主机处理组件100通知第一加速组件102。The target computing processing mode or the operation method can be notified to the first acceleration component 102 by the first host processing component 100 .
该目标计算处理模式例如可以包括密文累加,此外还可以包括密文相乘、密文相减等,在多方联合建模场景中,目标计算处理模式即可以具体是指密文累加。The target calculation processing mode may include, for example, ciphertext accumulation, and may also include ciphertext multiplication, ciphertext subtraction, etc. In a multi-party joint modeling scenario, the target calculation processing mode may specifically refer to ciphertext accumulation.
密文累加对应的运算方式可以为点加运算,如ECC(Elliptic Curve Cryptography,椭圆曲线加密算法)中,密文累加意即转换为椭圆曲线上的两个点的点加操作。在基于椭圆曲线实现的同态加密算法中,点加运算具体执行时会转换为模加、模乘等算术运算。The operation method corresponding to the accumulation of ciphertext can be point addition operation. For example, in ECC (Elliptic Curve Cryptography), the accumulation of ciphertext means the conversion into the point addition operation of two points on the elliptic curve. In the homomorphic encryption algorithm based on the elliptic curve, the point addition operation is converted into arithmetic operations such as modular addition and modular multiplication when it is executed.
第一存储组件101存储多个分桶结果分别对应的密文处理结果之后,可以通知第一主机处理组件100,从而第一主机处理组件100即可以从第一存储组件100中获取多个分桶 结果分别对应的密文处理结果。其中,第一存储组件可以采用带宽较高的外部存储器实现等。After the first storage component 101 stores the ciphertext processing results corresponding to the multiple bucket results, it can notify the first host processing component 100, so that the first host processing component 100 can obtain the multiple buckets from the first storage component 100. The results correspond to the ciphertext processing results respectively. Among them, the first storage component can be implemented by an external memory with a higher bandwidth.
第一主机处理组件100可以将多个分桶结果分别对应的密文处理结果发送至数据发起方,以便于数据发起方进行后续操作,比如数据发起方可以首先解密获得每个特征对应的多个分桶结果分别对应的明文处理结果,再对明文处理结果按照目标计算处理模式进行计算处理;或者也可以首先将每个特征对应的多个分桶结果分别对应的密文处理结果按照目标计算处理模式进行计算处理,再将处理结果进行解密等。The first host processing component 100 can send the ciphertext processing results corresponding to the multiple bucket results to the data initiator to facilitate the data initiator to perform subsequent operations. For example, the data initiator can first decrypt to obtain the plaintext processing results corresponding to the multiple bucket results corresponding to each feature, and then calculate the plaintext processing results according to the target calculation processing mode; or the data initiator can first calculate the ciphertext processing results corresponding to the multiple bucket results corresponding to each feature according to the target calculation processing mode, and then decrypt the processing results.
通过本实施例提供的加速设备,可以将分桶操作以及计算处理操作由加速设备执行,主机处理组件只需传输一次密文数据,即可以被多个特征共用以用来进行分桶操作,从而降低了主机处理组件的运算量,提高了处理效率,且可以降低I/O开销,保证加速设备的加速性能。Through the acceleration device provided in this embodiment, bucket operations and computing processing operations can be performed by the acceleration device. The host processing component only needs to transmit the ciphertext data once, which can be shared by multiple features for bucket operations, thereby reducing the amount of computation of the host processing component, improving processing efficiency, and reducing I/O overhead, thereby ensuring the acceleration performance of the acceleration device.
一些实施例中,如图1中所示,该加速设备还可以包括总线接口104,利用总线接口104可以接入第一计算设备,以实现第一加速组件102、第二加速组件103以及第一存储组件101与第一计算设备中的第一主机处理组件100通过总线连接。通过总线接口104可以实现加速设备可插拔的安装于第一计算设备中。In some embodiments, as shown in FIG1 , the acceleration device may further include a bus interface 104, which may be used to access the first computing device, so that the first acceleration component 102, the second acceleration component 103, and the first storage component 101 are connected to the first host processing component 100 in the first computing device through a bus. The bus interface 104 may be used to enable the acceleration device to be pluggable and installed in the first computing device.
一些实施例中,如图1中所示,该加速设备还可以包括基板105,第一存储组件101、第一加速组件102以及第二加速组件103焊接在基板105上,以实现第一加速组件102、第二加速组件103分别与第一存储组件101的电气连接。In some embodiments, as shown in FIG. 1 , the acceleration device may further include a substrate 105 , on which the first storage component 101 , the first acceleration component 102 , and the second acceleration component 103 are welded, so as to realize electrical connection between the first acceleration component 102 , the second acceleration component 103 and the first storage component 101 , respectively.
其中,通过对多个密文数据进行分桶处理,可以将多个密文数据划分为多个数据区间,每个数据区间即类似于一个桶,每个数据区间中包含的密文数据即组成一个分桶结果。Among them, by performing bucket processing on multiple ciphertext data, the multiple ciphertext data can be divided into multiple data intervals, each data interval is similar to a bucket, and the ciphertext data contained in each data interval constitutes a bucket result.
作为一种可选方式,第二加速组件103针对任一特征,将多个密文数据进行分桶处理,获得多个分桶结果可以包括:针对任一特征,将多个密文数据按照特征对应的至少一个特征取值进行分桶处理,获得多个分桶结果。As an optional method, the second acceleration component 103 buckets the multiple ciphertext data for any feature, and obtaining multiple bucket results may include: for any feature, bucketing the multiple ciphertext data according to at least one feature value corresponding to the feature, and obtaining multiple bucket results.
其中,该分桶处理操作可以首先按照至少一个特征取值将多个对象进行划分,根据多个对象的划分结果,将多个对象分别对应密文数据进行划分,使得位于同一特征取值区间的对象对应的密文数据划分为同一个分桶结果中。Among them, the bucket processing operation can first divide multiple objects according to at least one feature value, and then divide the ciphertext data corresponding to the multiple objects according to the division results of the multiple objects, so that the ciphertext data corresponding to the objects in the same feature value interval are divided into the same bucket result.
比如对象为用户,特征为年龄的情况下,特征取值包括10岁、20岁、以及30岁,则按照该3个年龄值可以将年龄划分为四个年龄区间0~10,10~20,20~30以及30~∞(无穷大)。按照四个年龄区间,可以首先将多个用户进行划分,归属到不同年龄区间中,那么对应同一个年龄区域的用户所对应的密文数据也即划分为同一个桶中,从而得到多个分桶结果。 For example, if the object is a user and the feature is age, the feature values include 10, 20, and 30. According to the three age values, the age can be divided into four age intervals: 0-10, 10-20, 20-30, and 30-∞ (infinity). According to the four age intervals, multiple users can be divided into different age intervals. Then, the ciphertext data corresponding to users in the same age range is also divided into the same bucket, thereby obtaining multiple bucket results.
其中,每个特征对应的至少一个特征取值可以由第一主机处理组件100存储至第一存储组件100中,由第二加速组件103从第一存储组件100中获取。当然,由于数据量较小,第一主机处理组件100可以将每个特征对应的至少一个特征取值直接发送至第二加速组件103中。Among them, at least one feature value corresponding to each feature can be stored in the first storage component 100 by the first host processing component 100, and obtained from the first storage component 100 by the second acceleration component 103. Of course, due to the small amount of data, the first host processing component 100 can directly send at least one feature value corresponding to each feature to the second acceleration component 103.
此外,作为另一种可选方式,第一存储组件101还用于存储第一主机处理组件100发送的多个对象分别对应不同特征的分桶信息;In addition, as another optional manner, the first storage component 101 is also used to store bucket information of multiple objects corresponding to different features sent by the first host processing component 100;
第二加速组件103针对任一特征,将多个密文数据进行分桶处理,获得多个分桶结果包括:针对任一特征,确定多个对象分别对应特征的分桶信息;将对应相同分桶信息的至少一个对象分别对应的密文数据划分为同一分桶,以获得多个分桶。The second acceleration component 103 performs bucket processing on multiple ciphertext data for any feature, and obtains multiple bucket results including: for any feature, determining bucket information of multiple objects corresponding to the feature respectively; dividing the ciphertext data corresponding to at least one object corresponding to the same bucket information into the same bucket, so as to obtain multiple buckets.
该分桶信息例如可以是指分桶标识,其用以唯一标识一个分桶,可以采用任意一个或多个字符(如数字组合、字母等)形式实现,本申请对此不进行限定。多个对象分别对应不同特征的分桶信息可以由第一主机处理组件100确定。The bucket information may refer to a bucket identifier, which is used to uniquely identify a bucket and may be implemented in the form of any one or more characters (such as a combination of numbers, letters, etc.), which is not limited in this application. Bucket information corresponding to different features of multiple objects may be determined by the first host processing component 100.
第一主机处理组件100获得每个对象对应的密文数据之后,可以结合数据接收方自身拥有的多个特征,针对每个特征,按照每个特征对应的至少一个特征取值,可以将多个对象进行划分,从而可以确定每个对象所在的特征取值区间,并将位于同一个特征取值区间的对象设置同一个分桶信息,不同特征取值区间对应的分桶信息不同。第一主机处理组件100可以将每个对象对应每个特征的分桶信息存储至第一存储组件101中,第二加速组件103即可以从第一存储组件101中获取多个对象对应每个特征的分桶信息。当然,由于分桶信息的数据量较小,也可以由第一主机处理组件100将多个对象分别对应不同特征的分桶信息发送至第二加速组件103。After the first host processing component 100 obtains the ciphertext data corresponding to each object, it can combine the multiple features possessed by the data recipient itself, and for each feature, divide the multiple objects according to at least one feature value corresponding to each feature, so as to determine the feature value interval of each object, and set the same bucket information for objects in the same feature value interval, and the bucket information corresponding to different feature value intervals is different. The first host processing component 100 can store the bucket information of each feature corresponding to each object in the first storage component 101, and the second acceleration component 103 can obtain the bucket information of each feature corresponding to multiple objects from the first storage component 101. Of course, since the data volume of the bucket information is small, the first host processing component 100 can also send the bucket information of different features corresponding to multiple objects to the second acceleration component 103.
一些实施例中,如图2中所述,该第二加速组件103可以包括数据加载单元201、多个分桶单元202以及数据存储单元203。In some embodiments, as shown in FIG. 2 , the second acceleration component 103 may include a data loading unit 201 , a plurality of bucketing units 202 , and a data storage unit 203 .
数据加载单元201,用于从第一存储组件101获取多个密文数据,将多个密文数据分别提供至多个分桶单元;为多个分桶单元分别分配待处理的特征,并控制多个分桶单元对分配的待处理的特征进行并行处理;The data loading unit 201 is used to obtain multiple ciphertext data from the first storage component 101, and provide the multiple ciphertext data to multiple bucketing units respectively; assign features to be processed to the multiple bucketing units respectively, and control the multiple bucketing units to process the assigned features to be processed in parallel;
分桶单元202,用于针对为其分配的特征,将多个密文数据进行分桶处理,获得多个分桶结果;并将多个分桶结果发送至存储单元;The bucketing unit 202 is used to bucket the multiple ciphertext data according to the features assigned to it, obtain multiple bucketing results, and send the multiple bucketing results to the storage unit;
数据存储单元203,用于将每个分桶单元发送的多个分桶结果存储至第一存储组件。其中,数据存储单元可以采用RAM(Random Access Memory,随机存取存储器)实现等。The data storage unit 203 is used to store the multiple bucketing results sent by each bucketing unit into the first storage component. The data storage unit can be implemented by RAM (Random Access Memory).
通过多个分桶单元202可以实现多个特征的并行处理。其中,每个分桶单元202可以分配获得至少一个特征,该至少一个特征可以按照线行处理方式分别进行分桶处理操作。 Multiple features can be processed in parallel by using multiple bucketing units 202. Each bucketing unit 202 can be allocated to obtain at least one feature, and the at least one feature can be processed by bucketing in a line processing manner.
可选地,每个分桶单元202可以分配一个特征,第一主机处理组件100可以根据多个分桶单元202的单元数量,确定一次并行处理的特征数量,该特征数量可以小于或等于单元数量。第一主机处理组件100可以根据该特征数量选择至少一个特征,将多个对象分别对应该至少一个特征的分桶信息提供至加速设备,以由数据加载单元201将至少一个特征的分桶信息一一分配给至少一个分桶单元202,每个分桶单元202可以获得一个特征的分桶信息,进而针对为其分配的特征,将对应相同分桶信息的至少一个对象分别对应的密文数据划分为同一分桶结果;当然,第一主机处理组件100也可以是根据该特征数量选择至少一个特征,将该至少一个特征分别对应的至少一个特征取值提供至加速设备,以由数据加载单元201将至少一个特征分别对应的至少一个特征取值一一分配给至少一个分桶单元202,每个分桶单元202可以获得一个特征的至少一个特征取值,进而针对为其分配的特征,按照其对应的至少一个特征取值,将多个密文数据进行分桶处理。Optionally, each bucket unit 202 may be assigned a feature, and the first host processing component 100 may determine the number of features to be processed in parallel at one time according to the number of units of the multiple bucket units 202 , and the number of features may be less than or equal to the number of units. The first host processing component 100 can select at least one feature according to the number of features, and provide the bucket information of the at least one feature corresponding to multiple objects to the acceleration device, so that the data loading unit 201 can assign the bucket information of the at least one feature to at least one bucket unit 202 one by one, and each bucket unit 202 can obtain the bucket information of a feature, and then for the feature assigned to it, the ciphertext data corresponding to at least one object corresponding to the same bucket information can be divided into the same bucket result; of course, the first host processing component 100 can also select at least one feature according to the number of features, and provide at least one feature value corresponding to the at least one feature to the acceleration device, so that the data loading unit 201 can assign at least one feature value corresponding to the at least one feature to at least one bucket unit 202 one by one, and each bucket unit 202 can obtain at least one feature value of a feature, and then for the feature assigned to it, the multiple ciphertext data can be bucketed according to the at least one feature value corresponding to it.
由于第一加速组件101对密文数据进行的计算处理操作运算量也较大,因此,为了进一步提高处理效率,提高加速性能,该第一加速组件101可以包括至少一个第一加速单元;Since the amount of computational processing performed by the first acceleration component 101 on the ciphertext data is also relatively large, in order to further improve processing efficiency and acceleration performance, the first acceleration component 101 may include at least one first acceleration unit;
其中,每个第一加速单元可以用于从第一存储组件101获取至少一个分桶结果,针对任一分桶结果,按照目标计算处理模式,对分桶结果中的多个密文数据进行计算处理获得密文处理结果;将任一分桶结果对应的密文处理结果存储至第一存储组件101。Among them, each first acceleration unit can be used to obtain at least one bucket result from the first storage component 101, and for any bucket result, according to the target calculation processing mode, calculate and process multiple ciphertext data in the bucket result to obtain the ciphertext processing result; store the ciphertext processing result corresponding to any bucket result in the first storage component 101.
可选地,第一加速组件101可以设置多个第一加速单元,从而可以提高并行处理能力,提高处理效率,提高加速性能。Optionally, the first acceleration component 101 may be provided with a plurality of first acceleration units, so as to improve parallel processing capability, processing efficiency, and acceleration performance.
一些实施例中,如图3中所示,每个第一加速单元可以包括第一控制单元301及多个第一运算单元302。In some embodiments, as shown in FIG. 3 , each first acceleration unit may include a first control unit 301 and a plurality of first computing units 302 .
第一控制单元301用于从第一存储组件101获取至少一个分桶结果;将至少一个分桶结果分派至至少一个运算单元302;The first control unit 301 is used to obtain at least one bucket result from the first storage component 101; and dispatch the at least one bucket result to at least one computing unit 302;
第一运算单元302用于针对为其分派的任一分桶结果,按照目标计算处理模式,对分桶结果中的多个密文数据进行计算处理获得密文处理结果;The first computing unit 302 is used to perform computing processing on a plurality of ciphertext data in the bucket result according to a target computing processing mode for any bucket result assigned thereto to obtain a ciphertext processing result;
第一控制单元301用于将任一分桶结果对应的密文处理结果存储至第一存储组件101。The first control unit 301 is used to store the ciphertext processing result corresponding to any bucket result in the first storage component 101.
通过多个第一运算单元302可以实现多个分桶结果的并行计算处理,从而提高了处理效率,进一步的保证了加速性能。Multiple first computing units 302 can be used to implement parallel computing of multiple bucket results, thereby improving processing efficiency and further ensuring acceleration performance.
一些实施例中,如图3中所示,每个第一加速单元300还可以包括第一存储单元303;In some embodiments, as shown in FIG3 , each first acceleration unit 300 may further include a first storage unit 303 ;
第一运算单元302还可以用于将任一分桶结果对应的密文处理结果保存至第一存储单元303;The first operation unit 302 may also be used to save the ciphertext processing result corresponding to any bucket result to the first storage unit 303;
第一控制单元301将任一分桶结果对应的密文处理结果存储至第一存储组件可以是: 将第一存储单元303中存储的任一分桶结果对应的密文处理结果存储至第一存储组件101。The first control unit 301 stores the ciphertext processing result corresponding to any bucket result in the first storage component, which may be: The ciphertext processing result corresponding to any bucket result stored in the first storage unit 303 is stored in the first storage component 101.
一些实施例中,如图3中所示,每个第一加速单元300还可以包括第一加载单元304。In some embodiments, as shown in FIG. 3 , each first acceleration unit 300 may further include a first loading unit 304 .
第一控制单元301从第一存储组件101获取至少一个分桶结果可以具体是,控制第一加载单元304从第一存储组件101获取至少一个分桶结果。The first control unit 301 obtaining at least one bucket result from the first storage component 101 may specifically control the first loading unit 304 to obtain at least one bucket result from the first storage component 101 .
可选地,第一控制单元301可以根据第一主机处理组件100的指示而执行相应操作,因此,一些实施例中,第一控制单元301还可以用于接收第一主机处理组件100发送的第一控制信息,按照第一控制信息控制多个第一运算单元302以及第一存储单元303以及第一加载单元304运行。Optionally, the first control unit 301 can perform corresponding operations according to the instructions of the first host processing component 100. Therefore, in some embodiments, the first control unit 301 can also be used to receive first control information sent by the first host processing component 100, and control the operation of multiple first computing units 302, first storage units 303, and first loading units 304 according to the first control information.
其中,第一控制信息中可以包括第一加速单元300所需获取的至少一个分桶结果的第一数据总量以及将至少一个分桶结果进行计算处理之后对应的第二数据总量,此外,还可以包括所需获取至少一个分桶结果对应的第一存储地址以及将至少一个分桶结果进行计算处理之后获得的至少一个密文处理结果对应的第二存储地址。从而第一控制单元301可以具体是按照第一数据总量以及第一存储地址从第一存储组件101获取至少一个分桶结果;以及可以按照第二数据总量以及第二存储地址,控制第一存储单元303将至少一个密文处理结果存储至第一存储组件101。具体的,第一控制单元301可以具体是按照第一数据总量以及第一存储地址,控制第一加载单元304从第一存储组件101获取至少一个分桶结果。The first control information may include the first total data amount of at least one bucket result that the first acceleration unit 300 needs to obtain and the second total data amount corresponding to the at least one bucket result after the calculation and processing are performed on the at least one bucket result. In addition, it may also include the first storage address corresponding to the at least one bucket result that needs to be obtained and the second storage address corresponding to the at least one ciphertext processing result obtained after the calculation and processing are performed on the at least one bucket result. Thus, the first control unit 301 may specifically obtain at least one bucket result from the first storage component 101 according to the first total data amount and the first storage address; and may control the first storage unit 303 to store at least one ciphertext processing result to the first storage component 101 according to the second total data amount and the second storage address. Specifically, the first control unit 301 may specifically control the first loading unit 304 to obtain at least one bucket result from the first storage component 101 according to the first total data amount and the first storage address.
此外,第一控制信息中还可以包括目标计算处理模式或者目标计算处理模式对应的运算方式等,第一控制单元301可以具体按照第一控制信息,通知第一运算单元302对应的运算方式。In addition, the first control information may further include the target computing processing mode or the operation method corresponding to the target computing processing mode, and the first control unit 301 may specifically notify the first operation unit 302 of the corresponding operation method according to the first control information.
第一运算单元302针对为其分派的任一分桶结果,对分桶结果中的多个密文数据进行计算处理获得密文处理结果包括:针对为其分派的任一分桶结果,按照运算方式,对分桶结果中的多个密文数据进行计算处理获得密文处理结果。The first operation unit 302 calculates and processes multiple ciphertext data in any bucket result assigned to it to obtain a ciphertext processing result, including: for any bucket result assigned to it, according to the operation method, calculating and processing multiple ciphertext data in the bucket result to obtain a ciphertext processing result.
上述一个或多个实施例中,每个目标计算模式对应的运算方式,可以预先配置有对应的一条或多条运算指令,通过执行一条或多条运算指令来实现对每个分桶结果中的多个密文数据进行计算处理。In one or more of the above embodiments, the operation method corresponding to each target calculation mode can be pre-configured with one or more corresponding operation instructions, and the calculation and processing of multiple ciphertext data in each bucket result can be achieved by executing one or more operation instructions.
实际应用中,其中,每个第一运算单元302实际应用中可以采用可编程处理器(PC)实现,其可以存储相应指令以执行相应操作。一些实施例中,如图4中所示,第一运算单元302可以包括第一存储子单元401、第一解析子单元402、第一计算子单元403、以及第一控制子单元404;In practical applications, each first operation unit 302 can be implemented by a programmable processor (PC), which can store corresponding instructions to perform corresponding operations. In some embodiments, as shown in FIG4 , the first operation unit 302 may include a first storage subunit 401, a first parsing subunit 402, a first calculation subunit 403, and a first control subunit 404;
第一存储子单元401用于存储目标计算处理模式对应的一条或多条运算指令;The first storage subunit 401 is used to store one or more operation instructions corresponding to the target computing processing mode;
第一解析子单元402用于解析一条或多条运算指令;The first parsing subunit 402 is used to parse one or more operation instructions;
第一控制子单元403用于基于第一解析单元的解析结果,向第一计算子单元404发送 计算指示信息;The first control subunit 403 is used to send a signal to the first calculation subunit 404 based on the analysis result of the first analysis unit. Calculation instructions;
第一计算子单元404,用于基于该计算指示信息,对多个密文数据进行计算处理获得密文处理结果。The first calculation subunit 404 is used to perform calculation processing on multiple ciphertext data based on the calculation indication information to obtain a ciphertext processing result.
其中,该一条或多条运算指令并解析之后可以转换为相应的计算指示信息,以控制第一计算子单元的运行。The one or more operation instructions may be converted into corresponding calculation instruction information after being parsed to control the operation of the first calculation subunit.
其中,第一存储子单元可以采用RAM实现等。The first storage subunit may be implemented by RAM, etc.
实际应用中,该目标计算处理模式可以是密文累加,对应的运算方式为点加运算。比如密文数据采用基于椭圆曲线的同态加密算法进行加密实现,例如EC-ELGamal半同态加速算法。EC-ElGamal是ECC的一种,是把ElGamal移植到椭圆曲线上来的实现,主要计算有:椭圆曲线点加、点减、点乘、模逆和离散对数。而ElGamal是基于Diffie-Hellman(迪菲-赫尔曼)密钥交换的非对称加密算法。In practical applications, the target computing processing mode can be ciphertext accumulation, and the corresponding operation mode is point addition. For example, ciphertext data is encrypted using a homomorphic encryption algorithm based on elliptic curves, such as the EC-ELGamal semi-homomorphic acceleration algorithm. EC-ElGamal is a type of ECC, which is an implementation of ElGamal transplanted to elliptic curves. The main calculations include: elliptic curve point addition, point subtraction, point multiplication, modular inversion and discrete logarithm. ElGamal is an asymmetric encryption algorithm based on Diffie-Hellman key exchange.
以EC-ELGamal半同态加密算法为,其加密公式为:
Enc(P,m)=(C1=kG,C2=kP+mG)
Taking the EC-ELGamal semi-homomorphic encryption algorithm as an example, the encryption formula is:
Enc(P,m)=(C 1 =kG,C 2 =kP+mG)
其中,P表示公钥,是椭圆曲线上的点;G为椭圆曲线的基点;k为一个随机数;m为待加密的明文数据,意即目标数据,Enc(P,m)表示加密获得的密文,其由点对数据C1和C2构成。Among them, P represents the public key, which is a point on the elliptic curve; G is the base point of the elliptic curve; k is a random number; m is the plaintext data to be encrypted, that is, the target data, and Enc(P, m) represents the ciphertext obtained by encryption, which is composed of the point pair data C1 and C2 .
密文相加公式为:
Enc(P,m1)+Enc(P,m2)
=(k1G+k2G,(k1P+m1G)+(k2P+m2G))
The ciphertext addition formula is:
Enc(P,m 1 )+Enc(P,m 2 )
=(k 1 G+k 2 G,(k 1 P+m 1 G)+(k 2 P+m 2 G))
解密公式为:
M=C2-sC1
=mG
The decryption formula is:
M=C 2 -sC 1
=mG
其中,M表示解密结果,s表示私钥,私钥乘以基点为公钥,因此sC1=s*kG=kP,从而C2-sC1=mG。Among them, M represents the decryption result, s represents the private key, and the private key multiplied by the base point is the public key, so sC1=s*kG=kP, and thus C2-sC1=mG.
可见,加密实质上需要椭圆曲线上点乘和两个椭圆曲线上的点乘结果相加(点加),密文相加实质上是椭圆曲线的点加,而解密则需要椭圆曲线上的点乘。点乘运算本质上行由标量以及点构成,例如点乘运算kP包括标量k以及点P;点乘运算mG包括标量m和点G。It can be seen that encryption essentially requires point multiplication on an elliptic curve and the addition of the point multiplication results on two elliptic curves (point addition). Ciphertext addition is essentially point addition on an elliptic curve, while decryption requires point multiplication on an elliptic curve. Point multiplication operations essentially consist of scalars and points. For example, the point multiplication operation kP includes the scalar k and the point P; the point multiplication operation mG includes the scalar m and the point G.
而密文累加意即多个密文数据相加,一些实施例中,第一计算子单元404基于计算指示信息,对多个密文数据进行计算处理获得密文处理结果可以是:从多个密文数据中依次获取一个密文数据,与前一次点加结果进行点加运算,确定当前累加次数是否满足预设次 数,若是,将最后一次点加结果作为密文处理结果输出,若否,将点加结果保存至第一存储子单元401中。第一控制单元可以将该多个密文数据以输入数据流的形式提供至第一计算子单元。Ciphertext accumulation means adding multiple ciphertext data. In some embodiments, the first calculation subunit 404 performs calculation processing on multiple ciphertext data based on the calculation indication information to obtain the ciphertext processing result, which can be: sequentially obtain one ciphertext data from the multiple ciphertext data, perform a dot addition operation with the previous dot addition result, and determine whether the current accumulation times meets the preset times. If yes, the last point addition result is output as the ciphertext processing result, if no, the point addition result is saved in the first storage subunit 401. The first control unit can provide the multiple ciphertext data to the first calculation subunit in the form of an input data stream.
一个实现方式中,该第一存储子单元401可以包括第一指令存储子单元、第一数据存储子单元、以及第一次数存储单元。In one implementation, the first storage subunit 401 may include a first instruction storage subunit, a first data storage subunit, and a first number storage unit.
其中,第一指令存储单元用于存储一条或多条运算指令,第一数据存储子单元用于存储计算处理过程中的中间结果,如前一次的点加结果,第一次数存储单元用于存储预设次数等。Among them, the first instruction storage unit is used to store one or more operation instructions, the first data storage subunit is used to store intermediate results in the calculation process, such as the previous point addition result, and the first number storage unit is used to store a preset number of times, etc.
此外,对于目标计算处理模式涉及点乘运算的,第一存储子单元还可以包括第一标量存储子单元用于存储标量数据。In addition, for a target computing processing mode involving a point multiplication operation, the first storage subunit may further include a first scalar storage subunit for storing scalar data.
在一个实际应用中,第一运算单元302的运算示意图可以如图5中所示,图5中所述的第一指令存储单元、第一解析子单元、第一控制子单元、第一数据存储子单元、第一计算子单元、第一次数存储子单元及第一标量存储子单元,前文已进行详细介绍,此处不再重复赘述。结合图5中所示,第一计算子单元可以具备有基础计算逻辑,该基础计算逻辑可以包括第一输入A、第二输入B、第三输入C以及第四输入D;第一输入A可以来自输入数据流或者第一数据存储子单元,第二输入B、第三输入C以及第四输入D可以来自于第一数据存储子单元,当然每个输入可以为空。以密文累加对应的点加运算为例:从输入数据流获取的密文数据可以进入第一输入A,前一次点加结果作为第二输入B,第三输入C以及第四输入D可以为空间,第一计算子单元进行点加运算,将第一输入A以及第二输入B进行点加运算获得点加结果,该点加结果会存储至第一数据存储子单元或者作为密文处理结果而输出。In an actual application, the operation schematic diagram of the first operation unit 302 can be as shown in FIG5. The first instruction storage unit, the first parsing subunit, the first control subunit, the first data storage subunit, the first calculation subunit, the first number storage subunit and the first scalar storage subunit described in FIG5 have been described in detail above and will not be repeated here. In conjunction with FIG5, the first calculation subunit can have a basic calculation logic, which can include a first input A, a second input B, a third input C and a fourth input D; the first input A can come from the input data stream or the first data storage subunit, and the second input B, the third input C and the fourth input D can come from the first data storage subunit, and of course each input can be empty. Take the point addition operation corresponding to the ciphertext accumulation as an example: the ciphertext data obtained from the input data stream can enter the first input A, the previous point addition result is used as the second input B, the third input C and the fourth input D can be space, and the first calculation subunit performs a point addition operation, and performs a point addition operation on the first input A and the second input B to obtain a point addition result, which will be stored in the first data storage subunit or output as a ciphertext processing result.
本申请实施例在实际应用中,第一计算设备可以是负责对多个密文数据进行计算处理的数据接收方所对应的计算设备等。数据发起方对应第二计算设备,用于向第一计算设备传输多个对象分别对应的密文数据等。In practical applications of the embodiments of the present application, the first computing device may be a computing device corresponding to a data receiver responsible for computing and processing multiple ciphertext data, etc. The data initiator corresponds to a second computing device, which is used to transmit ciphertext data corresponding to multiple objects to the first computing device.
如图6a中所示,本申请实施例还提供了一种计算系统,该计算系统可以包括第一计算设备60以及第二计算设备70。As shown in FIG. 6 a , the embodiment of the present application further provides a computing system, which may include a first computing device 60 and a second computing device 70 .
第一计算设备60可以包括第一主机处理组件100及第一加速设备601,该第一加速设备601的具体结构实现可以详见上述图1~图5所示的任一实施例中所述,此处不再重复赘述。The first computing device 60 may include a first host processing component 100 and a first acceleration device 601. The specific structural implementation of the first acceleration device 601 may be described in detail in any of the embodiments shown in FIG. 1 to FIG. 5 above, and will not be repeated here.
该第二计算设备70可以包括第二主机处理组件700以及第二加速设备602。The second computing device 70 may include a second host processing component 700 and a second acceleration device 602 .
也即第二计算设备70也可以配置第二加速设备602用于实现加密或解密操作的加速 处理等,因此,该第二加速设备602可以用于获取多个待处理数据,针对任一待处理数据,对待处理数据进行加密或解密,获得计算处理结果。That is, the second computing device 70 may also be configured with a second acceleration device 602 for accelerating encryption or decryption operations. Therefore, the second acceleration device 602 can be used to obtain multiple data to be processed, and for any data to be processed, encrypt or decrypt the data to be processed to obtain a calculation result.
该待处理数据可以是待加密的目标数据或者待解密的密文处理结果;相应的,计算处理结果可以是密文数据或者明文处理结果。The data to be processed may be target data to be encrypted or a ciphertext processing result to be decrypted; correspondingly, the calculation processing result may be ciphertext data or a plaintext processing result.
对于密文数据,第二主机处理组件700可以从第二加速设备602获取多个对象分别对应的密文数据并发送至第一计算设备60。For the ciphertext data, the second host processing component 700 may obtain the ciphertext data corresponding to the multiple objects respectively from the second acceleration device 602 and send the ciphertext data to the first computing device 60 .
对于明文处理结果,第二主机处理组件700可以从第二加速设备602获取明文处理结果,并进行后续处理操作。For the plaintext processing result, the second host processing component 700 can obtain the plaintext processing result from the second acceleration device 602 and perform subsequent processing operations.
例如,一个实际应用中,本申请实施例的技术方案可以应用于采用纵向联邦学习方式进行多方联合建模场景中,如图6b所示的交互示意图中,数据发起方的第二计算设备70中的第二加速设备602首先将不同样本对象对应的梯度信息进行加密,获得多个样本对象的密文梯度信息,其中,梯度信息是基于数据发起方提供的样本对象对应的特征取值和标签数据,利用决策树模型计算获得。For example, in a practical application, the technical solution of the embodiment of the present application can be applied to a scenario in which multi-party joint modeling is performed using a vertical federated learning method. In the interaction diagram shown in Figure 6b, the second acceleration device 602 in the second computing device 70 of the data initiator first encrypts the gradient information corresponding to different sample objects to obtain the ciphertext gradient information of multiple sample objects, wherein the gradient information is calculated based on the feature values and label data corresponding to the sample objects provided by the data initiator using a decision tree model.
之后,第二加速设备602再将多个样本对象的密文梯度信息发送至给第二主机处理组件700,第二主机处理组件700将该多个样本对象的密文梯度信息发送至数据接收方对应的第一计算设备60。Afterwards, the second acceleration device 602 sends the ciphertext gradient information of the multiple sample objects to the second host processing component 700, and the second host processing component 700 sends the ciphertext gradient information of the multiple sample objects to the first computing device 60 corresponding to the data recipient.
第一计算设备60中的第一主机处理组件100接收到多个样本对象的密文梯度信息之后,可以将多个样本对象的密文梯度信息发送至第一加速设备601,第一加速设备601可以首先将多个样本对象的密文梯度信息按照不同特征进行分桶处理获得每个特征的多个分桶结果,之后采用本申请技术方案可以计算获得每个特征的多个分桶结果分别对应的密文梯度累加值,再将每个特征的多个分桶结果分别对应的密文梯度累加值发送至第一主机处理组件100;第一主机处理组件100再将每个特征的多个分桶结果分别对应的密文梯度累加值发送至数据发起方的第二计算设备70。After the first host processing component 100 in the first computing device 60 receives the ciphertext gradient information of multiple sample objects, the ciphertext gradient information of the multiple sample objects can be sent to the first acceleration device 601. The first acceleration device 601 can first perform bucket processing on the ciphertext gradient information of the multiple sample objects according to different features to obtain multiple bucket results of each feature, and then use the technical solution of the present application to calculate the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature, and then send the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature to the first host processing component 100; the first host processing component 100 then sends the ciphertext gradient cumulative values corresponding to the multiple bucket results of each feature to the second computing device 70 of the data initiator.
第二计算设备70中的第二主机处理组件700接收到每个特征的多个分桶结果分别对应的密文梯度累加值,可以发送至第二加速设备602。The second host processing component 700 in the second computing device 70 receives the ciphertext gradient accumulation values corresponding to the multiple bucket results of each feature, and can send them to the second acceleration device 602.
第二加速设备602可以从中解密获得每个特征的多个分桶结果分别对应的梯度累加值,再将每个特征的多个分桶结果的梯度累积值进行累加获得该特征对应的梯度累加值。当然,也可以是先将多个分桶结果分别对应的密文梯度累加值进行累加获得该特征对应的密文梯度累加值,再对该特征对应的密文梯度累加值进行解密获得该特征对应的梯度累加值。The second acceleration device 602 can decrypt and obtain the gradient accumulation values corresponding to the multiple bucket results of each feature, and then accumulate the gradient accumulation values of the multiple bucket results of each feature to obtain the gradient accumulation value corresponding to the feature. Of course, it is also possible to first accumulate the ciphertext gradient accumulation values corresponding to the multiple bucket results to obtain the ciphertext gradient accumulation value corresponding to the feature, and then decrypt the ciphertext gradient accumulation value corresponding to the feature to obtain the gradient accumulation value corresponding to the feature.
第二加速设备602可以将多个特征分别对应的梯度累加值发送至第二主机处理组件700。The second acceleration device 602 may send the gradient accumulation values corresponding to the plurality of features to the second host processing component 700 .
第二主机处理组件700可以具体是基于多个特征对应的梯度累加值,确定决策树模型 的最优分裂点。根据最优分裂点即可以构建决策树模型。The second host processing component 700 can specifically determine the decision tree model based on the gradient accumulation values corresponding to the multiple features. The optimal split point. According to the optimal split point, the decision tree model can be constructed.
其中,该决策树模式可以是XGBoost(eXtreme Gradient Boosting,极端梯度提升)模型等。当然也可以是其它类型的决策树模型,例如GBDT(Gradient Boosting Decision Tree,梯度提升决策树),GBM(Gradient Boosting Machine,梯度提升机)等。The decision tree model may be an XGBoost (eXtreme Gradient Boosting) model, etc. Of course, it may also be other types of decision tree models, such as GBDT (Gradient Boosting Decision Tree), GBM (Gradient Boosting Machine), etc.
梯度信息可以包括每个样本对象对应的一阶梯度以及二阶梯度,通过对决策树模型的损失函数求导获得,将样本对象的特征取值输入决策树模型,即可以获得预测数据,利用损失函数可以估计该预测数据与标签数据的不一致程度,通过损失函数求导可以获得一阶梯度以及二阶梯度。The gradient information may include the first-order gradient and second-order gradient corresponding to each sample object, which is obtained by deriving the loss function of the decision tree model. The feature values of the sample object are input into the decision tree model to obtain the predicted data. The loss function can be used to estimate the degree of inconsistency between the predicted data and the label data. The first-order gradient and second-order gradient can be obtained by deriving the loss function.
由上述描述可知,对于第二计算设备,待处理数据为待加密的目标数据的情况下,目标数据可以是基于数据发起方提供的样本对象对应的特征取值和标签数据,计算获得的决策树模型对应的梯度信息;It can be seen from the above description that, for the second computing device, when the data to be processed is the target data to be encrypted, the target data may be the gradient information corresponding to the decision tree model calculated based on the feature values and label data corresponding to the sample object provided by the data initiator;
待处理数据为待解密的密文处理结果的情况下,待处理数据可以是针对数据接收方提供的任一特征,计算获得的待解密的密文处理结果,其中,待解密的密文处理结果可以是密文梯度累加值,可以是每个特征对应的密文梯度累加值或者每个分桶结果对应的密文梯度累加值,相应的对其解密获得的计算处理结果即为梯度累加值。In the case where the data to be processed is the ciphertext processing result to be decrypted, the data to be processed may be the ciphertext processing result to be decrypted obtained by calculation for any feature provided by the data recipient, wherein the ciphertext processing result to be decrypted may be the ciphertext gradient accumulated value, the ciphertext gradient accumulated value corresponding to each feature, or the ciphertext gradient accumulated value corresponding to each bucket result, and the corresponding calculation processing result obtained by decrypting it is the gradient accumulated value.
第一计算设备60和第二计算设备70之间通过网络建立连接。网络为第一计算设备60与第二计算设备70之间提供了通信链路的介质。网络可以包括各种连接类型,例如有线、无线或者光纤电缆等等。可选地,无线连接方式可以通过移动网络实现,相应地,移动网络的网络制式可以为2G(GSM)、2.5G(GPRS)、3G(WCDMA、TD-SCDMA、CDMA2000、UTMS)、4G(LTE)、4G+(LTE+)、5G、WiMax等中的任意一种。可选地,当然也可以通过蓝牙、WiFi、红外线等方式建立通信连接。The first computing device 60 and the second computing device 70 are connected via a network. The network provides a medium for a communication link between the first computing device 60 and the second computing device 70. The network may include various connection types, such as wired, wireless, or optical fiber cables, etc. Optionally, the wireless connection may be implemented via a mobile network, and accordingly, the network standard of the mobile network may be any one of 2G (GSM), 2.5G (GPRS), 3G (WCDMA, TD-SCDMA, CDMA2000, UTMS), 4G (LTE), 4G+ (LTE+), 5G, WiMax, etc. Optionally, of course, a communication connection may also be established via Bluetooth, WiFi, infrared, etc.
第一计算设备60以及第二计算设备70必然还可以包括其它部件,例如输入/输出接口、显示组件、以及实现上述通信连接的通信组件、以及存储计算机指令的主机存储组件以供主机处理组件调用并执行实现相应操作等,本申请对此不做过多赘述。The first computing device 60 and the second computing device 70 may also include other components, such as input/output interfaces, display components, communication components for implementing the above-mentioned communication connections, and host storage components for storing computer instructions for host processing components to call and execute to implement corresponding operations, etc. This application does not go into details.
如图7a中所述,为了进一步提高处理效率,提高加速性能,该第二加速设备可以包括第二存储组件701及至少一个第三加速组件702;第二存储组件701与第二主机处理组件700通过总线连接;As shown in FIG. 7a , in order to further improve processing efficiency and acceleration performance, the second acceleration device may include a second storage component 701 and at least one third acceleration component 702 ; the second storage component 701 is connected to the second host processing component 700 via a bus;
第二存储组件701用于存储第二主机处理组件700发送的多个待处理数据;待处理数据为待加密的目标数据或者待解密的密文处理结果;The second storage component 701 is used to store a plurality of data to be processed sent by the second host processing component 700; the data to be processed is target data to be encrypted or a ciphertext processing result to be decrypted;
第三加速组件702用于从第二存储组件获取至少一个待处理数据;针对任一待处理数据,对待处理数据进行加密或解密,获得计算处理结果,并将计算处理结果存储至第二存储组件701; The third acceleration component 702 is used to obtain at least one to-be-processed data from the second storage component; for any to-be-processed data, encrypt or decrypt the to-be-processed data to obtain a calculation result, and store the calculation result in the second storage component 701;
第二主机处理组件700于从第二存储组件701中获取任一待处理数据对应的计算处理结果。The second host processing component 700 obtains a calculation result corresponding to any data to be processed from the second storage component 701 .
可选地,第二加速设备可以设置多个第三加速组件702,从而可以提高并行处理能力,提高处理效率,提高加速性能。Optionally, the second acceleration device may be provided with a plurality of third acceleration components 702, thereby improving parallel processing capability, improving processing efficiency, and improving acceleration performance.
一些实施例中,如图7b中所示,每个第三加速组件702可以包括第二控制单元7021以及多个第二运算单元7022;In some embodiments, as shown in FIG. 7 b , each third acceleration component 702 may include a second control unit 7021 and a plurality of second computing units 7022 ;
第二控制单元7021用于从第二存储组件获取至少一个待处理数据;将至少一个待处理数据分派至至少一个第二运算单元;The second control unit 7021 is used to obtain at least one data to be processed from the second storage component; and dispatch the at least one data to be processed to at least one second computing unit;
第二运算单元7022用于针对为其分派的任一待处理数据,对待处理数据进行加密或解密,获得计算处理结果;The second computing unit 7022 is used to encrypt or decrypt any data to be processed assigned to it to obtain a computing result;
第二控制单元7021用于将任一待处理数据对应的计算处理结果存储至第二存储组件701。The second control unit 7021 is used to store the calculation results corresponding to any data to be processed into the second storage component 701.
一些实施例中,每个第三加速组件702还可以包括第二存储单元7023;第二运算单元7022还用于将任一待处理数据对应的计算处理结果保存至第二存储单元7023;In some embodiments, each third acceleration component 702 may further include a second storage unit 7023; the second computing unit 7022 is further configured to save a calculation result corresponding to any to-be-processed data to the second storage unit 7023;
第二控制单元7021将任一待处理数据对应的计算处理结果存储至第二存储组件701包括:将第二存储单元7023中存储的任一待处理数据对应的计算处理结果存储至第二存储组件701。The second control unit 7021 stores the calculation processing result corresponding to any data to be processed in the second storage component 701, including: storing the calculation processing result corresponding to any data to be processed stored in the second storage unit 7023 in the second storage component 701.
一些实施例中,如图7b中所示,每个第三加速组件702还可以包括第二加载单元7024。第二控制单元7021可以具体是控制第二加载单元7024从第二存储组件701获取至少一个待处理数据。In some embodiments, as shown in FIG7b , each third acceleration component 702 may further include a second loading unit 7024. The second control unit 7021 may specifically control the second loading unit 7024 to obtain at least one to-be-processed data from the second storage component 701 .
一些实施例中,第二控制单元7021还用于接收第二主机处理组件700发送的第二控制信息,按照第二控制信息控制多个第二运算单元7022以及第二存储单元7023运行;In some embodiments, the second control unit 7021 is further used to receive second control information sent by the second host processing component 700, and control the operation of the plurality of second computing units 7022 and the second storage unit 7023 according to the second control information;
第二控制单元7021还用于按照第二控制信息,通知第二运算单元7022对应的运算方式;其中,加密对应的运算方式即为点加以及点乘,解密对应运算方式为点乘。The second control unit 7021 is further used to notify the second computing unit 7022 of the corresponding computing method according to the second control information; wherein the computing methods corresponding to encryption are point addition and point multiplication, and the computing method corresponding to decryption is point multiplication.
第二控制信息中可以包括第三加速组件所需获取的至少一个待处理数据的第一数据总量以及将至少一个待处理数据进行计算处理之后对应的第二数据总量,此外,还可以包括所需获取至少一个待处理数据对应的第一存储地址以及将至少一个待处理数据进行计算处理之后获得的至少一个计算处理结果对应的第二存储地址。从而第二控制单元7021可以具体是按照第一数据总量以及第一存储地址从第二存储组件701获取至少一个待处理数据;以及可以按照第二数据总量以及第二存储地址,控制第二存储单元7023将至少一个计算处理结果存储至第二存储组件701。具体的,第二控制单元7021可以具体是按照第一数据总量以及第一存储地址,控制第二加载单元7024从第二存储组件701获取至少一 个待处理数据。The second control information may include the first total amount of at least one data to be processed that the third acceleration component needs to obtain and the second total amount of data corresponding to at least one data to be processed after the at least one data to be processed is calculated and processed. In addition, it may also include a first storage address corresponding to the at least one data to be processed that needs to be obtained and a second storage address corresponding to at least one calculation result obtained after the at least one data to be processed is calculated and processed. Thus, the second control unit 7021 can specifically obtain at least one data to be processed from the second storage component 701 according to the first total amount of data and the first storage address; and can control the second storage unit 7023 to store at least one calculation result to the second storage component 701 according to the second total amount of data and the second storage address. Specifically, the second control unit 7021 can specifically control the second loading unit 7024 to obtain at least one data to be processed from the second storage component 701 according to the first total amount of data and the first storage address. data to be processed.
此外,第二控制信息中还可以包括加密或解密对应的运算方式等,第二控制单元7021可以具体按照第二控制信息,通知第二运算单元7022对应的运算方式;In addition, the second control information may also include an operation method corresponding to encryption or decryption, and the second control unit 7021 may specifically notify the second operation unit 7022 of the corresponding operation method according to the second control information;
第二运算单元7022针对为其分派的任一待处理数据,对待处理数据进行计算处理获得计算处理结果包括:针对为其分派的任一待处理数据,按照该运算方式,对待处理数据进行处理获得计算处理结果。The second computing unit 7022 performs computing processing on any data to be processed assigned to it to obtain a computing processing result, including: for any data to be processed assigned to it, processing the data to be processed according to the computing method to obtain a computing processing result.
其中,加密或解密对应的运算方式,可以预先配置有对应的一条或多条运算指令,通过执行一条或多条运算指令来实现对每个待处理数据进行计算处理。一些实施例中,第二运算单元7022实际应用中可以采用可编程处理器(PC)实现,其可以存储相应指令以执行相应操作,该第二运算单元7022可以包括第二存储子单元、第二解析子单元、第二计算子单元、以及第二控制子单元;Among them, the operation mode corresponding to encryption or decryption can be pre-configured with one or more corresponding operation instructions, and each data to be processed can be calculated and processed by executing one or more operation instructions. In some embodiments, the second operation unit 7022 can be implemented by a programmable processor (PC) in actual application, which can store corresponding instructions to perform corresponding operations. The second operation unit 7022 can include a second storage subunit, a second parsing subunit, a second calculation subunit, and a second control subunit;
第二存储子单元用于存储加密或解密对应的一条或多条运算指令;The second storage subunit is used to store one or more operation instructions corresponding to encryption or decryption;
第二解析子单元用于解析一条或多条运算指令;The second parsing subunit is used to parse one or more operation instructions;
第二控制子单元用于基于解析单元的解析结果,向第二计算子单元发送计算指示信息;The second control subunit is used to send calculation instruction information to the second calculation subunit based on the analysis result of the analysis unit;
第二计算子单元,用于基于计算指示信息,对待处理数据进行计算处理获得计算处理结果。The second calculation subunit is used to perform calculation processing on the data to be processed based on the calculation indication information to obtain a calculation processing result.
其中,该一条或多条运算指令并解析之后可以转换为相应的计算指示信息,以控制第一计算子单元的运行。The one or more operation instructions may be converted into corresponding calculation instruction information after being parsed to control the operation of the first calculation subunit.
其中,第二存储子单元可以采用RAM实现等。The second storage subunit may be implemented by RAM, etc.
一个实现方式中,该第二存储子单元可以包括第二指令存储子单元、第二数据存储子单元、以及第二次数存储单元。In one implementation, the second storage subunit may include a second instruction storage subunit, a second data storage subunit, and a second number storage unit.
其中,第二指令存储单元用于存储一条或多条运算指令,第二数据存储子单元用于存储计算处理过程中的中间结果,第二次数存储单元用于存储预设次数等。Among them, the second instruction storage unit is used to store one or more operation instructions, the second data storage subunit is used to store intermediate results in the calculation process, and the second number storage unit is used to store a preset number of times, etc.
此外,由于加密操作涉及点乘运算,第二存储子单元还可以包括第一标量存储子单元用于存储标量数据。In addition, since the encryption operation involves a point multiplication operation, the second storage subunit may further include a first scalar storage subunit for storing scalar data.
需要说明的是,该第二运算单元的具体结构形式可以与前文相应实施例中所描述的第一运算单元302的结构相同,因此具体实现可以详见前文对第一运算单元的解释说明,此处将不再重复赘述。It should be noted that the specific structure of the second operation unit can be the same as the structure of the first operation unit 302 described in the corresponding embodiment above. Therefore, the specific implementation can be found in the above explanation of the first operation unit, which will not be repeated here.
通过本申请实施例的技术方案,可以提高第二计算设备中的加密或解密操作的处理效率,以及第一计算设备中的密文累加操作的处理效率,可以降低主机处理组件的运算量,提高处理性能,提高加速性能,实现了高效、高性能的数据联合处理。 Through the technical solution of the embodiment of the present application, the processing efficiency of the encryption or decryption operation in the second computing device and the processing efficiency of the ciphertext accumulation operation in the first computing device can be improved, the amount of calculation of the host processing component can be reduced, the processing performance can be improved, the acceleration performance can be improved, and efficient and high-performance data joint processing can be achieved.
其中,第一计算设备以及第二计算设备可以为物理机器,其可以是提供云计算能力的物理机器等。The first computing device and the second computing device may be physical machines, which may be physical machines providing cloud computing capabilities, etc.
此外,本申请实施例还提供了一种加速方法,该方法可以应用于如图1所示的加速设备中,加速设备包括第一存储组件、与第一存储组件连接的第一加速组件及第二加速组件;第一存储组件与第一主机处理组件通过总线连接;其中,第一存储组件存储第一主机处理组件发送的多个对象对应的多个密文数据;该加速设备的具体结构实现可以详见相应实施例中所述,此处不再重复赘述,该方法可以具体由加速设备中的第二加速组件执行,如图8中所述,该方法可以包括以下几个步骤:In addition, an embodiment of the present application further provides an acceleration method, which can be applied to an acceleration device as shown in FIG1 , wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; wherein the first storage component stores multiple ciphertext data corresponding to multiple objects sent by the first host processing component; the specific structural implementation of the acceleration device can be described in detail in the corresponding embodiment, which will not be repeated here. The method can be specifically executed by the second acceleration component in the acceleration device, as shown in FIG8 , and the method can include the following steps:
801:从第一存储组件获取多个密文数据。801: Obtain multiple ciphertext data from a first storage component.
802:针对任一个特征,将多个密文数据进行分桶处理,获得多个分桶结果。802: For any feature, multiple ciphertext data are bucketed to obtain multiple bucket results.
803:将多个分桶结果存储至第一存储组件。803: Store the multiple bucketing results in the first storage component.
第一加速组件用于从第一存储组件获取多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将多个分桶结果分别对应的密文处理结果存储至第一存储组件;第一存储组件用于将多个分桶结果分别对应的密文处理结果提供给第一主机处理组件。The first acceleration component is used to obtain multiple bucket results from the first storage component; calculate and process the ciphertext data in the same bucket result to obtain the ciphertext processing result; store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component; the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results respectively to the first host processing component.
此外,本申请实施例还提供了一种加速方法,该方法可以应用于如图1所示的加速设备中,加速设备包括第一存储组件、与第一存储组件连接的第一加速组件及第二加速组件;第一存储组件与第一主机处理组件通过总线连接;其中,第一存储组件存储第一主机处理组件发送的多个对象对应的多个密文数据;该加速设备的具体结构实现可以详见相应实施例中所述,此处不再重复赘述,该方法可以具体由加速设备中的第一加速组件执行,如图9中所述,该方法可以包括以下几个步骤:In addition, an embodiment of the present application further provides an acceleration method, which can be applied to an acceleration device as shown in FIG1 , wherein the acceleration device includes a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to the first host processing component through a bus; wherein the first storage component stores multiple ciphertext data corresponding to multiple objects sent by the first host processing component; the specific structural implementation of the acceleration device can be described in detail in the corresponding embodiment, which will not be repeated here. The method can be specifically executed by the first acceleration component in the acceleration device, as shown in FIG9 , and the method can include the following steps:
901:从第一存储组件获取多个分桶结果。901: Obtain multiple bucket results from a first storage component.
其中,该多个分桶结果可以由第二加速组件从第一存储组件获取多个密文数据,并针对多个特征,将多个密文数据分别进行分桶处理获得。The multiple bucketing results can be obtained by the second acceleration component obtaining multiple ciphertext data from the first storage component, and performing bucketing processing on the multiple ciphertext data according to multiple features.
902:将同一分桶结果中的密文数据进行计算处理获得密文处理结果。902: Calculate and process the ciphertext data in the same bucket result to obtain a ciphertext processing result.
903:将多个分桶结果分别对应的密文处理结果存储至第一存储组件。903: Store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component.
第一存储组件用于将多个分桶结果分别对应的密文处理结果提供给第一主机处理组件。The first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
需要说明的是,图8所示实施例所述的加速方法以及图9所示实施例所述的加速方法中,各个步骤操作的具体方式已经在有关设备实施例中进行了详细描述,此处将不做详细阐述说明。 It should be noted that, in the acceleration method described in the embodiment shown in FIG. 8 and the acceleration method described in the embodiment shown in FIG. 9 , the specific operation mode of each step has been described in detail in the relevant device embodiments and will not be elaborated here.
此外,本申请实施例还提供了一种计算设备,如图10中所述,该计算设备可以包括主机处理组件1001、主机存储组件1002以及加速设备1003,其中,加速设备可以采用如图1~图5或图7a任一实施例中所述的结构,此处不再赘述。In addition, an embodiment of the present application further provides a computing device, as shown in FIG. 10 , which may include a host processing component 1001, a host storage component 1002, and an acceleration device 1003, wherein the acceleration device may adopt a structure as described in any of the embodiments of FIG. 1 to FIG. 5 or FIG. 7 a, which will not be repeated here.
其中,主机存储组件1002可以存储一条或多条计算机指令,以供主机处理组件1001调用并执行以实现相应操作。The host storage component 1002 may store one or more computer instructions for the host processing component 1001 to call and execute to implement corresponding operations.
当然,计算设备必然还可以包括其他部件,例如输入/输出接口、显示组件、通信组件等。Of course, the computing device may also include other components, such as input/output interfaces, display components, communication components, etc.
输入/输出接口为处理组件和外围接口模块之间提供接口,上述外围接口模块可以是输出设备、输入设备等。通信组件被配置为便于计算设备和其他设备之间有线或无线方式的通信等。The input/output interface provides an interface between the processing component and the peripheral interface module, which may be an output device, an input device, etc. The communication component is configured to facilitate wired or wireless communication between the computing device and other devices.
其中,主机处理组件可以包括一个或多个处理器来执行计算机指令,以完成上述的方法中的全部或部分步骤。当然主机处理组件也可以为一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。The host processing component may include one or more processors to execute computer instructions to complete all or part of the steps in the above method. Of course, the host processing component may also be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components to perform the above method.
主机存储组件被配置为存储各种类型的数据以支持在计算设备的操作。主机存储组件可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The host storage component is configured to store various types of data to support operations in the computing device. The host storage component can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
加速设备可以采用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现。其可以通过总线方式与主机处理组件连接,采用热插拔方式部署在计算设备中。The acceleration device can be implemented by using an application-specific integrated circuit (ASIC), a digital signal processor (DSP), a digital signal processing device (DSPD), a programmable logic device (PLD), a field programmable gate array (FPGA), a controller, a microcontroller, a microprocessor or other electronic components. It can be connected to the host processing component through a bus and deployed in a computing device in a hot-swappable manner.
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被计算机执行时可以实现上述图8或图9所示实施例的加速方法。该计算机可读介质可以是上述实施例中描述的计算设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The embodiment of the present application also provides a computer-readable storage medium storing a computer program, which can implement the acceleration method of the embodiment shown in Figure 8 or Figure 9 when executed by a computer. The computer-readable medium can be included in the computing device described in the above embodiment; or it can exist independently without being assembled into the electronic device.
本申请实施例还提供了一种计算机程序产品,其包括承载在计算机可读存储介质上的计算机程序,所述计算机程序被计算机执行时可以实现如上述如图8或图9所示实施例的加速方法。在这样的实施例中,计算机程序可以是从网络上被下载和安装,和/或从可拆卸介质被安装。在该计算机程序被处理器执行时,执行本申请的系统中限定的各种功能。The embodiment of the present application also provides a computer program product, which includes a computer program carried on a computer-readable storage medium, and when the computer program is executed by a computer, it can implement the acceleration method of the embodiment shown in Figure 8 or Figure 9 as described above. In such an embodiment, the computer program can be downloaded and installed from a network, and/or installed from a removable medium. When the computer program is executed by a processor, various functions defined in the system of the present application are executed.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。 Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of this embodiment. Those of ordinary skill in the art may understand and implement it without creative work.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that each implementation method can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solution is essentially or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, a disk, an optical disk, etc., including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in each embodiment or some parts of the embodiments.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, rather than to limit it. Although the present application has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or replace some of the technical features therein with equivalents. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (14)

  1. 一种加速设备,其特征在于,包括:第一存储组件、与所述第一存储组件连接的第一加速组件及第二加速组件;所述第一存储组件与第一主机处理组件通过总线连接;An acceleration device, characterized in that it comprises: a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus;
    所述第一存储组件用于存储所述第一主机处理组件发送的多个对象对应的多个密文数据;The first storage component is used to store multiple ciphertext data corresponding to multiple objects sent by the first host processing component;
    所述第二加速组件用于从所述第一存储组件获取所述多个密文数据,并针对任一个特征,将所述多个密文数据进行分桶处理,获得多个分桶结果;将所述多个分桶结果存储至所述第一存储组件;The second acceleration component is used to obtain the multiple ciphertext data from the first storage component, and for any feature, perform bucket processing on the multiple ciphertext data to obtain multiple bucket results; and store the multiple bucket results in the first storage component;
    所述第一加速组件用于从所述第一存储组件获取所述多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将所述多个分桶结果分别对应的密文处理结果存储至所述第一存储组件;The first acceleration component is used to obtain the multiple bucket results from the first storage component; perform calculations on the ciphertext data in the same bucket result to obtain a ciphertext processing result; and store the ciphertext processing results corresponding to the multiple bucket results respectively in the first storage component;
    所述第一存储组件用于将所述多个分桶结果分别对应的密文处理结果提供给所述第一主机处理组件。The first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
  2. 根据权利要求1所述的设备,其特征在于,所述第二加速组件针对任一特征,将所述多个密文数据进行分桶处理,获得多个分桶结果包括:针对任一特征,确定所述多个对象分别对应所述特征的分桶信息;将对应相同分桶信息的至少一个对象分别对应的密文数据划分为同一分桶结果,以获得多个分桶结果;其中,多个对象分别对应不同特征的分桶信息由第一主机处理组件确定;The device according to claim 1 is characterized in that the second acceleration component performs bucket processing on the multiple ciphertext data for any feature, and obtaining multiple bucket results comprises: for any feature, determining bucket information of the multiple objects corresponding to the feature respectively; dividing the ciphertext data corresponding to at least one object corresponding to the same bucket information into the same bucket result to obtain multiple bucket results; wherein the bucket information of the multiple objects corresponding to different features is determined by the first host processing component;
    所述第一存储组件还用于存储所述第一主机处理组件发送的所述多个对象分别对应不同特征的分桶信息。The first storage component is also used to store bucket information of the multiple objects corresponding to different features sent by the first host processing component.
  3. 根据权利要求1所述的设备,其特征在于,所述第二加速组件包括数据加载单元、多个分桶单元以及数据存储单元;The device according to claim 1, characterized in that the second acceleration component comprises a data loading unit, a plurality of bucketing units and a data storage unit;
    所述数据加载单元用于从所述第一存储组件获取所述多个密文数据,将所述多个密文数据分别提供至所述多个分桶单元;所述数据加载单元还用于为所述多个分桶单元分别分配待处理的特征,并控制所述多个分桶单元对分配的待处理的特征进行并行处理;The data loading unit is used to obtain the plurality of ciphertext data from the first storage component, and provide the plurality of ciphertext data to the plurality of bucketing units respectively; the data loading unit is also used to assign features to be processed to the plurality of bucketing units respectively, and control the plurality of bucketing units to process the assigned features to be processed in parallel;
    所述分桶单元用于针对为其分配的特征,将所述多个密文数据进行分桶处理,获得多个分桶结果;并将所述多个分桶结果发送至所述存储单元;The bucketing unit is used to perform bucketing processing on the plurality of ciphertext data according to the features assigned thereto, to obtain a plurality of bucketing results; and send the plurality of bucketing results to the storage unit;
    所述存储单元用于将每个分桶单元发送的多个分桶结果存储至所述第一存储组件。The storage unit is used to store multiple bucket results sent by each bucket unit into the first storage component.
  4. 根据权利要求1所述的设备,其特征在于,还包括基板,所述第一存储组件、所述第一加速组件以及所述第二加速组件焊接在所述基板上。The device according to claim 1 is characterized in that it also includes a substrate, and the first storage component, the first acceleration component and the second acceleration component are welded on the substrate.
  5. 根据权利要求1所述的设备,其特征在于,所述第一加速组件包括至少一个第一加速单元;The device according to claim 1, characterized in that the first acceleration assembly includes at least one first acceleration unit;
    所述第一加速单元用于从所述第一存储组件获取至少一个分桶结果,针对任一分桶结果,按照目标计算处理模式,对所述分桶结果中的多个密文数据进行计算处理获得密文处 理结果;将任一分桶结果对应的密文处理结果存储至所述第一存储组件。The first acceleration unit is used to obtain at least one bucket result from the first storage component, and for any bucket result, perform calculation processing on multiple ciphertext data in the bucket result according to the target calculation processing mode to obtain a ciphertext processing mode. Processing result; storing the ciphertext processing result corresponding to any bucket result in the first storage component.
  6. 根据权利要求5所述的设备,其特征在于,所述第一加速单元包括第一控制单元及多个第一运算单元;The device according to claim 5, characterized in that the first acceleration unit comprises a first control unit and a plurality of first computing units;
    所述第一控制单元用于从所述第一存储组件获取至少一个分桶结果;将所述至少一个分桶结果分派至至少一个运算单元;The first control unit is used to obtain at least one bucket result from the first storage component; and dispatch the at least one bucket result to at least one computing unit;
    所述第一运算单元用于针对为其分派的任一分桶结果,按照目标计算处理模式,对所述分桶结果中的多个密文数据进行计算处理获得密文处理结果;The first computing unit is used to perform computing processing on a plurality of ciphertext data in any bucket result assigned to it according to a target computing processing mode to obtain a ciphertext processing result;
    所述第一控制单元用于将任一分桶结果对应的密文处理结果存储至所述第一存储组件。The first control unit is used to store the ciphertext processing result corresponding to any bucket result in the first storage component.
  7. 根据权利要求6所述的设备,其特征在于,所述第一加速单元还包括第一存储单元;所述第一运算单元还用于将任一分桶结果对应的密文处理结果保存至所述第一存储单元;The device according to claim 6, characterized in that the first acceleration unit further comprises a first storage unit; the first computing unit is further configured to save the ciphertext processing result corresponding to any bucket result to the first storage unit;
    所述第一控制单元将任一分桶结果对应的密文处理结果存储至所述第一存储组件包括:将所述第一存储单元中存储的任一分桶结果对应的密文处理结果存储至所述第一存储组件。The first control unit storing the ciphertext processing result corresponding to any bucket result in the first storage component includes: storing the ciphertext processing result corresponding to any bucket result stored in the first storage unit in the first storage component.
  8. 根据权利要求7所述的设备,其特征在于,所述第一控制单元还用于接收所述第一主机处理组件发送的第一控制信息,按照所述第一控制信息控制所述多个第一运算单元以及所述第一存储单元运行;The device according to claim 7, characterized in that the first control unit is further used to receive first control information sent by the first host processing component, and control the operation of the plurality of first computing units and the first storage unit according to the first control information;
    所述第一控制单元还用于按照所述第一控制信息,通知所述第一运算单元对应的运算方式;The first control unit is further configured to notify the first computing unit of a corresponding computing mode according to the first control information;
    所述第一运算单元针对为其分派的任一分桶结果,对所述分桶结果中的多个密文数据进行计算处理获得密文处理结果包括:针对为其分派的任一分桶结果,按照所述运算方式,对所述分桶结果中的多个密文数据进行处理获得密文处理结果。The first computing unit calculates and processes multiple ciphertext data in any bucket result assigned to it to obtain a ciphertext processing result, including: for any bucket result assigned to it, according to the computing method, processing multiple ciphertext data in the bucket result to obtain a ciphertext processing result.
  9. 根据权利要求6所述的设备,其特征在于,所述第一运算单元包括第一存储子单元、第一解析子单元、第一计算子单元、以及第一控制子单元;The device according to claim 6, characterized in that the first operation unit includes a first storage subunit, a first parsing subunit, a first calculation subunit, and a first control subunit;
    所述第一存储子单元用于存储目标计算处理模式对应的一条或多条运算指令;The first storage subunit is used to store one or more operation instructions corresponding to the target computing processing mode;
    所述第一解析子单元用于解析所述一条或多条运算指令;The first parsing subunit is used to parse the one or more operation instructions;
    所述第一控制子单元用于基于所述解析单元的解析结果,向第一计算子单元发送计算指示信息;The first control subunit is used to send calculation instruction information to the first calculation subunit based on the analysis result of the analysis unit;
    所述第一计算子单元,用于基于所述计算指示信息,对所述多个密文数据进行计算处理获得密文处理结果。The first calculation subunit is used to perform calculation processing on the multiple ciphertext data based on the calculation indication information to obtain a ciphertext processing result.
  10. 根据权利要求9所述的设备,其特征在于,所述目标计算处理模式为密文累加,所述运算方式为点加运算;The device according to claim 9, characterized in that the target calculation processing mode is ciphertext accumulation, and the operation method is point addition operation;
    第一计算子单元基于所述计算指示信息,对所述多个密文数据进行计算处理获得密文 处理结果包括:从所述多个密文数据中依次获取一个密文数据,与前一次点加结果进行点加运算,确定当前累加次数是否满足预设次数,若是,将最后一次点加结果作为密文处理结果输出,若否,将所述点加结果保存至所述第一存储子单元中。The first calculation subunit calculates and processes the plurality of ciphertext data based on the calculation instruction information to obtain ciphertext The processing results include: obtaining one ciphertext data from the multiple ciphertext data in turn, performing a dot addition operation with the previous dot addition result, determining whether the current cumulative number of additions meets the preset number of times, if so, outputting the last dot addition result as the ciphertext processing result, if not, saving the dot addition result to the first storage subunit.
  11. 一种计算系统,其特征在于,包括第一计算设备及第二计算设备,所述第一计算设备包括第一主机处理组件及如权利要求1~10任一项所述的加速设备;A computing system, characterized in that it comprises a first computing device and a second computing device, wherein the first computing device comprises a first host processing component and an acceleration device according to any one of claims 1 to 10;
    所述第二计算设备包括第二主机处理组件及第二加速设备;所述第二加速设备包括第二存储组件及至少一个第三加速组件;所述第二存储组件与第二主机处理组件通过总线连接;The second computing device includes a second host processing component and a second acceleration device; the second acceleration device includes a second storage component and at least one third acceleration component; the second storage component is connected to the second host processing component via a bus;
    所述第二存储组件用于存储所述第二主机处理组件发送的多个待处理数据;所述待处理数据为待加密的目标数据或者待解密的密文处理结果;The second storage component is used to store a plurality of to-be-processed data sent by the second host processing component; the to-be-processed data is target data to be encrypted or a ciphertext processing result to be decrypted;
    所述第三加速组件用于从所述第二存储组件获取至少一个待处理数据;针对任一待处理数据,对所述待处理数据进行加密或解密,获得计算处理结果,并将所述计算处理结果存储至所述第二存储组件;The third acceleration component is used to obtain at least one to-be-processed data from the second storage component; for any to-be-processed data, encrypt or decrypt the to-be-processed data to obtain a calculation result, and store the calculation result in the second storage component;
    所述第二主机处理组件用于从所述第二存储组件中获取任一待处理数据对应的计算处理结果。The second host processing component is used to obtain a calculation result corresponding to any data to be processed from the second storage component.
  12. 根据权利要求11所述的系统,其特征在于,所述待处理数据为待加密的目标数据的情况下,所述目标数据为基于数据发起方提供的样本对象对应的特征取值和标签数据,计算获得的决策树模型对应的梯度信息;The system according to claim 11, characterized in that, when the data to be processed is target data to be encrypted, the target data is gradient information corresponding to the decision tree model calculated based on feature values and label data corresponding to the sample object provided by the data initiator;
    或者,or,
    所述待处理数据为待解密的密文处理结果的情况下,所述待处理数据具体为针对数据接收方提供的任一特征,计算获得的待解密的密文处理结果,所述密文处理结果对应的计算处理结果为梯度累加值;则所述第二主机处理组件还用于基于多个特征对应的梯度累加值,确定所述决策树模型的最优分裂点。When the data to be processed is a ciphertext processing result to be decrypted, the data to be processed is specifically a ciphertext processing result to be decrypted obtained by calculation based on any feature provided by the data recipient, and the calculation processing result corresponding to the ciphertext processing result is a gradient accumulation value; then the second host processing component is also used to determine the optimal splitting point of the decision tree model based on the gradient accumulation values corresponding to multiple features.
  13. 一种计算设备,其特征在于,包括主机处理组件、主机存储组件以及如权利要求1~10任一项所述的加速设备。A computing device, characterized in that it comprises a host processing component, a host storage component and an acceleration device as claimed in any one of claims 1 to 10.
  14. 一种加速方法,其特征在于,应用于加速设备,所述加速设备包括第一存储组件、与所述第一存储组件连接的第一加速组件及第二加速组件;所述第一存储组件与第一主机处理组件通过总线连接;其中,所述第一存储组件存储所述第一主机处理组件发送的多个对象对应的多个密文数据;所述方法包括:An acceleration method, characterized in that it is applied to an acceleration device, the acceleration device comprising a first storage component, a first acceleration component connected to the first storage component, and a second acceleration component; the first storage component is connected to a first host processing component via a bus; wherein the first storage component stores a plurality of ciphertext data corresponding to a plurality of objects sent by the first host processing component; the method comprising:
    从所述第一存储组件获取所述多个密文数据;Acquire the plurality of ciphertext data from the first storage component;
    针对任一个特征,将所述多个密文数据进行分桶处理,获得多个分桶结果;For any feature, the plurality of ciphertext data are bucketed to obtain a plurality of bucketed results;
    将所述多个分桶结果存储至所述第一存储组件;第一加速组件用于从所述第一存储组件获取所述多个分桶结果;将同一分桶结果中的密文数据进行计算处理获得密文处理结果;将所述多个分桶结果分别对应的密文处理结果存储至所述第一存储组件;所述第一存储组件用于将所述多个分桶结果分别对应的密文处理结果提供给所述第一主机处理组件。 The multiple bucket results are stored in the first storage component; the first acceleration component is used to obtain the multiple bucket results from the first storage component; the ciphertext data in the same bucket result is calculated and processed to obtain the ciphertext processing result; the ciphertext processing results corresponding to the multiple bucket results are respectively stored in the first storage component; the first storage component is used to provide the ciphertext processing results corresponding to the multiple bucket results to the first host processing component.
PCT/CN2023/123473 2022-10-11 2023-10-09 Acceleration device, computing system, and acceleration method WO2024078428A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211241151.7A CN115801221A (en) 2022-10-11 2022-10-11 Acceleration apparatus, computing system, and acceleration method
CN202211241151.7 2022-10-11

Publications (1)

Publication Number Publication Date
WO2024078428A1 true WO2024078428A1 (en) 2024-04-18

Family

ID=85432823

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/123473 WO2024078428A1 (en) 2022-10-11 2023-10-09 Acceleration device, computing system, and acceleration method

Country Status (2)

Country Link
CN (1) CN115801221A (en)
WO (1) WO2024078428A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115801221A (en) * 2022-10-11 2023-03-14 阿里云计算有限公司 Acceleration apparatus, computing system, and acceleration method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464282A (en) * 2019-01-18 2020-07-28 百度在线网络技术(北京)有限公司 Data processing method and device based on homomorphic encryption
CN112989368A (en) * 2021-02-07 2021-06-18 支付宝(杭州)信息技术有限公司 Method and device for processing private data by combining multiple parties
CN114039785A (en) * 2021-11-10 2022-02-11 奇安信科技集团股份有限公司 Data encryption, decryption and processing method, device, equipment and storage medium
WO2022142038A1 (en) * 2020-12-29 2022-07-07 平安普惠企业管理有限公司 Data transmission method and related device
CN115801220A (en) * 2022-10-11 2023-03-14 阿里云计算有限公司 Acceleration apparatus, computing system, and acceleration method
CN115801221A (en) * 2022-10-11 2023-03-14 阿里云计算有限公司 Acceleration apparatus, computing system, and acceleration method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464282A (en) * 2019-01-18 2020-07-28 百度在线网络技术(北京)有限公司 Data processing method and device based on homomorphic encryption
WO2022142038A1 (en) * 2020-12-29 2022-07-07 平安普惠企业管理有限公司 Data transmission method and related device
CN112989368A (en) * 2021-02-07 2021-06-18 支付宝(杭州)信息技术有限公司 Method and device for processing private data by combining multiple parties
CN114039785A (en) * 2021-11-10 2022-02-11 奇安信科技集团股份有限公司 Data encryption, decryption and processing method, device, equipment and storage medium
CN115801220A (en) * 2022-10-11 2023-03-14 阿里云计算有限公司 Acceleration apparatus, computing system, and acceleration method
CN115801221A (en) * 2022-10-11 2023-03-14 阿里云计算有限公司 Acceleration apparatus, computing system, and acceleration method

Also Published As

Publication number Publication date
CN115801221A (en) 2023-03-14

Similar Documents

Publication Publication Date Title
WO2024078347A1 (en) Acceleration device, computing system and acceleration method
US9942039B1 (en) Applying modular reductions in cryptographic protocols
CN110689349B (en) Transaction hash value storage and searching method and device in blockchain
WO2024078428A1 (en) Acceleration device, computing system, and acceleration method
KR20030078873A (en) Packet encrypton system and method
US20100011047A1 (en) Hardware-Based Cryptographic Accelerator
CN112070222B (en) Processing device, accelerator and method for federal learning
US9953184B2 (en) Customized trusted computer for secure data processing and storage
WO2022121623A1 (en) Data set intersection method and apparatus
US11750403B2 (en) Robust state synchronization for stateful hash-based signatures
CN115174267B (en) TLS protocol negotiation method, equipment and medium
Garimella et al. Characterizing and optimizing end-to-end systems for private inference
Zeydan et al. Recent advances in post-quantum cryptography for networks: A survey
CN114944935A (en) Multi-party fusion computing system, multi-party fusion computing method and readable storage medium
CN114760023A (en) Model training method and device based on federal learning and storage medium
WO2023236899A1 (en) Data processing method, apparatus, device and storage medium
CN111314080B (en) SM9 algorithm-based collaborative signature method, device and medium
CN109428876A (en) One kind is shaken hands connection method and device
JPWO2020152831A1 (en) Information processing equipment, secret calculation method and program
CN116681141A (en) Federal learning method, terminal and storage medium for privacy protection
CN108768994A (en) Data matching method, device and computer readable storage medium
JP2009038416A (en) Multicast communication system, and group key management server
CN111224777A (en) SDN network multicast member information encryption method, system, terminal and storage medium
CN116305187B (en) Decision flow model calculation method and device based on hybrid encryption
CN116166429B (en) Channel attribute determining method of multiple security chips and security chip device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23876630

Country of ref document: EP

Kind code of ref document: A1