CN116455612B - Privacy calculation intermediate data stream zero-copy device and method - Google Patents

Privacy calculation intermediate data stream zero-copy device and method Download PDF

Info

Publication number
CN116455612B
CN116455612B CN202310292403.7A CN202310292403A CN116455612B CN 116455612 B CN116455612 B CN 116455612B CN 202310292403 A CN202310292403 A CN 202310292403A CN 116455612 B CN116455612 B CN 116455612B
Authority
CN
China
Prior art keywords
data
arow
kernel
mode
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310292403.7A
Other languages
Chinese (zh)
Other versions
CN116455612A (en
Inventor
王济平
黎刚
高俊杰
汤克云
杨劲业
梁孟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingxin Data Technology Co ltd
Original Assignee
Jingxin Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingxin Data Technology Co ltd filed Critical Jingxin Data Technology Co ltd
Priority to CN202310292403.7A priority Critical patent/CN116455612B/en
Publication of CN116455612A publication Critical patent/CN116455612A/en
Application granted granted Critical
Publication of CN116455612B publication Critical patent/CN116455612B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/46Secure multiparty computation, e.g. millionaire problem
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a zero-copy device and a zero-copy method for an intermediate data stream of privacy calculation, wherein the zero-copy device comprises a privacy calculation alliance network module, the privacy calculation alliance network module comprises a plurality of nodes, privacy calculation is carried out among the nodes, and encryption process data and intermediate factors are mutually transmitted among the nodes so as to optimize and calculate a local model and an algorithm of each node; the node has two service states, wherein the two service states are a user state and a kernel state respectively, the user state is an upper-layer application program and a bottom-layer computing engine of the node, and the kernel state is data sharing circulation of the node on hardware. The zero-copy device adopts a device for carrying out zero-copy on the software layer, the kernel, the hardware and the network transmission layer, so that the data memory sharing among the systems in the nodes is realized, the network transmission does not need to be in serialization and deserialization, and the execution efficiency of data execution is improved, and the whole system can achieve zero-copy of the data in the whole data transmission, data sharing and data execution processes.

Description

Privacy calculation intermediate data stream zero-copy device and method
Technical Field
The invention belongs to the technical field of privacy computation, and particularly relates to a zero copy device and method for an intermediate data stream in privacy computation.
Background
Various privacy computing technologies are known at present, and each technology has a specific application scene, so that a unique safety protection effect is generated.
However, many efficiency problems often occur in the processes of network data transmission, node data execution and the like, and meanwhile, because different technologies or different programming languages are used between the application layer system and the underlying system in the node or between the underlying system and the underlying system, a great deal of time and resources are wasted in the process of sharing data between the systems in the node, such as completely wasting the precious resources of the CPU in the process of serializing and deserializing the data.
At present, the existing privacy computing technology is mostly used for solving the privacy security problem, and the problems of timeliness, resource waste and the like in data transmission, sharing and serialization reverse serialization are not solved.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a device and a method for zero copying of intermediate data flow in privacy calculation, which aim to further improve the timeliness and the performance of hardware resources affected by network data transmission, node data execution, data serialization anti-serialization and the like under the original privacy calculation technology system.
The aim of the invention can be achieved by the following technical scheme:
a zero-copy device of an intermediate data stream for privacy calculation comprises a privacy calculation alliance network module, wherein the privacy calculation alliance network module comprises a plurality of nodes, privacy calculation is carried out among the nodes, and encryption process data and intermediate factors are mutually transmitted among the nodes so as to optimize and calculate a local model and an algorithm of each node.
The node has two service states, wherein the two service states are a user state and a kernel state respectively, the user state is an upper-layer application program and a bottom-layer computing engine of the node, and the kernel state is data sharing circulation of the node on hardware.
Further, the upper layer application program is implemented by using java language, and the lower layer computing engine is implemented by using python or rust language.
The upper layer application program is deployed with a platform application for carrying out data preprocessing and converting the local privacy data into a memory data format, and a conversion technology used for converting the local privacy data into the memory data format is Apache Arrow.
The underlying computing engine is deployed with a variety of privacy computing underlying technologies.
Further, the upper layer application and the lower layer computing engine fully realize memory sharing through Arrow and related mmap, sendfile and DMA technologies.
The kernel mode uses DMA on the hardware level, the kernel mode uses mmap and sendfi+DMA modes in the instruction aspect, and when the kernel mode transmits data to the outside through a network card, the user mode adopts ApacheArrowF light technology to transmit the data through the network.
The zero-copy method of the zero-copy device is characterized by comprising the following steps:
s1: the user state upper layer service converts private data.
S2: and switching kernel mode memory mapping and landing by user mode.
S3: the user mode upper layer calls the bottom layer and the memory sharing data mechanism.
S4: the user-state bottom layer performs computation on the data.
S5: and the user mode is switched to kernel mode calculation result memory mapping and external transmission.
S6: the user state receives the transmission data and participates in the bottom layer operation.
Further, the specific operation of S1 is as follows: the user state upper layer application program accesses the privacy data of the user, and firstly, the privacy data needs to be preprocessed, and the specific method for preprocessing the privacy data is as follows:
s11: the null value is populated and the null value is replaced with other known identifiers.
S12: and screening and combining the same type of data in the data set, so that the same type of data is classified into one column or adjacent columns.
The two types of data types of the int8 and the int16 are classified into one column, the columns with larger scales of the int32 and the int64 are respectively classified into one column, the projected data range calculation is designed into an adjacent column, the float and the double columns are respectively classified into one column, and the projected data range calculation is designed into the adjacent column.
S13, after the pretreatment of the privacy data is completed through S11 and S12, the privacy data is continuously converted into a column data table on the internal memory by using an Arrow technology, and the column data table is marked as: arow_private_data.
Further, the specific operation of S2 is as follows:
and the user mode and the kernel mode are switched for one time, and simultaneously, the user mode upper layer application program calls a mmap instruction, address mapping is carried out on the user mode memory buffer area and the kernel buffer area of the kernel mode, and the kernel buffer area of the kernel mode is directly shared to obtain the arow_private_data data in the user mode memory buffer area.
The user state program performs disk dropping on the arow_private_data, the service state is switched to the kernel state, and the kernel state CPU informs the DMA device that the user state is written into the user state memory buffer area and shared into the kernel buffer area, and performs disk dropping on the arow_private_data to a local disk, and the arow_private_dat a_file is recorded.
Further, the specific operation of S3 is as follows:
the bottom layer computing engine in the user mode performs data computation or model training on the local privacy data, and is initiated by the user mode upper layer application call.
And after the service state is switched to the kernel state, the CPU points to an arow_private_data_file in the disk through the characteristic of the Arrow technology.
When the kernel mode reads the Arrow file, the Arrow file is directly written into the kernel mode memory buffer area in a zero copy mode, the service state is switched from the kernel mode to the user state, and the user state bottom computing engine reads the Arrow_private_data_file data which is already shared by the kernel mode memory buffer area.
The user mode bottom layer computing engine is written in the python and rust languages, and receives the arow_private_data_file data by using an arow python and rust library.
Further, the specific operation of S4 is as follows:
the user-state bottom layer calculation engine calculates an arow_private_data_file, and the calculation process comprises the following steps: machine learning, deep learning, data management cleaning, data encryption and decryption and homomorphic calculation.
In the calculation process, a part of process data and ciphertext data calculated by the current node are output in a user-state memory, and the process data is homomorphic encrypted in a homomorphic encryption mode.
The user state bottom layer calculation engine uses Arrow technology to perform data continuous conversion and data preprocessing on the process data, the mode is consistent with the step S1, and finally column type process data and column type ciphertext data are obtained and marked as arow_process_encrypt_data and arow_encrypt_data.
Further, the specific operation of S5 is as follows:
the user-state underlying computing engine temporarily drops the arow_process_encrypt_data and arow_encrypt_da ta.
The landing operation is consistent with the step S2, the arow_process_encrypt_data_file and arow_encrypt_data_file are obtained, the service state is in a kernel state, a sendfile instruction is used in the kernel state to inform a DMA, and the arow_process_encrypt_data and arow_encrypt_data data of a kernel buffer area in the landing operation are collected and copied to the network card directly through a DMA technology.
The kernel mode network card and the user mode Flight network transmission protocol technology are combined to transmit the arow_process_encrypt_data and the arow_encrypt_data to other nodes.
Further, the specific operation of S6 is as follows:
the network card of the kernel mode of other nodes receives data from the arow_process_encrypt_data and arow_encrypt_data transmitted from outside, and the kernel mode uses sendfile instructions to inform the DMA and directly copies the data collection of arow_process_encrypt_data and arow_encrypt_data in the network card to a kernel buffer area in the kernel mode through a DMA technology.
The kernel mode establishes a memory address mapping sharing relation with the user mode through a mmap instruction, and the arow_process_encrypt_data and arow_encrypt_data in a kernel buffer area in the kernel mode are shared to the user mode memory buffer area, the state is switched to the user mode, and an underlying computing engine of the user mode reads the arow_process_encrypt_data and arow_encrypt_data of the user mode memory buffer area.
And the process data and the ciphertext data of the nodes are fused with the arow_process_encrypt_data and arow_encrypt_data transmitted by other nodes.
If the data of the node in the memory is lost, zero copying is performed in the S3 step mode, and the data of the arow_process_encrypter_data_file and the arow_encrypter_data_file are read in a deserializing mode-free mode.
And carrying out multi-node multiparty privacy calculation on the privacy data through continuous iterative polling S3-S6 steps to obtain a final calculation result.
The invention has the beneficial effects that:
1. the copying method and the copying device adopt a device for carrying out zero copying of data on four layers of a software layer, a kernel, hardware and a network transmission layer, so that the purposes of sharing data memory among systems in nodes, no serialization and deserialization in network transmission are achieved, and the execution efficiency of data execution is improved, so that the whole system can achieve zero copying of data in the whole data transmission, data sharing and data execution processes;
2. the whole process of the copying method of the invention is based on the original privacy computing technology system, the data circulation efficiency is further improved in the process of influencing timeliness such as network data transmission, node data execution, data serialization anti-serialization and the like, CPU resources are fully liberated in hardware resources, more CPU participates in the privacy computing process in the user-mode bottom computing engine, meanwhile, the data format in software is the array storage format of Arro w, the CPU resources and the Arrow data format are fully improved, and the execution efficiency of the whole privacy computing process is greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to those skilled in the art that other drawings can be obtained according to these drawings without inventive effort.
FIG. 1 is a diagram of the overall architecture of a zero-copy device of the present invention;
FIG. 2 is a flow chart of the zero copy method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the zero copy device of the privacy computing intermediate data stream includes a privacy computing alliance network module, where the privacy computing alliance network module includes a plurality of nodes, and the plurality of nodes are respectively expressed as: the privacy calculation is carried out among the nodes, the encryption process data, the intermediate factors and the like are mutually transmitted among the nodes to optimize and calculate the local model and algorithm of each node, the internal data circulation of the nodes is basically consistent, and the node structure diagram of fig. 1 can be seen.
The node has two service states, which are a user state and a kernel state, respectively.
The user mode is a service state which does not directly access the node hardware, and is usually used for deploying software applications.
The upper layer application program is deployed with a platform application, the platform application can not only preprocess local privacy data, but also convert the local privacy data into a memory data format, the conversion technology for converting the local privacy data into the memory data format is ApacheArrow, apacheArrow, the memory is a memory mapping file, the memory is a memory storage data format, the zero copy use of the data among systems is supported, the data can move among heterogeneous big data systems and can be processed more quickly, and the conversion technology is to convert the privacy data into a column type data table by utilizing CPU resources.
The underlying computing engine deploys a variety of privacy computing underlying technologies such as: federal learning, multiparty secure computing, etc., the upper layer application program and the bottom layer computing engine can fully realize memory sharing through technologies such as arow and related mmap, sendfile, DMA (the three blocks belong to kernel mode functions), the bottom layer computing engine is based on memory sharing, at this time, CPU resources are not required to be used, and the bottom layer computing engine can directly access data in the memory to perform CPU resource level computing.
The kernel state is mainly used for responding to instruction call of the user state and directly accessing the hardware, the kernel state is data sharing circulation of the node on the hardware, DMA is used on a hardware level, the DMA is a direct memory access device, and the DMA transmission mode does not need direct control and transmission of a CPU (Central processing Unit), namely CPU resources can be liberated in the data transmission process.
In the kernel mode, in terms of instructions, a mmap and sendfile+DMA mode is used, wherein mmap is a method for mapping a file or other objects to an address space of a process, so that a one-to-one mapping relation between a file disk address and a section of virtual address in a process virtual address space is realized, and memory data generated in a user mode can be directly shared to a kernel buffer area in the kernel mode by using mmap, and similarly, the kernel buffer area data in the kernel mode can also be directly shared to a memory buffer area in the user mode, so that a CPU (Central processing unit), DMA (direct memory access) and the like do not need to copy the data from a memory again; sendfile is a system call that passes data directly between two file descriptors (operating entirely in kernel mode), avoiding copying of data between kernel buffers and user buffers, and is very efficient to operate, called zero copy.
In the invention, because sendfile completely operates in the kernel mode, the sendfile is directly used for collecting and copying the data of the kernel buffer area to the network card by using DMA, and a part of kernel buffer area is not required to be copied to a socket buffer area again, and the process also does not need CPU participation.
The kernel mode is mainly used for network transmission of encryption process data, intermediate factors and the like between nodes in the aspect of network cards, the Apache ArrowFlight technology is used for corresponding user mode application, high-capacity data transmission for analysis by the Flight provides a high-performance wired protocol, is specially designed for meeting the requirements of modern data world, comprises cross-platform language support, infinite parallelism, high efficiency, strong safety, multi-region distribution and high-efficiency network utilization, and meanwhile, the Flight can transmit Arrowformat data without serialization and deserialization under an Arrowsystem.
In the invention, by combining the two service states, the data flow zero copy is carried out in all directions from the aspects of software and hardware+network transmission, the performance timeliness, efficiency, executability and the like under a privacy computing system are greatly improved, precious CPU resources are more released, and the CPU resources are fully used in the privacy computing process.
The zero-copy method of the zero-copy device, as shown in fig. 2, includes the following steps:
s1: user state upper layer service conversion privacy data
The user state upper layer application program accesses the privacy data of the user, and firstly, the privacy data needs to be preprocessed, and the specific method for preprocessing the privacy data is as follows:
s11: null value filling directly affects the zero copy mode of data between a user mode and a kernel mode, and when null value filling is not performed, arrow technology cannot identify data and data types of the null value filling to perform continuous conversion on the data, so that a CPU still copies the data from a memory or a disk to a memory buffer of the user mode in a copy mode when the data is read later.
In the invention, null values are replaced by other known identifiers, such as empty strings, nan, undefined and the like, and Arrow defaults to real data and the like to operate in continuous conversion, so that data copying is avoided, and a user mode bottom layer of the invention can filter and calculate the data in the calculation process and does not participate in calculation.
S12: the data sets exist in a column format in the memory, and the column format is very suitable for large-scale OLAP (on-line analytical processing) calculation, so that the data sets of the same type are required to be subjected to screening combination, the data of the same type are classified into one column or adjacent columns, and the data sets are relatively concentrated in the columns in the sets to be calculated in the calculation process, so that the large-scale OLAP calculation of the data is achieved without scanning all data rows and columns.
In the invention, two types of data such as int8 and int16 are classified into one column, larger-scale columns such as int32 and int64 are respectively classified into one column, the projected data range calculation is designed into an adjacent column, the columns such as float and double are respectively classified into one column, and the projected data range calculation is designed into an adjacent column.
And finally, the data are distributed more uniformly in the same type of data, and the data of the same type can be stored in a continuous memory based on the conversion of the Arrow technology to form a vector data structure and a column type storage structure.
S13, after the pretreatment of the privacy data is completed through S11 and S12, the privacy data is continuously converted into a column data table on the internal memory by using an Arrow technology, and the column data table is marked as: arow_private_data.
S2: user mode switching kernel mode memory mapping and landing
The user state and the kernel state are switched for one time, and simultaneously, the user state upper layer application program calls the mmap instruction, address mapping is carried out on the user state memory buffer area and the kernel buffer area of the kernel state, so that the kernel buffer area of the kernel state is directly shared to obtain the arow_private_data data in the user state memory buffer area, and zero copy of the data of the CPU is not needed.
At this time, the user state program needs to drop the arow_private_data, the service state is switched to the kernel state, and at this time, the kernel state CPU notifies the DMA device that the user state has been written into the user state memory buffer and shared into the kernel buffer, and writes the arow_private_data into the local disk, which is denoted as arow_private_data_file. In the process, the user mode and the kernel mode use an Arrow+mmap+DMA mode to carry out the disc-falling data, the memory data do not need to be copied, the CPU resources are not occupied, and the CPU is only responsible for conveying instructions.
S3: mechanism for calling bottom layer by user mode upper layer and sharing data in memory
The bottom layer computing engine in the user mode starts to perform data computation or model training on the local privacy data, and is initiated by the call of the upper layer application in the user mode.
At this time, the user mode and the kernel mode perform one-time service state switching, and call mmap instructions, address mapping is performed again on the kernel buffer zone of the kernel mode and the memory buffer zone of the user mode, so that the user memory buffer zone can directly share to obtain data of the kernel buffer zone of the kernel mode, and after the service state is switched to the kernel mode, the CPU points to an arow_private_data_file in a disk through the characteristic of arow technology.
Because Arrow is a memory mapping file, and the private data set is preprocessed in the step S1, so that the Arrow can be directly written into the kernel-state memory buffer area in a zero copy mode when the kernel-state is read, the service state is switched from the kernel-state to the user-state, at the moment, because the kernel buffer area already has the data of the Arrow_private_da_ta_file, and the user-state and the kernel-state establish a memory address mapping sharing relation by using mmap, the user-state bottom computing engine can directly read the data of the Arrow_private_data_file which is already shared by the kernel-state kernel buffer area in the user-state memory buffer area, and meanwhile, because the user-state upper-layer service program is written by using java, the java library of the Arrow is used for writing data when writing.
The user-mode bottom layer computing engine is written by using the python and rust languages, and receives the arow_private_data_file data by using the python and rust libraries of arow as well, and can directly use the arow_private_data_file data in the user-mode memory buffer without deserializing the data due to the characteristic of arow. In the step, the problems of timeliness, resource waste and the like caused by zero copy of the data, serialization and reverse serialization of the data participated by the CPU are avoided, and the data transmission sharing efficiency of the data in the node system is greatly improved.
S4: user-state bottom layer performs computation on data
The user-state underlying computing engine computes the arow_private_data_file, and the computing process includes but is not limited to: machine learning, deep learning, data management cleaning, data encryption and decryption, homomorphic calculation and the like, and outputting a part of process data (intermediate factors, gradients, loss values and the like) calculated by the current node, ciphertext data and the like in a user-state memory in the calculation process, wherein the process data is transmitted to other nodes by a network in the follow-up process, so that homomorphic encryption is performed by using a homomorphic encryption mode, and the data privacy safety is ensured.
The user state bottom layer calculation engine uses Arrow technology to perform data continuous conversion and data preprocessing on the process data, the mode is consistent with the step S1, namely, the column type process data and the column type ciphertext data are finally obtained and marked as arow_process_encrypt_data and arow_encrypt_data.
S5: user mode switching kernel mode calculation result memory mapping and external transmission
The user-state underlying computing engine needs to temporarily drop the arow_process_encrypt_data and arow_encrypt_data for subsequent process computing use.
The landing operation is consistent with the step S2, the arow_process_encrypt_data_file and arow_encrypt_data_file are obtained, at this time, the service state is in a kernel state, sendfile instructions are used in the kernel state, a user state is not required to be switched back, meanwhile, a DMA (direct memory access) is informed, and the arow_process_encrypt_data and arow_encrypt_data data of a kernel buffer area in the landing operation are directly collected and copied to a network card in a mode of no CPU (Central processing unit) copy, one CPU copy is omitted, and CPU resources are saved.
And finally, the kernel mode network card and the user mode Flight network transmission protocol technology are combined to transmit the arow_process_encrypt_data to other nodes.
S6: user mode receiving transmission data and participating in bottom layer operation
The network cards of other node kernel states receive data from the arow_process_encrypt_data and arow_encrypt_data transmitted from the outside, the data are in the network cards in the kernel states at the moment, the kernel states use sendfile instructions, the user states do not need to be switched back, meanwhile, the DMA is informed, and the data of the arow_process_encrypt_data and the arow_encrypt_data in the network cards are collected and copied to the kernel buffer areas in the kernel states directly through the DMA technology, so that one CPU copy is omitted, and CPU resources are saved.
And then the kernel mode establishes a memory address mapping sharing relation with the user mode through an mmap instruction, and the arow_process_encrypt_data and arow_encrypt_data in the kernel buffer in the kernel mode are shared to the user mode memory buffer, the state is switched to the user mode, and at the moment, an underlying computing engine of the user mode reads the arow_process_encrypt_data and arow_encrypt_data of the user mode memory buffer, and because the data is transmitted by using the Flight technology, the data belongs to the arow technology system, and the arow_pr_encrypt_data and arow_encrypt_data in the user mode memory buffer can be directly used without deserializing by the underlying computing engine of the user mode.
Because the whole flow is a privacy computing network, the privacy computing tasks between the nodes are performed simultaneously, so that before the other nodes receive the arow_process_encrypt_data and the arow_en-encrypt_data, the other nodes already have one piece of own underlying process data and ciphertext data, and at the moment, the process data and the ciphertext data of the nodes are fused with the arow_process_encrypt_data and the arow_encrypt_data transmitted by the other nodes, because the other nodes adopt an arow fligh technology to transmit the arow data format to the current node, and when the current node receives the data, the local process data and the ciphertext data can be directly fused with the arow_process_encrypt_data and the arow_encrypt_data without performing a serialization inverse serialization operation.
If the data of the node in the memory is lost, zero copy can be performed in the S3 step mode, and the data of the arow_process_encrypter_data_file and the arow_encrypter_data_file are read without a deserialization mode.
In the calculation process, all data are converted through Arrow, namely all the data are column-type data tables, so that large-scale OLAP calculation can be effectively and pertinently performed, and the execution efficiency is improved.
And finally, carrying out multi-node multiparty privacy calculation on the privacy data through continuous iterative polling S3-S6 steps to obtain a final calculation result.
The whole process is based on the original privacy computing technology system, and the data circulation efficiency is further improved in the process of influencing timeliness, such as network data transmission, node data execution, data serialization anti-serialization and the like. On hardware resources, CPU resources are fully liberated, so that the CPU participates in the privacy calculation process in the user-mode bottom calculation engine more, meanwhile, the data format is an Arrow column storage format in terms of software, and the execution efficiency of the whole privacy calculation process is greatly improved due to the full CPU resources and the Arrow data format.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims.

Claims (4)

1. The zero-copy device comprises a privacy computing alliance network module, and is characterized in that the privacy computing alliance network module comprises a plurality of nodes, privacy computation is carried out among the nodes, and encryption process data and intermediate factors are mutually transmitted among the nodes so as to optimize and compute respective node local models and algorithms;
the node is provided with two service states, wherein the two service states are a user state and a kernel state respectively, the user state is an upper-layer application program and a bottom-layer computing engine of the node, and the kernel state is data sharing circulation of the node on hardware;
the zero-copy method of the zero-copy device comprises the following steps:
s1: the user state upper layer service converts the privacy data;
the user state upper layer application program accesses the privacy data of the user, and firstly, the privacy data needs to be preprocessed, and the specific method for preprocessing the privacy data is as follows:
s11: null values are populated, null values being replaced with other known identifiers;
s12: the same type of data is screened and combined, and the data of the same type in the data set is screened and combined, so that the data of the same type is classified into one column or adjacent columns;
s13, after the pretreatment of the privacy data is completed through S11 and S12, the privacy data is continuously converted into a column data table on the internal memory by using an Arrow technology, and the column data table is marked as: arow_private_data;
s2: user mode switching kernel mode memory mapping and disk dropping;
the specific operation of S2 is as follows:
the user state and the kernel state are switched for one time, and simultaneously, the user state upper layer application program calls a mmap instruction, address mapping is carried out on the user state memory buffer area and the kernel buffer area of the kernel state, and the kernel buffer area of the kernel state is directly shared to obtain the arow_private_data data in the user state memory buffer area;
the user state program performs disk dropping on the arow_private_data, the service state is switched to the kernel state, the kernel state CPU informs the DMA device that the user state is written into the arow_private_data in the user state memory buffer area and shared into the kernel buffer area to write the arow_private_data into a local disk, the arow_private_data_file is recorded, the CPU is only responsible for informing, the subsequent disk dropping to the disk is processed by the DMA, and CPU resources are released;
s3: the user state upper layer calls the bottom layer and the memory sharing data mechanism;
the specific operation of S3 is as follows:
the bottom layer computing engine in the user mode performs data computation or model training on the local privacy data, and is invoked and initiated by the upper layer application in the user mode;
the user mode and the kernel mode are subjected to primary service state switching, an mmap instruction is called, the kernel buffer area of the kernel mode and the memory buffer area of the user mode are subjected to address mapping again, the user memory buffer area is directly shared to obtain data of the kernel buffer area of the kernel mode, and after the service state is switched to the kernel mode, the CPU points to an arow_private_data_file in a disk through the characteristic of arow technology;
when the kernel mode reads the Arrow file, the Arrow file is directly written into the kernel mode memory buffer area in a zero copy mode, the service state is switched from the kernel mode to the user state, and the user state bottom computing engine reads the Arrow_private_data_file data which is already shared by the kernel mode memory buffer area;
the user mode bottom layer computing engine is written by using the python and rust languages, and receives the arow_private_data_file data by using an arow python and rust library;
s4: the user mode bottom layer performs calculation on the data;
the specific operation of S4 is as follows:
the user-state bottom layer calculation engine calculates an arow_private_data_file, and the calculation process comprises the following steps: machine learning, deep learning, data treatment cleaning, data encryption and decryption and homomorphic calculation;
in the calculation process, outputting a part of process data and ciphertext data calculated by the current node in a user-state memory, wherein the process data is homomorphic encrypted in a homomorphic encryption mode;
the user state bottom layer calculation engine uses an Arrow technology to perform data continuous conversion and data preprocessing on the process data, the mode is consistent with the step S1, and finally column type process data and column type ciphertext data are obtained and marked as arow_process_encrypt_data and arow_encrypt_data;
s5: the user state is switched to kernel state calculation result memory mapping and external transmission;
the specific operation of S5 is as follows:
the user-state bottom layer computing engine temporarily drops the arow_process_encrypt_data and the arow_encrypt_data;
the disc-dropping operation is consistent with the step S2, and the arow_process_encrypt_data_file and arow_encrypt_data_file are obtained, wherein the service state is a kernel state, sendfile instructions are used in the kernel state to inform a DMA (direct memory access) and the arow_process_encrypt_data and arow_encrypt_data data of a kernel buffer area in the disc-dropping operation are collected and copied to a network card directly through a DMA technology;
the kernel mode network card and the user mode Flight network transmission protocol technology are combined to transmit the arow_process_encrypt_data and the arow_encrypt_data to other nodes;
s6: the user state receives the transmission data and participates in the bottom layer operation.
2. The privacy computing intermediate data stream zero-copy device of claim 1, wherein the upper layer application is implemented in java language and the lower layer computing engine is implemented in python or rust language;
the upper layer application program is deployed with a platform application for carrying out data preprocessing and converting the local privacy data into a memory data format, and a conversion technology used for converting the local privacy data into the memory data format is Apache Arrow;
the underlying computing engine is deployed with a variety of privacy computing underlying technologies.
3. The privacy computing intermediate data stream zero-copy device of claim 2, wherein the upper layer application and the lower layer computing engine each fully implement memory sharing through arow and associated mmap, sendfile and DMA technologies;
the kernel mode uses DMA on the hardware level, the kernel mode uses mmap and sendfile+DMA mode on the instruction aspect, and when the kernel mode transmits data to the outside through the network card, the user mode adopts Apache Arrow Flight technology to transmit the data through the network.
4. A privacy computing intermediate data stream zero copy device according to claim 3, wherein the specific operation of S6 is as follows:
the network card of the kernel mode of other nodes receives data from the arow_process_encrypt_data and arow_encrypt_data transmitted from the outside, and the kernel mode uses sendfile instructions to inform the DMA and directly collects and copies the data of arow_process_encrypt_data and arow_encrypt_data in the network card to a kernel buffer area in the kernel mode through a DMA technology;
the kernel mode establishes a memory address mapping sharing relation with the user mode through a mmap instruction, and the arow_process_encrypt_data and arow_encrypt_data in a kernel buffer area in the kernel mode are shared to the user mode memory buffer area, the state is switched to the user mode, and an underlying computing engine of the user mode reads arow_process_encrypt_data and arow_encrypt_data of the user mode memory buffer area;
the process data and the ciphertext data of the node are fused with the arow_process_encrypt_data and arow_encrypt_data transmitted by other nodes;
if the data of the node in the memory is lost, zero copying is performed in the S3 step mode, and the data of the arow_process_encrypt_data_file and the arow_encrypt_data_file are read in a deserialized mode;
and carrying out multi-node multiparty privacy calculation on the privacy data through continuous iterative polling S3-S6 steps to obtain a final calculation result.
CN202310292403.7A 2023-03-23 2023-03-23 Privacy calculation intermediate data stream zero-copy device and method Active CN116455612B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310292403.7A CN116455612B (en) 2023-03-23 2023-03-23 Privacy calculation intermediate data stream zero-copy device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310292403.7A CN116455612B (en) 2023-03-23 2023-03-23 Privacy calculation intermediate data stream zero-copy device and method

Publications (2)

Publication Number Publication Date
CN116455612A CN116455612A (en) 2023-07-18
CN116455612B true CN116455612B (en) 2023-11-28

Family

ID=87119385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310292403.7A Active CN116455612B (en) 2023-03-23 2023-03-23 Privacy calculation intermediate data stream zero-copy device and method

Country Status (1)

Country Link
CN (1) CN116455612B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873337A (en) * 2009-04-22 2010-10-27 电子科技大学 Zero-copy data capture technology based on rt8169 gigabit net card and Linux operating system
CN102291298A (en) * 2011-08-05 2011-12-21 曾小荟 Efficient computer network communication method oriented to long message
CN102402487A (en) * 2011-11-15 2012-04-04 北京天融信科技有限公司 Zero copy message reception method and system
CN103678203A (en) * 2013-12-13 2014-03-26 国家计算机网络与信息安全管理中心 Method and device for achieving zero copy of network card
CN104239249A (en) * 2014-09-16 2014-12-24 国家计算机网络与信息安全管理中心 PCI-E (peripheral component interconnect-express) zero-copy DMA (direct memory access) data transmission method
CN104333533A (en) * 2014-09-12 2015-02-04 北京华电天益信息科技有限公司 A Data packet zero-copy acquiring method for industrial control system network
WO2016078313A1 (en) * 2014-11-20 2016-05-26 中兴通讯股份有限公司 Data writing method and device
CN109117270A (en) * 2018-08-01 2019-01-01 湖北微源卓越科技有限公司 The method for improving network packet treatment effeciency
CN111240853A (en) * 2019-12-26 2020-06-05 天津中科曙光存储科技有限公司 Method and system for bidirectionally transmitting massive data in node
CN112995753A (en) * 2019-12-16 2021-06-18 中兴通讯股份有限公司 Media stream distribution method, CDN node server, CDN system and readable storage medium
CN113986811A (en) * 2021-09-23 2022-01-28 北京东方通网信科技有限公司 High-performance kernel-mode network data packet acceleration method
WO2022105884A1 (en) * 2020-11-23 2022-05-27 中兴通讯股份有限公司 Data transmission method and apparatus, network device, and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2477930C2 (en) * 2008-08-04 2013-03-20 ЗетТиИ Корпорейшн Method and system for transmitting flow multimedia data with zero copying
US10530747B2 (en) * 2017-01-13 2020-01-07 Citrix Systems, Inc. Systems and methods to run user space network stack inside docker container while bypassing container Linux network stack
US20210117246A1 (en) * 2020-09-25 2021-04-22 Intel Corporation Disaggregated computing for distributed confidential computing environment
US20220263869A1 (en) * 2021-02-17 2022-08-18 Seagate Technology Llc Data validation for zero copy protocols
US11784990B2 (en) * 2021-12-13 2023-10-10 Intel Corporation Protecting data transfer between a secure application and networked devices

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101873337A (en) * 2009-04-22 2010-10-27 电子科技大学 Zero-copy data capture technology based on rt8169 gigabit net card and Linux operating system
CN102291298A (en) * 2011-08-05 2011-12-21 曾小荟 Efficient computer network communication method oriented to long message
CN102402487A (en) * 2011-11-15 2012-04-04 北京天融信科技有限公司 Zero copy message reception method and system
CN103678203A (en) * 2013-12-13 2014-03-26 国家计算机网络与信息安全管理中心 Method and device for achieving zero copy of network card
CN104333533A (en) * 2014-09-12 2015-02-04 北京华电天益信息科技有限公司 A Data packet zero-copy acquiring method for industrial control system network
CN104239249A (en) * 2014-09-16 2014-12-24 国家计算机网络与信息安全管理中心 PCI-E (peripheral component interconnect-express) zero-copy DMA (direct memory access) data transmission method
WO2016078313A1 (en) * 2014-11-20 2016-05-26 中兴通讯股份有限公司 Data writing method and device
CN109117270A (en) * 2018-08-01 2019-01-01 湖北微源卓越科技有限公司 The method for improving network packet treatment effeciency
CN112995753A (en) * 2019-12-16 2021-06-18 中兴通讯股份有限公司 Media stream distribution method, CDN node server, CDN system and readable storage medium
CN111240853A (en) * 2019-12-26 2020-06-05 天津中科曙光存储科技有限公司 Method and system for bidirectionally transmitting massive data in node
WO2022105884A1 (en) * 2020-11-23 2022-05-27 中兴通讯股份有限公司 Data transmission method and apparatus, network device, and storage medium
CN113986811A (en) * 2021-09-23 2022-01-28 北京东方通网信科技有限公司 High-performance kernel-mode network data packet acceleration method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于零拷贝的网络数据捕获技术的研究与实现;张珂;权义宁;;电子科技(第11期);全文 *
零拷贝报文捕获平台的研究与实现;王佰玲, 方滨兴, 云晓春;计算机学报(第01期);全文 *

Also Published As

Publication number Publication date
CN116455612A (en) 2023-07-18

Similar Documents

Publication Publication Date Title
EP3889774A1 (en) Heterogeneous computing-based task processing method and software-hardware framework system
Mamidala et al. MPI collectives on modern multicore clusters: Performance optimizations and communication characteristics
JP3696563B2 (en) Computer processor and processing device
JP3483877B2 (en) Data processing method and data processing system in processor
JP3515985B2 (en) Method and system for temporarily setting a dedicated pipeline in a processor device
CN104820657A (en) Inter-core communication method and parallel programming model based on embedded heterogeneous multi-core processor
JP4768386B2 (en) System and apparatus having interface device capable of data communication with external device
CN106951926A (en) The deep learning systems approach and device of a kind of mixed architecture
CN110619595A (en) Graph calculation optimization method based on interconnection of multiple FPGA accelerators
CN111630505A (en) Deep learning accelerator system and method thereof
JPS62208158A (en) Multiprocessor system
Yi et al. Gpunfv: a gpu-accelerated nfv system
CN110929456B (en) Equivalent particle load balancing and accelerating method for parallel computing by moving particle method
CN110928694B (en) Computer system
CN115605907A (en) Distributed graphics processor unit architecture
CN109491934A (en) A kind of storage management system control method of integrated computing function
US20100146241A1 (en) Modified-SIMD Data Processing Architecture
CN116455612B (en) Privacy calculation intermediate data stream zero-copy device and method
US20100281192A1 (en) Apparatus and method for transferring data within a data processing system
CN112084023A (en) Data parallel processing method, electronic equipment and computer readable storage medium
Chu et al. Dynamic kernel fusion for bulk non-contiguous data transfer on GPU clusters
CN112732634B (en) ARM-FPGA (advanced RISC machine-field programmable gate array) cooperative local dynamic reconstruction processing method for edge calculation
CN114529444B (en) Graphics processing module, graphics processor, and graphics processing method
CN109828842A (en) A kind of high-performance data acquisition engine method based on DPDK technological development
CN115686836A (en) Unloading card provided with accelerator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant