CN103220226B - Transparent real-time traffic compression method and system between data center - Google Patents

Transparent real-time traffic compression method and system between data center Download PDF

Info

Publication number
CN103220226B
CN103220226B CN201310158691.3A CN201310158691A CN103220226B CN 103220226 B CN103220226 B CN 103220226B CN 201310158691 A CN201310158691 A CN 201310158691A CN 103220226 B CN103220226 B CN 103220226B
Authority
CN
China
Prior art keywords
compression
data
stream
data block
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310158691.3A
Other languages
Chinese (zh)
Other versions
CN103220226A (en
Inventor
王燕飞
吴教仁
刘晓光
刘涛
刘宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310158691.3A priority Critical patent/CN103220226B/en
Publication of CN103220226A publication Critical patent/CN103220226A/en
Application granted granted Critical
Publication of CN103220226B publication Critical patent/CN103220226B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention proposes transparent real-time traffic compression method between a kind of data center, comprise the steps: to carry out compressing based on the fine granularity of stream to data stream, comprising: the port attribute according to data flow carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to data stream; According to the stream type of multiple compression data block, the compression data block of multiple stream type is transmitted respectively by different tunnels simultaneously, wherein, adopt batch processing strategy and local Buffer Pool method to transmit compression data block.The present invention compresses the redundant information of specific transactions, and the bandwidth availability ratio of transmission link in the heart in abundant mining data, optimization system expense, flexibility is good, and efficiency is high, and performance advantage is obvious.The invention also discloses transparent real-time traffic compressibility between a kind of data center.

Description

Transparent real-time traffic compression method and system between data center
Technical field
The present invention relates to field of computer technology, particularly transparent real-time traffic compression method and system between a kind of data center.
Background technology
Be under the high-speed link of 10Gbps in magnitude, the Real Time Compression of packet will face very large challenge, divide three challenges of aspect to Real Time Compression to be described below.
(1), in system realizes, the auxiliary unloading of general employing hardware acceleration device (Offload) system processing pressure, but, as peripheral hardware acceleration equipment participate in system interaction time, huge overhead will be faced, such as I/O efficiency (comprising the device register memory access optimization etc. of PCIe bandwidth utilization efficiency, high latency), operating system overhead (comprise system call expense, kernel state and User space packet copy expense etc.).
(2), under general multi-core platform, 10Gbps magnitude line-speed processing demand must cause concurrent design.According to Amdahl law, system serial section is by the concurrent speed-up ratio of system for restricting the most at last.Therefore, concurrent design is optimized most important.But current Partial shrinkage card drives from hardware and drives two aspects all to fail well to be applied to the concurrent process scene of high speed.Although increase the efficiency that data packet length to be compressed can optimize compression device, the process simultaneously too increasing excessive data bag is prolonged.I/O communication pattern based on block type fully can not control the concurrent processing efficiency between general processor and acceleration equipment.
(3), in network design aspect, the compression/de-compression of transparence compression design demand fulfillment symmetrical expression is disposed.This results in and how at utmost to utilize application characteristic, the Date redundancy features according to different application carries out the problem of the compression process of Different Strategies.General in order to optimize compression bandwidth and compression efficiency, adopt many bags to assemble and compresses simultaneously, but the packet loss that the unreliability of potential IP network causes may affect multiple services TCP performance simultaneously, therefore optimize the performance loss that packet loss causes also most important.How adaptive control also will be considered during design.The packet after compression is simultaneously assembled in order to transmit many bags, also need Reseal header, but the stream single characteristic that traditional tunnel pattern causes may affect the efficiency of transmission of packet heart core layer in the data, therefore how to design efficient tunnel system also most important.
Prior art is generally mainly used in the transmission link of high latency, as in satellite link.Due to without the high bandwidth requirements between data center, so there is no patent that the overhead of compression engine is optimized and technology.
In existing compress technique, in order to improve compression efficiency, have employed the compressed configuration based on data block (block), multiple packet participates in compression together.Adopt virtual link method to realize the transparent compression of packet, the data message after compression transmits in virtual link, and the corresponding virtual link two ends of compressed and decompressed difference, therefore achieve transparent compression.
Prior art has following shortcoming:
(1) although, the existing compress technique based on block compression to a certain degree improves the compression efficiency of acceleration equipment, do not have fine-grained control, this adds the abnormal impacts on network performance such as packet loss to a certain extent.Such as may have influence on multiple TCP flow simultaneously, or have influence on multiple service feature simultaneously.In addition, there is no fine-grained control, be difficult to the Date redundancy features excavating different TCP flow and different business, cause compression efficiency limited.
(2), in the data in heart core network, single virtual link may cause high speed flow fully can not control basic network.
(3), there is no high-throughput design requirement, be therefore difficult to embody overhead, especially more difficult embodiment in the parallelization platform based on general multinuclear is to application acceleration.The high-performance real-time system design of 10000000000 (10Gbps) magnitude, needs optimization system expense, especially lays particular emphasis on efficient paralleling tactic optimization.
Summary of the invention
The present invention is intended at least to solve one of technical problem existed in prior art.
For this reason, one object of the present invention is to propose transparent real-time traffic compression method between a kind of data center.The redundant information of this method compression specific transactions, the bandwidth availability ratio of transmission link in the heart in abundant mining data, optimization system expense, flexibility is good, and efficiency is high, and performance advantage is obvious.
Second object of the present invention is to propose transparent real-time traffic compressibility between a kind of data center.
For achieving the above object, the embodiment of first aspect present invention proposes transparent real-time traffic compression method between a kind of data center.Comprise the steps: to carry out compressing based on the fine granularity of stream to data stream, comprising: the port attribute according to data flow carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to described data flow; According to the stream type of described multiple compression data block, the compression data block of multiple stream type is transmitted respectively by different tunnels simultaneously, wherein, adopt batch processing strategy and local Buffer Pool method to transmit described compression data block.
Between the data center according to the embodiment of the present invention, transparent real-time traffic compression method have employed the optimisation strategy that fine granularity controls, apply property according to carrying carries out more fine-grained performance optimization, the redundant information of compression specific transactions, bandwidth resources in the heart in abundant mining data, improve bandwidth availability ratio, Cost optimization, reduces the impact that unreliable network causes packet loss phenomenon, provides flexibility and performance advantage.Optimize defeated performance.In the data in heart network, in order to improve network transmission efficiency and redundancy, various flows packet can be propagated through different routed paths simultaneously, improves propagation efficiency, optimization system expense.
In one embodiment of the invention, the magnitude in described tunnel is equal with the magnitude of the packet of described data flow.Optimize Tunnel Design.Multiple tunnel Model Design optimizes restriction, takes full advantage of the bandwidth of data center network.
In one embodiment of the invention, describedly carry out adopting the running of flowing water strategy based on the fine granularity compression of stream to data stream, wherein, central processor CPU receives request compression queue, compressing card is had to compress described request compression queue, and data feeding after compression is responded out group queue, described central processor CPU obtains the described described packed data responded out in group queue.Adopt flowing water strategy can realize the running of CPU and compressing card high load capacity simultaneously, fully excavate the service efficiency of general multi-core CPU and compressing card, thus avoid idle running, reach the object reducing systematic function.
In one embodiment of the invention, utilize the buffer system between described central processor CPU and compressing card, increase the buffer size between described central processor CPU and compressing card according to system load jitter amplitude.
In one embodiment of the invention, to described request compression queue with describedly respond out group queue and carry out polling operation.Carry out polling operation to asking compression queue and responding out group queue, system break expense and systematic evaluation expense can be reduced.
In one embodiment of the invention, also comprise the steps: that the I/O model adopting User space to drive transmits described compression data block.Can optimization system call overhead, bag copy communication overhead etc. between kernel state and User space, optimize the CPU that the lower copy procedure of the long application of large bag brings and consume and cache pollution issue, and then raising systematic function.
The embodiment of second aspect present invention proposes transparent real-time traffic compressibility between a kind of data center, comprises compression treatment device and flow management device.
Wherein, compression treatment device is used for carrying out compressing based on the fine granularity of stream to data stream, wherein, described compression treatment device carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to described data flow according to the port attribute of data flow; Flow management device is used for the stream type according to described multiple compression data block, is transmitted by the compression data block of multiple stream type respectively by different tunnels simultaneously, wherein, adopts batch processing strategy and local Buffer Pool method to transmit described compression data block.
Transparent real-time traffic compressibility between the data center according to the embodiment of the present invention, have employed the optimisation strategy that fine granularity controls, apply property according to carrying carries out more fine-grained performance optimization, the redundant information of compression specific transactions, bandwidth resources in the heart in abundant mining data, improve bandwidth availability ratio, Cost optimization, reduce the impact that unreliable network causes packet loss phenomenon, provide flexibility and performance advantage.Optimize defeated performance.In the data in heart network, in order to improve network transmission efficiency and redundancy, various flows packet can be propagated through different routed paths simultaneously, improves propagation efficiency, optimization system expense.
In one embodiment of the invention, the magnitude in described tunnel is equal with the magnitude of the packet of described data flow.Optimize Tunnel Design.Multiple tunnel Model Design optimizes restriction, takes full advantage of the bandwidth of data center network.
In one embodiment of the invention, described compression treatment device carries out adopting the running of flowing water strategy based on the fine granularity compression of stream to described data flow, wherein, described compression treatment device comprises central processor CPU and compressing card, wherein, described central processor CPU is for receiving request compression queue; Described compressing card is used for compressing described request compression queue, and data feeding after compression is responded out group queue; Wherein, described central processor CPU is also for obtaining the described described packed data responded out in group queue.Adopt flowing water strategy can realize the running of CPU and compressing card high load capacity simultaneously, fully excavate the service efficiency of general multi-core CPU and compressing card, thus avoid idle running, reach the object reducing systematic function.
In one embodiment of the invention, described compression treatment device also comprises buffer system, wherein, described buffer system between described central processor CPU and compressing card, for increasing the buffer size between described central processor CPU and compressing card according to system load jitter amplitude.
In one embodiment of the invention, described compression treatment device is also for described request compression queue with describedly respond out group queue and carry out polling operation.Carry out polling operation to asking compression queue and responding out group queue, system break expense and systematic evaluation expense can be reduced.
In one embodiment of the invention, the I/O model that described current processing device adopts User space to drive transmits described compression data block.Can optimization system call overhead, bag copy communication overhead etc. between kernel state and User space, optimize the CPU that the lower copy procedure of the long application of large bag brings and consume and cache pollution issue, and then raising systematic function.
Additional aspect of the present invention and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or additional aspect of the present invention and advantage will become obvious and easy understand from accompanying drawing below combining to the description of embodiment, wherein:
Fig. 1 is transparent real-time traffic compression method flow chart between the data center according to the embodiment of the present invention;
Fig. 2 is the flow chart of the flowing water strategy according to the embodiment of the present invention;
Fig. 3 is the general frame of the system according to the embodiment of the present invention; With
Fig. 4 is the structural representation of transparent real-time traffic compressibility between the data center according to the embodiment of the present invention.
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
Describe transparent real-time traffic compression method between data center according to the embodiment of the present invention below with reference to Fig. 1, comprise the steps:
Step S110: the fine granularity compression based on stream is carried out to data stream, comprising: the port attribute according to data flow carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to data stream.
In one embodiment of the invention, carry out adopting the running of flowing water strategy based on the fine granularity compression of stream to data stream, wherein, central processor CPU receives request compression queue, compressing card is had to compress request compression queue, and data feeding after compression is responded out group queue, central processor CPU obtains the packed data responded out in group queue.Utilize the buffer system between central processor CPU and compressing card, increase the buffer size between central processor CPU and compressing card according to system load jitter amplitude.
Wherein, polling operation is carried out to asking compression queue and responding out group queue.
Step S120: according to the stream type of multiple compression data block, transmits the compression data block of multiple stream type respectively by different tunnels simultaneously, wherein, adopts batch processing strategy and local Buffer Pool method to transmit compression data block.
Wherein, the magnitude in tunnel is equal with the magnitude of the packet of data flow.
Also comprise the steps: that the I/O model adopting User space to drive transmits compression data block.
Be described according to real-time traffic compression method transparent between the data center of the embodiment of the present invention with concrete example below.Be understandable that, following explanation only for illustrative purposes, is not limited thereto according to embodiments of the invention.
Step S210: the fine granularity compression based on stream is carried out to data stream.
Wherein, the fine granularity compression design based on stream (Flow) is as follows:
Port attribute according to data flow divides data stream, and implements different Compression Strategies to different demarcation, obtains multiple compression data block.Utilize the feature of data redundancy feature similarity in same data flow, improve compression efficiency, optimize bandwidth.When bandwidth is heightened in many packet aggregation compressions, can guarantee that packet loss is only confined in a TCP flow to the coverage of system, avoid many to flow the packet loss that simultaneously sets out, thus avoid multiple connection to enter congestion avoidance procedure simultaneously, and then optimize overall transfer performance.
Step S220: according to the stream type of multiple compression data block, transmits the compression data block of multiple stream type respectively by different tunnels simultaneously.
Single tunnel mode by the propagation efficiency of restriction high flow capacity in the data in heart network, in order to improve network transmission efficiency and redundancy.In the data in heart network, various flows packet can be propagated through different routed paths simultaneously.Particularly, the virtual transmission link design based on Multiple tunnel (Tunnel) pattern adopts the layout strategy of class NAT, utilizes the mode of the common map tunnel of port and IP address, reduces the occupancy of IP address.Under extreme case, the magnitude in tunnel is equal with the magnitude of the packet of data flow, optimizes Tunnel Design.Multiple tunnel Model Design optimizes restriction, takes full advantage of the bandwidth of data center network.
In order to optimize under 10,000,000,000 magnitudes often bag average overhead (Per-PacketBookkeeping), the overhead such as high latency register memory access expense, bag Memory Allocation expense of such as accelerator card PCIe equipment, adopt batch processing strategy and local Buffer Pool method to transmit compression data block, optimize competition and average overhead.
As follows according to the design of nothing lock concurrent system in different levels of this method design:
(1), application: in order to fully excavate the concurrent processing ability of general multiple nucleus system, according to fine granularity stream compatibility strategy, the packet meeting various flows walks abreast, without the compression process of locking.Partition strategy based on stream level (FlowlevelBased) ensure that the different physics of polycaryon processor endorse with without lock, completely concurrent realize compression/decompression process.
(2), aspect is driven: adopt thread-safe and completely concurrent drive system, avoid serial code in drive system to become the bottleneck of system concurrency.The hardware concurrent access that driving should make full use of accelerator card equipment to be provided is supported, optimizes lock competition, guarantees the carrying out that high flow capacity Real Time Compression is entirely concurrent.
(3), hardware level: the accelerating hardware selecting the many queues of support hardware, guarantee drive and application program can without lock the Concurrency Access carrying out accelerating hardware.For the accelerator card not having the many queues of hardware to support, when highly concurrent driving and upper layer software (applications) are applied in access hardware acceleration equipment, to inevitably carry out the access of serial mutual exclusion, this serial time of implementation will directly have influence on the speed-up ratio of Multi-core system.
(4), data structure level: in compressibility, potential Memory Allocation process all adopts every Nuclear Data structure (Per-CoreDataStructure) software caching (Cache) and the overall situation to complete without the data structure of locking (Lock-free) queue, improves concurrency and the flexibility of system to the full extent.
Wherein, carry out adopting the running of flowing water strategy based on the fine granularity compression of stream to data stream, adopt flowing water (pipelining) strategy can realize the running of CPU and compressing card high load capacity simultaneously, the service efficiency of the general multi-core CPU of abundant excavation and compressing card, thus avoid idle running, reach the object reducing systematic function.Specific design as shown in Figure 2, comprises CPU, RequestRingbuffer, ResponseRingbuffer and HW.Central processor CPU is CPU in Fig. 2, and buffering area is made up of RequestRingbuffer and ResponseRingbuffer, and in Fig. 2, HW represents compressing card.Concrete design comprises:
(1) buffer system between CPU and compressing card, is made full use of.According to the amplitude of system load shake, suitably increase buffer size.
(2), by compression process be divided into three processing procedures, comprise respectively and be compressed into queue process, the compression process of compressing card process and the response of central processor CPU process by the request of central processor CPU process and go out group process.
Concrete steps are as shown in Figure 2:
Step S1: central processor CPU receives request compression queue RequestRingbuffer.
Step S2: compressing card HW compresses request compression queue RequestRingbuffer
Data feeding after compression is responded out group queue ResponseRingbuffer by step S3: compressing card HW.
Step S4: central processor CPU obtains the packed data responded out in group queue.
In Fig. 2, flowing water design object makes CPU all remain on the state of relative busy busy with compressing card hardware HW, fully can excavate the throughput of flowing water.
(3), in I/O pattern, avoid using block type (Block) communication pattern.In extreme circumstances, CPU can get clogged until data packet compressing completes, and this can lower the efficiency of flowing water for blocking model.The I/O model that User space can be adopted to drive transmits compression data block.In order to optimization system call overhead, bag copy communication overhead etc. between kernel state and User space, User space is adopted to drive (UserSpaceDriver, UIO) the I/O model designed, realize the I/O driving model of zero-copy, thus optimize the CPU that the lower copy procedure of the long application of large bag brings and consume and cache pollution issue (CachePollution), and then raising systematic function.
(4), poll (Polling) is asked compression queue and responds out group queue, at utmost performance flowing water performance.Carry out polling operation to asking compression queue and responding out group queue, system break expense and systematic evaluation expense can be reduced.
Fig. 3 is the general frame of a system according to this method realization, and be understandable that, Fig. 3 only for illustrative purposes, is not limited thereto according to embodiments of the invention.System is divided into flow management module and compression processing module.The compression data block Chunk that fine granularity flow management process provides continuous print to divide based on stream for compression processing module, efficient packed data transmission is carried out by many tunnels in centre, and the system of this symmetry provides transparent compression process.Because the solution pressure side flow process of system and compression end have certain symmetry, be described for the compression process on Fig. 3 left side below.FlowsManagement represents flow management module, and C0, C1, C2 represent the processor physics core of multi-core platform, and wherein, according to parallel protocols strategy, each physics core has the session context of some, represents in Fig. 3 with flows.Generally organize session context with Hash table form, according to the data message of strategy tissue from different flows, form chunk to be compressed by the Flowschunk of Fig. 3, be submitted to compression engine CompressEngine; Final result Newdata, by the new header of Reseal, as shown in+header in Fig. 3, by Tunnel tunnel, is sent to opposite end and carries out decompress(ion).
Between the data center according to the embodiment of the present invention, transparent real-time traffic compression method to be mainly used in uniform data high-performance data optimized transmission platform in the heart, for having the bag transmission performance optimization of high latency bandwidth product characteristic link, comprise the purposes such as bandwidth availability ratio optimization, delay performance optimization.By compressing the redundant information of specific transactions, real-time data compression design, bandwidth resources in the heart in abundant mining data, improve the bandwidth availability ratio of the high business of some data redundancy amount, Cost optimization, provide flexibility and performance advantage.Have employed the optimisation strategy that fine granularity controls, the apply property according to carrying carries out more fine-grained performance optimization.Have employed the I/O system from performance perspective height optimization, avoid the system of high-magnitude additionally to introduce high overhead.
Describe transparent real-time traffic compressibility 100 between data center according to the embodiment of the present invention below with reference to Fig. 4, comprise compression treatment device 110 and flow management device 120.
Wherein, compression treatment device 110 is for carrying out the fine granularity compression based on stream to data stream, wherein, compression treatment device 110 carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to data stream according to the port attribute of data flow; The compression data block of multiple stream type, for the stream type according to multiple compression data block, transmits respectively by different tunnels by flow management device 120 simultaneously, wherein, adopts batch processing strategy and local Buffer Pool method to transmit compression data block.
Wherein, the magnitude in tunnel is equal with the magnitude of the packet of data flow.
Compression treatment device 110 pairs of data streams carry out adopting the running of flowing water strategy based on the fine granularity compression of stream, and wherein, compression treatment device 110 comprises central processor CPU 111 and compressing card 112, and wherein, central processor CPU 111 is for receiving request compression queue; Data feeding after compression for compressing request compression queue, and is responded out group queue by compressing card 112; Wherein, central processor CPU 111 is also for obtaining the packed data responded out in group queue.
Compression treatment device 110 also comprises buffer system 113113, wherein, buffer system 113 between central processor CPU 111 and compressing card 112, for increasing the buffer size between central processor CPU 111 and compressing card 112 according to system load jitter amplitude.
Compression treatment device 110 is also for carrying out polling operation to asking compression queue and responding out group queue.
The I/O model that current processing device adopts User space to drive transmits compression data block.
Be described according to real-time traffic compressibility transparent between the data center of the embodiment of the present invention with concrete example below.Be understandable that, following explanation only for illustrative purposes, is not limited thereto according to embodiments of the invention.
Compression treatment device 110 pairs of data streams carry out the fine granularity compression based on stream.
Wherein, compression treatment device 110 comprises central processor CPU 111, compressing card 112 and buffer system 113, and buffer system 113 is between central processor CPU 111 and compressing card 112.Fine granularity compression design based on stream (Flow) in compression treatment device 110 is as follows:
Compression treatment device 110 divides data stream according to the port attribute of data flow, and implements different Compression Strategies to different demarcation, obtains multiple compression data block.Compression treatment device 110 utilizes the feature of data redundancy feature similarity in same data flow, improves compression efficiency, optimizes bandwidth.When bandwidth is heightened in many packet aggregation compressions, can guarantee that packet loss is only confined in a TCP flow to the coverage of system, avoid many to flow the packet loss that simultaneously sets out, thus avoid multiple connection to enter congestion avoidance procedure simultaneously, and then optimize overall transfer performance.
The compression data block of multiple stream type, according to the stream type of multiple compression data block, transmits respectively by different tunnels by flow management device 120 simultaneously.
Single tunnel mode by the propagation efficiency of restriction high flow capacity in the data in heart network, in order to improve network transmission efficiency and redundancy.In the data in heart network, various flows packet can be propagated through different routed paths simultaneously.Particularly, adopt the layout strategy of class NAT in flow management device 120 based on the virtual transmission link design of Multiple tunnel (Tunnel) pattern, utilize the mode of the common map tunnel of port and IP address, reduce the occupancy of IP address.Under extreme case, the magnitude in tunnel is equal with the magnitude of the packet of data flow, optimizes Tunnel Design.Multiple tunnel Model Design optimizes restriction, takes full advantage of the bandwidth of data center network.
In order to optimize under 10,000,000,000 magnitudes often bag average overhead (Per-PacketBookkeeping), the overhead such as high latency register memory access expense, bag Memory Allocation expense of such as accelerator card PCIe equipment, flow management device 120 adopts batch processing strategy and local Buffer Pool method to transmit compression data block, optimizes competition and average overhead.
As follows according to the design of nothing lock concurrent system in different levels of native system design:
(1), application: in order to fully excavate the concurrent processing ability of general multiple nucleus system, according to compression treatment device 110 fine granularity stream compatibility strategy, the packet meeting various flows walks abreast, without the compression process of locking.Partition strategy based on stream level (FlowlevelBased) ensure that the different physics of polycaryon processor endorse with without lock, completely concurrent realize compression/decompression process.
(2), aspect is driven: adopt thread-safe and completely concurrent drive system, avoid serial code in drive system to become the bottleneck of system concurrency.The hardware concurrent access that driving should make full use of accelerator card equipment to be provided is supported, optimizes lock competition, guarantees the carrying out that high flow capacity Real Time Compression is entirely concurrent.
(3), hardware level: the accelerating hardware selecting the many queues of support hardware, guarantee drive and application program can without lock the Concurrency Access carrying out accelerating hardware.For the accelerator card not having the many queues of hardware to support, when highly concurrent driving and upper layer software (applications) are applied in access hardware acceleration equipment, to inevitably carry out the access of serial mutual exclusion, this serial time of implementation will directly have influence on the speed-up ratio of Multi-core system.
(4), data structure level: in compressibility, potential Memory Allocation process all adopts every Nuclear Data structure (Per-CoreDataStructure) software caching (Cache) and the overall situation to complete without the data structure of locking (Lock-free) queue, improves concurrency and the flexibility of system to the full extent.
Wherein, compression treatment device 110 pairs of data streams carry out adopting the running of flowing water strategy based on the fine granularity compression of stream.Adopt flowing water (pipelining) strategy can realize the running of central processor CPU 111 and compressing card 112 high load capacity simultaneously, fully excavate the service efficiency of general multi-core CPU and compressing card 112, thus avoid idle running, reach the object reducing systematic function.Specific design as shown in Figure 2, comprises CPU, RequestRingbuffer, ResponseRingbuffer and HW.Central processor CPU 111 is CPU in figure, and buffer system 113 is made up of RequestRingbuffer and ResponseRingbuffer, and in figure, HW represents compressing card 112.Concrete design comprises:
(1) buffer system 113 between CPU and compressing card 112, is made full use of.According to the amplitude of system load shake, suitably increase buffer size.
(2), by compression process be divided into three processing procedures, comprise the request processed by central processor CPU 111 is compressed into queue process, compressing card 112 processes compression process respectively and the response that central processor CPU 111 processes goes out group process.
Concrete steps are as shown in Figure 2:
Step S1: central processor CPU 111 receives request compression queue RequestRingbuffer.
Step S2: compressing card 112HW compresses request compression queue RequestRingbuffer
Data feeding after compression is responded out group queue ResponseRingbuffer by step S3: compressing card 112HW.
Step S4: central processor CPU 111 obtains the packed data responded out in group queue.
In Fig. 2, flowing water design object makes CPU all remain on the state of relative busy busy with compressing card 112 hardware HW, fully can excavate the throughput of flowing water.
(3), in I/O pattern, avoid using block type (Block) communication pattern.In extreme circumstances, CPU can get clogged until data packet compressing completes, and this can lower the efficiency of flowing water for blocking model.The I/O model that User space can be adopted to drive transmits compression data block.In order to optimization system call overhead, bag copy communication overhead etc. between kernel state and User space, current processing device adopts User space to drive (UserSpaceDriver, UIO) the I/O model designed, realize the I/O driving model of zero-copy, thus optimize the CPU that the lower copy procedure of the long application of large bag brings and consume and cache pollution issue (CachePollution), and then raising systematic function.
(4), poll (Polling) is asked compression queue and responds out group queue, at utmost performance flowing water performance.Compression treatment device 110 carries out polling operation to asking compression queue and responding out group queue, can reduce system break expense and systematic evaluation expense.
Between the data center according to the embodiment of the present invention, transparent real-time traffic compressibility to be mainly used in uniform data high-performance data optimized transmission platform in the heart, for having the bag transmission performance optimization of high latency bandwidth product characteristic link, comprise the purposes such as bandwidth availability ratio optimization, delay performance optimization.By compressing the redundant information of specific transactions, real-time data compression design, bandwidth resources in the heart in abundant mining data, improve the bandwidth availability ratio of the high business of some data redundancy amount, Cost optimization, provide flexibility and performance advantage.Have employed the optimisation strategy that fine granularity controls, the apply property according to carrying carries out more fine-grained performance optimization.Have employed the I/O system from performance perspective height optimization, avoid the system of high-magnitude additionally to introduce high overhead.
In the description of this specification, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, identical embodiment or example are not necessarily referred to the schematic representation of above-mentioned term.And the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiment or example.
Although illustrate and describe embodiments of the invention, for the ordinary skill in the art, be appreciated that and can carry out multiple change, amendment, replacement and modification to these embodiments without departing from the principles and spirit of the present invention, scope of the present invention is by claims and equivalency thereof.

Claims (12)

1. a transparent real-time traffic compression method between data center, is characterized in that, comprise the steps:
Fine granularity compression based on stream is carried out to data stream, comprising: the port attribute according to data flow carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to described data flow;
According to the stream type of described multiple compression data block, the compression data block of multiple stream type is transmitted respectively by different tunnels simultaneously,
Wherein, batch processing strategy and local Buffer Pool method is adopted to transmit described compression data block.
2. the method for claim 1, is characterized in that, the magnitude in described tunnel is equal with the magnitude of the packet of described data flow.
3. the method for claim 1, it is characterized in that, describedly carry out adopting the running of flowing water strategy based on the fine granularity compression of stream to data stream, wherein, central processor CPU receives request compression queue, have compressing card to compress described request compression queue, and data feeding after compression is responded out group queue, described central processor CPU obtains the described described packed data responded out in group queue.
4. method as claimed in claim 3, is characterized in that, utilize the buffer system between described central processor CPU and compressing card, increase the buffer size between described central processor CPU and compressing card according to system load jitter amplitude.
5. method as claimed in claim 3, is characterized in that, to described request compression queue with describedly respond out group queue and carry out polling operation.
6. the method for claim 1, is characterized in that, also comprises the steps: that the I/O model adopting User space to drive transmits described compression data block.
7. a transparent real-time traffic compressibility between data center, is characterized in that, comprising:
Compression treatment device, for carrying out the fine granularity compression based on stream to data stream, wherein, described compression treatment device carries out compressing to obtain multiple compression data block based on the Compression Strategies of the correspondence of stream division to described data flow according to the port attribute of data flow;
Flow management device, for the stream type according to described multiple compression data block, transmits the compression data block of multiple stream type respectively by different tunnels simultaneously, wherein, adopts batch processing strategy and local Buffer Pool method to transmit described compression data block.
8. system as claimed in claim 7, it is characterized in that, the magnitude in described tunnel is equal with the magnitude of the packet of described data flow.
9. system as claimed in claim 7, is characterized in that, described compression treatment device carries out adopting the running of flowing water strategy based on the fine granularity compression of stream to described data flow, and wherein, described compression treatment device comprises central processor CPU and compressing card, wherein,
Described central processor CPU is for receiving request compression queue;
Described compressing card is used for compressing described request compression queue, and data feeding after compression is responded out group queue;
Wherein, described central processor CPU is also for obtaining the described described packed data responded out in group queue.
10. system as claimed in claim 9, it is characterized in that, described compression treatment device also comprises buffer system, wherein, described buffer system between described central processor CPU and compressing card, for increasing the buffer size between described central processor CPU and compressing card according to system load jitter amplitude.
11. systems as claimed in claim 9, is characterized in that, described compression treatment device is also for described request compression queue with describedly respond out group queue and carry out polling operation.
12. systems as claimed in claim 7, is characterized in that, the I/O model that described current processing device adopts User space to drive transmits described compression data block.
CN201310158691.3A 2013-05-02 2013-05-02 Transparent real-time traffic compression method and system between data center Active CN103220226B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310158691.3A CN103220226B (en) 2013-05-02 2013-05-02 Transparent real-time traffic compression method and system between data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310158691.3A CN103220226B (en) 2013-05-02 2013-05-02 Transparent real-time traffic compression method and system between data center

Publications (2)

Publication Number Publication Date
CN103220226A CN103220226A (en) 2013-07-24
CN103220226B true CN103220226B (en) 2016-04-20

Family

ID=48817705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310158691.3A Active CN103220226B (en) 2013-05-02 2013-05-02 Transparent real-time traffic compression method and system between data center

Country Status (1)

Country Link
CN (1) CN103220226B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160253096A1 (en) * 2015-02-28 2016-09-01 Altera Corporation Methods and apparatus for two-dimensional block bit-stream compression and decompression
CN113301123B (en) * 2021-04-30 2024-04-05 阿里巴巴创新公司 Data stream processing method, device and storage medium
CN114827125A (en) * 2022-03-23 2022-07-29 深圳北鲲云计算有限公司 Parallel data transmission method, system and medium for high-performance computing cloud platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101977162A (en) * 2010-12-03 2011-02-16 电子科技大学 Load balancing method of high-speed network
CN102916905A (en) * 2012-10-18 2013-02-06 曙光信息产业(北京)有限公司 Gigabit network card multi-path shunting method and system based on hash algorithm
CN102984269A (en) * 2012-12-10 2013-03-20 北京网御星云信息技术有限公司 Method and device for peer-to-peer flow identification
CN202907104U (en) * 2012-08-07 2013-04-24 上海算芯微电子有限公司 Compression and decompression system of video data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6819271B2 (en) * 1999-01-29 2004-11-16 Quickshift, Inc. Parallel compression and decompression system and method having multiple parallel compression and decompression engines

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101977162A (en) * 2010-12-03 2011-02-16 电子科技大学 Load balancing method of high-speed network
CN202907104U (en) * 2012-08-07 2013-04-24 上海算芯微电子有限公司 Compression and decompression system of video data
CN102916905A (en) * 2012-10-18 2013-02-06 曙光信息产业(北京)有限公司 Gigabit network card multi-path shunting method and system based on hash algorithm
CN102984269A (en) * 2012-12-10 2013-03-20 北京网御星云信息技术有限公司 Method and device for peer-to-peer flow identification

Also Published As

Publication number Publication date
CN103220226A (en) 2013-07-24

Similar Documents

Publication Publication Date Title
Cheng et al. Using high-bandwidth networks efficiently for fast graph computation
WO2022225639A1 (en) Service mesh offload to network devices
US20200380361A1 (en) Directed and interconnected grid dataflow architecture
CN111966446B (en) RDMA virtualization method in container environment
KR101150928B1 (en) Network architecture and method for processing packet data using the same
CN104820657A (en) Inter-core communication method and parallel programming model based on embedded heterogeneous multi-core processor
CN100461770C (en) Data processing method for the packet service transfer link of the wireless communication system of the terminal
KR20130099185A (en) A method and system for improved multi-cell support on a single modem board
CN103532876A (en) Processing method and system of data stream
CN101834789B (en) Packet-circuit exchanging on-chip router oriented rollback steering routing algorithm and router used thereby
CN103220226B (en) Transparent real-time traffic compression method and system between data center
CN111182239A (en) AI video processing method and device
WO2023045134A1 (en) Data transmission method and apparatus
CN103338156B (en) A kind of name pipeline server concurrent communication method based on thread pool
CN106844263B (en) Configurable multiprocessor-based computer system and implementation method
Feng et al. In-network aggregation for data center networks: A survey
CN111680791B (en) Communication method, device and system suitable for heterogeneous environment
Liu et al. Implementing efficient and scalable flow control schemes in MPI over InfiniBand
CN109743350B (en) Unloading implementation method for switching communication mode of scientific computing application image area
CN103747439A (en) Wireless controller equipment, wireless authentication processing method, system and networking technique
CN104636206A (en) Optimization method and device for system performance
WO2022228485A1 (en) Data transmission method, data processing method, and related product
Ren et al. Design and testbed evaluation of RDMA-based middleware for high-performance data transfer applications
CN112887093B (en) Hardware acceleration system and method for implementing cryptographic algorithms
Ren et al. Middleware support for rdma-based data transfer in cloud computing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant