CN112612518B - Network checksum algorithm optimization method based on Feiteng platform - Google Patents
Network checksum algorithm optimization method based on Feiteng platform Download PDFInfo
- Publication number
- CN112612518B CN112612518B CN202011420425.XA CN202011420425A CN112612518B CN 112612518 B CN112612518 B CN 112612518B CN 202011420425 A CN202011420425 A CN 202011420425A CN 112612518 B CN112612518 B CN 112612518B
- Authority
- CN
- China
- Prior art keywords
- data
- neon
- cnt
- buff
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000005457 optimization Methods 0.000 title claims abstract description 15
- 229910052754 neon Inorganic materials 0.000 claims abstract description 35
- GKAOGPIIYCISHV-UHFFFAOYSA-N neon atom Chemical compound [Ne] GKAOGPIIYCISHV-UHFFFAOYSA-N 0.000 claims abstract description 35
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000009825 accumulation Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 abstract description 7
- 230000005540 biological transmission Effects 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 3
- 238000007792 addition Methods 0.000 description 16
- 125000004122 cyclic group Chemical group 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. by incrementing the instruction counter
- G06F9/322—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
- G06F9/325—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0896—Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Detection And Prevention Of Errors In Transmission (AREA)
Abstract
The invention discloses a network checksum algorithm optimization method based on a Feiteng platform, which comprises the following implementation processes: firstly, 128bit data is loaded into a NEON 128-bit register to reduce the cycle number; then, a NEON vector pair-wise addition instruction is adopted, data in a NEON 128-bit register are divided into 8 pieces of 16-bit data to be subjected to pair-wise addition, and when the data stream is processed to a certain length, the data stream is switched to arm64 to be assembled for processing; then the algorithm operation of 64bit to 16bit is carried out. Compared with the prior art, the network checksum algorithm optimization method based on the Feiteng platform effectively reduces the delay caused by the checksum algorithm when the network UDP receives data, thereby achieving the purpose of improving the UDP packet data transmission efficiency, and having the advantages of autonomous controllability, originality of an implementation mode, obvious implementation effect and the like.
Description
Technical Field
The invention belongs to the technical field of communication and computers, and particularly relates to a network checksum algorithm optimization method based on a Feiteng platform.
Background
The domestic Feiteng series processor is based on an ARM64 architecture, is fully compatible with an ARMV8 instruction set, and internally realizes an NEON expansion instruction. The SIMD part of the expansion instruction makes up the weakness of the Feiteng processor in the aspect of CPU frequency, and can improve the memory access and data calculation speed of data intensive application. Common data intensive applications include graphics computing, entertainment audio, data verification, and the like.
Ethernet is the most common communication protocol standard used in the existing local area networks today, and there are many protocols available in the transport layer, among which UDP protocol is widely used in local area networks because of its compact structure and low transmission overhead. The UDP protocol is a simple datagram-oriented transport layer protocol that provides non-connection-oriented, unreliable transport of data streams. The UDP protocol is a data segment in a data packet of the ethernet, the UDP encapsulation header includes a source port, a destination port, a UDP length, and a checksum of the UDP, and the checksum calculation for the 16-bit UDP may include all data loads behind a checksum field of the UDP.
The existing UDP checksum calculation method is characterized in that a UDP pseudo packet header, a UDP packet header and a data segment are divided into 16-bit hexadecimal numbers, data are subjected to grouping cyclic addition, a generated carry is added to a bit of the operation, then the result of the successive cyclic addition is inverted according to the bit, and the result obtained by calculation is backfilled into the UDP checksum. Therefore, to calculate the checksum of UDP, it is necessary to cyclically add all 16 bits in the data stream step by step, and as the amount of data to be transmitted in the DUP packet increases, the number of cyclic additions step by step increases accordingly, which greatly reduces the efficiency of UDP packet data transmission.
The invention provides a UDP checksum calculation method (application number CN:201210087407.3), which is concretely implemented by firstly setting a UDP checksum as a constant; then, calculating according to a traditional UDP checksum calculation method; finally, the obtained result is added at the end of the UDP data portion. The method adopts a calculation method, simplifies the packing flow of UDP data packets, enables all data to be immediately packed and sent after being read once, and does not adopt an effective method to reduce the number of times of gradually adding data streams when the checksum is calculated.
The invention provides a check sum calculating method and a network processor (application number CN 201510536324.1). the invention provides the check sum calculating method and the network processor, which are concretely realized by firstly obtaining a calculating parameter corresponding to a current thread by a multithreading micro-engine and sending the calculating parameter to a calculating unit; then the computing unit carries out checksum computation, and meanwhile, the thread scheduling module schedules the current thread to enter a dormant state; then when the calculation is completed, the calculation unit writes the calculated check sum into a check sum register of the current thread and instructs the thread scheduling module to schedule the current thread to enter an awakening state; and then when the thread scheduling module schedules the current thread to enter a working state from an awakening state, the multithreading microengine writes the calculated checksum into a position corresponding to the current thread in the data storage unit. According to the method, the checksum calculation is embedded into a production line of the multi-thread micro-engine, so that the scheduling link is reduced, and the performance of the network processor is improved. But again no effective method is taken for reducing the number of times the checksum is calculated step-wise.
Disclosure of Invention
In order to solve the problems, the invention provides a network checksum algorithm optimization method based on a Feiteng platform, which comprises the following steps:
s1: determining the cycle number cnt _ neo of the NEON instruction and the assembly cycle number cnt _ asm;
s2: defining NEON register variables VA and VB, and initializing to 0;
s3: judging whether cnt _ neon > 0 is true or not; if yes, go to step S4; if not, go to step S7;
s4: loading 8 16bit data from buff to VB;
s5: adopting a UADALP vector addition instruction to complete the vector addition calculation of VA and VB;
s6: decrementing cnt _ neon by 1, and shifting back buff by 16 bytes, and returning to step S3;
s7: accumulating 4 32bit data in VA to result;
s8: judging whether cnt _ asm is greater than 0; if yes, go to step S9; if not, go to step S12;
s9: load 4 16bit data from buff to X1;
s10: completing result + X1 accumulation operation by using an ADDS ADCS addition instruction;
s11: subtracting 1 from cnt _ asm, and shifting back buff by 8 bytes, and returning to step S8;
s12: circularly accumulating the buff residual data to result;
s13: convert result to 16bit number and then invert.
Compared with the prior art, the invention has the advantages that:
(1) and the design and implementation of the optimized checksum algorithm are independent design research and development, so that the method has complete intellectual property.
(2) The implementation mode is original, the NEON characteristic of the Feiteng processor is fully utilized, the checksum algorithm characteristic is combined, the advantages of the NEON instruction are fully played, and the gradual cyclic addition times of data are reduced by combining the assembly instruction.
(3) The method has obvious implementation effect, widens the single-byte processing before optimization into the 16-byte processing after optimization, greatly reduces the delay caused by the checksum algorithm when the network udp receives the data, and improves the UPD network bandwidth.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a generic Checksum algorithm on a Feiteng platform in the prior art;
FIG. 2 is a diagram illustrating the execution of UADALP VA.4S and VB.8H instructions.
Fig. 3 is a flowchart of a fickian platform-based Checksum algorithm optimization method in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
As shown in fig. 2 to 3, in the embodiment of the present application, the present invention provides a network checksum algorithm optimization method based on a soar platform, where the method includes the steps of:
s1: determining the cycle number cnt _ neo of the NEON instruction and the assembly cycle number cnt _ asm;
s2: defining NEON register variables VA and VB, and initializing to 0;
s3: judging whether cnt _ neon > 0 is true or not; if yes, go to step S4; if not, go to step S7;
s4: loading 8 16bit data from buff to VB;
s5: adopting a UADALP vector addition instruction to complete the vector addition calculation of VA and VB;
s6: decrementing cnt _ neon by 1, and shifting back buff by 16 bytes, and returning to step S3;
s7: accumulating 4 32bit data in VA to result;
s8: judging whether cnt _ asm is greater than 0; if yes, go to step S9; if not, go to step S12;
s9: load 4 16bit data from buff to X1;
s10: completing result + X1 accumulation operation by using an ADDS ADCS addition instruction;
s11: subtracting 1 from cnt _ asm, and shifting back buff by 8 bytes, and returning to step S8;
s12: circularly accumulating the buff residual data to result;
s13: convert result to 16bit number and then invert.
The most time consuming in FIG. 1 is the portion of the loop operation where only 16 bits of data can be processed per loop and overflow conditions need to be handled in addition to the accumulate operation per loop, which consumes a large number of CPU instruction cycles.
The Feiteng full-serial processors are all 64-bit, registers support operands 64-bit wide, ADDS addition instructions can affect the condition code flag bit of the Feiteng CPU, the condition code flag bit C can be set to 1 when the result overflows, and the condition code flag bit C can be set to 0 when the result does not overflow. 4 pieces of 16-bit data can be processed by using the ADDS and ADCS addition instructions in one operation, and overflow operation does not need to be considered additionally. In addition, the Feiteng full-range processor supports NEON, which is a 128-bit operation instruction that can process 8 16-bit numbers at a time, 1 times faster than the ADDS and ADCS instructions, however, the nen instruction does not require additional consideration of overflow operations as in the ADDS, ADCS instruction, and in order to fully exploit the advantages of the nen instruction, the accumulate operation may be performed using the NEON vector add instruction UADALP va.4s, vb.8h, which is schematically depicted in figure 2, the instruction can process 8 16 bits of data at a time, and in addition the loop accumulation u32+ ═ u16+ u16 does not overflow for no more than 0xffff cycles, there is no need to consider overflow carry operations inside the loop as long as it is guaranteed that the number of loops does not exceed 0xffff, for the part exceeding 0xffff cycles, the acceleration is performed by using ARM64 assembly instructions ADDS and ADCS instructions, this allows the advantages of the 128-bit NEON instruction to be shared, and avoids the disadvantages of the overflow handling.
Fig. 1 is a flowchart of a general netlocksum algorithm of a current soar platform without optimization, and fig. 3 is a flowchart of the netlocksum algorithm after nen optimization. It is based on fig. 1, and is improved on the basis of this, broadens the channel of data calculation, and raises the efficiency of data transmission. Similarly, the embodiment steps do not consider the problem of data address alignment, and if the data address is not aligned, only simple conversion is needed, and the specific embodiment steps are as follows:
step S201: a 64bit wide result is defined and initialized to 0.
Step S202: the number of cycles cnt _ NEON of the NEON instruction is determined, ensuring that it is not greater than 0xffff, and the number of assembly cycles cnt _ asm is determined.
Step S203: a 4-channel 128-bit NEON register variable VA, 32-bits per channel, is defined and initialized to 0, and an 8-channel 128-bit NEON register variable VB, 16-bits per channel, is defined for loading data from buff.
Step S204: steps S205, S206, S207 are repeated until cnt _ neon is 0.
Step S205: 8 16 bits of data are loaded from the buff into 8 channels of the NEON register variable VB via LDR instructions.
Step S206: the operation of a0+ ═ B0+ B1, a1+ ═ B2+ B3, a2+ ═ B4+ B5, A3+ ═ B6+ B7 is completed by using a NEON vector pair-add instruction UADALP.
Step S207: the data buff is shifted backward by 16 bytes and cnt _ neon is decremented by 1.
Step S208: the nenon register variable VA accumulates to result, i.e., result + ═ a0+ a1+ a2+ A3;
step S209: the steps S210, S211, S212 are repeated until the cnt _ asm is 0.
Step S210: 4 16 bits of data are loaded from buf into register X1 by the LDR instruction.
Step S211: with the ADDS ADCS add instruction, result + X1+ C is completed.
Step S212: the data buff is shifted backward by 8 bytes, and cnt _ asm is decremented by 1.
Step S213: and circularly accumulating the residual data in the buf to result, converting the result into a 16-bit number, and inverting to obtain the checksum.
In step S202), the cnt _ neon is set to 0xffff at maximum to prevent overflow of the neon vector pair-wise addition instruction UADALP, and the U32+ ═ U16+ U16 operation can ensure that the carry is not generated by overflow when the loop addition is performed 0xffff times.
In step S206), a0, a1, a2, A3 refer to 4 32-bit lanes in the NEON register variable VA, B0, B1, B2, B3, B4, B5, B6, B7 refer to 8 16-bit lanes in the NEON register variable VB, and the vector addition operation is specifically completed as shown in fig. 2.
In step S213), the processing is performed in units of 128bit/64bit in the previous step, so that there may be a few data left unprocessed in buf, and the accumulation processing herein needs to consider the overflow condition, but because there is only a data amount less than 8 bytes, no large delay is caused. The result performs 64-bit conversion and 16-bit operation, namely dividing 64 bits into 4 pieces of 16-bit data and accumulating the 4 pieces of 16-bit data.
Compared with the prior art, the invention has the advantages that:
(1) and the design and implementation of the optimized checksum algorithm are independent design research and development, so that the method has complete intellectual property.
(2) The implementation mode is original, the NEON characteristic of the Feiteng processor is fully utilized, the checksum algorithm characteristic is combined, the advantages of the NEON instruction are fully played, and the gradual cyclic addition times of data are reduced by combining the assembly instruction.
(3) The method has obvious implementation effect, widens the single-byte processing before optimization into the 16-byte processing after optimization, greatly reduces the delay caused by the checksum algorithm when the network udp receives the data, and improves the UPD network bandwidth.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.
Claims (1)
1. A network checksum algorithm optimization method based on a Feiteng platform is characterized by comprising the following steps:
s1: determining the cycle number cnt _ neo of the NEON instruction and the assembly cycle number cnt _ asm;
s2: defining NEON register variables VA and VB, and initializing to 0;
s3: judging whether cnt _ neon > 0 is true or not; if yes, go to step S4; if not, go to step S7;
s4: loading 8 16bit data from buff to VB;
s5: adopting a UADALP vector addition instruction to complete the vector addition calculation of VA and VB;
s6: decrementing cnt _ neon by 1, and shifting back buff by 16 bytes, and returning to step S3;
s7: accumulating 4 32bit data in VA to result;
s8: judging whether cnt _ asm is greater than 0; if yes, go to step S9; if not, go to step S12;
s9: load 4 16bit data from buff to X1;
s10: completing result + X1 accumulation operation by using an ADDS ADCS addition instruction;
s11: subtracting 1 from cnt _ asm, and shifting back buff by 8 bytes, and returning to step S8;
s12: circularly accumulating the buff residual data to result;
s13: convert result to 16bit number and then invert.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011420425.XA CN112612518B (en) | 2020-12-08 | 2020-12-08 | Network checksum algorithm optimization method based on Feiteng platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011420425.XA CN112612518B (en) | 2020-12-08 | 2020-12-08 | Network checksum algorithm optimization method based on Feiteng platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112612518A CN112612518A (en) | 2021-04-06 |
CN112612518B true CN112612518B (en) | 2022-04-01 |
Family
ID=75229268
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011420425.XA Active CN112612518B (en) | 2020-12-08 | 2020-12-08 | Network checksum algorithm optimization method based on Feiteng platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112612518B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114968653B (en) * | 2022-07-14 | 2022-11-11 | 麒麟软件有限公司 | Method for determining RAIDZ check value of ZFS file system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104937542A (en) * | 2013-01-23 | 2015-09-23 | 国际商业机器公司 | Vector checksum instruction |
CN106293870A (en) * | 2015-06-29 | 2017-01-04 | 联发科技股份有限公司 | Computer system and strategy thereof guide compression method |
CN106484503A (en) * | 2015-08-27 | 2017-03-08 | 深圳市中兴微电子技术有限公司 | A kind of computational methods of verification sum and network processing unit |
US9648102B1 (en) * | 2012-12-27 | 2017-05-09 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
CN107094369A (en) * | 2014-09-26 | 2017-08-25 | 英特尔公司 | Instruction and logic for providing SIMD SM3 Cryptographic Hash Functions |
CN108139907A (en) * | 2015-10-14 | 2018-06-08 | Arm有限公司 | Vector data send instructions |
CN110620585A (en) * | 2018-06-20 | 2019-12-27 | 英特尔公司 | Supporting random access of compressed data |
CN111654265A (en) * | 2020-06-19 | 2020-09-11 | 京东方科技集团股份有限公司 | Quick checking circuit, method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200177660A1 (en) * | 2020-02-03 | 2020-06-04 | Intel Corporation | Offload of streaming protocol packet formation |
-
2020
- 2020-12-08 CN CN202011420425.XA patent/CN112612518B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9648102B1 (en) * | 2012-12-27 | 2017-05-09 | Iii Holdings 2, Llc | Memcached server functionality in a cluster of data processing nodes |
CN104937542A (en) * | 2013-01-23 | 2015-09-23 | 国际商业机器公司 | Vector checksum instruction |
CN107094369A (en) * | 2014-09-26 | 2017-08-25 | 英特尔公司 | Instruction and logic for providing SIMD SM3 Cryptographic Hash Functions |
CN106293870A (en) * | 2015-06-29 | 2017-01-04 | 联发科技股份有限公司 | Computer system and strategy thereof guide compression method |
CN106484503A (en) * | 2015-08-27 | 2017-03-08 | 深圳市中兴微电子技术有限公司 | A kind of computational methods of verification sum and network processing unit |
CN108139907A (en) * | 2015-10-14 | 2018-06-08 | Arm有限公司 | Vector data send instructions |
CN110620585A (en) * | 2018-06-20 | 2019-12-27 | 英特尔公司 | Supporting random access of compressed data |
CN111654265A (en) * | 2020-06-19 | 2020-09-11 | 京东方科技集团股份有限公司 | Quick checking circuit, method and device |
Non-Patent Citations (1)
Title |
---|
UDP: a programmable accelerator for extract-transform-load workloads and more;Yuanwei Fang等;《Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture》;20171014;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112612518A (en) | 2021-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8230144B1 (en) | High speed multi-threaded reduced instruction set computer (RISC) processor | |
US10140124B2 (en) | Reconfigurable microprocessor hardware architecture | |
JP5269610B2 (en) | Perform cyclic redundancy check operations according to user level instructions | |
US9015443B2 (en) | Reducing remote reads of memory in a hybrid computing environment | |
US11489773B2 (en) | Network system including match processing unit for table-based actions | |
US20140208069A1 (en) | Simd instructions for data compression and decompression | |
EP1126367A1 (en) | Data processing device, system and method using a state transition table | |
US20120030451A1 (en) | Parallel and long adaptive instruction set architecture | |
US9274802B2 (en) | Data compression and decompression using SIMD instructions | |
US20120317360A1 (en) | Cache Streaming System | |
US10666288B2 (en) | Systems, methods, and apparatuses for decompression using hardware and software | |
CN112612518B (en) | Network checksum algorithm optimization method based on Feiteng platform | |
US9959066B2 (en) | Memory-attached computing resource in network on a chip architecture to perform calculations on data stored on memory external to the chip | |
WO2023169267A1 (en) | Network device-based data processing method and network device | |
US11343358B2 (en) | Flexible header alteration in network devices | |
US20040103086A1 (en) | Data structure traversal instructions for packet processing | |
WO2013036950A1 (en) | Instruction packet including multiple instructions having a common destination | |
US7320013B2 (en) | Method and apparatus for aligning operands for a processor | |
US8745235B2 (en) | Networking system call data division for zero copy operations | |
Zolfaghari et al. | A custom processor for protocol-independent packet parsing | |
US10445099B2 (en) | Reconfigurable microprocessor hardware architecture | |
US7571258B2 (en) | Method and apparatus for a pipeline architecture | |
KR101449732B1 (en) | System and method of processing hierarchical very long instruction packets | |
US7877581B2 (en) | Networked processor for a pipeline architecture | |
Zilberman | Technical perspective: hXDP: Light and efficient packet processing offload |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |