CN101599294B - Method for storing multiple virtual queues data based on FPGA - Google Patents

Method for storing multiple virtual queues data based on FPGA Download PDF

Info

Publication number
CN101599294B
CN101599294B CN2009100838925A CN200910083892A CN101599294B CN 101599294 B CN101599294 B CN 101599294B CN 2009100838925 A CN2009100838925 A CN 2009100838925A CN 200910083892 A CN200910083892 A CN 200910083892A CN 101599294 B CN101599294 B CN 101599294B
Authority
CN
China
Prior art keywords
read
write
module
data
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009100838925A
Other languages
Chinese (zh)
Other versions
CN101599294A (en
Inventor
曾宇
方信我
郑臣明
白宗元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing City Cloud Computing Center Co., Ltd.
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN2009100838925A priority Critical patent/CN101599294B/en
Publication of CN101599294A publication Critical patent/CN101599294A/en
Application granted granted Critical
Publication of CN101599294B publication Critical patent/CN101599294B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention is a method for storing multiple virtual queues data based on FPGA. The invention is designed based on the technology realizing the network card data read-write in FPGA, in particular to a method of storing multiple virtual queues data, adopting a plurality of register blocks which contain a receiving engine module, a read-write module and a block-shaped memory; wherein, each register block belongs to one of the queues, the block-shape memory is provided with a read-write mode state switching module, both the lock-shaped memory and the read-write mode state switching module are in support of three operational modes and uses the block RAM in the FPGA to store and process all registers in a large number of queues (like 2048 queues), thus realizing simultaneous reading and writing on the related registers in the same queue. The method provided in the invention solves the problems of insufficient logical resources, slow storage speed, long storage cycle and the like in the logical storage and external storage in the prior art.

Description

A kind of method of the multiple virtual queues data storage based on FPGA
Technical field
The present invention's design particularly relates to a kind of date storage method of multiple virtual queues based on realizing high speed network interface card reading and writing data technology among the FPGA.
Background technology
Intel Virtualization Technology is a focus in current market, adopts Intel Virtualization Technology can save the hardware cost up to 70%.Because Intel Virtualization Technology can help the user to merge multiple application workloads, the multiple operating environment of operation on individual system; The optimized application exploitation is tested on triangular web and is developed; Improve system availability, between system, move virtual environment.Current design mainly is through logical storage and exterior storage, and exterior storage mainly is to adopt storages such as DDR2_SDRAM, FLASH.
Under the situation of logical storage: realize that in FPGA the design of high speed Microsoft Loopback Adapter is exactly the Intel Virtualization Technology that adopts; In the Microsoft Loopback Adapter design process, need use a large amount of formations; As 2048; The register that has a large amount of 32-bit in each formation again, Fig. 1 are the resource requirement table of Microsoft Loopback Adapter chip according to 2048,512,128 and 64 cohort design.Be the save design cost, FPGA adopts the XC5VLX50T chip of Xilinx, and the Slices size of this chip is 7200Clbs, is the FF of 28Kb altogether through calculating.If design 2048 virtual queues as requested, some register and formation need the register of 2048 32-bit one to one, and shared total resources size is 1047.6Kb, sees shown in Figure 1; Logical storage is applicable to the few situation of stock number that takies, and obviously, these registers will take whole logical resources if use logical block to store fully, and design can't be realized.
Externally under the situation of storage: if adopt DDR2_SDRAM that mass data is stored, clock 333MHz carries out write operation to the DDR2 storer; Command word is 0; It is effective that two cycles are kept in the address, writes that to enable to keep one-period effective, and data provide (each cycle 128bits) in two cycles; Realized the operation of burst length=4 like this, 4x64bit=2x128bit.The address is to provide according to 4 multiple, that is to say address of each 64bit, and one time write operation will use 4 addresses.The reading order word is 1, and it is effective that 1 cycle is kept in the address, and it is effective to read to enable to keep one-period, and the address also is 4 multiple, and when read data was effective, data appeared on the read data output port, and every 128bit keeps 1 cycle; But then need 8 cycles from sending the reading order word to sense data, and DDR2 sdram controller more complicated, the data storage cycle is long, data rate memory is slow and dumb.
Therefore, need that design is a kind of can to provide the memory mechanism that data rate memory is fast, control is convenient flexibly, can be used for the storage of mass data, to solve the defective of above-mentioned prior art.
Summary of the invention
In order to solve problems such as above-mentioned logical resource is not enough, storage speed is slow, the memory cycle is long; The design adopts a kind of method for designing cleverly to use the block RAM among the FPGA to store and handle to the register in all a large amount of (as 2048) formations, realizes the related register of same formation is read while write.
To achieve these goals; The invention provides a kind of method of the multiple virtual queues data storage based on FPGA; Comprise network interface card, FPGA, said FPGA comprises block storer (piece ram), mac controller module, dma controller, PCIe controller module, and its special character is:
Be provided with among the said FPGA and receive engine modules (Rx engine), send engine modules (Tx engine), module for reading and writing (Reg_Rd_Wr); Receiving engine modules (Rx engine) is connected with transmission engine modules (Tx engine) through module for reading and writing (Reg_Rd_Wr); Receive engine modules (Rx engine) and realize that the packet that sends over from main frame carries out treatment classification; Send engine modules (Tx engine) realization and the user is mail to the packing data of main frame; Send to kernel interface according to corresponding sequential, module for reading and writing (Reg_Rd_Wr) is realized the storage and the extraction of data;
Said reception engine modules (Rx engine) comprises that reception finite state machine module (Rx FSM), reception Posted bag module (Rx posted), reception NonPosted bag module (Rx Nonposted), reception Completion wrap module (Rx Completion);
Said transmission engine modules (Tx engine) comprises sends finite state machine module (Tx FSM), transmission Completion bag module (Tx Completion);
The virtual a plurality of registers group that are provided with addressing in the said block storer (piece ram); A plurality of registers group comprise transmitting terminal registers group (TX reg); Receiving end registers group (RX reg), interruption frequency control register group, each register is virtual to be provided with a plurality of formations;
A kind of method step of the multiple virtual queues data storage based on FPGA is following:
A, main frame send to the PCIe module with packet, are delivered to by the PCIe module and receive engine modules (Rx engine), handle by receiving engine modules (Rx engine), carry out the b operation;
B, the data that receive are surrounded by three kinds: posted request; Non-posted request; Rx completion carries out treatment classification by sending finite state machine module (Tx FSM) to three kinds of packets, sends finite state machine module (Tx FSM) posted request is carried out the c operation; Send finite state machine module (Tx FSM) non-posted request is carried out the d operation
C, posted request distribute to register by reception Posted bag module (Rx posted) through module for reading and writing (Reg_Rd_Wr) and carry out write operation, carry out the c1 operation,
C1, reception Posted bag module (Rx posted) are carried out logical storage or block storer (piece ram) storage according to the judgement of the address in the packet, if block storer (piece ram) storage, execution c2 operation,
C2, reception Posted bag module (Rx posted) judge it is to transmitting terminal register (TX reg) or receiving end register (RXreg) operation according to the address in the packet, if receiving end register (RX reg) operation, execution c3 operation,
C3, reception Posted bag module (Rx posted) are judged concrete block storer (piece ram) write operation of confirming the receiving end register according to the address in the packet; Judge specifically will carry out write operation according to the address again, carry out the c4 operation to which formation of this bulk storer (piece ram);
C4, data are write in the formation of block storer by module for reading and writing (Reg_Rd_Wr), to realize the write operation of data, write operation finishes;
D, non-posted request distribute to register by reception NonPosted bag module (Rx Nonposted) through module for reading and writing (Reg_Rd_Wr) and carry out read operation, carry out the d1 operation,
D1, reception NonPosted bag module (Rx Nonposted) are carried out logical storage or block storer (piece ram) storage according to the judgement of the address in the packet, if block storer (piece ram) storage, execution d2 operation,
D2, reception NonPosted bag module (Rx Nonposted) judge it is to transmitting terminal register or receiving end register manipulation according to the address in the packet, if the receiving end register manipulation is carried out the d3 operation,
D3, reception NonPosted bag module (Rx Nonposted) are declared concrete block storer (piece ram) read operation of confirming the receiving end register according to the address in the packet; Judge specifically will carry out read operation according to the address in the packet again to which formation of this bulk storer (piece ram); Carry out the d4 operation
D4, by module for reading and writing (Reg_Rd_Wr) data are read from the formation of block storer, to realize the read operation of data, read operation finishes, and carries out the d5 operation;
The data that read in d5, the concrete formation of module for reading and writing (Reg_Rd_Wr) with concrete block storer (piece ram) return to main frame through sending Completion bag module (Tx Completion).
Said block storer is provided with read-write mode state exchange module, and read-write mode state exchange module is divided into the conversion of reading mode state and the conversion of WriteMode state, and the reading mode state is divided into: original state (idle); Main frame is read enabled (host_rd_en), and read states (read) is read done state (read_end); Read to finish time-delay state (read_end_dly); The WriteMode state is divided into: original state (idle), main frame are write enabled (host_wr_en), write state (write); Write done state (write_end)
The write operation of I, execution c4 step is specially;
C4-1, when read-write mode state exchange module original state is the idle state,
When c4-2, read-write mode state exchange module detect the order host_wr_en of main frame transmission, be in main frame and write enabled,
C4-3, read-write mode state exchange module get into the state (write) of writing,
C4-4, data are write in the formation of block storer by module for reading and writing (Reg_Rd_Wr), realizing the write operation of data,
C4-5, read-write mode state exchange module get into writes done state (write_end), closes write order,
C4-6, read-write mode state exchange module are got back to original state idle state, wait for write operation next time, and write operation finishes;
The read operation of II, execution d4 step is specially:
D4-1, when read-write mode state exchange module original state is the idle state,
When d4-2, read-write mode state exchange module detect the order host_rd_en of main frame transmission, be in main frame and read enabled,
D4-3, read-write mode state exchange module get into read states (read),
D4-4, data are read from the formation of block storer by module for reading and writing (Reg_Rd_Wr), realizing the read operation of data,
D4-5, read-write mode state exchange module get into reads done state (read_end), closes read command,
D4-6, read-write mode state exchange module get into and read to finish time-delay state (read_end_dly), with carrying out data latching in the data backup of reading to the latch,
D4-7, read-write mode state exchange module are got back to original state idle state, wait for read operation next time, and read operation finishes;
The virtual degree of depth that is provided with a plurality of formations of said each register is 16 or 32 or 64 or 128 or 256 or 2048 etc.
The virtual degree of depth that is provided with a plurality of formations of each register is 2048, and registers group is 15, and transmitting terminal registers group (TX reg) is 7, and receiving end registers group (RX reg) is 7, and interruption frequency control register group is 1.
Said interruption frequency control register group can be that the degree of depth is that 4096 the MSIX register or the degree of depth are that to receive VLAN filter table (RVFTA register) or the degree of depth be that 128 the Receive Multicast Table Array Register receiving group register (RMTA register) or the degree of depth are that 256 the flexible TCO of Flexible TCO Filter Table Register filters register (FTFT register) for 2176 Receive VLAN Filter Table Array.
Said FPGA adopts the XC5VLX50T chip of Xilinx,
Said block storer (piece ram) can adopt the transmission modes such as conversion of single port, dual-port, data width,
Same formation is being carried out under the state of read-write operation simultaneously, and said block storer (piece ram) and read-write mode state exchange module are all supported three kinds of operator schemes: write mode of priority, read mode of priority, export constant pattern.
Write under the mode of priority state: the data of reading are identical with the data that write; Read under the mode of priority state: earlier the data of this specified impact damper in address are read, write data again, when write operation, do not influence this number; Export under the constant mode state: data only write corresponding buffers, and do not influence output, the data when output buffer keeps last read operation.
Said packet is the transport layer data bag.
Compared with prior art, the present invention has the following advantages and beneficial effect:
1, the present invention adopts the reception engine modules: three kinds of packets are carried out treatment classification, and processing logic storage and block memory stores realize to the transmitting terminal register still being the receiving end register manipulation, make Data Control, processing flexible and efficient.
2, module for reading and writing: main frame can be read and write the data of block RAM storage flexibly.
3, a plurality of registers group of block storer; Each registers group all belongs to one of them formation: support the storage of mass data; Data in the registers group belong to a formation; Receiving controller or send transmit control device in the time of need reading and writing related register, can mainly be the cycle of reading, shorten data storage that makes things convenient for data simultaneously with the related register read-write.
4, block storer is provided with read-write mode state exchange module: return original state at once after accomplishing read-write operation, wait for order next time, carry out read-write operation next time again; Mainly be convenient control; Improve operating efficiency, improve read-write efficiency, shorten the cycle of data storage.
5, block storer and read-write mode state exchange module are all supported three kinds of operator schemes: write mode of priority, read mode of priority, export constant pattern; Selection, switching through three kinds of operator schemes of interface control; Mainly be convenient control; Improve operating efficiency, improve read-write efficiency, shorten the cycle of data storage.
Therefore, the invention provides the storage means that a kind of data rate memory is fast, control is convenient flexibly, can be used for the storage of mass data.
Description of drawings
Fig. 1: frame diagram of the present invention;
Fig. 2: module for reading and writing is realized the logical storage of register and the structured flowchart of block RAM storage;
Fig. 3: the storage synoptic diagram of transmitter register in block RAM;
Fig. 4: the storage synoptic diagram of receiving register in block RAM;
Fig. 5: single port block RAM model synoptic diagram;
Reading and writing data synoptic diagram under Fig. 6: write_first (the writing preferential) pattern;
Reading and writing data synoptic diagram under Fig. 7: read-first (the reading preferential) pattern;
Reading and writing data synoptic diagram under Fig. 8: no_change (the exporting constant) pattern;
Fig. 9: the read-write mode state exchange module diagram in the block RAM;
Figure 10: the Microsoft Loopback Adapter chip is according to the resource requirement table of 2048,512,128 and 64 cohort design in the prior art.
The correlation technique technical term:
1.Rx Engine: receive engine modules;
2.Rx FSM: receive finite state machine module Receive Finite State Machine;
3.Rx Posted: the Posted bag that receives; Posted request:Posted wraps request;
4.Rx Non-Posted: the NonPosted bag that receives; Non-Posted request:Non-Posted wraps request;
5.Rx Completion: the Completion bag that receives;
6.TLP: transport layer data bag Transaction Layer Packets;
7.Reg_Rd_Wr: module for reading and writing register_read_write;
8.Tx Engine: send engine modules transmit Engine;
9.Tx FSM: send finite state machine Receive Finite State Machine;
10.Tx Completion: the Completion bag of transmission;
11.TX Reg: transmitting terminal registers group;
12.RX Reg: receiving end register;
13.Idle: original state (generally represent original state, have nothing special implication) with Idle;
14.host_wr_en: main frame is write and is enabled, the order that will write register that main frame sends;
15.host_rd_en: main frame is read to enable, the order of wanting read register that main frame sends;
16.write: write state, address and the data that under this state, will send the order of writing and be ready to write;
17.write_end: write done state, it is invalid under this attitude, will write order to be set;
18.read: read states, send order of reading and the address that will read under this state;
19.read_end: read done state, the data in the address that under this attitude, will read are read;
20.read_end_dly: read to finish the time-delay attitude, under this attitude, the data of reading are latched;
21.Block ram: block storer, piece ram;
22. interruption frequency control register group;
23.MSIX register: MSI-X interrupt register (MSI-X is known, a kind of interrupt mode);
24.RVFTA register: Receive VLAN Filter Table Array receives the VLAN filter table;
25.RMTA register: Receive Multicast Table Array Register receiving group register;
26.FTFT register: the flexible TCO of Flexible TCO Filter Table Register filters register.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described further.
The invention provides a kind of method of the multiple virtual queues data storage based on FPGA, comprise network interface card, FPGA, said FPGA comprises block storer (piece ram), mac controller module, dma controller, PCIe controller module, and its special character is:
Be provided with among the said FPGA and receive engine modules (Rx engine), send engine modules (Tx engine), module for reading and writing (Reg_Rd_Wr); Receiving engine modules (Rx engine) is connected with transmission engine modules (Tx engine) through module for reading and writing (Reg_Rd_Wr); Receive engine modules (Rx engine) and realize that the packet that sends over from main frame carries out treatment classification; Send engine modules (Tx engine) realization and the user is mail to the packing data of main frame; Send to kernel interface according to corresponding sequential, module for reading and writing (Reg_Rd_Wr) is realized the storage and the extraction of data;
Said reception engine modules (Rx engine) comprises that reception finite state machine module (Rx FSM), reception Posted bag module (Rx posted), reception NonPosted bag module (Rx Nonposted), reception Completion wrap module (Rx Completion);
Said transmission engine modules (Tx engine) comprises sends finite state machine module (Tx FSM), transmission Completion bag module (Tx Completion);
The virtual a plurality of registers group that are provided with addressing in the said block storer (piece ram); A plurality of registers group comprise transmitting terminal registers group (TX reg); Receiving end registers group (RX reg), interruption frequency control register group, each register is virtual to be provided with a plurality of formations;
A kind of method step of the multiple virtual queues data storage based on FPGA is following:
A, main frame send to the PCIe module with packet, are delivered to by the PCIe module and receive engine modules (Rx engine), handle by receiving engine modules (Rx engine), carry out the b operation;
B, the data that receive are surrounded by three kinds: posted request; Non-posted request; Rx completion carries out treatment classification by sending finite state machine module (Rx FSM) to three kinds of packets, sends finite state machine module (Rx FSM) posted request is carried out the c operation; Send finite state machine module (Rx FSM) non-posted request is carried out the d operation
C, posted request distribute to register by reception Posted bag module (Rx posted) through module for reading and writing (Reg_Rd_Wr) and carry out write operation, carry out the c1 operation,
C1, reception Posted bag module (Rx posted) are carried out logical storage or block storer (piece ram) storage according to the judgement of the address in the packet, if block storer (piece ram) storage, execution c2 operation,
C2, reception Posted bag module (Rx posted) judge it is to transmitting terminal register (TX reg) or receiving end register (RX reg) operation according to the address in the packet, if receiving end register (RX reg) operation, execution c3 operation,
C3, reception Posted bag module (Rx posted) are judged concrete block storer (piece ram) write operation of confirming the receiving end register according to the address in the packet; Judge specifically will carry out write operation according to the address again, carry out the c4 operation to which formation of this bulk storer (piece ram);
C4, data are write in the formation of block storer by module for reading and writing (Reg_Rd_Wr), to realize the write operation of data, write operation finishes;
D, non-posted request distribute to register by reception NonPosted bag module (Rx Nonposted) through module for reading and writing (Reg_Rd_Wr) and carry out read operation, carry out the d1 operation,
D1, reception NonPosted bag module (Rx Nonposted) are carried out logical storage or block storer (piece ram) storage according to the judgement of the address in the packet, if block storer (piece ram) storage, execution d2 operation,
D2, reception NonPosted bag module (Rx Nonposted) judge it is to transmitting terminal register or receiving end register manipulation according to the address in the packet, if the receiving end register manipulation is carried out the d3 operation,
D3, reception NonPosted bag module (Rx Nonposted) are declared concrete block storer (piece ram) read operation of confirming the receiving end register according to the address in the packet; Judge specifically will carry out read operation according to the address in the packet again to which formation of this bulk storer (piece ram); Carry out the d4 operation
D4, by module for reading and writing (Reg_Rd_Wr) data are read from the formation of block storer, to realize the read operation of data, read operation finishes, and carries out the d5 operation;
The data that read in d5, the concrete formation of module for reading and writing (Reg_Rd_Wr) with concrete block storer (piece ram) return to main frame through sending Completion bag module (Tx Completion).
Said block storer is provided with read-write mode state exchange module, and read-write mode state exchange module is divided into the conversion of reading mode state and the conversion of WriteMode state, and the reading mode state is divided into: original state (idle); Main frame is read enabled (host_rd_en), and read states (read) is read done state (read_end); Read to finish time-delay state (read_end_dly); The WriteMode state is divided into: original state (idle), main frame are write enabled (host_wr_en), write state (write); Write done state (write_end)
The write operation of I, execution c4 step is specially;
C4-1, when read-write mode state exchange module original state is the idle state,
When c4-2, read-write mode state exchange module detect the order host_wr_en of main frame transmission, be in main frame and write enabled,
C4-3, read-write mode state exchange module get into the state (write) of writing,
C4-4, data are write in the formation of block storer by module for reading and writing (Reg_Rd_Wr), realizing the write operation of data,
C4-5, read-write mode state exchange module get into writes done state (write_end), closes write order,
C4-6, read-write mode state exchange module are got back to original state idle state, wait for write operation next time, and write operation finishes;
The read operation of II, execution d4 step is specially:
D4-1, when read-write mode state exchange module original state is the idle state,
When d4-2, read-write mode state exchange module detect the order host_rd_en of main frame transmission, be in main frame and read enabled,
D4-3, read-write mode state exchange module get into read states (read),
D4-4, data are read from the formation of block storer by module for reading and writing (Reg_Rd_Wr), realizing the read operation of data,
D4-5, read-write mode state exchange module get into reads done state (read_end), closes read command,
D4-6, read-write mode state exchange module get into and read to finish time-delay state (read_end_dly), with carrying out data latching in the data backup of reading to the latch,
D4-7, read-write mode state exchange module are got back to original state idle state, wait for read operation next time, and read operation finishes;
The virtual degree of depth that is provided with a plurality of formations of said each register is 16 or 32 or 64 or 128 or 256 or 2048 etc.For example: the virtual degree of depth that is provided with a plurality of formations of each register is 2048, and registers group is 15, and transmitting terminal registers group (TX reg) is 7, and receiving end registers group (RX reg) is 7, and interruption frequency control register group is 1.Said interruption frequency control register group can be that the degree of depth is that 4096 the MSIX register or the degree of depth are that to receive VLAN filter table (RVFTA register) or the degree of depth be that 128 the Receive Multicast Table Array Register receiving group register (RMTA register) or the degree of depth are that 256 the flexible TCO of Flexible TCO Filter Table Register filters register (FTFT register) for 2176 Receive VLAN Filter Table Array.
Said FPGA adopts the XC5VLX50T chip of Xilinx, and said packet is transport layer data bag (TLP), and said block storer (piece ram) can adopt the transmission modes such as conversion of single port, dual-port, data width.
Same formation is being carried out under the state of read-write operation simultaneously, and said block storer (piece ram) and read-write mode state exchange module are all supported three kinds of operator schemes: write mode of priority, read mode of priority, export constant pattern.Write under the mode of priority state: the data of reading are identical with the data that write; Read under the mode of priority state: earlier the data of this specified impact damper in address are read, write data again, when write operation, do not influence this number; Export under the constant mode state: data only write corresponding buffers, and do not influence output, the data when output buffer keeps last read operation.
As shown in Figure 1: the data that from main frame, spread out of can only be through PCIe to Rx engine; Is to have Rx engine to be delivered to main frame through PCIe by the outside to main frame transmission data, and Rx Engine module realizes and will carry out treatment classification from the packet that main frame sends over.What receive is surrounded by three kinds:
(1) Posted request: need not return Completion, as write register.
(2) Non-Posted request: need return Completion, like read register.
(3) Rx Completion: network interface card is sent to network interface card through Rx Completion bag with the object of being asked to host requests read data or descriptor.
According to the TLP type that receives, Rx Engine module can be divided into Rx FSM, RxPosted, RxNonPosted and Rx Completion module, and Rx FSM gives different module according to dissimilar TLP with data allocations.Wherein, the packet of Posted and Non-Posted will be distributed to the register read writing module, and the packets need return data of Non-Posted transmits back main frame by Reg_Rd_Wr module return data through Tx Completion.
As shown in Figure 2: module Reg_Rd_Wr mainly realizes the storage and the module for reading and writing of register, and storage class is logical storage and block RAM storage.
At first main frame is through the data process Rx FSM module of PCIe Core reception; Rx FSM gives different module with data allocations, if the RxPosted module according to dissimilar TLP; Then carry out the write operation of register, judge according to the address and carry out logical storage or block RAM storage; If the RxNonPosted module is then carried out the read operation of register, also to judge and carry out logical storage or block RAM storage according to the address; Logical storage mainly is the register-stored to single 32bit, and wherein logical storage is fairly simple, repeats no more at this, mainly tells about the register that utilizes the block RAM storage depth big.
Like Fig. 3, shown in Figure 4: in carrying out this design; Be that 2048 register carries out addressing to the degree of depth at first, the degree of depth of each register differs, and to establish a capital be 2048, decide according to resource size; If resource inadequately can projected depth be 256,128,64 etc., is example at this with the degree of depth 2048.
Through analysis depth is 2048 15 registers group that have, and is respectively each 7 of TX Reg and RX Reg, adds an interruption frequency control register group ITRC; Interruption frequency control register group ITRC can be that 4096 the MSIX register or the degree of depth are that 2176 the RVFTA register or the degree of depth are that 128 the RMTA register or the degree of depth are that 6272 the register (RMHTA+RMATA) or the degree of depth are 256 register FTFT for the degree of depth.
With transmitter register TX is example, and the TDBAL register address is set to 0x10_0000+n * 0x20 (n=0~2047), and register TDBAH address setting is 0x10_0004+n * 0x20 (n=0~2047), and the rest may be inferred for other registers; Receiving register is an example with register RDBAL, and its register address is set to 0x40_0000+n * 0x20 (n=0~2047), and register RDBAH address setting is 0x40_0004+n * 0x20 (n=0~2047), and the rest may be inferred for other registers.Fig. 3, Fig. 4 are respectively transmitter register and the storage condition of receiving register in block RAM.
Suppose that the address that main frame will read or write data is address, address [23:16] decision is to transmitter register or receiving register operation.What n represented is the numbering of block RAM, n=address [4:0]; If n=5 ' is h0, expression is in block RAM 0, to read or write; N=5 ' h4, expression is in block RAM 1, to read or write; N=5 ' h8, expression is in block RAM 2, to read or write; N=5 ' hc, expression is in block RAM 3, to read or write; N=5 ' h10, expression is in block RAM 4, to read or write; N=5 ' h14, expression is in block RAM 5, to read or write; N=5 ' h18, expression is in block RAM 6, to read or write.
After definite a certain block RAM reads or writes, will determine specifically will in which address of this block RAM, read and write, this address is address [15:5].
As shown in Figure 5: according to the needs of reality, the design of the IP kernel of the Xilinx that the single port block RAM that this paper adopts is to use, being created on this and no longer setting forth of IP kernel; Single port block RAM model.
Each signal definition of Block RAM is following.
The a mouth clock control signal of CLKA:block ram.
ENA:block ram enable signal, when this control pin when low, it is invalid to write with read operation;
WEA:block ram read, when ENA is controlled to be when high, WEA=1 representes destination address is carried out write operation; WEA=0 representes destination address is carried out read operation.
The set control signal of SSRA:block ram, when ENA is controlled to be when high, this signal is effective.
ADDRA:block ram address input signal.
The data input of DINA:block ram.
The data output of DOUTA:block ram.
The last register of REGCEA:block ram enables output.
Block storage also can be supported the conversion (comprising PB) of data width except the storer of realizing list/dual-port, can use a plurality of block ram to form the storage unit of the bigger degree of depth and width.
Like Fig. 6, Fig. 7, shown in Figure 8: block RAM is supported three kinds of operator schemes, and these patterns detail below.
Write_first (write preferential) pattern: simultaneously during the same address of read/write block ram, the data of reading are identical with the data that write, and are as shown in Figure 6 under the write_frist pattern.This transmission mode makes data-out bus become very flexible when same port carries out write operation.
Read-first (read preferential) pattern: under this pattern, during the same address of read/write blok ram, at first the data of this specified impact damper in address are read simultaneously.When write operation, do not influence these data, as shown in Figure 7.
No_change (exporting constant) pattern: under this pattern, simultaneously during the same address of read/write block ram, data only write corresponding buffers, and do not influence output.Data when output buffer keeps last read operation, as shown in Figure 8.
According to three kinds of patterns of block RAM, the actual needs of integration project, select write_first (writing preferential) pattern still be read-first (reading preferentially) pattern still be no_change (exporting constant) pattern.
As shown in Figure 9: block RAM read-write mode state exchange module diagram; This state exchange mainly is divided into two operations of read-write, when tense is in idle originally, if when detecting the order host_wr_en that main frame sends; Address and the data that will write this moment are ready, then get into write operation; When detecting the host_rd_en order of main frame transmission, the address that will read this moment is ready, then carries out read operation.According to writing or the address address of reading of data, parse which block RAM that data will write, and which locational space of this block RAM.
When carrying out write operation, at first get into the write state.Under this state, data are write in the address, following one-period will get into the write_end state, under this state, signal wea and ena all will be changed to 0, close write order, accomplish the write operation of these data.When carrying out write operation, the write state writes and closes write order by the write_end state after data manipulation finishes in the data write-in block address ram.
When carrying out read operation, at first get into the read state, under this state, will read data in the address; Following one-period will get into the read_end state, under this state, signal wea and ena all will be changed to 0; Close read command, accomplish the read operation of these data, this moment, data were exported by douta.In order to keep data stability, avoid metastable state to occur, under the read_end_dly state, the result is latched with a latch.When carrying out read operation, the read state reads the data in the block RAM address, and data are sent to the register of appointment, and read command is closed by the read_end state in the read data operation back that finishes.At read_end_dly execution latch the result is latched after accomplishing read operation.
Should be noted that at last: above embodiment is only in order to technical scheme of the present invention to be described but not to its restriction; Although the present invention has been carried out detailed explanation with reference to the foregoing description; Under the those of ordinary skill in field be to be understood that: still can specific embodiments of the invention make amendment or be equal to replacement; And do not break away from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (10)

1. the method for a multiple virtual queues data storage, based on network interface card, FPGA, said FPGA comprises block storer, mac controller module, dma controller, PCIe interface, it is characterized in that:
Be provided with among the said FPGA and receive engine modules, send engine modules, module for reading and writing; Receiving engine modules is connected with the transmission engine modules through module for reading and writing; Receiving engine modules realizes the packet that sends over from main frame is carried out treatment classification; Send the packing data that the engine modules realization is mail to main frame with the user, send to interface according to corresponding sequential, the storage and the extraction of module for reading and writing realization data;
Said reception engine modules comprises that reception finite state machine module, reception Posted bag module, reception NonPosted bag module, reception Completion wrap module;
Said transmission engine modules comprises sends finite state machine module, transmission Completion bag module;
The virtual a plurality of registers group that are provided with addressing in the said block storer, a plurality of registers group comprise transmitting terminal registers group, receiving end registers group, interruption frequency control register group, each register is virtual to be provided with a plurality of formations;
The method step of described data storage is following:
A, main frame send to the PCIe interface with packet, are delivered to the reception engine modules by the PCIe interface, handle by receiving engine modules, carry out the b operation;
B, the data that receive are surrounded by three kinds: posted request, non-posted request, rx completion; By sending the finite state machine module three kinds of packets are carried out treatment classification; Send the finite state machine module posted request is carried out the c operation; Send the finite state machine module non-posted request is carried out the d operation
C, posted request distribute to register by reception Posted bag module through module for reading and writing and carry out write operation, carry out the c1 operation,
C1, reception Posted bag module are carried out logical storage or block memory stores according to the judgement of the address in the packet, if block memory stores is carried out the c2 operation,
C2, reception Posted bag module judge it is to transmitting terminal register or receiving end register manipulation according to the address in the packet, if the receiving end register manipulation is carried out the c3 operation,
C3, reception Posted bag module are judged the concrete block memory write operation of confirming the receiving end register according to the address in the packet, judge specifically will carry out write operation to which formation of this bulk storer according to the address again, execution c4 operation;
C4, data are write in the formation of block storer by module for reading and writing, to realize the write operation of data, write operation finishes;
D, non-posted request distribute to register by reception NonPosted bag module through module for reading and writing and carry out read operation, carry out the d1 operation,
D1, reception NonPosted bag module are carried out logical storage or block memory stores according to the judgement of the address in the packet, if block memory stores is carried out the d2 operation,
D2, reception NonPosted bag module judge it is to transmitting terminal register or receiving end register manipulation according to the address in the packet, if the receiving end register manipulation is carried out the d3 operation,
D3, reception NonPosted bag module are judged the concrete block memory read operation of confirming the receiving end register according to the address in the packet; Judge specifically will carry out read operation according to the address in the packet again to which formation of this bulk storer; Carry out the d4 operation
D4, by module for reading and writing data are read from the formation of block storer, to realize the read operation of data, read operation finishes, and carries out the d5 operation;
The data that read in d5, the concrete formation of module for reading and writing with concrete block storer return to main frame through sending Completion bag module.
2. the method for multiple virtual queues data storage according to claim 1 is characterized in that: said block storer is provided with read-write mode state exchange module, and read-write mode state exchange module is divided into the conversion of reading mode state and the conversion of WriteMode state,
The reading mode state is divided into: original state, and main frame is read enabled, and read states is read done state, reads to finish the time-delay state,
The WriteMode state is divided into: original state, and main frame is write enabled, writes state, writes done state,
The write operation of I, execution c4 step is specially;
C4-1, when read-write mode state exchange module original state is the idle state,
When c4-2, read-write mode state exchange module detect the order host_wr_en of main frame transmission, be in main frame and write enabled,
C4-3, read-write mode state exchange module get into the state of writing,
C4-4, data are write in the formation of block storer by module for reading and writing, realizing the write operation of data,
C4-5, read-write mode state exchange module get into writes done state, closes write order,
C4-6, read-write mode state exchange module are got back to original state idle state, wait for write operation next time, and write operation finishes;
The read operation of II, execution d4 step is specially:
D4-1, when read-write mode state exchange module original state is the idle state,
When d4-2, read-write mode state exchange module detect the order host_rd_en of main frame transmission, be in main frame and read enabled,
D4-3, read-write mode state exchange module get into read states,
D4-4, data are read from the formation of block storer by module for reading and writing, realizing the read operation of data,
D4-5, read-write mode state exchange module get into reads done state, closes read command,
D4-6, read-write mode state exchange module get into and read to finish the time-delay state, with carrying out data latching in the data backup of reading to the latch,
D4-7, read-write mode state exchange module are got back to original state idle state, wait for read operation next time, and read operation finishes.
3. the method for multiple virtual queues data storage according to claim 1 and 2 is characterized in that: the virtual degree of depth that is provided with a plurality of formations of said each register is 16 or 32 or 64 or 128 or 256 or 2048.
4. the method for multiple virtual queues data storage according to claim 3; It is characterized in that: the virtual degree of depth that is provided with a plurality of formations of said each register is 2048; Registers group is 15; The transmitting terminal registers group is 7, and the receiving end registers group is 7, and interruption frequency control register group is 1.
5. the method for multiple virtual queues data storage according to claim 4 is characterized in that: said interruption frequency control register group can be that the degree of depth is that 4096 the MSIX register or the degree of depth are that to receive VLAN filter table or the degree of depth be that 128 the Receive Multicast Table Array Register receiving group register or the degree of depth are that 256 the flexible TCO of Flexible TCO Filter Table Register filters register for 2176 Receive VLAN Filter Table Array.
6. the method for multiple virtual queues data storage according to claim 1 and 2 is characterized in that: said FPGA adopts the XC5VLX50T chip of Xilinx.
7. the method for multiple virtual queues data storage according to claim 1 and 2 is characterized in that: said block storer can adopt conversion, single port, the dual-port transmission mode of data width.
8. the method for multiple virtual queues data storage according to claim 1 and 2; It is characterized in that: same formation is being carried out under the state of read-write operation simultaneously, and said block storer and read-write mode state exchange module are all supported three kinds of operator schemes: write mode of priority, read mode of priority, export constant pattern.
9. the method for multiple virtual queues data storage according to claim 8 is characterized in that: write under the mode of priority state: the data of reading are identical with the data that write; Read under the mode of priority state: earlier the data of this specified impact damper in address are read, write data again, when write operation, do not influence the data that before from the impact damper of this address appointment, read; Export under the constant mode state: data only write corresponding buffers, and do not influence output, the data when output buffer keeps last read operation.
10. the method for multiple virtual queues data storage according to claim 1 and 2 is characterized in that: said packet is the transport layer data bag.
CN2009100838925A 2009-05-11 2009-05-11 Method for storing multiple virtual queues data based on FPGA Active CN101599294B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100838925A CN101599294B (en) 2009-05-11 2009-05-11 Method for storing multiple virtual queues data based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100838925A CN101599294B (en) 2009-05-11 2009-05-11 Method for storing multiple virtual queues data based on FPGA

Publications (2)

Publication Number Publication Date
CN101599294A CN101599294A (en) 2009-12-09
CN101599294B true CN101599294B (en) 2012-01-25

Family

ID=41420702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100838925A Active CN101599294B (en) 2009-05-11 2009-05-11 Method for storing multiple virtual queues data based on FPGA

Country Status (1)

Country Link
CN (1) CN101599294B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360326B (en) * 2011-05-24 2014-01-01 福建星网锐捷网络有限公司 Field-programmable gate array (FPGA) simulating method and device
CN102541779B (en) * 2011-11-28 2015-07-08 曙光信息产业(北京)有限公司 System and method for improving direct memory access (DMA) efficiency of multi-data buffer
CN104156663B (en) * 2014-07-31 2018-01-02 上海华为技术有限公司 A kind of hardware virtual port and processor system
CN105573947B (en) * 2014-10-13 2018-10-26 北京自动化控制设备研究所 A kind of SD/MMC card control methods based on APB buses
WO2016106601A1 (en) * 2014-12-30 2016-07-07 京微雅格(北京)科技有限公司 Extensible and configurable fpga storage structure and fpga device
CN107403641B (en) * 2016-05-20 2020-12-18 中芯国际集成电路制造(上海)有限公司 Memory read-write method based on finite-state machine control and memory device
CN106101019A (en) * 2016-06-22 2016-11-09 浪潮电子信息产业股份有限公司 Interrupt binding-based multi-queue network card performance tuning method
CN107277538A (en) * 2017-08-11 2017-10-20 西安万像电子科技有限公司 Method for encoding images and system
CN110300081B (en) * 2018-03-21 2021-04-16 大唐移动通信设备有限公司 Data transmission method and equipment
CN109581170A (en) * 2018-12-06 2019-04-05 贵州电网有限责任公司 A kind of high frequency response bandwidth Lightning Over-voltage on-line monitoring system
CN110806997B (en) * 2019-10-16 2021-03-26 广东高云半导体科技股份有限公司 System on chip and memory
CN111459545B (en) * 2020-03-27 2022-07-22 广东速美达自动化股份有限公司 Method and device for optimizing register resources of FPGA (field programmable Gate array)
CN111654886B (en) * 2020-05-27 2023-06-27 杭州迪普科技股份有限公司 Method and device for limiting user bandwidth

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761222A (en) * 2005-11-22 2006-04-19 华中科技大学 Storage network adapter of supporting virtual interface
CN1984030A (en) * 2005-12-14 2007-06-20 中兴通讯股份有限公司 Method and device for controlling ATM network flow based on FPGA
US7439763B1 (en) * 2005-10-25 2008-10-21 Xilinx, Inc. Scalable shared network memory switch for an FPGA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7439763B1 (en) * 2005-10-25 2008-10-21 Xilinx, Inc. Scalable shared network memory switch for an FPGA
CN1761222A (en) * 2005-11-22 2006-04-19 华中科技大学 Storage network adapter of supporting virtual interface
CN1984030A (en) * 2005-12-14 2007-06-20 中兴通讯股份有限公司 Method and device for controlling ATM network flow based on FPGA

Also Published As

Publication number Publication date
CN101599294A (en) 2009-12-09

Similar Documents

Publication Publication Date Title
CN101599294B (en) Method for storing multiple virtual queues data based on FPGA
US10725956B2 (en) Memory device for a hierarchical memory architecture
US6085278A (en) Communications interface adapter for a computer system including posting of system interrupt status
US8230180B2 (en) Shared memory burst communications
US6889266B1 (en) Method for delivering packet boundary or other metadata to and from a device using direct memory controller
CN101587462B (en) USB data transmission device in high-speed data communication link and data transmission method thereof
CN103559156B (en) Communication system between a kind of FPGA and computing machine
CN101158893B (en) Register rename of data precess system
TW201131368A (en) Command queue for peripheral component
CN110334040B (en) Satellite-borne solid-state storage system
CN101436171B (en) Modular communication control system
CN103002046B (en) Multi-system data copying remote direct memory access (RDMA) framework
CN102347902B (en) Transmission interval regulation method and device and network equipment
US5581741A (en) Programmable unit for controlling and interfacing of I/O busses of dissimilar data processing systems
KR20220103931A (en) Data transfer between memory and distributed compute arrays
CN102841871A (en) Pipeline read-write method of direct memory access (DMA) structure based on high-speed serial bus
US7774513B2 (en) DMA circuit and computer system
Kavianipour et al. High performance FPGA-based scatter/gather DMA interface for PCIe
US20070028015A1 (en) System and method for processing data streams
CN101377762B (en) Direct memory access (DMA) system and method for transmitting data
CN117312202B (en) System on chip and data transmission method for system on chip
JP2834927B2 (en) Computer system
CN116860185B (en) Data access apparatus, system, method, device, chip and medium for SRAM array
EP3841484B1 (en) Link layer data packing and packet flow control scheme
CN116208731A (en) PCIe cascade network port high-speed transmission method and system based on Zynq architecture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: NANJING CITY CLOUD COMPUTING CENTER CO., LTD.

Free format text: FORMER OWNER: SHUGUANG INFORMATION INDUSTRIAL (BEIJING) CO., LTD.

Effective date: 20130326

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100084 HAIDIAN, BEIJING TO: 211153 NANJING, JIANGSU PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20130326

Address after: 211153, 1, 37, general road, Jiangning economic and Technological Development Zone, Nanjing, Jiangsu

Patentee after: Nanjing City Cloud Computing Center Co., Ltd.

Address before: 100084 Beijing Haidian District City Mill Street No. 64

Patentee before: Dawning Information Industry (Beijing) Co., Ltd.