WO2016206490A1 - 一种用于提高表项访问带宽和原子性操作的装置及方法 - Google Patents

一种用于提高表项访问带宽和原子性操作的装置及方法 Download PDF

Info

Publication number
WO2016206490A1
WO2016206490A1 PCT/CN2016/081618 CN2016081618W WO2016206490A1 WO 2016206490 A1 WO2016206490 A1 WO 2016206490A1 CN 2016081618 W CN2016081618 W CN 2016081618W WO 2016206490 A1 WO2016206490 A1 WO 2016206490A1
Authority
WO
WIPO (PCT)
Prior art keywords
entry
cache
address
data
request
Prior art date
Application number
PCT/CN2016/081618
Other languages
English (en)
French (fr)
Inventor
包闯
闫振林
张春晖
安康
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Priority to US15/739,243 priority Critical patent/US10545867B2/en
Priority to EP16813606.7A priority patent/EP3316543B1/en
Priority to ES16813606T priority patent/ES2813944T3/es
Priority to SG11201710789YA priority patent/SG11201710789YA/en
Publication of WO2016206490A1 publication Critical patent/WO2016206490A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0879Burst mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction

Definitions

  • the present invention relates to an item management technology, and in particular, to an apparatus and method for improving an access bandwidth and atomic operation of an entry, and a computer storage medium.
  • NP Network Processor
  • ASIP Application Specific Instruction Process
  • the network processor manages the entries, and requires that the entries need to achieve continuous flow and no error when searching for a certain update performance.
  • the entries For a table entry with a large single entry width, it may be necessary to store multiple addresses to store a single entry.
  • the atomic operation of the entry is particularly important during the update of the entry;
  • embodiments of the present invention are directed to an apparatus and method for improving access bandwidth and atomic operation of a table item, and a computer storage medium, which at least improve access bandwidth when an entry is queried and ensure that atomic operations can be continuously implemented. Flow and no error.
  • An apparatus for improving access bandwidth and atomic operation of an entry where the apparatus includes: a comparison module, a Cache, and a distribution module;
  • the comparison module is configured to receive the query request from the service side, and determine whether the address pointed to by the query request is equal to the address of the stored entry in the cache cache. If they are equal, and the valid identifier vld is currently valid, the search fragment does not need to be initiated. a request from the external memory to reduce access to the off-chip memory, directly returning the entry data stored in the cache to the service side; if not equal, initiating a request to find the off-chip memory to request the off-chip memory
  • the returned entry data is processed according to the first preset rule, so that the atomic operation of updating the entry in the table item search process can achieve continuous stream search and no error;
  • the Cache is configured to store entry data and an entry address
  • the distribution module is configured to identify that the data returned to the service side is the entry data in the Cache or the entry data in the off-chip memory, and then returned to the service side.
  • the comparing module is further configured to determine, according to the first preset rule, whether the address pointed by the search request is equal to the stored address in the Cache, including any one of the following methods:
  • Manner 1 If the vld corresponding to the Mth address is lower than the first threshold, the NDDR address is equal to the address stored in the Cache, and the data in the Cache is returned to the service side, and the Cache is not updated. Data; if the addresses are not equal, the data in the Cache is not updated, and the data returned by the requesting off-chip memory is sent to the service side;
  • both M and N are natural numbers, and the sum of M and N is the request side width of the service side.
  • the device further includes:
  • the first arbitration module is configured to complete arbitration between the central processing unit write cache and the off-chip memory return entry data write cache;
  • control module configured to manage the vld flag bit and determine when to initiate an update operation on the off-chip memory
  • the central processing unit is configured to configure a service entry.
  • an instruction to write a burst entry is sent.
  • the entry is lower than the first threshold Mbit as an address.
  • the entry of the entry higher than the second threshold Nbit address/entry data into the Cache, and set the vld register corresponding to the address by the control module, and issue an instruction to update the off-chip memory, and complete the entry update;
  • M and N are natural numbers, and the sum of M and N is the request side width for the service side.
  • the device further includes:
  • the first arbitration module is configured to complete arbitration between the central processing unit write cache and the off-chip memory return entry data write cache;
  • control module configured to manage the vld flag bit and determine when to initiate an update operation on the off-chip memory
  • the central processing unit is configured to configure a service entry. For the case where the multi-burst entry is updated, an instruction to write a multi-burst entry is issued. After the first arbitration module arbitrates, the first burst is moved to the left by the entry lower than the first threshold Mbit. The value obtained by the 2 ⁇ S bit is used as an address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache, and the control module performs the vld corresponding to the address to 0, and does not issue the update.
  • the instruction of the off-chip memory entry; the second burst uses the value of the entry below the first threshold Mbit to the left of the 2 ⁇ S bit as the address, and the entry is higher than the second threshold Nbit address/entry data.
  • the device further includes:
  • Locating an information storage module configured to store a table lookup request and multiple burst flag information
  • the comparison module is further configured to determine, according to the case of the single burst entry, whether the vld identifier corresponding to the search request is lower than the first threshold Mbit, and valid, if the lookup request is lower than the first threshold Mbit, the query cache is started. Obtaining the result of the query, parsing the query result, comparing the found address with the second threshold Nbit of the search request, and equalizing, directly returning the cache check result to the service side through the distribution module, and not sending out Querying the request of the off-chip memory, discarding the data of the search information storage module; wherein, the M and the N are both natural numbers, and the sum of M and N is the request side width of the service side;
  • the second arbitration module is configured to perform arbitration between the service side read Cache and the off-chip memory return read Cache;
  • the off-chip memory is configured to store a lookup entry
  • the device further includes:
  • Locating an information storage module configured to store a table lookup request and multiple burst flag information
  • the comparison module is further configured to be in the case of a multi-burst entry, and when the multi-burst entry is 2 ⁇ S, the control module determines that the search request is lower than the first threshold Mbit and then shifts to the left Sbit continuously. Whether the vld identifier corresponding to the 2 ⁇ S addresses is valid. If the validity is all valid, the request is sent to the second sbit after the search request is lower than the first threshold Mbit, and then the request is queried continuously, and the query result is obtained, and the query result is parsed.
  • the address to be searched is compared with the second threshold value Nbit of the search request, and the result of the Cache is directly spliced and returned to the service side by the distribution module, and the request for querying the off-chip memory is not issued.
  • Nbit the second threshold value
  • the result of the Cache is directly spliced and returned to the service side by the distribution module, and the request for querying the off-chip memory is not issued.
  • the second arbitration module is configured to perform arbitration between the service side read Cache and the off-chip memory return read Cache;
  • the off-chip memory is configured to store a lookup entry.
  • the comparison module is further configured to: when the query request and the Cache all the addresses do not match, initiate a request for finding the off-chip memory, and after the entry data is returned, take out the search information storage module.
  • the entry address and the multi-burst information If the control module determines whether the address is lower than the vld corresponding to the first threshold Mbit, and is valid, the quorum is read by the second arbitration module. And obtaining the address higher than the second threshold Nbit, and extracting the address higher than the second threshold Nbit for comparison, and matching, replacing the data of the corresponding address with the entry data returned from the off-chip memory, and writing back to the Cache, This data is returned to the service side by the distribution module.
  • the comparison module is further configured to: when the query request and the Cache all the addresses do not match, initiate a request for finding the off-chip memory, and after the entry data is returned, take out the search information storage module.
  • the entry address and the multi-burst information For the case of the multi-burst entry, the control module determines whether the address is lower than the first threshold Mbit, and the corresponding vld corresponding to the consecutive 2 ⁇ S addresses is valid after the Sbit is shifted to the left. Then, the second arbitrating module arbitrates and reads the Cache, and the obtained address is higher than the second threshold Nbit, and the extracted address is higher than the second threshold Nbit for comparison. If the matching is performed, the data of the corresponding address is used from the off-chip. The entry data returned by the memory is replaced and written back to the Cache, and the data is returned to the service side through the distribution module.
  • the comparing module is further configured to: after receiving the query request, according to the multi-burst identifier carried in the query request, the control module determines that the service request is lower than the first threshold Mbit and shifts to the left Sbit, corresponding to Whether the vld identifier of the consecutive 2 ⁇ S requests is valid, and if valid, reads the data corresponding to the Cache, and determines that the service request is higher than the second threshold Nbit and the Cache. If the address matches or matches, the data is directly returned to the service side; if it does not match, the request for querying the off-chip memory is issued.
  • the comparison module is further configured to: after the entry data is returned, read the search information storage module to obtain the search request address and the multi-burst identifier, and determine, by the control module, that the service request is lower than the first threshold.
  • the Mbit is shifted to the left by Sbit, and the corresponding vld identifiers of the consecutive 2 ⁇ S requests are valid. If all the valid, the data of the corresponding Cache is read, and it is determined whether the service request is higher than the second threshold Nbit address and the service address of the return Cache is matched.
  • the data of the entry in the Cache is returned to the service side through the distribution module, and the entry data in the Cache is not updated; otherwise, the entry data in the off-chip memory is directly returned to the service through the distribution module. If the vld corresponding to the burst is partially valid, the entry update is not completed, and the entry data in the off-chip memory is returned to the service side through the distribution module, and is not updated. Table item data in the Cache.
  • the memory can be implemented by a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or a Field-Programmable Gate Array (FPGA).
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • FPGA Field-Programmable Gate Array
  • a method for improving an access bandwidth and an atomic operation of an embodiment of the present invention is applied to the device according to any one of the foregoing aspects, the method comprising:
  • the request for the off-chip memory is not required to be initiated to reduce the access to the off-chip memory, and the entry data stored in the cache is directly returned to the service side;
  • the entry data is processed according to the first preset rule, so that the atomic operation of updating the entry in the table item search process can achieve continuous stream search and no error.
  • the first preset rule is used to determine whether the address of the search request is equal to the stored address in the Cache, and includes any one of the following methods:
  • Manner 1 If the vld corresponding to the Mth address is lower than the first threshold, the NDDR address is equal to the address stored in the Cache, and the data in the Cache is returned to the service side, and the Cache is not updated. Data; if the addresses are not equal, the data in the Cache is not updated, and the data returned by the requesting off-chip memory is sent to the service side;
  • both M and N are natural numbers, and the sum of M and N is the request side width of the service side.
  • the method further includes:
  • the central processing unit configures a service entry, and issues an instruction to write a single burst entry for the update of a single burst entry;
  • the entry is lower than the first threshold Mbit as an address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache, and the address is corresponding to the control module by the control module.
  • the vld register is set, and an instruction to update the off-chip memory is issued to complete the entry update; wherein the M and N are both natural numbers, and the sum of M and N is the request side width of the service side.
  • the method further includes:
  • the central processing unit configures a service entry, and issues an instruction to write a multi-burst entry for the case where the multi-burst entry is updated;
  • the first burst takes the value obtained by shifting the entry below the first threshold Mbit by 2 ⁇ S bit as the address, and the entry is higher than the second threshold Nbit address/entry data.
  • the vld corresponding to the address is set to 0 by the control module, and the instruction for updating the off-chip memory entry is not issued;
  • the second burst is obtained by writing the value +1 of the entry lower than the first threshold Mbit to the left of the 2 ⁇ S bit as the address, and writing the data of the entry higher than the second threshold Nbit address/entry to the Cache.
  • the control module sets the address corresponding to vld to 0, does not issue an instruction to update the off-chip memory, and sets the vld of the first burst to 1, and issues an instruction to update the vld entry;
  • the address + S-2 address obtained by shifting 2 ⁇ S bit to the left is matched, and the vld corresponding to the last burst is set to 1. And issuing an instruction to update the off-chip memory entry to complete the update of the entry; wherein, the M and the N are both natural numbers, and the sum of M and N is the service side sending request bit width.
  • the method further includes:
  • the comparison module determines whether the vld identifier corresponding to the first threshold Mbit is valid, and if the search request is lower than the first threshold Mbit, the query cache is used to obtain the query result;
  • the queried result is compared with the second threshold Nbit of the search request, and the Cache check result is directly returned to the service side by the distribution module, and the off-chip memory is not issued.
  • the request is to discard the data read of the information storage module; wherein the M and N are both natural numbers, and the sum of M and N is the service side transmission request bit width.
  • the method further includes:
  • the control module determines that the search request is lower than the first threshold Mbit and shifts to the left sbit, and consecutively 2 ⁇ S addresses correspond to Whether the vld identifier is valid, if it is all valid, the request is sent 2 s queries Cache consecutively after the search request is lower than the first threshold Mbit left shift sbit, and the query result is obtained;
  • the method further includes:
  • the comparison module does not match all the addresses in the query request and the Cache, and initiates a request for finding the off-chip memory. After the entry data is returned, the entry address and the multi-burst information are retrieved from the search information storage module.
  • the control module determines whether the address corresponding to the address lower than the first threshold Mbit is valid, if it is valid, the Cache is read by the second arbitration module, and the address is obtained higher than the second. Threshold Nbit, the fetch address is higher than the second threshold Nbit for comparison. If the match is matched, the data of the corresponding address is replaced with the entry data returned from the off-chip memory and written back to the Cache, and the data is passed through the distribution module. Return to the business side.
  • the method further includes:
  • the comparison module does not match all the addresses in the query request and the Cache, and initiates a request for finding the off-chip memory. After the entry data is returned, the entry address and the multi-burst information are retrieved from the search information storage module.
  • the control module determines whether the corresponding vld corresponding to consecutive 2 ⁇ S addresses is valid after the address is lower than the first threshold Mbit, and the corresponding vld is valid, and the second arbitration is performed. After the module arbitrates, the Cache is read, the address is higher than the second threshold Nbit, and the fetch address is higher than the second threshold Nbit for comparison. If the matching is performed, the data of the corresponding address is used by the entry data returned from the off-chip memory. Replace the write back to the Cache, and return the data to the service side through the distribution module.
  • the method further includes:
  • the comparing module receives the query request, and according to the multi-burst identifier carried in the query request, the control module determines that the service request is lower than the first threshold Mbit and shifts to the left Sbit, and the corresponding connection Whether the vld identifier of the 2S request is valid, if it is valid, the data of the corresponding Cache is read, and it is determined whether the service request is higher than the second threshold Nbit and the address in the Cache matches, and if the match is matched, the data is directly returned to the service side. If there is no match, a request to query the off-chip memory is issued.
  • the method further includes:
  • the comparison module reads the search information storage module to obtain the search request address and the multi-burst identifier.
  • the control module determines whether the service request is lower than the first threshold Mbit and shifts to the left Sbit, and the corresponding consecutive 2 ⁇ S requests of the vld identifier are valid. If all the valid, the corresponding cache data is read, and the service request is determined to be high. Whether the second threshold Nbit address matches the service address of the return Cache. If the matching, the entry data of the Cache is returned to the service side through the distribution module, and the entry data in the Cache is not updated. Otherwise, the data is directly off-chip.
  • the entry data in the memory is returned to the service side by the distribution module, and the entry data in the Cache is updated; if the vld corresponding to the multiple burst is partially valid, it indicates that the update of the entry is not completed, and the table in the off-chip memory is The item data is returned to the service side through the distribution module, and the item data in the Cache is not updated.
  • Embodiments of the present invention also provide a computer storage medium in which computer executable instructions are stored, the computer executable instructions configured to perform the above method for improving table access bandwidth and atomic operations.
  • the device for improving the access bandwidth and the atomic operation of the embodiment of the present invention includes: a comparison module, a cache cache, and a distribution module; wherein the comparison module is configured to receive a query request of the service side, and determine that the query request points Whether the address is equal to the stored entry address in the cache cache. If they are equal, and the valid identifier vld is currently valid, there is no need to initiate a request to find off-chip memory to reduce access to the off-chip memory.
  • the entry data stored in the Cache is returned to the service side; if not, the request for finding the off-chip memory is initiated, so that the entry data returned by the requesting off-chip memory is processed according to the first preset rule, so that the entry is searched in the entry.
  • the atomic operation of the update table entry in the process can achieve continuous flow and no error;
  • the Cache is configured to store the data of the entry and the address of the entry.
  • the distribution module is configured to identify whether the data returned to the service side is the entry data in the Cache or the entry data in the off-chip memory, and then returned to the service side.
  • the number of accesses to the off-chip memory is reduced, thereby reducing the query bandwidth, and further, the table that will request the off-chip memory to be returned
  • the item data is processed according to the first preset rule, so that the atomic operation of updating the item in the table item searching process can achieve continuous stream search and no error.
  • FIG. 1 is a schematic diagram of an atomic operation of an entry to which an embodiment of the present invention is applied and a device for improving an off-chip memory search bandwidth;
  • FIG. 2 is a schematic diagram showing the atomic operation of an entry to which an embodiment of the present invention is applied;
  • FIG. 3 is a schematic diagram of a buffer internal data structure to which an embodiment of the present invention is applied;
  • FIG. 4 is a schematic diagram of a process for improving the bandwidth performance of a table lookup and atomic operation of an entry according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a process for updating a single burst entry according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a process for updating a multi-burst entry to which an embodiment of the present invention is applied.
  • An apparatus for improving access bandwidth and atomic operation of an entry comprising a comparison module, a cache cache, and a distribution module;
  • the comparison module is configured to receive the query request from the service side, and determine whether the address pointed to by the query request is equal to the address of the stored entry in the cache cache. If they are equal, and the valid identifier vld is currently valid, the search fragment does not need to be initiated.
  • the Cache is configured to store entry data and an entry address
  • the distribution module is configured to identify that the data returned to the service side is the entry data in the Cache or the entry data in the off-chip memory, and then returned to the service side.
  • the comparing module is further configured to determine, according to the first preset rule, whether the address pointed by the search request is equal to the stored address in the Cache, including any one of the following methods:
  • Manner 1 If the vld corresponding to the Mth address is lower than the first threshold, the NDDR address is equal to the address stored in the Cache, and the data in the Cache is returned to the service side, and the Cache is not updated. Data; if the addresses are not equal, the data in the Cache is not updated, and the data returned by the requesting off-chip memory is sent to the service side;
  • both M and N are natural numbers, and the sum of M and N is the request side width of the service side.
  • the apparatus further includes:
  • the first arbitration module is configured to complete arbitration between the central processing unit write cache and the off-chip memory return entry data write cache;
  • control module configured to manage the vld flag bit and determine when to initiate an update operation on the off-chip memory
  • the central processing unit is configured to configure a service entry.
  • an instruction to write a burst entry is issued; after the arbitration by the first arbitration module, the entry is lower than the first threshold.
  • the value Mbit is an address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache, and the vld register corresponding to the address is set by the control module, and an instruction for updating the off-chip memory is issued, and the entry is completed.
  • Update wherein, both M and N are natural numbers, and the sum of M and N is the request side width of the service side.
  • the apparatus further includes:
  • the first arbitration module is configured to complete arbitration between the central processing unit write cache and the off-chip memory return entry data write cache;
  • control module configured to manage the vld flag bit and determine when to initiate an update operation on the off-chip memory
  • the central processing unit is configured to configure a service entry. For the case where the multi-burst entry is updated, an instruction to write a multi-burst entry is issued. After the first arbitration module arbitrates, the first burst is moved to the left by the entry lower than the first threshold Mbit. The value obtained by the 2 ⁇ S bit is used as an address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache, and the control module performs the vld corresponding to the address to 0, and does not issue the update.
  • the instruction of the off-chip memory entry; the second burst uses the value of the entry below the first threshold Mbit to the left of the 2 ⁇ S bit as the address, and the entry is higher than the second threshold Nbit address/entry data.
  • the apparatus further includes:
  • the information storage module is configured to store a table lookup request and multiple burst flag information; the table lookup request includes an entry address of the table lookup table;
  • the comparison module is further configured to determine, according to the case of the single burst entry, whether the vld identifier corresponding to the search request is lower than the first threshold Mbit, and valid, if the lookup request is lower than the first threshold Mbit, the query cache is started. Obtaining the result of the query, parsing the query result, comparing the found address with the second threshold Nbit of the search request, and equalizing, directly returning the cache check result to the service side through the distribution module, and not sending out Querying the request of the off-chip memory, discarding the data of the search information storage module; wherein, the M and the N are both natural numbers, and the sum of M and N is the request side width of the service side;
  • the second arbitration module is configured to perform arbitration between the service side read Cache and the off-chip memory return read Cache;
  • the off-chip memory is configured to store a lookup entry
  • the apparatus further includes:
  • the information storage module is configured to store a table lookup request and multiple burst flag information; the table lookup request includes an entry address of the table lookup table;
  • the comparison module is further configured to be in the case of a multi-burst entry, and when the multi-burst entry is 2 ⁇ S, the control module determines that the search request is lower than the first threshold Mbit and then shifts to the left Sbit continuously. Whether the vld identifier corresponding to the 2 ⁇ S addresses is valid. If the validity is all valid, the request is sent to the second sbit after the search request is lower than the first threshold Mbit, and then the request is queried continuously, and the query result is obtained, and the query result is parsed.
  • the address to be searched is compared with the second threshold value Nbit of the search request, and the result of the Cache is directly spliced and returned to the service side by the distribution module, and the request for querying the off-chip memory is not issued. Discarding the data read of the lookup information storage module; wherein the M and N are both natural numbers; the S is a natural number;
  • the second arbitration module is configured to perform arbitration between the service side read Cache and the off-chip memory return read Cache;
  • the off-chip memory is configured to store a lookup entry.
  • the comparing module is further configured to be in the query
  • the request and the address in the Cache do not match, and the request for the off-chip memory is initiated.
  • the entry address and the multi-burst information are retrieved from the search information storage module.
  • the control module determines whether the address corresponding to the address lower than the first threshold Mbit is valid, if it is valid, the Cache is read by the second arbitration module, and the Cache is obtained, and the address is higher than the second threshold Nbit, and the address is higher than the first address.
  • the second threshold Nbit is compared and matched, and the data corresponding to the address is replaced with the entry data returned from the off-chip memory, and then returned to the Cache, and the data is returned to the service side through the distribution module.
  • the comparing module is further configured to: when all the addresses in the query request and the Cache do not match, initiate a request for finding an off-chip memory, and after the item data is returned,
  • the search information storage module extracts the entry address and the multi-burst information.
  • the control module first determines that the address is lower than the first threshold Mbit, and the Sbit corresponds to the consecutive 2 ⁇ S addresses. Whether the vld is valid and fully valid, the Cache is read by the second arbitration module, and the obtained address is higher than the second threshold Nbit, and the fetched address is higher than the second threshold Nbit for comparison, and the matching data is used for the corresponding address.
  • the replacement is returned to the Cache with the entry data returned from the off-chip memory, and the data is returned to the service side through the distribution module.
  • the comparing module is further configured to: after receiving the query request, according to the multi-burst identifier carried in the query request, determine, by the control module, that the service request is lower than the first threshold Mbit Move the Sbit to the left, whether the corresponding vld identifiers of consecutive 2 ⁇ S requests are valid. If valid, read the data of the corresponding Cache, and determine whether the service request is higher than the second threshold Nbit and the address in the Cache matches, and the matching is directly The data is returned to the service side; if it does not match, a request to query the off-chip memory is issued.
  • the comparing module is further configured to: after the item data is returned, read the search information storage module to obtain a search request address and a multi-burst identifier, and determine, in the control module, the service request. Below the first threshold Mbit shifts Sbit to the left, corresponding to consecutive 2 ⁇ S Whether the requested vld identifier is valid. If it is valid, the data of the corresponding Cache is read, and it is determined whether the service request is higher than the second threshold Nbit address and the service address of the return Cache is matched. If the match is matched, the entry data of the Cache is matched.
  • the entry data in the Cache is not updated; otherwise, the entry data in the off-chip memory is directly returned to the service side through the distribution module, and the entry data in the Cache is updated; If the vld corresponding to the multi-burst is partially valid, it indicates that the entry update is not completed, and the entry data in the off-chip memory is returned to the service side through the distribution module, and the entry data in the cache is not updated.
  • a method for improving access bandwidth and atomic operations of an entry including,
  • Step S11 Receive a query request from the service side, and determine whether the address pointed to by the query request is equal to the address of the stored entry in the cache cache.
  • step S12 if they are equal, and the valid identifier vld is currently valid, the request for the off-chip memory is not required to be initiated to reduce the access to the off-chip memory, and the entry data stored in the cache is directly returned to the service side. ;
  • Step S13 If not equal, initiate a request for finding an off-chip memory, to process the entry data of the request-off-chip memory according to the first preset rule, so as to update the atomic operation of the entry in the table item search process. Can achieve continuous search and no error.
  • the first preset rule is used to determine whether the address of the search request is equal to the stored address in the Cache, and includes any one of the following methods:
  • Manner 1 If the vld corresponding to the Mth address is lower than the first threshold, the NDDR address is equal to the address stored in the Cache, and the data in the Cache is returned to the service side, and the Cache is not updated. Data; if the addresses are not equal, the data in the Cache is not updated, and the data returned by the requesting off-chip memory is sent to the service side;
  • both M and N are natural numbers, and the sum of M and N is the request side width of the service side.
  • the method further includes:
  • the central processing unit configures a service entry, and issues an instruction to write a single burst entry for the update of a single burst entry;
  • the entry is lower than the first threshold Mbit as an address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache, and the address is corresponding to the control module by the control module.
  • the vld register is set, and an instruction to update the off-chip memory is issued to complete the entry update; wherein the M and N are both natural numbers, and the sum of M and N is the request side width of the service side.
  • the method further includes:
  • the central processing unit configures a service entry, and issues an instruction to write a multi-burst entry for the case where the multi-burst entry is updated;
  • the first burst takes the value obtained by shifting the entry below the first threshold Mbit by 2 ⁇ S bit as the address, and the entry is higher than the second threshold Nbit address/entry data is written into the Cache.
  • the vld corresponding to the address is set to 0 by the control module, and the instruction for updating the off-chip memory entry is not issued;
  • the second burst is obtained by writing the value +1 of the entry lower than the first threshold Mbit to the left of the 2 ⁇ S bit as the address, and writing the data of the entry higher than the second threshold Nbit address/entry to the Cache.
  • the control module sets the address corresponding to vld to 0, does not issue an instruction to update the off-chip memory, and sets the vld of the first burst to 1, and issues an instruction to update the vld entry;
  • the address + S-2 address obtained by shifting 2 ⁇ S bit to the left is matched, and the vld corresponding to the last burst is set to 1. And issuing an instruction to update the off-chip memory entry to complete the update of the entry; wherein, the M and the N are both natural numbers, and the sum of M and N is the service side sending request bit width.
  • the method further includes:
  • the comparison module determines whether the vld identifier corresponding to the first threshold Mbit is valid, and if the search request is lower than the first threshold Mbit, the query cache is used to obtain the query result;
  • the queried result is compared with the second threshold Nbit of the search request, and the Cache check result is directly returned to the service side by the distribution module, and the off-chip memory is not issued.
  • the request is to discard the data read of the information storage module; wherein the M and N are both natural numbers, and the sum of M and N is the service side transmission request bit width.
  • the method further includes:
  • the control module determines that the search request is lower than the first threshold Mbit and shifts to the left sbit, and consecutively 2 ⁇ S addresses correspond to Whether the vld identifier is valid, if it is all valid, the request is sent 2 s queries Cache consecutively after the search request is lower than the first threshold Mbit left shift sbit, and the query result is obtained;
  • the query result is parsed, and the searched address is compared with the second threshold Nbit of the search request, and is equal.
  • the Cache check result is directly spliced and returned to the service side by the distribution module, and the query is not sent.
  • the request of the off-chip memory discards the data read of the information storage module; wherein the M and N are both natural numbers, and the sum of M and N is the request side width of the service side; the S is a natural number.
  • the method further includes:
  • the comparison module does not match all the addresses in the query request and the Cache, and initiates a request for finding the off-chip memory. After the entry data is returned, the entry address and the multi-burst information are retrieved from the search information storage module.
  • the control module determines whether the address corresponding to the address lower than the first threshold Mbit is valid, if it is valid, the Cache is read by the second arbitration module, and the address is obtained higher than the second.
  • Threshold Nbit the fetch address is higher than the second threshold Nbit for comparison, and if it matches, the data of the corresponding address is entered with the entry data returned from the off-chip memory. The row replaces the write back to the Cache, and the data is returned to the service side through the distribution module.
  • the method further includes:
  • the comparison module does not match all the addresses in the query request and the Cache, and initiates a request for finding the off-chip memory. After the entry data is returned, the entry address and the multi-burst information are retrieved from the search information storage module.
  • the control module determines whether the corresponding vld corresponding to consecutive 2 ⁇ S addresses is valid after the address is lower than the first threshold Mbit, and the corresponding vld is valid, and the second arbitration is performed. After the module arbitrates, the Cache is read, the address is higher than the second threshold Nbit, and the fetch address is higher than the second threshold Nbit for comparison. If the matching is performed, the data of the corresponding address is used by the entry data returned from the off-chip memory. Replace the write back to the Cache, and return the data to the service side through the distribution module.
  • the method further includes:
  • the comparison module receives the query request, and according to the multi-burst identifier carried in the query request, the control module determines that the service request is lower than the first threshold Mbit and shifts to the left Sbit, and the corresponding consecutive 2 ⁇ S requests of the vld identifier are Valid, if valid, read the data of the corresponding Cache, determine whether the service request is higher than the second threshold Nbit and the address in the Cache matches, if the match, the data is directly returned to the service side; if not, the query is sent to the off-chip memory. Request.
  • the method further includes:
  • the comparison module reads the search information storage module to obtain the search request address and the multi-burst identifier.
  • the control module determines whether the service request is lower than the first threshold Mbit and shifts to the left Sbit, and the corresponding consecutive 2 ⁇ S requests of the vld identifier are valid. If all the valid, the corresponding cache data is read, and the service request is determined to be high. Whether the second threshold Nbit address matches the service address of the return Cache. If the matching, the entry data of the Cache is returned to the service side through the distribution module, and the entry data in the Cache is not updated. Otherwise, the data is directly off-chip.
  • Entry data in memory through the distribution The module returns to the service side, and updates the entry data in the Cache; if the vld corresponding to the multiple burst is partially valid, it indicates that the entry update is not completed, and the entry data in the off-chip memory is returned to the service through the distribution module. On the side, the table entry data in the Cache is not updated.
  • the application scenario of an entry management is based on the management of the dynamic dynamic random access memory (SDRAM), and the type and entry of the table are checked because the network processor is applied to different occasions.
  • SDRAM dynamic dynamic random access memory
  • the capacity, the size of the entry, and the performance requirements of the look-up table are quite different.
  • the prior art is based on the structural characteristics of the memory itself, or multi-bank copying, or reducing the line-feeding operation, that is, it is necessary to change the self-structure of the off-chip memory.
  • the application scenario of the present invention adopts the embodiment of the present invention, and does not need to change the self-structure of the off-chip memory, but starts from reducing the access to the external memory SDRAM to improve the search bandwidth, which is an efficient item management scheme, which not only improves.
  • the search performance of the external memory SDRAM and overcome the problem of the atomic operation of updating the entry of the multi-burst entry in the search process, ensuring that the search cannot continuously fail the error during the update process of the entry.
  • the application scenario adopts the embodiment of the present invention, it is not necessary to change the self-structure of the off-chip memory, and the design and use are more convenient and flexible in terms of manufacturing cost.
  • a so-called single-burst entry means that the entry of the entry needs to be stored in a single address in the memory.
  • the result of the entry is directly obtained according to the lookup request.
  • the so-called multi-burst entry means that the entry of the entry needs to be stored in multiple addresses of the memory.
  • the entry management module needs to convert the single lookup request into multiple lookup requests to obtain the entry. fruit.
  • the device of the present application adopts the device of the embodiment of the present invention, and is specifically a device for improving the access bandwidth and atomic operation of the off-chip memory entry.
  • the device includes: a search information storage module 101 and a comparison module 102.
  • the information storage module 101 is configured to store a table lookup address and multiple burst flag information.
  • the comparison module 102 firstly, the service side initiates a lookup request, and when the lookup request comes over, it is used to determine whether the address of the search request is equal to the stored address in the Cache. If the identifier is valid and the valid identifier (vld, valid) is valid, Returning the entry stored in the Cache to the service side through the distribution module 103 without initiating a request to find the SDRAM 109; otherwise, indicating that the address of the lookup request is not equal to the stored address in the Cache, then waiting, initiating a request to find the SDRAM 109 In order to continue processing according to the returned data; second, when the SDRAM 109 returns data, the address used to determine the lookup request is equal to the stored address in the Cache, and there are the following cases:
  • the threshold value needs to be set according to the requirements of the actual application, that is, the low Mbit refers to a situation that is lower than a first threshold, such as M bits; and the high Nbit refers to a higher threshold than a second threshold, such as N bits. How to deal with the situation.
  • the distribution module 103 is configured to identify that the data returned to the service side is the entry data in the Cache 106 or the entry data in the SDRAM 109.
  • the control module 104 is configured to manage the vld flag and determine when to initiate an update operation on the SDRAM 109.
  • the table entry update identifier vld action rule is as follows:
  • the central processing unit 107 is configured to update to 1 when writing a single burst entry, and issue an update SDRAM109 entry update operation.
  • the second arbitration module 105 is configured to return the arbitration of the read cache Cache 106 by the service side read cache 106 and the SDRAM 109.
  • the Cache 106 is configured to store the entry data and the entry address. As shown in FIG. 3, FIG. 3 is a cache internal data structure to which the embodiment of the present invention is applied.
  • the central processing unit 107 is configured to configure a service entry.
  • the first arbitration module 108 is configured to complete the central processing unit 107 to write the Cache 106 and SDRAM 109 returns the arbitration of data write Cache 106.
  • SDRAM 109 is used to store a lookup table entry.
  • the application scenario of the present invention adopts the device of the embodiment of the present invention, which is an efficient method for improving the access bandwidth and atomic operation of the off-chip memory entry, and mainly includes the following aspects, including single burst and multiple burst processing:
  • the first arbitration module 108 arbitrates, and the entry to be updated and the corresponding address are high. After the Nbit is spliced, it is written into the low Mbit address of the Cache 106, and the vld flag of the corresponding address of the Mbit is set to 1 by the control module 104, and the operation of updating the SDRAM109 entry is issued; for example, the central processing unit sends a multi-burst entry, as shown in FIG. 6.
  • the processing flow is the same as the single burst operation when the first burst entry is updated.
  • the vld flag of the Mbit corresponding address is set to 0 by the control module 104 at this time, and the table item update operation is not issued, and the central processing unit issues the
  • the implementation process is consistent with the first burst process.
  • the vld flag corresponding to the previous burst is set to 1, and the previous entry is sent to update the SDRAM109 entry.
  • the address corresponding to the entry returned by the SDRAM 109 is consistent with the address of the second-last burst entry of the multi-burst entry, the last burst entry is updated to the SDRAM 109 to complete the update of the multi-burst entry.
  • the service side sends a request, and the table lookup request and the multi-burst identifier are stored in the search information storage module 101. If it is a single burst, the search request is first determined. If the VLD identifier corresponding to the low Mbit is valid and valid, the query cache 106 is initiated by using the lookup request low Mbit, and the query result is obtained.
  • the comparison module 102 compares the found address with the high Nbit of the service side search request, and the equality directly
  • the result of the lookup of the Cache 106 is returned to the service side by the distribution module 103, and the request for querying the external memory SDRAM 109 is not issued, and the data of the lookup information storage module 101 is read and discarded; if it is a multi-burst entry (assuming 2 ⁇ S), first After the control module 104 determines that the lookup request is low Mbit left shift sbit, whether the VLD identifier corresponding to the consecutive 2 ⁇ S addresses is valid, and if it is all valid, the request is requested by the low Mbit left shift sbit and then continuously sends 2 ⁇ S query Cache106 requests.
  • the query result is obtained, and the comparison module 102 compares the found address with the high Nbit of the service side search request, and directly splicing the lookup result of the Cache 106 to the service side through the distribution module 103, and does not issue a request for querying the SDRAM 109. At the same time, the data read of the lookup information storage module 101 is discarded.
  • the request for querying the SDRAM 109 is sent. After the entry is returned, the address and the multi-burst information are retrieved from the search information storage module 101.
  • the control module 104 determines whether the vld corresponding to the low Mbit address is valid, and if it is valid, the Cache 106 is read by the second arbitration module 105, and the high Nbit of the address is obtained, and the comparison module 102 and the search information storage module 101 extract the address high Nbit for comparison.
  • the matching data is replaced with the data returned by the SDRAM 109 and written back to the Cache 106, and the data is returned to the service side through the distribution module 103.
  • the control module 104 first determines the address low Mbit. After the Sbit is shifted to the left, the corresponding vld corresponding to the consecutive 2 ⁇ S addresses is valid, and all valid. After the second arbitration module 105 arbitrates, the Cache 106 is read, and the subsequent operations are consistent with the single burst.
  • the result of the entry corresponding to the service side lookup table request is stored in the Cache 106, and the query result in the cache 106 is directly returned to the service side through the distribution module 103, and the access request of the external memory SDRAM 109 is not issued, thereby improving the access request.
  • the query bandwidth of the entry is stored in the Cache 106, and the query result in the cache 106 is directly returned to the service side through the distribution module 103, and the access request of the external memory SDRAM 109 is not issued, thereby improving the access request.
  • the query bandwidth of the entry is stored in the Cache 106, and the query result in the cache 106 is directly returned to the service side through the distribution module 103, and the access request of the external memory SDRAM 109 is not issued, thereby improving the access request.
  • the service side query request is received, and according to the multi-burst identifier carried in the query request, the control module 104 determines the low Mbit left of the service request. Move Sbit, whether the corresponding vld identifier of consecutive 2 ⁇ S requests is If it is valid, the data corresponding to the Cache 106 is read, and the comparison module 102 determines whether the address of the service request high Nbit matches the address in the Cache 106, and the matching directly returns the data to the service side, and does not match the request for querying the external memory SDRAM 109.
  • the lookup information storage module 101 is read, and the search request address and the multi-burst identifier are obtained.
  • the control module 104 determines that the low Mbit of the service request is shifted to the left by the Sbit, and whether the corresponding vld identifiers of the consecutive 2 ⁇ S requests are valid. If the data is valid, the data of the corresponding Cache 106 is read, and the comparison module 102 determines whether the high Nbit address of the service request matches the service address of the return Cache 106. If the match returns the cache 106 data to the service side through the distribution module 103, the Cache 106 data is not updated.
  • the SDRAM 109 data is directly returned to the service side through the distribution module 103, and the Cache 106 data is updated; if the vld corresponding to the multiple bursts is partially valid, the update of the entry is not completed, and the SDRAM 109 data is returned to the service side through the distribution module 103. , does not update Cache106 data.
  • the schematic diagram of improving the performance of the table lookup bandwidth and the atomic operation process of the table entry according to the embodiment of the present invention is a complete principle flow, including:
  • Step S21 According to the multi-burst flag of the entry, when the vld corresponding to the consecutive 2 ⁇ S addresses after the service search request low Mbit is shifted to the left by 2 ⁇ S bit is valid, the service requests the consecutive 2 ⁇ S addresses to be read.
  • Step S22 The service lookup request high Nbit is compared with the addrl in the Cache, and the data in the Cache is returned to the service query module.
  • Step S23 otherwise storing the service search request and the multi-burst flag, and simultaneously issuing the query for the off-chip memory request, and reading the service search request and the multi-burst flag after the data is returned, when the service search request is low, the Mbit is shifted to the left by 2 ⁇ S bit.
  • vld corresponding to consecutive 2 ⁇ S addresses are valid, the data in the Cache is read by the consecutive 2 ⁇ S addresses of the service request;
  • Step S24 If the lookup request high Nbit is compared with the addrl in the Cache, if the comparison is equal, the data in the Cache is returned to the service query module, otherwise the queried data is written into the Cache and returned to the service module;
  • Step S25 If the vld part is valid, the cache request is not sent, and the data obtained by the SDRAM is directly returned to the service module.
  • FIG. 5 is a schematic flowchart of a process for updating a single burst entry according to an embodiment of the present invention, including:
  • Step S31 the write single burst entry sent by the central processing unit
  • Step S32 Write the high Nbit address/entry data of the entry to the cache by using the low Mbit of the entry as the address, and set the VLD register corresponding to the address, and issue an instruction to update the external memory to complete the entry update.
  • FIG. 6 is a schematic flowchart of a process for updating a multi-burst entry to which the embodiment of the present invention is applied, including:
  • Step S41 Write a multi-burst entry sent by the central processing unit
  • Step S42 The first burst is the address obtained by shifting the 2 ⁇ S bit to the left of the low Mbit as the address of the Cache, and the data is written into the Cache corresponding to the address, and the vld corresponding to the address is set to 0, and the updated SDRAM entry is not issued. instruction;
  • Step S43 The second burst uses the low address Mbit left shift 2 ⁇ S bit to obtain the address +1 as the address of the Cache, writes the data into the Cache corresponding to the address, sets the vld corresponding to the address to 0, does not issue an update.
  • the instruction of the SDRAM entry at the same time, sets the vld of the first burst to 1, and issues an instruction to update the SDRAM entry;
  • Step S44 when the second-to-last burst address returned by the SDRAM is low and the Mbit is shifted to the address +s-2 obtained by shifting 2 ⁇ S bit to the left, the vld corresponding to the last burst is set to 1, and the updated SDRAM entry is issued.
  • the application scenario adopts the scheme of improving the access bandwidth and the atomic operation of the SDRAM entry in the embodiment of the present invention, and overcomes the problem of the atomic operation of updating the entry of the multi-burst entry in the search process, ensuring that During the update process of the entry, the continual flow and the error are not found.
  • the data of the entry that has been stored in the Cache is directly returned by the Cache during the query.
  • the search performance of the SDRAM is improved, and the scheme is simple and flexible. Therefore, it can be widely applied to other single burst/multi-burst entry management.
  • the integrated modules described in the embodiments of the present invention may also be stored in a computer readable storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .
  • embodiments of the invention are not limited to any specific combination of hardware and software.
  • an embodiment of the present invention further provides a computer storage medium, wherein a computer program is stored, and the computer program is used to perform a method for improving access bandwidth and atomic operation of an embodiment of the present invention.
  • the number of accesses to the off-chip memory is reduced, thereby reducing the query bandwidth, and further, the table that will request the off-chip memory to be returned
  • the item data is processed according to the first preset rule, so that the atomic operation of updating the item in the table item searching process can achieve continuous stream search and no error.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例公开了一种用于提高表项访问带宽和原子性操作的装置及方法、计算机存储介质,其中,所述装置包括:比较模块、缓存Cache、分发模块;比较模块,用于收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等,如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。

Description

一种用于提高表项访问带宽和原子性操作的装置及方法 技术领域
本发明涉及表项管理技术,尤其涉及一种用于提高表项访问带宽和原子性操作的装置及方法、计算机存储介质。
背景技术
网络处理器(NP,Network Processor)是为网络应用领域涉及的专用指令处理器(ASIP,Application Specific Instruction Process),ASIP具有自己的结构特征和专门的电路设计以适用于网络分组处理,同时它又是一块软件可编程芯片。它使得网络系统能够具备高性能和灵活性。
由于网络处理器应用于不同的场合,查表的类型、表项容量、表项条目大小及查表性能需求差异较大,因此,需要解决的技术问题包括以下两方面:
一、网络处理器对表项管理,要求表项在实现一定更新性能下进行查找时需要达到不断流和不出现错误的目的。对于单条目宽度比较大的表项存储,可能需要存储器多个地址来存储单个条目,此时表项更新过程中,保证表项的原子性操作尤为重要;
二、同时存储在片外存储器由于受到自身结构的影响,表项查询时的访问带宽也是亟待解决的问题。
发明内容
有鉴于此,本发明实施例希望提供一种用于提高表项访问带宽和原子性操作的装置及方法、计算机存储介质,至少提高了表项查询时的访问带宽和确保原子性操作能实现不断流和不出现错误的目的。
本发明实施例的技术方案是这样实现的:
本发明实施例的一种用于提高表项访问带宽和原子性操作的装置,所述装置包括:比较模块、Cache、分发模块;
比较模块,配置为收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等,如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错;
所述Cache,配置为存储表项数据和表项地址;
分发模块,配置为识别返回给业务侧的数据是Cache中的表项数据还是所述片外存储器中的表项数据后返回给业务侧。
上述方案中,所述比较模块,还配置为按照第一预设规则判断所述查找请求指向的地址与Cache中的存储的地址是否相等,包括以下任意一种方式:
方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache中的数据,将片外存储器返回的数据发送至业务侧;
其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述装置还包括:
第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
中央处理单元,配置为配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;经所述第一仲裁模块仲裁后以表项低于第一阈值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述装置还包括:
第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
中央处理单元,配置为配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^S bit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;依此类推,当所述片外存储器返回的倒数第二个burst 地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述装置还包括:
查找信息存储模块,配置为存储查表请求和多burst标志信息;
所述比较模块,还配置为对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果,解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;
第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
所述片外存储器,配置为存储查找表项;
上述方案中,所述装置还包括:
查找信息存储模块,配置为存储查表请求和多burst标志信息;
所述比较模块,还配置为对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移Sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果,解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读 取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;所述S为自然数;
第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
所述片外存储器,配置为存储查找表项。
上述方案中,所述比较模块,还配置为在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
上述方案中,所述比较模块,还配置为在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
上述方案中,所述比较模块,还配置为在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的 地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
上述方案中,所述比较模块,还配置为待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识,在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
所述比较模块、所述Cache、所述分发模块、所述第一仲裁模块、所述控制模块、所述中央处理单元、所述查找信息存储模块、所述第二仲裁模块、所述片外存储器在执行处理时,可以采用中央处理器(CPU,Central Processing Unit)、数字信号处理器(DSP,Digital Singnal Processor)或可编程逻辑阵列(FPGA,Field-Programmable Gate Array)实现。
本发明实施例的一种用于提高表项访问带宽和原子性操作的方法,所述方法应用于上述方案中任一项所述的装置,所述方法包括:
收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等;
如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;
如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回 的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。
上述方案中,所述第一预设规则用于判断查找请求的地址与Cache中的存储的地址是否相等,包括以下任意一种方式:
方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache中的数据,将片外存储器返回的数据发送至业务侧;
其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述方法还包括:
中央处理单元配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;
经所述第一仲裁模块仲裁后以表项低于第一阈值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述方法还包括:
中央处理单元配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;
经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^S bit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述 Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;
第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;
依此类推,当所述片外存储器返回的倒数第二个burst地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述方法还包括:
比较模块对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果;
解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
上述方案中,所述方法还包括:
比较模块对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果;
解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块 返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;所述S为自然数。
上述方案中,所述方法还包括:
比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
上述方案中,所述方法还包括:
比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
上述方案中,所述方法还包括:
比较模块在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连 续2^S个请求的vld标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
上述方案中,所述方法还包括:
比较模块在待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识;
在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
本发明实施例还提供一种计算机存储介质,其中存储有计算机可执行指令,该计算机可执行指令配置执行上述用于提高表项访问带宽和原子性操作的方法。
本发明实施例的用于提高表项访问带宽和原子性操作的装置包括:比较模块、缓存Cache、分发模块;其中,比较模块,用于收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等,如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错;所述 Cache,用于存储表项数据和表项地址;分发模块,用于识别返回给业务侧的数据是Cache中的表项数据还是所述片外存储器中的表项数据后返回给业务侧。
采用本发明实施例,由于并不是总是需发起查找片外存储器的请求,从而减少了对所述片外存储器的访问次数,从而减少了查询带宽,而且,在将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。
附图说明
图1为应用本发明实施例的表项原子性操作及提高片外存储器查找带宽装置示意图;
图2为应用本发明实施例的表项原子性操作说明示意图;
图3为应用本发明实施例的缓冲(Cache)内部数据结构示意图;
图4为应用本发明实施例的提高查表带宽性能及表项原子性操作处理流程示意图;
图5为应用本发明实施例的单burst表项更新处理流程示意图;
图6为应用本发明实施例的多burst表项更新处理流程示意图。
具体实施方式
下面结合附图对技术方案的实施作进一步的详细描述。
本发明实施例的一种用于提高表项访问带宽和原子性操作的装置,包括比较模块、缓存Cache、分发模块;
比较模块,配置为收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等,如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;如果不 相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错;
所述Cache,配置为存储表项数据和表项地址;
分发模块,配置为识别返回给业务侧的数据是Cache中的表项数据还是所述片外存储器中的表项数据后返回给业务侧。
在本发明实施例一实施方式中,所述比较模块,还配置为按照第一预设规则判断所述查找请求指向的地址与Cache中的存储的地址是否相等,包括以下任意一种方式:
方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache中的数据,将片外存储器返回的数据发送至业务侧;
其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述装置还包括:
第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
中央处理单元,配置为配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;经所述第一仲裁模块仲裁后以表项低于第一阈 值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述装置还包括:
第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
中央处理单元,配置为配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^S bit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;依此类推,当所述片外存储器返回的倒数第二个burst地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述装置还包括:
查找信息存储模块,配置为存储查表请求和多burst标志信息;所述查表请求包括查表的表项地址;
所述比较模块,还配置为对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果,解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;
第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
所述片外存储器,配置为存储查找表项;
在本发明实施例一实施方式中,所述装置还包括:
查找信息存储模块,配置为存储查表请求和多burst标志信息;所述查表请求包括查表的表项地址;
所述比较模块,还配置为对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移Sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果,解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数;所述S为自然数;
第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
所述片外存储器,配置为存储查找表项。
在本发明实施例一实施方式中,所述比较模块,还配置为在所述查询 请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
在本发明实施例一实施方式中,所述比较模块,还配置为在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
在本发明实施例一实施方式中,所述比较模块,还配置为在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
在本发明实施例一实施方式中,所述比较模块,还配置为待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识,在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S 个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
一种用于提高表项访问带宽和原子性操作的方法,包括,
步骤S11、收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等;
步骤S12、如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;
步骤S13、如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。
在本发明实施例一实施方式中,所述第一预设规则用于判断查找请求的地址与Cache中的存储的地址是否相等,包括以下任意一种方式:
方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache 中的数据,将片外存储器返回的数据发送至业务侧;
其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述方法还包括:
中央处理单元配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;
经所述第一仲裁模块仲裁后以表项低于第一阈值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述方法还包括:
中央处理单元配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;
经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^S bit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;
第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;
依此类推,当所述片外存储器返回的倒数第二个burst地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述方法还包括:
比较模块对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果;
解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
在本发明实施例一实施方式中,所述方法还包括:
比较模块对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果;
解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;所述S为自然数。
在本发明实施例一实施方式中,所述方法还包括:
比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进 行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
在本发明实施例一实施方式中,所述方法还包括:
比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
在本发明实施例一实施方式中,所述方法还包括:
比较模块在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
在本发明实施例一实施方式中,所述方法还包括:
比较模块在待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识;
在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发 模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
以一个现实应用场景为例对本发明实施例阐述如下:
一个表项管理的应用场景为:基于同步动态随机存取内存(SDRAM,synchronous dynamic random-access memory)的表项管理进行的,由于网络处理器应用于不同的场合,查表的类型、表项容量、表项条目大小及查表性能需求差异较大,一方面需要实现网络处理器表项原子性操作时确保数据为全新或全旧,以达到不断流和不出现错误的目的,另一方面需要实现提高查找时的访问带宽,以达到高效的表项管理目的。
而对于本应用场景,采用现有技术,均是从存储器自身的结构特点作为出发点,或是多bank复制,或是减少换行操作等,也就是说,都需要改变片外存储器的自身结构才可行,而本应用场景采用本发明实施例,无需改变片外存储器的自身结构,而是从减少对外部存储器SDRAM的访问来提高查找带宽来入手,是一种高效的表项管理方案,不仅提高了外部存储器SDRAM的查找性能,而且克服多burst表项在查找过程中更新表项存在的原子性操作的问题,保证在表项更新过程中,查找不断流不出错误的目的。另外,由于本应用场景采用本发明实施例,无需改变片外存储器的自身结构,在制造成本上,设计使用上都更方便和灵活。
需要指出的是,本应用场景包括对单burst表项和对多burst表项更新处理的不同情况,后续具体阐述,这里只对所涉及的技术术语描述如下:
1)所谓单burst表项,指表项条目需要存放在存储器的单个地址中,查表时直接根据查找请求获取表项结果。
2)所谓多burst表项,指表项条目需要存放在存储器的多个地址中,查表时表项管理模块需要将单个查找请求转换为多个查找请求获取表项结 果。
3)表项原子性操作:如图2所示,多burst表项更新的过程中,可能会出现写多burst表项操作过程中夹杂读操作,使得表项返回结果出现新老交替情况,如对地址A、B、C、D操作时,得到返回结果为:A’、B、C、D,这就是一种新老交替的情况,其中,A’为新值,B、C、D为旧值;原子性操作就是查找获取表项结果或全是老值或全是新值,保证查找不出错或不断流,如对地址A、B、C、D操作时,得到返回结果为:A’、B’、C’、D’,这就是一种全是新值的情况。
本应用场景采用本发明实施例的装置,具体为一种高效的提高片外存储器表项访问带宽和原子性操作的装置,如图1所示,主要包括:查找信息存储模块101、比较模块102、控制模块104、第二仲裁模块105、第一仲裁模块108、分发模块103、Cache106和SDRAM109。
查找信息存储模块101,用于存储查表地址和多burst标志信息。
比较模块102,一是业务侧发起查找请求,当查找请求过来时,用于判断查找请求的地址与Cache中的存储的地址是否相等,若相等且有效标识(vld,valid)标识有效,则直接将Cache中存储的表项通过分发模块103返回至业务侧,而无需发起查找SDRAM109的请求;否则,说明查找请求的地址与Cache中的存储的地址不相等,则需等待,发起查找SDRAM109的请求,以便根据返回数据来继续处理;二是当SDRAM109返回数据时,用于判断查找请求的地址与Cache中的存储的地址是否相等,存在如下几种情况:
1)若低Mbit地址对应vld标识全有效,高Nbit地址和Cache中存储的地址相等,将Cache106中的数据返回至业务侧,不更新Cache106中的数据;若地址不相等,不更新Cache106中的数据,将SDRAM109返回数据送至业务侧;其中,所述M和N都为自然数,为不同的取值,作用都是表示一个 阈值,需按照实际应用的需求进行设置,即所述低Mbit指低于一个第一阈值,如M比特时如何处理的情况;而所述高Nbit指高于一个第二阈值,如N比特时如何处理的情况。
2)若低Mbit地址对应vld标识部分有效(针对多burst表项),不更新Cache106中的数据,将SDRAM109返回数据送至业务侧。
3)若低Mbit地址对应vld标识无效,更新Cache106中的数据,将SDRAM109返回数据送至业务侧。
分发模块103,用于识别返回给业务侧的数据是Cache106中表项数据或是SDRAM109中表项数据。
控制模块104,用于管理vld标识位及判断何时发起对SDRAM109的更新操作。表项更新标识vld操作规则如下所示:
1)中央处理单元107,用于写单burst表项时更新为1,发出更新SDRAM109表项更新操作。
2)多burst表项写入第一个burst时对应地址的vld置0,写入第二个burst对应vld置0,同时将第一个burst时对应的vld置1,发出更新表项第一个burst操作,以此类推。当外部存储器SDRAM109返回的读数据对应的地址与cache中倒数第二个burst匹配时,最后一个burst对应的vld置1完成表项的更新。
3)其他情况vld保持不变。
第二仲裁模块105,用于业务侧读Cache106和SDRAM109返回读Cache106的仲裁。
Cache106,用于存储表项数据、表项地址,如图3所示,图3为应用本发明实施例的缓冲(Cache)内部数据结构。
中央处理单元107,用于配置业务表项。
第一仲裁模块108,用于完成中央处理单元107写Cache106和 SDRAM109返回数据写Cache106的仲裁。
SDRAM109,用于存储查找表项。
本应用场景采用本发明实施例的装置,具体为一种高效的提高片外存储器表项访问带宽和原子性操作的方法,主要包括如下几个方面,均包括单burst和多burst处理:
1)中央处理单元更新表项过程;
2)提高SDRAM访问带宽处理过程;
3)多burst表项原子性操作处理过程。
一、就所述中央处理单元更新表项过程而言,如发送是单burst表项更新,如图5的处理流程,首先通过第一仲裁模块108仲裁,将要更新的表项和对应地址的高Nbit拼接后写入Cache106低Mbit地址中,同时通过控制模块104置低Mbit对应地址的vld标识位为1,发出更新SDRAM109表项操作;如中央处理单元发送的是多burst表项,如图6的处理流程,发出首burst表项更新时与单burst操作一致,区别是此时通过控制模块104置低Mbit对应地址的vld标识位为0,且不发出表项更新操作,中央处理单元发出第二个burst表项更新时,实现过程与首个burst处理过程一致,此时将上一个burst对应的vld标识置1,发出上一个表项更新SDRAM109表项的操作。以此类推,当SDRAM109返回的表项对应的地址与多burst表项的倒数第二个burst表项地址一致时,将最后一个burst表项更新到SDRAM109中,完成多burst表项的更新。
二、就提高SDRAM访问带宽处理过程而言,如图1所示,接收到业务侧发送请求,将查表请求和多burst标识存储在查找信息存储模块101,若是单burst,首先判断该查找请求低Mbit对应的VLD标识是否有效,有效,则用查找请求低Mbit发起查询cache106,获得查询结果,通过比较模块102将查到的地址与业务侧查找请求的高Nbit进行比较,相等直接将 Cache106的查表结果通过分发模块103返回给业务侧,不发出查询外部存储器SDRAM109的请求,同时将查找信息存储模块101的数据读取丢弃;若是多burst表项(假设2^S个),首先通过控制模块104判断该查找请求低Mbit左移sbit后,连续2^S个地址对应的VLD标识是否有效,若全有效,用查找请求低Mbit左移sbit后连续发出2^S个查询Cache106请求,获得查询结果,通过比较模块102将查到的地址与业务侧查找请求的高Nbit进行比较,相等直接将Cache106的查表结果拼接后通过分发模块103返回给业务侧,不发出查询SDRAM109的请求,同时将查找信息存储模块101的数据读取丢弃。
所述业务侧查询请求和Cache106中的所有地址均不匹配,则发出查询SDRAM109的请求,待表项返回后,从查找信息存储模块101取出地址和多burst信息,若为单burst表项首先通过控制模块104判断地址低Mbit对应的vld是否有效,有效,则通过第二仲裁模块105仲裁后读取Cache106,获取地址的高Nbit,通过比较模块102和查找信息存储模块101取出地址高Nbit进行比较,匹配,则将对应地址的数据用SDRAM109返回的数据进行替换回写至Cache106,同时将此数据通过分发模块103返回至业务侧;若为多burst表项,首先通过控制模块104判断地址低Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过第二仲裁模块105仲裁后读取Cache106,后续操作与单burst一致。
所述接收到业务侧查表请求对应的表项结果已存储在Cache106中,直接将cache106中的查询结果通过分发模块103返回给业务侧,同时不会发出外部存储器SDRAM109的访问请求,进而提高了表项的查询带宽。
三、就所述多burst表项原子性操作处理过程而言,如图1所示,接收到业务侧查询请求,根据查询请求携带的多burst标识,通过控制模块104判断业务请求的低Mbit左移Sbit,对应的连续2^S个请求的vld标识是否 有效,若有效,则读取对应Cache106的数据,在比较模块102判断业务请求高Nbit与Cache106中的地址是否匹配,匹配则直接将数据返回给业务侧,不匹配发出查询外部存储器SDRAM109的请求。待表项返回后,读取查找信息存储模块101,获取查找请求地址和多burst标识,在控制模块104判断业务请求的低Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效则读取对应Cache106的数据,通过比较模块102判断业务请求的高Nbit地址与返回Cache106的业务地址是否匹配,若匹配将cache106数据通过分发模块103返回给业务侧,不更新Cache106数据,否则直接将SDRAM109数据通过分发模块103返回给业务侧,同时更新Cache106数据;若多burst对应的vld有部分有效则表明表项更新未完成,此时将SDRAM109数据通过分发模块103返回给业务侧,不更新Cache106数据。
对应上述描述,如图4所示为应用本发明实施例的提高查表带宽性能及表项原子性操作处理流程示意图,是一个完整的原理流程,包括:
步骤S21、依据表项多burst标志,将业务查找请求低Mbit左移2^S bit后的连续2^S个地址对应的vld均有效时,则用业务请求这个连续2^S个地址读取Cache中的数据;
步骤S22、业务查找请求高Nbit与Cache中的addrl进行比较,将Cache中的数据返回给业务查询模块;
步骤S23、否则将业务查找请求和多burst标志存储,同时发出查询片外存储器请求,待数据返回后读取业务查找请求和多burst标志,当业务查找请求低Mbit左移2^S bit后的连续2^S个地址对应的vld均有效时,则用业务请求这连续2^S个地址读取Cache中的数据;
步骤S24、查找请求高Nbit与Cache中的addrl进行比较若相等,将Cache中的数据返回给业务查询模块,否则将查询到的数据写入Cache的同时返回给业务模块;
步骤S25、若vld部分有效不发出查cache请求,直接将SDRAM得到的数据返回给业务模块。
对应上述描述,图5为应用本发明实施例的单burst表项更新处理流程示意图,包括:
步骤S31、中央处理单元发出的写单burst表项;
步骤S32、以表项低Mbit为地址将表项高Nbit地址/表项数据写入cache中,同时将此地址对应的VLD寄存器置位,发出更新外部存储器的指令,完成表项更新。
对应上述描述,图6为应用本发明实施例的多burst表项更新处理流程示意图,包括:
步骤S41、中央处理单元发出的写多burst表项;
步骤S42、首burst以低Mbit左移2^S bit得到的地址作为Cache的地址,将数据写入此地址对应的Cache中,同时将此地址对应的vld置0,不发出更新SDRAM表项的指令;
步骤S43、第二个burst以低Mbit左移2^S bit的得到的地址+1作为Cache的地址,将数据写入此地址对应的Cache中,把此地址对应的vld置0,不发出更新SDRAM表项的指令,同时将首burst的vld置1,发出更新SDRAM表项的指令;
步骤S44、依此类推,当SDRAM返回的倒数第二个burst地址低Mbit左移2^S bit得到的地址+s-2匹配时,将最后一个burst对应的vld置1,发出更新SDRAM表项的指令,完成表项的更新。
总之,本应用场景采用本发明实施例这种高效的提高SDRAM表项访问带宽和原子性操作的方案,克服了多burst表项在查找过程中更新表项存在的原子性操作的问题,保证在表项更新过程中,查找不断流和不出错,同时对已经存储在Cache中的表项数据,查询时直接由Cache返回,不需 要发出SDRAM的读访问请求,提高了SDRAM的查找性能,且该方案具有使用简单、灵活等特点。从而可以广泛的应用于其他单burst/多burst表项管理中。
本发明实施例所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本发明实施例不限制于任何特定的硬件和软件结合。
相应的,本发明实施例还提供一种计算机存储介质,其中存储有计算机程序,该计算机程序用于执行本发明实施例的用于提高表项访问带宽和原子性操作的方法。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。
工业实用性
采用本发明实施例,由于并不是总是需发起查找片外存储器的请求,从而减少了对所述片外存储器的访问次数,从而减少了查询带宽,而且,在将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。

Claims (21)

  1. 一种用于提高表项访问带宽和原子性操作的装置,所述装置包括:比较模块、缓存Cache、分发模块;
    比较模块,配置为收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等,如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错;
    所述Cache,配置为存储表项数据和表项地址;
    分发模块,配置为识别返回给业务侧的数据是Cache中的表项数据还是所述片外存储器中的表项数据后返回给业务侧。
  2. 根据权利要求1所述的装置,其中,所述比较模块,还配置为按照第一预设规则判断所述查找请求指向的地址与Cache中的存储的地址是否相等,包括以下任意一种方式:
    方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
    方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
    方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache中的数据,将片外存储器返回的数据发送至业务侧;
    其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  3. 根据权利要求2所述的装置,其中,所述装置还包括:
    第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
    控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
    中央处理单元,配置为配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;经所述第一仲裁模块仲裁后以表项低于第一阈值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  4. 根据权利要求2所述的装置,其中,所述装置还包括:
    第一仲裁模块,配置为完成中央处理单元写入Cache与片外存储器返回表项数据写入Cache间的仲裁;
    控制模块,配置为管理vld标识位及判断何时发起对片外存储器的更新操作;
    中央处理单元,配置为配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^S bit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;依此类推,当所述片外存储器返回的倒数第二个burst 地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  5. 根据权利要求3所述的装置,其中,所述装置还包括:
    查找信息存储模块,配置为存储查表请求和多burst标志信息;
    所述比较模块,还配置为对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果,解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;
    第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
    所述片外存储器,配置为存储查找表项;
  6. 根据权利要求4所述的装置,其中,所述装置还包括:
    查找信息存储模块,配置为存储查表请求和多burst标志信息;
    所述比较模块,还配置为对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移Sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果,解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读 取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;所述S为自然数;
    第二仲裁模块,配置为业务侧读取Cache与片外存储器返回读取Cache间的仲裁;
    所述片外存储器,配置为存储查找表项。
  7. 根据权利要求5所述的装置,其中,所述比较模块,还配置为在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
  8. 根据权利要求6所述的装置,其中,所述比较模块,还配置为在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息,对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
  9. 根据权利要求8所述的装置,其中,所述比较模块,还配置为在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld 标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
  10. 根据权利要求9所述的装置,其中,所述比较模块,还配置为待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识,在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
  11. 一种用于提高表项访问带宽和原子性操作的方法,所述方法应用于权利要求1至10任一项所述的装置,所述方法包括:
    收到业务侧的查询请求,判断所述查询请求指向的地址与缓存Cache中的存储的表项地址是否相等;
    如果相等,且有效标识vld当前为有效,则无需发起查找片外存储器的请求,以减少对所述片外存储器的访问,直接将所述Cache中存储的表项数据返回至业务侧;
    如果不相等,发起查找片外存储器的请求,以将请求片外存储器返回的表项数据按照第一预设规则进行处理,使在表项查找过程中更新表项存在的原子性操作能实现查找不断流和不出错。
  12. 根据权利要求11所述的方法,其中,所述第一预设规则用于判断查找请求的地址与Cache中的存储的地址是否相等,包括以下任意一种方 式:
    方式一:若低于第一阈值Mbit地址对应的所述vld为全有效,高于第二阈值Nbit地址和Cache中存储的地址相等,将Cache中的数据返回至业务侧,不更新Cache中的数据;若地址不相等,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
    方式二:若低于第一阈值Mbit地址对应的所述vld为部分有效,不更新Cache中的数据,将请求片外存储器返回的数据发送至业务侧;
    方式三:若低于第一阈值Mbit地址对应的所述vld为无效,更新Cache中的数据,将片外存储器返回的数据发送至业务侧;
    其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  13. 根据权利要求12所述的方法,其中,所述方法还包括:
    中央处理单元配置业务表项,对于单burst表项更新的情况,发出写单burst表项的指令;
    经所述第一仲裁模块仲裁后以表项低于第一阈值Mbit为地址,将表项高于第二阈值Nbit地址/表项数据写入Cache中,通过所述控制模块将此地址对应的vld寄存器置位,发出更新片外存储器的指令,完成表项更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  14. 根据权利要求12所述的方法,其中,所述方法还包括:
    中央处理单元配置业务表项,对于多burst表项更新的情况,发出写多burst表项的指令;
    经所述第一仲裁模块仲裁后首burst以表项低于第一阈值Mbit左移2^Sbit得到的值作为地址,将表项高于第二阈值Nbit地址/表项数据写入所述Cache中,通过所述控制模块将此地址对应的vld置0,不发出更新所述片外存储器表项的指令;
    第二个burst以表项低于第一阈值Mbit左移2^S bit的得到值+1作为地 址,将表项高于第二阈值Nbit地址/表项数据置写入所述Cache中,通过所述控制模块将此地址对应vld置0,不发出更新所述片外存储器的指令,同时将首burst的vld置1,发出更新vld表项的指令;
    依此类推,当所述片外存储器返回的倒数第二个burst地址低于第一阈值Mbit左移2^S bit得到的地址+S-2地址匹配时,将最后一个burst对应的vld置1,发出更新所述片外存储器表项的指令,完成表项的更新;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  15. 根据权利要求13所述的方法,其中,所述方法还包括:
    比较模块对于单burst表项的情况,判断所述查找请求低于第一阈值Mbit对应的vld标识是否有效,有效,则用查找请求低于第一阈值Mbit发起查询Cache,获得查询结果;
    解析所查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽。
  16. 根据权利要求14所述的方法,其中,所述方法还包括:
    比较模块对于多burst表项的情况,且多burst表项为2^S个时,通过所述控制模块判断所述查找请求低于第一阈值Mbit左移sbit后,连续2^S个地址对应的vld标识是否有效,若全有效,用查找请求低于第一阈值Mbit左移sbit后连续发出2^S个查询Cache的请求,获得查询结果;
    解析所述查询结果,将查到的地址与所述查找请求的高于第二阈值Nbit进行比较,相等,直接将Cache的查表结果拼接后通过所述分发模块返回给业务侧,不发出查询片外存储器的请求,将查找信息存储模块的数据读取丢弃;其中,所述M和N都为自然数,M和N之和为业务侧发送请求位宽;所述S为自然数。
  17. 根据权利要求15所述的方法,其中,所述方法还包括:
    比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
    对于单burst表项的情况,通过所述控制模块判断地址低于第一阈值Mbit对应的vld是否有效,有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
  18. 根据权利要求16所述的方法,其中,所述方法还包括:
    比较模块在所述查询请求和Cache中的所有地址均不匹配,发起查找片外存储器的请求,待表项数据返回后,从所述查找信息存储模块取出表项地址和多burst信息;
    对于多burst表项的情况,先通过所述控制模块判断地址低于第一阈值Mbit左边移Sbit后对应连续2^S个地址的对应的vld是否有效,全有效,则通过所述第二仲裁模块仲裁后读取Cache,获取地址的高于第二阈值Nbit,取出地址高于第二阈值Nbit进行比较,若匹配,则将对应地址的数据用从所述片外存储器返回的表项数据进行替换回写至Cache,将此数据通过所述分发模块返回至业务侧。
  19. 根据权利要求18所述的方法,其中,所述方法还包括:
    比较模块在接收到所述查询请求,根据查询请求携带的多burst标识,通过所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若有效,则读取对应Cache的数据,判断业务请求高于第二阈值Nbit与Cache中的地址是否匹配,匹配,则直接将数据返回给业务侧;不匹配,则发出查询片外存储器的请求。
  20. 根据权利要求19所述的方法,其中,所述方法还包括:
    比较模块在待表项数据返回后,读取查找信息存储模块,以获取查找请求地址和多burst标识;
    在所述控制模块判断业务请求的低于第一阈值Mbit左移Sbit,对应的连续2^S个请求的vld标识是否有效,若全有效,则读取对应Cache的数据,判断业务请求的高于第二阈值Nbit地址与返回Cache的业务地址是否匹配,若匹配,将Cache中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据,否则,直接将片外存储器中的表项数据通过所述分发模块返回给业务侧,同时更新Cache中的表项数据;若多burst对应的vld为部分有效,则表明表项更新未完成,将片外存储器中的表项数据通过所述分发模块返回给业务侧,不更新Cache中的表项数据。
  21. 一种计算机存储介质,其中存储有计算机可执行指令,该计算机可执行指令配置执行上述权利要求11-20任一项用于提高表项访问带宽和原子性操作的方法。
PCT/CN2016/081618 2015-06-26 2016-05-10 一种用于提高表项访问带宽和原子性操作的装置及方法 WO2016206490A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US15/739,243 US10545867B2 (en) 2015-06-26 2016-05-10 Device and method for enhancing item access bandwidth and atomic operation
EP16813606.7A EP3316543B1 (en) 2015-06-26 2016-05-10 Device and method of enhancing item access bandwidth and atomic operation
ES16813606T ES2813944T3 (es) 2015-06-26 2016-05-10 Dispositivo y procedimiento para mejorar un ancho de banda de acceso a elemento y una operación atómica
SG11201710789YA SG11201710789YA (en) 2015-06-26 2016-05-10 Device and method of enhancing item access bandwidth and atomic operation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510364814.8 2015-06-26
CN201510364814.8A CN106302374B (zh) 2015-06-26 2015-06-26 一种用于提高表项访问带宽和原子性操作的装置及方法

Publications (1)

Publication Number Publication Date
WO2016206490A1 true WO2016206490A1 (zh) 2016-12-29

Family

ID=57584540

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/081618 WO2016206490A1 (zh) 2015-06-26 2016-05-10 一种用于提高表项访问带宽和原子性操作的装置及方法

Country Status (7)

Country Link
US (1) US10545867B2 (zh)
EP (1) EP3316543B1 (zh)
CN (1) CN106302374B (zh)
ES (1) ES2813944T3 (zh)
PT (1) PT3316543T (zh)
SG (1) SG11201710789YA (zh)
WO (1) WO2016206490A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109656832A (zh) * 2017-10-11 2019-04-19 深圳市中兴微电子技术有限公司 一种查表方法、计算机可读存储介质
CN107888513A (zh) * 2017-10-23 2018-04-06 深圳市楠菲微电子有限公司 用于交换芯片的缓存方法及装置
US10776281B2 (en) * 2018-10-04 2020-09-15 International Business Machines Corporation Snoop invalidate filter for distributed memory management unit to reduce snoop invalidate latency
CN113630462B (zh) * 2021-08-09 2022-06-03 北京城建设计发展集团股份有限公司 一种数据中台实现设备下控的方法与系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1863169A (zh) * 2006-03-03 2006-11-15 清华大学 基于网络处理器的路由查找结果缓存方法
CN101526896A (zh) * 2009-01-22 2009-09-09 杭州中天微系统有限公司 嵌入式处理器的加载/存储单元
CN101719055A (zh) * 2009-12-03 2010-06-02 杭州中天微系统有限公司 快速执行加载存储指令模块
CN104378295A (zh) * 2013-08-12 2015-02-25 中兴通讯股份有限公司 表项管理装置及表项管理方法

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1300707C (zh) 2002-07-23 2007-02-14 华为技术有限公司 外部sdram读写处理方法
US7107425B2 (en) 2003-09-06 2006-09-12 Match Lab, Inc. SDRAM controller that improves performance for imaging applications
US7200713B2 (en) * 2004-03-29 2007-04-03 Intel Corporation Method of implementing off-chip cache memory in dual-use SRAM memory for network processors
EP1605360B1 (en) * 2004-06-08 2010-02-17 Freescale Semiconductors, Inc. Cache coherency maintenance for DMA, task termination and synchronisation operations
US7921243B1 (en) 2007-01-05 2011-04-05 Marvell International Ltd. System and method for a DDR SDRAM controller
US20100146415A1 (en) * 2007-07-12 2010-06-10 Viasat, Inc. Dns prefetch
US8335122B2 (en) * 2007-11-21 2012-12-18 The Regents Of The University Of Michigan Cache memory system for a data processing apparatus
US20090138680A1 (en) * 2007-11-28 2009-05-28 Johnson Timothy J Vector atomic memory operations
CN101340365A (zh) 2008-08-11 2009-01-07 杭州瑞纳科技有限公司 一种高带宽利用率的ddr2 sdram控制器设计方法
CN101534477A (zh) 2009-04-23 2009-09-16 杭州华三通信技术有限公司 一种表项管理方法和装置
CN101620623A (zh) 2009-08-12 2010-01-06 杭州华三通信技术有限公司 内容可寻址存储器表项管理方法和装置
CN101651628A (zh) 2009-09-17 2010-02-17 杭州华三通信技术有限公司 一种三状态内容可寻址存储器实现方法及装置
JP5006472B2 (ja) 2009-12-04 2012-08-22 隆敏 柳瀬 表検索装置、表検索方法、及び、表検索システム
US9507735B2 (en) * 2009-12-29 2016-11-29 International Business Machines Corporation Digital content retrieval utilizing dispersed storage
US8447798B2 (en) 2010-03-25 2013-05-21 Altera Corporation Look up table (LUT) structure supporting exclusive or (XOR) circuitry configured to allow for generation of a result using quaternary adders
CN102073539B (zh) * 2010-12-02 2013-10-09 华为技术有限公司 队列请求处理方法和装置
CN102831078B (zh) 2012-08-03 2015-08-26 中国人民解放军国防科学技术大学 一种cache中提前返回访存数据的方法
CN104601471B (zh) * 2015-02-02 2017-12-01 华为技术有限公司 一种转发信息表的读写方法及网络处理器

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1863169A (zh) * 2006-03-03 2006-11-15 清华大学 基于网络处理器的路由查找结果缓存方法
CN101526896A (zh) * 2009-01-22 2009-09-09 杭州中天微系统有限公司 嵌入式处理器的加载/存储单元
CN101719055A (zh) * 2009-12-03 2010-06-02 杭州中天微系统有限公司 快速执行加载存储指令模块
CN104378295A (zh) * 2013-08-12 2015-02-25 中兴通讯股份有限公司 表项管理装置及表项管理方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3316543A4 *

Also Published As

Publication number Publication date
CN106302374A (zh) 2017-01-04
EP3316543A4 (en) 2018-08-01
US10545867B2 (en) 2020-01-28
EP3316543A1 (en) 2018-05-02
ES2813944T3 (es) 2021-03-25
US20180314634A1 (en) 2018-11-01
CN106302374B (zh) 2019-08-16
SG11201710789YA (en) 2018-01-30
PT3316543T (pt) 2020-07-16
EP3316543B1 (en) 2020-07-01

Similar Documents

Publication Publication Date Title
US10198363B2 (en) Reducing data I/O using in-memory data structures
US10097466B2 (en) Data distribution method and splitter
WO2016206490A1 (zh) 一种用于提高表项访问带宽和原子性操作的装置及方法
US6594722B1 (en) Mechanism for managing multiple out-of-order packet streams in a PCI host bridge
US11099746B2 (en) Multi-bank memory with one read port and one or more write ports per cycle
WO2015172533A1 (zh) 数据库查询方法和服务器
US11269956B2 (en) Systems and methods of managing an index
US20150143065A1 (en) Data Processing Method and Apparatus, and Shared Storage Device
WO2008119269A1 (fr) Procédé et dispositif de moteur de stockage et de consultation d'informations
KR102126592B1 (ko) 멀티코어 프로세서들에 대한 내부 및 외부 액세스를 갖는 룩-어사이드 프로세서 유닛
CN111737564B (zh) 一种信息查询方法、装置、设备及介质
US20200057722A1 (en) Data reading method based on variable cache line
US20170078200A1 (en) Multi-table hash-based lookups for packet processing
WO2022156650A1 (zh) 访问数据的方法及装置
US9697127B2 (en) Semiconductor device for controlling prefetch operation
WO2018177184A1 (zh) 一种实现查表处理的方法及装置、设备、存储介质
CN113377689B (zh) 一种路由表项查找、存储方法及网络芯片
US8539135B2 (en) Route lookup method for reducing overall connection latencies in SAS expanders
US9571541B1 (en) Network device architecture using cache for multicast packets
CN104378295B (zh) 表项管理装置及表项管理方法
US9996468B1 (en) Scalable dynamic memory management in a network device
US8359528B2 (en) Parity look-ahead scheme for tag cache memory
US9514060B2 (en) Device, system and method of accessing data stored in a memory
CN105302745B (zh) 高速缓冲存储器及其应用方法
US12001333B1 (en) Early potential HPA generator

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16813606

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15739243

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11201710789Y

Country of ref document: SG

WWE Wipo information: entry into national phase

Ref document number: 2016813606

Country of ref document: EP