CN113966532A - Content addressable storage device, method and related equipment - Google Patents

Content addressable storage device, method and related equipment Download PDF

Info

Publication number
CN113966532A
CN113966532A CN201980096977.9A CN201980096977A CN113966532A CN 113966532 A CN113966532 A CN 113966532A CN 201980096977 A CN201980096977 A CN 201980096977A CN 113966532 A CN113966532 A CN 113966532A
Authority
CN
China
Prior art keywords
data
matched
memory
comparison units
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980096977.9A
Other languages
Chinese (zh)
Inventor
杨兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN113966532A publication Critical patent/CN113966532A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C15/00Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a content addressable storage device, a method and related equipment, wherein the content addressable storage device can comprise: the memory is used for storing data to be matched; a comparator comprising N comparison units respectively coupled to the memory; the scheduler is used for acquiring K pieces of search data and scheduling the K pieces of search data to K target comparison units in the N comparison units respectively, wherein the target comparison units are comparison units in an idle state; the controller is used for controlling the memory to read out the data to be matched and respectively sending the data to the K target comparison units; and each target comparison unit in the K target comparison units is used for comparing the corresponding search data with the data to be matched and outputting a corresponding matching result according to the comparison result. By adopting the method and the device, the low power consumption and the flexibility of the content addressable storage device can be ensured.

Description

Content addressable storage device, method and related equipment Technical Field
The present application relates to the field of memory technologies, and in particular, to a content addressable memory device, a content addressable memory method, and a related apparatus.
Background
A Content Addressable Memory (CAM) is a Memory that addresses contents, and is a special Random Access Memory (RAM). The method can perform reading and writing as in the conventional RAM, and can also perform searching operation, wherein the searching mechanism is to compare an input data item with all data items stored in the CAM, judge whether the input data item is matched with the data items stored in the CAM or not, and output the matching information corresponding to the data item. As shown in fig. 1A, fig. 1A is a schematic diagram illustrating a functional difference between a RAM and a CAM in the prior art, where the RAM searches for corresponding data according to an input address, and the CAM searches for a corresponding address according to the input data.
As shown in fig. 1B, fig. 1B is a schematic diagram of a typical CAM structure in the prior art, and the CAM mainly includes the following parts: a CAM array (M array), a Sense Amplifier (SA), a Search Line drivers (SL drivers), and an encoder (encoder). The CAM array in fig. 1B includes (W × K) CAM memory cells M, each of which is responsible for storing data and comparing the stored data with extrinsic search data. In the above description, a row of memory cells M (W in the figure for example) constitutes one word (which may also be referred to as a data entry, an entry, etc.) of the CAM, W is "word width or bit width", the number of all words in the CAM array (K in the figure for example) is referred to as "depth", and the capacity of the CAM is represented by (K word × W bits). When a Search operation is performed, Search data (Search Key) is loaded into a Search Line SL through a Search Line driver SL, data stored in a memory cell M is matched with the SL, a matching result is reflected on a Match Line (ML) in a level change form, voltage change on the ML is amplified through a sensitive method SA and finally output to a priority encoder (encoder), and an address with the highest priority of a data item matched with the Search data is encoded and output.
Therefore, when the CAM searches the content, the comparison circuit in each storage unit M can be used for simultaneously comparing the search data with the data stored in all the storage units, so that all the data items of the whole CAM array can be inquired in one clock cycle at the fastest speed, the quick matching is realized, the parallelism is high, and the searching speed is not influenced by the CAM capacity. Compared with the common memory which can only read out the data in the memory one by one according to the address sequence under the control of software for comparison, the CAM realizes the function of high-speed search through a hardware circuit, greatly improves the search efficiency and the performance of a search system, and is widely applied to the fields of network communication, mode identification and the like.
However, because of the characteristic that the CAM uses a dedicated circuit for parallel comparison, the CAM inevitably has the problems of large chip area and high power consumption, and has the disadvantages of poor portability and low flexibility, thereby reducing the performance and reliability of the CAM chip.
Content of application
The embodiment of the application provides a content addressable storage device, a content addressable storage method and related equipment, which can ensure the low power consumption and flexibility of the content addressable storage device while realizing the content addressable function.
In a first aspect, an embodiment of the present application provides a content addressable storage apparatus, which may include: the memory is used for storing data to be matched; a comparator comprising N comparison units respectively coupled to the memory, N being an integer greater than 1; the scheduler is used for acquiring K search data and scheduling the K search data to K target comparison units in the N comparison units respectively, wherein the target comparison units are comparison units in an idle state, and K is an integer which is greater than or equal to 1 and less than or equal to N; the controller is used for controlling the memory to read out the data to be matched and respectively sending the data to the K target comparison units; and each target comparison unit in the K target comparison units is used for comparing the corresponding search data with the data to be matched and outputting a corresponding matching result according to the comparison result.
In the embodiment of the application, a plurality of comparison units in a comparator are used as dynamic and flexible callable comparison resources in a content addressable storage device, and when input search data needs to be searched, idle comparison units in the comparison units are called to sequentially compare the search data with data to be matched in a memory, wherein the comparator comprises N comparison units, so that at most, parallel search of N search data can be simultaneously realized. Unlike the CAM of the prior art, each memory cell of the CAM array has a comparator circuit, so that the search data can be simultaneously compared with all the stored data in the memory array, thereby obtaining comparison results in fewer clock cycles or even one clock cycle. And once the CAM leaves the factory, the physical structure and related parameters of the CAM are fixed, so that the portability and flexibility are poor. In the embodiment of the application, the comparison units of the comparators in the content addressable storage device are shared, the parallel search function of searching data can be realized according to actual search requirements, the processing capacity of the device is improved in multiples, the scenes that the instantaneous throughput of a network is low and high are met, the use of a CAM structure with high cost and high power consumption is avoided, the performance is ensured, and the cost and the power consumption are reduced. Further, since the memory in the device does not need a special CAM structure, but a relatively general memory structure can be used, and other functional structures (a scheduler, a controller, etc.) in the device can be implemented based on a general description language such as a hardware description language verilog, the portability and flexibility of the device are strong, so that the high availability, low cost and low power consumption of the content addressable storage device are greatly ensured.
In a possible implementation manner, the apparatus further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched comprises a plurality of data items; each target comparison unit in the K target comparison units is further configured to send a corresponding matching result to the result output register, where the matching result includes one or more of matching indication information, a matching data item of corresponding search data, and an address of the matching data item, where the matching indication information is used to indicate whether there is a matching data item; and the result output register is used for receiving and storing the matching results respectively sent by the K target comparison units. In this embodiment, the N comparison units are further respectively coupled to a result output register in the content addressable storage device, and when any one of the N comparison units completes matching of the search data, the matching result may be sent to the result output register, where the matching result may be one or more of whether matching is successful, specific matching data, or an address of the matching data.
In one possible implementation, the data to be matched includes M data items. M is an integer greater than 1; the controller is specifically configured to control the memory to read out the M data items in sequence and send the M data items to the K target comparison units respectively in an address traversal manner; each of the K target comparison units is specifically configured to compare the corresponding search data with the M data items in sequence, and determine a matching data item of the corresponding search data. In the embodiment of the application, M data items stored in the memory are serially read in an address traversal mode and are gradually sent to the corresponding target comparison units for comparison, so that the corresponding matching results are obtained.
In one possible implementation, the data to be matched includes M data items; the controller is specifically configured to control the memory to read out L data items at each clock cycle in an address polling manner, and broadcast the read out L data items to the N comparison units, where L is a positive integer less than or equal to M; each of the K comparison units is configured to compare L data items received each time with corresponding search data. In the embodiment of the present application, when a plurality of search data are searched, a plurality of comparison units execute search operations at the same time, specifically, the controller controls to read out L data items in the memory and send the L data items to the comparison unit currently executing the search operations in parallel, and the comparison unit executing the search operations compares L data received each time with the stored search data to obtain a corresponding matching result. For example, the comparator includes 8 comparing units, there are 4 comparing units currently performing data comparison, the controller reads out 2 data items from 64 data items to be matched every clock cycle and sends them to the 4 comparing units in parallel, each comparing unit compares 2 data items received each time with the stored search data, wherein, for each comparing unit, if it obtains a matching result (compared with all or part of 64 data items to be matched), the searching operation of the comparing unit may be stopped, and at this time, the controller controls no data item read out from the memory to be sent to the comparing unit that has completed the searching operation.
In one possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items. In this embodiment, each comparing unit in the comparator may have a bit width L times the bit width of each data item in the data to be matched stored in the memory, and at this time, the comparing unit may compare the L data items in each clock cycle. For example, if the bit width of each data item is W and the comparison bit width of the comparison unit is L × W, the comparison unit may complete the comparison of L × W bits/bit data in one clock cycle.
In a possible implementation manner, the controller is further configured to control the memory to write the data to be matched in a preset manner. In this embodiment of the application, the controller may complete initial configuration control of the data to be matched in a preset manner at an initial stage under the control of an external or internal processor of the device, for example, enumerate addresses of all the data to be matched, and write the data to be matched corresponding to the addresses into the memory correspondingly, where the content of the data to be matched depends on specific business requirements.
In a possible implementation manner, the scheduler is further configured to, when the search data received within a preset time period exceeds a preset number, or when there is no idle comparison unit currently, control to determine a target comparison unit from the N comparison units after a preset time interval. In the embodiment of the present application, when the throughput in the network is high, for example, when the instantaneous burst maximum pps is large, or when there is no idle comparing unit currently, the traffic shaping processing, that is, the control is performed after a certain time, so as to improve the adaptability of the content addressable storage device in the embodiment of the present application, so as to reduce the demand of the instantaneous comparing resource.
In a possible implementation manner, the controller is further configured to control the memory to write new data to be matched or modify the data to be matched in a process of controlling the memory to read the data to be matched. In the embodiment of the application, the controller can write the data to be matched into the memory in a read-while-write mode, for example, for a system with the requirement of online refreshing of the data to be matched, the memory with two independent read/write ports can be selected to update the data to be matched while reading or modify the data while reading; if the single-port memory is used, the controller needs to insert some extra refresh time to complete the refresh of the data to be matched when controlling the read of the data to be matched, and at this time, the partial query speed may be reduced. The refresh time refers to that when refreshing (writing table entry/data entry), the address sequence of the query (reading table entry/data entry) is suspended to be increased (meanwhile, the write enable is invalid), and after the write is finished, the query is enabled again (the read enable and the read address sequence are increased).
In a possible implementation manner, the data to be matched includes M data items, where each data item includes content to be matched, query control information, and an output result; each of the K target comparison units is specifically configured to compare corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and output an output result in the matched data item as a matching result of the corresponding search data. In the embodiment of the application, the query control information and the corresponding output result (for example, the output result is an address corresponding to the content to be matched) are carried in the data item contained in the data to be matched, so that the query method and the related parameters of the comparison unit during the data search query are controlled, and the service requirements under various scenes are met.
In a second aspect, an embodiment of the present application provides a content addressable storage method, which is applicable to a content addressable storage apparatus, where the apparatus includes: the comparator comprises N comparison units, the N comparison units are respectively coupled to the memory, and N is an integer greater than 1; the method comprises the following steps: storing data to be matched in the memory; acquiring K search data, and respectively scheduling the K search data to K target comparison units in the N comparison units, wherein the target comparison units are in idle states, and K is an integer greater than or equal to 1 and less than or equal to N; controlling the memory to read the data to be matched and respectively sending the data to the K target comparison units; and comparing the corresponding search data with the data to be matched through each target comparison unit in the K target comparison units, and outputting a corresponding matching result according to the comparison result.
In a possible implementation manner, the apparatus further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched comprises a plurality of data items; the method further comprises the following steps: sending a matching result corresponding to each target comparing unit in the K target comparing units to the result output register, wherein the matching result comprises one or more of matching indication information, a matching data item of corresponding search data and an address of the matching data item, and the matching indication information is used for indicating whether a matching data item exists or not; and receiving and storing the matching results respectively sent by the K target comparison units through the result output register.
In a possible implementation manner, the data to be matched includes M data items, where M is an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes: controlling the memory to read out the M data items in sequence and send the M data items to the K target comparison units respectively according to an address traversal mode; the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes: and each target comparison unit in the K target comparison units compares the corresponding search data with the M data items in sequence to determine the matching data item of the corresponding search data.
In a possible implementation manner, the data to be matched includes M data items, where M is an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes: controlling the memory to read out L data items in each clock cycle in an address polling mode, and broadcasting the read-out L data items to the N comparison units, wherein L is a positive integer less than or equal to M; comparing the corresponding search data with the data to be matched by each of the K target comparison units, including: comparing, by each of the K comparison units, the L data items received at each time with the corresponding search data.
In one possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items.
In one possible implementation, the method further includes: and controlling the memory to write the data to be matched in a preset mode.
In one possible implementation, the method further includes: and when the search data received in a preset time period exceeds a preset number or no idle comparison unit exists currently, controlling to determine a target comparison unit from the N comparison units after a preset time interval.
In one possible implementation, the method further includes: and in the process of controlling the memory to read out the data to be matched, controlling the memory to write in new data to be matched or modifying the data to be matched.
In one possible implementation manner, the data to be matched includes a plurality of data items, where each data item includes content to be matched, query control information, and an output result; the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes: and comparing the corresponding search data with the content to be matched in each data item according to the query control information of the M data items by each target comparison unit in the K target comparison units, and outputting the output result in the matched data item as the matching result of the corresponding search data.
In a third aspect, the present application provides a semiconductor chip, which may include:
the first aspect as well as a content addressable storage device as provided in connection with any one of the implementations of the first aspect.
In a fourth aspect, the present application provides a semiconductor chip, which may include:
the first aspect as well as the content addressable storage device provided in connection with any implementation manner of the first aspect, the processor coupled to the content addressable storage device, and the memory external to the content addressable storage device.
In a fifth aspect, the present application provides a system on a chip SoC chip, which includes the first aspect, and the content addressable storage device provided in conjunction with any implementation manner of the first aspect, a processor coupled to the content addressable storage device, and an external memory of the content addressable storage device. The chip system may be constituted by a chip, or may include a chip and other discrete devices.
In a sixth aspect, the present application provides a chip system, comprising: the content addressable memory device comprises the first aspect and a chip comprising a processor coupled to the content addressable memory device and an external memory of the content addressable memory device. The chip system may be constituted by a chip, or may include a chip and other discrete devices.
In a seventh aspect, the present application provides an electronic device, where the electronic device includes the content addressable storage apparatus of the first aspect as well as the content addressable storage apparatus provided in conjunction with any implementation manner of the first aspect, an external memory of the content addressable storage apparatus, and a processor coupled to the content addressable storage apparatus. The external memory is used for storing necessary program instructions and data, and the processor is used for running a necessary general-purpose operating system of the electronic equipment and is coupled with the content addressable storage device to complete relevant processing functions in the content addressable storage device. The electronic device may also include a communication interface for the electronic device to communicate with other devices or a communication network.
In an eighth aspect, the present application provides a computer storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program can implement the second aspect and the flow in the content addressable storage method provided in connection with any one implementation manner of the second aspect.
In a ninth aspect, the present application provides a computer program, where the computer program includes instructions, when the computer program is executed by a computer, the computer can execute the second aspect and the flow in the content addressable storage method provided in connection with any implementation manner of the second aspect.
Drawings
FIG. 1A is a diagram illustrating the functional differences between a RAM and a CAM in the prior art;
FIG. 1B is a schematic diagram of a typical CAM structure of the prior art;
fig. 2 is a schematic processing diagram of a simulated CAM based on Memory address traversal according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of extended bit width analog CAM processing based on Memory address traversal according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a content addressable storage device according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of another content addressable memory device provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of another content addressable device provided in an embodiment of the present application;
FIG. 7 is a timing diagram of comparison by a comparison unit according to an embodiment of the present application;
fig. 8 is a schematic flowchart of a content addressable storage method according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described below with reference to the drawings.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
As used in this specification, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between 2 or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from two components interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal).
First, some terms in the present application are explained so as to be easily understood by those skilled in the art.
(1) A System on Chip (SoC) is called a System on Chip (SoC), and also called a System on Chip (SoC) means that it is a product, i.e. an integrated circuit with a special purpose, which contains the complete System and has the whole content of embedded software. Meanwhile, the method is a technology for realizing the whole process from the determination of system functions to the software/hardware division and completing the design.
(2) Random Access Memory (RAM) is used to store and store data. Which can be read and written at any time, the RAM is typically used as a temporary storage medium for the operating system or other programs that are running (which may be referred to as system memory). RAM cannot retain data when power is off and must be written to a long term memory (e.g., hard disk) if data needs to be saved.
(3) The random access memory RAM can be further divided into: static Random Access Memory (SRAM), and Dynamic Random Access Memory (DRAM). The two basic principles have the same place, and the charges are stored in the memory, wherein the SRAM has a more complex structure, less capacity per unit area and high access speed; the DRAM has a simple structure, a large storage capacity per unit area, and a slow access time compared to the SRAM, and because the DRAM has a simple structure, the stored charges gradually disappear over time, and therefore, a timed recharge (Refresh) is required to maintain the data stored in the capacitor.
(4) Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), DDR for short, is developed on the basis of SDRAM Memory, and still continues to use SDRAM production system, so for Memory manufacturers, only a little improvement is needed to the device for manufacturing ordinary SDRAM, and then DDR Memory can be produced, and cost can be effectively reduced. Compared with the traditional single data rate, the DDR technology realizes that two read/write operations are carried out in one clock cycle, namely, one read/write operation is respectively carried out on the rising edge and the falling edge of the clock.
(5) A Read Only Memory (ROM) is a solid-state semiconductor Memory that can Read Only data stored in advance. The property is that once the data is stored, it cannot be changed or deleted. It is commonly used in electronic or computer systems where the data is not required to be changed frequently and the data is not lost due to power down.
(6) A Field Programmable Gate Array (FPGA) is a product of further development on the basis of Programmable devices such as Programmable Array Logic (PAL), Generic Array Logic (GAL), and Complex Programmable Logic Device (CPLD). The circuit is a semi-custom circuit in the field of Application Specific Integrated Circuits (ASIC), not only overcomes the defects of the custom circuit, but also overcomes the defect that the number of gate circuits of the original programmable device is limited.
(7) An Emulator (Emu) refers to a device or program that can simulate almost one hundred percent of the entire characteristics and behaviors of a hardware or software system, and aims to completely simulate the reaction of the emulated hardware when receiving various external information.
(8) The Round Robin Scheduling (Round Robin Scheduling) algorithm is a Round-Robin algorithm that sequentially requests task queues or processes that are ready for Scheduling/processing.
(9) The Look-up Table (LUT) is essentially a RAM. At present, 4-input LUTs are mostly used in an FPGA, so each LUT can be regarded as a RAM with 4-bit address lines, when a user describes a logic circuit through a schematic diagram or an HDL language, then PLD/FPGA development software can automatically calculate all possible results of the logic circuit, and write a truth table (i.e., a result) into the RAM in advance, so that each time a signal is input for logic operation, the logic operation is equal to inputting an address for table lookup, finding out the content corresponding to the address, and then outputting the content.
(10) The unit of network throughput rate (i.e. how many packet data packets are sent per second) is commonly used for packets per second (pps), and the performance of the network is usually measured by throughput (throughput) as an index. The packet forwarding rate marks the size of the capability of the switch to forward the data packet, and generally, the packet forwarding rate of the switch is from tens of Kpps to hundreds of mbps. The packet forwarding rate refers to how many million data packets (Mpps) a switch can forward per second, i.e., the number of data packets that the switch can forward simultaneously. The packet forwarding rate embodies the switching capacity of the switch in units of data packets.
(11) Throughput, which is the amount of data (measured in bits, bytes, etc.) successfully transmitted to a network, device, port, or other facility per unit time, means the maximum data rate that a device can receive and forward without frame loss. The throughput is mainly determined by the hardware of the internal and external network ports of the network device and the efficiency of the program algorithm, especially the program algorithm, for the devices which need to perform a large amount of calculation, the low efficiency of the algorithm can cause the traffic to be greatly discounted.
In order to facilitate understanding of the embodiments of the present application, the following further analyzes technical problems to be specifically solved by the embodiments of the present application and corresponding practical application scenarios. Because the CAM performs content search, the highly parallel search is performed based on the comparison circuit in each storage unit in the CAM array, that is, the CAM array structure in the CAM is different from the data storage array structure in an ordinary Memory (such as a Memory, a RAM, and the like), that is, the CAM cannot use a general Memory structure to implement a content-based addressing function. The following provides two schemes that can utilize a common Memory structure to implement a content addressable storage function:
in the first scheme, the simulation CAM based on Memory address traversal:
the scheme can be simply described as that, initially, the table entry data to be matched is stored in the Memory, after each subsequent inquiry is started, all the table entry data of the Memory depth are read once according to the address traversal mode, in the reading process, the target matching data (namely, the search input data in the application) and the read table entry data (namely, the data entry in the application) are compared one by one, and when the address traversal is finished, one-time CAM simulation processing is completed; the next query start needs to wait for the end of the last query, and then the complete address traversal process is repeated. As shown in fig. 2, fig. 2 is a schematic processing diagram of a Memory address traversal-based analog CAM according to an embodiment of the present application, where addresses corresponding to all entry data in a Memory are assumed to be 0, 1, 2, 3, 4, 5, 6, 7, and T0Representing the moment of initiation of the first query, T1Representing the moment of initiation of the second query, T3Representing the moment of initiation of the third query. It can be seen that the 2 nd query can only be started after the 1 st query is completed at the earliest time, time slices of the single query are not overlapped, and only one query request is processed at the same time. The scheme is a serial comparison mode, a universal Memory structure is used, the cost and the power consumption are relatively low, the rear-end risk is small (namely the feasibility risk of converting the logic design into the physical circuit is small), and the flexibility is high.
The disadvantages of the above-described processing structure: because its single inquiry needs to wait for the processing delay of address traversal, it results in low throughput and low performance.
Scheme two, Memory address traversal-based bit width expansion analog CAM:
the scheme can be simply described as that on the basis of the first scheme, the Memory word width of the Memroy is increased, for example, the Memory word width is widened to be the word width of 2 table entry data, and when 2 table entry data are compared at a time, the Memory depth can be reduced to 1/2, and meanwhile, the processing delay is correspondingly reduced. The single query time is reduced to half of the original time, and two table entries are compared in one clock period. As shown in fig. 3, fig. 3 is a schematic diagram of extended bit width analog CAM processing based on Memory address traversal provided in the embodiment of the present application, where it is assumed that addresses corresponding to all table entry data in a Memory are 0, 1, 2, 3, 4, 5, 6, 7, and T0Representing the moment of initiation of the first query, T1Representing the moment of initiation of the second query, T3Representing the moment of initiation of the third query. It can be seen that two table data can be compared in each clock cycle, the time for single query is reduced to half of the original time, however, the 2 nd query can be started only from the 1 st query at the earliest, the time slices for single query are still not overlapped, and only one query request is processed at the same time. Namely, the scheme can also be understood as a serial comparison mode, a universal Memory structure can also be used, the cost and the power consumption are relatively low, the risk of the rear end is small, and the flexibility is high.
The disadvantages of the above-described processing structure: the Memory word width is remarkably widened, the depth is remarkably reduced, but the actual Memory shape with more optimized resources generally has certain constraint, the Memory type selection may be deteriorated due to the change, the high-integration storage advantage of the Memory may be lost in an extreme scene, and finally the increase of the area and the power consumption of the chip is reflected.
The application scene of a general network system is analyzed to find that the limit performance requirement that a query request needs to be initiated once in each system clock cycle is not required, typically, for example, the Ethernet message processing is performed, the query is generally carried out according to the packet without the query according to the byte, and the packet rate is far less than the byte rate of the packet, so that a large number of query requests can not be generated in a short time; however, in a short packet burst scenario, if the table entry depth is long (the number of data items to be matched is large), the pps performance of the first scheme may not meet the performance requirement, because the table entry depth is long, the time of single query is long, and there is no time overlap between single queries, which causes a short packet burst scenario and cannot ensure the network throughput; compared with the first scheme, the network throughput of the CAM in the prior art and the second scheme are optimized, but the problems of resource, power consumption degradation or matching flexibility exist.
In summary, the existing content-based addressing schemes fail to meet the performance of the network. Therefore, the flexible CAM with low power consumption, small area, strong portability and strong expandability provided by the application is used for solving the technical problems.
Based on the above, the following describes a content addressable storage apparatus and related devices provided in the embodiments of the present application. Referring to fig. 4, fig. 4 is a schematic structural diagram of a content addressable storage device 40 according to an embodiment of the present disclosure, where the content addressable storage device 40 may include a scheduler 401, a comparator 402, a memory 403, and a memory 404, where the comparator 402 includes N comparison units 4021, and the N comparison units are respectively coupled to the memory 403, where N is an integer greater than 1; alternatively, the scheduler 401, the comparator 402, the memory 403, and the memory 404 may be located on one integrated circuit substrate. Wherein,
the memory 404 is configured to store data to be matched, where the data to be matched is a plurality of data items that need to be matched with the K search data acquired by the scheduler 401, that is, each search data needs to be matched with the data to be matched, so as to obtain a corresponding matching result. The data to be matched can be obtained in various ways, such as static configuration, dynamic learning, and the like. Before any one of the target comparing units 4021 starts to perform comparison of search data, the controller 404 controls the memory 404 to write the data to be matched. Optionally, the controller 403 may also control the memory 404 to write new data to be matched or modify the data to be matched during the process of reading the data to be matched from the control memory 404. That is, the controller 403 may control the memory 404 to write the data to be matched into the memory 403 in a read-while-write manner, for example, for a system with a requirement for online refreshing of the data to be matched, a memory with two independent read/write ports may be selected to perform read-while-update or read-while-modify of the data to be matched; if the single-port memory is used, the controller 403 may additionally insert some refresh time to complete the refresh of the data to be matched when the memory 404 is controlled to read the data to be matched, and at this time, may reduce a partial query speed, where the refresh time refers to, during the refresh (writing the table entry/data entry), suspending the query (reading the table entry/data entry) address sequence from increasing (while the write enable is invalid), and after the write is finished, re-enabling the query (reading the enable, and increasing the read address sequence).
In a possible implementation, the storage structure of the memory 404 may be in the form of a storage array, where the storage array is formed by a plurality of basic storage units, each basic storage unit stores a binary digit (1 or 0) called a bit (bit), and a row of storage units forms one word (also called a data item, a data table entry, a table entry, etc.) of the memory 403, W is "word width or bit width", the number K of bars of all words in the memory array is called "depth", and the capacity of the memory 403 is represented by (K word × W bits). Unlike the CAM in the prior art, the Memory 403 itself has no function of Data comparison, i.e., does not include a comparison circuit, and thus a general Memory structure may be adopted, for example, the Memory 403 may be a general Random Access Memory (RAM) or a power-down volatile Memory device, such as a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), or a Synchronous Dynamic Random Access Memory (Synchronous DRAM, SDRAM), a double Data Rate SDRAM (DDR SDRAM), and the like; the Memory 403 may also be a general-purpose Read Only Memory (ROM) or a nonvolatile Memory, such as a Programmable ROM (PROM), an Erasable Programmable ROM (EPROM), an Electrically Erasable Programmable ROM (EEPROM), a flash ROM (flash ROM), etc.; the memory 403 may also be general purpose registers on the processor, flash memory, or any other suitable type of memory. It is understood that the data to be matched may be variable if the memory 403 is a RAM, and the data to be matched is data solidified in the memory 403 if the memory 403 is a ROM. It should be noted that the form of the data to be matched stored in the memory 404 is not limited in this application, and may be changed according to actual needs or business situations.
The scheduler 401 is configured to obtain K search data, and schedule the K search data to K target comparison units of the N comparison units, where the target comparison units are in an idle state, and K is an integer greater than or equal to 1 and less than or equal to N. For example, when the scheduler 401 receives a query request including search data, a lookup type, etc., it indicates that the content addressable storage device 40 needs to perform a lookup operation for searching data. At this time, the scheduler 401 needs to allocate a corresponding comparison unit to the search data for searching the matching result, and when there are multiple query requests, the scheduler 401 needs to allocate different target comparison units to different query requests; the scheduler 401 dynamically selects an idle available target comparing unit (which may be any one comparing unit 4021 in fig. 4) according to the dynamically received query request and in combination with the current states of the N comparing units 4021 in the comparator 402, starts or enables the query function of the target comparing unit, and sends the search data to the target comparing unit 4021, thereby completing the allocation scheduling of the query request to the query resources. The K search data acquired by the scheduler 401 may be acquired at the same time, or acquired at different times, that is, the K search data may be scheduled to the K target comparison units at the same time, or may be scheduled to the K target comparison units after the K search data is scheduled, which may depend on the order in which the K search data reaches the scheduler, or may depend on a scheduling rule preset or flexibly variable by the scheduler 401, and this is not particularly limited in this embodiment of the present application. For example, the scheduler 401 may schedule a certain search data to a certain target comparing unit immediately after receiving the search data, or the scheduler 401 may uniformly schedule a certain number of search data to corresponding target comparing units after receiving the certain number of search data. It should be noted that the scheduler 401 may receive a large number of query requests, that is, a large number of search data, at a certain time or within a certain time period, at this time, because the number of the comparison units in the comparator 402 is limited, the scheduler 401 may perform certain traffic shaping control, that is, only K search data are currently acquired for querying, and after a subsequent idle comparison unit exists, scheduling of other search data is performed, so as to avoid congestion or discarding of the search data as much as possible.
And the controller 403 is configured to control the memory 404 to read out the data to be matched and send the data to the K target comparing units respectively. When one or more comparison units in the comparator 402 are selected as target comparison units, that is, one or more comparison units need to perform a search operation, the comparator 402 may notify the controller 403 of the state information (idle state or active state) of the target comparison units, so as to complete the read-write enable merge control of the comparator 402 on the controller 403, and finally control the output of the memory 404. That is, when at least one comparison unit in the comparator 402 is selected as the target comparison unit, the controller 403 needs to control the memory 404 to continuously read the data item to be matched, because the port of the memory 404 needs to be controlled to be kept in a readable state, thereby realizing the read operation. Optionally, the scheduler 401 may also notify the controller 403 of the state information of the target comparing unit to complete read-write enabling combination control of the comparator 402 on the controller 403, which is not specifically limited in this embodiment of the present application. The structure in fig. 4 is exemplified by the comparator 402 notifying the controller 403, and it can be understood that if the scheduler 401 notifies the controller 403, a communication connection is required between the scheduler 401 and the controller 403, and details thereof are not described herein. Under the control of the controller 403, the memory 403 sends the read data to be matched to the K target comparing units, which may be broadcast to the N comparing units (including the K target comparing units), or may optionally send the read data to only the K target comparing units. For example, when the controller 403 knows that the comparison units 4021-2 and 4021-4 are currently used as target comparison units through the scheduler 401 or the comparator 402, and performs a comparison operation on search data, the controller 403 controls the memory 404 to broadcast the read data to be matched to the N comparison units, or only sends the data to the comparison units 4021-2 and 4021-4. It should be noted that if the memory 404 sends the read data to be matched only to the K target comparing units, the memory 404 needs to know which comparing units are currently the target comparing units, and at this time, the output of the memory 404 can be controlled by the state information (notified by the scheduler 401 or the comparator 402) of each comparing unit, which is known by the controller 403.
Further, before the search operation starts, the controller 403 is also used for controlling the memory 404 to write the data to be matched in a preset manner. The controller 403 may perform initial configuration control of data to be matched in a preset manner under the control of a processor external or internal to the apparatus at an initial stage. For example, after the content addressable storage device 40 is powered on, before entering a data search operation, the controller 403 completes an initialization process of the memory 404 under the control of the external processor, for example, the initialization process includes defining a matching word width and an output result bit width, selecting a read/write pattern of a data table entry (data to be matched), enumerating addresses of all data to be matched, and correspondingly writing the data to be matched corresponding to the addresses into the memory 404, and the content of the data to be matched depends on specific business requirements.
The comparator 402 comprises N comparison units 4021 for performing search data lookup, i.e. a set of comparison units 4021-1, 4021-2, … … 4021-N in fig. 4. Since the N comparing units in the comparator 402 are respectively coupled to the memory 404, as shown in fig. 4, each comparing unit 4021 and the memory 404 are respectively physically connected, that is, the N comparing units 4021 are in a parallel comparing structure, the searching process is not interfered with each other, and the starting time of searching and the searched search data may be unrelated. When a certain idle comparing unit 4021 is selected as a target comparing unit, the target comparing unit compares the search data sent by the scheduler 401 with the data to be matched sent by the memory 404, and determines a matching result of the search data according to the comparing result. The matching result comprises one or more of matching indication information, a matching data item of the corresponding search data and an address of the matching data item, wherein the matching indication information is used for indicating whether the matching data item exists or not. For example, in practical applications, it may not be necessary to completely traverse all data items, for example, only a first matching result is taken or only whether there is a matching item is determined, and at this time, it may not be necessary to completely traverse all data items to accelerate the response speed or reduce the flipping power consumption; the corresponding matching result can be kept at a fixed time delay or immediately output, and can be determined according to the requirements of other components of the system. That is, in some cases, only whether there is a hit matching item needs to be searched, and in some cases, a specific matching item or an address corresponding to the matching item needs to be searched. It is understood that each comparing unit 4021 may include, in addition to logic resources such as a comparing circuit, maintenance control of necessary information such as its own comparison state, and a register for storing search data transmitted from the scheduler 401.
In a possible implementation manner, when the search data received by the scheduler 401 in a preset time period exceeds a preset number, or when there is no idle comparison unit currently, the scheduler 401 further controls to determine an idle target comparison unit from the N comparison units after a preset time interval. That is, when the throughput in the network is high, for example, the instantaneous burst maximum pps is large, or when there is no idle comparing unit currently, the adaptability of the content addressable storage device in the embodiment of the present application may be improved by a traffic shaping processing manner, that is, the comparison is performed after a certain time, so as to reduce the demand of the instantaneous comparing resource.
In the embodiment of the application, a plurality of comparison units in a comparator are used as dynamic and flexible callable comparison resources in a content addressable storage device, and when input search data needs to be searched, idle comparison units in the comparison units are called to sequentially compare the search data with data to be matched in a memory, wherein the comparator comprises N comparison units, so that at most, parallel search of N search data can be simultaneously realized. Unlike the CAM of the prior art, each memory cell of the CAM array has a comparator circuit, so that the search data can be simultaneously compared with all the stored data in the memory array, thereby obtaining comparison results in fewer clock cycles or even one clock cycle. And once the CAM leaves the factory, the physical structure and related parameters of the CAM are fixed, so that the portability and flexibility are poor. In the embodiment of the application, the comparison units of the comparators in the content addressable storage device are shared, the parallel search function of searching data can be realized according to actual search requirements, the processing capacity of the device is improved in multiples, the scenes that the instantaneous throughput of a network is low and high are met, the use of a CAM structure with high cost and high power consumption is avoided, the performance is ensured, and the cost and the power consumption are reduced. Further, since the memory in the device does not need a special CAM structure, but a relatively general memory structure can be used, and other functional structures (a scheduler, a controller, etc.) in the device can be implemented based on a general description language such as a hardware description language verilog, the portability and flexibility of the device are strong, so that the high availability, low cost and low power consumption of the content addressable storage device are greatly ensured.
In fig. 4, the scheduler 401 selects an idle target comparing unit from the N comparing units 4021, or before the controller 403 determines to which comparing unit the currently read data to be matched is to be sent, each comparing unit 4021 in the comparator 402 may directly send its own status information (idle or use status) to the scheduler 401 and/or the controller 403, or each comparing unit 4021 may send the status information to one status merging module in the comparator 402 for aggregation, and then the status merging module sends all the aggregated status information to the scheduler 401 and/or the controller 403.
As shown in fig. 5, fig. 5 is a schematic structural diagram of another content addressable memory device according to an embodiment of the present application, where a scheduler 401 is connected in parallel to N comparison units 4021 through N physical connection lines 001, idle states or use states of the N comparison units 4021 may be reflected on the physical connection lines 001 in the form of level changes, and the scheduler 401 may sense a current idle state or use state of each comparison unit 4021 according to a voltage change on the physical connection line 001. For example, when the comparing unit 4021 is in an idle state, the scheduler 401 keeps the physical connection 001 therebetween at a low level, and when the comparing unit 4021 is in an active state, the scheduler 401 pulls the physical connection 001 therebetween at a high level. When the idle comparing unit 4021 is selected as the target comparing unit, the target comparing unit (e.g., the target comparing units 4021-2, 4021-4, and 4021-5 in fig. 5) may receive the search data (search data a, search data b, and search data c) sent by the scheduler 401 through the corresponding physical connection lines (001 a, 001b, and 001c in fig. 5), and store the corresponding search data.
Further, each comparing unit 4021 in the comparator 402 may feed back its current state information (idle state or use state) to the controller 403 through the physical connection 002, and the controller 403 learns and counts which comparing units currently performing the search operation are through the physical connection 002, and controls the data to be matched in the memory 404 to be sent to the corresponding target comparing units gradually through the physical connection 003 in parallel.
Assume that there are currently three search data: the comparison units selected by the scheduler 401 are the target comparison units 4021-2, the target comparison units 4021-4 and the target comparison units 4021-5 respectively, after the search operation functions of the 4021-2, the 4021-4 and the 4021-5 are started, the comparator 402 notifies the controller 403 (or notified by the scheduler 401) of the enabled states of the target comparison units, the controller 403 acquires the state information of the comparison units and combines the states of the N comparison units, that is, the comparison units 4021-2, 4021-4 and 4021-5 are all enabled, and the other comparison units are disabled. The controller 403 sends the merged control information to the memory 404 in conjunction with the above-described state of the comparing units and the initialization control of the memory 404, thereby controlling the data items to be read out step by step and sent in parallel to the target comparing units 4021-2, 4021-4, and 4021-5 in the enabled state. For example, in an address polling manner, one data item (or a plurality of data items depending on the bit width relationship between the comparison unit 4021 and the memory 404) is transmitted in parallel to the target comparison units 4021 to 2, 4021 to 4, and 4021 to 5, respectively, in each clock cycle. Finally, the target comparing units 4021-2, 4021-4, and 4021-5 gradually compare the stored search data with the data items that the controller 403 controls to read out from the memory 404 gradually and transmit through the corresponding physical links (003 a, 003b, 003c in fig. 5), respectively, until the final matching result of the search data is obtained. It can be understood that the time when each comparing unit 4021 starts searching may be the same time, or may be different, that is, the searching data performed by the different comparing units 4021 are not affected and do not interfere with each other.
In one possible implementation, the data to be matched includes M data items; the controller 403 is specifically configured to control the memory to sequentially read out the M data items and send the M data items to the K target comparing units respectively according to an address traversal manner, where the M data items may be sent to only the K target comparing units (e.g., comparing units 4021-2, 4021-4, and 4021-5 in fig. 5), or may be broadcast to the N comparing units (e.g., comparing units 4021-1, 4021-2, and … … 4021-N in fig. 5); each of the K target comparison units is specifically configured to compare the corresponding search data with the M data items in sequence, and determine a matching data item of the corresponding search data. In the embodiment of the application, M data items stored in the memory are serially read in an address traversal mode and are gradually sent to the corresponding target comparison units for comparison, so that the corresponding matching results are obtained. Alternatively, the reading may be performed bar by bar, that is, the reading of one data item, or two-bar reading, that is, the reading of two data items, or the reading of a plurality of data items.
As shown in fig. 6, fig. 6 is a schematic structural diagram of another content addressable device provided in this embodiment of the present application, the content addressable device 40 in fig. 6 further includes a result output register 405, and the N comparison units 4021 are respectively coupled to the result output register 405; the target compare unit also sends the matching result to a result output register 405, and the result output register 405 receives and stores the matching result. Since the N comparing units 4021 are also coupled to the result output register 405 in the content addressable storage device 40 through the physical connection lines 004, when any comparing unit 4021 in the N comparing units 4021 completes the matching of the search data, the matching result can be sent to the result output register 405 through the corresponding physical connection line 004 in parallel. Optionally, the memory 404 reads data in an address polling manner under the control of the controller 403, that is, the read data can participate in independent queries of multiple comparison units at the same time. In a possible implementation manner, the comparator 402 of the content addressable device 40 in fig. 6 further includes a status merging module 4022, and each comparing unit 4021 sends the status information to the status merging module 4022 in the comparator 402 for aggregation, and then the status merging module 4022 sends all aggregated status information to the controller 403. Optionally, the number N of the comparison units is greater than or equal to (maximum burst pps) × (static lookup cycle), where the maximum burst pps may be determined by the minimum cycle of occurrence of the actual query request, and the static lookup cycle refers to the time of traversing one pass of the lookup table address (address of the data to be matched). In practical application, the peak pps can be reduced by combining the existing flow shaping module of the system.
For example, as shown in fig. 7, fig. 7 is a timing chart of comparison by the comparing unit provided in the embodiment of the present application, and in fig. 7, the timing sequences are denoted as T0, T1, T2, T3, T4, and T5, and correspond to the first query, the second query, the third query, the fourth query, the fifth query, and the sixth query, respectively, where the first query and the fourth query (T0 and T3) are executed by the comparing unit 402-2, the second query and the fifth query (T1 and T4) are executed by the comparing unit 402-4, and the third query and the sixth query (T2 and T5) are executed by the comparing unit 402-5. That is, each comparing unit can perform the search data query in parallel, and each comparing unit returns to the idle state after a single query is completed, and a next round of search data query can be performed, for example, the comparing unit 402-2 starts the search data a query at time T0, and starts a new search data d query after the search data a query is completed, and in the process of the comparing unit 402-2 performing the search data a, the comparing unit 402-4 starts the search data b query and starts the search data e query immediately. It should be noted that all the data to be matched in the memory 404 are read in the form of address polling, that is, the addresses of the data to be matched in fig. 7 are read circularly in the manner of address polling all the time in terms of time sequence, as long as there are currently more comparison units in the process of querying the corresponding search data, the controller 403 controls the data in the memory 404 to keep circularly read, and each comparison unit starts comparison not from the data item with the lowest or highest address, for example, the comparison is started from address 6 at time T1, the comparison is started from address 2 at time T3, that is, all the data items to be compared by the target comparison unit depend on the data currently read from the memory 404. In the time period corresponding to the address 5 (hold) in fig. 7, no comparing unit is currently executing the query task, and at this time, if the comparing unit outputs the data to be matched, resources are wasted, and if the comparing unit indicates the controller 403 through the high and low levels, at this time, the N comparing units are all in the low level state, and the controller 403 controls to suspend the reading and matching of the data. When the search data is sent to the comparison unit 402-5 again at time T5, the controller 403 continues to read from the suspended data item and send it to the comparison unit 402-5 for comparison.
In one possible implementation manner, K comparison units currently perform comparison of search data in the N comparison units, and different comparison units respectively correspond to different search data, where K is a positive integer less than or equal to M; the data to be matched comprises M data items; a controller 403, specifically configured to control the memory 404 to read out L data items at each clock cycle in an address polling manner, and broadcast the read out L data items to the N comparison units, where L is a positive integer less than or equal to M; each of the K comparison units is configured to compare L data items received each time with corresponding search data. Specifically, the controller 403 may read out L data items at each clock cycle and broadcast the L data items to the N comparison units by controlling the memory 404, and optionally, may also send the L data items to the target comparison unit currently performing the search operation in parallel, where each target comparison unit performing the search operation compares L data items received each time with search data stored in the target comparison unit, so as to obtain a corresponding matching result. For example, assuming that M is 64, N is 8, L is 2, and K is 4, the comparator 402 includes 8 comparing units, there are currently 4 target compare units performing data comparisons, and controller 403 selects from 64 data items to be matched, 2 data items are read out and sent to the 4 target comparison units in parallel (or broadcast to the 8 comparison units), each target comparison unit compares the 2 data items received each time with the stored search data, wherein, for each target comparing unit, after it obtains the matching result (comparing with all or part of 64 data items to be matched), the searching operation of the comparing unit may be stopped, and at this time, the controller 403 may control the data items read out from the memory not to be sent to the comparing unit that has completed the task of the searching operation.
In one possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items. That is, the bit width of each comparing unit in the comparator 402 may be L times the bit width of each data item in the data to be matched stored in the memory 404, and at this time, the comparing unit may perform comparison of L data items every clock cycle. For example, if the bit width of each data item is W and the comparison bit width of the comparison unit is L × W, the comparison unit may complete the comparison of L × W bits/bit data in one clock cycle. In the range that the type selection is allowed and the influence of the rear end is reasonably evaluated, the performance and the cost can be further balanced by expanding the word width, and the query speed is accelerated by simultaneously carrying out multiple comparisons in one clock period in a relatively low cost power consumption influence range.
In one possible implementation, when multiple matching results occur, a priority encoder (priority) may be added to the content addressable device 40, between the comparing unit 4021 and the result output register 40, and the priority encoder encodes and outputs the matching result with the highest priority (e.g., the data item with the lowest address).
In one possible implementation, a matching address history register may be added to the content addressable device 40, and when a match occurs for the first time, the address where the match occurs is recorded, and when a match occurs again later, whether to update the matching result and the matching address history register is determined by comparing the relative sizes of the recorded address and the new matching address. Such as a low address priority policy, old large matching addresses and results may always be refreshed with new small matching addresses and results.
In a possible implementation manner, the data to be matched includes M data items, where each data item includes content to be matched, query control information, and an output result; each of the K target comparison units is specifically configured to compare corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and output an output result in the matched data item as a matching result of the corresponding search data. That is, in the above search, the query control information and the corresponding output result (for example, the output result is an address corresponding to the content to be matched) may be carried in the data item included in the data to be matched, where the query control information may include: the specific comparison may be performed using a min max range or bit mask or in combination with other logical operations, as determined by system requirements. For example, the stored data items in the memory may be defined as a ternary data set of { content to be matched, query control information, output result }, such as a multivariate data set of one or more data items { min, max, mask, item _ en, result }, to indicate that when the search data to be queried satisfies min < (d0& mask) <maxand item __ en is equal to 1, the data item is a matching data item of the search data, and then result, i.e. the matching result in this application, may be output.
The scheduler 401 of the content addressable storage apparatus 40 in fig. 4 to fig. 6 may be a hardware circuit module, or may be a functional device running software. Similarly, the controller 403 may also be a hardware circuit module or a functional device running software. It will be appreciated that, in general, the speed is faster when hardware circuit modules are used than when software is used to implement the corresponding functions. The scheduler 401 may be deployed within the controller 403, i.e., considered to be physically part of the controller 403 or integrated with the controller 403; alternatively, the scheduler 401 may also be disposed outside the controller 403, or of course, a part of the scheduler 401 may also be disposed inside the controller 403, and another part of the scheduler is disposed outside the controller 403, which is not specifically limited in this embodiment of the present application.
It should be noted that the content addressable memory device in the present application can have three operation modes: a read mode, a write mode, and a match mode. In read and write mode, the data access and operation in memory is the same as in normal memory. The matching pattern may then implement the content-based lookup function implemented by the content addressable storage device of fig. 4-6 described above.
The data search function of the content addressable storage device in the present application can be used in various application scenarios, such as virtual memory, data compression, pattern recognition, image processing, caching, table lookup applications, and the like. For example, when performing Media Access Control (MAC) Address retrieval, a switch (which may include any content addressable storage device described in this application) first obtains a corresponding index value through retrieval of a MAC-CAM table (i.e., data to be matched in this application) by using a MAC Address as a key (search data), where the MAC-CAM table maintains an Address table (generally referred to as "CAM table") for two-layer switching on the ethernet for the switch, and the table maintains a correspondence relationship between the MAC Address and an egress interface. Thus, the switch makes a decision whenever it receives an ethernet frame. Extracting the destination MAC address of the data frame, and inquiring a CAM table according to the destination MAC address of the data frame if the data frame is not sent to the data frame; if there is a hit (i.e. a hit is found in the CAM table for the forwarding entry corresponding to the MAC address), forwarding is performed according to the result of the query (usually an egress interface list); if not, the data frame is broadcast to all ports. This CAM table for the switch can be obtained in a number of ways, such as static configuration, dynamic learning.
When the content addressable storage device only needs to simulate general CAM operation, only one set of table entry storage resources (data to be matched) can be used, the throughput (pps) performance can be linearly doubled by adding a small amount of control logic, the shape of the storage (compared with the original table lookup matching requirement) does not need to be obviously changed, and the advantages of cost and power consumption are not lost while the throughput and the realizability are ensured. If more flexible dynamic table look-up requirements are needed, some operation information related to the table entries can be conveniently and simultaneously stored in the memory, so that the implementation of additional characteristics such as table entry enabling, bit mask, numerical range mask, out-of-order table entry priority and the like can be conveniently supported, and the flexibility of table look-up matching processing is greatly improved. Meanwhile, the energy efficiency ratio is further improved by combining a specific scene and word width expansion, and further balancing performance and area power consumption. In summary, the present application provides a content addressable storage device that includes at least the following advantages:
1. the cost and the power consumption are low, the matched application scene is a scene with a relatively low pps peak value, a CAM structure with high cost and high power consumption is avoided, and the cost (area) and the power consumption (including average and peak values) are reduced; meanwhile, as the circuit is simpler, the risk of the rear-end engineering process can be reduced;
2. the portability is strong, and because a universal Memory (such as a Memory) structure can be used, the portability of a logic scheme is strong, scenes such as Emu, FPGA and the like can be directly and comprehensively realized, logic resources such as lut are occupied, and the inhibition or verification cost is greatly reduced;
3. the expandability is strong, TCAM can be easily realized through a small amount of control and modification of the table item content, and fuzzy CAM processing in other complex modes, such as table item enabling, such as fuzzy modes with min and max, can be planned into the Memory table item content for dynamic matching processing; the treatment delay is easy to fix, and a flow structure is easy to form on the whole treatment of the system; the word width dimension is easy to combine and expand, and the cost performance balance is further improved; the online dynamic table item updating operation is easy to expand and realize, and the service scene with high online requirement attribute is met; when the method is applied partially, other non-coexisting scene Memory (such as Memory) resources can be multiplexed, and the cost is further reduced.
Referring to fig. 8, fig. 8 is a schematic flowchart of a content addressable storage method according to an embodiment of the present application, where the content addressable storage method is suitable for any one of the content addressable storage apparatus in fig. 4 to fig. 7 and a device including the content addressable storage apparatus, where the content addressable storage apparatus includes: the comparator comprises N comparison units, the N comparison units are respectively coupled to the memory, and N is an integer greater than 1; the method may include the following steps S801 to S804.
S801: storing data to be matched in the memory.
S802: acquiring K search data, and respectively dispatching the K search data to K target comparison units in the N comparison units, wherein the target comparison units are in idle states, and K is an integer which is greater than or equal to 1 and less than or equal to N.
S803: and reading out the data to be matched from the memory and sending the data to the target comparison unit.
S804: and comparing the corresponding search data with the data to be matched through each target comparison unit in the K target comparison units, and outputting a corresponding matching result according to the comparison result.
In a possible implementation manner, the apparatus further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched comprises a plurality of data items; the method further comprises the following steps: sending a matching result corresponding to each target comparing unit in the K target comparing units to the result output register, wherein the matching result comprises one or more of matching indication information, a matching data item of corresponding search data and an address of the matching data item, and the matching indication information is used for indicating whether a matching data item exists or not; and receiving and storing the matching results respectively sent by the K target comparison units through the result output register.
In a possible implementation manner, the data to be matched includes M data items, where M is an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes:
controlling the memory to read out the M data items in sequence and send the M data items to the K target comparison units respectively according to an address traversal mode; the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes: and each target comparison unit in the K target comparison units compares the corresponding search data with the M data items in sequence to determine the matching data item of the corresponding search data.
In a possible implementation manner, the data to be matched includes M data items, where M is an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes:
controlling the memory to read out L data items in each clock cycle in an address polling mode, and broadcasting the read-out L data items to the N comparison units, wherein L is a positive integer less than or equal to M; comparing the corresponding search data with the data to be matched by each of the K target comparison units, including: comparing, by each of the K comparison units, the L data items received at each time with the corresponding search data.
In one possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items.
In one possible implementation, the method further includes: and controlling the memory to write the data to be matched in a preset mode.
In one possible implementation, the method further includes: and when the search data received in a preset time period exceeds a preset number or no idle comparison unit exists currently, controlling to determine a target comparison unit from the N comparison units after a preset time interval.
In one possible implementation, the method further includes: and in the process of controlling the memory to read out the data to be matched, controlling the memory to write in new data to be matched or modifying the data to be matched.
In one possible implementation manner, the data to be matched includes a plurality of data items, where each data item includes content to be matched, query control information, and an output result; the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes: and comparing the corresponding search data with the content to be matched in each data item according to the query control information of the M data items by each target comparison unit in the K target comparison units, and outputting the output result in the matched data item as the matching result of the corresponding search data.
It should be noted that, for specific flows in the content addressable storage method described in the embodiment of the present application, reference may be made to the related descriptions in the embodiment of the application described in fig. 4 to fig. 7, and details are not repeated here.
The embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and the program includes some or all of the steps of any one of the method embodiments described above when executed.
Embodiments of the present application also provide a computer program, which includes instructions that, when executed by a computer, enable the computer to perform some or all of the steps of any one of the content addressable storage methods. In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute all or part of the steps of the above-described method of the embodiments of the present application. The storage medium may include: a U-disk, a removable hard disk, a magnetic disk, an optical disk, a Read-Only Memory (ROM) or a Random Access Memory (RAM), and the like.

Claims (20)

  1. A content addressable storage device, comprising:
    the memory is used for storing data to be matched;
    a comparator comprising N comparison units respectively coupled to the memory, N being an integer greater than 1;
    the scheduler is used for acquiring K search data and scheduling the K search data to K target comparison units in the N comparison units respectively, wherein the target comparison units are comparison units in an idle state, and K is an integer which is greater than or equal to 1 and less than or equal to N;
    the controller is used for controlling the memory to read out the data to be matched and respectively sending the data to the K target comparison units;
    and each target comparison unit in the K target comparison units is used for comparing the corresponding search data with the data to be matched and outputting a corresponding matching result according to the comparison result.
  2. The apparatus of claim 1, further comprising a result output register, the N comparison units being respectively coupled to the result output register; the data to be matched comprises a plurality of data items;
    each target comparison unit in the K target comparison units is further configured to send a corresponding matching result to the result output register, where the matching result includes one or more of matching indication information, a matching data item of corresponding search data, and an address of the matching data item, where the matching indication information is used to indicate whether there is a matching data item;
    and the result output register is used for receiving and storing the matching results respectively sent by the K target comparison units.
  3. The apparatus according to claim 1 or 2, wherein the data to be matched comprises M data items, M being an integer greater than 1;
    the controller is specifically configured to control the memory to read out the M data items in sequence and send the M data items to the K target comparison units respectively in an address traversal manner;
    each of the K target comparison units is specifically configured to compare the corresponding search data with the M data items in sequence, and determine a matching data item of the corresponding search data.
  4. The apparatus according to claim 1 or 2, wherein the data to be matched comprises M data items;
    the controller is specifically configured to control the memory to read out L data items in each clock cycle in an address polling manner, and broadcast the read out L data items to the N comparison units, where L is a positive integer less than or equal to M;
    each of the K comparison units is configured to compare L data items received each time with corresponding search data.
  5. The apparatus of claim 4, wherein each of the N comparison units has a bit width that is L times a bit width of each of the M data items.
  6. The apparatus of any of claims 1-5, wherein the controller is further configured to control the memory to write the data to be matched in a predetermined manner.
  7. The apparatus of any one of claims 1-6,
    the scheduler is further configured to determine a target comparison unit from the N comparison units after a preset time interval if the search data received within a preset time period exceeds a preset number or if there is no idle comparison unit currently.
  8. The apparatus of any of claims 1-7, wherein the controller is further configured to control the memory to write new data to be matched or modify the data to be matched in controlling the memory to read the data to be matched.
  9. The apparatus of claim 1, wherein the data to be matched comprises M data items, wherein each data item comprises content to be matched, query control information, and an output result;
    each of the K target comparison units is specifically configured to compare corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and output an output result in the matched data item as a matching result of the corresponding search data.
  10. A content addressable storage method, applied to a content addressable storage apparatus, the apparatus comprising: the comparator comprises N comparison units, the N comparison units are respectively coupled to the memory, and N is an integer greater than 1; the method comprises the following steps:
    storing data to be matched in the memory;
    acquiring K search data, and respectively scheduling the K search data to K target comparison units in the N comparison units, wherein the target comparison units are in idle states, and K is an integer greater than or equal to 1 and less than or equal to N;
    controlling the memory to read the data to be matched and respectively sending the data to the K target comparison units;
    and comparing the corresponding search data with the data to be matched through each target comparison unit in the K target comparison units, and outputting a corresponding matching result according to the comparison result.
  11. The method of claim 10, wherein the apparatus further comprises a result output register, the N comparison units being respectively coupled to the result output register; the data to be matched comprises a plurality of data items; the method further comprises the following steps:
    sending a matching result corresponding to each target comparing unit in the K target comparing units to the result output register, wherein the matching result comprises one or more of matching indication information, a matching data item of corresponding search data and an address of the matching data item, and the matching indication information is used for indicating whether a matching data item exists or not;
    and receiving and storing the matching results respectively sent by the K target comparison units through the result output register.
  12. The method according to claim 10 or 11, wherein the data to be matched includes M data items, M being an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes:
    controlling the memory to read out the M data items in sequence and send the M data items to the K target comparison units respectively according to an address traversal mode;
    the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes:
    and each target comparison unit in the K target comparison units compares the corresponding search data with the M data items in sequence to determine the matching data item of the corresponding search data.
  13. The method according to claim 10 or 11, wherein the data to be matched includes M data items, M being an integer greater than 1; the controlling the memory to read out the data to be matched and respectively send the data to the K target comparison units includes:
    controlling the memory to read out L data items in each clock cycle in an address polling mode, and broadcasting the read-out L data items to the N comparison units, wherein L is a positive integer less than or equal to M;
    comparing the corresponding search data with the data to be matched by each of the K target comparison units, including:
    comparing, by each of the K comparison units, the L data items received at each time with the corresponding search data.
  14. The method of claim 13, wherein each of the N comparison units has a bit width that is L times a bit width of each of the M data items.
  15. The method of any one of claims 10-14, further comprising:
    and controlling the memory to write the data to be matched in a preset mode.
  16. The method of any one of claims 10-15, further comprising:
    and when the search data received in a preset time period exceeds a preset number or no idle comparison unit exists currently, controlling to determine a target comparison unit from the N comparison units after a preset time interval.
  17. The method of any one of claims 10-16, further comprising:
    and in the process of controlling the memory to read out the data to be matched, controlling the memory to write in new data to be matched or modifying the data to be matched.
  18. The method of claim 10, wherein the data to be matched comprises a plurality of data items, wherein each data item comprises content to be matched, query control information, and an output result;
    the comparing, by each of the K target comparing units, the corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparing result, includes:
    and comparing the corresponding search data with the content to be matched in each data item according to the query control information of the M data items by each target comparison unit in the K target comparison units, and outputting the output result in the matched data item as the matching result of the corresponding search data.
  19. A semiconductor chip, comprising:
    the content addressable storage device of any of claims 1 to 9, a processor coupled to the content addressable storage device, and a memory external to the content addressable storage device.
  20. An electronic device, comprising:
    the content addressable storage device of any of claims 1 to 9, and a discrete device coupled to the content addressable storage device.
CN201980096977.9A 2019-05-31 2019-05-31 Content addressable storage device, method and related equipment Pending CN113966532A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/089687 WO2020237682A1 (en) 2019-05-31 2019-05-31 Content-addressable storage apparatus and method, and related device

Publications (1)

Publication Number Publication Date
CN113966532A true CN113966532A (en) 2022-01-21

Family

ID=73552497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980096977.9A Pending CN113966532A (en) 2019-05-31 2019-05-31 Content addressable storage device, method and related equipment

Country Status (2)

Country Link
CN (1) CN113966532A (en)
WO (1) WO2020237682A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448733B (en) * 2021-07-12 2024-05-28 中国银行股份有限公司 Data processing method and system
CN114356418B (en) * 2022-03-10 2022-08-05 之江实验室 Intelligent table entry controller and control method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067356A1 (en) * 2004-08-23 2006-03-30 Han-Gyoo Kim Method and apparatus for network direct attached storage
US9082484B1 (en) * 2013-12-23 2015-07-14 International Business Machines Corporation Partial update in a ternary content addressable memory
US10068645B2 (en) * 2016-05-31 2018-09-04 Qualcomm Incorporated Multiple cycle search content addressable memory
CN109408898B (en) * 2018-09-28 2023-03-31 北京时代民芯科技有限公司 On-chip CAM (computer-aided manufacturing) structure system and implementation method thereof

Also Published As

Publication number Publication date
WO2020237682A1 (en) 2020-12-03

Similar Documents

Publication Publication Date Title
US20200136971A1 (en) Hash-table lookup with controlled latency
US8041854B2 (en) Steering data units to a consumer
CN109981464B (en) TCAM circuit structure realized in FPGA and matching method thereof
CN113966532A (en) Content addressable storage device, method and related equipment
CN103605478B (en) Storage address sign, collocation method and data access method and system
US20240302996A1 (en) Table entry reading method, network device, and storage medium
KR102524566B1 (en) A packet memory system, method and device for preventing underrun
US10067868B2 (en) Memory architecture determining the number of replicas stored in memory banks or devices according to a packet size
US8156259B2 (en) Memory data transfer method and system
US20190187927A1 (en) Buffer systems and methods of operating the same
US10061513B2 (en) Packet processing system, method and device utilizing memory sharing
CN111181874A (en) Message processing method, device and storage medium
EP4242819A1 (en) System and method for efficiently obtaining information stored in an address space
US9531641B2 (en) Virtual output queue linked list management scheme for switch fabric
CN116738510A (en) System and method for efficiently obtaining information stored in address space
WO2023060833A1 (en) Data exchange method, electronic device and storage medium
CN115955441A (en) Management scheduling method and device based on TSN queue
CN104378295B (en) List item managing device and entry management method
EP3289466A1 (en) Technologies for scalable remotely accessible memory segments
US10997087B2 (en) Direct memory access
JP2006510083A (en) Configurable memory partitioning in hardware
JP2009199384A (en) Data processing apparatus
CN111124312B (en) Method and device for data deduplication
CN111865794A (en) Correlation method, system and equipment of logical port and data transmission system
US9612950B2 (en) Control path subsystem, method and device utilizing memory sharing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination