WO2020237682A1 - Content-addressable storage apparatus and method, and related device - Google Patents

Content-addressable storage apparatus and method, and related device Download PDF

Info

Publication number
WO2020237682A1
WO2020237682A1 PCT/CN2019/089687 CN2019089687W WO2020237682A1 WO 2020237682 A1 WO2020237682 A1 WO 2020237682A1 CN 2019089687 W CN2019089687 W CN 2019089687W WO 2020237682 A1 WO2020237682 A1 WO 2020237682A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
matched
memory
comparison units
target
Prior art date
Application number
PCT/CN2019/089687
Other languages
French (fr)
Chinese (zh)
Inventor
杨兵
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980096977.9A priority Critical patent/CN113966532A/en
Priority to PCT/CN2019/089687 priority patent/WO2020237682A1/en
Publication of WO2020237682A1 publication Critical patent/WO2020237682A1/en

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C15/00Digital stores in which information comprising one or more characteristic parts is written into the store and in which information is read-out by searching for one or more of these characteristic parts, i.e. associative or content-addressed stores
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers

Definitions

  • This application relates to the field of memory technology, and in particular to a content addressable storage device, method and related equipment.
  • Content addressable memory is a content-addressable memory, which is a special storage array random access memory (Random Access Memory, RAM). In addition to reading and writing like traditional RAM, it can also perform search operations. Its search mechanism is to compare an input data item with all data items stored in the CAM, and distinguish the input data item from the data stored in the CAM. Whether the data item matches, and output the matching information corresponding to the data item. As shown in FIG. 1A, FIG. 1A is a schematic diagram of the functional difference between RAM and CAM in the prior art. RAM searches for corresponding data according to the input address, and CAM can search for the corresponding address according to input data.
  • RAM Random Access Memory
  • Fig. 1B is a schematic diagram of a typical CAM structure in the prior art.
  • CAM mainly consists of the following parts: CAM array (M array), sensitive amplifier (SA), search line drivers (Search Line drivers) , SL drivers) and encoders.
  • the CAM array in FIG. 1B includes (W ⁇ K) CAM memory cells M, and each memory cell M is responsible for storing data and comparing the stored data with external search data.
  • a row of memory cells M constitutes a word of CAM (also called data item, data table item, table item, etc.), W is "word width or bit width", CAM array
  • the number of all words (K in the figure as an example) is called “depth”, and the capacity of CAM is represented by (K words ⁇ W bits).
  • the search data (Search Key) is loaded into the search line SL through the search line driver SL drivers, the data stored in the memory cell M is matched with the SL, and the matching result is reflected in the match line in the form of level changes On (Match Line, ML), the voltage change on ML is amplified by the sensitive method SA, and finally output to the priority encoder (encoder), and the address with the highest priority of the data item that matches the search data is encoded and output.
  • the search data Search Key
  • the search line driver SL drivers the data stored in the memory cell M is matched with the SL, and the matching result is reflected in the match line in the form of level changes On (Match Line, ML), the voltage change on ML is amplified by the sensitive method SA, and finally output to the priority encoder (encoder), and the address with the highest priority of the data item that matches the search data is encoded and output.
  • CAM when CAM searches for content, it uses the comparison circuit in each memory cell M to compare the search data with the data stored in all memory cells at the same time, so that all data items in the entire CAM array can be as fast as one clock cycle It is queried to achieve fast matching, high parallelism, and the search speed is not affected by the CAM capacity.
  • the data in the memory can be read one by one in the order of address for comparison.
  • CAM realizes the function of high-speed search through hardware circuit, which greatly improves the search efficiency and performance of the search system. It is widely used in network communication, pattern recognition and other fields.
  • the embodiments of the present application provide a content addressable storage device, method, and related equipment, which can realize the content addressable function while ensuring the low power consumption and flexibility of the content addressable storage device.
  • an embodiment of the present application provides a content-addressable storage device, which may include: a memory for storing data to be matched; a comparator including N comparison units, and the N comparison units are respectively coupled to In the memory, N is an integer greater than 1; a scheduler is used to obtain K search data, and schedule the K search data to K target comparison units among the N comparison units, and the target
  • the comparison unit is a comparison unit in an idle state, K is an integer greater than or equal to 1 and less than or equal to N; the controller is used to control the memory to read out the data to be matched and send them to the K targets respectively Comparison unit; each of the K target comparison units is used to compare the corresponding search data with the data to be matched, and output a corresponding matching result according to the comparison result.
  • multiple comparison units in the comparator are used as dynamic, flexible and callable comparison resources in the content addressable storage device.
  • the idle comparison unit in the unit sequentially compares the search data with the data to be matched in the memory.
  • the comparator since the comparator includes N comparison units, a maximum of N search data can be simultaneously searched in parallel.
  • each memory cell of the CAM array has a comparison circuit.
  • each memory cell has the characteristics of a dedicated comparison circuit, it has caused problems such as large chip area, high power consumption, and high cost.
  • problems such as large chip area, high power consumption, and high cost.
  • the comparison unit of the comparator in the content addressable storage device is shared, and the parallel search function of search data can be realized according to actual search requirements, which doubles the processing capacity of the device and meets the network transient.
  • the use of high-cost and high-power CAM structures is avoided, which ensures performance and reduces costs and power consumption.
  • the memory in the device does not need a dedicated CAM structure
  • a relatively general memory structure can be used, and other functional structures in the device (scheduler, controller, etc.) can be implemented based on general description languages such as hardware description language verilog , Thus making its portability and flexibility strong, thereby greatly ensuring the high availability, low cost and low power consumption of the content addressable storage device.
  • the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the K targets Each target comparison unit in the comparison unit is further configured to send a corresponding matching result to the result output register, and the matching result includes matching indication information, matching data items of the corresponding search data, and matching data items of the matching data items.
  • the N comparison units in the embodiment of the present application are respectively coupled to the result output register in the content addressable storage device. When any one of the N comparison units completes the matching of the search data, the matching result can be sent to The result output register, where the matching result can be one or more of whether the matching is successful, the specific matching data, or the address of the matching data.
  • the data to be matched includes M data items.
  • M is an integer greater than 1; the controller is specifically configured to control the memory to sequentially read the M data items and send them to the K target comparison units according to the address traversal mode; the K targets Each target comparison unit in the comparison unit is specifically configured to sequentially compare the corresponding search data with the M data items to determine matching data items of the corresponding search data.
  • the M data items stored in the memory are serially read out through address traversal, and are gradually sent to the corresponding target comparison unit for comparison, so as to obtain the corresponding matching result.
  • the data to be matched includes M data items; the controller is specifically configured to control the memory to read out L data items in an address polling manner in each clock cycle , And broadcast the read L data items to the N comparison units, where L is a positive integer less than or equal to M; each comparison unit in the K comparison units is used to compare the received data each time The L data items are compared with the corresponding search data.
  • the controller when there are multiple search data for searching, there are corresponding multiple comparison units performing the search operation at the same time. Specifically, it may be that the controller reads out L data items in the memory and sends them in parallel.
  • the comparison unit currently performing the search operation compares the L pieces of data received each time with the stored search data to obtain the corresponding matching result.
  • the comparator includes 8 comparison units, and currently 4 comparison units are performing data comparison.
  • the controller reads out 2 data items from the 64 data items to be matched every clock cycle and sends them to the above 4 in parallel.
  • each comparison unit compares the two data items received each time with the search data stored in itself. For each comparison unit, it obtains the matching result (with all the data items to be matched). All or part of the 64 data items have been compared), the search operation of the comparison unit can be stopped. At this time, the controller controls no longer sending the data items read from the memory to the comparison unit that has completed the search operation.
  • the bit width of each of the N comparison units is L times the bit width of each of the M data items.
  • the bit width of each comparison unit in the comparator may be L times the bit width of each data item in the data to be matched stored in the memory.
  • the comparison unit may L data items are compared in clock cycles. For example, if the bit width of each data item is W and the comparison unit comparison bit width is L*W, then the comparison unit can complete the comparison of L*W bit/bit data in one clock cycle.
  • the controller is further configured to control the memory to write the data to be matched in a preset manner.
  • the controller can complete the initial configuration control of the data to be matched in a preset manner under the control of the external or internal processor of the device in the initial stage, for example, enumerate the addresses of all the data to be matched, and The data to be matched corresponding to the address is correspondingly written into the memory, and the content of the data to be matched depends on the specific business requirements.
  • the scheduler is also used to control the preset time when the search data received within the preset time period exceeds a preset number, or when there is no free comparison unit currently After the interval, the target comparison unit is determined from the N comparison units.
  • the throughput in the network is high, such as when the maximum pps of the instantaneous burst is large, or when there is no idle comparison unit currently, the comparison can be performed after a certain period of time through traffic shaping processing. To improve the adaptability of the content addressable storage device in the embodiments of the present application to reduce the requirement for instantaneous comparison of resources.
  • the controller is further configured to control the memory to write new data to be matched or modify the data to be matched during the process of controlling the memory to read the data to be matched.
  • the data to be matched can write the data to be matched into the memory by reading and writing.
  • the controller can select the independent read/write Two-port memory is used to read and update or modify the data to be matched; if a single-port memory is selected, the controller needs to insert some additional refresh time to complete the data to be matched when reading the data to be matched. Refreshing of matching data may reduce the query speed at this time.
  • the refresh time means that when refreshing (writing table entry/data item), the address sequence of suspending query (reading table entry/data item) increases (while the write enable is invalid), after the writing is completed, the query is re-enabled ( Read enable, read address sequence increases).
  • the data to be matched includes M data items, where each data item includes content to be matched, query control information, and output results; each of the K target comparison units A target comparison unit is specifically configured to compare the corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and to output the result in the matched data item Output as the matching result of the corresponding search data.
  • the query control information and the corresponding output result are carried in the data items contained in the data to be matched (for example, the output result is the address corresponding to the content to be matched), thereby controlling the comparison unit to search for data
  • the embodiments of the present application provide a content addressable storage method, which can be applied to a content addressable storage device.
  • the device includes a comparator and a memory.
  • the comparator includes N comparison units.
  • the N comparison units are respectively coupled to the memory, and N is an integer greater than 1.
  • the method includes: storing the data to be matched in the memory; obtaining K search data, and separately Scheduling to K target comparison units among the N comparison units, the target comparison unit is a comparison unit in an idle state, and K is an integer greater than or equal to 1 and less than or equal to N; controlling the memory readout
  • the data to be matched are sent to the K target comparison units respectively; through each target comparison unit of the K target comparison units, the corresponding search data is compared with the data to be matched, and according to The comparison result outputs the corresponding matching result.
  • the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the method further includes : Send the matching result corresponding to each target comparing unit of the K target comparing units to the result output register, where the matching result includes matching indication information, matching data items of the corresponding search data, and matching data One or more of the address of the item, wherein the matching indication information is used to indicate whether there is a matching data item; the matching result sent by the K target comparison units is received and stored through the result output register.
  • the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively.
  • the target comparing unit includes: controlling the memory to sequentially read the M data items and sending them to the K target comparing units according to the address traversal mode; the comparing by each of the K target comparing units Unit, which compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: the corresponding search data is compared by each of the K target comparison units Compare with the M data items in sequence to determine matching data items of the corresponding search data.
  • the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively.
  • the target comparison unit includes: controlling the memory to read out L data items in each clock cycle by address polling, and broadcasting the read L data items to the N comparison units, where L is less than or A positive integer equal to M; compare the corresponding search data with the data to be matched through each of the K target comparison units, including: through each of the K comparison units , Compare the L data items received each time with the corresponding search data.
  • the bit width of each of the N comparison units is L times the bit width of each of the M data items.
  • the method further includes: controlling the memory to write the data to be matched in a preset manner.
  • the method further includes: when the search data received within the preset time period exceeds the preset number, or when there is no free comparison unit currently, controlling after the preset time interval , And then determine the target comparison unit from the N comparison units.
  • the method further includes: in the process of controlling the memory to read the data to be matched, controlling the memory to write new data to be matched or modify the data to be matched The data.
  • the data to be matched includes multiple data items, wherein each data item includes content to be matched, query control information, and output results;
  • Each target comparison unit compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: through each of the K target comparison units, respectively According to the query control information of the M data items, the corresponding search data is compared with the content to be matched in each data item, and the output result in the matched data item is used as the matching result of the corresponding search data.
  • this application provides a semiconductor chip, which may include:
  • this application provides a semiconductor chip, which may include:
  • the present application provides a system-on-chip SoC chip.
  • the SoC chip includes the above-mentioned first aspect and the content-addressable storage device provided in combination with any one of the above-mentioned first aspects, coupled to the The processor of the content addressable storage device and the external memory of the content addressable storage device.
  • the chip system can be composed of chips, or include chips and other discrete devices.
  • the present application provides a chip system, which includes: a content addressable storage device including the first aspect described above and any one of the implementations provided in combination with the first aspect described above, and including: The processor of the content addressable storage device and the chip of the external memory of the content addressable storage device.
  • the chip system can be composed of chips, or include chips and other discrete devices.
  • the present application provides an electronic device that includes the foregoing first aspect and the content addressable storage device provided in combination with any one of the foregoing first aspect implementations, and the content addressable An external memory of the storage device, and a processor coupled to the content addressable storage device.
  • the external memory is used to store necessary program instructions and data
  • the processor is used to run the necessary general operating system of the electronic device, and is used to couple with the content addressable storage device to complete the content addressable storage device Related processing functions.
  • the electronic device may also include a communication interface for the electronic device to communicate with other devices or a communication network.
  • the present application provides a computer storage medium that stores a computer program, and when the computer program is executed by a processor, it can implement any one of the above second aspect and in combination with the above second aspect The flow in the content addressable storage method provided by the implementation method.
  • an embodiment of the present application provides a computer program, the computer program includes instructions, when the computer program is executed by a computer, the computer can execute the above-mentioned second aspect and any combination of the above-mentioned second aspect.
  • FIG. 1A is a schematic diagram of the functional difference between RAM and CAM in the prior art
  • FIG. 1B is a schematic diagram of a typical CAM structure in the prior art
  • FIG. 2 is a schematic diagram of processing a simulated CAM based on Memory address traversal provided by an embodiment of the application;
  • FIG. 3 is a schematic diagram of an extended bit-width analog CAM processing based on Memory address traversal provided by an embodiment of the application;
  • FIG. 4 is a schematic structural diagram of a content addressable storage device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of another content addressable storage device provided by an embodiment of the application.
  • FIG. 6 is a schematic structural diagram of another content addressable device provided by an embodiment of this application.
  • FIG. 7 is a sequence diagram of comparison of a comparison unit provided by an embodiment of the application.
  • FIG. 8 is a schematic flowchart of a content addressable storage method provided by an embodiment of the present application.
  • component used in this specification are used to denote computer-related entities, hardware, firmware, a combination of hardware and software, software, or software in execution.
  • the component may be, but is not limited to, a process, a processor, an object, an executable file, an execution thread, a program, and/or a computer running on a processor.
  • the application running on the computing device and the computing device can be components.
  • One or more components may reside in processes and/or threads of execution, and components may be located on one computer and/or distributed among two or more computers.
  • these components can be executed from various computer readable media having various data structures stored thereon.
  • the component may be based on, for example, a signal having one or more data packets (such as data from two components interacting with another component in a local system, a distributed system, and/or a network, such as the Internet that interacts with other systems through signals) Communicate through local and/or remote processes.
  • a signal having one or more data packets (such as data from two components interacting with another component in a local system, a distributed system, and/or a network, such as the Internet that interacts with other systems through signals) Communicate through local and/or remote processes.
  • SoC System-on-Chip
  • SoC is called system-on-chip, also called system-on-chip, which means that it is a product, an integrated circuit with a dedicated target, which contains a complete system and has embedded software The entire contents of.
  • SoC is a kind of technology to realize the whole process from determining system functions to software/hardware division and completing the design.
  • RAM Random Access Memory
  • Random access memory RAM can be further divided into two categories: Static Random Access Memory (SRAM) and Dynamic Random Access Memory (DRAM).
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • the basic principles of the two have the same place, they both store the charge inside the memory. Among them, the structure of SRAM is more complicated, the capacity per unit area is small, and the access speed is fast; DRAM has a simple structure and stores per unit area. The capacity is relatively large, and the access time is slower than that of SRAM. At the same time, DRAM has a relatively simple structure, and the stored charge will gradually disappear over time, so it needs to be recharged regularly (Refresh) to maintain the data stored in the capacitor.
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • DDR memory is developed on the basis of SDRAM memory, and still uses the SDRAM production system. As far as memory manufacturers are concerned, they only need to slightly improve the equipment for manufacturing ordinary SDRAM to realize the production of DDR memory, which can effectively reduce costs.
  • DDR technology implements two read/write operations in one clock cycle, that is, one read/write operation is performed on the rising and falling edges of the clock.
  • ROM Read Only Memory
  • ROM is a solid-state semiconductor memory that can only read data stored in advance. Its characteristic is that once the data is stored, it cannot be changed or deleted. It is usually used in electronic or computer systems that do not need to change data frequently, and the data will not disappear because the power is turned off.
  • Field Programmable Gate Array (Field Programmable Gate Array, FPGA) is further developed on the basis of programmable array logic (PAL), general array logic (GAL), complex programmable logic device (CPLD) and other programmable devices The product. It emerged as a semi-custom circuit in the field of application-specific integrated circuits (ASIC), which not only solves the deficiencies of custom circuits, but also overcomes the shortcomings of the limited number of gate circuits of the original programmable devices.
  • PAL programmable array logic
  • GAL general array logic
  • CPLD complex programmable logic device
  • Emulator refers to a device or program that can simulate almost 100% of all the characteristics and behaviors of a hardware or software system. Its purpose is to completely simulate the simulated hardware receiving various external information. Time reaction.
  • the Round Robin Scheduling algorithm is to sequentially request scheduling/processing ready task queues or processes in a cyclic manner.
  • Lookup Table (LUT) is essentially a RAM. At present, 4-input LUTs are mostly used in FPGA, so each LUT can be regarded as a RAM with 4-bit address lines.
  • the PLD/FPGA development software will automatically calculate the logic in the future All possible results of the circuit, and the truth table (ie results) is written into RAM in advance, so that each input of a signal for logic operation is equivalent to inputting an address to look up the table, find out the content of the address, and then output it.
  • Packets per second is a commonly used unit of network throughput (that is, how many packet data packets are sent per second). The performance of the network is usually measured by throughput.
  • the packet forwarding rate indicates the ability of the switch to forward data packets. Generally, the packet forwarding rate of a switch ranges from tens of Kpps to hundreds of Mpps.
  • the packet forwarding rate refers to how many millions of data packets (Mpps) the switch can forward per second, that is, the number of data packets that the switch can forward at the same time.
  • the packet forwarding rate reflects the switching capability of the switch in units of data packets.
  • Throughput refers to the amount of data (measured in bits, bytes, etc.) that are successfully transmitted to a network, device, port or other facility in a unit of time. In case of loss, the maximum data rate that the device can receive and forward.
  • the throughput is mainly determined by the internal and external network port hardware of the network equipment, and the efficiency of the program algorithm, especially the program algorithm. For devices that require a lot of calculations, the low efficiency of the algorithm will reduce the communication volume.
  • CAM is performing content search
  • its highly parallel search feature is based on the comparison circuit in each memory cell in the CAM array. That is to say, the CAM array structure in CAM is different from ordinary memory (such as Memory, RAM, etc.) in the data storage array structure, that is, CAM cannot use the general Memory structure to implement content-based addressing functions.
  • the following provides two solutions that can utilize the general Memory structure to realize the content addressable storage function:
  • Solution 1 Simulation CAM based on Memory address traversal:
  • FIG. 2 is a schematic diagram of processing a simulated CAM based on Memory address traversal provided by an embodiment of the application. It is assumed that the addresses corresponding to all entry data in the Memory are 0, 1, 2, 3, 4, and 5.
  • T 0 represents the start time of the first query
  • T 1 represents the start time of the second query
  • T 3 represents the start time of the third query.
  • the above scheme is a serial comparison method, and uses a general memory structure, with relatively low cost and power consumption, low back-end risk (that is, the feasibility risk of converting the logical design into a physical circuit is low), and the flexibility is high.
  • FIG. 3 is a schematic diagram of an extended bit-width analog CAM processing based on Memory address traversal provided by an embodiment of the application. 4, 5, 6, 7, T 0 represents the start time of the first query, T 1 represents the start time of the second query, and T 3 represents the start time of the third query.
  • the data of two entries can be compared in each clock cycle, and the single query time is reduced to half of the original.
  • the second query can only be started after the first query is completed at the earliest.
  • the existing content-based addressing scheme cannot meet the performance of the network. Therefore, a flexible CAM with low power consumption, small area, strong portability, and strong scalability provided in this application is used to solve the above technical problems.
  • FIG. 4 is a schematic structural diagram of a content addressable storage device provided by an embodiment of the present application.
  • the content addressable storage device 40 may include a scheduler 401, a comparator 402, a memory 403, and a memory 404.
  • the comparator 402 includes N comparison units 4021, and the N comparison units are respectively coupled to the memory 403, where N is an integer greater than 1; optionally, the aforementioned scheduler 401, comparator 402, memory 403, and memory 404 Can be located on an integrated circuit substrate. among them,
  • the memory 404 is used to store data to be matched.
  • the data to be matched are multiple data items that need to be matched for the K search data acquired by the scheduler 401, that is, each search data needs to be matched with the data to be matched , So as to get the corresponding matching result.
  • the data to be matched can be obtained in various ways, such as static configuration, dynamic learning, and so on.
  • the controller 404 controls the memory 404 to write the data to be matched.
  • the controller 403 may also control the memory 404 to write new data to be matched or modify the data to be matched when the control memory 404 reads the data to be matched.
  • the controller 403 can control the memory 404 to write the data to be matched into the memory 403 in a read-and-write manner. For example, for a system that requires online refreshing of the data to be matched, you can select an independent read/write Two-port memory is used to read and update or modify the data to be matched; if a single-port memory is selected, the controller 403 can insert some additional refresh time when the memory 404 is controlled to read the data to be matched To complete the refresh of the data to be matched, some query speed may be reduced at this time.
  • the refresh time refers to the increase of the address sequence of the pause query (read table item/data item) during refresh (write table entry/data item) ( At the same time, the write enable is invalid). After the write is completed, the query is re-enabled (read enable, read address sequence increases).
  • the storage structure of the memory 404 may be in the form of a storage array.
  • the storage array is composed of many basic storage units. Each basic storage unit stores a binary number (1 or 0), called bit (bit ), and a row of storage units constitutes a word of the memory 403 (also called data items, data table items, table items, etc.), W is the "word width or bit width", the number of all words in the memory array K Called "depth", the capacity of the memory 403 is characterized by (K words ⁇ W bits).
  • the memory 403 has no data comparison function, that is, it does not include a comparison circuit. Therefore, a general memory structure can be adopted.
  • the memory 403 can be a general random access memory (Random Access Memory, RAM) or volatile storage devices under power failure, such as static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM) or synchronous dynamic random access memory (Synchronous DRAM, SDRAM) , Double Data Rate SDRAM (Dual Data Rate SDRAM, DDR SDRAM), etc.;
  • the memory 403 can also be a general read only memory (Read Only Memory, ROM) or non-power-down volatile memory, such as programmable ROM (Programmable ROM, PROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), Flash ROM (FLASH ROM), etc.
  • the memory 403 can also be a processor General-purpose registers, flash memory, or any other suitable type of memory on the computer.
  • the memory 403 is a RAM
  • the data to be matched can be changed.
  • the memory 403 is a ROM
  • the data to be matched is the data solidified in the memory 403. It should be noted that this application does not specifically limit the specific form in which the memory 404 stores the data to be matched, and relevant changes can be made according to actual needs or business conditions.
  • the scheduler 401 is configured to obtain K search data and respectively schedule the K search data to K target comparison units among the N comparison units, where the target comparison unit is a comparison unit in an idle state, K is an integer greater than or equal to 1 and less than or equal to N.
  • K is an integer greater than or equal to 1 and less than or equal to N.
  • the scheduler 401 needs to allocate different target comparison units for different query requests; According to the dynamically received query request, and combined with the current status of the N comparison units 4021 in the comparator 402, 401 dynamically selects an available target comparison unit (which can be any comparison unit 4021 in FIG. 4), and starts or The query function of the target comparison unit is enabled, and the search data is sent to the target comparison unit 4021, thereby completing the allocation and scheduling of query requests to query resources.
  • an available target comparison unit which can be any comparison unit 4021 in FIG. 4
  • the K search data acquired by the scheduler 401 may be acquired at the same time or at different times, that is, the K search data may be scheduled to K target comparison units at the same time, and It may be scheduled to K target comparison units first, and may depend on the order in which the K search data reaches the scheduler, or it may depend on the scheduler 401 preset or flexible scheduling rules.
  • the application embodiment does not specifically limit this.
  • the scheduler 401 can schedule a certain search data to a certain target comparison unit immediately after receiving it, or it can schedule the scheduler 401 to the corresponding target comparison unit after receiving a certain amount of search data.
  • the target comparison unit can schedule a certain search data to a certain target comparison unit immediately after receiving it, or it can schedule the scheduler 401 to the corresponding target comparison unit after receiving a certain amount of search data.
  • the scheduler 401 may receive a large number of query requests at a certain moment or time period, that is, a large amount of search data. At this time, due to the limited number of comparison units in the comparator 402, the scheduler The device 401 can be based on certain traffic shaping control, that is, currently only obtain K search data among them for query, and after there are free comparison units in the future, other search data scheduling can be performed to avoid search data congestion or throw away.
  • the controller 403 is configured to control the memory 404 to read the data to be matched and send them to the K target comparison units respectively.
  • a comparison unit in the comparator 402 is selected as the target comparison unit, that is, when one or more comparison units need to perform a search operation, the comparator 402 can notify the status information (idle state or use state) of the target comparison unit to The controller 403 completes the read-write enable combined control of the controller 403 by the comparator 402, and finally controls the output of the memory 404.
  • the controller 403 needs to control the memory 404 to continuously read the data items to be matched, because the memory 404 needs to be controlled.
  • the scheduler 401 may also notify the controller 403 of the status information of the target comparison unit to complete the read-write enable combination control of the controller 403 by the comparator 402, which is not specifically limited in this embodiment of the application.
  • the structure in FIG. 4 is based on the example notified by the comparator 402 to the controller 403. It can be understood that if the controller 403 is notified by the scheduler 401, a communication connection is required between the scheduler 401 and the controller 403. I will not repeat them here.
  • the memory 403 Under the control of the controller 403, the memory 403 sends the read data to be matched to the K target comparison units respectively, which may be broadcast to N comparison units (including the K target comparison units), optionally Yes, it may also be sent only to the K target comparison units.
  • the controller 403 learns through the scheduler 401 or the comparator 402 that there is currently a comparison unit 4021-2 and a comparison unit 4021-4 as target comparison units to perform a comparison operation of search data
  • the controller 403 controls the memory 404 to read
  • the output data to be matched is broadcast to the N comparison units, or only sent to the comparison unit 4021-2 and the comparison unit 4021-4.
  • the memory 404 needs to know which comparison units are currently the target comparison units.
  • the status information of each comparison unit (notified by the scheduler 401 or notified by the comparator 402) learned by the controller 403 controls the output of the memory 404.
  • the controller 403 is further configured to control the memory 404 to write the data to be matched in a preset manner.
  • the controller 403 may complete the initial configuration control of the data to be matched in a preset manner under the control of the external or internal processor of the device in the initial stage.
  • the controller 403 under the control of the external processor, first completes the initialization process of the memory 404, such as defining the matching word width and output result bit width , Select the read/write mode of the data table item (data to be matched), enumerate the addresses of all the data to be matched, and write the data to be matched corresponding to the address to the memory 404, etc., and the to be matched
  • the content of the data depends on the specific business requirements.
  • the comparator 402 includes N comparison units 4021 for searching data, that is, a collection of comparison units 4021-1, 4021-2, ... 4021 -N in FIG. 4. Since the N comparison units in the comparator 402 are respectively coupled to the memory 404, as shown in FIG. 4, there is a physical connection between each comparison unit 4021 and the memory 404, that is, the comparison between the N comparison units 4021 is parallel. The structure and the search process do not interfere with each other, and there can be no correlation between the start time of the search and the search data.
  • the target comparison unit compares the search data sent by the scheduler 401 with the data to be matched sent from the memory 404, and determines the search data according to the comparison result The matching result.
  • the matching result includes one or more of matching indication information, a matching data item of the corresponding search data, and an address of the matching data item, wherein the matching indication information is used to indicate whether there is a matching data item. For example, in practical applications, it may not be necessary to completely traverse all data items, such as only taking the result of the first match or only judging whether there is a match.
  • each comparison unit 4021 may also include maintenance control of necessary information such as its own comparison status, and a register for storing search data sent by the scheduler 401.
  • the scheduler 401 when the search data received by the scheduler 401 within the preset time period exceeds the preset number, or when there is no idle comparison unit currently, the scheduler 401 also controls the preset time interval After that, an idle target comparison unit is determined from the N comparison units. That is, when the throughput in the network is high, such as when the maximum pps of the instantaneous burst is large, or when there is no idle comparison unit, it can be improved by the traffic shaping processing method, that is, the comparison is controlled after a certain period of time.
  • the adaptability of the content addressable storage device in the embodiment of the present application reduces the demand for instantaneous comparison of resources.
  • multiple comparison units in the comparator are used as dynamic, flexible and callable comparison resources in the content addressable storage device.
  • the idle comparison unit in the unit sequentially compares the search data with the data to be matched in the memory.
  • the comparator since the comparator includes N comparison units, a maximum of N search data can be simultaneously searched in parallel.
  • each memory cell of the CAM array has a comparison circuit.
  • each memory cell has the characteristics of a dedicated comparison circuit, it has caused problems such as large chip area, high power consumption, and high cost.
  • problems such as large chip area, high power consumption, and high cost.
  • the comparison unit of the comparator in the content addressable storage device is shared, and the parallel search function of search data can be realized according to actual search requirements, which doubles the processing capacity of the device and meets the network transient.
  • the use of high-cost and high-power CAM structures is avoided, which ensures performance and reduces costs and power consumption.
  • the memory in the device does not need a dedicated CAM structure
  • a relatively general memory structure can be used, and other functional structures in the device (scheduler, controller, etc.) can be implemented based on general description languages such as hardware description language verilog , Thus making its portability and flexibility strong, thereby greatly ensuring the high availability, low cost and low power consumption of the content addressable storage device.
  • the scheduler 401 selects an idle target comparison unit from N comparison units 4021, or before the controller 403 determines which comparison unit to send the currently read data to be matched, it can be compared
  • Each comparison unit 4021 in the device 402 sends its own status information (idle or use status) directly to the scheduler 401 and/or the controller 403, or each comparison unit 4021 sends the status information uniformly to some of the comparators 402.
  • a state merging module collects, and then the state merging module sends all the summarized state information to the scheduler 401 and/or the controller 403, which is not specifically limited in the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of another content addressable storage device provided by an embodiment of the application, in which the scheduler 401 and the N comparison units 4021 are connected in parallel through N physical connections 001
  • the idle state or use state of the N comparison units 4021 can be reflected on the physical connection 001 in the form of level changes, and the scheduler 401 can perceive each comparison unit 4021 according to the voltage change on the physical connection 001 The current idle or use status. For example, when the comparison unit 4021 is in an idle state, the scheduler 401 keeps the physical connection 001 between it at low level, and when the comparison unit 4021 is in use, the scheduler 401 pulls the physical connection 001 between it high Level.
  • the target comparison unit (such as the target comparison units 4021-2, 4021-4, and 4021-5 in FIG. 5) can receive the scheduler 401 through the corresponding Search data (search data a, search data b, search data c) sent by physical connections (001a, 001b, 001c in Figure 5), and store the corresponding search data.
  • Search data search data a, search data b, search data c
  • physical connections 001a, 001b, 001c in Figure 5
  • each comparison unit 4021 in the comparator 402 can feed back its current state information (idle state or use state) to the controller 403 through the physical connection 002, and the controller 403 learns and counts the current status through the physical connection 002. Which comparison units are performing the search operation, and control the data to be matched in the memory 404 to be gradually sent to the corresponding target comparison unit through the physical connection 003 in parallel.
  • search data a search data a
  • search data b search data c
  • the comparison units selected by the scheduler 401 for it are target comparison unit 4021-2, target comparison unit 4021-4, and target comparison unit 4021- 5.
  • the comparator 402 After the search operation functions of 4021-2, 4021-4 and 4021-5 are turned on, the comparator 402 notifies the controller 403 of the enabled state of the target comparison unit (it can also be notified by the scheduler 401)
  • the controller 403 obtains the state information of the comparison unit, it combines the states of the N comparison units, that is, 4021-2, 4021-4, and 4021-5 are all enabled states, and other comparison units are in the non-enabled state.
  • the controller 403 combines the state of the comparison unit and the initialization control of the memory 404 to send the combined control information to the memory 404, thereby controlling the stepwise reading of data items and sending them to the target comparison unit 4021-2 in the enabled state in parallel. , 4021-4 and 4021-5.
  • each clock cycle sends a data item (or multiple data items) to the target comparison units 4021-2, 4021-4, and 4021-5 in parallel, depending on the comparison unit 4021 and the memory The bit width relationship between 404).
  • the target comparison units 4021-2, 4021-4, and 4021-5 will gradually read the stored search data and the controller 403 control from the memory 404 and pass them through the corresponding physical connections (003a, 003b, 003c in Figure 5). ) The sent data items are gradually compared until the final matching result of the search data is obtained. It can be understood that the time when each comparing unit 4021 starts to search may be at the same time or at different times, that is, searching data between different comparing units 4021 does not affect each other and does not interfere with each other.
  • the data to be matched includes M data items; the controller 403 is specifically configured to control the memory to read out the M data items in sequence according to the address traversal mode and send them to each
  • the K target comparison units can be sent only to K target comparison units (comparison units 4021-2, 4021-4, 4021-5 in Figure 5), or broadcast to N comparison units (as shown in Figure 5). 5 in the comparison unit 4021-1, 4021-2, ... 4021-N); each of the K target comparison units is specifically configured to sequentially compare the corresponding search data with the M pieces of data The items are compared to determine the matching data items of the corresponding search data.
  • the M data items stored in the memory are serially read out through address traversal, and are gradually sent to the corresponding target comparison unit for comparison, so as to obtain the corresponding matching result.
  • it can be read one by one, that is, one data item is one data item, or two two data items, that is, two data items, two data items, or multiple data items. Reading multiple data items.
  • FIG. 6 is a schematic structural diagram of another content addressable device provided by an embodiment of the application.
  • the content addressable device 40 in FIG. 6 further includes a result output register 405, and N comparison units 4021, respectively Coupled to the result output register 405; the target comparison unit also sends the matching result to the result output register 405, and the result output register 405 receives and stores the matching result. Since the N comparison units 4021 are respectively coupled to the result output register 405 in the content addressable storage device 40 through the physical connection 004, when any one of the N comparison units 4021 completes the matching of the search data , The matching result can be sent to the result output register 405 through the corresponding physical connection 004 in parallel.
  • the memory 404 reads data in an address polling manner under the control of the controller 403, so that the read data can participate in independent queries of multiple comparison units at the same time.
  • the comparator 402 of the content addressable device 40 in FIG. 6 further includes a state combining module 4022, and each comparing unit 4021 sends the state information to the state combining module 4022 in the comparator 402 in a unified manner. After summarizing, the state merging module 4022 sends all the summarized state information to the controller 403.
  • the number of comparison units N is greater than or equal to (instantaneous burst maximum pps)*(static table lookup period), where the instantaneous burst maximum pps can be determined by the minimum period of the actual query request, and the static table lookup period is Refers to the time to traverse the lookup table address (the address of the data to be matched). In practical applications, it can be combined with the existing traffic shaping module of the system to reduce peak pps.
  • FIG. 7 is a sequence diagram for comparison of the comparison unit provided by the embodiment of the application.
  • they are marked as T0, T1, T2, T3, T4, T5 in chronological order, corresponding to The first query, the second query, the third query, the fourth query, the fifth query, and the sixth query, where the first query and the fourth query (T0 and T3) are made by the comparison unit 402-2, the second query and the fifth query (T1 and T4) are executed by the comparison unit 402-4, the third query and the sixth query (T2 and T5) are executed by the comparison unit 402- 5 executed.
  • each comparison unit can perform the search data query in parallel, and each comparison unit returns to the idle state after a single query is completed, and the next round of search data query can be performed, for example, the comparison unit 402 -2 Start the query of search data a at time T0, after completing the query of search data a, you can start a new query of search data d, and in the process of comparing search data a, the comparison unit 402-2 compares The unit 402-4 starts the search for the search data b, and then starts the search for the search data e. It should be noted that all the data to be matched in the memory 404 is read out in the form of address polling. From the timing point of view, it is the address of the data to be matched in FIG.
  • the controller 403 will control the data in the memory 404 to keep cyclically reading, and each comparison unit does not start the comparison from the lowest or highest address Data items are compared. For example, at T1, the comparison starts from address 6, and at T3, the comparison starts from address 2. That is, the data items that all target comparison units start to compare depend on the data currently read from the memory 404. The time period corresponding to address 5 (hold) in Figure 7 is currently because no comparison unit is executing the query task.
  • the controller 403 will control the reading and matching of the pause data.
  • the controller 403 continues to read the suspended data item and sends it to the comparison unit 402-5 for comparison.
  • K comparison units are currently performing search data comparison, and different comparison units correspond to different search data, where K is less than or equal to M A positive integer; the data to be matched includes M data items; the controller 403 is specifically used to control the memory 404 to read out L data items in each clock cycle through address polling, and to read the L data items
  • the data item is broadcast to the N comparison units, and L is a positive integer less than or equal to M; each comparison unit in the K comparison units is used to compare the L data items received each time with the corresponding search Compare the data. That is, when there are multiple search data for searching, there are corresponding multiple comparison units performing the search operation at the same time.
  • each target comparison unit will receive the data each time 2 data items are compared with the search data stored by itself, among which, for each target comparison unit, it can get a matching result (compared with all or part of all 64 data items to be matched).
  • the search operation of the comparison unit is stopped.
  • the controller 403 can control the data item read from the memory to no longer be sent to the comparison unit that has completed the search operation task.
  • the bit width of each of the N comparison units is L times the bit width of each of the M data items. That is, the bit width of each comparison unit in the comparator 402 can be L times the bit width of each data item in the data to be matched stored in the memory 404.
  • the comparison unit can perform L in each clock cycle. Comparison of data items. For example, if the bit width of each data item is W and the comparison unit comparison bit width is L*W, then the comparison unit can complete the comparison of L*W bit/bit data in one clock cycle.
  • the embodiments of this application can further balance the performance and cost by expanding the word width within the range that the model selection allows and the evaluation of the back-end impact is reasonable. Within the range of relatively low cost and power consumption, multiple comparisons can be performed simultaneously through one clock cycle To speed up the query speed.
  • a priority encoder may be added to the content addressable device 40, which is located between the comparison unit 4021 and the result output register 40, and the priority encoder
  • the matching result with the highest priority (for example, the data item at the lowest position of the address) is encoded and output.
  • a matching address history register can be added to the content addressable device 40.
  • the address where the match occurred is recorded.
  • the recorded address is compared.
  • the relative size of the new matching address is used to determine whether to update the matching result and the matching address history register. For example, in the low address priority strategy, the new small matching address and result can always be used to refresh the old large matching address and result.
  • the data to be matched includes M data items, where each data item includes content to be matched, query control information, and output results; each of the K target comparison units A target comparison unit is specifically configured to compare the corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and to output the result in the matched data item Output as the matching result of the corresponding search data.
  • the query control information and the corresponding output result can be carried in the data item contained in the data to be matched, where the query control information may include :
  • the min max range or bit mask used in the specific comparison or combined with other logical operations can be determined based on system requirements.
  • the scheduler 401 of the content addressable storage device 40 in the foregoing FIGS. 4 to 6 may be a hardware circuit module or a functional device that runs software.
  • the controller 403 may also be a hardware circuit module or a functional device running software. It is understandable that under normal circumstances, the use of hardware circuit modules is faster than the use of software to implement corresponding functions.
  • the above scheduler 401 can be deployed inside the controller 403, that is, it is considered to be a part of the controller 403 in physical form or integrated with the controller 403; alternatively, the scheduler 401 can also be deployed outside the controller 403 Of course, it can also be partially deployed inside the controller 403 and the other part deployed outside the controller 403, which is not specifically limited in the embodiment of the present application.
  • the content addressable storage device in this application can have three operating modes: read mode, write mode and matching mode.
  • read mode the way of data access and operation in the memory is the same as in the ordinary memory.
  • the matching mode can realize the content-based search function implemented by the content-addressable storage device in Figures 4-6.
  • the data search function of the content addressable storage device in this application can be used in various application scenarios such as virtual memory, data compression, pattern recognition, image processing, high-speed caching, and table lookup applications.
  • the switch when performing Media Access Control Address (MAC) address retrieval, the switch (the switch may include any of the content addressable storage devices described in this application) first uses the MAC address as the key ( Search data) retrieve the corresponding index value through the MAC-CAM table (that is, the data to be matched in this application).
  • the MAC-CAM table maintains a piece of data for layer 2 switching for the switch Address table (usually called "CAM table"), which maintains the correspondence between MAC addresses and outbound interfaces.
  • the switch will make a judgment. Extract the destination MAC address of the data frame, if the data frame is not sent to yourself, query the CAM table according to the destination MAC address of the data frame; if it can be hit (the so-called hit, it is to find the corresponding MAC address in the CAM table The forwarding item), the forwarding is performed according to the result of the query (usually an outbound interface list); if it cannot be hit, the data frame is broadcast to all ports.
  • This CAM table of the switch can be obtained in many ways, such as static configuration and dynamic learning.
  • the content addressable storage device in this application When the content addressable storage device in this application only needs to simulate a general CAM operation, it can only use a set of table item storage resources (data to be matched), and by adding a small amount of control logic, the throughput can be linearly doubled ( pps) performance, without the need to significantly change the memory shape (relative to the original look-up table matching requirements), while ensuring throughput and realizability, without losing cost and power consumption advantages. If you need more flexible dynamic table lookup requirements, you can conveniently store some operation information about table items in the memory at the same time, which can more conveniently support implementations such as table item enable, bit mask, value range mask, and disorder Additional features such as table item priority greatly improve the flexibility of table look-up matching processing. At the same time, in combination with specific scenarios, word width expansion can also be combined to further balance performance and area power consumption, and further improve energy efficiency ratio.
  • the content addressable storage device provided by this application includes at least the following advantages:
  • TCAM can be easily realized through a small amount of control and modification of table item content, and other complicated methods of fuzzy CAM processing, such as table item enable, such as fuzzy methods with min, max, etc. It is planned to perform dynamic matching processing in the contents of the Memory table; it is easier to achieve a fixed processing delay, and it is easy to form a pipeline structure in the overall processing of the system; it is easier to combine the expanded word width dimension to further improve the cost performance balance; it is easier to expand and realize online
  • the dynamic entry update operation satisfies business scenarios with high online requirements; in some applications, other non-coexisting scenario memory (such as Memory) resources can be reused to further reduce costs.
  • FIG. 8 is a schematic flowchart of a content addressable storage method provided by an embodiment of the present application.
  • the content addressable storage method is applicable to any of the above-mentioned figures 4-7.
  • S801 Store data to be matched in the memory.
  • S802 Obtain K search data, and schedule the K search data to K target comparison units in the N comparison units, where the target comparison unit is a comparison unit in an idle state, and K is greater than or An integer equal to 1 and less than or equal to N.
  • the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the method further includes : Send the matching result corresponding to each target comparing unit of the K target comparing units to the result output register, where the matching result includes matching indication information, matching data items of the corresponding search data, and matching data One or more of the address of the item, wherein the matching indication information is used to indicate whether there is a matching data item; the matching result sent by the K target comparison units is received and stored through the result output register.
  • the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively.
  • Target comparison unit including:
  • the memory is controlled to sequentially read the M data items and send them to the K target comparison units respectively; the corresponding search is performed by each of the K target comparison units.
  • the data is compared with the data to be matched, and the corresponding matching result is output according to the comparison result, including: the corresponding search data is sequentially compared with the M pieces of search data through each of the K target comparing units.
  • the data items are compared to determine the matching data items of the corresponding search data.
  • the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively.
  • Target comparison unit including:
  • the memory is controlled to read out L data items in each clock cycle, and the read L data items are broadcast to the N comparison units, where L is a positive integer less than or equal to M; Comparing the corresponding search data with the to-be-matched data through each of the K target comparison units includes: through each of the K comparison units, each comparison unit receives Compare the L data items with the corresponding search data.
  • the bit width of each of the N comparison units is L times the bit width of each of the M data items.
  • the method further includes: controlling the memory to write the data to be matched in a preset manner.
  • the method further includes: when the search data received within the preset time period exceeds the preset number, or when there is no free comparison unit currently, controlling after the preset time interval , And then determine the target comparison unit from the N comparison units.
  • the method further includes: in the process of controlling the memory to read the data to be matched, controlling the memory to write new data to be matched or modify the data to be matched The data.
  • the data to be matched includes multiple data items, wherein each data item includes content to be matched, query control information, and output results;
  • Each target comparison unit compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: through each of the K target comparison units, respectively According to the query control information of the M data items, the corresponding search data is compared with the content to be matched in each data item, and the output result in the matched data item is used as the matching result of the corresponding search data.
  • An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and the program includes part or all of the steps of any one of the above method embodiments when executed.
  • the embodiments of the present application also provide a computer program, which includes instructions, when the computer program is executed by a computer, the computer can execute part or all of the steps of any content addressable storage method.
  • a computer program which includes instructions, when the computer program is executed by a computer, the computer can execute part or all of the steps of any content addressable storage method.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the above-mentioned units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc., specifically a processor in a computer device) execute all or part of the steps of the foregoing methods of the various embodiments of the present application.
  • the aforementioned storage medium may include: U disk, mobile hard disk, magnetic disk, optical disk, read-only memory (Read-Only Memory, abbreviation: ROM) or Random Access Memory (Random Access Memory, abbreviation: RAM), etc.
  • U disk mobile hard disk
  • magnetic disk magnetic disk
  • optical disk read-only memory
  • Read-Only Memory abbreviation: ROM
  • Random Access Memory Random Access Memory

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed in embodiments of the present application are a content-addressable storage apparatus and method, and a related device, wherein the content-addressable storage apparatus may comprise: a memory used for storing data to be matched; a comparator comprising N comparison units, the N comparison units being respectively coupled to the memory; a scheduler used for acquiring K pieces of search data and for scheduling the K pieces of search data to K target comparison units among the N comparison units respectively, the target comparison units being comparison units in an idle state; a controller used for controlling the memory to read out the data to be matched and send same to the K target comparison units respectively; and each target comparison unit among the K target comparison units is used for comparing corresponding search data with the data to be matched, and outputting a corresponding matching result according to the comparison result. By employing the present application, the low power consumption and flexibility of a content-addressable storage device may be ensured.

Description

一种内容可寻址存储装置、方法及相关设备Content addressable storage device, method and related equipment 技术领域Technical field
本申请涉及存储器技术领域,尤其涉及一种内容可寻址存储装置、方法及相关设备。This application relates to the field of memory technology, and in particular to a content addressable storage device, method and related equipment.
背景技术Background technique
内容可寻址存储器(Content Addressable Memory,CAM)是以内容进行寻址的存储器,是一种特殊的存储阵列随机存取存储器(Random Access Memory,RAM)。它除了可以像传统的RAM一样进行读和写,还可以进行搜索操作,其搜索机制就是将一个输入数据项与存储在CAM中的所有数据项进行比较,判别该输入数据项与CAM中存储的数据项是否相匹配,并输出该数据项对应的匹配信息。如图1A所示,图1A为现有技术中的一种RAM和CAM的功能区别示意图,RAM是根据输入的地址搜索对应的数据,而CAM则可以根据输入数据搜索对应的地址。Content addressable memory (Content Addressable Memory, CAM) is a content-addressable memory, which is a special storage array random access memory (Random Access Memory, RAM). In addition to reading and writing like traditional RAM, it can also perform search operations. Its search mechanism is to compare an input data item with all data items stored in the CAM, and distinguish the input data item from the data stored in the CAM. Whether the data item matches, and output the matching information corresponding to the data item. As shown in FIG. 1A, FIG. 1A is a schematic diagram of the functional difference between RAM and CAM in the prior art. RAM searches for corresponding data according to the input address, and CAM can search for the corresponding address according to input data.
如图1B所示,图1B为现有技术中的一种典型的CAM结构示意图,CAM主要由以下几部分组成:CAM阵列(M阵列)、敏感放大器(SA)、搜索线驱动(Search Line drivers,SL drivers)以及编码器(encoder)。图1B中CAM阵列包括(W×K)个CAM存储单元M,每个存储单元M负责存储数据以及将存储的数据与外来搜索数据进行比较。其中,一行存储单元M(图中W个为例)构成CAM的一个字(也可称之为数据项、数据表项、表项等),W则为"字宽或位宽",CAM阵列内所有字的条数(图中K条为例)称为"深度",CAM的容量通过(K字×W位)来表征。在进行搜索操作时,搜索数据(Search Key)经过搜索线驱动SL drivers加载到搜索线SL中,储存在存储单元M中的数据与SL进行匹配,匹配结果以电平变化的形式反映在匹配线(Match Line,ML)上,ML上电压变化经过灵敏方法器SA放大,最后输出到优先级编码器(encoder)上,与搜索数据发生匹配的数据项的优先级最高的地址被编码输出。As shown in Fig. 1B, Fig. 1B is a schematic diagram of a typical CAM structure in the prior art. CAM mainly consists of the following parts: CAM array (M array), sensitive amplifier (SA), search line drivers (Search Line drivers) , SL drivers) and encoders. The CAM array in FIG. 1B includes (W×K) CAM memory cells M, and each memory cell M is responsible for storing data and comparing the stored data with external search data. Among them, a row of memory cells M (W as an example in the figure) constitutes a word of CAM (also called data item, data table item, table item, etc.), W is "word width or bit width", CAM array The number of all words (K in the figure as an example) is called "depth", and the capacity of CAM is represented by (K words×W bits). During the search operation, the search data (Search Key) is loaded into the search line SL through the search line driver SL drivers, the data stored in the memory cell M is matched with the SL, and the matching result is reflected in the match line in the form of level changes On (Match Line, ML), the voltage change on ML is amplified by the sensitive method SA, and finally output to the priority encoder (encoder), and the address with the highest priority of the data item that matches the search data is encoded and output.
可见,CAM在进行内容查找时,利用每个存储单元M中的比较电路,可将搜索数据和所有存储单元中存储的数据同时比较,使得整个CAM阵列的所有数据项最快可在一个时钟周期被查询,实现快速匹配,并行度高、查找速度不受CAM容量的影响。相比于普通的内存只能在软件的控制下,按地址顺序逐一读出内存中的数据进行比较,CAM通过硬件电路实现了高速搜索的功能,极大地提高搜索效率和搜索系统的性能,被广泛地应用于网络通信、模式识别等领域。It can be seen that when CAM searches for content, it uses the comparison circuit in each memory cell M to compare the search data with the data stored in all memory cells at the same time, so that all data items in the entire CAM array can be as fast as one clock cycle It is queried to achieve fast matching, high parallelism, and the search speed is not affected by the CAM capacity. Compared with ordinary memory which can only be controlled by software, the data in the memory can be read one by one in the order of address for comparison. CAM realizes the function of high-speed search through hardware circuit, which greatly improves the search efficiency and performance of the search system. It is widely used in network communication, pattern recognition and other fields.
但是,由于CAM使用专用电路进行并行比较的特点,导致其不可避免的存在芯片面积较大、功耗较高的问题,且存在可移植性差、灵活性较低的弊端,降低了CAM芯片的性能和可靠性。However, due to the characteristics of CAM using dedicated circuits for parallel comparison, it inevitably has the problems of large chip area, high power consumption, poor portability and low flexibility, which reduces the performance of CAM chips. And reliability.
申请内容Application content
本申请实施例提供了一种内容可寻址存储装置、方法及相关设备,能够实现内容可寻址功能的同时,保证内容可寻址存储装置的低功耗性和灵活性。The embodiments of the present application provide a content addressable storage device, method, and related equipment, which can realize the content addressable function while ensuring the low power consumption and flexibility of the content addressable storage device.
第一方面,本申请实施例提供了一种内容可寻址存储装置,可包括:存储器,用于存 储待匹配的数据;比较器,包括N个比较单元,所述N个比较单元分别耦合至所述存储器,N为大于1的整数;调度器,用于获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数;控制器,用于控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元;所述K个目标比较单元中的每一个目标比较单元,用于将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果。In the first aspect, an embodiment of the present application provides a content-addressable storage device, which may include: a memory for storing data to be matched; a comparator including N comparison units, and the N comparison units are respectively coupled to In the memory, N is an integer greater than 1; a scheduler is used to obtain K search data, and schedule the K search data to K target comparison units among the N comparison units, and the target The comparison unit is a comparison unit in an idle state, K is an integer greater than or equal to 1 and less than or equal to N; the controller is used to control the memory to read out the data to be matched and send them to the K targets respectively Comparison unit; each of the K target comparison units is used to compare the corresponding search data with the data to be matched, and output a corresponding matching result according to the comparison result.
本申请实施例,通过在内容可寻址存储装置中,将比较器中的多个比较单元作为动态灵活可调用的比较资源,当有输入的搜索数据需要进行查找时,则调用上述多个比较单元中的空闲比较单元将搜索数据与存储器中待匹配的数据依次进行比较,其中,由于比较器中包括N个比较单元,因此最多可以同时实现N个搜索数据的并行查找。不同于现有技术的CAM,其CAM阵列的每个存储单元中都有一个比较电路,虽然使得搜索数据可同时与存储阵列中的所有存储数据同时比较,从而在较少甚至是一个时钟周期得到比较结果,但是,由于每个存储单元都有专用比较电路的特性,导致了芯片面积大、功耗大、成本高等问题。并且,CAM一旦出厂后,其物理结构和相关参数就固定了,因此可移植性和灵活性较差。而本申请实施例中,将内容可寻址存储装置中的比较器的比较单元进行共享,且可以根据实际查找需求实现搜索数据的并行查找功能,成倍提升装置的处理能力,同时满足网络瞬时吞吐量较低和较高的场景,避免使用高成本、高功耗的CAM结构,保证了性能、降低了成本和功耗。进一步地,由于装置中的存储器无需专用CAM结构,而可以使用相对通用的存储器结构,而装置中的其他功能结构(调度器、控制器等)则可以基于硬件描述语言verilog等通用描述语言进行实现,因而使得其可移植性、灵活性较强,从而极大的保证了内容可寻址存储装置的高可用性、低成本和低功耗性。In the embodiment of the present application, multiple comparison units in the comparator are used as dynamic, flexible and callable comparison resources in the content addressable storage device. When there is input search data that needs to be searched, the above multiple comparisons are called The idle comparison unit in the unit sequentially compares the search data with the data to be matched in the memory. Among them, since the comparator includes N comparison units, a maximum of N search data can be simultaneously searched in parallel. Different from the prior art CAM, each memory cell of the CAM array has a comparison circuit. Although the search data can be compared with all the stored data in the memory array at the same time, it can be obtained in less or even one clock cycle. The comparison results, however, because each memory cell has the characteristics of a dedicated comparison circuit, it has caused problems such as large chip area, high power consumption, and high cost. Moreover, once the CAM leaves the factory, its physical structure and related parameters are fixed, so portability and flexibility are poor. However, in the embodiment of the present application, the comparison unit of the comparator in the content addressable storage device is shared, and the parallel search function of search data can be realized according to actual search requirements, which doubles the processing capacity of the device and meets the network transient In scenarios with lower throughput and higher throughput, the use of high-cost and high-power CAM structures is avoided, which ensures performance and reduces costs and power consumption. Furthermore, since the memory in the device does not need a dedicated CAM structure, a relatively general memory structure can be used, and other functional structures in the device (scheduler, controller, etc.) can be implemented based on general description languages such as hardware description language verilog , Thus making its portability and flexibility strong, thereby greatly ensuring the high availability, low cost and low power consumption of the content addressable storage device.
在一种可能的实现方式中,所述装置还包括结果输出寄存器,所述N个比较单元分别耦合于所述结果输出寄存器;所述待匹配的数据包括多条数据项;所述K个目标比较单元中的每一个目标比较单元,还用于将对应的匹配结果发送至所述结果输出寄存器,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项;所述结果输出寄存器,用于接收并存储所述K个目标比较单元分别发送的匹配结果。本申请实施例中的N个比较单元还分别耦合于内容可寻址存储装置中的结果输出寄存器,当N个比较单元中的任意一个比较单元完成搜索数据的匹配,都可以将匹配结果发送至结果输出寄存器,其中匹配结果可以是是否匹配成功、具体的匹配数据或者匹配数据的地址中的一种或多种。In a possible implementation manner, the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the K targets Each target comparison unit in the comparison unit is further configured to send a corresponding matching result to the result output register, and the matching result includes matching indication information, matching data items of the corresponding search data, and matching data items of the matching data items. One or more of the addresses, wherein the matching indication information is used to indicate whether there is a matching data item; the result output register is used to receive and store the matching results respectively sent by the K target comparison units. The N comparison units in the embodiment of the present application are respectively coupled to the result output register in the content addressable storage device. When any one of the N comparison units completes the matching of the search data, the matching result can be sent to The result output register, where the matching result can be one or more of whether the matching is successful, the specific matching data, or the address of the matching data.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项。M为大于1的整数;所述控制器,具体用于按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元;所述K个目标比较单元中的每一个目标比较单元,具体用于将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。本申请实施例,通过地址遍历的方式,将存储器中存储的M条数据项串行读出,并逐步发送给对应的目标比较单元进行比较,以获得对应的匹配结果。In a possible implementation manner, the data to be matched includes M data items. M is an integer greater than 1; the controller is specifically configured to control the memory to sequentially read the M data items and send them to the K target comparison units according to the address traversal mode; the K targets Each target comparison unit in the comparison unit is specifically configured to sequentially compare the corresponding search data with the M data items to determine matching data items of the corresponding search data. In the embodiment of the present application, the M data items stored in the memory are serially read out through address traversal, and are gradually sent to the corresponding target comparison unit for comparison, so as to obtain the corresponding matching result.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项;所述控制器,具体用 于通过地址轮询方式,在每个时钟周期控制所述存储器读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;所述K个比较单元中的每一个比较单元,用于将每次接收到的L条数据项与对应的搜索数据进行比较。本申请实施例中,当有多个搜索数据在进行查找时,则对应有多个比较单元在同时执行搜索操作,其具体可以为,控制器通过控制读出存储器中L条数据项且并行发送至当前在执行搜索操作的比较单元,而正在执行搜索操作的比较单元则是将每次接收到的L条数据与存储的搜索数据进行比较,从而得到对应的匹配结果。例如,比较器中包括8个比较单元,当前有4个比较单元正在进行数据比较,控制器每个时钟周期从待匹配的64条数据项中,读出2条数据项且并行发送至上述4个比较单元中,每个比较单元则将每次接收到的2条数据项与自身存储的搜索数据进行比较,其中,对于每个比较单元来说,其得到了匹配结果(与待匹配的所有64条数据项的全部或者部分进行了比较)则可以停止该比较单元的搜索操作,此时控制器控制不再将从存储器中读出的数据项发送至已完成搜索操作的比较单元。In a possible implementation manner, the data to be matched includes M data items; the controller is specifically configured to control the memory to read out L data items in an address polling manner in each clock cycle , And broadcast the read L data items to the N comparison units, where L is a positive integer less than or equal to M; each comparison unit in the K comparison units is used to compare the received data each time The L data items are compared with the corresponding search data. In the embodiment of the present application, when there are multiple search data for searching, there are corresponding multiple comparison units performing the search operation at the same time. Specifically, it may be that the controller reads out L data items in the memory and sends them in parallel. To the comparison unit currently performing the search operation, and the comparison unit currently performing the search operation compares the L pieces of data received each time with the stored search data to obtain the corresponding matching result. For example, the comparator includes 8 comparison units, and currently 4 comparison units are performing data comparison. The controller reads out 2 data items from the 64 data items to be matched every clock cycle and sends them to the above 4 in parallel. Among the comparison units, each comparison unit compares the two data items received each time with the search data stored in itself. For each comparison unit, it obtains the matching result (with all the data items to be matched). All or part of the 64 data items have been compared), the search operation of the comparison unit can be stopped. At this time, the controller controls no longer sending the data items read from the memory to the comparison unit that has completed the search operation.
在一种可能的实现方式中,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。在本申请实施例中,比较器中的每一个比较单元的位宽可以为存储器中存储的待匹配的数据中的每一条数据项的位宽的L倍,此时,比较单元可以在每个时钟周期进行L条数据项的比较。例如,若每一条数据项的位宽为W,比较单元比较位宽为L*W,则该比较单元可以在一个时钟周期完成L*W位/bit数据的比较。In a possible implementation manner, the bit width of each of the N comparison units is L times the bit width of each of the M data items. In the embodiment of the present application, the bit width of each comparison unit in the comparator may be L times the bit width of each data item in the data to be matched stored in the memory. In this case, the comparison unit may L data items are compared in clock cycles. For example, if the bit width of each data item is W and the comparison unit comparison bit width is L*W, then the comparison unit can complete the comparison of L*W bit/bit data in one clock cycle.
在一种可能的实现方式中,所述控制器,还用于控制所述存储器以预设方式写入所述待匹配的数据。本申请实施例中,控制器在初始阶段可以在装置外部或内部处理器的控制下,以预设方式完成待匹配的数据的初始配置控制,例如,枚举所有待匹配的数据的地址,并将地址对应的待匹配的数据相应的写入至存储器中,而其中的待匹配的数据的内容则取决于具体的业务需求。In a possible implementation manner, the controller is further configured to control the memory to write the data to be matched in a preset manner. In the embodiment of the present application, the controller can complete the initial configuration control of the data to be matched in a preset manner under the control of the external or internal processor of the device in the initial stage, for example, enumerate the addresses of all the data to be matched, and The data to be matched corresponding to the address is correspondingly written into the memory, and the content of the data to be matched depends on the specific business requirements.
在一种可能的实现方式中,所述调度器,还用于当在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则控制在预设时间间隔之后,再从所述N个比较单元中确定目标比较单元。本申请实施例,当网络中的吞吐量较高,如瞬时突发最大pps较大时,或者当前没有空闲的比较单元时,则可以通过流量整形处理即控制在一定的时间之后再进行比较,来提高本申请实施例中的内容可寻址存储装置的适配性以降低瞬时比较资源的需求。In a possible implementation manner, the scheduler is also used to control the preset time when the search data received within the preset time period exceeds a preset number, or when there is no free comparison unit currently After the interval, the target comparison unit is determined from the N comparison units. In the embodiment of the present application, when the throughput in the network is high, such as when the maximum pps of the instantaneous burst is large, or when there is no idle comparison unit currently, the comparison can be performed after a certain period of time through traffic shaping processing. To improve the adaptability of the content addressable storage device in the embodiments of the present application to reduce the requirement for instantaneous comparison of resources.
在一种可能的实现方式中,所述控制器,还用于在控制所述存储器读出所述待匹配的数据的过程中,控制所述存储器写入新的待匹配的数据或修改所述待匹配的数据。本申请实施例中,控制器可以通过边读边写的方式将待匹配的数据写入到存储器中,例如,对于有在线刷新待匹配的数据的需求的系统,可以通过选用带独立读/写两个端口的存储器来进行待匹配的数据的边读边更新或边读边修改;如果选用单端口存储器,则控制器在控制待匹配的数据读出时,需额外插入一些刷新时间来完成待匹配的数据的刷新,此时可能减低部分查询速度。其中的刷新时间是指,在刷新(写表项/数据项)时,暂停查询(读表项/数据项)地址序列增加(同时写使能无效),在写结束后,重新使能查询(读使能,读地址序列增加)。In a possible implementation manner, the controller is further configured to control the memory to write new data to be matched or modify the data to be matched during the process of controlling the memory to read the data to be matched. The data to be matched. In the embodiment of the present application, the controller can write the data to be matched into the memory by reading and writing. For example, for a system that requires online refreshing of the data to be matched, the controller can select the independent read/write Two-port memory is used to read and update or modify the data to be matched; if a single-port memory is selected, the controller needs to insert some additional refresh time to complete the data to be matched when reading the data to be matched. Refreshing of matching data may reduce the query speed at this time. The refresh time means that when refreshing (writing table entry/data item), the address sequence of suspending query (reading table entry/data item) increases (while the write enable is invalid), after the writing is completed, the query is re-enabled ( Read enable, read address sequence increases).
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,其中,每一条数据项 包括待匹配的内容、查询控制信息和输出结果;所述K个目标比较单元中的每一个目标比较单元,具体用于分别根据所述M条数据项的查询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。本申请实施例,通过在待匹配的数据所包含的数据项中携带查询控制信息和对应的输出结果(例如该输出结果为该待匹配的内容对应的地址),从而控制比较单元在进行搜索数据查询时的查询方法和相关参数,以满足多种场景下的业务需求。In a possible implementation, the data to be matched includes M data items, where each data item includes content to be matched, query control information, and output results; each of the K target comparison units A target comparison unit is specifically configured to compare the corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and to output the result in the matched data item Output as the matching result of the corresponding search data. In this embodiment of the application, the query control information and the corresponding output result are carried in the data items contained in the data to be matched (for example, the output result is the address corresponding to the content to be matched), thereby controlling the comparison unit to search for data The query method and related parameters when querying to meet the business needs in multiple scenarios.
第二方面,本申请实施例提供了一种内容可寻址存储方法,可应用于内容可寻址存储装置,所述装置包括:比较器和存储器,所述比较器包括N个比较单元,所述N个比较单元分别耦合于所述存储器,N为大于1的整数;所述方法包括:在所述存储器中存储待匹配的数据;获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数;控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元;通过所述K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果。In the second aspect, the embodiments of the present application provide a content addressable storage method, which can be applied to a content addressable storage device. The device includes a comparator and a memory. The comparator includes N comparison units. The N comparison units are respectively coupled to the memory, and N is an integer greater than 1. The method includes: storing the data to be matched in the memory; obtaining K search data, and separately Scheduling to K target comparison units among the N comparison units, the target comparison unit is a comparison unit in an idle state, and K is an integer greater than or equal to 1 and less than or equal to N; controlling the memory readout The data to be matched are sent to the K target comparison units respectively; through each target comparison unit of the K target comparison units, the corresponding search data is compared with the data to be matched, and according to The comparison result outputs the corresponding matching result.
在一种可能的实现方式中,所述装置还包括结果输出寄存器,所述N个比较单元分别耦合于所述结果输出寄存器;所述待匹配的数据包括多条数据项;所述方法还包括:将所述K个目标比较单元中的每一个目标比较单元对应的匹配结果发送至所述结果输出寄存器,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项;通过所述结果输出寄存器接收并存储所述K个目标比较单元分别发送的匹配结果。In a possible implementation manner, the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the method further includes : Send the matching result corresponding to each target comparing unit of the K target comparing units to the result output register, where the matching result includes matching indication information, matching data items of the corresponding search data, and matching data One or more of the address of the item, wherein the matching indication information is used to indicate whether there is a matching data item; the matching result sent by the K target comparison units is received and stored through the result output register.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元;所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。In a possible implementation manner, the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively. The target comparing unit includes: controlling the memory to sequentially read the M data items and sending them to the K target comparing units according to the address traversal mode; the comparing by each of the K target comparing units Unit, which compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: the corresponding search data is compared by each of the K target comparison units Compare with the M data items in sequence to determine matching data items of the corresponding search data.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:通过地址轮询方式,在每个时钟周期控制所述存储器读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,包括:通过所述K个比较单元中的每一个比较单元,将每次接收到的L条数据项与对应的搜索数据进行比较。In a possible implementation manner, the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively. The target comparison unit includes: controlling the memory to read out L data items in each clock cycle by address polling, and broadcasting the read L data items to the N comparison units, where L is less than or A positive integer equal to M; compare the corresponding search data with the data to be matched through each of the K target comparison units, including: through each of the K comparison units , Compare the L data items received each time with the corresponding search data.
在一种可能的实现方式中,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。In a possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items.
在一种可能的实现方式中,所述方法还包括:控制所述存储器以预设方式写入所述待 匹配的数据。In a possible implementation, the method further includes: controlling the memory to write the data to be matched in a preset manner.
在一种可能的实现方式中,所述方法还包括:当在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则控制在预设时间间隔之后,再从所述N个比较单元中确定目标比较单元。In a possible implementation manner, the method further includes: when the search data received within the preset time period exceeds the preset number, or when there is no free comparison unit currently, controlling after the preset time interval , And then determine the target comparison unit from the N comparison units.
在一种可能的实现方式中,所述方法还包括:在控制所述存储器读出所述待匹配的数据的过程中,控制所述存储器写入新的待匹配的数据或修改所述待匹配的数据。In a possible implementation manner, the method further includes: in the process of controlling the memory to read the data to be matched, controlling the memory to write new data to be matched or modify the data to be matched The data.
在一种可能的实现方式中,所述待匹配的数据包括多条数据项,其中,每一条数据项包括待匹配的内容、查询控制信息和输出结果;所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:通过所述K个目标比较单元中的每一个目标比较单元,分别根据所述M条数据项的查询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。In a possible implementation, the data to be matched includes multiple data items, wherein each data item includes content to be matched, query control information, and output results; Each target comparison unit compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: through each of the K target comparison units, respectively According to the query control information of the M data items, the corresponding search data is compared with the content to be matched in each data item, and the output result in the matched data item is used as the matching result of the corresponding search data. Output.
第三方面,本申请提供一种半导体芯片,可包括:In the third aspect, this application provides a semiconductor chip, which may include:
上述第一方面以及结合上述第一方面中的任意一种实现方式所提供的内容可寻址存储装置。The foregoing first aspect and a content addressable storage device provided in combination with any one of the foregoing first aspects.
第四方面,本申请提供一种半导体芯片,可包括:In a fourth aspect, this application provides a semiconductor chip, which may include:
上述第一方面、以及结合上述第一方面中的任意一种实现方式所提供的内容可寻址存储装置、耦合于所述内容可寻址存储装置的处理器以及所述内容可寻址存储装置外部的存储器。The foregoing first aspect and the content addressable storage device provided in combination with any one of the foregoing first aspect implementation manners, a processor coupled to the content addressable storage device, and the content addressable storage device External storage.
第五方面,本申请提供一种片上系统SoC芯片,该SoC芯片包括上述第一方面、以及结合上述第一方面中的任意一种实现方式所提供的内容可寻址存储装置、耦合于所述内容可寻址存储装置的处理器和所述内容可寻址存储装置的外部存储器。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。In a fifth aspect, the present application provides a system-on-chip SoC chip. The SoC chip includes the above-mentioned first aspect and the content-addressable storage device provided in combination with any one of the above-mentioned first aspects, coupled to the The processor of the content addressable storage device and the external memory of the content addressable storage device. The chip system can be composed of chips, or include chips and other discrete devices.
第六方面,本申请提供一种芯片系统,该芯片系统包括:包含上述第一方面以及结合上述第一方面中的任意一种实现方式所提供的内容可寻址存储装置,以及包含耦合于所述内容可寻址存储装置的处理器和所述内容可寻址存储装置的外部存储器的芯片。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。In a sixth aspect, the present application provides a chip system, which includes: a content addressable storage device including the first aspect described above and any one of the implementations provided in combination with the first aspect described above, and including: The processor of the content addressable storage device and the chip of the external memory of the content addressable storage device. The chip system can be composed of chips, or include chips and other discrete devices.
第七方面,本申请提供一种电子设备,该电子设备中包括上述第一方面以及结合上述第一方面中的任意一种实现方式所提供的内容可寻址存储装置、所述内容可寻址存储装置的外部存储器,以及耦合于所述内容可寻址存储装置的处理器。所述外部存储器用于存储必要的程序指令和数据,所述处理器用于运行该电子设备必要的通用操作系统,且用于与所述内容可寻址存储装置耦合完成内容可寻址存储装置中的相关处理功能。该电子设备还可以包括通信接口,用于该电子设备与其他设备或通信网络通信。In a seventh aspect, the present application provides an electronic device that includes the foregoing first aspect and the content addressable storage device provided in combination with any one of the foregoing first aspect implementations, and the content addressable An external memory of the storage device, and a processor coupled to the content addressable storage device. The external memory is used to store necessary program instructions and data, the processor is used to run the necessary general operating system of the electronic device, and is used to couple with the content addressable storage device to complete the content addressable storage device Related processing functions. The electronic device may also include a communication interface for the electronic device to communicate with other devices or a communication network.
第八方面,本申请提供一种计算机存储介质,所述计算机存储介质存储有计算机程序,当该计算机程序被处理器执行时,可以实现上述第二方面以及结合上述第二方面中的任意一种实现方式所提供的内容可寻址存储方法中的流程。In an eighth aspect, the present application provides a computer storage medium that stores a computer program, and when the computer program is executed by a processor, it can implement any one of the above second aspect and in combination with the above second aspect The flow in the content addressable storage method provided by the implementation method.
第九方面,本申请实施例提供了一种计算机程序,该计算机程序包括指令,当该计算 机程序被计算机执行时,使得计算机可以执行上述第二方面以及结合上述第二方面中的任意一种实现方式所提供的内容可寻址存储方法中的流程。In a ninth aspect, an embodiment of the present application provides a computer program, the computer program includes instructions, when the computer program is executed by a computer, the computer can execute the above-mentioned second aspect and any combination of the above-mentioned second aspect. The process in the content addressable storage method provided by the method.
附图说明Description of the drawings
图1A为现有技术中的一种RAM和CAM的功能区别示意图;FIG. 1A is a schematic diagram of the functional difference between RAM and CAM in the prior art;
图1B为现有技术中的一种典型的CAM结构示意图;FIG. 1B is a schematic diagram of a typical CAM structure in the prior art;
图2为本申请实施例提供的一种基于Memory地址遍历的模拟CAM的处理示意图;2 is a schematic diagram of processing a simulated CAM based on Memory address traversal provided by an embodiment of the application;
图3为本申请实施例提供的一种基于Memory地址遍历的扩展位宽模拟CAM处理的示意图;3 is a schematic diagram of an extended bit-width analog CAM processing based on Memory address traversal provided by an embodiment of the application;
图4是本申请实施例提供的一种内容可寻址存储装置的结构示意图;4 is a schematic structural diagram of a content addressable storage device provided by an embodiment of the present application;
图5为本申请实施例提供的另一种内容可寻址存储装置的结构示意图;5 is a schematic structural diagram of another content addressable storage device provided by an embodiment of the application;
图6为本申请实施例提供的又一种内容可寻址装置的结构示意图;FIG. 6 is a schematic structural diagram of another content addressable device provided by an embodiment of this application;
图7为本申请实施例所提供的比较单元比较的时序图;FIG. 7 is a sequence diagram of comparison of a comparison unit provided by an embodiment of the application;
图8是本申请实施例提供的一种内容可寻址存储方法的流程示意图。FIG. 8 is a schematic flowchart of a content addressable storage method provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例进行描述。The embodiments of the present application will be described below in conjunction with the drawings in the embodiments of the present application.
本申请的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third" and "fourth" in the description and claims of the application and the drawings are used to distinguish different objects, rather than describing a specific order . In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
在本说明书中使用的术语“部件”、“模块”、“系统”等用于表示计算机相关的实体、硬件、固件、硬件和软件的组合、软件、或执行中的软件。例如,部件可以是但不限于,在处理器上运行的进程、处理器、对象、可执行文件、执行线程、程序和/或计算机。通过图示,在计算设备上运行的应用和计算设备都可以是部件。一个或多个部件可驻留在进程和/或执行线程中,部件可位于一个计算机上和/或分布在2个或更多个计算机之间。此外,这些部件可从在上面存储有各种数据结构的各种计算机可读介质执行。部件可例如根据具有一个或多个数据分组(例如来自与本地系统、分布式系统和/或网络间的另一部件交互的二个部件的数据,例如通过信号与其它系统交互的互联网)的信号通过本地和/或远程进程来通信。The terms "component", "module", "system", etc. used in this specification are used to denote computer-related entities, hardware, firmware, a combination of hardware and software, software, or software in execution. For example, the component may be, but is not limited to, a process, a processor, an object, an executable file, an execution thread, a program, and/or a computer running on a processor. Through the illustration, both the application running on the computing device and the computing device can be components. One or more components may reside in processes and/or threads of execution, and components may be located on one computer and/or distributed among two or more computers. In addition, these components can be executed from various computer readable media having various data structures stored thereon. The component may be based on, for example, a signal having one or more data packets (such as data from two components interacting with another component in a local system, a distributed system, and/or a network, such as the Internet that interacts with other systems through signals) Communicate through local and/or remote processes.
首先,对本申请中的部分用语进行解释说明,以便于本领域技术人员理解。First, some terms in this application are explained to facilitate the understanding of those skilled in the art.
(1)系统级芯片(System on Chip,SoC),SoC称为系统级芯片,也有称片上系统,意指它是一个产品,是一个有专用目标的集成电路,其中包含完整系统并有嵌入软件的全部 内容。同时它又是一种技术,用以实现从确定系统功能开始,到软/硬件划分,并完成设计的整个过程。(1) System-on-Chip (SoC), SoC is called system-on-chip, also called system-on-chip, which means that it is a product, an integrated circuit with a dedicated target, which contains a complete system and has embedded software The entire contents of. At the same time, it is a kind of technology to realize the whole process from determining system functions to software/hardware division and completing the design.
(2)随机存取存储器(Random Access Memory,RAM),用来存储和保存数据。它在任何时候都可以读写,RAM通常是作为操作系统或其他正在运行程序的临时存储介质(可称作系统内存)。当电源关闭时RAM不能保留数据,如果需要保存数据,就必须把它们写入到一个长期的存储器中(例如硬盘)。(2) Random Access Memory (RAM), used to store and save data. It can be read and written at any time. RAM is usually used as a temporary storage medium for operating systems or other running programs (which can be called system memory). RAM cannot retain data when the power is turned off. If you need to save data, you must write them into a long-term storage (such as a hard disk).
(3)随机存取存储器RAM可以进一步分为:静态随机存储(Static Random Access Memory,SRAM),和动态随机存储(Dynamic Random Access Memory,DRAM)两大类。这两者基本原理上有相同的地方,都是将电荷存储到记忆体内部,其中,SRAM的结构比较复杂,单位面积的容量少,存取速度很快;DRAM则结构简单,单位面积存储的容量比较多,存取时间相对SRAM慢,同时DRAM因为构造比较简单,存储的电荷会随着时间逐渐消失,因此需要定时再充电(Refresh),以保持电容存储的资料。(3) Random access memory RAM can be further divided into two categories: Static Random Access Memory (SRAM) and Dynamic Random Access Memory (DRAM). The basic principles of the two have the same place, they both store the charge inside the memory. Among them, the structure of SRAM is more complicated, the capacity per unit area is small, and the access speed is fast; DRAM has a simple structure and stores per unit area. The capacity is relatively large, and the access time is slower than that of SRAM. At the same time, DRAM has a relatively simple structure, and the stored charge will gradually disappear over time, so it needs to be recharged regularly (Refresh) to maintain the data stored in the capacitor.
(4)双倍速率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDR SDRAM),简称DDR,DDR内存是在SDRAM内存基础上发展而来的,仍然沿用SDRAM生产体系,因此对于内存厂商而言,只需对制造普通SDRAM的设备稍加改进,即可实现DDR内存的生产,可有效的降低成本。与传统的单数据速率相比,DDR技术实现了一个时钟周期进行两次读/写操作,即在时钟的上升沿和下降沿分别执行一次读/写操作。(4) Double Data Rate Synchronous Dynamic Random Access Memory (Dynamic Random Access Memory, DDR SDRAM), abbreviated as DDR, DDR memory is developed on the basis of SDRAM memory, and still uses the SDRAM production system. As far as memory manufacturers are concerned, they only need to slightly improve the equipment for manufacturing ordinary SDRAM to realize the production of DDR memory, which can effectively reduce costs. Compared with the traditional single data rate, DDR technology implements two read/write operations in one clock cycle, that is, one read/write operation is performed on the rising and falling edges of the clock.
(5)只读存储器(Read Only Memory,ROM),是一种只能读出事先所存数据的固态半导体存储器。其特性是一旦储存资料就无法再将之改变或删除。通常用在不需经常变更资料的电子或电脑系统中,并且资料不会因为电源关闭而消失。(5) Read Only Memory (ROM) is a solid-state semiconductor memory that can only read data stored in advance. Its characteristic is that once the data is stored, it cannot be changed or deleted. It is usually used in electronic or computer systems that do not need to change data frequently, and the data will not disappear because the power is turned off.
(6)现场可编程门阵列(Field Programmable Gate Array,FPGA)是在可编程阵列逻辑(PAL)、通用阵列逻辑(GAL)、复杂可编程逻辑器件(CPLD)等可编程器件的基础上进一步发展的产物。它是作为专用集成电路(ASIC)领域中的一种半定制电路而出现的,既解决了定制电路的不足,又克服了原有可编程器件门电路数有限的缺点。(6) Field Programmable Gate Array (Field Programmable Gate Array, FPGA) is further developed on the basis of programmable array logic (PAL), general array logic (GAL), complex programmable logic device (CPLD) and other programmable devices The product. It emerged as a semi-custom circuit in the field of application-specific integrated circuits (ASIC), which not only solves the deficiencies of custom circuits, but also overcomes the shortcomings of the limited number of gate circuits of the original programmable devices.
(7)仿真器(Emulator,Emu),指的是几乎能够百分之百地模拟某硬件或软件系统的全部特性、行为的装置或程序,其目的是完全仿真被仿真硬件在接收到各种外界信息的时候的反应。(7) Emulator (Emu) refers to a device or program that can simulate almost 100% of all the characteristics and behaviors of a hardware or software system. Its purpose is to completely simulate the simulated hardware receiving various external information. Time reaction.
(8)轮询调度(Round Robin Scheduling)算法就是以循环的方式依次请求调度/处理就绪的任务队列或进程等。(8) The Round Robin Scheduling algorithm is to sequentially request scheduling/processing ready task queues or processes in a cyclic manner.
(9)查找表(Look up Table,LUT)本质上就是一个RAM。目前FPGA中多使用4输入的LUT,所以每一个LUT可以看成一个有4位地址线的RAM,当用户通过原理图或HDL语言描述了一个逻辑电路,以后PLD/FPGA开发软件会自动计算逻辑电路的所有可能结果,并把真值表(即结果)事先写入RAM,这样每输入一个信号进行逻辑运算就等于输入一个地址进行查表,找出地址对应的内容,然后输出即可。(9) Lookup Table (LUT) is essentially a RAM. At present, 4-input LUTs are mostly used in FPGA, so each LUT can be regarded as a RAM with 4-bit address lines. When the user describes a logic circuit through schematic diagram or HDL language, the PLD/FPGA development software will automatically calculate the logic in the future All possible results of the circuit, and the truth table (ie results) is written into RAM in advance, so that each input of a signal for logic operation is equivalent to inputting an address to look up the table, find out the content of the address, and then output it.
(10)包每秒(packets per second,pps)常用的网络吞吐率的单位(即每秒发送多少个分组数据包),网络的性能通常用吞吐率(throughput)这个指标来衡量。包转发率标志了交换机转发数据包能力的大小,一般交换机的包转发率在几十Kpps到几百Mpps不等。 包转发速率是指交换机每秒可以转发多少百万个数据包(Mpps),即交换机能同时转发的数据包的数量。包转发率以数据包为单位体现了交换机的交换能力。(10) Packets per second (pps) is a commonly used unit of network throughput (that is, how many packet data packets are sent per second). The performance of the network is usually measured by throughput. The packet forwarding rate indicates the ability of the switch to forward data packets. Generally, the packet forwarding rate of a switch ranges from tens of Kpps to hundreds of Mpps. The packet forwarding rate refers to how many millions of data packets (Mpps) the switch can forward per second, that is, the number of data packets that the switch can forward at the same time. The packet forwarding rate reflects the switching capability of the switch in units of data packets.
(11)吞吐量,吞吐量是指对网络、设备、端口或其他设施在单位时间内成功地传送数据的数量(以比特、字节等测量单位),也就是说吞吐量是指在没有帧丢失的情况下,设备能够接收并转发的最大数据速率。吞吐量的大小主要由网络设备的内外网口硬件,及程序算法的效率决定,尤其是程序算法,对于像需要进行大量运算的设备来说,算法的低效率会使通信量大打折扣。(11) Throughput. Throughput refers to the amount of data (measured in bits, bytes, etc.) that are successfully transmitted to a network, device, port or other facility in a unit of time. In case of loss, the maximum data rate that the device can receive and forward. The throughput is mainly determined by the internal and external network port hardware of the network equipment, and the efficiency of the program algorithm, especially the program algorithm. For devices that require a lot of calculations, the low efficiency of the algorithm will reduce the communication volume.
为了便于理解本申请实施例,以下进一步分析申请实施例具体要解决的技术问题以及对应的实际应用场景。由于CAM在进行内容查找时,其高度并行搜索的特征是基于CAM阵列中的每个存储单元中的比较电路来完成的,也即是说,CAM中的CAM阵列结构不同于普通的存储器(如内存Memory、RAM等)中的数据存储阵列结构,即CAM无法使用通用的Memory结构来实现基于内容寻址的功能。以下提供两种可利用通用Memory结构来实现内容可寻址存储功能的方案:In order to facilitate the understanding of the embodiments of the present application, the following further analyzes the specific technical problems to be solved by the application embodiments and the corresponding actual application scenarios. Because CAM is performing content search, its highly parallel search feature is based on the comparison circuit in each memory cell in the CAM array. That is to say, the CAM array structure in CAM is different from ordinary memory (such as Memory, RAM, etc.) in the data storage array structure, that is, CAM cannot use the general Memory structure to implement content-based addressing functions. The following provides two solutions that can utilize the general Memory structure to realize the content addressable storage function:
方案一,基于Memory地址遍历的模拟CAM:Solution 1: Simulation CAM based on Memory address traversal:
该方案可简单描述为,初始时先将待匹配的表项数据存储在Memory中,后续每单次查询启动后,按照地址遍历方式读取一遍Memory深度的所有表项数据,在读取过程中,同时逐个比对目标匹配数据(即本申请中的搜索输入数据)和读出表项数据(即本申请中的数据项),地址遍历结束时,完成一次模拟CAM处理;下次查询启动需等待上次查询结束,再重复上述完整地址遍历过程。如图2所示,图2为本申请实施例提供的一种基于Memory地址遍历的模拟CAM的处理示意图,假设Memory中所有表项数据对应的地址为0、1、2、3、4、5、6、7,T 0代表第一次查询启动时刻,T 1代表第二次查询启动时刻,T 3代表第三次查询启动时刻。其中,可以看出,第2次查询最早只能从第1次查询完成后才能启动,各单次查询时间片之间无重叠,同一时间只在处理一个查询请求。上述方案为串行比较方式,且使用通用Memory结构,成本功耗相对较低,后端风险小(即将逻辑设计转换为物理电路的可行性风险小),灵活性较高。 The solution can be simply described as: initially store the item data to be matched in the Memory, and after each subsequent single query is started, read all the item data of the memory depth according to the address traversal method. During the reading process , And compare the target matching data (ie the search input data in this application) and read the table entry data (ie the data items in this application) one by one at the same time. When the address traversal is over, complete a simulated CAM process; Wait for the end of the last query, and then repeat the above complete address traversal process. As shown in FIG. 2, FIG. 2 is a schematic diagram of processing a simulated CAM based on Memory address traversal provided by an embodiment of the application. It is assumed that the addresses corresponding to all entry data in the Memory are 0, 1, 2, 3, 4, and 5. , 6, 7, T 0 represents the start time of the first query, T 1 represents the start time of the second query, and T 3 represents the start time of the third query. Among them, it can be seen that the second query can only be started after the first query is completed at the earliest. There is no overlap between the time slices of each single query, and only one query request is processed at the same time. The above scheme is a serial comparison method, and uses a general memory structure, with relatively low cost and power consumption, low back-end risk (that is, the feasibility risk of converting the logical design into a physical circuit is low), and the flexibility is high.
上述处理结构的缺点:由于其单次查询需等待地址遍历的处理延时,因此导致吞吐量小,性能较低。The shortcomings of the above-mentioned processing structure: due to the processing delay of the address traversal for a single query, the throughput is small and the performance is low.
方案二,基于Memory地址遍历的扩展位宽的模拟CAM:Scheme two, analog CAM with extended bit width based on Memory address traversal:
该方案可简单描述为在上述方案一的基础上,将Memroy的存储字宽加大,例如,加宽为2个表项数据的字宽,一次比较2个表项数据,则Memory深度可以减少为原来的1/2,同时,处理延时相应减小。其中单次查询时间减少为原来的一半,一个时钟周期比对了两个表项。如图3所示,图3为本申请实施例提供的一种基于Memory地址遍历的扩展位宽模拟CAM处理的示意图,假设Memory中所有表项数据对应的地址为0、1、2、3、4、5、6、7,T 0代表第一次查询启动时刻,T 1代表第二次查询启动时刻,T 3代表第三次查询启动时刻。其中,可以看出,每个时钟周期可以比较两个表项数据,单次查询时间减小为原来的一半,但是,第2次查询同样最早只能从第1次查询完成后才能启动,各单次查询时间片之间仍然无重叠,同一时间只在处理一个查询请求。即上述方案同样可以理解为串行比 较方式,也可使用通用Memory结构,成本功耗相对较低,后端风险小,灵活性较高。 This scheme can be simply described as increasing the memory word width of Memroy on the basis of scheme one above, for example, widening to the word width of 2 entry data, and comparing 2 entry data at a time, the memory depth can be reduced At the same time, the processing delay is reduced accordingly. Among them, the single query time is reduced to half of the original, and two entries are compared in one clock cycle. As shown in FIG. 3, FIG. 3 is a schematic diagram of an extended bit-width analog CAM processing based on Memory address traversal provided by an embodiment of the application. 4, 5, 6, 7, T 0 represents the start time of the first query, T 1 represents the start time of the second query, and T 3 represents the start time of the third query. Among them, it can be seen that the data of two entries can be compared in each clock cycle, and the single query time is reduced to half of the original. However, the second query can only be started after the first query is completed at the earliest. There is still no overlap between single query time slices, and only one query request is processed at the same time. That is to say, the above scheme can also be understood as a serial comparison method, and a general memory structure can also be used, with relatively low cost and power consumption, low back-end risk, and high flexibility.
上述处理结构的缺点:使得Memory字宽显著变宽,深度显著变小,但实际资源较优化的Memory形状一般都存在一定约束,上述变化可能导致Memory选型劣化,极端场景可能丧失Memory的高集成度存储优势,最终反映为芯片的面积、功耗增加。The shortcomings of the above processing structure: make the memory word width significantly wider and the depth significantly smaller, but the actual resource optimized memory shape generally has certain constraints, the above changes may lead to the deterioration of the memory selection, and extreme scenarios may lose the high integration of the memory The advantage of high-speed storage is ultimately reflected in the increase in chip area and power consumption.
而通过分析一般的网络系统应用场景发现,通常不需要做到每个系统时钟周期都需要发起一次查询请求的极限性能需求,典型的,如以太报文处理,一般按包查询即可,而无需按字节查询,而包速率远小于包的字节速率,因此通常不会在短时间内有大量查询请求;但是在短包突发场景,如果表项深度较长(待匹配的数据项条数较多),上述的方案一的pps性能可能也无法满足性能要求,因为表项深度较长,导致单次查询的时间较长,而各单次查询之间又无时间重叠,导致短包突发的场景,无法保证网络吞吐率;而现有技术中的CAM和上述方案二相较于上述方案一的网络吞吐率会优化一些,但又存在资源、功耗劣化或匹配灵活性问题。By analyzing the application scenarios of general network systems, it is found that there is usually no need to meet the extreme performance requirements of a query request that needs to be initiated every system clock cycle. Typically, such as Ethernet packet processing, query by packet is generally sufficient. Query by byte, and the packet rate is much smaller than the byte rate of the packet, so there will usually not be a large number of query requests in a short time; but in the short packet burst scenario, if the table entry depth is long (the data item to be matched) If the number is large), the pps performance of the above scheme 1 may not meet the performance requirements, because the table entry depth is long, resulting in a long single query time, and there is no time overlap between single queries, resulting in short packets In a sudden scenario, the network throughput rate cannot be guaranteed; however, the CAM in the prior art and the above-mentioned solution 2 will optimize the network throughput rate compared to the above-mentioned solution 1, but there are problems of resource, power consumption degradation or matching flexibility.
综上所述,现有的基于内容的寻址方案无法满足网络的性能。因此,在本申请提供的一种低功耗、面积小、可移植性强、可扩展性强的灵活的CAM,用于解决上述技术问题。In summary, the existing content-based addressing scheme cannot meet the performance of the network. Therefore, a flexible CAM with low power consumption, small area, strong portability, and strong scalability provided in this application is used to solve the above technical problems.
基于上述,下面对本申请实施例提供的内容可寻址存储装置以及相关设备进行描述。请参见图4,图4是本申请实施例提供的一种内容可寻址存储装置的结构示意图,该内容可寻址存储装置40中可包括调度器401、比较器402、存储器403和存储器404,其中,比较器402包括N个比较单元4021,并且该N个比较单元分别耦合于存储器403,N为大于1的整数;可选的,上述调度器401、比较器402、存储器403和存储器404可以位于一个集成电路衬底上。其中,Based on the foregoing, the content addressable storage device and related equipment provided in the embodiments of the present application are described below. Please refer to FIG. 4, which is a schematic structural diagram of a content addressable storage device provided by an embodiment of the present application. The content addressable storage device 40 may include a scheduler 401, a comparator 402, a memory 403, and a memory 404. , Where the comparator 402 includes N comparison units 4021, and the N comparison units are respectively coupled to the memory 403, where N is an integer greater than 1; optionally, the aforementioned scheduler 401, comparator 402, memory 403, and memory 404 Can be located on an integrated circuit substrate. among them,
存储器404,用于存储待匹配的数据,该待匹配的数据为调度器401获取的K个搜索数据所分别需要匹配的多个数据项,即每个搜索数据都需要与待匹配的数据进匹配,从而得到对应的匹配结果。该待匹配的数据可以通过多种方式获得,比如静态配置、动态学习等。在任意一个目标比较单元4021开始进行搜索数据的比较之前,控制器404会控制存储器404写入所述待匹配的数据。可选的,控制器403还可以在控制存储器404读出所述待匹配的数据的过程中,控制存储器404写入新的待匹配的数据或修改所述待匹配的数据。即控制器403可以控制存储器404以边读边写的方式将待匹配的数据写入到存储器403中,例如,对于有在线刷新待匹配的数据的需求的系统,可以通过选用带独立读/写两个端口的存储器来进行待匹配的数据的边读边更新或边读边修改;如果选用单端口存储器,则控制器403可在控制存储器404读出待匹配的数据时,额外插入一些刷新时间来完成待匹配的数据的刷新,此时可能降低部分查询速度,其中的刷新时间是指,在刷新(写表项/数据项)时,暂停查询(读表项/数据项)地址序列增加(同时写使能无效),在写结束后,重新使能查询(读使能,读地址序列增加)。The memory 404 is used to store data to be matched. The data to be matched are multiple data items that need to be matched for the K search data acquired by the scheduler 401, that is, each search data needs to be matched with the data to be matched , So as to get the corresponding matching result. The data to be matched can be obtained in various ways, such as static configuration, dynamic learning, and so on. Before any target comparison unit 4021 starts to compare search data, the controller 404 controls the memory 404 to write the data to be matched. Optionally, the controller 403 may also control the memory 404 to write new data to be matched or modify the data to be matched when the control memory 404 reads the data to be matched. That is, the controller 403 can control the memory 404 to write the data to be matched into the memory 403 in a read-and-write manner. For example, for a system that requires online refreshing of the data to be matched, you can select an independent read/write Two-port memory is used to read and update or modify the data to be matched; if a single-port memory is selected, the controller 403 can insert some additional refresh time when the memory 404 is controlled to read the data to be matched To complete the refresh of the data to be matched, some query speed may be reduced at this time. The refresh time refers to the increase of the address sequence of the pause query (read table item/data item) during refresh (write table entry/data item) ( At the same time, the write enable is invalid). After the write is completed, the query is re-enabled (read enable, read address sequence increases).
在一种可能的实现方式中,存储器404的存储结构可以为存储阵列形式,存储阵列由许多基本存储单元构成,每个基本存储单元存放一位二进制数码(1或0),称为位(bit),而一行存储单元构成存储器403的一个字(也可称之为数据项、数据表项、表项等),W则为"字宽或位宽",存储器阵列内所有字的条数K称为"深度",存储器403的容量通过(K 字×W位)来表征。不同于现有技术中的CAM,该存储器403自身没有数据比较的功能,即不包含比较电路,因此可以采用通用的存储器结构,例如,存储器403可以是通用的随机存取存储器(Random Access Memory,RAM)或掉电易失性存储设备,如静态随机存取存储器(Static Random Access Memory,SRAM)、动态随机存取存储器(Dynamic Random Access Memory,DRAM)或同步动态随机存储器(Synchronous DRAM,SDRAM)、双倍速率SDRAM(Dual Data Rate SDRAM,DDR SDRAM)等;存储器403也可以是通用的只读存储器(Read Only Memory,ROM)或非掉电易失性存储器,如可编程ROM(Programmable ROM,PROM)、可擦写可编程ROM(Erasable Programmable ROM,EPROM)、电可擦除可编程ROM(Electrically Erasable Programmable ROM,EEPROM)、快速擦写ROM(FLASH ROM)等;存储器403还可以是处理器上的通用寄存器、闪存、或任何其他合适类型存储器。可以理解的是,若存储器403为RAM,则待匹配的数据是可以变化的,若存储器403为ROM,则待匹配的数据为固化在存储器403中的数据。需要说明的是,关于存储器404具体以何种形式来存储待匹配的数据本申请不作具体限定,可以依据实际的需求或业务情况,作相关的变化。In a possible implementation, the storage structure of the memory 404 may be in the form of a storage array. The storage array is composed of many basic storage units. Each basic storage unit stores a binary number (1 or 0), called bit (bit ), and a row of storage units constitutes a word of the memory 403 (also called data items, data table items, table items, etc.), W is the "word width or bit width", the number of all words in the memory array K Called "depth", the capacity of the memory 403 is characterized by (K words×W bits). Unlike the CAM in the prior art, the memory 403 has no data comparison function, that is, it does not include a comparison circuit. Therefore, a general memory structure can be adopted. For example, the memory 403 can be a general random access memory (Random Access Memory, RAM) or volatile storage devices under power failure, such as static random access memory (Static Random Access Memory, SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM) or synchronous dynamic random access memory (Synchronous DRAM, SDRAM) , Double Data Rate SDRAM (Dual Data Rate SDRAM, DDR SDRAM), etc.; the memory 403 can also be a general read only memory (Read Only Memory, ROM) or non-power-down volatile memory, such as programmable ROM (Programmable ROM, PROM), Erasable Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), Flash ROM (FLASH ROM), etc.; the memory 403 can also be a processor General-purpose registers, flash memory, or any other suitable type of memory on the computer. It is understandable that if the memory 403 is a RAM, the data to be matched can be changed. If the memory 403 is a ROM, the data to be matched is the data solidified in the memory 403. It should be noted that this application does not specifically limit the specific form in which the memory 404 stores the data to be matched, and relevant changes can be made according to actual needs or business conditions.
调度器401,用于获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数。例如,当调度器401接收到查询请求时,而查询请求中包括搜索数据、查找类型等,则表明该内容可寻址存储装置40需要执行搜索数据的查找操作。此时,调度器401需要为该搜索数据分配相应的比较单元进行匹配结果的查找,当有多个查询请求时,则调度器401需要分别为不同的查询请求分配不同的目标比较单元;调度器401根据动态接收到的查询请求,并结合比较器402中的N个比较单元4021当前的状态,动态选择出空闲可用的目标比较单元(可以为图4中的任意一个比较单元4021),启动或使能该目标比较单元的查询功能,并将该搜索数据发送至该目标比较单元4021中,从而完成查询请求到查询资源的分配调度。其中,调度器401获取的K个搜索数据,可以是同一时刻获取的,也可以是在不同时刻获取的,即K个搜索数据可以是在同一个时刻被调度至K个目标比较单元的,也可以是有先有后的被调度至K个目标比较单元的,可以取决于该K个搜索数据达到调度器的先后顺序,也可以取决于调度器401预设或灵活可变的调度规则,本申请实施例对此不作具体限定。例如,调度器401可以是在接收到某个搜索数据后立即将其调度给某个目标比较单元,也可以是调度器401在接收到一定数量的搜索数据之后,再统一将其进行调度至对应的目标比较单元。需要说明的是,调度器401可能在某个时刻或时间段内接收到大量的查询请求,即接收到大量的搜索数据,此时,由于比较器402中的比较单元的数量有限,因此,调度器401可以基于一定的流量整形控制,即当前只获取其中的K个搜索数据进行查询,待后续有空闲的比较单元之后,再进行其它搜索数据的调度,以尽可能的避免搜索数据的拥堵或丢弃。The scheduler 401 is configured to obtain K search data and respectively schedule the K search data to K target comparison units among the N comparison units, where the target comparison unit is a comparison unit in an idle state, K is an integer greater than or equal to 1 and less than or equal to N. For example, when the scheduler 401 receives a query request, and the query request includes search data, search type, etc., it indicates that the content addressable storage device 40 needs to perform a search data search operation. At this time, the scheduler 401 needs to allocate a corresponding comparison unit for the search data to search for matching results. When there are multiple query requests, the scheduler 401 needs to allocate different target comparison units for different query requests; According to the dynamically received query request, and combined with the current status of the N comparison units 4021 in the comparator 402, 401 dynamically selects an available target comparison unit (which can be any comparison unit 4021 in FIG. 4), and starts or The query function of the target comparison unit is enabled, and the search data is sent to the target comparison unit 4021, thereby completing the allocation and scheduling of query requests to query resources. Among them, the K search data acquired by the scheduler 401 may be acquired at the same time or at different times, that is, the K search data may be scheduled to K target comparison units at the same time, and It may be scheduled to K target comparison units first, and may depend on the order in which the K search data reaches the scheduler, or it may depend on the scheduler 401 preset or flexible scheduling rules. The application embodiment does not specifically limit this. For example, the scheduler 401 can schedule a certain search data to a certain target comparison unit immediately after receiving it, or it can schedule the scheduler 401 to the corresponding target comparison unit after receiving a certain amount of search data. The target comparison unit. It should be noted that the scheduler 401 may receive a large number of query requests at a certain moment or time period, that is, a large amount of search data. At this time, due to the limited number of comparison units in the comparator 402, the scheduler The device 401 can be based on certain traffic shaping control, that is, currently only obtain K search data among them for query, and after there are free comparison units in the future, other search data scheduling can be performed to avoid search data congestion or throw away.
控制器403,用于控制存储器404读出所述待匹配的数据并分别发送至所述K个目标比较单元。当比较器402中的有比较单元被选择为目标比较单元,即有一个或多个比较单元需要执行搜索操作时,比较器402可将目标比较单元的状态信息(空闲态或使用态)通知给控制器403,以完成比较器402对控制器403的读写使能合并控制,并最终控制存储 器404的输出。也即是说,当比较器402中有至少一个比较单元被选择为目标比较单元时,那么,控制器403则需要控制存储器404持续将待匹配的数据项进行读出,因为需要控制存储器404的端口保持为可读状态,进而实现读出操作。可选的,也可以由调度器401将目标比较单元的状态信息通知给控制器403,以完成比较器402对控制器403的读写使能合并控制,本申请实施例对此不作具体限定。图4中的结构是以由比较器402通知给控制器403为例,可以理解的是,若是由调度器401通知给控制器403,则调度器401与控制器403之间需要有通信连接,在此不再赘述。而存储器403在控制器403的控制下,将读出的待匹配的数据分别发送至所述K个目标比较单元,可以是广播至N个比较单元(包括该K个目标比较单元),可选的,也可以是仅发送给所述K个目标比较单元。例如,当控制器403通过调度器401或比较器402获知当前有比较单元4021-2和比较单元4021-4作为目标比较单元,进行搜索数据的比较操作时,则控制器403控制存储器404将读出的待匹配的数据广播至所述N个比较单元,或者仅发送至比较单元4021-2和比较单元4021-4中。需要说明的是,若存储器404是将读出的待匹配的数据仅发送至所述K个目标比较单元,那么,存储器404则需要获知当前有哪些比较单元为目标比较单元,此时,可以通过控制器403所获知的每个比较单元的状态信息(调度器401通知或者比较器402通知的)来控制存储器404的输出。The controller 403 is configured to control the memory 404 to read the data to be matched and send them to the K target comparison units respectively. When a comparison unit in the comparator 402 is selected as the target comparison unit, that is, when one or more comparison units need to perform a search operation, the comparator 402 can notify the status information (idle state or use state) of the target comparison unit to The controller 403 completes the read-write enable combined control of the controller 403 by the comparator 402, and finally controls the output of the memory 404. In other words, when at least one comparison unit in the comparator 402 is selected as the target comparison unit, then the controller 403 needs to control the memory 404 to continuously read the data items to be matched, because the memory 404 needs to be controlled. The port remains in a readable state, and the read operation is realized. Optionally, the scheduler 401 may also notify the controller 403 of the status information of the target comparison unit to complete the read-write enable combination control of the controller 403 by the comparator 402, which is not specifically limited in this embodiment of the application. The structure in FIG. 4 is based on the example notified by the comparator 402 to the controller 403. It can be understood that if the controller 403 is notified by the scheduler 401, a communication connection is required between the scheduler 401 and the controller 403. I will not repeat them here. Under the control of the controller 403, the memory 403 sends the read data to be matched to the K target comparison units respectively, which may be broadcast to N comparison units (including the K target comparison units), optionally Yes, it may also be sent only to the K target comparison units. For example, when the controller 403 learns through the scheduler 401 or the comparator 402 that there is currently a comparison unit 4021-2 and a comparison unit 4021-4 as target comparison units to perform a comparison operation of search data, the controller 403 controls the memory 404 to read The output data to be matched is broadcast to the N comparison units, or only sent to the comparison unit 4021-2 and the comparison unit 4021-4. It should be noted that if the memory 404 only sends the read data to be matched to the K target comparison units, then the memory 404 needs to know which comparison units are currently the target comparison units. The status information of each comparison unit (notified by the scheduler 401 or notified by the comparator 402) learned by the controller 403 controls the output of the memory 404.
进一步地,在搜索操作开始之前,控制器403还用于控制存储器404以预设方式写入所述待匹配的数据。控制器403可以在初始阶段,在装置外部或内部处理器的控制下以预设方式完成待匹配的数据的初始配置控制。例如,内容可寻址存储装置40上电后,在进入数据搜索操作之前,控制器403在外部处理器的控制下,先完成存储器404的初始化过程,如包括定义匹配字宽和输出结果位宽、选择数据表项(待匹配的数据)的读/写模式、枚举所有待匹配的数据的地址,并将地址对应的待匹配的数据相应的写入至存储器404中等,而其中的待匹配的数据的内容则取决于具体的业务需求。Further, before the search operation starts, the controller 403 is further configured to control the memory 404 to write the data to be matched in a preset manner. The controller 403 may complete the initial configuration control of the data to be matched in a preset manner under the control of the external or internal processor of the device in the initial stage. For example, after the content addressable storage device 40 is powered on, before entering the data search operation, the controller 403, under the control of the external processor, first completes the initialization process of the memory 404, such as defining the matching word width and output result bit width , Select the read/write mode of the data table item (data to be matched), enumerate the addresses of all the data to be matched, and write the data to be matched corresponding to the address to the memory 404, etc., and the to be matched The content of the data depends on the specific business requirements.
比较器402,包括N个用于进行搜索数据查找的比较单元4021,即图4中的比较单元4021-1、4021-2、……4021-N的集合。由于比较器402中的N个比较单元分别耦合于存储器404,如图4中所示,每个比较单元4021与存储器404之间都分别存在物理连接,即N个比较单元4021之间为并行比较结构,查找的过程互不干扰,其进行查找的起始时间、查找的搜索数据之间均可以无关联。当某个空闲的比较单元4021被选定为目标比较单元后,则该目标比较单元将调度器401发送的搜索数据与存储器404发送过来的待匹配的数据进行比较,并根据比较结果确定搜索数据的匹配结果。其中,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项。例如,实际应用中,可能不需要完整遍历所有数据项,比如只取第一次匹配的结果或仅判断有无匹配项即可,此时可以不用完整遍历所有数据项以加快响应速度或减小翻转功耗;相应的匹配结果可以保持固定延时,或立刻输出,可以视系统其他组件的要求确定。即有些情况下,只需要查找是否有命中的匹配项、有些情况下则需要查找具体匹配项或者匹配项对应的地址等。可以理解的是,每一个比较单元4021除了包括比较电路等逻辑资源,还可以包括自身的对比状态等必要信息的维护控制、以及用来存储调度器401发送的搜索数据的寄存器。The comparator 402 includes N comparison units 4021 for searching data, that is, a collection of comparison units 4021-1, 4021-2, ... 4021 -N in FIG. 4. Since the N comparison units in the comparator 402 are respectively coupled to the memory 404, as shown in FIG. 4, there is a physical connection between each comparison unit 4021 and the memory 404, that is, the comparison between the N comparison units 4021 is parallel. The structure and the search process do not interfere with each other, and there can be no correlation between the start time of the search and the search data. When an idle comparison unit 4021 is selected as the target comparison unit, the target comparison unit compares the search data sent by the scheduler 401 with the data to be matched sent from the memory 404, and determines the search data according to the comparison result The matching result. The matching result includes one or more of matching indication information, a matching data item of the corresponding search data, and an address of the matching data item, wherein the matching indication information is used to indicate whether there is a matching data item. For example, in practical applications, it may not be necessary to completely traverse all data items, such as only taking the result of the first match or only judging whether there is a match. In this case, it is not necessary to traverse all the data items completely to speed up the response or reduce Turnover power consumption; the corresponding matching result can be maintained with a fixed delay or output immediately, which can be determined according to the requirements of other components of the system. That is, in some cases, it is only necessary to find whether there is a matching item, and in some cases, it is necessary to find a specific match or the address corresponding to the match. It is understandable that, in addition to logical resources such as a comparison circuit, each comparison unit 4021 may also include maintenance control of necessary information such as its own comparison status, and a register for storing search data sent by the scheduler 401.
在一种可能的实现方式中,当调度器401在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则调度器401还控制在预设时间间隔之后,再从所述N个比较单元中确定空闲的目标比较单元。即当网络中的吞吐量较高,如瞬时突发最大pps较大时,或者当前没有空闲的比较单元时,则可以通过流量整形处理方式,即控制在一定的时间之后再进行比较,来提高本申请实施例中的内容可寻址存储装置的适配性以降低瞬时比较资源的需求。In a possible implementation manner, when the search data received by the scheduler 401 within the preset time period exceeds the preset number, or when there is no idle comparison unit currently, the scheduler 401 also controls the preset time interval After that, an idle target comparison unit is determined from the N comparison units. That is, when the throughput in the network is high, such as when the maximum pps of the instantaneous burst is large, or when there is no idle comparison unit, it can be improved by the traffic shaping processing method, that is, the comparison is controlled after a certain period of time. The adaptability of the content addressable storage device in the embodiment of the present application reduces the demand for instantaneous comparison of resources.
本申请实施例,通过在内容可寻址存储装置中,将比较器中的多个比较单元作为动态灵活可调用的比较资源,当有输入的搜索数据需要进行查找时,则调用上述多个比较单元中的空闲比较单元将搜索数据与存储器中待匹配的数据依次进行比较,其中,由于比较器中包括N个比较单元,因此最多可以同时实现N个搜索数据的并行查找。不同于现有技术的CAM,其CAM阵列的每个存储单元中都有一个比较电路,虽然使得搜索数据可同时与存储阵列中的所有存储数据同时比较,从而在较少甚至是一个时钟周期得到比较结果,但是,由于每个存储单元都有专用比较电路的特性,导致了芯片面积大、功耗大、成本高等问题。并且,CAM一旦出厂后,其物理结构和相关参数就固定了,因此可移植性和灵活性较差。而本申请实施例中,将内容可寻址存储装置中的比较器的比较单元进行共享,且可以根据实际查找需求实现搜索数据的并行查找功能,成倍提升装置的处理能力,同时满足网络瞬时吞吐量较低和较高的场景,避免使用高成本、高功耗的CAM结构,保证了性能、降低了成本和功耗。进一步地,由于装置中的存储器无需专用CAM结构,而可以使用相对通用的存储器结构,而装置中的其他功能结构(调度器、控制器等)则可以基于硬件描述语言verilog等通用描述语言进行实现,因而使得其可移植性、灵活性较强,从而极大的保证了内容可寻址存储装置的高可用性、低成本和低功耗性。In the embodiment of the present application, multiple comparison units in the comparator are used as dynamic, flexible and callable comparison resources in the content addressable storage device. When there is input search data that needs to be searched, the above multiple comparisons are called The idle comparison unit in the unit sequentially compares the search data with the data to be matched in the memory. Among them, since the comparator includes N comparison units, a maximum of N search data can be simultaneously searched in parallel. Different from the prior art CAM, each memory cell of the CAM array has a comparison circuit. Although the search data can be compared with all the stored data in the memory array at the same time, it can be obtained in less or even one clock cycle. The comparison results, however, because each memory cell has the characteristics of a dedicated comparison circuit, it has caused problems such as large chip area, high power consumption, and high cost. Moreover, once the CAM leaves the factory, its physical structure and related parameters are fixed, so portability and flexibility are poor. However, in the embodiment of the present application, the comparison unit of the comparator in the content addressable storage device is shared, and the parallel search function of search data can be realized according to actual search requirements, which doubles the processing capacity of the device and meets the network transient In scenarios with lower throughput and higher throughput, the use of high-cost and high-power CAM structures is avoided, which ensures performance and reduces costs and power consumption. Furthermore, since the memory in the device does not need a dedicated CAM structure, a relatively general memory structure can be used, and other functional structures in the device (scheduler, controller, etc.) can be implemented based on general description languages such as hardware description language verilog , Thus making its portability and flexibility strong, thereby greatly ensuring the high availability, low cost and low power consumption of the content addressable storage device.
上述图4中,调度器401从N个比较单元4021中选择出空闲的目标比较单元,或者在控制器403确定将当前读出的待匹配的数据发送至哪个比较单元的之前,均可以由比较器402中各个比较单元4021将自身的状态信息(空闲或者使用状态)直接发送至调度器401和/或者控制器403,也可以由各个比较单元4021将状态信息统一发送至比较器402中的某一状态合并模块进行汇总,再由该状态合并模块将汇总的全部状态信息发送至调度器401和/或控制器403,本申请实施例对此不作具体限定。In Figure 4 above, the scheduler 401 selects an idle target comparison unit from N comparison units 4021, or before the controller 403 determines which comparison unit to send the currently read data to be matched, it can be compared Each comparison unit 4021 in the device 402 sends its own status information (idle or use status) directly to the scheduler 401 and/or the controller 403, or each comparison unit 4021 sends the status information uniformly to some of the comparators 402. A state merging module collects, and then the state merging module sends all the summarized state information to the scheduler 401 and/or the controller 403, which is not specifically limited in the embodiment of the present application.
如图5所示,图5为本申请实施例提供的另一种内容可寻址存储装置的结构示意图,其中,调度器401与N个比较单元4021之间通过N个物理连线001并行连接,N个比较单元4021的空闲状态或使用状态可以通过电平变化的形式反映在物理连线001上,而调度器401则可以根据该物理连线001上的电压变化来感知每个比较单元4021当前的空闲或使用状态。比如,当比较单元4021为空闲状态,调度器401保持与其之间的物理连线001为低电平,当比较单元4021为使用状态,调度器401拉高与其之间的物理连线001为高电平。当空闲的比较单元4021被选定为目标比较单元后,则该目标比较单元(如图5中的目标比较单元4021-2、4021-4和4021-5),可接收调度器401通过对应的物理连线(图5中的001a、001b、001c)发送的搜索数据(搜索数据a、搜索数据b、搜索数据c),并存储对应的搜索数据。As shown in FIG. 5, FIG. 5 is a schematic structural diagram of another content addressable storage device provided by an embodiment of the application, in which the scheduler 401 and the N comparison units 4021 are connected in parallel through N physical connections 001 The idle state or use state of the N comparison units 4021 can be reflected on the physical connection 001 in the form of level changes, and the scheduler 401 can perceive each comparison unit 4021 according to the voltage change on the physical connection 001 The current idle or use status. For example, when the comparison unit 4021 is in an idle state, the scheduler 401 keeps the physical connection 001 between it at low level, and when the comparison unit 4021 is in use, the scheduler 401 pulls the physical connection 001 between it high Level. When the idle comparison unit 4021 is selected as the target comparison unit, the target comparison unit (such as the target comparison units 4021-2, 4021-4, and 4021-5 in FIG. 5) can receive the scheduler 401 through the corresponding Search data (search data a, search data b, search data c) sent by physical connections (001a, 001b, 001c in Figure 5), and store the corresponding search data.
进一步地,比较器402中的每一个比较单元4021均可以通过物理连线002将自身当前状态信息(空闲状态或者使用状态)反馈至控制器403,控制器403通过物理连线002获知并统计当前正在执行搜索操作的比较单元有哪些,并控制存储器404中的待匹配的数据逐步通过物理连线003并行的发送至对应的目标比较单元。Further, each comparison unit 4021 in the comparator 402 can feed back its current state information (idle state or use state) to the controller 403 through the physical connection 002, and the controller 403 learns and counts the current status through the physical connection 002. Which comparison units are performing the search operation, and control the data to be matched in the memory 404 to be gradually sent to the corresponding target comparison unit through the physical connection 003 in parallel.
假设当前有三个搜索数据:搜索数据a、搜搜数据b和搜索数据c,调度器401为其选择的比较单元分别为目标比较单元4021-2、目标比较单元4021-4、目标比较单元4021-5,并开启了4021-2、4021-4和4021-5的搜索操作功能之后,比较器402将目标比较单元的使能的状态通知至控制器403(也可以是由调度器401通知的),控制器403获知比较单元的状态信息后合并N个比较单元的状态,即4021-2、4021-4和4021-5均为使能态,其他比较单元为非使能态。控制器403结合上述比较单元的状态以及存储器404的初始化控制,将合并后的控制信息发送至存储器404,从而控制逐步读出数据项且并行的发送给使能状态下的目标比较单元4021-2、4021-4和4021-5。比如,按照地址轮询方式,在每个时钟周期向目标比较单元4021-2、4021-4和4021-5分别并行发送一个数据项(也可以是多个数据项,取决于比较单元4021和存储器404之间的位宽关系)。最终,目标比较单元4021-2、4021-4和4021-5将存储的搜索数据与控制器403控制逐步从存储器404读出并分别通过对应的物理连线(图5中的003a、003b、003c)发送过来的数据项逐步进行比较,直到得到搜索数据最终的匹配结果。可以理解的是,每个比较单元4021开始查找的时刻可以是同时,也可以不同时,即不同比较单元4021之间进行搜索数据是互不影响互不干扰。Suppose there are currently three search data: search data a, search data b, and search data c. The comparison units selected by the scheduler 401 for it are target comparison unit 4021-2, target comparison unit 4021-4, and target comparison unit 4021- 5. After the search operation functions of 4021-2, 4021-4 and 4021-5 are turned on, the comparator 402 notifies the controller 403 of the enabled state of the target comparison unit (it can also be notified by the scheduler 401) After the controller 403 obtains the state information of the comparison unit, it combines the states of the N comparison units, that is, 4021-2, 4021-4, and 4021-5 are all enabled states, and other comparison units are in the non-enabled state. The controller 403 combines the state of the comparison unit and the initialization control of the memory 404 to send the combined control information to the memory 404, thereby controlling the stepwise reading of data items and sending them to the target comparison unit 4021-2 in the enabled state in parallel. , 4021-4 and 4021-5. For example, according to the address polling method, each clock cycle sends a data item (or multiple data items) to the target comparison units 4021-2, 4021-4, and 4021-5 in parallel, depending on the comparison unit 4021 and the memory The bit width relationship between 404). Finally, the target comparison units 4021-2, 4021-4, and 4021-5 will gradually read the stored search data and the controller 403 control from the memory 404 and pass them through the corresponding physical connections (003a, 003b, 003c in Figure 5). ) The sent data items are gradually compared until the final matching result of the search data is obtained. It can be understood that the time when each comparing unit 4021 starts to search may be at the same time or at different times, that is, searching data between different comparing units 4021 does not affect each other and does not interfere with each other.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项;控制器403具体用于按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元,可以是仅发送给K个目标比较单元(如图5中的比较单元4021-2、4021-4、4021-5),也可以是广播给N个比较单元(如图5中的比较单元4021-1、4021-2、……4021-N);所述K个目标比较单元中的每一个目标比较单元,具体用于将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。本申请实施例,通过地址遍历的方式,将存储器中存储的M条数据项串行读出,并逐步发送给对应的目标比较单元进行比较,以获得对应的匹配结果。可选的,可以逐条逐条读出,即一个数据项一个数据项的读出,也可以是两条两条读出,即两个数据项两个数据项读出,还可以是多条数据项多条数据项的读出。In a possible implementation manner, the data to be matched includes M data items; the controller 403 is specifically configured to control the memory to read out the M data items in sequence according to the address traversal mode and send them to each The K target comparison units can be sent only to K target comparison units (comparison units 4021-2, 4021-4, 4021-5 in Figure 5), or broadcast to N comparison units (as shown in Figure 5). 5 in the comparison unit 4021-1, 4021-2, ... 4021-N); each of the K target comparison units is specifically configured to sequentially compare the corresponding search data with the M pieces of data The items are compared to determine the matching data items of the corresponding search data. In the embodiment of the present application, the M data items stored in the memory are serially read out through address traversal, and are gradually sent to the corresponding target comparison unit for comparison, so as to obtain the corresponding matching result. Optionally, it can be read one by one, that is, one data item is one data item, or two two data items, that is, two data items, two data items, or multiple data items. Reading multiple data items.
如图6所示,图6为本申请实施例提供的又一种内容可寻址装置的结构示意图,图6中的内容可寻址装置40还包括结果输出寄存器405,N个比较单元4021分别耦合于该结果输出寄存器405;目标比较单元还将所述匹配结果发送至结果输出寄存器405,结果输出寄存器405接收并存储所述匹配结果。由于N个比较单元4021还分别通过物理连线004耦合于内容可寻址存储装置40中的结果输出寄存器405,因此,当N个比较单元4021中的任意一个比较单元4021完成搜索数据的匹配时,都可以并行的将匹配结果通过对应的物理连线004发送至结果输出寄存器405。可选的,存储器404在控制器403的控制下按照地址轮询方式读出的数据,即可实现被读出的数据可同时参与多个比较单元的独立查询。在一种可能的实现方式中,图6中的内容可寻址装置40的比较器402中还包括状态合并模 块4022,各个比较单元4021将状态信息统一发送至比较器402中的状态合并模块4022进行汇总,再由该状态合并模块4022将汇总的全部状态信息发送至控制器403。可选的,比较单元的数目N大于或者等于(瞬时突发最大pps)*(静态查表周期)其中,瞬时突发最大pps可以以实际查询请求出现的最小周期确定,静态查表周期则是指遍历一遍查询表地址(待匹配的数据的地址)的时间。在实际应用中,可以结合系统已有的流量整形模块一起配合应用,降低峰值pps。As shown in FIG. 6, FIG. 6 is a schematic structural diagram of another content addressable device provided by an embodiment of the application. The content addressable device 40 in FIG. 6 further includes a result output register 405, and N comparison units 4021, respectively Coupled to the result output register 405; the target comparison unit also sends the matching result to the result output register 405, and the result output register 405 receives and stores the matching result. Since the N comparison units 4021 are respectively coupled to the result output register 405 in the content addressable storage device 40 through the physical connection 004, when any one of the N comparison units 4021 completes the matching of the search data , The matching result can be sent to the result output register 405 through the corresponding physical connection 004 in parallel. Optionally, the memory 404 reads data in an address polling manner under the control of the controller 403, so that the read data can participate in independent queries of multiple comparison units at the same time. In a possible implementation manner, the comparator 402 of the content addressable device 40 in FIG. 6 further includes a state combining module 4022, and each comparing unit 4021 sends the state information to the state combining module 4022 in the comparator 402 in a unified manner. After summarizing, the state merging module 4022 sends all the summarized state information to the controller 403. Optionally, the number of comparison units N is greater than or equal to (instantaneous burst maximum pps)*(static table lookup period), where the instantaneous burst maximum pps can be determined by the minimum period of the actual query request, and the static table lookup period is Refers to the time to traverse the lookup table address (the address of the data to be matched). In practical applications, it can be combined with the existing traffic shaping module of the system to reduce peak pps.
例如,如图7所示,图7为本申请实施例所提供的比较单元比较的时序图,在图7中,按照时间先后顺序记为T0、T1、T2、T3、T4、T5,分别对应第一次查询、第二次查询、第三次查询、第四次查询、第五次查询和第六次查询,其中,第一次查询和第四次查询(T0和T3)是由比较单元402-2执行的,第二次查询和第五次查询(T1和T4)是由比较单元402-4执行的,第三次查询和第六次查询(T2和T5)是由比较单元402-5执行的。即每一个比较单元之间都可以并行的进行搜索数据的查询,且每一个比较单元在单次查询完之后又恢复到空闲状态,可以进行下一轮的搜索数据的查询,例如,比较单元402-2在T0时刻开始搜索数据a的查询,在完成搜索数据a的查询之后则可以开始新的搜索数据d的查询,而在比较单元402-2在进行搜索数据a的比较的过程中,比较单元402-4开始了搜索数据b的查找,且紧接着开始了搜索数据e的查找。需要说明的是,存储器404中的所有待匹配的数据按照地址轮询的形式读出,从时序上来看即是图7中的待匹配的数据的地址,按照地址轮询的方式一直循环读出,只要当前还有比较单元在进行对应的搜索数据的查询,则控制器403就会控制存储器404中的数据保持循环读出,且每一比较单元开始比较的时候并不是从地址最低或最高的数据项开始比较,比如T1时刻是从地址6开始比较,T3时刻是从地址2开始比较,即所有目标比较单元开始比较的数据项取决于从存储器404中当前读出来的数据。图7中地址5(保持)对应的时间段当前由于没有任意一个比较单元在执行查询任务,此时若输出待匹配的数据,则会浪费资源,并且,若比较单元是通过高底电平来指示控制器403,则此时N个比较单元均处于低电平状态,则控制器403会控制暂停数据的读取以及匹配。等到T5时刻又有搜索数据发送至比较单元402-5时,此时控制器403继续从暂停数据项处继续读出并发送至比较单元402-5中进行比较。For example, as shown in FIG. 7, FIG. 7 is a sequence diagram for comparison of the comparison unit provided by the embodiment of the application. In FIG. 7, they are marked as T0, T1, T2, T3, T4, T5 in chronological order, corresponding to The first query, the second query, the third query, the fourth query, the fifth query, and the sixth query, where the first query and the fourth query (T0 and T3) are made by the comparison unit 402-2, the second query and the fifth query (T1 and T4) are executed by the comparison unit 402-4, the third query and the sixth query (T2 and T5) are executed by the comparison unit 402- 5 executed. That is to say, each comparison unit can perform the search data query in parallel, and each comparison unit returns to the idle state after a single query is completed, and the next round of search data query can be performed, for example, the comparison unit 402 -2 Start the query of search data a at time T0, after completing the query of search data a, you can start a new query of search data d, and in the process of comparing search data a, the comparison unit 402-2 compares The unit 402-4 starts the search for the search data b, and then starts the search for the search data e. It should be noted that all the data to be matched in the memory 404 is read out in the form of address polling. From the timing point of view, it is the address of the data to be matched in FIG. , As long as there is a comparison unit currently inquiring about the corresponding search data, the controller 403 will control the data in the memory 404 to keep cyclically reading, and each comparison unit does not start the comparison from the lowest or highest address Data items are compared. For example, at T1, the comparison starts from address 6, and at T3, the comparison starts from address 2. That is, the data items that all target comparison units start to compare depend on the data currently read from the memory 404. The time period corresponding to address 5 (hold) in Figure 7 is currently because no comparison unit is executing the query task. At this time, if the data to be matched is output, resources will be wasted, and if the comparison unit passes the high and bottom levels Instruct the controller 403, and at this time the N comparison units are all in the low state, the controller 403 will control the reading and matching of the pause data. When the search data is sent to the comparison unit 402-5 at time T5, the controller 403 continues to read the suspended data item and sends it to the comparison unit 402-5 for comparison.
在一种可能的实现方式中,所述N个比较单元中当前有K个比较单元在进行搜索数据的比较,且不同的比较单元各自对应不同的搜索数据,其中,K为小于或者等于M的正整数;所述待匹配的数据包括M条数据项;控制器403,具体用于通过地址轮询方式,在每个时钟周期控制存储器404读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;所述K个比较单元中的每一个比较单元,用于将每次接收到的L条数据项与对应的搜索数据进行比较。即当有多个搜索数据在进行查找时,则对应有多个比较单元在同时执行搜索操作,其具体可以为,控制器403通过控制存储器404在每个时钟周期读出L条数据项且广播至N个比较单元中,可选的,也可以仅并行发送至当前在执行搜索操作的目标比较单元,而正在执行搜索操作的每一个目标比较单元则是将每次接收到的L条数据与自身存储的搜索数据进行比较,从而得到对应的匹配结果。例如,假设M=64、N=8、L=2,K=4,则比较器402中包括8个比较单元,当前有4个目标比较单元正在进行数据比较,控制器403每个时钟周期从待匹配的64条数据项中,读出 2条数据项且并行发送至上述4个目标比较单元中(或者广播至上述8个比较单元中),每个目标比较单元则将每次接收到的2条数据项与自身存储的搜索数据进行比较,其中,对于每个目标比较单元来说,其得到了匹配结果(与待匹配的所有64条数据项的全部或者部分进行了比较)后则可以停止该比较单元的搜索操作,此时控制器403可以控制存储器读出的数据项不再发送至已完成搜索操作任务的比较单元。In a possible implementation manner, among the N comparison units, K comparison units are currently performing search data comparison, and different comparison units correspond to different search data, where K is less than or equal to M A positive integer; the data to be matched includes M data items; the controller 403 is specifically used to control the memory 404 to read out L data items in each clock cycle through address polling, and to read the L data items The data item is broadcast to the N comparison units, and L is a positive integer less than or equal to M; each comparison unit in the K comparison units is used to compare the L data items received each time with the corresponding search Compare the data. That is, when there are multiple search data for searching, there are corresponding multiple comparison units performing the search operation at the same time. It can be specifically that the controller 403 controls the memory 404 to read out L data items in each clock cycle and broadcast Among the N comparison units, optionally, it can also be sent in parallel only to the target comparison unit currently performing the search operation, and each target comparison unit currently performing the search operation compares the L pieces of data received each time with The search data stored by itself is compared to obtain the corresponding matching result. For example, assuming that M=64, N=8, L=2, and K=4, the comparator 402 includes 8 comparison units, and currently 4 target comparison units are performing data comparisons. The controller 403 starts data comparison every clock cycle. Among the 64 data items to be matched, 2 data items are read out and sent to the above 4 target comparison units in parallel (or broadcast to the above 8 comparison units), and each target comparison unit will receive the data each time 2 data items are compared with the search data stored by itself, among which, for each target comparison unit, it can get a matching result (compared with all or part of all 64 data items to be matched). The search operation of the comparison unit is stopped. At this time, the controller 403 can control the data item read from the memory to no longer be sent to the comparison unit that has completed the search operation task.
在一种可能的实现方式中,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。即比较器402中的每一个比较单元的位宽可以为存储器404中存储的待匹配的数据中的每一条数据项的位宽的L倍,此时,比较单元可以在每个时钟周期进行L条数据项的比较。例如,若每一条数据项的位宽为W,比较单元比较位宽为L*W,则该比较单元可以在一个时钟周期完成L*W位/bit数据的比较。本申请实施例在选型允许且评估后端影响合理的范围内,可通过扩展字宽进一步均衡性能和成本,在相对低的成本功耗影响范围内,通过一个时钟周期同时进行多个比对来加快查询速度。In a possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items. That is, the bit width of each comparison unit in the comparator 402 can be L times the bit width of each data item in the data to be matched stored in the memory 404. At this time, the comparison unit can perform L in each clock cycle. Comparison of data items. For example, if the bit width of each data item is W and the comparison unit comparison bit width is L*W, then the comparison unit can complete the comparison of L*W bit/bit data in one clock cycle. The embodiments of this application can further balance the performance and cost by expanding the word width within the range that the model selection allows and the evaluation of the back-end impact is reasonable. Within the range of relatively low cost and power consumption, multiple comparisons can be performed simultaneously through one clock cycle To speed up the query speed.
在一种可能的实现方式中,当出现多个匹配结果时,则内容可寻址装置40中可以增加优先编码器(priority),位于比较单元4021和结果输出寄存器40之间,该优先编码器将优先级最高的匹配结果(例如处于地址最低位置的数据项)编码输出。In a possible implementation, when multiple matching results appear, a priority encoder (priority) may be added to the content addressable device 40, which is located between the comparison unit 4021 and the result output register 40, and the priority encoder The matching result with the highest priority (for example, the data item at the lowest position of the address) is encoded and output.
在一种可能的实现方式中,在内容可寻址装置40中可增加匹配地址历史寄存器,当第一次发生匹配后记录匹配发生的地址,当后续又发生匹配时,通过比对记录的地址和新匹配地址的相对大小来确定是否更新匹配结果及匹配地址历史寄存器。比如低地址优先策略时,可以总是用新的小的匹配地址和结果刷新旧的大的匹配地址和结果。In a possible implementation manner, a matching address history register can be added to the content addressable device 40. When a match occurs for the first time, the address where the match occurred is recorded. When a subsequent match occurs, the recorded address is compared. The relative size of the new matching address is used to determine whether to update the matching result and the matching address history register. For example, in the low address priority strategy, the new small matching address and result can always be used to refresh the old large matching address and result.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,其中,每一条数据项包括待匹配的内容、查询控制信息和输出结果;所述K个目标比较单元中的每一个目标比较单元,具体用于分别根据所述M条数据项的查询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。即在上述搜索中,可以在待匹配的数据所包含的数据项中携带查询控制信息和对应的输出结果(例如该输出结果为该待匹配的内容对应的地址),其中,查询控制信息可以包括:具体比较时所使用min max范围或bit掩码或结合其他逻辑运算,可视系统需求确定。例如,存储器中的所存储的数据项可定义为{待匹配的内容,查询控制信息,输出结果}的三元数据集合,比如某一条或多条数据项={min,max,mask,item_en,result}的多元数据组,来表示当要查询的搜索数据满足min=<(d0&mask)<=max且item__en等于1时,则该数据项即为该搜索数据的匹配数据项,此时可以输出result,即本申请中的匹配结果。In a possible implementation, the data to be matched includes M data items, where each data item includes content to be matched, query control information, and output results; each of the K target comparison units A target comparison unit is specifically configured to compare the corresponding search data with the content to be matched in each data item according to the query control information of the M data items, and to output the result in the matched data item Output as the matching result of the corresponding search data. That is, in the above search, the query control information and the corresponding output result (for example, the output result is the address corresponding to the content to be matched) can be carried in the data item contained in the data to be matched, where the query control information may include : The min max range or bit mask used in the specific comparison or combined with other logical operations can be determined based on system requirements. For example, the data items stored in the memory can be defined as a ternary data set of {content to be matched, query control information, output result}, such as one or more data items = {min, max, mask, item_en, The multivariate data group of result} means that when the search data to be queried meets min=<(d0&mask)<=max and item__en is equal to 1, then the data item is the matching data item of the search data, and the result can be output at this time , Which is the matching result in this application.
需要说明的是,上述图4-图6中的内容可寻址存储装置40的调度器401可以是硬件电路模块,也可以是运行软件的功能器件。同理,控制器403也可以是硬件电路模块,或是运行软件的功能器件。可以理解的是,通常情况下,采用硬件电路模块相较于采用软件来实现相应的功能,速度会更快。上述调度器401可以部署在控制器403的内部,即认为其在物理形态上是控制器403的一部分或与控制器403集成在一起;可替换地,调度器401也可以部署在控制器403外部,当然也可以部分部署在控制器403内部,另一部分部署在控制器403外部,本申请实施例对此不作具体限定。It should be noted that the scheduler 401 of the content addressable storage device 40 in the foregoing FIGS. 4 to 6 may be a hardware circuit module or a functional device that runs software. Similarly, the controller 403 may also be a hardware circuit module or a functional device running software. It is understandable that under normal circumstances, the use of hardware circuit modules is faster than the use of software to implement corresponding functions. The above scheduler 401 can be deployed inside the controller 403, that is, it is considered to be a part of the controller 403 in physical form or integrated with the controller 403; alternatively, the scheduler 401 can also be deployed outside the controller 403 Of course, it can also be partially deployed inside the controller 403 and the other part deployed outside the controller 403, which is not specifically limited in the embodiment of the present application.
需要说明的是,本申请中的内容可寻址存储装置可以有三种运行模式:读模式,写模式和匹配模式。读和写模式时,在存储器中数据访问和操作的方式和在普通存储器是一样的。匹配模式则可以实现上述图4-图6中的内容可寻址存储装置所实现的基于内容的查找功能。It should be noted that the content addressable storage device in this application can have three operating modes: read mode, write mode and matching mode. In the read and write mode, the way of data access and operation in the memory is the same as in the ordinary memory. The matching mode can realize the content-based search function implemented by the content-addressable storage device in Figures 4-6.
本申请中的内容可寻址存储装置的数据搜索功能,可以用于例如虚拟内存、数据压缩、模式识别、图像处理、高速缓存、表查找应用程序等多种应用场景。例如,当进行媒体访问控制(Media Access Control Address,MAC)地址检索时,交换机(该交换机可包括本申请中的任意一种所述的内容可寻址存储装置)首先以MAC地址为关键字(搜索数据)通过MAC-CAM表(也即是本申请中的待匹配的数据)的检索得到对应的索引值,其中,在以太网上,MAC-CAM表为交换机维护一张用于二层交换的地址表(通常称为“CAM表”),该表维护MAC地址与出接口的对应关系。这样每当交换机接收到一个以太网数据帧,交换机就会进行判断。提取出该数据帧的目的MAC地址,如果该数据帧不是发送给自己的,则根据数据帧的目的MAC地址查询CAM表;如果能命中(所谓命中,就是在CAM表中找到与该MAC地址对应的转发项),则根据查询的结果(通常是一个出接口列表)进行转发;如果不能命中,则向所有端口广播该数据帧。交换机的这张CAM表可以通过多种方式获得,比如静态配置、动态学习。The data search function of the content addressable storage device in this application can be used in various application scenarios such as virtual memory, data compression, pattern recognition, image processing, high-speed caching, and table lookup applications. For example, when performing Media Access Control Address (MAC) address retrieval, the switch (the switch may include any of the content addressable storage devices described in this application) first uses the MAC address as the key ( Search data) retrieve the corresponding index value through the MAC-CAM table (that is, the data to be matched in this application). Among them, on the Ethernet, the MAC-CAM table maintains a piece of data for layer 2 switching for the switch Address table (usually called "CAM table"), which maintains the correspondence between MAC addresses and outbound interfaces. In this way, whenever the switch receives an Ethernet data frame, the switch will make a judgment. Extract the destination MAC address of the data frame, if the data frame is not sent to yourself, query the CAM table according to the destination MAC address of the data frame; if it can be hit (the so-called hit, it is to find the corresponding MAC address in the CAM table The forwarding item), the forwarding is performed according to the result of the query (usually an outbound interface list); if it cannot be hit, the data frame is broadcast to all ports. This CAM table of the switch can be obtained in many ways, such as static configuration and dynamic learning.
本申请中的内容可寻址存储装置在仅需要模拟一般CAM操作时,可以仅使用一套表项存储资源(待匹配的数据),通过添加少量控制逻辑,即可线性成倍提升吞吐量(pps)性能,同时不需要显著改变存储器形状(相对原始查表匹配需求),在保证吞吐量和可实现性的同时,不丧失成本功耗优势。如果需要更加灵活的动态查表需求,可以方便的在存储器中同时存储一些关于表项的操作信息,即可较方便的支持实施诸如表项使能,bit掩码,数值范围掩码,乱序表项优先级等额外特性,极大提高查表匹配处理本身灵活性。同时,结合具体场景,也可以结合字宽扩展,进一步平衡性能和面积功耗,进一步改善能效比。总的来说本申请所提供的内容可寻址存储装置至少包括了以下优点:When the content addressable storage device in this application only needs to simulate a general CAM operation, it can only use a set of table item storage resources (data to be matched), and by adding a small amount of control logic, the throughput can be linearly doubled ( pps) performance, without the need to significantly change the memory shape (relative to the original look-up table matching requirements), while ensuring throughput and realizability, without losing cost and power consumption advantages. If you need more flexible dynamic table lookup requirements, you can conveniently store some operation information about table items in the memory at the same time, which can more conveniently support implementations such as table item enable, bit mask, value range mask, and disorder Additional features such as table item priority greatly improve the flexibility of table look-up matching processing. At the same time, in combination with specific scenarios, word width expansion can also be combined to further balance performance and area power consumption, and further improve energy efficiency ratio. In general, the content addressable storage device provided by this application includes at least the following advantages:
1.成本功耗低,匹配应用场景为,相对低pps峰值场景,避免使用高成本高功耗的CAM结构,降低成本(面积)和功耗(包括平均及峰值);同时由于电路更简单,还可降低后端工程工艺风险;1. Low cost and power consumption, matching application scenarios are relatively low pps peak scenarios, avoiding the use of high-cost and high-power CAM structures, reducing cost (area) and power consumption (including average and peak values); at the same time, because the circuit is simpler, It can also reduce the risk of back-end engineering process;
2.可移植性强,由于可以使用通用的存储器(如Memory)结构,使得逻辑方案可移植性强,Emu,FPGA等场景可直接综合实现,且占用lut等逻辑资源低,极大程度降低抑制或验证成本;2. Strong portability, because the general memory (such as Memory) structure can be used, the logic scheme is highly portable, Emu, FPGA and other scenarios can be directly integrated and realized, and the logic resources such as lut are low, which greatly reduces the inhibition Or verification cost;
3.可扩展性强,通过少量控制及表项内容的修改即可较容易实现TCAM,及其他复杂方式的模糊CAM处理,比如表项使能,比如带min,max的模糊方式等,都可以规划到Memory表项内容中做动态匹配处理;较容易实现处理延时固定,易于在系统整体处理上形成流水结构;较容易结合扩展字宽维度,进一步提高成本性能均衡度;较容易扩展实现在线动态表项更新操作,满足在线要求属性高的业务场景;在部分应用时,还可以复用其他不共存场景存储器(如Memory)资源,进一步降低成本。3. Strong scalability, TCAM can be easily realized through a small amount of control and modification of table item content, and other complicated methods of fuzzy CAM processing, such as table item enable, such as fuzzy methods with min, max, etc. It is planned to perform dynamic matching processing in the contents of the Memory table; it is easier to achieve a fixed processing delay, and it is easy to form a pipeline structure in the overall processing of the system; it is easier to combine the expanded word width dimension to further improve the cost performance balance; it is easier to expand and realize online The dynamic entry update operation satisfies business scenarios with high online requirements; in some applications, other non-coexisting scenario memory (such as Memory) resources can be reused to further reduce costs.
请参见图8,图8是本申请实施例提供的一种内容可寻址存储方法的流程示意图,该 内容可寻址存储方法,适用于上述图4-图7中的任意一种内容可寻址存储装置以及包含所述内容可寻址存储装置的设备,所述内容可寻址存储装置包括:比较器和存储器,所述比较器包括N个比较单元,所述N个比较单元分别耦合于所述存储器,N为大于1的整数;该方法可以包括以下步骤S801-步骤S804。Please refer to FIG. 8. FIG. 8 is a schematic flowchart of a content addressable storage method provided by an embodiment of the present application. The content addressable storage method is applicable to any of the above-mentioned figures 4-7. An address storage device and a device including the content addressable storage device, the content addressable storage device includes a comparator and a memory, the comparator includes N comparison units, and the N comparison units are respectively coupled to In the memory, N is an integer greater than 1; the method may include the following steps S801 to S804.
S801:在所述存储器中存储待匹配的数据。S801: Store data to be matched in the memory.
S802:获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数。S802: Obtain K search data, and schedule the K search data to K target comparison units in the N comparison units, where the target comparison unit is a comparison unit in an idle state, and K is greater than or An integer equal to 1 and less than or equal to N.
S803:从所述存储器中读出所述待匹配的数据,并发送至所述目标比较单元。S803: Read the data to be matched from the memory and send it to the target comparison unit.
S804:通过所述K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果。S804: Compare the corresponding search data with the data to be matched through each of the K target comparison units, and output a corresponding matching result according to the comparison result.
在一种可能的实现方式中,所述装置还包括结果输出寄存器,所述N个比较单元分别耦合于所述结果输出寄存器;所述待匹配的数据包括多条数据项;所述方法还包括:将所述K个目标比较单元中的每一个目标比较单元对应的匹配结果发送至所述结果输出寄存器,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项;通过所述结果输出寄存器接收并存储所述K个目标比较单元分别发送的匹配结果。In a possible implementation manner, the device further includes a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; the method further includes : Send the matching result corresponding to each target comparing unit of the K target comparing units to the result output register, where the matching result includes matching indication information, matching data items of the corresponding search data, and matching data One or more of the address of the item, wherein the matching indication information is used to indicate whether there is a matching data item; the matching result sent by the K target comparison units is received and stored through the result output register.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:In a possible implementation manner, the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively. Target comparison unit, including:
按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元;所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。According to the address traversal mode, the memory is controlled to sequentially read the M data items and send them to the K target comparison units respectively; the corresponding search is performed by each of the K target comparison units. The data is compared with the data to be matched, and the corresponding matching result is output according to the comparison result, including: the corresponding search data is sequentially compared with the M pieces of search data through each of the K target comparing units. The data items are compared to determine the matching data items of the corresponding search data.
在一种可能的实现方式中,所述待匹配的数据包括M条数据项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:In a possible implementation manner, the data to be matched includes M data items, and M is an integer greater than 1, and the control memory reads the data to be matched and sends them to the K data items respectively. Target comparison unit, including:
通过地址轮询方式,在每个时钟周期控制所述存储器读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,包括:通过所述K个比较单元中的每一个比较单元,将每次接收到的L条数据项与对应的搜索数据进行比较。By means of address polling, the memory is controlled to read out L data items in each clock cycle, and the read L data items are broadcast to the N comparison units, where L is a positive integer less than or equal to M; Comparing the corresponding search data with the to-be-matched data through each of the K target comparison units includes: through each of the K comparison units, each comparison unit receives Compare the L data items with the corresponding search data.
在一种可能的实现方式中,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。In a possible implementation, the bit width of each of the N comparison units is L times the bit width of each of the M data items.
在一种可能的实现方式中,所述方法还包括:控制所述存储器以预设方式写入所述待匹配的数据。In a possible implementation, the method further includes: controlling the memory to write the data to be matched in a preset manner.
在一种可能的实现方式中,所述方法还包括:当在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则控制在预设时间间隔之后,再从所述N个比较单元中确定目标比较单元。In a possible implementation manner, the method further includes: when the search data received within the preset time period exceeds the preset number, or when there is no free comparison unit currently, controlling after the preset time interval , And then determine the target comparison unit from the N comparison units.
在一种可能的实现方式中,所述方法还包括:在控制所述存储器读出所述待匹配的数据的过程中,控制所述存储器写入新的待匹配的数据或修改所述待匹配的数据。In a possible implementation manner, the method further includes: in the process of controlling the memory to read the data to be matched, controlling the memory to write new data to be matched or modify the data to be matched The data.
在一种可能的实现方式中,所述待匹配的数据包括多条数据项,其中,每一条数据项包括待匹配的内容、查询控制信息和输出结果;所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:通过所述K个目标比较单元中的每一个目标比较单元,分别根据所述M条数据项的查询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。In a possible implementation, the data to be matched includes multiple data items, wherein each data item includes content to be matched, query control information, and output results; Each target comparison unit compares the corresponding search data with the data to be matched, and outputs the corresponding matching result according to the comparison result, including: through each of the K target comparison units, respectively According to the query control information of the M data items, the corresponding search data is compared with the content to be matched in each data item, and the output result in the matched data item is used as the matching result of the corresponding search data. Output.
需要说明的是,本申请实施例中所描述的内容可寻址存储方法中的具体流程,可参见上述图4-图7中所述的申请实施例中的相关描述,此处不再赘述。It should be noted that, for the specific process in the content addressable storage method described in the embodiment of the present application, please refer to the related description in the application embodiment described in the foregoing FIG. 4 to FIG. 7, and will not be repeated here.
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任意一种的部分或全部步骤。An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and the program includes part or all of the steps of any one of the above method embodiments when executed.
本申请实施例还提供一种计算机程序,该计算机程序包括指令,当该计算机程序被计算机执行时,使得计算机可以执行任意一种内容可寻址存储方法的部分或全部步骤。在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。The embodiments of the present application also provide a computer program, which includes instructions, when the computer program is executed by a computer, the computer can execute part or all of the steps of any content addressable storage method. In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可能可以采用其它顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions. Because according to this application, some steps may be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by this application.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed device may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the above-mentioned units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可 以为个人计算机、服务器或者网络设备等,具体可以是计算机设备中的处理器)执行本申请各个实施例上述方法的全部或部分步骤。其中,而前述的存储介质可包括:U盘、移动硬盘、磁碟、光盘、只读存储器(Read-Only Memory,缩写:ROM)或者随机存取存储器(Random Access Memory,缩写:RAM)等各种可以存储程序代码的介质。If the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc., specifically a processor in a computer device) execute all or part of the steps of the foregoing methods of the various embodiments of the present application. Among them, the aforementioned storage medium may include: U disk, mobile hard disk, magnetic disk, optical disk, read-only memory (Read-Only Memory, abbreviation: ROM) or Random Access Memory (Random Access Memory, abbreviation: RAM), etc. A medium that can store program codes.

Claims (20)

  1. 一种内容可寻址存储装置,其特征在于,包括:A content addressable storage device, characterized in that it comprises:
    存储器,用于存储待匹配的数据;Memory, used to store the data to be matched;
    比较器,包括N个比较单元,所述N个比较单元分别耦合至所述存储器,N为大于1的整数;The comparator includes N comparison units, the N comparison units are respectively coupled to the memory, and N is an integer greater than 1;
    调度器,用于获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数;The scheduler is used to obtain K search data and respectively schedule the K search data to K target comparison units among the N comparison units, where the target comparison unit is a comparison unit in an idle state, and K Is an integer greater than or equal to 1 and less than or equal to N;
    控制器,用于控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元;A controller, configured to control the memory to read the data to be matched and send them to the K target comparison units respectively;
    所述K个目标比较单元中的每一个目标比较单元,用于将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果。Each of the K target comparison units is used to compare the corresponding search data with the data to be matched, and output a corresponding matching result according to the comparison result.
  2. 如权利要求1所述的装置,其特征在于,所述装置还包括结果输出寄存器,所述N个比较单元分别耦合于所述结果输出寄存器;所述待匹配的数据包括多条数据项;8. The device according to claim 1, wherein the device further comprises a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items;
    所述K个目标比较单元中的每一个目标比较单元,还用于将对应的匹配结果发送至所述结果输出寄存器,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项;Each of the K target comparison units is further configured to send a corresponding matching result to the result output register, and the matching result includes matching indication information, matching data items of corresponding search data, and One or more of the addresses of the matching data items, wherein the matching indication information is used to indicate whether there are matching data items;
    所述结果输出寄存器,用于接收并存储所述K个目标比较单元分别发送的匹配结果。The result output register is used to receive and store the matching results respectively sent by the K target comparison units.
  3. 如权利要求1或2所述的装置,其特征在于,所述待匹配的数据包括M条数据项,M为大于1的整数;The device according to claim 1 or 2, wherein the data to be matched includes M data items, and M is an integer greater than 1;
    所述控制器,具体用于按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元;The controller is specifically configured to control the memory to sequentially read the M data items and send them to the K target comparison units according to the address traversal mode;
    所述K个目标比较单元中的每一个目标比较单元,具体用于将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。Each target comparison unit of the K target comparison units is specifically configured to sequentially compare the corresponding search data with the M data items to determine matching data items of the corresponding search data.
  4. 如权利要求1或2所述的装置,其特征在于,所述待匹配的数据包括M条数据项;The device according to claim 1 or 2, wherein the data to be matched includes M data items;
    所述控制器,具体用于通过地址轮询方式,在每个时钟周期控制所述存储器读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;The controller is specifically configured to control the memory to read out L data items in each clock cycle through address polling, and broadcast the read L data items to the N comparison units, where L is A positive integer less than or equal to M;
    所述K个比较单元中的每一个比较单元,用于将每次接收到的L条数据项与对应的搜索数据进行比较。Each of the K comparison units is used to compare the L data items received each time with the corresponding search data.
  5. 如权利要求4所述的装置,其特征在于,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。The device according to claim 4, wherein the bit width of each of the N comparison units is L times the bit width of each of the M data items.
  6. 如权利要求1-5任意一项所述的装置,其特征在于,所述控制器,还用于控制所述存储器以预设方式写入所述待匹配的数据。5. The device according to any one of claims 1-5, wherein the controller is further configured to control the memory to write the data to be matched in a preset manner.
  7. 如权利要求1-6任意一项所述的装置,其特征在于,The device according to any one of claims 1-6, wherein:
    所述调度器,还用于当在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则控制在预设时间间隔之后,再从所述N个比较单元中确定目标比较单元。The scheduler is further configured to control the number of search data received within the preset time period to exceed the preset number, or when there is no free comparison unit currently Determine the target comparison unit in the comparison unit.
  8. 如权利要求1-7任意一项所述的装置,其特征在于,所述控制器,还用于在控制所述存储器读出所述待匹配的数据的过程中,控制所述存储器写入新的待匹配的数据或修改所述待匹配的数据。The device according to any one of claims 1-7, wherein the controller is further configured to control the memory to write new data in the process of controlling the memory to read the data to be matched. The data to be matched or modify the data to be matched.
  9. 如权利要求1所述的装置,其特征在于,所述待匹配的数据包括M条数据项,其中,每一条数据项包括待匹配的内容、查询控制信息和输出结果;The device according to claim 1, wherein the data to be matched includes M data items, wherein each data item includes content to be matched, query control information and output results;
    所述K个目标比较单元中的每一个目标比较单元,具体用于分别根据所述M条数据项的查询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。Each of the K target comparison units is specifically configured to compare the corresponding search data with the content to be matched in each data item according to the query control information of the M data items, respectively, And output the output result in the matched data item as the matching result of the corresponding search data.
  10. 一种内容可寻址存储方法,其特征在于,应用于内容可寻址存储装置,所述装置包括:比较器和存储器,所述比较器包括N个比较单元,所述N个比较单元分别耦合于所述存储器,N为大于1的整数;所述方法包括:A content addressable storage method, characterized in that it is applied to a content addressable storage device, the device includes: a comparator and a memory, the comparator includes N comparison units, and the N comparison units are respectively coupled In the memory, N is an integer greater than 1. The method includes:
    在所述存储器中存储待匹配的数据;Storing data to be matched in the memory;
    获取K个搜索数据,并将所述K个搜索数据分别调度给所述N个比较单元中的K个目标比较单元,所述目标比较单元为处于空闲状态的比较单元,K为大于或者等于1且小于或者等于N的整数;Acquire K search data, and dispatch the K search data to K target comparison units among the N comparison units, where the target comparison unit is a comparison unit in an idle state, and K is greater than or equal to 1. And an integer less than or equal to N;
    控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元;Controlling the memory to read the data to be matched and send them to the K target comparison units respectively;
    通过所述K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果。Through each of the K target comparison units, the corresponding search data is compared with the data to be matched, and the corresponding matching result is output according to the comparison result.
  11. 如权利要求10所述的方法,其特征在于,所述装置还包括结果输出寄存器,所述N个比较单元分别耦合于所述结果输出寄存器;所述待匹配的数据包括多条数据项;所述方法还包括:The method according to claim 10, wherein the device further comprises a result output register, and the N comparison units are respectively coupled to the result output register; the data to be matched includes multiple data items; The method also includes:
    将所述K个目标比较单元中的每一个目标比较单元对应的匹配结果发送至所述结果输出寄存器,所述匹配结果包括匹配指示信息、对应的搜索数据的匹配数据项、所述匹配数据项的地址中的一个或者多个,其中,所述匹配指示信息用于指示是否有匹配数据项;The matching result corresponding to each target comparing unit of the K target comparing units is sent to the result output register, and the matching result includes matching indication information, matching data items of corresponding search data, and matching data items One or more of the addresses in, wherein the matching indication information is used to indicate whether there is a matching data item;
    通过所述结果输出寄存器接收并存储所述K个目标比较单元分别发送的匹配结果。The matching results respectively sent by the K target comparison units are received and stored through the result output register.
  12. 如权利要求10或11所述的方法,其特征在于,所述待匹配的数据包括M条数据 项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:The method according to claim 10 or 11, wherein the data to be matched includes M data items, and M is an integer greater than 1; and the memory is controlled to read the data to be matched and respectively Sent to the K target comparison units, including:
    按照地址遍历方式,控制所述存储器依次读出所述M条数据项并分别发送至所述K个目标比较单元;According to the address traversal mode, control the memory to sequentially read the M data items and send them to the K target comparison units respectively;
    所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:The comparing the corresponding search data with the data to be matched by each of the K target comparing units and outputting the corresponding matching result according to the comparison result includes:
    所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据依次与所述M条数据项进行比较,确定对应的搜索数据的匹配数据项。The corresponding search data is sequentially compared with the M data items by each of the K target comparison units to determine matching data items of the corresponding search data.
  13. 如权利要求10或11所述的方法,其特征在于,所述待匹配的数据包括M条数据项,M为大于1的整数;所述控制所述存储器读出所述待匹配的数据并分别发送至所述K个目标比较单元,包括:The method according to claim 10 or 11, wherein the data to be matched includes M data items, and M is an integer greater than 1; and the memory is controlled to read the data to be matched and respectively Sent to the K target comparison units, including:
    通过地址轮询方式,在每个时钟周期控制所述存储器读出L条数据项,并将读出的L条数据项广播至所述N个比较单元,L为小于或者等于M的正整数;By means of address polling, the memory is controlled to read out L data items in each clock cycle, and the read L data items are broadcast to the N comparison units, where L is a positive integer less than or equal to M;
    通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,包括:Comparing the corresponding search data with the to-be-matched data through each of the K target comparison units includes:
    通过所述K个比较单元中的每一个比较单元,将每次接收到的L条数据项与对应的搜索数据进行比较。Through each of the K comparison units, the L data items received each time are compared with the corresponding search data.
  14. 如权利要求13所述的方法,其特征在于,所述N个比较单元中的每一个比较单元的位宽为所述M条数据项中的每一个数据项的位宽的L倍。The method according to claim 13, wherein the bit width of each of the N comparison units is L times the bit width of each of the M data items.
  15. 如权利要求10-14任意一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-14, wherein the method further comprises:
    控制所述存储器以预设方式写入所述待匹配的数据。Controlling the memory to write the data to be matched in a preset manner.
  16. 如权利要求10-15任意一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-15, wherein the method further comprises:
    当在预设时间段内接收到的搜索数据超过预设数量时,或者当前没有空闲的比较单元时,则控制在预设时间间隔之后,再从所述N个比较单元中确定目标比较单元。When the search data received within the preset time period exceeds the preset number, or there is no free comparison unit currently, control is controlled to determine the target comparison unit from the N comparison units after the preset time interval.
  17. 如权利要求10-16任意一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-16, wherein the method further comprises:
    在控制所述存储器读出所述待匹配的数据的过程中,控制所述存储器写入新的待匹配的数据或修改所述待匹配的数据。In the process of controlling the memory to read the data to be matched, the memory is controlled to write new data to be matched or modify the data to be matched.
  18. 如权利要求10所述的方法,其特征在于,所述待匹配的数据包括多条数据项,其中,每一条数据项包括待匹配的内容、查询控制信息和输出结果;The method according to claim 10, wherein the data to be matched includes multiple data items, wherein each data item includes the content to be matched, query control information, and output results;
    所述通过K个目标比较单元中的每一个目标比较单元,将对应的搜索数据与所述待匹配的数据进行比较,并根据比较结果输出对应的匹配结果,包括:The comparing the corresponding search data with the data to be matched by each of the K target comparing units and outputting the corresponding matching result according to the comparison result includes:
    通过所述K个目标比较单元中的每一个目标比较单元,分别根据所述M条数据项的查 询控制信息,将对应的搜索数据与每一条数据项中的待匹配的内容进行比较,并将匹配的数据项中的输出结果作为对应的搜索数据的匹配结果进行输出。Through each of the K target comparison units, according to the query control information of the M data items, the corresponding search data is compared with the content to be matched in each data item, and The output result in the matched data item is output as the matching result of the corresponding search data.
  19. 一种半导体芯片,其特征在于,包括:A semiconductor chip, characterized in that it comprises:
    如权利要求1至9中任一所述的内容可寻址存储装置、耦合于所述内容可寻址存储装置的处理器以及所述内容可寻址存储装置外部的存储器。The content addressable storage device according to any one of claims 1 to 9, a processor coupled to the content addressable storage device, and a memory external to the content addressable storage device.
  20. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    如权利要求1至9任一所述的内容可寻址存储装置,以及耦合于所述内容可寻址存储装置的分立器件。The content addressable storage device according to any one of claims 1 to 9, and a discrete device coupled to the content addressable storage device.
PCT/CN2019/089687 2019-05-31 2019-05-31 Content-addressable storage apparatus and method, and related device WO2020237682A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980096977.9A CN113966532A (en) 2019-05-31 2019-05-31 Content addressable storage device, method and related equipment
PCT/CN2019/089687 WO2020237682A1 (en) 2019-05-31 2019-05-31 Content-addressable storage apparatus and method, and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/089687 WO2020237682A1 (en) 2019-05-31 2019-05-31 Content-addressable storage apparatus and method, and related device

Publications (1)

Publication Number Publication Date
WO2020237682A1 true WO2020237682A1 (en) 2020-12-03

Family

ID=73552497

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089687 WO2020237682A1 (en) 2019-05-31 2019-05-31 Content-addressable storage apparatus and method, and related device

Country Status (2)

Country Link
CN (1) CN113966532A (en)
WO (1) WO2020237682A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448733A (en) * 2021-07-12 2021-09-28 中国银行股份有限公司 Data processing method and system
CN114356418A (en) * 2022-03-10 2022-04-15 之江实验室 Intelligent table entry controller and control method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067356A1 (en) * 2004-08-23 2006-03-30 Han-Gyoo Kim Method and apparatus for network direct attached storage
CN104733040A (en) * 2013-12-23 2015-06-24 国际商业机器公司 Partial update in a ternary content addressable memory
CN109196588A (en) * 2016-05-31 2019-01-11 高通股份有限公司 The memory of multicycle search content addressable
CN109408898A (en) * 2018-09-28 2019-03-01 北京时代民芯科技有限公司 A kind of on piece CAM structure system and its implementation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140278329A1 (en) * 2013-03-15 2014-09-18 Mentor Graphics Corporation Modeling Content-Addressable Memory For Emulation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067356A1 (en) * 2004-08-23 2006-03-30 Han-Gyoo Kim Method and apparatus for network direct attached storage
CN104733040A (en) * 2013-12-23 2015-06-24 国际商业机器公司 Partial update in a ternary content addressable memory
CN109196588A (en) * 2016-05-31 2019-01-11 高通股份有限公司 The memory of multicycle search content addressable
CN109408898A (en) * 2018-09-28 2019-03-01 北京时代民芯科技有限公司 A kind of on piece CAM structure system and its implementation

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448733A (en) * 2021-07-12 2021-09-28 中国银行股份有限公司 Data processing method and system
CN113448733B (en) * 2021-07-12 2024-05-28 中国银行股份有限公司 Data processing method and system
CN114356418A (en) * 2022-03-10 2022-04-15 之江实验室 Intelligent table entry controller and control method

Also Published As

Publication number Publication date
CN113966532A (en) 2022-01-21

Similar Documents

Publication Publication Date Title
EP1836583B1 (en) Dynamic control of memory access speed
US20220121599A1 (en) Network-on-chip data processing method and device
US8041854B2 (en) Steering data units to a consumer
WO2020237682A1 (en) Content-addressable storage apparatus and method, and related device
KR20150017725A (en) Computer system and method of memory management
CN109582598B (en) Preprocessing method for realizing efficient hash table searching based on external storage
US20200201573A1 (en) Memory system and operation method thereof
WO2022199027A1 (en) Random write method, electronic device and storage medium
KR102588408B1 (en) Adaptive memory transaction scheduling
CN103023782B (en) A kind of method and device of accessing three-state content addressing memory
US8316210B2 (en) Dividing a logical memory space into ranges and sets for address translation
US10067868B2 (en) Memory architecture determining the number of replicas stored in memory banks or devices according to a packet size
US10884477B2 (en) Coordinating accesses of shared resources by clients in a computing device
WO2012163019A1 (en) Method for reducing power consumption of externally connected ddr of data chip and data chip system
WO2022178772A1 (en) Memory refresh method, memory, controller, and storage system
CN104834482A (en) Hybrid buffer
WO2021081813A1 (en) Multi-core processor and scheduling method therefor, device, and storage medium
US8521951B2 (en) Content addressable memory augmented memory
CN111047503B (en) Attribute storage and assembly optimization circuit of vertex array class command
WO2024045260A1 (en) Monitoring circuit, refreshing method, and memory
WO2023142114A1 (en) Data processing method, apparatus, and electronic device
WO2024045218A1 (en) Monitoring circuit, refreshing method, and memory
US20230230622A1 (en) Processing-in-memory device with all operation mode and dispersion operation mode
TWI857216B (en) Methods, devices and systems for high speed serial bus transactions
WO2024045219A1 (en) Monitoring circuit, refreshing method, and memory

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19930987

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19930987

Country of ref document: EP

Kind code of ref document: A1