WO2020125741A1 - Hash collision processing method, apparatus, device, and computer readable storage medium - Google Patents

Hash collision processing method, apparatus, device, and computer readable storage medium Download PDF

Info

Publication number
WO2020125741A1
WO2020125741A1 PCT/CN2019/126860 CN2019126860W WO2020125741A1 WO 2020125741 A1 WO2020125741 A1 WO 2020125741A1 CN 2019126860 W CN2019126860 W CN 2019126860W WO 2020125741 A1 WO2020125741 A1 WO 2020125741A1
Authority
WO
WIPO (PCT)
Prior art keywords
hash
value
index
key
key value
Prior art date
Application number
PCT/CN2019/126860
Other languages
French (fr)
Chinese (zh)
Inventor
王磊
刘明强
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020125741A1 publication Critical patent/WO2020125741A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the embodiments of the present application relate to the technical field of network communication, for example, to a method, device, device, and computer-readable storage medium for processing hash collisions.
  • the hash algorithm is to map a key value of k bit width to an index value of n bit width, k>n.
  • the key is 200-bit wide binary data, which must be mapped to a 20-bit wide binary data index by hash. Since the sample space of the key is obviously much larger than the sample space of the index, during the mapping process, for different key values: key1 and key2, the index after the hash mapping is the same, which is the hash conflict.
  • the hash table is based on a hash algorithm, a data structure network that stores the corresponding key value and other entry information with the index as the address. The hash table is often used in the device for route lookup and data forwarding. When a hash conflict occurs, It will cause route lookup errors and data forwarding failure.
  • the method to solve the hash conflict is to expand the hash table, divide the hash table address pointed to by an index into M positions (slots), and each slot stores a different key, so that the original N storage
  • the hash table space of the unit is expanded to M*N storage units.
  • the hash table needs to be stored using a memory
  • the storage of the hash table requires a larger capacity of memory, so it will increase the overhead of the device; on the other hand, because the memory has Data bit width limitation.
  • the data bit width of a hash table address is too large, it is necessary to access the memory multiple times to read out the information of all slots in the space, so the lookup speed of the hash table will be reduced.
  • Embodiments of the present application provide a method, device, device, and computer-readable storage medium for processing hash conflicts, which can reduce hash conflicts without expanding the hash table.
  • An embodiment of the present application provides a method for processing hash conflicts, including:
  • a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
  • the key value and the result value are added to the hash table with the target index value as the address.
  • An embodiment of the present application further provides a device for processing hash conflicts, including:
  • the acquisition module is set to acquire the key value and the result value to be inserted into the hash table
  • An operation module configured to perform a hash operation on the key value using at least two different hash functions set in advance to obtain at least two hash values
  • the processing module is configured to allocate a historical index value to the key value when it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value Different target index values; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
  • the adding module is configured to add the key value and the result value to the hash table with the target index value as the address.
  • An embodiment of the present application further provides a hash collision processing device, including: a processor and a memory, where the memory stores the following instructions that can be executed by the processor:
  • Hashing the key value by using at least two preset different hash functions to obtain at least two hash values
  • a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
  • the key value and the result value are added to the hash table with the target index value as the address.
  • An embodiment of the present application further provides a computer-readable storage medium, on which the computer-executable instructions are stored, and the computer-executable instructions are used to perform the following steps:
  • Hashing the key value by using at least two preset different hash functions to obtain at least two hash values
  • a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
  • the key value and the result value are added to the hash table with the target index value as the address.
  • FIG. 1 is a schematic structural diagram of a hash table in the related art
  • FIG. 2 is a schematic diagram of a structure of a hash table expanded to deal with hash conflicts in the related art
  • FIG. 3 is a schematic flowchart of a method for processing a hash conflict provided by an embodiment of this application;
  • FIG. 4 is a schematic structural diagram of an index table provided by an embodiment of this application.
  • FIG. 5 is a schematic structural diagram of a device for processing hash conflicts according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of another hash collision processing apparatus provided by an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of a hash table in the related art
  • FIG. 2 is a schematic structural diagram of a hash table expanded in the related art to handle hash conflicts.
  • the hash table includes N storage units.
  • the hash table is based on the hash table shown in FIG.
  • the storage units have been expanded (that is, each storage unit has been expanded to 4 slots), and the hash table shown in Figure 1 has been expanded to a hash table with 4*N storage units.
  • the storage hash table is used as an example.
  • the larger the hash table the more DDR particles are required. Due to hardware design and cost control, the number of DDR particles used is limited and cannot be met. Oversized hash table.
  • an embodiment of the present application provides a method for processing hash conflicts. As shown in FIG. 3, the method includes:
  • Step 1010 Obtain the key value and the result value to be inserted into the hash table.
  • the action value is action information on how to operate the key value.
  • Step 1020 Perform a hash operation on the key values to be inserted into the hash table using at least two preset different hash functions to obtain at least two hash values.
  • the number of the at least two hash functions preset may be two or more than two, which can be set according to circumstances. When the number of preset hash functions is greater, the incidence of hash collisions can be reduced, but the complexity of the implementation of the corresponding method and the consumption of computing resources will increase.
  • Step 1030 If it is determined that the key value to be inserted into the hash table does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, the value is inserted into the hash table
  • the key value of is assigned a target index value that is different from the historical index value.
  • the historical index value is an index value corresponding to the key value already in the hash table.
  • Step 1040 Write the key value and the result value to be inserted into the hash table into the hash table with the target index value as the address.
  • the method for processing hash conflicts since determining whether the key value to be inserted into the hash table will cause the hash conflict to be based on at least two hash functions is obtained based on the hash value, is different from
  • the hash function of the method reduces the probability of hash collision, and after determining that the key value to be inserted into the hash table will not cause a hash conflict, the key value to be inserted into the hash table is assigned to the hash table.
  • the target index value corresponding to the index value corresponding to the existing key value in the target index value is used as the address of the key value and the result value to be inserted into the hash table, and it must be the address of the key value already existing in the hash table. Not the same, thereby reducing hash conflicts without expanding the hash table, saving equipment overhead, and improving the lookup speed of the hash table.
  • the at least two different hash functions preset include: a first hash function and a second hash function.
  • the first hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain the first hash value.
  • the second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
  • the index table includes: multiple addresses.
  • Each address includes: multiple locations.
  • Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
  • the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
  • the index value and the hash value are in one-to-one correspondence. How many hash values are in the hash table and how many corresponding index values are in the index table, so the maximum number of hash values that can be stored in the hash table should be The number considers the size of the index table.
  • the maximum number of hash values that the hash table can store is the number of addresses in the hash table, and the index table is used to store the index value is the position, so the number of positions in the index table
  • the number cannot be less than the number of addresses in the hash table, that is, the product of the number of addresses in the index table and the number of positions in each address is not less than the number of addresses in the hash table.
  • the number of the address of the index table is the depth of the index table.
  • the depth of the index table is 2 10 .
  • Marks can include: digital marks, letter marks and symbol marks.
  • the mark is a digital mark, two different numbers can be used to indicate that the position is occupied and the position is not occupied; when the mark is a letter mark, two different letters can be used to indicate that the position is occupied and the position is not occupied;
  • the mark is a symbol mark, two different symbols can be used to indicate that the position is occupied and the position is not occupied.
  • the pre-established index table may be as shown in FIG. 4, the depth of the index table (that is, the number of addresses of the index table) is 2 i , each address includes 4 slots, and each slot includes: a mark ( flag) field (equivalent to the third field used to store the mark indicating whether the location is occupied), hash value (sig) field (equivalent to the second field used to store the index value) and index value (index) field ( It is equivalent to the first field for storing the hash value obtained by performing the hash operation on the key value using the second hash function).
  • the method for processing hash conflicts further includes:
  • the second hash value and the index table are used to judge whether the key value to be inserted into the hash table will cause a hash conflict.
  • the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict, including:
  • the method further includes:
  • the method further includes:
  • the method further includes:
  • the size of the hash table and the sample space size of the key value are the same, and the hash table is not expanded.
  • an index table is introduced, and each key value is assigned a completely unique index value in the index table, so it will not duplicate the index value corresponding to the key value already existing in the hash table, thereby avoiding hashing conflict.
  • the size of the index table should be consistent with the number of samples of the key value, but compared with the bit width of the hash table, the bit width of the index table is much smaller, so the index table can be expanded to achieve the purpose of reducing conflicts .
  • the hash collision processing apparatus 2 includes:
  • the obtaining module 21 is set to obtain the key value and the result value to be inserted into the hash table.
  • the operation module 22 is configured to perform a hash operation on the key values to be inserted into the hash table using at least two different hash functions set in advance to obtain at least two hash values.
  • the processing module 23 is set to determine that the key value to be inserted into the hash table will not cause a hash conflict if it is determined from the obtained hash value and the pre-established index table for storing the index value
  • the key value of the hash table is assigned a target index value that is different from the historical index value; where the historical index value is an index value corresponding to the key value already in the hash table.
  • the adding module 24 is set to add the key value and the result value to be inserted into the hash table to the hash table with the target index value as the address.
  • the at least two different hash functions preset include: a first hash function and a second hash function.
  • the operation module 22 is configured to perform a hash operation on the key value to be inserted into the hash table using the first hash function to obtain the first hash value.
  • the second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
  • the index table includes: multiple addresses; each address includes: multiple locations; each location includes: a first field, a second field, and a third field; wherein, the first field is used to store and use the second
  • the hash function obtains the hash value obtained by hashing the key value; the second field is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
  • the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
  • the processing module 23 is further configured to: find an address corresponding to the first hash value among the addresses of the index table as a target address; determine whether each position in the target address is occupied; respond to the target The judgment result that all positions in the address are unoccupied, and determine that the key value to be inserted into the hash table will not cause hash conflict; in response to the judgment result that some positions in the target address are occupied and the remaining positions are not occupied, use The second hash value and the index table determine whether the key value to be inserted into the hash table will cause a hash conflict.
  • the processing module 23 is set to determine whether the key value to be inserted into the hash table will cause a hash conflict by using the second hash value and the index table as follows: acquiring the first of all occupied positions in the target address The hash value on the field is used as the hash value to be compared; determine whether the second hash value is the same as any one of the hash values to be compared; in response to the second hash value and any one of the hash values to be compared A judgment result of different hash values, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
  • the adding module 24 is further configured to add the second hash value to the first field of the first unoccupied position in the target address.
  • the adding module 24 is further configured to add the target index value to the second field of the first unoccupied position in the target address.
  • the adding module 24 is further configured to add a mark indicating that the position is occupied to the third field of the first unoccupied position in the target address.
  • the apparatus for processing hash conflicts since determining whether the key value to be inserted into the hash table will cause the hash conflict to be based on the hash value obtained based on at least two hash functions, is different from
  • the hash function of the method reduces the probability of hash collision, and after determining that the key value to be inserted into the hash table will not cause a hash conflict, the key value to be inserted into the hash table is assigned to the hash table.
  • the target index value corresponding to the index value corresponding to the existing key value in the target index value is used as the address of the key value and the result value to be inserted into the hash table, which must be the address of the key value already existing in the hash table. Not the same, thereby reducing hash conflicts without expanding the hash table, saving equipment overhead, and improving the lookup speed of the hash table.
  • the acquisition module 21, the calculation module 22, the processing module 23, and the addition module 24 can all be composed of a central processing unit (CPU) and a microprocessor (Micro Processor) located in a hash collision processing device Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field programmable gate array (Field Programmable Gate Array, FPGA) and so on.
  • CPU central processing unit
  • MPU microprocessor
  • DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • An embodiment of the present application further provides a device for processing hash conflicts. As shown in FIG. 6, the device includes:
  • the hash function module 31 is set to accept the input key. There are two different hash functions inside the module, denoted as hash_i and hash_h. The hash function is used to hash the key and the two hash values are output as follows: The first hash value (Hi) and the second hash value (Hh).
  • the hash conflict processing module 32 is set to receive the two hash values output by the hash function module 31, and then process the hash conflict, and perform corresponding read and write operations on the index table and the hash table according to the operations of insertion, query, and deletion.
  • the action will also be entered with the key.
  • the key insertion process includes the following steps:
  • Step 110 Read the index table with Hi as the address, and read out all the slots in the Hi address space in the index table.
  • Step 120 the index table data is returned, and four slot flags are extracted from the returned data.
  • Step 160 Using the newly allocated index as the address, write the key and action into the hash table.
  • Step 170 End the process.
  • the key query processing process includes the following steps:
  • Step 210 read the index table with Hi as the address, read out all the slots in the Hi address space in the index table, and jump to step 220.
  • Step 220 the index table data is returned, and four slot flags are extracted from the returned data.
  • Step 240 read the hash table with the extracted index as the address, and extract the key and action fields from the returned data. If the extracted key is the same as the key used in the query, the hash table query is successful, and the action is the required query information. If they are inconsistent, the hash table query fails and the action is invalid. Go to step 250.
  • Step 250 End the process.
  • the key deletion process includes the following steps:
  • Step 310 Read the index table with Hi as the address, and read out all the slots in the Hi address space in the index table.
  • Step 320 the index table data is returned, and four slot flags are extracted from the returned data.
  • extract the index from slot (h) and jump to step 340; when the index table query fails, jump to step 360.
  • Step 340 read the hash table with the extracted index as the address, extract the key and action fields from the returned data, and jump to step 350.
  • Step 350 Determine whether the read key is the same as the key used in the query. If they match, write all 0s to the hash table with the index as the address, and clear slot(h) in the index table with Hi as the address. If they are inconsistent, it indicates that the deletion has failed. Jump to step 360;
  • Step 360 end the process.
  • An embodiment of the present application also provides a hash collision processing device, including a memory and a processor, where the memory stores the following instructions that can be executed by the processor:
  • the key value to be inserted into the hash table Assign a target index value that is different from the historical index value; where the historical index value is the index value corresponding to the key value already in the hash table.
  • the key value and the result value to be inserted into the hash table are added to the hash table with the target index value as the address.
  • the at least two different hash functions preset include: a first hash function and a second hash function.
  • the memory stores the following instructions executable by the processor: performing a hash operation on the key value to be inserted into the hash table using the first hash function to obtain the first hash value.
  • the index table includes: multiple addresses.
  • Each address includes: multiple locations.
  • Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
  • the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
  • the memory also stores the following instructions that can be executed by the processor:
  • the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict.
  • the memory also stores the following instructions that can be executed by the processor:
  • the memory also stores the following instructions that can be executed by the processor:
  • the memory also stores the following instructions that can be executed by the processor:
  • the memory also stores the following instructions that can be executed by the processor:
  • An embodiment of the present application also provides a computer-readable storage medium, on which a computer-executable instruction is stored, and the computer-executable instruction is used to perform the following steps:
  • the key value to be inserted into the hash table Assign a target index value that is different from the historical index value; where the historical index value is the index value corresponding to the key value already in the hash table.
  • the key value and the result value to be inserted into the hash table are added to the hash table with the target index value as the address.
  • the at least two different hash functions preset include: a first hash function and a second hash function.
  • Computer executable instructions also perform the following steps:
  • the first hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain the first hash value.
  • the second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
  • the index table includes: multiple addresses.
  • Each address includes: multiple locations.
  • Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
  • the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
  • the computer executable instructions also perform the following steps:
  • the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict.
  • the computer executable instructions also perform the following steps:
  • the computer executable instructions also perform the following steps:
  • the computer executable instructions also perform the following steps:
  • the computer executable instructions also perform the following steps:

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application discloses a hash collision processing method, an apparatus, a device, and a computer readable storage medium. Said method comprises: acquiring a key and a result value that are to be inserted into a hash table; using at least two different pre-set hash functions to perform a hash operation of the key respectively, to obtain at least two hash values; and in the case where it is determined, according to the obtained hash values and a pre-established index table for storing indexes, that the key will not cause any hash collision, allocating, to the key, a target index different from a historical index, the historical index being an index corresponding to a key existing in the hash table; and adding the key and the result value into the hash table with the target index as an address.

Description

哈希冲突的处理方法、装置、设备及计算机可读存储介质Hash conflict processing method, device, equipment and computer readable storage medium
本申请要求在2018年12月21日提交中国专利局、申请号为201811571387.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application filed on December 21, 2018 with the Chinese Patent Office, application number 201811571387.0. The entire contents of this application are incorporated by reference in this application.
技术领域Technical field
本申请实施例涉及网络通信技术领域,例如涉及一种哈希冲突的处理方法、装置、设备及计算机可读存储介质。The embodiments of the present application relate to the technical field of network communication, for example, to a method, device, device, and computer-readable storage medium for processing hash collisions.
背景技术Background technique
哈希(hash)算法就是把一个k比特(bit)位宽的关键值(key)映射到一个n比特位宽的索引值(index),k>n。比如key是200比特位宽的二进制数据,要通过哈希映射到一个20比特位宽的二进制数据index。由于key的样本空间显然要比index的样本空间大许多,在映射过程中会出现对于不同的key值:key1和key2,经过哈希映射后的index是相同的,这就是哈希冲突。哈希表基于哈希算法,是以index为地址存储相应的key值和其他表项信息的数据结构网络,设备中会经常使用哈希表进行路由查找和数据转发,当哈希冲突出现时,会导致路由查找错误,数据转发失败。The hash algorithm is to map a key value of k bit width to an index value of n bit width, k>n. For example, the key is 200-bit wide binary data, which must be mapped to a 20-bit wide binary data index by hash. Since the sample space of the key is obviously much larger than the sample space of the index, during the mapping process, for different key values: key1 and key2, the index after the hash mapping is the same, which is the hash conflict. The hash table is based on a hash algorithm, a data structure network that stores the corresponding key value and other entry information with the index as the address. The hash table is often used in the device for route lookup and data forwarding. When a hash conflict occurs, It will cause route lookup errors and data forwarding failure.
相关技术中,解决哈希冲突的方法是扩大哈希表,把一个index指向的哈希表地址划分为M个位置(slot),每个slot存储不同的key,这样一来,原本N个存储单元的哈希表空间扩大为M*N个存储单元,出现哈希冲突时,不同key存储在相同index指向的地址中的不同slot。In the related art, the method to solve the hash conflict is to expand the hash table, divide the hash table address pointed to by an index into M positions (slots), and each slot stores a different key, so that the original N storage The hash table space of the unit is expanded to M*N storage units. When a hash conflict occurs, different keys are stored in different slots in the address pointed to by the same index.
然而,由于哈希表需要使用存储器对其进行存储,当扩大哈希表时,意味着存储哈希表需要更大容量的存储器,因此会增大设备方面的开销;另一方面,由于存储器具有数据位宽限制,当一个哈希表地址的数据位宽太大,需要多次访问该存储器才能读出该空间所有slot的信息,因此会降低哈希表的查找速度。However, since the hash table needs to be stored using a memory, when the hash table is expanded, it means that the storage of the hash table requires a larger capacity of memory, so it will increase the overhead of the device; on the other hand, because the memory has Data bit width limitation. When the data bit width of a hash table address is too large, it is necessary to access the memory multiple times to read out the information of all slots in the space, so the lookup speed of the hash table will be reduced.
发明内容Summary of the invention
本申请实施例提供了一种哈希冲突的处理方法、装置、设备及计算机可读存储介质,能够在不扩大哈希表的前提下减少哈希冲突。Embodiments of the present application provide a method, device, device, and computer-readable storage medium for processing hash conflicts, which can reduce hash conflicts without expanding the hash table.
本申请实施例提供了一种哈希冲突的处理方法,包括:An embodiment of the present application provides a method for processing hash conflicts, including:
获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算, 得到至少两个哈希值;Performing hash operation on the key value by using at least two different hash functions preset in advance to obtain at least two hash values;
在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
本申请实施例还提供了一种哈希冲突的处理装置,包括:An embodiment of the present application further provides a device for processing hash conflicts, including:
获取模块,设置为获取待插入至哈希表的关键值和结果值;The acquisition module is set to acquire the key value and the result value to be inserted into the hash table;
运算模块,设置为利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;An operation module, configured to perform a hash operation on the key value using at least two different hash functions set in advance to obtain at least two hash values;
处理模块,设置为在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;The processing module is configured to allocate a historical index value to the key value when it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value Different target index values; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
添加模块,设置为以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The adding module is configured to add the key value and the result value to the hash table with the target index value as the address.
本申请实施例还提供了一种哈希冲突的处理设备,包括:处理器和存储器,其中,存储器中存储有以下可被处理器执行的指令:An embodiment of the present application further provides a hash collision processing device, including: a processor and a memory, where the memory stores the following instructions that can be executed by the processor:
获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;Hashing the key value by using at least two preset different hash functions to obtain at least two hash values;
在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
本申请实施例还提供了一种计算机可读存储介质,所述存储介质上存储有计算机可执行指令,所述计算机可执行指令用于执行以下步骤:An embodiment of the present application further provides a computer-readable storage medium, on which the computer-executable instructions are stored, and the computer-executable instructions are used to perform the following steps:
获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;Hashing the key value by using at least two preset different hash functions to obtain at least two hash values;
在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
附图说明BRIEF DESCRIPTION
图1为相关技术中的一种哈希表的结构示意图;FIG. 1 is a schematic structural diagram of a hash table in the related art;
图2为相关技术中的一种为处理哈希冲突而扩大的哈希表的结构示意图;2 is a schematic diagram of a structure of a hash table expanded to deal with hash conflicts in the related art;
图3为本申请实施例提供的一种哈希冲突的处理方法的流程示意图;FIG. 3 is a schematic flowchart of a method for processing a hash conflict provided by an embodiment of this application;
图4为本申请实施例提供的一种索引表的结构示意图;4 is a schematic structural diagram of an index table provided by an embodiment of this application;
图5为本申请实施例提供的一种哈希冲突的处理装置的结构示意图;FIG. 5 is a schematic structural diagram of a device for processing hash conflicts according to an embodiment of the present application;
图6为本申请实施例提供的另一种哈希冲突的处理装置的结构示意图。FIG. 6 is a schematic structural diagram of another hash collision processing apparatus provided by an embodiment of the present application.
具体实施方式detailed description
下文中将结合附图对本申请的实施例进行详细说明。Hereinafter, embodiments of the present application will be described in detail with reference to the accompanying drawings.
在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行。并且,虽然在流程图中示出了逻辑顺序,但是在一些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。The steps shown in the flowcharts of the figures can be performed in a computer system such as a set of computer-executable instructions. And, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order different from here.
图1为相关技术中的一种哈希表的结构示意图,图2为相关技术中的一种为处理哈希冲突而扩大的哈希表的结构示意图。如图1所示,该哈希表包含N个存储单元,如图2所示,该哈希表在图1所示的哈希表的基础上,对图1所示的哈希表的每个存储单元进行了扩大(即每个存储单元扩大为了4个slot),将图1所示的哈希表扩大为了具有4*N个存储单元的哈希表,出现哈希冲突时,不同key值存储在相同index指向的地址中的不同slot,有效地解决了哈希冲突。但是由于哈希表需要使用存储器对其进行存储,当扩大哈希表时,意味着存储哈希表需要更大容量的存储器,以使用双倍速率同步动态随机存储器(Double Data Rate SDRAM,DDR)存储哈希表为例进行说明,当以DDR存储哈希表时,哈希表越大,需要的DDR颗粒就越多,由于硬件设计和成本的控制,限制了DDR颗粒使用的数量,不能满足过大的哈希表。而且由于DDR的数据位宽限制,当一个哈希表地址的数据位宽太大,需要多次访问DDR才能读出该空间所有slot的信息,这会降低哈希表查找速度,在网络设备的数据转发应用中会严重恶化路由查找和数据转发性能,使其达不到线速。FIG. 1 is a schematic structural diagram of a hash table in the related art, and FIG. 2 is a schematic structural diagram of a hash table expanded in the related art to handle hash conflicts. As shown in FIG. 1, the hash table includes N storage units. As shown in FIG. 2, the hash table is based on the hash table shown in FIG. The storage units have been expanded (that is, each storage unit has been expanded to 4 slots), and the hash table shown in Figure 1 has been expanded to a hash table with 4*N storage units. When a hash conflict occurs, different keys The values are stored in different slots in the address pointed to by the same index, effectively solving the hash conflict. However, since the hash table needs to be stored using a memory, when expanding the hash table, it means that storing the hash table requires a larger capacity of memory to use double-rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR) The storage hash table is used as an example. When the hash table is stored in DDR, the larger the hash table, the more DDR particles are required. Due to hardware design and cost control, the number of DDR particles used is limited and cannot be met. Oversized hash table. Moreover, due to the limitation of the data bit width of DDR, when the data bit width of a hash table address is too large, it is necessary to access the DDR multiple times to read out the information of all slots in the space, which will reduce the speed of the hash table lookup. In data forwarding applications, the performance of route lookup and data forwarding will be severely deteriorated, making it impossible to achieve wire speed.
为此,本申请实施例提供一种哈希冲突的处理方法,如图3所示,该方法包括:To this end, an embodiment of the present application provides a method for processing hash conflicts. As shown in FIG. 3, the method includes:
步骤1010、获取待插入至哈希表的关键值和结果值。Step 1010: Obtain the key value and the result value to be inserted into the hash table.
结果值(action)是如何操作关键值的动作信息。The action value is action information on how to operate the key value.
步骤1020、利用预先设置的至少两个不同哈希函数分别对待插入至哈希表的关键值进行哈希运算,得到至少两个哈希值。Step 1020: Perform a hash operation on the key values to be inserted into the hash table using at least two preset different hash functions to obtain at least two hash values.
预先设置的至少两个哈希函数的数量可以是两个,也可以是两个以上,可以根据情况进行设置。当预先设置的哈希函数个数越多时,越能减少哈希冲突的发生率,不过相应的方法实施的复杂度和所消耗的计算资源会增大。The number of the at least two hash functions preset may be two or more than two, which can be set according to circumstances. When the number of preset hash functions is greater, the incidence of hash collisions can be reduced, but the complexity of the implementation of the corresponding method and the consumption of computing resources will increase.
步骤1030、在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出待插入至哈希表的关键值不会引起哈希冲突的情况下,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值。 Step 1030. If it is determined that the key value to be inserted into the hash table does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, the value is inserted into the hash table The key value of is assigned a target index value that is different from the historical index value.
本实施例中,历史索引值为与哈希表中已有的关键值对应的索引值。In this embodiment, the historical index value is an index value corresponding to the key value already in the hash table.
步骤1040、以目标索引值为地址将待插入至哈希表的关键值和结果值写入哈希表。Step 1040: Write the key value and the result value to be inserted into the hash table into the hash table with the target index value as the address.
本申请实施例提供的哈希冲突的处理方法,由于确定待插入至哈希表的关键值是否会引起哈希冲突所根据的哈希值是基于至少两个哈希函数得到的,因此从不同的哈希函数出发减少了出现哈希冲突的概率,并且在确定出待插入至哈希表的关键值不会引起哈希冲突后,为待插入至哈希表的关键值分配与哈希表中已存在的关键值对应的索引值不相同的目标索引值,以目标索引值作为在哈希表中即将插入关键值和结果值的地址,必然和哈希表中已存在的关键值的地址不相同,从而在不扩大哈希表的前提下减少了哈希冲突,节省了设备方面的开销,提高了哈希表的查找速度。The method for processing hash conflicts provided in the embodiments of the present application, since determining whether the key value to be inserted into the hash table will cause the hash conflict to be based on at least two hash functions is obtained based on the hash value, is different from The hash function of the method reduces the probability of hash collision, and after determining that the key value to be inserted into the hash table will not cause a hash conflict, the key value to be inserted into the hash table is assigned to the hash table. The target index value corresponding to the index value corresponding to the existing key value in the target index value is used as the address of the key value and the result value to be inserted into the hash table, and it must be the address of the key value already existing in the hash table. Not the same, thereby reducing hash conflicts without expanding the hash table, saving equipment overhead, and improving the lookup speed of the hash table.
可选地,预先设置的至少两个不同哈希函数包括:第一哈希函数和第二哈希函数。Optionally, the at least two different hash functions preset include: a first hash function and a second hash function.
对待插入至哈希表的关键值进行哈希运算,得到至少两个哈希值包括:Perform a hash operation on the key value to be inserted into the hash table to obtain at least two hash values including:
利用第一哈希函数对待插入至哈希表的关键值进行哈希运算,得到第一哈希值。The first hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain the first hash value.
利用第二哈希函数对待插入至哈希表的关键值进行哈希运算,得到第二哈希值。The second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
一实施例中,索引表包括:多个地址。In one embodiment, the index table includes: multiple addresses.
每个地址包括:多个位置。Each address includes: multiple locations.
每个位置包括:第一字段、第二字段和第三字段;其中,第一字段,用于存放利用第二哈希函数对关键值进行哈希运算得到的哈希值;第二字段,用于存放索引值;第三字段,用于存放表示每个位置是否被占用的标记。Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
本实施例中,索引表的地址的个数与每个地址的位置的个数的乘积不小于哈希表的地址的个数。In this embodiment, the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
索引值与哈希值是一一对应的,哈希表中有多少个哈希值,索引表中就要有多少个相应的索引值,因此应当按照哈希表能够存放哈希值的最大个数考虑索引表的大小,哈希表能够存放哈希值的最大个数就是哈希表的地址的个数,而索引表中用于存放索引值的是位置,因此索引表中的位置的个数不能小于哈希表的地址的个数,即索引表的地址的个数与每个地址的位置的个数的乘积不小于哈希表的地址的个数。The index value and the hash value are in one-to-one correspondence. How many hash values are in the hash table and how many corresponding index values are in the index table, so the maximum number of hash values that can be stored in the hash table should be The number considers the size of the index table. The maximum number of hash values that the hash table can store is the number of addresses in the hash table, and the index table is used to store the index value is the position, so the number of positions in the index table The number cannot be less than the number of addresses in the hash table, that is, the product of the number of addresses in the index table and the number of positions in each address is not less than the number of addresses in the hash table.
索引表的地址的个数就是索引表的深度,第一哈希函数对关键值进行哈希运算而得到的哈希值的位宽决定了索引表的深度,从而保证了在索引表的地址中以第一哈希值查找对应的地址能够查找到,假设第一哈希函数对关键值进行哈希运算而得到的哈希值的位宽为i,索引表的深度为2 i,当i=10,索引表的深度为2 10The number of the address of the index table is the depth of the index table. The bit width of the hash value obtained by the first hash function hashing the key value determines the depth of the index table, thereby ensuring that the address of the index table It can be found by searching the corresponding address with the first hash value, assuming that the bit width of the hash value obtained by the first hash function hashing the key value is i, and the depth of the index table is 2 i , when i = 10. The depth of the index table is 2 10 .
标记可以包括:数字标记、字母标记和符号标记等。当标记为数字标记时,可以用两个不同的数字来表示位置被占用和位置未被占用;当标记为字母标记时,可以用两个不同的字母来表示位置被占用和位置未被占用;当标记为符号标记时,可以用两个不同的符号来表示位置置被占用和位置未被占用。Marks can include: digital marks, letter marks and symbol marks. When the mark is a digital mark, two different numbers can be used to indicate that the position is occupied and the position is not occupied; when the mark is a letter mark, two different letters can be used to indicate that the position is occupied and the position is not occupied; When the mark is a symbol mark, two different symbols can be used to indicate that the position is occupied and the position is not occupied.
一实施例中,预先建立的索引表可以如图4所示,索引表的深度(即索引表的地址的个数)为2 i,每个地址包括4个slot,每个slot包括:标记(flag)字段(相当于用于存放表示位置是否被占用的标记的第三字段)、哈希值(sig)字段(相当于用于存放索引值的第二字段)和索引值(index)字段(相当于用于存放利用第二哈希函数对关键值进行哈希运算得到的哈希值的第一字段)。 In an embodiment, the pre-established index table may be as shown in FIG. 4, the depth of the index table (that is, the number of addresses of the index table) is 2 i , each address includes 4 slots, and each slot includes: a mark ( flag) field (equivalent to the third field used to store the mark indicating whether the location is occupied), hash value (sig) field (equivalent to the second field used to store the index value) and index value (index) field ( It is equivalent to the first field for storing the hash value obtained by performing the hash operation on the key value using the second hash function).
可选地,所述哈希冲突的处理方法还包括:Optionally, the method for processing hash conflicts further includes:
在索引表的地址中查找与第一哈希值对应的地址,作为目标地址。Find the address corresponding to the first hash value in the address of the index table as the target address.
判断目标地址中每一个位置是否被占用。Determine whether each location in the target address is occupied.
响应于目标地址中所有位置都未被占用的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that all positions in the target address are not occupied, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
响应于目标地址中一部分位置被占用,剩余部分位置未被占用的判断结果,利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突。In response to the judgment result that a part of the position in the target address is occupied and the remaining part is not occupied, the second hash value and the index table are used to judge whether the key value to be inserted into the hash table will cause a hash conflict.
如果目标地址中一部分位置被占用且剩余部分位置未被占用时,存在出现哈希冲突的可能,因此需要利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突。If a part of the target address is occupied and the remaining part is unoccupied, there is a possibility of hash conflict, so it is necessary to use the second hash value and the index table to determine whether the key value to be inserted into the hash table will cause ha Greek conflict.
可选地,利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突,包括:Optionally, the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict, including:
获取目标地址中所有被占用位置的第一字段上的哈希值,作为待比较哈希值。Obtain the hash value on the first field of all occupied positions in the target address as the hash value to be compared.
判断第二哈希值是否与待比较哈希值中任意一个哈希值相同。Determine whether the second hash value is the same as any one of the hash values to be compared.
响应于第二哈希值与待比较哈希值中任意一个哈希值都不相同的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that the second hash value is different from any one of the hash values to be compared, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
可选地,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值之后,还包括:Optionally, after assigning a target index value different from the historical index value for the key value to be inserted into the hash table, the method further includes:
将第二哈希值添加至目标地址中第一个未被占用位置的第一字段。Add the second hash value to the first field of the first unoccupied position in the target address.
可选地,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值之后,还包括:Optionally, after assigning a target index value different from the historical index value for the key value to be inserted into the hash table, the method further includes:
将目标索引值添加至目标地址中第一个未被占用位置的第二字段。Add the target index value to the second field of the first unoccupied position in the target address.
可选地,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值之后,还包括:Optionally, after assigning a target index value different from the historical index value for the key value to be inserted into the hash table, the method further includes:
将表示位置被占用的标记添加至目标地址中第一个未被占用的位置的第三字段。Add a mark indicating that the position is occupied to the third field of the first unoccupied position in the target address.
本申请实施例提供的哈希冲突的处理方法,哈希表的大小和关键值的样本空间大小是一致的,没有扩大哈希表。为了降低冲突,引入了索引表,在索引表中为每个关键值分配完全唯一的索引值,因此不会与哈希表中已经存在的关键值所对应的索引值重复,从而避免了哈希冲突。并且,在理论上索引表的大小应该和关键值的样本数量一致,但是与哈希表的位宽相比,索引表的位宽要小很多,所以可以把索引表扩大,达到降低冲突的目的。当使用DDR存储索引表时,可以只用一个读命令把所需要的索引表信息都读出来,因为每个关键值有唯一的索引值,所以只要索引表命中,就可以通过一次读操作读出哈希表的内容,提高了查询速率。In the method for processing hash conflicts provided by the embodiments of the present application, the size of the hash table and the sample space size of the key value are the same, and the hash table is not expanded. In order to reduce conflicts, an index table is introduced, and each key value is assigned a completely unique index value in the index table, so it will not duplicate the index value corresponding to the key value already existing in the hash table, thereby avoiding hashing conflict. In addition, in theory, the size of the index table should be consistent with the number of samples of the key value, but compared with the bit width of the hash table, the bit width of the index table is much smaller, so the index table can be expanded to achieve the purpose of reducing conflicts . When using DDR to store the index table, you can use only one read command to read the required index table information, because each key value has a unique index value, so as long as the index table hits, you can read it out with a single read operation The content of the hash table improves the query rate.
本申请实施例提供一种哈希冲突的处理装置,如图5所示,该哈希冲突的处理装置2包括:An embodiment of the present application provides a hash collision processing apparatus. As shown in FIG. 5, the hash collision processing apparatus 2 includes:
获取模块21,设置为获取待插入至哈希表的关键值和结果值。The obtaining module 21 is set to obtain the key value and the result value to be inserted into the hash table.
运算模块22,设置为利用预先设置的至少两个不同哈希函数分别对待插入至哈希表的关键值进行哈希运算,得到至少两个哈希值。The operation module 22 is configured to perform a hash operation on the key values to be inserted into the hash table using at least two different hash functions set in advance to obtain at least two hash values.
处理模块23,设置为在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出待插入至哈希表的关键值不会引起哈希冲突的情况下,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值;其中,历史索引值为与哈希表中已有的关键值对应的索引值。The processing module 23 is set to determine that the key value to be inserted into the hash table will not cause a hash conflict if it is determined from the obtained hash value and the pre-established index table for storing the index value The key value of the hash table is assigned a target index value that is different from the historical index value; where the historical index value is an index value corresponding to the key value already in the hash table.
添加模块24,设置为以目标索引值为地址将待插入至哈希表的关键值和结果值添加至哈希表。The adding module 24 is set to add the key value and the result value to be inserted into the hash table to the hash table with the target index value as the address.
预先设置的至少两个不同哈希函数包括:第一哈希函数和第二哈希函数。The at least two different hash functions preset include: a first hash function and a second hash function.
运算模块22是设置为:利用第一哈希函数对待插入至哈希表的关键值进行哈希运算,得到第一哈希值。利用第二哈希函数对待插入至哈希表的关键值进行哈希运算,得到第二哈希值。The operation module 22 is configured to perform a hash operation on the key value to be inserted into the hash table using the first hash function to obtain the first hash value. The second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
可选地,索引表包括:多个地址;每个地址包括:多个位置;每个位置包括:第一字段、第二字段和第三字段;其中,第一字段,用于存放利用第二哈希函数对关键值进行哈希运算得到的哈希值;第二字段,用于存放索引值;第三字段,用于存放表示每个位置是否被占用的标记。索引表的地址的个数与每个地址的位置的个数的乘积不小于哈希表的地址的个数。Optionally, the index table includes: multiple addresses; each address includes: multiple locations; each location includes: a first field, a second field, and a third field; wherein, the first field is used to store and use the second The hash function obtains the hash value obtained by hashing the key value; the second field is used to store the index value; the third field is used to store the mark indicating whether each position is occupied. The product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
可选地,处理模块23还设置为:在所述索引表的地址中查找与所述第一哈希值对应的地址,作为目标地址;判断目标地址中每一个位置是否被占用;响应于目标地址中所有位置都未被占用的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突;响应于目标地址中一部分位置被占用且剩余部分位置未被占用的判断结果,利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突。Optionally, the processing module 23 is further configured to: find an address corresponding to the first hash value among the addresses of the index table as a target address; determine whether each position in the target address is occupied; respond to the target The judgment result that all positions in the address are unoccupied, and determine that the key value to be inserted into the hash table will not cause hash conflict; in response to the judgment result that some positions in the target address are occupied and the remaining positions are not occupied, use The second hash value and the index table determine whether the key value to be inserted into the hash table will cause a hash conflict.
可选地,处理模块23是设置为通过如下方式利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突:获取目标地址中所有被占用位置的第一字段上的哈希值,作为待比较哈希值;判断第二哈希值是否与待比较哈希值中任意一个哈希值相同;响应于第二哈希值与待比较哈希值中任意一个哈希值都不相同的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。Optionally, the processing module 23 is set to determine whether the key value to be inserted into the hash table will cause a hash conflict by using the second hash value and the index table as follows: acquiring the first of all occupied positions in the target address The hash value on the field is used as the hash value to be compared; determine whether the second hash value is the same as any one of the hash values to be compared; in response to the second hash value and any one of the hash values to be compared A judgment result of different hash values, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
可选地,添加模块24,还设置为将第二哈希值添加至目标地址中第一个未被占用位置的第一字段。Optionally, the adding module 24 is further configured to add the second hash value to the first field of the first unoccupied position in the target address.
可选地,添加模块24,还设置为将目标索引值添加至目标地址中第一个未被占用位置的第二字段。Optionally, the adding module 24 is further configured to add the target index value to the second field of the first unoccupied position in the target address.
可选地,添加模块24,还设置为将表示位置被占用的标记添加至目标地址 中第一个未被占用的位置的第三字段。Optionally, the adding module 24 is further configured to add a mark indicating that the position is occupied to the third field of the first unoccupied position in the target address.
本申请实施例提供的哈希冲突的处理装置,由于确定待插入至哈希表的关键值是否会引起哈希冲突所根据的哈希值是基于至少两个哈希函数得到的,因此从不同的哈希函数出发减少了出现哈希冲突的概率,并且在确定出待插入至哈希表的关键值不会引起哈希冲突后,为待插入至哈希表的关键值分配与哈希表中已存在的关键值对应的索引值不相同的目标索引值,以目标索引值作为在哈希表中即将插入关键值和结果值的地址,必然和哈希表中已存在的关键值的地址不相同,从而在不扩大哈希表的前提下减少了哈希冲突,节省了设备方面的开销,提高了哈希表的查找速度。The apparatus for processing hash conflicts provided in the embodiments of the present application, since determining whether the key value to be inserted into the hash table will cause the hash conflict to be based on the hash value obtained based on at least two hash functions, is different from The hash function of the method reduces the probability of hash collision, and after determining that the key value to be inserted into the hash table will not cause a hash conflict, the key value to be inserted into the hash table is assigned to the hash table. The target index value corresponding to the index value corresponding to the existing key value in the target index value is used as the address of the key value and the result value to be inserted into the hash table, which must be the address of the key value already existing in the hash table. Not the same, thereby reducing hash conflicts without expanding the hash table, saving equipment overhead, and improving the lookup speed of the hash table.
在实际应用中,所述获取模块21、运算模块22、处理模块23和添加模块24均可由位于哈希冲突的处理装置中的中央处理器(Central Processing Unit,CPU)、微处理器(Micro Processor Unit,MPU)、数字信号处理器(Digital Signal Processor,DSP)或现场可编程门阵列(Field Programmable Gate Array,FPGA)等实现。In practical applications, the acquisition module 21, the calculation module 22, the processing module 23, and the addition module 24 can all be composed of a central processing unit (CPU) and a microprocessor (Micro Processor) located in a hash collision processing device Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field programmable gate array (Field Programmable Gate Array, FPGA) and so on.
本申请实施例还提供一种哈希冲突的处理装置,如图6所示,该装置包括:An embodiment of the present application further provides a device for processing hash conflicts. As shown in FIG. 6, the device includes:
哈希函数模块31,设置为接受输入的key,该模块内部有2个不同的hash函数,记为hash_i,hash_h,通过这2个hash函数对key进行哈希,输出2个hash值分别为:第一哈希值(Hi)和第二哈希值(Hh)。The hash function module 31 is set to accept the input key. There are two different hash functions inside the module, denoted as hash_i and hash_h. The hash function is used to hash the key and the two hash values are output as follows: The first hash value (Hi) and the second hash value (Hh).
哈希冲突处理模块32,设置为接收哈希函数模块31输出的2个hash值,然后处理hash冲突,并根据插入,查询和删除的操作对index表和hash表进行相应的读写操作。The hash conflict processing module 32 is set to receive the two hash values output by the hash function module 31, and then process the hash conflict, and perform corresponding read and write operations on the index table and the hash table according to the operations of insertion, query, and deletion.
key插入时,还会和key一起输入action。先以Hi为地址从index表读出数据,然后通过哈希冲突处理,并为key分配一个index,把Hh和index写入index表的Hi地址空间的相应位置。然后再以index为地址把key和action写入hash表。When the key is inserted, the action will also be entered with the key. First read the data from the index table with Hi as the address, then handle the hash conflict, and assign an index to the key, and write Hh and index to the corresponding position in the Hi address space of the index table. Then use the index as the address to write the key and action into the hash table.
key查询时,只输入key,没有action,action是要从hash表中查询得到的。先以Hi为地址从index表读出数据,并从返回的数据中提取出hash函数hash_h的hash值,这里标记为Hh’,然后使用当前key产生的Hh和Hh’比较,选取结果一致的位置上的index,再以index为地址从hash表读取数据,hash表返回的即需要的key和action。During key query, only key is entered, there is no action, and the action is obtained from the hash table. First read the data from the index table with Hi as the address, and extract the hash value of the hash function hash_h from the returned data, marked here as Hh', and then use the Hh and Hh' generated by the current key to compare and select the position with the same result On the index, read the data from the hash table using the index as the address, and the key and action required by the hash table are returned.
key删除时,只输入key,没有action。先以Hi为地址从index表读出数据,并从返回的数据中提取出Hh’,然后使用当前key产生的Hh和Hh’比较,选取结果一致的位置上的index,再以index为地址从hash表读取数据,然后进行 key比较,一致就说明删除成功,以index为地址向hash表写0,并把index表中的Hi地址中的相应slot清零。不一致就说明删除失败。When the key is deleted, only the key is entered, and there is no action. First read the data from the index table with Hi as the address, and extract Hh' from the returned data, then use the Hh and Hh' generated by the current key to compare, select the index at the position where the result is consistent, and then use index as the address from The hash table reads the data, and then compares the keys. If they are consistent, the deletion is successful. Write 0 to the hash table with the index as the address and clear the corresponding slot in the Hi address in the index table. Inconsistency means that the deletion failed.
假设建立的index表如图4所示,其中,flag=0时,表明该slot为空闲,flag=1时,表明该slot已占用。下面来说明key插入、key查询和key删除的处理过程:Assume that the established index table is shown in FIG. 4, where when flag=0, it indicates that the slot is idle, and when flag=1, it indicates that the slot is occupied. The following describes the process of key insertion, key query and key deletion:
key插入的处理过程包括以下步骤:The key insertion process includes the following steps:
步骤110、以Hi为地址读index表,把index表中Hi地址空间的所有slot全部读出来。Step 110: Read the index table with Hi as the address, and read out all the slots in the Hi address space in the index table.
步骤120、index表数据返回,从返回的数据中提取出4个slot的flag。Step 120, the index table data is returned, and four slot flags are extracted from the returned data.
步骤130、从slot(1)到slot(4),通过flag找第一个空闲的slot,标记为slot(j),这是key待插入的slot位置。当4个slot都是空闲的,取j=1,并确定当前key可以插入,跳转到步骤140;当所有slot都被占用,当前条目不可插入,跳转到步骤170;其他情况跳转到步骤150。Step 130: From slot(1) to slot(4), find the first free slot through the flag, marked as slot(j), which is the slot position where the key is to be inserted. When all 4 slots are free, take j=1 and determine that the current key can be inserted, and skip to step 140; when all slots are occupied and the current entry cannot be inserted, skip to step 170; otherwise, skip to Step 150.
步骤140、设置flag=1,sig=Hh,为当前key分配一个新的index,再把新的flag(此时flag=1),sig和index写入到index表的Hi地址空间的slot(1)位置,跳转到步骤160。Step 140: Set flag=1 and sig=Hh, assign a new index to the current key, and then write the new flag (at this time flag=1), sig and index to the slot (1 in the Hi address space of the index table) ) Location, jump to step 160.
步骤150、提取flag=1的slot中的sig_h。然后把Hh和所有提取的sig_h做比较,如果有一致的,确定不能插入key,如果没有一致的,确定可以插入key。可插入时,为当前key分配一个新的index,再把新的flag,sig和index写入到index表的Hi地址空间的slot(j)位置,跳转到步骤160;不可插入时,插入失败,跳转到步骤170。Step 150: Extract sig_h in the slot with flag=1. Then compare Hh with all extracted sig_h. If there is a match, it is determined that the key cannot be inserted. If there is no match, the key can be inserted. When insertable, assign a new index to the current key, and then write the new flag, sig, and index to the slot (j) position in the Hi address space of the index table, and jump to step 160; when not insertable, the insert fails , Jump to step 170.
步骤160、以新分配的index为地址,将key和action写入hash表。Step 160: Using the newly allocated index as the address, write the key and action into the hash table.
步骤170、结束流程。Step 170: End the process.
Key查询的处理过程包括以下步骤:The key query processing process includes the following steps:
步骤210、以Hi为地址读index表,把index表中Hi地址空间的所有slot全部读出来,跳转到步骤220。Step 210, read the index table with Hi as the address, read out all the slots in the Hi address space in the index table, and jump to step 220.
步骤220、index表数据返回,从返回的数据中提取出4个slot的flag。Step 220, the index table data is returned, and four slot flags are extracted from the returned data.
步骤230、将flag=1的slot中的sig标记为sig_h。然后把Hh和所有sig_h做比较,如果有一致的,说明index表查询成功,将sig_h与Hh相同的slot记为slot(h),没有一致的,说明index查询失败。当index表查询成功时,从slot(h)中提取index,跳转到步骤240;当index表查询失败,跳转到步骤250。Step 230: Mark the sig in the slot with flag=1 as sig_h. Then compare Hh with all sig_h. If there is a match, it indicates that the index table query is successful. The slot with the same sig_h and Hh is recorded as slot(h). If there is no match, the index query fails. When the index table query is successful, extract the index from slot (h), and jump to step 240; when the index table query fails, jump to step 250.
步骤240、以提取的index为地址读hash表,从返回的数据中提取出key和action字段,如果提取的key和查询使用的key一致,那么hash表查询成功,action 就是需要的查询信息,如果不一致,hash表查询失败,action无效,跳转到步骤250。Step 240, read the hash table with the extracted index as the address, and extract the key and action fields from the returned data. If the extracted key is the same as the key used in the query, the hash table query is successful, and the action is the required query information. If they are inconsistent, the hash table query fails and the action is invalid. Go to step 250.
步骤250、结束流程。Step 250: End the process.
Key删除的处理过程包括以下步骤:The key deletion process includes the following steps:
步骤310、以Hi为地址读index表,把index表中Hi地址空间的所有slot全部读出来。Step 310: Read the index table with Hi as the address, and read out all the slots in the Hi address space in the index table.
步骤320、index表数据返回,从返回的数据中提取出4个slot的flag。Step 320, the index table data is returned, and four slot flags are extracted from the returned data.
步骤330、将flag=1的slot中的sig标记为sig_h。然后把Hh和所有sig_h做比较,如果有一致的,说明index表查询成功,将sig_h与Hh相同的slot记为slot(h),没有一致的,说明index表查询失败。当index表查询成功时,从slot(h)中提取index,跳转到步骤340;当index表查询失败,跳转到步骤360。Step 330: Mark the sig in the slot with flag=1 as sig_h. Then compare Hh with all sig_h. If there is a match, it indicates that the index table query is successful. The slot with the same sig_h and Hh is recorded as slot(h). If there is no match, it indicates that the index table query fails. When the index table query is successful, extract the index from slot (h), and jump to step 340; when the index table query fails, jump to step 360.
步骤340、以提取的index为地址读hash表,从返回的数据中提取出key和action字段,跳转到步骤350。Step 340, read the hash table with the extracted index as the address, extract the key and action fields from the returned data, and jump to step 350.
步骤350、判断读出的key和查询使用的key是否一致,如果一致,就以index为地址向hash表写全0,并以Hi为地址,把index表中的slot(h)清零。如果不一致,表明本次删除失败。跳转到步骤360;Step 350: Determine whether the read key is the same as the key used in the query. If they match, write all 0s to the hash table with the index as the address, and clear slot(h) in the index table with Hi as the address. If they are inconsistent, it indicates that the deletion has failed. Jump to step 360;
步骤360、结束流程。Step 360, end the process.
本申请实施例还提供一种哈希冲突的处理设备,包括存储器和处理器,其中,存储器中存储有以下可被处理器执行的指令:An embodiment of the present application also provides a hash collision processing device, including a memory and a processor, where the memory stores the following instructions that can be executed by the processor:
获取待插入至哈希表的关键值和结果值。Get the key and result values to be inserted into the hash table.
利用预先设置的至少两个不同哈希函数分别对待插入至哈希表的关键值进行哈希运算,得到至少两个哈希值。Using at least two different hash functions set in advance to perform hash operations on the key values to be inserted into the hash table, respectively, to obtain at least two hash values.
在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出待插入至哈希表的关键值不会引起哈希冲突的情况下,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值;其中,历史索引值为与哈希表中已有的关键值对应的索引值。When it is determined that the key value to be inserted into the hash table does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, the key value to be inserted into the hash table Assign a target index value that is different from the historical index value; where the historical index value is the index value corresponding to the key value already in the hash table.
以目标索引值为地址将待插入至哈希表的关键值和结果值添加至哈希表。The key value and the result value to be inserted into the hash table are added to the hash table with the target index value as the address.
可选地,预先设置的至少两个不同哈希函数包括:第一哈希函数和第二哈希函数。Optionally, the at least two different hash functions preset include: a first hash function and a second hash function.
存储器中存储有以下可被处理器执行的指令:利用第一哈希函数对待插入至哈希表的关键值进行哈希运算,得到第一哈希值。The memory stores the following instructions executable by the processor: performing a hash operation on the key value to be inserted into the hash table using the first hash function to obtain the first hash value.
利用第二哈希函数对待插入至哈希表的关键值进行哈希运算,得到第二哈希值。Use the second hash function to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
可选地,索引表包括:多个地址。Optionally, the index table includes: multiple addresses.
每个地址包括:多个位置。Each address includes: multiple locations.
每个位置包括:第一字段、第二字段和第三字段;其中,第一字段,用于存放利用第二哈希函数对关键值进行哈希运算得到的哈希值;第二字段,用于存放索引值;第三字段,用于存放表示每个位置是否被占用的标记。Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
其中,索引表的地址的个数与每个地址的位置的个数的乘积不小于哈希表的地址的个数。The product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
可选地,存储器中还存储有以下可被处理器执行的指令:Optionally, the memory also stores the following instructions that can be executed by the processor:
在索引表的地址中查找与第一哈希值对应的地址,作为目标地址。Find the address corresponding to the first hash value in the address of the index table as the target address.
判断目标地址中每一个位置是否被占用。Determine whether each location in the target address is occupied.
响应于目标地址中所有位置都未被占用的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that all positions in the target address are not occupied, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
响应于目标地址中一部分位置被占用且剩余部分位置未被占用的判断结果,利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突。In response to the judgment result that a part of the position in the target address is occupied and the remaining part is not occupied, the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict.
可选地,存储器中还存储有以下可被处理器执行的指令:Optionally, the memory also stores the following instructions that can be executed by the processor:
获取目标地址中所有被占用位置的第一字段上的哈希值,作为待比较哈希值。Obtain the hash value on the first field of all occupied positions in the target address as the hash value to be compared.
判断第二哈希值是否与待比较哈希值中任意一个哈希值相同。Determine whether the second hash value is the same as any one of the hash values to be compared.
响应于第二哈希值与待比较哈希值中任意一个哈希值都不相同的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that the second hash value is different from any one of the hash values to be compared, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
可选地,存储器中还存储有以下可被处理器执行的指令:Optionally, the memory also stores the following instructions that can be executed by the processor:
将第二哈希值添加至目标地址中第一个未被占用位置的第一字段。Add the second hash value to the first field of the first unoccupied position in the target address.
可选地,存储器中还存储有以下可被处理器执行的指令:Optionally, the memory also stores the following instructions that can be executed by the processor:
将目标索引值添加至目标地址中第一个未被占用位置的第二字段。Add the target index value to the second field of the first unoccupied position in the target address.
可选地,存储器中还存储有以下可被处理器执行的指令:Optionally, the memory also stores the following instructions that can be executed by the processor:
将表示位置被占用的标记添加至目标地址中第一个未被占用的位置的第三字段。Add a mark indicating that the position is occupied to the third field of the first unoccupied position in the target address.
本申请实施例还提供一种计算机可读存储介质,存储介质上存储有计算机可执行指令,计算机可执行指令用于执行以下步骤:An embodiment of the present application also provides a computer-readable storage medium, on which a computer-executable instruction is stored, and the computer-executable instruction is used to perform the following steps:
获取待插入至哈希表的关键值和结果值。Get the key and result values to be inserted into the hash table.
利用预先设置的至少两个不同哈希函数分别对待插入至哈希表的关键值进行哈希运算,得到至少两个哈希值。Using at least two different hash functions set in advance to perform hash operations on the key values to be inserted into the hash table, respectively, to obtain at least two hash values.
在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出待插入至哈希表的关键值不会引起哈希冲突的情况下,为待插入至哈希表的关键值分配与历史索引值不相同的目标索引值;其中,历史索引值为与哈希表中已有的关键值对应的索引值。When it is determined that the key value to be inserted into the hash table does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, the key value to be inserted into the hash table Assign a target index value that is different from the historical index value; where the historical index value is the index value corresponding to the key value already in the hash table.
以目标索引值为地址将待插入至哈希表的关键值和结果值添加至哈希表。The key value and the result value to be inserted into the hash table are added to the hash table with the target index value as the address.
可选地,预先设置的至少两个不同哈希函数包括:第一哈希函数和第二哈希函数。计算机可执行指令还执行以下步骤:Optionally, the at least two different hash functions preset include: a first hash function and a second hash function. Computer executable instructions also perform the following steps:
利用第一哈希函数对待插入至哈希表的关键值进行哈希运算,得到第一哈希值。The first hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain the first hash value.
利用第二哈希函数对待插入至哈希表的关键值进行哈希运算,得到第二哈希值。The second hash function is used to perform a hash operation on the key value to be inserted into the hash table to obtain a second hash value.
可选地,索引表包括:多个地址。Optionally, the index table includes: multiple addresses.
每个地址包括:多个位置。Each address includes: multiple locations.
每个位置包括:第一字段、第二字段和第三字段;其中,第一字段,用于存放利用第二哈希函数对关键值进行哈希运算得到的哈希值;第二字段,用于存放索引值;第三字段,用于存放表示每个位置是否被占用的标记。Each position includes: a first field, a second field, and a third field; where the first field is used to store the hash value obtained by hashing the key value using the second hash function; the second field is used It is used to store the index value; the third field is used to store the mark indicating whether each position is occupied.
其中,索引表的地址的个数与每个地址的位置的个数的乘积不小于哈希表的地址的个数。The product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
可选地,计算机可执行指令还执行以下步骤:Optionally, the computer executable instructions also perform the following steps:
在索引表的地址中查找与第一哈希值对应的地址,作为目标地址。Find the address corresponding to the first hash value in the address of the index table as the target address.
判断目标地址中每一个位置是否被占用。Determine whether each location in the target address is occupied.
响应于目标地址中所有位置都未被占用的判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that all positions in the target address are not occupied, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
响应于目标地址中一部分位置被占用且剩余部分位置未被占用的判断结果,利用第二哈希值和索引表判断待插入至哈希表的关键值是否会引起哈希冲突。In response to the judgment result that a part of the position in the target address is occupied and the remaining part is not occupied, the second hash value and the index table are used to determine whether the key value to be inserted into the hash table will cause a hash conflict.
可选地,计算机可执行指令还执行以下步骤:Optionally, the computer executable instructions also perform the following steps:
获取目标地址中所有被占用位置的第一字段上的哈希值,作为待比较哈希值。Obtain the hash value on the first field of all occupied positions in the target address as the hash value to be compared.
判断第二哈希值是否与待比较哈希值中任意一个哈希值相同。Determine whether the second hash value is the same as any one of the hash values to be compared.
响应于第二哈希值与待比较哈希值中任意一个哈希值都不相同判断结果,确定待插入至哈希表的关键值不会引起哈希冲突。In response to the judgment result that the second hash value is different from any one of the hash values to be compared, it is determined that the key value to be inserted into the hash table will not cause a hash conflict.
可选地,计算机可执行指令还执行以下步骤:Optionally, the computer executable instructions also perform the following steps:
将第二哈希值添加至目标地址中第一个未被占用位置的第一字段。Add the second hash value to the first field of the first unoccupied position in the target address.
可选地,计算机可执行指令还执行以下步骤:Optionally, the computer executable instructions also perform the following steps:
将目标索引值添加至目标地址中第一个未被占用位置的第二字段。Add the target index value to the second field of the first unoccupied position in the target address.
可选地,计算机可执行指令还执行以下步骤:Optionally, the computer executable instructions also perform the following steps:
将表示位置被占用的标记添加至目标地址中第一个未被占用的位置的第三字段。Add a mark indicating that the position is occupied to the third field of the first unoccupied position in the target address.

Claims (11)

  1. 一种哈希冲突的处理方法,包括:A method for processing hash conflicts includes:
    获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
    利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;Hashing the key value by using at least two preset different hash functions to obtain at least two hash values;
    在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
    以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
  2. 根据权利要求1所述的方法,其中,所述预先设置的至少两个不同哈希函数包括:第一哈希函数和第二哈希函数;The method according to claim 1, wherein the preset at least two different hash functions include: a first hash function and a second hash function;
    所述利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值包括:The performing hash operation on the key value by using at least two preset different hash functions respectively to obtain at least two hash values includes:
    利用所述第一哈希函数对所述关键值进行哈希运算,得到第一哈希值;Performing a hash operation on the key value using the first hash function to obtain a first hash value;
    利用所述第二哈希函数对所述关键值进行哈希运算,得到第二哈希值。Use the second hash function to perform a hash operation on the key value to obtain a second hash value.
  3. 根据权利要求2所述的方法,其中,所述索引表包括多个地址,每个地址包括多个位置,每个位置包括第一字段、第二字段和第三字段;The method according to claim 2, wherein the index table includes multiple addresses, each address includes multiple locations, and each location includes a first field, a second field, and a third field;
    其中,所述第一字段,用于存放利用所述第二哈希函数对关键值进行哈希运算得到的第二哈希值;所述第二字段,用于存放索引值;所述第三字段,用于存放表示所述每个位置是否被占用的标记;所述索引表的地址的个数与每个地址的位置的个数的乘积不小于所述哈希表的地址的个数。Wherein, the first field is used to store a second hash value obtained by hashing a key value using the second hash function; the second field is used to store an index value; the third A field for storing a mark indicating whether each position is occupied; the product of the number of addresses of the index table and the number of positions of each address is not less than the number of addresses of the hash table.
  4. 根据权利要求3所述的方法,还包括:The method of claim 3, further comprising:
    在所述索引表的地址中查找与所述第一哈希值对应的地址,作为目标地址;Searching for an address corresponding to the first hash value among the addresses of the index table as a target address;
    判断所述目标地址中每一个位置是否被占用;Determine whether each position in the target address is occupied;
    响应于所述目标地址中所有位置均未被占用的判断结果,确定所述关键值不会引起所述哈希冲突;In response to the judgment result that all positions in the target address are not occupied, it is determined that the key value does not cause the hash conflict;
    响应于所述目标地址中一部分位置被占用且所述目标地址中除所述一部分位置外的位置未被占用的判断结果,利用所述第二哈希值和所述索引表判断所述关键值是否会引起所述哈希冲突。In response to a judgment result that a part of the position in the target address is occupied and a position in the target address other than the part is not occupied, the key value is judged using the second hash value and the index table Whether it will cause the hash conflict.
  5. 根据权利要求4所述的方法,其中,所述利用第二哈希值和所述索引表判断所述关键值是否会引起哈希冲突,包括:The method according to claim 4, wherein the using the second hash value and the index table to determine whether the key value will cause a hash conflict includes:
    获取所述目标地址中所有被占用位置的第一字段上的哈希值,作为待比较哈希值;Obtain the hash value on the first field of all occupied positions in the target address as the hash value to be compared;
    判断所述第二哈希值是否与所述待比较哈希值中一个哈希值相同;Determine whether the second hash value is the same as one of the hash values to be compared;
    响应于所述第二哈希值与所述待比较哈希值中所有哈希值均不相同的判断结果,确定所述关键值不会引起所述哈希冲突。In response to the judgment result that all of the second hash value and the hash value to be compared are different, it is determined that the key value does not cause the hash conflict.
  6. 根据权利要求4或5所述的方法,在所述为所述关键值分配与历史索引值不相同的目标索引值之后,还包括:The method according to claim 4 or 5, after the assigning a target index value different from the historical index value for the key value, further comprising:
    将所述第二哈希值添加至所述目标地址中第一个未被占用位置的第一字段。Add the second hash value to the first field of the first unoccupied position in the target address.
  7. 根据权利要求4-6中任一项所述的方法,在所述为所述关键值分配与历史索引值不相同的目标索引值之后,还包括:The method according to any one of claims 4-6, after the assigning a target index value different from the historical index value for the key value, further comprising:
    将所述目标索引值添加至所述目标地址中第一个未被占用位置的第二字段。Add the target index value to the second field of the first unoccupied position in the target address.
  8. 根据权利要求4-7中任一项所述的方法,在所述为所述关键值分配与历史索引值不相同的目标索引值之后,还包括:The method according to any one of claims 4-7, after the assigning a target index value different from the historical index value for the key value, further comprising:
    将表示位置被占用的标记添加至所述目标地址中第一个未被占用的位置的第三字段。A mark indicating that the location is occupied is added to the third field of the first unoccupied location in the target address.
  9. 一种哈希冲突的处理装置,包括:A device for processing hash conflicts, including:
    获取模块,设置为获取待插入至哈希表的关键值和结果值;The acquisition module is set to acquire the key value and the result value to be inserted into the hash table;
    运算模块,设置为利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;An operation module, configured to perform a hash operation on the key value using at least two different hash functions set in advance to obtain at least two hash values;
    处理模块,设置为在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;The processing module is configured to allocate a historical index value to the key value when it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value Different target index values; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
    添加模块,设置为以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The adding module is configured to add the key value and the result value to the hash table with the target index value as the address.
  10. 一种哈希冲突的处理设备,包括:处理器和存储器,其中,所述存储器中存储有以下可被所述处理器执行的指令:A processing device for hash collision includes: a processor and a memory, wherein the memory stores the following instructions executable by the processor:
    获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
    利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算, 得到至少两个哈希值;Performing hash operation on the key value by using at least two different hash functions preset in advance to obtain at least two hash values;
    在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
    以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
  11. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行以下步骤:A computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to perform the following steps:
    获取待插入至哈希表的关键值和结果值;Get the key value and result value to be inserted into the hash table;
    利用预先设置的至少两个不同哈希函数分别对所述关键值进行哈希运算,得到至少两个哈希值;Hashing the key value by using at least two preset different hash functions to obtain at least two hash values;
    在根据获得的哈希值和预先建立的用于存储索引值的索引表确定出所述关键值不会引起哈希冲突的情况下,为所述关键值分配与历史索引值不相同的目标索引值;其中,所述历史索引值为与所述哈希表中已有的关键值对应的索引值;When it is determined that the key value does not cause a hash conflict based on the obtained hash value and the pre-established index table for storing the index value, a target index that is different from the historical index value is assigned to the key value Value; wherein, the historical index value is an index value corresponding to the key value already in the hash table;
    以所述目标索引值为地址将所述关键值和所述结果值添加至所述哈希表。The key value and the result value are added to the hash table with the target index value as the address.
PCT/CN2019/126860 2018-12-21 2019-12-20 Hash collision processing method, apparatus, device, and computer readable storage medium WO2020125741A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811571387.0A CN111352931A (en) 2018-12-21 2018-12-21 Hash collision processing method and device and computer readable storage medium
CN201811571387.0 2018-12-21

Publications (1)

Publication Number Publication Date
WO2020125741A1 true WO2020125741A1 (en) 2020-06-25

Family

ID=71102527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/126860 WO2020125741A1 (en) 2018-12-21 2019-12-20 Hash collision processing method, apparatus, device, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN111352931A (en)
WO (1) WO2020125741A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230244650A1 (en) * 2022-02-03 2023-08-03 TripleBlind, Inc. Systems and methods for enabling two parties to find an intersection between private data sets without learning anything other than the intersection of the datasets
CN117807277A (en) * 2024-03-01 2024-04-02 中国人民解放军国防科技大学 High-order dynamic image data storage method, device, equipment and storage medium

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112162950B (en) * 2020-09-11 2022-11-15 杭州涂鸦信息技术有限公司 Data processing method and device based on file system and computer equipment
CN112202677B (en) * 2020-09-29 2023-04-07 中移(杭州)信息技术有限公司 Hardware acceleration query method, system, electronic equipment and storage medium
CN113051302B (en) * 2021-04-19 2022-04-29 哈尔滨工业大学 Overall design-oriented multi-dimensional data matching method and device and computer storage medium
CN114244817A (en) * 2021-11-30 2022-03-25 慧之安信息技术股份有限公司 Hash collision processing method and device based on osi protocol stack header field
CN115576954B (en) * 2022-11-24 2023-04-07 恒生电子股份有限公司 Hash table determining method and device
CN116401258B (en) * 2023-06-06 2023-09-22 支付宝(杭州)信息技术有限公司 Data indexing method, data query method and corresponding devices
CN116450656B (en) * 2023-06-16 2023-08-22 北京数巅科技有限公司 Data processing method, device, equipment and storage medium
CN116822456A (en) * 2023-07-03 2023-09-29 中科驭数(北京)科技有限公司 Character string encoding method, device, equipment and storage medium
CN117390029B (en) * 2023-12-11 2024-05-17 格创通信(浙江)有限公司 Table entry inserting method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110185359A1 (en) * 2010-01-25 2011-07-28 Dhruva Chakrabarti Determining A Conflict in Accessing Shared Resources Using a Reduced Number of Cycles
CN102609509A (en) * 2010-04-26 2012-07-25 华为技术有限公司 Method and device for processing hash data
CN103577564A (en) * 2013-10-25 2014-02-12 盛科网络(苏州)有限公司 Method and device for reducing HASH collision through software shift
CN104158744A (en) * 2014-07-09 2014-11-19 中国电子科技集团公司第三十二研究所 Method for building table and searching for network processor
CN108111421A (en) * 2017-11-28 2018-06-01 郑州云海信息技术有限公司 A kind of message diversion method and device based on multiple Hash

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100502353C (en) * 2005-09-22 2009-06-17 中兴通讯股份有限公司 Signalling flow distributing method and signalling distributing processing unit
CN101692651B (en) * 2009-09-27 2014-12-31 中兴通讯股份有限公司 Method and device for Hash lookup table
CN101827137B (en) * 2010-04-13 2013-01-30 西安邮电学院 Hash table-based and extended memory-based high-performance IPv6 address searching method
CN102346735A (en) * 2010-07-29 2012-02-08 高通创锐讯通讯科技(上海)有限公司 Hash search method capable of reducing hash collision
US9230548B2 (en) * 2012-06-06 2016-01-05 Cypress Semiconductor Corporation Hybrid hashing scheme for active HMMS
CN103107945B (en) * 2013-01-10 2016-01-27 中国科学院信息工程研究所 A kind of system and method for fast finding IPV6 route
JP2016015011A (en) * 2014-07-02 2016-01-28 日本電信電話株式会社 Database device, database management method, and program
CN107153707B (en) * 2017-05-12 2020-08-14 华中科技大学 Hash table construction method and system for nonvolatile memory
CN108255912B (en) * 2017-08-17 2020-02-11 新华三技术有限公司 Method and device for storing and inquiring table data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110185359A1 (en) * 2010-01-25 2011-07-28 Dhruva Chakrabarti Determining A Conflict in Accessing Shared Resources Using a Reduced Number of Cycles
CN102609509A (en) * 2010-04-26 2012-07-25 华为技术有限公司 Method and device for processing hash data
CN103577564A (en) * 2013-10-25 2014-02-12 盛科网络(苏州)有限公司 Method and device for reducing HASH collision through software shift
CN104158744A (en) * 2014-07-09 2014-11-19 中国电子科技集团公司第三十二研究所 Method for building table and searching for network processor
CN108111421A (en) * 2017-11-28 2018-06-01 郑州云海信息技术有限公司 A kind of message diversion method and device based on multiple Hash

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230244650A1 (en) * 2022-02-03 2023-08-03 TripleBlind, Inc. Systems and methods for enabling two parties to find an intersection between private data sets without learning anything other than the intersection of the datasets
CN117807277A (en) * 2024-03-01 2024-04-02 中国人民解放军国防科技大学 High-order dynamic image data storage method, device, equipment and storage medium
CN117807277B (en) * 2024-03-01 2024-05-17 中国人民解放军国防科技大学 High-order dynamic image data storage method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111352931A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
WO2020125741A1 (en) Hash collision processing method, apparatus, device, and computer readable storage medium
US10706101B2 (en) Bucketized hash tables with remap entries
US11102120B2 (en) Storing keys with variable sizes in a multi-bank database
JP6542909B2 (en) File operation method and apparatus
US8266116B2 (en) Method and apparatus for dual-hashing tables
US9871727B2 (en) Routing lookup method and device and method for constructing B-tree structure
US9032143B2 (en) Enhanced memory savings in routing memory structures of serial attached SCSI expanders
JP2017182803A (en) Memory deduplication method and deduplication DRAM memory module
US8874866B1 (en) Memory access system
US6725216B2 (en) Partitioning search key thereby distributing table across multiple non-contiguous memory segments, memory banks or memory modules
TWI644216B (en) Priority-based access of compressed memory lines in memory in a processor-based system
CN108255912B (en) Method and device for storing and inquiring table data
CN106599091B (en) RDF graph structure storage and index method based on key value storage
CN103001878A (en) Determination method and device for media access control (MAC) address Hash collision
US20090282167A1 (en) Method and apparatus for bridging
WO2016070341A1 (en) Data processing method and apparatus
CN108762915B (en) Method for caching RDF data in GPU memory
CN108664518B (en) Method and device for realizing table look-up processing
US10102116B2 (en) Multi-level page data structure
WO2013108745A1 (en) Storage device, control method for same, and program
US9703484B2 (en) Memory with compressed key
US20160124950A1 (en) Data processing device, data processing method, and non-transitory computer readable medium
CN108614879A (en) Small documents processing method and device
US20160105363A1 (en) Memory system for multiple clients
US10162525B2 (en) Translating access requests for a multi-level page data structure

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19899922

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19899922

Country of ref document: EP

Kind code of ref document: A1