CN110825921A - Method for solving Hash collision - Google Patents

Method for solving Hash collision Download PDF

Info

Publication number
CN110825921A
CN110825921A CN201911107261.2A CN201911107261A CN110825921A CN 110825921 A CN110825921 A CN 110825921A CN 201911107261 A CN201911107261 A CN 201911107261A CN 110825921 A CN110825921 A CN 110825921A
Authority
CN
China
Prior art keywords
hash
rule
collision
key
hash table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911107261.2A
Other languages
Chinese (zh)
Inventor
陈晖�
张晓峰
陈伟峰
王东锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Optical Electrical Communication Technology Co Ltd
Original Assignee
Tianjin Optical Electrical Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Optical Electrical Communication Technology Co Ltd filed Critical Tianjin Optical Electrical Communication Technology Co Ltd
Priority to CN201911107261.2A priority Critical patent/CN110825921A/en
Publication of CN110825921A publication Critical patent/CN110825921A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Abstract

The invention discloses a method for solving Hash collision. According to the method, another hash function is used for calculating the rule again, and the rule is stored in a new hash table, so that the problem of rule collision coverage in the original hash table caused by only one-time hash calculation is solved. And when the key is searched and matched later, carrying out hash calculation on the key twice, searching in the two hash tables, and taking or obtaining the two search results to obtain a final result. In this way, although the original hash table does not search the corresponding rule due to the collision problem, the new hash table searches the corresponding rule, and the final result of the sum of the search results of the two hash tables is that the expected rule is searched, that is, the result is a hit. The method provided by the invention can well solve the problem of hash collision and provides beneficial reference for realizing the search matching technology based on the hash.

Description

Method for solving Hash collision
Technical Field
The invention relates to the field of data search matching, in particular to a method for solving hash collision, which is used for solving the collision problem when data search matching is realized based on hash.
Background
When the data search is matched, the received data is compared with the known database rule to achieve the purpose of matching and screening. The hash table has the advantages of high searching speed and capability of storing a large number of rules, and is widely applied to the technical field of data search and matching. However, when the hash table is used for data searching, the hash collision problem cannot be avoided. The hash collision is to different rules or keys, and when the same hash function is used for calculation, the hash results may be the same, that is, all the hash results are mapped to the same position of the hash table, so that the collision is generated. A common method for processing hash collision is to reserve a certain collision depth interval for each hash calculation result, and store different rules to different positions of the collision depth interval when a collision occurs, thereby avoiding the problem of rule coverage. When searching and matching the keys, traversing each rule in the collision depth interval to see whether the rule is equal to the keys to be searched, thereby obtaining whether a hit result is obtained. This implementation is most straightforward, but the disadvantage is also obvious, that is, due to the existence of the collision depth interval, when the key search is matched, each rule in the collision depth interval is traversed, which may cause the search performance to be degraded. In addition, if the collision depth interval is too large, the whole hash table becomes large, and the required storage space is large, which is unacceptable for some applications with tight storage space.
Disclosure of Invention
In view of the problems of the above technology, the present invention provides a method for solving hash collision. The invention aims to solve the problems of reduced hash table searching performance and overlarge storage space in the prior art.
The technical scheme adopted by the invention is as follows: a method for solving Hash collision is realized on a hardware platform based on FPGA, and is characterized by comprising the following steps:
if two rules, namely rule _0 and rule _1, need to be stored in a hash table, the corresponding hash function is hash, if the hash (rule _0) and the hash (rule _1) result are equal after the hash operation, the rule of rule _1 will override the rule of rule _0, and if the key _0 corresponding to rule _0 needs to be matched, then the key _0 will not be successfully matched;
therefore, another hash function is marked as hash _ s, the rule _0 and the rule _1 are calculated again, at this time, the results of the hash _ s (rule _0) and the hash _ s (rule _1) are different, that is, both the two rules are stored in a new hash table, and the collision coverage condition does not occur, if the key _0 corresponding to the rule _0 needs to be matched, the hash table corresponding to the hash function hash will not hit, the hash table corresponding to the hash function hash _ s can hit, and the two hit results are taken or the final key _0 obtains the expected hit result.
The beneficial effects produced by the invention are as follows: by adopting the method of calculating twice by adopting different hash functions for the same rule, when the first hash function is collided, because the second hash function and the first hash function have different calculation modes, the probability of collision generated when the second hash function processes the same rule is very low. The method provided by the invention has wide application value in the field of data search and matching.
Drawings
FIG. 1 is a schematic diagram of a hash collision;
fig. 2 is a schematic diagram of the present invention for solving hash collision.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
FIG. 1 is a schematic diagram of hash collision, in which rules rule _0 and rule _1 are stored in a hash table in advance, and then key _0 corresponding to rule _0 is used to perform a search and match on the hash table. As can be seen from fig. 1, since the calculated values of the hash (rule _0) and the hash (rule _1) are the same, i.e. both map to the same address addr _ r in the hash table, so that a hash collision occurs, the rule _1 covers the rule _ 0. Therefore, when key _0 is searched in the hash table, the hash (key _0) is mapped to the address addr _ r in the hash table, and the rule extracted from the hash table is rule _1, which is not the expected rule _0, so that the search result is miss, i.e. no hit.
FIG. 2 illustrates how the present invention solves the hash collision, and the rule _0 and rule _1 are calculated again by using another hash function, i.e. hash _ s, so that the rule _0 and rule _1 are also stored in another new hash table, i.e. hash _ s table. As can be seen from fig. 2, the new hash function calculation value hash _ s (rule _0) is different from hash _ s (rule _1), i.e. two rules are mapped to different locations in the hash _ s table, rule _0 is mapped to addr _ m in the hash _ s table, and rule _1 is mapped to addr _ n in the hash _ s table. It can be seen that no case of regular collision coverage occurs in the new hash table hash _ table. Similarly, when key _0 is searched in the hash table of the original hash table, the obtained result is miss, and when key _0 is searched in the hash table of the new hash table, since hash _ s (key _0) is also mapped to addr _ m in the hash table, the read rule is rule _0, which is a desired rule, the result of the search in the hash table of the new hash table is match, that is, the result is hit. And taking or operation is carried out on the results of the two hash table searches, and the total search result is a hit.
It can be seen from the above method for processing hash collision that the main idea of the present invention is to solve the collision problem of the original hash function by using different hash functions, and the probability of collision between the two hash functions is very small, which solves the collision problem of hash to a certain extent. As can be seen from the above explanation with reference to the drawings, since another new hash function is used, a new hash table with the same size is used more, that is, the storage space of the hash table is doubled, which is much less than the storage space consumed by the method that usually adopts the collision depth interval to solve the hash collision. In addition, when the key searches in the hash table, two hash tables are simultaneously searched, and the searching performance is the same as that of only one hash table, namely, no extra clock period is consumed for searching for matching. The method provided by the invention can well solve the problem of hash collision and provides beneficial reference for the search matching technology realized based on the hash table.

Claims (1)

1. A method for solving Hash collision is realized on a hardware platform based on FPGA, and is characterized by comprising the following steps:
if two rules, namely rule _0 and rule _1, need to be stored in a hash table, the corresponding hash function is hash, if the hash (rule _0) and the hash (rule _1) result are equal after the hash operation, the rule of rule _1 will override the rule of rule _0, and if the key _0 corresponding to rule _0 needs to be matched, then the key _0 will not be successfully matched;
therefore, another hash function is marked as hash _ s, the rule _0 and the rule _1 are calculated again, at this time, the results of the hash _ s (rule _0) and the hash _ s (rule _1) are different, that is, both the two rules are stored in a new hash table, and the collision coverage condition does not occur, if the key _0 corresponding to the rule _0 needs to be matched, the hash table corresponding to the hash function hash will not hit, the hash table corresponding to the hash function hash _ s can hit, and the two hit results are taken or the final key _0 obtains the expected hit result.
CN201911107261.2A 2019-11-13 2019-11-13 Method for solving Hash collision Pending CN110825921A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911107261.2A CN110825921A (en) 2019-11-13 2019-11-13 Method for solving Hash collision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911107261.2A CN110825921A (en) 2019-11-13 2019-11-13 Method for solving Hash collision

Publications (1)

Publication Number Publication Date
CN110825921A true CN110825921A (en) 2020-02-21

Family

ID=69554937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911107261.2A Pending CN110825921A (en) 2019-11-13 2019-11-13 Method for solving Hash collision

Country Status (1)

Country Link
CN (1) CN110825921A (en)

Similar Documents

Publication Publication Date Title
CN105718455B (en) A kind of data query method and device
CN105354151B (en) Cache management method and equipment
US9390134B2 (en) Regular expression matching method and system, and searching device
US20120330965A1 (en) Method and apparatus for storing and searching for keyword
US20230161822A1 (en) Fast and accurate geomapping
CN107368527B (en) Multi-attribute index method based on data stream
US20090063527A1 (en) Processing of database statements with join predicates on range-partitioned tables
CN106326475B (en) Efficient static hash table implementation method and system
US20020138648A1 (en) Hash compensation architecture and method for network address lookup
CN102880628A (en) Hash data storage method and device
CN102187642B (en) Method and device for adding, searching for and deleting key in hash table
CN112148738A (en) Hash collision processing method and system
CN113901279B (en) Graph database retrieval method and device
CN109800228B (en) Method for efficiently and quickly solving hash conflict
CN115438081A (en) Multi-stage aggregation and real-time updating method for massive ship position point clouds
CN110825921A (en) Method for solving Hash collision
CN107943807B (en) Data processing method and storage device
CN108614879A (en) Small documents processing method and device
CN112269784A (en) Hash table structure based on hardware realization and inserting, inquiring and deleting method
US20150324484A1 (en) Offline radix tree compression with key sequence skip
US9996569B2 (en) Index traversals utilizing alternate in-memory search structure and system memory costing
CN113641681B (en) Space self-adaptive mass data query method
CN103399920A (en) Key value searching method, key value searching device and chip
CN111723266A (en) Mass data processing method and device
CN109359111B (en) Android view access method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200221

WD01 Invention patent application deemed withdrawn after publication