CN101004741A - Modified hash method, and application - Google Patents
Modified hash method, and application Download PDFInfo
- Publication number
- CN101004741A CN101004741A CNA2006100084483A CN200610008448A CN101004741A CN 101004741 A CN101004741 A CN 101004741A CN A2006100084483 A CNA2006100084483 A CN A2006100084483A CN 200610008448 A CN200610008448 A CN 200610008448A CN 101004741 A CN101004741 A CN 101004741A
- Authority
- CN
- China
- Prior art keywords
- key
- subclauses
- clauses
- page
- hash
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Abstract
An improved Hash method includes providing Hash key code table with storage space table, providing key code page with key code item and pointer pointing to next key code page, providing key code item with effective flag position and key code as well as result pointer, providing actual stored item with actual information content corresponding to stored key code.
Description
Technical field
The present invention relates to use the computer realm of HASH (Hash) technology, relate in particular to the network communication field of using the HASH technology.
Background technology
The HASH method is a kind of common method of carrying out information retrieval and storage in computer realm, and HASH method commonly used has immediately allocating method, digital analysis method, leaving remainder method and takes advantage of the surplus method etc. that rounds.
The HASH method is set up a definite respective function and is concerned Hash () between the memory location of list item and its key (KEY), make each key corresponding with a unique memory location in the structure (Address=Hash (Rec.key)).The transfer function that uses in the HASH method is called the HASH function, and just is called the HASH table by table or the structure that this kind method construct comes out.
Use the HASH method to search and to carry out repeatedly the comparison of key, so seek rate can directly arrive or approach the actual storage address of the list item with this key than very fast.In some network processing systems (as route system etc.), usually use the HASH method to store and retrieve the corresponding information of transmitting.
In general, the HASH function is a compression map function.Usually the key set is than big many of HASH table address set, therefore might be through the calculating of HASH function, different keys is mapped on the same HASH address, this has just produced the HASH conflict, must look for an address to deposit the list item of conflict again, the problem that how to manage conflict that Here it is for this reason.
The solution a kind of commonly used of handling the HASH conflict is the linear probing method.In the linear probing method, when the HASH conflict takes place, search successively, up to finding a position to deposit the list item that clashes for sky from the memory location of conflict list item correspondence backward.The linear probing method is easy to generate " accumulation " problem, and promptly different keys of detecting sequence have occupied available sky bucket, makes and need experience the different elements of detecting sequence for seeking a certain key, causes searching the time increase.
The method that another kind of solution HASH commonly used conflicts is many HASH method.First HASH function Hash1 () presses the key of list item and calculates the deposit position H1=Hash1 () at list item place, in case clash, utilize second the HASH function Hash2 () that is different from first HASH function to calculate " next one " deposit position of this list item.The deposit position that might some keys uses second HASH function to obtain also is identical, needs this moment to continue to use the 3rd HASH function Hash3 () to calculate " next one " deposit position of this list item.So repeat, till all HASH conflicts are solved.
Compare the method for other a lot of HASH of solution conflicts, many HASH method is fairly simple, and is efficient, and is applicable to the method that great majority are used, and has certain problem but many HASH method is applied in network communication field.In network communication is used, " shake " be with " time delay " and a parameter of equal importance, for example, sound and picture record can be used as data and are transmitting on the Internet, by router by transmission be forwarded to the destination.If router device uses many HASH method to come the destination of locator data message and transmits data message, most of data messages (70%~80%) can be by destination, an internal storage access location, but, 20%~30% data message need carry out internal storage access again, and, having also needs under many situations three times, four times or even five visits to internal memory, the uncertainty of internal storage access number of times greatly influences performance, to cause data message transmission shake too big, sound is discontinuous, off and on.
Therefore, need a kind of improved hash method and application thereof, both had, the overwhelming majority's message is had fixing search number of times again, also need to have higher memory usage simultaneously to reduce network jitter than the higher performance of searching.
Summary of the invention
Technical matters to be solved by this invention is: overcome internal memory number of times that the hash method that exists in the prior art need visit and change greatly problem and defective, a kind of improved hash method and application thereof are provided.
The present invention by the following technical solutions, a kind of improved hash method may further comprise the steps:
The Hash key code table is provided, comprises the storage space table that constitutes by a series of Hash key pages;
The key page is provided, comprises the key clauses and subclauses of some and the pointer of the next key page of sensing;
The key clauses and subclauses are provided, include valid flag position, key and result pointer;
The actual storage clauses and subclauses are provided, comprise the actual information content of depositing the key correspondence.
Further, the actual information content of described key correspondence comprises forwarding information.
A kind of application of improved hash method is used for canned data, may further comprise the steps:
Steps A: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step B: key is written in idle key clauses and subclauses in the key page that the steps A computing obtains;
Step C: in the key clauses and subclauses, write the result pointer of a correspondence, this pointer and key correspondence, and point to the actual storage clauses and subclauses.
Further, described step B comprises:
Step B1: the key record clauses and subclauses of in the current key page, seeking a free time;
Step B2: if found certain idle key clauses and subclauses, key is write, change step C;
Step B3: if do not find available idle key clauses and subclauses, check whether next key page pointer is empty, if be empty, changes step B4, if be not idle running step B6;
Step B4: distribute a new key page, with the next key page pointer of new key page address as the current key page;
Step B5: in the new key page, seek the key record clauses and subclauses of a free time, change step C;
Step B6: read the next key page according to next key page pointer, change step B1.
A kind of application of improved hash method is used to the information of searching, and may further comprise the steps:
Step a: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step b: read the key page according to the key page address;
Step c: the key of searching key clauses and subclauses in the current key page equals to search the key of usefulness, if found the key clauses and subclauses of coupling, changes steps d, otherwise searches failure;
Steps d: the result pointer according to the key clauses and subclauses that match reads the actual storage clauses and subclauses.
Further, described step c comprises:
Step c1: search the key that the key that whether has key clauses and subclauses equals to search usefulness at the current key page;
Step c2: if found the key clauses and subclauses of coupling, change steps d, otherwise change step c3;
Step c3: whether the next key page pointer of checking the current key page is empty, if be empty, searches failure, otherwise changes step c4;
Step c4: read the next key page according to next key page pointer, change step c1.
Beneficial effect: adopt the inventive method, searching in most cases HASH only needs twice internal storage access, internal storage access is for reading the key page for the first time, internal storage access is for reading the actual storage clauses and subclauses for the second time, compared with prior art, overcome access memory number of times variation shortcoming greatly, " shake " problem that transmits in network for information such as solving voice, image is very favourable, has also improved the utilization ratio of internal memory simultaneously.
Description of drawings
Fig. 1 stores and searches needed data structure diagram according to the inventive method;
Fig. 2 is the process flow diagram of storing according to the inventive method;
Fig. 3 is the process flow diagram of searching according to the inventive method.
Embodiment
Be described in further detail below in conjunction with the enforcement of accompanying drawing technical scheme.
As shown in Figure 1, the data structure of the improved HASH method application of the present invention comprises:
HASH key code table 10: the storage space table that constitutes by a series of HASH key pages 11.
The key page 11: be a storage block depositing information such as a series of keys, comprise the key clauses and subclauses 12 of some and point to the pointer 17 of the next key page, suppose that in the present embodiment a key page 11 comprises 8 key clauses and subclauses.
Key clauses and subclauses 12: include valid flag position 13, key 14 and result pointer 15.
Whether effective marker position 13: it is effective to identify current key clauses and subclauses, is that the current key clauses and subclauses of 0 expression are invalid, is idle clauses and subclauses, is that the current key clauses and subclauses of 1 sign are effective.
Key 14: what deposit is key, is used for and the key of searching compares and determines the coupling clauses and subclauses.
Result pointer 15: the actual storage clauses and subclauses 18 of pointing to this key correspondence.
Effective marker position 16: representing whether next key page pointer 17 effective, is that 0 sign is invalid, be 1 sign effectively.
Next key page pointer 17: if all key clauses and subclauses of the current key page are all occupied, might need to distribute the next key page, this moment is with the address of the next key page " the next key page pointer 17 " as the current key page.
Actual storage clauses and subclauses 18: deposit and search needed actual information, such as forwarding information etc.
The next key page 19: the same with the key page 11, be a storage block depositing information such as a series of keys.
As shown in Figure 2, use the present invention and carry out information stores, may further comprise the steps:
Step 201: use a HASH function that key is carried out the HASH operation, obtain a key page address of key correspondence in the HASH key code table;
Step 202:, read corresponding key content of pages according to the key page address;
Step 203: get first key clauses and subclauses of the current key page;
Step 204: whether the effective marker position of judging current key clauses and subclauses is 0, if be 0, changes step 205, if be not 0, changes step 209;
Step 205: the effective marker position of the key clauses and subclauses that match is made as 1, and expression effectively;
Step 206: the key that will search usefulness writes in the key clauses and subclauses that match;
Step 207: distribute a slice internal memory to be used to deposit the actual storage clauses and subclauses;
Step 208: actual storage clauses and subclauses corresponding memory address is write the result pointer territory of the key clauses and subclauses of coupling, change step 216;
Step 209: judge whether the current key page also exists next key clauses and subclauses,, change step 210, if there is no, change step 211 if exist;
Step 210: get next key clauses and subclauses, change step 204;
Step 211: whether the next key page pointer of judging the current key page is empty, if be not empty, changes step 212, if be empty, changes step 213;
Step 212: read next key content of pages according to next key page pointer, change step 203;
Step 213: redistribute a key page memory block;
Step 214: the next key page pointer territory that new key page address is write the current key page;
Step 215: the new crucial page as the current key page, is changeed step 203;
Step 216: finish.
As shown in Figure 3, use the present invention and carry out information searching, may further comprise the steps:
Step 301: use a HASH function that key is carried out the HASH operation, obtain a key page address of key correspondence in the HASH key code table;
Step 302:, read corresponding key content of pages according to the key page address;
Step 303: get first key clauses and subclauses of the current key page;
Step 304: the effective marker position of judging current key clauses and subclauses whether be 1 and the key of current key clauses and subclauses and the key of searching usefulness equate, if equate, change step 305, if unequal, change step 306;
Step 305: the result pointer according to current key clauses and subclauses reads actual storage clauses and subclauses resultant content, changes step 311;
Step 306: judge whether current page also has next key clauses and subclauses,, change step 307,, change step 308 if do not have if having;
Step 307: get next key clauses and subclauses, change step 304;
Step 308: whether the next key page pointer of judging the current key page is empty, if be empty, changes step 310, if be not empty, changes step 309;
Step 309: read the content of the next key page according to next key page pointer, change step 303;
Step 310: it fails to match;
Step 311: finish.
IPV6 of supposition transmits and has 512 in the following example, 000 clauses and subclauses, adopt existing many HASH method and improvement HASH method of the present invention to realize respectively, memory usage of the present invention as can be seen is than HASH method memory usage height commonly used from example.
The key of supposing each forwarding entry is 128bits (16bytes), and corresponding actual storage items for information size is relevant with actual application, is assumed to 32bytes.
Existing many HASH method realizes:
For performance requirement, good HASH function requires usually for the first time that the HASH hit rate is 70%~80%, is assumed to 80%, and the first order HASH table key number that needs to load is 409,600 so.
In addition, owing to carry out HASH when searching, it is very high conflict, and can not expect, therefore obtains reasonable search efficiency if desired, and all requiring the loading factor (the dominant record entry number of the actual record entry number/distribution that takies) usually is 1/4.The entry number of first order HASH table needs distribution is 2,048,000 like this.
The key of residue 20% (102,400) need use second level HASH function to carry out HASH once more, according to the computing method of front, the entry number that second level HASH table needs to distribute be (102,400*80%)/(1/5)=409,600.
Have 20,480 again (102,400*20%) individual key need use third level HASH function to carry out HASH, according to the computing method of front, the entry number that third level HASH table needs to distribute be (20,480*80%)/(1/5)=81,920.
In order to solve all HASH conflict, the back also needs the fourth stage, and level V HASH function or the like saves as in adopting like this that many HASH method needs at least: (2,048,000+409,600+81,920) * (16+32) bytes=1.22*10
8Bytes.
Improvement HASH method of the present invention realizes:
As noted earlier, when for the first time carrying out the HASH computing and clash, according to the present invention, do not need this key again with another HASH functional operation this moment to obtain new memory address, but this key is also put into the key page that HASH computing for the first time obtains, only seek the key entry line of a free time in addition at this key page.
Each key page can be deposited a plurality of key entry line, and this distribution utilization factor for memory headroom is highly profitable.In common HASH method, because conflict can't be predicted, in order make to reduce the probability that conflict takes place, it is 1/4 that a good HASH method requires to load the factor (total entry number of the entry number/distribution of filling) usually, that is to say that memory usage has only 20%.In the present invention, because a key page can be deposited a plurality of key clauses and subclauses (being assumed to be 8 in this example), therefore the number of the key page can be 1/4 of total key number only, can reach 70%~90% so that load the factor like this, and carry out 2 internal memories and search the hit rate of (internal storage access is for reading the key page, and another time internal storage access reads in the actual storage items for information for the result pointer according to the key clauses and subclauses of coupling) and can reach 98%.
Thus, first order key number of pages only need be 128,000 (1/4 key number), the number that need redistribute the next key page (carry out need increasing when HASH searches once this moment internal memory visit so that next key content of pages is read in the internal memory) for (512,000*2%) * (1/4)=2560.
Certainly may also need to distribute the key page of the third level, but such ratio only accounts for 0.04% (2%*2%) of total key number.
Consider that each key clauses and subclauses effective marker position and result pointer are taken as 18bits (3bytes) altogether, each key page comprises 8 key clauses and subclauses, the next key page pointer of each key page also is 18bits (3bytes), and the size of each key page is so: 8* (3bytes+16btes)+3bytes=155bytes.
The internal memory that adopts the present invention to need is divided into two parts:
The actual storage clauses and subclauses need internal memory: 512, and 000*32bytes=0.16*10
8Bytes
The key page needs internal memory: (128,000+2560) * 155bytes=0.2*10
8Bytes
Save as in needing altogether: 0.16*10
8+ 0.2*10
8=0.36*10
8Bytes<<1.22*10
8Bytes
From last example as can be seen, compare with traditional HASH, the present invention has several important feature and advantage: lookup method of the present invention makes that searching a key only needs internal storage access 2 times in most situations, it is very important that this has stable performance for the system that uses the HASH method to search, and the system that makes can not shake; In addition, the present invention is searching performance relatively also than under the conditions of higher, and the memory headroom utilization factor is greatly improved.
Claims (6)
1. improved hash method may further comprise the steps:
The Hash key code table is provided, comprises the storage space table that constitutes by a series of Hash key pages;
The key page is provided, comprises the key clauses and subclauses of some and the pointer of the next key page of sensing;
The key clauses and subclauses are provided, include valid flag position, key and result pointer;
The actual storage clauses and subclauses are provided, comprise the actual information content of depositing the key correspondence.
2. method according to claim 1 is characterized in that the actual information content of described key correspondence comprises forwarding information.
3. the application of an improved hash method is used for canned data, may further comprise the steps:
Steps A: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step B: key is written in idle key clauses and subclauses in the key page that the steps A computing obtains;
Step C: in the key clauses and subclauses, write the result pointer of a correspondence, this pointer and key correspondence, and point to the actual storage clauses and subclauses.
4. the application of a kind of improved hash method of method according to claim 3 is characterized in that, described step B comprises:
Step B1: the key record clauses and subclauses of in the current key page, seeking a free time;
Step B2: if found certain idle key clauses and subclauses, key is write, change step C;
Step B3: if do not find available idle key clauses and subclauses, check whether next key page pointer is empty, if be empty, changes step B4, if be not idle running step B6;
Step B4: distribute a new key page, with the next key page pointer of new key page address as the current key page;
Step B5: in the new key page, seek the key record clauses and subclauses of a free time, change step C;
Step B6: read the next key page according to next key page pointer, change step B1.
5. the application of an improved hash method is used to the information of searching, and may further comprise the steps:
Step a: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step b: read the key page according to the key page address;
Step c: the key of searching key clauses and subclauses in the current key page equals to search the key of usefulness, if found the key clauses and subclauses of coupling, changes steps d, otherwise searches failure;
Steps d: the result pointer according to the key clauses and subclauses that match reads the actual storage clauses and subclauses.
6. the application of a kind of improved hash method according to claim 5, described step c comprises:
Step c1: search the key that the key that whether has key clauses and subclauses equals to search usefulness at the current key page;
Step c2: if found the key clauses and subclauses of coupling, change steps d, otherwise change step c3;
Step c3: whether the next key page pointer of checking the current key page is empty, if be empty, searches failure, otherwise changes step c4;
Step c4: read the next key page according to next key page pointer, change step c1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006100084483A CN100487697C (en) | 2006-01-22 | 2006-01-22 | Searching method by using modified hash method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006100084483A CN100487697C (en) | 2006-01-22 | 2006-01-22 | Searching method by using modified hash method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101004741A true CN101004741A (en) | 2007-07-25 |
CN100487697C CN100487697C (en) | 2009-05-13 |
Family
ID=38703886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2006100084483A Expired - Fee Related CN100487697C (en) | 2006-01-22 | 2006-01-22 | Searching method by using modified hash method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100487697C (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101887457A (en) * | 2010-07-02 | 2010-11-17 | 杭州电子科技大学 | Content-based copy image detection method |
CN109542908A (en) * | 2018-11-23 | 2019-03-29 | 中科驭数(北京)科技有限公司 | Data compression method, storage method, access method and system in key-value database |
CN110177106A (en) * | 2019-05-31 | 2019-08-27 | 贵州精准健康数据有限公司 | Medical imaging data transmission system |
CN111953682A (en) * | 2020-08-11 | 2020-11-17 | 北京八分量信息科技有限公司 | Tamper-proof method and device for bank cloud computing portal website page and related product |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101655861B (en) * | 2009-09-08 | 2011-06-01 | 中国科学院计算技术研究所 | Hashing method based on double-counting bloom filter and hashing device |
-
2006
- 2006-01-22 CN CNB2006100084483A patent/CN100487697C/en not_active Expired - Fee Related
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101887457A (en) * | 2010-07-02 | 2010-11-17 | 杭州电子科技大学 | Content-based copy image detection method |
CN101887457B (en) * | 2010-07-02 | 2012-10-03 | 杭州电子科技大学 | Content-based copy image detection method |
CN109542908A (en) * | 2018-11-23 | 2019-03-29 | 中科驭数(北京)科技有限公司 | Data compression method, storage method, access method and system in key-value database |
CN110177106A (en) * | 2019-05-31 | 2019-08-27 | 贵州精准健康数据有限公司 | Medical imaging data transmission system |
CN111953682A (en) * | 2020-08-11 | 2020-11-17 | 北京八分量信息科技有限公司 | Tamper-proof method and device for bank cloud computing portal website page and related product |
Also Published As
Publication number | Publication date |
---|---|
CN100487697C (en) | 2009-05-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8099421B2 (en) | File system, and method for storing and searching for file by the same | |
CN101354726B (en) | Method for managing memory metadata of cluster file system | |
CN101141389B (en) | Reinforcement multidigit Trie tree searching method and apparatus | |
CN106294190B (en) | Storage space management method and device | |
US20020138648A1 (en) | Hash compensation architecture and method for network address lookup | |
CN101655861A (en) | Hashing method based on double-counting bloom filter and hashing device | |
CN104820717A (en) | Massive small file storage and management method and system | |
CN103907099B (en) | Short address conversion table uncached in cache coherence computer system | |
CN100487697C (en) | Searching method by using modified hash method | |
CN101267381B (en) | Operation method and device for Hash table | |
CN102739622A (en) | Expandable data storage system | |
CN102541925A (en) | Method and device for rapidly storing and retrieving detailed tickets | |
CN102542041A (en) | Method and system for processing raster data | |
US9697898B2 (en) | Content addressable memory with an ordered sequence | |
CN101783814A (en) | Metadata storing method for mass storage system | |
CN111541617B (en) | Data flow table processing method and device for high-speed large-scale concurrent data flow | |
CN102984071B (en) | Method for organizing routing table of segment address route and method for checking route | |
CN101459599B (en) | Method and system for implementing concurrent execution of cache data access and loading | |
CN1723506A (en) | Method and system for maximizing DRAM memory bandwidth | |
CN101566985A (en) | System and method for searching static data converted from dynamic data | |
CN111178965B (en) | Resource release method and server | |
CN102521228A (en) | Key value mapping method for linear data table | |
CN1228935C (en) | Method of searching international nobile recognition number and electronic sequence number | |
CN100461806C (en) | Voice value-added service data information processing method | |
CN109325149A (en) | XML message search method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090513 Termination date: 20150122 |
|
EXPY | Termination of patent right or utility model |