CN101004741A - Modified hash method, and application - Google Patents

Modified hash method, and application Download PDF

Info

Publication number
CN101004741A
CN101004741A CNA2006100084483A CN200610008448A CN101004741A CN 101004741 A CN101004741 A CN 101004741A CN A2006100084483 A CNA2006100084483 A CN A2006100084483A CN 200610008448 A CN200610008448 A CN 200610008448A CN 101004741 A CN101004741 A CN 101004741A
Authority
CN
China
Prior art keywords
key
subclauses
clauses
page
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006100084483A
Other languages
Chinese (zh)
Other versions
CN100487697C (en
Inventor
胡龙斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CNB2006100084483A priority Critical patent/CN100487697C/en
Publication of CN101004741A publication Critical patent/CN101004741A/en
Application granted granted Critical
Publication of CN100487697C publication Critical patent/CN100487697C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

An improved Hash method includes providing Hash key code table with storage space table, providing key code page with key code item and pointer pointing to next key code page, providing key code item with effective flag position and key code as well as result pointer, providing actual stored item with actual information content corresponding to stored key code.

Description

A kind of improved hash method and application thereof
Technical field
The present invention relates to use the computer realm of HASH (Hash) technology, relate in particular to the network communication field of using the HASH technology.
Background technology
The HASH method is a kind of common method of carrying out information retrieval and storage in computer realm, and HASH method commonly used has immediately allocating method, digital analysis method, leaving remainder method and takes advantage of the surplus method etc. that rounds.
The HASH method is set up a definite respective function and is concerned Hash () between the memory location of list item and its key (KEY), make each key corresponding with a unique memory location in the structure (Address=Hash (Rec.key)).The transfer function that uses in the HASH method is called the HASH function, and just is called the HASH table by table or the structure that this kind method construct comes out.
Use the HASH method to search and to carry out repeatedly the comparison of key, so seek rate can directly arrive or approach the actual storage address of the list item with this key than very fast.In some network processing systems (as route system etc.), usually use the HASH method to store and retrieve the corresponding information of transmitting.
In general, the HASH function is a compression map function.Usually the key set is than big many of HASH table address set, therefore might be through the calculating of HASH function, different keys is mapped on the same HASH address, this has just produced the HASH conflict, must look for an address to deposit the list item of conflict again, the problem that how to manage conflict that Here it is for this reason.
The solution a kind of commonly used of handling the HASH conflict is the linear probing method.In the linear probing method, when the HASH conflict takes place, search successively, up to finding a position to deposit the list item that clashes for sky from the memory location of conflict list item correspondence backward.The linear probing method is easy to generate " accumulation " problem, and promptly different keys of detecting sequence have occupied available sky bucket, makes and need experience the different elements of detecting sequence for seeking a certain key, causes searching the time increase.
The method that another kind of solution HASH commonly used conflicts is many HASH method.First HASH function Hash1 () presses the key of list item and calculates the deposit position H1=Hash1 () at list item place, in case clash, utilize second the HASH function Hash2 () that is different from first HASH function to calculate " next one " deposit position of this list item.The deposit position that might some keys uses second HASH function to obtain also is identical, needs this moment to continue to use the 3rd HASH function Hash3 () to calculate " next one " deposit position of this list item.So repeat, till all HASH conflicts are solved.
Compare the method for other a lot of HASH of solution conflicts, many HASH method is fairly simple, and is efficient, and is applicable to the method that great majority are used, and has certain problem but many HASH method is applied in network communication field.In network communication is used, " shake " be with " time delay " and a parameter of equal importance, for example, sound and picture record can be used as data and are transmitting on the Internet, by router by transmission be forwarded to the destination.If router device uses many HASH method to come the destination of locator data message and transmits data message, most of data messages (70%~80%) can be by destination, an internal storage access location, but, 20%~30% data message need carry out internal storage access again, and, having also needs under many situations three times, four times or even five visits to internal memory, the uncertainty of internal storage access number of times greatly influences performance, to cause data message transmission shake too big, sound is discontinuous, off and on.
Therefore, need a kind of improved hash method and application thereof, both had, the overwhelming majority's message is had fixing search number of times again, also need to have higher memory usage simultaneously to reduce network jitter than the higher performance of searching.
Summary of the invention
Technical matters to be solved by this invention is: overcome internal memory number of times that the hash method that exists in the prior art need visit and change greatly problem and defective, a kind of improved hash method and application thereof are provided.
The present invention by the following technical solutions, a kind of improved hash method may further comprise the steps:
The Hash key code table is provided, comprises the storage space table that constitutes by a series of Hash key pages;
The key page is provided, comprises the key clauses and subclauses of some and the pointer of the next key page of sensing;
The key clauses and subclauses are provided, include valid flag position, key and result pointer;
The actual storage clauses and subclauses are provided, comprise the actual information content of depositing the key correspondence.
Further, the actual information content of described key correspondence comprises forwarding information.
A kind of application of improved hash method is used for canned data, may further comprise the steps:
Steps A: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step B: key is written in idle key clauses and subclauses in the key page that the steps A computing obtains;
Step C: in the key clauses and subclauses, write the result pointer of a correspondence, this pointer and key correspondence, and point to the actual storage clauses and subclauses.
Further, described step B comprises:
Step B1: the key record clauses and subclauses of in the current key page, seeking a free time;
Step B2: if found certain idle key clauses and subclauses, key is write, change step C;
Step B3: if do not find available idle key clauses and subclauses, check whether next key page pointer is empty, if be empty, changes step B4, if be not idle running step B6;
Step B4: distribute a new key page, with the next key page pointer of new key page address as the current key page;
Step B5: in the new key page, seek the key record clauses and subclauses of a free time, change step C;
Step B6: read the next key page according to next key page pointer, change step B1.
A kind of application of improved hash method is used to the information of searching, and may further comprise the steps:
Step a: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step b: read the key page according to the key page address;
Step c: the key of searching key clauses and subclauses in the current key page equals to search the key of usefulness, if found the key clauses and subclauses of coupling, changes steps d, otherwise searches failure;
Steps d: the result pointer according to the key clauses and subclauses that match reads the actual storage clauses and subclauses.
Further, described step c comprises:
Step c1: search the key that the key that whether has key clauses and subclauses equals to search usefulness at the current key page;
Step c2: if found the key clauses and subclauses of coupling, change steps d, otherwise change step c3;
Step c3: whether the next key page pointer of checking the current key page is empty, if be empty, searches failure, otherwise changes step c4;
Step c4: read the next key page according to next key page pointer, change step c1.
Beneficial effect: adopt the inventive method, searching in most cases HASH only needs twice internal storage access, internal storage access is for reading the key page for the first time, internal storage access is for reading the actual storage clauses and subclauses for the second time, compared with prior art, overcome access memory number of times variation shortcoming greatly, " shake " problem that transmits in network for information such as solving voice, image is very favourable, has also improved the utilization ratio of internal memory simultaneously.
Description of drawings
Fig. 1 stores and searches needed data structure diagram according to the inventive method;
Fig. 2 is the process flow diagram of storing according to the inventive method;
Fig. 3 is the process flow diagram of searching according to the inventive method.
Embodiment
Be described in further detail below in conjunction with the enforcement of accompanying drawing technical scheme.
As shown in Figure 1, the data structure of the improved HASH method application of the present invention comprises:
HASH key code table 10: the storage space table that constitutes by a series of HASH key pages 11.
The key page 11: be a storage block depositing information such as a series of keys, comprise the key clauses and subclauses 12 of some and point to the pointer 17 of the next key page, suppose that in the present embodiment a key page 11 comprises 8 key clauses and subclauses.
Key clauses and subclauses 12: include valid flag position 13, key 14 and result pointer 15.
Whether effective marker position 13: it is effective to identify current key clauses and subclauses, is that the current key clauses and subclauses of 0 expression are invalid, is idle clauses and subclauses, is that the current key clauses and subclauses of 1 sign are effective.
Key 14: what deposit is key, is used for and the key of searching compares and determines the coupling clauses and subclauses.
Result pointer 15: the actual storage clauses and subclauses 18 of pointing to this key correspondence.
Effective marker position 16: representing whether next key page pointer 17 effective, is that 0 sign is invalid, be 1 sign effectively.
Next key page pointer 17: if all key clauses and subclauses of the current key page are all occupied, might need to distribute the next key page, this moment is with the address of the next key page " the next key page pointer 17 " as the current key page.
Actual storage clauses and subclauses 18: deposit and search needed actual information, such as forwarding information etc.
The next key page 19: the same with the key page 11, be a storage block depositing information such as a series of keys.
As shown in Figure 2, use the present invention and carry out information stores, may further comprise the steps:
Step 201: use a HASH function that key is carried out the HASH operation, obtain a key page address of key correspondence in the HASH key code table;
Step 202:, read corresponding key content of pages according to the key page address;
Step 203: get first key clauses and subclauses of the current key page;
Step 204: whether the effective marker position of judging current key clauses and subclauses is 0, if be 0, changes step 205, if be not 0, changes step 209;
Step 205: the effective marker position of the key clauses and subclauses that match is made as 1, and expression effectively;
Step 206: the key that will search usefulness writes in the key clauses and subclauses that match;
Step 207: distribute a slice internal memory to be used to deposit the actual storage clauses and subclauses;
Step 208: actual storage clauses and subclauses corresponding memory address is write the result pointer territory of the key clauses and subclauses of coupling, change step 216;
Step 209: judge whether the current key page also exists next key clauses and subclauses,, change step 210, if there is no, change step 211 if exist;
Step 210: get next key clauses and subclauses, change step 204;
Step 211: whether the next key page pointer of judging the current key page is empty, if be not empty, changes step 212, if be empty, changes step 213;
Step 212: read next key content of pages according to next key page pointer, change step 203;
Step 213: redistribute a key page memory block;
Step 214: the next key page pointer territory that new key page address is write the current key page;
Step 215: the new crucial page as the current key page, is changeed step 203;
Step 216: finish.
As shown in Figure 3, use the present invention and carry out information searching, may further comprise the steps:
Step 301: use a HASH function that key is carried out the HASH operation, obtain a key page address of key correspondence in the HASH key code table;
Step 302:, read corresponding key content of pages according to the key page address;
Step 303: get first key clauses and subclauses of the current key page;
Step 304: the effective marker position of judging current key clauses and subclauses whether be 1 and the key of current key clauses and subclauses and the key of searching usefulness equate, if equate, change step 305, if unequal, change step 306;
Step 305: the result pointer according to current key clauses and subclauses reads actual storage clauses and subclauses resultant content, changes step 311;
Step 306: judge whether current page also has next key clauses and subclauses,, change step 307,, change step 308 if do not have if having;
Step 307: get next key clauses and subclauses, change step 304;
Step 308: whether the next key page pointer of judging the current key page is empty, if be empty, changes step 310, if be not empty, changes step 309;
Step 309: read the content of the next key page according to next key page pointer, change step 303;
Step 310: it fails to match;
Step 311: finish.
IPV6 of supposition transmits and has 512 in the following example, 000 clauses and subclauses, adopt existing many HASH method and improvement HASH method of the present invention to realize respectively, memory usage of the present invention as can be seen is than HASH method memory usage height commonly used from example.
The key of supposing each forwarding entry is 128bits (16bytes), and corresponding actual storage items for information size is relevant with actual application, is assumed to 32bytes.
Existing many HASH method realizes:
For performance requirement, good HASH function requires usually for the first time that the HASH hit rate is 70%~80%, is assumed to 80%, and the first order HASH table key number that needs to load is 409,600 so.
In addition, owing to carry out HASH when searching, it is very high conflict, and can not expect, therefore obtains reasonable search efficiency if desired, and all requiring the loading factor (the dominant record entry number of the actual record entry number/distribution that takies) usually is 1/4.The entry number of first order HASH table needs distribution is 2,048,000 like this.
The key of residue 20% (102,400) need use second level HASH function to carry out HASH once more, according to the computing method of front, the entry number that second level HASH table needs to distribute be (102,400*80%)/(1/5)=409,600.
Have 20,480 again (102,400*20%) individual key need use third level HASH function to carry out HASH, according to the computing method of front, the entry number that third level HASH table needs to distribute be (20,480*80%)/(1/5)=81,920.
In order to solve all HASH conflict, the back also needs the fourth stage, and level V HASH function or the like saves as in adopting like this that many HASH method needs at least: (2,048,000+409,600+81,920) * (16+32) bytes=1.22*10 8Bytes.
Improvement HASH method of the present invention realizes:
As noted earlier, when for the first time carrying out the HASH computing and clash, according to the present invention, do not need this key again with another HASH functional operation this moment to obtain new memory address, but this key is also put into the key page that HASH computing for the first time obtains, only seek the key entry line of a free time in addition at this key page.
Each key page can be deposited a plurality of key entry line, and this distribution utilization factor for memory headroom is highly profitable.In common HASH method, because conflict can't be predicted, in order make to reduce the probability that conflict takes place, it is 1/4 that a good HASH method requires to load the factor (total entry number of the entry number/distribution of filling) usually, that is to say that memory usage has only 20%.In the present invention, because a key page can be deposited a plurality of key clauses and subclauses (being assumed to be 8 in this example), therefore the number of the key page can be 1/4 of total key number only, can reach 70%~90% so that load the factor like this, and carry out 2 internal memories and search the hit rate of (internal storage access is for reading the key page, and another time internal storage access reads in the actual storage items for information for the result pointer according to the key clauses and subclauses of coupling) and can reach 98%.
Thus, first order key number of pages only need be 128,000 (1/4 key number), the number that need redistribute the next key page (carry out need increasing when HASH searches once this moment internal memory visit so that next key content of pages is read in the internal memory) for (512,000*2%) * (1/4)=2560.
Certainly may also need to distribute the key page of the third level, but such ratio only accounts for 0.04% (2%*2%) of total key number.
Consider that each key clauses and subclauses effective marker position and result pointer are taken as 18bits (3bytes) altogether, each key page comprises 8 key clauses and subclauses, the next key page pointer of each key page also is 18bits (3bytes), and the size of each key page is so: 8* (3bytes+16btes)+3bytes=155bytes.
The internal memory that adopts the present invention to need is divided into two parts:
The actual storage clauses and subclauses need internal memory: 512, and 000*32bytes=0.16*10 8Bytes
The key page needs internal memory: (128,000+2560) * 155bytes=0.2*10 8Bytes
Save as in needing altogether: 0.16*10 8+ 0.2*10 8=0.36*10 8Bytes<<1.22*10 8Bytes
From last example as can be seen, compare with traditional HASH, the present invention has several important feature and advantage: lookup method of the present invention makes that searching a key only needs internal storage access 2 times in most situations, it is very important that this has stable performance for the system that uses the HASH method to search, and the system that makes can not shake; In addition, the present invention is searching performance relatively also than under the conditions of higher, and the memory headroom utilization factor is greatly improved.

Claims (6)

1. improved hash method may further comprise the steps:
The Hash key code table is provided, comprises the storage space table that constitutes by a series of Hash key pages;
The key page is provided, comprises the key clauses and subclauses of some and the pointer of the next key page of sensing;
The key clauses and subclauses are provided, include valid flag position, key and result pointer;
The actual storage clauses and subclauses are provided, comprise the actual information content of depositing the key correspondence.
2. method according to claim 1 is characterized in that the actual information content of described key correspondence comprises forwarding information.
3. the application of an improved hash method is used for canned data, may further comprise the steps:
Steps A: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step B: key is written in idle key clauses and subclauses in the key page that the steps A computing obtains;
Step C: in the key clauses and subclauses, write the result pointer of a correspondence, this pointer and key correspondence, and point to the actual storage clauses and subclauses.
4. the application of a kind of improved hash method of method according to claim 3 is characterized in that, described step B comprises:
Step B1: the key record clauses and subclauses of in the current key page, seeking a free time;
Step B2: if found certain idle key clauses and subclauses, key is write, change step C;
Step B3: if do not find available idle key clauses and subclauses, check whether next key page pointer is empty, if be empty, changes step B4, if be not idle running step B6;
Step B4: distribute a new key page, with the next key page pointer of new key page address as the current key page;
Step B5: in the new key page, seek the key record clauses and subclauses of a free time, change step C;
Step B6: read the next key page according to next key page pointer, change step B1.
5. the application of an improved hash method is used to the information of searching, and may further comprise the steps:
Step a: use a hash function that key is carried out the Hash operation, obtain a key page address of key correspondence in the Hash key code table;
Step b: read the key page according to the key page address;
Step c: the key of searching key clauses and subclauses in the current key page equals to search the key of usefulness, if found the key clauses and subclauses of coupling, changes steps d, otherwise searches failure;
Steps d: the result pointer according to the key clauses and subclauses that match reads the actual storage clauses and subclauses.
6. the application of a kind of improved hash method according to claim 5, described step c comprises:
Step c1: search the key that the key that whether has key clauses and subclauses equals to search usefulness at the current key page;
Step c2: if found the key clauses and subclauses of coupling, change steps d, otherwise change step c3;
Step c3: whether the next key page pointer of checking the current key page is empty, if be empty, searches failure, otherwise changes step c4;
Step c4: read the next key page according to next key page pointer, change step c1.
CNB2006100084483A 2006-01-22 2006-01-22 Searching method by using modified hash method Expired - Fee Related CN100487697C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100084483A CN100487697C (en) 2006-01-22 2006-01-22 Searching method by using modified hash method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100084483A CN100487697C (en) 2006-01-22 2006-01-22 Searching method by using modified hash method

Publications (2)

Publication Number Publication Date
CN101004741A true CN101004741A (en) 2007-07-25
CN100487697C CN100487697C (en) 2009-05-13

Family

ID=38703886

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100084483A Expired - Fee Related CN100487697C (en) 2006-01-22 2006-01-22 Searching method by using modified hash method

Country Status (1)

Country Link
CN (1) CN100487697C (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887457A (en) * 2010-07-02 2010-11-17 杭州电子科技大学 Content-based copy image detection method
CN109542908A (en) * 2018-11-23 2019-03-29 中科驭数(北京)科技有限公司 Data compression method, storage method, access method and system in key-value database
CN110177106A (en) * 2019-05-31 2019-08-27 贵州精准健康数据有限公司 Medical imaging data transmission system
CN111953682A (en) * 2020-08-11 2020-11-17 北京八分量信息科技有限公司 Tamper-proof method and device for bank cloud computing portal website page and related product

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101655861B (en) * 2009-09-08 2011-06-01 中国科学院计算技术研究所 Hashing method based on double-counting bloom filter and hashing device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101887457A (en) * 2010-07-02 2010-11-17 杭州电子科技大学 Content-based copy image detection method
CN101887457B (en) * 2010-07-02 2012-10-03 杭州电子科技大学 Content-based copy image detection method
CN109542908A (en) * 2018-11-23 2019-03-29 中科驭数(北京)科技有限公司 Data compression method, storage method, access method and system in key-value database
CN110177106A (en) * 2019-05-31 2019-08-27 贵州精准健康数据有限公司 Medical imaging data transmission system
CN111953682A (en) * 2020-08-11 2020-11-17 北京八分量信息科技有限公司 Tamper-proof method and device for bank cloud computing portal website page and related product

Also Published As

Publication number Publication date
CN100487697C (en) 2009-05-13

Similar Documents

Publication Publication Date Title
US8099421B2 (en) File system, and method for storing and searching for file by the same
CN101354726B (en) Method for managing memory metadata of cluster file system
CN101141389B (en) Reinforcement multidigit Trie tree searching method and apparatus
CN106294190B (en) Storage space management method and device
US20020138648A1 (en) Hash compensation architecture and method for network address lookup
CN101655861A (en) Hashing method based on double-counting bloom filter and hashing device
CN104820717A (en) Massive small file storage and management method and system
CN103907099B (en) Short address conversion table uncached in cache coherence computer system
CN100487697C (en) Searching method by using modified hash method
CN101267381B (en) Operation method and device for Hash table
CN102739622A (en) Expandable data storage system
CN102541925A (en) Method and device for rapidly storing and retrieving detailed tickets
CN102542041A (en) Method and system for processing raster data
US9697898B2 (en) Content addressable memory with an ordered sequence
CN101783814A (en) Metadata storing method for mass storage system
CN111541617B (en) Data flow table processing method and device for high-speed large-scale concurrent data flow
CN102984071B (en) Method for organizing routing table of segment address route and method for checking route
CN101459599B (en) Method and system for implementing concurrent execution of cache data access and loading
CN1723506A (en) Method and system for maximizing DRAM memory bandwidth
CN101566985A (en) System and method for searching static data converted from dynamic data
CN111178965B (en) Resource release method and server
CN102521228A (en) Key value mapping method for linear data table
CN1228935C (en) Method of searching international nobile recognition number and electronic sequence number
CN100461806C (en) Voice value-added service data information processing method
CN109325149A (en) XML message search method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090513

Termination date: 20150122

EXPY Termination of patent right or utility model