Summary of the invention
The problem to be solved in the present invention be search efficiency not high with take the big problem of resource, and by method of the present invention, original element by special method generation index value, has promptly effectively been reduced the wasting of resources, improved seek rate again.
For solving the problems of the technologies described above, the objective of the invention is to realize by the following method:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved.
Wherein, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1)) mod 2
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
Wherein, the index value array of preserving is sorted.
Wherein, described Hash key value is double figures or 31.
For addressing the above problem, the present invention also provides a kind of method of searching element, and this method is specially:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved;
The element that input will be searched, the utilization hash algorithm generates and this element corresponding index value, searches this index value in described index value array.
Wherein, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1)) mod 2
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
Wherein, the index value array of preserving is sorted.
Wherein, described Hash key value is double figures or 31.
For realizing said method, the invention provides a kind of system of storage element, this system comprises: computing unit, acquiring unit, storage unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one; Wherein said hash algorithm is: mod 2 index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1))
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element; N is the storage bit number of the index value that presets;
The index value that storage unit is used for generating forms the index value array and preserves.
Wherein, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
Method for realizing that element is searched the invention provides a kind of system that is used to search element, and this system comprises: computing unit, acquiring unit, storage unit, search the unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one; Wherein said hash algorithm is: mod 2 index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1))
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets;
The index value that storage unit is used for generating forms the index value array and preserves;
Searching the unit is used for searching a certain index value that computing unit generates in the index value array.
Wherein, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
Owing to when storage element itself, also will store the length information of a string in the prior art.With the English word is example, and when the average length of word was 4 to 5 characters in the set, storage need take 8 to 10 bytes, and the present invention utilizes the element of hash algorithm in will gather to be stored as the form of index value, only needs to store an integer, i.e. 4 bytes.Therefore adopt the present invention the memory capacity of the thing of same content can be reduced to 1/2nd to 1/3rd of prior art.
And when the value of K was better, identical situation appearred in the key value that can farthest avoid hash algorithm to calculate, thereby improved the accuracy rate of searching element.
When having formed element with the index value array of index value storage, when searching owing to be comparison to integer, and prior art need compare each character in the character string, when if average each character string has 4 to 5 characters, utilization the present invention can improve seek rate four to five times.
In sum, the present invention contrasts prior art, has reduced resource shared when element is stored, and has improved the efficient when searching.
Embodiment
The problem to be solved in the present invention be search efficiency not high with take the big problem of resource, and by method of the present invention, original element by special method generation index value, has promptly effectively been reduced the wasting of resources, improved seek rate again.
For reaching above-mentioned effect, introduce the bright implementation procedure of this law below in detail.
For example all elements in certain set all is the English character string.Owing in these English character strings of storage, also need to preserve the additional information except that the English character string at present, really do not need but these additional informations are for we, so these additional informations taken too much resource.For this reason, the present invention adopts these character strings is changed, and realizes by the hash algorithm that the present invention provides.
Referring to Fig. 1, this figure is the flow process of store character string of the present invention.
Step 100: obtain the element in the set, an English character string in the set is calculated with specific hash algorithm, obtain and this English character string index value one to one.
Described specific hash algorithm is:
Suppose element character string N=N0N1N2......N (M-1);
M is the character number of character string N;
Hash key assignments=K;
Show that by experimental data repeatedly it is better relatively that K gets 31 o'clock effect.
Index value I=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1)) mod 2
N
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets; If obtain index value is 32 integer data, and N just is taken as 32 so.
Step 101: this index value is saved in the interim index value array.
Step 102: every other English character string execution in step 100,101 successively all in the pair set.
Step 103: store this index value array.
Give a concrete illustration below the computation index value in this storage means is described in detail.
With character string N=hello is example, searches to obtain its unicode coding and be from pre-set Hash table: h:104; E:101; L:108; O:111;
Situation 1: get K=31, N=32;
I=(((((h*31+e) * 31+l) * 31+l) * 31+o)) mod 2 so
32
=(((((h*31+101)*31+108)*31+108)*31+111))mod 2
32
=99162322 (10 systems)
=d218e905 (16 system)
3 groups of experimental datas enumerating below are based on an identical set, the distribution situation of the index value I that calculates with different K values.
Numerical value in the table is respectively the index value I that calculates when different K values.
Table 1K=53, distribution situation
0 |
0 |
2 |
0 |
26 |
0 |
234 |
15 |
807 |
1713 |
3473 |
|
Table 2K=31, distribution situation
0 |
0 |
2 |
0 |
26 |
56 |
178 |
723 |
423 |
1246 |
3616 |
|
Table 3K=389, distribution situation
0 |
0 |
2 |
0 |
0 |
27 |
0 |
8 |
351 |
1053 |
4829 |
|
By above-mentioned table 1 to the experimental data situation that other conditions are identical as can be seen of table 3, when K gets different values, the distribution situation difference of index value I.
Table 4 different K values distribution situation statistical form
Figure place in the table 4 refers to the figure place of index value I; Number refers to that index value I is with respect to the number of isotopic number not.We can clearly find out at K and got 31 o'clock by table 4, and the not isotopic number numerical value number of index value I is with respect to other K value distribution uniform.
Though formed the index value array that constitutes by index value in the above step, the element in its numerical value is not sorted, when the element in needing pair set carries out index so, will the efficient of index be exerted an influence.Because obviously will be higher than index efficient to the index efficient of subordinate ordered array, therefore can before the step 103 of above-mentioned flow process, index value numerical value be sorted to unordered array.
Corresponding to said method, the present invention also provides a kind of system that is used for storage element, comprises referring to this system of Fig. 4: computing unit (01), acquiring unit (03), storage unit (02);
Acquiring unit (03) is used to obtain element, and the element that for example obtains is various ways such as English character string;
Computing unit (01) be used for will obtain element generate and element index value one to one;
The index value that storage unit (02) is used for generating forms the index value array and preserves.
Referring to Fig. 2, this process step is specific as follows.
Step 200: obtain the element in the set, an English character string in the set is calculated with specific hash algorithm, obtain and this English character string index value one to one.
Described specific hash algorithm is:
Suppose element character string N=N0N1N2......N (M-1);
M is the character number of character string N;
Hash key assignments=K;
Show that by experimental data repeatedly it is better relatively that K gets 31 o'clock effect, experimental data is not stated tired at this with above-mentioned basic identical.
Index value I=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1)) mod 2
N
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets; If obtain index value is 32 integer data, and N just is taken as 32 so.
Step 201: this index value is saved in the interim index value array.
Step 202: every other English character string execution in step 100,101 successively all in the pair set
Step 203: described index value array is sorted.
Step 204: store this index value array.
Corresponding to above-mentioned second method, the invention provides the system that is used for storage element, comprise referring to this system of Fig. 5: computing unit (01), acquiring unit (03), storage unit (02) and sequencing unit (04);
Acquiring unit (03) is used to obtain element, and the element that for example obtains is various ways such as English character string;
Computing unit (01) be used for will obtain element generate and element index value one to one; Wherein said hash algorithm is: mod 2 index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1))
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets;
The index value that storage unit (02) is used for generating forms the index value array and preserves;
Sequencing unit (04) is used for the index value array sort that will form.
More than two embodiment be that Stored Procedure to element is described, after having formed above-mentioned index value array, can when the element in this array of index, improve the speed of index.
The concrete steps of index are referring to Fig. 3:
Step 300: the index value array that will generate is loaded in the internal memory.
Step 301: the element of input is calculated with specific hash algorithm, obtain and described element corresponding index value.
Specific hash algorithm described herein is identical with the hash algorithm of above-mentioned two embodiment, is not repeated at this.
Step 302: in the index value array, search this index value, if find the described element of expression in this set.
The described method of searching index value of step 302 can be the sequential search method; Under being orderly situation, the index value array can use binary search or the like.
Corresponding to this lookup method, the invention provides a kind of system that is used to search element, referring to Fig. 6, this system comprises: computing unit (01), acquiring unit (03), storage unit (02), search unit (05) and sequencing unit (04);
Acquiring unit (03) is used to obtain element;
Computing unit (01) be used for will obtain element generate and element index value one to one; Wherein said hash algorithm is: mod 2 index value=(((L0*K+L1) * K+L2) * K+L3) * K+...+L (M-1))
N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets;
The index value that storage unit (02) is used for generating forms the index value array and preserves;
Searching unit (05) is used for searching a certain index value that computing unit generates in the index value array;
Sequencing unit (04) is used for the index value array sort that will form.
Above-described element is not limited to the English character string, can be various ways such as Chinese character, Japanese character or Korea character, as long as can be applied to specific hash algorithm.
More than the method and system of the media file update prompting based on immediate communication tool provided by the present invention are described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.