CN1949221A - Method and system of storing element and method and system of searching element - Google Patents

Method and system of storing element and method and system of searching element Download PDF

Info

Publication number
CN1949221A
CN1949221A CN 200610144123 CN200610144123A CN1949221A CN 1949221 A CN1949221 A CN 1949221A CN 200610144123 CN200610144123 CN 200610144123 CN 200610144123 A CN200610144123 A CN 200610144123A CN 1949221 A CN1949221 A CN 1949221A
Authority
CN
China
Prior art keywords
index value
unit
array
searching
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200610144123
Other languages
Chinese (zh)
Other versions
CN100476824C (en
Inventor
彭锦臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Office Software Inc
Original Assignee
Beijing Kingsoft Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Software Co Ltd filed Critical Beijing Kingsoft Software Co Ltd
Priority to CN 200610144123 priority Critical patent/CN100476824C/en
Publication of CN1949221A publication Critical patent/CN1949221A/en
Application granted granted Critical
Publication of CN100476824C publication Critical patent/CN100476824C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention supplies element storing method. It includes the following steps: using Hash algorithm for all elements in aggregate to generate corresponding index value, compose index value array; storing the array. The invention also supplies element storing system includes computing unit, acquisition unit, and storage unit. Its element searching method includes the following steps: using Hash algorithm for all elements in aggregate to generate corresponding index value, compose index value array; storing the array; inputting searched element; using Hash algorithm to generate its corresponding index value; searching in index value array. The element searching system corresponding with the element searching method includes computing unit, acquisition unit, storage unit, and searching unit.

Description

The method and system of storage element and search the method and system of element
Technical field
The present invention relates to the method and system of storage element and search the method and system of element, particularly a kind of method and system that utilizes the method and system of hash algorithm storage element and search element.
Background technology
Along with development of times, from numerous information the interior Rongcheng of the own needs of screening an important techniques, determine that whether an element is the sub-fraction in this technology in a specific set.
At present, when we need determine that an element is whether in a set, the most conventional way was used the sequential search method exactly.The thought of sequential search is: each order of elements in the element that will search and the set compares one by one, and identical being searched successfully, otherwise searches failure.
If the element (for example the English character string can be arranged according to alphabetic(al) order) that can arrange in order in the set can be earlier carries out ascending order or descending carries out walkthrough to the element of this set, search with binary chop afterwards.The basic thought of binary chop is: the element at first will gathering according to keywords sorts, and secondly the value in element centre position compares in the element that will search and the set, and is identical, then searches successfully; Not etc., then intermediate data is greater than or less than the element that will search, searches in any case and will search in the data of half.
But the search efficiency of above two kinds of lookup methods is not high, if unordered set, the quantity of the time of searching with element in the set is directly proportional; Even if set is orderly, searching when adopting dichotomy is also little than the efficient raising of sequential search, because when the element in the set is a form with character string when existing, just need the value of the single character of compare string string the inside one by one in the process of searching, reduced the efficient of searching like this.
Because mostly the element in the set all is that form with character string exists, and except store character string itself, also needs store character string additional information in addition, for example string length information etc. is unwanted in search procedure but these information all are for we.Therefore these unnecessary information have taken too much resource, have caused the wasting of resources.
In sum, though prior art can be searched element whether in specific set, efficient is lower, and the resource that takies is bigger.
Summary of the invention
The problem to be solved in the present invention be search efficiency not high with take the big problem of resource, and by method of the present invention, original element by special method generation index value, has promptly effectively been reduced the wasting of resources, improved seek rate again.
For solving the problems of the technologies described above, the objective of the invention is to realize by the following method:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved.
Wherein, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
Wherein, the index value array of preserving is sorted.
Wherein, described Hash key value is double figures or 31.
For addressing the above problem, the present invention also provides a kind of method of searching element, and this method is specially:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved;
The element that input will be searched, the utilization hash algorithm generates and this element corresponding index value, searches this index value in described index value array.
Wherein, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
Wherein, the index value array of preserving is sorted.
Wherein, described Hash key value is double figures or 31.
For realizing said method, the invention provides a kind of system of storage element, this system comprises: computing unit, acquiring unit, storage unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one;
The index value that storage unit is used for generating forms the index value array and preserves.
Wherein, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
Method for realizing that element is searched the invention provides a kind of system that is used to search element, and this system comprises: computing unit, acquiring unit, storage unit, search the unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one;
The index value that storage unit is used for generating forms the index value array and preserves;
Searching the unit is used for searching a certain index value that computing unit generates in the index value array.
Wherein, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
Owing to when storage element itself, also will store the length information of a string in the prior art.With the English word is example, and when the average length of word was 4 to 5 characters in the set, storage need take 8 to 10 bytes, and the present invention utilizes the element of hash algorithm in will gather to be stored as the form of index value, only needs to store an integer, i.e. 4 bytes.Therefore adopt the present invention the memory capacity of the thing of same content can be reduced to 1/2nd to 1/3rd of prior art.
And when the value of K was better, identical situation appearred in the key value that can farthest avoid hash algorithm to calculate, thereby improved the accuracy rate of searching element.
When having formed element with the index value array of index value storage, when searching owing to be comparison to integer, and prior art need compare each character in the character string, when if average each character string has 4 to 5 characters, utilization the present invention can improve seek rate four to five times.
In sum, the present invention contrasts prior art, has reduced resource shared when element is stored, and has improved the efficient when searching.
Description of drawings
Fig. 1 is one embodiment of the invention process flow diagram;
Fig. 2 is one embodiment of the invention process flow diagram;
Fig. 3 is one embodiment of the invention process flow diagram;
Fig. 4 is one embodiment of the invention system diagram;
Fig. 5 is one embodiment of the invention system diagram;
Fig. 6 is one embodiment of the invention system diagram.
Embodiment
The problem to be solved in the present invention be search efficiency not high with take the big problem of resource, and by method of the present invention, original element by special method generation index value, has promptly effectively been reduced the wasting of resources, improved seek rate again.
For reaching above-mentioned effect, introduce the bright implementation procedure of this law below in detail.
For example all elements in certain set all is the English character string.Owing in these English character strings of storage, also need to preserve the additional information except that the English character string at present, really do not need but these additional informations are for we, so these additional informations taken too much resource.For this reason, the present invention adopts these character strings is changed, and realizes by the hash algorithm that the present invention provides.
Referring to Fig. 1, this figure is the flow process of store character string of the present invention.
Step 100: obtain the element in the set, an English character string in the set is calculated with specific hash algorithm, obtain and this English character string index value one to one.
Described specific hash algorithm is:
Suppose element character string N=N0N1N2......N (M-1);
M is the character number of character string N;
Hash key assignments=K;
Show that by experimental data repeatedly it is better relatively that K gets 31 o'clock effect.
Index value I=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets; If obtain index value is 32 integer data, and N just is taken as 32 so.
Step 101: this index value is saved in the interim index value array.
Step 102: every other English character string execution in step 100,101 successively all in the pair set.
Step 103: store this index value array.
Give a concrete illustration below the computation index value in this storage means is described in detail.
With character string N=hello is example, searches to obtain its unicode coding and be from pre-set Hash table: h:104; E:101; L:108; O:111;
Situation 1: get K=31, N=32;
I=(((((h*31+e) * 31+l) * 31+l) * 31+o)) mod 2 so 32
=(((((h*31+101)*31+108)*31+108)*31+111))mod 2 32
=99162322 (10 systems)
=d218e905 (16 system)
3 groups of experimental datas enumerating below are based on an identical set, the distribution situation of the index value I that calculates with different K values.
Numerical value in the table is respectively the index value I that calculates when different K values.
Table 1 K=53, distribution situation
0 0 2
0 26 0
234 15 807
1713 3473
Table 2 K=31, distribution situation
0 0 2
0 26 56
178 723 423
1246 3616
Table 3 K=389, distribution situation
0 0 2
0 0 27
0 8 351
1053 4829
By above-mentioned table 1 to the experimental data situation that other conditions are identical as can be seen of table 3, when K gets different values, the distribution situation difference of index value I.
Table 4 different K values distribution situation statistical form
Figure place in the table 4 refers to the figure place of index value I; Number refers to that index value I is with respect to the number of isotopic number not.We can clearly find out at K and got 31 o'clock by table 4, and the not isotopic number numerical value number of index value I is with respect to other K value distribution uniform.
Though formed the index value array that constitutes by index value in the above step, the element in its numerical value is not sorted, when the element in needing pair set carries out index so, will the efficient of index be exerted an influence.Because obviously will be higher than index efficient to the index efficient of subordinate ordered array, therefore can before the step 103 of above-mentioned flow process, index value numerical value be sorted to unordered array.
Corresponding to said method, the present invention also provides a kind of system that is used for storage element, comprises referring to this system of Fig. 4: computing unit (01), acquiring unit (03), storage unit (02);
Acquiring unit (03) is used to obtain element, and the element that for example obtains is various ways such as English character string;
Computing unit (01) be used for will obtain element generate and element index value one to one;
The index value that storage unit (02) is used for generating forms the index value array and preserves.
Referring to Fig. 2, this process step is specific as follows.
Step 200: obtain the element in the set, an English character string in the set is calculated with specific hash algorithm, obtain and this English character string index value one to one.
Described specific hash algorithm is:
Suppose element character string N=N0N1N2......N (M-1);
M is the character number of character string N;
Hash key assignments=K;
Show that by experimental data repeatedly it is better relatively that K gets 31 o'clock effect, experimental data is not stated tired at this with above-mentioned basic identical.
Index value I=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets; If obtain index value is 32 integer data, and N just is taken as 32 so.
Step 201: this index value is saved in the interim index value array.
Step 202: every other English character string execution in step 200,201 successively all in the pair set.
Step 203: described index value array is sorted.
Step 204: store this index value array.
Corresponding to above-mentioned second method, the invention provides the system that is used for storage element, comprise referring to this system of Fig. 5: computing unit (01), acquiring unit (03), storage unit (02) and sequencing unit (04);
Acquiring unit (03) is used to obtain element, and the element that for example obtains is various ways such as English character string;
Computing unit (01) be used for will obtain element generate and element index value one to one;
The index value that storage unit (02) is used for generating forms the index value array and preserves;
Sequencing unit (04) is used for the index value array sort that will form.
More than two embodiment be that Stored Procedure to element is described, after having formed above-mentioned index value array, can when the element in this array of index, improve the speed of index.
The concrete steps of index are referring to Fig. 3:
Step 300: the index value array that will generate is loaded in the internal memory.
Step 301: the element of input is calculated with specific hash algorithm, obtain and described element corresponding index value.
Specific hash algorithm described herein is identical with the hash algorithm of above-mentioned two embodiment, is not repeated at this.
Step 302: in the index value array, search this index value, if find the described element of expression in this set.
The described method of searching index value of step 302 can be the sequential search method; Under being orderly situation, the index value array can use binary search or the like.
Corresponding to this lookup method, the invention provides a kind of system that is used to search element, referring to Fig. 6, this system comprises: computing unit (01), acquiring unit (03), storage unit (02), search unit (05) and sequencing unit (04);
Acquiring unit (03) is used to obtain element;
Computing unit (01) be used for will obtain element generate and element index value one to one;
The index value that storage unit (02) is used for generating forms the index value array and preserves;
Searching unit (05) is used for searching a certain index value that computing unit generates in the index value array;
Sequencing unit (04) is used for the index value array sort that will form.
Above-described element is not limited to the English character string, can be various ways such as Chinese character, Japanese character or Korea character, as long as can be applied to specific hash algorithm.
More than the method and system of the media file update prompting based on immediate communication tool provided by the present invention are described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (12)

1, a kind of method of storage element is characterized in that, this method comprises:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved.
2, the method for storage element according to claim 1 is characterized in that, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N:
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
3, the method for storage element according to claim 1 is characterized in that, the index value array of preserving is sorted.
According to the method for the arbitrary described storage element of claim 1 to 3, it is characterized in that 4, described Hash key value is double figures or 31.
5, a kind of method of searching element is characterized in that, this method comprises:
All elements uses hash algorithm generation and element index value one to one in will gathering, and forms the index value array; The index value array is preserved;
The element that input will be searched, the utilization hash algorithm generates and this element corresponding index value, searches this index value in described index value array.
6, method of searching element according to claim 5 is characterized in that, generating index value is the following specific hash algorithm of utilization:
Index value=(((L0*K+L1) * K+L2+......) * K+L (M-1) mod2 N
K is the Hash key assignments that presets;
The coded data of L0......L (M-1) for finding out by the Hash table that presets, L0......L (M-1) is corresponding to each character of element;
N is the storage bit number of the index value that presets.
7, method of searching element according to claim 4 is characterized in that, the index value array of preserving is sorted.
8, according to the arbitrary described method of searching element of claim 5 to 7, it is characterized in that described Hash key value is double figures or 31.
9, a kind of system of storage element is characterized in that, this system comprises: computing unit, acquiring unit, storage unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one;
The index value that storage unit is used for generating forms the index value array and preserves.
10, the system of storage element according to claim 9 is characterized in that, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
11, a kind of system that is used to search element is characterized in that, this system comprises: computing unit, acquiring unit, storage unit, search the unit;
Acquiring unit is used to obtain element;
Computing unit be used for will obtain element generate and element index value one to one;
The index value that storage unit is used for generating forms the index value array and preserves;
Searching the unit is used for searching a certain index value that computing unit generates in the index value array.
12, the system that is used to search element according to claim 11 is characterized in that, this system further comprises sequencing unit;
Sequencing unit is used for the index value array sort that will form.
CN 200610144123 2006-11-27 2006-11-27 Method and system for storing element and method and system for searching element Active CN100476824C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200610144123 CN100476824C (en) 2006-11-27 2006-11-27 Method and system for storing element and method and system for searching element

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200610144123 CN100476824C (en) 2006-11-27 2006-11-27 Method and system for storing element and method and system for searching element

Publications (2)

Publication Number Publication Date
CN1949221A true CN1949221A (en) 2007-04-18
CN100476824C CN100476824C (en) 2009-04-08

Family

ID=38018736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200610144123 Active CN100476824C (en) 2006-11-27 2006-11-27 Method and system for storing element and method and system for searching element

Country Status (1)

Country Link
CN (1) CN100476824C (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100561482C (en) * 2008-01-29 2009-11-18 北京北方烽火科技有限公司 A kind of implementation method of embedded system data base
CN101996217B (en) * 2009-08-24 2012-11-07 华为技术有限公司 Method for storing data and memory device thereof
CN103077199A (en) * 2012-12-26 2013-05-01 北京思特奇信息技术股份有限公司 File resource searching and locating method and device
CN105159987A (en) * 2015-08-31 2015-12-16 深圳市茁壮网络股份有限公司 Data storage and query method and apparatus
CN107357632A (en) * 2017-07-17 2017-11-17 郑州云海信息技术有限公司 A kind of order line analysis method and device
CN108629049A (en) * 2018-05-14 2018-10-09 芜湖岭上信息科技有限公司 A kind of image real-time storage and lookup device and method based on hash algorithm
CN111737264A (en) * 2020-07-20 2020-10-02 智者四海(北京)技术有限公司 Information processing method and system
CN111814003A (en) * 2019-04-12 2020-10-23 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for building metadata index

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100561482C (en) * 2008-01-29 2009-11-18 北京北方烽火科技有限公司 A kind of implementation method of embedded system data base
CN101996217B (en) * 2009-08-24 2012-11-07 华为技术有限公司 Method for storing data and memory device thereof
CN103077199A (en) * 2012-12-26 2013-05-01 北京思特奇信息技术股份有限公司 File resource searching and locating method and device
CN105159987A (en) * 2015-08-31 2015-12-16 深圳市茁壮网络股份有限公司 Data storage and query method and apparatus
CN105159987B (en) * 2015-08-31 2019-03-29 深圳市茁壮网络股份有限公司 A kind of storage of data, lookup method and device
CN107357632A (en) * 2017-07-17 2017-11-17 郑州云海信息技术有限公司 A kind of order line analysis method and device
CN108629049A (en) * 2018-05-14 2018-10-09 芜湖岭上信息科技有限公司 A kind of image real-time storage and lookup device and method based on hash algorithm
CN111814003A (en) * 2019-04-12 2020-10-23 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for building metadata index
CN111814003B (en) * 2019-04-12 2024-04-23 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for establishing metadata index
CN111737264A (en) * 2020-07-20 2020-10-02 智者四海(北京)技术有限公司 Information processing method and system

Also Published As

Publication number Publication date
CN100476824C (en) 2009-04-08

Similar Documents

Publication Publication Date Title
CN1949221A (en) Method and system of storing element and method and system of searching element
CN1728114A (en) Use the page replacement method of page information
CN1324811C (en) Interleaver and interleaving method in a communication system
CN1703089A (en) A two-value arithmetic coding method of digital signal
CN101043695A (en) Method for storing and maintaining user equipment information in mobile communication system
CN101051320A (en) File name generating method and device in file distribution system
CN1141666C (en) Online character recognition system for recognizing input characters using standard strokes
CN1254921C (en) Improved huffman decoding method and device
CN1868127A (en) Data compression system and method
CN1831825A (en) Document management method and apparatus and document search method and apparatus
CN101079042A (en) System and method for quickly inquiring about black and white name list
CN101055574A (en) Domain name information storage and inquiring method and system
CN101075239A (en) Composite searching method and system
CN1716246A (en) Multi-column multi-data type internationalized sort extension method for WEB applications
CN101046857A (en) Setting method for grade system and its system
CN1825321A (en) Searching method, holding method and searching system for dictionary-like data
CN1893282A (en) An inter-sequence permutation turbo code system and operation method therefor
CN101075237A (en) Method for storing, fetching and indexing data
CN100340081C (en) Pseudo-random squence generator and associated method
CN101079890A (en) A method and device for generating characteristic code and identifying status machine
CN1967720A (en) Semiconductor memory and method for controlling the same
CN1889080A (en) Method for searching character string
CN101046745A (en) Method, device for controlling relation between control piece on interface and control piece display system
CN1773450A (en) Straight number
CN1863085A (en) Method and system for ensuring network managment and element configuration data consistency

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING KINGSOFT OFFICE SOFTWARE CO., LTD.

Free format text: FORMER OWNER: BEIJING JINSHAN SOFTWARE CO., LTD.

Effective date: 20140312

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100083 HAIDIAN, BEIJING TO: 100085 HAIDIAN, BEIJING

TR01 Transfer of patent right

Effective date of registration: 20140312

Address after: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road

Patentee after: Beijing Kingsoft WPS Office Co., Ltd.

Address before: 100083, Beijing, Haidian District No. 238 North Fourth Ring Road, No. 20, Bai Yan building

Patentee before: Beijing Jinshan Software Co., Ltd.

C56 Change in the name or address of the patentee
CP01 Change in the name or title of a patent holder

Address after: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road

Patentee after: Beijing Kingsoft office software Limited by Share Ltd

Address before: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road

Patentee before: Beijing Kingsoft WPS Office Co., Ltd.