CN108572958A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN108572958A
CN108572958A CN201710132651.XA CN201710132651A CN108572958A CN 108572958 A CN108572958 A CN 108572958A CN 201710132651 A CN201710132651 A CN 201710132651A CN 108572958 A CN108572958 A CN 108572958A
Authority
CN
China
Prior art keywords
keyword
data block
data
index
target data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710132651.XA
Other languages
Chinese (zh)
Other versions
CN108572958B (en
Inventor
袁野
周海发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710132651.XA priority Critical patent/CN108572958B/en
Publication of CN108572958A publication Critical patent/CN108572958A/en
Application granted granted Critical
Publication of CN108572958B publication Critical patent/CN108572958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of data processing method and devices;Method includes:The first keyword and the second keyword are extracted from the keyword of target data;It is index with first keyword, is compared successively with the Hash key of each data block;Based on the first keyword mapped storage location described in when comparing successfully, obtain using first keyword as the target data block of Hash key;It is index with second keyword, is compared successively with the median of the index of the target data block after the median of the index of the target data block and recursive subdivision;Wherein, the index includes the tactic sequence number of data in the target data block;Based on the second keyword mapped storage location described in when comparing successfully, the target data is obtained from the respective memory locations of the target data block.Implement the present invention, it being capable of efficient lookup data.

Description

Data processing method and device
Technical field
The present invention relates to database technology more particularly to a kind of data processing method and devices.
Background technology
Data search technology, refers to the technology of data required for searching service operation, and quick searching data is to ensure service Efficiently, the key factor of stable operation.
Currently, the trend of explosive growth is presented in data, conventional data search technology is searched in mass data There is the bottleneck that search efficiency is low, occupancy resource is high.
By taking augmented reality as an example, augmented reality is on the basis of showing true environment, and amplification user is to existing The perception in the real world realizes that true environment is combined with virtual objects (user is presently in the object being not present in true environment) Effect be related to the massive map data for indicating true environment and virtual objects for augmented reality, it is conventional at present Data search technology be difficult to ensure the efficiency of lookup.
Again with high precision for electronic map, high-precision electronic map is to be used for automatic Pilot and self-navigation, is had normal The incomparable precision (trueness error is often within one meter) of electronic map is advised, and will include a large amount of dependency numbers of road equipment According to, therefore data volume is larger, is equally difficult to ensure when conventional data search technology is searched in high-precision electronic map at present The efficiency of lookup.
In conclusion when searching data, how to ensure that the efficiency of data search, the relevant technologies there is no effective solution Certainly scheme.
Invention content
The embodiment of the present invention provides a kind of data processing method and device, being capable of searching data in an efficient manner.
What the technical solution of the embodiment of the present invention was realized in:
In a first aspect, the embodiment of the present invention provides a kind of data processing method, including:
The first keyword and the second keyword are extracted from the keyword of target data;
It is index with first keyword, is compared successively with the Hash key of each data block;
Based on the first keyword mapped storage location described in when comparing successfully, it is to breathe out to obtain with first keyword The target data block of uncommon keyword;
With second keyword be index, successively with the median of the index of the target data block and the mesh The median for marking the index after the recursive subdivision of data block is compared;Wherein, the index includes in the target data block The tactic sequence number of data;
Based on the second keyword mapped storage location described in when comparing successfully, corresponding from the target data block is deposited Store up target data described in position acquisition.
Second aspect, the embodiment of the present invention provide a kind of data processing equipment, including:
Extraction unit, for extracting the first keyword and the second keyword from the keyword of target data;
First searching unit, for being index with first keyword, with the Hash key of each data block successively into Row compares;
First acquisition unit, for based on compares successfully when described in the first keyword mapped storage location, acquisition with First keyword is the target data block of Hash key;
Second searching unit, for second keyword be index, successively with the index of the target data block The median of index after the recursive subdivision of median and the target data block is compared;Wherein, the index includes The tactic sequence number of data in the target data block;
Second acquisition unit, for based on the second keyword mapped storage location described in when comparing successfully, from described The respective memory locations of target data block obtain the target data.
The third aspect, the embodiment of the present invention provides a kind of data processing equipment, including pocessor and storage media, described to deposit Executable instruction is stored in storage media, the executable instruction is used to cause the operation that the processor execution includes following:
The first keyword and the second keyword are extracted from the keyword of target data;
It is index with first keyword, is compared successively with the Hash key of each data block;
Based on the first keyword mapped storage location described in when comparing successfully, it is to breathe out to obtain with first keyword The target data block of uncommon keyword;
It is index with second keyword, divides successively with the median of the index of the target data block and recurrence The median of the index of the target data block after cutting is compared;Wherein, the index includes in the target data block The tactic sequence number of data;
Based on the second keyword mapped storage location described in when comparing successfully, corresponding from the target data block is deposited Store up target data described in position acquisition.
Fourth aspect, the embodiment of the present invention provide a kind of storage medium, executable instruction are stored with, for executing the present invention The data processing method that embodiment provides.
The embodiment of the present invention has the advantages that:
It is compared with the Hash key of each data block with the first keyword, determines the target data of target data ownership Block can subsequently be continued to search in target data block, will be based on Hash key and be based on sequence number in recursive subdivision Index in the mode searched combine;On the one hand, avoiding to read all data blocks and search in each data block traversal causes Occupancy a large amount of memory spaces the problem of;On the other hand, avoid the single sequence number using data recursive subdivision index The low problem of search efficiency, improves search efficiency caused by middle lookup.
Description of the drawings
Fig. 1 be data processor means provided in an embodiment of the present invention be deployed in based on server/client system one A optional schematic diagram;
Fig. 2 is an optional hardware architecture diagram of data processing equipment provided in an embodiment of the present invention;
Optional structure after Fig. 3 divides data when being storage data provided in an embodiment of the present invention is shown It is intended to;
Fig. 4 is the schematic diagram of an optional structure after storage data provided in an embodiment of the present invention are divided;
Fig. 5 is the schematic diagram of an optional storage organization of storage data block provided in an embodiment of the present invention;
Fig. 6 is showing for one optional storage organization (mapping table) of storage data block set provided in an embodiment of the present invention It is intended to;、
Fig. 7 is an optional schematic diagram of ordered arrangement data in data block provided in an embodiment of the present invention;
Fig. 8 is an optional schematic diagram of ordered arrangement data in data block provided in an embodiment of the present invention;
Fig. 9 is an optional structural schematic diagram of the keyword provided in an embodiment of the present invention for searching data;
Figure 10 is an optional structural schematic diagram of the keyword provided in an embodiment of the present invention for searching data;
Figure 11 is an optional flow diagram of data processing method provided in an embodiment of the present invention;
Figure 12 is that the first keyword provided in an embodiment of the present invention with target data is index, the Hash with each data block The optional flow diagram that keyword is compared successively;
Figure 13 is an optional flow diagram of data search method provided in an embodiment of the present invention;
Figure 14 is an optional structural schematic diagram of data processing equipment provided in an embodiment of the present invention.
Specific implementation mode
The present invention is further described in detail below with reference to the accompanying drawings and embodiments.It should be appreciated that described herein Specific embodiment is only used to explain the present invention, is not intended to limit the present invention.
Before the present invention will be described in further detail, to involved in the embodiment of the present invention noun and term say Bright, noun and term involved in the embodiment of the present invention are suitable for following explanation.
1) data, also referred to as static data, or referred to as cross-section data, be by several correlated phenomenas on a certain time point institute The data composition of the state at place, describe the situation of change of phenomenon at a time, the objective items such as reflection certain time, place Existing inherence numerical value contact between phenomenon under part, such as can be the data collected on same time point, it can also be It is pre-created the data finished before data search.
For example, the map datum used in map application, when carrying out map datum lookup map datum be it is stable not It can change.For another example the map datum used in augmented reality application, includes each real object and void of different location The image data of quasi- object.
Itself with regard to data, it can be a data or a plurality of data, or be certain capacity (such as certain byte Number) data.Such as one place or some region of map datum in high-precision electronic map, in another example augmented reality map In be located at a certain place one or more virtual objects image data.
2) data block, including:2.1) hash key character segment, i.e. the data block Hash key of itself;2.2) data portion Point, i.e., tactic multiple data, data each data in the block form ordered arrangement according to sequence number within the data block.
3) sequence number, corresponds to the data of ordered arrangement in data block the orderly mark of distribution, and sequence number can be adopted With numeric sorting (such as 1/2/3/4), either sorted (such as using the combination of letter sequence (such as a/b/c/d) or letter and number A1/a2/a3/a4), or be any other form orderly mark.
4) sequence number indexes, orderly (such as numeric sorting, the letter sequence that data each data sequence number in the block is formed Deng) index.
5) data block set, including:5.1) hash key character segment, that is, the data block set Hash key of itself; 5.2) data portion, i.e., more than two data blocks.
6) point of interest (POI, POint of Interest), for corresponding to an object (target), such as in high precision electro In sub- map, point of interest can be place, for corresponding such as house, retail shop, mailbox and bus station, in augmented reality map In, point of interest can be the virtual objects in place (such as the various virtual items in playing).
The data processing equipment of the embodiment of the present invention is realized in description with reference to the drawings.Data processing equipment can be with each Kind of form is implemented.For example, the data processing equipment described in the embodiment of the present invention may be embodied as such as smart phone, notes The terminal of this computer, tablet computer (PAD), car-mounted terminal etc., and such as DTV (TV, Television), desk-top The fixed terminal of computer etc..Terminal realizes the various services based on data search by running applications client.
For another example referring to Fig. 1, Fig. 1 be data processor means provided in an embodiment of the present invention be deployed in based on server/ One optional schematic diagram of FTP client FTP, the data processing equipment described in the embodiment of the present invention may be embodied as servicing Device, server can be applied to the various clothes based on data search realized in the arbitrarily framework based on server/customer end Business.
For the service above-mentioned based on data search, illustratively, including:
1) user in online social interaction server searches, such as based on various stereotactic conditions (user's names;Region, preference etc. Each attribute), qualified user is searched in the customer data base of social networks;
2) augmented reality service wears augmented reality equipment (for example, using form of glasses or the helmet) according to user When residing orientation in the environment, the map number of the virtual objects in the corresponding orientation is searched in augmented reality map data base According to, and shown in the visual field of augmented reality equipment, realize the effect that real world is merged with virtual world.
3) electronic map (for example, high-precision electronic map), according to target location, in high-precision electronic map data base It searches the map datum of target location and is shown.
It is an optional hardware configuration signal of data processing equipment provided in an embodiment of the present invention referring to Fig. 2, Fig. 2 Figure, data processing equipment 100 include processor 101, display unit 102, communication unit 103, memory 104, input unit 105 With power supply unit 106.Fig. 1 shows the data processing equipment with various assemblies, it should be understood that being not required for implementing All components shown.More or fewer components can alternatively be implemented.It will be discussed in more detail below in data processing equipment Component.
Processor 101 is used to control the overall operation of data processing equipment.For example, the execution of processor 101 is based on realization The relevant control of various services and processing of data search, including various services that aforementioned exemplary illustrates.
Display unit 102, which may be displayed in data processing equipment 100, to carry out in the various services based on data search Between information and lookup result.
For example, when relevant control and processing of the processor 101 for realizing online social interaction server, display unit 102 can To show the user interface (UI, User Interface) or graphic user interface (GUI, Graphical of online social interaction server User Interface), it is shown in the result that user is searched in social networks.
In another example when relevant control and processing of the processor 101 for realizing augmented reality service, display unit 102 According to orientation residing in the environment when user's wearing augmented reality equipment (for example, using form of glasses or the helmet), show Show the image of the actual environment in the corresponding orientation, and the virtual objects being superimposed in actual environment according to the existing strategy of various enhancings Image.
For another example when relevant control and processing of the processor 101 for realizing high-precision electronic map, display unit 102 display electronic maps search the map datum of target location simultaneously according to target location in high-precision electronic map data base It is shown.
Communication unit 103 generally includes one or more components, allows data processing equipment 100 and wireless communication system Or the communication of the wired mode or wireless mode between network.For example, communication unit 103 may be embodied as mobile communication module, At least one of wireless Internet module and short range communication module.
Mobile communication module send radio signals to base station (for example, access point, node B etc.), exterior terminal with And at least one of server and/or receive from it radio signal.Such radio signal may include voice communication Signal, video calling signal or the various types of data for sending and/or receiving according to text and/or Multimedia Message.
Wireless Internet module supports the Wi-Fi (Wireless Internet Access) of mobile terminal.The module can be coupled internally or externally To terminal.Wi-Fi (Wireless Internet Access) technology involved by the module may include that WLAN (WLAN), wireless compatibility are recognized Demonstrate,prove (Wi-Fi), WiMAX (Wibro), worldwide interoperability for microwave accesses (Wimax), high-speed downlink packet access (HSDPA) Etc..
Short range communication module is the module for supporting short range communication.Some examples of short-range communication technology include bluetooth, Radio frequency identification (RFID, Radio Frequency Identification), (IrDA, Infrared are short for Infrared Data Association Data Association), ultra wide band (UWB, Ultra WIDeband), purple honeybee etc..
Memory 104 can store the software program etc. of the processing and control operation that are executed by processor 101, Huo Zheke Temporarily to store the data that has exported or will export (for example, various service processings above-mentioned based on data search Intermediate result or final result).
Memory 104 may include the storage medium of at least one type, and the storage medium includes flash memory, hard disk, more Media card, card-type memory (for example, SD or DX memories etc.), random access storage device (RAM, Random Access Memory), static random-access memory (SRAM, Static Random Access Memory), read-only memory (ROM, Read Only Memory), electrically erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read Only Memory), programmable read only memory (PROM, Programmable Read Only Memory), magnetic storage, disk, CD etc..Moreover, data processing equipment 100 can be deposited with by network connection execution The network storage device of the store function of reservoir 104 cooperates.
Input unit 105 can generate key input data to control the various behaviour of mobile terminal according to order input by user Make.Input unit 105 allows user to input various types of information, and may include keyboard, touch tablet, idler wheel, rocking bar etc. Deng.Particularly, when touch tablet is superimposed on the display unit 102 in the form of layer, touch screen can be formed.
Power supply unit 106 receives external power or internal power under the control of processor 101 and provides operation each member Electric power appropriate needed for part and component.
Various embodiments described herein can with use such as computer software, hardware or any combination thereof calculating Machine readable medium is implemented.
For hardware implement, embodiment described herein can by using application-specific IC (ASIC, Application Specific Integrated Circuit), digital signal processor (DSP, Digital Signal Processing), digital signal processing device (DSPD, Digital Signal Processing Device), programmable patrol Collect device (PLD, Programmable Logic Device), field programmable gate array (FPGA, Field Programmable Gate Array), processor, controller, microcontroller, microprocessor, be designed to execute it is described herein At least one of the electronic unit of function implement, in some cases, such embodiment can be in processor 101 Middle implementation.
For software implementation, the embodiment of such as process or function can with allow to execute at least one functions or operations Individual software module implement.Software code can be by the software application write with any programming language appropriate (or program) is implemented, and software code can be stored in memory 104 and be executed by processor 101.
So far, the data processing equipment involved in the embodiment of the present invention is described according to its function, is based on above-mentioned number According to the hardware architecture diagram of processing unit, to the data processing side provided in an embodiment of the present invention applied to data processing equipment Method illustrates.
The dividing mode provided in an embodiment of the present invention for storing data to be found is illustrated.
One after being divided to data when referring to Fig. 3, Fig. 3 being storage data provided in an embodiment of the present invention is optional The schematic diagram of structure is divided into multiple data blocks for the data of lookup, using data block as basic in data search Search object, that is, when searching target data, need to position data block i.e. (target data block) residing for target data first, According to the sequence number ordered arrangement data of data in each data block.
For data block, it can be obtained using different modes to being divided for the data of lookup, including such as Under several optional dividing modes:
1) it is divided based on the relevance between data, corresponding number will be divided into the associated data of one or more dimensions According to block, dimension here may include time, region and description object.
It, can be by high-precision for high-precision electronic map datum by taking the object based on data is divided as an example Electronic map data is divided according to described geographic area (such as county, city, street);For augmented reality map datum Speech, can be divided, the image data for the virtual objects that each place is applied is divided into the number of corresponding location according to place According in block.
By taking the time dimension based on data is divided as an example, timestamp can be distributed in the data in one hour with every 10 minutes are granularity division to data block.
2) capacity based on data is divided, i.e., by total data to be found, is carried out for granularity according to specified vol It divides, such as carries out as unit of 100 Mbytes being divided into multiple data blocks.
It may be noted that ground, the above-mentioned mode that data to be found are divided into data block is merely illustrative, the embodiment of the present invention In be not specifically limited for data to be found to be divided into the realization method of data block, above-mentioned division data block in practical application Mode can select a use or combined use.
The embodiment of the present invention also provide be different from Fig. 3 to the structure after being divided for the total data of lookup, ginseng See that Fig. 4, Fig. 4 are the schematic diagram of an optional structure after storage data provided in an embodiment of the present invention are divided, data The base unit of storage is data block set, and multiple data blocks are divided into for the data of lookup, and by data block combinations number It is the basic lookup object in data search with set of data blocks cooperation, that is, when searching target data, need according to set of blocks The data block set (i.e. target data set of blocks) residing for target data is positioned first, each data block in being combined for data block Structure can understand according to fig. 3.
For data chunk to be combined into set of data blocks and is closed, it can be combined from different dimensions, illustratively, packet Include following methods:
1) it is combined, will be arrived in the associated data block combinations of one or more dimensions based on the relevance between data block Corresponding data block set, dimension here may include time, region and description object.
It, can will be high for high-precision electronic map datum by taking the object based on data is by data block combinations as an example Precision electronic map datum is combined according to described geographic area (such as county, city, street);For augmented reality map number It for, can be combined according to the place residing for virtual objects, the virtual object data that multiple adjacent places are applied Data block combinations to data block set in.
By taking the time dimension based on data is by data block combinations as an example, the data that timestamp can be distributed in one hour The multiple data blocks obtained for granularity division with every 10 minutes, are combined to according to priority time sequencing in data block set, for example, Each data block set can store a hour corresponding data block.
2) capacity based on data merges, i.e., by total data to be found, is carried out for granularity according to specified vol Combination, such as carrying out being divided into multiple data blocks as unit of 100 Mbytes, arrived for combinations of particle sizes according to 1000 Mbytes In data block set, each data block set includes 10 data blocks.
It may be noted that ground, the above-mentioned mode that data chunk to be found is combined into set of data blocks conjunction is merely illustrative, for example, also The arbitrary data block of predetermined quantity can be combined as to data block set, for dividing data to be found in the embodiment of the present invention The realization method that data acquisition system is combined into for data chunk is not specifically limited, above-mentioned in practical application that data chunk is combined into number A use or combined use can be selected according to the mode of set of blocks.
The mode stored after the division provided in an embodiment of the present invention for storing data to be found is illustrated.
It is provided in an embodiment of the present invention deposit referring to Fig. 5, Fig. 5 for the data block after dividing as shown in Figure 3 The schematic diagram for storing up an optional storage organization of data block, data are provided in the storage organization (mapping table) shown in Fig. 5 The mapping table of the Hash key (Hash Key) of block and the storage location of data block, carries for each data block in the mapping table For corresponding Hash key and corresponding storage location.
Each data block has unique Hash key and corresponding storage location, is with the data block shown in Fig. 5 Example, 1 corresponding Hash key of data block are:Hash Key11And corresponding storage location11, 2 corresponding Hash of data block Keyword is:Hash Key12And corresponding storage location12
For the Hash key of data block as shown in Figure 5, the sequence number conduct of data block can be directly used Hash key, alternatively, the sequence number of data block is carried out the cryptographic Hash that meter coding obtains using hash algorithm.For example, for For the Hash key of data block 1, sequence number " 1 " can be directly used to be used as Hash key, or make to sequence number " 1 " The cryptographic Hash encoded with hash algorithm.
In this way, the Kazakhstan of the data block (namely target data block) belonged in data (namely target data) to be found When uncommon keyword, you can the storage location that target data block is positioned based on mapping table, based in data processing equipment local or net Storage location in the nonvolatile storage space of network side can read data block in the memory headroom of data processing equipment It carries out continuing to search for target data, memory headroom is read due to not needing all data blocks, can save significantly on to memory sky Between occupancy.
It is that the embodiment of the present invention provides referring to Fig. 6, Fig. 6 for the data block set after dividing as shown in Figure 4 Storage data block set an optional storage organization (mapping table) schematic diagram, carried in the storage organization shown in Fig. 6 Supplied the Hash key of data block set and the storage location of data block set mapping relations and data block set in it is each The mapping relations of the Hash key of data block and the storage location of data block.
Each data block has unique Hash key and corresponding storage location, with the set of data blocks shown in Fig. 6 For closing 1,1 corresponding Hash key of data block set is:Hash Key1And corresponding storage location1, for data block For each data block in set 1, there are corresponding Hash key and storage location, for example, 1 corresponding Hash of data block Keyword is:Hash Key1And corresponding storage location11, 2 corresponding Hash key of data block is:Hash Key2, with And corresponding storage location12
For the Hash key of data block set as shown in Figure 6, the sequence of data block set can be directly used Row number is as Hash key, alternatively, the sequence number of data block set is carried out the Hash that meter coding obtains using hash algorithm Value.For example, for Hash key of the data block in conjunction with 1, sequence number " 1 " can directly be used to be used as Hash key, Or to cryptographic Hash that sequence number " 1 " is encoded using hash algorithm;The Hash of data block as shown in Figure 6 is closed For key word, the identical calculation of data block as shown in Figure 5 can be used.
By taking the data 11 of data block 1 as an example, corresponding Hash key is:Hash Key1, in this way, once knowing target The data block set of attribution data and the Hash key of the data block belonged to, based on the mapping table shown in Fig. 6, you can The storage location of the data block set (target data set of blocks) residing for target data is positioned first, and is positioned residing for target data Data block (target data block) storage location, based on above-mentioned storage location read target data block, for example, in data processing Storage location in the nonvolatile storage space of device local or network side reads target data block, thus by target data block It reads in the memory headroom of data processing equipment and is continued to search for, due to need not be by total data set of blocks and mesh All data blocks in mark data block set read memory headroom, can save significantly on the occupancy to memory headroom.
The storage mode of data in aforementioned data block is illustrated again, data each data in the block are divided within the data block Equipped with sequence number, and orderly arrangement mode is formed based on sequence number, be data provided in an embodiment of the present invention referring to Fig. 7, Fig. 7 An optional schematic diagram of ordered arrangement data in block, data each data in the block are had according to the sequence number of digital form Sequence sorts.
It is an optional signal of ordered arrangement data in data block provided in an embodiment of the present invention referring back to Fig. 8, Fig. 8 Figure, data each data in the block carry out ordered arrays according to the sequence number of lexicographic ordering form, certainly, in practical application in data block Data the ordered arrangement of any other form may be used, such as combined based on letter, numbers and symbols etc..
So far, the structure of storage data provided in an embodiment of the present invention is illustrated, with reference to the present invention The different storage organizations that embodiment provides are illustrated to being formed for the keyword of searching data.
For the data stored using storage organization as shown in Figure 5, data block that data can be used to be belonged to Hash key and data positioned in the data sequence number in the block belonged to.
Show referring to the optional structure that Fig. 9, Fig. 9 are the keywords provided in an embodiment of the present invention for searching data It is intended to, is applied to the lookup of the data using storage organization as shown in Figure 5, the keywords of data includes the first keyword and the Two keywords, wherein the first keyword Hash Key are that (or data are belonged to for the sequence number of the data block that data are belonged to The sequence number of data block carry out the cryptographic Hash that encodes of Hash), the second keyword Main Key are that data are being belonged to Data sequence number in the block.
With the data 11 shown in Fig. 9, (Serial No. 11 belongs to the data block i.e. data block 1 of Serial No. 1, in data Serial No. 11 in block 1), corresponding first keyword Hash Key11It can be " 1 ", or Hash coding is carried out to " 1 " Obtained cryptographic Hash, corresponding second keyword Main Key11For " 11 ".
For the data using storage organization as shown in FIG. 6 storage, the data block that be belonged to of data can be used The Hash key of set, the Hash key of the data block belonged to and data are in the data sequence in the block belonged to It number positions.
It is an optional structure of the keyword provided in an embodiment of the present invention for searching data referring to Figure 10, Figure 10 Schematic diagram, is applied to the lookup of the data using storage organization as shown in FIG. 6, the keywords of data include the first keyword and Second keyword, wherein the first keyword Hash Key include the first sub- keyword Hash Key (1) and the second sub- keyword again Hash Key (2), the first sub- keyword are sequence number (or the numbers that data are belonged to for the data block set that data are belonged to The cryptographic Hash that Hash encodes is carried out according to the sequence number of set of blocks), the second sub- keyword is the data block that data are belonged to Sequence number (or the sequence number of data block that data are belonged to carries out the cryptographic Hash that Hash encodes), the second keyword Main Key are data in the data sequence number in the block belonged to.
With the data 11 shown in Figure 10, (Serial No. 11 belongs to the data block set i.e. data block set of Serial No. 1 1, the data block i.e. data block 11 of Serial No. 11, the Serial No. in data block 11 are belonged in data block set 1 11), the corresponding first sub- keyword Hash Key1It can be " 1 ", or the cryptographic Hash that Hash encodes is carried out to " 1 ", Corresponding second sub- keyword Hash Key11It can be " 11 ", or the cryptographic Hash that Hash encodes is carried out to " 11 ", it is right The the second keyword Main Key answered11For " 11 ".
So far, the keyword for having formed data to storage data provided in an embodiment of the present invention is illustrated, in the following, The processing for searching target data in the case of for the keyword for having learned that data i.e. target data to be found is said It is bright.
It is an optional flow diagram of data processing method provided in an embodiment of the present invention referring to Figure 11, Figure 11, It can be applied in data processing equipment above-mentioned, in the nonvolatile storage space (such as flash memory, hard disk) of data processing equipment Middle storage, can be by data block (target data block) that target data is belonged to from non-volatile memories sky for the data of lookup Between read in memory headroom, and target data is continued to search in target data block, by save memory headroom, it is efficient in a manner of Realize that the lookup of the target data for known keyword, the step of being related in conjunction with Figure 11 illustrate.
Step 101, the first keyword and the second keyword are extracted from the keyword of target data.
The storage organization (such as Fig. 5 and storage organization as shown in Figure 6) taken according to data is different, and data take corresponding class The keyword of type, when data use the storage organization shown in Fig. 5, data use the keyword of respective type as shown in Figure 9, Keyword include the belonged to data block of data Hash key and data in belonged to data sequence number in the block;Work as data When using storage organization as shown in Figure 6, data use the keyword of respective type as shown in Figure 10,
As an example, the first keyword and the second keyword can be based on the pre-assigned length of keyword and into Row is distinguished, for example, being 50 keyword for length, the section of length 0-25 corresponds to the first keyword, length 26-50's Section corresponds to the second keyword, and certainly, the length of the first keyword and the second keyword these are only determines according to actual conditions Example.
It is to be appreciated that above-mentioned length can be the memory space deposited inside in space, in memory headroom One length of middle distribution is the memory space of 50 (bit or bytes), crucial in the part storage first of the 0-25 of memory space Word stores the second keyword, in this way, may be implemented to the according to length in memory headroom in the parts 26-50 of memory space The differentiation of one keyword and the second keyword.
As another example, the first keyword and the second keyword can be based on specific separator such as ":", "-" etc. It distinguishes.
Step 102, it is index with the first keyword of target data, is compared successively with the Hash key of each data block It is right.
In one embodiment, for the data for lookup be using storage organization as shown in Figure 5 the case where, for The data of lookup are divided into multiple data blocks and store, and target data is stored therein in a data block, correspondingly, target First keyword of data is the Hash key of the data block (target data block) of target data ownership, then reads such as Fig. 5 institutes In the mapping table (storage location for including the Hash key of each data block and respective data blocks) to memory headroom shown, with first Keyword is index, is compared one by one with the Hash Key of each data block of mapping table in memory headroom.
In one embodiment, for the data for lookup be using storage organization as shown in FIG. 6 the case where, for The data of lookup are divided into multiple data blocks, and data block is combined to form multiple data block set, and target data is deposited Storage is in a data block (target data block) of a data set of blocks (in target data set of blocks).
Correspondingly, the first keyword of target data includes the first sub- keyword (namely data that target data is belonged to The Hash key of set of blocks) and the second sub- keyword (namely the data block that target data is belonged in target data set of blocks Hash key).
For with the first keyword of target data be index, with the Hash key of each data block be compared successively and Speech, referring to Figure 12, Figure 12 be the first keyword provided in an embodiment of the present invention with target data be index, with each data block The optional flow diagram that Hash key is compared successively, is related to following steps:
Step 1021, it reads mapping table as shown in FIG. 6 and (includes the Hash key and corresponding data of each data block set The Hash key of each data block and respective data blocks in the mapping relations and data block set of the storage location of set of blocks The mapping relations of storage location) in memory headroom.
Step 1022, it is index with the first of target data the sub- keyword, with each data block in the mapping table of memory headroom The Hash key of set is compared, and obtains the storage location for the target data set of blocks that target data is belonged to, based on depositing Storage space, which is set, reads target data set of blocks.
Step 1023, it is index with the second of target data the sub- keyword, and it is each in target data set of blocks in mapping table The Hash key of data block is compared, and obtains target data block that target data is belonged in nonvolatile storage space Storage location.
Step 1024, target is read from the target data set of blocks of the storage of nonvolatile storage space based on storage location Data block is to memory headroom.
Step 103, the first keyword mapped storage location described in when being based on comparing successfully is obtained and is closed with described first Key word is the target data block of Hash key.
In one embodiment, the case where being using storage organization as shown in Figure 5 for the data for lookup, with the When one keyword is that index is compared one by one with the Hash Key of each data block in mapping table as shown in Figure 5, it can obtain The storage location of the target data block belonged to target data, to based on the storage location reading in nonvolatile storage space It takes in target data block to memory headroom.
For example, for the data 11 shown in Fig. 9, the first keyword Hash Key are Hash Key11, the second keyword Main Key is Main Key11, it is based on Hash Key11It is compared with the Hash Key of in mapping table shown in fig. 5 data block, energy Access the storage location for the data block 1 that data 11 are belonged to11, from the storage location of nonvolatile storage space11It can read In data block 1 to memory headroom, it can be seen that, it is not necessary to all data blocks are read in memory headroom, and by all data blocks It reads memory headroom to compare, significantly reduces the occupancy to memory headroom.
In another embodiment, for the data for lookup be using storage organization as shown in FIG. 6 the case where, can It is divided into multiple data blocks for the data of lookup, and data block is combined into multiple data block set, with target data The first keyword be index be compared in the Hash key as shown in FIG. 6 with each data block set in mapping table, energy The storage location that the target data block that target data is belonged to is integrated into nonvolatile storage space is accessed, with target data Second keyword is compared in the Hash key as shown in FIG. 6 with each data in target data set of blocks in mapping table, It can further obtain what target data was belonged in the storage location that target data block is integrated into nonvolatile storage space Data block (target data block) nonvolatile storage space storage location, so as to which target can be read from storage location In data block to memory headroom.
For example, for the data 11 shown in Figure 10, the first sub- keyword Hash Key (1) are Hash Key1, the second son pass Key word Hash Key (2) are Hash Key11, the second keyword Main Key are Main Key11, it is based on Hash Key1With Fig. 6 institutes The Hash key of data set of blocks is compared in the mapping table shown, can obtain the data block set that data 11 are belonged to 1 storage location1, it is based on Hash Key11It is compared with the Hash key of in mapping table shown in fig. 6 data block, energy Access the storage location for the data block 1 that data 11 are belonged to11, in the storage location of nonvolatile storage space1In can be into One step obtains the storage location of 1 piece of data that data 11 are belonged in nonvolatile storage space12, so as to from storage position It sets12It can be in read block 1 to memory headroom.It can be seen that, it is not necessary to by total data set of blocks and target data block collection All data blocks in conjunction are read in memory, compared with total data set of blocks is read memory headroom, can significantly be subtracted Few occupancy to memory headroom.
The lookup mode of the data block provided for step 102 and step 103 be the Hash key based on data block into Capable, Hash lookup mode is also referred to as in the embodiment of the present invention.
Step 104, with second keyword be index, successively with the median of the index of the target data block, with And the median of the index of the target data block after recursive subdivision is compared;Wherein, the index includes the target The tactic sequence number of each data in data block.
Step 105, the second keyword mapped storage location described in when being based on comparing successfully, from the target data block Respective memory locations read the target data.
In one embodiment of step 104, it is related to following steps:
Step 1041, it is index with second keyword, is compared with the median of the index of the target data block It is right:Step 1042 is executed when comparing successfully, and step 1043 is executed when comparing failure.
Step 1042, illustrate that the second keyword is consistent with median, deposited from the median in target data block mapped Storage space is set, and target data can be read from the respective memory locations in memory headroom.
Step 1043, when comparing failed, the index of the target data block is divided into the first index and the second rope Draw, determines first index residing for the value of second keyword.
For example, when index for (1,2,3,4,5) when, median 3 compares not if the second keyword is 4 with median 3 Index is split that (use equipartition principle as possible, carries out decile when such as even number, when odd number makes first based on median by success The difference of the quantity of the sequence number of index and the second index is 1), to form the first index (4,5) and the second index (1,2), due to Index is ascending order arrangement, and the second keyword 4 is more than the median 3 that original first indexes and therefore tentatively judges the second keyword Value is in the first new index (4,5) of high valued space.
Step 1044, second keyword is compared with the median of first index, when comparing successfully Step 1042 is executed, step 1045 is executed when comparing failure.
Step 1045, when not comparing successfully, first index is divided into new first node and the second new rope Draw, determines the first new index residing for the value of second keyword, and return to step 1043, until compare successfully, or The first new index of person does not have the second keyword.
Connect aforementioned exemplary, since the second index (4,5) only has 2 values, therefore randomly select one as median and Second keyword 4 compare, if the median taken be 4 if compare success, if the value taken be 5, find to the left 4 after It is continuous to compare, it compares successfully.For another example when the second keyword is 3,5, still can not be compared into after original first is indexed decile Work(then judges to compare failure, does not include target data in target data block.
For the lookup mode that step 104 and step 105 provide, carried out within the data block based on the second keyword Recursive lookup by half, also referred to as binary chop mode in the embodiment of the present invention.
It is illustrated in conjunction with the example of data search in augmented reality map.
The Local search algorithm of data has Hash lookup (Hash) to search and two kinds of binary chop, single Hash lookup (only being searched in a manner of traversal in the Hash key of total data according to the Hash key of target data) Advantage is that inquiry velocity is fast;The disadvantage is that the memory of consumption is larger, belong to typical space for time, and needs to solve key assignments punching Prominent problem;Single binary chop mode is (i.e. only according to the sequence number of target data, in the sequence number index of recursive subdivision Searched, each to be only compared with the median of index, data or index obtained until searching) the advantages of be simple, look into Speed is ask, and does not need extra memory headroom;The disadvantage is that data need ordered arrangement, and insertion and delete operation Cost is larger, and average number of comparisons is more in the lookup of mass data.
The embodiment of the present invention provides the mode for being combined Hash lookup and binary chop, and data search is divided into Hash lookup With two stages of binary chop, two kinds of data structures are depended on, a kind of data structure is the keyword for data search (Key), another data structure is the mapping table of the Hash Key of block for storing data and the storage location of data block, below Illustrate two kinds of data structures.
The internal storage structure of the keyword of data is as shown in table 1 below:
Hash Key Main Key
Table 1
The keyword (Key) of data is divided into two parts:First part is Hash Key, is used for Hash lookup process, i.e., According to found out in Hash key demapping table need carry out binary chop data block;Second part is Main Key, mainly For binary chop process, i.e., binary chop is carried out according to Main Key in the data block found.
The set in many places (point of interest) is such as had in the map of augmented reality service for launching specified task (place refers to the target that user can see on augmented reality map, and multiple POI points constitute a set), in order to store set Relationship between POI, can (POI's be unique the ID of the ID of set (set) (unique mark gathered) and POI Mark) it is combined into a keyword, and the ID gathered can serve as the Hash Key in keyword and (can also be calculated using Hash Method is encoded to obtain to the ID of set), Main Keys of the POI_ID as keyword.
The mapping table of Hash Key and data block storage location
The element stored in mapping table mainly consists of two parts:First part is that the Hash Key in keyword (can be with Be in augmented reality map point set ID, can also be the cryptographic Hash calculated by the ID of set);Second part is The storage location of data block corresponding with Hash Key, the data block are used for binary chop.
For the keyword (Key) of a data to be found, as long as there are the Hash in the keyword in mapping table Key, can the storage location of data block that is belonged to data of quick obtaining, and then by the memory headroom of data block reading into Row binary chop.
Data search flow
It is an optional flow diagram of data search method provided in an embodiment of the present invention referring to Figure 13, Figure 13, It involves the steps of:
First, processing is split to the Key of target data, therefrom obtains corresponding Hash Key and Main Key, Middle Hash Key are used for Hash lookup, and Main Key are used for binary chop.
Secondly, the Hash Key of acquisition are compared with the Hash Key of each data block stored in mapping table, Whether see can find matched Hash Key;It indicates to search failure if it is not found, corresponding to this Hash Key Data block;If it is found, indicating the data block corresponding to this Hash Key, data block is fetched into memory headroom, For subsequent binary chop.
Again, binary chop is carried out to the data block of extraction using Main Key in keyword.
The process of binary chop is exactly:The intermediate node for finding data block first divides data block using intermediate node as boundary For front and back two parts, then it is compared with Main Key with intermediate node.If Main Key and intermediate node are equal, indicate Lookup is successfully found corresponding static data;If Main Key are less than intermediate node, it is necessary to be repeated in first half above-mentioned Process, until there is no intermediate node (indicate search failure);If Main Key are more than intermediate node, it is necessary to rear Half part repeats above-mentioned process, (indicates to search failure) until not having intermediate node.
For example, as thering is 7 POI, the ID of POI to correspond to 1,2,3,4,5,6,7 on augmented reality map, by this 7 POI points It is divided into 2 set, the ID of set is respectively 1001 and 1002, and the collection of gained is combined into 1001:[1,3,7] and 1002:[2,4,5, 6].Finally formed data arrangement is as follows:
1001:[1,3,7]
1002:[2,4,5,6]
Assuming that needing to search out static Key now to be 1002:5 (ID that the ID of set is 1002, POI is position 5), root According to above-mentioned lookup mode, search procedure is as follows:
First, Hash Key and the Main Key, as 1002 and 5 in static state Key are obtained.
Then, corresponding data block is found out in 1002 demapping tables, i.e. [2,4,5,6].
Finally, binary chop is carried out to [2,4,5,6] with 5 and eventually finds the storage location of 5 mappings by comparing.
The information of POI points can be read from storage location, shown for verification data or to user.
Above-mentioned lookup mode is applied to the object locating function of augmented reality (for example, object can be to be added to reality Various virtual objects in environment, such as treasure stage property), search efficiency is more maximum than directly using binary chop in theory 50% can be improved, that is to say, that user needed 5 seconds can just see complete POI information originally, due to the quickening of search speed, Only need can find and be shown within 2.5 seconds.
The functional structure of data processing equipment provided in an embodiment of the present invention is illustrated, is this referring to Figure 14, Figure 14 One optional structural schematic diagram of the data processing equipment that inventive embodiments provide, including:Extraction unit 210, first is searched Unit 220, first acquisition unit 230, the second searching unit 240 and second acquisition unit 250;Furthermore it is also possible to including key Word cell 260.
Extraction unit 210, for extracting the first keyword and the second keyword from the keyword of target data;
First searching unit 220, for being index with first keyword, successively with the Hash key of each data block It is compared;
First acquisition unit 230, for based on compares successfully when described in the first keyword mapped storage location, acquisition Using first keyword as the target data block of Hash key;
Second searching unit 240, for second keyword be index, successively with the index of the target data block Median and the target data block recursive subdivision after the median of index be compared;Wherein, the index packet Include the tactic sequence number of data in the target data block;
Second acquisition unit 250, for based on compare successfully when described in the second keyword mapped storage location, from institute The respective memory locations for stating target data block obtain the target data.
In one embodiment, key element 260 form corresponding data for the sequence number based on each data block The Hash key of block;For each data each data in the block according to Rankine-Hugoniot relations order-assigned sequence number;Based on institute It states the Hash key of the data block of each attribution data and each data corresponds to the sequence number distributed and are combined, formed The keyword of respective nodes.
In one embodiment, the key element 260 carries out Hash volume for the sequence number to each data block Code obtains the Hash key of respective data blocks, alternatively, using the sequence number of each data block as the Hash of respective data blocks Keyword.
In one embodiment, first searching unit 220 is additionally operable to read mapping table, and the mapping table includes institute State the Hash key of each data block and the mapping relations of respective memory locations;It is index with first keyword, and it is described The Hash key of each data block described in mapping table is compared.
In one embodiment, first searching unit 220 is additionally operable to extract first from first keyword Sub- keyword and the second sub- keyword;Be index with the described first sub- keyword, with the Hash key of each data block set according to It is secondary to be compared;Based on the first sub- keyword mapped storage location described in when comparing successfully, obtains and closed with first son Key word is the target data set of blocks of Hash key;It is index with the described second sub- keyword, with the target data block collection The Hash key of each data block is compared successively in conjunction, obtains in the target data set of blocks with second keyword For the target data block of Hash key.
In one embodiment, the first acquisition unit 230, be additionally operable to based on compare successfully when described in the first keyword Mapped storage location is read from the respective memory locations of nonvolatile storage space using first keyword as Hash In the target data block to memory headroom of keyword.
In one embodiment, the second acquisition unit 250, be additionally operable to based on compare successfully when described in the second keyword The mapped storage location in the target data block, the corresponding of the target data block stored from the memory headroom are deposited Storage space, which is set, reads the target data.
In one embodiment, second searching unit 240 is additionally operable to second keyword be index, with institute The median for stating the index of target data block is compared;When comparing failed, by the index decile of the target data block For the first index and the second index, first index residing for the value of second keyword is determined;Described second is closed Key word is compared with the median of first index;When not comparing successfully, first index is divided into new the One node and the new second index determine the first new index residing for the value of second keyword;Described second is closed Key word is compared with the median of new first index, until compare successfully, or new first index does not have There is second keyword.
The embodiment of the present invention also provides a kind of storage medium, is stored with executable instruction, implements for executing the present invention such as The data processing method that illustration 11, Figure 12 and Figure 13 either figures provide, the storage medium described in the embodiment of the present invention can be light The storage mediums such as disk, flash memory or disk are chosen as non-moment storage medium.
In conclusion the embodiment of the present invention has the advantages that:
It is compared with the Hash key of each data block with the first keyword, determines the target data of target data ownership Block can subsequently be continued to search in target data block, will be based on Hash key and be based on sequence number in recursive subdivision Index in the mode searched combine;On the one hand, avoiding to read all data blocks and search in each data block traversal causes Occupancy a large amount of memory spaces the problem of;On the other hand, avoid the single sequence number using data recursive subdivision index The low problem of search efficiency, improves search efficiency caused by middle lookup.
It will be appreciated by those skilled in the art that:Realize that all or part of step of above method embodiment can pass through journey Sequence instructs relevant hardware to complete, and program above-mentioned can be stored in a computer read/write memory medium, which exists When execution, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes:Movable storage device, RAM, The various media that can store program code such as ROM, magnetic disc or CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product Sale in use, can also be stored in a computer read/write memory medium.Based on this understanding, the present invention is implemented The technical solution of example substantially in other words can be expressed in the form of software products the part that the relevant technologies contribute, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention. And storage medium above-mentioned includes:Movable storage device, RAM, ROM, magnetic disc or CD etc. are various can to store program code Medium.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (16)

1. a kind of data processing method, which is characterized in that including:
The first keyword and the second keyword are extracted from the keyword of target data;
It is index with first keyword, is compared successively with the Hash key of each data block;
Based on the first keyword mapped storage location described in when comparing successfully, it is Hash pass to obtain using first keyword The target data block of key word;
With second keyword be index, successively with the median of the index of the target data block and the number of targets It is compared according to the median of the index after the recursive subdivision of block;Wherein, the index includes data in the target data block Tactic sequence number;
Based on the second keyword mapped storage location described in when comparing successfully, from the respective stored position of the target data block It sets and obtains the target data.
2. the method as described in claim 1, which is characterized in that further include:
Sequence number based on each data block forms the Hash key of respective data blocks;
For each data each data in the block Rankine-Hugoniot relations and order-assigned sequence number;
The sequence number that the Hash key of data block based on each attribution data and each data correspond to distribution carries out Combination, forms the keyword of corresponding data.
3. method as claimed in claim 2, which is characterized in that the sequence number based on each data block forms respective counts According to the Hash key of block, including:
Hash is carried out to the sequence number of each data block to encode to obtain the Hash key of respective data blocks, alternatively, by each institute State Hash key of the sequence number as respective data blocks of data block.
4. the method as described in claim 1, which is characterized in that it is described with first keyword be index, with each data block Hash key be compared successively, including:
Mapping table is read to memory headroom, the mapping table includes the Hash key and respective memory locations of each data block Mapping relations;
It is index with first keyword, is compared with the Hash key of each data block described in the mapping table.
5. the method as described in claim 1, which is characterized in that it is described with first keyword be index, with each data block Hash key be compared successively;Based on the first keyword mapped storage location described in when comparing successfully, obtain with First keyword is the target data block of Hash key, including:
The first sub- keyword and the second sub- keyword are extracted from first keyword;
It is index with the described first sub- keyword, is compared successively with the Hash key of each data block set;
Based on the first sub- keyword mapped storage location described in when comparing successfully, it is to breathe out to obtain with the described first sub- keyword The target data set of blocks of uncommon keyword;
Be index with the described second sub- keyword, with the Hash key of each data block in the target data set of blocks successively into Row compares, and obtains in the target data set of blocks using second keyword as the target data block of Hash key.
6. the method as described in claim 1, which is characterized in that it is described based on compare successfully when described in the first keyword mapped Storage location, obtain using first keyword as the target data block of Hash key, including:
Based on the first keyword mapped storage location described in when comparing successfully, from the respective stored of nonvolatile storage space Position is read using first keyword as in the target data block to memory headroom of Hash key.
7. method as claimed in claim 6, which is characterized in that it is described based on compare successfully when described in the second keyword mapped Storage location, obtain the target data from the respective memory locations of the target data block, including:
Based on the second keyword mapped storage location in the target data block described in when comparing successfully, from the memory The respective memory locations of the target data block of space storage read the target data.
8. the method as described in claim 1, which is characterized in that it is described with second keyword be index, successively with it is described The median of index after the recursive subdivision of the median of the index of target data block and the target data block is compared It is right, including:
It is index with second keyword, is compared with the median of the index of the target data block;
When comparing failed, the index of the target data block is divided into the first index and the second index, determines described the First index residing for the value of two keywords;
Second keyword is compared with the median of first index;
When not comparing successfully, first index is divided into new first index and the new second index, determines described the The first new index residing for the value of two keywords;
Second keyword is compared with the median of new first index, until compare successfully, or new First index does not have second keyword.
9. a kind of data processing equipment, which is characterized in that including:
Extraction unit, for extracting the first keyword and the second keyword from the keyword of target data;
First searching unit is compared for being index with first keyword with the Hash key of each data block successively It is right;
First acquisition unit, for based on compares successfully when described in the first keyword mapped storage location, acquisition with described First keyword is the target data block of Hash key;
Second searching unit, for second keyword be index, successively with the centre of the index of the target data block The median of index after the recursive subdivision of value and the target data block is compared;Wherein, the index includes described The tactic sequence number of data in target data block;
Second acquisition unit, for based on compare successfully when described in the second keyword mapped storage location, from the target The respective memory locations of data block obtain the target data.
10. data processing equipment as claimed in claim 9, which is characterized in that further include:
Key element forms the Hash key of respective data blocks for the sequence number based on each data block;For each The Rankine-Hugoniot relations of data each data in the block and order-assigned sequence number;The Kazakhstan of data block based on each attribution data The sequence number that uncommon keyword and each data correspond to distribution is combined, and forms the keyword of corresponding data.
11. data processing equipment as claimed in claim 10, which is characterized in that
The key element is additionally operable to encode to obtain the Kazakhstan of respective data blocks to the sequence number progress Hash of each data block Uncommon keyword, alternatively, using the sequence number of each data block as the Hash key of respective data blocks.
12. data processing equipment as claimed in claim 9, which is characterized in that
First searching unit is additionally operable to read mapping table to memory headroom, and the mapping table includes each data block The mapping relations of Hash key and respective memory locations;It is index with first keyword, described in the mapping table The Hash key of each data block is compared.
13. data processing equipment as claimed in claim 9, which is characterized in that
First searching unit is additionally operable to extract the first sub- keyword from first keyword and the second son is crucial Word;It is index with the described first sub- keyword, is compared successively with the Hash key of each data block set;Based on compare at First sub- keyword mapped storage location described in when work(is obtained using the described first sub- keyword as the target of Hash key Data block set;It is index, the hash key with each data block in the target data set of blocks with the described second sub- keyword Word is compared successively, obtains in the target data set of blocks using second keyword as the target of Hash key Data block.
14. data processing equipment as claimed in claim 9, which is characterized in that
The first acquisition unit is additionally operable to based on the first keyword mapped storage location described in when comparing successfully, from non- The respective memory locations of volatile memory, it is the target data block of Hash key to interior to read using first keyword It deposits in space.
15. data processing equipment as claimed in claim 14, which is characterized in that
The second acquisition unit, be additionally operable to based on compare successfully when described in the second keyword reflected in the target data block The respective memory locations of the storage location penetrated, the target data block stored from the memory headroom read the number of targets According to.
16. data processing equipment as claimed in claim 9, which is characterized in that
Second searching unit is additionally operable to second keyword be index, and in the index of the target data block Between value be compared;When comparing failed, the index of the target data block is divided into the first index and the second index, really First index residing for the value of fixed second keyword;By the centre of second keyword and first index Value is compared;When not comparing successfully, first index is divided into new the first index and the new second index, is determined The first new index residing for the value of second keyword;During second keyword is indexed with new described first Between value be compared, until compares successfully, or new described first indexes without second keyword.
CN201710132651.XA 2017-03-07 2017-03-07 Data processing method and device Active CN108572958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710132651.XA CN108572958B (en) 2017-03-07 2017-03-07 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710132651.XA CN108572958B (en) 2017-03-07 2017-03-07 Data processing method and device

Publications (2)

Publication Number Publication Date
CN108572958A true CN108572958A (en) 2018-09-25
CN108572958B CN108572958B (en) 2022-07-29

Family

ID=63577062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710132651.XA Active CN108572958B (en) 2017-03-07 2017-03-07 Data processing method and device

Country Status (1)

Country Link
CN (1) CN108572958B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764705A (en) * 2019-10-22 2020-02-07 北京锐安科技有限公司 Data reading and writing method, device, equipment and storage medium
CN110838199A (en) * 2019-11-12 2020-02-25 Tcl-罗格朗国际电工(惠州)有限公司 Access control card management method and device, computer equipment and storage medium
CN111104787A (en) * 2018-10-26 2020-05-05 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for comparing files
CN113553343A (en) * 2021-06-29 2021-10-26 通号城市轨道交通技术有限公司 Electronic map data query method and system
CN113608701A (en) * 2021-08-18 2021-11-05 合肥大唐存储科技有限公司 Data management method in storage system and solid state disk
CN113626490A (en) * 2020-05-08 2021-11-09 杭州海康威视数字技术股份有限公司 Data query method, device and equipment and storage medium
WO2023083237A1 (en) * 2021-11-11 2023-05-19 支付宝(杭州)信息技术有限公司 Graph data management
CN111104787B (en) * 2018-10-26 2024-04-26 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for comparing files

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858747A (en) * 2006-04-30 2006-11-08 北京金山软件有限公司 Data storage/searching method and system
CN101594319A (en) * 2009-06-26 2009-12-02 华为技术有限公司 List item lookup method and device
CN101692651A (en) * 2009-09-27 2010-04-07 中兴通讯股份有限公司 Method and device for Hash lookup table
CN101727465A (en) * 2008-11-03 2010-06-09 中国移动通信集团公司 Methods for establishing and inquiring index of distributed column storage database, device and system thereof
CN101782922A (en) * 2009-12-29 2010-07-21 山东山大鸥玛软件有限公司 Multi-level bucket hashing index method for searching mass data
CN102012851A (en) * 2010-12-20 2011-04-13 浪潮(北京)电子信息产业有限公司 Continuous data protection method and server
US20110145188A1 (en) * 2008-08-07 2011-06-16 Thomas Vachuska Providing data structures for determining whether keys of an index are present in a storage system
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102467458A (en) * 2010-11-05 2012-05-23 英业达股份有限公司 Method for establishing index of data block
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method
CN102945242A (en) * 2006-11-01 2013-02-27 起元技术有限责任公司 Managing storage method, system, and computer system
CN103412962A (en) * 2013-09-04 2013-11-27 国家测绘地理信息局卫星测绘应用中心 Storage method and reading method for mass tile data
CN103488709A (en) * 2013-09-09 2014-01-01 东软集团股份有限公司 Method and system for building indexes and method and system for retrieving indexes
CN103513956A (en) * 2012-06-26 2014-01-15 阿里巴巴集团控股有限公司 Data processing method and device of processor
US20140136802A1 (en) * 2012-11-09 2014-05-15 International Business Machines Corporation Accessing data in a storage system
CN104395904A (en) * 2012-04-27 2015-03-04 网络装置公司 Efficient data object storage and retrieval
CN104794162A (en) * 2015-03-25 2015-07-22 中国人民大学 Real-time data storage and query method
CN105320775A (en) * 2015-11-11 2016-02-10 中科曙光信息技术无锡有限公司 Data access method and apparatus
US20160103767A1 (en) * 2014-10-09 2016-04-14 Netapp, Inc. Methods and systems for dynamic hashing in caching sub-systems

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858747A (en) * 2006-04-30 2006-11-08 北京金山软件有限公司 Data storage/searching method and system
CN102945242A (en) * 2006-11-01 2013-02-27 起元技术有限责任公司 Managing storage method, system, and computer system
US20110145188A1 (en) * 2008-08-07 2011-06-16 Thomas Vachuska Providing data structures for determining whether keys of an index are present in a storage system
CN101727465A (en) * 2008-11-03 2010-06-09 中国移动通信集团公司 Methods for establishing and inquiring index of distributed column storage database, device and system thereof
CN101594319A (en) * 2009-06-26 2009-12-02 华为技术有限公司 List item lookup method and device
CN101692651A (en) * 2009-09-27 2010-04-07 中兴通讯股份有限公司 Method and device for Hash lookup table
CN101782922A (en) * 2009-12-29 2010-07-21 山东山大鸥玛软件有限公司 Multi-level bucket hashing index method for searching mass data
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102467458A (en) * 2010-11-05 2012-05-23 英业达股份有限公司 Method for establishing index of data block
CN102012851A (en) * 2010-12-20 2011-04-13 浪潮(北京)电子信息产业有限公司 Continuous data protection method and server
CN102541968A (en) * 2010-12-31 2012-07-04 百度在线网络技术(北京)有限公司 Indexing method
CN104395904A (en) * 2012-04-27 2015-03-04 网络装置公司 Efficient data object storage and retrieval
CN103513956A (en) * 2012-06-26 2014-01-15 阿里巴巴集团控股有限公司 Data processing method and device of processor
US20140136802A1 (en) * 2012-11-09 2014-05-15 International Business Machines Corporation Accessing data in a storage system
CN103412962A (en) * 2013-09-04 2013-11-27 国家测绘地理信息局卫星测绘应用中心 Storage method and reading method for mass tile data
CN103488709A (en) * 2013-09-09 2014-01-01 东软集团股份有限公司 Method and system for building indexes and method and system for retrieving indexes
US20160103767A1 (en) * 2014-10-09 2016-04-14 Netapp, Inc. Methods and systems for dynamic hashing in caching sub-systems
CN104794162A (en) * 2015-03-25 2015-07-22 中国人民大学 Real-time data storage and query method
CN105320775A (en) * 2015-11-11 2016-02-10 中科曙光信息技术无锡有限公司 Data access method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
喻波等: "一种基于共享前缀的两级索引结构", 《计算机工程与科学》 *
黄金等: "哈希索引在交警专用移动执法终端数据检索中的应用研究", 《中国公共安全(学术版)》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104787A (en) * 2018-10-26 2020-05-05 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for comparing files
CN111104787B (en) * 2018-10-26 2024-04-26 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for comparing files
CN110764705A (en) * 2019-10-22 2020-02-07 北京锐安科技有限公司 Data reading and writing method, device, equipment and storage medium
CN110764705B (en) * 2019-10-22 2023-08-04 北京锐安科技有限公司 Data reading and writing method, device, equipment and storage medium
CN110838199A (en) * 2019-11-12 2020-02-25 Tcl-罗格朗国际电工(惠州)有限公司 Access control card management method and device, computer equipment and storage medium
CN113626490A (en) * 2020-05-08 2021-11-09 杭州海康威视数字技术股份有限公司 Data query method, device and equipment and storage medium
CN113626490B (en) * 2020-05-08 2023-08-25 杭州海康威视数字技术股份有限公司 Data query method, device and equipment and storage medium
CN113553343A (en) * 2021-06-29 2021-10-26 通号城市轨道交通技术有限公司 Electronic map data query method and system
CN113608701A (en) * 2021-08-18 2021-11-05 合肥大唐存储科技有限公司 Data management method in storage system and solid state disk
WO2023083237A1 (en) * 2021-11-11 2023-05-19 支付宝(杭州)信息技术有限公司 Graph data management

Also Published As

Publication number Publication date
CN108572958B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN108572958A (en) Data processing method and device
CN106484875B (en) MOLAP-based data processing method and device
CN110291518A (en) Merge tree garbage index
CN102402605B (en) Mixed distribution model for search engine indexing
EP2924594B1 (en) Data encoding and corresponding data structure in a column-store database
CN108255958A (en) Data query method, apparatus and storage medium
CN110268399A (en) Merging tree for attended operation is modified
US20140188885A1 (en) Utilization and Power Efficient Hashing
CN110383261A (en) Stream for multithread storage device selects
US20080201302A1 (en) Using promotion algorithms to support spatial searches
CN104408163B (en) A kind of data classification storage and device
CN106970958B (en) A kind of inquiry of stream file and storage method and device
KR20010077983A (en) Method and means for classifying data packets
JP2002501256A (en) Database device
CN108304484A (en) Key word matching method and device, electronic equipment and readable storage medium storing program for executing
CN109086456B (en) Data indexing method and device
CN112148217B (en) Method, device and medium for caching deduplication metadata of full flash memory system
CN100397816C (en) Method for classifying received data pocket in network apparatus
US9760836B2 (en) Data typing with probabilistic maps having imbalanced error costs
CN106469182A (en) A kind of information recommendation method based on mapping relations and device
CN103874996A (en) Method for performing logical operation, based on full-text, using hashes
CN108984723A (en) Creation index, data query method, apparatus and computer equipment
CN109299106B (en) Data query method and device
CN114579617A (en) Data query method and device, computer equipment and storage medium
CN110633275B (en) ETC transaction data retention analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant