CN108572958B

CN108572958B - Data processing method and device

Info

Publication number: CN108572958B
Application number: CN201710132651.XA
Authority: CN
Inventors: 袁野; 周海发
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2017-03-07
Filing date: 2017-03-07
Publication date: 2022-07-29
Anticipated expiration: 2037-03-07
Also published as: CN108572958A

Abstract

The invention discloses a data processing method and a data processing device; the method comprises the following steps: extracting a first keyword and a second keyword from the keywords of the target data; comparing the first keyword serving as an index with the hash keywords of each data block in sequence; based on the storage position mapped by the first keyword when the comparison is successful, acquiring a target data block taking the first keyword as a Hash keyword; taking the second keyword as an index, and sequentially comparing the second keyword with the intermediate value of the index of the target data block and the intermediate value of the index of the target data block after recursive segmentation; wherein the index comprises a sequentially arranged sequence number of data in the target data block; and acquiring the target data from the corresponding storage position of the target data block based on the storage position mapped by the second keyword when the comparison is successful. By implementing the invention, data can be efficiently searched.

Description

Data processing method and device

Technical Field

The present invention relates to database technologies, and in particular, to a data processing method and apparatus.

Background

The data searching technology refers to a technology for searching data required by service operation, and the rapid data searching is a key factor for ensuring the efficient and stable operation of the service.

At present, data shows an explosive growth trend, and bottlenecks of low searching efficiency and high resource occupation are caused when a conventional data searching technology searches mass data.

Taking an augmented reality technology as an example, the augmented reality technology is to amplify the perception of a user to the real world on the basis of displaying a real environment, and realize the effect of combining the real environment with a virtual object (an object which does not exist in the real environment where the user is currently located).

Taking a high-precision electronic map as an example, the high-precision electronic map is used for automatic driving and automatic navigation, has an incomparable precision (precision error is usually less than one meter) compared with a conventional electronic map, and can include a large amount of related data of road facilities, so that the data volume is large, and the searching efficiency is difficult to guarantee when the conventional data searching technology is used for searching in the high-precision electronic map at present.

In summary, when data is searched, how to ensure the efficiency of data search has not been an effective solution in the related art.

Disclosure of Invention

The embodiment of the invention provides a data processing method and device, which can search data in an efficient manner.

The technical scheme of the embodiment of the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a data processing method, including:

extracting a first keyword and a second keyword from the keywords of the target data;

comparing the first keyword serving as an index with the hash keywords of each data block in sequence;

based on the storage position mapped by the first keyword when the comparison is successful, acquiring a target data block taking the first keyword as a Hash keyword;

taking the second keyword as an index, and sequentially comparing the second keyword with the intermediate value of the index of the target data block and the intermediate value of the index after recursive segmentation of the target data block; wherein the index comprises a sequentially arranged sequence number of data in the target data block;

and acquiring the target data from the corresponding storage position of the target data block based on the storage position mapped by the second keyword when the comparison is successful.

In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:

an extraction unit configured to extract a first keyword and a second keyword from keywords of target data;

the first searching unit is used for taking the first keyword as an index and sequentially comparing the first keyword with the hash keywords of each data block;

a first obtaining unit, configured to obtain a target data block using the first keyword as a hash keyword based on a storage location to which the first keyword is mapped when the comparison is successful;

the second searching unit is used for taking the second keyword as an index and sequentially comparing the second keyword with the intermediate value of the index of the target data block and the intermediate value of the index after the recursive division of the target data block; wherein the index comprises a sequentially arranged sequence number of data in the target data block;

and the second obtaining unit is used for obtaining the target data from the corresponding storage position of the target data block based on the storage position mapped by the second keyword when the comparison is successful.

In a third aspect, an embodiment of the present invention provides a data processing apparatus, including a processor and a storage medium, where the storage medium has stored therein executable instructions for causing the processor to perform operations including:

taking the second keyword as an index, and sequentially comparing the second keyword with the intermediate value of the index of the target data block and the intermediate value of the index of the target data block after recursive segmentation; wherein the index comprises a sequentially arranged sequence number of data in the target data block;

In a fourth aspect, an embodiment of the present invention provides a storage medium, which stores executable instructions for executing the data processing method provided in the embodiment of the present invention.

The embodiment of the invention has the following beneficial effects:

comparing the first key with the hash key of each data block to determine a target data block to which the target data belongs, subsequently searching the target data block continuously, and combining the hash key and the sequence number-based searching mode in the recursively divided index; on one hand, the problem that a large amount of storage space is occupied due to the fact that all data blocks are read and each data block is searched in a traversing mode is avoided; on the other hand, the problem of low searching efficiency caused by searching the serial number of the single adopted data in the recursively divided index is solved, and the searching efficiency is improved.

Drawings

FIG. 1 is an alternative schematic diagram of a data processor apparatus deployed in a server/client based system in accordance with an embodiment of the present invention;

FIG. 2 is a diagram illustrating an alternative hardware configuration of a data processing apparatus according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating an alternative structure for partitioning data when storing data according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative structure after partitioning storage data according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an alternative storage structure for storing data blocks provided by embodiments of the present invention;

FIG. 6 is a diagram of an alternative storage structure (mapping table) for storing a set of data blocks according to an embodiment of the present invention;

FIG. 7 is an alternative diagram of the ordered arrangement of data in a data block provided by an embodiment of the invention;

FIG. 8 is an alternative diagram of the ordered arrangement of data in a data block provided by an embodiment of the invention;

FIG. 9 is an alternative structural diagram of a key for searching data according to an embodiment of the present invention;

FIG. 10 is an alternative structural diagram of a key for searching data according to an embodiment of the present invention;

FIG. 11 is a schematic flow chart diagram illustrating an alternative data processing method according to an embodiment of the present invention;

fig. 12 is an alternative flowchart that sequentially compares the first key of the target data as an index with the hash keys of the data blocks according to the embodiment of the present invention;

FIG. 13 is a schematic flow chart diagram illustrating an alternative data searching method according to an embodiment of the present invention;

fig. 14 is an alternative structural schematic diagram of a data processing apparatus according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Before further detailed description of the present invention, terms and expressions referred to in the embodiments of the present invention are described, and the terms and expressions referred to in the embodiments of the present invention are applicable to the following explanations.

1) The data, also called static data or cross-sectional data, is composed of data of states of a plurality of related phenomena at a certain time point, describes changes of the phenomena at a certain time point, reflects internal numerical relations existing among the phenomena under objective conditions of a certain time, a certain place and the like, and can be data collected at the same time point or data created in advance before data search.

For example, map data used in a map application is stable and does not change when map data is searched. As another example, map data used in augmented reality applications includes imagery data of individual real and virtual objects at different locations.

In terms of data, the data may be a piece of data or a plurality of pieces of data, or a certain amount of data (e.g. a certain number of bytes). Such as map data of a location or a region in a high-precision electronic map, and image data of one or more virtual objects located at a location in an augmented reality map.

2) A data block comprising: 2.1) hash key part, namely the hash key of the data block; 2.2) data part, namely a plurality of data arranged in sequence, wherein each data in the data block forms an ordered arrangement in the data block according to the sequence number.

3) The sequence number may be sorted numerically (e.g. 1/2/3/4), sorted alphanumerically (e.g. a/b/c/d), sorted by a combination of alphanumerics (e.g. a1/a2/a3/a4), or sorted in any other form.

4) A serial number index, an ordered (e.g., numeric, alphabetical, etc.) index formed by the serial numbers of the various data in the data block.

5) A set of data blocks comprising: 5.1) hash key part, i.e. the hash key of the data block set itself; 5.2) data portion, i.e. more than two data blocks.

6) A POint of Interest (POI) for corresponding to an object (target), for example, in a high-precision electronic map, the POI may be a place for corresponding to a house, a shop, a mailbox, a bus station, etc., and in an augmented reality map, the POI may be a virtual object in the place (such as various virtual items in a game).

A data processing apparatus implementing an embodiment of the present invention will now be described with reference to the accompanying drawings. The data processing apparatus may be implemented in various forms. For example, the data processing apparatus described in the embodiments of the present invention may be implemented as a terminal such as a smart phone, a notebook computer, a tablet computer (PAD), a vehicle-mounted terminal, or the like, and a stationary terminal such as a digital Television (TV), a desktop computer, or the like. The terminal realizes various services based on data search by running an application client.

For another example, referring to fig. 1, fig. 1 is an alternative schematic diagram of a data processor device deployed in a server/client-based system according to an embodiment of the present invention, and the data processor device described in the embodiment of the present invention may be implemented as a server, and the server may be applied to various data lookup-based services implemented in any server/client-based architecture.

As for the aforementioned data lookup based service, the following are exemplary:

1) user lookup in an online social service, for example, based on various targeting conditions (user name; various attributes such as region, preference, etc.), searching users meeting the conditions in a user database of the social network;

2) the augmented reality service searches map data of a virtual object corresponding to the position in an augmented reality database according to the position of the user in the environment when the user wears the augmented reality device (for example, in the form of glasses or a helmet), and displays the map data in the field of view of the augmented reality device, so that the effect of fusing the real world and the virtual world is realized.

3) An electronic map (e.g., a high-precision electronic map) is displayed by searching map data of a target position in a high-precision electronic map database according to the target position.

Referring to fig. 2, fig. 2 is a schematic diagram of an alternative hardware structure of a data processing apparatus according to an embodiment of the present invention, and the data processing apparatus 100 includes a processor 101, a display unit 102, a communication unit 103, a memory 104, an input unit 105, and a power supply unit 106. FIG. 1 shows a data processing apparatus with various components, but it is understood that not all of the shown components are required to be implemented. More or fewer components may alternatively be implemented. The components in the data processing apparatus will be described in detail below.

The processor 101 is used to control the overall operation of the data processing apparatus. For example, the processor 101 performs control and processing associated with implementing various services based on data lookup, including the various services illustratively described above.

The display unit 102 may display intermediate information of various services for data-based search and search results in the data processing apparatus 100.

For example, when the processor 101 is configured to implement relevant control and processing of an online social service, the display unit 102 may display a User Interface (UI) or a Graphical User Interface (GUI) of the online social service, and display a result of searching for a User in the social network.

For another example, when the processor 101 is used to implement relevant control and processing of an augmented reality service, the display unit 102 displays an image of a real environment corresponding to an orientation in which a user is located in the environment when wearing an augmented reality device (e.g., in the form of glasses or a helmet), and an image of a virtual object superimposed in the real environment according to various augmented reality strategies.

For another example, when the processor 101 is used to implement the relevant control and processing of the high-precision electronic map, the display unit 102 displays the electronic map, searches the map data of the target position in the high-precision electronic map database according to the target position, and displays the map data.

The communication unit 103 typically includes one or more components that allow wired or wireless communication between the data processing apparatus 100 and a wireless communication system or network. For example, the communication unit 103 may be implemented as at least one of a mobile communication module, a wireless internet module, and a short-range communication module.

The mobile communication module transmits and/or receives radio signals to and/or from at least one of a base station (e.g., access point, node B, etc.), an external terminal, and a server. Such radio signals may include voice call signals, video call signals, or various types of data transmitted and/or received according to text and/or multimedia messages.

The wireless internet module supports wireless internet access of the mobile terminal. The module may be internally or externally coupled to the terminal. The wireless internet access technology to which the module relates may include Wireless Local Area Network (WLAN), wireless compatibility authentication (Wi-Fi), wireless broadband (Wibro), worldwide interoperability for microwave access (Wimax), High Speed Downlink Packet Access (HSDPA), and the like.

The short-range communication module is a module for supporting short-range communication. Some examples of short-range communication technologies include bluetooth, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra WIDeband (UWB), zigbee, and the like.

The memory 104 may store software programs or the like that control operations and processes performed by the processor 101, or may temporarily store data that has been output or is to be output (e.g., intermediate results or final results of the aforementioned various service processes based on data lookup).

The Memory 104 may include at least one type of storage medium including a flash Memory, a hard disk, a multimedia card, a card-type Memory (e.g., SD or DX Memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. Also, the data processing apparatus 100 may cooperate with a network storage apparatus that performs a storage function of the memory 104 through a network connection.

The input unit 105 may generate key input data to control various operations of the mobile terminal according to a command input by a user. The input unit 105 allows a user to input various types of information, and may include a keyboard, a touch pad, a jog wheel, a jog dial, and the like. In particular, when the touch pad is superimposed on the display unit 102 in the form of a layer, a touch screen may be formed.

The power supply unit 106 receives external power or internal power and supplies appropriate power required to operate the respective elements and components under the control of the processor 101.

The various embodiments described herein may be implemented in a computer-readable medium using, for example, computer software, hardware, or any combination thereof.

For hardware implementation, the embodiments described herein may be implemented using at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a processor, a controller, a microcontroller, a microprocessor, and an electronic unit designed to perform the functions described herein, and in some cases, such embodiments may be implemented in the processor 101.

For a software implementation, the implementation such as a process or a function may be implemented with a separate software module that allows performing at least one function or operation. The software code may be implemented by a software application (or program) written in any suitable programming language, which may be stored in the memory 104 and executed by the processor 101.

So far, the data processing apparatus according to the embodiments of the present invention has been described in terms of its functions, and a data processing method applied to the data processing apparatus according to the embodiments of the present invention will be described based on the hardware configuration diagram of the data processing apparatus.

The dividing method for storing the data to be searched provided by the embodiment of the invention is explained.

Referring to fig. 3, fig. 3 is a schematic diagram of an optional structure obtained after dividing data when storing data according to an embodiment of the present invention, where data that can be searched is divided into a plurality of data blocks, and the data blocks are used as basic search objects in data search, that is, when searching for target data, it is necessary to first locate a data block where the target data is located, that is, a (target data block), and data is arranged in order in each data block according to a sequence number of the data.

For a data block, the data block can be obtained by dividing the data available for searching in different ways, including the following several optional dividing ways:

1) based on the relevance partition among the data, the data associated in one or more dimensions are partitioned into corresponding data blocks, wherein the dimensions can comprise time, regions and description objects.

For example, the high-precision electronic map data can be divided according to the described geographic areas (such as county, city, street); the augmented reality map data may be divided into different locations, and the image data of the virtual object applied to each location may be divided into data blocks of the corresponding location.

Taking the example of data-based time dimension partitioning, data with time stamps distributed over an hour may be partitioned into data blocks at a granularity of every 10 minutes.

2) The dividing is performed based on the capacity of the data, that is, all the data to be searched is divided according to a specific capacity as a granularity, for example, the data is divided into a plurality of data blocks by taking 100 megabytes as a unit.

It should be noted that the above manner of dividing the data to be searched into data blocks is only an example, and the implementation manner of dividing the data to be searched into data blocks in the embodiment of the present invention is not particularly limited, and the above manners of dividing the data blocks may be used alternatively or in combination in practical applications.

The embodiment of the present invention further provides a structure after dividing all data available for searching, which is different from fig. 3, and referring to fig. 4, fig. 4 is a schematic diagram of an optional structure after dividing stored data provided in the embodiment of the present invention, a basic unit of data storage is a data block set, data available for searching is divided into a plurality of data blocks, and the data blocks are combined with the data block set, and the data block set is used as a basic search object in data searching, that is, when searching target data, a data block set (i.e., a target data block set) where the target data is located needs to be first located, and the structure of each data block in the data block set can be understood according to fig. 3.

For combining data blocks into a data block set, the combination can be performed from different dimensions, and exemplarily, the following ways are included:

1) and combining the data blocks associated in one or more dimensions into a corresponding data block set based on the association among the data blocks, wherein the dimensions can comprise time, regions and description objects.

For example, the data blocks are combined based on the data objects, and for the high-precision electronic map data, the high-precision electronic map data can be combined according to the described geographic areas (such as counties, cities and streets); for the augmented reality map data, data blocks of virtual object data applied to a plurality of adjacent locations may be combined into a data block set according to the location where the virtual object is located.

For example, data blocks are combined based on the time dimension of the data, a plurality of data blocks obtained by dividing data with time stamps distributed in one hour by taking every 10 minutes as granularity can be combined into data block sets according to the sequence of time, for example, each data block set can store a data block corresponding to one hour.

2) The merging is performed based on the data capacity, that is, all the data to be searched are combined according to a specific capacity as a granularity, for example, for dividing the data into a plurality of data blocks by taking 100 megabytes as a unit, the data blocks are combined into data block sets according to a granularity of 1000 megabytes, and each data block set includes 10 data blocks.

It should be noted that the above-mentioned manner of combining the data blocks to be searched into the data block set is only an example, for example, a predetermined number of arbitrary data blocks may also be combined into the data block set.

A manner of storing the data to be searched after being divided, which is provided by the embodiment of the present invention, is described.

For the divided data blocks as shown in fig. 3, referring to fig. 5, fig. 5 is a schematic diagram of an alternative storage structure for storing data blocks provided in the embodiment of the present invention, a mapping table of Hash keys (Hash keys) of the data blocks and storage locations of the data blocks is provided in the storage structure (mapping table) shown in fig. 5, and a corresponding Hash Key and a corresponding storage location are provided in the mapping table for each data block.

Each data block has a unique hash key and a corresponding storage location, and taking the data block shown in fig. 5 as an example, the hash key corresponding to the data block 1 is: hash Key ₁₁ And corresponding storage locations ₁₁ The hash key corresponding to the data block 2 is: hash Key ₁₂ And corresponding storage locations ₁₂ 。

For the hash key of the data block as shown in fig. 5, the sequence number of the data block may be directly used as the hash key, or a hash value obtained by encoding the sequence number of the data block by using a hash algorithm may be used. For example, for the hash key of the data block 1, the sequence number "1" may be used directly as the hash key, or a hash value obtained by encoding the sequence number "1" by using a hash algorithm.

Therefore, when the hash key of the data block (namely the target data block) to which the data to be searched (namely the target data) belongs is used, the storage position of the target data block can be positioned based on the mapping table, and the data block can be read into the memory space of the data processing device to continuously search the target data based on the storage position in the nonvolatile storage space of the local or network side of the data processing device.

For the divided data block set shown in fig. 4, referring to fig. 6, fig. 6 is a schematic diagram of an optional storage structure (mapping table) for storing the data block set provided by the embodiment of the present invention, and a mapping relationship between the hash key of the data block set and the storage location of the data block set, and a mapping relationship between the hash key of each data block in the data block set and the storage location of the data block are provided in the storage structure shown in fig. 6.

Each data block has a unique hash key and a corresponding storage location, and taking the data block set 1 shown in fig. 6 as an example, the hash corresponding to the data block set 1 isThe xi keywords are: hash Key ₁ And corresponding storage locations ₁ For each data block in the data block set 1, there is a corresponding hash key and a storage location, for example, the hash key corresponding to the data block 1 is: hash Key ₁ And corresponding storage locations ₁₁ The hash key corresponding to the data block 2 is: hash Key ₂ And corresponding storage locations ₁₂ 。

For the hash key of the data block set shown in fig. 6, the sequence number of the data block set may be directly used as the hash key, or a hash value obtained by encoding the sequence number of the data block set by using a hash algorithm may be used. For example, for a hash key of a data block combination 1, the sequence number "1" may be directly used as the hash key, or a hash value obtained by encoding the sequence number "1" by using a hash algorithm; for the hash key of the data block as shown in fig. 6, the same calculation manner as the data block as shown in fig. 5 may be used.

Taking the data 11 of the data block 1 as an example, the corresponding hash key is: hash Key ₁ In this way, once the data block set to which the target data belongs and the hash key of the data block to which the target data belongs are obtained, based on the mapping table shown in fig. 6, the storage location of the data block set (target data block set) in which the target data is located may be first located, the storage location of the data block (target data block) in which the target data is located may be located, and the target data block may be read based on the storage location, for example, the target data block may be read at a storage location in a nonvolatile storage space of a local or network side of the data processing apparatus, so that the target data block may be read into a memory space of the data processing apparatus for continuous search.

Next, a storage manner of data in the data block is described, where each data in the data block is assigned with a sequence number in the data block, and an ordered arrangement manner is formed based on the sequence numbers, see fig. 7, where fig. 7 is an optional schematic diagram of the ordered arrangement data in the data block provided in the embodiment of the present invention, and each data in the data block is ordered and ordered according to the sequence numbers in a digital form.

Referring to fig. 8 again, fig. 8 is an optional schematic diagram of sequentially arranging data in a data block according to an embodiment of the present invention, where data in the data block is sequentially ordered according to a sequence number in an alphabetical order form, and of course, in practical applications, data in the data block may be sequentially arranged in any other form, for example, based on a combination of letters, numbers, and symbols.

Now, the structure of the storage data provided by the embodiment of the present invention has been described, and the following describes forming a key for searching data in conjunction with different storage structures provided by the embodiment of the present invention.

For data stored using the storage structure shown in fig. 5, the hash key of the data block to which the data belongs and the sequence number of the data in the data block to which the data belongs may be used for locating.

Referring to fig. 9, fig. 9 is an alternative structural diagram of a Key for searching data according to an embodiment of the present invention, which is applied to searching data in the storage structure shown in fig. 5, where the Key of the data includes a first Key and a second Key, where the first Key Hash Key is a serial number of a data block to which the data belongs (or a Hash value obtained by Hash-coding a serial number of a data block to which the data belongs), and the second Key Main Key is a serial number of the data in the data block to which the data belongs.

The corresponding first Key Hash Key is shown as data 11 in fig. 9 (serial number 11, belonging to data block 1, i.e. data block 1, with serial number 11 in data block 1), and corresponding first Key Hash Key ₁₁ May be "1", or a hash value obtained by hash-coding "1", and the corresponding second keyword Main Key ₁₁ And is "11".

For data stored with the storage structure shown in fig. 6, the hash key of the set of data blocks to which the data belongs, the hash key of the data block to which the data belongs, and the sequence number of the data in the data block to which the data belongs may be used for locating.

Referring to fig. 10, fig. 10 is an alternative structural diagram of a Key for searching data according to an embodiment of the present invention, which is applied to searching data in a storage structure as shown in fig. 6, where the Key of the data includes a first Key and a second Key, where the first Key Hash Key further includes a first sub-Key Hash Key (1) and a second sub-Key Hash Key (2), the first sub-Key is a serial number of a data block set to which the data belongs (or a Hash value obtained by Hash-coding a serial number of a data block set to which the data belongs), the second sub-Key is a serial number of a data block to which the data belongs (or a Hash value obtained by Hash-coding a serial number of a data block to which the data belongs), and the second Key Main Key is a serial number of the data in the data block to which the data belongs.

Data 11 (serial number 11, data block set 1 belonging to data block set with serial number 1, data block 11 belonging to data block set 1 with serial number 11, and serial number 11 in data block 11) shown in fig. 10 corresponds to the Hash Key of the first sub-Key ₁ Can be '1', or a Hash value obtained by carrying out Hash coding on '1', and a corresponding second sub-keyword Hash Key ₁₁ May be "11", or a hash value obtained by hash-coding "11", and the corresponding second Key Main Key ₁₁ And is "11".

Now, the keyword for storing data forming data according to the embodiment of the present invention has been described, and the following description will be made on the process of searching for target data in a case where the keyword for target data, which is data to be searched, is already known.

Referring to fig. 11, fig. 11 is an optional flowchart of the data processing method according to the embodiment of the present invention, which can be applied to the data processing apparatus, and is used to store data that can be searched in a nonvolatile storage space (e.g., a flash memory or a hard disk) of the data processing apparatus, read a data block (a target data block) to which target data belongs from the nonvolatile storage space into a memory space, and continue to search for the target data in the target data block, so as to implement searching for the target data of a known keyword in an efficient manner by saving the memory space, and the steps involved in fig. 11 are described.

Step 101, extracting a first keyword and a second keyword from the keywords of the target data.

According to different storage structures (such as the storage structures shown in fig. 5 and fig. 6) adopted by the data, the data adopts a key of a corresponding type, when the storage structure shown in fig. 5 is adopted by the data, the data adopts a key of a corresponding type shown in fig. 9, and the key comprises a hash key of a data block to which the data belongs and a sequence number of the data in the data block to which the data belongs; when the data adopts the storage structure as shown in fig. 6, the data adopts the corresponding type of key as shown in fig. 10,

as an example, the first keyword and the second keyword may be distinguished based on the length pre-allocated to the keywords, for example, for a keyword with a length of 50, the interval with a length of 0-25 corresponds to the first keyword, and the interval with a length of 26-50 corresponds to the second keyword, of course, the lengths of the first keyword and the second keyword are determined according to actual situations, and the above is only an example.

It is understood that the length may be a storage space in the memory space, and for a storage space with a length of 50 (bits or bytes) allocated in the memory space, the first key is stored in a portion of the storage space from 0 to 25, and the second key is stored in a portion of the storage space from 26 to 50, so that the first key and the second key can be distinguished according to the length in the memory space.

As another example, the first key and the second key may be based on a particular delimiter such as ": "," - ", etc.

And 102, taking the first key of the target data as an index, and sequentially comparing the first key with the hash key of each data block.

In an embodiment, for the case that the data available for lookup uses the storage structure shown in fig. 5, the data available for lookup is divided into a plurality of data blocks for storage, the target data is stored in one of the data blocks, and accordingly, the first Key of the target data is the Hash Key of the data block (target data block) to which the target data belongs, the mapping table shown in fig. 5 (including the Hash Key of each data block and the storage location of the corresponding data block) is read into the memory space, and the first Key is used as an index to compare with the Hash Key of each data block of the mapping table in the memory space one by one.

In one embodiment, for the case that the data available for lookup is using a storage structure as shown in fig. 6, the data available for lookup is divided into a plurality of data blocks, and the data blocks are combined to form a plurality of data block sets, and the target data is stored in one data block (target data block) of one data block set (target data block set).

Accordingly, the first key of the target data includes a first sub-key (i.e., the hash key of the set of data blocks to which the target data belongs) and a second sub-key (i.e., the hash key of the data block to which the target data belongs in the set of target data blocks).

For the comparison between the first key of the target data as the index and the hash keys of the data blocks in sequence, referring to fig. 12, fig. 12 is an optional flowchart illustrating that the first key of the target data is the index and the hash keys of the data blocks are compared in sequence, according to the embodiment of the present invention, which relates to the following steps:

step 1021, reading the mapping table (including the mapping relationship between the hash key of each data block set and the storage location of the corresponding data block set, and the mapping relationship between the hash key of each data block in the data block set and the storage location of the corresponding data block) shown in fig. 6 into the memory space.

Step 1022, comparing the first sub-keyword of the target data with the hash keywords of each data block set in the mapping table of the memory space, to obtain a storage location of the target data block set to which the target data belongs, and reading the target data block set based on the storage location.

And step 1023, comparing the second sub-keyword of the target data serving as an index with the hash keywords of the data blocks in the target data block set in the mapping table to obtain the storage position of the target data block to which the target data belongs in the nonvolatile storage space.

Step 1024, reading the target data block from the stored target data block set of the non-volatile storage space to the memory space based on the storage location.

Step 103, based on the storage location mapped by the first keyword when the comparison is successful, obtaining a target data block using the first keyword as a hash keyword.

In an embodiment, for the case that the data available for lookup uses the storage structure shown in fig. 5, when the first Key is used as an index and the Hash keys of the data blocks in the mapping table shown in fig. 5 are aligned one by one, the storage location of the target data block to which the target data belongs can be obtained, so that the target data block is read into the memory space based on the storage location in the nonvolatile storage space.

For example, for the data 11 shown in FIG. 9, the first Key Hash Key is a Hash Key ₁₁ The second keyword Main Key is Main Key ₁₁ Based on Hash Key ₁₁ Comparing with the Hash Key of the data block in the mapping table shown in fig. 5, the storage location of the data block 1 to which the data 11 belongs can be obtained ₁₁ From storage locations of non-volatile storage space ₁₁ The data block 1 can be read into the memory space, and it can be seen that all the data blocks do not need to be read into the memory space, and the occupation of the memory space is obviously reduced compared with the case that all the data blocks are read into the memory space.

In another embodiment, for the case where the data available for lookup is using a storage structure as shown in fig. 6, the data available for lookup is divided into a plurality of data blocks, and the data blocks are combined into a plurality of data block sets, the first key of the target data is used as an index to compare with the hash key of each data block set in the mapping table shown in FIG. 6, the storage position of the target data block set to which the target data belongs in the nonvolatile storage space can be obtained, the second key of the target data is compared with the hash key of each data in the target data block set in the mapping table as shown in fig. 6, the storage location of the data block (target data block) to which the target data belongs in the nonvolatile storage space can be further obtained in the storage location of the target data block set in the nonvolatile storage space, so that the target data block can be read from the storage location into the memory space.

For example, for the data 11 shown in FIG. 10, the first sub-Key Hash Key (1) is a Hash Key ₁ The second sub-keyword Hash Key (2) is Hash Key ₁₁ The second keyword Main Key is Main Key ₁₁ Based on Hash Key ₁ Comparing with the hash key of the data block set in the mapping table shown in fig. 6, the storage location of the data block set 1 to which the data 11 belongs can be obtained ₁ Based on Hash Key ₁₁ Comparing with the hash key of each data block in the mapping table shown in fig. 6, the storage location of the data block 1 to which the data 11 belongs can be obtained ₁₁ Storage location in non-volatile storage space ₁ Can further obtain the storage position of the data 1 block belonging to the data 11 in the nonvolatile storage space ₁₂ So that it can be stored from a storage location ₁₂ Data block 1 can be read into the memory space. It can be seen that all data blocks in the set of all data blocks and the set of target data blocks do not need to be read into the memory, and the occupation of the memory space can be significantly reduced compared with the case that all data blocks are read into the memory space.

The searching manner for the data block provided in step 102 and step 103 is performed based on the hash key of the data block, and is also referred to as a hash searching manner in the embodiment of the present invention.

Step 104, taking the second keyword as an index, and sequentially comparing the second keyword with the intermediate value of the index of the target data block and the intermediate value of the index of the target data block after recursive segmentation; wherein the index includes a sequentially arranged sequence number of each data in the target data block.

And 105, reading the target data from the corresponding storage position of the target data block based on the storage position mapped by the second keyword when the comparison is successful.

In one embodiment of step 104, the following steps are involved:

step 1041, comparing the second keyword as an index with a middle value of the index of the target data block: if the comparison is successful, step 1042 is executed, and if the comparison is failed, step 1043 is executed.

Step 1042, it is stated that the second key is consistent with the intermediate value, and the target data can be read from the storage location mapped by the intermediate value in the target data block and the corresponding storage location in the memory space.

And 1043, when the comparison is unsuccessful, equally dividing the index of the target data block into a first index and a second index, and determining the first index where the value of the second keyword is located.

For example, when the index is (1, 2, 3, 4, 5), the median is 3, if the second key is 4, the comparison with the median 3 is unsuccessful, the index is divided based on the median (using the principle of equally dividing as much as possible, for example, equally dividing is performed when even, and the difference between the numbers of the sequence numbers of the first index and the second index is 1 when odd), so as to form a first index (4, 5) and a second index (1, 2), and since the indexes are in ascending order and the second key 4 is greater than the median 3 of the original first index, it is preliminarily determined that the value of the second key is a new first index (4, 5) in the high value space.

Step 1044, comparing the second keyword with the middle value of the first index, executing step 1042 if the comparison is successful, and executing step 1045 if the comparison is failed.

Step 1045, when the comparison is not successful, equally dividing the first index into a new first node and a new second index, determining the new first index where the value of the second keyword is located, and returning to step 1043 until the comparison is successful, or the new first index does not have the second keyword.

Continuing with the previous example, since the second index (4, 5) has only 2 values, one of the second indexes is randomly selected as an intermediate value to be compared with the second keyword 4, if the selected intermediate value is 4, the comparison is successful, and if the selected value is 5, the 4 is found to the left to continue the comparison, and the comparison is successful. For another example, when the second keyword is 3,5, the original first index is equally divided and still cannot be successfully compared, and it is determined that the comparison fails and the target data block does not include the target data.

For the search manners provided in step 104 and step 105, a semi-recursive search is performed in the data block based on the second key, which is also referred to as a binary search manner in the embodiment of the present invention.

The description is given with reference to an example of data search in an augmented reality map.

The local search algorithm of the data comprises two types of Hash search (Hash) search and binary search, and the single Hash search (namely, the search is carried out in a traversal mode in the Hash keywords of all the data only according to the Hash keywords of the target data) has the advantage of high search speed; the method has the disadvantages that the consumed memory is large, the typical space is changed for time, and the problem of key value conflict needs to be solved; the single binary search mode (that is, only according to the serial number of the target data, search in the recursively divided serial number indexes, each of which is only compared with the intermediate value of the index until the data or the index is obtained by searching) has the advantages of simplicity, high query speed and no need of redundant memory space; the method has the disadvantages that the data needs to be orderly arranged, the insertion and deletion operation cost is high, and the average comparison times in the search of mass data are high.

The embodiment of the invention provides a mode of combining Hash lookup and binary lookup, wherein data lookup is divided into two stages of Hash lookup and binary lookup, and mainly depends on two data structures, one data structure is a Key (Key) used for data lookup, the other data structure is a mapping table used for storing the Hash Key of a data block and the storage position of the data block, and the two data structures are explained below.

The memory structure of the key of the data is shown in table 1 below:

Hash Key

Main Key

TABLE 1

The Key (Key) of the data is divided into two parts: the first part is Hash Key used for Hash search process, namely, finding out the data block needing binary search in the mapping table according to the Hash Key; the second part is Main Key, which is mainly used in binary search process, i.e. binary search is performed in the found data block according to Main Key.

For example, in a map of an augmented reality service, a set of many locations (points of interest) is used for delivering a specified task (a location refers to a target that a user can see on the augmented reality map, and a plurality of POI points form a set), in order to store a relationship between a set and a POI, an ID of the set (a unique identifier of a set) and an ID of the POI (a unique identifier of a POI) are combined into a Key, and the ID of the set can be used as a Hash Key in the Key (the ID of the set can also be obtained by encoding using a Hash algorithm), and the POI _ ID is used as a Main Key of the Key.

Hash Key and data block storage location mapping table

The elements stored in the mapping table are mainly composed of two parts: the first part is a Hash Key in the keyword (which can be the ID of a place set in an augmented reality map, or a Hash value calculated by the ID of the set); the second part is the storage location of the data block corresponding to the Hash Key, which is used for binary lookup.

For a Key (Key) of data to be searched, as long as the Hash Key in the Key exists in the mapping table, the storage location of the data block to which the data belongs can be quickly obtained, and then binary search is performed in the memory space read by the data block.

Data lookup procedure

Referring to fig. 13, fig. 13 is an optional flowchart of the data searching method according to the embodiment of the present invention, and the method includes the following steps:

firstly, a Key of target data is segmented, and a corresponding Hash Key and a Main Key are obtained from the Key, wherein the Hash Key is used for Hash search, and the Main Key is used for binary search.

Secondly, comparing the obtained Hash Key with the Hash Key of each data block stored in the mapping table to see whether a matched Hash Key can be found; if not, the data block corresponding to the Hash Key does not exist, which indicates that the searching fails; if the Hash Key is found, the data block corresponding to the Hash Key is shown, and the data block is taken out to a memory space for the following binary search.

And thirdly, performing binary search on the extracted data block by using the Main Key in the keyword.

The binary search process is as follows: firstly, finding an intermediate node of a data block, dividing the data block into a front part and a rear part by taking the intermediate node as a boundary, and then comparing the data block with the intermediate node by using a Main Key. If the Main Key and the intermediate node are equal, the finding is successful and corresponding static data is found; if the Main Key is smaller than the intermediate node, the process needs to be repeated in the first half part until no intermediate node exists (indicating that the lookup fails); if the Main Key is larger than the intermediate node, the above process needs to be repeated in the latter half until there are no intermediate nodes (indicating a lookup failure).

For example, if there are 7 POIs on the augmented reality map, the IDs of the POIs are 1, 2, 3, 4,5,6, 7, the 7 POI points are divided into 2 sets, the IDs of the sets are 1001 and 1002, respectively, and the obtained sets are 1001 [1,3,7] and 1002: [2,4,5,6 ]. The resulting data is arranged as follows:

1001:[1,3,7]

1002:[2,4,5,6]

assuming that the position with a static Key of 1002:5 (the ID of the set is 1002 and the ID of the POI is 5) needs to be searched, according to the above searching method, the searching process is as follows:

first, obtain the Hash Key and Main Key in the static keys, namely 1002 and 5.

Then 1002 is used to find the corresponding data block in the demapping table, i.e. [2,4,5,6 ].

Finally, binary search is carried out on [2,4,5,6] by using 5, and finally the storage position mapped by 5 is found through comparison.

The information of the POI point can be read from the storage location for verification of the data or display to the user.

The searching method is applied to the function of searching the augmented reality object (for example, the object can be various virtual objects added to the real environment, such as props of treasured and the like), the searching efficiency can be improved by 50% in theory compared with the method of directly searching by two minutes, that is, the user can see the complete POI information within 5 seconds originally, and due to the fact that the searching speed is accelerated, the POI information can be searched and displayed within 2.5 seconds.

Describing a functional structure of the data processing apparatus provided in the embodiment of the present invention, referring to fig. 14, fig. 14 is an optional structural schematic diagram of the data processing apparatus provided in the embodiment of the present invention, and includes an extracting unit 210, a first searching unit 220, a first obtaining unit 230, a second searching unit 240, and a second obtaining unit 250; in addition, a keyword unit 260 may also be included.

An extracting unit 210 for extracting a first keyword and a second keyword from the keywords of the target data;

a first searching unit 220, configured to take the first keyword as an index, and sequentially compare the first keyword with hash keywords of each data block;

a first obtaining unit 230, configured to obtain, based on a storage location mapped by the first keyword when the comparison is successful, a target data block using the first keyword as a hash keyword;

a second searching unit 240, configured to take the second keyword as an index, and compare the second keyword with a middle value of the index of the target data block and a middle value of the index after recursive partitioning of the target data block in sequence; wherein the index comprises a sequentially arranged sequence number of data in the target data block;

a second obtaining unit 250, configured to obtain the target data from a corresponding storage location of the target data block based on the storage location mapped by the second keyword when the comparison is successful.

In one embodiment, the key unit 260 is configured to form a hash key of each data block based on a sequence number of the data block; sequentially distributing sequence numbers according to the arrangement relation aiming at each data in each data block; and combining the hash keywords of the data block to which each piece of data belongs and the sequence numbers correspondingly distributed to each piece of data to form the keywords of the corresponding nodes.

In an embodiment, the key unit 260 is configured to perform hash coding on the serial number of each data block to obtain a hash key of the corresponding data block, or use the serial number of each data block as the hash key of the corresponding data block.

In an embodiment, the first lookup unit 220 is further configured to read a mapping table, where the mapping table includes a mapping relationship between a hash key of each data block and a corresponding storage location; and comparing the first key word serving as an index with the hash key word of each data block in the mapping table.

In an embodiment, the first searching unit 220 is further configured to extract a first sub-keyword and a second sub-keyword from the first keyword; taking the first sub-keyword as an index, and sequentially comparing the first sub-keyword with the hash keywords of each data block set; based on the storage position mapped by the first sub-keyword when the comparison is successful, acquiring a target data block set taking the first sub-keyword as a Hash keyword; and comparing the second sub-keyword serving as an index with the hash keywords of the data blocks in the target data block set in sequence to obtain the target data block in the target data block set, wherein the second keyword serves as a hash keyword.

In an embodiment, the first obtaining unit 230 is further configured to, based on a storage location mapped by the first keyword when the comparison is successful, read, from a corresponding storage location of the non-volatile storage space, a target data block using the first keyword as a hash keyword into the memory space.

In an embodiment, the second obtaining unit 250 is further configured to read the target data from a corresponding storage location of the target data block stored in the memory space based on a storage location mapped by the second keyword in the target data block when the comparison is successful.

In an embodiment, the second lookup unit 240 is further configured to compare the second keyword as an index with a middle value of the index of the target data block; when the comparison is unsuccessful, equally dividing the index of the target data block into a first index and a second index, and determining the first index where the value of the second keyword is located; comparing the second keyword with the middle value of the first index; when the comparison is not successful, equally dividing the first index into a new first node and a new second index, and determining a new first index where the value of the second keyword is located; and comparing the second keyword with the middle value of the new first index until the comparison is successful, or the new first index does not have the second keyword.

The embodiment of the present invention further provides a storage medium, which stores executable instructions for executing the data processing method provided in any one of fig. 11, fig. 12 and fig. 13 in the embodiment of the present invention, where the storage medium in the embodiment of the present invention may be a storage medium such as an optical disc, a flash memory, or a magnetic disc, and may be a non-transitory storage medium.

In summary, the embodiments of the present invention have the following advantages:

comparing the first key with the hash key of each data block to determine a target data block to which the target data belongs, subsequently searching the target data block continuously, and combining the hash key and the sequence number-based searching mode in the recursively divided index; on one hand, the problem that a large amount of storage space is occupied due to the fact that all data blocks are read and each data block is searched in a traversing mode is avoided; on the other hand, the problem of low searching efficiency caused by searching the sequence number of the single adopted data in the recursively divided index is solved, and the searching efficiency is improved.

Those skilled in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a RAM, a ROM, a magnetic or optical disk, or various other media that can store program code.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a RAM, a ROM, a magnetic or optical disk, or various other media that can store program code.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A data processing method, comprising:

reading a mapping table to a memory space, wherein the mapping table comprises a mapping relation between hash keywords of each data block and a corresponding storage position;

comparing the first keyword serving as an index with the hash keywords of each data block in the mapping table in sequence;

based on the storage position mapped by the first keyword when the comparison is successful, reading a target data block taking the first keyword as a hash keyword from a corresponding storage position of a nonvolatile storage space into the memory space;

and acquiring the target data from the corresponding storage position of the target data block stored in the memory space based on the storage position mapped by the second keyword in the target data block when the comparison is successful.

2. The method of claim 1, further comprising:

forming a hash key of a corresponding data block based on the sequence number of each data block;

sequentially assigning sequence numbers to the arrangement relation of each data in each data block;

and combining the hash keywords of the data blocks to which the data belong and the sequence numbers correspondingly distributed to the data to form keywords of the corresponding data.

3. The method of claim 2, wherein forming a hash key for each of the data blocks based on the sequence number of the respective data block comprises:

and carrying out Hash coding on the serial number of each data block to obtain a Hash key word of the corresponding data block, or taking the serial number of each data block as the Hash key word of the corresponding data block.

4. The method of claim 1, wherein the first key is used as an index and is sequentially compared with hash keys of the data blocks; based on the storage location mapped by the first keyword when the comparison is successful, obtaining a target data block using the first keyword as a hash keyword, including:

extracting a first sub keyword and a second sub keyword from the first keyword;

taking the first sub-keyword as an index, and sequentially comparing the first sub-keyword with the hash keywords of each data block set;

based on the storage position mapped by the first sub-keyword when the comparison is successful, acquiring a target data block set taking the first sub-keyword as a Hash keyword;

and comparing the second sub-keyword serving as an index with the hash keywords of the data blocks in the target data block set in sequence to obtain the target data block in the target data block set, wherein the second keyword serves as a hash keyword.

5. The method of claim 1, wherein the comparing the second key as an index with the intermediate value of the index of the target data block and the intermediate value of the index after recursive partitioning of the target data block in sequence comprises:

taking the second keyword as an index, and comparing the index with a middle value of the index of the target data block;

when the comparison is unsuccessful, equally dividing the index of the target data block into a first index and a second index, and determining the first index where the value of the second keyword is located;

comparing the second keyword with the middle value of the first index;

when the comparison is not successful, equally dividing the first index into a new first index and a new second index, and determining the new first index where the value of the second keyword is located;

and comparing the second keyword with the middle value of the new first index until the comparison is successful, or the new first index does not have the second keyword.

6. A data processing apparatus, comprising:

the first searching unit is used for reading a mapping table to a memory space, wherein the mapping table comprises a mapping relation between a hash keyword of each data block and a corresponding storage position; comparing the first keyword serving as an index with the hash keywords of each data block in the mapping table in sequence;

a first obtaining unit, configured to, based on a storage location mapped by the first keyword when the comparison is successful, read, from a corresponding storage location in a nonvolatile storage space, a target data block that uses the first keyword as a hash keyword into the memory space;

and the second obtaining unit is used for obtaining the target data from the corresponding storage position of the target data block stored in the memory space based on the storage position mapped by the second keyword in the target data block when the comparison is successful.

7. The data processing apparatus of claim 6, further comprising:

a key unit, configured to form a hash key of a corresponding data block based on a sequence number of each data block; sequentially assigning sequence numbers to the arrangement relation of each data in each data block; and combining the hash keywords of the data blocks to which the data belong and the sequence numbers correspondingly distributed to the data to form keywords of the corresponding data.

8. The data processing apparatus of claim 7,

the key word unit is further configured to perform hash coding on the serial number of each data block to obtain a hash key word of the corresponding data block, or use the serial number of each data block as the hash key word of the corresponding data block.

9. The data processing apparatus of claim 6,

the first searching unit is further configured to extract a first sub-keyword and a second sub-keyword from the first keyword; taking the first sub-keyword as an index, and sequentially comparing the first sub-keyword with the hash keywords of each data block set; based on the storage position mapped by the first sub-keyword when the comparison is successful, acquiring a target data block set taking the first sub-keyword as a hash keyword; and sequentially comparing the second sub-keyword serving as an index with the hash keywords of the data blocks in the target data block set to obtain the target data blocks in the target data block set, wherein the second sub-keyword serves as the hash keyword.

10. The data processing apparatus of claim 6,

the second searching unit is further configured to compare the second keyword serving as an index with a middle value of the index of the target data block; when the comparison is unsuccessful, equally dividing the index of the target data block into a first index and a second index, and determining the first index where the value of the second keyword is located; comparing the second keyword with the middle value of the first index; when the comparison is not successful, equally dividing the first index into a new first index and a new second index, and determining a new first index where the value of the second keyword is located; and comparing the second keyword with the middle value of the new first index until the comparison is successful, or the new first index does not have the second keyword.

11. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

a processor for implementing a data processing method as claimed in any one of claims 1 to 5 when executing executable instructions stored in said memory.

12. A computer-readable storage medium, characterized in that executable instructions are stored, which when executed, are adapted to implement the data processing method of any of claims 1 to 5.