CN109460406B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN109460406B
CN109460406B CN201811198259.6A CN201811198259A CN109460406B CN 109460406 B CN109460406 B CN 109460406B CN 201811198259 A CN201811198259 A CN 201811198259A CN 109460406 B CN109460406 B CN 109460406B
Authority
CN
China
Prior art keywords
value
block
data
key
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811198259.6A
Other languages
Chinese (zh)
Other versions
CN109460406A (en
Inventor
李翰
黄斐一
李琳
邹易展
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811198259.6A priority Critical patent/CN109460406B/en
Publication of CN109460406A publication Critical patent/CN109460406A/en
Application granted granted Critical
Publication of CN109460406B publication Critical patent/CN109460406B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method, which comprises the following steps: receiving a storage request for key-value data; extracting key values and data values in the key-value data based on the storage request; and respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data. The invention also provides a data processing device.

Description

Data processing method and device
Technical Field
The present invention relates to data processing technologies, and in particular, to a method and an apparatus for processing data.
Background
The small embedded device has strict requirements on memory (ram) space and data storage (rom) space, which has two requirements:
1. executable files for data manipulation should be as small as possible;
2. files used for storing data should be as small as possible; in addition, in view of the ease of use, the number of files used for data storage should be as small as possible, and in the most ideal case, data storage is realized by only a single file.
In the prior art, a row-based storage form is adopted to store Key-Value pair (K-V, Key-Value) data, that is: each row stores a pair of k-v values in a format.
The problems existing in the prior art are that: when data modification is carried out, if key values in key-value data are modified (K1), and new K1 data are longer than old K1 data, all data behind the old K1 data need to be moved backwards integrally, otherwise newly inserted K1 data cover other key value data; when data is deleted, if data of other rows of the tail row is deleted, a data hole exists in the file, which causes waste of the stored file, and all data behind the hole need to be moved forward as a whole in order to avoid wasting the storage space of the file; when data is searched, the data can only be searched line by line, which results in long data searching time, high complexity and low searching efficiency.
Disclosure of Invention
In order to solve the foregoing technical problem, embodiments of the present invention provide a data processing method and apparatus.
The technical scheme of the embodiment of the invention is realized as follows:
according to an aspect of the embodiments of the present invention, there is provided a data processing method, including:
receiving a storage request for key-value data;
extracting key values and data values in the key-value data based on the storage request;
and respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data.
In the above scheme, the method further comprises:
when the storage length of the data value is larger than that of a single data block, storing the data value in a plurality of data blocks; wherein the plurality of data blocks for storing the data value are associated with each other by an index value of the block.
In the above scheme, the method further comprises:
the indexes of each keying value block are associated to form a balanced binary tree structure;
when new key-value data is inserted, the hash value of the left node in the balanced binary tree structure is less than or equal to the hash value of the root node and the hash value of the right node, and the hash value of the root node is less than or equal to the hash value of the right node, so as to maintain the orderliness of the balanced binary tree structure; the height difference value of the left subtree and the right subtree of any root is less than or equal to 2 so as to keep the balance of the balanced binary tree structure.
In the above scheme, the method further comprises:
based on a block head index value in the key value block, searching a data value block corresponding to the key value block, wherein the block head index value is used for storing a block chain index value of the data value block;
and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
In the above scheme, the method further comprises:
detecting an empty block state in a block chain table storing the key-value data;
when the empty block state represents that no empty block exists in the block linked list, storing the data of the key value and the data of the data value in a preset storage position of the block linked list in a data block mode respectively;
or, when the empty block state indicates that there is an empty block in the block linked list, storing the data of the key value and the data of the data value in the position of the empty block in the form of a data block, respectively.
In the above scheme, the method further comprises:
receiving a modification request for key-value data;
determining a block index value of a key value block to be modified based on the modification request;
determining a target data value block corresponding to the target key value block based on a block header index in the target key value block corresponding to the block index value;
modifying the data value data in the target data value block.
In the above scheme, the method further comprises:
comparing the data storage length corresponding to the target data value block with the data storage length of the data value to be modified;
when the comparison result represents that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, adding a new data value block behind a block chain taking the target data value block as a block head;
and when the comparison result represents that the storage length of the data value corresponding to the target data value block is greater than the storage length of the data value to be modified, releasing the block space occupied by the data value corresponding to the target data value block.
In the above scheme, the method further comprises:
receiving a delete request for key-value data;
determining a block index value corresponding to the key value block to be deleted based on the deletion request;
determining a target data value block corresponding to the target key value block based on a block head index value in the target key value block corresponding to the block index value;
deleting the data value stored in the target data value block and the key value stored in the target key value block to release the block space occupied by the key value and the corresponding data value.
According to another aspect of embodiments of the present invention, there is provided a data processing apparatus, the apparatus including:
a receiving unit for receiving a storage request for key-value data;
an extracting unit, configured to extract, based on the storage request, a key value in the key-value data and a data value corresponding to the key value;
and the storage unit is used for respectively storing the hash value of the key value and the data value in the form of data blocks, the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to that of the key value block and/or the data value block, wherein the length of the key value block and the data value block can store data.
According to a third aspect in an embodiment of the present invention, there is provided a data processing apparatus including: memory, a processor and an executable program stored in the memory for movement by the processor, wherein the processor executes the executable program to perform the steps of any of the above data processing methods.
In the technical scheme of the embodiment of the invention, a data processing method and a data processing device are provided, wherein a storage request aiming at key-value data is received; extracting key-value data and data values in the key-value data based on the storage request; and respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data. According to the method and the device, the key-value pair data are stored in the form of the data blocks, and the length of each data block is the same, so that the data searching efficiency can be improved.
Drawings
FIG. 1 is a flow chart illustrating a data processing method according to an embodiment of the present invention;
FIG. 2 is a first block diagram illustrating a data processing apparatus according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a second exemplary embodiment of a data processing apparatus;
FIG. 4 is a diagram illustrating an AVL tree structure of a key-value block according to an embodiment of the present invention;
FIG. 5 is a block diagram illustrating a file structure of a block of data values according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the insertion of key-value pair (K-V) data according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating the insertion of key-value pair (K-V) data according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating the insertion of key-value pair (K-V) data according to an embodiment of the present invention;
FIG. 9 is a diagram illustrating a modification of key-value pair (K-V) data according to an embodiment of the present invention;
FIG. 10 is a diagram illustrating deletion of key-value pair (K-V) data according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
FIG. 1 is a flow chart illustrating a data processing method according to an embodiment of the present invention; as shown in fig. 1, the method includes:
step 101, receiving a storage request for key-value data;
in the embodiment of the invention, the method is mainly applied to the electronic equipment with key-value data processing function, and the electronic equipment can be terminals such as a desktop computer and a server. Wherein the key-value data refers to key-value pairs, which are implementations of mappings in programming language to mathematical concepts. The key (key) is used as an index of the element, and the value (value) represents the stored and read data.
Step 102, based on the storage request, extracting a key value and a data value in the key-value data;
in the embodiment of the invention, when the electronic equipment receives a storage request for key-value data, the electronic equipment responds to the storage request and extracts key (key) data and value (value) data in a key-value pair corresponding to the storage request respectively based on the storage request.
Step 103, storing the hash value of the key value and the data value in the form of data blocks respectively, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data.
In the embodiment of the invention, after the electronic device extracts the key (key) data and the value (value) data to be stored, the key (key) data and the value (value) data are respectively stored in the form of data blocks, and the storage lengths of the single key value block and the single data value block are the same and/or consistent.
Specifically, a file structure in which key (key) data and value (value) data are stored in the form of data blocks is shown in table 1.
block0 block1 block2 block3 block4
block5 block6 block7 block8 block9
TABLE 1
In table 1, each block represents a block, and the storage length of each block is the same, that is, a single key value block is the same as the storage length of a single data value block, and the index value of the block in the same storage file is usually from "0", denoted as "block 0".
In the embodiment of the invention, when the electronic equipment stores the key data in the form of the data block, the key data is not directly stored, but the key data is subjected to data conversion through a Hash algorithm to obtain the Hash value of the key data, and then the Hash value of the key data is stored. Here, the Hash algorithm is also called a Hash algorithm, which means that input key data of an arbitrary length is converted into an output of a fixed length by the Hash algorithm. Thus, errors occurring during data transmission can be intuitively detected.
The storage structure of a specific key value block may be as shown in table 2:
Type(0)
Left-child
Right-child
Data-head
Hash(key)
TABLE 2
In table 2, Type represents that the content stored in the block is a key value; a Hash (key) for storing a Hash value of the key; left-child is used for storing the index of the Left sub-tree key value block; right-child is used to store the index of the Right sub-tree key value block; the Data-head is used to store the head block index of the linked list of blocks of Data values. If the index value stored in Left-child or Right-child is 0xFFFFFFFF, it indicates that there is no Left or Right subtree in the current key value block. It can be seen that the key value indexes of all the key value blocks form a binary tree structure.
When inserting new key-value data, in order for the binary tree to appear as an (AVL, Adelson-Velskii and Landis) tree, two properties must be maintained: 1) ordering; 2) balance;
wherein, the order means: keeping all the root node Hash values larger than the left node Hash value and smaller than the right node Hash value; the balance refers to: the heights of the left and right subtrees of any one root cannot differ by more than 2.
That is, when key value pairs are added or deleted, i.e., the number of key (k) blocks changes, the order and balance of the AVL tree need to be maintained.
Because the key value block in the embodiment of the invention is an ordered binary tree structure, when the key value pair data is added, the input key (k) data can be firstly converted into the output with fixed length through a Hash algorithm (Hash), and the output is a Hash value and is also called a Hash value. Then, determining the position of the key value block to be stored in the AVL tree structure according to the Hash value; then, based on the determined position, comparing the Hash value of the key value block to be stored with the first Hash value of the first key value block adjacent to the key value block to be stored and the second Hash value of the second key value block respectively to obtain a comparison result; when the comparison result represents that the hash value of the key value block to be stored is greater than the first hash value and less than the second hash value, determining the block index value of the key value block to be stored as the index value of the tree root in the AVL tree structure; determining a block index value of the first key value block as an index value of a left sub-tree in an AVL tree structure; the block index value of the second key value block is determined as the index value of the right sub-tree in the AVL tree structure. Since the AVL tree is always balanced, the structure of the AVL tree is not changed due to data query, and therefore, the key value blocks are stored in a tree structure, and the data query efficiency can be improved.
Here, when the electronic device stores a Hash value of key (key) data, the length of the Hash value may be less than or equal to the length of a single key value block and/or a single data value block that can store data.
Fig. 4 is a schematic structural diagram of an AVL tree of a key value block in the embodiment of the present invention, as shown in fig. 4: taking block0, block1, block2 and block3 as examples, wherein block0 is a root block and has a Hash value of BXXXXXXX; block1 is a left sub-tree block with a Hash value of axxxxxxx, block2 is a right sub-tree block with a Hash value of cxxxxx, and axxxxxxx < BXXXXXX < cxxxxx, i.e. the Hash value of the root block is greater than the Hash value of the left sub-tree block and less than the Hash value of the right sub-tree block; block3 is a block of data values and block0 has a block of data values block header index of block 3.
The storage structure of a particular block of data values may be as shown in table 3:
Type(1)
Prev-index
Next-index
Data-type
Data
TABLE 3
In table 3, Type represents that the content stored in the block is data; the Prev-index is used to store an index of a predecessor block of the block of data values; the Next-index is used for storing the index of the subsequent block of the data value block, and if the index value of the Prev-index or the Next-index is 0xFFFFFFFF, the data value block does not have the previous block or the subsequent block currently; the Data-type is used to represent the authority type (read-only, read-write, etc.) of the Data value block; data is used to represent the Data that is actually stored for the block of Data values. Since the storage length of a single block of data values is fixed, when the storage length of a data value to be stored is greater than that of a single block of data values, the data value to be stored may be stored in the form of multiple blocks of data values, and the respective blocks of data values may be associated with each other by the index values of the blocks, i.e., the respective blocks may be concatenated into a block chain by Prev-index and Next-index.
Fig. 5 is a schematic diagram of a file structure of a data value block in the embodiment of the present invention, as shown in fig. 5: taking block0, block1, block2 and block3 as examples, wherein block0 is a precursor block of block 1; block1 is a successor to block0 and is a predecessor to block 2; block2 is a successor to block1 and is a predecessor to block 3; block3 is a successor to block2, and 0xFFFFFFFF indicates that there is no predecessor or successor block currently.
In the embodiment of the present invention, although the content stored in each block is different, the storage length of each block is the same, and the type of each block can be also converted into each other by the block operation program.
In the embodiment of the present invention, when a data value block needs to be searched, the electronic device may further search a data value block corresponding to the key value block based on a block head index value in the key value block, where the block head index value is used to store a block chain index value of the data value block. When the length of the data value to be searched is greater than the storage length of a single block, the electronic device may further search the block index value of the predecessor block or successor block corresponding to the data value block based on the index value of the associated block in the data value block. Thus, the searching efficiency of the data value block can be improved.
The following describes a detailed workflow of an embodiment of the present invention:
first, a storage file is initialized and a fixed file header is generated (as shown in table 4).
Figure GDA0001956219060000091
TABLE 4
In table 4, block0 and block1 are generated file headers, where the Next-index of block0 is to be used for a tree root pointing to a key value block, and block1 is used as a block header node of an empty block chain.
The electronic device may also detect an empty block state in a block chain table storing key-value data when key-value pair (K-V) data is inserted; when the empty block state representation block linked list has no empty block, the K data and the V data in the K-V data to be inserted can be respectively stored in the preset storage positions of the block linked list in the form of data blocks; or, when the empty block state representation block linked list has an empty block, the K data and the V data in the K-V data to be inserted can be directly stored in the position of the empty block in the form of a data block.
Fig. 6 is a schematic diagram showing insertion of key-value pair (K-V) data in the embodiment of the present invention, as shown in fig. 6, where K is set to "key", V is "value", and a Hash value of "key" is set to "0287014 a", and when no empty block is detected in a current empty block chain, a plurality of blocks for storing K-V data are newly added to a generated storage file, and the newly added blocks are appended to the back of the block chain in the storage file.
Fig. 7 is a schematic diagram illustrating insertion of key-value pair (K-V) data in an embodiment of the present invention, as shown in fig. 7, when K-V data continues to be added, K that continues to be inserted is "key 1", V is "value 1", and a Hash value of "key 1" is 0402017B, and when no empty block exists in a current empty block chain, a plurality of blocks for storing K-V data are newly allocated in a storage file, and the newly added block is appended to the back of the block chain in the storage file.
In FIG. 7, since Hash (key1) is greater than Hash (key), the key1 key should be placed into the right sub-tree of the key.
Fig. 8 is a schematic diagram showing insertion of key-value pair (K-V) data in the embodiment of the present invention, as shown in fig. 8, when K-V data continues to be added, K that can continue to be inserted is "key 2", V is "value 2", and a Hash value of "key 2" is 0403017C, and when a detection result indicates that there are no empty blocks on a current empty block chain, a plurality of blocks for storing K-V data are added to a storage file, and a newly added block is appended to the back of the storage file.
In fig. 8, since Hash (key2) is greater than Hash (key1) and Hash (key1) is greater than Hash (key), the key block of key1 should be adjusted to the tree root in the binary tree, key to the left subtree in the binary tree, and key2 to the right subtree in the binary tree, so as to maintain the ordering and balance of AVL numbers.
In the embodiment of the present invention, the electronic device may further receive a modification request for key-value data, respond to the modification request, and determine, based on the modification request, a block index value of a key value block to be modified; then, based on the block head index in the target key value block corresponding to the block index value, determining a target data value block corresponding to the target key value block; finally, the data value data in the target data value block is modified.
Fig. 9 is a schematic diagram of modifying key-value pair (K-V) data in the embodiment of the present invention, and as shown in fig. 9, if the value of the modified key is value _ new, the Hash value of the key may be calculated first to obtain 0287014a, then the key is searched through the key block tree pointed to by block0, and when the key index corresponding to the key is found to be block2, then the value block3 pointed to by block2 is modified.
Here, after the key block corresponding to the key is found, the electronic device may further use the key block corresponding to the key as a target data value block, and compare the data storage length corresponding to the target data value block with the data storage length of the data value to be modified; when the comparison result represents that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, a new data value block can be added behind a block chain taking the target data value block as a block head; or, when the comparison result indicates that the storage length of the data value corresponding to the target data value block is greater than the storage length of the data value to be modified, the block space occupied by the data value corresponding to the target data value block may be released.
As shown in fig. 9, if the storage length of the value that can be accommodated by the value chain with block3 as the block header is smaller than the storage length of value _ new, adding a new value block for storing V data in the storage file, and appending the new value block to the block chain with block3 as the block header; if the stored length of the value that the value chain can accommodate is greater than the stored length of value _ new by a number of value blocks, the redundant value blocks can be released to the empty block chain.
In the embodiment of the present invention, the electronic device may further receive a deletion request for the key-value data; responding to the deletion request, and determining a block index value corresponding to the key value block to be deleted based on the deletion request; then, based on the block head index value in the target key value block corresponding to the block index value, determining a target data value block corresponding to the target key value block; and finally, deleting the data value stored in the target data value block and the key value stored in the target key value block so as to release the block space occupied by the key value and the corresponding data value.
Fig. 10 is a schematic diagram of deleting key-value pair (K-V) data in the embodiment of the present invention, and as shown in fig. 10, if a key value to be deleted is a key value pair of a key, a key block index block2 corresponding to the key is first found, then a value chain header block3 pointed by block2 is found, then all blocks in the value block chain are released to an empty block chain, a key block is released to the empty block chain, and finally a tree node in an AVL tree is readjusted.
In fig. 10, two additional empty blocks, block2 and block3, are added to the empty block chain with block1 as the block head.
According to the embodiment of the invention, the K-V data is stored in the form of the data block, and all K data value blocks are managed through the AVL tree, so that the data searching efficiency is improved when data are inserted, modified and deleted.
Fig. 2 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention, as shown in fig. 2, the apparatus includes:
a receiving unit 201 for receiving a storage request for key-value data;
an extracting unit 202, configured to extract, based on the storage request, a key value in the key-value data and a data value corresponding to the key value;
the storage unit 203 is configured to store the hash value of the key value and the data value in the form of data blocks, where a single key value block and a single data value block have the same storage length, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block that can store data.
In this embodiment of the present invention, the storage unit 203 is further configured to store the data value in the form of a plurality of data blocks when the storage length of the data value is greater than the storage length of a single data block; wherein the plurality of data blocks for storing the data value are associated with each other by an index value of the block.
In the embodiment of the invention, indexes of each key value block are associated to form a balanced binary tree structure;
when new key-value data is inserted, the hash value of the left node in the balanced binary tree structure is less than or equal to the hash value of the root node and the hash value of the right node, and the hash value of the root node is less than or equal to the hash value of the right node, so as to maintain the orderliness of the balanced binary tree structure; the height difference value of the left subtree and the right subtree of any root is less than or equal to 2 so as to keep the balance of the balanced binary tree structure.
In the embodiment of the present invention, the apparatus further includes: a search unit 204;
the searching unit 204 is configured to search a data value block corresponding to the key value block based on a block head index value in the key value block, where the block head index value is used to store a block chain index value of the data value block; and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
In the embodiment of the present invention, the apparatus further includes: a detection unit 205;
the detection unit 205 is configured to detect an empty block state in a block chain table storing the key-value data;
the storage unit 203 is specifically configured to store the data of the key value and the data of the data value in a preset storage position of the block linked list in a data block form, respectively, when the empty block state indicates that there is no empty block in the block linked list; or, when the empty block state indicates that there is an empty block in the block linked list, storing the data of the key value and the data of the data value in the position of the empty block in the form of a data block, respectively.
In the embodiment of the present invention, the apparatus further includes: a determination unit 206 and a modification unit 207;
the receiving unit 201 is further configured to receive a modification request for key-value data;
the determining unit 206 is further configured to determine, based on the modification request, a block index value of the key value block to be modified; and means for determining a target block of data values corresponding to the target block of key values based on a block header index in the target block of key values corresponding to the block index value;
the modifying unit 207 is configured to modify the data value data in the target data value block.
In the embodiment of the present invention, the apparatus further includes: a comparison unit 208, an addition unit 209, and a release unit 210;
the comparing unit 208 is further configured to compare the data storage length corresponding to the target data value block with the data storage length of the data value to be modified;
the adding unit 209 is configured to, when the comparison result indicates that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, add a new data value block after a block chain with the target data value block as a block header;
the releasing unit 210 is configured to release a block space occupied by a data value corresponding to the target data value block when the comparison result indicates that the data value storage length corresponding to the target data value block is greater than the storage length of the data value to be modified.
In this embodiment of the present invention, the receiving unit 201 is further configured to receive a deletion request for key-value data;
the determining unit 206 is further configured to determine, based on the deletion request, a block index value corresponding to the key value block to be deleted; and determining a target data value block corresponding to the target key value block based on a block head index value in the target key value block corresponding to the block index value;
the releasing unit 210 is further configured to delete the data value stored in the target data value block and the key value stored in the target key value block, so as to release the block space occupied by the key value and the corresponding data value.
It should be noted that: in the data processing apparatus provided in the above embodiment, when performing data processing, only the division of each program module is exemplified, and in practical applications, the processing may be distributed to different program modules according to needs, that is, the internal structure of the data processing apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the data processing apparatus and the data processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
Fig. 3 is a schematic diagram illustrating a structure of a data processing apparatus 300, which may be a mobile phone, a computer, a digital broadcast terminal, an information transceiver, a game console, a tablet device, a personal digital assistant, an information push server, a content server, or the like. The data processing apparatus 300 shown in fig. 3 includes: at least one processor 301, memory 302, at least one network interface 304, and a user interface 305. The various components in data processing apparatus 300 are coupled together by a bus system 306. It is understood that the bus system 306 is used to enable connective communication between these components. The bus system 306 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 306 in FIG. 3.
The user interface 305 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, keys, buttons, a touch pad, or a touch screen.
It will be appreciated that the memory 302 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 302 described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The memory 302 in embodiments of the present invention is used to store various types of data to support the operation of the data processing apparatus 300. Examples of such data include: any computer programs for operating on data processing apparatus 300, such as an operating system 3021 and application programs 3022; music data; animation data; book information; video, etc. Operating system 3021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and for processing hardware-based tasks. The application programs 3022 may contain various application programs such as a Media Player (Media Player), a Browser (Browser), etc. for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in the application program 3022.
The method disclosed in the above embodiments of the present invention may be applied to the processor 301, or implemented by the processor 301. The processor 301 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 301. The Processor 301 may be a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 301 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 302, and the processor 301 reads the information in the memory 302 and performs the steps of the aforementioned methods in conjunction with its hardware.
In an exemplary embodiment, the data processing apparatus 300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, Micro Controllers (MCUs), microprocessors (microprocessors), or other electronic components for performing the foregoing methods.
Specifically, when the processor 301 runs the computer program, it executes: receiving a storage request for key-value data; extracting key values and data values in the key-value data based on the storage request; and respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: when the storage length of the data value is larger than that of a single data block, storing the data value in a plurality of data blocks; wherein the plurality of data blocks for storing the data value are associated with each other by an index value of the block.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: the indexes of each keying value block are associated to form a balanced binary tree structure; when new key-value data is inserted, the hash value of the left node in the balanced binary tree structure is less than or equal to the hash value of the root node and the hash value of the right node, and the hash value of the root node is less than or equal to the hash value of the right node, so as to maintain the orderliness of the balanced binary tree structure; the height difference value of the left subtree and the right subtree of any root is less than or equal to 2 so as to keep the balance of the balanced binary tree structure.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: based on a block head index value in the key value block, searching a data value block corresponding to the key value block, wherein the block head index value is used for storing a block chain index value of the data value block; and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: detecting an empty block state in a block chain table storing the key-value data; when the empty block state represents that no empty block exists in the block linked list, storing the data of the key value and the data of the data value in a preset storage position of the block linked list in a data block mode respectively; or, when the empty block state indicates that there is an empty block in the block linked list, storing the data of the key value and the data of the data value in the position of the empty block in the form of a data block, respectively.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: receiving a modification request for key-value data; determining a block index value of a key value block to be modified based on the modification request; determining a target data value block corresponding to the target key value block based on a block header index in the target key value block corresponding to the block index value; modifying the data value data in the target data value block.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: comparing the data storage length corresponding to the target data value block with the data storage length of the data value to be modified; when the comparison result represents that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, adding a new data value block behind a block chain taking the target data value block as a block head; and when the comparison result represents that the storage length of the data value corresponding to the target data value block is greater than the storage length of the data value to be modified, releasing the block space occupied by the data value corresponding to the target data value block.
Specifically, when the processor 301 runs the computer program, the following steps are further performed: receiving a delete request for key-value data; determining a block index value corresponding to the key value block to be deleted based on the deletion request; determining a target data value block corresponding to the target key value block based on a block head index value in the target key value block corresponding to the block index value; deleting the data value stored in the target data value block and the key value stored in the target key value block to release the block space occupied by the key value and the corresponding data value.
In an exemplary embodiment, the present invention further provides a computer readable storage medium, such as a memory 302, comprising a computer program, which is executable by a processor 301 of a data processing apparatus 300 to perform the steps of the aforementioned method. The computer readable storage medium can be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM; or may be a variety of devices including one or any combination of the above memories, such as a mobile phone, computer, tablet device, personal digital assistant, etc.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, performs: receiving a storage request for key-value data; extracting key values and data values in the key-value data based on the storage request; and respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data.
The computer program, when executed by the processor, further performs: when the storage length of the data value is larger than that of a single data block, storing the data value in a plurality of data blocks; wherein the plurality of data blocks for storing the data value are associated with each other by an index value of the block.
The computer program, when executed by the processor, further performs: the indexes of each keying value block are associated to form a balanced binary tree structure; when new key-value data is inserted, the hash value of the left node in the balanced binary tree structure is less than or equal to the hash value of the root node and the hash value of the right node, and the hash value of the root node is less than or equal to the hash value of the right node, so as to maintain the orderliness of the balanced binary tree structure; the height difference value of the left subtree and the right subtree of any root is less than or equal to 2 so as to keep the balance of the balanced binary tree structure.
The computer program, when executed by the processor, further performs: based on a block head index value in the key value block, searching a data value block corresponding to the key value block, wherein the block head index value is used for storing a block chain index value of the data value block; and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
The computer program, when executed by the processor, further performs: detecting an empty block state in a block chain table storing the key-value data; when the empty block state represents that no empty block exists in the block linked list, storing the data of the key value and the data of the data value in a preset storage position of the block linked list in a data block mode respectively; or, when the empty block state indicates that there is an empty block in the block linked list, storing the data of the key value and the data of the data value in the position of the empty block in the form of a data block, respectively.
The computer program, when executed by the processor, further performs: receiving a modification request for key-value data; determining a block index value of a key value block to be modified based on the modification request; determining a target data value block corresponding to the target key value block based on a block header index in the target key value block corresponding to the block index value; modifying the data value data in the target data value block.
The computer program, when executed by the processor, further performs: comparing the data storage length corresponding to the target data value block with the data storage length of the data value to be modified; when the comparison result represents that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, adding a new data value block behind a block chain taking the target data value block as a block head; and when the comparison result represents that the storage length of the data value corresponding to the target data value block is greater than the storage length of the data value to be modified, releasing the block space occupied by the data value corresponding to the target data value block.
The computer program, when executed by the processor, further performs: receiving a delete request for key-value data; determining a block index value corresponding to the key value block to be deleted based on the deletion request; determining a target data value block corresponding to the target key value block based on a block head index value in the target key value block corresponding to the block index value; deleting the data value stored in the target data value block and the key value stored in the target key value block to release the block space occupied by the key value and the corresponding data value.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.

Claims (9)

1. A method of data processing, the method comprising:
receiving a storage request for key-value data;
extracting key values and data values in the key-value data based on the storage request;
respectively storing the hash value of the key value and the data value in the form of data blocks, wherein the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to the length of the key value block and/or the length of the data value block capable of storing data; wherein the single key value block is to store the hash value; the single block of data values is for storing the data values;
based on a block head index value in the key value block, searching a data value block corresponding to the key value block, wherein the block head index value is used for storing a block chain index value of the data value block; and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
2. The method of claim 1, further comprising:
when the storage length of the data value is larger than that of a single data block, storing the data value in a plurality of data blocks; wherein the plurality of data blocks for storing the data value are associated with each other by an index value of the block.
3. The method of claim 1, further comprising:
the indexes of each keying value block are associated to form a balanced binary tree structure;
when new key-value data is inserted, the hash value of the left node in the balanced binary tree structure is less than or equal to the hash value of the root node and the hash value of the right node, and the hash value of the root node is less than or equal to the hash value of the right node, so as to maintain the orderliness of the balanced binary tree structure; the height difference value of the left subtree and the right subtree of any root is less than or equal to 2 so as to keep the balance of the balanced binary tree structure.
4. The method of claim 1, further comprising:
detecting an empty block state in a block chain table storing the key-value data;
when the empty block state represents that no empty block exists in the block linked list, storing the data of the key value and the data of the data value in a preset storage position of the block linked list in a data block mode respectively;
or, when the empty block state indicates that there is an empty block in the block linked list, storing the data of the key value and the data of the data value in the position of the empty block in the form of a data block, respectively.
5. The method of claim 1, further comprising:
receiving a modification request for key-value data;
determining a block index value of a key value block to be modified based on the modification request;
determining a target data value block corresponding to the target key value block based on a block header index in the target key value block corresponding to the block index value;
modifying the data value data in the target data value block.
6. The method of claim 5, further comprising:
comparing the data storage length corresponding to the target data value block with the data storage length of the data value to be modified;
when the comparison result represents that the data value storage length corresponding to the target data value block is smaller than the storage length of the data value to be modified, adding a new data value block behind a block chain taking the target data value block as a block head;
and when the comparison result represents that the storage length of the data value corresponding to the target data value block is greater than the storage length of the data value to be modified, releasing the block space occupied by the data value corresponding to the target data value block.
7. The method of claim 1, further comprising:
receiving a delete request for key-value data;
determining a block index value corresponding to the key value block to be deleted based on the deletion request;
determining a target data value block corresponding to the target key value block based on a block head index value in the target key value block corresponding to the block index value;
deleting the data value stored in the target data value block and the key value stored in the target key value block to release the block space occupied by the key value and the corresponding data value.
8. A data processing apparatus, the apparatus comprising:
a receiving unit for receiving a storage request for key-value data;
an extracting unit, configured to extract, based on the storage request, a key value in the key-value data and a data value corresponding to the key value;
the storage unit is used for respectively storing the hash value of the key value and the data value in the form of data blocks, the storage length of a single key value block is the same as that of a single data value block, and the length of the hash value is less than or equal to that of the key value block and/or the data value block, wherein the data blocks can store data; wherein the single key value block is to store the hash value; the single block of data values is for storing the data values;
a lookup unit configured to lookup a data value block corresponding to the key value block based on a block header index value in the key value block, where the block header index value is used to store a block chain index value of the data value block; and/or, based on the index value of the associated one of the blocks of data values, looking up the block index value of the preceding or succeeding block corresponding to the block of data values.
9. A data processing apparatus, the apparatus comprising: memory, processor and executable program stored in the memory for execution by the processor, characterized in that the processor executes the executable program to perform the steps of a data processing method as claimed in any one of claims 1 to 7.
CN201811198259.6A 2018-10-15 2018-10-15 Data processing method and device Active CN109460406B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811198259.6A CN109460406B (en) 2018-10-15 2018-10-15 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811198259.6A CN109460406B (en) 2018-10-15 2018-10-15 Data processing method and device

Publications (2)

Publication Number Publication Date
CN109460406A CN109460406A (en) 2019-03-12
CN109460406B true CN109460406B (en) 2021-03-23

Family

ID=65607767

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811198259.6A Active CN109460406B (en) 2018-10-15 2018-10-15 Data processing method and device

Country Status (1)

Country Link
CN (1) CN109460406B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399104B (en) * 2019-07-23 2023-06-09 网易(杭州)网络有限公司 Data storage method, data storage device, electronic apparatus, and storage medium
CN110825363B (en) * 2019-11-01 2024-05-17 北京知道创宇信息技术股份有限公司 Intelligent contract acquisition method and device, electronic equipment and storage medium
CN112464619B (en) * 2021-01-25 2021-05-25 平安国际智慧城市科技股份有限公司 Big data processing method, device and equipment and computer readable storage medium
CN116414828A (en) * 2021-12-31 2023-07-11 华为技术有限公司 Data management method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153757A (en) * 2016-12-02 2018-06-12 深圳市中兴微电子技术有限公司 A kind of method and apparatus of Hash table management
CN108446376A (en) * 2018-03-16 2018-08-24 众安信息技术服务有限公司 Date storage method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9110936B2 (en) * 2010-12-28 2015-08-18 Microsoft Technology Licensing, Llc Using index partitioning and reconciliation for data deduplication
US9846642B2 (en) * 2014-10-21 2017-12-19 Samsung Electronics Co., Ltd. Efficient key collision handling
CN105320775B (en) * 2015-11-11 2019-05-14 中科曙光信息技术无锡有限公司 The access method and device of data
CN106202548B (en) * 2016-07-25 2018-09-04 网易(杭州)网络有限公司 Date storage method, lookup method and device
CN107918612B (en) * 2016-10-08 2019-03-05 腾讯科技(深圳)有限公司 The implementation method and device of key assignments memory system data structure
US10243939B2 (en) * 2016-12-23 2019-03-26 Amazon Technologies, Inc. Key distribution in a distributed computing environment
CN108595720B (en) * 2018-07-12 2020-05-19 中国科学院深圳先进技术研究院 Block chain space-time data query method, system and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153757A (en) * 2016-12-02 2018-06-12 深圳市中兴微电子技术有限公司 A kind of method and apparatus of Hash table management
CN108446376A (en) * 2018-03-16 2018-08-24 众安信息技术服务有限公司 Date storage method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Key-Value型NoSQL本地存储系统研究";马文龙 等;《计算机学报》;20170601;第41卷(第8期);1722-1751 *
"PHash: A memory-efficient, high-performance key-value store for large-scale data-intensive applications";Hyotaek Shim;《Journal of Systems and Software》;20170131;第123卷;33-44 *

Also Published As

Publication number Publication date
CN109460406A (en) 2019-03-12

Similar Documents

Publication Publication Date Title
CN109460406B (en) Data processing method and device
US10642515B2 (en) Data storage method, electronic device, and computer non-volatile storage medium
CN106970936B (en) Data processing method and device and data query method and device
US8910044B1 (en) Playlist incorporating tags
US9411840B2 (en) Scalable data structures
US9971799B2 (en) Storage device for storing directory entries, directory entry lookup apparatus and method, and storage medium storing directory entry lookup program
US11003625B2 (en) Method and apparatus for operating on file
CN110162525B (en) B+ tree-based read-write conflict resolution method, device and storage medium
CN110018998B (en) File management method and system, electronic equipment and storage medium
US20140280187A1 (en) Data storage system having mutable objects incorporating time
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
CN107704202B (en) Method and device for quickly reading and writing data
CN113297138A (en) Index establishing method, data query method and computing device
CN110134335B (en) RDF data management method and device based on key value pair and storage medium
US11429494B2 (en) File backup based on file type
US20140052734A1 (en) Computing device and method for creating data indexes for big data
CN105447166A (en) Keyword based information search method and system
WO2018045049A1 (en) Method and system for implementing distributed lobs
US9858300B2 (en) Hash based de-duplication in a storage system
CN105843809B (en) Data processing method and device
EP3343395A1 (en) Data storage method and apparatus for mobile terminal
CN104572638A (en) Data reading and writing method and device
CN111414527A (en) Similar item query method and device and storage medium
CN105354506A (en) File hiding method and apparatus
US11132401B1 (en) Distributed hash table based logging service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant