CN114090575A - Data storage method and retrieval method based on key value database and corresponding devices - Google Patents

Data storage method and retrieval method based on key value database and corresponding devices Download PDF

Info

Publication number
CN114090575A
CN114090575A CN202111258633.9A CN202111258633A CN114090575A CN 114090575 A CN114090575 A CN 114090575A CN 202111258633 A CN202111258633 A CN 202111258633A CN 114090575 A CN114090575 A CN 114090575A
Authority
CN
China
Prior art keywords
key
value
storage area
target
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111258633.9A
Other languages
Chinese (zh)
Inventor
罗京
潘松强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN202111258633.9A priority Critical patent/CN114090575A/en
Publication of CN114090575A publication Critical patent/CN114090575A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a key value database-based data storage method, a key value database-based data retrieval method and a corresponding device. The data storage method based on the key value database comprises the following steps: acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type; converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to key value pair data; storing the value of the key value pair data into a value storage area of a key value database, and recording the offset of the value in the value storage area and the length of the value; generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of a key value database; the method and the device realize the conversion of the original key of the character string type into the corresponding index key of the integer type with the specified byte length, facilitate the arrangement of the key storage area of the key value database, and further reduce the occupied space of the key storage area.

Description

Data storage method and retrieval method based on key value database and corresponding devices
Technical Field
The present invention relates to the field of data management technologies, and in particular, to a data storage method based on a key-value database, a retrieval method based on a key-value database, a data storage apparatus based on a key-value database, a retrieval apparatus based on a key-value database, a device, a readable storage medium, and a computer program product.
Background
A Key-Value store is a non-relational database that is a storage structure that organizes data using a Key-Value form. Wherein the bonds of different bond-value elements are all unique. Fast storage and retrieval of values is done through a unique key.
In the prior art, the original key of a key-value element is stored directly in the key-value database, regardless of the type of key. When the key is a character string type, because the data length of the character string type is uncertain, and the data length of the general character string type exceeds 8 bytes, if the original character string type key is directly used for storage, the problem of large occupied storage space exists; in the process of subsequently retrieving stored data, byte-by-byte comparison needs to be performed on the keys of the character string type, which results in low retrieval efficiency.
Disclosure of Invention
The technical problem to be solved by the embodiment of the invention is to provide a data storage method based on a key value database to reduce the storage space of keys; and provides a retrieval method based on the key value database to improve the retrieval efficiency.
Correspondingly, the embodiment of the invention also provides a data storage device based on the key value database, a retrieval device based on the key value database, equipment, a readable storage medium and a computer program product, which are used for ensuring the realization and the application of the method.
In order to solve the above problems, an embodiment of the present invention discloses a data storage method based on a key value database, including:
acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type;
converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data;
storing the value of the key-value pair data into a value storage area of a key-value database, and recording the offset of the value in the value storage area and the length of the value;
and generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
Optionally, the storing the directory entry into a key storage area of the key-value store further includes:
determining the target storage position of the directory entry in the key storage area according to the numerical value of an index key in the directory entry;
storing the directory entry in the target storage location in the key storage area.
Optionally, the method further comprises:
and sorting the directory entries in the key storage area according to the numerical size of the index key in each directory entry.
Optionally, the converting the original key of the string type into a corresponding integer type with a specified byte length by using a conversion rule to obtain an index key corresponding to the key-value pair data includes:
inverting the character string corresponding to the original key to obtain an inverted key;
performing a first hash operation on the original key to obtain a first hash value of a first specified byte length, and performing a second hash operation on the inverted key to obtain a second hash value of a second specified byte length;
and generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
Optionally, the first hash value and the second hash value are both 4-byte integer data; generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value, including:
and performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data.
Optionally, an operation formula of the preset operation is as follows:
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL);
wherein a hash1 in the operation formula represents the first hash value, and the hash2 represents the second hash value; or the hash1 in the operation formula represents the second hash value, and the hash2 represents the first hash value.
Optionally, the generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value includes:
and performing splicing operation on the first hash value and the second hash value to obtain an index key corresponding to the key value pair data.
Optionally, the storing the value of the key-value pair data into a value storage area of a key-value database, and recording an offset of the value in the value storage area and a length of the value, further includes:
when the index key is the same as a stored index key in the key value database, acquiring the stored index key which is the same as the index key;
and aiming at the index key and the stored index key, respectively storing the corresponding original key and value in the value storage area of the key value database, and recording the offset of the original key and value in the value storage area and the length corresponding to the original key and value.
The embodiment of the invention also discloses a key value database-based retrieval method, which comprises the following steps:
receiving retrieval information, wherein the retrieval information comprises a retrieval key; the retrieval key is of a character string type;
converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key;
searching a target directory entry corresponding to the target key from a key storage area of a key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting an original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
and acquiring a corresponding target value from a value storage area of the key value database according to the offset and the length in the target directory entry.
Optionally, the directory entries in the key storage area of the key value database are sorted and stored according to the numerical value of the corresponding index key; the searching the target directory entry corresponding to the target key from the key storage area of the key value database comprises:
and searching a target directory entry corresponding to the index key from a key storage area of the key value database based on a dichotomy.
Optionally, when there are a plurality of target directory entries, obtaining a corresponding target value from a value storage area of the key-value store, further includes:
respectively acquiring a candidate value corresponding to each target directory entry;
each candidate value is analyzed, and the candidate value including the search key corresponding to the target key is set as the target value.
Optionally, the method further comprises:
and returning the target value.
Optionally, the returning the target value further includes:
and if the target value contains the corresponding search key, deleting the search key from the target value and returning.
The embodiment of the invention also discloses a data storage device based on the key value database, which comprises:
the data acquisition module is used for acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type;
the original key conversion module is used for converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data;
the value storage module is used for storing the value of the key value pair data into a value storage area of a key value database and recording the offset of the value in the value storage area and the length of the value;
and the key storage module is used for generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
Optionally, the key storage module further includes:
the target storage position determining module is used for determining the target storage position of the directory entry in the key storage area according to the numerical value of the index key in the directory entry;
a target storage location based storage module to store the directory entry in the target storage location in the key storage area.
Optionally, the key storage module further includes:
and the sorting module is used for sorting the directory entries in the key storage area according to the numerical value of the index key in each directory entry.
Optionally, the original key conversion module includes:
the inversion conversion module is used for inverting the character string corresponding to the original key to obtain an inverted key;
the hash conversion module is used for carrying out first hash operation on the original key to obtain a first hash value with a first specified byte length and carrying out second hash operation on the inverted key to obtain a second hash value with a second specified byte length;
and the index key generation module is used for generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
Optionally, the first hash value and the second hash value are both 4-byte integer data; the index key generation module is specifically configured to:
and performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data.
Optionally, an operation formula of the preset operation is as follows:
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL);
wherein a hash1 in the operation formula represents the first hash value, and the hash2 represents the second hash value; or the hash1 in the operation formula represents the second hash value, and the hash2 represents the first hash value.
Optionally, the index key generation module is specifically configured to:
and performing splicing operation on the first hash value and the second hash value to obtain an index key corresponding to the key value pair data.
Optionally, the value storage module further includes:
the same key acquisition module is used for acquiring the stored index key which is the same as the index key when the index key is the same as the stored index key in the key value database;
and the same key value storage module is used for respectively storing corresponding original keys and values in a value storage area of the key value database aiming at the index keys and the stored index keys, and recording offsets of the original keys and the values in the value storage area and the corresponding lengths of the original keys and the values.
The embodiment of the invention also discloses a retrieval device based on the key value database, which comprises:
the retrieval information receiving module is used for receiving retrieval information input by a user, and the retrieval information comprises a retrieval key input by the user in a retrieval process; the retrieval key is of a character string type;
the retrieval key conversion module is used for converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key;
the target directory entry searching module is used for searching a target directory entry corresponding to the target key from a key storage area of the key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting an original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
and the target value acquisition module is used for acquiring a corresponding target value from the value storage area of the key value database according to the offset and the length in the target directory entry.
Optionally, the directory entries in the key storage area of the key value database are sorted and stored according to the numerical value of the corresponding index key; the target directory entry searching module is specifically configured to:
and searching a target directory entry corresponding to the index key from a key storage area of the key value database based on a dichotomy.
Optionally, when there are a plurality of target directory entries, the target value obtaining module further includes:
the candidate value acquisition module is used for respectively acquiring a candidate value corresponding to each target directory entry;
and the candidate value analysis module is used for analyzing each candidate value and taking the candidate value containing the retrieval key corresponding to the target key as the target value.
Optionally, the apparatus further comprises:
and the target value returning module is used for returning the target value.
Optionally, the target value returning module is further configured to delete the search key from the target value and return the deleted search key when the corresponding search key is included in the target value.
The embodiment of the invention also discloses equipment which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs which are configured to be executed by one or more processors comprise a data storage method for executing one or more key value databases in the embodiment of the invention.
The embodiment of the invention also discloses equipment which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs which are configured to be executed by one or more processors comprise a key value database-based retrieval method for executing one or more programs in the embodiment of the invention.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the device, the device can execute one or more of the key-value database-based data storage methods in the embodiment of the invention.
The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the equipment, the equipment can execute one or more retrieval methods based on the key value database in the embodiment of the invention.
The embodiment of the invention also discloses a computer program product, which comprises a computer program or computer instructions, and when the computer program or the computer instructions are executed by a processor, the data storage method based on the key value database is realized.
The embodiment of the invention also discloses a computer program product, which comprises a computer program or computer instructions, and when the computer program or the computer instructions are executed by a processor, the retrieval method based on the key value database is realized.
The embodiment of the invention has the following advantages:
in the process of storing data based on a key value database, the embodiment of the invention obtains key value pair data to be stored, wherein the key value pair data comprises an original key and a value; the original key is of a character string type; converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to key value pair data; storing the value of the key value pair data into a value storage area of a key value database, and recording the offset of the value in the value storage area and the length of the value; generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of a key value database; the method and the device realize the conversion of the original key of the character string type into the corresponding index key of the integer type with the specified byte length, facilitate the arrangement of the key storage area of the key value database, and further reduce the occupied space of the key storage area.
In the retrieval process based on the key value database, the retrieval information is received, and the retrieval information comprises a retrieval key; the retrieval key is of a character string type; converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key; searching a target directory entry corresponding to a target key from a key storage area of a key value database; in the process of storing key value pair data with keys of a character string type, a key value database converts original keys of the character string type into corresponding integer types with appointed byte length by adopting the same conversion rule to obtain index keys corresponding to the key value pair data, the values of the key value pair data are stored in a value storage area of the key value database, a directory entry corresponding to the values is generated according to the offset and the length of the values in the value storage area and the index keys, and the directory entry is stored in the key storage area; acquiring a corresponding target value from a value storage area of a key value database according to the offset and the length in the target directory entry; the method and the device realize that the retrieval key of the character string type is converted into the corresponding target key of the integer type by adopting the same conversion rule as that in the data storage process during retrieval, so that the retrieval is not required to be carried out in a byte-by-byte comparison mode, and the retrieval efficiency is improved.
Drawings
FIG. 1 is a flowchart illustrating steps of an embodiment of a key-value store based data storage method of the present invention;
FIG. 2 is a schematic diagram of a directory entry stored by a key store in an example of the invention;
FIG. 3 is a diagrammatic illustration of determining a target storage location for a directory entry in an example of the present invention;
FIG. 4 is a schematic diagram of ordering directory entries in a key store in an example of the invention;
FIG. 5 is a flowchart illustrating steps of an embodiment of a key-value store based retrieval method of the present invention;
FIG. 6 is a block diagram illustrating an embodiment of an index library constructing apparatus according to the present invention;
FIG. 7 is a block diagram of an embodiment of a search apparatus based on an index database according to the present invention;
FIG. 8 is a block diagram illustrating the structure of a device according to an exemplary embodiment;
fig. 9 is a schematic structural diagram of a server in an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In the field of data management technology, a key-value database is a database that is often used. For example, in a scenario in which an Identity Document (ID) and user information are stored, the following is specifically included: a file management scene, an online shopping scene, a session scene, a mailbox use scene and the like; a user ID corresponds to a set of user information, key value pair data with the user ID as a key and the user information set as a value can be obtained, the key value pair data are stored in a key value database, and when the user ID is determined, the user information set corresponding to the user ID can be quickly and efficiently obtained from the key value database.
Generally, the user ID may employ a combination of letters and numbers, and the number of characters of different user IDs is not limited, and thus, the key of the key-value pair data employs a character string type, i.e., the key of the key-value pair data is a character string. In the prior art, when key value pair data is stored in a key value database, keys of the key value pair data are stored in a key storage area of the key value database, and the length of the keys of the key value pair data is uncertain, so that the storage space of the key storage area for storing the keys is difficult to reasonably plan, and when the length of character strings of the keys exceeds 8 bytes, a relatively large storage space is needed. In addition, in the search stage, since the stored keys are character strings, byte-by-byte comparison is required during the search, the same key as the searched key is found from the stored keys, and the value corresponding to the key is further searched out, which is obviously inefficient.
One of the core concepts of the embodiment of the invention is as follows: in the storage stage of key value pair data, converting the character string type key into a corresponding integer type key with specified byte length, thereby storing the key regularly and reducing the occupied space of a key storage area; in addition, the integer type key can directly determine the corresponding value in a numerical value comparison mode in the retrieval stage, so that the retrieval efficiency is improved.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data storage method based on a key-value database according to the present invention is shown, which may specifically include the following steps:
step 101, key value pair data to be stored are obtained, wherein the key value pair data comprise original keys and values; the original key is a string type.
The key-value pair data includes a key and a value, where the key is an index of the value and the value is what its corresponding key is to obtain. The key in the key-value pair data can be obtained by analyzing the content corresponding to the value, or the key corresponding to the value can be limited when the sender of the key-value pair data sends the value. The embodiment of the invention does not limit the generation process of the acquired key value pair data.
For ease of distinction, the key in the key-value pair data at the time of acquisition is named the original key. The original key is of a character string type, that is, the original key is a character string, and the character string is a string of characters consisting of letters, numbers and underlines, each of which takes one byte length.
For example, in a student profile management scenario, the key-value pair data to be stored corresponds to the student profile of each student, the original key of the key-value pair data may be the student number, and the value of the key-value pair data is the student profile content. For the student a in school A, the school number is STUA 202111101; the school number of student B in school B is STUB 2111101. Thus, the bytes occupied by the key corresponding to different key value pair data to be stored may be different.
And 102, converting the original key of the character string type into a corresponding integer type with the specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data.
The problems that the byte length occupied by the original keys of the character string type is not fixed, and when the original keys of the character string type are compared in the retrieval stage, the characters need to be compared one by one, the retrieval efficiency is low and the like are considered; in the process of storing key-value pair data, the embodiment of the invention converts the original key of the character string type into the corresponding integer type with the appointed byte length through a conversion rule to obtain an index key corresponding to the key-value pair data; therefore, the problem that the original key length is not fixed is solved, and the index key of the integer type has corresponding numerical values, so that the index key can be retrieved through numerical value comparison in the retrieval stage without character-by-character comparison, and the problem of low retrieval efficiency is solved.
It should be noted that, in the storage stage of the key value pair data, the conversion rule used for converting the original key is the same as the conversion rule used for converting the received search key in the search stage. In different application scenarios, the conversion rules may be different, specifically including different conversion processes and different lengths of bytes corresponding to the converted index keys.
Generally, when the number of key-value pair data to be stored in an application scenario is known, that is, the key-value pair data to be stored is a closed set of a fixed number, and the number is relatively small, the specified byte length of the index key obtained by conversion using the conversion rule may be relatively short, for example, 4 bytes, 8 bytes, and the like; when the number of key-value pair data to be stored in the application scenario is known and is relatively large, or when the number of key-value pair data to be stored in the application scenario is unknown, that is, the key-value pair data to be stored is an open set of an unfixed number, the specified byte length of the index key converted by the conversion rule may be relatively long, for example, 8 bytes, 16 bytes, 32 bytes, and the like.
In an optional embodiment of the present invention, the process of obtaining an index key corresponding to key-value pair data by converting an original key of a string type into a corresponding integer type with a specified byte length by using a conversion rule may include:
inverting the character string corresponding to the original key to obtain an inverted key;
performing a first hash operation on the original key to obtain a first hash value of a first specified byte length, and performing a second hash operation on the inverted key to obtain a second hash value of a second specified byte length;
and generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
In this embodiment, the process of converting the original key includes inverting the character string corresponding to the original key to obtain an inverted key, for example, assuming that the original key is abcde12345, the corresponding inverted key is 54321 edcba. And obtaining a first hash value of the original key through a first hash operation, obtaining a second hash value of the inverted key through a second hash operation, and obtaining an index key corresponding to the original key according to the first hash value and the second hash value.
Since hash collision may exist in hash operation, in an optional embodiment, when hash collision occurs, the hash collision may be solved by combining methods such as a linked list, but the efficiency of data storage and retrieval is affected to a certain extent by solving the hash collision.
In another optional embodiment, when the hash collision occurs, the original key corresponding to the index key in which the hash collision exists may be stored in a value storage area for storing values of the key-value pair data in the numerical database to resolve the hash collision; details will be described later.
The first hash operation and the second hash operation may be the same or different. Illustratively, the first hash operation and the second hash operation may be the same, e.g., both compute a first hash value of the original key and a second hash value of the inverted key using a hash function of murmurmur hash3, respectively.
Illustratively, the first hash operation and the second hash operation may be different, e.g., the first hash operation computes a first hash value of the original key using a hash function of murmur hash3, and the second hash operation computes a second hash value of the inverted key using a hash function of murmur hash 2; alternatively, the first hash operation uses the hash function of murmur hash2 to compute the first hash value of the original key, and the second hash operation uses the hash function of murmur hash3 to compute the second hash value of the inverted key.
It should be noted that, in practical applications, other hash functions may be used to perform the hash operation, and the hash function is not limited to the hash function presented in the above example.
In an optional embodiment of the present invention, the generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value may include:
and performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data.
The lengths of the first hash value and the second hash value are fixed, and the preset operation may be a bit operation. The bit operations may include bit and operations, bit or operations, bitwise shift operations, and the like.
In a specific example, the first hash value and the second hash value are both 4-byte integer data, and an operation formula of the preset operation is as follows:
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL);
wherein, hash1 in the operation formula represents a first hash value, and hash2 represents a second hash value; alternatively, the hash1 in the operation formula represents the second hash value, and the hash2 represents the first hash value.
In this example, when hash1 in the operation formula represents the first hash value and hash2 represents the second hash value, the corresponding operation process is: converting the first hash value into 8-byte-long integer data, and then shifting the data to the left by 32 bits according to the bits to obtain a converted first hash value; simultaneously, the second hash value is bitwise compared with the upper 8-byte-long integer data '0X 00000000 FFFFFFFFL' to obtain a transformed second hash value; and finally, performing bit OR operation on the transformed first hash value and the transformed second hash value to obtain an operation result, namely the index key.
For example, assuming that the first hash value is 0X684067AC and the second hash value is 0X627C96D7, the first hash value is converted into 8-byte-long integer data by the above operation formula, and then shifted left by 32 bits to obtain a transformed first hash value of 0X684067AC 00000000L; the second hash value is bitwise compared with the upper 8-byte-long integer data '0X 00000000 FFFFFFFFL', and the transformed second hash value is 0X00000000627C96D 7L; finally, performing a bit or operation on the transformed first hash value and the transformed second hash value to obtain an operation result of 0X684067AC627C96D7L ═ 7512118168538355415, that is, the obtained index key is 7512118168538355415.
When the hash1 in the operation formula represents the second hash value and the hash2 represents the first hash value, the corresponding operation process is similar to that of the previous operation formula, and is not described again.
In another optional embodiment of the present invention, the generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value may include:
and carrying out splicing operation on the first hash value and the second hash value to obtain an index key corresponding to the key value pair data.
Wherein the lengths of the first hash value and the second hash value are fixed. And performing splicing operation on the first hash value and the second hash value, specifically splicing the first hash value to the end of the second hash value, or splicing the second hash value to the end of the first hash value. For example, assume that the first hash value is 0X 26; the second hash value is 0X0103, and the first hash value is concatenated to the end of the second hash value to obtain 0X010326, i.e., 66342 is the index key 0X 010326.
Step 103, storing the value of the key value pair data into a value storage area of the key value database, and recording the offset of the value in the value storage area and the length of the value.
In this embodiment, a key value database is used to store key value pair data, and the key value database has a value storage area and a key storage area, where the value storage area is used to store values of the key value pair data, and the key storage area is used to store keys corresponding to the values stored in the value storage area. In the process of storing values in the value storage area, the values can be stored in sequence according to the time sequence of storage and the sequence of addresses of the value storage area from low to high, namely, the storage address corresponding to the value stored first is lower than the storage address corresponding to the value stored later; or, random storage, etc.; after the value is stored in the value storage area, the specific location of the value in the value storage area can be obtained.
Wherein the length of the value corresponds to the size of the storage space occupied by the value; the offset corresponds to the starting address of the value in the value storage area. For example, assuming that the length of a value of one key value pair data is 300 and the offset is 0, it means that the size of the storage space occupied by the value is 300 bytes, and the starting address of the value in the value storage area is 0, i.e. the value is stored in the position of 0-300 bytes of the value storage space.
In an optional embodiment of the present invention, when the index key has a hash collision, the storing the value of the key-value pair data into the value storage area of the key-value database, and recording the offset of the value in the value storage area and the length of the value may include:
when the index key is the same as the index key corresponding to the key value pair data stored in the key value database, acquiring the stored index key which is the same as the index key;
and respectively storing the original key and the value corresponding to the index key and the stored index key in a value storage area of the key value database, and recording the offset of the original key and the value in the value storage area and the length corresponding to the original key and the value.
In this embodiment, after the index key corresponding to the key-value pair data to be stored is obtained through calculation, whether the index key exists in the key-value database may be determined through a comparison method, and if the index key exists, it indicates that a hash collision exists, at this time, each index key corresponding to the hash collision and the corresponding original key and value are used together as a new value to be stored, so that when a plurality of values are obtained during subsequent retrieval, a correct value may be determined according to the original key in the obtained values.
And step 104, generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
The key storage area of the key value database is used for storing an index corresponding to the value stored in the value storage area, that is, the value corresponding to the key can be obtained from the value storage area by any one key in the key storage area. The specific implementation process of the embodiment of the invention is that a directory entry composed of an index key, an offset and a length is stored in a key storage area, the directory entry corresponding to the index key can be determined under the condition of determining the index key, the offset and the length in the directory entry are further obtained, and the corresponding value can be obtained from a value storage area according to the offset and the length.
Fig. 2 is a schematic diagram illustrating a storage structure of key-value pair data in a key-value database according to an example of the present invention. In FIG. 2, the key storage area is a byte array structure, and two 8-byte-long integers are used to store directory entries, wherein one 8-byte-long integer is used to store the index key and the other 8-byte-long integer is used to store the offset and length. Specifically, the 8-byte integer for storing the offset and the length can be further divided according to actual requirements, for example, when the maximum storage control of the value storage area is 1024 gigabytes, and the maximum length is 16 megabytes, the upper 5 bytes can be divided for storing the offset, and the lower 3 bytes can be divided for storing the length, so that compact storage is realized, and the storage space occupied by the key storage area is saved.
It can be understood that the key storage area can be reasonably planned according to the byte length of the index key, the storage space of the value storage area, and the length of the maximum value to be stored, so as to reduce waste.
In order to facilitate subsequent retrieval, in an optional embodiment of the present invention, the storing the directory entry in the key storage area of the key-value store further includes:
determining the target storage position of the directory entry in the key storage area according to the numerical value of the index key in the directory entry;
the directory entry is stored in a target storage location in the key storage area.
In this embodiment, the key storage area is a byte array structure, and the directory entries stored in the key storage area may be limited to be arranged in the order of the size of the corresponding index key, specifically in the order from large to small, or in the order from small to large. After the arrangement sequence of the key storage areas is determined, the target storage position of the directory entry to be stored in the key storage area can be determined by comparing the numerical value between the index key in the directory entry to be stored and the index key in the stored directory entry, and the directory entry to be stored is stored into the target storage position according to the target storage position.
As shown in fig. 3, it is assumed that directory entry 1, directory entry 2, and directory entry 3 have been stored in the key storage area, and index key 1 in directory entry 1 is smaller than index key 2 in directory entry 2, and index key 2 is smaller than index key 3 in directory entry 3; at this time, the index key 4 in the directory entry 4 to be stored is larger than the index key 1 and smaller than the index key 2; thus, it may be determined that the target storage location for directory entry 4 is between directory entry 1 and directory entry 2, and directory entry 4 is stored to the target storage location.
Optionally, in another embodiment of the present invention, when the key-value pair data to be stored is a closed set with a fixed number, after the key-value pair data to be stored are all stored, the directory entries in the key storage area may be sorted, that is, sorted according to the numerical value of the index key in each directory entry from large to small or from small to large. As shown in fig. 4, which is a schematic diagram of sorting directory entries in a key storage area, a corresponding storage structure schematic diagram when storage of all directory entries is completed corresponds to an upper part in fig. 4; the sorted structural schematic diagram corresponds to the lower part in fig. 4; it can be seen that in this example, the locations of directory entry 1 and directory entry 4 have changed.
Furthermore, the embodiment of the present invention may further export the directory entry stored in the key storage area and the value stored in the value storage area to a binary file stored in a disk, so as to reduce the occupation of the memory space.
In the process of storing data based on a key value database, the embodiment of the invention obtains key value pair data to be stored, wherein the key value pair data comprises an original key and a value; the original key is of a character string type; converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to key value pair data; storing the value of the key value pair data into a value storage area of a key value database, and recording the offset of the value in the value storage area and the length of the value; generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of a key value database; the method and the device realize the conversion of the original key of the character string type into the corresponding index key of the integer type with the specified byte length, facilitate the arrangement of the key storage area of the key value database, and further reduce the occupied space of the key storage area.
Referring to fig. 5, a flowchart illustrating steps of an alternative embodiment of a key-value database-based retrieval method according to the present invention is shown, which may specifically include the following steps:
step 501, receiving retrieval information, wherein the retrieval information comprises a retrieval key; the search key is of a string type.
Step 502, a conversion rule is adopted to convert the search key of the character string type into a corresponding integer type with a specified byte length, and a target key corresponding to the search key is obtained.
Step 503, searching the target directory entry corresponding to the target key from the key storage area of the key value database.
Step 504, according to the offset and the length in the target directory entry, a corresponding target value is obtained from the value storage area of the key-value store.
In this embodiment, the retrieval information is generally a retrieval key input by a user in a retrieval process, and is used for retrieving a corresponding value from a key value database; the key value database in the embodiment is the key value database corresponding to the data storage method based on the key value database; in the process of storing key value pair data with keys of a character string type, a key value database converts original keys of the character string type into corresponding integer types with appointed byte length by adopting a conversion rule to obtain index keys corresponding to the key value pair data, the values of the key value pair data are stored in a value storage area of the key value database, a directory entry corresponding to the values is generated according to the offset and the length of the values in the value storage area and the index keys, and the directory entry is stored in the key storage area.
Correspondingly, in this embodiment, the search key input by the user is the original key corresponding to the value that the user desires to search, and is of a character string type. Because the key storage area of the key value database in the invention is not the original key for directly storing key value pair data, in the retrieval stage, the retrieval key input by a user needs to be converted by adopting the same conversion rule as the storage stage, and an integer type target key corresponding to the retrieval key and with the specified byte length is obtained; so as to find out the corresponding target directory entry from the key storage area according to the target key, and further obtain the target value corresponding to the target key from the value storage area and return the target value according to the offset and the length in the target directory entry.
Illustratively, continuing to take the above student archive management scenario as an example, the student archives of the students are stored by the data storage method based on the key value database; when a user needs to acquire a student file of a certain student, the student number of the student can be input, the target key of the student number can be obtained through a conversion rule, and because the target key is an integer type, a target directory entry with the numerical value equal to that of the target key can be directly searched from a key storage area in a numerical value comparison mode, the offset and the length of the value of the target value corresponding to the target key in the value storage area are acquired from the target directory entry, and then the target value is acquired from the value storage area.
In the retrieval process based on the key value database, the retrieval information is received, and the retrieval information comprises a retrieval key; the retrieval key is of a character string type; converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key; searching a target directory entry corresponding to a target key from a key storage area of a key value database; in the process of storing key value pair data with keys of a character string type, a key value database converts original keys of the character string type into corresponding integer types with appointed byte length by adopting the same conversion rule to obtain index keys corresponding to the key value pair data, the values of the key value pair data are stored in a value storage area of the key value database, a directory entry corresponding to the values is generated according to the offset and the length of the values in the value storage area and the index keys, and the directory entry is stored in the key storage area; acquiring a corresponding target value from a value storage area of a key value database according to the offset and the length in the target directory entry; the method and the device realize that the retrieval key of the character string type is converted into the corresponding target key of the integer type by adopting the same conversion rule as that in the data storage process during retrieval, so that the retrieval is not required to be carried out in a byte-by-byte comparison mode, and the retrieval efficiency is improved.
Further, in an optional embodiment of the present invention, directory entries in the key storage area of the key-value database are sorted and stored according to the numerical size of the corresponding index key; searching a target directory entry corresponding to a target key from a key storage area of a key value database, comprising:
and searching a target directory entry corresponding to the index key from the key storage area of the key value database based on the dichotomy.
In this embodiment, since the directory entries stored in the key storage area are sorted according to the size of the index key, a dichotomy may be adopted to find the index key with the same value as the target key, so as to obtain the target directory entry corresponding to the target key, thereby further improving the retrieval efficiency.
Specifically, the number of directory entries stored in the key storage area may be obtained, the directory entry located in the middle may be determined, the size of the index key in the directory entry located in the middle may be compared with the size of the target key, the search range of the directory entry may be determined based on the result obtained by the comparison, in the newly determined search range, the size of the index key of the directory entry located in the middle of the search range may be further compared with the size of the target key, and so on until the target index key having the same size as the value of the target key is found.
Further, when a plurality of target directory entries exist, it is described that a hash conflict occurs in the data storage stage, if the hash conflict exists in the data storage stage, the original key corresponding to the hash conflict and the value are stored in the value storage area together, when a plurality of target directory entries exist, the value corresponding to each target directory entry is respectively obtained and recorded as a candidate value, the specific content of each candidate value is analyzed, and the candidate value including the retrieval key corresponding to the target key (i.e., the corresponding original key) in the specific content is taken as the target value. When returning the target value to the user, the original key included in the target value may be deleted.
In order to facilitate understanding of the present solution by those skilled in the art, the following describes an implementation process of the data storage phase and the data retrieval phase according to an embodiment of the present invention with reference to a specific example.
In the data storage stage, assuming that an original key of the key value pair data to be stored is "abcde 12345", through hash operation, a first hash value of the original key is obtained as-1749051308, a character string of the original key is inverted, an obtained inverted key is obtained as "54321 edcba", and through hash operation, a second hash value of the inverted key is obtained as 1652332247. And (4) operating the first hash value and the second hash value according to the operation formula (1), and obtaining an operation result of-7512118165233690921, namely an index key of-7512118165233690921.
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL) (1)
Assuming that the value size of the key-value pair data to be stored is 300 bytes, and the key-value pair data is first stored in the value storage area of the byte array structure, it can be obtained that the length of the value is 300, the offset is 0, and the corresponding directory entry is: -7512118165233690921, 0 offset, 300 length; the directory entry is stored into a key store of the byte data structure.
After all the data to be stored are stored, the directory entries in the key storage area are sorted in an ascending order according to the numerical value of the corresponding index key, and if the memory is insufficient, the directory entries stored in the key storage area and the values stored in the value storage area can be exported to a binary file stored in a disk.
In the data retrieval phase, assuming that the original key of the key value pair data to be retrieved by the user is "abcde 12345", the retrieval key input by the user is "abcde 12345". The target key corresponding to the search key is-7512118165233690921, which can be obtained in the same way as the original key is converted in the data storage phase.
Performing dichotomy search in directory entries stored in a key storage area of a memory or a disk to obtain a target directory entry corresponding to a target key, wherein the target directory entry is as follows: the index key is-7512118165233690921, the offset is 0, and the length is 300.
And reading the data of the length caging 300 from the offset position of 0 in the value storage area of the memory or the disk, returning the data to the user, and finishing the retrieval.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 6, a block diagram of a data storage device based on a key-value database according to an embodiment of the present invention is shown, which may specifically include the following modules:
a data obtaining module 601, configured to obtain key-value pair data to be stored, where the key-value pair data includes an original key and a value; the original key is of a character string type;
an original key conversion module 602, configured to convert, by using a conversion rule, the original key of the character string type into a corresponding integer type with a specified byte length, so as to obtain an index key corresponding to the key-value pair data;
a value storage module 603, configured to store a value of the key-value pair data into a value storage area of a key-value database, and record an offset of the value in the value storage area and a length of the value;
a key storage module 604, configured to generate a directory entry corresponding to the value according to the index key, the offset, and the length, and store the directory entry in a key storage area of the key-value database.
In an optional embodiment of the present invention, the key storage module 604 further includes:
the target storage position determining module is used for determining the target storage position of the directory entry in the key storage area according to the numerical value of the index key in the directory entry;
a target storage location based storage module to store the directory entry in the target storage location in the key storage area.
In an optional embodiment of the present invention, the key storage module 604 further includes:
and the sorting module is used for sorting the directory entries in the key storage area according to the numerical value of the index key in each directory entry.
In an optional embodiment of the present invention, the original key conversion module 602 includes:
the inversion conversion module is used for inverting the character string corresponding to the original key to obtain an inverted key;
the hash conversion module is used for carrying out first hash operation on the original key to obtain a first hash value with a first specified byte length and carrying out second hash operation on the inverted key to obtain a second hash value with a second specified byte length;
and the index key generation module is used for generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
In an optional embodiment of the present invention, the first hash value and the second hash value are both 4-byte integer data; the index key generation module is specifically configured to:
and performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data.
In an optional embodiment of the present invention, an operation formula of the preset operation is as follows:
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL);
wherein a hash1 in the operation formula represents the first hash value, and the hash2 represents the second hash value; or the hash1 in the operation formula represents the second hash value, and the hash2 represents the first hash value.
In an optional embodiment of the present invention, the index key generation module is specifically configured to:
and performing splicing operation on the first hash value and the second hash value to obtain an index key corresponding to the key value pair data.
In an optional embodiment of the present invention, the value storage module 603 further includes:
the same key acquisition module is used for acquiring the stored index key which is the same as the index key when the index key is the same as the stored index key in the key value database;
and the same key value storage module is used for respectively storing corresponding original keys and values in a value storage area of the key value database aiming at the index keys and the stored index keys, and recording offsets of the original keys and the values in the value storage area and the corresponding lengths of the original keys and the values.
Referring to fig. 7, a block diagram of a key-value database-based retrieval apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
a retrieval information receiving module 701, configured to receive retrieval information input by a user, where the retrieval information includes a retrieval key input by the user in a retrieval process; the retrieval key is of a character string type;
a retrieval key conversion module 702, configured to convert the retrieval key of the character string type into a corresponding integer type with a specified byte length by using a conversion rule, so as to obtain a target key corresponding to the retrieval key;
a target directory entry searching module 703, configured to search a target directory entry corresponding to the target key from a key storage area of a key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting an original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
a target value obtaining module 704, configured to obtain a corresponding target value from a value storage area of the key-value store according to the offset and the length in the target directory entry.
In an optional embodiment of the present invention, the directory entries in the key storage area of the key-value database are sorted and stored according to the numerical size of the corresponding index key; the target directory entry searching module 703 is specifically configured to:
and searching a target directory entry corresponding to the index key from a key storage area of the key value database based on a dichotomy.
In an optional embodiment of the present invention, when there are a plurality of target directory entries, the target value obtaining module 704 further includes:
the candidate value acquisition module is used for respectively acquiring a candidate value corresponding to each target directory entry;
and the candidate value analysis module is used for analyzing each candidate value and taking the candidate value containing the retrieval key corresponding to the target key as the target value.
In an optional embodiment of the invention, the apparatus further comprises:
and the target value returning module is used for returning the target value to the user.
In an optional embodiment of the present invention, the target value returning module is further configured to delete the search key from the target value and return the deleted search key when the corresponding search key is included in the target value.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Fig. 8 is a block diagram illustrating the structure of an apparatus 1200 according to an example embodiment. For example, device 1200 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like; or may be a server-side device, such as a server.
Referring to fig. 8, device 1200 may include one or more of the following components: processing component 1202, memory 1204, power component 1206, multimedia component 1208, audio component 1210, input/output (I/O) interface 1212, sensor component 1214, and communications component 1216.
The processing component 1202 generally controls overall operation of the device 1200, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 1202 may include one or more processors 1220 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 1202 can include one or more modules that facilitate interaction between the processing component 1202 and other components. For example, the processing component 1202 can include a multimedia module to facilitate interaction between the multimedia component 1208 and the processing component 1202.
The memory 1204 is configured to store various types of data to support operation at the device 1200. Examples of such data include instructions for any application or method operating on device 1200, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1204 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
A power supply component 1206 provides power to the various components of the device 1200. Power components 1206 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1200.
The multimedia components 1208 include a screen that provides an output interface between the device 1200 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1208 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 1200 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
Audio component 1210 is configured to output and/or input audio signals. For example, audio assembly 1210 includes a Microphone (MIC) configured to receive external audio signals when device 1200 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1204 or transmitted via the communication component 1216. In some embodiments, audio assembly 1210 further includes a speaker for outputting audio signals.
The I/O interface 1212 provides an interface between the processing component 1202 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 1214 includes one or more sensors for providing various aspects of state assessment for the device 1200. For example, the sensor assembly 1214 may detect an open/closed state of the device 1200, the relative positioning of the components, such as a display and keypad of the device 1200, the sensor assembly 1214 may also detect a change in the position of the device 1200 or a component of the device 1200, the presence or absence of user contact with the device 1200, orientation or acceleration/deceleration of the device 1200, and a change in the temperature of the device 1200. The sensor assembly 1214 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 1214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1214 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communications component 1216 is configured to facilitate communications between device 1200 and other devices in a wired or wireless manner. The device 1200 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1216 receives the broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1216 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the device 1200 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided that includes instructions, such as the memory 1204 that includes instructions, that are executable by the processor 1220 of the device 1200 to perform the above-described methods. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium having instructions therein which, when executed by a processor of a device, enable the device to perform a key-value store-based data storage method, the method comprising:
acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type;
converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data;
storing the value of the key-value pair data into a value storage area of a key-value database, and recording the offset of the value in the value storage area and the length of the value;
and generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
Optionally, the storing the directory entry into a key storage area of the key-value store further includes:
determining the target storage position of the directory entry in the key storage area according to the numerical value of an index key in the directory entry;
storing the directory entry in the target storage location in the key storage area.
Optionally, the method further comprises:
and sorting the directory entries in the key storage area according to the numerical size of the index key in each directory entry.
Optionally, the converting the original key of the string type into a corresponding integer type with a specified byte length by using a conversion rule to obtain an index key corresponding to the key-value pair data includes:
inverting the character string corresponding to the original key to obtain an inverted key;
performing a first hash operation on the original key to obtain a first hash value of a first specified byte length, and performing a second hash operation on the inverted key to obtain a second hash value of a second specified byte length;
and generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
Optionally, the first hash value and the second hash value are both 4-byte integer data; generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value, including:
performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data; the operation formula of the preset operation is as follows:
(((long)hash1)<<32)|(hash2&0X00000000FFFFFFFFL);
wherein a hash1 in the operation formula represents the first hash value, and the hash2 represents the second hash value; or the hash1 in the operation formula represents the second hash value, and the hash2 represents the first hash value.
Optionally, the generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value includes:
and performing splicing operation on the first hash value and the second hash value to obtain an index key corresponding to the key value pair data.
Optionally, the storing the value of the key-value pair data into a value storage area of a key-value database, and recording an offset of the value in the value storage area and a length of the value, further includes:
when the index key is the same as a stored index key in the key value database, acquiring the stored index key which is the same as the index key;
and aiming at the index key and the stored index key, respectively storing the corresponding original key and value in the value storage area of the key value database, and recording the offset of the original key and value in the value storage area and the length corresponding to the original key and value.
A non-transitory computer readable storage medium having instructions therein, which when executed by a processor of a device, enable the device to perform a key-value store-based retrieval method, the method comprising:
receiving retrieval information, wherein the retrieval information comprises a retrieval key; the retrieval key is of a character string type;
converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key;
searching a target directory entry corresponding to the target key from a key storage area of a key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
and acquiring a corresponding target value from a value storage area of the key value database according to the offset and the length in the target directory entry.
Optionally, the directory entries in the key storage area of the key value database are sorted and stored according to the numerical value of the corresponding index key; the searching the target directory entry corresponding to the target key from the key storage area of the key value database comprises:
and searching a target directory entry corresponding to the index key from a key storage area of the key value database based on a dichotomy.
Optionally, when there are a plurality of target directory entries, obtaining a corresponding target value from a value storage area of the key-value store, further includes:
respectively acquiring a candidate value corresponding to each target directory entry;
each candidate value is analyzed, and the candidate value including the search key corresponding to the target key is set as the target value.
Optionally, the taking the candidate value including the search key corresponding to the target key as the target value further includes:
and if the target value contains the corresponding search key, deleting the search key from the target value and returning.
Fig. 9 is a schematic structural diagram of a server in an embodiment of the present invention. The server 1300 may vary widely in configuration or performance and may include one or more Central Processing Units (CPUs) 1322 (e.g., one or more processors) and memory 1332, one or more storage media 1330 (e.g., one or more mass storage devices) storing applications 1342 or data 1344. Memory 1332 and storage medium 1330 may be, among other things, transitory or persistent storage. The program stored on the storage medium 1330 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Still further, the central processor 1322 may be arranged in communication with the storage medium 1330, executing a sequence of instruction operations in the storage medium 1330 on the server 1300.
The server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input-output interfaces 1358, one or more keyboards 1356, and/or one or more operating systems 1341 such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
The embodiment of the invention also discloses a computer program product, which comprises a computer program or computer instructions, and when the computer program or the computer instructions are executed by a processor, the data storage method based on the key value database is realized.
The embodiment of the invention also discloses a computer program product, which comprises a computer program or computer instructions, and when the computer program or the computer instructions are executed by a processor, the retrieval method based on the key value database is realized.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The key-value database-based data storage method and device, the key-value database-based retrieval method and device, the readable storage medium and the computer program product provided by the invention are described in detail, specific examples are applied in the text to explain the principle and the implementation mode of the invention, and the description of the above embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (14)

1. A method for storing data based on a key-value store, the method comprising:
acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type;
converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data;
storing the value of the key-value pair data into a value storage area of a key-value database, and recording the offset of the value in the value storage area and the length of the value;
and generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
2. The method of claim 1, wherein storing the directory entry in a key store of the key-value store, further comprises:
determining the target storage position of the directory entry in the key storage area according to the numerical value of an index key in the directory entry;
storing the directory entry in the target storage location in the key storage area.
3. The method according to claim 1, wherein the converting the original key of the string type into a corresponding integer type with a specified byte length by using a conversion rule to obtain an index key corresponding to the key-value pair data comprises:
inverting the character string corresponding to the original key to obtain an inverted key;
performing a first hash operation on the original key to obtain a first hash value of a first specified byte length, and performing a second hash operation on the inverted key to obtain a second hash value of a second specified byte length;
and generating an index key corresponding to the key value pair data according to the first hash value and the second hash value.
4. The method of claim 3, wherein the first hash value and the second hash value are both 4-byte integer data; generating an index key corresponding to the key-value pair data according to the first hash value and the second hash value, including:
and performing preset operation on the first hash value and the second hash value, and taking an operation result of the preset operation as an index key corresponding to the key value pair data.
5. The method of any of claims 1-4, wherein storing the value of the key-value pair data in a value store of a key-value store, and recording an offset of the value in the value store and a length of the value, further comprises:
when the index key is the same as a stored index key in the key value database, acquiring the stored index key which is the same as the index key;
and aiming at the index key and the stored index key, respectively storing the corresponding original key and value in the value storage area of the key value database, and recording the offset of the original key and value in the value storage area and the length corresponding to the original key and value.
6. A key-value store-based retrieval method, the method comprising:
receiving retrieval information, wherein the retrieval information comprises a retrieval key; the retrieval key is of a character string type;
converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key;
searching a target directory entry corresponding to the target key from a key storage area of a key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting an original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
and acquiring a corresponding target value from a value storage area of the key value database according to the offset and the length in the target directory entry.
7. The method of claim 6, wherein directory entries in key store areas of the key-value store are sorted for storage by a numerical size of a corresponding index key; the searching the target directory entry corresponding to the target key from the key storage area of the key value database comprises:
and searching a target directory entry corresponding to the index key from a key storage area of the key value database based on a dichotomy.
8. The method of claim 6 or 7, wherein when there are multiple target directory entries, obtaining corresponding target values from a value store of the key-value store, further comprises:
respectively acquiring a candidate value corresponding to each target directory entry;
each candidate value is analyzed, and the candidate value including the search key corresponding to the target key is set as the target value.
9. The method of claim 8, wherein after the step of using the candidate value containing the search key corresponding to the target key as the target value, the method further comprises:
and returning after deleting the retrieval key from the target value.
10. A key-value store-based data storage device, comprising:
the data acquisition module is used for acquiring key-value pair data to be stored, wherein the key-value pair data comprises an original key and a value; the original key is of a character string type;
the original key conversion module is used for converting the original key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain an index key corresponding to the key value pair data;
the value storage module is used for storing the value of the key value pair data into a value storage area of a key value database and recording the offset of the value in the value storage area and the length of the value;
and the key storage module is used for generating a directory entry corresponding to the value according to the index key, the offset and the length, and storing the directory entry into a key storage area of the key value database.
11. A key-value store-based retrieval apparatus, comprising:
the retrieval information receiving module is used for receiving retrieval information input by a user, and the retrieval information comprises a retrieval key input by the user in a retrieval process; the retrieval key is of a character string type;
the retrieval key conversion module is used for converting the retrieval key of the character string type into a corresponding integer type with a specified byte length by adopting a conversion rule to obtain a target key corresponding to the retrieval key;
the target directory entry searching module is used for searching a target directory entry corresponding to the target key from a key storage area of the key value database; in the process that a key value pair data with a character string type as a storage key is stored in the key value database, converting an original key of the character string type into a corresponding integer type with a specified byte length by adopting the conversion rule to obtain an index key corresponding to the key value pair data, storing the value of the key value pair data in a value storage area of the key value database, generating a directory entry corresponding to the value according to the offset of the value in the value storage area, the length of the value and the index key, and storing the directory entry in the key storage area;
and the target value acquisition module is used for acquiring a corresponding target value from the value storage area of the key value database according to the offset and the length in the target directory entry.
12. An apparatus comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and configured to be executed by the one or more processors to perform the one or more programs comprises a method for performing key-value database-based data storage according to any one of method claims 1-5, or comprises a method for performing key-value database-based retrieval according to any one of method claims 6-9.
13. A readable storage medium, characterized in that instructions in said storage medium, when executed by a processor of a device, enable the device to perform a key-value database based data storage method as defined in any one of method claims 1-5, or a key-value database based retrieval method as defined in any one of method claims 6-9.
14. A computer program product, characterized in that it comprises a computer program or computer instructions which, when executed by a processor, implement a key-value database-based data storage method according to any one of claims 1 to 5, or a key-value database-based retrieval method according to any one of method claims 6 to 9.
CN202111258633.9A 2021-10-27 2021-10-27 Data storage method and retrieval method based on key value database and corresponding devices Pending CN114090575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111258633.9A CN114090575A (en) 2021-10-27 2021-10-27 Data storage method and retrieval method based on key value database and corresponding devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111258633.9A CN114090575A (en) 2021-10-27 2021-10-27 Data storage method and retrieval method based on key value database and corresponding devices

Publications (1)

Publication Number Publication Date
CN114090575A true CN114090575A (en) 2022-02-25

Family

ID=80297933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111258633.9A Pending CN114090575A (en) 2021-10-27 2021-10-27 Data storage method and retrieval method based on key value database and corresponding devices

Country Status (1)

Country Link
CN (1) CN114090575A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757315A (en) * 2022-06-15 2022-07-15 深圳市成为信息股份有限公司 Method for reader-writer to rapidly output tag data, reader-writer and receiving terminal
CN115061637A (en) * 2022-07-12 2022-09-16 平安科技(深圳)有限公司 Disk data indexing method and device, computer equipment and storage medium
WO2023160115A1 (en) * 2022-02-28 2023-08-31 华为技术有限公司 Key-value pair retrieving method and apparatus, and storage medium
WO2024021667A1 (en) * 2022-07-26 2024-02-01 华为云计算技术有限公司 Data processing method and apparatus, device and storage medium
CN117827849A (en) * 2024-03-04 2024-04-05 支付宝(杭州)信息技术有限公司 Data dictionary maintenance method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023160115A1 (en) * 2022-02-28 2023-08-31 华为技术有限公司 Key-value pair retrieving method and apparatus, and storage medium
CN114757315A (en) * 2022-06-15 2022-07-15 深圳市成为信息股份有限公司 Method for reader-writer to rapidly output tag data, reader-writer and receiving terminal
CN114757315B (en) * 2022-06-15 2022-08-26 深圳市成为信息股份有限公司 Method for reader-writer to rapidly output label data, reader-writer and receiving terminal
CN115061637A (en) * 2022-07-12 2022-09-16 平安科技(深圳)有限公司 Disk data indexing method and device, computer equipment and storage medium
WO2024021667A1 (en) * 2022-07-26 2024-02-01 华为云计算技术有限公司 Data processing method and apparatus, device and storage medium
CN117827849A (en) * 2024-03-04 2024-04-05 支付宝(杭州)信息技术有限公司 Data dictionary maintenance method and device

Similar Documents

Publication Publication Date Title
CN114090575A (en) Data storage method and retrieval method based on key value database and corresponding devices
CN109089133B (en) Video processing method and device, electronic equipment and storage medium
CN108304475B (en) Data query method and device and electronic equipment
CN110472091B (en) Image processing method and device, electronic equipment and storage medium
CN111581488B (en) Data processing method and device, electronic equipment and storage medium
CN109144285B (en) Input method and device
RU2663707C1 (en) Method and device for resource search
CN105095427A (en) Search recommendation method and device
CN105843951B (en) Data query method and device
CN104809157A (en) Number recognition method and device
TWI755890B (en) Data processing method, electronic device and computer-readable storage medium
CN110826697B (en) Method and device for acquiring sample, electronic equipment and storage medium
CN110442844B (en) Data processing method, device, electronic equipment and storage medium
CN112307281A (en) Entity recommendation method and device
CN114168808A (en) Regular expression-based document character string coding identification method and device
CN113987128A (en) Related article searching method and device, electronic equipment and storage medium
CN106959970B (en) Word bank, processing method and device of word bank and device for processing word bank
CN109992790B (en) Data processing method and device for data processing
CN112131999B (en) Identity determination method and device, electronic equipment and storage medium
CN108304491B (en) Data query method and device and electronic equipment
CN110471538B (en) Input prediction method and device
CN108241438B (en) Input method, input device and input device
CN112905023A (en) Input error correction method and device for input error correction
CN112651221A (en) Data processing method and device and data processing device
CN109388251B (en) Input method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination