CN111767289A

CN111767289A - Data storage method and device based on memory database

Info

Publication number: CN111767289A
Application number: CN202010909904.1A
Authority: CN
Inventors: 张艳清; 肖杰; 张永飞; 杨尧; 胥莉君
Original assignee: Chengdu Sefon Software Co Ltd
Current assignee: Chengdu Sefon Software Co Ltd
Priority date: 2020-09-02
Filing date: 2020-09-02
Publication date: 2020-10-13

Abstract

The invention discloses a data storage method and a data storage device based on a memory database, which mainly solve the problems of reading amplification and resource waste caused by the fact that required data needs to be screened in a memory and a lot of useless data is read when the data of the memory database is inquired in the prior art. The data storage method based on the memory database comprises the following steps: the data are stored after being blocked according to the set column number; and establishing a multi-stage index, and establishing a reverse index in each stage of index. By the scheme, the invention achieves the aims of avoiding the retrieval of useless data, reducing reading amplification and reducing memory consumption, and has very high practical value and popularization value.

Description

Data storage method and device based on memory database

Technical Field

The invention relates to the technical field of memory data storage, in particular to a data storage method and device based on a memory database.

Background

The database is a warehouse for storing data, and the warehouse organizes, stores and manages the data according to a data structure set by people, so that the data can be more conveniently reserved and used; according to different data storage positions, the database is divided into a disk database and a memory database from a large direction, and the memory database basically and completely replaces disk data.

The stored data of the existing memory database is sorted by forming an index by multiple rows, and the data is read according to the index; the method comprises the following steps that a plurality of columns form a primary key, when a part of columns are inquired, data are obtained according to the combined primary key, and then data screening is carried out according to an appointed column; all data of the primaryKey can be read out during query, required data is screened in a memory, a lot of useless data are read out, reading amplification is caused, and resources are wasted.

Disclosure of Invention

The invention aims to provide a data storage method and a data storage device based on a memory database, which are used for solving the problems of reading amplification and resource waste caused by the fact that required data need to be screened in a memory and a lot of useless data are read when the existing memory database is used for data query.

In order to solve the above problems, the present invention provides the following technical solutions:

a data storage method based on a memory database comprises the following steps:

the data are stored after being blocked according to the set quantity;

and establishing a multi-stage index, and establishing a reverse index in each stage of index.

By the scheme, the data storage structure is optimized, the corresponding data are directly retrieved according to the reverse index, the retrieval of useless data is avoided, the reading amplification is reduced, and the memory consumption is reduced.

Further, the process of establishing the multi-level index is as follows:

s101, establishing a Manifest Block;

s102, establishing a plurality of lower Index blocks for the Manifest Block in the step S101 according to a set threshold;

and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102.

When in query, the first-level index is searched according to the specified conditions, the corresponding second-level index block is found according to the index, the index is screened from the second-level index block, the corresponding data block is found according to the second-level index, and finally the corresponding data is searched from the data block according to the index, so that the reading amplification is reduced.

Further, at step S101, there is one Manifest Block, which stores the largest Index value in each Index Block, and a pointer to the Index Block.

Further, the Index Block in step S102 stores the value with the largest Index in each Data Block, and a pointer pointing to the Data Block.

Further, Data Block stores an index in step S103, and stores real Data.

Further, establishing a reverse index according to the fixed columns of the data, and splicing the fixed columns into primary keys.

Further, the database is compressed when the magnitude of the data block is greater than a set threshold and the query frequency of the non-primary key column is greater than the set threshold; otherwise, no compression is performed.

A data storage device based on a memory database comprises

A memory: for storing executable instructions;

a processor: the data storage method based on the memory database is realized by executing the executable instructions stored in the memory.

Compared with the prior art, the invention has the following beneficial effects:

(1) the storage structure of the invention cuts the fixed data into a plurality of data blocks to be cached in the memory, and establishes the reverse index according to the data blocks, thereby avoiding the retrieval of useless data, reducing the reading amplification and reducing the memory consumption.

(2) In the reading process, partial query and range query aiming at the primaryKey are provided through reverse index, and compared with a traditional memory database, the reading amplification is reduced.

(3) The invention determines whether the non-primary key row is compressed according to the query frequency of each block and each row in the block, thereby reducing the memory consumption.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts, wherein:

fig. 1 is a schematic diagram of embodiment 1 of storage after blocking according to a set number.

FIG. 2 is a diagram illustrating the index establishment according to the fixed column in embodiment 3.

Fig. 3 is a schematic diagram of an index structure in embodiment 2.

FIG. 4 is a flowchart illustrating the read procedure after data storage according to embodiment 6.

Fig. 5 is a diagram illustrating the decision of whether to compress or not according to the access frequency in embodiment 4.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to fig. 1 to 5, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

Example 1

A data storage method based on a memory database comprises the following steps:

the data are stored after being blocked according to the set quantity;

As shown in fig. 1, 1000 pieces of data are set as one data block, and then the data block is cached in the memory; optimizing a data storage structure through block storage, and then establishing indexes, wherein reverse indexes are arranged in each level of index; when reading, corresponding data can be directly searched in the data block according to the index, so that the reading amplification is reduced, and the reading of useless data is avoided.

Example 2

As shown in fig. 3, in this embodiment, based on embodiment 1, a process of establishing a multi-level index is as follows:

s101, establishing a Manifest Block, wherein the Manifest Block is one, and the Manifest Block stores the maximum Index value in each Index Block and a pointer pointing to the Index Block;

s102, establishing a plurality of lower-level Index blocks for the Manifest Block in the step S101 according to a set threshold, wherein the IndexBlock stores the maximum Index value in each Data Block and a pointer pointing to the Data Block;

and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102, wherein the Data blocks store indexes and store real Data.

During reading, firstly, the Index is searched in the Manifest Block, then the Index is searched in the Index Block, and finally, the Data is acquired and returned in the corresponding Block cache according to the Index of the Data Block; the reading process is more accurate, reading of useless data is avoided, reading amplification is reduced, and memory resources are saved.

Example 3

In this embodiment, based on embodiment 1, further, a reverse index is established according to a fixed column of data, and the fixed column is spliced into a primary key; partial search is facilitated, and read amplification is reduced.

As shown in fig. 2, for example, the place is used as a fixed column to establish the reverse index, the model is used as a fixed column to establish the reverse index, and the number is used as a fixed column to establish the reverse index.

Example 4

In this embodiment, on the basis of embodiment 1, the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary column is greater than the set threshold; otherwise, no compression is performed.

As shown in fig. 5, the data block with large magnitude and high access frequency is compressed and compressed, so as to reduce the memory consumption.

Example 5

In this embodiment, based on embodiment 1, a data storage device based on an in-memory database includes

A memory: for storing executable instructions;

Example 6

In this embodiment, based on embodiment 1, a reading flow after storing data by the present invention is shown in fig. 4, for example, querying data with a model a 01:

inquiring the reverse index in the Manifest Block, and judging that the sequence of A01 is less than A05 according to the index sequence of the places

In IndexBlock1, judging that the sequence of A01 is smaller than A02 according to Index Block1, obtaining the Data possible DataBlock1, then searching whether the Data of A01 exists in Data Block1, and returning the query result.

The invention is mainly used for scenes needing data caching and efficient query, such as scenes of algorithm class for providing data caching, and scenes of data analysis and data mining class for providing efficient data caching and query.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A data storage method based on a memory database is characterized by comprising the following steps:

the data are stored after being blocked according to the set quantity;

2. The in-memory database-based data storage method according to claim 1, wherein the process of establishing the multi-level index is as follows:

s101, establishing a Manifest Block;

3. The in-memory database-based data storage method according to claim 2, wherein in step S101, there is one Manifest Block, and the largest Index value in each Index Block and the pointer pointing to Index Block are stored.

4. The in-memory database-based Data storage method according to claim 2, wherein the Index Block in step S102 stores the largest Index value in each Data Block and the pointer pointing to the Data Block.

5. The in-memory database-based Data storage method according to claim 2, wherein the Data Block in step S103 stores an index and stores real Data.

6. The data storage method based on the in-memory database as claimed in claim 1, wherein the reverse index is established according to the fixed columns of the data, and the fixed columns are spliced into primary keys.

7. The data storage method based on the in-memory database as claimed in claim 6, wherein the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary key column is higher than the set threshold; otherwise, no compression is performed.

8. A data storage device based on a memory database is characterized by comprising

A memory: for storing executable instructions;

a processor: the executable instructions stored in the memory are executed to realize the data storage method based on the memory database according to any one of claims 1 to 7.