CN111767289A - Data storage method and device based on memory database - Google Patents
Data storage method and device based on memory database Download PDFInfo
- Publication number
- CN111767289A CN111767289A CN202010909904.1A CN202010909904A CN111767289A CN 111767289 A CN111767289 A CN 111767289A CN 202010909904 A CN202010909904 A CN 202010909904A CN 111767289 A CN111767289 A CN 111767289A
- Authority
- CN
- China
- Prior art keywords
- data
- index
- block
- data storage
- memory database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data storage method and a data storage device based on a memory database, which mainly solve the problems of reading amplification and resource waste caused by the fact that required data needs to be screened in a memory and a lot of useless data is read when the data of the memory database is inquired in the prior art. The data storage method based on the memory database comprises the following steps: the data are stored after being blocked according to the set column number; and establishing a multi-stage index, and establishing a reverse index in each stage of index. By the scheme, the invention achieves the aims of avoiding the retrieval of useless data, reducing reading amplification and reducing memory consumption, and has very high practical value and popularization value.
Description
Technical Field
The invention relates to the technical field of memory data storage, in particular to a data storage method and device based on a memory database.
Background
The database is a warehouse for storing data, and the warehouse organizes, stores and manages the data according to a data structure set by people, so that the data can be more conveniently reserved and used; according to different data storage positions, the database is divided into a disk database and a memory database from a large direction, and the memory database basically and completely replaces disk data.
The stored data of the existing memory database is sorted by forming an index by multiple rows, and the data is read according to the index; the method comprises the following steps that a plurality of columns form a primary key, when a part of columns are inquired, data are obtained according to the combined primary key, and then data screening is carried out according to an appointed column; all data of the primaryKey can be read out during query, required data is screened in a memory, a lot of useless data are read out, reading amplification is caused, and resources are wasted.
Disclosure of Invention
The invention aims to provide a data storage method and a data storage device based on a memory database, which are used for solving the problems of reading amplification and resource waste caused by the fact that required data need to be screened in a memory and a lot of useless data are read when the existing memory database is used for data query.
In order to solve the above problems, the present invention provides the following technical solutions:
a data storage method based on a memory database comprises the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
By the scheme, the data storage structure is optimized, the corresponding data are directly retrieved according to the reverse index, the retrieval of useless data is avoided, the reading amplification is reduced, and the memory consumption is reduced.
Further, the process of establishing the multi-level index is as follows:
s101, establishing a Manifest Block;
s102, establishing a plurality of lower Index blocks for the Manifest Block in the step S101 according to a set threshold;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102.
When in query, the first-level index is searched according to the specified conditions, the corresponding second-level index block is found according to the index, the index is screened from the second-level index block, the corresponding data block is found according to the second-level index, and finally the corresponding data is searched from the data block according to the index, so that the reading amplification is reduced.
Further, at step S101, there is one Manifest Block, which stores the largest Index value in each Index Block, and a pointer to the Index Block.
Further, the Index Block in step S102 stores the value with the largest Index in each Data Block, and a pointer pointing to the Data Block.
Further, Data Block stores an index in step S103, and stores real Data.
Further, establishing a reverse index according to the fixed columns of the data, and splicing the fixed columns into primary keys.
Further, the database is compressed when the magnitude of the data block is greater than a set threshold and the query frequency of the non-primary key column is greater than the set threshold; otherwise, no compression is performed.
A data storage device based on a memory database comprises
A memory: for storing executable instructions;
a processor: the data storage method based on the memory database is realized by executing the executable instructions stored in the memory.
Compared with the prior art, the invention has the following beneficial effects:
(1) the storage structure of the invention cuts the fixed data into a plurality of data blocks to be cached in the memory, and establishes the reverse index according to the data blocks, thereby avoiding the retrieval of useless data, reducing the reading amplification and reducing the memory consumption.
(2) In the reading process, partial query and range query aiming at the primaryKey are provided through reverse index, and compared with a traditional memory database, the reading amplification is reduced.
(3) The invention determines whether the non-primary key row is compressed according to the query frequency of each block and each row in the block, thereby reducing the memory consumption.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts, wherein:
fig. 1 is a schematic diagram of embodiment 1 of storage after blocking according to a set number.
FIG. 2 is a diagram illustrating the index establishment according to the fixed column in embodiment 3.
Fig. 3 is a schematic diagram of an index structure in embodiment 2.
FIG. 4 is a flowchart illustrating the read procedure after data storage according to embodiment 6.
Fig. 5 is a diagram illustrating the decision of whether to compress or not according to the access frequency in embodiment 4.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to fig. 1 to 5, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
Example 1
A data storage method based on a memory database comprises the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
As shown in fig. 1, 1000 pieces of data are set as one data block, and then the data block is cached in the memory; optimizing a data storage structure through block storage, and then establishing indexes, wherein reverse indexes are arranged in each level of index; when reading, corresponding data can be directly searched in the data block according to the index, so that the reading amplification is reduced, and the reading of useless data is avoided.
Example 2
As shown in fig. 3, in this embodiment, based on embodiment 1, a process of establishing a multi-level index is as follows:
s101, establishing a Manifest Block, wherein the Manifest Block is one, and the Manifest Block stores the maximum Index value in each Index Block and a pointer pointing to the Index Block;
s102, establishing a plurality of lower-level Index blocks for the Manifest Block in the step S101 according to a set threshold, wherein the IndexBlock stores the maximum Index value in each Data Block and a pointer pointing to the Data Block;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102, wherein the Data blocks store indexes and store real Data.
During reading, firstly, the Index is searched in the Manifest Block, then the Index is searched in the Index Block, and finally, the Data is acquired and returned in the corresponding Block cache according to the Index of the Data Block; the reading process is more accurate, reading of useless data is avoided, reading amplification is reduced, and memory resources are saved.
Example 3
In this embodiment, based on embodiment 1, further, a reverse index is established according to a fixed column of data, and the fixed column is spliced into a primary key; partial search is facilitated, and read amplification is reduced.
As shown in fig. 2, for example, the place is used as a fixed column to establish the reverse index, the model is used as a fixed column to establish the reverse index, and the number is used as a fixed column to establish the reverse index.
Example 4
In this embodiment, on the basis of embodiment 1, the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary column is greater than the set threshold; otherwise, no compression is performed.
As shown in fig. 5, the data block with large magnitude and high access frequency is compressed and compressed, so as to reduce the memory consumption.
Example 5
In this embodiment, based on embodiment 1, a data storage device based on an in-memory database includes
A memory: for storing executable instructions;
a processor: the data storage method based on the memory database is realized by executing the executable instructions stored in the memory.
Example 6
In this embodiment, based on embodiment 1, a reading flow after storing data by the present invention is shown in fig. 4, for example, querying data with a model a 01:
inquiring the reverse index in the Manifest Block, and judging that the sequence of A01 is less than A05 according to the index sequence of the places
In IndexBlock1, judging that the sequence of A01 is smaller than A02 according to Index Block1, obtaining the Data possible DataBlock1, then searching whether the Data of A01 exists in Data Block1, and returning the query result.
The invention is mainly used for scenes needing data caching and efficient query, such as scenes of algorithm class for providing data caching, and scenes of data analysis and data mining class for providing efficient data caching and query.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (8)
1. A data storage method based on a memory database is characterized by comprising the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
2. The in-memory database-based data storage method according to claim 1, wherein the process of establishing the multi-level index is as follows:
s101, establishing a Manifest Block;
s102, establishing a plurality of lower Index blocks for the Manifest Block in the step S101 according to a set threshold;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102.
3. The in-memory database-based data storage method according to claim 2, wherein in step S101, there is one Manifest Block, and the largest Index value in each Index Block and the pointer pointing to Index Block are stored.
4. The in-memory database-based Data storage method according to claim 2, wherein the Index Block in step S102 stores the largest Index value in each Data Block and the pointer pointing to the Data Block.
5. The in-memory database-based Data storage method according to claim 2, wherein the Data Block in step S103 stores an index and stores real Data.
6. The data storage method based on the in-memory database as claimed in claim 1, wherein the reverse index is established according to the fixed columns of the data, and the fixed columns are spliced into primary keys.
7. The data storage method based on the in-memory database as claimed in claim 6, wherein the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary key column is higher than the set threshold; otherwise, no compression is performed.
8. A data storage device based on a memory database is characterized by comprising
A memory: for storing executable instructions;
a processor: the executable instructions stored in the memory are executed to realize the data storage method based on the memory database according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010909904.1A CN111767289A (en) | 2020-09-02 | 2020-09-02 | Data storage method and device based on memory database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010909904.1A CN111767289A (en) | 2020-09-02 | 2020-09-02 | Data storage method and device based on memory database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111767289A true CN111767289A (en) | 2020-10-13 |
Family
ID=72729280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010909904.1A Pending CN111767289A (en) | 2020-09-02 | 2020-09-02 | Data storage method and device based on memory database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111767289A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076068A (en) * | 2021-04-27 | 2021-07-06 | 哈尔滨工业大学(深圳) | Data storage method and device, electronic equipment and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254189A1 (en) * | 2011-03-31 | 2012-10-04 | Biren Narendra Shah | Multilevel indexing system |
CN104376119A (en) * | 2014-12-03 | 2015-02-25 | 天津南大通用数据技术股份有限公司 | Data access method and device adapted to super-large scale column-storage database |
CN106909623A (en) * | 2017-01-19 | 2017-06-30 | 中国科学院信息工程研究所 | A kind of data set and date storage method of supporting efficient mass data to analyze and retrieve |
CN109408515A (en) * | 2018-11-01 | 2019-03-01 | 郑州云海信息技术有限公司 | A kind of index execution method and apparatus |
-
2020
- 2020-09-02 CN CN202010909904.1A patent/CN111767289A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254189A1 (en) * | 2011-03-31 | 2012-10-04 | Biren Narendra Shah | Multilevel indexing system |
CN104376119A (en) * | 2014-12-03 | 2015-02-25 | 天津南大通用数据技术股份有限公司 | Data access method and device adapted to super-large scale column-storage database |
CN106909623A (en) * | 2017-01-19 | 2017-06-30 | 中国科学院信息工程研究所 | A kind of data set and date storage method of supporting efficient mass data to analyze and retrieve |
CN109408515A (en) * | 2018-11-01 | 2019-03-01 | 郑州云海信息技术有限公司 | A kind of index execution method and apparatus |
Non-Patent Citations (1)
Title |
---|
陆小丽 等: ""基于Map/Reduce的索引数据云存储模型研究"", 《宁波大学学报(理工版)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076068A (en) * | 2021-04-27 | 2021-07-06 | 哈尔滨工业大学(深圳) | Data storage method and device, electronic equipment and readable storage medium |
CN113076068B (en) * | 2021-04-27 | 2022-10-21 | 哈尔滨工业大学(深圳) | Data storage method and device, electronic equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10725987B2 (en) | Forced ordering of a dictionary storing row identifier values | |
US8266147B2 (en) | Methods and systems for database organization | |
US9898551B2 (en) | Fast row to page lookup of data table using capacity index | |
US9965504B2 (en) | Transient and persistent representation of a unified table metadata graph | |
US8935233B2 (en) | Approximate index in relational databases | |
US9892157B2 (en) | Min/max query with synopsis guided scan order | |
US11023452B2 (en) | Data dictionary with a reduced need for rebuilding | |
EP2924594A1 (en) | Data encoding and corresponding data structure in a column-store database | |
US20150006509A1 (en) | Incremental maintenance of range-partitioned statistics for query optimization | |
CN107688488B (en) | Metadata-based task scheduling optimization method and device | |
US9633059B2 (en) | Data table performance optimization | |
CN112015741A (en) | Method and device for storing massive data in different databases and tables | |
US20180276264A1 (en) | Index establishment method and device | |
US20170046394A1 (en) | Fast incremental column store data loading | |
CN111767289A (en) | Data storage method and device based on memory database | |
US10929434B2 (en) | Data warehouse single-row operation optimization | |
CN116303628A (en) | Alarm data query method, system and equipment based on elastic search | |
CN104636474A (en) | Method and equipment for establishment of audio fingerprint database and method and equipment for retrieval of audio fingerprints | |
CN114911826A (en) | Associated data retrieval method and system | |
CN110489601B (en) | Real-time data index rapid dynamic updating method based on cache mechanism | |
Kvet et al. | Efficiency of the relational database tuple access | |
WO2001025962A1 (en) | Database organization for increasing performance by splitting tables | |
CN117725096B (en) | Data storage and query method, device, equipment and medium of relational database | |
KR101311409B1 (en) | Partition scan method and device, memory system, and data alignment method using partial index rid alignment | |
WO2010060179A1 (en) | Methods for organizing a relational database by using clustering operations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201013 |
|
RJ01 | Rejection of invention patent application after publication |