CN111767289A - Data storage method and device based on memory database - Google Patents

Data storage method and device based on memory database Download PDF

Info

Publication number
CN111767289A
CN111767289A CN202010909904.1A CN202010909904A CN111767289A CN 111767289 A CN111767289 A CN 111767289A CN 202010909904 A CN202010909904 A CN 202010909904A CN 111767289 A CN111767289 A CN 111767289A
Authority
CN
China
Prior art keywords
data
index
block
data storage
memory database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010909904.1A
Other languages
Chinese (zh)
Inventor
张艳清
肖杰
张永飞
杨尧
胥莉君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sefon Software Co Ltd
Original Assignee
Chengdu Sefon Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Sefon Software Co Ltd filed Critical Chengdu Sefon Software Co Ltd
Priority to CN202010909904.1A priority Critical patent/CN111767289A/en
Publication of CN111767289A publication Critical patent/CN111767289A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data storage method and a data storage device based on a memory database, which mainly solve the problems of reading amplification and resource waste caused by the fact that required data needs to be screened in a memory and a lot of useless data is read when the data of the memory database is inquired in the prior art. The data storage method based on the memory database comprises the following steps: the data are stored after being blocked according to the set column number; and establishing a multi-stage index, and establishing a reverse index in each stage of index. By the scheme, the invention achieves the aims of avoiding the retrieval of useless data, reducing reading amplification and reducing memory consumption, and has very high practical value and popularization value.

Description

Data storage method and device based on memory database
Technical Field
The invention relates to the technical field of memory data storage, in particular to a data storage method and device based on a memory database.
Background
The database is a warehouse for storing data, and the warehouse organizes, stores and manages the data according to a data structure set by people, so that the data can be more conveniently reserved and used; according to different data storage positions, the database is divided into a disk database and a memory database from a large direction, and the memory database basically and completely replaces disk data.
The stored data of the existing memory database is sorted by forming an index by multiple rows, and the data is read according to the index; the method comprises the following steps that a plurality of columns form a primary key, when a part of columns are inquired, data are obtained according to the combined primary key, and then data screening is carried out according to an appointed column; all data of the primaryKey can be read out during query, required data is screened in a memory, a lot of useless data are read out, reading amplification is caused, and resources are wasted.
Disclosure of Invention
The invention aims to provide a data storage method and a data storage device based on a memory database, which are used for solving the problems of reading amplification and resource waste caused by the fact that required data need to be screened in a memory and a lot of useless data are read when the existing memory database is used for data query.
In order to solve the above problems, the present invention provides the following technical solutions:
a data storage method based on a memory database comprises the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
By the scheme, the data storage structure is optimized, the corresponding data are directly retrieved according to the reverse index, the retrieval of useless data is avoided, the reading amplification is reduced, and the memory consumption is reduced.
Further, the process of establishing the multi-level index is as follows:
s101, establishing a Manifest Block;
s102, establishing a plurality of lower Index blocks for the Manifest Block in the step S101 according to a set threshold;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102.
When in query, the first-level index is searched according to the specified conditions, the corresponding second-level index block is found according to the index, the index is screened from the second-level index block, the corresponding data block is found according to the second-level index, and finally the corresponding data is searched from the data block according to the index, so that the reading amplification is reduced.
Further, at step S101, there is one Manifest Block, which stores the largest Index value in each Index Block, and a pointer to the Index Block.
Further, the Index Block in step S102 stores the value with the largest Index in each Data Block, and a pointer pointing to the Data Block.
Further, Data Block stores an index in step S103, and stores real Data.
Further, establishing a reverse index according to the fixed columns of the data, and splicing the fixed columns into primary keys.
Further, the database is compressed when the magnitude of the data block is greater than a set threshold and the query frequency of the non-primary key column is greater than the set threshold; otherwise, no compression is performed.
A data storage device based on a memory database comprises
A memory: for storing executable instructions;
a processor: the data storage method based on the memory database is realized by executing the executable instructions stored in the memory.
Compared with the prior art, the invention has the following beneficial effects:
(1) the storage structure of the invention cuts the fixed data into a plurality of data blocks to be cached in the memory, and establishes the reverse index according to the data blocks, thereby avoiding the retrieval of useless data, reducing the reading amplification and reducing the memory consumption.
(2) In the reading process, partial query and range query aiming at the primaryKey are provided through reverse index, and compared with a traditional memory database, the reading amplification is reduced.
(3) The invention determines whether the non-primary key row is compressed according to the query frequency of each block and each row in the block, thereby reducing the memory consumption.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts, wherein:
fig. 1 is a schematic diagram of embodiment 1 of storage after blocking according to a set number.
FIG. 2 is a diagram illustrating the index establishment according to the fixed column in embodiment 3.
Fig. 3 is a schematic diagram of an index structure in embodiment 2.
FIG. 4 is a flowchart illustrating the read procedure after data storage according to embodiment 6.
Fig. 5 is a diagram illustrating the decision of whether to compress or not according to the access frequency in embodiment 4.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to fig. 1 to 5, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
Example 1
A data storage method based on a memory database comprises the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
As shown in fig. 1, 1000 pieces of data are set as one data block, and then the data block is cached in the memory; optimizing a data storage structure through block storage, and then establishing indexes, wherein reverse indexes are arranged in each level of index; when reading, corresponding data can be directly searched in the data block according to the index, so that the reading amplification is reduced, and the reading of useless data is avoided.
Example 2
As shown in fig. 3, in this embodiment, based on embodiment 1, a process of establishing a multi-level index is as follows:
s101, establishing a Manifest Block, wherein the Manifest Block is one, and the Manifest Block stores the maximum Index value in each Index Block and a pointer pointing to the Index Block;
s102, establishing a plurality of lower-level Index blocks for the Manifest Block in the step S101 according to a set threshold, wherein the IndexBlock stores the maximum Index value in each Data Block and a pointer pointing to the Data Block;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102, wherein the Data blocks store indexes and store real Data.
During reading, firstly, the Index is searched in the Manifest Block, then the Index is searched in the Index Block, and finally, the Data is acquired and returned in the corresponding Block cache according to the Index of the Data Block; the reading process is more accurate, reading of useless data is avoided, reading amplification is reduced, and memory resources are saved.
Example 3
In this embodiment, based on embodiment 1, further, a reverse index is established according to a fixed column of data, and the fixed column is spliced into a primary key; partial search is facilitated, and read amplification is reduced.
As shown in fig. 2, for example, the place is used as a fixed column to establish the reverse index, the model is used as a fixed column to establish the reverse index, and the number is used as a fixed column to establish the reverse index.
Example 4
In this embodiment, on the basis of embodiment 1, the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary column is greater than the set threshold; otherwise, no compression is performed.
As shown in fig. 5, the data block with large magnitude and high access frequency is compressed and compressed, so as to reduce the memory consumption.
Example 5
In this embodiment, based on embodiment 1, a data storage device based on an in-memory database includes
A memory: for storing executable instructions;
a processor: the data storage method based on the memory database is realized by executing the executable instructions stored in the memory.
Example 6
In this embodiment, based on embodiment 1, a reading flow after storing data by the present invention is shown in fig. 4, for example, querying data with a model a 01:
inquiring the reverse index in the Manifest Block, and judging that the sequence of A01 is less than A05 according to the index sequence of the places
In IndexBlock1, judging that the sequence of A01 is smaller than A02 according to Index Block1, obtaining the Data possible DataBlock1, then searching whether the Data of A01 exists in Data Block1, and returning the query result.
The invention is mainly used for scenes needing data caching and efficient query, such as scenes of algorithm class for providing data caching, and scenes of data analysis and data mining class for providing efficient data caching and query.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A data storage method based on a memory database is characterized by comprising the following steps:
the data are stored after being blocked according to the set quantity;
and establishing a multi-stage index, and establishing a reverse index in each stage of index.
2. The in-memory database-based data storage method according to claim 1, wherein the process of establishing the multi-level index is as follows:
s101, establishing a Manifest Block;
s102, establishing a plurality of lower Index blocks for the Manifest Block in the step S101 according to a set threshold;
and S103, establishing a plurality of lower Data blocks according to the set threshold in the step S102.
3. The in-memory database-based data storage method according to claim 2, wherein in step S101, there is one Manifest Block, and the largest Index value in each Index Block and the pointer pointing to Index Block are stored.
4. The in-memory database-based Data storage method according to claim 2, wherein the Index Block in step S102 stores the largest Index value in each Data Block and the pointer pointing to the Data Block.
5. The in-memory database-based Data storage method according to claim 2, wherein the Data Block in step S103 stores an index and stores real Data.
6. The data storage method based on the in-memory database as claimed in claim 1, wherein the reverse index is established according to the fixed columns of the data, and the fixed columns are spliced into primary keys.
7. The data storage method based on the in-memory database as claimed in claim 6, wherein the magnitude of the data block is greater than the set threshold, and the database is compressed when the query frequency of the non-primary key column is higher than the set threshold; otherwise, no compression is performed.
8. A data storage device based on a memory database is characterized by comprising
A memory: for storing executable instructions;
a processor: the executable instructions stored in the memory are executed to realize the data storage method based on the memory database according to any one of claims 1 to 7.
CN202010909904.1A 2020-09-02 2020-09-02 Data storage method and device based on memory database Pending CN111767289A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010909904.1A CN111767289A (en) 2020-09-02 2020-09-02 Data storage method and device based on memory database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010909904.1A CN111767289A (en) 2020-09-02 2020-09-02 Data storage method and device based on memory database

Publications (1)

Publication Number Publication Date
CN111767289A true CN111767289A (en) 2020-10-13

Family

ID=72729280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010909904.1A Pending CN111767289A (en) 2020-09-02 2020-09-02 Data storage method and device based on memory database

Country Status (1)

Country Link
CN (1) CN111767289A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076068A (en) * 2021-04-27 2021-07-06 哈尔滨工业大学(深圳) Data storage method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254189A1 (en) * 2011-03-31 2012-10-04 Biren Narendra Shah Multilevel indexing system
CN104376119A (en) * 2014-12-03 2015-02-25 天津南大通用数据技术股份有限公司 Data access method and device adapted to super-large scale column-storage database
CN106909623A (en) * 2017-01-19 2017-06-30 中国科学院信息工程研究所 A kind of data set and date storage method of supporting efficient mass data to analyze and retrieve
CN109408515A (en) * 2018-11-01 2019-03-01 郑州云海信息技术有限公司 A kind of index execution method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254189A1 (en) * 2011-03-31 2012-10-04 Biren Narendra Shah Multilevel indexing system
CN104376119A (en) * 2014-12-03 2015-02-25 天津南大通用数据技术股份有限公司 Data access method and device adapted to super-large scale column-storage database
CN106909623A (en) * 2017-01-19 2017-06-30 中国科学院信息工程研究所 A kind of data set and date storage method of supporting efficient mass data to analyze and retrieve
CN109408515A (en) * 2018-11-01 2019-03-01 郑州云海信息技术有限公司 A kind of index execution method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆小丽 等: ""基于Map/Reduce的索引数据云存储模型研究"", 《宁波大学学报(理工版)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076068A (en) * 2021-04-27 2021-07-06 哈尔滨工业大学(深圳) Data storage method and device, electronic equipment and readable storage medium
CN113076068B (en) * 2021-04-27 2022-10-21 哈尔滨工业大学(深圳) Data storage method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
US10725987B2 (en) Forced ordering of a dictionary storing row identifier values
US8266147B2 (en) Methods and systems for database organization
US9898551B2 (en) Fast row to page lookup of data table using capacity index
US9965504B2 (en) Transient and persistent representation of a unified table metadata graph
US8935233B2 (en) Approximate index in relational databases
US9892157B2 (en) Min/max query with synopsis guided scan order
US11023452B2 (en) Data dictionary with a reduced need for rebuilding
EP2924594A1 (en) Data encoding and corresponding data structure in a column-store database
US20150006509A1 (en) Incremental maintenance of range-partitioned statistics for query optimization
CN107688488B (en) Metadata-based task scheduling optimization method and device
US9633059B2 (en) Data table performance optimization
CN112015741A (en) Method and device for storing massive data in different databases and tables
US20180276264A1 (en) Index establishment method and device
US20170046394A1 (en) Fast incremental column store data loading
CN111767289A (en) Data storage method and device based on memory database
US10929434B2 (en) Data warehouse single-row operation optimization
CN116303628A (en) Alarm data query method, system and equipment based on elastic search
CN104636474A (en) Method and equipment for establishment of audio fingerprint database and method and equipment for retrieval of audio fingerprints
CN114911826A (en) Associated data retrieval method and system
CN110489601B (en) Real-time data index rapid dynamic updating method based on cache mechanism
Kvet et al. Efficiency of the relational database tuple access
WO2001025962A1 (en) Database organization for increasing performance by splitting tables
CN117725096B (en) Data storage and query method, device, equipment and medium of relational database
KR101311409B1 (en) Partition scan method and device, memory system, and data alignment method using partial index rid alignment
WO2010060179A1 (en) Methods for organizing a relational database by using clustering operations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201013

RJ01 Rejection of invention patent application after publication