CN111291235A - Metadata storage method and device based on time sequence database - Google Patents
Metadata storage method and device based on time sequence database Download PDFInfo
- Publication number
- CN111291235A CN111291235A CN202010399862.1A CN202010399862A CN111291235A CN 111291235 A CN111291235 A CN 111291235A CN 202010399862 A CN202010399862 A CN 202010399862A CN 111291235 A CN111291235 A CN 111291235A
- Authority
- CN
- China
- Prior art keywords
- metadata
- block
- key
- time
- time sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/164—File meta data generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
Abstract
The invention discloses a metadata storage method and a device based on a time sequence database, which are based on the storage and reading of time sequence data, and provide a metadata storage format for storing data block information by compressing batch data and taking a timestamp as a KV column name after compression, thereby effectively reducing the reading amplification. By adopting the two schemes, the problem that the existing time sequence database needs to store one piece of uncompressed original data, and the writing amplification is very serious because the writing I/O ratio of the application scene of the time sequence database is far greater than that of reading and the compression ratio is high is effectively solved.
Description
Technical Field
The invention relates to the field of data storage, in particular to a metadata storage method and device based on a time sequence database.
Background
Based on the characteristics of time series data, a relational database cannot meet the requirement of effective storage and processing of time series data, so that a database system, namely a time series database, specially optimized for time series data is urgently needed.
The storage and processing of time series big data are often processed in a relational database mode, but the relational database cannot perform efficient storage and data query due to the inherent disadvantages of the relational database. The time sequence big data solution enables the time sequence big data to be efficiently stored and quickly process the mass time sequence big data by using a special storage mode, and is an important technology for solving the mass data processing. The technology adopts a special data storage mode, greatly improves the processing capacity of time-related data, and greatly improves the query speed compared with a relational database in which the storage space is halved.
The existing time sequence database writing and reading data are generally as follows: 1. extracting a time period with a fixed length according to the timestamp, compressing the time period and writing the time period into a KV engine, and simultaneously writing uncompressed data; 2. when data is read, the column to be read is calculated through the specified time stamp, and then the corresponding time sequence data block is read in the KV removing engine.
The existing time-series database needs to store a copy of uncompressed original data, and because the application scenario write I/O ratio of the time-series database is much larger than that of read (at least 20: 1) and the compression ratio is higher (about 10: 1), the write amplification is very serious because the uncompressed data and the compressed data are the same data, and the uncompressed data is usually about 10 times of the compressed data.
Disclosure of Invention
The invention aims to: the metadata storage method and device based on the time sequence database solve the problems that an uncompressed original data needs to be stored in the existing time sequence database, and due to the fact that the writing I/O ratio of an application scene of the time sequence database is far larger than that of reading and the compression ratio is high, writing amplification is serious.
The technical scheme adopted by the invention is as follows:
a metadata storage method based on a time sequence database comprises the following steps:
marking time sequence data to obtain a plurality of labels, and setting at least one Key, wherein each Key corresponds to one or more labels;
compressing time series data corresponding to each Key according to a timestamp to obtain a compressed data block;
writing each Key and the corresponding compressed data block into a KV engine as a KV;
and establishing a metadata linked list, wherein the metadata linked list records each Key and the timestamp of the corresponding compressed data block.
The invention is based on the storage and reading of time sequence data, and effectively reduces the reading amplification by compressing the batch data and taking the timestamp as the KV column name after compression and providing a metadata storage format for storing data block information. By adopting the two schemes, the problem that the existing time sequence database needs to store one piece of uncompressed original data, and the writing amplification is very serious because the writing I/O ratio of the application scene of the time sequence database is far greater than that of reading and the compression ratio is high is effectively solved.
Further, each Key corresponds to at least one compressed data block.
Further, the column in the KV engine is named as the first timestamp of each compressed block.
Further, the metadata linked list includes metadata corresponding to each Key one to one.
Further, the metadata includes a Key value corresponding to the Key and a Block corresponding to the compressed data Block.
Furthermore, the content described in the Block includes a parameter of a previous Block connected to the Block, a tag of a compressed data Block corresponding to the current Block, and compressed data.
A time series database based metadata storage comprising:
a memory for storing executable instructions, a metadata chain table and a KV engine;
and the processor is used for executing the executable instructions stored in the memory to realize the metadata storage method based on the time sequence database.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the invention relates to a metadata storage method and a metadata storage device based on a time sequence database, which solve the problems that the prior time sequence database needs to store one piece of uncompressed original data, and the writing amplification is very serious because the writing I/O ratio of an application scene of the time sequence database is far greater than that of reading and the compression ratio is higher;
2. according to the metadata storage method and device based on the time sequence database, the writing performance is improved, the writing-in of redundant data is reduced, and the throughput of the time sequence database is greatly improved;
3. the metadata storage method and device based on the time sequence database reduce reading and amplifying, and can directly position the corresponding column names by storing the metadata.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts, wherein:
FIG. 1 is a schematic diagram of a write KV engine after compression of time series data according to the present invention;
FIG. 2 is a diagram of a metadata linked list of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to fig. 1 to 2, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.
Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.
Example 1
A metadata storage method based on a time sequence database comprises the following steps:
marking time sequence data to obtain a plurality of labels, and setting at least one Key, wherein each Key corresponds to one or more labels;
compressing time series data corresponding to each Key according to a timestamp to obtain a compressed data block;
writing each Key and the corresponding compressed data block into a KV engine as a KV;
and establishing a metadata linked list, wherein the metadata linked list records each Key and the timestamp of the corresponding compressed data block.
The invention is based on the storage and reading of time sequence data, and effectively reduces the reading amplification by compressing the batch data and taking the timestamp as the KV column name after compression and providing a metadata storage format for storing data block information. By adopting the two schemes, the problem that the existing time sequence database needs to store one piece of uncompressed original data, and the writing amplification is very serious because the writing I/O ratio of the application scene of the time sequence database is far greater than that of reading and the compression ratio is high is effectively solved.
Example 2
This embodiment is further based on embodiment 1, where each Key corresponds to at least one compressed data block.
Further, the column in the KV engine is named as the first timestamp of each compressed block.
Example 3
Further to embodiment 1, the metadata linked list includes metadata in one-to-one correspondence with each Key.
Further, the metadata includes a Key value corresponding to the Key and a Block corresponding to the compressed data Block.
Furthermore, the content described in the Block includes a parameter of a previous Block connected to the Block, a tag of a compressed data Block corresponding to the current Block, and compressed data.
Example 4
As shown in fig. 1 and fig. 2, the present embodiment is a specific operation scheme, and after the time series data is written in batch, keys are extracted according to the specified tags, including Key1, Key2, Key3, Key4, and Key 5; compressing the time sequence data according to the time stamp through the extracted Key, and generating a Compress _ Block in the figure 1 after compression;
then, the time after each compression _ Block, e.g. t1, t2, t3, represents the first timestamp of the compressed batch of data; key and following CompressjBlock in FIG. 1 are written as a KV into a KV engine, such as HBase engine, and the first timestamp of each CompressjBlock is used as a column name;
furthermore, each Key in fig. 1 and its corresponding compressed _ Block correspond to a piece of metadata in fig. 2, each Key value in fig. 2 corresponds to a Key1, a Key2, a Key3, a Key4, or a Key5 in fig. 1, and a timestamp corresponding to each compressed _ Block in fig. 1 is stored in the Block corresponding to the Key, for example, a timestamp corresponding to compressed _ Block _ t1 corresponding to the Block. Each time KV is read, a time period is usually provided, a metadata table is inquired according to time, the column name of the corresponding Compress _ Block is obtained, and then the corresponding column of the corresponding KV is accurately read.
Example 5
A time series database based metadata storage comprising:
a memory for storing executable instructions, a metadata chain table and a KV engine;
and the processor is used for executing the executable instructions stored in the memory to realize the metadata storage method based on the time sequence database.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (7)
1. A metadata storage method based on a time sequence database is characterized in that: the method comprises the following steps:
marking time sequence data to obtain a plurality of labels, and setting at least one Key, wherein each Key corresponds to one or more labels;
compressing time series data corresponding to each Key according to a timestamp to obtain a compressed data block;
writing each Key and the corresponding compressed data block into a KV engine as a KV;
and establishing a metadata linked list, wherein the metadata linked list records each Key and the timestamp of the corresponding compressed data block.
2. The metadata storage method based on the time-series database according to claim 1, wherein: each Key corresponds to at least one compressed data block.
3. The metadata storage method based on the time-series database according to claim 1, wherein: the column name in the KV engine is the first timestamp of each compressed block.
4. The metadata storage method based on the time-series database according to claim 1, wherein: the metadata linked list includes metadata in one-to-one correspondence with each Key.
5. The metadata storage method based on the time-series database according to claim 4, wherein: the metadata includes a Key value corresponding to a Key and a Block corresponding to a compressed data Block.
6. The metadata storage method based on the time-series database according to claim 5, wherein: the content recorded in the Block comprises the parameter of the last Block connected with the Block, the label of the compressed data Block corresponding to the current Block and the compressed data.
7. A metadata storage device based on a time series database, characterized by: the method comprises the following steps:
a memory for storing executable instructions, a metadata chain table and a KV engine;
a processor for executing the executable instructions stored in the memory to implement a method for metadata storage based on a time series database as claimed in claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010399862.1A CN111291235A (en) | 2020-05-13 | 2020-05-13 | Metadata storage method and device based on time sequence database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010399862.1A CN111291235A (en) | 2020-05-13 | 2020-05-13 | Metadata storage method and device based on time sequence database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111291235A true CN111291235A (en) | 2020-06-16 |
Family
ID=71031219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010399862.1A Pending CN111291235A (en) | 2020-05-13 | 2020-05-13 | Metadata storage method and device based on time sequence database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111291235A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112783927A (en) * | 2021-01-27 | 2021-05-11 | 浪潮云信息技术股份公司 | Database query method and system |
CN113127660A (en) * | 2021-05-24 | 2021-07-16 | 成都四方伟业软件股份有限公司 | Timing graph database storage method and device |
CN113312313A (en) * | 2021-01-29 | 2021-08-27 | 淘宝(中国)软件有限公司 | Data query method, nonvolatile storage medium and electronic device |
CN113641763A (en) * | 2021-08-31 | 2021-11-12 | 优刻得科技股份有限公司 | Distributed time sequence database system, electronic equipment and storage medium |
CN114327264A (en) * | 2021-12-22 | 2022-04-12 | 北京力控元通科技有限公司 | Time sequence data compression method, device and equipment |
CN114726379A (en) * | 2022-06-13 | 2022-07-08 | 西安热工研究院有限公司 | Self-adaptive compression method and system based on time sequence database sample storage characteristics |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1886238A1 (en) * | 2005-05-31 | 2008-02-13 | Nptv | Analysis and classification of a multimedia flux in a homogeneous sequences |
CN101641674A (en) * | 2006-10-05 | 2010-02-03 | 斯普兰克公司 | Time series search engine |
CN104731896A (en) * | 2015-03-18 | 2015-06-24 | 北京百度网讯科技有限公司 | Data processing method and system |
CN106503276A (en) * | 2017-01-06 | 2017-03-15 | 山东浪潮云服务信息科技有限公司 | A kind of method and apparatus of the time series databases for real-time monitoring system |
CN107037980A (en) * | 2015-12-07 | 2017-08-11 | Sap欧洲公司 | Many expressions storage of time series data |
CN109687875A (en) * | 2018-11-20 | 2019-04-26 | 成都四方伟业软件股份有限公司 | A kind of time series data processing method |
CN110362572A (en) * | 2019-06-25 | 2019-10-22 | 浙江邦盛科技有限公司 | A kind of time series database system based on column storage |
-
2020
- 2020-05-13 CN CN202010399862.1A patent/CN111291235A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1886238A1 (en) * | 2005-05-31 | 2008-02-13 | Nptv | Analysis and classification of a multimedia flux in a homogeneous sequences |
CN101641674A (en) * | 2006-10-05 | 2010-02-03 | 斯普兰克公司 | Time series search engine |
CN104731896A (en) * | 2015-03-18 | 2015-06-24 | 北京百度网讯科技有限公司 | Data processing method and system |
CN107037980A (en) * | 2015-12-07 | 2017-08-11 | Sap欧洲公司 | Many expressions storage of time series data |
CN106503276A (en) * | 2017-01-06 | 2017-03-15 | 山东浪潮云服务信息科技有限公司 | A kind of method and apparatus of the time series databases for real-time monitoring system |
CN109687875A (en) * | 2018-11-20 | 2019-04-26 | 成都四方伟业软件股份有限公司 | A kind of time series data processing method |
CN110362572A (en) * | 2019-06-25 | 2019-10-22 | 浙江邦盛科技有限公司 | A kind of time series database system based on column storage |
Non-Patent Citations (1)
Title |
---|
胡鹏飞 等: "基于IEC 61850的时序数据库系统设计", 《电子测量技术》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112783927A (en) * | 2021-01-27 | 2021-05-11 | 浪潮云信息技术股份公司 | Database query method and system |
CN112783927B (en) * | 2021-01-27 | 2023-03-17 | 浪潮云信息技术股份公司 | Database query method and system |
CN113312313A (en) * | 2021-01-29 | 2021-08-27 | 淘宝(中国)软件有限公司 | Data query method, nonvolatile storage medium and electronic device |
CN113312313B (en) * | 2021-01-29 | 2023-09-29 | 淘宝(中国)软件有限公司 | Data query method, nonvolatile storage medium and electronic device |
CN113127660A (en) * | 2021-05-24 | 2021-07-16 | 成都四方伟业软件股份有限公司 | Timing graph database storage method and device |
CN113641763A (en) * | 2021-08-31 | 2021-11-12 | 优刻得科技股份有限公司 | Distributed time sequence database system, electronic equipment and storage medium |
CN113641763B (en) * | 2021-08-31 | 2023-11-10 | 优刻得科技股份有限公司 | Distributed time sequence database system, electronic equipment and storage medium |
CN114327264A (en) * | 2021-12-22 | 2022-04-12 | 北京力控元通科技有限公司 | Time sequence data compression method, device and equipment |
CN114726379A (en) * | 2022-06-13 | 2022-07-08 | 西安热工研究院有限公司 | Self-adaptive compression method and system based on time sequence database sample storage characteristics |
CN114726379B (en) * | 2022-06-13 | 2022-09-13 | 西安热工研究院有限公司 | Self-adaptive compression method and system based on time sequence database sample storage characteristics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111291235A (en) | Metadata storage method and device based on time sequence database | |
CN110879813B (en) | Binary log analysis-based MySQL database increment synchronization implementation method | |
JP5961689B2 (en) | Incremental data extraction | |
JP5377318B2 (en) | Storage management of individually accessible data units | |
CN111309720A (en) | Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium | |
CN111930751A (en) | Time sequence data storage method and device | |
CN105373541A (en) | Processing method and system for data operation request of database | |
US11429658B1 (en) | Systems and methods for content-aware image storage | |
CN105447168A (en) | Method for restoring and recombining fragmented files in MP4 format | |
CN106874399B (en) | Networking backup system and backup method | |
CN111008183A (en) | Storage method and system for business wind control log data | |
CN110716739A (en) | Code change information statistical method, system and readable storage medium | |
CN103207916A (en) | Metadata processing method and device | |
CN112835918A (en) | MySQL database increment synchronization implementation method | |
CN111045994A (en) | KV database-based file classification retrieval method and system | |
CN111753518B (en) | Autonomous file consistency checking method | |
CN115794861A (en) | Offline data query multiplexing method based on feature abstract and application thereof | |
CN111444194B (en) | Method, device and equipment for clearing indexes in block chain type account book | |
CN111209285A (en) | Statistical index storage method and device based on time sequence data | |
CN106802922A (en) | A kind of object-based storage system and method for tracing to the source | |
CN113704227A (en) | Incremental update data storage method and device, electronic equipment and storage medium | |
CN113127660A (en) | Timing graph database storage method and device | |
CN111767436A (en) | HASH index data storage and reading method and system | |
CN113298106A (en) | Sample generation method and device, server and storage medium | |
CN110727845A (en) | Crawler text-based recent text-sending priority processing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200616 |