WO2018119976A1 - Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données - Google Patents

Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données Download PDF

Info

Publication number
WO2018119976A1
WO2018119976A1 PCT/CN2016/113364 CN2016113364W WO2018119976A1 WO 2018119976 A1 WO2018119976 A1 WO 2018119976A1 CN 2016113364 W CN2016113364 W CN 2016113364W WO 2018119976 A1 WO2018119976 A1 WO 2018119976A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
column
file
columns
block
Prior art date
Application number
PCT/CN2016/113364
Other languages
English (en)
Chinese (zh)
Inventor
李挥
李鑫
危奕
黄志浩
朱兵
Original Assignee
日彩电子科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日彩电子科技(深圳)有限公司 filed Critical 日彩电子科技(深圳)有限公司
Priority to CN201680090379.7A priority Critical patent/CN110268397B/zh
Priority to PCT/CN2016/113364 priority patent/WO2018119976A1/fr
Publication of WO2018119976A1 publication Critical patent/WO2018119976A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the field of data processing, and in particular, to an efficient and optimized data layout method applied to a data warehouse system.
  • Structured data is the most common type of data storage in database management systems.
  • partitioning method of table structure in structured data has a great impact on query and space efficiency. This is done by a single node. Data processing efficiency and network data transmission differences between different nodes.
  • Row storage is a commonly used data layout structure that divides the data in a table into rows, and then stores the partitioned data blocks on different data nodes, where each node stores the rows in turn on disk. Its shortcoming is that in the query process, even if the column is not used, the entire row of data needs to be loaded into the memory and perform unnecessary query operations, thus extending the query time.
  • Another common data layout structure is column storage, which divides the data in the table into columns, and then stores the different columns on different data nodes, each of which stores the columns in turn on disk. Its disadvantage is that the results obtained by different columns in the query process need to be transmitted between nodes to produce the final result. This way increases the data transmission loss and reduces the query efficiency.
  • RCFile A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems
  • the proposed RCFile is a common data layout scheme applied to distributed storage systems. It mainly combines the storage methods of row storage and column storage to build files. The data within the block. When the table data needs to be stored, it first divides the table file according to the row format, wherein each divided row group has the same size, and then stores the row group in different areas of the file block, and simultaneously in each row group. Column order is stored contiguously, and this way of storage avoids the drawbacks of row storage and column storage modes.
  • RCFile's data compression method is relatively simple, and each column data in each row group is separately compressed and stored. This way of compressing all the data is not suitable for the reading and use of common data.
  • the primary key in a table is used in almost every query. In each query for it, it is necessary to perform column data once. Decompression, this way results in higher time and computational overhead.
  • its way of fault tolerance for data is the multi-copy mode of the underlying storage system, which takes up more storage space than the error correction code.
  • Zebra is a column-oriented data layout structure. In order to avoid the defect of the multi-node recombination query result inherent in the column layout, it divides the columns of the data table into multiple column groups, and stores each column group separately, in the storage. In each column group, the data is stored in a row-stored format. Each of the column groups consists of multiple columns, and one column can belong to different column groups. This storage method largely avoids the storage of query results on multiple nodes.
  • the Zebra storage layout needs to group the columns in the table before storing the data table. For a query, there is no guarantee that all the columns to be used are in the same column group. In this case, The result of the query still requires reorganization of data rows between multiple nodes. Because the column group is on the same node, one column can be in multiple column groups at the same time, which actually adds duplicate data to the original data, which increases storage overhead.
  • the present invention provides an efficient and optimized data layout method applied to a data warehouse system, which solves the problem of increasing query overhead and occupying more storage space in the prior art.
  • the invention is realized by the following technical solutions: designing and manufacturing an efficient and optimized data layout method applied to a data warehouse system, comprising the following steps: (A) performing block file basic data layout; (B) performing column classification processing; (C) Perform table file storage.
  • the table file is horizontally divided into equal-sized row groups, and then the row groups are sequentially stored in the block file by column storage; each row group is composed of three
  • the partial components are respectively a synchronization part, a metadata part and an actual data part, and the synchronization part is used for the system to distinguish two adjacent row groups when reading data, and the metadata part includes the system can be in the row group Differentiating the size information of different columns and different fields in each column and column classification information for systematically distinguishing different kinds of columns, the actual data portion is used for storing actual data.
  • a column classification strategy based on the frequency of use is used to reduce the decompression cost of the common column, and the column is divided into a query column and a code column.
  • step (C) data is stored in a manner of using both a copy and an RDP code check block.
  • the matrix generated by the RDP code is a file group, and the data block of each file group stores two copies and two RDP code check blocks generated by the storage blocks when stored; two copy storage On different nodes, the other two check blocks are stored on nodes that do not contain any data blocks of the file group.
  • an RDP code generation group is a matrix of (p-1) ⁇ (p + 1), wherein the parameter p is an arbitrary prime number greater than 2, and the last two columns of each matrix are generated. Data is verified, and other columns store information data; the RDP code is divided into a row check block and a diagonal check block, and the row check block is obtained by laterally adding information data, and the diagonal check block is composed of information data. The diagonal addition is obtained; the RDP code organization information block file generates a verification file.
  • the code column is used to divide the data column, the column whose frequency is greater than or equal to the coded threshold is divided into the query column, and the frequency of use is less than the coded threshold is the code column.
  • the invention has the beneficial effects that the upper layer data warehouse system can obtain faster query rate and occupy less storage space than the conventional solution when processing large-scale structured data; in terms of query rate, by setting different codes
  • the threshold is to meet the needs of data management. Generally speaking, the smaller the encoding threshold, the higher the query rate will be, and the larger the storage space occupied by the data. Data warehouse managers can set a reasonable coding threshold through actual business requirements, which can make the data warehouse system get a good compromise between query rate and space occupation. In terms of storage space, use the construction parameters to determine the row group and The size of the file group, by constructing the row group, makes the data avoid additional data reading and result reorganization during the query.
  • the data table takes less than three copies of the fault-tolerant method in the fault-tolerant space, and is fault-tolerant. With no less than three copies, this storage method allows the system to take up less storage space in terms of data fault tolerance, saving physical storage costs.
  • FIG. 1 is a schematic view of space occupation of the present invention
  • FIG. 2 is a schematic diagram of a query rate according to the present invention.
  • An efficient and optimized data layout method applied to a data warehouse system includes the following steps: (A) performing block file basic data layout; (B) performing column classification processing; and (C) performing table file storage.
  • the table file is horizontally divided into equal-sized row groups, and then the row groups are sequentially stored in the block file by column storage; each row group is composed of three parts, which are respectively a synchronization part.
  • a metadata portion for distinguishing between two adjacent row groups when the data is read, and an actual data portion the metadata portion including the system distinguishing different columns and each column in the row group
  • the size information of the different domains and the column classification information for the system to distinguish the different kinds of columns, the actual data portion is used to store the actual data.
  • a column classification strategy based on the frequency of use is used to reduce the decompression cost of the common column, and the column is divided into a query column and a code column.
  • the data is stored in a manner of using both the replica and the RDP code check block.
  • the matrix generated by the RDP code is a file group, and the data block of each file group stores two copies and two RDP code check blocks generated by the storage blocks when stored; two copies are stored on different nodes, and The other two check blocks are stored on nodes that do not contain any data blocks of the file set.
  • An RDP code generation group is a (p-1) ⁇ (p+1) matrix, where the parameter p is an arbitrary prime number greater than 2, the last two columns of each matrix are generated check data, and the other columns store information.
  • Data; the RDP code is divided into a row check block and a diagonal check block, and the row check block is obtained by laterally adding information data, and the diagonal check block is obtained by adding diagonal lines of information data;
  • the RDP code organization information block file generates a verification file.
  • the code column is used to divide the data column, the column whose frequency is greater than or equal to the coded threshold is divided into the query column, and the frequency of use is less than the coded threshold is the code column.
  • the EStore of the present invention is used for the underlying storage file block data layout of the distributed storage system, and the optimized layout of the data enables the system query execution rate to be improved while reducing the storage space occupied by the data for error correction.
  • the EStore first divides the table file into equal-sized row groups in the block file data layout, and then stores the row groups in the block file in a column storage manner.
  • the size of the row group is determined by the system's build parameters, and the system's build parameters also affect the size of the filegroup, as described in the next section.
  • the EStore system stores the row groups on such blocks.
  • the first part is the synchronization part, which is used by the system to distinguish two adjacent line groups when reading data.
  • the second part is the metadata section, which contains the size information that the system can distinguish between different columns and different fields in each column in the row group. In addition to this, column classification information is included for the system to distinguish between different types of columns.
  • the third part is the actual data part, which is used to store the actual data, which are organized in column groups in column storage.
  • a column classification strategy based on frequency of use is used to reduce the decompression cost of common columns.
  • the system divides the columns in a data table into two types, one for the query column and the other for the code column. Each column is divided into one of these types.
  • the system uses the code threshold value to divide the data column.
  • Some queries are usually executed periodically for information decision making or data mining. In such a case, it is considered that such a query has a frequency of use, and each query will often use the same column for a data table.
  • the data table is preprocessed, and the query used by each column in the table is counted, and the value obtained by adding the frequency of the queries is the frequency of use of the column. In this way, the frequency of use of each column can be obtained.
  • the column whose frequency is greater than the coding threshold is divided into the query column. If the frequency is less than the coding threshold, the code threshold is the code column.
  • each column in a row group is actually stored, the columns are stored in different forms depending on the kind of these columns.
  • the query column requires that the data is read fast, then the data is stored according to the original format of the data, and for the code column, based on the storage space requirement, the column is compressed and stored using a common data compression algorithm.
  • the classification information for these columns is saved in the second part of the row group. This way reduces the possibility of decompressing data when querying, and improves the query efficiency of the system.
  • the data system manager can set a different coding threshold, so that the system can get a good balance between query rate and storage space. .
  • RDP codes to fault-tolerant data.
  • An RDP code generation group is a (p-1) ⁇ (p+1) matrix, where the parameter p is an arbitrary prime number greater than 2, the last two columns of each matrix are generated check data, and the other columns store information. data.
  • the first column is obtained by lateral addition of information data, called a row check block.
  • the second column is obtained by adding the diagonals of the information data, called the diagonal check block.
  • the main problem with applying RDP codes to block verification of distributed storage systems is how to organize the information block files to generate checksum files.
  • EStore uses the build parameters to determine the size of this check matrix.
  • the EStore defines the construction parameter as an arbitrary prime number greater than 2. If the size of the constructed prime number is k, then its RDP generation matrix is a matrix of (k ⁇ 1) ⁇ (k+1) size, that is, a total of k+1 Files, each file is internally divided into k-1 blocks. It is known from the previous section that the file blocks are composed of the same size group, so the row group is regarded as the basic symbol in the RDP generation matrix, and each file block It will contain k ⁇ 1 row groups, so the size of the row group in the file block is determined by the size of the block and the build parameters.
  • the matrix generated by each RDP code in the EStore is referred to as a file group.
  • a modular table file which often contains a large number of storage blocks.
  • the EStore divides the storage blocks according to the size of the construction parameters, and divides the storage blocks into different file groups, where each file group contains k-1 such storage. Block, then use these blocks to regenerate 2 check blocks in each file group, so that each file group will eventually contain k+1 file blocks.
  • each row group in block 4 contains all the row checkers.
  • r 0,4 is the exclusive OR of the row groups ⁇ r 0,0 , r 0,1 , r 0,2 ,r 0.3 ⁇ .
  • Block 5 contains all of the diagonal check symbols, for example, r 0,5 is the exclusive OR of the row set ⁇ r 0,0 , r 3,2 , r 2,3 , r 1,4 ⁇ .
  • the data is stored in the EStore using both the copy and the check block.
  • the data block of each of its file sets contains two copies in the storage system, and also stores two RDP code check blocks generated by these storage blocks.
  • the reason that the system still uses the copy mode is that the RDP code requires a large transmission bandwidth when the file is restored. Therefore, for each data block, the system will still store one more copy on other nodes, so that when a single node fails, the data block can still be obtained by copy transmission. Only when two nodes storing the same data block fail at the same time, it is necessary to restore the data block by means of RDP code data recovery. Since such a situation does not occur frequently in a distributed storage system, it is still acceptable for the transmission bandwidth of the RDP code repair in the case where the construction parameters are not very large.
  • EStore's file group storage two copies of each data block are stored on different nodes, and the other two check blocks are stored on nodes that do not contain any data blocks of the file group, so that When two copies of an arbitrary data block are corrupted at the same time, the original data block is restored using the RDP repair method.
  • the present invention uses three optimization strategies to improve the data processing performance of the data warehouse system, which is manifested in the improvement of the query rate and the reduction of the fault-tolerant space occupation.
  • the data is reasonably organized in the file block of hdfs.
  • the table file in the relational data is horizontally divided into equal-sized row groups, each hdfs file block stores one or more row groups, and data is stored in a column storage manner within each row group.
  • the data is pre-processed before the data table is stored, the frequency of use of different columns is counted, the code threshold is set to classify the column, and the code column below the coding threshold is stored in the data compression mode. Above the encoding threshold is the query column, which holds the data in its native format.
  • the RDP code is used to construct the error correction strategy, and the data table file is divided into different file groups.
  • the size of the file group is determined by the system construction parameters.
  • the system generates two additional check file blocks for data repair within each file group.
  • the system uses double copy plus check block to perform data fault tolerance. Each data block in the file group is saved on two different nodes, and the check block is saved on other nodes.
  • the system uses the construction parameters to encode two important parameters of the threshold to store the table file. By setting different parameters, the system can meet the specific data management requirements in terms of query efficiency and space occupation.
  • the EStore system is built in the actual distributed system, and the performance comparison between the EStore and the existing common data layout structure RCFile is made in terms of storage space occupation and query rate.
  • the parameter t is used to indicate the encoding threshold.
  • Figure 2 shows the difference in query rates for different data layouts. It can be seen that the EStore query rate is higher than RCFile under all coding thresholds, because EStore reduces the time consumption of data decompression during the query process.
  • Performance comparisons with RCFiles with EStores with different encoding thresholds include space occupancy and query execution time.
  • the above experimental results reflect the performance advantages of the EStore column classification and fault tolerance strategy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données, le procédé comprenant les étapes qui consistent : (A) à agencer une implantation de données sur la base de fichiers de blocs; (B) à effectuer un tri de colonnes; et (C) à mémoriser des fichiers de table. Lors du traitement de données structurées à grande échelle dans le système d'entrepôt de données à un niveau supérieur, l'invention présente une très grande efficacité d'interrogation et occupe moins d'espace de mémoire que les solutions classiques.
PCT/CN2016/113364 2016-12-30 2016-12-30 Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données WO2018119976A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201680090379.7A CN110268397B (zh) 2016-12-30 2016-12-30 应用于数据仓库系统的高效优化数据布局方法
PCT/CN2016/113364 WO2018119976A1 (fr) 2016-12-30 2016-12-30 Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/113364 WO2018119976A1 (fr) 2016-12-30 2016-12-30 Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données

Publications (1)

Publication Number Publication Date
WO2018119976A1 true WO2018119976A1 (fr) 2018-07-05

Family

ID=62706678

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/113364 WO2018119976A1 (fr) 2016-12-30 2016-12-30 Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données

Country Status (2)

Country Link
CN (1) CN110268397B (fr)
WO (1) WO2018119976A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579597A (zh) * 2020-12-15 2021-03-30 西安邮电大学 一种压缩敏感的数据库文件存储方法及系统
CN116931845A (zh) * 2023-09-18 2023-10-24 新华三信息技术有限公司 一种数据布局方法、装置及电子设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116719822B (zh) * 2023-08-10 2023-12-22 深圳市连用科技有限公司 一种海量结构化数据的存储方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339524A (zh) * 2008-05-22 2009-01-07 清华大学 大规模磁盘阵列存储系统的磁盘容错方法
US20100235677A1 (en) * 2007-09-21 2010-09-16 Wylie Jay J Generating A Parallel Recovery Plan For A Data Storage System
CN103186566A (zh) * 2011-12-28 2013-07-03 中国移动通信集团河北有限公司 一种数据分级存储方法、装置及系统
CN103699676A (zh) * 2013-12-30 2014-04-02 厦门市美亚柏科信息股份有限公司 基于mssql server表分区及自动维护方法及系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3694813A (en) * 1970-10-30 1972-09-26 Ibm Method of achieving data compaction utilizing variable-length dependent coding techniques
CN102521363A (zh) * 2011-12-15 2012-06-27 武汉达梦数据库有限公司 基于列分解的列存储数据库数值数据压缩方法
CN102737132A (zh) * 2012-06-25 2012-10-17 天津神舟通用数据技术有限公司 基于数据库行列混合存储的多规则复合压缩方法
CN103118133B (zh) * 2013-02-28 2015-09-02 浙江大学 基于文件访问频次的混合云存储方法
US9722637B2 (en) * 2013-03-26 2017-08-01 Peking University Shenzhen Graduate School Construction of MBR (minimum bandwidth regenerating) codes and a method to repair the storage nodes
US9244935B2 (en) * 2013-06-14 2016-01-26 International Business Machines Corporation Data encoding and processing columnar data
CN103440244A (zh) * 2013-07-12 2013-12-11 广东电子工业研究院有限公司 一种大数据存储优化方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100235677A1 (en) * 2007-09-21 2010-09-16 Wylie Jay J Generating A Parallel Recovery Plan For A Data Storage System
CN101339524A (zh) * 2008-05-22 2009-01-07 清华大学 大规模磁盘阵列存储系统的磁盘容错方法
CN103186566A (zh) * 2011-12-28 2013-07-03 中国移动通信集团河北有限公司 一种数据分级存储方法、装置及系统
CN103699676A (zh) * 2013-12-30 2014-04-02 厦门市美亚柏科信息股份有限公司 基于mssql server表分区及自动维护方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, Z.: "Performance Optimization of a Massive Data Query and Analysis System on Hadoop", ELECTRONIC TECHNOLOGY & INFORMATION SCIENCE, CHINA MASTER'S THESES, 15 August 2015 (2015-08-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579597A (zh) * 2020-12-15 2021-03-30 西安邮电大学 一种压缩敏感的数据库文件存储方法及系统
CN112579597B (zh) * 2020-12-15 2023-03-21 西安邮电大学 一种压缩敏感的数据库文件存储方法及系统
CN116931845A (zh) * 2023-09-18 2023-10-24 新华三信息技术有限公司 一种数据布局方法、装置及电子设备
CN116931845B (zh) * 2023-09-18 2023-12-12 新华三信息技术有限公司 一种数据布局方法、装置及电子设备

Also Published As

Publication number Publication date
CN110268397B (zh) 2023-06-13
CN110268397A (zh) 2019-09-20

Similar Documents

Publication Publication Date Title
US20220368457A1 (en) Distributed Storage System Data Management And Security
US10719250B2 (en) System and method for combining erasure-coded protection sets
CN103944981B (zh) 一种基于纠删码技术改进的云存储系统及实现方法
US8051362B2 (en) Distributed data storage using erasure resilient coding
US20130232153A1 (en) Modifying an index node of a hierarchical dispersed storage index
US20120089799A1 (en) Data backup processing method, data storage node apparatus and data storage device
CN106527993A (zh) 一种分布式系统中的海量文件储存方法及装置
US11656942B2 (en) Methods for data writing and for data recovery, electronic devices, and program products
CN114090345B (zh) 一种磁盘阵列数据恢复方法、系统、存储介质及设备
CN106484559A (zh) 一种校验矩阵的构造方法及水平阵列纠删码的构造方法
WO2018119976A1 (fr) Procédé d'optimisation d'implantation de données efficace destiné à un système d'entrepôt de données
CN105703782B (zh) 一种基于递增移位矩阵的网络编码方法及系统
CN101840366A (zh) 环链式n+1位奇偶校验码的存储方法
WO2023103213A1 (fr) Procédé et dispositif de stockage de données pour base de données distribuée
WO2024021594A1 (fr) Procédé et dispositif de codage pour réseau de disques raid6, procédé et dispositif de décodage pour réseau de disques raid6, et support
Esmaili et al. CORE: Cross-object redundancy for efficient data repair in storage systems
CN102843212B (zh) 编解码处理方法及装置
WO2024001974A1 (fr) Procédé et dispositif de récupération locale de données, et support de stockage
CN114115729B (zh) 一种raid下的高效数据迁移方法
CN107153661A (zh) 一种基于hdfs系统的数据的存储、读取方法及其装置
US7831859B2 (en) Method for providing fault tolerance to multiple servers
US20230418827A1 (en) Processing multi-column streams during query execution via a database system
US10599520B2 (en) Meta-copysets for fault-tolerant data storage
CN116248129A (zh) 一种容错的数据分段压缩方法、恢复方法、设备及系统
WO2018209541A1 (fr) Structure de codage sur la base de codes de répétition fractionnelle à conception en t, et procédé de codage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16925712

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16925712

Country of ref document: EP

Kind code of ref document: A1