WO2017101477A1 - 一种PostgreSQL块 - Google Patents

一种PostgreSQL块 Download PDF

Info

Publication number
WO2017101477A1
WO2017101477A1 PCT/CN2016/095365 CN2016095365W WO2017101477A1 WO 2017101477 A1 WO2017101477 A1 WO 2017101477A1 CN 2016095365 W CN2016095365 W CN 2016095365W WO 2017101477 A1 WO2017101477 A1 WO 2017101477A1
Authority
WO
WIPO (PCT)
Prior art keywords
postgresql
block
postgresql block
data
database
Prior art date
Application number
PCT/CN2016/095365
Other languages
English (en)
French (fr)
Inventor
朱晓伟
Original Assignee
深圳市华讯方舟软件技术有限公司
华讯方舟科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市华讯方舟软件技术有限公司, 华讯方舟科技有限公司 filed Critical 深圳市华讯方舟软件技术有限公司
Publication of WO2017101477A1 publication Critical patent/WO2017101477A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device

Definitions

  • the present invention relates to a PostgreSQL database storage device, and more particularly to a PostgreSQL block for storing a PostgreSQL data table in a PostgreSQL database.
  • the original PostgreSQL database system's underlying storage architecture is on the file system. From the table to the disk, the four layers of the tablespace, file system, logical volume, and disk are finally written to the physical disk.
  • Fsm file It stores information about the free space in the data table file.
  • Vm file A tuple that marks which file blocks in the data table file are not invalid.
  • Table data file This file stores data for the main user, but the file size will be limited. Normally, 2G will be saved. Files larger than 2G will be saved in files.
  • the general operating system has a limit on the number of files that can be opened.
  • the PostgreSQL database system internally uses its own virtual file management to manage file handles to ensure that many tables can be opened at the same time. For the database, there may be many tables at the same time, and the PostgreSQL database system uses one table and multiple files to save data.
  • you open a file you will first find in the file handle management whether there is a file handle that has been opened. Set the offset according to the file handle.
  • centralized storage disk array, etc.
  • This method only hardly improves the I/O throughput of the back end, and does not improve the utilization of the underlying I/O in principle. And in terms of security, this method does not protect the data at any level, and still exposes the data. It can only achieve the speedup of the database system to a certain extent. But for the high security and high performance requirements, we can't just rely on this pure hardware acceleration and ignore the optimization of the software itself.
  • the above is the original PostgreSQL database system storage architecture.
  • the above structure has the following disadvantages:
  • 64-bit space is used to record the number of blocks and the number of i-nodes.
  • 64-bit space is used to record the number of blocks and the number of i-nodes.
  • database systems there may be a large number of tables, and each table may have at least 3 files. , resulting in a lot of files in a folder.
  • the file system When the file system performs block allocation, it is basically allocated in a 4K block mode. At the same time, our database application is applied in 8K units. This means that the blocks allocated by the system are always smaller than the blocks of our table.
  • the direct problem with this is that the database's business blocks are cluttered to disk. The direct problem is that the disk seek time is long and the read and write speed is slowed down.
  • the database is migrated between different systems and its complexity, and the operator's professional knowledge is relatively high.
  • the database data files are directly exposed in the operating system, the data has security risks, and the data security is not guaranteed. It is a great security vulnerability for some environments with high security requirements.
  • the technical problem to be solved by the present invention is to provide a PostgreSQL block, which can greatly improve the query speed and read/write speed of the PostgreSQL block.
  • the PostgreSQL block of the present invention is the smallest unit allocated to the data table in the PostgreSQL database system, and the storage capacity of the PostgreSQL block is greater than 4 KB.
  • the storage capacity of the PostgreSQL block is a positive integer multiple of 8 KB.
  • the storage capacity of the PostgreSQL block is a natural multiple of 8 KB of 2 KB.
  • the storage capacity of the PostgreSQL block is 1 MB, 2 MB, 4 MB, 8 MB, 16 MB, 32 MB, 64 MB, 128 MB, 256 MB, 512 MB or 1024 MB.
  • a PostgreSQL block storage device read/write module is a PostgreSQL block storage device read/write module for managing a PostgreSQL block as described above in a PostgreSQL block storage device.
  • the PostgreSQL block storage device read/write module can be architected on a PostgreSQL database system.
  • the PostgreSQL block storage device read/write module manages PostgreSQL blocks in the PostgreSQL block storage device through a mapping relationship table between PostgreSQL block-data tables and an idle PostgreSQL block table.
  • the mapping relationship table between the PostgreSQL block and the data table includes fields Relfilenode, Reltablespace, Forknum, Blockid, Blockno, and the free PostgreSQL block table includes fields Blockid, Isfree, Dev.
  • the PostgreSQL block storage device read and write module has the following submodules:
  • a submodule that writes data in a PostgreSQL block is a submodule that writes data in a PostgreSQL block.
  • the sub-module that allocates the PostgreSQL block allocates a PostgreSQL block to the PostgreSQL data table by using a nearby allocation policy or a hot-hot data hierarchical allocation policy, and the nearby allocation strategy is a PostgreSQL block nearest allocation policy or an idle PostgreSQL block table recording nearest allocation policy.
  • the hierarchical allocation strategy of the hot and cold data table is a common data table allocation strategy or a recently used data table.
  • the allocation strategy, the PostgreSQL block nearest allocation strategy is to find the free PostgreSQL block to be allocated to the data table before and after the last allocated PostgreSQL block of the data table, and the free PostgreSQL block table record nearest allocation strategy is to search from the idle PostgreSQL block table record.
  • the first free PostgreSQL block, the commonly used data table allocation strategy is that the frequently used data table is preferentially assigned to the faster PostgreSQL block device, and the recently used data table allocation strategy is to prioritize the recently used data table to be faster.
  • the hot and cold data table hierarchical allocation strategy is used on a PostgreSQL block device having two or more different read/write speeds;
  • the submodule of the Recycled PostgreSQL block is used to reclaim the PostgreSQL block that is no longer used by the data table, and the corresponding PostgreSQL block record is deleted from the mapping relationship table between the PostgreSQL block and the data table, and the record of the corresponding PostgreSQL block is sent to the free table.
  • the sub-module for locating the PostgreSQL block is used to locate a page in the data table to a location specified by the PostgreSQL block device;
  • the submodule for reading data in the PostgreSQL block is used to read data of a specified size in a specified position in the specified block;
  • the sub-module that writes the data in the PostgreSQL block is used to write data of a specified size within a specified location within the specified block.
  • the submodule that writes the data in the PostgreSQL block writes the data to the PostgreSQL block in an encrypted manner; accordingly,
  • the sub-module that reads the data in the PostgreSQL block reads the data back to the database business layer in a decrypted manner.
  • the encryption method is a method of performing storage encryption on the user password and data XOR;
  • the decryption method is to use the user password and the data read from the PostgreSQL block.
  • the Dev field in the idle PostgreSQL block table identifies different storage devices to implement capacity expansion.
  • a data read/write method based on a PostgreSQL block storage device which reads and writes a PostgreSQL block as described above by using a PostgreSQL block storage device read/write module as described above.
  • the driver of the PostgreSQL block storage device read/write module completes the operation of the block file, and its main tasks are:
  • the table space is located to a specified position of the specified block device, and a certain offset of the block device is located by the pageid of the table.
  • Each table in the PostgreSQL database system is assigned at least one of the PostgreSQL blocks, and the PostgreSQL block can be dynamically allocated incrementally;
  • Each PostgreSQL block can write to 512 8-KB PostgreSQL block database system pages.
  • a block device can contain an infinite number of tables.
  • a table can contain an unlimited number of PostgreSQL blocks. These block file information is stored in PostgreSQL block-data. The mapping relationship between the tables;
  • the information of all the PostgreSQL blocks of a data table can be obtained by one oid, and the PostgreSQL block information will sequentially form the entire PostgreSQL block-data table file.
  • the pageid*8k of this table is the internal bias of the table. Move, and at the same time be able to determine this is The first few blocks of the block file, so far get the position of the page we want to get in the block device.
  • the PostgreSQL block of the present invention has the following advantageous effects as compared with the prior art.
  • the data in the database will be more secure.
  • the data in the database will be saved to the block device according to the rules we have established, which is more secure in terms of security.
  • the data will be saved in pieces and saved in multiple copies to different data nodes, which is more than 100 times more secure than a single disk.
  • Database business and data separation As long as it is the same version of the database, it can be freely migrated in different operating systems. As long as the verification code set at the head of the data block can be directly mounted, the database can be directly mounted. It is free to migrate from different versions of Linux, and can also migrate with the Windows operating system platform.
  • PostgreSQL block the PostgreSQL block device read and write storage module, and the data read and write method based on the PostgreSQL block storage device of the present invention are further described in detail below with reference to the accompanying drawings and specific embodiments.
  • FIG. 1 is a schematic diagram of a file system read and write hierarchical structure of an operating system in the prior art.
  • FIG. 2 is a schematic diagram of a file system storage space allocation structure of an operating system in the prior art.
  • FIG. 3 is a schematic diagram of the principle of read and write of a file system head of an operating system in the prior art.
  • FIG. 4 is a schematic diagram showing the structure and principle of a PostgreSQL block, a PostgreSQL block device read/write memory module, and a data read/write method based on a PostgreSQL block storage device.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the PostgreSQL block of the present invention is the smallest unit allocated to the data table in the PostgreSQL database system, and the storage capacity of the PostgreSQL block is equal to 4 MB.
  • the storage capacity of the PostgreSQL block is greater than 4 KB.
  • the storage capacity of the PostgreSQL block is a positive integer multiple of 8 KB.
  • the storage capacity of the PostgreSQL block is a natural multiple of 8 KB of 2 KB.
  • the storage capacity of the PostgreSQL block is 1 MB, 2 MB, 8 MB, 16 MB, 32 MB, 64 MB, 128 MB, 256 MB, 512 MB or 1024 MB.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the PostgreSQL block storage device read/write module of the present invention is a PostgreSQL block storage device read/write module for managing a PostgreSQL block as described above in a PostgreSQL block storage device.
  • the PostgreSQL block storage device read/write module can be architected on a PostgreSQL database system.
  • the PostgreSQL block storage device read/write module manages PostgreSQL blocks in the PostgreSQL block storage device through a mapping relationship table between PostgreSQL block-data tables and an idle PostgreSQL block table.
  • the mapping relationship table between the PostgreSQL block and the data table includes fields Relfilenode, Reltablespace, Forknum, Blockid, Blockno, and the free PostgreSQL block table includes fields Blockid, Isfree, Dev.
  • the PostgreSQL block storage device read and write module has the following submodules:
  • a submodule that writes data in a PostgreSQL block is a submodule that writes data in a PostgreSQL block.
  • the sub-module that allocates the PostgreSQL block allocates a PostgreSQL block to the PostgreSQL data table by using a nearby allocation policy or a hot-hot data hierarchical allocation policy, and the nearby allocation strategy is a PostgreSQL block nearest allocation policy or an idle PostgreSQL block table recording nearest allocation policy.
  • the hot-cold data table hierarchical allocation strategy is a common data table allocation strategy or a recently used data table allocation strategy.
  • the PostgreSQL block nearest allocation strategy is to find an idle PostgreSQL block to be assigned to the data table before and after the last allocated PostgreSQL block of the data table.
  • the free PostgreSQL block table record near allocation strategy is to find the first free PostgreSQL block from the idle PostgreSQL block table record, the commonly used data table allocation strategy is a frequently used data table.
  • Priority is assigned to a faster PostgreSQL block device that prioritizes the recent use of the data table to a faster PostgreSQL block device, the hot-cold data table hierarchical allocation strategy for having two On a PostgreSQL block device with different read/write speeds above the block;
  • the submodule of the Recycled PostgreSQL block is used to reclaim the PostgreSQL block that is no longer used by the data table, and the corresponding PostgreSQL block record is deleted from the mapping relationship table between the PostgreSQL block and the data table, and the record of the corresponding PostgreSQL block is sent to the free table.
  • the sub-module for locating the PostgreSQL block is used to locate a page in the data table to a location specified by the PostgreSQL block device;
  • the submodule for reading data in the PostgreSQL block is used to read data of a specified size in a specified position in the specified block;
  • the sub-module that writes the data in the PostgreSQL block is used to write data of a specified size within a specified location within the specified block.
  • the submodule that writes the data in the PostgreSQL block writes the data to the PostgreSQL block in an encrypted manner; accordingly,
  • the sub-module that reads the data in the PostgreSQL block reads the data back to the database business layer in a decrypted manner.
  • the encryption method is a method of performing storage encryption on the user password and data XOR;
  • the decryption method is a method of deciphering the user password and the data read from the PostgreSQL block and then deciphering it to the database service layer.
  • the Dev field in the idle PostgreSQL block table identifies different storage devices to implement capacity expansion.
  • the PostgreSQL block table is copied from the original library to the new library, and the PostgreSQL block storage device is migrated from the computer where the original library resides to the computer where the new library is located.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • the present invention is based on a data read/write method of a PostgreSQL block storage device, and reads and writes a PostgreSQL block as described above by using a PostgreSQL block storage device read/write module as described above.
  • the driver of the PostgreSQL block storage device read/write module completes the operation of the block file, and its main tasks are:
  • the table space is located to a specified position of the specified block device, and a certain offset of the block device is located by the pageid of the table.
  • Each table in the PostgreSQL database system is assigned at least one of the PostgreSQL blocks, and the PostgreSQL block can be dynamically allocated incrementally;
  • Each PostgreSQL block can write to 512 8-KB PostgreSQL block database system pages.
  • a block device can contain an infinite number of tables.
  • a table can contain an unlimited number of PostgreSQL blocks. These block file information is stored in PostgreSQL block-data. The mapping relationship between the tables;
  • the information of all the PostgreSQL blocks of a data table can be obtained by one oid, and the PostgreSQL block information will sequentially form the entire PostgreSQL block-data table file.
  • the pageid*8k of this table is the internal bias of the table. Move, and at the same time be able to determine the first block of this block file, and thus get the position of the page we want to get in the block device.
  • the PostgreSQL block storage device In general, we split the PostgreSQL block storage device by 4MB. We only need to calculate the total number of PostgreSQL block storage devices by dividing the total capacity of the PostgreSQL block storage device by the capacity of a PostgreSQL block. This does not require scanning the PostgreSQL block storage device. This allocation time is extremely short. For example, the 2T PostgreSQL block storage device only needs to be divided into 524288 PostgreSQL blocks. We only need to add 524288 records to the free PostgreSQL block table, and the blockid can be incremented from 0--524287. Also mark isfree as true.
  • the storage space of the data table is allocated in units of 4 MB.
  • the actual occupied space of a table is allocated on demand.
  • the original SMgrRelation reln, ForkNumber forknum two parameters are located to the file system's table file modified to locate the offset of the block device.
  • the database business layer is separated from the data layer. Mount the block device directly to the new node. This upper level system error does not affect the underlying data.
  • PostgreSQL block storage device selection rules can be customized.
  • step B we are the choice of order.
  • PostgreSQL database system background storage is stored on a bare disk basis.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供一种PostgreSQL块。该PostgreSQL块是分配给PostgreSQL数据库系统中数据表的最小单位,该PostgreSQL块的存储容量大于4KB。根据本发明,数据库的查询速度、读写速度得到提升,数据库的数据更安全,而且数据库业务和数据分离。

Description

一种PostgreSQL块 技术领域
本发明涉及一种PostgreSQL数据库存储设备,尤其涉及一种用于存储PostgreSQL数据库中PostgreSQL数据表的PostgreSQL块。
背景技术
随着互联网、移动互联网和物联网的发展,我们迎来了一个海量数据的时代,而数据库内保存的数据也越来越多,而我们需要的查询时间反而要越来越小。现在的众多应用场景都需要后台的存储具有高并发,高容量,高响应。高速的入库需求迫使我们不得不放弃实时索引,而大数据量的数据扫描又被存储端的IO所限制。现在的PostgreSQL数据库系统中,高并发的多数据库扫描需要碰到基本都是随机读,而与此同时,普通磁盘的吞吐量已经不能满足需求。
为了适应大数据量的应用场景,官方推出的PostgreSQL-XC、PostgreSQL-XL这两个MPP数据库还没有成熟,而且还存在很多安全性问题。而传统的提速方案仅仅是利用集中式存储的高I/O(输入/输出)来在一定程度上提高整个系统的。但这种提升依旧存在很大的资源浪费。
如图1所示,原有的PostgreSQL数据库系统底层存储架构在文件系统上,从表到磁盘需要经过:表空间,文件系统,逻辑卷、磁盘这四层最终才会写入到物理磁盘内。
这样的架构首先会使得磁盘的I/O形成衰减。最终的磁盘I/O利用率只能达到80%左右,甚至更低。其次,当数据库表存在大量小表的情况下,势必会增加文件系统的压力。而且对于数据库操作而言,表的写入读出都较为随机,极容易造成磁盘碎片,而大量的随机读写,也必将会使得整个数据的读写性能下降。对于小数据量的数据库而言,现有的缺点也许不明显。但现在更多的使用场景是大数据量、高并发、高I/O。这些使用场景下,这些缺点将会大大影响数据库性能。此外在不同的操作系统迁移中也需要 极复杂的数据库迁移操作。
普通磁盘的I/O速度较低,传统的解决方法是使用集中式存储(即磁盘阵列)。这种方式只是简单的提高磁盘的I/O速度,而且成本极高,具有一定的局限性。
如图2和图3所示,PostgreSQL数据库系统现有的后台存储架构:
数据库的数据操作每条记录都有几个主要标识。表ID、记录所在black id、文件类型、行记录对应的标识。现有的PostgreSQL数据库系统读写表操作需要:
a根据表ID查找到这个表的表空间,然后定位到文件系统中表空间的位置。
b根据文件类型要获取到指定的表文件。一般一个表会包含3种类型的文件。
fsm文件:其中存放了数据表文件中空闲空间的信息。
vm文件:标记了数据表文件中哪些文件块没有失效的元组。
表数据文件:此文件主要用户存储数据,但是此文件大小会有一定的限制,普通情况下会保存2G,超过2G的文件会分文件保存。
c根据记录号以及fsm文件,定位page。
而一般的操作系统对于打开的文件数会有限制。PostgreSQL数据库系统内部使用自己的虚拟文件管理来管理文件句柄,来保证能够同时打开很多的表。对于数据库而言可能同时存在很多的表,而PostgreSQL数据库系统采用1个表多个文件方式来保存数据。当打开一个文件的时候,会首先到文件句柄管理中找一下是不是存在已经打开的文件句柄。根据文件句柄设定一下offset。
d从文件中取到对应的page,并取到对应的行。
根据记录的blockid定位到文件,读取指定大小的块。而这些记录就保存在page内。
现有技术中,采用集中式存储作为数据库后端。
1、采用集中式存储(磁盘阵列等)做为存储后端。把集中式存储提供的磁盘挂载到PostgreSQL数据库系统所安装的机器。并根据不同的操作系统格式化成不同的文件系统。
2、把数据库中的数据指定到集中式存储提供的磁盘。
此种方法仅仅硬性提高后端的I/O吞吐量,并没有从原理上提高底层I/O的利用率。而且就安全性而言,这种方法并没有对数据做任何层面的保护,依旧让数据暴露在外。只能够在一定程度上实现数据库系统的提速。但是对于高安全高性能的需求来说,我们不能仅仅依靠这种纯硬件的加速而忽略软件本身的优化。
以上是原有的PostgreSQL数据库系统存储架构。以上结构存在以下几个缺点:
第一、对磁盘的使用必经过文件系统层。
就linux系统下较为流行的文件系统而言,采用64位空间来记录块数量和i-节点数量,对于数据库系统,可能存在及大量的表,每个表都将可能至少存在3个以上的文件,导致一个文件夹内存在极多的文件。
文件系统进行块分配时,基本上都是按照4K一个块的模式进行分配。同时我们数据库申请的时候都是以8K为单位进行申请。也就是说系统分配的块总是比我们的表的块要小。这样导致的直接问题就是数据库的业务块杂乱的分配到磁盘上。导致的直接问题就是磁盘寻道时间长,读写速度减慢。
假如我们的数据库中存在一个极大的表,文件系统在处理的时候效率极其低。例如,在ext3文件系统中100MB的文件就需要近25600个数据块。而对于PostgreSQL数据库系统而言,达到或超过GB级别的表处理会很常见,而且是随机的读写。
第二、文件系统的扩容也非常麻烦。除了已知的极个别商业文件系统外,其他的文件系统扩容多会要求关闭数据库。
第三、数据库在不同的系统之间进行迁移及其复杂,对操作人员专业知识要求较高。
第四、数据库数据文件直接暴露在操作系统中,数据存在安全隐患,数据安全也就得不到保证,对于一些对安全级别要求较高的使用环境而言是个极大的安全漏洞。
发明内容
本发明要解决的技术问题是提供一种PostgreSQL块,该PostgreSQL块的查询速度、读写速度能够有极大提升。
为了解决上述技术问题,本发明的PostgreSQL块是分配给PostgreSQL数据库系统中数据表的最小单位,所述PostgreSQL块的存储容量大于4KB。
所述PostgreSQL块的存储容量是8KB的正整数倍。
所述PostgreSQL块的存储容量是8KB之2的自然数次方倍。
所述PostgreSQL块的存储容量是1MB、2MB、4MB、8MB、16MB、32MB、64MB、128MB、256MB、512MB或1024MB。
一种PostgreSQL块存储设备读写模块,所述PostgreSQL块存储设备读写模块是对PostgreSQL块存储设备中如前面所述的PostgreSQL块进行管理的PostgreSQL块存储设备读写模块。
所述PostgreSQL块存储设备读写模块可以架构在PostgreSQL数据库系统上。
所述PostgreSQL块存储设备读写模块通过PostgreSQL块-数据表之间的映射关系表和空闲PostgreSQL块表对PostgreSQL块存储设备中的PostgreSQL块进行管理。
所述PostgreSQL块-数据表之间的映射关系表包括字段Relfilenode、Reltablespace、Forknum、Blockid、Blockno,所述空闲PostgreSQL块表包括字段Blockid、Isfree、Dev。
所述PostgreSQL块存储设备读写模块,具有以下子模块:
分配PostgreSQL块的子模块,
回收PostgreSQL块的子模块,
定位PostgreSQL块的子模块,
读出PostgreSQL块中数据的子模块,
写入PostgreSQL块中数据的子模块。
所述分配PostgreSQL块的子模块采用就近分配策略或冷热数据分层分配策略给PostgreSQL数据表分配PostgreSQL块,所述就近分配策略是PostgreSQL块就近分配策略或空闲PostgreSQL块表记录就近分配策略,所述冷热数据表分层分配策略是常用数据表分配策略或近期使用数据表 分配策略,所述PostgreSQL块就近分配策略是在数据表上次分配的PostgreSQL块前后就近寻找空闲PostgreSQL块分配给数据表,所述空闲PostgreSQL块表记录就近分配策略是从空闲PostgreSQL块表记录中寻找第一个空闲PostgreSQL块,所述常用数据表分配策略是经常使用的数据表优先分配到较快的PostgreSQL块设备上,所述近期使用数据表分配策略是将近期使用数据表优先分配到较快的PostgreSQL块设备上,所述冷热数据表分层分配策略用于具有两块以上不同读写速度的PostgreSQL块设备上;
所述回收PostgreSQL块的子模块用于回收数据表不再使用的PostgreSQL块,从PostgreSQL块-数据表之间的映射关系表中删除对应的PostgreSQL块记录,到空闲表中将相应PostgreSQL块的记录设置为空闲;
所述定位PostgreSQL块的子模块用于将数据表中的页定位到PostgreSQL块设备指定的位置上;
所述读出PostgreSQL块中数据的子模块用于读取指定块内指定位置指定大小的数据;
所述写入PostgreSQL块中数据的子模块用于写入指定块内指定位置指定大小的数据。
所述写入PostgreSQL块中数据的子模块以加密方式将数据写入PostgreSQL块中;相应地,
所述读出PostgreSQL块中数据的子模块以解密的方式将数据读出返回给数据库业务层。
所述加密的方式是采用将用户密码与数据异或计算后进行存储加密的方式;
所述解密的方式是采用将用户密码与从PostgreSQL块中读出的数据 异或计算后反回给数据库业务层解密的方式。
所述空闲PostgreSQL块表中的Dev字段对不同的存储设备进行标识实现扩容。
将所述PostgreSQL块-数据表之间的映射关系表和所述空闲PostgreSQL块表从原库复制到新库,将PostgreSQL块存储设备从原库所在的计算机迁移到新库所在的计算机。
一种基于PostgreSQL块存储设备的数据读写方法,利用如前面所述的PostgreSQL块存储设备读写模块对如前面所述的PostgreSQL块进行读写。
A、在数据库后台驱动层中,PostgreSQL块存储设备读写模块的驱动完成对块文件的操作,其主要任务为:
a、通过文件句柄管理器管理上层业务对底层设备的读写接口,实现提供高并发的读写接口;
b、实现对PostgreSQL块存储设备的覆盖写,直接定位到PostgreSQL块存储设备的指定位置,并写入指定长度的数据;
B、根据所述PostgreSQL块-数据表之间的映射关系表oid、表空间定位到指定块设备的指定位置,通过表的pageid定位到块设备的某个偏移。
PostgreSQL数据库系统中的每一个表分配有至少一个所述PostgreSQL块,而且可动态地增量分配PostgreSQL块;
每个PostgreSQL块可写入512个8KB的PostgreSQL块数据库系统的页,一个块设备可以包含无数多个表,一个表将可以包含无限多个PostgreSQL块,这些块文件信息被保存在PostgreSQL块-数据表之间的映射关系表内;
通过1个oid能获取到一个数据表的所有PostgreSQL块的信息,而这些PostgreSQL块信息会顺序地组成整个的所述PostgreSQL块-数据表的文件,这个表的pageid*8k就是这个表内部的偏移,同时也就能够确定是这个 块文件的第几个块,至此获得我们想要获取的页在块设备内位置。
本发明的PostgreSQL块与现有技术相比具有以下有益效果。
1)查询速度、读写速度能够有极大提升,在相同的后端存储情况下能够达到10%~20%的速度提升。这个提升对于数据库而言是十分可观的。
2)数据库的数据将会更为安全。数据库的数据将按照我们制定的规则保存到块设备内,从安全上来说更为安全。而且,由于使用分布式存储,将会使得数据将会分片保存,并保存多份到不同的数据节点内,就安全度而言是单磁盘百倍以上。
3)数据库业务和数据分离。只要是同一个版本的数据库,就能够在不同的操作系统中自由迁移。只要能够通过设定在数据块头部的验证码,就能够实现数据库的直接挂载。能够自由的从linux不同版本之间进行迁移,也能够和windows操作系统平台进行相互迁移。
4)数据库在不同的系统之间进行迁移及其简单,对操作人员专业知识要求不高。
附图说明
下面结合附图和具体实施方式对本发明的PostgreSQL块、PostgreSQL块设备读写存储模块及基于PostgreSQL块存储设备的数据读写方法作进一步的详细描述。
图1是现有技术中操作系统的文件系统读写分层结构示意图。
图2是现有技术中操作系统的文件系统存储空间分配结构示意图。
图3是现有技术中操作系统的文件系统磁头读写原理示意图。
图4是本发明的PostgreSQL块、PostgreSQL块设备读写存储模块及基于PostgreSQL块存储设备的数据读写方法的结构、原理示意图。
具体实施方式
实施例一:
如图4所示,本发明的PostgreSQL块是分配给PostgreSQL数据库系统中数据表的最小单位,所述PostgreSQL块的存储容量等于4MB。
当然,也可以是,所述PostgreSQL块的存储容量大于4KB。
还可以是,所述PostgreSQL块的存储容量是8KB的正整数倍。
又可以是,所述PostgreSQL块的存储容量是8KB之2的自然数次方倍。
更可以是,所述PostgreSQL块的存储容量是1MB、2MB、8MB、16MB、32MB、64MB、128MB、256MB、512MB或1024MB。
实施例二:
如图4所示,本发明的PostgreSQL块存储设备读写模块是对PostgreSQL块存储设备中如前面所述的PostgreSQL块进行管理的PostgreSQL块存储设备读写模块。
所述PostgreSQL块存储设备读写模块可以架构在PostgreSQL数据库系统上。
所述PostgreSQL块存储设备读写模块通过PostgreSQL块-数据表之间的映射关系表和空闲PostgreSQL块表对PostgreSQL块存储设备中的PostgreSQL块进行管理。
所述PostgreSQL块-数据表之间的映射关系表包括字段Relfilenode、Reltablespace、Forknum、Blockid、Blockno,所述空闲PostgreSQL块表包括字段Blockid、Isfree、Dev。
所述PostgreSQL块存储设备读写模块,具有以下子模块:
分配PostgreSQL块的子模块,
回收PostgreSQL块的子模块,
定位PostgreSQL块的子模块,
读出PostgreSQL块中数据的子模块,
写入PostgreSQL块中数据的子模块。
所述分配PostgreSQL块的子模块采用就近分配策略或冷热数据分层分配策略给PostgreSQL数据表分配PostgreSQL块,所述就近分配策略是PostgreSQL块就近分配策略或空闲PostgreSQL块表记录就近分配策略,所述冷热数据表分层分配策略是常用数据表分配策略或近期使用数据表分配策略,所述PostgreSQL块就近分配策略是在数据表上次分配的PostgreSQL块前后就近寻找空闲PostgreSQL块分配给数据表,所述空闲PostgreSQL块表记录就近分配策略是从空闲PostgreSQL块表记录中寻找第一个空闲PostgreSQL块,所述常用数据表分配策略是经常使用的数据表 优先分配到较快的PostgreSQL块设备上,所述近期使用数据表分配策略是将近期使用数据表优先分配到较快的PostgreSQL块设备上,所述冷热数据表分层分配策略用于具有两块以上不同读写速度的PostgreSQL块设备上;
所述回收PostgreSQL块的子模块用于回收数据表不再使用的PostgreSQL块,从PostgreSQL块-数据表之间的映射关系表中删除对应的PostgreSQL块记录,到空闲表中将相应PostgreSQL块的记录设置为空闲;
所述定位PostgreSQL块的子模块用于将数据表中的页定位到PostgreSQL块设备指定的位置上;
所述读出PostgreSQL块中数据的子模块用于读取指定块内指定位置指定大小的数据;
所述写入PostgreSQL块中数据的子模块用于写入指定块内指定位置指定大小的数据。
所述写入PostgreSQL块中数据的子模块以加密方式将数据写入PostgreSQL块中;相应地,
所述读出PostgreSQL块中数据的子模块以解密的方式将数据读出返回给数据库业务层。
所述加密的方式是采用将用户密码与数据异或计算后进行存储加密的方式;
所述解密的方式是采用将用户密码与从PostgreSQL块中读出的数据异或计算后反回给数据库业务层解密的方式。
所述空闲PostgreSQL块表中的Dev字段对不同的存储设备进行标识实现扩容。
将所述PostgreSQL块-数据表之间的映射关系表和所述空闲 PostgreSQL块表从原库复制到新库,将PostgreSQL块存储设备从原库所在的计算机迁移到新库所在的计算机。
实施例三:
如图4所示,本发明基于PostgreSQL块存储设备的数据读写方法,利用如前面所述的PostgreSQL块存储设备读写模块对如前面所述的PostgreSQL块进行读写。
A、在数据库后台驱动层中,PostgreSQL块存储设备读写模块的驱动完成对块文件的操作,其主要任务为:
a、通过文件句柄管理器管理上层业务对底层设备的读写接口,实现提供高并发的读写接口;
b、实现对PostgreSQL块存储设备的覆盖写,直接定位到PostgreSQL块存储设备的指定位置,并写入指定长度的数据;
B、根据所述PostgreSQL块-数据表之间的映射关系表oid、表空间定位到指定块设备的指定位置,通过表的pageid定位到块设备的某个偏移。
PostgreSQL数据库系统中的每一个表分配有至少一个所述PostgreSQL块,而且可动态地增量分配PostgreSQL块;
每个PostgreSQL块可写入512个8KB的PostgreSQL块数据库系统的页,一个块设备可以包含无数多个表,一个表将可以包含无限多个PostgreSQL块,这些块文件信息被保存在PostgreSQL块-数据表之间的映射关系表内;
通过1个oid能获取到一个数据表的所有PostgreSQL块的信息,而这些PostgreSQL块信息会顺序地组成整个的所述PostgreSQL块-数据表的文件,这个表的pageid*8k就是这个表内部的偏移,同时也就能够确定是这个块文件的第几个块,至此获得我们想要获取的页在块设备内位置。
如图4所示,详细设计如下:
创建PostgreSQL块-数据表之间的映射关系表和空闲PostgreSQL块表。表一:PostgreSQL块-数据表之间映射关系表
  字段 类型 描述
1 Relfilenode(表编号) oid 表ID
2 Reltablespace(表空间编号) oid 表空间ID
3 Forknum(文件类型) Int2 表fork类型
4 Blockid(块编号) oid 表所占用的块ID
5 Blockno(块位置) oid 被占用的块所在数据表中的位置
表二:空闲PostgreSQL块表
  字段 类型 描述
1 Blockid(块编号) oid 块ID
2 Isfree(占用标识) bool 是否被使用
3 Dev(设备标识) string 块设备标识
A首先我们把PostgreSQL块存储设备在逻辑上以4MB为单位分割为N个PostgreSQL块。
一般情况下我们把PostgreSQL块存储设备按照4MB为单位进行分割。我们只需要通过PostgreSQL块存储设备的总容量除以一个PostgreSQL块的容量计算出PostgreSQL块存储设备的总块数即可,不需要对PostgreSQL块存储设备进行扫描,这个分配时间极短。比如2T的PostgreSQL块存储设备只需要分成524288个PostgreSQL块,我们只需要在空闲PostgreSQL块表中新增524288条记录,blockid从0--524287递增即可。同时标注isfree为true。
B、数据表的存储空间以4MB为单位进行分配。
按照PostgreSQL数据库系统的流程而言,一个表实际占用空间是按需分配的。我们修改PostgreSQL数据库系统后台驱动文件storage/smgr/smgr.c类型typedef struct f_smgr中需要的方法进行重写。
包括{
mdinit,
mdclose,
mdcreate,
mdexists,
mdunlink,
mdextend,
mdprefetch,
mdread,
mdwrite,
mdnblocks,
mdtruncate,
mdimmedsync,
mdpreckpt,
mdsync,
mdpostckpt
}
这些方法的具体实现在md.c文件中。(此文件中的其他方法也要进行修改)
把各个方法中涉及到的MdfdVec进行修改。
修改的主要思想为:
原本SMgrRelation reln,ForkNumber forknum两个参数定位到文件系统的表文件修改为定位到块设备的offset。我们能根据reln获取到relfilenode,reltablespace,forknum我们从PostgreSQL块-数据表之间的映射关系表中能够定位到我们的表所占用的数据。保证(pageno+1)*4*1024KB>reln.pageid*8KB>pageno*4*1024KB。
假如PostgreSQL块-数据表之间的映射关系表中不存在,说明我们需要新增一个PostgreSQL块给数据表。这时我们从空闲PostgreSQL块表中直接搜到一个没被使用的块设备就可以直接使用。直接避免文件系统在处理大文件时候的繁杂分配过程。
假如PostgreSQL块-数据表之间的映射关系表中存在,我们就能把PostgreSQL块存储设备直接定位到pageno*4*1024KB+(reln.pageid%512) *8KB。读出这个8K数据即可。
PostgreSQL数据库业务到磁盘的中间过程及其简单。磁盘利用率极高。
C、PostgreSQL块存储设备的扩容。
当我们的PostgreSQL块存储设备容量不够的时候,我们的分布式存储能够极容易给其扩容。只需要在空闲PostgreSQL块表新增新条目即可。扩容的方式有两种:一种方式是原有的块设备扩容;另一种方式是新增块设备。对于前者我们更新PostgreSQL块-数据表之间的映射关系表,比如从2TB新增到4TB,我们只需要新插入524288条记录,524288-1048575递增即可。对于后者,比如新增2T块设备。我们新增524287条记录,blockid从0-524287递增,dev能够定位即可。
D、数据库迁移。
数据库业务层跟数据层分割。把块设备直接挂载到新的节点上。这样上层的系统错误不会影响到底层数据。
E、数据安全。
PostgreSQL块存储设备选择规则可自定义,在B步骤中,我们是顺序的选择,我们也可以在我们的选择方法中添加一些规则(比如我们可以在其中插入一致性hash算法),打乱我们数据库的存储规则,保证数据安全。
本发明的关键点如下。
1)PostgreSQL数据库系统后台存储以裸盘为基础进行保存。
2)PostgreSQL数据库系统迁移的跨操作系统直接迁移,以及验证。
3)PostgreSQL数据库系统对分布式文件系统(ceph)块设备的直接支持。
与现有技术相比,本发明的有益效果如下。
1)插入和查询速度更快。
2)迁移更灵活。
3)数据更为安全。
4)PostgreSQL数据库系统业务和数据存储相分离。
需要说明的是,以上参照附图所描述的各个实施例仅用以说明本发明而非限制本发明的范围,本领域的普通技术人员应当理解,在不脱离本发明的精神和范围的前提下对本发明进行的修改或者等同替换,均应涵盖在 本发明的范围之内。此外,除上下文另有所指外,以单数形式出现的词包括复数形式,反之亦然。另外,除非特别说明,那么任何实施例的全部或一部分可结合任何其它实施例的全部或一部分来使用。

Claims (6)

  1. 一种PostgreSQL块,所述PostgreSQL块是分配给PostgreSQL数据库系统中数据表的最小单位,其特征在于:所述PostgreSQL块的存储容量大于4KB。
  2. 根据权利要求1所述的PostgreSQL块,其特征在于:所述PostgreSQL块的存储容量是8KB的正整数倍。
  3. 根据权利要求2所述的PostgreSQL块,其特征在于:所述PostgreSQL块的存储容量是8KB之2的自然数次方倍。
  4. 根据权利要求3所述的PostgreSQL块,其特征在于:所述PostgreSQL块的存储容量是1MB、2MB、4MB、8MB、16MB、32MB、64MB、128MB、256MB、512MB或1024MB。
  5. 根据权利要求1至4之一所述的PostgreSQL块,其特征在于:所述PostgreSQL块利用PostgreSQL块设备读写存储模块进行读写。
  6. 根据权利要求5所述的PostgreSQL块,其特征在于:所述PostgreSQL块通过基于PostgreSQL块存储设备的数据读写方法进行读写。
PCT/CN2016/095365 2015-12-17 2016-08-15 一种PostgreSQL块 WO2017101477A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510953093.4 2015-12-17
CN201510953093.4A CN105573678B (zh) 2015-12-17 2015-12-17 一种PostgreSQL块

Publications (1)

Publication Number Publication Date
WO2017101477A1 true WO2017101477A1 (zh) 2017-06-22

Family

ID=55883872

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/095365 WO2017101477A1 (zh) 2015-12-17 2016-08-15 一种PostgreSQL块

Country Status (2)

Country Link
CN (1) CN105573678B (zh)
WO (1) WO2017101477A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630879B (zh) * 2015-12-17 2019-03-26 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块存储设备读写模块
CN105824879B (zh) * 2015-12-17 2019-06-28 深圳市华讯方舟软件技术有限公司 一种基于PostgreSQL块存储设备的迁移方法
CN105573678B (zh) * 2015-12-17 2018-11-09 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块
CN112380217B (zh) * 2020-11-17 2024-04-12 安徽鸿程光电有限公司 数据处理方法、装置、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7908245B2 (en) * 2007-03-09 2011-03-15 Fujitsu Limited Database management method and database management apparatus
CN103345518A (zh) * 2013-07-11 2013-10-09 清华大学 基于数据块的自适应数据存储管理方法及系统
CN104809152A (zh) * 2015-03-12 2015-07-29 杭州米加科技有限公司 一种节约PostgreSQL共享内存的方法及系统
CN105573678A (zh) * 2015-12-17 2016-05-11 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块
CN105630879A (zh) * 2015-12-17 2016-06-01 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块存储设备读写模块
CN105824879A (zh) * 2015-12-17 2016-08-03 深圳市华讯方舟软件技术有限公司 一种基于PostgreSQL块存储设备的迁移方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102768672B (zh) * 2012-06-12 2016-02-17 北大方正集团有限公司 一种磁盘空间管理方法和装置
CN103577470B (zh) * 2012-08-03 2017-11-21 上海交通大学 一种提升web服务器性能的文件系统及方法
US8862847B2 (en) * 2013-02-08 2014-10-14 Huawei Technologies Co., Ltd. Distributed storage method, apparatus, and system for reducing a data loss that may result from a single-point failure
CN104731863B (zh) * 2015-02-13 2017-10-13 杭州米加科技有限公司 简化PostgreSQL分区代码的方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7908245B2 (en) * 2007-03-09 2011-03-15 Fujitsu Limited Database management method and database management apparatus
CN103345518A (zh) * 2013-07-11 2013-10-09 清华大学 基于数据块的自适应数据存储管理方法及系统
CN104809152A (zh) * 2015-03-12 2015-07-29 杭州米加科技有限公司 一种节约PostgreSQL共享内存的方法及系统
CN105573678A (zh) * 2015-12-17 2016-05-11 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块
CN105630879A (zh) * 2015-12-17 2016-06-01 深圳市华讯方舟软件技术有限公司 一种PostgreSQL块存储设备读写模块
CN105824879A (zh) * 2015-12-17 2016-08-03 深圳市华讯方舟软件技术有限公司 一种基于PostgreSQL块存储设备的迁移方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GU, RUI ET AL.: "Research and Implementation of Tablespace Storage Strategy in PostgreSQL", COMPUTER ENGINEERING, vol. 32, no. 16, 31 August 2006 (2006-08-31), pages 38 - 40, ISSN: 1000-3428 *

Also Published As

Publication number Publication date
CN105573678A (zh) 2016-05-11
CN105573678B (zh) 2018-11-09

Similar Documents

Publication Publication Date Title
WO2017101505A1 (zh) 一种基于PostgreSQL块存储设备的迁移方法
WO2017101478A1 (zh) 一种PostgreSQL块存储设备读写模块
KR102147905B1 (ko) 어드레스 기반의 멀티-스트림 스토리지 장치 액세스
US9047301B2 (en) Method for optimizing the memory usage and performance of data deduplication storage systems
CN106663047A (zh) 用于优化的签名比较和数据复制的系统和方法
TW201935243A (zh) 固態驅動器、分散式資料儲存系統和利用鍵值儲存的方法
US20140281307A1 (en) Handling snapshot information for a storage device
US10956071B2 (en) Container key value store for data storage devices
US10296250B2 (en) Method and apparatus for improving performance of sequential logging in a storage device
WO2017101477A1 (zh) 一种PostgreSQL块
CN103597440A (zh) 用于创建克隆文件的方法以及采用该方法的文件系统
US11775476B2 (en) Techniques for snapshotting scalable multitier storage structures
CN102012852A (zh) 一种增量写快照的实现方法
JP2006293981A (ja) データベース格納方法、および、データベース格納システム
CN111324305B (zh) 一种分布式存储系统中数据写入/读取方法
KR20170038853A (ko) 호스트-관리 비휘발성 메모리
US20180267713A1 (en) Method and apparatus for defining storage infrastructure
CN113535670B (zh) 一种虚拟化资源镜像存储系统及其实现方法
WO2014089828A1 (zh) 访问存储设备的方法和存储设备
WO2022262381A1 (zh) 一种数据压缩方法及装置
US8131966B2 (en) System and method for storage structure reorganization
US10089348B2 (en) Computer device and storage device
Yang et al. Reducing relational database performance bottlenecks using 3D XPoint storage technology
WO2023235040A1 (en) File system improvements for zoned storage device operations
Chen et al. Facilitating external sorting on SMR-based large-scale storage systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16874537

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16874537

Country of ref document: EP

Kind code of ref document: A1