CN110888937A - Novel distributed storage system and method using QLC as storage medium - Google Patents

Novel distributed storage system and method using QLC as storage medium Download PDF

Info

Publication number
CN110888937A
CN110888937A CN201911210914.XA CN201911210914A CN110888937A CN 110888937 A CN110888937 A CN 110888937A CN 201911210914 A CN201911210914 A CN 201911210914A CN 110888937 A CN110888937 A CN 110888937A
Authority
CN
China
Prior art keywords
qlc
scm
data
disk
distributed storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911210914.XA
Other languages
Chinese (zh)
Inventor
刘凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Virtual Clusters Information Technology Co Ltd
Original Assignee
Shenzhen Virtual Clusters Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Virtual Clusters Information Technology Co Ltd filed Critical Shenzhen Virtual Clusters Information Technology Co Ltd
Priority to CN201911210914.XA priority Critical patent/CN110888937A/en
Publication of CN110888937A publication Critical patent/CN110888937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention relates to a novel distributed storage system and a method using QLC as a storage medium, wherein the storage system comprises: the system comprises a plurality oF servers, a switch and a plurality oF QLC/SCM cases, wherein all the servers are connected to the QLC/SCM cases through the switch, and the QLC/SCM cases are internally provided with QLC discs and SCM discs and can be accessed by all the servers through NVMe-oF protocol. The invention fully utilizes three new technologies oF QLC, SCM and NVMe-oF, eliminates the access bottleneck in performance after the storage system is redesigned, avoids various complicated layering mechanisms and the high price oF the ordinary SSD, and can meet the requirements oF the current big data on various aspects oF storage.

Description

Novel distributed storage system and method using QLC as storage medium
Technical Field
The present invention relates to computer storage systems, and more particularly, to a novel distributed storage system and method using a QLC as a storage medium.
Background
In the era of big data, artificial intelligence and internet of things, particularly with the coming of 5G technology, the data volume can grow very fast. This requires that not only such a large amount of data can be stored, but also the capacity is required, and it is also required that fast reading and writing can be performed at any time to perform real-time analysis and processing of data, and undoubtedly, the following requirements are required for a large storage system:
(1) ultra-large capacity, more than PB level, to EB level;
(2) ultra-high performance, millions or even tens of millions of iops;
(3) the cost is low, so that the method can be applied on a large scale;
(4) high reliability, uninterrupted data acquisition;
(5) it can be extended linearly, which is the requirement for distributed storage.
Among the three major components of computing, networking and storage, the performance bottleneck of storage is the most prominent. At present, most of storage and use are mechanical disks, the read-write performance, particularly the read-write io frequency, of the mechanical disks is severely limited, and the mechanical disks cannot be matched with the rapid development of a CPU and a network.
The current storage system uses either a mechanical disk, a full SSD, or an SSD as a cache, but has the following advantages and disadvantages: (1) the mechanical disc has mature technology and larger capacity, but the defect of performance is obvious; (2) the full SSD storage system has very good performance, but is expensive and can be used in partial scenes only; (3) although SSD caching can also solve some problems, the final performance is still limited by the bottleneck of the underlying mechanical disk.
Therefore, the prior art has problems and needs to be further improved.
Disclosure of Invention
The present invention is directed to the above-mentioned problems, and provides a novel distributed storage system and method using QLC as a storage medium.
In order to achieve the purpose, the specific technical scheme of the invention is as follows:
a novel distributed storage system using QLC as a storage medium, comprising: the system comprises a plurality oF servers, a switch and a plurality oF QLC/SCM cases, wherein all the servers are connected to the QLC/SCM cases through the switch, and the QLC/SCM cases are internally provided with QLC discs and SCM discs and can be accessed by all the servers through NVMe-oF protocol.
Preferably, the QLC/SCM chassis provides a dual power, dual 10/100G interface.
Preferably, the switch is an ethernet switch or an IB switch.
Preferably, the server is an x86 server, and the server is configured with CPUs and memories with different performances according to requirements and runs distributed storage software.
Preferably, all the SCM disks in the system form a global write cache, the SCM disks are cached in the SCM disks firstly, and all data are written into the QLC disk and then written into the QLC disk at one time after being written.
The invention also provides a storage method of the system, which comprises the following steps:
a data writing step: caching the data in an SCM disk, and writing all the data in a QLC disk at one time after writing;
and (3) data modification: frequent modifications of the thermal data are made in the SCM disk and refreshed in the QLC disk after the data has cooled.
Preferably, in the data writing step, if it is a modification operation on old data, when data is written to the QLC disc, a method of newly allocating a data block and then additionally writing the data is adopted.
In the invention, the storage server can see all the QLC SSD disks (QLC disks) and simultaneously has the SCM disk as the write cache, so that the data protection can be carried out by adopting an extremely high erasure correction algorithm. A 40:4 erasure ratio may be used in the case of small capacity; an erasure ratio of 150:4 can be adopted at medium capacity; the erasure ratio of 500:8 can be adopted under large capacity; at such a high erasure rate, the usage rate of the storage capacity is as high as 95% or more.
The invention fully utilizes three new technologies oF QLC, SCM and NVMe-oF, eliminates the access bottleneck in performance after the storage system is redesigned, avoids various complicated layering mechanisms and the high price oF the ordinary SSD, and can meet the requirements oF the current big data on various aspects oF storage.
Drawings
FIG. 1 is a system architecture diagram of the present invention;
FIG. 2 is a QLC/SCM enclosure configuration diagram in accordance with the present invention.
Detailed Description
In order that those skilled in the art can understand and implement the present invention, the following embodiments of the present invention will be further described with reference to the accompanying drawings.
Referring to fig. 1, the present invention provides a novel distributed storage system using a QLC as a storage medium, comprising: the system comprises a plurality oF servers, a switch and a plurality oF QLC/SCM cases, wherein all the servers are connected to the QLC/SCM cases through the switch, and the QLC/SCM cases are internally provided with QLC discs and SCM discs and can be accessed by all the servers through NVMe-oF protocol.
Wherein:
a. the Server Storage Server is an x86 Server, is configured with CPUs and memories with different performances according to requirements, and runs distributed Storage software.
b. The switch 10/100G switch is an Ethernet switch or an IB switch.
c. The QLC/SCM chassis: the expansion cabinet is equivalent to a disk expansion cabinet, a QLC disk and an SCM disk are arranged in the expansion cabinet, a dual power supply and dual 10/100G interfaces are provided, and high-availability access is provided; the NVMe-oF protocol can be accessed by all Storage servers; a dual power, dual 10/100G interface is provided.
The invention adopts a stateless storage server, service programs adopt a stateless container form and can run in a standard x86 server, each server can access each NVMe Flash and SCM storage device by using NVMe over Fabrics and has a low-delay access characteristic similar to DAS, so that the method has the advantages that ① stateless services do not need to interact with each other, the performance is very high, other service programs can take over the service immediately when any service program is down, ② container arrangement and running can be carried out by adopting a container state, the efficiency is high, meanwhile, the container programs can independently occupy the servers and can be fused with other application programs to be stored in one server, and ③ service programs can be deployed locally or can be deployed on public cloud to run.
FIG. 2 is a configuration diagram of a QLC/SCM chassis, and the SCM disk and the QLC disk form a storage space of a system structure. The data storage method in the system comprises the following steps:
a. for the new write data: the data is cached in an SCM disk, and all the data is written on a QLC disk at one time after being written. All SCM disks in the system form a global write cache, and after a series of processing, the SCM disks are written into the QLC disk.
b. For the modified data: frequent modifications of the thermal data are made in the SCM disk and refreshed in the QLC disk after the data has cooled. The hot spot data in the system can be frequently accessed and modified, the modification of the hot spot data is put into an SCM layer for processing, and after the data modification is completed, the hot spot data is refreshed into a QLC disk at the bottom layer once.
c. Changing the way of overwriting: if the operation is a modification operation on old data, when data is written into the QLC disk, a method of newly allocating a data block and then additionally writing the data is adopted instead of performing overwriting on the QLC disk.
The invention uses QLC SSD (QLC disk) as storage medium, has the advantages of large capacity, high performance and low cost, and can meet the requirement of the big data era on the storage system after the QLC SSD and the QLC SSD are combined together with the development of distributed storage software. Compared with the ordinary SSD, the QLC SSD has much lower price, has very high reading and writing performance compared with a mechanical disk, but has obvious defects, namely the erasing and writing times of the QLC SSD are only 1000 times and are far less than those of the mechanical disk and the ordinary SSD. But if the utility model is used reasonably, the advantages and the disadvantages can be exploited, and a good effect can be obtained. The capacity of the QLC SSD is continuously increasing, the price is relatively low, and the performance is very good, so it is very suitable for being used as the underlying storage medium.
The scm (storage class memory), also called persistent memory, is a composite storage technology combining SSD storage and memory characteristics, and provides faster read/write speed than SSD storage, and is cheaper in cost than Dynamic Random Access Memory (DRAM). The method can well make up the problems of the capacity of the memory and the speed of the flash, and can be used as a write cache layer of the QLC to be matched with the write cache layer of the QLC, so that the effect of reducing the writing times of the QLC is achieved.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (7)

1. A novel distributed storage system using QLC as a storage medium, comprising: the system comprises a plurality oF servers, a switch and a plurality oF QLC/SCM cases, wherein all the servers are connected to the QLC/SCM cases through the switch, and the QLC/SCM cases are internally provided with QLC discs and SCM discs and can be accessed by all the servers through NVMe-oF protocol.
2. The novel distributed storage system of claim 1, wherein said QLC/SCM chassis provides a dual power, dual 10/100G interface.
3. A novel distributed storage system as claimed in claim 1 or 2, wherein said switch is an ethernet switch or an IB switch.
4. The novel distributed storage system according to claim 1 or 2, wherein the server is an x86 server, and is configured with CPUs and memories with different performances according to requirements to run distributed storage software.
5. The novel distributed storage system according to claim 1 or 2, wherein all SCM disks in the system form a global write cache, the SCM disks are cached in the SCM disks firstly, and all data are written into the QLC disk and then written into the QLC disk at one time after being written.
6. A novel distributed storage method using a QLC as a storage medium, comprising the steps of:
a data writing step: caching the data in an SCM disk, and writing all the data in a QLC disk at one time after writing;
and (3) data modification: frequent modifications of the thermal data are made in the SCM disk and refreshed in the QLC disk after the data has cooled.
7. The novel distributed storage method according to claim 6, wherein in the data writing step, if it is a modification operation on old data, when data is written to the QLC disk, a method of newly allocating data blocks and then additionally writing data is adopted.
CN201911210914.XA 2019-11-29 2019-11-29 Novel distributed storage system and method using QLC as storage medium Pending CN110888937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911210914.XA CN110888937A (en) 2019-11-29 2019-11-29 Novel distributed storage system and method using QLC as storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911210914.XA CN110888937A (en) 2019-11-29 2019-11-29 Novel distributed storage system and method using QLC as storage medium

Publications (1)

Publication Number Publication Date
CN110888937A true CN110888937A (en) 2020-03-17

Family

ID=69749846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911210914.XA Pending CN110888937A (en) 2019-11-29 2019-11-29 Novel distributed storage system and method using QLC as storage medium

Country Status (1)

Country Link
CN (1) CN110888937A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190180A (en) * 2021-05-26 2021-07-30 北京自由猫科技有限公司 Storage device based on mixed media and distributed storage system
CN114489484A (en) * 2021-12-27 2022-05-13 得一微电子股份有限公司 Data storage method of SSD, terminal device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9910618B1 (en) * 2017-04-10 2018-03-06 Pure Storage, Inc. Migrating applications executing on a storage system
US20180095872A1 (en) * 2016-10-04 2018-04-05 Pure Storage, Inc. Distributed integrated high-speed solid-state non-volatile random-access memory
US10141050B1 (en) * 2017-04-27 2018-11-27 Pure Storage, Inc. Page writes for triple level cell flash memory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180095872A1 (en) * 2016-10-04 2018-04-05 Pure Storage, Inc. Distributed integrated high-speed solid-state non-volatile random-access memory
US9910618B1 (en) * 2017-04-10 2018-03-06 Pure Storage, Inc. Migrating applications executing on a storage system
US10141050B1 (en) * 2017-04-27 2018-11-27 Pure Storage, Inc. Page writes for triple level cell flash memory

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190180A (en) * 2021-05-26 2021-07-30 北京自由猫科技有限公司 Storage device based on mixed media and distributed storage system
CN114489484A (en) * 2021-12-27 2022-05-13 得一微电子股份有限公司 Data storage method of SSD, terminal device and storage medium

Similar Documents

Publication Publication Date Title
CN106528001B (en) A kind of caching system based on nonvolatile memory and software RAID
US9747318B2 (en) Retrieving data in a storage system using thin provisioning
CN101604226B (en) Method for building dynamic buffer pool to improve performance of storage system based on virtual RAID
Leventhal Flash storage memory
CN102843396B (en) Data write-in and read method and device in a kind of distributed cache system
WO2018019119A1 (en) Method and device for dynamic partial-parallel data layout for continuous data storage
CN108829341B (en) Data management method based on hybrid storage system
CN102306503B (en) Method and system for detecting false capacity memory
US9009392B2 (en) Leveraging a hybrid infrastructure for dynamic memory allocation and persistent file storage
CN109062505A (en) A kind of write performance optimization method under cache policy write-in layering hardware structure
CN102804151A (en) Memory agent to access memory blade as part of the cache coherency domain
CN111488125B (en) Cache Tier Cache optimization method based on Ceph cluster
CN110888937A (en) Novel distributed storage system and method using QLC as storage medium
CN104657461A (en) File system metadata search caching method based on internal memory and SSD (Solid State Disk) collaboration
Thereska et al. Sierra: a power-proportional, distributed storage system
CN105938447B (en) Data backup device and method
CN104778018A (en) Broad-strip disk array based on asymmetric hybrid type disk image and storage method of broad-strip disk array
Chen et al. Unified non-volatile memory and NAND flash memory architecture in smartphones
Xie et al. MICRO: A multilevel caching-based reconstruction optimization for mobile storage systems
Leventhal Flash Storage Today: Can flash memory become the foundation for a new tier in the storage hierarchy?
Xie et al. SAIL: self-adaptive file reallocation on hybrid disk arrays
CN103500147A (en) Embedded and layered storage method of PB-class cluster storage system
CN105630699B (en) A kind of solid state hard disk and read-write cache management method using MRAM
CN101807212B (en) Caching method for embedded file system and embedded file system
CN117056087A (en) Cloud data center hybrid memory optimization method, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination