CN116450042A - Distributed storage method, system and device - Google Patents

Distributed storage method, system and device Download PDF

Info

Publication number
CN116450042A
CN116450042A CN202310425670.7A CN202310425670A CN116450042A CN 116450042 A CN116450042 A CN 116450042A CN 202310425670 A CN202310425670 A CN 202310425670A CN 116450042 A CN116450042 A CN 116450042A
Authority
CN
China
Prior art keywords
data
operation request
disk
read operation
access frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310425670.7A
Other languages
Chinese (zh)
Inventor
郭长伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310425670.7A priority Critical patent/CN116450042A/en
Publication of CN116450042A publication Critical patent/CN116450042A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method, a system and a device for distributed storage, which are applied to the field of storage and comprise the following steps: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data; when a write operation request is received, determining the access frequency of data corresponding to the write operation request; storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request; when a read operation request is received, determining the access frequency of data corresponding to the read operation request; searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request. After the buffer memory disk is divided into spaces according to the access frequency of the data, the data is stored in the corresponding space when the write operation is received, the data searching time is shortened when the read operation request is received, and the read operation speed is increased.

Description

Distributed storage method, system and device
Technical Field
The present invention relates to the field of storage, and in particular, to a method, system, and apparatus for distributed storage.
Background
Ceph is a decentralized distributed storage system that provides better performance, reliability and scalability. Mainly, an SSD (Solid State Drive, solid state Disk) disc is used to make a layer of cache on an HDD (Hard Disk Drive) disc with a slower Input/Output (IO) speed, so as to improve the IO rate of the HDD disc. One cache device (SSD) may provide caching for multiple back-end devices (HDDs) simultaneously. Under this approach, there will be a lower bandwidth for read and write traffic and lower IOPS (Input/Output Operations Per Second, number of write/read operations per second). How to increase the read/write speed of a distributed storage system is a problem to be solved.
Disclosure of Invention
The invention aims to provide a distributed storage method, a distributed storage system and a distributed storage device, which can shorten the data searching time and accelerate the reading operation speed.
In order to solve the technical problems, the present invention provides a method for distributed storage, including:
dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data;
when a write operation request is received, determining the access frequency of data corresponding to the write operation request;
Storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request;
when a read operation request is received, determining the access frequency of data corresponding to the read operation request;
searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request.
Preferably, dividing a space for a cache disk corresponding to a data disk in a storage pool according to an access frequency of stored data includes:
dividing the cache disk into a first space, a second space and a third space according to the access frequency of the stored data;
the first space is used for storing logs, the second space is used for storing data storage paths, the third space is used for storing cache data except the storage logs and the data storage paths, the access frequency of the storage logs is higher than that of the data storage paths, and the access frequency of the data storage paths is higher than that of the cache data.
Preferably, before dividing the space for the buffer memory disk corresponding to the data disk in the storage pool according to the access frequency of the stored data, the method further includes:
Clearing signature data in the data disk and the cache disk;
formatting the data disk and the cache disk.
Preferably, after determining the access frequency of the data corresponding to the write operation request, the method further includes:
judging whether the size of the data corresponding to the write operation request is larger than the data threshold value;
if the data is larger than the data, directly storing the data corresponding to the write operation request to the data disk;
if not, the method enters a step of storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request.
Preferably, searching the data requested by the read operation from the space corresponding to the access frequency of the data requested by the read operation includes:
determining a corresponding address of the data of the read operation request in a space corresponding to the access frequency of the data of the read operation request;
searching the data of the read operation request in a preset data capacity by taking the corresponding address as a starting address;
if the data requested by the read operation is not found, the data requested by the read operation is looked for from the data disc, the data of the read operation request is returned to the sending end of the read operation request;
If the data of the read operation request is found, a step of returning the data of the read operation request to the sending end of the read operation request is entered.
Preferably, before receiving the write operation request or the read operation request, the method further includes:
setting a cache policy between the data disk and the cache disk as a write-back policy, wherein the write-back policy is to write data into the cache disk first and then write the data into the data disk by the cache disk;
setting a percentage of the write-back strategy, wherein the percentage represents a proportion of dirty data written into the data disk in the cache disk, and the dirty data is modified data;
after the data of the write operation request is stored in the space corresponding to the access frequency of the data of the write operation request, the method further comprises the following steps:
judging whether the proportion of dirty data of the cache disk exceeds the percentage;
and if so, brushing the data in the cache disk down to the data disk until the proportion of dirty data of the cache disk does not exceed the percentage.
Preferably, after storing the data of the write operation request in a space corresponding to the access frequency of the data of the write operation request, the method further includes:
Judging whether the capacity of the data in the cache disk exceeds a capacity threshold value or not;
and if so, brushing the data in the cache disk down to the data disk until the capacity of the data in the cache disk does not exceed the capacity threshold.
Preferably, after setting the cache policy between the data disc and the cache disc to be a write-back policy, the method further includes:
and when the delay time of the write operation request exceeds the maximum delay time of the write operation request or the delay time of the read operation request exceeds the maximum delay time of the read operation request, reducing the number of the write operation requests or the number of the read operation requests received in a preset time.
In order to solve the technical problem, the present invention further provides a distributed storage system, including:
the dividing unit is used for dividing the space of the cache disk corresponding to the data disk in the storage pool according to the access frequency of the stored data;
a first determining unit, configured to determine, when a write operation request is received, an access frequency of data corresponding to the write operation request;
a storage unit, configured to store the data of the write operation request into a space corresponding to an access frequency of the data of the write operation request;
A second determining unit, configured to determine, when a read operation request is received, an access frequency of data corresponding to the read operation request;
a searching unit, configured to search for the data requested by the read operation from a space corresponding to the access frequency of the data requested by the read operation;
and the return unit is used for returning the data of the read operation request to the sending end of the read operation request.
In order to solve the technical problem, the present invention further provides a distributed storage device, including:
a memory for storing a computer program;
and a processor for executing the steps of the above-described distributed storage method when the computer program is executed.
The application provides a method, a system and a device for distributed storage, which are applied to the field of storage and comprise the following steps: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data; when a write operation request is received, determining the access frequency of data corresponding to the write operation request; storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request; when a read operation request is received, determining the access frequency of data corresponding to the read operation request; searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request. After the buffer memory disk is divided into spaces according to the access frequency of the data, the data is stored in the corresponding space when the write operation is received, the data searching time is shortened when the read operation request is received, and the read operation speed is increased.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the prior art and the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of distributed storage provided by the present invention;
FIG. 2 is a schematic diagram of a distributed storage system according to the present invention;
fig. 3 is a schematic structural diagram of a distributed storage device according to the present invention.
Detailed Description
The core of the invention is to provide a distributed storage method, system and device, the time for searching data is shortened, and the reading operation speed is accelerated.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Ceph is a decentralized distributed storage system that provides better performance, reliability and scalability. Mainly, an SSD (Solid State Drive, solid state Disk) disc is used to make a layer of cache on an HDD (Hard Disk Drive) disc with a slower Input/Output (IO) speed, so as to improve the IO rate of the HDD disc. One cache device (SSD) may provide caching for multiple back-end devices (HDDs) simultaneously. Under this approach, there will be a lower bandwidth for read and write traffic and lower IOPS (Input/Output Operations Per Second, number of write/read operations per second). How to increase the read/write speed of a distributed storage system is a problem to be solved.
FIG. 1 is a flow chart of a method of distributed storage according to the present invention, the method of distributed storage comprising:
s11: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data;
it can be understood that the data stored in the cache disk includes multiple types, and when searching for data, if more data is in the space, the longer it takes to search for data, the longer it takes for the read-write operation. Considering that the time for searching the data is too long due to the fact that various types of data exist in the cache disk, the cache disk is divided into spaces according to the access frequency of the data, so that the data among different access frequencies are not stored in the same space, and the time can be shortened in the process of searching the data. Specific partitioning terms include, but are not limited to, the above-described partitioning according to the access frequency of the data, which is not excessively limited herein.
S12: when a write operation request is received, determining the access frequency of data corresponding to the write operation request;
s13: saving data of a write operation request to a write operation the access frequency of the requested data is in the corresponding space;
when a write operation request is received, in order to accurately write data into a corresponding space, the access frequency of the data, which needs to be written into a cache disk by the write operation request, needs to be determined first, and the data is stored into the corresponding space after the access frequency of the data is determined.
S14: when a read operation request is received, determining the access frequency of data corresponding to the read operation request;
s15: searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request.
Accordingly, since the data is written into the space corresponding to the access frequency in the process of writing the data, the data can be found from the data corresponding to the read operation request when the read operation request is received. Before searching data, the access frequency of the data corresponding to the read operation request is predetermined, the data is found from the corresponding space according to the access frequency of the data corresponding to the read operation request, and the data is returned to the sending end of the read operation request.
The data is stored into the corresponding space according to the access frequency of the data in the writing operation process, the data searching time can be reduced in the reading operation process, and the reading operation speed is improved.
It should also be noted that the method for distributed storage provided in the present application is applied to an operating system of a server, and the operating system executes the above steps.
The application provides a distributed storage method, which is applied to the field of storage and comprises the following steps: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data; when a write operation request is received, determining the access frequency of data corresponding to the write operation request; storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request; when a read operation request is received, determining the access frequency of data corresponding to the read operation request; searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request. After the buffer memory disk is divided into spaces according to the access frequency of the data, the data is stored in the corresponding space when the write operation is received, the data searching time is shortened when the read operation request is received, and the read operation speed is increased.
Based on the above embodiments:
as a preferred embodiment, dividing a space for a cache disk corresponding to a data disk in a storage pool according to an access frequency of stored data includes:
dividing a cache disk into a first space, a second space and a third space according to the access frequency of the stored data;
the first space is used for storing logs, the second space is used for storing data storage paths, the third space is used for storing cache data except the storage paths for storing the logs and the data, the access frequency of the storage logs is higher than that of the data storage paths, and the access frequency of the data storage paths is higher than that of the cache data.
Considering the different access frequencies of different data, the frequency of accessing the data can be divided into three levels, and the three levels have spaces corresponding to the three levels.
Specifically, three spaces are divided:
the first space is a super high speed space (WAL): log files, such as storage logs, are mainly stored, and the capacity is generally 15GB;
the second space is a high-speed space (DB): storing metadata generated internally by a BlueStore, such as a search directory or directory structure, typically 30GB in capacity;
The third space is a Cache space (Cache): the capacity calculation formula of the buffer memory other than the two data is as follows:
cach= (cache disk single disk capacity. Cache disk number)/data disk number-capacity of ultra-high speed space (WAL) -capacity of high speed space (DB).
The three spaces have the same read-write speed, different storage spaces and different types of stored data.
By dividing the cache disk into the three spaces, when the access requests of different types of data are received, the corresponding space can be searched for the data, and the reading operation can be accelerated.
As a preferred embodiment, before dividing the space for the buffer disk corresponding to the data disk in the storage pool according to the access frequency of the stored data, the method further includes:
clearing signature data in the data disk and the cache disk;
the data disk is formatted and the disk is cached.
It is contemplated that the process of storing data in a distributed manner is to distribute the same size of space in order to store data. Taking 1M space storage data as an example, if 4K data is needed to be stored currently, 1M space storage data is allocated, 4K data is stored in 1M space, and the rest is idle. The cache disk needs to be formatted before being partitioned, aligning each data address in the cache disk.
At the same time, the data disks are also formatted for storage of data before the distributed storage is put into use. The purpose of flushing the signature data is to report errors during partitioning and formatting if the signature is not cleared.
Specifically, the execution code is as follows:
a) Clear cache disk (SSD) signature:
all SSD disk signatures are cleared using the wipefs command, with specific command references as follows:
wipefs-a/dev/nvme0n1;
b) Formatted cache disk (SSD) disks:
all SSD disks are formatted using nvme commands, which are referenced below:
nvme format-s 1-f/dev/nvme0n1;
adjusting data distribution in the data disk, namely formatting the HDD disk;
a) Clear data disk (HDD) signature:
all HDD disk signatures are purged using the wipefs command, with specific command references as follows:
wipefs-a/dev/sda;
b) Formatting data disc (HDD):
the dd command is used to format the HDD pre-disk 20M data, with specific command references as follows:
dd if=/dev/zero of=/dev/sda bs=1M count=20。
since the data stored in the front 20M of the HDD disk is often pointer data, when the data of the front 20M is cleared, it can be understood that the HDD disk is emptied.
In addition, the distributed storage system is subjected to placement group weight balancing, and the method comprises the following steps:
a) ceph osd set-required-min-compare-client solutions-yes-i-required-mean-it-set to allow pg (placement set) equalization operations;
b) ceph osd getmap-o osd. Map- -obtain the maps of all osds;
c) osdmapto osd.map-upmap out.txt-upmap-pool rbdpool-upmap-max 2000-performing a pg balancing operation on the storage pool rbdpool;
d) source out. Txt—derive pg equalization results.
As a preferred embodiment, after determining the access frequency of the data corresponding to the write operation request, the method further includes:
judging whether the size of the data corresponding to the write operation request is larger than a data threshold value or not;
if the data is larger than the data, directly storing the data corresponding to the write operation request to a data disk;
if not, the method proceeds to a step of storing the data of the write operation request in a space corresponding to the access frequency of the data of the write operation request.
The value of the parameter "sequential_cutoff" is preset, and the set value is the data threshold, and once the continuous IO size crosses the data threshold, the continuous IO size bypasses the cache disk and is directly saved to the data disk.
When a write operation request is received, in addition to determining the access frequency of the data corresponding to the write operation request, the size of the data needs to be determined, and if the data threshold set by the sequential_cutoff parameter is exceeded, the data will not be written into the cache disk, but directly written into the data disk.
It should be noted that, in the present application, the parameter sequential_cutoff is set to 4M, that is, if the data does not exceed 4M, the data may be saved to the cache disk, and the numerical value setting of the data threshold may be set according to the actual requirement, which is not limited herein too much.
As a preferred embodiment, searching for data of a read operation request from a space corresponding to an access frequency of the data of the read operation request includes:
determining the corresponding address of the data of the read operation request in the space corresponding to the access frequency of the data of the read operation request;
searching data of a read operation request in a preset data capacity by taking a corresponding address as a starting address;
if the data of the read operation request is not found, the data of the read operation request is searched from the data disk, and the data of the read operation request is returned to a sending end of the read operation request;
if the data of the read operation request is found, a step of returning the data of the read operation request to the sending end of the read operation request is entered.
Setting the value of the parameter readhead, wherein the set value is the preset data capacity. It will be appreciated that after determining the space corresponding to the access frequency of the data, a corresponding address is also determined, but the corresponding address is not the exact saved address of the data, and may be the start address of the space or the address of the preset location. In order to improve the reading operation speed of the data, the searching of all addresses is not carried out in the cache disk according to the corresponding addresses, but the searching of the data in the preset data capacity is started by taking the corresponding addresses as the initial addresses, and if the searching is carried out, the searched data is returned to the sending end of the reading operation request; if the data is not found in the preset data capacity, the data is found from the cache disk and returned to the sending end of the read operation request.
It should be noted that, in the present application, the parameter readahead is set to 1M, that is, the corresponding address is the start address, and 1M data is searched for later, and the numerical setting of the preset data capacity can be set according to the actual requirement, which is not limited here too much.
As a preferred embodiment, before receiving the write operation request or the read operation request, the method further includes:
setting a cache policy between the data disk and the cache disk as a write-back policy, wherein the write-back policy is to write data into the cache disk firstly and then write the data into the data disk by the cache disk;
setting a percentage of a write-back strategy, wherein the percentage represents the proportion of dirty data written into a data disk in a cache disk, and the dirty data is modified data;
after storing the data of the write operation request into the space corresponding to the access frequency of the data of the write operation request, the method further comprises:
judging whether the proportion of dirty data of the cache disk exceeds the percentage;
if so, the data in the cache disk is flushed down to the data disk until the proportion of dirty data in the cache disk is not more than a percentage.
There are various caching policies between the data disk and the cache disk, for example:
writeback: write-back strategy, all data will be written into the buffer disk first, then wait for the system to write back the data into the back-end data disk;
writethrough: write-through strategy (default strategy), data will be written into the buffer disk and the back-end data disk at the same time;
writearoud: data will be written directly to the back-end disk.
The parameter "cache_mode" can be one of writethrough, writeback and writearound, and the parameter "cache_mode" is set as writeback, so that higher reading operation speed is realized.
The parameter "cache_mode" is set, and specific command references are as follows:
echo“writeback”>/sys/block/bcache0/bcache/cache_mode。
setting a parameter 'write back_percentage', wherein the data of the parameter represents the percentage of the write-back strategy;
taking the example of setting the parameter "write back_percentage" to 40%, i.e. the percentage of write back strategy to 40%, specific commands are referred to as follows:
echo 40>/sys/block/bcache0/bcache/writeback_percent。
setting a parameter 'write back_running', wherein the parameter represents the write-back state of dirty data, the on state is 1, and the off state is 0: if closed, no write back of dirty data will occur, and dirty data will still be added to the cache until it is about to be full.
Taking setting the write-back state of dirty data as an on state, that is, setting the parameter "write back_running" as 1 as an example, specific commands refer to the following:
echo 1>/sys/block/bcache0/bcache/writeback_running。
after setting the caching strategy as the write-back strategy, after storing the data corresponding to the write operation into the corresponding space, judging whether the proportion of the dirty data in the caching disc exceeds the percentage, and if the proportion exceeds the percentage, the data needs to be flushed down to the data disc until the proportion of the dirty data in the caching disc is lower than the percentage.
As a preferred embodiment, after storing the data of the write operation request in the space corresponding to the access frequency of the data of the write operation request, the method further includes:
judging whether the capacity of the data in the cache disk exceeds a capacity threshold value;
if so, the data in the cache disk is flushed to the data disk until the capacity of the data in the cache disk does not exceed the capacity threshold.
In view of the fact that when the capacity of the buffer disk is limited, and the data capacity in the buffer disk is too large, and the capacity threshold is reached, the read operation speed and the write operation speed of the buffer disk are reduced, so that the buffer disk needs to be subjected to data scrubbing. And flushing the data to the data disk until the data capacity of the cache disk is lower than the capacity threshold.
Specifically, the capacity threshold may be set to a value according to the capacity of the cache disk, or set to a percentage, which is not limited herein.
As a preferred embodiment, after setting the write-back strategy between the data disc and the cache disc, the method further comprises:
when the delay time of the write operation request exceeds the maximum delay time of the write operation request or the delay time of the read operation request exceeds the maximum delay time of the read operation request, the number of the write operation requests or the number of the read operation requests received in the preset time is reduced.
The preset parameter "structured_read_threshold_us" characterizes the maximum delay time of the read operation, i.e. the read congestion control term, and gradually reduces the traffic when the delay exceeds a threshold. Taking the example of setting the maximum delay time of a read operation to 20000 microseconds, specific command references are as follows:
echo 20000>/sys/fs/bcache/*/congested_read_threshold_us;
the parameter "bounded_write_threshold_us" characterizes the maximum latency of a write operation, i.e., the write congestion control term, and gradually reduces the flow when the latency exceeds a threshold. Taking the example of setting the maximum delay time of a write operation to 200000 microseconds, specific command references are as follows:
echo 200000>/sys/fs/bcache/*/congested_write_threshold_us。
when it is determined that the delay time of the write operation exceeds the maximum delay time of the write operation request, it is necessary to reduce the number of received write operation requests. When it is determined that the delay time of the read operation exceeds the maximum delay time of the read operation request, it is necessary to reduce the number of received read operation requests.
Specifically, the operating system receives the read operation request or the write operation request, executes the read operation request and the write operation request correspondingly, queues the read operation request or the write operation request if the delay time of the read operation or the delay time of the write operation is found to exceed the corresponding maximum delay time, and prompts the user to reduce the number of the read operation request or the write operation request received in the preset time.
It should be noted that, the present application further configures the values of the partial parameters, and after the corresponding values are configured, the speed of the reading operation and the speed of the writing operation can be improved.
Specifically, the method comprises the following eight parameters:
1. mon_clock_drift_allowed: the parameter determines the clock deviation between the nodes allowed by the system, and this value may affect the stability of the Ceph monitor and Ceph storage cluster, and is set to 2;
2. mon_clock_drift_return_backoff: after the time inconsistency is checked, the cluster alarms after the number of mon_clock_drift_return_back off is reached, and the cluster alarms are set to be 30;
the two parameters described above affect the stability of the distributed storage system and are inversely related to the speed of the read operation and the speed of the write operation, i.e. the greater the values set by the two parameters, the slower the speed of the read operation and the speed of the write operation.
3. osd_client_message_cap: the maximum number of clients allowed in the memory, the number of clients received by the OSD network layer, is set to 1000;
4. osd_client_message_size_cap: the client allows the maximum data (bytes) in the memory, and the size of the client request received by the OSD network layer is set to be 2GB;
5. osd_op_num_shields_hdd: a buffer queue of the HDD disk, set to 10;
6. osd_op_num_shields_ssd: the cache queue of the SSD disk is set to 16;
7. osd_op_num_threads_per_card_hdd: the number of threads in a cache queue in each data disk is multiplied by osd_op_num_cards_hdd and osd_op_num_threads_per_cards_hdd to be the number of threads for processing IO requests by an OSD process, the default is 5*1, the compression algorithm can be guaranteed to exert the maximum performance by modifying to 10 x 2, and the compression algorithm is set to 2;
8. osd_op_num_threads_per_card_ssd: the number of threads in the cache queue in each cache disk, similar in function to osd_op_num_threads_per_card_hdd, is set to 4.
The above six parameters are parameters that are positively correlated to the speed of the read operation and the speed of the write operation, i.e., the greater the values set by the two parameters, the faster the speed of the read operation and the speed of the write operation will be.
Fig. 2 is a schematic structural diagram of a distributed storage system according to the present invention, where the distributed storage system includes:
a dividing unit 21, configured to divide a space for a cache disk corresponding to a data disk in a storage pool according to an access frequency of stored data;
a first determining unit 22, configured to determine, when a write operation request is received, an access frequency of data corresponding to the write operation request;
A saving unit 23, configured to save the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request;
a second determining unit 24, configured to determine, when a read operation request is received, an access frequency of data corresponding to the read operation request;
a searching unit 25, configured to search for data requested by a read operation from a space corresponding to an access frequency of the data requested by the read operation;
and a return unit 26, configured to return the data of the read operation request to the sending end of the read operation request.
Based on the above embodiments:
a dividing unit 21, specifically configured to divide the buffer disk into a first space, a second space, and a third space according to the access frequency of the stored data;
the first space is used for storing logs, the second space is used for storing data storage paths, the third space is used for storing cache data except the storage paths for storing the logs and the data, the access frequency of the storage logs is higher than that of the data storage paths, and the access frequency of the data storage paths is higher than that of the cache data.
Further comprises:
the emptying unit is used for emptying the signature data in the data disk and the cache disk;
and the formatting unit is used for formatting the data disc and the cache disc.
A first judging unit, configured to judge whether the size of the data corresponding to the write operation request is greater than a data threshold; if yes, the second storing unit is triggered, and if not, the storing unit 23 is triggered;
the second storage unit is used for directly storing the data corresponding to the write operation request to the data disk;
a third determining unit, configured to determine a corresponding address of the data of the read operation request in a space corresponding to the access frequency of the data of the read operation request;
a searching unit 25, configured to search data requested by the read operation within a preset data capacity by taking the corresponding address as a start address; if the data of the read operation request is not found, triggering a second searching unit; if the data of the read operation request is found, return unit 26;
the second searching unit is used for searching the data of the read operation request from the data disk and returning the data of the read operation request to the sending end of the read operation request;
the first setting unit is used for setting a cache strategy between the data disk and the cache disk as a write-back strategy, wherein the write-back strategy is to write data into the cache disk firstly and then write the data into the data disk by the cache disk;
the second setting unit is used for setting the percentage of the write-back strategy, wherein the percentage represents the proportion of dirty data written into the data disk in the cache disk, and the dirty data is modified data;
A second judging unit for judging whether the proportion of dirty data of the buffer disk exceeds a percentage; if yes, triggering a first brushing unit;
and the first brushing unit is used for brushing the data in the cache disk down to the data disk until the proportion of dirty data of the cache disk is not more than a percentage.
A third judging unit for judging whether the capacity of the data in the buffer disk exceeds a capacity threshold; if yes, triggering a second lower brushing unit;
and the second brushing unit is used for brushing the data in the cache disk down to the data disk until the capacity of the data in the cache disk does not exceed the capacity threshold.
And the adjusting unit is used for reducing the number of the received write operation requests or the number of the read operation requests in the preset time when the delay time of the write operation requests exceeds the maximum delay time of the write operation requests or the delay time of the read operation requests exceeds the maximum delay time of the read operation requests.
The application provides a system of distributed storage, is applied to the storage field, includes: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data; when a write operation request is received, determining the access frequency of data corresponding to the write operation request; storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request; when a read operation request is received, determining the access frequency of data corresponding to the read operation request; searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request. After the buffer memory disk is divided into spaces according to the access frequency of the data, the data is stored in the corresponding space when the write operation is received, the data searching time is shortened when the read operation request is received, and the read operation speed is increased.
Fig. 3 is a schematic structural diagram of a distributed storage device according to the present invention. The distributed storage device comprises:
a memory 31 for storing a computer program;
a processor 32 for implementing the steps of the above-described method of distributed storage when executing a computer program.
The steps implemented when the processor 32 executes a computer program are as follows:
s11: dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data;
s12: when a write operation request is received, determining the access frequency of data corresponding to the write operation request;
s13: storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request;
s14: when a read operation request is received, determining the access frequency of data corresponding to the read operation request;
s15: searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request.
Based on the above embodiments:
dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data, wherein the method comprises the following steps:
dividing a cache disk into a first space, a second space and a third space according to the access frequency of the stored data;
The first space is used for storing logs, the second space is used for storing data storage paths, the third space is used for storing cache data except the storage paths for storing the logs and the data, the access frequency of the storage logs is higher than that of the data storage paths, and the access frequency of the data storage paths is higher than that of the cache data.
Before the buffer memory disk corresponding to the data disk in the storage pool divides the space according to the access frequency of the stored data, the method further comprises the following steps:
clearing signature data in the data disk and the cache disk;
the data disk is formatted and the disk is cached.
After determining the access frequency of the data corresponding to the write operation request, the method further comprises the following steps:
judging whether the size of the data corresponding to the write operation request is larger than a data threshold value or not;
if the data is larger than the data, directly storing the data corresponding to the write operation request to a data disk;
if not, the method proceeds to a step of storing the data of the write operation request in a space corresponding to the access frequency of the data of the write operation request.
Searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request comprises the following steps:
determining access frequency correspondence of data requested by read operation corresponding addresses of data requested by the read operation in space;
Searching data of a read operation request in a preset data capacity by taking a corresponding address as a starting address;
if the data of the read operation request is not found, the data of the read operation request is searched from the data disk, and the data of the read operation request is returned to a sending end of the read operation request;
if the data of the read operation request is found, a step of returning the data of the read operation request to the sending end of the read operation request is entered.
Before receiving the write operation request or the read operation request, the method further comprises:
setting a cache policy between the data disk and the cache disk as a write-back policy, wherein the write-back policy is to write data into the cache disk firstly and then write the data into the data disk by the cache disk;
setting a percentage of a write-back strategy, wherein the percentage represents the proportion of dirty data written into a data disk in a cache disk, and the dirty data is modified data;
after storing the data of the write operation request into the space corresponding to the access frequency of the data of the write operation request, the method further comprises:
judging whether the proportion of dirty data of the cache disk exceeds the percentage;
if so, the data in the cache disk is flushed down to the data disk until the proportion of dirty data in the cache disk is not more than a percentage.
After storing the data of the write operation request into the space corresponding to the access frequency of the data of the write operation request, the method further comprises:
Judging whether the capacity of the data in the cache disk exceeds a capacity threshold value;
if so, the data in the cache disk is flushed to the data disk until the capacity of the data in the cache disk does not exceed the capacity threshold.
After setting the cache policy between the data disc and the cache disc as the write-back policy, the method further comprises:
when the delay time of the write operation request exceeds the maximum delay time of the write operation request or the delay time of the read operation request exceeds the maximum delay time of the read operation request, the number of the write operation requests or the number of the read operation requests received in the preset time is reduced.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of distributed storage, comprising:
dividing a space for a cache disk corresponding to a data disk in a storage pool according to the access frequency of stored data;
when a write operation request is received, determining the access frequency of data corresponding to the write operation request;
storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request;
when a read operation request is received, determining the access frequency of data corresponding to the read operation request;
searching the data of the read operation request from the space corresponding to the access frequency of the data of the read operation request, and returning the data of the read operation request to the sending end of the read operation request.
2. The method of distributed storage according to claim 1, wherein partitioning a space for a cache disk corresponding to a data disk in a storage pool according to an access frequency of stored data, comprises:
dividing the cache disk into a first space, a second space and a third space according to the access frequency of the stored data;
the first space is used for storing logs, the second space is used for storing data storage paths, the third space is used for storing cache data except the storage logs and the data storage paths, the access frequency of the storage logs is higher than that of the data storage paths, and the access frequency of the data storage paths is higher than that of the cache data.
3. The method of distributed storage according to claim 1, wherein before dividing a space for a cache disk corresponding to a data disk in a storage pool according to an access frequency of stored data, further comprising:
clearing signature data in the data disk and the cache disk;
formatting the data disk and the cache disk.
4. The method of distributed storage of claim 1, wherein after determining the access frequency of the data corresponding to the write operation request, further comprising:
judging whether the size of the data corresponding to the write operation request is larger than a data threshold value or not;
if the data is larger than the data, directly storing the data corresponding to the write operation request to the data disk;
if not, the method enters a step of storing the data of the write operation request into a space corresponding to the access frequency of the data of the write operation request.
5. The method of distributed storage of claim 1, wherein finding the data of the read operation request from within a space corresponding to an access frequency of the data of the read operation request comprises:
determining a corresponding address of the data of the read operation request in a space corresponding to the access frequency of the data of the read operation request;
Searching the data of the read operation request in a preset data capacity by taking the corresponding address as a starting address;
if the data of the read operation request is not found, the data of the read operation request is found from the data disk, and the data of the read operation request is returned to the sending end of the read operation request;
and if the data of the read operation request is found, entering a step of returning the data of the read operation request to the sending end of the read operation request.
6. The method of distributed storage of any of claims 1 to 5, further comprising, prior to receiving a write operation request or a read operation request:
setting a cache policy between the data disk and the cache disk as a write-back policy, wherein the write-back policy is to write data into the cache disk first and then write the data into the data disk by the cache disk;
setting a percentage of the write-back strategy, wherein the percentage represents a proportion of dirty data written into the data disk in the cache disk, and the dirty data is modified data;
after the data of the write operation request is stored in the space corresponding to the access frequency of the data of the write operation request, the method further comprises the following steps:
Judging whether the proportion of dirty data of the cache disk exceeds the percentage;
and if so, brushing the data in the cache disk down to the data disk until the proportion of dirty data of the cache disk does not exceed the percentage.
7. The method of distributed storage according to claim 6, further comprising, after storing the data of the write operation request in a space corresponding to an access frequency of the data of the write operation request:
judging whether the capacity of the data in the cache disk exceeds a capacity threshold value or not;
and if so, brushing the data in the cache disk down to the data disk until the capacity of the data in the cache disk does not exceed the capacity threshold.
8. The method of distributed storage according to claim 6, wherein after setting the cache policy between the data disk and the cache disk to a write-back policy, further comprising:
and when the delay time of the write operation request exceeds the maximum delay time of the write operation request or the delay time of the read operation request exceeds the maximum delay time of the read operation request, reducing the number of the write operation requests or the number of the read operation requests received in a preset time.
9. A system for distributed storage, comprising:
the dividing unit is used for dividing the space of the cache disk corresponding to the data disk in the storage pool according to the access frequency of the stored data;
a first determining unit, configured to determine, when a write operation request is received, an access frequency of data corresponding to the write operation request;
a storage unit, configured to store the data of the write operation request into a space corresponding to an access frequency of the data of the write operation request;
a second determining unit, configured to determine, when a read operation request is received, an access frequency of data corresponding to the read operation request;
a searching unit, configured to search for the data requested by the read operation from a space corresponding to the access frequency of the data requested by the read operation;
and the return unit is used for returning the data of the read operation request to the sending end of the read operation request.
10. An apparatus for distributed storage, comprising:
a memory for storing a computer program;
treatment of the device is used for controlling the temperature of the air, steps for implementing a method of distributed storage according to any of claims 1 to 8 when said computer program is executed.
CN202310425670.7A 2023-04-20 2023-04-20 Distributed storage method, system and device Pending CN116450042A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310425670.7A CN116450042A (en) 2023-04-20 2023-04-20 Distributed storage method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310425670.7A CN116450042A (en) 2023-04-20 2023-04-20 Distributed storage method, system and device

Publications (1)

Publication Number Publication Date
CN116450042A true CN116450042A (en) 2023-07-18

Family

ID=87127211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310425670.7A Pending CN116450042A (en) 2023-04-20 2023-04-20 Distributed storage method, system and device

Country Status (1)

Country Link
CN (1) CN116450042A (en)

Similar Documents

Publication Publication Date Title
US11372544B2 (en) Write type based crediting for block level write throttling to control impact to read input/output operations
US20170185512A1 (en) Specializing i/o access patterns for flash storage
US6381677B1 (en) Method and system for staging data into cache
US11030107B2 (en) Storage class memory queue depth threshold adjustment
US8874854B2 (en) Method for selectively enabling and disabling read caching in a storage subsystem
US8281076B2 (en) Storage system for controlling disk cache
US6957294B1 (en) Disk volume virtualization block-level caching
CN108920387B (en) Method and device for reducing read delay, computer equipment and storage medium
CN106547476B (en) Method and apparatus for data storage system
US20080177975A1 (en) Database management system for controlling setting of cache partition area in storage system
US20050071550A1 (en) Increasing through-put of a storage controller by autonomically adjusting host delay
JP5531091B2 (en) Computer system and load equalization control method thereof
US7062608B2 (en) Storage device adapter equipped with integrated cache
US11204870B2 (en) Techniques for determining and using caching scores for cached data
US11899580B2 (en) Cache space management method and apparatus
JP3194201B2 (en) How to select cache mode
US7032093B1 (en) On-demand allocation of physical storage for virtual volumes using a zero logical disk
US8732404B2 (en) Method and apparatus for managing buffer cache to perform page replacement by using reference time information regarding time at which page is referred to
US9032153B2 (en) Use of flash cache to improve tiered migration performance
US20180067860A1 (en) Read ahead management in a multi-stream workload
CN116450042A (en) Distributed storage method, system and device
JP5907189B2 (en) Storage control device, storage control method, and program
JP2020046752A (en) Storage device and information processing system
KR101144321B1 (en) Methods of managing buffer cache using solid state disk as an extended buffer and apparatuses for using solid state disk as an extended buffer
JP6928247B2 (en) Storage controller and storage control program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination