CN105915595B - Method for cluster storage system to access data and cluster storage system - Google Patents

Method for cluster storage system to access data and cluster storage system Download PDF

Info

Publication number
CN105915595B
CN105915595B CN201610223289.2A CN201610223289A CN105915595B CN 105915595 B CN105915595 B CN 105915595B CN 201610223289 A CN201610223289 A CN 201610223289A CN 105915595 B CN105915595 B CN 105915595B
Authority
CN
China
Prior art keywords
cache
program
data block
storage
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610223289.2A
Other languages
Chinese (zh)
Other versions
CN105915595A (en
Inventor
林沧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vclusters Information Technology Co ltd
Original Assignee
Shenzhen Vclusters Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vclusters Information Technology Co ltd filed Critical Shenzhen Vclusters Information Technology Co ltd
Priority to CN201610223289.2A priority Critical patent/CN105915595B/en
Publication of CN105915595A publication Critical patent/CN105915595A/en
Application granted granted Critical
Publication of CN105915595B publication Critical patent/CN105915595B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for storing data by a cluster storage system and the cluster storage system, wherein the cluster storage system comprises a storage space and a cache space, the cluster storage system also runs a storage program, a cache program and a client program, the storage program, the cache program and the client program are interconnected, one storage program corresponds to one storage space, one cache program corresponds to one cache space, when receiving data to be stored, the data to be stored is divided into a plurality of data blocks, the data blocks are firstly generated into N-1 copies, each copy of the data blocks is respectively cached in one cache space, and the data blocks are also cached in one cache space; when the utilization rate of the data block is low, the data block is flushed into the storage space, and the cache space is released, so that the use amount of the cache space is reduced. Through the mode, the read-write performance of the cluster storage system is optimized, the cost of the cluster storage system is reduced, and the reliability of the cluster storage system is improved.

Description

Method for cluster storage system to access data and cluster storage system
Technical Field
The invention relates to the technical field of cloud storage, in particular to a method for accessing data by a cluster storage system and the cluster storage system.
Background
The cluster storage system is a system which aggregates storage spaces in a plurality of storage devices into a storage pool capable of providing a uniform access interface and a management interface for an application server, and an application can transparently access and utilize disks on all the storage devices through the access interface. The cluster storage system can fully exert the performance and the disk utilization rate of the storage devices, and easily expand the storage capacity of the storage pool.
Most of the storage space of the cluster storage system in the prior art uses mechanical hard disks. The mechanical hard disk has the advantages of low price, long service life and the like, but the mechanical hard disk also has the defects of low reading and writing speed and the like. The storage space of the cluster storage system is all mechanical hard disks, the read-write speed is very low, and especially when a large number of concurrent access requests exist at the same time, the read-write performance of the mechanical hard disks can directly influence the overall read-write performance of the cluster storage system.
Disclosure of Invention
The invention mainly solves the technical problem of providing a method for storing data by a cluster storage system and the cluster storage system, which not only optimizes the read-write performance of the cluster storage system, but also reduces the cost of the cluster storage system.
In order to solve the technical problems, the invention adopts a technical scheme that: providing a cluster storage system, which comprises a storage space and a cache space, wherein the cluster storage system also runs a storage program, a cache program and a client program, and the storage program, the cache program and the client program are interconnected, wherein the storage space, the cache space, the storage program and the cache program are respectively in a plurality of numbers, one storage program corresponds to one storage space, one cache program corresponds to one cache space, and the read-write performance of the cache space is superior to the read-write performance of the storage space; the client program is used for dividing the data to be stored into a plurality of data blocks according to a preset data division algorithm when the data to be stored is received, establishing a corresponding relation between the identification of the data blocks and the identification of the data to be stored, respectively selecting the storage program corresponding to each data block according to a preset first storage algorithm, and sending the corresponding data block to the selected storage program. The selected storage program is used for selecting a main cache program according to a preset cache algorithm, taking the selected cache program as the main cache program, sending the received data block to the main cache program, and writing the self identification into the metadata of the data block; the original cache program is used for caching the data block and the metadata of the data block into a cache space corresponding to the original cache program, writing the identifier of the data block and the identifier association relation of the original cache program into a cache set, selecting N-1 cache programs according to a preset copy storage algorithm, taking the selected cache programs as copy cache programs, generating N-1 copies of the data block, and respectively sending one copy of the data block to one copy cache program, wherein N is a natural number greater than 1;
the copy caching program receiving the copy of the data block caches the copy of the data block to a corresponding caching space of the copy caching program, writes the incidence relation between the identification of the copy caching program and the identification of the data block into a caching set, and sends a copy caching success response to the original caching program;
the main cache program is also used for sending a data cache success response by the selected storage program after receiving a copy cache success response; and the selected storage server is also used for sending a data storage success response to the client program when receiving the data caching success response.
The cache program is further configured to filter a data block to be stored from a cache space of the cache program according to a preset filtering algorithm, acquire a storage program for processing the data block to be stored according to metadata of the data block to be stored, and send the data block to be stored to the acquired storage program; and the acquired storage program stores the data block to be stored to a storage space corresponding to the acquired storage program, and returns a data brushing success response.
The cache program is used for acquiring other cache programs for processing the data block to be stored according to the cache set after receiving the data brushing success response, and deleting the data block to be stored from the cache space of the cache program; sending a copy deleting request to the other caching programs, wherein the copy deleting request carries the identification of the data block to be stored; the other cache programs delete the data block to be stored from the cache space of the other cache programs according to the request for deleting the copy, and return a response of successful copy deletion after the deletion is completed; and after receiving the successful response of the duplicate deletion, the cache program deletes the association relationship between the identifier of the data block to be stored and the identifier of the cache program from the cache set.
Wherein the preset screening algorithm comprises a least recently used algorithm; the cache program filters the data block to be stored from the cache space corresponding to the cache program according to a preset filtering algorithm, and the method comprises the following steps: and when the cache program detects that the free residual space of the cache space corresponding to the cache program is smaller than a threshold value, screening the data block to be stored from the cache space of the cache program according to a least recently used algorithm.
The storage space is a mechanical hard disk, and the cache space is a solid state disk.
The client program is further configured to, when receiving a data reading request, acquire a data block to be read corresponding to data to be read requested by the data reading request, process a storage program of the data block to be read, and send the data block reading request to the acquired storage program; after receiving the data block reading request, the obtained storage program searches whether the data block to be read has a cache or not from the cache set, and if so, sends the data block reading request to the corresponding cache program; the cache program receiving the data block reading request reads the data block to be read from the cache space of the cache program and returns the data block to be read; and the acquired storage program returns the data block to be read to the client program.
When the cache of the data block to be read cannot be found from the cache set, the acquired storage program extracts the data block to be read from the storage space of the storage program and returns the data block to be read to the client program.
In order to solve the technical problem, the invention adopts another technical scheme that: the cluster storage system also runs a storage program, a cache program and a client program, and the storage program, the cache program and the client program are interconnected, wherein the storage space, the cache space, the storage program and the cache program are all in multiple numbers, one storage program corresponds to one storage space, one cache program corresponds to one cache space, and the read-write performance of the cache space is superior to the read-write performance of the storage space, and the method comprises the following steps: receiving a data block sent by a client program, wherein the data block is obtained by dividing data to be stored into data blocks according to a preset data division algorithm when the client program receives uploaded data to be stored, and a corresponding relation is established between an identifier of the data block and an identifier of the data to be stored; selecting a cache program according to a preset cache algorithm, taking the selected cache program as a main cache program, writing the self identification into the metadata corresponding to the data block, and sending the data block to the main cache program, so that the main cache program is used for caching the data block into the cache space corresponding to the main cache program, and writing the incidence relation between the self identification and the identification of the data block into a cache set;
selecting N-1 cache programs according to a preset copy storage algorithm, taking the selected cache programs as copy cache programs, generating N-1 copies of the data blocks, sending one copy of the data block to one copy cache program, enabling the copy cache program receiving the copy of the data block to cache the copy of the data block to a cache space corresponding to the copy cache program, and writing an incidence relation between an identifier of the copy cache program and an identifier of the data block into a cache set, wherein N is a natural number greater than 1.
The storage space is a mechanical hard disk, and the cache space is a solid state disk.
The invention has the beneficial effects that: different from the situation of the prior art, when the data to be stored is received, the data to be stored is divided into a plurality of data blocks, the data blocks are firstly generated into N-1 copies, each copy of the data blocks is cached in a cache space, the data blocks are cached in the cache space, the writing speed of the data blocks can be greatly improved due to the fact that the reading and writing performance of the cache space is superior to that of the storage space, the data blocks and the copies thereof are cached in N parts together, and the reliability of a cluster storage system is effectively improved; in addition, in the technical field of storage processing, the higher the possibility that a recently stored data block is accessed recently, the higher the data block is read through a cache space, the reading speed of the data block is greatly improved, and the reading performance of a cluster storage system is optimized; and thirdly, when the utilization rate of the data blocks is low, the data blocks are brushed into the storage space, the cache space is released, the use amount of the cache space is reduced, and the cost of the cluster storage system is reduced.
Drawings
FIG. 1 is a schematic diagram of a cluster storage system embodiment of the present invention;
FIG. 2 is a flowchart of an embodiment of a method for storing data in a cluster storage system according to the present invention.
Detailed Description
Referring to fig. 1, the cluster storage system 20 includes a storage space 21 and a cache space 22, the cluster storage system further runs a storage program 211, a cache program 221, and a client program 231, and the storage program 211, the cache program 221, and the client program 231 are interconnected, where the storage space 21, the cache space 22, the storage program 211, and the cache program 221 are multiple in number, one storage program 211 corresponds to one storage space 21, one cache program 221 corresponds to one cache space 22, and the read-write performance of the cache space 22 is better than that of the storage space 21. It is worth mentioning that: the cluster storage system 20 is formed by combining a plurality of storage nodes 2, each storage node 2 can mount one or more storage spaces 21 and/or storage spaces 21, the cache program 221 and the storage program 211 run on a storage node 2, and the client program 231 can be installed on a client 3 or on one storage node 2.
The client program 231 is configured to partition the data to be stored into a plurality of data blocks according to a preset data partitioning algorithm, establish a correspondence between the identifier of the data block and the identifier of the data to be stored, select a storage program 211 for each data block according to a preset first storage algorithm, and send the data block to the selected storage program 211. After the corresponding relationship is established between the identifier of the data block and the identifier of the data to be stored, the data block can be combined into complete data to be stored according to the corresponding relationship. The preset data segmentation algorithm and the preset first storage algorithm are not specifically limited, and may be set according to actual conditions, for example: the preset data partitioning algorithm is an equal partitioning algorithm, the preset first storage algorithm is a load balancing algorithm, and so on.
The storage program 211 that receives the data block is configured to select the cache program 221 according to a preset cache algorithm, use the selected cache program 221 as a native cache program (not shown), send the data block to the native cache program, and write its own identifier into the metadata of the data block. After the identifier of the storage program 211 is written in the metadata of the data block, the storage program 211 storing the data block can be found directly from the metadata of the data block. The identifier of the stored program 211 may be pre-coded or may be the address of the stored program 211. The main cache program is used for caching the data block and the metadata of the data block in a corresponding cache space 22, writing the identifier of the data block into a cache set, selecting N-1 cache programs according to a preset copy storage algorithm, taking the selected cache programs as copy cache programs, generating N-1 copies of the data block, and sending one copy of the data block to one copy cache program.
And caching the copy of the data block to a corresponding cache space of the copy caching program which receives the copy of the data block, and writing the incidence relation between the identifier of the copy caching program and the identifier of the data block into a cache set, wherein N is a natural number greater than 1. After the incidence relation between the identification of the copy cache program and the identification of the data block is written into the cache set, the copy cache program and the main cache program for caching the data block can be found through the cache set
After the copy cache program completes caching, the copy cache program sends a copy cache success response to the original cache program, and after the original cache program receives the copy cache success response of the copy cache program, the original cache program sends a data cache success response to the selected storage program 211. The selected storage server is also configured to receive a data cache success response and send a data storage success response to the client program 231. After receiving the data storage success response, the client program 231 learns that the current data to be stored has been stored completely.
The processing performance of the cache space 22 is better than that of the storage space 21, and when data is stored, the data is stored in the cache space 22, so that the writing speed of writing the data into the cluster storage system is greatly increased, and the writing performance of the cluster storage system is improved. In addition, in the technical field of storage processing, the more recently stored data is likely to be accessed, the more newly stored data is divided into data blocks to be stored in the cache space 22, which is also beneficial to optimizing the processing performance of the cluster storage system. Moreover, when the data block is cached, the data block and the N-1 copies of the data block are directly cached, so that the reliability of the cluster storage system is greatly improved. In this embodiment, the storage space 21 is preferably a mechanical hard disk, the buffer space 22 is preferably a solid state disk, and the solid state disk has a higher read/write speed than the mechanical hard disk, but the price of the solid state disk is also higher than that of the mechanical hard disk.
Since the size of each cache space 22 is limited, in order to avoid the shortage of the cache space 22, the cache program 221 may further flush the data blocks in the cache space 22 into the storage space 21, specifically, the cache program 221 is further configured to filter the data blocks to be stored from the cache space 22 thereof according to a preset filtering algorithm, acquire and process the storage program 211 of the data blocks to be stored according to the metadata of the data blocks to be stored, and send the data blocks to be stored to the acquired storage program 211. Since the identifier of the storage program 211 associated with the identifier of the data to be stored is recorded in the metadata of the data block to be stored, the storage program 211 that processes the data block to be stored can be acquired from the metadata of the data block to be stored. The acquired storage program 211 stores the data block to be stored in the storage space 21 corresponding to itself, and returns a data brushing success response.
After the data block to be stored is flushed from the cache space 22 into the storage space 21, the data block to be stored is deleted from all the cache spaces 22, and then the cache program 221 is configured to, after receiving a data flushing success response, obtain, according to the cache set, another cache program 221 that processes the data block to be stored, and delete the data block to be stored from the cache space 22 of itself. And sending a copy deletion request to the other caching program 221, wherein the copy deletion request carries the identifier of the data block to be stored. The other cache program 221 deletes the data block to be stored deleted from its own cache space 22 according to the request for deleting the copy, and returns a response of successful deletion of the copy after the deletion is completed. After receiving the deduplication success response, the caching program 221 deletes the association relationship between the identifier of the data block to be stored and the identifier of the caching program 221 from the caching set.
Further, the predetermined screening algorithm includes a least recently used algorithm. The data blocks which are used least recently are flushed into the storage space 21, so that the storage configuration of the cache space 22 and the storage space 21 of the cluster storage system is optimized, and the processing performance of the cluster storage system is optimized. The step of screening, by the cache program 221, the data block to be stored from the cache space 22 corresponding to the cache program according to the preset screening algorithm includes: when detecting that the free remaining space of the cache space 22 corresponding to the cache program 221 is smaller than the threshold, the cache program 221 screens the data block to be stored from the cache space 22 according to the least recently used algorithm.
In addition, the original cache program and the copy cache program may also be connected, and when a failure occurs in one of the copy cache programs, immediately swipes the data block into the storage space 21, specifically, when the heart beat detection is performed between the original cache program and the copy cache program, and when it is found that the heart beat cannot be detected by one of the copy cache programs, the original cache program sends a data swipe request to the storage program 211 according to the identifier of the storage program 211 described in the metadata of the data block, and sends the data block to the corresponding storage program 211. The storage program 211 writes the data block into its storage space after receiving the data flush request and the data block, and sends a data flush success response to the original cache program that sent the data flush request after the data flush success response is successfully written, the original cache program sends a copy deletion request to other copy cache programs that store the data block after receiving the data flush success response, the other copy cache programs delete the data block from their cache space 22 after receiving the copy deletion request, and returns a copy deletion success response after the copy deletion success response, and the original cache program marks the metadata of the data block that the data block has been flushed into the storage space 21 and the copy in the cache space 22 has been deleted after receiving the copy deletion success response. And if the heartbeat is not detected, the original cache program is used, the copy cache program storing the first copy is upgraded to the original cache program, and the action which is supposed to be taken by the original cache program in the state is executed. At the same time, the storage program 211 is notified that the topology has changed to ensure that the storage program 211 can find the storing original cache program when reading the data block.
When the client program 231 reads data, it preferentially reads data from the cache space 22 to increase the reading speed, specifically, the client program 231 is further configured to, when receiving a data reading request, obtain a data block to be read corresponding to the data to be read requested by the data reading request, process the storage program 211 of the data block to be read, and send the data block reading request to the obtained storage program 211.
After receiving the data block reading request, the acquired storage program 211 searches whether the data block to be read has a cache from the cache set, and if so, sends the data block reading request to the corresponding cache program 221. The cache program 221 that receives the data block reading request reads the data block to be read from the cache space 22 of itself, and returns the data block to be read; the acquired storage program 211 returns the data block to be read to the client program 231.
When the acquired storage program 211 finds that the data block to be read does not exist in the cache from the cache set, it extracts the data block to be read from its own storage space, and returns the data block to be read to the client program 231231. The client 231 may combine the read data blocks to be read into complete data.
The invention has the beneficial effects that: different from the situation of the prior art, when the data to be stored is received, the data to be stored is divided into a plurality of data blocks, the data blocks are firstly generated into N-1 copies, each copy of the data blocks is cached in a cache space, the data blocks are cached in the cache space, the writing speed of the data blocks can be greatly improved due to the fact that the reading and writing performance of the cache space is superior to that of the storage space, the data blocks and the copies thereof are cached in N parts together, and the reliability of a cluster storage system is effectively improved; in addition, in the technical field of storage processing, the higher the possibility that a recently stored data block is accessed recently, the higher the data block is read through a cache space, the reading speed of the data block is greatly improved, and the reading performance of a cluster storage system is optimized; and when the utilization rate of the data block is low, the data block is flushed into the storage space, the cache space is released, the use amount of the cache space is reduced, and the cost of the cluster storage system is reduced. The cluster storage system comprises a storage space and a cache space, and also runs a storage program, a cache program and a client program which are interconnected, wherein the storage space, the cache space, the storage program and the cache program are multiple in number, one storage program corresponds to one storage space, one cache program corresponds to one cache space, and the read-write performance of the cache space is superior to that of the storage space. It is worth mentioning that: the cluster storage system is formed by combining a plurality of storage nodes, each storage node can mount one or more storage spaces and/or storage spaces, and the cache program and the storage program run on the storage node, but when a certain storage node has only a storage space or a cache space, the storage node only runs the cache program or the storage program, please refer to fig. 2, the method includes:
step S301: receiving a data block sent by a client program, wherein the data block is formed by segmenting data to be stored according to a preset data segmentation algorithm when the client program receives uploaded data to be stored, and a corresponding relation is established between the identifier of the data block and the identifier of the data to be stored;
after the data to be stored is divided into data blocks, the client program selects a storage program for storing the data blocks according to a preset first storage algorithm, and sends the data blocks to the selected storage program. In addition, the divided data blocks may be combined into data to be stored.
Step S302: selecting a cache program according to a preset cache algorithm, taking the selected cache program as a main cache program, writing the identification of the main cache program into metadata corresponding to the data block, and sending the data block to the main cache program, so that the main cache program is used for caching the data block into a cache space corresponding to the main cache program, and writing the association relationship between the identification of the main cache program and the identification of the data block into a cache set.
The read-write performance of the cache space is superior to that of the storage space, and the read-write performance of the cluster storage system is greatly improved by caching the data blocks in the cache space. When the utilization rate of the data blocks is low, the data blocks are brushed into the storage space, and the storage space and the cache space are matched for use, so that the read-write performance of the storage system is better than that of a cluster storage system which only has the storage space to store the data blocks, and the storage system is cheaper than that of a cluster storage system which only has the cache space. In this embodiment, preferably, the storage space is a mechanical hard disk, and the cache space is a solid state hard disk.
Step S303: selecting N-1 cache programs according to a preset copy storage algorithm, taking the selected cache programs as copy cache programs, generating N-1 copies of data blocks, sending one copy of the data blocks to one copy cache program, enabling the copy cache program receiving the copy of the data blocks to cache the copy of the data blocks to a cache space corresponding to the copy cache program, and writing an incidence relation between an identification of the copy cache program and the identification of the data blocks into a cache set, wherein N is a natural number greater than 1;
besides caching the data blocks, the method also caches the copies of the N-1 data blocks, each copy of the data block and the local copy of the data block are positioned in different cache spaces, namely the N-1 copies are damaged, and at least one copy of the data block exists, so that the reliability of the data block is greatly improved.
For other operational processes of the cluster storage system, for example: for example, a plurality of copies are stored, data blocks are flushed from a buffer space into a storage space, and data are read, which can be referred to in the implementation of the cluster storage system and are not described in detail here.
The invention has the beneficial effects that: different from the situation of the prior art, when the data to be stored is received, the data to be stored is divided into a plurality of data blocks, the data blocks are firstly generated into N-1 copies, each copy of the data blocks is cached in a cache space, the data blocks are cached in the cache space, the writing speed of the data blocks can be greatly improved due to the fact that the reading and writing performance of the cache space is superior to that of the storage space, the data blocks and the copies thereof are cached in N parts together, and the reliability of a cluster storage system is effectively improved; in addition, in the technical field of storage processing, the higher the possibility that a recently stored data block is accessed recently, the higher the data block is read through a cache space, the reading speed of the data block is greatly improved, and the reading performance of a cluster storage system is optimized; and thirdly, when the utilization rate of the data blocks is low, the data blocks are brushed into the storage space, the cache space is released, the use amount of the cache space is reduced, and the cost of the cluster storage system is reduced.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A cluster storage system is characterized by comprising a storage space and a cache space, wherein the cluster storage system also runs a storage program, a cache program and a client program which are interconnected, wherein the storage program, the cache program and the client program are in a plurality of numbers, one storage program corresponds to one storage space, one cache program corresponds to one cache space, and the read-write performance of the cache space is superior to that of the storage space;
the client program is used for dividing the data to be stored into a plurality of data blocks according to a preset data division algorithm when the data to be stored is received, establishing a corresponding relation between the identification of the data blocks and the identification of the data to be stored, respectively selecting a storage program for each data block according to a preset first storage algorithm, and sending the corresponding data block to the selected storage program;
the selected storage program is used for selecting a cache program according to a preset cache algorithm, taking the selected cache program as a main cache program, sending the received data block to the main cache program, and writing the identifier of the selected storage program into metadata of the data block;
the original cache program is used for caching the data block and the metadata of the data block into a cache space corresponding to the original cache program, writing the identifier of the data block and the identifier association relation of the original cache program into a cache set, selecting N-1 cache programs according to a preset copy storage algorithm, taking the selected cache programs as copy cache programs, generating N-1 copies of the data block, and respectively sending one copy of the data block to one copy cache program, wherein N is a natural number greater than 1;
the copy caching program receiving the copy of the data block caches the copy of the data block to a corresponding caching space of the copy caching program, writes the incidence relation between the identification of the copy caching program and the identification of the data block into a caching set, and sends a copy caching success response to the original caching program; the main cache program is also used for sending a data cache success response to the selected storage program after receiving the copy cache success response;
the selected storage program is also used for sending a data storage success response to the client program when receiving a data caching success response;
the main cache program and the duplicate cache program are connected, and when one duplicate cache program fails, the data blocks are flushed into the storage space.
2. The cluster storage system of claim 1,
the cache program is further used for screening a data block to be stored from the cache space of the cache program according to a preset screening algorithm, acquiring a storage program for processing the data block to be stored according to metadata of the data block to be stored, and sending the data block to be stored to the acquired storage program;
and the acquired storage program stores the data block to be stored to a storage space corresponding to the acquired storage program, and returns a data brushing success response.
3. The cluster storage system of claim 2,
the cache program is used for acquiring other cache programs for processing the data block to be stored according to the cache set after receiving the data brushing success response, and deleting the data block to be stored from the cache space of the cache program;
sending a copy deleting request to the other caching programs, wherein the copy deleting request carries the identification of the data block to be stored;
the other cache programs delete the data block to be stored from the cache space of the other cache programs according to the request for deleting the copy, and return a response of successful copy deletion after the deletion is completed;
and after receiving the successful response of the duplicate deletion, the cache program deletes the association relationship between the identifier of the data block to be stored and the identifier of the cache program from the cache set.
4. The cluster storage system of claim 2,
the preset screening algorithm comprises a least recently used algorithm;
the cache program filters the data block to be stored from the cache space corresponding to the cache program according to a preset filtering algorithm, and the method comprises the following steps:
and when the cache program detects that the free residual space of the cache space corresponding to the cache program is smaller than a threshold value, screening the data block to be stored from the cache space of the cache program according to a least recently used algorithm.
5. The cluster storage system of claim 2,
the storage space is a mechanical hard disk, and the cache space is a solid state disk.
6. The cluster storage system of claim 1,
the client program is also used for acquiring a data block to be read corresponding to the data to be read requested by the data reading request when the data reading request is received, processing a storage program of the data block to be read, and sending the data block reading request to the acquired storage program;
after receiving the data block reading request, the obtained storage program searches whether the data block to be read has a cache or not from the cache set, and if so, sends the data block reading request to the corresponding cache program;
the cache program receiving the data block reading request reads the data block to be read from the cache space of the cache program and returns the data block to be read;
and the acquired storage program returns the data block to be read to the client program.
7. The cluster storage system of claim 6,
and when the obtained storage program cannot find the cache of the data block to be read from the cache set, extracting the data block to be read from the storage space of the storage program, and returning the data block to be read to the client program.
8. A method for accessing data by a cluster storage system is characterized in that the cluster storage system comprises a storage space and a cache space, the cluster storage system also runs a storage program, a cache program and a client program, and the storage program, the cache program and the client program are interconnected, wherein the storage space, the cache space, the storage program and the cache program are all in multiple numbers, one storage program corresponds to one storage space, one cache program corresponds to one cache space, and the read-write performance of the cache space is superior to the read-write performance of the storage space, and the method comprises the following steps:
when the client program receives uploaded data to be stored, the client program divides the data to be stored into a plurality of data blocks according to a preset data division algorithm, establishes a corresponding relation between the identification of the data blocks and the identification of the data to be stored, selects a storage program for each data block according to a preset first storage algorithm, sends the corresponding data block to the selected storage program, and receives the data block sent by the client program;
selecting a cache program by a storage program receiving a data block according to a preset cache algorithm, taking the selected cache program as a main cache program, writing an identifier of the main cache program into metadata corresponding to the data block, and sending the data block to the main cache program, so that the main cache program is used for caching the data block into a cache space corresponding to the main cache program, and writing an incidence relation between the identifier of the main cache program and the identifier of the data block into a cache set;
the main cache program selects N-1 cache programs according to a preset copy storage algorithm, takes the selected cache programs as copy cache programs, generates N-1 copies of the data blocks, sends one copy of the data blocks to one copy cache program, enables the copy cache program receiving the copy of the data blocks to cache the copy of the data blocks to a cache space corresponding to the copy cache program, and writes an incidence relation between an identification of the copy cache program and the identification of the data blocks into a cache set, wherein N is a natural number larger than 1;
the main cache program and the duplicate cache program are connected, and when one duplicate cache program fails, the data blocks are flushed into the storage space.
9. The method of claim 8,
the storage space is a mechanical hard disk, and the cache space is a solid state disk.
CN201610223289.2A 2016-04-11 2016-04-11 Method for cluster storage system to access data and cluster storage system Active CN105915595B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610223289.2A CN105915595B (en) 2016-04-11 2016-04-11 Method for cluster storage system to access data and cluster storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610223289.2A CN105915595B (en) 2016-04-11 2016-04-11 Method for cluster storage system to access data and cluster storage system

Publications (2)

Publication Number Publication Date
CN105915595A CN105915595A (en) 2016-08-31
CN105915595B true CN105915595B (en) 2020-05-26

Family

ID=56745682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610223289.2A Active CN105915595B (en) 2016-04-11 2016-04-11 Method for cluster storage system to access data and cluster storage system

Country Status (1)

Country Link
CN (1) CN105915595B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107003943B (en) * 2016-12-05 2019-04-12 华为技术有限公司 The control method of reading and writing data order, storage equipment and system in NVMe over Fabric framework
CN106599292B (en) * 2016-12-26 2020-05-15 东方网力科技股份有限公司 Method and system for storing real-time video data and image data
KR20180130140A (en) * 2017-05-29 2018-12-07 에스케이하이닉스 주식회사 Data processing system and data processing method
CN109002260B (en) * 2018-07-02 2021-08-13 深圳市茁壮网络股份有限公司 Processing method and processing system for cache data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8700841B2 (en) * 2010-04-19 2014-04-15 International Business Machines Corporation Sub-LUN input/output profiling for SSD devices
US9158673B2 (en) * 2013-01-22 2015-10-13 International Business Machines Corporation Use of differing granularity heat maps for caching and migration
CN103885728B (en) * 2014-04-04 2016-08-17 华中科技大学 A kind of disk buffering system based on solid-state disk
CN104484130A (en) * 2014-12-04 2015-04-01 北京同有飞骥科技股份有限公司 Construction method of horizontal expansion storage system

Also Published As

Publication number Publication date
CN105915595A (en) 2016-08-31

Similar Documents

Publication Publication Date Title
US11449239B2 (en) Write-ahead log maintenance and recovery
US10101930B2 (en) System and method for supporting atomic writes in a flash translation layer
US9348760B2 (en) System and method for efficient flash translation layer
US11301379B2 (en) Access request processing method and apparatus, and computer device
CN108647151A (en) It is a kind of to dodge system metadata rule method, apparatus, equipment and storage medium entirely
CN108319602B (en) Database management method and database system
CN106951375B (en) Method and device for deleting snapshot volume in storage system
CN106547476B (en) Method and apparatus for data storage system
CN107329704B (en) Cache mirroring method and controller
CN105915595B (en) Method for cluster storage system to access data and cluster storage system
KR20120090965A (en) Apparatus, system, and method for caching data on a solid-state strorage device
CN111736767B (en) Method and equipment for writing cache of distributed object storage system
CN107422989B (en) Server SAN system multi-copy reading method and storage system
CN109918352B (en) Memory system and method of storing data
US20210034477A1 (en) Transaction recovery from a failure associated with a database server
CN112799595A (en) Data processing method, device and storage medium
CN111399760B (en) NAS cluster metadata processing method and device, NAS gateway and medium
JP2019028954A (en) Storage control apparatus, program, and deduplication method
CN106991118A (en) Entangled based on CEPH and delete fault-tolerant reading document method and device
WO2017113211A1 (en) Method and device for processing access request, and computer system
CN110134551B (en) Continuous data protection method and device
JP6269253B2 (en) Distributed storage system, storage device control method, and storage device control program
CN109254958A (en) Distributed data reading/writing method, equipment and system
CN104156327A (en) Method for recognizing object power failure in write back mode in distributed file system
US11861198B2 (en) Journal replay optimization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant