CN111522514A - Cluster file system, data processing method, computer device and storage medium - Google Patents

Cluster file system, data processing method, computer device and storage medium Download PDF

Info

Publication number
CN111522514A
CN111522514A CN202010343972.6A CN202010343972A CN111522514A CN 111522514 A CN111522514 A CN 111522514A CN 202010343972 A CN202010343972 A CN 202010343972A CN 111522514 A CN111522514 A CN 111522514A
Authority
CN
China
Prior art keywords
instruction
virtual block
data
metadata
block device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010343972.6A
Other languages
Chinese (zh)
Other versions
CN111522514B (en
Inventor
张和泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Intelligent Technology Co Ltd
Priority to CN202010343972.6A priority Critical patent/CN111522514B/en
Publication of CN111522514A publication Critical patent/CN111522514A/en
Application granted granted Critical
Publication of CN111522514B publication Critical patent/CN111522514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a cluster file system, a data processing method, a computer device, and a storage medium, wherein the cluster file system includes: an object storage server, a plurality of object storage devices, and a plurality of first virtual block devices; the object storage server is connected with a plurality of object storage devices; each object storage device is connected with at least one first virtual block device; the object storage server is used for receiving a first input/output (IO) instruction sent by the client and accessing the first virtual block device through the object storage device based on the first IO instruction; target data is stored in the first virtual block device; wherein the target data is stored as at least three copies of data in the first virtual block device.

Description

Cluster file system, data processing method, computer device and storage medium
Technical Field
The present disclosure relates to the field of computer storage technologies, and in particular, to a cluster file system, a data processing method, a computer device, and a storage medium.
Background
The cluster file system is a file system which runs on a plurality of computers, integrates and virtualizes all storage space resources in a cluster through mutual communication in a certain mode and provides file access service to the outside.
The reliability of data is an important index for measuring the stability of a cluster file system; as a kind of cluster File System, Lustre generally adopts a dynamic File System (ZFS) and hardware redundancy to ensure data reliability in order to prevent data loss. The ZFS is a kernel file system, and a plurality of Disks are combined into a flexible disk array (RAID) to provide a data storage disk for the Lustre cluster.
The data storage mode has the problem of poor expandability.
Disclosure of Invention
The embodiment of the disclosure at least provides a cluster file system, a data processing method, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a cluster file system, including: an object storage server, a plurality of object storage devices, and a plurality of first virtual block devices; wherein the object storage server is connected with a plurality of the object storage devices; each object storage device is connected with at least one first virtual block device; the object storage server is used for receiving a first input/output (IO) instruction sent by a client and accessing the first virtual block device through the object storage device based on the first IO instruction; target data is stored in the first virtual block device; wherein the target data is stored as at least three copies of data in the first virtual block device.
In this way, the first virtual block device can aggregate a plurality of physical disks Disk together to form a large-capacity virtual device, and thus has stronger expandability. Meanwhile, the target data can be stored as at least three copies in the first virtual block device, so that the reliability of the data can be greatly improved.
In a possible embodiment, the method further comprises: a metadata server, a plurality of metadata storage devices, and a plurality of second virtual block devices; wherein the metadata server is connected to a plurality of the metadata storage devices; each metadata storage device is connected with at least one second virtual block device; the metadata server is configured to receive a second IO instruction sent by the client, and access the second virtual block device through the metadata storage device based on the second IO instruction; metadata corresponding to the target data is stored in the second virtual block device; wherein the metadata is stored as at least three copies of metadata in the second virtual block device.
In this way, the second virtual block device can aggregate a plurality of physical disks Disk together to form a large-capacity virtual device, and the number of disks contained in the second virtual block device is larger, so that the scalability is stronger.
Meanwhile, the storage capacity of the second virtual block device is larger, and because the metadata can be stored as at least three copies in the first virtual block device, the reliability of the data can be greatly improved.
In one possible embodiment, the target data includes: a data block aggregated from a plurality of files; the metadata corresponding to the data blocks includes location information of each of the files in the data blocks.
Thus, the target data comprises data blocks formed by aggregating a plurality of files; for a metadata server, the file directory size required to be maintained is significantly reduced, and the cluster file system provided by the embodiment of the disclosure can be applied to massive small files.
In a possible implementation manner, the first virtual block device is further configured to determine and return a target data copy to the object storage device from at least three data copies corresponding to the first IO instruction according to a preset first reading order.
In this way, smooth reading of the target data is ensured.
In a second aspect, an embodiment of the present disclosure provides a data processing method, including: receiving an input/output (IO) instruction sent by a client; based on the IO instruction, controlling storage equipment to write data corresponding to the IO instruction into the virtual block equipment or read data corresponding to the IO instruction from the virtual block equipment; wherein the data is stored as at least three copies of data in the virtual block device.
In one possible implementation, the storage device includes: an object storage device; the virtual block device includes: a first virtual block device; the IO instruction includes: a first IO instruction; the data comprises target data; based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes: based on a first IO instruction, controlling the object storage device to write target data corresponding to the first IO instruction into the first virtual block device, or reading the target data corresponding to the first IO instruction from the first virtual block device.
In one possible implementation, the storage device includes: a metadata storage device; the virtual block device includes: a second virtual block device; the IO instruction includes: a second IO instruction; the data comprises metadata; based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes: based on a second IO instruction, controlling the metadata storage device to write metadata data corresponding to the second IO instruction into the second virtual block device, or reading metadata corresponding to the second IO instruction from the second virtual block device.
In a possible implementation, the reading data corresponding to the IO instruction from the virtual block device includes: and sending the IO instruction to the virtual block device, so that the virtual block device determines and returns a target data copy to the object storage device from at least three data copies corresponding to the IO instruction according to a preset reading sequence.
In a third aspect, an embodiment of the present disclosure further provides a computer device, including: a processor and a memory coupled to each other, the memory storing machine-readable instructions executable by the processor, the machine-readable instructions being executable by the processor when the computer apparatus is running to implement the steps of the data processing method of the second aspect described above, or any one of the possible embodiments of the second aspect.
In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to perform the steps of the above second aspect, or the data processing method in any possible implementation manner of the second aspect.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.
FIG. 1 is a schematic diagram illustrating a cluster file system provided by an embodiment of the present disclosure;
fig. 2 is a schematic diagram illustrating a data reading process in a cluster file system according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating a process of an OST reading target data from a VBD in a cluster file system according to an embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating a data processing method provided by an embodiment of the present disclosure;
fig. 5 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
Research has shown that in cluster file system Lustre, a Disk (Disk) array is usually used to store data. The number of disks which can be installed by the object storage equipment is limited, and the number of the object storage equipment needs to be correspondingly increased when the storage capacity of Lustre is required to be expanded; but the physical equipment cost is high, and further the extensibility of Lustre is poor.
Meanwhile, the number of disks that can be installed in the object storage device is limited, which causes the problem that the current Lustre can only provide data redundancy of the highest two copy levels, resulting in poor data reliability.
In addition, the cluster and parallel architecture of Lustre is more suitable for the occasions where a plurality of clients concurrently read and write the sent file; but is not applicable to small file applications; particularly, in the application of massive small files, a metadata server and an object storage server need to maintain huge file directories, so that the Lustre response speed is reduced.
Based on the research, the present disclosure provides a cluster file system and a data processing method, including an object storage server, a plurality of object storage devices, and a plurality of first virtual block devices; wherein the object storage server is connected with a plurality of the object storage devices; each object storage device is connected with at least one first virtual block device; the object storage server is configured to receive a first Input/Output (IO) instruction sent by a client, and access a first virtual block device through the object storage device based on the first IO instruction; target data is stored in the first virtual block device; the first Virtual Block Disk (VBD) can aggregate a plurality of physical disks together to form a large-capacity Virtual device, and thus has stronger scalability. Meanwhile, the target data can be stored as at least three copies in the first virtual block device, so that the reliability of the data can be greatly improved.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
For the convenience of understanding the present embodiment, a detailed description will be given to a cluster file system disclosed in the embodiments of the present disclosure.
The Luster cluster includes: a metadata Server (MDS), and an Object Storage Server (OSS); the MDS is connected with a plurality of metadata Storage devices (Meta Data Target, MDT), and the OSS is connected with a plurality of Object Storage devices (OST). Each MDT and each OST is connected with a physical storage (Disk) through which data storage services are provided for the MDT and the SOT.
The MDS is responsible for providing metadata service for the cluster and managing the name space of the whole cluster; sharing access to one MDT among a plurality of MDSs; each MDT stores file metadata objects such as file names, directory structures, access rights and the like; the Client (Client) reads the metadata stored on the MDT through the MDS.
The OSS is responsible for interaction between the Client and the Disk and data storage, and provides an Input/Output (I/O) interface of data for the Client; each OSS manages one or more OSTs, each OST for storing file data objects.
Referring to fig. 1, a schematic structural diagram of a cluster file system provided in the embodiment of the present disclosure is shown, where the cluster file system includes:
an object storage server 11, a plurality of object storage devices 12, and a plurality of first virtual block devices 13;
wherein, the object storage server 11 is connected with a plurality of the object storage devices 12; each of the object storage devices 12 is connected to at least one of the first virtual block devices 13;
the object storage server 11 is configured to receive a first input/output IO instruction sent by a client, and access the first virtual block device 13 through the object storage device 12 based on the first IO instruction;
target data is stored in the first virtual block device 13;
wherein the target data is stored as at least three copies of data in the first virtual block device.
In a specific implementation, after receiving a first IO instruction sent by a client, an Object Storage Server (OSS) 11, where the first IO instruction is a data write instruction, and the first IO instruction further carries target data to be written corresponding to the first IO instruction. The Object Storage server 11 determines a Target Object Storage device from a plurality of Object Storage devices (OST) 12 according to the first IO instruction, and transmits the first IO instruction to the Target Object Storage device; the target object storage device is connected with a first Virtual Block Disk (VBD) 13, and the target object storage device transmits target data carried in the first IO instruction to the first Virtual block device 13 connected thereto; the first virtual block device 13 stores the target data as at least three copies of data.
Under the condition that the first IO instruction is a data reading instruction, the first IO instruction carries metadata of target data to be read; the object storage server 11 determines, according to the first IO instruction, a target object storage device in which target data corresponding to the metadata is stored from the plurality of object storage devices 12, and then transmits the first IO instruction to the target object storage device; the target object storage device reads target data corresponding to the metadata carried in the first IO instruction from a first virtual block device 13 connected to the target object storage device according to the first IO instruction; here, when returning the target data to the target object storage device, the first virtual block device 13 reads the target data copy from the at least three stored data copies in a first reading order set in advance; specifically, if the first data copy in the reading sequence cannot be found, reading the corresponding second data copy according to the reading sequence; and if the second data copy cannot be found, reading the corresponding third data copy according to the reading sequence. After reading the target data from the first virtual block device connected to the target object storage device, the target object storage device returns the target data to the object storage server 11, and the target object storage server returns the target data to the client.
In the cluster file system provided in another embodiment of the present disclosure, the method further includes: a metadata server 14, a plurality of metadata storage devices 15, and a plurality of second virtual block devices 16;
wherein the metadata server 14 is connected to a plurality of the metadata storage devices 15; each of the metadata storage devices 15 is connected to at least one of the second virtual block devices 16;
the metadata server 14 is configured to receive a second IO instruction sent by the client, and access the second virtual block device 16 through the metadata storage device 15 based on the second IO instruction;
the second virtual block device 16 stores therein metadata corresponding to the target data;
wherein the metadata is stored as at least three copies of metadata in the second virtual block device 16.
In a specific implementation, after receiving a second IO instruction sent by the client, the metadata Server (Meta Data Server, MDS)14 also carries metadata to be written corresponding to the second IO instruction in the second IO instruction when the second IO instruction is a Data write instruction. The metadata server 14 determines a Target metadata storage device from the plurality of metadata storage devices (Meta Data Target, MDT)15 according to the second IO instruction, and transmits the second IO instruction to the determined Target metadata storage device; the target metadata storage device is connected with a second virtual block device 16, and transmits metadata carried in the second IO instruction to the second virtual block device 16; the second virtual block device 16 stores the metadata as at least three copies of the metadata.
Under the condition that the second IO instruction is a data reading instruction, the second IO instruction carries relevant information of metadata to be read, such as a file name and the like; the metadata server 14 determines a target metadata storage device from the plurality of metadata storage devices 15 connected thereto according to the related information of the metadata, and transmits a second IO instruction to the target metadata storage device; and the target metadata storage device reads the metadata corresponding to the second IO instruction from the second virtual block device 16 connected with the target metadata storage device according to the second IO instruction and the related information of the metadata carried in the second IO instruction. Here, the second virtual block device 16 reads the target metadata copy from the stored at least three metadata copies in a second reading order set in advance when returning the metadata to the target metadata storage device; specifically, if the first metadata copy in the reading order cannot be found, reading the corresponding second metadata copy according to the reading order; if the second metadata copy cannot be found, reading a corresponding third metadata copy … … according to the reading order until the metadata copy is found; and if the metadata copy cannot be found, returning the information of reading failure to the target metadata storage equipment. The target metadata storage device returns the metadata to the metadata server 14 after reading the metadata from the second virtual block device connected thereto; and the metadata server returns the metadata to the client so that the client can initiate a first IO instruction to the object storage server according to the acquired metadata to read target data corresponding to the acquired metadata.
In the above process, each time the client initiates a data reading process, the client first accesses the metadata server to obtain the metadata of the file to be read, and then accesses the object storage server according to the metadata of the file to be read to read the target data of the file.
Similarly, each time the client initiates a data writing process, the client first accesses the metadata server, and after the metadata server generates metadata, the client stores target data corresponding to the metadata in the first virtual block device connected to the object storage device by accessing the object storage server.
In one possible embodiment, the metadata includes: file name, directory structure, access rights, etc. of the target data.
In another possible embodiment, the target data includes: a data block aggregated from a plurality of files;
the metadata corresponding to the data blocks includes location information of each of the files in the data blocks.
In this way, multiple files can be aggregated into a data block; when accessing a file in the data block, the client can determine a storage location of the file to be accessed in the second virtual block device based on the location information.
As shown in fig. 2, an embodiment of the present disclosure provides a complete process for reading data, including:
a Client requests metadata from a metadata server MDS; the MDS initiates data reading to a specific metadata storage device MDT based on a Client request; the MDT reads the metadata from the second virtual block device VBD2 and returns the metadata to the MDT; the MDT returns the metadata to the MDS; the MDS returns the metadata to the Client.
The Client finds a corresponding object storage server OSS according to the obtained metadata information to obtain target data; the OSS transfers the IO request to a specific object storage device OST; the OST acquires target data from a first virtual block device VBD1 connected with the OST and returns the acquired target data to the OST; the OST returns the target data to the OSS; the OSS returns the target data to the Client.
Through the above process, one data reading is completed.
Referring to fig. 3, an embodiment of the present disclosure further provides a specific data processing process in which a first virtual block device 13 provides a data storage service for an OST, where the first virtual block device 13 is a virtual storage device obtained after virtualizing at least one physical Disk; virtual Block Disk (VBD) service is deployed in the object storage device 12; in addition, the VBD service may be separately provided in another specific device, and is referred to as a VBD server.
The OST requests secondary metadata from the VBD service, and the VBD service acquires specific secondary metadata from the manager node Monitor in the VBD cluster based on the request.
Here, the metadata includes a file name of the target data, a directory structure and access authority in the virtual block device, and the like.
The secondary metadata includes a file name of the target data, a directory structure and access rights of the target data in at least one physical disk constituting the virtual block device, and the like.
The Monitor returns the secondary metadata to the VBD service; the VBD service returns the secondary metadata to the OSD; and the OSD acquires a target data copy corresponding to the secondary metadata from the specific physical Disk according to the secondary metadata.
Here, it is to be noted that different target data copies, for example, are deployed in different disks constituting the first virtual block device; the Disk constituting the first virtual fast device in this example includes: disk1, Disk2, and Disk 3; each Disk stores a target data copy of the target data; the preset first reading sequence is as follows: disk1 → Disk2 → Disk 3; when the OSD cannot read the target data copy from the Disk1 according to a preset first reading sequence, reading the target data copy from the Disk 2; after the target data copy cannot be read from Disk2, the target data copy is read from Disk 3.
The probability that the three disks forming the virtual block device simultaneously lose the target data copy is extremely low, and the reliability of the data is further ensured.
In addition, when data is written, the OST also sends a write request to the VBD service; the VBD service sends the writing request to the Monitor to generate secondary metadata; and the VBD service writes the corresponding target data to different disks that make up the VBD to be stored as different copies of the target data in the multiple disks.
The cluster file system provided by the embodiment of the disclosure comprises an object storage server, a plurality of object storage devices and a plurality of first virtual block devices; wherein the object storage server is connected with a plurality of the object storage devices; each object storage device is connected with at least one first virtual block device; the object storage server is configured to receive a first Input/Output (IO) instruction sent by a client, and access a first virtual block device through the object storage device based on the first IO instruction; target data is stored in the first virtual block device; the first virtual block device can aggregate a plurality of physical disks Disk together to form a large-capacity virtual device, and therefore the first virtual block device has stronger expandability.
Meanwhile, the target data can be stored as at least three copies in the first virtual block device, so that the reliability of the data can be greatly improved
In addition, the target data of the embodiment of the present disclosure includes a data block into which a plurality of files are aggregated; for a metadata server, the file directory size required to be maintained is significantly reduced, and the cluster file system provided by the embodiment of the disclosure can be applied to massive small files.
Referring to fig. 4, an embodiment of the present disclosure further provides a data processing method, including:
s401: and receiving an input/output (IO) instruction sent by the client.
S402: based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device.
Wherein the data is stored as at least three copies of data in the virtual block device.
In one possible implementation, the storage device includes: an object storage device; the virtual block device includes: a first virtual block device; the IO instruction includes: a first IO instruction; the data comprises target data;
based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes:
based on a first IO instruction, controlling the object storage device to write target data corresponding to the first IO instruction into the first virtual block device, or reading the target data corresponding to the first IO instruction from the first virtual block device.
Here, for a specific data reading manner and a data writing manner of the target data, reference may be made to the above embodiment corresponding to fig. 1, and details are not described here again.
In another possible embodiment, the storage device includes: a metadata storage device; the virtual block device includes: a second virtual block device; the IO instruction includes: a second IO instruction; the data comprises metadata;
based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes:
based on a second IO instruction, controlling the metadata storage device to write metadata data corresponding to the second IO instruction into the second virtual block device, or reading metadata corresponding to the second IO instruction from the second virtual block device.
Here, for a specific reading and writing manner of the metadata, reference may be made to the embodiment corresponding to fig. 1, which is not described herein again.
It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.
The embodiment of the present disclosure further provides a computer device 10, as shown in fig. 5, which is a schematic structural diagram of the computer device 10 provided in the embodiment of the present disclosure, and includes:
a processor 11 and a memory 12 connected to each other, the memory 12 storing machine readable instructions executable by the processor, the machine readable instructions being executable by the processor 11 to implement the following steps when a computer device is run: .
Receiving an input/output (IO) instruction sent by a client;
based on the IO instruction, controlling storage equipment to write data corresponding to the IO instruction into the virtual block equipment or read data corresponding to the IO instruction from the virtual block equipment;
wherein the data is stored as at least three copies of data in the virtual block device.
For the specific execution process of the instruction, reference may be made to the steps of the data processing method described in the embodiments of the present disclosure, and details are not described here.
The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the data processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.
The computer program product of the data processing method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute steps of the data processing method in the above method embodiments, which may be referred to specifically for the above method embodiments, and are not described herein again.
The embodiments of the present disclosure also provide a computer program, which when executed by a processor implements any one of the methods of the foregoing embodiments. The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (10)

1. A clustered file system, comprising: an object storage server, a plurality of object storage devices, and a plurality of first virtual block devices;
wherein the object storage server is connected with a plurality of the object storage devices; each object storage device is connected with at least one first virtual block device;
the object storage server is used for receiving a first input/output (IO) instruction sent by a client and accessing the first virtual block device through the object storage device based on the first IO instruction;
target data is stored in the first virtual block device;
wherein the target data is stored as at least three copies of data in the first virtual block device.
2. The clustered file system of claim 1, further comprising: a metadata server, a plurality of metadata storage devices, and a plurality of second virtual block devices;
wherein the metadata server is connected to a plurality of the metadata storage devices; each metadata storage device is connected with at least one second virtual block device;
the metadata server is configured to receive a second IO instruction sent by the client, and access the second virtual block device through the metadata storage device based on the second IO instruction;
metadata corresponding to the target data is stored in the second virtual block device;
wherein the metadata is stored as at least three copies of metadata in the second virtual block device.
3. The cluster file system of claim 2, wherein the target data comprises: a data block aggregated from a plurality of files;
the metadata corresponding to the data blocks includes location information of each of the files in the data blocks.
4. The cluster file system of any of claims 1-3, wherein the first virtual block device is further configured to determine and return a target data copy to the object storage device from among the at least three data copies corresponding to the first IO instruction according to a preset first read order.
5. A data processing method, comprising:
receiving an input/output (IO) instruction sent by a client;
based on the IO instruction, controlling storage equipment to write data corresponding to the IO instruction into the virtual block equipment or read data corresponding to the IO instruction from the virtual block equipment;
wherein the data is stored as at least three copies of data in the virtual block device.
6. The data processing method of claim 5, wherein the storage device comprises: an object storage device; the virtual block device includes: a first virtual block device; the IO instruction includes: a first IO instruction; the data comprises target data;
based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes:
based on a first IO instruction, controlling the object storage device to write target data corresponding to the first IO instruction into the first virtual block device, or reading the target data corresponding to the first IO instruction from the first virtual block device.
7. The data processing method according to claim 5 or 6, wherein the storage device comprises: a metadata storage device; the virtual block device includes: a second virtual block device; the IO instruction includes: a second IO instruction; the data comprises metadata;
based on the IO instruction, controlling a storage device to write data corresponding to the IO instruction into the virtual block device, or reading data corresponding to the IO instruction from the virtual block device includes:
based on a second IO instruction, controlling the metadata storage device to write metadata data corresponding to the second IO instruction into the second virtual block device, or reading metadata corresponding to the second IO instruction from the second virtual block device.
8. The data processing method according to any one of claims 5 to 8, wherein the reading data corresponding to the IO instruction from the virtual block device includes:
and sending the IO instruction to the virtual block device, so that the virtual block device determines and returns a target data copy to the object storage device from at least three data copies corresponding to the IO instruction according to a preset reading sequence.
9. An electronic device, comprising: an interconnected processor and memory, the memory storing machine-readable instructions executable by the processor, the machine-readable instructions being executable by the processor to implement a data processing method as claimed in any one of claims 5 to 8 when executed by a computer device.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the data processing method of any one of claims 5 to 8.
CN202010343972.6A 2020-04-27 2020-04-27 Cluster file system, data processing method, computer equipment and storage medium Active CN111522514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010343972.6A CN111522514B (en) 2020-04-27 2020-04-27 Cluster file system, data processing method, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010343972.6A CN111522514B (en) 2020-04-27 2020-04-27 Cluster file system, data processing method, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111522514A true CN111522514A (en) 2020-08-11
CN111522514B CN111522514B (en) 2023-11-03

Family

ID=71906212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010343972.6A Active CN111522514B (en) 2020-04-27 2020-04-27 Cluster file system, data processing method, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111522514B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590309A (en) * 2021-06-30 2021-11-02 郑州云海信息技术有限公司 Data processing method, device, equipment and storage medium
CN114079659A (en) * 2020-08-13 2022-02-22 支付宝(杭州)信息技术有限公司 Server of distributed storage system, data storage method and data access system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160026672A1 (en) * 2014-07-23 2016-01-28 Netapp. Inc. Data and metadata consistency in object storage systems
CN105446794A (en) * 2014-09-30 2016-03-30 北京金山云网络技术有限公司 Disc operation method, apparatus and system based on virtual machine
CN105468296A (en) * 2015-11-18 2016-04-06 南京格睿信息技术有限公司 No-sharing storage management method based on virtualization platform
US9558208B1 (en) * 2013-12-19 2017-01-31 EMC IP Holding Company LLC Cluster file system comprising virtual file system having corresponding metadata server
CN107203639A (en) * 2017-06-09 2017-09-26 联泰集群(北京)科技有限责任公司 Parallel file system based on High Performance Computing
CN107807794A (en) * 2017-10-31 2018-03-16 新华三技术有限公司 A kind of date storage method and device
CN109347896A (en) * 2018-08-14 2019-02-15 联想(北京)有限公司 A kind of information processing method, equipment and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9558208B1 (en) * 2013-12-19 2017-01-31 EMC IP Holding Company LLC Cluster file system comprising virtual file system having corresponding metadata server
US20160026672A1 (en) * 2014-07-23 2016-01-28 Netapp. Inc. Data and metadata consistency in object storage systems
CN105446794A (en) * 2014-09-30 2016-03-30 北京金山云网络技术有限公司 Disc operation method, apparatus and system based on virtual machine
CN105468296A (en) * 2015-11-18 2016-04-06 南京格睿信息技术有限公司 No-sharing storage management method based on virtualization platform
CN107203639A (en) * 2017-06-09 2017-09-26 联泰集群(北京)科技有限责任公司 Parallel file system based on High Performance Computing
CN107807794A (en) * 2017-10-31 2018-03-16 新华三技术有限公司 A kind of date storage method and device
CN109347896A (en) * 2018-08-14 2019-02-15 联想(北京)有限公司 A kind of information processing method, equipment and computer readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KRISTAL T. POLLACK: "Quota enforcement for high-performance distributed storage systems", 《IEEE XPLORE》 *
刘仲, 章文嵩, 王召福, 周兴铭: "基于对象存储的集群存储系统设计", 计算机工程与科学, no. 02 *
罗圣美: "一种结合SSD特征的分布式文件系统元数据优化技术", 《小型微型计算机系统》, no. 5 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114079659A (en) * 2020-08-13 2022-02-22 支付宝(杭州)信息技术有限公司 Server of distributed storage system, data storage method and data access system
CN113590309A (en) * 2021-06-30 2021-11-02 郑州云海信息技术有限公司 Data processing method, device, equipment and storage medium
CN113590309B (en) * 2021-06-30 2024-01-23 郑州云海信息技术有限公司 Data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111522514B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
JP5276218B2 (en) Convert LUNs to files or files to LUNs in real time
US10303649B2 (en) Storage media abstraction for uniform data storage
Tantisiriroj et al. On the duality of data-intensive file system design: reconciling HDFS and PVFS
US11093148B1 (en) Accelerated volumes
US8473462B1 (en) Change tracking for shared disks
EP4139781B1 (en) Persistent memory architecture
US10242011B1 (en) Managing truncation of files of file systems
CN103501319A (en) Low-delay distributed storage system for small files
CN111522514A (en) Cluster file system, data processing method, computer device and storage medium
Selvaganesan et al. An insight about GlusterFS and its enforcement techniques
Li et al. An efficient and performance-aware big data storage system
US20190243807A1 (en) Replication of data in a distributed file system using an arbiter
Xu et al. YuruBackup: a space-efficient and highly scalable incremental backup system in the cloud
US11461281B2 (en) Freeing pages within persistent memory
CN109558082B (en) Distributed file system
US11989159B2 (en) Hybrid snapshot of a global namespace
EP3367259B1 (en) Method and device for reading and writing video data in nas device
US9529812B1 (en) Timestamp handling for partitioned directories
WO2008029146A1 (en) A distributed file system operable with a plurality of different operating systems
CN109508255B (en) Data processing method and device
US11121981B1 (en) Optimistically granting permission to host computing resources
US8356016B1 (en) Forwarding filesystem-level information to a storage management system
US10685046B2 (en) Data processing system and data processing method
CN113973138B (en) Method and system for optimizing access to data nodes of a data cluster using a data access gateway
Lackschewitz et al. Performance Evaluation of Object Storages (NHR2022)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant