CN116501252A - Data management method and device - Google Patents

Data management method and device Download PDF

Info

Publication number
CN116501252A
CN116501252A CN202310423776.3A CN202310423776A CN116501252A CN 116501252 A CN116501252 A CN 116501252A CN 202310423776 A CN202310423776 A CN 202310423776A CN 116501252 A CN116501252 A CN 116501252A
Authority
CN
China
Prior art keywords
file
stored
file block
disk
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310423776.3A
Other languages
Chinese (zh)
Inventor
江文龙
戴恩亮
李文俊
周明伟
王志豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202310423776.3A priority Critical patent/CN116501252A/en
Publication of CN116501252A publication Critical patent/CN116501252A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a data management method and device. The data management method comprises the following steps: creating at least one storage area for a conventional magnetic recording disk; writing data to be stored into a storage area in a file form; and storing the corresponding relation between the data to be stored and the storage area, so as to manage the data in the traditional magnetic recording disk by taking the storage area as a management unit based on the corresponding relation. The management efficiency of the storage device to the CMR disk can be improved.

Description

Data management method and device
Technical Field
The present disclosure relates to the field of data management technologies, and in particular, to a data management method and apparatus.
Background
CMR (Conventional Magnetic Recording) is a conventional magnetic recording disk whose adjacent tracks are independent and have gaps, allowing random-location reading and writing and freeing up any location storage space, which can be managed by a common file system. At present, management operations such as reading and writing and deleting of an expiration file block are generally performed on the CMR disk directly through a standard file system, so that the expiration file block of the CMR disk is excessively complicated to manage, and the management efficiency of the storage device on the CMR disk is lower.
Disclosure of Invention
The application provides a data management method and device, which can improve the management efficiency of a storage device on a CMR disk.
To achieve the above object, the present application provides a data management method, including:
creating at least one storage area within a conventional magnetic recording disk;
writing data to be stored into a storage area in a file form;
and storing the corresponding relation between the data to be stored and the storage area, so as to manage the data in the traditional magnetic recording disk by taking the storage area as a management unit based on the corresponding relation.
In one embodiment, the method further comprises:
and performing data management on a preset magnetic disk with the same partition management logic as that of the traditional magnetic recording disk, wherein the magnetic disk type of the preset magnetic disk is different from that of the traditional magnetic recording disk.
In one embodiment, writing data to be stored in a storage area in the form of a file to enable a conventional magnetic recording disk to perform file management with the storage area as a management unit, includes:
if the used space of the storage area is larger than or equal to a preset space threshold value, a storage area is newly established, and the data to be stored is written into the newly established storage area;
and if the used space of the storage area is smaller than the preset space threshold value, writing the data to be stored into the storage area.
The data management method is applied to a storage device, wherein at least two types of magnetic disks are arranged on the storage device, or at least two types of magnetic disks are arranged on a distributed data management system where the storage device is located; the method further comprises the steps of:
acquiring a file block to be stored;
judging whether all the magnetic discs of the type corresponding to the file blocks to be stored on the storage device exist in the magnetic discs meeting the preset requirements or not under the condition that the file blocks to be stored are the restored damaged file blocks;
if the file block to be stored exists, storing the file block to be stored on a magnetic disk meeting the preset requirement;
if the file block to be stored does not exist, storing the file block to be stored on the disks meeting the preset requirements in all the disks of the types except the type corresponding to the file block to be stored;
the corresponding type of the file block is the storage disk type of the original damaged file block of the file block.
The method for judging whether the magnetic discs meeting the preset requirements exist in all the magnetic discs of the type corresponding to the file blocks to be stored on the storage device comprises the following steps:
selecting a disk from all disks of the type corresponding to the file block to be stored according to a load balancing strategy;
if the disk is selected from all disks of the type corresponding to the file block to be stored, the selected disk is the disk meeting the preset requirement in all disks of the type corresponding to the file block to be stored, and the step of storing the file block to be stored on the disk meeting the preset requirement is executed;
And if all the disks of the type corresponding to the file block to be stored do not meet the load balancing strategy, executing the step that the file block to be stored is stored on the disk meeting the preset requirement in all the disks of the type except the type corresponding to the file block to be stored.
The method for obtaining the file block to be stored of the file comprises the following steps:
in response to obtaining a file recovery task, reading undamaged file blocks of the file;
recovering damaged file blocks of the file based on the undamaged file blocks to obtain at least one repaired file block of the file;
and taking the repaired file block which is preset as the storage device by the storage node as a file block to be stored.
Recovering damaged file blocks of the file based on undamaged file blocks to obtain at least one repaired file block of the file, and then comprising:
if the preset storage node is not the repaired file block of the storage device, the repaired file block and the corresponding type information thereof are sent to the preset storage node of the repaired file block, so that the preset storage node stores the repaired file block on the disk based on the corresponding type information of the repaired file block.
Wherein the method further comprises:
and counting the self recovery task processing conditions, and limiting the read-write flow and/or the recovery task number based on the recovery task processing conditions.
The recovery task processing condition includes the current task number and/or the current recovery read-write flow, the self recovery task processing condition is counted, the read-write flow and/or the recovery task number are limited based on the recovery task processing condition, and the method includes the following steps:
responding to the obtained file recovery task, and judging whether the current task number reaches a first threshold value or not; if the file is arrived, waiting for the resource refreshing and then executing the file recovery task;
and/or counting the current recovered read-write flow in the file recovery process, and if the read-write flow reaches a second threshold value, waiting for the resource refresh and then executing the read-write operation.
To achieve the above object, the present application further provides an electronic device, including a processor; the processor is configured to execute instructions to implement the above-described method.
To achieve the above object, the present application also provides a computer readable storage medium storing instructions/program data capable of being executed to implement the above method.
In the application, at least one storage area is created for the traditional magnetic recording disk, the data to be stored is written into the storage area in a file form, and after the data to be stored is written into the storage area, the technical characteristics of the corresponding relation between the data to be stored and the storage area are stored, so that the storage device can quickly find the corresponding storage data based on the corresponding relation between the storage data and the storage area, the storage device is convenient to manage the storage data, the traditional magnetic recording disk can manage the data by taking the storage area as a management unit, and the management of the data in the traditional magnetic recording disk is convenient.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a schematic diagram of an embodiment of a distributed data management system of the present application;
FIG. 2 is a flow chart of an embodiment of a data management method of the present application;
FIG. 3 is a flow chart of an embodiment of a data management method of the present application;
FIG. 4 is a schematic diagram of a unified storage chassis of storage nodes in the distributed data management system of the present application;
FIG. 5 is a schematic flow chart of file block reading in the data management method of the present application;
FIG. 6 is a flow chart of another embodiment of a data management method of the present application;
FIG. 7 is a flow chart of another embodiment of a data management method of the present application;
FIG. 8 is a flow chart of yet another embodiment of a data management method of the present application;
FIG. 9 is a schematic diagram of an embodiment of an electronic device of the present application;
fig. 10 is a schematic structural diagram of an embodiment of a computer-readable storage medium of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure. In addition, the term "or" as used herein refers to a non-exclusive "or" (i.e., "and/or") unless otherwise indicated (e.g., "or otherwise" or in the alternative "). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments may be combined with one or more other embodiments to form new embodiments.
The SMR divides the internal physical storage area into a series of logical "zones" by a certain logic, called Zone. The read-write of the data in the Zone only supports the sequential write, and the random write is not supported, or the data moves backwards in sequence according to the position of the write pointer; the Zone size is usually fixed at 256MB. The data within the Zone may support random reading.
In the existing scheme of performing distributed storage by utilizing an SMR disk, a distributed object storage system is designed based on ZoneGroup (ZG), a group of initialized totally new Zone is associated through ZGID, the sliced data corresponding to erasure codes are respectively and concurrently written into the group of Zone in batches, unified load and management capacity based on ZG level is provided, and the capability management of supporting erasure code technology is realized through policy management of the Zone space of the original SMR disk, so that management and maintenance of all storage areas of SMR disks on all storage nodes are required by metadata nodes, the management burden of the metadata nodes is large, the distributed storage efficiency of the metadata nodes is relatively poor, and the distributed storage scheme cannot be compatible with other types of disks (such as CMR disks and the like).
Based on this, the application proposes a data management method, in which the metadata node only needs to determine the predetermined number of storage nodes storing the client data based on the data writing request of the client, each storage node in the predetermined number of storage nodes determines whether to allocate the data of the client to its own SMR disk and to which storage area of its own SMR disk, so that the metadata node is not required to manage and maintain all storage areas of all SMR disks, and thus the metadata node can uniformly manage all storage nodes by a conventional distributed file management method regardless of the disk type on the storage node, so that the disk type on the storage node is completely transparent and unaware of the metadata node, greatly reducing the management burden of the metadata node, improving the distributed storage efficiency of the metadata node, and enabling the distributed data management system including the metadata node and the storage node to be compatible with other types of disks such as CMR (Conventional Magnetic Recording, conventional magnetic recording disk).
First, the present application provides a distributed data management system. As shown in fig. 1, the distributed data management system includes a metadata node and a storage node.
A distributed data management system may include one or more storage nodes that may be responsible for storing data, managing hard disks, and the like. The storage node can also provide functions such as data stream writing response, nano-tube disk, and/or periodic file block scanning reporting. Wherein each storage node may support management of a single type of disk or a mixed type of disk.
A distributed data management system may include one or more metadata nodes. The metadata nodes may be organized in a single machine, a master and slave machine, a cluster, and the like.
The metadata nodes may be responsible for managing file information, metadata information, task scheduling, etc. of the distributed data management system.
That is, the metadata nodes may be used to internally maintain metadata images of the entire system in response to metadata requests from the entire distributed data management system. The metadata maintained by the metadata nodes may include file states in the distributed data management system, i.e., the metadata nodes control and manage a global file view of the distributed data management system. The global file view may include information such as file block identification information, file block status bit information, offset information of file blocks in the file, and the length and location of file blocks included in the file blocks.
Wherein, the task scheduling of the metadata node in charge of managing the distributed data management system can be understood as follows: the metadata node may determine a storage node storing the client data in response to the client data storage request to achieve load balancing of the storage node. Specifically, the metadata node may respond to the data storage request, integrate the loads of all storage nodes of the whole system, and select a target number of storage node lists through a certain load balancing strategy (rotation, weight selection and load selection) to return to the metadata node, so that the metadata node writes the data into the corresponding storage node. Specifically, after the metadata node returns to the storage node list, the client directly goes to at least one storage node on the storage node list to write concurrent service data, which is also one of the reasons for the efficiency of the distributed file system. The file reading flow is similar, and after the client-side to metadata node applies to the storage node list, the data is directly read on the decorrelated storage node. That is, when a file is written to the distributed data management system, the metadata node writes the contents of each file block of the file to the storage node according to the configuration rules. Preferably, the metadata node writes different file blocks of the same file to different storage nodes through a certain load balancing policy.
Based on the file storage flow and the file reading flow, the metadata node may only provide metadata management functions, i.e. management of the corresponding signaling flow. The interaction of the data streams is the writing of the real business data, and is responsible for the corresponding storage nodes.
In addition, the metadata node can also control the running of processes such as file recovery and the like. Wherein, metadata node control file recovery can be understood as: the metadata node executes the file recovery process and controls the file recovery progress. In other embodiments, metadata node control file recovery may also be understood as: the metadata node issues a recovery task to the target storage node to enable the target storage node to execute the recovery task of the file.
For the distributed data management system, when the storage node starts registration, the storage node reports storage file information (such as block information in a file) to the metadata node, and the metadata node updates the metadata cache of the internal block memory state based on the information and updates the real-time state of the file. The storage node can also periodically report internal storage file information (such as block information in a file) for the metadata node to check, update and cache. After the real-time service flow IO writing is completed, the storage node also reports the real-time writing data information to the metadata node. Therefore, the correctness of the metadata cache in the metadata node can be ensured through the periodical reporting and/or real-time reporting of the storage node, so that the correctness and the accuracy of the load balancing of the metadata node are ensured.
The present application provides a data management method, as shown in fig. 2, which includes the following steps. The data management method can be applied to a storage device, and the storage device can be a stand-alone storage device or can be a storage node in the distributed data management system. It should be noted that the following step numbers are only for simplifying the description, and are not intended to limit the execution order of the steps, and the execution order of the steps of the present embodiment may be arbitrarily changed without departing from the technical idea of the present application.
S101: at least one storage area is created for a conventional magnetic recording disk.
S102: and writing the data to be stored into the storage area in the form of a file.
S103: and storing the corresponding relation between the data to be stored and the storage area, so as to manage the data in the traditional magnetic recording disk by taking the storage area as a management unit based on the corresponding relation.
Compared with the prior art that data transmitted from a client is directly written into a traditional magnetic recording disk (Conventional Magnetic Recording, CMR disk) through a standard file system, the embodiment creates at least one storage area for the traditional magnetic recording disk, writes the data to be stored into the storage area in a file form, and stores the corresponding relation between the data to be stored and the storage area after writing the data to be stored into the storage area, so that the storage device can quickly find the corresponding storage data based on the corresponding relation between the storage data and the storage area, thereby being convenient for the storage device to manage the storage data, and further enabling the traditional magnetic recording disk to manage the data by taking the storage area as a management unit, thereby being convenient for the management of the data in the traditional magnetic recording disk.
In one possible implementation, the storage space of a conventional magnetic recording disk may be divided directly into a plurality of storage areas. When there is data to be recorded, the data is written into a storage area having sufficient storage space.
In another implementation, for a new conventional magnetic recording disk, a storage area may be created for the new conventional magnetic recording disk when there is data to be recorded, and then data may be written to the storage area; when the unused storage space of the newly created storage area is insufficient, if new data needs to be recorded, a storage area can be created for the traditional magnetic recording disk, so that the new data can be recorded into the newly created storage area, and all data to be recorded can be recorded into each storage area of the traditional magnetic recording disk in the mode, so that the traditional magnetic recording disk can manage the data by taking the storage area as a management unit.
When the storage area is created, the storage space of the created storage area can be set to be a preset value, so that the sizes of the plurality of storage areas on the traditional magnetic recording disk are the same, and the plurality of storage areas on the traditional magnetic recording disk are convenient to manage.
Determining that the unused storage space of the storage area is insufficient may refer to confirming whether the unused storage space of the storage area is less than a lower limit (e.g., 0 or a minimum specification of a file block).
Of course, in the case where the preset storage space amount of the storage area is a preset value, whether the unused storage space of the newly created storage area is sufficient may also be confirmed by confirming whether the used space of the storage area is greater than or equal to the preset space threshold. Specifically, if the used space of the latest created storage area is confirmed to be greater than or equal to a preset space threshold, the unused storage space of the latest created storage area is insufficient; if the storage area is smaller than the preset space threshold value, the unused storage space of the newly created storage area is sufficient. The preset space threshold may be a preset value, for example, 125MB, which is the storage space of the Zone of the SMR disk, and may be, for example, 250MB. In other embodiments, the preset space threshold may be equal to a difference between the preset value and the size of the current data to be stored, so as to avoid that the used storage space of the storage area is larger than the preset value and the storage space occupied by the storage area is larger than the preset value after the current data to be stored is stored in the storage area.
The above-described data management method may be applied to a storage device provided with at least two types of magnetic disks, or to a distributed data management system provided with at least two types of magnetic disks. That is, the above-described data management method can be applied to a conventional magnetic recording disk and a preset magnetic disk, in which the magnetic disk type of the preset magnetic disk is different from that of the conventional magnetic recording disk, that is, the conventional magnetic recording disk and the preset magnetic disk can be subjected to data management with unified partition management logic, so that at least two kinds of magnetic disks can be managed in the same manner, and management of at least two kinds of magnetic disks is facilitated. With unified partition management logic, the data management of the conventional magnetic recording disk and the preset magnetic disk can be understood as follows: and deleting the expired data in the traditional magnetic recording disk and the preset magnetic disk according to the unit of the storage unit. In some embodiments, with unified partition management logic, data management for conventional magnetic recording disks and preset magnetic disks can also be understood as: and performing read operation or write operation on the data in the traditional magnetic recording disk and the preset magnetic disk according to the corresponding relation between the stored data and the storage area.
The preset disk may include an SMR disk, that is, the at least two types of disks may include the above-mentioned CMR disk and SMR disk, so that the CMR disk and the SMR disk are managed by using a storage area as a unit, which may facilitate data management in a storage device or a distributed data management system.
For example, the size of a Zone block (typically 256M) of an SMR disc may be referred to, and the CMR disc also creates a Zone directory according to the size of the Zone as required, and the corresponding Zone directory is a metadata file corresponding to the file block data and the data block. As shown in fig. 3: when the file block data is written into the CMR disk, firstly creating a Zone1 catalog, writing the file block data into the Zone1 catalog in a file form and generating a metadata file corresponding to the file block; when the Zone1 catalogue reaches the Zone block size of the SMR disc, a Zone2 catalogue is created, the follow-up file block data is written into the Zone2 catalogue in a file form, and metadata files of corresponding file blocks are generated, so that the cycle is repeated.
Thus, for the CMR disk, the file blocks in the same time period can be placed in the same storage area, and if the file blocks in one time period expire, the expired file block management can be realized by deleting all the file blocks in the storage area corresponding to the time period, so that the problem that the expired file block management is excessively complicated because the expired file blocks are required to be found one by one based on the file block storage time to be deleted when the CMR disk is managed through a standard file system is solved.
In order to facilitate the storage node to read and store the file blocks, a unified storage base can be arranged on the storage node, so that the storage node can store and read the file blocks through the unified storage base.
The unified storage base of the storage node service layer may be as shown in fig. 4. The SMR disk is divided into a CMR area and an SMR area (Zone block area), wherein the CMR area is independently formatted into a standard file system (such as xfs, ext3, ext4 and the like) and then used as a KV index partition to support random read-write so as to manage metadata information (such as the corresponding relation between file blocks and Zone blocks) of the file blocks in an SMR data partition; the SMR area is bare block equipment and has no file system and is used for storing file block contents. Because the CMR disk supports random reading and writing, the CMR disk can be directly formatted into a standard file system, and the corresponding file block data and the metadata information corresponding to the file block are directly landed in a file form. The storage base senses the magnetic discs of different types of media, automatically loads and manages metadata information of the file blocks according to the different media, and reads and writes data from different media. Metadata information of the SMR disk is stored in a CMR area of the SMR disk in a KV mode, the information such as a Zone block where file block data are located and the position of the Zone block in the Zone block can be indexed through file block information, the file block data are read out from the Zone block area of the SMR according to the information, when nodes write the file block data, the Zone is allocated firstly, and then the KV index is updated after the data are written; similarly, the metadata information and the file block content on the CMR disk are stored in the corresponding Zone directory in the form of files, and the file block data can be read and written to directly operate the files corresponding to the file block names in the corresponding Zone directory.
The flow of the storage node reading the file blocks based on the unified storage base can be as shown in fig. 5, the storage node can find the disk/directory corresponding to the file blocks from the global file block and disk index map, and then call different disk management interfaces through the universal storage interface to read data from the corresponding disks: when reading data in an SMR disk, firstly finding a storage unit (namely a Zone) where a file block is located and information such as offset, length and the like of the file block in the Zone according to the recorded metadata information in the KV index partition according to the name of the file block, and then reading file block data with corresponding length from the corresponding offset in the Zone; when reading data in the CMR disk, the Zone directory corresponding to the file block is found from the global file block and disk index map, then the data file with the corresponding name is found from the corresponding Zone directory according to the file block name, and then the file content is directly read.
For the storage device provided with at least two types of disks or the distributed data management system provided with at least two types of disks, the application further provides a data management method according to another embodiment, in which when a file to be stored is a recovered damaged file block, the file block to be stored is preferentially stored on a disk of a type corresponding to the file block to be stored, and when none of the disks of the type corresponding to the file block to be stored meets a preset condition, the file block to be stored is stored on other types of disks, so that the storage disk type of the file block is not modified as much as possible when the file block is recovered, so that as many file blocks of a file are stored on one type of disk as much as possible, and file management is facilitated.
As shown in fig. 6, the data management method of this another embodiment may include the following steps.
S201: and acquiring a file block to be stored.
The file block to be stored may be acquired. The file blocks to be stored may include new file blocks to be recorded, or file blocks recorded but damaged and repaired (i.e. damaged file blocks recovered), and other file blocks of various types. In the case where the acquired file block to be stored is a restored damaged file block, step S202 may be entered to store the restored damaged file block through the following steps.
S202: and judging whether all the magnetic discs of the type corresponding to the file blocks to be stored on the storage device exist in the magnetic discs meeting the preset requirements or not under the condition that the file blocks to be stored are the restored damaged file blocks.
Under the condition that the file block to be stored is the restored damaged file block, judging whether all the magnetic discs of the type corresponding to the file block to be stored on the storage device exist or not, wherein the magnetic discs meet the preset requirement; if so, entering step S203 to store the file block to be stored on a corresponding type of magnetic disk meeting the preset requirement; otherwise, step S204 is entered, where the file block to be stored is stored on the disk meeting the preset requirement among all the disks of the types except the type corresponding to the file block to be stored.
The corresponding type of the file block is the storage disk type of the original damaged file block of the file block. For example, if the type of the storage disk of the original damaged file block is an SMR disk, the corresponding type of the file block is SMR; if the type of the storage disk of the original damaged file is a CMR disk, the corresponding type of the file block is CMR.
And whether the disk meets the preset requirement may refer to: whether the remaining storage space in the disk is greater than or equal to the size of the file block to be stored. Under this situation, if the remaining storage space of all the disks of the type corresponding to the file block to be stored is smaller than the size of the file block to be stored, then there is no disk satisfying the preset requirement in all the disks of the type corresponding to the file block to be stored on the storage device, and step S204 may be entered at this time; otherwise, the process advances to step S203.
Currently, in other embodiments, the disc meeting the preset requirement corresponding to the file block to be stored may further refer to: and selecting the disk from all disks of the type corresponding to the file block to be stored based on the load balancing strategy. Under this situation, if a disk is selected from all disks of the type corresponding to the file block to be stored by using the load balancing policy, the selected disk is a disk meeting the preset requirement, and step S203 may be performed to store the file block to be stored on the selected disk; if the disk is not selected from all the disks of the type corresponding to the file block to be stored by using the load balancing policy, that is, all the disks of the type corresponding to the file block to be stored do not satisfy the load balancing policy, step S204 may be performed. The load balancing strategy can comprehensively select a proper disk through various conditions such as disk capacity, disk load and the like.
S203: and storing the file blocks to be stored on a disk meeting preset requirements.
After confirming that all the disks meeting the preset requirements exist in the types of the corresponding disks of the file blocks to be stored on the storage device, the file blocks to be stored can be written into the corresponding disks through a universal storage interface (namely a unified storage base).
If the corresponding type of the file block to be stored is SMR, a proper Zone block can be selected from an SMR disk meeting the preset requirement, then the file block to be stored is written into the selected Zone block, and the disk content index information (such as KV index information of the SMR disk) is updated.
If the corresponding type of the file block to be stored is CMR, a proper Zone directory can be selected from the CMR disk meeting the preset requirement, then the file can be directly created by the file block name, the data is written into the file, and then the corresponding metadata information is written into the corresponding metadata file.
After the writing of the file block to be stored is completed, the mapping information of the file block and the disk/directory can be updated into the global index.
S204: storing the file blocks to be stored on the disks meeting the preset requirements in all the disks of the types except the types corresponding to the file blocks to be stored.
If all the disks of the type corresponding to the file blocks to be stored on the storage device do not meet the preset requirement, determining the disks of all the disks of the type except the type corresponding to the file blocks to be stored, and then storing the file blocks to be stored on the disks of all the disks of the type except the type corresponding to the file blocks to be stored, wherein the disks of the type except the type corresponding to the file blocks to be stored meet the preset requirement.
Similarly, in step S204, whether the disk meets the preset requirement may also refer to: whether the remaining storage space in the disk is greater than or equal to the size of the file block to be stored. Or, a magnetic disk meeting the preset requirement may also refer to: and selecting the disk from all disks of types except the type corresponding to the file block to be stored based on the load balancing strategy.
In order to facilitate understanding of the data management method according to the above embodiment by those skilled in the art, specific examples of the data management method are also provided. In a specific example, if only an SMR disk and a CMR disk are in the storage device, and if an original damaged file block of a file block to be stored is on the SMR disk, preferentially selecting a disk from an SMR disk list according to a load balancing policy, and if all SMR disks on a node do not meet the load balancing policy, selecting a proper disk from the CMR disk list according to the load balancing policy and storing the file block to be stored; if the original damaged file block is on the CMR disk, the disk is preferentially selected from the CMR disk list according to the load balancing strategy, and if all the CMR disks on the node do not meet the load balancing strategy, the proper disk is selected from the SMR disk list according to the load balancing strategy, and the file block to be stored is stored.
Further, if the file block to be stored is a damaged file block, as shown in fig. 7, in step S201, an undamaged file block of the file may be read in response to a file recovery task issued by the metadata node; recovering damaged file blocks of a file based on undamaged file blocks to obtain at least one repaired file block of the file; and taking the repaired file block which is preset as the storage device by the preset storage node as the file block to be stored.
The file recovery task may be issued by the metadata node. Specifically, the present invention relates to a method for manufacturing a semiconductor device. The metadata node may determine a target storage node (i.e. the above-mentioned storage device) from all storage nodes of the distributed storage system under the condition that a file is to be restored due to the damage or deletion of a file block; the damaged file blocks and/or missing file blocks in the file are controlled to be restored by the storage device, so that the storage device controls the file data restoration flow of the round, the data restoration of the file block level can be realized through the restoration scheme, the file restoration function based on SMR bare disc management can be realized on the premise that management nodes do not perceive the differences of different types of discs such as CMR, SMR and the like, and the system architecture design is simplified.
The file recovery task issued by the metadata node may carry information of normal/damaged file blocks in the whole file. The storage device can thus read out normal file block data based on the information of the normal/damaged file block. The information of the undamaged file blocks may include storage node information of the undamaged file blocks, so that the storage device may find a storage node storing the undamaged file blocks based on the storage node information of the undamaged file blocks, and then read the corresponding undamaged file blocks from the storage node storing the undamaged file blocks. When the storage device and the storage node storing the at least one undamaged file block are not the same storage node, the storage device may send a request for reading the undamaged file block to the storage node storing the at least one undamaged file block through the network, so that the undamaged file block is read through the network. When the storage device and the storage node storing at least one uncorrupted file block are the same storage node, i.e. when the at least one uncorrupted file block is just on the storage device, the at least one uncorrupted file block stored in the storage device can be read through the network loop.
In addition, as can be seen from the data management method according to the above embodiment, both the CMR and the SMR of the present application use the storage unit as a unit to perform storage management, and when normal file block data is read, the corresponding storage node can read uncorrupted file block data for recovery from a disk of different media types (without distinguishing between the CMR disk and the SMR disk) through a unified storage base. And the storage node can call different interfaces to operate the disk according to different storage media, so that the data read-write/recovery function of SMR and CMR cross-type disk media is realized, and the effects of improving the expansibility and compatibility of the distributed storage system and reducing the data loss are achieved.
After reading a sufficient number of uncorrupted file blocks of the file, e.g., more than or equal to N uncorrupted file blocks of the file are read, the corrupted file blocks of the file may be restored based on the uncorrupted file blocks. In an embodiment, the file is stored by splitting into N file blocks, calculating M file redundant blocks through EC erasure codes, and then writing n+m file blocks into at least one storage node, where the management node writes the n+m file blocks into different storage nodes through a certain load balancing policy, so as to maintain uniformity of cluster data and improve fault tolerance, i.e., multiple file blocks of one file are uniformly distributed on each storage node, any storage node is damaged, and all file blocks of the file are not damaged. In this case, when the number of undamaged file blocks of the file is greater than or equal to N, the [1, m ] damaged file block data may be calculated based on the N undamaged file blocks of the file and by EC erasure codes, thereby obtaining at least one repaired file block of the file. Therefore, when the file is damaged due to the reasons of abnormal hard disk, abnormal storage node or metadata node, network disconnection and the like, the file can be completely recovered as long as the number of damaged file blocks is less than or equal to M.
After at least one repaired file block of the file is obtained, the at least one repaired file block may be distributed to the respective storage nodes and stored. When the predetermined storage node of the at least one repaired file block is a storage device, the at least one repaired file block may be stored in the storage device based on the data management method of the above embodiment.
For the repaired file block of which the preset storage node is not the storage device, the repaired file block can be sent to the preset storage node of the repaired file block through a network, so that the preset storage node stores the repaired file block. The storage device may send the repaired file block and the corresponding type information of the repaired file block to a preset storage node of the repaired file block together, so that the preset storage node stores the repaired file block on the disk based on the corresponding type information of the repaired file block. The preset storage node can write the repaired file blocks into the magnetic discs of different media types through the unified storage base.
The preset storage node of each repaired file block may be determined by a metadata node, in which case, the file recovery task issued by the metadata node may carry the preset storage node of each repaired file block. In other embodiments, the preset storage nodes of each repaired file block may also be determined by the storage device itself.
In addition, in order to prevent the recovery task from consuming excessive system resources, the storage device can count the processing condition of the recovery task, and limit the read-write flow and/or the number of the recovery tasks based on the processing condition of the recovery task, so as to prevent recovery storm and the like from consuming a large amount of system resources, and ensure that normal storage service is not affected. The recovery task processing condition counted by the storage device can be information such as read-write flow per second, recovery task number per second and the like. As shown in fig. 8. After the recovery task is issued, the storage node firstly judges whether the current task number reaches a set threshold value (namely a first threshold value) per second, and if the current task number reaches a set upper limit, the storage node waits for the next second of resource refreshing and then executes the recovery task; in the recovery process, the current recovered read-write flow is counted, and if the current read-write flow reaches a set threshold value (namely a second threshold value) per second, the read-write operation is executed after the next second of resource refreshing is waited.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of an electronic device 20 according to the present application. The electronic device 20 of the present application includes a processor 22, the processor 22 being configured to execute instructions to implement the methods provided by the methods of any of the embodiments of the present application and any non-conflicting combination.
The processor 22 may also be referred to as a CPU (Central Processing Unit ). The processor 22 may be an integrated circuit chip having signal processing capabilities. Processor 22 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor, or the processor 22 may be any conventional processor or the like.
The electronic device 20 may further comprise a memory 21 for storing instructions and data needed for the operation of the processor 22.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a computer readable storage medium according to an embodiment of the present application. The computer readable storage medium 30 of the embodiments of the present application stores instructions/program data 31 that, when executed, implement the methods provided by any of the embodiments of the methods described above and any non-conflicting combination. Wherein the instructions/program data 31 may be stored in the storage medium 30 as a software product to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium 30 includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a computer, a server, a mobile phone, a tablet, or other devices.
In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
The foregoing is only the embodiments of the present application, and not the patent scope of the present application is limited by the foregoing description, but all equivalent structures or equivalent processes using the contents of the present application and the accompanying drawings, or directly or indirectly applied to other related technical fields, which are included in the patent protection scope of the present application.

Claims (10)

1. A method of data management, the method comprising:
creating at least one storage area for a conventional magnetic recording disk;
writing data to be stored into a storage area in a file form;
and storing the corresponding relation between the data to be stored and the storage area, so as to manage the data in the traditional magnetic recording disk by taking the storage area as a management unit based on the corresponding relation.
2. The method of data management according to claim 1, wherein the method further comprises:
and performing data management on a preset magnetic disk with the same partition management logic as that of the traditional magnetic recording disk, wherein the magnetic disk type of the preset magnetic disk is different from that of the traditional magnetic recording disk.
3. The data management method according to claim 1, wherein writing the data to be stored in the form of a file to a storage area so that the conventional magnetic recording disk performs file management with the storage area as a management unit, comprises:
If the used space of the storage area is larger than or equal to a preset space threshold value, a storage area is newly established, and the data to be stored is written into the newly established storage area;
and if the used space of the storage area is smaller than a preset space threshold value, writing the data to be stored into the storage area.
4. The data management method according to claim 1, wherein the data management method is applied to a storage device on which at least two types of disks are provided, or a distributed data management system on which the storage device is located on which at least two types of disks are provided; the method further comprises the steps of:
acquiring a file block to be stored;
judging whether all magnetic discs of the type corresponding to the file blocks to be stored on the storage device exist in the magnetic discs meeting the preset requirements or not under the condition that the file blocks to be stored are the restored damaged file blocks;
if the file block to be stored exists, storing the file block to be stored on the magnetic disk meeting the preset requirement;
if the file block to be stored does not exist, storing the file block to be stored on the disks meeting the preset requirements in all the disks of the types except the type corresponding to the file block to be stored;
The corresponding type of the file block is the storage disk type of the original damaged file block of the file block.
5. The method for data management according to claim 4, wherein the determining whether the disk satisfying the preset requirement exists in all the disks of the type corresponding to the file block to be stored on the storage device includes:
selecting a disk from all disks of the type corresponding to the file block to be stored according to a load balancing strategy;
if the disk is selected from all disks of the type corresponding to the file block to be stored, the selected disk is a disk meeting the preset requirement in all disks of the type corresponding to the file block to be stored, and the step of storing the file block to be stored on the disk meeting the preset requirement is executed;
and if all the magnetic disks of the type corresponding to the file block to be stored do not meet the load balancing strategy, executing the step of storing the file block to be stored on the magnetic disk meeting the preset requirement in all the magnetic disks of the type except the type corresponding to the file block to be stored if the file block to be stored does not exist.
6. The method of claim 4, wherein the obtaining the file block to be stored comprises:
Reading undamaged file blocks of the file in response to obtaining a file recovery task;
restoring damaged file blocks of the file based on the undamaged file blocks to obtain at least one repaired file block of the file;
taking a preset storage node as a repaired file block of the storage device as the file block to be stored; or alternatively, the first and second heat exchangers may be,
if there is a repaired file block of the storage device, the recovering the damaged file block of the file based on the undamaged file block to obtain at least one repaired file block of the file, and then the recovering includes: if the preset storage node is not the repaired file block of the storage device, the repaired file block and the corresponding type information thereof are sent to the preset storage node of the repaired file block, so that the preset storage node stores the repaired file block on a disk based on the corresponding type information of the repaired file block.
7. The method of data management according to claim 4, wherein the method further comprises:
and counting the self recovery task processing conditions, and limiting the read-write flow and/or the recovery task number based on the recovery task processing conditions.
8. The method according to claim 7, wherein the recovery task processing condition includes a current task number and/or a current recovery read-write traffic, the counting the own recovery task processing condition, and limiting the read-write traffic and/or the recovery task number based on the recovery task processing condition includes:
responding to the obtained file recovery task, and judging whether the current task number reaches a first threshold value or not; if the file is arrived, waiting for the refreshing of the resources and then executing the file recovery task;
and/or counting the read-write flow of the current recovery in the file recovery process, and if the read-write flow reaches a second threshold value, executing the read-write operation after waiting for the resource refresh.
9. An electronic device comprising a processor configured to execute instructions to implement the method of any one of claims 1-8.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores instructions/program data for being executed to implement the method of any one of claims 1-8.
CN202310423776.3A 2023-04-17 2023-04-17 Data management method and device Pending CN116501252A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310423776.3A CN116501252A (en) 2023-04-17 2023-04-17 Data management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310423776.3A CN116501252A (en) 2023-04-17 2023-04-17 Data management method and device

Publications (1)

Publication Number Publication Date
CN116501252A true CN116501252A (en) 2023-07-28

Family

ID=87315982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310423776.3A Pending CN116501252A (en) 2023-04-17 2023-04-17 Data management method and device

Country Status (1)

Country Link
CN (1) CN116501252A (en)

Similar Documents

Publication Publication Date Title
US8429369B2 (en) Storage management program, storage management method, and storage management apparatus
CN107943867B (en) High-performance hierarchical storage system supporting heterogeneous storage
CN109597567B (en) Data processing method and device
US20070078972A1 (en) Computer-readable recording medium with system managing program recorded therein, system managing method and system managing apparatus
CN101443761A (en) QOS-enabled lifecycle management for file systems
JP2010277289A (en) Management program, management device and management method
CN105549905A (en) Method for multiple virtual machines to access distributed object storage system
US8386707B2 (en) Virtual disk management program, storage device management program, multinode storage system, and virtual disk managing method
JP4229626B2 (en) File management system
CN110825704B (en) Data reading method, data writing method and server
CN113568582B (en) Data management method, device and storage equipment
CN115617264A (en) Distributed storage method and device
CN113495889B (en) Distributed object storage method and device, electronic equipment and storage medium
CN111309245B (en) Hierarchical storage writing method and device, reading method and device and system
CN111399761B (en) Storage resource allocation method, device and equipment, and storage medium
CN110147203B (en) File management method and device, electronic equipment and storage medium
US11429311B1 (en) Method and system for managing requests in a distributed system
CN114217740A (en) Storage management method, equipment, system and computer readable storage medium
CN110298031B (en) Dictionary service system and model version consistency distribution method
CN116501252A (en) Data management method and device
CN111124294A (en) Sector mapping information management method and device, storage medium and equipment
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN112131191B (en) Management method, device and equipment of NAMENODE file system
CN114153395A (en) Object storage data life cycle management method, device and equipment
CN103502953B (en) The method and apparatus improving the concurrency performance of distributed objects storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination